Skip to Main Content

SPSS Tutorials: Rank Cases

In its simplest form, a rank transform converts a set of data values by ordering them from smallest to largest, and then assigning a rank to each value. In SPSS, the Rank Cases procedure can be used to compute the rank transform of a variable.

Rank Cases

A rank variable represents the ordering of the values of a numeric variable. Because ranks are the cornerstone of many nonparametric statistical methods, it is useful to know how to compute the rank transform of a variable in your dataset.

In SPSS, rank variables can be computed using the Rank Cases procedure. To open Rank Cases, click Transform > Rank Cases.

The Rank Cases dialog window.

A Variables: The variables to compute rank transforms on. The new ranks will be saved to new variables (whose names will be automatically generated).

B By: (Optional) Assign ranks within groups. By variables should be nominal or ordinal, and have a small number of categories.

C Assign Rank 1 to: Should ranks be assigned in increasing or decreasing order? By default, ranks are assigned by ordering the data values in ascending order (smallest to largest), then labeling the smallest value as rank 1. Alternatively, Largest value orders the data in descending order (largest to smallest), and assigns the largest value the rank of 1.

D Display summary tables: When checked, a summary of the new rank variables is printed to the Output window. The summary includes the original variables, the name of the new variables, the rank order, the ranking method, and the method used for ties. This option is on by default.


E Rank types: (Optional) Choose one or more formulas to compute the ranks. Each box you check on this screen will add another rank variable to your dataset.

By default, only the "Rank" option is selected; this computes simple ranks. For details about the other rank types and the proportion estimation formulas, please see the official SPSS documentation for Rank Cases. Note that the Proportion Estimation Formula options are inactive unless Proportion estimates and/or Normal scores are selected.


F Ties: How should ranks be assigned in the case of ties? (A tie occurs when two or more observations share the exact same value.) There are four options for how to resolve ties: Mean, Low, High, and Sequential ranks to unique values. By default, mean ranks are assigned to ties.

  • Mean - First, the observations are ordered and given unique, sequential ranks. Then, tied observations have their assigned ranks averaged together.
  • Low - First, the observations are ordered and given unique, sequential ranks. Then, the ranks of any ties are re-assigned to the value of the smallest rank.
  • High - First, the observations are ordered and given unique, sequential ranks. Then, the ranks of any ties are re-assigned to the value of the largest rank.
  • Sequential ranks to unique values - First, the observations are ordered. Unique ranks are assigned in order until a tie is encountered. Ties receive the same rank until the next unique value appears. (The actual number of unique ranks assigned is therefore equal to the number of unique values.)

Example: Rank Transforms for Non-Normal Data

Many hypothesis tests require assumptions about the distribution of the data or residuals. A common way to adjust for non-normality is to perform a transform on that variable; for example, taking the log, square root, or square of a variable. Rank transforms are another type of transform. Suppose we want to perform a rank transform on a variable in the sample dataset that is non-normally distributed: MileMinDur.

Before the Procedure

Before we compute the ranks, let's check how many nonmissing values MileMinDur has. Let's also check the distribution of the mile run times graphically. The Frequencies procedure makes it easy to do both of these things at once:

undefined

Histogram of the variable MileMinDur.

There are two important things we want to take note of:

  • The full dataset has 435 observations, but only 392 had non-missing values for their mile run time.
  • The histogram shows that the mile run times are strongly skewed right; additionally, on the low end, the mile run times cut off at 5 minutes.

This means that after we run the Rank Cases procedure, the resulting variable will only have assigned ranks for the 392 cases with nonmissing mile run times.

Running the Procedure

  1. Click Transform > Rank Cases.
  2. Add variable MileMinDur to the Variables box.
  3. Click OK.

SPSS Rank Cases dialog window; MileMinDur has been moved to the Variables box.

Syntax

RANK VARIABLES=MileMinDur (A)
  /RANK
  /PRINT=YES
  /TIES=MEAN.

Output

After executing the procedure, SPSS will add a new variable at the end of your dataset, and will print a table summarizing the computation in the Output window:

undefined

This table summarizes what the Rank Cases procedure did. It created a new variable named RMileMin, and assigned it the variable label "Rank of MileMinDur". It ranked the values in ascending order (i.e., the smallest value has rank 1), and it used the mean rank for values with ties.

We can inspect the new variable using the Descriptives procedure to get the sample size, minimum, maximum, mean, and standard deviation of the new variable:

undefined

Notice that we have the same sample size as the original variable (392).

Tutorial Feedback