The SPSS EnvironmentThe Data View WindowUsing SPSS SyntaxData Creation in SPSSImporting Data into SPSSCreating and Deleting CasesCreating and Deleting VariablesVariable TypesDate-Time Variables in SPSSDefining VariablesRecoding VariablesAutomatic RecodeComputing VariablesSorting DataSplitting DataWeighting Cases
DescriptivesCompare MeansExploreFrequencies Part I (Continuous Data)Frequencies Part II (Categorical Data)Crosstabs
Pearson Correlation (bivariate)One Sample t TestIndependent Samples t Test
This is the "Recoding Variables" page of the "SPSS Tutorials" guide.
Alternate Page for Screenreader Users
Skip to Page Navigation
Skip to Page Content

SPSS Tutorials   Tags: spss, statistics, tutorials  

This LibGuide contains written and illustrated tutorials for the statistical software SPSS.
Last Updated: May 12, 2015 URL: Print Guide RSS UpdatesEmail Alerts

Recoding Variables Print Page

Recoding (Transforming) Variables

Sometimes you will want to change the form of a variable so that you can work with it in different ways. For example, you may want to change a continuous variable into a categorical variable (e.g., change continuous income to categories of income). This section describes how to change the form of your variables.

There are two basic options for recoding variables:

1.    Recode into different variables
2.    Recode into the same variables

Each of these options changes the form (or values) of your selected variable. The only difference between these options is whether the changes are applied to the original variable, or whether a new variable is created in addition to leaving the original variable unchanged. In general, it is best to recode a variable into a different variable so that you never alter the original data and can easily access the original data if you need to make different changes later on.


Recode into Different Variables

Recoding into a different variable transforms an original variable into a new variable. That is, the changes are not applied to the original variable but are instead applied to a copy of the original variable under a new name. For example, we might recode the variable “Height” (continuous variable) into a new variable called “Height_categ” (a categorical variable). The original “Height” variable remains unchanged and our transformation is applied only to the new variable “Height_categ.”

To recode into different variables, click Transform > Recode into Different Variables.

The Recode into Different Variables window will appear.

A. The left column lists all of the variables in your dataset. Select the variable you wish to recode by clicking it. Click the arrow in the center to move the selected variable to the center text box, (B).

B. Input Variable -> Output Variable: The center text box lists the variable(s) you have selected to recode, as well as the name your new variable(s) will have after the recode. You will define the new name in (C).

C. Output Variable: Define the name and label for your recoded variable(s) by typing them in the text fields. Once you are finished, click Change. Now the center text box, (B), will display both the name of the original variable as well as the name for the new variable (e.g., “Height --> Height_categ”).

D. Old and New Variables: Click the Old and New Values to specify how you wish to recode the values for the selected variable.

E. If: The If option allows you to specify the conditions under which your recode will be applied. (We discuss the If option in more detail later in this tutorial.)

Once you click Old and New Values, a new window where you will specify how to transform the values will appear.

1. Old Value: Specify the type of value you wish to recode (e.g., a specific value, missing data, or a range of values) and the specific value to be recoded (e.g., a value of “1” or a range of “1-5”).

2. New Value: Specify the new value for your variable (i.e., a specific value such as “2,” system-missing, or copy old values).

3. Old -> New: Once you have selected the old and new values for your selected variable in (1) and (2), click Add in area (3), Old-->New. The recode that you have specified now appears in the text field. If you need to change one of the recodes that you have added to the Old-->New area section, simply click on the one you wish to change and make changes in (1) and (2) as necessary.

You will need to repeat these steps for each value that you wish to recode. Once you have specified all the transformations that you wish to make for the selected variable, click the “Continue” button.

The “If” option. Sometimes you may wish to recode values for a specific variable only when other conditions in your data are satisfied. This means that cases meeting the conditions will be recoded, and cases not meeting the conditions will be assigned a missing value. To specify such conditions, click If to bring up the Recode into Different Variables: If Cases window.

1. The left column displays all of the variables in your dataset. You will use one or more variables to define the conditions under which your recode should be applied to the data.

2. The default specification for a recode is to Include all cases. To specify the conditions under which the recode should be applied, however, you will need to click Include if case satisfies condition. This will allow you to specify the conditions under which the recode will be applied to your data.

3. The center of the window includes a collection of arithmetic operators, Boolean operators, and numeric characters, which you can use to specify the conditions under which your recode will be applied to the data. There are many kinds of conditions you can specify by selecting a variable (or multiple variables) from the left column, moving them to the center text field, and using the blue buttons to specify values (e.g., “1”) and operations (e.g., +, *, /). You can also use the options in the Function group list.

When you are finished defining the conditions under which your recode will be applied to the data, click Continue.

When you are ready to run the procedure, click OK. Now your new variable will be recoded according to the criteria you specified. You can find your new variable in the last column in Data View or in the last row of Variable View.

Note: It is always good practice to check that your recode was successful and that no mistakes occurred. If you recode into a different variable, as we have done here, it is easy to check that the recode worked correctly. Simply click the Data View tab and select a few cases (rows) to check. In the current example, you can examine rows that have a value of “2” for the variable Gender (which represents females) and then compare the values of the old variable (Height) and new variable you created (Height_categ). The values for the new variable should have values that correspond to the changes you specified during the recode. If the values are wrong (or missing), you will need to re-check the options you selected during the recode. (In a future tutorial we will discuss better ways to check that a recode worked, such as running Frequencies for old and new variables if they are categorical and comparing the output. For now, it is sufficient to visually check the data in Data View.)


Recode into Same Variable

Recoding into the same variable works the same way as described above, except for that any changes made will permanently alter the original variable. That is, the original values will be replaced by the recoded values.

In general, it is good practice not to recode into the same variable because it overwrites the original variable. If you ever needed to use the variable in its original form (or wanted to double-check your steps), that information would be lost.


Example: Discretizing a Continuous Variable

One important use of the Recode procedure is dichotomizing a continuous variable. Dichotomizing a continuous variable transforms a scale variable into a binary categorical variable by splitting the values into two groups based on a cut point.

In the sample data file, students' heights and genders are recorded. Heights are measured in inches and range from 54 to 81 inches. Suppose we want to classify the females in our sample into one of two groups, based on whether or not they are under or over the average height for females (65 inches). That is, we want to create two groups of females: one group that represents females who are 65 inches or shorter, and another group that represents females who are taller than 65 inches. This can be done by recoding the Height variable and specifying that the recode should only apply to females in my sample.

1. Click Transform > Recode into Different Variables.

2. Select Height from the list on the left and click the arrow (in the center) to move it to the right text box.

3. Enter the name and label for the new variable on the right. We will call our new variable Height_categ (with “categ” representing that our new variable will be categorical). We will label our new variable “height categories for females.” Click Change.

4. Click Old and New Values. This will display the Recode into Different Variables: Old and New Values window.

  • To create new categories for height, we will need to define the old and new values. Since this recode will change the continuous Height variable to a categorical variable called Height_categ, we will specify a range of values on the left side of the dialog box and tell SPSS what new value to apply to the selected range on the right side of the dialog box.
  • Handle missing values: First, we want to explicitly state how missing height values should be handled. In the Old Value area, click System-missing. In the New Value area, click System-missing. Then click Add. This will add "SYSMIS ----> SYSMIS" to the Old->New text field.
  • Create group 1: Now we will define the "65 inches and under" group. In the Old Values area, click Range, LOWEST through value and enter“65” in the text field. This tells SPSS that we want to select a range from the lowest value in our data to a maximum value of 65. In the New Value area, click Value and enter a “1” in the text field. This tells SPSS that the range we specified for our old value will be given a new value of “1” in the recoded variable. Click Add to complete this part of the transformation. Now the Old-->New text field will display the following: “Lowest thru 65-->1.”
  • Create group 2: Now we will define the “taller than 65 inches" group. On the left side of the window, click All other values. This tells SPSS that any value that does not fall into the "missing" group or the "65 inches and under" group should go into group 2. In the New Value area, click Value  and enter a “2” in the text field. This tells SPSS that the range we specified for our old value will be given a new value of “2” in the recoded variable. Click Add to complete this second part of the transformation. Now the Old-->New text field will display the following: “ELSE-->2.”
  • Click the Continue button.

5. Now we will specify that the transformation we just defined should only be applied to values of height for the females in our data. Click If. The Recode into Different Variables: If Cases window will appear. Click Include if case satisfies condition.

  • Select the variable Gender in the list of variables in the left column. Click the arrow button to move this variable to the text field. Since we want to select only females, we will click the “=” and “2” buttons. (In our data, 1 = male and 2 = female). Now the text field displays “Gender = 2,” which tells SPSS that we are only selecting females. Click Continue.

6. Click OK to complete the transformation and apply the changes to the data.

7. Finally, let’s make sure that a new variable called Height_categ was successfully created.

  • We can find the new variable in the last column in Data View or in the last row of Variable View. If you do not see the new variable, the recode was unsuccessful.
  • It is also useful to explore whether the changes you specified were applied correctly to the data. You can spot-check the transformation by viewing your data in the Data View tab. For any case (row of data) that is female (has a value of “2” in the column for the Gender variable), a value of “1” (below 65 inches tall) or “2” (65 inches or taller) should be represented in the new Height_categ variable. Any female who is missing data for the original Height variable, and any male (“1”) in the data, should be missing values on the new Height_categ variable.
  • Another way to spot-check the data is to sort the data based on Height_categ and then on Height, then run a Case Summary (Analyze > Reports > Case Summaries).

Loading  Loading...