|Recoding Variables Print Page|
Sometimes you will want to change the form of a variable so that you can work with it in different ways. For example, you may want to change a continuous variable into a categorical variable (e.g., change continuous income to categories of income). This section describes how to change the form of your variables.
There are two basic options for recoding variables:
1. Recode into different variables
2. Recode into the same variables
Each of these options changes the form (or values) of your selected variable. The only difference between these options is whether the changes are applied to the original variable, or whether a new variable is created in addition to leaving the original variable unchanged. In general, it is best to recode a variable into a different variable so that you never alter the original data and can easily access the original data if you need to make different changes later on.
Recoding into a different variable transforms an original variable into a new variable. That is, the changes are not applied to the original variable but are instead applied to a copy of the original variable under a new name. For example, we might recode the variable “Height” (continuous variable) into a new variable called “Height_categ” (a categorical variable). The original “Height” variable remains unchanged and our transformation is applied only to the new variable “Height_categ.”
To recode into different variables, click Transform > Recode into Different Variables.
The Recode into Different Variables window will appear.
A. The left column lists all of the variables in your dataset. Select the variable you wish to recode by clicking it. Click the arrow in the center to move the selected variable to the center text box, (B).
B. Input Variable -> Output Variable: The center text box lists the variable(s) you have selected to recode, as well as the name your new variable(s) will have after the recode. You will define the new name in (C).
C. Output Variable: Define the name and label for your recoded variable(s) by typing them in the text fields. Once you are finished, click Change. Now the center text box, (B), will display both the name of the original variable as well as the name for the new variable (e.g., “Height --> Height_categ”).
D. Old and New Variables: Click the Old and New Values to specify how you wish to recode the values for the selected variable.
E. If: The If option allows you to specify the conditions under which your recode will be applied. (We discuss the If option in more detail later in this tutorial.)
Once you click Old and New Values, a new window where you will specify how to transform the values will appear.
1. Old Value: Specify the type of value you wish to recode (e.g., a specific value, missing data, or a range of values) and the specific value to be recoded (e.g., a value of “1” or a range of “1-5”).
2. New Value: Specify the new value for your variable (i.e., a specific value such as “2,” system-missing, or copy old values).
3. Old -> New: Once you have selected the old and new values for your selected variable in (1) and (2), click Add in area (3), Old-->New. The recode that you have specified now appears in the text field. If you need to change one of the recodes that you have added to the Old-->New area section, simply click on the one you wish to change and make changes in (1) and (2) as necessary.
You will need to repeat these steps for each value that you wish to recode. Once you have specified all the transformations that you wish to make for the selected variable, click the “Continue” button.
The “If” option. Sometimes you may wish to recode values for a specific variable only when other conditions in your data are satisfied. This means that cases meeting the conditions will be recoded, and cases not meeting the conditions will be assigned a missing value. To specify such conditions, click If to bring up the Recode into Different Variables: If Cases window.
1. The left column displays all of the variables in your dataset. You will use one or more variables to define the conditions under which your recode should be applied to the data.
2. The default specification for a recode is to Include all cases. To specify the conditions under which the recode should be applied, however, you will need to click Include if case satisfies condition. This will allow you to specify the conditions under which the recode will be applied to your data.
3. The center of the window includes a collection of arithmetic operators, Boolean operators, and numeric characters, which you can use to specify the conditions under which your recode will be applied to the data. There are many kinds of conditions you can specify by selecting a variable (or multiple variables) from the left column, moving them to the center text field, and using the blue buttons to specify values (e.g., “1”) and operations (e.g., +, *, /). You can also use the options in the Function group list.
When you are finished defining the conditions under which your recode will be applied to the data, click Continue.
When you are ready to run the procedure, click OK. Now your new variable will be recoded according to the criteria you specified. You can find your new variable in the last column in Data View or in the last row of Variable View.
Note: It is always good practice to check that your recode was successful and that no mistakes occurred. If you recode into a different variable, as we have done here, it is easy to check that the recode worked correctly. Simply click the Data View tab and select a few cases (rows) to check. In the current example, you can examine rows that have a value of “2” for the variable Gender (which represents females) and then compare the values of the old variable (Height) and new variable you created (Height_categ). The values for the new variable should have values that correspond to the changes you specified during the recode. If the values are wrong (or missing), you will need to re-check the options you selected during the recode. (In a future tutorial we will discuss better ways to check that a recode worked, such as running Frequencies for old and new variables if they are categorical and comparing the output. For now, it is sufficient to visually check the data in Data View.)
Recoding into the same variable works the same way as described above, except for that any changes made will permanently alter the original variable. That is, the original values will be replaced by the recoded values.
In general, it is good practice not to recode into the same variable because it overwrites the original variable. If you ever needed to use the variable in its original form (or wanted to double-check your steps), that information would be lost.
One important use of the Recode procedure is dichotomizing a continuous variable. Dichotomizing a continuous variable transforms a scale variable into a binary categorical variable by splitting the values into two groups based on a cut point.
In the sample data file, students' heights and genders are recorded. Heights are measured in inches and range from 54 to 81 inches. Suppose we want to classify the females in our sample into one of two groups, based on whether or not they are under or over the average height for females (65 inches). That is, we want to create two groups of females: one group that represents females who are 65 inches or shorter, and another group that represents females who are taller than 65 inches. This can be done by recoding the Height variable and specifying that the recode should only apply to females in my sample.
1. Click Transform > Recode into Different Variables.
2. Select Height from the list on the left and click the arrow (in the center) to move it to the right text box.
3. Enter the name and label for the new variable on the right. We will call our new variable Height_categ (with “categ” representing that our new variable will be categorical). We will label our new variable “height categories for females.” Click Change.
4. Click Old and New Values. This will display the Recode into Different Variables: Old and New Values window.
5. Now we will specify that the transformation we just defined should only be applied to values of height for the females in our data. Click If. The Recode into Different Variables: If Cases window will appear. Click Include if case satisfies condition.
6. Click OK to complete the transformation and apply the changes to the data.
7. Finally, let’s make sure that a new variable called Height_categ was successfully created.