- University Libraries
- LibGuides
- Statistical Consulting
- SPSS Tutorials
- Computing Variables

The "Compute Variable" command allows you to create new variables from existing variables by applying formulas. This tutorial shows how the "Compute Variable" command can compute a variable using an equation, a built-in function, or conditional logic.

Sometimes you may need to compute a new variable based on existing information (from other variables) in your data. For example, you may want to convert the units of a variable from feet to meters, or use a subject's height and weight to compute their BMI. You may also want to apply a computation conditionally, so that a new variable is only computed for cases where certain conditions are met. In this tutorial, we'll discuss how to compute variables in SPSS using numeric expressions, built-in functions, and conditional logic.

To compute a new variable, click **Transform** **> Compute Variable**.

The Compute Variable window will open where you will specify how to calculate your new variable.

A**Target**** Variable:** The name of the new variable that will be created during the computation. Simply type a name for the new variable in the text field. Once a variable is entered here, you can click on “Type & Label” to assign a variable type and give it a label. The default type for new variables is numeric.

BThe left column lists all of the variables in your dataset. You can use this menu to add variables into a computation: either double-click on a variable to add it to the Numeric Expression field, or select the variable(s) that will be used in your computation and click the arrow to move them to the** Numeric Expression** text field (C).

C**Numeric**** Expression:** Specify how to compute the new variable by writing a numeric expression.

D The center of the window includes a collection of arithmetic operators, Boolean operators, and numeric characters, which you can use to specify how your new variable will be calculated. There are many kinds of calculations you can specify by selecting a variable (or multiple variables) from the left column, moving them to the center text field, and using the blue buttons to specify values (e.g., “1”) and operations (e.g., +, *, /).

E **If: **The **If** option allows you to specify the conditions under which your computation will be applied.

F **Function group**: You can also use the built-in functions in the **Function group** list on the right-hand side of the window. The function group contains many useful, common functions that may be used for calculating values for new variables (e.g., mean, logarithm). To find a specific function, simply click one of the function groups in the **Function Group** list. You will now see a list of functions that belong to that function group in the **Functions and Special Variables** area. If you click on a specific function, a description of that function will appear in the text field to the left.

Click **If** (indicated by letter E in the above image) to open the Compute Variable: If Cases window.

1The left column displays all of the variables in your dataset. You will use one or more variables to define the conditions under which your computation should be applied to the data.

2 The default specification is to **Include all cases**. To specify the conditions under which your computation should be applied, however, you will need to click **Include if case satisfies condition**. This will allow you to specify the conditions under which the computation will be applied to your data.

3The center of the dialog box includes a collection of arithmetic operators, Boolean operators, and numeric characters, which you can use to specify the conditions under which your recode will be applied to the data. There are many kinds of conditions you can specify by selecting a variable (or multiple variables) from the left column, moving them to the center text field, and using the blue buttons to specify values (e.g., “1”) and operations (e.g., +, *, /). You can also use the built-in functions in the **Function Group** list under the right column.

After you are finished defining the conditions under which your computation will be applied to the data, click **Continue**. Note that when you specify a condition in the *Compute Variable: If Cases* window, the computation will only be performed on the cases meeting the specified condition. If a case does not meet that condition, it will be assigned a missing value for the new variable.

Now we will use what we have learned throughout this tutorial to demonstrate how to compute a new variable. In this example, we wish to compute a new variable called `AverageScore` that is the average of four test scores—variables `English`, `Reading`, `Math`, and `Writing`.

- Click
**Transform > Compute Variable**. - In the
**Target Variable**field, type a name for the new variable that will be computed. Let's call our new variable`AverageScore`. - Highlight each variable—
`English`,`Reading`,`Math`, and`Writing`—from the list on the left and click the arrow to move each variable to the**Numeric Expression**field. (Alternatively, you can double-click on the variable name to move it to the**Numeric Expression**field.) Make sure you click the spacebar to create a space between each variable. - Now your four variables will appear in the
**Numeric Expression**field. Move your cursor between each set of variables and click the “+” sign to add the symbol for addition to the numeric expression. Now your expression should appear as`English + Reading + Math + Writing`

. - Now insert parentheses around the expression so that it appears as
`(English + Reading + Math + Writing)`

. - At the end of the expression, add the “/” sign and the number “4.” Now your expression should appear as
`(English + Reading + Math + Writing) / 4`

. - The final expression indicates that the new variable,
`AverageScore`will be calculated as the average of the four test scores. - Click
**OK**to complete the computation and apply the changes to the data. - Finally, let’s make sure that a new variable called
`AverageScore`was successfully created. - We can find the new variable in the last column in Data View or in the last row of Variable View. If you do not see the new variable, the computation was unsuccessful.
- We can check the syntax that was executed by looking at the log in the Output Viewer window. After running Compute Variable, the syntax that should have appeared in the output window is:

If there was an error in how the computation was specified, the log in the Output Viewer will often show an error message.`COMPUTE FinalGrade1=(English + Reading + Math + Writing) / 4. EXECUTE.`

- It is also useful to explore whether the computation you specified was applied correctly to the data. You can spot-check the computation by viewing your data in the Data View tab. To check that the new variable computed correctly, you can manually calculate the averages for a few cases in your dataset just to spot-check that the computation worked correctly.

Let's instead try computing the average test score using the built-in mean function.

- Click
**Transform > Compute Variable**. - In the
**Target Variable**area, type a name for the new variable that will be computed; let's call the new variable`AverageScore2`. - In the
**Function group**list, click**All**. - In the
**Functions and Special Variables**list, scroll down until you find “Mean”, then click on it. A description of this function will appear in the text box to the left. In this example, the description reads: - Double-click “Mean” under in the
**Functions and Special Values**list. When you do this, the syntax`MEAN(?,?)`

should appear in the**Numeric Expression**field. - Now add each of the variables (i.e.,
`English`,`Reading`,`Math`,`Writing`) to the numeric expression by double-clicking on the variable name in the left list. The variable names should be separated by commas, and all of the variable names should remain inside the parentheses. - Your final numeric expression should appear as
`MEAN(English,Reading,Math,Writing)`

. The final expression indicates that the new variable,`AverageScore2`, will be calculated as the average of the four test scores. - Click
**OK**to complete the computation and apply the changes to the data. - Finally, let’s make sure that a new variable called
`AverageScore2`was successfully created.- We can find the new variable in the last column in Data View or in the last row of Variable View. If you do not see the new variable in the Variable View, the computation was unsuccessful. Additionally, if you see the new column in the Data View but every row has a missing value, there was an issue with your computation.
- We can check the syntax that was executed by looking at the log in the Output Viewer window. After running Compute Variable, the syntax that should have appeared in the output window is:

If there was an error in how the computation was specified, the log in the Output Viewer will often show an error message.`COMPUTE AverageScore2=MEAN(English,Reading,Math,Writing). EXECUTE.`

- It is also useful to explore whether the computation you specified was applied correctly to the data. You can spot-check the computation by viewing your data in the Data View tab. To check that the new variable computed correctly, you can manually calculate the averages for a few cases in your dataset just to spot-check that the computation worked correctly.

Notice that in the sample dataset, the test score variables in the sample dataset are all next to each other. In the previous example, we explicitly specified all four test score variables in the `MEAN`

function. But what if there had been ten or twenty test score variables? It would take much longer to manually enter all twenty variable names.

What if we wanted to refer to the entire range of test score variables, beginning with `English` and ending with `Writing`, without having to type out each variable's name?

When using SPSS's special built-in functions, you can refer to a range of variables by using the statement `TO`

. Let's repeat the previous example and show how the `TO`

statement is used to refer to a range of variables inside a function.

** WARNING:** This method is dependent on the positions of the variables in the dataset. If the variables are not in sequential order, this method may not work correctly.

- Click
**Transform > Compute Variable**. - In the
**Target Variable**area, type a name for the new variable that will be computed; let's call the new variable*AverageScore3.* - In the
**Function group**list, click**All**. - In the
**Functions and Special Variables**list, scroll down until you find “Mean”, then click on it. - Double-click “Mean” under in the
**Functions and Special Values**list. The basic setup for using this function will now appear in the**Numeric Expression**field. - Inside the MEAN function, change the arguments to
`English TO Writing`

. Your final numeric expression should appear as`MEAN(English TO Writing)`

. The final expression indicates that the new variable,*AverageScore3*, will be calculated as the average of all the variables between`English`and`Writing`in the dataset. - Click
**OK**to complete the computation. - Finally, let’s make sure that a new variable called
*AverageScore3*was successfully created.- We can find the new variable in the last column in Data View or in the last row of Variable View. If you do not see the new variable, the computation was unsuccessful.
- We can check the syntax that was executed by looking at the log in the Output Viewer window. After running Compute Variable, the syntax that should have appeared in the output window is:

If there was an error in how the computation was specified, the log in the Output Viewer will often show an error message.`COMPUTE AverageScore3=MEAN(English TO Writing). EXECUTE.`

- It is also useful to explore whether the computation you specified was applied correctly to the data. You can spot-check the computation by viewing your data in the Data View tab. To check that the new variable computed correctly, you can manually calculate the averages for a few cases in your dataset just to spot-check that the computation worked correctly.

If you've already verified the computation for `AverageScore` or `AverageScore2`, then you should be able to verify that `AverageScore`, `AverageScore2`, and `AverageScore3` are all equal.

In the previous examples, we did not talk about what happens when one or more of the variables has missing values for a given case. In fact, if there is a missing value for one or more of the input variables, SPSS assigns the new variable a missing value. That is, there must be valid values for each input variable in order for the computation to work. This is called *listwise exclusion*.

Listwise exclusion can end up throwing out a lot of data, especially if you are computing a subscale from many variables.

In SPSS, you can modify *any* function that takes a list of variables as arguments using the `.n`

suffix, where *n* is an integer indicating how many nonmissing values a given case must have. As long as a case has at least *n* valid values, the computation will be carried out using just the valid values.

In the previous example, we used the built-in `MEAN()`

function to compute the average of the four placement test scores. If we change the syntax to:

```
COMPUTE AverageScore=MEAN.3(English TO Writing).
EXECUTE.
```

Then any case with three or more nonmissing values will have a successful, nonmissing value for `AverageScore`

. (Stated another way, a given case could have at most one missing test score and still be OK.)

Alternatively, using the syntax

```
COMPUTE AverageScore=MEAN.2(English TO Writing).
EXECUTE.
```

would require that two or more of the test score variables have valid values (i.e., a given case could have at most two missing test scores).