|Data Creation in SPSS Print Page|
When you open the SPSS program, you will see a blank spreadsheet in Data View. If you already have another dataset open but want to create a new one, click File > New > Data to open a blank spreadsheet.
You will notice that each of the columns is labeled “var.” The column names will represent the variables that you enter in your dataset. You will also notice that each row is labeled with a number (“1,” “2,” and so on). The rows will represent cases that will be a part of your dataset. When you enter values for your data in the spreadsheet cells, each value will correspond to a specific variable (column) and a specific case (row).
Follow these steps to enter data:
Now that you know how to enter data, it is important to discuss a special type of variable called an ID variable. When data are collected, each piece of information is tied to a particular case. For example, perhaps you distributed a survey as part of your data collection, and each survey was labeled with a number (“1,” “2,” etc.). In this example, the survey numbers essentially represent ID numbers: numbers that help you identify which pieces of information go with which respondents in your sample. Without these ID numbers, you would have no way of tracking which information goes with which respondent, and it would be impossible to enter the data accurately into SPSS.
When you enter data into SPSS, you will need to make sure that you are entering values for each variable that correspond to the correct person or object in your sample. It might seem like a simple solution to use the conveniently labeled rows in SPSS as ID numbers; you could enter your first respondent’s information in the row that is already labeled “1,” the second respondent’s information in the row labeled “2,” etc. However, you should never rely on these pre-numbered rows for keeping track of the specific respondents in your sample. This is because the numbers for each row are visual guides only—they are not attached to specific lines of data, and thus cannot be used to identify specific cases in your data. If your data become rearranged (e.g., after sorting data), the row numbers will no longer be associated with the same case as when you first entered the data. Again, the row numbers in SPSS are not attached to specific lines of data and should not be used to identify certain cases. Instead, you should create a variable in your dataset that is used to identify each case—for example, a variable called StudentID.
Here is an example that illustrates why using the row numbers in SPSS as case identifiers is flawed:
Let’s say that you have entered values for each person for the School_Class variable. You relied on the row numbers in SPSS to correspond to your survey ID numbers. Thus, for survey #1, you entered the first respondent’s information in row 1, for survey #2 you entered the second person’s information in row 2, and so on. Now you have entered all of your data.
But suppose the data get rearranged in the spreadsheet view. A common way of rearranging data is by sorting—and you may very well need to do this as you explore and analyze your data. Sorting will rearrange the rows of data so that the values appear in ascending or descending order. If you right-click on any variable name, you can select “Sort Ascending” or “Sort Descending.” In the example below, the data are sorted in ascending order on the values for the variable School_Class.
But what happens if you need to view a specific respondent’s information? Or perhaps you need to double-check your entry of the data by comparing the original survey to the values you entered in SPSS. Now that the data have been rearranged, there is no way to identify which row corresponds to which participant / survey number.
The main point is that you should not rely on the row numbers in SPSS since they are merely visual guides and not part of your data. Instead, you should create a specific variable that will serve as an ID for each case so that you can always identify certain cases in your data, no matter how much you rearrange the data. In the sample data file, the variable StudentID acts as the ID variable.