The SPSS EnvironmentThe Data View WindowUsing SPSS SyntaxData Creation in SPSSImporting Data into SPSSCreating and Deleting CasesCreating and Deleting VariablesVariable TypesDate-Time Variables in SPSSDefining VariablesRecoding VariablesAutomatic RecodeComputing VariablesSorting DataSplitting DataWeighting Cases
DescriptivesCompare MeansExploreFrequencies Part I (Continuous Data)Frequencies Part II (Categorical Data)Crosstabs
Pearson Correlation (bivariate)One Sample t TestIndependent Samples t Test
This is the "Frequencies Part II (Categorical Data)" page of the "SPSS Tutorials" guide.
Alternate Page for Screenreader Users
Skip to Page Navigation
Skip to Page Content

SPSS Tutorials   Tags: spss, statistics, tutorials  

This LibGuide contains written and illustrated tutorials for the statistical software SPSS.
Last Updated: May 12, 2015 URL: Print Guide RSS UpdatesEmail Alerts

Frequencies Part II (Categorical Data) Print Page

Frequencies for Categorical Data

The Frequencies procedure can produce summary measures for categorical variables in the form of frequency tables, bar charts, or pie charts.

To run Frequencies procedure, click Analyze > Descriptive Statistics > Frequencies.

The Frequencies window will appear.

By default, the Display frequency tables check box is selected. If this check box is not selected, no frequency tables will be produced, and the only output will come from the supplementary options from Statistics or Charts. For categorical variables, you will want to leave this box checked.

All variables in the dataset are listed in the column on the left. To select a variable for analysis, double-click on its name to move it to the right-hand column. You can move multiple variables to the right-hand column to create several frequency tables at once.


Using the sample dataset, let's a create a frequency table for class rank (variable School_Class). Open the Frequencies window (Analyze > Descriptive Statistics > Frequencies), double-click on variable School_Slass, and click OK. The following results should appear in the Output Viewer window.

The first table, Statistics, reports how many missing and nonmissing observations were in the dataset. Here, only one observation did not specify the student's class rank.

The second table contains the frequencies for the selected variable. The table title is determined by the variable's label (or the variable name, if a label is not assigned); here, the table is entitled Class Rank.

This table contains four columns of summary measures:

  • The Frequency column indicates how many observations fell into the given category.
  • The Percent column indicates the percentage of observations in that category out of all observations (both missing and nonmissing).
  • The Valid Percent column displays the percentage of observations in that category out of the total number of nonmissing responses.
  • The Cumulative Percent column is the total percentage of the sample that has been accounted for up to that row; it can be computed by adding all of the numbers in the Valid Percent column above the current row.

Notice how the categories are grouped into "Valid" and "Missing" sections. This grouping allows for easy comparison of missing versus nonmissing observations. Note that "System" missing responses are observations that use SPSS's default method  -- a period (.) -- for indicating missing values. If a user has assigned special codes for missing values in the Variable View window, those codes would appear here.


Frequencies: Charts

Clicking Charts in the Frequencies window will bring up the Frequencies: Charts prompt.

For categorical variables, bar charts and pie charts are appropriate. Histograms should only be used for continuous variables; they should not be used for ordinal variables.

A Bar chart displays the categories on the graph's x-axis, and either the frequencies or the percentages on the y-axis. In general, bar charts the preferred graph type for displaying categorical variables, because the format allows for easy comparison of the relative sizes of the categories. Note that the options in the Chart Values area apply only to bar charts. In particular, these options affect whether the y-axis of the bar chart displays counts or percentages. This setting will not affect pie charts, and will be greyed out if the radio button next to Histograms is selected.

Alternatively, a Pie chart depicts the categories of a variable as "slices" of a circular "pie". Pie charts should typically be reserved for variables where each component can be thought of as a proportion of a whole. For example, a pie chart would be appropriate for depicting the proportion of time devoted to different activities during a normal work week. Here, a pie chart of class rank is given for comparison with the bar chart above.


Frequencies: Format

Clicking Format in the Frequencies window brings up the Frequencies: Format prompt.

The Order by options affect only categorical variables:

  • Ascending or Descending values will arrange the rows of the frequency table with respect to the category names. (Note: if your categorical variable is coded numerically as 0, 1, 2, ..., sorting by value will arrange the rows with respect to the numeric code.)
  • Sorting by Ascending or Descending counts will order your data from least frequent to most frequent (if ascending), or most frequent to least frequent (if descending).

When working with two or more categorical variables, the Multiple Variables options only affects the order of the output. If Compare variables is selected, then the frequency tables for all of the variables will appear first, and all of the graphs for the variables will appear after. If Organize output by variables is selected, then the frequency table and graph for the first variable will appear together; then the frequency table and graph for the second variable will appear together; etc.


Frequencies: Statistics with Categorical Variables

The vast majority of the descriptive statistics available in the Frequencies: Statistics window are not appropriate for nominal or ordinal categorical variables in most situations. There are two exceptions to this.

  • The Mode (which is the most frequent response) has a clear interpretation when applied to most categorical variables.
  • The Values are group midpoints option can be applied to certian ordinal variables that have been coded in such a way that their value takes on the midpoint of a range. For example, this would be the case if you had measured subjects' ages and had coded anyone between the ages of 20 and 29 as 25, or between the 30 and 39 as 35 (source: IBM SPSS Statistics Information Center).

Warning: If your categorical variables are coded numerically, it is very easy to mis-use measures like the mean and standard deviation. SPSS will compute those statistics if they are requested, regardless of whether or not they are meaningful. It is up to the researcher to determine if these measures are appropriate for their data. In general, you should never use any of these statistics for (a) binary / dichotomous variables, or (b) nominal variables, and should only use these statistics with caution for ordinal variables.


Loading  Loading...