|Frequencies Part II (Categorical Data) Print Page|
The Frequencies procedure can produce summary measures for categorical variables in the form of frequency tables, bar charts, or pie charts.
To run Frequencies procedure, click Analyze > Descriptive Statistics > Frequencies.
The Frequencies window will appear.
By default, the Display frequency tables check box is selected. If this check box is not selected, no frequency tables will be produced, and the only output will come from the supplementary options from Statistics or Charts. For categorical variables, you will want to leave this box checked.
All variables in the dataset are listed in the column on the left. To select a variable for analysis, double-click on its name to move it to the right-hand column. You can move multiple variables to the right-hand column to create several frequency tables at once.
Using the sample dataset, let's a create a frequency table for class rank (variable School_Class). Open the Frequencies window (Analyze > Descriptive Statistics > Frequencies), double-click on variable School_Slass, and click OK. The following results should appear in the Output Viewer window.
The first table, Statistics, reports how many missing and nonmissing observations were in the dataset. Here, only one observation did not specify the student's class rank.
The second table contains the frequencies for the selected variable. The table title is determined by the variable's label (or the variable name, if a label is not assigned); here, the table is entitled Class Rank.
This table contains four columns of summary measures:
Notice how the categories are grouped into "Valid" and "Missing" sections. This grouping allows for easy comparison of missing versus nonmissing observations. Note that "System" missing responses are observations that use SPSS's default method -- a period (.) -- for indicating missing values. If a user has assigned special codes for missing values in the Variable View window, those codes would appear here.
Clicking Charts in the Frequencies window will bring up the Frequencies: Charts prompt.
For categorical variables, bar charts and pie charts are appropriate. Histograms should only be used for continuous variables; they should not be used for ordinal variables.
A Bar chart displays the categories on the graph's x-axis, and either the frequencies or the percentages on the y-axis. In general, bar charts the preferred graph type for displaying categorical variables, because the format allows for easy comparison of the relative sizes of the categories. Note that the options in the Chart Values area apply only to bar charts. In particular, these options affect whether the y-axis of the bar chart displays counts or percentages. This setting will not affect pie charts, and will be greyed out if the radio button next to Histograms is selected.
Alternatively, a Pie chart depicts the categories of a variable as "slices" of a circular "pie". Pie charts should typically be reserved for variables where each component can be thought of as a proportion of a whole. For example, a pie chart would be appropriate for depicting the proportion of time devoted to different activities during a normal work week. Here, a pie chart of class rank is given for comparison with the bar chart above.
Clicking Format in the Frequencies window brings up the Frequencies: Format prompt.
The Order by options affect only categorical variables:
When working with two or more categorical variables, the Multiple Variables options only affects the order of the output. If Compare variables is selected, then the frequency tables for all of the variables will appear first, and all of the graphs for the variables will appear after. If Organize output by variables is selected, then the frequency table and graph for the first variable will appear together; then the frequency table and graph for the second variable will appear together; etc.
The vast majority of the descriptive statistics available in the Frequencies: Statistics window are not appropriate for nominal or ordinal categorical variables in most situations. There are two exceptions to this.
Warning: If your categorical variables are coded numerically, it is very easy to mis-use measures like the mean and standard deviation. SPSS will compute those statistics if they are requested, regardless of whether or not they are meaningful. It is up to the researcher to determine if these measures are appropriate for their data. In general, you should never use any of these statistics for (a) binary / dichotomous variables, or (b) nominal variables, and should only use these statistics with caution for ordinal variables.