Skip to Main Content

SAS Tutorials: Summarizing dataset contents with PROC CONTENTS

This SAS software tutorial shows how to summarize a SAS dataset's contents and metadata using PROC CONTENTS.

Summarizing Data with PROC CONTENTS

The CONTENTS procedure generates summary information about the contents of a dataset, including:

  • The variables' names, types, and attributes (including formats, informats, and labels)
  • How many observations are in the dataset
  • How many variables are in the dataset
  • When the dataset was created

This procedure is especially useful if you have imported your data from a file and want to check that your variables have been read correctly, and have the appropriate variable type and format. (For example, you may wish to check that none of your character variables have been truncated, and that your date variables have not been misread.) The basic syntax of PROC CONTENTS is:

PROC CONTENTS DATA=sample;
RUN;

As with all SAS procedures, the DATA command (which specifies the name of the dataset) is optional, but recommended. If you do not specify a dataset, SAS will use the most recently created dataset by default.

Note that PROC CONTENTS will list the variables in alphabetical order, rather than the order they appear in the dataset. You can change this by adding the ORDER=VARNUM option to the PROC CONTENTS statement:

PROC CONTENTS DATA=sample ORDER=varnum;
RUN;

The screenshot below shows the output of PROC CONTENTS on the sample data file. Key elements are labeled and described below the screenshot.

A The number of observations (or rows) in the dataset. Here, the sample dataset contains 435 observations.

B The number of variables (or columns) in the dataset. Here, the sample dataset contains 23 variables.

C The date and time that the dataset was created and last modified.

D This part of the output lists the dataset’s variables and their attributes.

  • (#): The original order of the variable in the columns of the dataset. (PROC CONTENTS prints the variables in alphabetical order with respect to name, instead of in the order that they appear in the dataset.)
  • Type: Whether the variable is numeric (Num) or character (Char).
  • Len: Short for "Length"; represents the width of the variable.
  • Format: The assigned format that will be used when the values of the variable are printed in the Output window.
  • Informat: The original format of the variable when it was read into SAS.
  • Label: The assigned variable label that will be used when the name of the variable is printed in the Output window. If your variables do not have labels, this column will be identical to the Variable column.