Our tutorials reference a dataset called "sample" in many examples. If you'd like to download the sample dataset to work through the examples, choose one of the files below:
Most of the time, you'll need to make modifications to your variables before you can analyze your data. These types of modifications can include changing a variable's type from numeric to string (or vice versa), merging the categories of a nominal or ordinal variable, dichotomizing a continuous variable at a cut point, or computing a new summary variable from existing variables. This section will focus on transformations applied to individual variables, particularly recoding and computing new variables.
Managing a dataset often includes tasks such as sorting data, subsetting data into separate samples, merging multiple sources of data, aggregating of data based on some key indicator, or restructuring a dataset. These types of data management tasks are sometimes called data cleaning, data munging, or data wrangling. This section covers these types of "cleaning" tasks.