When starting a data analysis project in SAS OnDemand for Academics, your data files may or may not be in the SAS data file format (*.sas7bdat). Your data might be an Excel spreadsheet, an SPSS dataset, or a text or CSV file.
Regardless of what format your data is in, the process of getting your data into your SAS OnDemand for Academics account requires two steps. First, you must upload the file to your SAS OnDemand for Academics account -- that is, your data file must actually be uploaded to the cloud. Then, you must "import" the uploaded file into SAS Studio (using a process that is similar to what you would do with "desktop" SAS). The first step is unique to those using SAS OnDemand for Academics (if you were using "desktop" SAS, you could simply import the file directly from your computer), but fortunately only needs to be done once.
In this tutorial, we'll cover how to get data files from your computer into your SAS OnDemand for Academics account so that you can analyze them.
If you are using SAS OnDemand for Academics and have data files you want to work with in the program, you will need to upload data to your account. This goes for ANY dataset you want to work with in SAS ODA, including "native" SAS datasets (*.sas7bdat) or other formats such as Excel and CSV.
To upload a file to SAS ODA:
Left-click on the folder you want to upload the data file to (usually "Files (Home)" or a folder within that directory). Then click Upload. (Note that the Upload button will not be clickable until a folder location is selected, so if it's greyed out, make sure you've selected a folder.)
The Upload Files window will open. Click Choose Files, then choose the data file on your computer to upload.
If successful, you should see the data file(s) appear in the directory you selected in step 2:
At this stage, if you have uploaded a SAS dataset file (*.sas7bdat) and have already mapped a library to the folder where the data file is located, you can skip the next part of the tutorial and start working with the data right away! However, if you have uploaded a text, CSV, Excel, SPSS, or other non-SAS format, you'll need to proceed to the next step: importing your uploaded data.
If you have SAS code files (*.sas) on your computer that you want to use in SAS OnDemand for Academics, you can upload them to your account using these same steps. After uploading *.sas files to your account, you can open and edit them just like a SAS script created directly in ODA.
After your datafiles have been uploaded to your SAS OnDemand for Academics account, you can import the data into a SAS data step. The easiest way to do so is to use the Import Data task:
Click Tasks and Utilities in the Navigation pane, then click Import Data.
This will create a new tab in the work area, titled Import Data 1.
You can decide to drag and drop your files or select a file to import. In our case, the Excel file with the sample data is located in our Sample data folder. (Remember: when using SAS OnDemand for Academics, your data must be uploaded into the Server Files and Folders tab. If you do not see any data files in this window, you most likely need to upload them; see the previous section on uploading data to SAS ODA.)
Once you’ve chosen the Excel file with the data, click Open. SAS will populate the file information and generate code for importing.
To run the code, click the Run button (running man icon) to execute the code.
Once you’ve executed the code, check the Log to see if any errors have occurred. If successful, the Log window should tell us how many observations and variables the new dataset contains.
Additionally, you can preview the dataset in the Output Data tab.
If you are satisfied with the result of the import process, you have the option to save the Import task by clicking the Save button (floppy disk icon). This will create a *.ctl file in your SAS ODA account, which you can use to repeat/re-create the import process in the future. We highly recommend doing this so you have a record of how you imported the data.
Let's take a closer look at the syntax that SAS generates when using the Import tool and understand what each line is doing:
The first four lines are comments -- they do not contain executable code, and exist only as notes for human readers. The notes contain the name and location of the imported file, and the date that the import code was generated.
The first executable line of code is
%web_drop_table(WORK.IMPORT);
This particular command "clears out" any data in the system saved under the WORK.IMPORT name. You might notice that this command looks very different than the PROC IMPORT and PROC CONTENTS commands. Notice the percent symbol (%) at the start of the line: this tells us that %web_drop_table
is a user-defined macro command. In this case, it's a function that SAS themselves defined to quickly delete a particular dataset. If you've imported another dataset into SAS using the Import task under the name WORK.IMPORT and want to keep it, make sure you save it under a name other than WORK.IMPORT, or else you will lose it!
The first pertinent line starts with the FILENAME
statement, which specifies the path to the file containing the data to import. In particular, it assigns the nickname REFFILE to the full file path where our data is located. This means that we can simply type the nickname REFFILE in the PROC IMPORT
step instead of having to write out the full file path.
The next block of code is the PROC IMPORT
statement, which triggers the data import action. Note the first semicolon is not until the third line. This is because the syntax includes quite a few options associated with the PROC IMPORT
statement:
OUT
option tells SAS where to put the new SAS dataset it is creating – in this case we told it to put the new SAS dataset “import” in the Work library.DATAFILE
option points to the file directory of the dataset you are importing.DBMS
option tells SAS what engine to use to read the data (in this case, the XLSX engine). This value will depend on what type of file you're importing.GETNAMES=YES
instructs SAS to use the first row of the file as variable names.The PROC IMPORT code in this step is very similar to the PROC IMPORT code one would use for "desktop" SAS, so for more information about PROC IMPORT, see our Importing Excel Files into SAS tutorial.
Following the PROC IMPORT
command, SAS generates code for PROC CONTENTS
, which displays information about the newly-imported dataset (WORK.IMPORT).
PROC CONTENTS is a standard SAS procedure; for more information about how to use and understand it, see the Summarizing Dataset Contents with PROC CONTENTS tutorial.
Data saved in Excel or *.csv formats should work with the Import Wizard's default choices, but data saved in *.txt format with a special character separator requires a bit of extra work. The process starts the same, but if your data is saved as a .txt file, SAS requires you to explicitly say you're importing delimited data, and specify what character is used as the delimiter.
"Delimiters" are the character used to separate columns (i.e., variables) in the data file. In a CSV file, the delimiter is a comma; other common delimiters include tabs and spaces, but any character could be used. (The only way to know what character is used as the delimiter in a given *.txt file is to look at the data's accompanying documentation (if available), or look at the data file itself using a text editor.)
If the delimiter is not a white-space character, the character can be entered into the Delimiter field. If the delimiter is a white-space character, like a tab or a space, then the character's ASCII code must be used instead.
To import delimited data in a *.txt file, take the following steps:
","
'09'x
'20'x
When finished, click the Run button.
For tab-delimited data where the first line contains variable names and data begins on line 2, your inputs should look like the following:
If you used the above setup, your generated import syntax should look like this:
The import syntax generated for delimited data is very similar to the syntax generated for Excel files described earlier, but there are several notable differences. For the Excel file, PROC IMPORT syntax contained four special parameters: DATAFILE
, DBMS
, OUT
, and GETNAMES
. The generated PROC IMPORT syntax for delimited files has these same four parameters, plus two new ones:DELIMITER
and DATAROW
. Let's go over each one:
PROC IMPORT
statement is associated with three options, all of which appear before the first semicolon: DATAFILE
, DBMS
, and OUT
.
DATAFILE=REFFILE
option points to the file directory of the dataset you are importing. The term REFFILE
is a "nickname" for the file path where the data is actually located, and is defined on line 9 of the script.DBMS=DLM
option tells SAS what engine to use to read the data, and differs based on what type of data you're importing. Because we are reading delimited data, we are using the DLM engine (instead of the XLSX engine).OUT=WORK.IMPORT
option tells SAS what to call and where to put the new SAS dataset. The name WORK.IMPORT means that the dataset will be named IMPORT, and it will be placed in the WORK library.DELIMITER='09'x
statement declares that tabs are the delimiter used in the file ('09'x
is the ASCII code for the tab character).GETNAMES=YES
statement tells SAS to use the first row of the file as the variable names.DATAROW=2
statement says to begin reading data/observations on line 2.