The problem: while packages for reading in SAS datasets into R exist, they do not many formats, especially custom formats. Hence, a user must manually enter those in R. This becomes particularly onerous with survey datasets involving custom Likert scales.
Solution: SAS-R scripte. This handy script from an anonymous contributor generates R code to set the levels, labels and formatting of each variable.
Getting the Dataset into R
Two main options:
Haven package
read_sas(): reads .sas7bdat and .sas7bcat files from SAS
read_sav(): reads .sav files from SPSS
read_dta(): reads .dta files from Stata.
sas7bdat package
read.sas7bdat("psu97ai.sas7bdat")
Solution: sas-r
A Simple SAS Program
- Edit three items:
Source Dataset Location (reads in the column names and assigned formats)
Formats
From a format library
From a sas program
Or paste the custom formats directly into the sas-r script
Output R Program Name - Click run in SAS
- Paste generated r code into your program and lean back in your chair
Example
SAS: value am_terrified_about_being_o_ 1='Always' 2='Usually' 3='Often’ 4='Sometimes' 5='Rarely' 6='Never’;
R: EAT26$am_terrified_about_being_o <- factor(EAT26$am_terrified_about_being_o, c(1, 2, 3, 4, 5, 6), exclude = "")
levels(EAT26$am_terrified_about_being_o) <- c("Always", "Usually", "Often", "Sometimes", "Rarely", "Never")
Final Tips/Questions?/Link
Add the notsorted option (otherwise factors will be sorted alphanumeric)
For ordinal factors, you must manually apply this to every variable:
ordered_vars <- c(34:63)
EAT26[ordered_vars] <- lapply(EAT26[ordered_vars], as.ordered)