This is a short comparison of SAS and R code in the context of generating multiple datasets. In this example, the mpg dataset from R is used to show how one might use a SAS Macro to subset the data by car class, followed by the R equivalent. You will see that in SAS, the macro is a little complicated; however, in R, the macro only takes a few lines to accomplish the same task.
First, the SAS Code.
*Get the Unique list of car classes; proc sql; create table class as select distinct class from mpg; quit; %macro Subset_Data(); *Open the list of car classes, and go from the first item to the last in the list; %do i= 1 %to &SYSNOBS.; *&SYSNOBS takes on the value of the length last dataset processed by SAS; data null; set class; if n=&i.; call symput(‘name’, class); *symput saves the value of class as the macro variable &name.; run; *Subset the data; data dataset_&name.; set mpg; where class=“&name.”; run; %end; %mend Subset_Data; %Subset_Data;
Now the R Code
Here the R equivalent to the do loop is a for loop. The proc sql code used to generate a unique list of car classes is replaced by simply unique(mpg$class).
library(ggplot2) data(mpg) for (var in unique(mpg$class)) { assign(paste("dataset",var,sep="_"), mpg[which(mpg$class == var), ]) }
With the SAS code, the naming of the datasets was handled by loading a dataset and passing the value for a specific observation to the macro. In R, the assign function handles naming the dataset generated by the code mpg[which(mpg$class == var), ].