简体   繁体   中英

How do I add a variable to a dataset within a SET statement in a data step?

I have a SAS data step that uses a %do loop to build a set statement that combines multiple files. The files have names of the form datafile2020q1 or datafile2021q3 , and they're each stored in libraries with names of the form DataLibrary2020 or DataLibrary2021 .

%macro processclaimsdata;
data rawclaimsdata (keep=&claimvariables.);
set
    %do yr=2019 %to 2021;
        %do qrtr=1 %to 4;
            DataLibrary&yr..datafile&yr.q&qrtr. (keep=&claimvariables.)
        %end;
    %end;
;
run;
%mend;

My goal is to add a variable to the dataset that has YYYYqQ in it, eg, 2020q1 or 2021q3, for each dataset, so I can keep track of which file it came from. Is this possible by modifying my code above, or do I need to rework it to use proc append and/or proc sql ?

Use the indsname= option to keep track of the dataset that a row came from. Although it won't be in the format you specify, this will tell you the exact dataset name that a row came from.

The code below will create a new variable in the dataset called dsn that stores the name of the dataset a row came from.

data want;
    set have1
        have2
        have3
    indsname=dsn;
run;

You can then use substr() or other string functions to snag the name of the quarter from each dataset name. For example:

qtr = substr(dsn, -6);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM