简体   繁体   English

Proc比较循环遍历文件夹SAS中的多个数据集

[英]Proc compare loop through multiple datasets in a folder SAS

I have two folders containing numerous datasets.我有两个包含大量数据集的文件夹。 Each folder contains identical datasets and I'd like to compare to ensure they are similar.每个文件夹都包含相同的数据集,我想比较以确保它们相似。 Is it possible to loop through each folder and compare each dataset?是否可以遍历每个文件夹并比较每个数据集?

%macro compare(dpath=, cpath=,);

%do i = 1 %to n;

proc compare base = &dpath data = &cpath;
run;

%mend;

%compare(dpath=folder1_path, cpath=folder2_path);

Point librefs to the "folders".将 librefs 指向“文件夹”。 Get the lists of datasets.获取数据集列表。 Use the list to drive the code generation.使用列表来驱动代码生成。

%macro compare(dpath,cpath);
libname left "&dpath";
libname right "&cpath";

proc contents data=left._all_ noprint out=contents;
run;

data _null_;
  set contents;
  by membname;
  if first.memname;
  call execute(catx(' '
    ,'proc compare base=',cats('left.',memname)
    ,'compare=',cats('right.',memname)
    ,';run;'
  ));
run;

%mend;

%compare
(dpath=folder1_path
,cpath=folder2_path
);

To make it more robust you might want to do things like check that the member names in LEFT actually match the member names in RIGHT.为了使其更健壮,您可能需要检查 LEFT 中的成员名称是否与 RIGHT 中的成员名称实际匹配。 Or add an ID statement to the generated PROC COMPARE code so that PROC COMPARE knows how to match observations, otherwise it will just match the observations in the order they appear.或者在生成的 PROC COMPARE 代码中添加一个 ID 语句,以便 PROC COMPARE 知道如何匹配观察结果,否则它只会按照观察结果出现的顺序匹配观察结果。

This macro will compare the contents of all data exactly as-is and output if there are any differences at all.该宏将完全按原样比较所有数据的内容,如果有任何差异则输出。 If there are no differences, the dataset all_differences will not be created.如果没有差异,则不会创建数据集all_differences

%macro compare(dpath=, cpath=);
    libname d "&dpath";
    libname c "&cpath";

    /* Save all datasets to macro variables:
       &dataset1 &dataset2 etc. */
    data _null_;
        set sashelp.vmember;
        where libname = 'D';

        call symputx(cats('name', _N_), memname);
        call symputx('n', _N_);
    run;

    proc datasets lib=work nolist;
        delete all_differences;
    quit;

    %do i = 1 %to &n;

        /* Compare dataset names and only output if they are unequal */
        proc compare base    = d.&&name&i 
                     compare = c.&&name&i 
                     out     = outcomp
                     outnoequal;
        run;

        /* Get the number of obs from outcomp */
        %let dsid = %sysfunc(open(outcomp));
        %let nobs = %sysfunc(attrn(&dsid, nlobs));
        %let rc   = %sysfunc(close(&dsid));

        /* If outcomp is populated, log the dataset with differences */
        %if(&nobs > 0) %then %do;
            data _difference_;
                length dsn $32.;
                dsn = "&&name&i.";
            run;

            proc append base=all_differences
                        data=_difference_
                        force;
            run;
        %end;
    %end;
%mend;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM