简体   繁体   English

如何使用proc SQL合并多个变量

[英]How do I merge by more than one variable using proc SQL is SAS

I have 2 datasets in SAS: 我在SAS中有2个数据集:

main_1 main_1

ID Rep Dose Response ID代表剂量响应
1 2 34 567 1 2 34 567
1 1 45 756 1 1 45 756
2 1 35 456 2 1 35 456
3 1 56 345 3 1 56 345

main_2 main_2

ID Rep Hour Day ID Rep Hour Day
1 1 89 157 1 1 89 157
2 1 62 365 2 1 62365
3 1 12 689 3 1 12 689

I can easily merge these 2 datasets first by ID and then by Rep (as one of the ID's has two observations) with the following code in SAS: 我可以使用SAS中的以下代码轻松地按ID轻松合并这两个数据集,然后按Rep(因为ID中的一个有两个观察值)轻松合并这两个数据集:

proc import out=main_1 
    datafile='/folders/myfolders/sasuser.v94/main_1.xls'
    dbms=xls replace;
    /*optional*/
    sheet='Sheet1';
    getnames=yes;
run;

proc import out=main_2 
    datafile='/folders/myfolders/sasuser.v94/main_2.xls'
    dbms=xls replace;
    /*optional*/
    sheet='Sheet1';
    getnames=yes;
run;

/*merge datasets based on common variable (ID then Rep)*/
    /*first sort all datasets by target variables*/
proc sort data=main_1;
    by ID Rep;
proc sort data=main_2;
    by ID Rep;
run;
    /*can now be merged*/
data main_merge;
    merge main_1 main_2;
    by ID Rep;
run;

this produces the following table: 这将产生下表:

ID Rep Dose Response Hour Day ID代表剂量响应小时
1 1 45 756 89 157 1 1 45 756 89157
1 2 34 567 . 1 2 34 567。 .
2 1 35 456 62 365 2 1 35 456 62 365
3 1 56 345 12 689 3 1 56 345 12 689

I currently have the following proc SQL alternative (I am learning so sorry of its terrible) but cannot seem to merge by more than 1 variable (ie ID and Rep): 我目前有以下proc SQL替代方法(我正在学习,很抱歉,它很糟糕),但似乎无法合并多个变量(即ID和Rep):

proc sql;
    create table merged_sql as 
    select L.*, R.*
    from main_1 as L
    LEFT JOIN main_2 as R
    on L.ID = R.ID;
quit;

producing the following: 产生以下内容:

ID Rep Dose Response Hour Day ID代表剂量响应小时
1 2 34 567 89 157 1 2 34 567 89 157
1 1 45 756 89 157 1 1 45 756 89157
2 1 35 456 62 365 2 1 35 456 62 365
3 1 56 345 12 689 3 1 56 345 12 689

Any suggestion on a proc SQL code to achieve the same table as previously? 关于proc SQL代码以实现与以前相同的表的任何建议吗? My current code adds the '89 157' to both ID=1 observations. 我当前的代码将“ 89 157”添加到两个ID = 1观察值中。

Many thanks. 非常感谢。

You're almost there... 你快到了...

proc sql;
    create table merged_sql as 
      select L.*, 
             R.HOUR,
             R.DAY
        from main_1 as L
          LEFT JOIN main_2 as R
                 on L.ID = R.ID
                and L.REP = R.REP;
quit;

The reason not to use R.* is to avoid a note or warning about having duplicate ID and REP fields. 不使用R.*的原因是为了避免出现有关重复ID和REP字段的注释或警告。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM