[英]How do I merge by more than one variable using proc SQL is SAS
I have 2 datasets in SAS: 我在SAS中有2个数据集:
main_1 main_1
ID Rep Dose Response ID代表剂量响应
1 2 34 567 1 2 34 567
1 1 45 756 1 1 45 756
2 1 35 456 2 1 35 456
3 1 56 345 3 1 56 345
main_2 main_2
ID Rep Hour Day ID Rep Hour Day
1 1 89 157 1 1 89 157
2 1 62 365 2 1 62365
3 1 12 689 3 1 12 689
I can easily merge these 2 datasets first by ID and then by Rep (as one of the ID's has two observations) with the following code in SAS: 我可以使用SAS中的以下代码轻松地按ID轻松合并这两个数据集,然后按Rep(因为ID中的一个有两个观察值)轻松合并这两个数据集:
proc import out=main_1
datafile='/folders/myfolders/sasuser.v94/main_1.xls'
dbms=xls replace;
/*optional*/
sheet='Sheet1';
getnames=yes;
run;
proc import out=main_2
datafile='/folders/myfolders/sasuser.v94/main_2.xls'
dbms=xls replace;
/*optional*/
sheet='Sheet1';
getnames=yes;
run;
/*merge datasets based on common variable (ID then Rep)*/
/*first sort all datasets by target variables*/
proc sort data=main_1;
by ID Rep;
proc sort data=main_2;
by ID Rep;
run;
/*can now be merged*/
data main_merge;
merge main_1 main_2;
by ID Rep;
run;
this produces the following table: 这将产生下表:
ID Rep Dose Response Hour Day ID代表剂量响应小时
1 1 45 756 89 157 1 1 45 756 89157
1 2 34 567 . 1 2 34 567。 .
。
2 1 35 456 62 365 2 1 35 456 62 365
3 1 56 345 12 689 3 1 56 345 12 689
I currently have the following proc SQL alternative (I am learning so sorry of its terrible) but cannot seem to merge by more than 1 variable (ie ID and Rep): 我目前有以下proc SQL替代方法(我正在学习,很抱歉,它很糟糕),但似乎无法合并多个变量(即ID和Rep):
proc sql;
create table merged_sql as
select L.*, R.*
from main_1 as L
LEFT JOIN main_2 as R
on L.ID = R.ID;
quit;
producing the following: 产生以下内容:
ID Rep Dose Response Hour Day ID代表剂量响应小时
1 2 34 567 89 157 1 2 34 567 89 157
1 1 45 756 89 157 1 1 45 756 89157
2 1 35 456 62 365 2 1 35 456 62 365
3 1 56 345 12 689 3 1 56 345 12 689
Any suggestion on a proc SQL code to achieve the same table as previously? 关于proc SQL代码以实现与以前相同的表的任何建议吗? My current code adds the '89 157' to both ID=1 observations.
我当前的代码将“ 89 157”添加到两个ID = 1观察值中。
Many thanks. 非常感谢。
You're almost there... 你快到了...
proc sql;
create table merged_sql as
select L.*,
R.HOUR,
R.DAY
from main_1 as L
LEFT JOIN main_2 as R
on L.ID = R.ID
and L.REP = R.REP;
quit;
The reason not to use R.*
is to avoid a note or warning about having duplicate ID and REP fields. 不使用
R.*
的原因是为了避免出现有关重复ID和REP字段的注释或警告。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.