简体   繁体   English

SAS数据步/ proc sql使用自动增量主键从另一个表插入行

[英]SAS data step/ proc sql insert rows from another table with auto increment primary key

I have 2 datasets as below 我有2个数据集,如下所示

id name status 
1  A    a
2  B    b
3  C    c

Another dataset 另一个数据集

name status new
C    c      0
D    d      1
E    e      1
F    f      1

How do I insert all rows from 2nd table to 1st table? 如何将第2个表中的所有行插入第1个表? The situation is that the first table is permanent. 情况是第一个表是永久性的。 The 2nd table is updated monthly, so I would like to add all rows from the monthly updated table to the permanent table, so that it would look like this 第二个表每月更新一次,所以我想将每月更新表中的所有行添加到永久表中,以便它看起来像这样

id name status
1  A    a
2  B    b
3  C    c
4  D    d
5  E    e
6  F    f

The problem I'm facing is that I cannot increment the id from dataset 1. As far as I searched, the dataset in SAS does not have auto increment property. 我面临的问题是我无法从数据集1中增加id。就我搜索而言,SAS中的数据集没有自动增量属性。 The auto increment can be done with using data step, but I don't know if data step could be use in the case with 2 tables like this. 使用数据步骤可以完成自动增量,但是我不知道数据步骤是否可以用于具有这样的2个表的情况。 The usual sql would be 通常的sql会

Insert into table1 (name, status) 
select name, status from table2 where new = 1;

But since the sas dataset not support auto increment column hence the problem I'm facing. 但由于sas数据集不支持自动增量列因此我面临的问题。 I could solve it by using SAS data step as below after the above proc sql 在上面的proc sql之后,我可以通过使用SAS数据步骤解决它

data table1;
set table1;
if _n_ > 3 then id = _n_;
run;

This would increase the value of id column, but the code is kinda ugly, and also the id is a primary key, and being used as a foreign key in other table, so I don't want to mess up the ids of old rows. 这会增加id列的值,但是代码有点难看,并且id也是主键,并且在其他表中被用作外键,所以我不想弄乱旧行的ID 。

I'm in the process of both learning and working with SAS so help is really appreciated. 我正在学习和使用SAS,所以非常感谢帮助。 Thanks in advance. 提前致谢。

Extra question: If the 2nd table does not have the new column, is there any way to complete what I want (add new row from monthly table (2nd) to permanent table (1st)) with data step? 额外的问题:如果第二个表没有新列,有没有办法用数据步骤完成我想要的东西(从月表(第二个)到永久表(第一个)添加新行)? Currently, I use this ugly proc sql/data step to create new column 目前,我使用这个丑陋的proc sql / data步骤来创建新列

proc sql; //create a temp table from table2
create t2temp as select t2.*, 
(case when t2.name = t1.name and t2.status = t1.status then 0 else 1) as new
from table2 as t2 
left join table1 as t1
on t2.name = t1.name and t2.status = t1.status;
drop table t2; //drop the old table2 with no column "new"
quit;
data table2;  //rename the t2temp as table2
set t2temp;
run;

You can do it in the datastep. 你可以在datastep中完成。 BTW, if you were creating it entirely anew, you could just use 顺便说一句,如果你完全重新创建它,你可以使用

id+1;

to create an autonumbered field (assuming your data step wasn't too complicated). 创建自动编号字段(假设您的数据步骤不太复杂)。 This will keep track of the current highest ID number and assign one higher to each row as you go if it is in the new dataset. 这将跟踪当前最高ID号,并在新行数据集中为每行分配一个更高的ID。

data have;
input id name $ status $;
datalines;
2  A    a
3  B    b
1  C    c
;;;;
run;

data addon;
input name $ status $ new;
datalines;
C    c      0
D    d      1
E    e      1
F    f      1
;;;;
run;

data want;
retain _maxID;                    *keep the value of _maxID from one row to the next, 
                                   do not reset it;
set have(in=old) addon(in=add);   *in= creates a temporary variable indicating which 
                                   dataset a row came from;
if (old) or (add and new);        *in SAS like in c/etc., 0/missing(null) is 
                                   false negative/positive numbers are true;
if add then ID = _maxID+1;        *assigns ID to the new records;
_maxID = max(id,_maxID);          *determines the new maximum ID - 
                                   this structure guarantees it works 
                                   even if the old DS is not sorted;
put id= name=;
drop _maxID;
run;

Response to second question: 回答第二个问题:

Yes, you can still do that. 是的,你仍然可以这样做。 One of the easiest ways is, if you have the datasets sorted by NAME: 最简单的方法之一是,如果您有按NAME排序的数据集:

data want;
retain _maxID;
set have(in=old) addon(in=add);
by name;
if (old) or (add and first.name);
if add then ID = _maxID+1;
_maxID = max(id,_maxID);
put id= name=;
run;

first.name will be true for the first record with the same value of name ; 对于具有相同name值的第一条记录,first.name将为true; so if HAVE has a value of that name, then ADDON will not be permitted to add a new record. 因此,如果HAVE具有该名称的值,则不允许ADDON添加新记录。

This does require name to be unique in HAVE, or you might delete some records. 这确实要求name在HAVE中是唯一的,或者您可能删除一些记录。 If that is not true then you have a more complicated solution. 如果不是这样,那么你有一个更复杂的解决方案。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM