[英]sas transpose data set
I have a large transaction dataset need to do transpose. 我有一个大型交易数据集需要进行转置。 data I have: 我拥有的数据:
id prod
1 A
1 B
1 C
1 B
1 B
2 A
2 B
2 B
2 B
2 D
I need to transpose it to 我需要将它转置为
id PROD_1 PROD_2 PROD_3
1 A B C
2 A B D
there are a lot of variables need to do this type of work. 有很多变量需要完成此类工作。 Any help is greatly appreciated. 任何帮助是极大的赞赏。 Really have no clue right now. 现在真的不知道了。 Or if you have better idea to transform this information about prod into data set that will be able to analyze, please let me know. 或者,如果您有更好的主意将有关产品的信息转换为能够进行分析的数据集,请告诉我。
You can transpose as many variables as you like in one data step. 您可以在一个数据步骤中转置任意多个变量。 This will typically be much faster than doing the same thing with proc transpose: 这通常比使用proc transpose做相同的事情要快得多:
data want;
if 0 then set have; /*Keeps all columns in the original order*/
array prods[5] $ prod1-prod5;
do _n_ = 1 by 1 until(last.id);
set have;
by id;
prods[_n_] = prod;
run;
run;
Just add more arrays as necessary for each variable you want to transpose. 只需根据需要为每个要转置的变量添加更多数组。 This assumes that you only want to look at the same number of rows for each id - if you're not sure how many there are, you'll need to do an extra initial pass to find out how large you need to make the arrays. 假设您只想为每个id看相同数量的行-如果不确定是否有多少行,则需要额外进行一次初始遍历,以找出制作数组所需的行数。
This technique is known as a DOW-loop. 这种技术称为DOW循环。 Further reading: http://analytics.ncsu.edu/sesug/2010/BB13.Dorfman.pdf 进一步阅读: http : //analytics.ncsu.edu/sesug/2010/BB13.Dorfman.pdf
If it is a large dataset, you need to think of efficiency. 如果数据集很大,则需要考虑效率。 It needs to be sorted first (or indexed) on the ID variable. 它需要首先在ID变量上排序(或建立索引)。 Also, ensure you only keep (process) the relevant variables: 另外,请确保仅保留(处理)相关变量:
proc transpose data=input(keep=id prod) out=output(drop=_name_) prefix=PROD_;
by id;
var prod;
run;
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.