简体   繁体   English

SAS转置数据集

[英]sas transpose data set

I have a large transaction dataset need to do transpose. 我有一个大型交易数据集需要进行转置。 data I have: 我拥有的数据:

id        prod
1            A
1            B
1            C
1            B
1            B
2            A
2            B
2            B
2            B
2            D

I need to transpose it to 我需要将它转置为

id   PROD_1   PROD_2   PROD_3
1      A        B        C 
2      A        B        D

there are a lot of variables need to do this type of work. 有很多变量需要完成此类工作。 Any help is greatly appreciated. 任何帮助是极大的赞赏。 Really have no clue right now. 现在真的不知道了。 Or if you have better idea to transform this information about prod into data set that will be able to analyze, please let me know. 或者,如果您有更好的主意将有关产品的信息转换为能够进行分析的数据集,请告诉我。

You can transpose as many variables as you like in one data step. 您可以在一个数据步骤中转置任意多个变量。 This will typically be much faster than doing the same thing with proc transpose: 这通常比使用proc transpose做相同的事情要快得多:

data want;
  if 0 then set have; /*Keeps all columns in the original order*/
  array prods[5] $ prod1-prod5;
  do _n_ = 1 by 1 until(last.id);
    set have;
    by id;
    prods[_n_] = prod;
  run;
run;

Just add more arrays as necessary for each variable you want to transpose. 只需根据需要为每个要转置的变量添加更多数组。 This assumes that you only want to look at the same number of rows for each id - if you're not sure how many there are, you'll need to do an extra initial pass to find out how large you need to make the arrays. 假设您只想为每个id看相同数量的行-如果不确定是否有多少行,则需要额外进行一次初始遍历,以找出制作数组所需的行数。

This technique is known as a DOW-loop. 这种技术称为DOW循环。 Further reading: http://analytics.ncsu.edu/sesug/2010/BB13.Dorfman.pdf 进一步阅读: http : //analytics.ncsu.edu/sesug/2010/BB13.Dorfman.pdf

If it is a large dataset, you need to think of efficiency. 如果数据集很大,则需要考虑效率。 It needs to be sorted first (or indexed) on the ID variable. 它需要首先在ID变量上排序(或建立索引)。 Also, ensure you only keep (process) the relevant variables: 另外,请确保仅保留(处理)相关变量:

proc transpose data=input(keep=id prod) out=output(drop=_name_) prefix=PROD_;
by id;
var prod;
run;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 创建交易数据集R - create transaction data set R 在pandas / python中转置DataFrame,但不是所有列 - Transpose the DataFrame in pandas/python, but not all columns Redis原子事务通过相交集合进行搜索并使用返回的数据进行更新 - Redis atomic transaction to search by intersectioning set and use returned data to update 在Spring中为不同的数据源设置事务的正确方法是什么? - Proper way to set up transactions in Spring for different data sources? Hibernate 设置 null 没有保存,则数据已被删除 - Hibernate set null has no saved,then data has been deleted 在SAS Enterprise Miner上使用关联规则时出错 - Error in Using Association Rule on SAS Enterprise Miner firestore transaction.set(ref, data, {merge: true}) 和 transaction.update(ref, data) 有什么区别? - what difference between firestore transaction.set(ref, data, {merge: true}) and transaction.update(ref, data)? 如果在SET操作期间Redis超时,是否可以修改Redis数据库中的数据? - If redis timesout during a SET operation, could data in the redis db have been modified anyway? 如何在Power BI中创建一个布尔列,以标识在基于事务的数据集中已作废或“归零”的行? - How do I create a Boolean column in Power BI that identifies rows that have been voided or 'zeroed' in a transaction based data set? 将选项设置为FbTransaction - Set options to FbTransaction
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM