简体   繁体   English

重塑既宽又长(稀疏)的R data.table

[英]Reshaping R data.table that is both wide and long (and sparse)

I have a data.table that is both wide and long, and also sparse. 我有一个data.table ,它既宽又长,而且稀疏。 This is the simplest example: 这是最简单的示例:

Row Val1 Val2
1   1
2        1

Reshaping from wide to long yields: 从宽收益重塑到长收益:

Row Idx Val
1   1   1
1   2
2   1
2   2   1

Reshaping from long (index is implicit based on non-missing rows, in this case row numbers) to wide yields: 从长整型(索引是基于不丢失的行(在这种情况下为行号)隐含的)到宽收益的:

Row Val1.1 Val2.1 Val1.2 Val2.2
1   1                    1

What I want is: 我想要的是:

Row Idx Val
1   1   1
2   2   1

Missing values are structurally missing and should be discarded. 缺失值在结构上缺失,应将其丢弃。

The data set is very complex (400+ columns); 数据集非常复杂(超过400列); it is from a survey in which one question was replicated in six different ways for six different cases, with answers selectively filled based on the case. 它来自一项调查,其中针对六个不同案例以六种不同方式复制了一个问题,并根据案例有选择地填写了答案。 Each question has six binary answers, making 36 columns. 每个问题有六个二进制答案,共36列。 These need to be collapsed into the eight columns representing the eight unique binary answers, plus a new column identifying the case. 需要将这些折叠成代表八个唯一二进制答案的八列,再加上一个标识案例的新列。

There are several other questions with similar issues so I need to find an algorithm to do this, and I don't have the vocabulary to explain it to Google. 还有其他一些类似的问题,因此我需要找到一种算法来做到这一点,而且我没有词汇可以向Google解释。 What is the right way to do this? 什么是正确的方法?

Try 尝试

setDT(df1)[,list(Idx=which.max(.SD), Val=1) , Row]
#   Row Idx Val
#1:   1   1   1
#2:   2   2   1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM