简体   繁体   English

如何有效地重塑我的data.table

[英]How can I reshape my data.table efficiently

I have a data.table DT 我有一个data.table DT

set.seed(1)
DT <- data.table(x=rep(c(1,2,3),each=4), y=c("A","B"), v=sample(1:100,12))
DT
    x y   v
 1: 1 A  29
 2: 1 B  92
 3: 1 A 100
 4: 1 B  82
 5: 2 A  28
 6: 2 B  26
 7: 2 A  18
 8: 2 B  22
 9: 3 A  30
10: 3 B  96
11: 3 A  15
12: 3 B   4

I would like to expand it like bellow, creating a new column for each value of x and reporting the v values, no structure in the data should be expected (not by blocks like bellow) 我想像下面这样展开它,为x每个值创建一个新列并报告v值,不应该预期数据中的结构(不是像下面的块那样)

    x y v.1  v.2 v.3
 1: 1 A  29   NA  NA
 2: 1 B  92   NA  NA
 3: 1 A 100   NA  NA
 4: 1 B  82   NA  NA
 5: 2 A  NA   28  NA
 6: 2 B  NA   26  NA
 7: 2 A  NA   18  NA
 8: 2 B  NA   22  NA
 9: 3 A  NA   NA  30
10: 3 B  NA   NA  96
11: 3 A  NA   NA  15
12: 3 B  NA   NA   4

I asked a very similar question here but cannot adapt the answer G Grothendieck gave us at the time... 在这里问了一个非常相似的问题但无法适应G格洛腾迪克当时给我们的答案......

EDIT: As usual I just almost got it after I wrote the post... I just need to replace those 0 by NA (I might get 0 in v and I want to be able to dissociate v==0 from missing items) 编辑:像往常一样,我在撰写帖子后几乎得到了它...我只需要用NA替换那些0(我可能在v中得到0并且我希望能够从缺失的项目中分离v == 0)

DT2 <- DT[, {SUM.<-factor(x); data.table(model.matrix(~ SUM.:v + 0))}]
txtR) DT2
    SUM.1:v SUM.2:v SUM.3:v
 1:      29       0       0
 2:      92       0       0
 3:     100       0       0
 4:      82       0       0
 5:       0      28       0
 6:       0      26       0
 7:       0      18       0
 8:       0      22       0
 9:       0       0      30
10:       0       0      96
11:       0       0      15
12:       0       0       4
set.seed(1)
DT <- data.table(x=rep(c(1,2,3),each=4), y=c("A","B"), v=sample(1:100,12))

This gives 这给了

    x y  v
 1: 1 A 27
 2: 1 B 37
 3: 1 A 57
 4: 1 B 89
 5: 2 A 20
 6: 2 B 86
 7: 2 A 97
 8: 2 B 62
 9: 3 A 58
10: 3 B  6
11: 3 A 19
12: 3 B 16

Next, the answer: 接下来,答案是:

ux <- unique(DT$x)
DT[,c(v.=lapply(ux,function(i)v[x==i])),by="x,y"]

which gives 这使

    x y v.1 v.2 v.3
 1: 1 A  27  NA  NA
 2: 1 A  57  NA  NA
 3: 1 B  37  NA  NA
 4: 1 B  89  NA  NA
 5: 2 A  NA  20  NA
 6: 2 A  NA  97  NA
 7: 2 B  NA  86  NA
 8: 2 B  NA  62  NA
 9: 3 A  NA  NA  58
10: 3 A  NA  NA  19
11: 3 B  NA  NA   6
12: 3 B  NA  NA  16

That answer might break in later versions of R, but the OP pointed out that this works too and may be faster: 在R的后续版本中,这个答案可能会中断,但OP指出这也有效,可能更快:

DT[,paste0("v.",ux):=lapply(ux,function(i)v[x==i]),by="x"]

Here's one way: 这是一种方式:

tt <- model.matrix(data=DT, ~ factor(x):rep(1, nrow(DT)) + 0)
tt[tt==0] <- NA
cbind(DT, DT$v * tt)
#     x y   v factor(x)1:v factor(x)2:v factor(x)3:v
#  1: 1 A  69           69           NA           NA
#  2: 1 B  39           39           NA           NA
#  3: 1 A  76           76           NA           NA
#  4: 1 B  49           49           NA           NA
#  5: 2 A 100           NA          100           NA
#  6: 2 B  95           NA           95           NA
#  7: 2 A  36           NA           36           NA
#  8: 2 B  73           NA           73           NA
#  9: 3 A  86           NA           NA           86
# 10: 3 B  20           NA           NA           20
# 11: 3 A  59           NA           NA           59
# 12: 3 B  12           NA           NA           12

You could simply loop over the x 's and use data.table assignment: 可以简单地遍历x并使用data.table赋值:

setkey(DT, x)
for (i in unique(DT$x)) {
  DT[J(i), paste0("v.", i) := v]
}

PS I really wish the following worked, but .GRP is not available there: PS我真的希望以下工作,但.GRP在那里不可用:

DT[, paste0("v.", .GRP) := v, by = x]

edit one more solution (trying to get the above .GRP idea to work somehow), using rbind.fill (I didn't run very careful benches, but this seemed to scale quite well) 编辑另一个解决方案(试图让上面的.GRP想法以某种方式工作),使用rbind.fill (我没有运行非常小心的长凳,但这似乎很好地扩展)

library(plyr)

cbind(DT,
      rbind.fill(DT[, list(list(setnames(data.table(v), paste0("v.", .GRP)))),
                      by = x]$V1))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM