简体   繁体   English

data.table:使用lapply创建新列

[英]data.table: create new columns with lapply

i have a data.table and want to apply a function to on each subset of a row. 我有一个data.table,并希望在一行的每个子集上应用一个函数。 Normaly one would do as follows: DT[, lapply(.SD, function), by = y] Normaly会做如下: DT[, lapply(.SD, function), by = y]

But in my case the function does not return a atomic vector but simply a vector. 但在我的情况下,函数不返回原子向量而只返回向量。 Is there a chance to do something like this? 有没有机会做这样的事情?

library(data.table)
set.seed(9)
DT <- data.table(x1=letters[sample(x=2L,size=6,replace=TRUE)],
                 x2=letters[sample(x=2L,size=6,replace=TRUE)],
                 y=rep(1:2,3), key="y")
DT
#   x1 x2 y
#1:  a  a 1
#2:  a  b 1
#3:  a  a 1
#4:  a  a 2
#5:  a  b 2
#6:  a  a 2

DT[, lapply(.SD, table), by = y]
# Desired Result, something like this:
# x1_a x2_a x2_b
#    3    2    1
#    3    2    1

Thanks in advance, and also: I would not mind if the result of the function must have a fixed length. 在此先感谢,并且:我不介意函数的结果是否必须具有固定长度。

You simply need to unlist the table and then coerce back to a list: 您只需要取消列表,然后强制回到列表:

> DTCounts <- DT[, as.list(unlist(lapply(.SD, table))), by=y]
> DTCounts

   y x1.a x2.a x2.b
1: 1    3    2    1
2: 2    3    2    1

.


if you do not like the dots in the names, you can sub them out: 如果你不喜欢名字中的圆点,你可以将它们sub出来:

> setnames(DTCounts, sub("\\.", "_", names(DTCounts)))
> DTCounts

   y x1_a x2_a x2_b
1: 1    3    2    1
2: 2    3    2    1

Note that if not all values in a column are present for each group 请注意,如果不是每个组都存在列中的所有值
(ie, if x2=c("a", "b") when y=1 , but x2=c("b", "b") when y=2 ) (即,如果x2=c("a", "b")时, y=1 ,但x2=c("b", "b")时, y=2
then the above breaks. 然后上面的休息。

The solution is to make the columns factors before counting. 解决方案是在计数之前制作列因子。

DT[, lapply(.SD, is.factor)]

## OR
columnsToConvert <- c("x1", "x2")  # or .. <- setdiff(names(DT), "y") 
DT <- cbind(DT[, lapply(.SD, factor), .SDcols=columnsToConvert], y=DT[, y])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM