[英]R data.table Group By Created Column
我是真棒的data.table
软件包的新手,并且data.table
了一个希望可以提供简单解决方案的问题。 我想过滤一个data.table
, data.table
添加一些列,并对该data.table
的某些列进行data.table
包括我在j
子句中创建的列之一 。
如果我使用的是dplyr
,它将如下所示:
library(dplyr)
mtcars %>%
filter(vs == 1) %>%
mutate(trans = ifelse(am == 1, "Manual", "Auto")) %>%
group_by(gear, carb, trans) %>%
summarise(num_cars = n(),
avg_qsec = mean(qsec))
# A tibble: 6 x 5
# Groups: gear, carb [?]
gear carb trans num_cars avg_qsec
<dbl> <dbl> <chr> <int> <dbl>
1 3 1 Auto 3 19.9
2 4 1 Manual 4 19.2
3 4 2 Auto 2 21.4
4 4 2 Manual 2 18.6
5 4 4 Auto 2 18.6
6 5 2 Manual 1 16.9
我对data.table
尝试data.table
。
library(data.table)
dtmt <- as.data.table(mtcars)
dtmt[vs == 1,
.(num_cars = .N,
avg_qsec = mean(qsec),
trans = ifelse(am == 1,
"Manual", "Auto")),
by = list(gear, carb, trans)]
Error in eval(bysub, xss, parent.frame()) : object 'trans' not found
所以我在j
子句中创建的列不能在by
? 如果我不尝试转换am
列,则效果很好。
dtmt[vs == 1,
.(num_cars = .N,
avg_qsec = mean(qsec)),
by = list(gear, carb, am)]
gear carb am num_cars avg_qsec
1: 4 1 1 4 19.22
2: 3 1 0 3 19.89
3: 4 2 0 2 21.45
4: 4 4 0 2 18.60
5: 4 2 1 2 18.56
6: 5 2 1 1 16.90
谢谢!
在过滤“ vs”为1的行之后,我们创建一列“ trans”。然后,将其用作分组变量进行汇总
dtmt[vs==1 # subset the rows
][, trans := c("Auto", "Manual")[(am==1)+1] # create trans
][, .(num_cars = .N, avg_qsec = mean(qsec)), by = .(gear, carb, trans)]
可以在一个[]
完成所有操作:
as.data.table(mtcars)[
vs == 1,
.(num_cars = .N, avg_qsec = mean(qsec)),
by = .(gear, carb, trans = ifelse(am == 1, "Manual", "Auto"))]
# gear carb trans num_cars avg_qsec
# 1: 4 1 Manual 4 19.22
# 2: 3 1 Auto 3 19.89
# 3: 4 2 Auto 2 21.45
# 4: 4 4 Auto 2 18.60
# 5: 4 2 Manual 2 18.56
# 6: 5 2 Manual 1 16.90
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.