简体   繁体   English

数据表和列名

[英]data.table & column names

I am using data.table to get some basic statistics in one column while filtering by another column(s).我正在使用data.table获取一列中的一些基本统计信息,同时按另一列进行过滤。

This is the command这是命令

stats <- as.data.frame(mydata[, j = list(Sum = sum(as.numeric(get(selection))),
                                         Average = mean(as.numeric(get(selection))),
                                         Count = length(get(selection))), 
                                by = list(get(filters))])

where:在哪里:

  • mydata is a data.table with 20 or so columns mydata是一个包含 20 列左右的data.table
  • selection is a column name which is passed programmatically selection是以编程方式传递的列名
  • filters is also a column name which is passed programmatically filters也是以编程方式传递的列名

If I limit myself to one filter (one column) everything works, but I would like to filter by more than one column.如果我将自己限制为一个过滤器(一列),则一切正常,但我想过滤多于一列。

It is possible to do:可以这样做:

by = list(get(filters[1]), get(filters[2]), ...) 

However, that requires I know how many filters will be used.但是,这需要我知道将使用多少个过滤器。 That is a limitation I don't want to have.这是我不想拥有的限制。

How do I write the by = to take any number of filters (column names) - I just tried mget(filters) and that doesn't work.我如何编写by =以采用任意数量的过滤器(列名) - 我刚刚尝试过mget(filters)并且这不起作用。

Data.tables by argument accepts a character vector of column names (see the documentation: help("data.table") ). Data.tables by参数接受列名的字符向量(请参阅文档: help("data.table") )。 There is no need for get .不需要get Just use by = c(filters) .只需使用by = c(filters)

Example:例子:

library(data.table)
DT <- data.table(mtcars)

filters <- c("am", "gear")
DT[, mean(mpg), by=c(filters)]
#   am gear       V1
#1:  1    4 26.27500
#2:  0    3 16.10667
#3:  0    4 21.05000
#4:  1    5 21.38000

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM