数据表和列名

Question

I am using data.table to get some basic statistics in one column while filtering by another column(s).我正在使用data.table获取一列中的一些基本统计信息，同时按另一列进行过滤。

This is the command这是命令

stats <- as.data.frame(mydata[, j = list(Sum = sum(as.numeric(get(selection))),
                                         Average = mean(as.numeric(get(selection))),
                                         Count = length(get(selection))), 
                                by = list(get(filters))])

where:在哪里：

mydata is a data.table with 20 or so columns mydata是一个包含 20 列左右的data.table
selection is a column name which is passed programmatically selection是以编程方式传递的列名
filters is also a column name which is passed programmatically filters也是以编程方式传递的列名

If I limit myself to one filter (one column) everything works, but I would like to filter by more than one column.如果我将自己限制为一个过滤器（一列），则一切正常，但我想过滤多于一列。

It is possible to do:可以这样做：

by = list(get(filters[1]), get(filters[2]), ...)

However, that requires I know how many filters will be used.但是，这需要我知道将使用多少个过滤器。 That is a limitation I don't want to have.这是我不想拥有的限制。

How do I write the by = to take any number of filters (column names) - I just tried mget(filters) and that doesn't work.我如何编写by =以采用任意数量的过滤器（列名） - 我刚刚尝试过mget(filters)并且这不起作用。

Answer 1

Data.tables by argument accepts a character vector of column names (see the documentation: help("data.table") ). Data.tables by参数接受列名的字符向量（请参阅文档： help("data.table") ）。 There is no need for get .不需要get 。 Just use by = c(filters) .只需使用by = c(filters) 。

Example:例子：

library(data.table)
DT <- data.table(mtcars)

filters <- c("am", "gear")
DT[, mean(mpg), by=c(filters)]
#   am gear       V1
#1:  1    4 26.27500
#2:  0    3 16.10667
#3:  0    4 21.05000
#4:  1    5 21.38000

数据表和列名

问题描述

1 个解决方案

解决方案1
6 已采纳 2014-09-30 14:55:56

数据表和列名

问题描述

1 个解决方案

解决方案1 6 已采纳 2014-09-30 14:55:56

解决方案1
6 已采纳 2014-09-30 14:55:56