[英]Pass data.table column by name in function
I want to pass a column name to a function and use column indexing and the setorder
function: 我想将列名传递给函数,并使用列索引和setorder
函数:
require(data.table)
data(iris)
top3 = function(t, n) {
setorder(t, n, order=-1)
return ( t[1:3, .(Species, n)])
}
DT = data.table(iris)
top3(DT, Petal.Width)
However, this returns an error: 但是,这将返回错误:
Error in setorderv(x, cols, order, na.last) : some columns are not in the data.table: n,1
I think I'm misunderstanding how passing bare column names works in R. What are my options? 我想我误会了在R中传递裸列名称的方式。我有什么选择?
You can do 你可以做
top3 = function(DT, nm) eval(substitute( DT[order(-nm), .(Species, nm)][, head(.SD, 3L)] ))
top3(DT, Petal.Width)
Species Petal.Width
1: virginica 2.5
2: virginica 2.5
3: virginica 2.5
I would advise against (1) setorder
inside a function, since it has side effects; 我建议不要(1) setorder
内部的setorder
,因为它有副作用; (2) indexing with 1:3
when you may use this on a data.table with fewer than three rows in the future, to strange effect; (2)当您将来可能在少于三行的data.table上使用它时,以1:3
索引,产生奇怪的效果; (3) fixing 3
instead of making it an argument to the function; (3)固定3
而不是将其作为函数的参数; and (4) using n
for name... just my personal preference to reserve n
for counts. (4)使用n
来命名...只是我个人的喜好保留n
用于计数。
Assuming your dataset will always have more than 3 rows and that this is the ONLY operation you want to perform on that data table, it may be in your interest to use setorderv
instead. 假设您的数据集将始终具有3行以上,并且这是您要在该数据表上执行的唯一操作,那么使用setorderv
代替可能会符合您的利益。
top3 = function(t, n) {
setorderv(t, n, -1)
return ( t[1:3, c("Species", n), with=FALSE])
}
DT = data.table(iris)
top3(DT, "Petal.Width")
Result: 结果:
Species Petal.Width
1: virginica 2.5
2: virginica 2.5
3: virginica 2.5
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.