简体   繁体   English

在函数中按名称传递data.table列

[英]Pass data.table column by name in function

I want to pass a column name to a function and use column indexing and the setorder function: 我想将列名传递给函数,并使用列索引和setorder函数:

require(data.table)
data(iris)

top3 = function(t, n) {
  setorder(t, n, order=-1)
  return ( t[1:3, .(Species, n)])
}

DT = data.table(iris)
top3(DT, Petal.Width)

However, this returns an error: 但是,这将返回错误:

Error in setorderv(x, cols, order, na.last) : some columns are not in the data.table: n,1

I think I'm misunderstanding how passing bare column names works in R. What are my options? 我想我误会了在R中传递裸列名称的方式。我有什么选择?

You can do 你可以做

top3 = function(DT, nm) eval(substitute( DT[order(-nm), .(Species, nm)][, head(.SD, 3L)] ))
top3(DT, Petal.Width)

     Species Petal.Width
1: virginica         2.5
2: virginica         2.5
3: virginica         2.5

I would advise against (1) setorder inside a function, since it has side effects; 我建议不要(1) setorder内部的setorder ,因为它有副作用; (2) indexing with 1:3 when you may use this on a data.table with fewer than three rows in the future, to strange effect; (2)当您将来可能在少于三行的data.table上使用它时,以1:3索引,产生奇怪的效果; (3) fixing 3 instead of making it an argument to the function; (3)固定3而不是将其作为函数的参数; and (4) using n for name... just my personal preference to reserve n for counts. (4)使用n来命名...只是我个人的喜好保留n用于计数。

Assuming your dataset will always have more than 3 rows and that this is the ONLY operation you want to perform on that data table, it may be in your interest to use setorderv instead. 假设您的数据集将始终具有3行以上,并且这是您要在该数据表上执行的唯一操作,那么使用setorderv代替可能会符合您的利益。

top3 = function(t, n) {
  setorderv(t, n, -1)
  return ( t[1:3, c("Species", n), with=FALSE])
}

DT = data.table(iris)
top3(DT, "Petal.Width")

Result: 结果:

     Species Petal.Width
1: virginica         2.5
2: virginica         2.5
3: virginica         2.5

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM