[英]transposed vector by group within data.table
What is the idiomatic data.table approach to produce a data.table with separate columns for elements of a vector returned by a function, calculated by group? 什么是惯用的data.table方法来生成data.table,其中包含由函数返回的向量元素的单独列,由group计算?
Consider the data.table: 考虑data.table:
library(data.table)
data(iris)
setDT(iris)
If the function is range()
, I'd want the output similar to: 如果函数是
range()
,我希望输出类似于:
iris[, .(min_petal_width = min(Petal.Width),
max_petal_width = max(Petal.Width)
), keyby = Species] # produces desired output
but using the range()
function. 但是使用
range()
函数。
I can use dcast
, but it's ugly: 我可以使用
dcast
,但它很难看:
dcast(
iris[, .( petal_width = range(Petal.Width),
value = c("min_petal_width", "max_petal_width")),
keyby = Species],
Species ~ value, value.var = "petal_width")
I'm hoping there's a simpler expression, along the lines of: 我希望有一个更简单的表达方式,如下所示:
iris[, (c("min_petal_width","max_petal_width")) = range(Petal.Width),
keyby = Species] # doesn't work
You can also do: 你也可以这样做:
dt[, lapply(list(min=min, max=max), function(f) f(Petal.Width)), by=Species]
# Species min max
# 1: setosa 0.1 0.6
# 2: versicolor 1.0 1.8
# 3: virginica 1.4 2.5
Your approach was very close. 你的方法非常接近。 Just remember that you need to feed a list to data.table and it will happily accept it.
请记住,您需要将一个列表提供给data.table,它会很乐意接受它。 Hence, you can use:
因此,您可以使用:
iris[, c("min_petal_width","max_petal_width") := as.list(range(Petal.Width)),
by = Species]
I misread the question.. Since you want to aggregate the result instead of adding new columns, you could use 我误解了这个问题..因为你想要聚合结果而不是添加新列,你可以使用
cols <- c("min_petal_width", "max_petal_width")
iris[, setNames(as.list(range(Petal.Width)), cols), keyby = Species]
But I'm sure there are a few other data.table approaches, too. 但我确信还有其他一些data.table方法。
If readability and conciseness is really important to you, I would define a custom function or binary operator which you can then easily use in your data.table subset expression, eg : 如果可读性和简洁性对您来说非常重要,我将定义一个自定义函数或二元运算符,然后您可以在data.table子集表达式中轻松使用它,例如:
# custom function
.nm <- function(v,vnames){
`names<-`(as.list(v),vnames)
}
# custom binary operator
`%=%` <- function(vnames,v){
`names<-`(as.list(v),vnames)
}
# using custom function
iris[, .nm(range(Petal.Width),c("min_petal_width", "max_petal_width")), keyby = Species]
# using custom binary operator
iris[, c("min_petal_width", "max_petal_width") %=% range(Petal.Width), keyby = Species]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.