简体   繁体   中英

Variable as name in aggregate list of data.table

I'm aggregating an R/data.table (v1.12.2) and I need to use a variable as the name of the aggregated column. Eg:

library(data.table)

DT <- data.table(x= 1:5, y= c('A', 'A', 'B', 'B', 'B'))

aggname <- 'max_x'  ## 'max_x' should be the name of the aggregated column

DT2 <- DT[, list(aggname= max(x)), by= y]
DT2
   y aggname  <- This should be 'max_x' not 'aggname'!
1: A       2
2: B       5

I can rename the column(s) afterwards with something like:

setnames(DT2, 'aggname', aggname)
DT2
   y max_x
1: A     2
2: B     5

But I would have to check that the string 'aggname' doesn't create duplicate names first. Is there any better way of doing it?

We can use setNames on the list column

DT[, setNames(.(max(x)), aggname), by = y]
#    y max_x
#1: A     2
#2: B     5

aggname2 <- 'min_x'
DT[, setNames(.(max(x), min(x)), c(aggname, aggname2)), by = y]
#   y max_x min_x
#1: A     2     1
#2: B     5     3

Or another option is lst from dplyr

library(dplyr)
DT[, lst(!! aggname := max(x)), by = y]
#    y max_x
#1: A     2
#2: B     5


DT[, lst(!! aggname := max(x), !! aggname2 := min(x)), by = y]
#   y max_x min_x
#1: A     2     1
#2: B     5     3

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM