简体   繁体   中英

set multiple columns in R `data.table` with a named list and `:=`

Using := to create new columns is one of my favorite features of data.table. I know of two approaches to using it to add multiple columns at once. Here's a simple example

dt <- data.table("widths" = seq(2, 10, 2), "heights" = 8:4)
dt
   widths heights
1:      2       8
2:      4       7
3:      6       6
4:      8       5
5:     10       4

Suppose I want to add two columns, one for areas and another for perimeters. The first approach is a call such as

new_cols <- c("areas", "perimeters")

my_fun <- function(x, y){
  areas <- x * y
  perimeters <- 2*(x + y)
  return(list(areas = areas, perimeters = perimeters))
}

dt[ , (new_cols) := my_fun(widths, heights)]
dt
   widths heights areas perimeters
1:      2       8   16        20
2:      4       7   28        22
3:      6       6   36        24
4:      8       5   40        26
5:     10       4   40        28

Equivalently, we could use the functional form of := as follows:

dt[ , `:=`("areas" = widths * heights, "perimeters" = 2*(widths + heights))]

Both of these approaches require entering the names of the new columns in advance. You can enter them manually, you can save them in an object prior to creating the columns, or you could have a function on the left-hand side of := that produces names. What I don't know about is a way to get both the names and output to := in a single call.

Is there a way to do this? Here's an example of what I'm hoping to do:

dt[ , (new_cols) := NULL] # delete the previously added area and perimeter cols.
dt[ , `:=`(my_fun(widths, heights))]
dt
   widths heights areas perimeters
1:      2       8   16        20
2:      4       7   28        22
3:      6       6   36        24
4:      8       5   40        26
5:     10       4   40        28

Ideally, there's a way to make := see that my_fun() returns names and then use these as the names for the new column. I know the above produces an error, but I'm wondering if there's a simple way to get the desired functionality, since this would be useful in larger problems where there are many columns or where the column names depend on the input to the function.

Edit: The key thing I'm looking for is a way to assign these columns by reference, ie, using := or set(), and I also want to maintain the class of the output as a data.table .

too long in comment. Not pretty:

dt[, {
    a <- my_fun(widths, heights)   
    for (x in names(a))
        set(dt, j=x, value=a[[x]])
}]

Or you can pass dt into the function if it was created by you?

I don't think you are looking for this but this works.

data.frame(dt, my_fun(dt$widths, dt$heights))

#  widths heights areas perimeters
#1      2       8    16         20
#2      4       7    28         22
#3      6       6    36         24
#4      8       5    40         26
#5     10       4    40         28

Unfortunately, data.table(dt, my_fun(dt$widths, dt$heights)) doesn't work.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM