Product of list of columns in R data.table

Question

I have a large list of column names (variables) of an R data.table and I want to create a column containing the product of these columns.

Example:

col_names <- c("season_1","season_2","season_3")
DT_example <- data.table(id=1:4,
                 season_1=c(1,1,0,0),
                 season_2=c(0,1,1,1),
                 season_3=c(1,0,1,0),
                 product=1)

data.table:

   id season_1 season_2 season_3 product
1:  1        1        0        1       1
2:  2        1        1        1       1
3:  3        0        1        1       1
4:  4        0        1        0       1

The solution I have is using a "for" loop but it is not very efficient:

for(inc in col_names){
  nm1 <- as.symbol(inc)
  DT_example[,product:= product * eval(nm1)]
}

result:

   id season_1 season_2 season_3 product
1:  1        1        0        1       0
2:  2        1        1        0       0
3:  3        1        1        1       1
4:  4        0        1        0       0

Is there a faster way to do this using data.table native syntax?

Answer 1

Here are four options. The first one is by far the most efficient but assumes we are dealing with only zeros and ones.

DT_example[, product := do.call(pmin, .SD), .SDcols = patterns("season")]

DT_example[, product := Reduce(`*`, .SD), .SDcols = patterns("season")]

DT_example[, product := apply(.SD, 1, prod), .SDcols = patterns("season")]

DT_example[, product := melt(.SD, id.vars = "id")[, prod(value), by = id]$V1]

# > DT_example
#    id season_1 season_2 season_3 product
# 1:  1        1        0        1       0
# 2:  2        1        1        1       1
# 3:  3        0        1        1       0
# 4:  4        0        1        0       0

Data:

DT_example <- data.table(
  id=1:4,
  season_1=c(1,1,0,0),
  season_2=c(0,1,1,1),
  season_3=c(1,1,1,0),
  product=1
)

Answer 2

We can use prod grouped by sequence of rows after selecting the columns in .SDcols . With prod , there is na.rm option as well to remove NA elements if needed.

DT_example[,  Product := prod(.SD, na.rm = TRUE), by = 1:nrow(DT_example),
     .SDcols = patterns("season")]

-output

DT_example
#   id season_1 season_2 season_3 product Product
#1:  1        1        0        1       1       0
#2:  2        1        1        1       1       1
#3:  3        0        1        1       1       0
#4:  4        0        1        0       1       0

Answer 3

I think you could use "apply" and "prod" functions:

DT_example$product = apply(DT_example[,2:4], 1, prod)

This is applying the "prod" function (multiplies every element of what ir receives), to every line (defined by the "1" argument, as "2" would be column), of "DT_example[,2:4]".

Product of list of columns in R data.table

Question

3 answers

solution1
3 ACCPTED 2020-10-14 13:10:24

solution2
0 2020-10-14 13:14:39

solution3
0 2020-10-14 13:15:38

Product of list of columns in R data.table

Question

3 answers

solution1 3 ACCPTED 2020-10-14 13:10:24

solution2 0 2020-10-14 13:14:39

solution3 0 2020-10-14 13:15:38

solution1
3 ACCPTED 2020-10-14 13:10:24

solution2
0 2020-10-14 13:14:39

solution3
0 2020-10-14 13:15:38