R data.table 中列列表的乘积

Question

I have a large list of column names (variables) of an R data.table and I want to create a column containing the product of these columns.我有一个 R data.table 的大量列名（变量）列表，我想创建一个包含这些列的乘积的列。

Example:例子：

col_names <- c("season_1","season_2","season_3")
DT_example <- data.table(id=1:4,
                 season_1=c(1,1,0,0),
                 season_2=c(0,1,1,1),
                 season_3=c(1,0,1,0),
                 product=1)

data.table:数据表：

   id season_1 season_2 season_3 product
1:  1        1        0        1       1
2:  2        1        1        1       1
3:  3        0        1        1       1
4:  4        0        1        0       1

The solution I have is using a "for" loop but it is not very efficient:我的解决方案是使用“for”循环，但效率不高：

for(inc in col_names){
  nm1 <- as.symbol(inc)
  DT_example[,product:= product * eval(nm1)]
}

result:结果：

   id season_1 season_2 season_3 product
1:  1        1        0        1       0
2:  2        1        1        0       0
3:  3        1        1        1       1
4:  4        0        1        0       0

Is there a faster way to do this using data.table native syntax?是否有使用 data.table 本机语法执行此操作的更快方法？

Answer 1

Here are four options.这里有四个选项。 The first one is by far the most efficient but assumes we are dealing with only zeros and ones.第一个是迄今为止最有效的，但假设我们只处理零和一。

DT_example[, product := do.call(pmin, .SD), .SDcols = patterns("season")]

DT_example[, product := Reduce(`*`, .SD), .SDcols = patterns("season")]

DT_example[, product := apply(.SD, 1, prod), .SDcols = patterns("season")]

DT_example[, product := melt(.SD, id.vars = "id")[, prod(value), by = id]$V1]

# > DT_example
#    id season_1 season_2 season_3 product
# 1:  1        1        0        1       0
# 2:  2        1        1        1       1
# 3:  3        0        1        1       0
# 4:  4        0        1        0       0

Data:数据：

DT_example <- data.table(
  id=1:4,
  season_1=c(1,1,0,0),
  season_2=c(0,1,1,1),
  season_3=c(1,1,1,0),
  product=1
)

Answer 2

We can use prod grouped by sequence of rows after selecting the columns in .SDcols .在.SDcols选择列后，我们可以使用按行顺序分组的prod 。 With prod , there is na.rm option as well to remove NA elements if needed.使用prod ，如果需要，还有na.rm选项可以删除NA元素。

DT_example[,  Product := prod(.SD, na.rm = TRUE), by = 1:nrow(DT_example),
     .SDcols = patterns("season")]

-output -输出

DT_example
#   id season_1 season_2 season_3 product Product
#1:  1        1        0        1       1       0
#2:  2        1        1        1       1       1
#3:  3        0        1        1       1       0
#4:  4        0        1        0       1       0

Answer 3

I think you could use "apply" and "prod" functions:我认为您可以使用“应用”和“产品”功能：

DT_example$product = apply(DT_example[,2:4], 1, prod)

This is applying the "prod" function (multiplies every element of what ir receives), to every line (defined by the "1" argument, as "2" would be column), of "DT_example[,2:4]".这是将“prod”函数（将 ir 接收到的每个元素相乘）应用于“DT_example[,2:4]”的每一行（由“1”参数定义，因为“2”将是列）。

R data.table 中列列表的乘积

问题描述

3 个解决方案

解决方案1
3 已采纳 2020-10-14 13:10:24

解决方案2
0 2020-10-14 13:14:39

解决方案3
0 2020-10-14 13:15:38

R data.table 中列列表的乘积

问题描述

3 个解决方案

解决方案1 3 已采纳 2020-10-14 13:10:24

解决方案2 0 2020-10-14 13:14:39

解决方案3 0 2020-10-14 13:15:38

解决方案1
3 已采纳 2020-10-14 13:10:24

解决方案2
0 2020-10-14 13:14:39

解决方案3
0 2020-10-14 13:15:38