子集data.table列独立

Question

I'm starting with the below table dt and try to subset its column by the list keys : 我从下表dt开始，并尝试通过列表keys对其列进行子集化：

library(data.table)

set.seed(123)

randomchar <- function(n, w){
  chararray <- replicate(w, sample(c(letters, LETTERS), n, replace = TRUE))
  apply(chararray, 1, paste0, collapse = "")
}

dt <- data.table(x = randomchar(1000, 3),
                 y = randomchar(1000, 3),
                 z = randomchar(1000, 3),
                 key = c("x", "y", "z"))

keys <- with(dt, list(x = sample(x, 501),
              y = sample(y, 500),
              z = sample(z, 721)))

I can get the result I want by using a loop: 我可以通过循环获得我想要的结果：

desired <- copy(dt)

for(i in seq_along(keys)){
  keyname <- names(keys)[i]
  desired <- desired[get(keyname) %in% keys[[i]]]
}

desired

The question is - Is there a more data.table idiomatic way to do this subset? 问题是 - 是否有更多的data.table惯用方法来做这个子集？

I tried using CJ : dt[CJ(keys)] , but it takes a very long time. 我尝试使用CJ ： dt[CJ(keys)] ，但需要很长时间。

Answer 1

怎么样在这个面具上构建一个蒙版并过滤dt ：

dt[Reduce(`&`, Map(function(key, col) col %in% key, keys, dt)),]

子集data.table列独立

问题描述

1 个解决方案

解决方案1
4 已采纳 2016-10-11 12:16:11

子集data.table列独立

问题描述

1 个解决方案

解决方案1 4 已采纳 2016-10-11 12:16:11

解决方案1
4 已采纳 2016-10-11 12:16:11