[英]Is there a fast way to compare two list of vectors in R?
If there are two lists of the form如果表单有两个列表
list_a <- list(c(1,2,3,4,5), c(6,7,8,9,10), c(11,12,13,14))
list_b <- list(c(1,9,3,7,5), c(2,7,13,9,10), c(1,12,5,14))
I want to compare the elements in list_a
vs the elements in list list_b
and extract the sum of matched elements.我想比较list_a
的元素与列表list_b
的元素并提取匹配元素的总和。 I mean, I want to compare as follows我的意思是,我想比较如下
sum(c(1,2,3,4,5) %in% c(1,9,3,7,5))
sum(c(1,2,3,4,5) %in% c(2,7,13,9,10))
sum(c(1,2,3,4,5) %in% c(1,12,5,14))
sum(c(6,7,8,9,10) %in% c(1,9,3,7,5))
sum(c(6,7,8,9,10) %in% c(2,7,13,9,10))
sum(c(6,7,8,9,10) %in% c(1,12,5,14))
sum(c(11,12,13,14) %in% c(1,9,3,7,5))
sum(c(11,12,13,14) %in% c(2,7,13,9,10))
sum(c(11,12,13,14) %in% c(1,12,5,14))
I tried the following code using the sapply()
function and the output is as expected (Colums maps to elements in list_a
and rows maps to elements in list_b
).我尝试使用以下代码sapply()
函数和如预期的输出(Colums映射到元件list_a
和行映射到元件list_b
)。 However, when the length of the lists is large this code is too slow (Imagine list_a
with 10000 elements and list_b
with 10000 elements).但是,当列表的长度很大时,此代码太慢(想象一下list_a
有 10000 个元素, list_b
有 10000 个元素)。
test <- sapply(list_a, function(x){
out_sum <- sapply(list_b, function(y){
matches <- sum(x %in% y)
return(matches)
})
return(out_sum)
})
Output输出
Does anyone have an idea?有没有人有想法?
You can try using the map
function.您可以尝试使用map
功能。 It reduces the runtime by more than half.它减少了一半以上的运行时间。
library(purrr)
out_sum <- list_b %>% map(function (y) {
list_a %>% map(function(x) sum(x %in% y))
})
out_matrix <- matrix(unlist(out_sum), ncol = length(list_a), byrow = TRUE)
Another option is to use outer
-另一种选择是使用outer
-
check <- function(x, y) sum(list_a[[x]] %in% list_b[[y]])
test <- outer(seq_along(list_a), seq_along(list_b), Vectorize(check))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.