简体   繁体   中英

Matching between a vector and multiple vectors in a list in R

I have a list of vectors such as:

>list

[[1]]

[1] "a" "m" "l" "s" "t" "o"

[[2]]

[1] "a" "y" "o" "t" "e"

[[3]]

[1] "n" "a" "s" "i" "d"

I want to find the matches between each of them and the remaining (ie between the 1st and the other 2, the 2nd and the other 2, and so on) and keep the couple with the highest number of matches. I could do it with a "for" loop and intersect by couples. For example

for (i in 2:3) { intersect(list[[1]],list[[i]]) }

and then save the output into a vector or some other structure. However, this seems so inefficient to me (given than rather than 3 I have thousands) and I am wondering if R has some built-in function to do that in a clever way.

So the question would be:

Is there a way to look for matches of one vector to a list of vectors without the explicit use of a "for" loop?

I don't believe there is a built-in function for this. The best you could try is something like:

lsts <- lapply(1:5, function(x) sample(letters, 10)) # make some data (see below)
maxcomb <- which.max(apply(combs <- combn(length(lsts), 2), 2,
  function(ix) length(intersect(lsts[[ix[1]]], lsts[[ix[2]]]))))
lsts <- lsts[combs[, maxcomb]]
# [[1]]
#  [1] "m" "v" "x" "d" "a" "g" "r" "b" "s" "t"

# [[2]]
#  [1] "w" "v" "t" "i" "d" "p" "l" "e" "s" "x"

A dump of the original:

[[1]]
 [1] "z" "r" "j" "h" "e" "m" "w" "u" "q" "f"

[[2]]
 [1] "m" "v" "x" "d" "a" "g" "r" "b" "s" "t"

[[3]]
 [1] "w" "v" "t" "i" "d" "p" "l" "e" "s" "x"

[[4]]
 [1] "c" "o" "t" "j" "d" "g" "u" "k" "w" "h"

[[5]]
 [1] "f" "g" "q" "y" "d" "e" "n" "s" "w" "i"
datal <- list (a=c(2,2,1,2),
           b=c(2,2,2,4,3),
           c=c(1,2,3,4))

# all possible combinations
combs <- combn(length(datal), 2)
# split into list
combs <- split(combs, rep(1:ncol(combs), each = nrow(combs)))

# calculate length of intersection for every combination
intersections_length <- sapply(combs, function(y) {
  length(intersect(datal[[y[1]]],datal[[y[2]]]))
  }
  )

# What lists have biggest intersection
combs[which(intersections_length == max(intersections_length))]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM