简体   繁体   English

在 R 的向量列表中查找所有相交向量

[英]Find all intersecting vectors in a list of vectors in R

I have a list of vectors sets as follows.我有一个向量sets列表,如下所示。

sets <- list(b = c("b4", "b5", "b6"),
             c = c("c2", "c3", "b4", "b5", "c6"),
             d = c("d1", "d2"),
             e = c("e45", "e55", "e65"),
             f = c("f4", "f5", "d1", "f6"),
             g = c("g1", "g2"),
             h = c("h5", "h6", "h7"),
             i = c("i9", "h5", "g1", "h6", "i8", "i7"),
             j = c("j1", "j2", "j3"))

I want to identify all the elements of this list which are unique, as well as all those which are overlapping/intersecting.我想识别此列表中所有唯一的元素,以及所有重叠/相交的元素。

How to do this in R?如何在 R 中执行此操作?

unique <- list(e = c("e45", "e55", "e65"),
               j = c("j1", "j2", "j3"))

intersects <- list(d = c("d1", "d2"),
                   b = c("b4", "b5", "b6"),
                   c = c("c2", "c3", "b4", "b5", "c6"),
                   f = c("f4", "f5", "d1", "f6"),
                   g = c("g1", "g2"),
                   h = c("h5", "h6", "h7"),
                   i = c("i9", "h5", "g1", "h6", "i8", "i7"))

Given that the list elements should be partitioned according to:鉴于列表元素应根据以下内容进行分区:

  • List elements with empty intersections w.r.t.列出具有空交叉点的元素 w.r.t。 all the other list components,所有其他列表组件,
  • List elements with a non-empty intersection w.r.t.列出具有非空交集 w.r.t 的元素。 some other list component,其他一些列表组件,

a way to achieve this in base R is as follows:在基础 R 中实现此目的的方法如下:

## find set components w/ empty intersections w/ all other components
isUnique <- sapply(seq_along(sets), function(i) length(intersect(sets[[i]], unlist(sets[-i]))) < 1)

## empty intersect components
sets[isUnique]
#> $e
#> [1] "e45" "e55" "e65"
#> 
#> $j
#> [1] "j1" "j2" "j3"

## non-empty intersect components 
sets[!isUnique]
#> $b
#> [1] "b4" "b5" "b6"
#> 
#> $c
#> [1] "c2" "c3" "b4" "b5" "c6"
#> 
#> $d
#> [1] "d1" "d2"
#> 
#> $f
#> [1] "f4" "f5" "d1" "f6"
#> 
#> $g
#> [1] "g1" "g2"
#> 
#> $h
#> [1] "h5" "h6" "h7"
#> 
#> $i
#> [1] "i9" "h5" "g1" "h6" "i8" "i7"

Here is my attempt to get the intersect list:这是我获取相交列表的尝试:

sets <- list(b = c("b4", "b5", "b6"),
             c = c("c2", "c3", "b4", "b5", "c6"),
             d = c("d1", "d2"),
             e = c("e45", "e55", "e65"),
             f = c("f4", "f5", "d1", "f6"),
             g = c("g1", "g2"),
             h = c("h5", "h6", "h7"),
             i = c("i9", "h5", "g1", "h6", "i8", "i7"),
             j = c("j1", "j2", "j3"))

set.names <- names(sets)
names(set.names) <- set.names

sets.intersect <- lapply(set.names, function(x) {
  res <- lapply(set.names, function(y) {
    if (x != y) {
      intersect(sets[[x]], sets[[y]])
    }
    else (
      character(0)
    )
  })
  Filter(function(x) length(x) > 0, res)
})

output.intersect <- lapply(sets.intersect, function(x) {
  res <- unlist(unname(x))
})
output.intersect <- Filter(function(x) !is.null(x), output.intersect)

# RESULT
dput(output.intersect)
structure(
  list(
    b = c("b4", "b5"), 
    c = c("b4", "b5"), 
    d = "d1", 
    f = "d1", 
    g = "g1", 
    h = c("h5", "h6"), 
    i = c("g1", "h5", "h6")
  ), .Names = c("b", "c", "d", "f", "g", "h", "i")
 )

I tried to do this without for loops which requires some tricks with list and vector names.我试图在没有 for 循环的情况下做到这一点,这需要一些使用列表和向量名称的技巧。

Intersect Values相交值

For the intersect values there are is one built in function in R that can make the job done.对于相交值,在 R 中有一个内置的R可以完成工作。 intersect does exactly that example: intersect正是这个例子:

intersect(c("b4", "b5", "b6"),c("c2", "c3", "b4", "b5", "c6"))
# [1] "b4" "b5"

However if you want to apply it with multiple values you will need to use another built in function named Reduce example:但是,如果您想使用多个值应用它,您将需要使用另一个名为Reduce的内置 function 示例:

sets <- list(b = c("b4", "b5", "b6"),
         c = c("c2", "c3", "b4", "b5", "c6"),
         d = c("d1", "d2"),
         e = c("e45", "e55", "e65"),
         f = c("f4", "f5", "d1", "f6"),
         g = c("g1", "g2"),
         h = c("h5", "h6", "h7"),
         i = c("i9", "h5", "g1", "h6", "i8", "i7"),
         j = c("j1", "j2", "j3"))

Reduce(intersect,sets)

source资源

Unique Values in list列表中的唯一值

You can use the do.call function, for this example it would be:您可以使用do.call function,对于此示例,它将是:

unique(do.call("c",sets))
# [1] "b4"  "b5"  "b6"  "c2"  "c3"  "c6" ....

Hope this can help希望这可以帮助

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM