[英]enumerate all possible connected nodes
Say I have 100 nodes, then I give each a unique ID from 1:100.假设我有 100 个节点,然后我从 1:100 给每个节点一个唯一的 ID。
If I wanted a list of every combination of nodes, I believe it would be of length 2^100
.如果我想要每个节点组合的列表,我相信它的长度为
2^100
。 This is if any node can be missing from the diagram.这是图表中可能缺少任何节点的情况。
But say I have a dataframe that represents a connections between nodes:但是假设我有一个 dataframe 代表节点之间的连接:
head(conn_)
from to
2 1 2
3 1 4
4 2 3
5 2 5
6 4 6
7 154 100
8 102 101
so, row one of this df states there exists a connection from node 11
to node 10
因此,此 df 的第一行表示存在从节点
11
到节点10
的连接
Say I want to enumerate every combination of valid nodes, but a combination is only valid if there is no broken connection between the elements of the set.假设我想枚举有效节点的每个组合,但只有在集合的元素之间没有断开的连接时,组合才有效。 How could I do this?
我怎么能这样做?
For example if I have nodes 1->2->3->4->5->6->7->8->9
, where ->
represents a two way connection ( 1
connects to 2
and 2
connects to 1
), then two valid subsets would be {1, 2, 3} & {4, 5, 6}
, but an invalid subset would be {1, 3, 4, 6}
.例如,如果我有节点
1->2->3->4->5->6->7->8->9
,其中->
表示双向连接( 1
连接到2
和2
连接到1
),则两个有效子集将是{1, 2, 3} & {4, 5, 6}
,但无效子集将是{1, 3, 4, 6}
。 It would be invalid because there is a broken connection between two elements in the set.这将是无效的,因为集合中的两个元素之间的连接断开了。
If one node connects to multiple other nodes, that counts as a valid connection, meaning for the dataframe above I can have a valid set {1, 2, 4, 6}
如果一个节点连接到多个其他节点,则算作有效连接,这意味着对于上面的 dataframe 我可以有一个有效的集合
{1, 2, 4, 6}
I'm having a really hard time trying to figure out a method to do this, recursively or with for/while loops.我很难找到一种方法来做到这一点,递归或使用 for/while 循环。
Also, if there is a max of five two-way connections per node, for the case of 100 nodes, then is it possible to enumerate all?另外,如果每个节点最多有五个双向连接,对于 100 个节点,那么是否可以枚举所有节点? How does this problem grow?
这个问题如何发展?
edit:编辑:
Here is an example of input / output:以下是输入/output 的示例:
conn_ =
from to
1 2
1 4
2 3
2 5
4 6
Expected output : { {1}, {1, 2}, {1, 4}, {1, 2, 4}, {1, 2, 3}, {1, 2, 5}, {1, 4, 6}, {1, 2, 4, 6}, {1, 2, 3, 4}, {1, 2, 3, 4, 6}, {1, 2, 3, 4, 5, 6}, {2}, {2, 3}, {2, 5}, {2, 3, 5}, {3}, {4}, {4, 6} }
Notice {1, 3, 5}
is not in the output, because there can't exist a break between elements in the set, but {1, 2, 4, 6}
is valid because 1
connects to 2
and 1
connects to 4
注意
{1, 3, 5}
不在 output 中,因为集合中的元素之间不能存在中断,但是{1, 2, 4, 6}
是有效的,因为1
连接到2
并且1
连接到4
Here is a solution with igraph.这是 igraph 的解决方案。 It will quickly exhaust your resources for big graphs with high connectivity.
对于具有高连接性的大图,它将很快耗尽您的资源。
Basically, we search for all paths from each vertex.基本上,我们从每个顶点搜索所有路径。 That will give us each combination twice, so we subset that to unique combinations in the end.
这会给我们每个组合两次,所以我们最终将其子集为唯一组合。 Someone who knows more about graphs than I do might be able to create a more efficient solution.
比我更了解图表的人可能能够创建更有效的解决方案。
DF <- read.table(text = "from to
1 2
1 4
2 3
2 5
4 6", header = TRUE)
library(igraph)
g <- graph_from_data_frame(DF, directed = FALSE)
plot(g)
#all paths starting from each vertex
paths <- unlist(lapply(V(g), function(from) all_simple_paths(g, from)), FALSE)
res <- lapply(paths, names) #extract vertex names from each path
res <- c(as.list(names(V(g))), res) #add single vertices
res <- lapply(res, sort) #sort
res <- res[!duplicated(res)] #remove duplicates
#for compact printing:
unname(sapply(res, paste, collapse = ","))
#[1] "1" "2" "4" "3" "5" "6" "1,2" "1,2,3" "1,2,5" "1,4" "1,4,6" "1,2,4" "1,2,4,6" "2,3"
#[15] "2,5" "1,2,3,4" "1,2,4,5" "4,6" "1,2,3,4,6" "2,3,5" "1,2,4,5,6"
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.