[英]What is the best way to count the cliques of size k in an undirected graph using networkx in python?
I am surprised that networkx does not seem to have a built in function to do this, but maybe I am missing some clever way to do this using the built-in algorithms?我很惊讶 networkx 似乎没有内置的 function 来执行此操作,但也许我错过了使用内置算法执行此操作的一些聪明方法?
You can use one of these built in functions: enumerate_all_cliques or find_cliques in order to get all k-clique in an un-directed graph.您可以使用以下内置函数之一: enumerate_all_cliques或find_cliques以获取无向图中的所有 k-clique。
The difference between these functions is that enumerate_all_cliques
goes over all possible cliques and find_cliques
goes over only the maximal cliques.这些函数之间的区别在于
enumerate_all_cliques
遍历所有可能的派系,而find_cliques
仅遍历最大派系。 We will see in the end it affects the run time.我们最终会看到它会影响运行时间。
Option 1 using enumerate_all_cliques
:选项 1 使用
enumerate_all_cliques
:
import networkx as nx
def enumerate_all_cliques_size_k(G, k):
i = 0
for clique in nx.enumerate_all_cliques(G):
if len(clique) == k:
i += 1
elif len(clique) > k:
return i
return i
Option 2 using find_cliques
:选项 2 使用
find_cliques
:
import networkx as nx
import itertools
def find_cliques_size_k(G, k):
i = 0
for clique in nx.find_cliques(G):
if len(clique) == k:
i += 1
elif len(clique) > k:
i += len(list(itertools.combinations(clique, k)))
return i
The first option is more straight forward but it's run time is problematic since we go over all possible sub-sets of the maximal cliques, even if the maximal clique size is less than k.第一个选项更直接,但它的运行时间是有问题的,因为我们 go 在最大派系的所有可能子集上,即使最大派系大小小于 k。 We can see
enumerate_all_cliques_size_k
takes 10 times longer to run on a complete graph of size 20:我们可以看到
enumerate_all_cliques_size_k
在大小为 20 的完整图上运行需要 10 倍的时间:
G = nx.complete_graph(20)
@timing
def test_enumerate_all_cliques_size_k(G,k):
print(enumerate_all_cliques_size_k(G, k))
@timing
def test_find_cliques_size_k(G, k):
print(find_cliques_size_k(G, k))
test_enumerate_all_cliques_size_k(G,5)
test_find_cliques_size_k(G,5)
# --------------------Result-----------------------
15504
test_enumerate_all_cliques_size_k function took 616.645 ms
15504
test_find_cliques_size_k function took 56.967 ms
When using the find_cliques function you need to be carfull when you are going through all the possibilities (itertools.combinations) - in some cases you will count the same clique more than once.使用 find_cliques function 时,您需要仔细检查所有可能性(itertools.combinations) - 在某些情况下,您会多次计算同一个 clique。 For example, if you have a graph of six nodes (let's name them AG).
例如,如果您有一个包含六个节点的图(我们将它们命名为 AG)。 Four of them are fully connected (AD) and E is connected to AD, and G is also connected to AD (but E is not connected to G).
其中四个是全连接的(AD),E连接到AD,G也连接到AD(但E没有连接到G)。 In this situation you have two 5-cliques that share 4 nodes (A,B,C,D,E and A,B,C,D,G).
在这种情况下,您有两个共享 4 个节点的 5 集团(A、B、C、D、E 和 A、B、C、D、G)。 Now let's say that you are looking for 4-cliques in this suggested garph, by using find_cliques you will go over the two 5-cliques, and in each one of them you will count every 4-clique, which includes the 4-clique A,B,C,D, so it will be counted twice (.).
现在假设您正在这个建议的 garph 中寻找 4-clique,通过使用 find_cliques 您将 go 在两个 5-clique 上,并且在每一个中您将计算每个 4-clique,其中包括 4-clique A ,B,C,D,所以会计算两次 (.)。
here is a version of the suggested function that fix this problem by using set so you will count each clique only once:这是建议的 function 的一个版本,它通过使用 set 解决了这个问题,因此您将只计算每个 clique 一次:
def find_cliques_size_k(G, k):
all_cliques = set()
for clique in nx.find_cliques(G):
if len(clique) == k:
all_cliques.add(tuple(sorted(clique)))
elif len(clique) > k:
for mini_clique in itertools.combinations(clique, k):
all_cliques.add(tuple(sorted(mini_clique)))
return len(all_cliques)
(If you want the cliques themselves you can return the 'all_cliques' itself) (如果你想要派系本身,你可以返回 'all_cliques' 本身)
Welcome to SO.欢迎来到 SO。
Based on this reference , I think currently there is no existing function to do this.基于这个参考,我认为目前没有现有的 function 可以做到这一点。 If you want to use
nx
functions you can do something like this:如果你想使用
nx
函数,你可以这样做:
def count_k_cliques(G, k):
k_cliques_count = 0
for clique in nx.enumerate_all_cliques(G):
if len(clique) > k:
break
elif len(clique) == k:
k_cliques_count += 1
return k_cliques_count
Edit: I recommend considering option 2 in Michal's answer编辑:我建议在Michal 的回答中考虑选项 2
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.