简体   繁体   English

在 python 中使用 networkx 在无向图中计算大小为 k 的团的最佳方法是什么?

[英]What is the best way to count the cliques of size k in an undirected graph using networkx in python?

I am surprised that networkx does not seem to have a built in function to do this, but maybe I am missing some clever way to do this using the built-in algorithms?我很惊讶 networkx 似乎没有内置的 function 来执行此操作,但也许我错过了使用内置算法执行此操作的一些聪明方法?

You can use one of these built in functions: enumerate_all_cliques or find_cliques in order to get all k-clique in an un-directed graph.您可以使用以下内置函数之一: enumerate_all_cliquesfind_cliques以获取无向图中的所有 k-clique。

The difference between these functions is that enumerate_all_cliques goes over all possible cliques and find_cliques goes over only the maximal cliques.这些函数之间的区别在于enumerate_all_cliques遍历所有可能的派系,而find_cliques仅遍历最大派系。 We will see in the end it affects the run time.我们最终会看到它会影响运行时间。

Option 1 using enumerate_all_cliques :选项 1 使用enumerate_all_cliques

import networkx as nx

def enumerate_all_cliques_size_k(G, k):
    i = 0
    for clique in nx.enumerate_all_cliques(G):
        if len(clique) == k:
            i += 1
        elif len(clique) > k:
            return i
    return i

Option 2 using find_cliques :选项 2 使用find_cliques

import networkx as nx
import itertools

def find_cliques_size_k(G, k):
    i = 0
    for clique in nx.find_cliques(G):
        if len(clique) == k:
            i += 1
        elif len(clique) > k:
            i += len(list(itertools.combinations(clique, k)))
    return i

The first option is more straight forward but it's run time is problematic since we go over all possible sub-sets of the maximal cliques, even if the maximal clique size is less than k.第一个选项更直接,但它的运行时间是有问题的,因为我们 go 在最大派系的所有可能子集上,即使最大派系大小小于 k。 We can see enumerate_all_cliques_size_k takes 10 times longer to run on a complete graph of size 20:我们可以看到enumerate_all_cliques_size_k在大小为 20 的完整图上运行需要 10 倍的时间:

G = nx.complete_graph(20)


@timing
def test_enumerate_all_cliques_size_k(G,k):
    print(enumerate_all_cliques_size_k(G, k))

@timing
def test_find_cliques_size_k(G, k):
    print(find_cliques_size_k(G, k))

test_enumerate_all_cliques_size_k(G,5)
test_find_cliques_size_k(G,5)

# --------------------Result-----------------------

15504
test_enumerate_all_cliques_size_k function took 616.645 ms
15504
test_find_cliques_size_k function took 56.967 ms

When using the find_cliques function you need to be carfull when you are going through all the possibilities (itertools.combinations) - in some cases you will count the same clique more than once.使用 find_cliques function 时,您需要仔细检查所有可能性(itertools.combinations) - 在某些情况下,您会多次计算同一个 clique。 For example, if you have a graph of six nodes (let's name them AG).例如,如果您有一个包含六个节点的图(我们将它们命名为 AG)。 Four of them are fully connected (AD) and E is connected to AD, and G is also connected to AD (but E is not connected to G).其中四个是全连接的(AD),E连接到AD,G也连接到AD(但E没有连接到G)。 In this situation you have two 5-cliques that share 4 nodes (A,B,C,D,E and A,B,C,D,G).在这种情况下,您有两个共享 4 个节点的 5 集团(A、B、C、D、E 和 A、B、C、D、G)。 Now let's say that you are looking for 4-cliques in this suggested garph, by using find_cliques you will go over the two 5-cliques, and in each one of them you will count every 4-clique, which includes the 4-clique A,B,C,D, so it will be counted twice (.).现在假设您正在这个建议的 garph 中寻找 4-clique,通过使用 find_cliques 您将 go 在两个 5-clique 上,并且在每一个中您将计算每个 4-clique,其中包括 4-clique A ,B,C,D,所以会计算两次 (.)。

here is a version of the suggested function that fix this problem by using set so you will count each clique only once:这是建议的 function 的一个版本,它通过使用 set 解决了这个问题,因此您将只计算每个 clique 一次:

def find_cliques_size_k(G, k):
    all_cliques = set()
    for clique in nx.find_cliques(G):
        if len(clique) == k:
            all_cliques.add(tuple(sorted(clique)))
        elif len(clique) > k:
            for mini_clique in itertools.combinations(clique, k):
                all_cliques.add(tuple(sorted(mini_clique)))
    return len(all_cliques)

(If you want the cliques themselves you can return the 'all_cliques' itself) (如果你想要派系本身,你可以返回 'all_cliques' 本身)

Welcome to SO.欢迎来到 SO。

Based on this reference , I think currently there is no existing function to do this.基于这个参考,我认为目前没有现有的 function 可以做到这一点。 If you want to use nx functions you can do something like this:如果你想使用nx函数,你可以这样做:

def count_k_cliques(G, k):
    k_cliques_count = 0
    for clique in nx.enumerate_all_cliques(G): 
        if len(clique) > k: 
            break 
        elif len(clique) == k: 
            k_cliques_count += 1
    return k_cliques_count

Edit: I recommend considering option 2 in Michal's answer编辑:我建议在Michal 的回答中考虑选项 2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM