简体   繁体   中英

What is the best way to count the cliques of size k in an undirected graph using networkx in python?

I am surprised that networkx does not seem to have a built in function to do this, but maybe I am missing some clever way to do this using the built-in algorithms?

You can use one of these built in functions: enumerate_all_cliques or find_cliques in order to get all k-clique in an un-directed graph.

The difference between these functions is that enumerate_all_cliques goes over all possible cliques and find_cliques goes over only the maximal cliques. We will see in the end it affects the run time.

Option 1 using enumerate_all_cliques :

import networkx as nx

def enumerate_all_cliques_size_k(G, k):
    i = 0
    for clique in nx.enumerate_all_cliques(G):
        if len(clique) == k:
            i += 1
        elif len(clique) > k:
            return i
    return i

Option 2 using find_cliques :

import networkx as nx
import itertools

def find_cliques_size_k(G, k):
    i = 0
    for clique in nx.find_cliques(G):
        if len(clique) == k:
            i += 1
        elif len(clique) > k:
            i += len(list(itertools.combinations(clique, k)))
    return i

The first option is more straight forward but it's run time is problematic since we go over all possible sub-sets of the maximal cliques, even if the maximal clique size is less than k. We can see enumerate_all_cliques_size_k takes 10 times longer to run on a complete graph of size 20:

G = nx.complete_graph(20)


@timing
def test_enumerate_all_cliques_size_k(G,k):
    print(enumerate_all_cliques_size_k(G, k))

@timing
def test_find_cliques_size_k(G, k):
    print(find_cliques_size_k(G, k))

test_enumerate_all_cliques_size_k(G,5)
test_find_cliques_size_k(G,5)

# --------------------Result-----------------------

15504
test_enumerate_all_cliques_size_k function took 616.645 ms
15504
test_find_cliques_size_k function took 56.967 ms

When using the find_cliques function you need to be carfull when you are going through all the possibilities (itertools.combinations) - in some cases you will count the same clique more than once. For example, if you have a graph of six nodes (let's name them AG). Four of them are fully connected (AD) and E is connected to AD, and G is also connected to AD (but E is not connected to G). In this situation you have two 5-cliques that share 4 nodes (A,B,C,D,E and A,B,C,D,G). Now let's say that you are looking for 4-cliques in this suggested garph, by using find_cliques you will go over the two 5-cliques, and in each one of them you will count every 4-clique, which includes the 4-clique A,B,C,D, so it will be counted twice (.).

here is a version of the suggested function that fix this problem by using set so you will count each clique only once:

def find_cliques_size_k(G, k):
    all_cliques = set()
    for clique in nx.find_cliques(G):
        if len(clique) == k:
            all_cliques.add(tuple(sorted(clique)))
        elif len(clique) > k:
            for mini_clique in itertools.combinations(clique, k):
                all_cliques.add(tuple(sorted(mini_clique)))
    return len(all_cliques)

(If you want the cliques themselves you can return the 'all_cliques' itself)

Welcome to SO.

Based on this reference , I think currently there is no existing function to do this. If you want to use nx functions you can do something like this:

def count_k_cliques(G, k):
    k_cliques_count = 0
    for clique in nx.enumerate_all_cliques(G): 
        if len(clique) > k: 
            break 
        elif len(clique) == k: 
            k_cliques_count += 1
    return k_cliques_count

Edit: I recommend considering option 2 in Michal's answer

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM