有向图中的最大公共子图

Question

I am trying to represent a group of sentences as a directed graph where one word is represented by one node.我试图将一组句子表示为一个有向图，其中一个单词由一个节点表示。 If a word is repeated then the node is not repeated, the previously existing node is used.如果单词重复，则节点不重复，则使用先前存在的节点。 Let's call this graph MainG .我们将此图MainG 。

Following this, I take a new sentence, creating a directed graph of this sentence (call this graph SubG ) and then looking for the Maximum Common Subgraph of SubG in MainG .在此之后，我取一个新句子，创建该句子的有向图（称为图SubG ），然后在SubG中MainG的最大公共子图。

I am using NetworkX api in Python 3.5.我在 Python 3.5 中使用 NetworkX api。 I understand that as this is NP-Complete problem for normal graphs, but for Directed Graphs it is a Linear problem.我知道这是正常图的 NP-Complete 问题，但对于有向图则是线性问题。 One of the links I referred:我提到的链接之一：

How can I find Maximum Common Subgraph of two graphs? 如何找到两个图的最大公共子图？

I tried to do the following code:我尝试执行以下代码：

import networkx as nx
import pandas as pd
import nltk

class GraphTraversal:
    def createGraph(self, sentences):
        DG=nx.DiGraph()
        tokens = nltk.word_tokenize(sentences)
        token_count = len(tokens)
        for i in range(token_count):
            if i == 0:
                continue
            DG.add_edges_from([(tokens[i-1], tokens[i])], weight=1)
        return DG


    def getMCS(self, G_source, G_new):
        """
        Creator: Bonson
        Return the MCS of the G_new graph that is present 
        in the G_source graph
        """
        order =  nx.topological_sort(G_new)
        print("##### topological sort #####")
        print(order)

        objSubGraph = nx.DiGraph()

        for i in range(len(order)-1):

            if G_source.nodes().__contains__(order[i]) and G_source.nodes().__contains__(order[i+1]):
                print("Contains Nodes {0} -> {1} ".format(order[i], order[i+1]))
                objSubGraph.add_node(order[i])
                objSubGraph.add_node(order[i+1])
                objSubGraph.add_edge(order[i], order[i+1])
            else:
                print("Does Not Contains Nodes {0} -> {1} ".format(order[i], order[i+1]))
                continue


obj_graph_traversal = GraphTraversal()
SourceSentences = "A series of escapades demonstrating the adage that what is good for the goose is also good for the gander , some of which occasionally amuses but none of which amounts to much of a story ."
SourceGraph = obj_graph_traversal.createGraph(SourceSentences)

TestSentence_1 = "not much of a story"    #ThisWorks
TestSentence_1 = "not much of a story of what is good"    #This DOES NOT Work
TestGraph = obj_graph_traversal.createGraph(TestSentence_1)

obj_graph_traversal.getMCS(SourceGraph, TestGraph)

As I am trying to do a topological sort, the second one doesn't work.当我尝试进行拓扑排序时，第二个不起作用。

Would be interested in understanding the possible approaches to this.有兴趣了解可能的方法。

Answer 1

The following code gets the maximum common subgraph from a directed graph:以下代码从有向图中获取最大公共子图：

def getMCS(self, G_source, G_new):
    matching_graph=nx.Graph()

    for n1,n2,attr in G_new.edges(data=True):
        if G_source.has_edge(n1,n2) :
            matching_graph.add_edge(n1,n2,weight=1)

    graphs = list(nx.connected_component_subgraphs(matching_graph))

    mcs_length = 0
    mcs_graph = nx.Graph()
    for i, graph in enumerate(graphs):

        if len(graph.nodes()) > mcs_length:
            mcs_length = len(graph.nodes())
            mcs_graph = graph

    return mcs_graph

Answer 2

The edit queue for Bonson's answer is full, but it doesn't work with networkx 2.4 anymore and has some possible improvements: Bonson 答案的编辑队列已满，但它不再适用于 networkx 2.4，并且有一些可能的改进：

connected_component_subgraphs was removed in networkx 2.4 and connected_components which returns a set of nodes should be used instead. connected_component_subgraphs在networkx 2.4并除去connected_components它返回一组节点应使用。
because only the number of nodes is to find the largest component this can be simplified significantly.因为只有节点数才能找到最大的组件，这可以显着简化。
this isn't specifically tailored to the initial question anymore, because this is the best hit if searching for "Maximum Common Subgraph in a Directed Graph" which I needed for something completely different这不再是专门针对最初的问题量身定制的，因为如果搜索“有向图中的最大公共子图”，这是我需要的完全不同的东西的最佳选择

My adapted version is:我的改编版本是：

def getMCS(g1, g2):
    matching_graph=networkx.Graph()

    for n1,n2 in g2.edges():
        if g1.has_edge(n1, n2):
            matching_graph.add_edge(n1, n2)

    components = networkx.connected_components(matching_graph)

    largest_component = max(components, key=len)
    return networkx.induced_subgraph(matching_graph, largest_component)

If the last line is replaced with return networkx.induced_subgraph(g1, largest_component) it should also work correctly and return a directed graph.如果最后一行被替换为return networkx.induced_subgraph(g1, largest_component)它也应该正常工作并返回一个有向图。

有向图中的最大公共子图

问题描述

2 个解决方案

解决方案1
3 已采纳 2017-05-09 23:30:22

解决方案2
1 2020-04-24 13:04:52

有向图中的最大公共子图

问题描述

2 个解决方案

解决方案1 3 已采纳 2017-05-09 23:30:22

解决方案2 1 2020-04-24 13:04:52

解决方案1
3 已采纳 2017-05-09 23:30:22

解决方案2
1 2020-04-24 13:04:52