在 python 图形数据结构中找到所有可能的路径而不使用递归 function

Question

I have a serious issue with finding all possible paths in my csv file that looks like this:在我的 csv 文件中查找所有可能的路径时遇到一个严重的问题，如下所示：

Source资源	Target目标	Source_repo源代码库	Target_repo目标仓库
SOURCE1来源1	Target2目标2	repo-1回购-1	repo-2回购-2
SOURCE5来源5	Target3目标3	repo-5回购5	repo-3回购3
SOURCE8来源8	Target5目标5	repo-8回购8	repo-5回购5

There a large amount of lines in the datasets, more than 5000 lines.数据集中有大量的行，超过 5000 行。 I want to generate all possible paths like this in and return a list (Target5 is equal to SOURCE5):我想像这样生成所有可能的路径并返回一个列表（Target5 等于 SOURCE5）：

SOURCE1 Target2源 1 目标 2
SOURCE8 Target5 Target3 SOURCE8 目标 5 目标 3

I want to implement this solution without using recursive functions, since causes problems (maximum recursion depth exceeded).我想在不使用递归函数的情况下实现此解决方案，因为会导致问题（超出最大递归深度）。

This is the current code example:这是当前的代码示例：

def attach_co_changing_components(base_component):
    co_changes = df_depends_on.loc[df_depends_on["Source_repo"] ==
                                   base_component, "Target_repo"].values
    result = {base_component: list(co_changes)}
    return result


def dfs(data, path, paths):
    datum = path[-1]
    if datum in data:
        for val in data[datum]:
            new_path = path + [val]
            paths = dfs(data, new_path, paths)
    else:
        paths += [path]
    return paths



def enumerate_paths(graph, nodes=[]):
    nodes = graph.keys()
    all_paths = []
    for node in nodes:
        node_paths = dfs(graph, [node], [])
        all_paths += node_paths
    return all_paths


if __name__ == "__main__":

    df = pd.read_csv("clean_openstack_evolution.csv")

    co_changing_components = df[["Source"]].copy()

    co_changing_components = co_changing_components.drop_duplicates(
    ).reset_index(drop=True)

    co_changing_components = co_changing_components["Source"].map(
        attach_co_changing_components)

    co_changing_components = co_changing_components.rename("Path")

    co_changing_components = co_changing_components.reset_index(drop=True)

    newdict = {}
    for k, v in [(key, d[key]) for d in co_changing_components for key in d]:
        if k not in newdict: newdict[k] = v
        else: newdict[k].append(v)

    graph_keys = df_depends_on["Source_repo"].drop_duplicates().to_dict(
    ).values()
    graph_keys = {*graph_keys}
    graph_keys = set([
        k for k in graph_keys
        if len(df_depends_on[df_depends_on["Target"] == k]) > 0
    ])

    result = enumerate_paths(new_dict)

Here is the output after executing the preceding code:执行上述代码后的 output 如下：

Here is the data link Google drive这是谷歌驱动器的数据链接

I tried to solve the problem using recursive function, but the code failed with the problem of depth exceeded.我尝试使用递归function来解决问题，但是代码失败，出现了超出深度的问题。 I aim to solve it without recursive functions.我的目标是在没有递归函数的情况下解决它。

Answer 1

I'm not sure if you want all paths or paths specifically from node to another node.我不确定您是否想要所有路径或专门从节点到另一个节点的路径。 Either way this looks like a job for networkx .无论哪种方式，这看起来都像是networkx的工作。

Setup ( `nx.from_pandas_edgelist` )设置（ `nx.from_pandas_edgelist` ）

import networkx as nx
import pandas as pd


df = pd.read_csv("...")

graph = nx.from_pandas_edgelist(df, create_using=nx.DiGraph)

All paths ( `nx.all_simple_paths` )所有路径 ( `nx.all_simple_paths` )

from itertools import chain, product, starmap
from functools import partial


roots = (node for node, d in graph.in_degree if d == 0)

leaves = (node for node, d in graph.out_degree if d == 0)

all_paths = partial(nx.all_simple_paths, graph)

paths = list(chain.from_iterable(starmap(all_paths, product(roots, leaves))))

From one node to another从一个节点到另一个节点

source_node = "some_node_in_graph"
target_node = "some_other_node_in_graph"
list(nx.all_simple_paths(graph, source=source_node, target=target_node))

在 python 图形数据结构中找到所有可能的路径而不使用递归 function

问题描述

1 个解决方案

解决方案1
1 已采纳 2022-11-30 22:43:18

Setup ( `nx.from_pandas_edgelist` )设置（ `nx.from_pandas_edgelist` ）

All paths ( `nx.all_simple_paths` )所有路径 ( `nx.all_simple_paths` )

From one node to another从一个节点到另一个节点

在 python 图形数据结构中找到所有可能的路径而不使用递归 function

问题描述

1 个解决方案

解决方案1 1 已采纳 2022-11-30 22:43:18

Setup ( nx.from_pandas_edgelist )设置（ nx.from_pandas_edgelist ）

All paths ( nx.all_simple_paths )所有路径 ( nx.all_simple_paths )

From one node to another从一个节点到另一个节点

解决方案1
1 已采纳 2022-11-30 22:43:18

Setup ( `nx.from_pandas_edgelist` )设置（ `nx.from_pandas_edgelist` ）

All paths ( `nx.all_simple_paths` )所有路径 ( `nx.all_simple_paths` )