Python：有向图中的所有简单路径

Question

I am working with a (number of) directed graphs with no cycles in them, and I have the need to find all simple paths between any two nodes.我正在使用其中没有循环的（数量）有向图，并且我需要找到任何两个节点之间的所有简单路径。 In general I wouldn't worry about the execution time, but I have to do this for very many nodes during very many timesteps - I am dealing with a time-based simulation.一般来说，我不会担心执行时间，但我必须在非常多的时间步中为非常多的节点执行此操作 - 我正在处理基于时间的模拟。

I had tried in the past the facilities offered by NetworkX but in general I found them slower than my approach.我过去曾尝试过 NetworkX 提供的工具，但总的来说我发现它们比我的方法慢。 Not sure if anything has changed lately.不知道最近有没有什么变化。

I have implemented this recursive function:我已经实现了这个递归函数：

import timeit

def all_simple_paths(adjlist, start, end, path):

    path = path + [start]

    if start == end:
        return [path]

    paths = []

    for child in adjlist[start]:

        if child not in path:

            child_paths = all_simple_paths(adjlist, child, end, path)
            paths.extend(child_paths)

    return paths


fid = open('digraph.txt', 'rt')
adjlist = eval(fid.read().strip())

number = 1000
stmnt  = 'all_simple_paths(adjlist, 166, 180, [])'
setup  = 'from __main__ import all_simple_paths, adjlist'
elapsed = timeit.timeit(stmnt, setup=setup, number=number)/number
print 'Elapsed: %0.2f ms'%(1000*elapsed)

On my computer, I get an average of 1.5 ms per iteration.在我的计算机上，每次迭代平均需要 1.5 毫秒。 I know this is a small number, but I have to do this operation very many times.我知道这是一个小数目，但我不得不这样做操作很多次。

In case you're interested, I have uploaded a small file containing the adjacency list here:如果您有兴趣，我在这里上传了一个包含邻接列表的小文件：

adjlist调整列表

I am using adjacency lists as inputs, coming from a NetworkX DiGraph representation.我使用邻接列表作为输入，来自 NetworkX DiGraph 表示。

Any suggestion for improvements of the algorithm (ie, does it have to be recursive?) or other approaches I may try are more than welcome.任何改进算法的建议（即它是否必须是递归的？）或我可以尝试的其他方法都非常受欢迎。

Thank you.谢谢你。

Andrea.安德烈亚。

Answer 1

You can save time without change the algorithm logic by caching result of shared sub-problems here.通过在此处缓存共享子问题的结果，您可以在不更改算法逻辑的情况下节省时间。

For example, calling all_simple_paths(adjlist, 'A', 'D', []) in following graph will compute all_simple_paths(adjlist, 'D', 'E', []) multiple times:例如， all_simple_paths(adjlist, 'A', 'D', [])调用all_simple_paths(adjlist, 'A', 'D', [])将all_simple_paths(adjlist, 'D', 'E', [])计算all_simple_paths(adjlist, 'D', 'E', []) ：

Python has a built-in decorator lru_cache for this task. Python 有一个用于此任务的内置装饰器lru_cache 。 It uses hash to memorize the parameters so you will need to change adjList and path to tuple since list is not hashable.它使用哈希来记住参数，因此您需要更改adjList和tuple path ，因为list不可哈希。

import timeit
import functools

@functools.lru_cache()
def all_simple_paths(adjlist, start, end, path):

    path = path + (start,)

    if start == end:
        return [path]

    paths = []

    for child in adjlist[start]:

        if child not in path:

            child_paths = all_simple_paths(tuple(adjlist), child, end, path)
            paths.extend(child_paths)

    return paths


fid = open('digraph.txt', 'rt')
adjlist = eval(fid.read().strip())

# you can also change your data format in txt
adjlist = tuple(tuple(pair)for pair in adjlist)

number = 1000
stmnt  = 'all_simple_paths(adjlist, 166, 180, ())'
setup  = 'from __main__ import all_simple_paths, adjlist'
elapsed = timeit.timeit(stmnt, setup=setup, number=number)/number
print('Elapsed: %0.2f ms'%(1000*elapsed))

Running time on my machine:在我的机器上运行时间：
- original: 0.86ms - 原始：0.86ms
- with cache: 0.01ms - 带缓存：0.01ms

And this method should only work when there's a lot shared sub-problems.而且这种方法应该只在有很多共享的子问题时才有效。

Python：有向图中的所有简单路径

问题描述

1 个解决方案

解决方案1
1 已采纳 2017-09-30 07:50:58

Python：有向图中的所有简单路径

问题描述

1 个解决方案

解决方案1 1 已采纳 2017-09-30 07:50:58

解决方案1
1 已采纳 2017-09-30 07:50:58