简体   繁体   English

功能需要很长时间

[英]Function takes a long time

im currently working on trying to get the the number of unique paths from node 1 .. N of maximum length for a weighted directed acyclic graph, i have worked out getting the max length but i am stuck on getting the NUMBER of paths of that given max length... 我目前正在尝试从节点1获得唯一路径的数量..加权有向无环图的最大长度为N,我已经计算出最大长度,但是我坚持要获取给定路径的数量最长长度...

Data is inputted like this: 数据输入如下:

91 120  # Number of nodes, number of edges

1 2 34

1 3 15

2 4 10

.... As Node 1-> Node 2 with a weight of 34, ....作为权重为34的节点1->节点2,

I input my data using a diction so my dict looks like:
_distance = {}
_distance = {1: [(2, 34), (3, 15)], 2: [(4, 10)], 3: [(4, 17)], 4: [(5, 36), (6, 22)], 5: [(7, 8)],...ect

I have worked out how to achieve the longest length of the paths using this: 我已经解决了如何使用以下方法实现路径的最长长度:

first i make a list of vertices 首先,我列出顶点

class Vertice:
    def __init__(self,name,weight=0,visted=False):
        self._n = name
        self._w = weight
        self._visited = visted
        self.pathTo

for i in range(numberOfNodes): # List of vertices (0-n-1)
  _V = Vertice(i) 
  _nodes.append(_V)

next i iterate through my dictionary setting each node to the maximum weight it can be 接下来,我遍历字典将每个节点设置为最大权重

        for vert, neighbors in _distance.iteritems():
        _vert = _nodes[vert-1] # Current vertice array starts at 0, so n-1


        for x,y in neighbors:  # neighbores,y = weight of neighbors
            _v = _nodes[x-1]   # Node #1 will be will be array[0]

            if _v._visited == True:
                if _v._w > _vert._w+y:
                    _v._w = _v._w
                else:
                    _v._w = y + _vert._w

            else:

                _v._w = y + _vert._w
                _v._visited = True

with this done, the last node will have a weight of the maximum so i can just call 完成此操作后,最后一个节点将具有最大权重,因此我可以调用

max = _nodes[-1]._w

to get the max weight. 以获得最大重量。 This seems to perform fast and has no trouble finding the max length path even when performed on the bigger data set, i then take my max value and run it into this function: 这似乎执行起来很快,即使在更大的数据集上执行,也没有发现最大长度路径的麻烦,然后我取最大值并将其运行到此函数中:

#  Start from first node in dictionary, distances is our dict{}
#  Target is the last node in the list of nodes, or the total number of nodes.
numLongestPaths(currentLocation=1,target=_numNodes,distances=_distance,maxlength=max)

def numLongestPaths(currentLocation,maxlength, target, sum=0, distances={}):


    _count = 0

    if currentLocation == target:
        if sum == maxlength:
                _count += 1

    else:
        for vert, weight in distances[currentLocation]:
            newSum = sum + weight
            currentLocation = vert
            _count += numLongestPaths(currentLocation,maxlength,target,newSum,distances)

    return _count

I simply check once we have hit the end node if our current sum is the max, if it is, add one to our count, if not pass. 我只是简单地检查一下是否到达当前节点,如果当前总和是最大值,如果没有,将其加一。

This works instantly for the inputs such as 8 nodes and longest path is 20, finding 3 paths, and for inputs such as 100 nodes, longest length of 149 and only 1 unique path of that length, but when i try to do a data set with 91 nodes such as longest path 1338 and number of unique paths are 32, the function takes extremely LONG, it works but is very slow. 这对于输入(例如8个节点,最长路径为20,找到3条路径)立即起作用,对于输入(例如100个节点,最长长度为149,并且只有该长度的唯一路径),但是当我尝试做一个数据集时拥有91个节点,例如最长路径1338和唯一路径数为32,该函数占用的时间非常长,但运行起来很慢。

Can someone give me some tips on what is wrong with my function to cause it to take so long finding the # of paths length X from 1..N? 有人可以给我一些提示,告诉我我的函数有什么问题,导致它花很长时间从1..N中查找长度为X的路径数吗? i'm assuming its getting an exponential run time but i'm unsure how to fix it 我假设它获得了指数级的运行时间,但是我不确定如何解决它

Thank you for your help! 谢谢您的帮助!

EDIT: Okay i was overthinking this and going about this the wrong way, i restructured my approach and my code is now as follows: 编辑:好的,我对此进行了过度思考,并以错误的方式进行处理,我重新构造了方法,现在的代码如下:

# BEGIN SEARCH.
    for vert, neighbors in _distance.iteritems():
        _vert = _nodes[vert-1] # Current vertice array starts at 0, so n-1


        for x,y in neighbors:  # neighbores

            _v = _nodes[x-1]   # Node #1 will be will be array[0]

            if _v._visited == True:
                if _v._w > _vert._w+y:
                    _v._w = _v._w
                elif _v._w == _vert._w+y:
                        _v.pathsTo += _vert.pathsTo
                else:
                    _v.pathsTo = _vert.pathsTo
                    _v._w = y + _vert._w

            else:

                _v._w = y + _vert._w
                _v.pathsTo = max(_vert.pathsTo, _v.pathsTo + 1)
                _v._visited = True

i added a pathsTo variable to my Vertice class, and that will hold the number of unique paths of MAX length 我在Vertice类中添加了一个pathsTo变量,该变量将保留最大长度为MAX的唯一路径

Your numLongestPaths is slow because you're recursively trying every possible path, and there can be exponentially many of those. 您的numLongestPaths速度很慢,因为您要递归地尝试所有可能的路径,并且其中的指数可能会成倍增加。 Find a way to avoid computing numLongestPaths for any node more than once. 找到一种避免numLongestPaths计算任何节点的numLongestPaths的方法。

Also, your original _w computation is broken, because when it computes a node's _w value, it does nothing to ensure the other _w values it's relying on have themselves been computed. 同样,您原始的_w计算也被破坏了,因为当它计算节点的_w值时,它不会做任何事情来确保自己依赖的其他_w值已经被计算出来。 You will need to avoid using uninitialized values; 您将需要避免使用未初始化的值。 a topological sort may be useful, although it sounds like the vertex labels may have already been assigned in topological sort order. 拓扑排序可能有用,尽管听起来顶点标签可能已经按照拓扑排序顺序进行了分配。

In addition to @user2357112's answer, here are two additional recommendations 除了@ user2357112的答案,这里还有两个其他建议

Language 语言

If you what this code to be as efficient as possible, I recommend using C. Python is a great scripting language, but really slow compared to compiled alternatives 如果您希望此代码尽可能高效,我建议使用C。Python是一种很棒的脚本语言,但是与编译后的替代品相比确实很慢

Data-structure 数据结构

Nodes are named in an ordered fashion, you can thus optimize a lot your code by using a list instead of a dictionary. 节点以有序的方式命名,因此您可以使用列表而不是字典来优化代码。 ie

_distance = [[] for i in range(_length)]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM