简体   繁体   English

对有向无环图(一种特殊情况)进行拓扑排序的最有效算法是什么?

[英]What's the most efficient algorithm for topologically sorting a (special case of a) directed acyclic graph?

I have a data set that is a special case of a directed acyclic graph (DAG). 我有一个数据集,这是有向无环图(DAG)的特例。 The nodes in my DAG have either 0 or 1 arcs. DAG中的节点具有0或1个弧。 All arcs are equally weighted (that is, the only information contained in an arc is the node it points at, no "distance" or "cost" or "weight"). 所有圆弧均具有相同的权重(也就是说,圆弧中包含的唯一信息是它指向的节点,没有“距离”或“成本”或“重量”)。

My users will input nodes in a semi-random order, with the expectation that the order of arcless nodes be preserved, but all arcs get sorted before the nodes that point at them (so child-first ordering). 我的用户将以半随机顺序输入节点,以期保留无弧节点的顺序,但是所有弧都在指向它们的节点之前进行排序(因此,子优先顺序)。

Here is a simplified class representing my data: 这是代表我的数据的简化类:

class Node:
    def __init__(self, name, arc=None):
        self.name = str(name)
        self.arc = str(arc)

(note that self.arc is a string representation of the pointed-at node and not the object itself) (请注意, self.arc是指向节点的字符串表示形式,而不是对象本身)

So, given an input like this: 因此,给定这样的输入:

input = [Node('A'),
         Node('B'),
         Node('C'),
         Node('Z', 'Y'),
         Node('X', 'W'),
         Node('Y', 'X'),
         Node('W')]

You would get output like this, preferably using the fewest loops and intermediate data structures: 您将获得这样的输出,最好使用最少的循环和中间数据结构:

output = [Node('A'),
          Node('B'),
          Node('C'),
          Node('W'),
          Node('X', 'W'),
          Node('Y', 'X'),
          Node('Z', 'Y')]

So far this is the best algorithm I've been able to come up with: 到目前为止,这是我能够提出的最佳算法:

def topo_sort(nodes):
    objects = {}           # maps node names back to node objects
    graph = OrderedDict()  # maps node names to arc names
    output = []
    for node in nodes:
        objects[node.name] = node
        graph[node.name] = node.arc
    arcs = graph.values()

    # optional
    missing = set(arcs) - set(graph) - {None}
    if missing:
        print('Invalid arcs: {}'.format(', '.join(missing)))

    def walk(arc):
        """Recurse down the tree from root to leaf nodes."""
        obj = objects.get(arc)
        if obj and obj not in output:
            follow_the_arcs(graph.get(arc))
            output.append(obj)

    # Find all root nodes
    for node in graph:
        if node not in arcs:
            walk(node)
    return ordered

What I like about this is that there are only 2 loops (one to build the graph and one to find the roots), and the output is built exclusively with list.append() , no slow list.insert() s. 我喜欢的是,只有2个循环(一个用于构建图形,一个用于查找根),并且输出仅使用list.append()构建,没有慢速的list.insert() Can anybody think of any improvements to this? 有人可以考虑对此进行任何改进吗? Any obvious inefficiencies I've overlooked? 我忽略了任何明显的效率低下吗?

Thanks! 谢谢!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM