简体   繁体   中英

What's the most efficient algorithm for topologically sorting a (special case of a) directed acyclic graph?

I have a data set that is a special case of a directed acyclic graph (DAG). The nodes in my DAG have either 0 or 1 arcs. All arcs are equally weighted (that is, the only information contained in an arc is the node it points at, no "distance" or "cost" or "weight").

My users will input nodes in a semi-random order, with the expectation that the order of arcless nodes be preserved, but all arcs get sorted before the nodes that point at them (so child-first ordering).

Here is a simplified class representing my data:

class Node:
    def __init__(self, name, arc=None):
        self.name = str(name)
        self.arc = str(arc)

(note that self.arc is a string representation of the pointed-at node and not the object itself)

So, given an input like this:

input = [Node('A'),
         Node('B'),
         Node('C'),
         Node('Z', 'Y'),
         Node('X', 'W'),
         Node('Y', 'X'),
         Node('W')]

You would get output like this, preferably using the fewest loops and intermediate data structures:

output = [Node('A'),
          Node('B'),
          Node('C'),
          Node('W'),
          Node('X', 'W'),
          Node('Y', 'X'),
          Node('Z', 'Y')]

So far this is the best algorithm I've been able to come up with:

def topo_sort(nodes):
    objects = {}           # maps node names back to node objects
    graph = OrderedDict()  # maps node names to arc names
    output = []
    for node in nodes:
        objects[node.name] = node
        graph[node.name] = node.arc
    arcs = graph.values()

    # optional
    missing = set(arcs) - set(graph) - {None}
    if missing:
        print('Invalid arcs: {}'.format(', '.join(missing)))

    def walk(arc):
        """Recurse down the tree from root to leaf nodes."""
        obj = objects.get(arc)
        if obj and obj not in output:
            follow_the_arcs(graph.get(arc))
            output.append(obj)

    # Find all root nodes
    for node in graph:
        if node not in arcs:
            walk(node)
    return ordered

What I like about this is that there are only 2 loops (one to build the graph and one to find the roots), and the output is built exclusively with list.append() , no slow list.insert() s. Can anybody think of any improvements to this? Any obvious inefficiencies I've overlooked?

Thanks!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM