简体   繁体   中英

How to cleanly avoid loops in recursive function (breadth-first traversal)

I'm writing a recursive breadth-first traversal of a network. The problem I ran into is that the network often looks like this:

    1
   / \
  2   3
   \ /
    4
    |
    5

So my traversal starts at 1, then traverses to 2, then 3. The next stop is to proceed to 4, so 2 traverses to 4. After this, 3 traverses to 4, and suddenly I'm duplicating work as both lines try to traverse to 5.

The solution I've found is to create a list called self.already_traversed , and every time a node is traversed, I append it to the list. Then, when I'm traversing from node 4, I check to make sure it hasn't already been traversed.

The problem here is that I'm using an instance variable for this, so I need a way to set up the list before the first recursion and a way to clean it up afterwards. The way I'm currently doing this is:

self.already_traversed = []
self._traverse_all_nodes(start_id)
self.already_traversed = []

Of course, it sucks to be twoggling variables outside of the function that's using them. Is there a better way to do this so this can be put into my traversal function?


Here's the actual code, though I recognize it's a bit dense:

def _traverse_all_nodes(self, root_authority, max_depth=6):
    """Recursively build a networkx graph

    Process is:
     - Work backwards through the authorities for self.cluster_end and all
       of its children.
     - For each authority, add it to a networkx graph, if:
        - it happened after self.cluster_start
        - it's in the Supreme Court
        - we haven't exceeded a max_depth of six cases.
        - we haven't already followed this path
    """
    g = networkx.Graph()
    if hasattr(self, 'already_traversed'):
        is_already_traversed = (root_authority.pk in self.visited_nodes)
    else:
        # First run. Create an empty list.
        self.already_traversed = []
        is_already_traversed = False

    is_past_max_depth = (max_depth <= 0)
    is_cluster_start_obj = (root_authority == self.cluster_start)
    blocking_conditions = [
        is_past_max_depth,
        is_cluster_start_obj,
        is_already_traversed,
    ]
    if not any(blocking_conditions):
        print "  No blocking conditions. Pressing on."
        self.visited_nodes.append(root_authority.pk)
        for authority in root_authority.authorities.filter(
                docket__court='scotus',
                date_filed__gte=self.cluster_start.date_filed):

            g.add_edge(root_authority.pk, authority.pk)
            # Combine our present graph with the result of the next
            # recursion
            g = networkx.compose(g, self._build_graph(
                authority,
                max_depth - 1,
            ))

    return g

def add_clusters(self):
    """Do the network analysis to add clusters to the model.

    Process is to:
     - Build a networkx graph
     - For all nodes in the graph, add them to self.clusters
    """
    self.already_traversed = []
    g = self._traverse_all_nodes(
        self.cluster_end,
        max_depth=6,
    )
    self.already_traversed = []

Check out:

How do I pass a variable by reference?

which contains an example on how to past a list by reference. If you pass the list by reference, every call to your function will refer to the same list.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM