Iteratively compute subtree sizes for all nodes?

Question

I am trying to create an iterative version of this:

def computeSize(id):
   subtreeSize[id] = 1
   for child in children[id]:
      computeSize(child)
      subtreeSize[id]+=subtreeSize[child]

"Iterative" meaning no recursion, because in Python, if your graph is large and has lengthy linear chains anywhere, it'll give a stack recursion error.

Trying to use a stack for this instead (modeling it from the DFS algorithm) but I am having difficulty with the details:

def computeSubtreeSizes(self): #self.sizes[nodeID] has size of subtree
    stack = [self.rootID] #e.g. rootID = 1
    visited = set()

    while stack:
        nodeID = stack.pop()
        if nodeID not in visited:
            visited.add(nodeID)
            for nextNodeID in self.nodes[nodeID]:
                stack.append(nextNodeID)

For example once I begin, I pop the root ID out of the stack obviously, but after that, I've basically "lost" the ID after the child loop and have no way to assign its size later.

Do I need a second stack somehow?

Answer 1

Untested -- consider this pseudo-code for the concept of having a stack of nodes being processed, and on each node, a corresponding stack of its direct subnodes that have not yet been processed. This means that each item on the main stack is a tuple -- the first item in the tuple is the node, and the second item is the list of unprocessed subnodes.

def computeSubtreeSizes(self):
    stack = [(self.rootID, [])] #e.g. rootID = 1
    visited = self.sizes = {}

    while stack:
        nodeID, subnodes = stack[-1]
        size = visited.get(nodeID)
        if size is None:
            # Haven't seen it before.  Set total to 1,
            # and set up the list of subnodes.
            visited[nodeID] = size = 1
            subnodes[:] = self.nodes[nodeID]
        if subnodes:
            # Process all the subnodes one by one
            stack.append((subnodes.pop(), []))
        else:
            # When finished, update the parent
            stack.pop()
            if stack:
                visited[stack[-1][0]] += size

An obvious potential performance enhancement would be to not bother adding nodes that have already been visited to the main stack. This is only useful if duplicate subtrees are extremely common. This is more code (less readable) but might look something like this:

def computeSubtreeSizes(self):
    stack = [(self.rootID, [])] #e.g. rootID = 1
    visited = self.sizes = {}

    while stack:
        nodeID, subnodes = stack[-1]
        size = visited.get(nodeID)
        if size is None:
            # Haven't seen it before.  Add totals of
            # all previously visited subnodes, and
            # add the others to the list of nodes to
            # be visited.
            size = 1
            for sn in self.nodes[nodeID]:
                sn_size = visited.get(sn)
                if sn_size is None:
                    subnodes.append(sn)
                else:
                    size += sn_size
            visited[nodeID] = size

        if subnodes:
            # Process all the subnodes one by one
            stack.append((subnodes.pop(), []))
        else:
            # When finished, update the parent
            stack.pop()
            if stack:
                visited[stack[-1][0]] += size

Edits (especially by the question author after testing) are certainly welcome.

Iteratively compute subtree sizes for all nodes?

Question

1 answers

solution1
2 ACCPTED 2015-08-20 20:03:04

Iteratively compute subtree sizes for all nodes?

Question

1 answers

solution1 2 ACCPTED 2015-08-20 20:03:04

solution1
2 ACCPTED 2015-08-20 20:03:04