Efficiently iterating arbitrary depth dict tree in Python

Question

I have the following tree data structure stored in dictionaries:

1
   2
      3
         4 -> ["a", "b", "c"]
         5 -> ["x", "y", "z"]
   3
      5
         7 -> ["e", "f", "j"]

Here is how I build sample of it in Python:

tree = dict()
for i in range(100):
    tree[i] = dict()
    for j in range(10):
        tree[i][j] = dict()
        for k in range(10):
            tree[i][j][k] = dict()
            for l in range(10):
                tree[i][j][k][l] = dict()
                for m in range(10):
                    tree[i][j][k][l][m] = dict()
                    for n in range(10):
                        tree[i][j][k][l][m][n] = ["a", "b", "c", "d", "e", "f", "g"]

I want to traverse it and do some calculation when reaching each leaf. While doing calculation I need to know the path to the leaf.

Ie given callback

def callback(p1, p2, p3, p4, leaf):
    ...

I want it to be called like following using my tree example:

callback(1, 2, 3, 4, ["a", "b", "c"])
callback(1, 2, 3, 5, ["x", "y", "z"])
callback(1, 3, 5, 7, ["e", "f", "j"])

Question: How to implement traversal most efficiently? Note, that tree depth is not static.

Here is what I tried:

1. Inline code. This is the fastest one, but is not usable in practice since, again, tree depth is not static.

def callback(*args):
    assert isinstance(args[-1], list)

start = time.time()
for k1, leafs1 in tree.items():
    for k2, leafs2 in leafs1.items():
        for k3, leafs3 in leafs2.items():
            for k4, leafs4 in leafs3.items():
                for k5, leafs5 in leafs4.items():
                    for k6, val in leafs5.items():
                        callback(k1, k2, k3, k4, k5, k6, val)
print("inline: %f" % (time.time() - start))

This runs 3.5 seconds average using Python 3.4.2 on my laptop.

2. Recursive approach

from functools import partial
def iterate_tree(tree, depth, callback):
    if depth:
        for k, subtree in tree.items():
            cb = partial(callback, k)
            yield from iterate_tree(subtree, depth-1, cb)
    else:
        for k, v in tree.items():
            rv = callback(k, v)
            yield rv

start = time.time()
for i in iterate_tree(tree, 5, callback):
    pass
print("iterate_tree: %f" % (time.time() - start))

This is generic and all that nice, but 2 times slower!

3. Non-recursive approach I thought that may be recursion, yield from and partial are slowing me down. So I tried the to flaten it:

def iterate_tree2(tree, depth, callback):
    iterators = [iter(tree.items())]
    args = []
    while iterators:
        try:
            k, v = next(iterators[-1])
        except StopIteration:
            depth += 1
            iterators.pop()
            if args:
                args.pop()
            continue

        if depth:
            args.append(k)
            iterators.append(iter(v.items()))
            depth -= 1
        else:
            yield callback(*(args + [k, v]))

start = time.time()
for i in iterate_tree2(tree, 5, callback):
    pass
print("iterate_tree2: %f" % (time.time() - start))

This is generics and works, but performance improvement compared to recursion, ie still two times slower than inline version.

So how to implement my traversal in a generic way? And what makes inline version so much faster?

PS The code above is for Python 3.3+. I've adapted it to Python 2 and results are similar.

SOLUTION AND ANALYSIS

I've made comparative analysis of all of the solutions and optimizations. The code and results can be obtained from the gist .

TL;DR; The fastest solution is to use optimized loop-based version:

Its the fastest version that supports convenient results reporting from callback
Its only 30% slower then inline version (on Python3.4)
On PyPy it gets magnificent speed boost, outperforming even inline version

Loop-based iterations own everything when run on PyPy.

On non-pypy, the major slowdown is a result reporting from callback:

yield ing results is the slowest - ~30% penalty compared to inline. See iterate_tree6 for loop version and iterate_tree3 for recursive version
Reporting by calling callback from callback is slightly better - 17% slower than inline (on Python3.4). See iterate_tree3_noyield
No reporting at all can run better then inline. See iterate_tree6_nofeedback

For recursion-based versions, use tuples for argument accumulating and not list. The performance difference is rather significant.

Thanks to everyone who contributed to this topic.

Answer 1

I managed to improve the performance to about half way between the inlined version and your first recursive version with this which I think is equivalent.

def iterate_tree_2(tree, depth, accumulator, callback):
    if depth:
        for k, subtree in tree.items():
            yield from iterate_tree_2(subtree, depth-1, accumulator + (k,), callback)
    else:
        for k, v in tree.items():
            rv = callback(accumulator + (k,), v)
            yield rv

>>> for i in iterate_tree_2(tree, depth, (), callback): pass

It's slightly different in that it calls the callback with

callback((1, 2, 3, 4), ["a", "b", "c"])

instead of

callback(1, 2, 3, 4, ["a", "b", "c"])

The implementation difference is that it builds up tuples of arguments rather than using partial . Which I guess makes sense since every time you call partial you're adding an extra layer of a function call to the callback.

Answer 2

Here's a recursive approach that seems to perform about 5-10% better than your inline method:

def iter_tree(node, depth, path):
    path.append(node)
    for v in node.values():
        if depth:
            iter_tree(v, depth-1, path)
        else:
            callback(path)

Which you could call with:

iter_tree(tree, 5, [])

Edit a similar approach, but preserved keys , per your comment:

def iter_tree4(node, depth, path):
    for (k,v) in node.items():
        kpath = path + [k]
        if depth:
            iter_tree4(v, depth-1, kpath)
        else:
            callback(kpath, v)

Called the same way.

Note that we lost the performance gain from just keeping track of values, but it's still competitive with your inline method:

Iteration 1  21.3142
Iteration 2  11.2947
Iteration 3   1.3979

The number listed is the percent performance loss: [(recursive-inline)/inline]

Answer 3

Here's an optimized version of the iterative iterate_tree2 . It is 40 % faster on my system, mainly thanks to improved looping structure and elimination of the try except . Andrew Magee's recursive code performs approximately the same.

def iterate_tree4(tree, depth, callback):
    iterators = [iter(tree.items())]
    args = [] 
    while iterators:
        while depth:
            for k, v in iterators[-1]:
                args.append(k)
                iterators.append(iter(v.items()))
                depth -= 1
                break
            else:
                break
        else:
            for k, v in iterators[-1]:
                yield callback(*(args + [k, v]))
        depth += 1
        del iterators[-1]
        del args[-1:]

Efficiently iterating arbitrary depth dict tree in Python

Question

3 answers

solution1
2 2015-03-19 09:10:30

solution2
1 2015-03-19 09:45:15

solution3
1 ACCPTED 2015-03-19 11:45:31

Efficiently iterating arbitrary depth dict tree in Python

Question

3 answers

solution1 2 2015-03-19 09:10:30

solution2 1 2015-03-19 09:45:15

solution3 1 ACCPTED 2015-03-19 11:45:31

solution1
2 2015-03-19 09:10:30

solution2
1 2015-03-19 09:45:15

solution3
1 ACCPTED 2015-03-19 11:45:31