I have the following tree data structure stored in dictionaries:
1
2
3
4 -> ["a", "b", "c"]
5 -> ["x", "y", "z"]
3
5
7 -> ["e", "f", "j"]
Here is how I build sample of it in Python:
tree = dict()
for i in range(100):
tree[i] = dict()
for j in range(10):
tree[i][j] = dict()
for k in range(10):
tree[i][j][k] = dict()
for l in range(10):
tree[i][j][k][l] = dict()
for m in range(10):
tree[i][j][k][l][m] = dict()
for n in range(10):
tree[i][j][k][l][m][n] = ["a", "b", "c", "d", "e", "f", "g"]
I want to traverse it and do some calculation when reaching each leaf. While doing calculation I need to know the path to the leaf.
Ie given callback
def callback(p1, p2, p3, p4, leaf):
...
I want it to be called like following using my tree example:
callback(1, 2, 3, 4, ["a", "b", "c"])
callback(1, 2, 3, 5, ["x", "y", "z"])
callback(1, 3, 5, 7, ["e", "f", "j"])
Question: How to implement traversal most efficiently? Note, that tree depth is not static.
Here is what I tried:
1. Inline code. This is the fastest one, but is not usable in practice since, again, tree depth is not static.
def callback(*args):
assert isinstance(args[-1], list)
start = time.time()
for k1, leafs1 in tree.items():
for k2, leafs2 in leafs1.items():
for k3, leafs3 in leafs2.items():
for k4, leafs4 in leafs3.items():
for k5, leafs5 in leafs4.items():
for k6, val in leafs5.items():
callback(k1, k2, k3, k4, k5, k6, val)
print("inline: %f" % (time.time() - start))
This runs 3.5 seconds average using Python 3.4.2 on my laptop.
2. Recursive approach
from functools import partial
def iterate_tree(tree, depth, callback):
if depth:
for k, subtree in tree.items():
cb = partial(callback, k)
yield from iterate_tree(subtree, depth-1, cb)
else:
for k, v in tree.items():
rv = callback(k, v)
yield rv
start = time.time()
for i in iterate_tree(tree, 5, callback):
pass
print("iterate_tree: %f" % (time.time() - start))
This is generic and all that nice, but 2 times slower!
3. Non-recursive approach I thought that may be recursion, yield from
and partial
are slowing me down. So I tried the to flaten it:
def iterate_tree2(tree, depth, callback):
iterators = [iter(tree.items())]
args = []
while iterators:
try:
k, v = next(iterators[-1])
except StopIteration:
depth += 1
iterators.pop()
if args:
args.pop()
continue
if depth:
args.append(k)
iterators.append(iter(v.items()))
depth -= 1
else:
yield callback(*(args + [k, v]))
start = time.time()
for i in iterate_tree2(tree, 5, callback):
pass
print("iterate_tree2: %f" % (time.time() - start))
This is generics and works, but performance improvement compared to recursion, ie still two times slower than inline version.
So how to implement my traversal in a generic way? And what makes inline version so much faster?
PS The code above is for Python 3.3+. I've adapted it to Python 2 and results are similar.
SOLUTION AND ANALYSIS
I've made comparative analysis of all of the solutions and optimizations. The code and results can be obtained from the gist .
TL;DR; The fastest solution is to use optimized loop-based version:
Loop-based iterations own everything when run on PyPy.
On non-pypy, the major slowdown is a result reporting from callback:
yield
ing results is the slowest - ~30% penalty compared to inline. See iterate_tree6
for loop version and iterate_tree3
for recursive version iterate_tree3_noyield
iterate_tree6_nofeedback
For recursion-based versions, use tuples for argument accumulating and not list. The performance difference is rather significant.
Thanks to everyone who contributed to this topic.
I managed to improve the performance to about half way between the inlined version and your first recursive version with this which I think is equivalent.
def iterate_tree_2(tree, depth, accumulator, callback):
if depth:
for k, subtree in tree.items():
yield from iterate_tree_2(subtree, depth-1, accumulator + (k,), callback)
else:
for k, v in tree.items():
rv = callback(accumulator + (k,), v)
yield rv
>>> for i in iterate_tree_2(tree, depth, (), callback): pass
It's slightly different in that it calls the callback with
callback((1, 2, 3, 4), ["a", "b", "c"])
instead of
callback(1, 2, 3, 4, ["a", "b", "c"])
The implementation difference is that it builds up tuples of arguments rather than using partial
. Which I guess makes sense since every time you call partial
you're adding an extra layer of a function call to the callback.
Here's a recursive approach that seems to perform about 5-10% better than your inline method:
def iter_tree(node, depth, path):
path.append(node)
for v in node.values():
if depth:
iter_tree(v, depth-1, path)
else:
callback(path)
Which you could call with:
iter_tree(tree, 5, [])
Edit a similar approach, but preserved keys , per your comment:
def iter_tree4(node, depth, path):
for (k,v) in node.items():
kpath = path + [k]
if depth:
iter_tree4(v, depth-1, kpath)
else:
callback(kpath, v)
Called the same way.
Note that we lost the performance gain from just keeping track of values, but it's still competitive with your inline method:
Iteration 1 21.3142
Iteration 2 11.2947
Iteration 3 1.3979
The number listed is the percent performance loss: [(recursive-inline)/inline]
Here's an optimized version of the iterative iterate_tree2
. It is 40 % faster on my system, mainly thanks to improved looping structure and elimination of the try except
. Andrew Magee's recursive code performs approximately the same.
def iterate_tree4(tree, depth, callback):
iterators = [iter(tree.items())]
args = []
while iterators:
while depth:
for k, v in iterators[-1]:
args.append(k)
iterators.append(iter(v.items()))
depth -= 1
break
else:
break
else:
for k, v in iterators[-1]:
yield callback(*(args + [k, v]))
depth += 1
del iterators[-1]
del args[-1:]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.