简体   繁体   English

如何递归地编写嵌套的for循环?

[英]How can I write nested for loops recursively?

I have the following code for processing an XML file: 我有以下代码用于处理XML文件:

for el in root:
    checkChild(rootDict, el)
    for child in el:
        checkChild(rootDict, el, child)
        for grandchild in child:
            checkChild(rootDict, el, child, grandchild)
            for grandgrandchild in grandchild:
                checkChild(rootDict, el, child, grandchild, grandgrandchild)
                ...
                   ...

As you can see, on every iteration I just call the same function with one extra parameter. 如您所见,在每次迭代中,我只是使用一个额外的参数调用同一函数。 Is there a way to avoid writing so many nested for loops that basically do the same thing? 有没有一种方法可以避免编写太多嵌套的for循环,这些循环基本上可以完成相同的工作?

Any help would be appreciated. 任何帮助,将不胜感激。 Thank you. 谢谢。

Assuming that root comes from an ElemenTree parsing, you can make a datastructure containing the list of all the ancestors for each node, cnd then iterate over this to call checkChild: 假设root来自ElemenTree解析,则可以创建一个包含每个节点的所有祖先列表的数据结构,然后cnd对其进行迭代以调用checkChild:

def checkChild(*element_chain):
    # Code placeholder
    print("Checking %s" % '.'.join(t.tag for t in reversed(element_chain)))

tree = ET.fromstring(xml)
# Build a dict containing each node and its ancestors
nodes_and_parents = {}
for elem in tree.iter():  # tree.iter yields every tag in the XML, not only the root childs 
    for child in elem:
        nodes_and_parents[child] = [elem, ] + nodes_and_parents.get(elem, [])

for t, parents in nodes_and_parents.items():
    checkChild(t, *parents)
def recurse(tree):
    """Walks a tree depth-first and yields the path at every step."""
    # We convert the tree to a list of paths through it,
    # with the most recently visited path last. This is the stack.
    def explore(stack):
        try:
            # Popping from the stack means reading the most recently
            # discovered but yet unexplored path in the tree. We yield it
            # so you can call your method on it.
            path = stack.pop()
        except IndexError:
            # The stack is empty. We're done.
            return
        yield path
        # Then we expand this path further, adding all extended paths to the
        # stack. In reversed order so the first child element will end up at
        # the end, and thus will be yielded first.
        stack.extend(path + (elm,) for elm in reversed(path[-1]))
    yield from explore([(tree,)])

# The linear structure yields tuples (root, child, ...)
linear = recurse(root)

# Then call checkChild(rootDict, child, ...)
next(linear)  # skip checkChild(rootDict)
for path in linear:
    checkChild(rootDict, *path[1:])

For your understanding, suppose the root looked something like this: 为了您的理解,假设根看起来像这样:

root
  child1
    sub1
    sub2
  child2
    sub3
      subsub1
    sub4
  child3

That is like a tree. 那就像一棵树。 We can find a few paths through this tree, eg (root, child1) . 我们可以找到穿过这棵树的一些路径,例如(root, child1) And as you feed these paths to checkChild this would result in a call checkChild(rootNode, child1) . 当您将这些路径提供给checkChild这将导致调用checkChild(rootNode, child1) Eventually checkChild will be called exactly once for every path in the tree. 最终,对于树中的每个路径,都会恰好一次调用checkChild We can thus write the tree as a list of paths like so: 因此,我们可以将树写为路径列表,如下所示:

[(root,),
 (root, child1),
 (root, child1, sub1),
 (root, child1, sub2),
 (root, child2),
 (root, child2, sub3),
 (root, child2, sub3, subsub1),
 (root, child2, sub4),
 (root, child3)]

The order of paths in this list happens to match your loop structure. 此列表中路径的顺序恰好与您的循环结构匹配。 It is called depth-first . 这称为深度优先 (Another sort order, breadth-first , would first list all child nodes, then all sub nodes and finally all subsub nodes.) (另一个排序顺序, 广度优先 ,将首先列出所有子节点,然后列出所有子节点,最后列出所有子子节点。)

The list above is the same as the stack variable in the code, with a small change that stack only stores the minimal number of paths it needs to remember. 上面的列表与代码中的stack变量相同,但有一点点变化,即stack仅存储需要记住的最小数量的路径。

To conclude, recurse yields those paths one-by-one and the last bit of code invokes the checkChild method as you do in your question. 总而言之, recurse逐一生成这些路径,最后一部分代码将像您在问题中那样调用checkChild方法。

Whatever operation you wish to perform on files and directories you can traverse them. 无论您希望对文件和目录执行什么操作,都可以遍历它们。 In python the easiest way I know is: 在python中,我知道的最简单的方法是:

#!/usr/bin/env python

import os

# Set the directory you want to start from
root_dir = '.'
for dir_name, subdirList, file_list in os.walk(root_dir):
    print(f'Found directory: {dir_name}s')
    for file_name in file_list:
        print(f'\t{file_name}s')

while traversing you can add the to groups or perform other operations 遍历时,您可以将添加到组或执行其他操作

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM