如何使用 Python 最好地迭代（广度优先）lxml etree

Question

I'm trying to wrap my head around lxml (new to this) and how I can use it to do what I want to do.我正在尝试围绕 lxml（对此是新手）以及如何使用它来做我想做的事情。 I've got an well-formed and valid XML file我有一个格式正确且有效的 XML 文件

<root>
  <a>
    <b>Text</b>
    <c>More text</c>
  </a>
  <!-- some comment -->
  <a>
    <d id="10" />
  </a>
</root>

something like this.像这样的东西。 Now I'd like to visit the children breadth-first, and the best I can come up with is something like this:现在我想以广度优先的方式访问孩子们，我能想到的最好的办法是这样的：

for e in xml.getroot()[0].itersiblings() :
    print(e.tag, e.attrib)

and then take it from there.然后从那里拿走。 However, this gives me all elements including comments但是，这给了我所有元素，包括评论

a {}
<built-in function Comment> {}
a {}

How do I skip over comments?如何跳过评论？ Is there a better way to iterate over the direct children of a node?有没有更好的方法来迭代节点的直接子节点？

In general, what are the recommendations to parse an XML tree vs. event-driven pull-parsing using, say, iterparse() ?一般来说，解析 XML 树与使用iterparse()等事件驱动的拉式解析的建议是什么？

Answer 1

This works for your case 这适用于您的情况

for child in doc.getroot().iterchildren("*"):
    print(child.tag, child.attrib)

Answer 2

This question was asked over 9 years ago, but I just ran into this issue myself, and I solved it with the following这个问题是 9 年前提出的，但我自己也遇到过这个问题，我用以下方法解决了它

import xml.etree.ElementTree as ET

xmlfile = ET.parse("file.xml")
root = xmlfile.getroot()

visit = [root]
while len(visit):
  curr = visit.pop(0)
  print(curr.tag, curr.attrib, curr.text)
  visit += list(curr)

list(node) will give a list of all the immediate children of that node. list(node)将给出该节点的所有直接子节点的列表。 So by adding those children to a stack and just repeating that process with whatever is on the top of the stack (popping it off at the same time), we should end up with a standard breadth-first search.因此，通过将这些孩子添加到堆栈中，然后对堆栈顶部的任何内容重复该过程（同时将其弹出），我们应该以标准的广度优先搜索结束。

如何使用 Python 最好地迭代（广度优先）lxml etree

问题描述

2 个解决方案

解决方案1
3 已采纳 2013-03-15 14:46:46

解决方案2
0 2022-12-22 02:43:31

如何使用 Python 最好地迭代（广度优先）lxml etree

问题描述

2 个解决方案

解决方案1 3 已采纳 2013-03-15 14:46:46

解决方案2 0 2022-12-22 02:43:31

解决方案1
3 已采纳 2013-03-15 14:46:46

解决方案2
0 2022-12-22 02:43:31