简体   繁体   English

Python ElementTree - 按顺序迭代子节点和文本

[英]Python ElementTree - iterate through child nodes and text in order

I am using python the third and the ElementTree API. 我正在使用python第三个和ElementTree API。 I have some xml of the form: 我有一些形式的xml:

<root>
  <item>Over the <ref id="river" /> and through the <ref id="woods" />.</item>
  <item>To Grandmother's <ref id="house" /> we go.</item>
</root>

I want to be able to iterate through the text and child nodes for a given item in order. 我希望能够按顺序遍历给定项目的文本和子节点。 So, for the first item, the list I want printed line by line would be: 因此,对于第一项,我想逐行打印的列表将是:

Over the 
<Element 'ref' at 0x######>
 and through the 
<Element 'ref' at 0x######>
.

But I can't figure out how to do this with ElementTree. 但我无法弄清楚如何使用ElementTree做到这一点。 I can get the text in order via itertext() and the child elements in order in several ways, but not them interleaved together in order. 我可以通过itertext()和子元素按顺序按顺序获取文本,但不按顺序交错排列。 I was hoping I could use an XPath expression like ./@text|./ref , but ElementTree's subset of XPath doesn't seem to support attribute selection. 我希望我可以使用类似./@text|./ref的XPath表达式,但是ElementTree的XPath子集似乎不支持属性选择。 If I could even just get the original raw xml contents of each item node, I could parse it out myself if necessary. 如果我甚至可以获得每个项目节点的原始xml内容,我可以在必要时自己解析它。

Try this: 尝试这个:

from xml.etree import ElementTree as ET

xml = """<root>
  <item>Over the <ref id="river" /> and through the <ref id="woods" />.</item>
  <item>To Grandmother's <ref id="house" /> we go.</item>
</root>"""

root = ET.fromstring(xml)

for item in root:
    if item.text:
        print(item.text)
    for ref in item:
        print(ref)
        if ref.tail:
            print(ref.tail)

ElementTree s representation of "mixed content" is based on .text and .tail attributes. ElementTree对“混合内容”的表示基于.text.tail属性。 The .text of an element represents the text of the element up to the first child element. 元素的.text表示直到第一个子元素的元素的文本。 That child's .tail then contains the text of its parent following it. 那个孩子的.tail随后包含其父母的文本。 See the API doc . 请参阅API文档

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM