使用Python解析XML时，定位特定的子元素

Question

我正在构建一个简单的解析器来处理工作中的常规数据提要。 这篇将XML转换为csv（-like）格式的文章非常有帮助。 我在解决方案中使用了for循环，以遍历我需要定位的所有元素/子元素，但仍然有些困难。

例如，我的xml文件的结构如下：

<root>
  <product>
    <identifier>12</identifier>
    <identifier>ab</identifier>
    <contributor>Alex</contributor>
    <contributor>Steve</contributor>
  </product>
<root>

我只想定位第二个标识符，也只瞄准第一个贡献者。 有什么建议我该怎么做？

干杯!

Answer 1

您指向的另一个答案有一个示例，说明如何将标签的所有实例转换为列表。 您可以循环浏览这些内容，并丢弃不感兴趣的内容。

但是，有一种方法可以直接使用XPath执行：迷你语言在方括号中支持项目索引：

import xml.etree.ElementTree as etree
document = etree.parse(open("your.xml"))

secondIdentifier = document.find(".//product/identifier[2]")
firstContributor = document.find(".//product/contributor[1]")
print secondIdentifier, firstContributor

版画

'ab', 'Alex'

请注意，在XPath中，第一个索引是1 ，而不是0 。

ElementTree的的find和findall仅支持的XPath的一个子集，描述在这里。 完整的XPath，在W3Schools上进行了简要介绍，在W3C的规范性文档中进行了更全面的介绍，可从第三方软件包lxml获得，但可以广泛使用。 使用lxml，示例将如下所示：

import lxml.etree as etree
document = etree.parse(open("your.xml"))

secondIdentifier = document.xpath(".//product/identifier[2]")[0]
firstContributor = document.xpath(".//product/contributor[1]")[0]
print secondIdentifier, firstContributor

使用Python解析XML时，定位特定的子元素

问题描述

1 个解决方案

解决方案1
0 已采纳 2013-05-31 23:38:28

使用Python解析XML时，定位特定的子元素

问题描述

1 个解决方案

解决方案1 0 已采纳 2013-05-31 23:38:28

解决方案1
0 已采纳 2013-05-31 23:38:28