使用Python解析XML時，定位特定的子元素

Question

我正在構建一個簡單的解析器來處理工作中的常規數據提要。 這篇將XML轉換為csv（-like）格式的文章非常有幫助。 我在解決方案中使用了for循環，以遍歷我需要定位的所有元素/子元素，但仍然有些困難。

例如，我的xml文件的結構如下：

<root>
  <product>
    <identifier>12</identifier>
    <identifier>ab</identifier>
    <contributor>Alex</contributor>
    <contributor>Steve</contributor>
  </product>
<root>

我只想定位第二個標識符，也只瞄准第一個貢獻者。 有什么建議我該怎么做？

干杯!

Answer 1

您指向的另一個答案有一個示例，說明如何將標簽的所有實例轉換為列表。 您可以循環瀏覽這些內容，並丟棄不感興趣的內容。

但是，有一種方法可以直接使用XPath執行：迷你語言在方括號中支持項目索引：

import xml.etree.ElementTree as etree
document = etree.parse(open("your.xml"))

secondIdentifier = document.find(".//product/identifier[2]")
firstContributor = document.find(".//product/contributor[1]")
print secondIdentifier, firstContributor

版畫

'ab', 'Alex'

請注意，在XPath中，第一個索引是1 ，而不是0 。

ElementTree的的find和findall僅支持的XPath的一個子集，描述在這里。 完整的XPath，在W3Schools上進行了簡要介紹，在W3C的規范性文檔中進行了更全面的介紹，可從第三方軟件包lxml獲得，但可以廣泛使用。 使用lxml，示例將如下所示：

import lxml.etree as etree
document = etree.parse(open("your.xml"))

secondIdentifier = document.xpath(".//product/identifier[2]")[0]
firstContributor = document.xpath(".//product/contributor[1]")[0]
print secondIdentifier, firstContributor

使用Python解析XML時，定位特定的子元素

問題描述

1 個解決方案

解決方案1
0 已采納 2013-05-31 23:38:28

使用Python解析XML時，定位特定的子元素

問題描述

1 個解決方案

解決方案1 0 已采納 2013-05-31 23:38:28

解決方案1
0 已采納 2013-05-31 23:38:28