在標簽上解析帶有前綴的.xml？ xml.etree.ElementTree

Question

我可以讀取標簽，除非有前綴。 我沒有運氣搜索SO以前的問題。

我需要閱讀media:content 。 我試過image = node.find("media:content") 。 Rss輸入：

<channel>
  <title>Popular  Photography in the last 1 week</title>
  <item>
    <title>foo</title>
    <media:category label="Miscellaneous">photography/misc</media:category>
    <media:content url="http://foo.com/1.jpg" height="375" width="500" medium="image"/>
  </item>
  <item> ... </item>
</channel>

我可以讀一個兄弟標簽title 。

from xml.etree import ElementTree
with open('cache1.rss', 'rt') as f:
    tree = ElementTree.parse(f)

for node in tree.findall('.//channel/item'):
    title =  node.find("title").text

我一直在使用文檔，但仍然堅持'前綴'部分。

Answer 1

以下是使用ElementTree的 XML命名空間的示例：

>>> x = '''\
<channel xmlns:media="http://www.w3.org/TR/html4/">
  <title>Popular  Photography in the last 1 week</title>
  <item>
    <title>foo</title>
    <media:category label="Miscellaneous">photography/misc</media:category>
    <media:content url="http://foo.com/1.jpg" height="375" width="500" medium="image"/>
  </item>
  <item> ... </item>
</channel>
'''
>>> node = ElementTree.fromstring(x)
>>> for elem in node.findall('item/{http://www.w3.org/TR/html4/}category'):
        print elem.text


photography/misc

Answer 2

media是一個XML命名空間，必須先用xmlns:media="..."定義它。 有關如何在lxml中定義用於XPath表達式的xml命名空間，請參閱http://lxml.de/xpathxslt.html#namespaces-and-prefixes 。

在標簽上解析帶有前綴的.xml？ xml.etree.ElementTree

問題描述

2 個解決方案

解決方案1
5 已采納 2011-10-31 01:24:15

解決方案2
0 2011-10-31 01:05:22

在標簽上解析帶有前綴的.xml？ xml.etree.ElementTree

問題描述

2 個解決方案

解決方案1 5 已采納 2011-10-31 01:24:15

解決方案2 0 2011-10-31 01:05:22

解決方案1
5 已采納 2011-10-31 01:24:15

解決方案2
0 2011-10-31 01:05:22