简体   繁体   English

如何在 lxml xpath 查询中使用空命名空间?

[英]how do I use empty namespaces in an lxml xpath query?

I have an xml document in the following format:我有以下格式的 xml 文档:

<feed xmlns="http://www.w3.org/2005/Atom" xmlns:openSearch="http://a9.com/-/spec/opensearchrss/1.0/" xmlns:gsa="http://schemas.google.com/gsa/2007">
  ...
  <entry>
    <id>https://ip.ad.dr.ess:8000/feeds/diagnostics/smb://ip.ad.dr.ess/path/to/file</id>
    <updated>2011-11-07T21:32:39.795Z</updated>
    <app:edited xmlns:app="http://purl.org/atom/app#">2011-11-07T21:32:39.795Z</app:edited>
    <link rel="self" type="application/atom+xml" href="https://ip.ad.dr.ess:8000/feeds/diagnostics"/>
    <link rel="edit" type="application/atom+xml" href="https://ip.ad.dr.ess:8000/feeds/diagnostics"/>
    <gsa:content name="entryID">smb://ip.ad.dr.ess/path/to/directory</gsa:content>
    <gsa:content name="numCrawledURLs">7</gsa:content>
    <gsa:content name="numExcludedURLs">0</gsa:content>
    <gsa:content name="type">DirectoryContentData</gsa:content>
    <gsa:content name="numRetrievalErrors">0</gsa:content>
  </entry>
  <entry>
    ...
  </entry>
  ...
</feed>

I need to retrieve all entry elements using xpath in lxml.我需要在 lxml 中使用 xpath 检索所有entry元素。 My problem is that I can't figure out how to use an empty namespace.我的问题是我不知道如何使用空命名空间。 I have tried the following examples, but none work.我尝试了以下示例,但没有任何效果。 Please advise.请指教。

import lxml.etree as et

tree=et.fromstring(xml)    

The various things I have tried are:我尝试过的各种事情是:

for node in tree.xpath('//entry'):

or或者

namespaces = {None:"http://www.w3.org/2005/Atom" ,"openSearch":"http://a9.com/-/spec/opensearchrss/1.0/" ,"gsa":"http://schemas.google.com/gsa/2007"}

for node in tree.xpath('//entry', namespaces=ns):

or或者

for node in tree.xpath('//\"{http://www.w3.org/2005/Atom}entry\"'):

At this point I just don't know what to try.在这一点上,我只是不知道该尝试什么。 Any help is greatly appreciated.任何帮助是极大的赞赏。

Something like this should work:这样的事情应该工作:

import lxml.etree as et

ns = {"atom": "http://www.w3.org/2005/Atom"}
tree = et.fromstring(xml)
for node in tree.xpath('//atom:entry', namespaces=ns):
    print node

See also http://lxml.de/xpathxslt.html#namespaces-and-prefixes .另请参阅http://lxml.de/xpathxslt.html#namespaces-and-prefixes

Alternative:选择:

for node in tree.xpath("//*[local-name() = 'entry']"):
    print node

Use findall method.使用findall方法。

for item in tree.findall('{http://www.w3.org/2005/Atom}entry'): 
    print item

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM