简体   繁体   English

使lxml.objectify忽略xml名称空间?

[英]Making lxml.objectify ignore xml namespaces?

So I gotta deal with some xml that looks like this: 所以我要处理一些看起来像这样的xml:

<ns2:foobarResponse xmlns:ns2="http://api.example.com">
  <duration>206</duration>
  <artist>
    <tracks>...</tracks>
  </artist>
</ns2:foobarResponse>

I found lxml and it's objectify module, that lets you traverse a xml document in a pythonic way, like a dictionary. 我找到了lxml和它的objectify模块,它允许你以pythonic方式遍历xml文档,就像字典一样。
Problem is: it's using the bogus xml namespace every time you try to access an element, like this: 问题是:每次尝试访问元素时都使用伪造的xml命名空间,如下所示:

from lxml import objectify

tree = objectify.fromstring(xml)
print tree.artist
# ERROR: no such child: {http://api.example.com}artist

It's trying to access <artist> with the parent namespace, but the tag doesn't use the ns. 它正在尝试使用父命名空间访问<artist> ,但标记不使用ns。

Any ideas how to get around this? 任何想法如何解决这个问题? Thanks 谢谢

According to the lxml.objectify documentation , attribute lookups default to using the namespace of their parent element. 根据lxml.objectify 文档 ,属性查找默认使用其父元素的命名空间。

What you probably want to work would be: 你可能想要的工作是:

print tree["{}artist"]

QName syntax like this would work if your children had a non-empty namespace ("{ http://foo/ }artist", for instance), but unfortunately, it looks like the current source code treats an empty namespace as no namespace, so all of objectify's lookup goodness will helpfully replace the empty namespace with the parent namespace, and you're out of luck. 如果你的孩子有一个非空的命名空间(例如“{ http:// foo / } artist”),这样的QName语法会起作用,但不幸的是,它看起来像当前的源代码将空命名空间视为没有命名空间,因此,所有objectify的查找优点将有助于用父命名空间替换空命名空间,并且你运气不好。

This is either a bug ("{}artist" should work), or an enhancement request to file for the lxml folks. 这可能是一个bug(“{}艺术家”应该工作),或者是为lxml人提交的增强请求。

For the moment, the best thing to do is probably: 目前,最好的办法可能是:

print tree.xpath("artist")

It's unclear to me how much of a performance hit you'll take using xpath here, but this certainly works. 我不清楚你在这里使用xpath会有多大的性能影响,但这肯定有效。

Just for reference: Note that this works as expected since lxml 2.3. 仅供参考:请注意,从lxml 2.3开始,这可以正常工作。

From the lxml changelog: 从lxml更改日志:

" [...] “[...]

2.3 (2011-02-06) Features added 2.3(2011-02-06)增加了功能

  • When looking for children, lxml.objectify takes '{}tag' as meaning an empty namespace, as opposed to the parent namespace. 在寻找孩子时,lxml.objectify将'{} tag'视为空命名空间,而不是父命名空间。

[...]" [...]”

In action: 在行动:

>>> xml = """<ns2:foobarResponse xmlns:ns2="http://api.example.com">
...   <duration>206</duration>
...   <artist>
...     <tracks>...</tracks>
...   </artist>
... </ns2:foobarResponse>"""
>>> tree = objectify.fromstring(xml)
>>> print tree['{}artist']
artist = None [ObjectifiedElement]
    tracks = '...' [StringElement]
>>>

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM