python lxml.objectify 不提供对 gco:CharacterString 节点的属性访问

Question

I have a geometadata file我有一个几何元数据文件

<gmd:MD_Metadata
xmlns:gmd="http://www.isotc211.org/2005/gmd"
xmlns:gco="http://www.isotc211.org/2005/gco" >
  <gmd:fileIdentifier>
    <gco:CharacterString>2ce585df-df23-45f6-b8e1-184e64e7e3b5</gco:CharacterString>
  </gmd:fileIdentifier>
  <gmd:language>
    <gmd:LanguageCode codeList="https://www.loc.gov/standards/iso639-2/" codeListValue="ger">ger</gmd:LanguageCode>
  </gmd:language>
</gmd:MD_Metadata>

Utilizing lxml.objectivy I can parse it easily利用 lxml.objectivy 我可以很容易地解析它

from lxml import objectify
data = objectify.parse('rec.xml').getroot()

in data the 'language' is exposed as在数据中，“语言”暴露为

(Pdb) data.language.LanguageCode
'ger'

but not the 'fileIdentifier'但不是 'fileIdentifier'

(Pdb) data.fileIdentifier.CharacterString
*** AttributeError: no such child: {http://www.isotc211.org/2005/gmd}CharacterString

Obviously lxml is looking in the wrong namespace for "CharacterString".显然 lxml 正在为“CharacterString”寻找错误的命名空间。

But the information is there但是信息在那里

(Pdb) data.fileIdentifier['{http://www.isotc211.org/2005/gco}CharacterString']
'2ce585df-df23-45f6-b8e1-184e64e7e3b5'

How can I convince lxml to use the correct namespace?我怎样才能说服 lxml 使用正确的命名空间？ Any help appreciated任何帮助表示赞赏

Volker沃尔克

Answer 1

This is expected and just how lxml.objectify works.这是预期的，也是 lxml.objectify 的工作原理。 Since {http://www.isotc211.org/2005/gco}CharacterString is from another namespace than its parent element ({"http://www.isotc211.org/2005/gmd}fileIdentifier) it cannot be retrieved from the parent with "simple (dot) attribute lookup".由于 {http://www.isotc211.org/2005/gco}CharacterString 来自与其父元素 ({"http://www.isotc211.org/2005/gmd}fileIdentifier) 不同的另一个名称空间，因此无法从具有“简单（点）属性查找”的父级。

Instead you have to you use index access (just like you already did) or getattr:相反，你必须使用索引访问（就像你已经做过的那样）或 getattr：

>>> from lxml import etree, objectify
>>> data = objectify.fromstring(
... """<gmd:MD_Metadata
... xmlns:gmd="http://www.isotc211.org/2005/gmd"
... xmlns:gco="http://www.isotc211.org/2005/gco" >
...   <gmd:fileIdentifier>
...     <gco:CharacterString>2ce585df-df23-45f6-b8e1-184e64e7e3b5</gco:CharacterString>
...   </gmd:fileIdentifier>
...   <gmd:language>
...     <gmd:LanguageCode codeList="https://www.loc.gov/standards/iso639-2/" codeListValue="ger">ger</gmd:LanguageCode>
...   </gmd:language>
... </gmd:MD_Metadata>
... """)
>>> data.fileIdentifier["{http://www.isotc211.org/2005/gco}CharacterString"]
'2ce585df-df23-45f6-b8e1-184e64e7e3b5'
>>> getattr(data.fileIdentifier, "{http://www.isotc211.org/2005/gco}CharacterString")
'2ce585df-df23-45f6-b8e1-184e64e7e3b5'
>>>

The reason for this is that obviously {http://www.isotc211.org/2005/gco}CharacterString is not a valid Python identifier and it makes sense to restrict unqualified lookup to children from the same namespace.这样做的原因是 {http://www.isotc211.org/2005/gco}CharacterString 显然不是有效的 Python 标识符，将非限定查找限制为来自同一名称空间的子项是有意义的。 See also "Namespace handling" in the official docs: https://lxml.de/objectify.html#the-lxml-objectify-api另请参阅官方文档中的“命名空间处理”： https://lxml.de/objectify.html#the-lxml-objectify-api

Best, Holger最好的，霍尔格

python lxml.objectify 不提供对 gco:CharacterString 节点的属性访问

问题描述

1 个解决方案

解决方案1
1 2022-02-28 07:33:35

python lxml.objectify 不提供对 gco:CharacterString 节点的属性访问

问题描述

1 个解决方案

解决方案1 1 2022-02-28 07:33:35

解决方案1
1 2022-02-28 07:33:35