最佳做法-解析XML API响应-Python 3

Question

I have referenced several guides, but I'm still finding it difficult to wrap my head around this (Python newb): 我已经参考了几本指南，但是仍然很难克服（Python newb）：

/docs.python.org/3.7/library/xml.etree.elementtree.html /docs.python.org/3.7/library/xml.etree.elementtree.html
/effbot.org/zone/element-xpath.htm /effbot.org/zone/element-xpath.htm

xml output example xml输出示例

The intent is to retrieve the zipcode text value; 目的是检索邮政编码文本值。 however, I haven't done this before and from referencing the guides, I want the output of the following xpath: 但是，在引用指南之前，我还没有这样做，我想要以下xpath的输出：

/SearchResults:searchresults[@xmlns:xsi="http://www.w3.org/2001/XMLSchema-
instance"]/response/results/result/address/zipcode/text()

Here's an example of what's working from a local file: 这是本地文件工作示例：

from xml.etree import ElementTree as ET

tree = ET.parse(<destination_of_xml>.xml')

for elem in tree.iterfind('/response/results/result/address/zipcode'):
    print(elem.tag, elem.text)
----------------------------------------------------------------------
output: 
zipcode {90292}
zipcode {90292}
...

What's good practice in this instance to retrieve zipcode values and account for any schema changes in the future (ie iterate through XML until finding the element zipcode)? 在这种情况下，如何检索邮政编码值并考虑将来的任何模式更改（即遍历XML直到找到元素邮政编码），是什么好习惯？ Are there better solutions to this? 有更好的解决方案吗？

Answer 1

You may need to know about xpath expressions. 您可能需要了解xpath表达式。

I'm using the lxml library to parse a simpler xml hierarchy. 我正在使用lxml库来解析更简单的xml层次结构。 I don't need to know what's above the zipcode element because I can write an xpath expression that says, in effect, look anywhere from the top of the document for zipcode elements (note, plural): .//zipcode . 我不需要知道zipcode元素上方的内容，因为我可以编写一个xpath表达式，说实际上是在文档顶部的任何地方查找zipcode元素（注意，复数）： .//zipcode 。 This yields the element. 这产生了元素。 Now that I have them, since I know there's just one, I select the 'first', get its text and strip off leading and trailing blanks. 现在有了它们，因为我知道只有一个，所以我选择“第一个”，获取其text并去除开头和结尾的空格。

Providing that the name of the element remains unchanged ... 假设元素名称保持不变...

>>> from xml.etree import ElementTree as ET
>>> from lxml import etree
>>> tree = etree.fromstring('''\
... <company>
...     <name>XYZ</name>
...     <industry>chemicals</industry>
...     <address>
...         <street>
...             14234 Onyx Drive West
...         </street>
...         <city>
...             Ainslie
...         </city>
...         <state>
...             Idaho
...         </state>
...         <zipcode>
...             87734
...         </zipcode>
...     </address>
... </company>''')
>>> tree.xpath('.//zipcode')
[<Element zipcode at 0xb5e9c8>]

>>> tree.xpath('.//zipcode')[0].text.strip()
'87734'

最佳做法-解析XML API响应-Python 3

问题描述

1 个解决方案

解决方案1
0 已采纳 2017-08-25 17:22:53

最佳做法-解析XML API响应-Python 3

问题描述

1 个解决方案

解决方案1 0 已采纳 2017-08-25 17:22:53

解决方案1
0 已采纳 2017-08-25 17:22:53