Best practices - Parsing XML API response - Python 3

Question

I have referenced several guides, but I'm still finding it difficult to wrap my head around this (Python newb):

/docs.python.org/3.7/library/xml.etree.elementtree.html
/effbot.org/zone/element-xpath.htm

The intent is to retrieve the zipcode text value; however, I haven't done this before and from referencing the guides, I want the output of the following xpath:

/SearchResults:searchresults[@xmlns:xsi="http://www.w3.org/2001/XMLSchema-
instance"]/response/results/result/address/zipcode/text()

Here's an example of what's working from a local file:

from xml.etree import ElementTree as ET

tree = ET.parse(<destination_of_xml>.xml')

for elem in tree.iterfind('/response/results/result/address/zipcode'):
    print(elem.tag, elem.text)
----------------------------------------------------------------------
output: 
zipcode {90292}
zipcode {90292}
...

What's good practice in this instance to retrieve zipcode values and account for any schema changes in the future (ie iterate through XML until finding the element zipcode)? Are there better solutions to this?

Answer 1

You may need to know about xpath expressions.

I'm using the lxml library to parse a simpler xml hierarchy. I don't need to know what's above the zipcode element because I can write an xpath expression that says, in effect, look anywhere from the top of the document for zipcode elements (note, plural): .//zipcode . This yields the element. Now that I have them, since I know there's just one, I select the 'first', get its text and strip off leading and trailing blanks.

Providing that the name of the element remains unchanged ...

>>> from xml.etree import ElementTree as ET
>>> from lxml import etree
>>> tree = etree.fromstring('''\
... <company>
...     <name>XYZ</name>
...     <industry>chemicals</industry>
...     <address>
...         <street>
...             14234 Onyx Drive West
...         </street>
...         <city>
...             Ainslie
...         </city>
...         <state>
...             Idaho
...         </state>
...         <zipcode>
...             87734
...         </zipcode>
...     </address>
... </company>''')
>>> tree.xpath('.//zipcode')
[<Element zipcode at 0xb5e9c8>]

>>> tree.xpath('.//zipcode')[0].text.strip()
'87734'

Best practices - Parsing XML API response - Python 3

Question

1 answers

solution1
0 ACCPTED 2017-08-25 17:22:53

Best practices - Parsing XML API response - Python 3

Question

1 answers

solution1 0 ACCPTED 2017-08-25 17:22:53

solution1
0 ACCPTED 2017-08-25 17:22:53