Using ElementTree's iterfind for deep XML parsing

Question

I am trying to parse all the IPv6 address elements using iterfind. I thought my match string is correct, but I am not seeing any results. I am not familiar with parsing deep XML files, so I am starting to question is this method the best approach?

import requests 
import xml.etree.ElementTree as ET

r = requests.get('https://support.content.office.net/en-us/static/O365IPAddresses.xml')

root = ET.fromstring(r.text)

for node in root.iterfind(".//products/product/[@name='o365']/addresslist/[@type='IPv6']"):
    data = []
    for d in node.getchildren():
        if d.text:
            data.append(d.text)
    print ' '.join(data)

Answer 1

Take a step back and make sure your xpath expression is correct. Start with:

>>> r = requests.get('https://support.content.office.net/en-us/static/O365IPAddresses.xml')
>>> root = ET.fromstring(r.text)

If you search for the beginning of your xpath expression, .//products , what do you get?

>>> root.findall('.//products/product')
[]

You get an empty list, which means there's a problem with your expression. That's because the root of your tree is the products element:

>>> root
<Element 'products' at 0x7f16be5a9450>

So the first level of the hiearchy will be product :

>>> root.findall('product')
[<Element 'product' at 0x7f16be5a9490>, <Element 'product' at 0x7f16be0e4190>, ...]

If you substitute that back into your full expression, we get:

>>> root.findall("product/[@name='o365']/addresslist/[@type='IPv6']")
[<Element 'addresslist' at 0x7f16be5a94d0>]

That seems much better.

Using that expression in your example code produces output that seems reasonable.

Using ElementTree's iterfind for deep XML parsing

Question

1 answers

solution1
2 ACCPTED 2017-09-07 19:32:58

Using ElementTree's iterfind for deep XML parsing

Question

1 answers

solution1 2 ACCPTED 2017-09-07 19:32:58

solution1
2 ACCPTED 2017-09-07 19:32:58