简体   繁体   中英

Parsing XML data with python - how to capture everything in a more pythonic way?

The aim is to capture all data in an xml file. Once captured, im comparing it to a reference xml file to ensure nothing has changed and to then tell you what the differences are.

What i wrote works for what I need, but is very cumbersome and a bit messy? Is there a better way to iterate through all items at all depths of an xml file. The solution just has to be robust to capture everything.

Currently, iterating like I have below uses too may layers of iteration with try/except that is very ugly!

import xml.etree.ElementTree as ET

def xml_iter(file):
    
    tree = ET.parse(file)
    root = tree.getroot()
    
    namespaces = {}

    List = []
    Parent = []
    for elem in root:
        for i in elem:
            try:
                i = i.text.strip()
                List.append(i)
            except:
                pass
     
            for j in i:
                try:
                    j = j.text.strip()
                    List.append(j)
                except:
                    pass
  
                for k in j:
                    try:
                        k = k.text.strip()
                        List.append(k)
                    except:
                        pass
    return (List)

Any help would be greatly appreciated.

User Element.iter . It iterates recursively over all the sub-trees.

For your case, it would be something like:

dict_list = []
text_list = []
for node in root.iter():
    dict_list.append(node.attrib) # adds to list, the dictionary of attrib
    text_list.append(node.text)

 

# do the same for other file and compare dictionaries/strings in corresponding lists.

You can look at this official tutorial for examples.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM