简体   繁体   中英

Parse multiple XML although not all files have the elements with python

this is my code, and i have some problem..

from xml.dom import minidom
openFiles = 'myxml/*.xml'

list = []

for xmlfiles in glob.glob(openFiles):
    doc = minidom.parse(xmlfiles)
    root = doc.getElementsByTagName("info")[0]
    project_name = root.getAttribute('project_name')
    list.append(project_name)
    ....

This code is working properly. But somehow because I run to open multiple files its error. The cause is because not all files that have 'info' elements. So is there a way to make it keep run and change it to 'none'?

So example become like this

project1, project2, none, project3, none

Sorry for my bad english, and thank you in advance

You could add the try...except handling according to the common Python coding principle EAFP (Easier to Ask Forgiveness than Permission) and if there are no 'info' element in XML, exception will be handled and None will be added to the list:

for xmlfiles in glob.glob(openFiles):
    doc = minidom.parse(xmlfiles)
    root = None
    try:
        root = doc.getElementsByTagName('info')[0]
    except IndexError:
        list.append(None)
    if root:
        project_name = root.getAttribute('project_name')       
        list.append(project_name)
    ....

Or you could use LBYL (Look before you leap) coding principle and check that 'info' is in XML before getting its attribute:

for xmlfiles in glob.glob(openFiles):
    doc = minidom.parse(xmlfiles)
    if len(doc.getElementsByTagName('info')):
        root = doc.getElementsByTagName('info')[0]
        project_name = root.getAttribute('project_name')       
        list.append(project_name)
    else:
        list.append(None)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM