简体   繁体   中英

python parse/process all xml files in folder

I am trying to run my code on all xml files in the folder I get a few errors when I run the code and it generates some files but not all

here is my code:

import xml.etree.ElementTree as ET
import os
import glob
path = 'C:/xml/'

for infile in glob.glob( os.path.join(path, '*.xml') ):
        tree = ET.parse(infile)
        root = tree.getroot()
        with open(infile+'new.csv','w') as outfile:
            for elem in root.findall('.//event[@type="MEDIA"]'):
                    mediaidelem = elem.find('./mediaid')
                    if mediaidelem is not None:
                            outfile.write("{}\n".format(mediaidelem.text))

here is the error log all the

Traceback (most recent call last):
  File "C:\xml\2.py", line 8, in <module>
    tree = ET.parse(infile)
  File "C:\Python34\lib\xml\etree\ElementTree.py", line 1187, in parse
    tree.parse(source, parser)
  File "C:\Python34\lib\xml\etree\ElementTree.py", line 598, in parse
    self._root = parser._parse_whole(source)
  File "<string>", line None
xml.etree.ElementTree.ParseError: no element found: line 1, column 0

Considering the error message you may have some empty (or malformed) files.

I would add a error handling here to warn user about such error and then skip the file. Something like:

for infile in glob.glob( os.path.join(path, '*.xml') ):
    try:
        tree = ET.parse(infile)
    except xml.etree.ElementTree.ParseError as e:
        print infile, str(e)
        continue
    ...

I did not tried to reproduce it here, it is just a guess.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM