简体   繁体   中英

parsing repeating child elements python

I am trying to parse an XML document that contains repeating child elements using Python. When I attempt to parse the data, it creates an empty file. If I comment out the repeating child elements code (see bolded section in python script below), the document generates correctly. Can someone help?


<?xml version="1.0" encoding="ISO-8859-1" standalone="no"?>
<FRPerformance xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
          <Date>Fri, 28 Feb 1997 00:00:00 -0600</Date>
          <Date>Fri, 28 Feb 2003 00:00:00 -0600</Date>
          <Date>Fri, 29 Feb 2008 00:00:00 -0600</Date>
          <Date>Fri, 30 Apr 1993 00:00:00 -0600</Date>

Python script

## Create command line arguments for XML file and tageName
xmlFile = sys.argv[1]
tagName = sys.argv[2]

tree = ET.parse(xmlFile)
root = tree.getroot()

## Setup the file for output
saveout = sys.stdout
output_file =  open('parsedXML.csv', 'w')
sys.stdout = output_file

## Parse XML

for node in root.findall(tagName):
    fundCode = node.find('FundCode').text
    curr = node.find('CurrencyID').text
    shareClass = node.find('FundShareClassCode').text
    for node2 in node.findall('./Net/Annualized'):
        year1 = node2.findtext('Year1')
        year3 = node2.findtext('Year3')
        year5 = node2.findtext('Year5')
        year10 = node2.findtext('Year10')
        year15 = node2.findtext('Year15')
        year20 = node2.findtext('Year20')
        SI = node2.findtext('SI')
        for node3 in node.findall('./Gross'):
            for node4 in node3.findall('./Annualized'):
                month3 = node4.findtext('Month3')
                ytd = node4.findtext('YTD')
                year1g = node4.findtext('Year1')
                year3g = node4.findtext('Year3')
                year5g = node4.findtext('Year5')
                year10g = node4.findtext('Year10')
                year15g = node4.findtext('Year15')
                year20g = node4.findtext('Year2')
                SIg = node4.findtext('SI')
            for node5 in node3.findall('./Cumulative'):
                month1b = node5.findtext('Month1Back')
                month2b = node5.findtext('Month2Back')
                month3b = node5.findtext('Month3Back')
                curYear = node5.findtext('CurrentYear')
                year1b = node5.findtext('Year1Back')
                year2b = node5.findtext('Year2Back')
                year3b = node5.findtext('Year3Back')
                year4b = node5.findtext('Year4Back')
                year5b = node5.findtext('Year5Back')
                year6b = node5.findtext('Year6Back')
                year7b = node5.findtext('Year7Back')
                year8b = node5.findtext('Year8Back')
                year9b = node5.findtext('Year9Back')
                year10b = node5.findtext('Year10Back')
        **for node6 in node.findall('./HistoricReturns'):
            for node7 in node6.findall('./HistoricReturns_Item'):
                hDate = node7.findall('Date')
                hReturn = node7.findall('Return')**
                print(fundCode, curr, shareClass,year1, year3, year5, year10, year15, year15, year20, SI,month3, ytd, year1g, year3g, year5g, year10g, year15g, year20g, SIg, month1b, month2b, month3b, curYear, year1b, year2b, year3b, year4b, year5b, year6b, year7b, year8b,year9b,year10b, hDate, hReturn)

The sample XML and the python code don't match up in terms of structure. Either

  • you're missing a closing </Gross> tag from the XML (which should be before the <HistoricReturns> section starts) - in which case the code is correct or
  • the code should be for node6 in node3.findall('./HistoricReturns'): ie node3 instead of node

NB The XML sample isn't complete (it isn't well-formed XML) because it's missing closing tags for Gross , FRPerformanceShareClassCurrency and FRPerformance so this makes it impossible to answer the question definitively. Hope this helps though.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM