简体   繁体   中英

parsing large xml file with python

xml file example:

<header>
<name>name</name>

<items>

<item>
<title>title</title>
<add>add</add>
</item>

<item>
<title>title</title>
<add>add</add>
</item>

</items>
</header>

I would like to parse the info into groups broken up by each header and subgroup item:

xml parse too:

name
----title
----add

----title
----add

next header

name
----tile
----add
----etc
----etc

if someone could post an example, preferable with elem tree iterparse its a large xml file...

my example that doesn't work is...

import xml.etree.cElementTree as etree
infile = open("c:/1.xml", 'r')
context = etree.iterparse(infile, events=("start", "end"))

for event, element in context:
    if event == "end":
        if element.tag == "header":
            print element.findtext('name')
        elif element.tag == "item":
            print element.findtext('title')
            print element.findtext('add')

So, nice and simply, with the infile you provided:

import xml.etree.cElementTree as etree

for event, element in etree.iterparse("C:/1.xml"):
    if element.tag == "name":
        print element.text
    elif element.tag in ["title", "add"]:
        print "---" + element.text

this gives output:

name
----title
----add
----title
----add

I guess if you wanted a spacer between headers you'd just:

if element.tag == "header":
    print "\n"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM