简体   繁体   中英

Parsing nested xml with lxml and Python

I am having trouble parsing XML when it is in the form of:

<Cars>
    <Car>
        <Color>Blue</Color>
        <Make>Ford</Make>
        <Model>Mustant</Model>
    </Car>
    <Car>
        <Color>Red</Color>
        <Make>Chevy</Make>
        <Model>Camaro</Model>
    </Car>
</Cars>

I have figured out how to parse 1st level children like this:

<Car>
    <Color>Blue</Color>
    <Make>Chevy</Make>
    <Model>Camaro</Model>
</Car>

With this kind of code:

from lxml import etree
    a = os.path.join(localPath,file)
    element = etree.parse(a)
    cars = element.xpath('//Root/Foo/Bar/Car/node()[text()]')
    parsedCars = [{field.tag: field.text for field in cars} for action in cars]
    print parsedCars[0]['Make'] #Chevy

How can I parse our multiple "Car" tags that is a child tag of "Cars"?

Try this

from lxml import etree
    a = os.path.join(localPath,file)
    element = etree.parse(a)
    cars = element.xpath('//Root/Foo/Bar/Car')
    for car in cars:
        colors = car.xpath('./Color')
        makes = car.xpath('./Make')
        models = car.xpath('./Model')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM