简体   繁体   中英

Parsing XML with ElementTree in Python

I have XML like this:

<parameter>
 <name>ec_num</name>
 <value>none</value>
 <units/>
 <url/>
 <id>2455</id>
 <m_date>2008-11-29 13:15:14</m_date>
 <user_id>24</user_id>
 <user_name>registry</user_name>
</parameter>
<parameter>
 <name>swisspro</name>
 <value>Q8H6N2</value>
 <units/>

I want to parse the XML and extract the <value> entry which is just below the <name> entry marked 'swisspro'. Ie I want to parse and extract the 'Q8H6N2' value.

How would I do this using ElementTree?

It would by much easier to do via lxml , but here' a solution using ElementTree library:

import xml.etree.ElementTree as ET

data = """<parameters>
<parameter>
 <name>ec_num</name>
 <value>none</value>
 <units/>
 <url/>
 <id>2455</id>
 <m_date>2008-11-29 13:15:14</m_date>
 <user_id>24</user_id>
 <user_name>registry</user_name>
</parameter>
<parameter>
 <name>swisspro</name>
 <value>Q8H6N2</value>
 <units/>
</parameter>
</parameters>"""

tree = ET.fromstring(data)

for parameter in tree.iter(tag='parameter'):
    name = parameter.find('name')
    if name is not None and name.text == 'swisspro':
        print parameter.find('value').text
        break

prints:

Q8H6N2

The idea is pretty simple: iterate over all parameter tags, check the value of the name tag and if it is equal to swisspro , get the value element.

Hope that helps.

Here is an example: xml file

<span style="font-size:13px;"><?xml version="1.0" encoding="utf-8"?>
<root>
 <person age="18">
    <name>hzj</name>
    <sex>man</sex>
 </person>
 <person age="19" des="hello">
    <name>kiki</name>
    <sex>female</sex>
 </person>
</root></span>

parse method

from xml.etree import ElementTree
def print_node(node):
    '''print basic info'''
    print "=============================================="
    print "node.attrib:%s" % node.attrib
    if node.attrib.has_key("age") > 0 :
        print "node.attrib['age']:%s" % node.attrib['age']
    print "node.tag:%s" % node.tag
    print "node.text:%s" % node.text
def read_xml(text):
    '''read xml file'''
    # root = ElementTree.parse(r"D:/test.xml")  #first method
    root = ElementTree.fromstring(text)  #second method

    # get element
    # 1 by getiterator 
    lst_node = root.getiterator("person")
    for node in lst_node:
        print_node(node)

    # 2 by getchildren
    lst_node_child = lst_node[0].getchildren()[0]
    print_node(lst_node_child)

    # 3 by .find
    node_find = root.find('person')
    print_node(node_find)

    #4. by findall
    node_findall = root.findall("person/name")[1]
    print_node(node_findall)

if __name__ == '__main__':
     read_xml(open("test.xml").read())

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM