简体   繁体   中英

How to get xml elements which have childs with a certain tag and attribute

I want to find xml elements which have certain child elements. The child elements need to have a given tag and an attribute set to a specific value.

To give a concrete example (based on the official documentation ). I want to find all country elements which have a child element neighbor with attribute name="Austria" :

import xml.etree.ElementTree as ET

data = """<?xml version="1.0"?>
<data>
    <country name="Liechtenstein">
        <neighbor name="Austria" direction="E"/>
        <neighbor name="Switzerland" direction="W"/>
    </country>
    <country name="Singapore">
        <neighbor name="Malaysia" direction="N"/>
        <partner name="Austria"/>
    </country>
    <country name="Panama">
        <neighbor name="Costa Rica" direction="W"/>
        <neighbor name="Colombia" direction="E"/>
    </country>
</data>
"""

root = ET.fromstring(data)

What I've tried without success:

countries1 = root.findall('.//country[neighbor@name="Austria"]')
countries2 = root.findall('.//country[neighbor][@name="Austria"]')
countries3 = root.findall('.//country[neighbor[@name="Austria"]]')

which all give:

SyntaxError: invalid predicate


Following solutions are obviously wrong, as too much elements are found:

countries4 = root.findall('.//country/*[@name="Austria"]')
countries5 = root.findall('.//country/[neighbor]')

where countries4 contains all elements having an attribute name="Austria" , but including the partner element. countries5 contains all elements which have any neighbor element as a child.

I want to find all country elements which have a child element neighbor with attribute name="Austria"

see below

import xml.etree.ElementTree as ET

data = """<?xml version="1.0"?>
<data>
    <country name="Liechtenstein">
        <neighbor name="Austria" direction="E"/>
        <neighbor name="Switzerland" direction="W"/>
    </country>
    <country name="Singapore">
        <neighbor name="Malaysia" direction="N"/>
        <partner name="Austria"/>
    </country>
    <country name="Panama">
        <neighbor name="Costa Rica" direction="W"/>
        <neighbor name="Colombia" direction="E"/>
    </country>
</data>
"""

root = ET.fromstring(data)
countries_with_austria_as_neighbor = [c.attrib['name'] for c in root.findall('.//country') if
                                      'Austria' in [n.attrib['name'] for n in c.findall('neighbor')]]
print(countries_with_austria_as_neighbor)

output

['Liechtenstein']
import xml.etree.ElementTree as ET

data = """<?xml version="1.0"?>
<data>
    <country name="Liechtenstein">
        <neighbor name="Austria" direction="E"/>
        <neighbor name="Switzerland" direction="W"/>
    </country>
    <country name="Singapore">
        <neighbor name="Malaysia" direction="N"/>
        <partner name="Austria"/>
    </country>
    <country name="Panama">
        <neighbor name="Costa Rica" direction="W"/>
        <neighbor name="Colombia" direction="E"/>
    </country>
    <country name="Liechtenstein">
        <neighbor name="Austria" direction="dummy"/>
        <neighbor name="Switzerland" direction="W"/>
    </country>
    
</data>
"""

root = ET.fromstring(data)
for x in root.findall(".//country/neighbor[@name='Austria']"):
    print(x.attrib)

Output:

{'name': 'Austria', 'direction': 'E'}
{'name': 'Austria', 'direction': 'dummy'}

// : Selects all subelements, on all levels beneath the current element. For example, .//egg selects all egg elements in the entire tree. Selects all subelements, on all levels beneath the current element. For example, .//egg selects all egg elements in the entire tree.

[@attrib='value'] : Selects all elements for which the given attribute has the given value. The value cannot contain quotes Selects all elements for which the given attribute has the given value. The value cannot contain quotes

for x in root.find('.'):
    if x[0].attrib['name'] == 'Austria':
                   print(x.attrib['name']) 

Output: Liechtenstein

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM