简体   繁体   中英

How to access nested XML tags for comparision using Python?

I have this original XML which needs to be modified

            <COUNTRY>
                <NAME>Place ="MALTA"</NAME>
                <DETAILS ID = "tag1"/>
                    <EUROPE CAPITAL="Valletta" />
                    <EUROPE population=123456 />
                    <EUROPE tag = "new"/>
                </DETAILS>
                <DETAILS ID = "tag2"/>
                    <EUROPE CAPITAL="NEW_CAPITAL" />
                    <EUROPE GDP=66666666 />
                    <EUROPE tag = "new"/>
                </DETAILS>
                <DETAILS ID = "tag3"/>
                    <EUROPE CLIMATE="Warm" />
                    <EUROPE Votes=123 />
                    <EUROPE tag = "new"/>
                </DETAILS>
            </COUNTRY>

Now I need to modify this XML after comparing the tags,here I need to compare COUNTRY/DETAILS/ID tag for example: if ID == "tag1" add a new tag( <EUROPE tag = "tag1"/> ). If ID == tag2 need to add( <EUROPE tag = "tag2"/> ). Basically I'm trying to modify a particular block of XML using its "TEXT" as a reference instead of TAG or its ATTRIBUTE. TL;DR - Explanation might be a lil confusing, the tried approach code below might be beneficial.

           <COUNTRY>
                <NAME>Place ="MALTA"</NAME>
                <DETAILS ID = "tag1"/>
                    <EUROPE CAPITAL="Valletta" />
                    <EUROPE population=123456 />
                    <EUROPE tag = "new"/>
                    <EUROPE tag = "tag1"/>
                </DETAILS>
                <DETAILS ID = "tag2"/>
                    <EUROPE CAPITAL="NEW_CAPITAL" />
                    <EUROPE GDP=66666666 />
                    <EUROPE tag = "new"/>
                    <EUROPE tag = "tag2"/>
                </DETAILS>
                <DETAILS ID = "tag3"/>
                    <EUROPE CLIMATE="Warm" />
                    <EUROPE Votes=123 />
                    <EUROPE tag = "new"/>
                </DETAILS>
            </COUNTRY>

STEP1 - Compare the tag to ID(If ID == "tag1")

STEP2 - do something if successful(in this case add <EUROPE tag = "tag1"/> )

I tried the below approach but wasn't successful.When I try to iterate through "details" variable, it's empty. Not sure if it's able to populate specified XML entries.

tree = ET.parse('abc.xml')
root = tree.getroot()
details= tree.findall(".//COUNTRY[DETAILS='ID:\"tag1\"')
for d in details:
     d.append(ET.fromstring('<EUROPE tag = "tag1"/>'))
details2= tree.findall(".//COUNTRY[DETAILS='ID:\"tag2\"')
for d in details2:
     d.append(ET.fromstring('<EUROPE tag = "tag2"/>'))

As mentioned in comments to your question, both your sample xml and expected output are not well formed. But assuming your sample xml is fixed like so:

<COUNTRY>
  <NAME>Place ="MALTA"
  </NAME>
  <DETAILS ID = "tag1">
    <EUROPE CAPITAL="Valletta" />
    <EUROPE population="123456" />
    <EUROPE tag = "new"/>
  </DETAILS>
  <DETAILS ID = "tag2">
    <EUROPE CAPITAL="NEW_CAPITAL" />
    <EUROPE GDP="66666666" />
    <EUROPE tag = "new"/>
  </DETAILS>
  <DETAILS ID = "tag3">
    <EUROPE CLIMATE="Warm" />
    <EUROPE Votes="123" />
  </DETAILS>
</COUNTRY>

and that I understand your question correctly, your main issue is with your xpath expression .//COUNTRY[DETAILS='ID:\"tag1\" , which seems to confuse elements and attributes. This should work:

for country in root.findall('.//DETAILS'):
    new_euo = ET.fromstring(f'<EUROPE tag = "{country.get("ID")}"/>')
    size = len(country.findall('.//*'))
    #size is necessary to determine the insertion place, since the number
    #of <EUROPE> children seems to change in each <DETAILS>
    country.insert(size,new_euo)
    ET.indent(root, space=' ', level=2) 
    #indent() works with python 3.9 and above; otherwise - just delete it                          
print(ET.tostring(root).decode())

Output:

<COUNTRY>
   <NAME>Place ="MALTA"</NAME>
   <DETAILS ID="tag1">
    <EUROPE CAPITAL="Valletta" />
    <EUROPE population="123456" />
    <EUROPE tag="new" />
    <EUROPE tag="tag1" />
   </DETAILS>
   <DETAILS ID="tag2">
    <EUROPE CAPITAL="NEW_CAPITAL" />
    <EUROPE GDP="66666666" />
    <EUROPE tag="new" />
    <EUROPE tag="tag2" />
   </DETAILS>
   <DETAILS ID="tag3">
    <EUROPE CLIMATE="Warm" />
    <EUROPE Votes="123" />
    <EUROPE tag="tag3" />
   </DETAILS>
  </COUNTRY>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM