简体   繁体   中英

Convert XML to List of Dictionaries in python

I'm very new to python and please treat me as same. When i tried to convert the XML content into List of Dictionaries I'm getting output but not as expected and tried a lot playing around.

XML Content:

<project>
    <panelists>
        <panelist panelist_login="pradeep">
            <login/>
            <firstname/>
            <lastname/>
            <gender/>
            <age>0</age>
        </panelist>
        <panelist panelist_login="kumar">
            <login>kumar</login>
            <firstname>kumar</firstname>
            <lastname>Pradeep</lastname>
            <gender/>
            <age>24</age>
        </panelist>
    </panelists>
</project>

Code i have used:

import xml.etree.ElementTree as ET

tree = ET.parse(xml_file.xml)   # import xml from
root = tree.getroot()  

Panelist_list = []

for item in root.findall('./panelists/panelist'):    # find all projects node
    Panelist = {}              # dictionary to store content of each projects
    panelist_login = {}
    panelist_login = item.attrib
    Panelist_list.append(panelist_login)
    for child in item:

      Panelist[child.tag] = child.text

    Panelist_list.append(Panelist)

print(Panelist_list)

Output:

[{
  'panelist_login': 'pradeep'
}, {
  'login': None,
  'firstname': None,
  'lastname': None,
  'gender': None,
  'age': '0'
}, {
  'panelist_login': 'kumar'
}, {
  'login': 'kumar',
  'firstname': 'kumar',
  'lastname': 'Pradeep',
  'gender': None,
  'age': '24'
}]

and I'm Expecting for the below Output

[{
  'panelist_login': 'pradeep',
  'login': None,
  'firstname': None,
  'lastname': None,
  'gender': None,
  'age': '0'
}, {
  'panelist_login': 'kumar'
  'login': 'kumar',
  'firstname': 'kumar',
  'lastname': 'Pradeep',
  'gender': None,
  'age': '24'
}]

I have refereed so many stack overflow questions on xml tree but still didn't helped me.

any help/suggestion is appreciated.

Your code is appending the dict panelist_login with the tag attributes to the list, in this line: Panelist_list.append(panelist_login) separately from the Panelist dict. So for every <panelist> tag the code appends 2 dicts: one dict of tag attributes and one dict of subtags. Inside the loop you have 2 append() calls, which means 2 items in the list for each time through the loop.

But you actually want a single dict for each <panelist> tag, and you want the tag attribute to appear inside the Panelist dict as if it were a subtag also.

So have a single dict, and update the Panelist dict with the tag attributes instead of keeping the tag attributes in a separate dict.

for item in root.findall('./panelists/panelist'):    # find all projects node
    Panelist = {}              # dictionary to store content of each projects
    panelist_login = item.attrib
    Panelist.update(panelist_login) # make panelist_login the first key of the dict
    for child in item:
      Panelist[child.tag] = child.text
    Panelist_list.append(Panelist)
print(Panelist_list)

I get this output, which I think is what you had in mind:

[
  {'panelist_login': 'pradeep', 
  'login': None, 
  'firstname': None, 
  'lastname': None, 
  'gender': None, 
  'age': '0'}, 
  {'panelist_login': 'kumar', 
  'login': 'kumar', 
  'firstname': 'kumar', 
  'lastname': 'Pradeep', 
  'gender': None, 
  'age': '24'}
 ]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM