I am a relatively new programmer, using xml element tree to iterate through an xml file of data on members. I am iterating through the members to extract data and allocate to variables for surname, firstname and id.
The problem I have is that if one of the members has a missing element (as opposed to missing data within an element), then my current code maintains the variable value from the last iteration (member).
My data:
<?xml version='1.0' ?>
<members>
<member>
<field name="surname">
<text>Smith</text>
</field>
<field name="firstname" type="text">
<text>John</text>
</field>
<field name="id" type="text">
<text>123</text>
</field>
</member>
<member>
<field name="surname" type="text">
<text>Bloggs</text>
</field>
<!--missing firstname element here -->
<field name="id" type="text">
<text>789</text>
</field>
</member>
<member>
<field name="surname" type="text">
<text>Jones</text>
</field>
<field name="firstname" type="text">
<text>Jane</text>
</field>
<field name="id" type="text">
<text>456</text>
</field>
</member>
</members>
My code:
tree = ET.parse('mydata.xml')
root = tree.getroot()
y = root.findall('member')
for member in y:
fields = member.findall("field")
for field in fields:
if field.get('name') == 'surname':
surname = field.find('text').text
if field.get('name') == 'firstname':
firstname = field.find('text').text
if field.get('name') == 'id':
id = field.find('text').text
print(surname, firstname, id)
Desired output:
Smith John 123
Bloggs 789
Jones Jane 456
Actual output, which shows Bloggs' firstname as John rather than blank:
Smith John 123
Bloggs John 789
Jones Jane 456
I can avoid this by setting the surname, firstname and id variable prior to each member iteration:
for member in y:
surname = ''
firstname = ''
id = ''
fields = member.findall("field")
for field in fields:
if field.get('name') == 'surname':
surname = field.find('text').text
if field.get('name') == 'firstname':
firstname = field.find('text').text
if field.get('name') == 'id':
id = field.find('text').text
print(surname, firstname, id)
which gives the desired result:
Smith John 123
Bloggs 789
Jones Jane 456
However this feels a bit of a workaround - is there an alternative, more pythonic way to achieve this?
What you have is actually quite OK and readable. But if you really want to you could use a ternary expression
for member in y:
fields = member.findall("field")
for field in fields:
surname = field.find('text').text if field.get('name') == 'surname' else ''
firstname = field.find('text').text if field.get('name') == 'firstname' else ''
id = field.find('text').text if field.get('name') == 'id' else ''
print(surname, firstname, id)
Maybe this is easier to achieve by putting the data into a dict using a dict comprehension. This way the dict will always be overwritten, even if it's completely empty because there are no fields in a member :
for member in root.findall("member"):
data = {field.get("name") : field.find("text").text for field in member.findall("field")}
print(
data.get("surname", "(no surname"),
data.get("firstname", "(no firstname)"),
data.get("id", "(no id")
)
=>
Smith John 123
Bloggs (no firstname) 789
Jones Jane 456
those things are maybe better isolated to an own function. and yes you should reset them first when you want empty fields
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.