简体   繁体   中英

XML extracting parsing with ElementTree

I am trying to parse some XML data from this url: http://py4e-data.dr-chuck.net/comments_42.xml , return the Count value and sum the extracted values.

import urllib as ur
import xml.etree.ElementTree as ET

url = input(('Enter location: '))
print'Retrieving:', url

data = ur.urlopen(url).read()
tree = ET.fromstring(data)
counts = tree.findall('.//count')

print('Count: ', sum(counts))
#print('Sum: ', sum_all)

I understand there is some basic issue here, but I've been trying and failing to amend my code without success. I am receiving a TypeError as follows:

Enter location: 'http://py4e-data.dr-chuck.net/comments_42.xml'
Retrieving: http://py4e-data.dr-chuck.net/comments_42.xml
Traceback (most recent call last):
  File "extracting_xml.py", line 11, in <module>
    print('Count: ', sum(counts))
TypeError: unsupported operand type(s) for +: 'int' and 'Element'

The error you are getting is in the summation sum(counts) . Instead you should do:

sum([int(el.text) for el in counts])

As the exception indicates you are trying to sum up found nodes of type Element which do not have addition operator defined. The nodes contain plain integers so converting the text of the node to int and then summing up is what you need to do.

Should you have floats in your nodes then you would use float constructor.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM