简体   繁体   中英

How can I access the data in an XML node when using ElementTree

i am parsing the XML located at this link:

XML File to Parse

I need to access the data inside the node and it seems like the program I have written is telling me that there is nothing inside the node. Here is my code:

import urllib
import xml.etree.ElementTree as ET 

#prompt for link where xml data resides
#Use this link for testing: http://python-data.dr-chuck.net/comments_42.xml
url = raw_input('Enter URL Link: ')

#open url and prep for parsing
data = urllib.urlopen(url).read()

#read url data and convert to XML Node Tree for parsing
comments = ET.fromstring(data)

#the comment below is part of another approach to the solution
#both approaches are leading me into the same direction
#it appears as if the data inside the node is not being parsed/extracted
#counts = comments.findall('comments/comment/count')

for count in comments.findall('count'):
    print comments.find('count').text

When i print out the 'data' variable alone, i get the complete XML tree. However, when I try to access the data inside a particular node, the node comes back empty.

I also tried printing the following code to see what data I would get back:

for child in comments:
    print child.tag, child.attrib

the output i got was:

note {} comments {}

What am i doing wrong, and what am i missing?

one of the errors i get when trying a different looping strategy of accessing the node is this:

Traceback (most recent call last):
  File "xmlextractor.py", line 16, in <module>
    print comments.find('count').text
AttributeError: 'NoneType' object has no attribute 'text'

Please help and thanks!!!


Ive realized in looking through the etree docs for python that my approach has been trying to 'get' the node attributes instead of the contents of the nodes. I still havent found an answer but i am definitely closer!!!


so i tried out this code:

import urllib
import xml.etree.ElementTree as ET 

#prompt for link where xml data resides
#Use this link for testing: http://python-data.dr-chuck.net/comments_42.xml

url = raw_input('Enter URL Link: ')

#open url and prep for parsing
data = urllib.urlopen(url).read()

#read url data and convert to XML Node Tree for parsing
comments = ET.fromstring(data)

counts = comments.findall('comments/comment/count')

print len(counts)

for count in counts:
    print 'count', count.find('count').text

from above, when i run this code my:

print len(counts)

outputs that i have 50 nodes in my counts list, but i still get the same error:

Traceback (most recent call last):
  File "xmlextractor.py", line 18, in <module>
    print 'count', count.find('count').text
AttributeError: 'NoneType' object has no attribute 'text'

i dont understand why it says that there is no 'text' attribute when i am trying to access the contents of the node.

What am I doing wrong??

A few comments on your approaches:

 for count in comments.findall('count'): print comments.find('count').text 

comments.findall('count') returns an empty list because comments does not have any immediate child elements with the name count .

 for child in comments: print child.tag, child.attrib 

Iterates over the immediate child elements of your root node, which are called note .

 # From update #2 for count in comments.findall('comments/comment/count'): print 'count', count.find('count').text 

Here, count is an Element object representing a count node which itself does not contain any count nodes. Thus, count.find('count') returns a NoneType object.

If I understand correctly, your goal is to retrieve the text values of the count nodes. Here are two ways this can be achieved:

for count in comments.findall('comments/comment/count'):
    print count.text

for comment in comments.iter('comment'):
    print comment.find('count').text

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM