简体   繁体   English

使用ElementTree时如何访问XML节点中的数据

[英]How can I access the data in an XML node when using ElementTree

i am parsing the XML located at this link: 我正在解析位于此链接的XML:

XML File to Parse XML文件解析

I need to access the data inside the node and it seems like the program I have written is telling me that there is nothing inside the node. 我需要访问节点内部的数据,似乎我编写的程序告诉我节点内部没有任何内容。 Here is my code: 这是我的代码:

import urllib
import xml.etree.ElementTree as ET 

#prompt for link where xml data resides
#Use this link for testing: http://python-data.dr-chuck.net/comments_42.xml
url = raw_input('Enter URL Link: ')

#open url and prep for parsing
data = urllib.urlopen(url).read()

#read url data and convert to XML Node Tree for parsing
comments = ET.fromstring(data)

#the comment below is part of another approach to the solution
#both approaches are leading me into the same direction
#it appears as if the data inside the node is not being parsed/extracted
#counts = comments.findall('comments/comment/count')

for count in comments.findall('count'):
    print comments.find('count').text

When i print out the 'data' variable alone, i get the complete XML tree. 当我单独打印出“数据”变量时,我得到了完整的XML树。 However, when I try to access the data inside a particular node, the node comes back empty. 但是,当我尝试访问特定节点内的数据时,该节点恢复为空。

I also tried printing the following code to see what data I would get back: 我还尝试打印以下代码以查看将返回的数据:

for child in comments:
    print child.tag, child.attrib

the output i got was: 我得到的输出是:

note {} comments {} 注意{}评论{}

What am i doing wrong, and what am i missing? 我做错了什么,我想念什么?

one of the errors i get when trying a different looping strategy of accessing the node is this: 我尝试访问节点的不同循环策略时遇到的错误之一是:

Traceback (most recent call last):
  File "xmlextractor.py", line 16, in <module>
    print comments.find('count').text
AttributeError: 'NoneType' object has no attribute 'text'

Please help and thanks!!! 请帮助,谢谢!!!

UPDATE: 更新:

Ive realized in looking through the etree docs for python that my approach has been trying to 'get' the node attributes instead of the contents of the nodes. 我已经在浏览etree文档中的python时意识到,我的方法一直在尝试“获取”节点属性而不是节点的内容。 I still havent found an answer but i am definitely closer!!! 我仍然没有找到答案,但是我一定更靠近!!!

2nd UPDATE: 第二次更新:

so i tried out this code: 所以我尝试了这段代码:

import urllib
import xml.etree.ElementTree as ET 

#prompt for link where xml data resides
#Use this link for testing: http://python-data.dr-chuck.net/comments_42.xml

url = raw_input('Enter URL Link: ')

#open url and prep for parsing
data = urllib.urlopen(url).read()

#read url data and convert to XML Node Tree for parsing
comments = ET.fromstring(data)

counts = comments.findall('comments/comment/count')

print len(counts)

for count in counts:
    print 'count', count.find('count').text

from above, when i run this code my: 从上面,当我运行此代码时,我:

print len(counts)

outputs that i have 50 nodes in my counts list, but i still get the same error: 我的计数列表中有50个节点的输出,但是我仍然遇到相同的错误:

Traceback (most recent call last):
  File "xmlextractor.py", line 18, in <module>
    print 'count', count.find('count').text
AttributeError: 'NoneType' object has no attribute 'text'

i dont understand why it says that there is no 'text' attribute when i am trying to access the contents of the node. 我不明白为什么当我尝试访问节点的内容时没有“文本”属性。

What am I doing wrong?? 我究竟做错了什么??

A few comments on your approaches: 关于您的方法的一些评论:

 for count in comments.findall('count'): print comments.find('count').text 

comments.findall('count') returns an empty list because comments does not have any immediate child elements with the name count . comments.findall('count')返回一个空列表,因为comments没有任何带有名称count直接子元素。

 for child in comments: print child.tag, child.attrib 

Iterates over the immediate child elements of your root node, which are called note . 遍历根节点的直接子元素(称为note

 # From update #2 for count in comments.findall('comments/comment/count'): print 'count', count.find('count').text 

Here, count is an Element object representing a count node which itself does not contain any count nodes. 在此, count是一个Element代表一个对象count ,其本身并不包含任何节点count的节点。 Thus, count.find('count') returns a NoneType object. 因此, count.find('count')返回一个NoneType对象。

If I understand correctly, your goal is to retrieve the text values of the count nodes. 如果我理解正确,那么您的目标是检索count节点的文本值。 Here are two ways this can be achieved: 这可以通过两种方法实现:

for count in comments.findall('comments/comment/count'):
    print count.text

for comment in comments.iter('comment'):
    print comment.find('count').text

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用Python中的ElementTree删除xml中的节点? - How do I remove a node in xml using ElementTree in Python? Python - 如何使用xml.etree.ElementTree为我正在迭代的每个xml节点返回一个列表? - Python - How can I return a list for each xml node I am iterating through using xml.etree.ElementTree? 如何在Python中使用ElementTree查询XML节点 - How to query XML node using ElementTree in python xml.etree.ElementTree:如何使用 python 访问具有属性名称的节点 - xml.etree.ElementTree: How to access node with attribute name using python xml.etree.ElementTree 如何在节点内添加属性? - xml.etree.ElementTree how can i add attribute inside of a node? 如何通过搜索元素的属性在Python中使用ElementTree删除xml中的节点? - How do I remove a node in xml using ElementTree in Python by searching attributes of an element? 如何使用 ElementTree 从 Python 中的 XML 文档中删除节点 - How to delete a node from an XML document in Python using ElementTree 如何使用python中的ElementTree访问包含命名空间的xml中的属性值 - How to access attribute value in xml containing namespace using ElementTree in python 如何使用Python ElementTree将新数据附加到现有XML? - How do I append new data to existing XML using Python ElementTree? 如何使用Python和ElementTree挖掘XML文件中的字段数据 - How do I dig out field data in the XML file using Python and ElementTree
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM