[英]Python 3.4 - XML Parse - IndexError: List Index Out of Range - How do I find range of XML?
Okay guys, I'm new to parsing XML and Python, and am trying to get this to work. 好的,伙计们,我是解析XML和Python的新手,并且正在尝试使其工作。 If someone could help me with this it would be greatly appreciated.
如果有人可以帮助我,将不胜感激。 If you can help me (educate me) on how to figure it out for myself, that would be even better!
如果您可以帮助我(教育我)如何自己解决问题,那就更好了!
I am having trouble trying to figure out the range to reference for an XML document as I can't find any documentation on it. 我无法找出XML文档的引用范围,因为我找不到任何文档。 Here is my code and I'll include the entire Traceback after.
这是我的代码,之后将包括整个Traceback。
#import library to do http requests:
import urllib.request
#import easy to use xml parser called minidom:
from xml.dom.minidom import parseString
#all these imports are standard on most modern python implementations
#download the file:
file = urllib.request.urlopen('http://www.wizards.com/dndinsider/compendium/CompendiumSearch.asmx/KeywordSearch?Keywords=healing%20%word&nameOnly=True&tab=')
#convert to string:
data = file.read()
#close file because we dont need it anymore:
file.close()
#parse the xml you downloaded
dom = parseString(data)
#retrieve the first xml tag (<tag>data</tag>) that the parser finds with name tagName:
xmlTag = dom.getElementsByTagName('Data.Results.Power.ID')[0].toxml()
#strip off the tag (<tag>data</tag> ---> data):
xmlData=xmlTag.replace('<id>','').replace('</id>','')
#print out the xml tag and data in this format: <tag>data</tag>
print(xmlTag)
#just print the data
print(xmlData)
Traceback 追溯
/usr/bin/python3.4 /home/mint/PycharmProjects/DnD_Project/Power_Name.py
Traceback (most recent call last):
File "/home/mint/PycharmProjects/DnD_Project/Power_Name.py", line 14, in <module>
xmlTag = dom.getElementsByTagName('id')[0].toxml()
IndexError: list index out of range
Process finished with exit code 1 流程以退出代码1完成
print len( dom.getElementsByTagName('id') )
EDIT: 编辑:
ids = dom.getElementsByTagName('id')
if len( ids ) > 0 :
xmlTag = ids[0].toxml()
# rest of code
EDIT: I add example because I saw in other comment tha you don't know how to use it 编辑:我添加示例,因为我在其他评论中看到了,你不知道如何使用它
BTW: I add some comment in code about file/connection 顺便说一句:我在代码中添加了一些有关文件/连接的注释
import urllib.request
from xml.dom.minidom import parseString
# create connection to data/file on server
connection = urllib.request.urlopen('http://www.wizards.com/dndinsider/compendium/CompendiumSearch.asmx/KeywordSearch?Keywords=healing%20%word&nameOnly=True&tab=')
# read from server as string (not "convert" to string):
data = connection.read()
#close connection because we dont need it anymore:
connection.close()
dom = parseString(data)
# get tags from dom
ids = dom.getElementsByTagName('Data.Results.Power.ID')
# check if there are any data
if len( ids ) > 0 :
xmlTag = ids[0].toxml()
xmlData=xmlTag.replace('<id>','').replace('</id>','')
print(xmlTag)
print(xmlData)
else:
print("Sorry, there was no data")
or you can use for
loop if there is more tags 或者如果有更多标签,可以使用
for
循环
dom = parseString(data)
# get tags from dom
ids = dom.getElementsByTagName('Data.Results.Power.ID')
# get all tags - one by one
for one_tag in ids:
xmlTag = one_tag.toxml()
xmlData = xmlTag.replace('<id>','').replace('</id>','')
print(xmlTag)
print(xmlData)
BTW: BTW:
getElementsByTagName()
expects tagname ID
- not path Data.Results.Power.ID
getElementsByTagName()
需要标记名ID
而不是路径Data.Results.Power.ID
ID
so you have to replace <ID>
not <id>
ID
因此您必须替换<ID>
而不是<id>
one_tag.firstChild.nodeValue
in place of xmlTag.replace
one_tag.firstChild.nodeValue
代替xmlTag.replace
. 。
dom = parseString(data)
# get tags from dom
ids = dom.getElementsByTagName('ID') # tagname
# get all tags - one by one
for one_tag in ids:
xmlTag = one_tag.toxml()
#xmlData = xmlTag.replace('<ID>','').replace('</ID>','')
xmlData = one_tag.firstChild.nodeValue
print(xmlTag)
print(xmlData)
I haven't used the built in xml library in a while, but it's covered in Mark Pilgrim's great Dive into Python book. 我已经有一段时间没有使用内置的xml库了,但是Mark Pilgrim的Dive into Python一书中对此进行了介绍。
-- I see as I'm typing this that your question has already been answered but since you mention being new to Python I think you will find the text useful for xml parsing and as an excellent introduction to the language. -我输入的内容表明您的问题已经得到解答,但是由于您提到了Python的新知识,所以我认为您会发现该文本对于xml解析很有用,并且是该语言的绝佳介绍。
If you would like to try another approach to parsing xml and html, I highly recommend lxml . 如果您想尝试另一种解析xml和html的方法,我强烈建议使用lxml 。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.