简体   繁体   English

Python 3.4-XML解析-IndexError:列表索引超出范围-如何找到XML范围?

[英]Python 3.4 - XML Parse - IndexError: List Index Out of Range - How do I find range of XML?

Okay guys, I'm new to parsing XML and Python, and am trying to get this to work. 好的,伙计们,我是解析XML和Python的新手,并且正在尝试使其工作。 If someone could help me with this it would be greatly appreciated. 如果有人可以帮助我,将不胜感激。 If you can help me (educate me) on how to figure it out for myself, that would be even better! 如果您可以帮助我(教育我)如何自己解决问题,那就更好了!

I am having trouble trying to figure out the range to reference for an XML document as I can't find any documentation on it. 我无法找出XML文档的引用范围,因为我找不到任何文档。 Here is my code and I'll include the entire Traceback after. 这是我的代码,之后将包括整个Traceback。

#import library to do http requests:
import urllib.request

#import easy to use xml parser called minidom:
from xml.dom.minidom import parseString
#all these imports are standard on most modern python implementations

#download the file:
file = urllib.request.urlopen('http://www.wizards.com/dndinsider/compendium/CompendiumSearch.asmx/KeywordSearch?Keywords=healing%20%word&nameOnly=True&tab=')
#convert to string:
data = file.read()
#close file because we dont need it anymore:
file.close()
#parse the xml you downloaded
dom = parseString(data)
#retrieve the first xml tag (<tag>data</tag>) that the parser finds with name tagName:
xmlTag = dom.getElementsByTagName('Data.Results.Power.ID')[0].toxml()
#strip off the tag (<tag>data</tag>  --->   data):
xmlData=xmlTag.replace('<id>','').replace('</id>','')
#print out the xml tag and data in this format: <tag>data</tag>
print(xmlTag)
#just print the data
print(xmlData)

Traceback 追溯

/usr/bin/python3.4 /home/mint/PycharmProjects/DnD_Project/Power_Name.py
Traceback (most recent call last):
  File "/home/mint/PycharmProjects/DnD_Project/Power_Name.py", line 14, in <module>
xmlTag = dom.getElementsByTagName('id')[0].toxml()
IndexError: list index out of range

Process finished with exit code 1 流程以退出代码1完成

print len( dom.getElementsByTagName('id') )

EDIT: 编辑:

ids = dom.getElementsByTagName('id')

if len( ids ) > 0 :
     xmlTag = ids[0].toxml()
     # rest of code

EDIT: I add example because I saw in other comment tha you don't know how to use it 编辑:我添加示例,因为我在其他评论中看到了,你不知道如何使用它

BTW: I add some comment in code about file/connection 顺便说一句:我在代码中添加了一些有关文件/连接的注释

import urllib.request

from xml.dom.minidom import parseString

# create connection to data/file on server
connection = urllib.request.urlopen('http://www.wizards.com/dndinsider/compendium/CompendiumSearch.asmx/KeywordSearch?Keywords=healing%20%word&nameOnly=True&tab=')

# read from server as string (not "convert" to string):
data = connection.read()

#close connection because we dont need it anymore:
connection.close()

dom = parseString(data)

# get tags from dom
ids = dom.getElementsByTagName('Data.Results.Power.ID')

# check if there are any data
if len( ids ) > 0 :
    xmlTag = ids[0].toxml()
    xmlData=xmlTag.replace('<id>','').replace('</id>','')
    print(xmlTag)
    print(xmlData)
else:
    print("Sorry, there was no data")

or you can use for loop if there is more tags 或者如果有更多标签,可以使用for循环

dom = parseString(data)

# get tags from dom
ids = dom.getElementsByTagName('Data.Results.Power.ID')

# get all tags - one by one
for one_tag in ids:
    xmlTag = one_tag.toxml()
    xmlData = xmlTag.replace('<id>','').replace('</id>','')
    print(xmlTag)
    print(xmlData)

BTW: BTW:

  • getElementsByTagName() expects tagname ID - not path Data.Results.Power.ID getElementsByTagName()需要标记名ID而不是路径Data.Results.Power.ID
  • tagname is ID so you have to replace <ID> not <id> 标记名是ID因此您必须替换<ID>而不是<id>
  • for this tag you can event use one_tag.firstChild.nodeValue in place of xmlTag.replace 对于此标签,您可以事件使用one_tag.firstChild.nodeValue代替xmlTag.replace

.

dom = parseString(data)

# get tags from dom
ids = dom.getElementsByTagName('ID') # tagname

# get all tags - one by one
for one_tag in ids:
    xmlTag = one_tag.toxml()
    #xmlData = xmlTag.replace('<ID>','').replace('</ID>','')
    xmlData = one_tag.firstChild.nodeValue
    print(xmlTag)
    print(xmlData)

I haven't used the built in xml library in a while, but it's covered in Mark Pilgrim's great Dive into Python book. 我已经有一段时间没有使用内置的xml库了,但是Mark Pilgrim的Dive into Python一书中对此进行了介绍。

-- I see as I'm typing this that your question has already been answered but since you mention being new to Python I think you will find the text useful for xml parsing and as an excellent introduction to the language. -我输入的内容表明您的问题已经得到解答,但是由于您提到了Python的新知识,所以我认为您会发现该文本对于xml解析很有用,并且是该语言的绝佳介绍。

If you would like to try another approach to parsing xml and html, I highly recommend lxml . 如果您想尝试另一种解析xml和html的方法,我强烈建议使用lxml

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 IndexError:列表分配索引超出范围python 3.4 - IndexError: list assignment index out of range python 3.4 如果在 python 中解析 xml 和 BeautifulSoup 时,如果元素不存在,如何避免遇到 IndexError: list index out of range 错误 - How to avoid running into IndexError: list index out of range error if an element is nonexistent while parsing xml with BeautifulSoup in python 如何修复 Python 中的“IndexError: list index out of range” - How do you fix 'IndexError: list index out of range' in Python 如何解决错误“IndexError:列表索引超出范围” - How do I solve the error, "IndexError: list index out of range" 如何解决 IndexError: list index out of range? - how do I resolve IndexError: list index out of range? 如何解决 IndexError:列表索引超出范围 - How do I resolve IndexError: list index out of range 如何修复IndexError:Python中的字符串索引超出范围 - How do I fix IndexError: string index out of range in python IndexError:列表分配索引超出范围Python 3这是什么意思,我该如何解决? - IndexError: list assignment index out of range Python 3 What does this mean and how do I fix it? 如何在读取和写入文件时对 Python IndexError:List Index out of range 进行排序 - How do I sort Python IndexError:List Index out of range when reading and writing with files 我遇到IndexError:尝试从xml文件中获取文本时出现列表索引超出范围错误 - I am having a IndexError: list index out of range error trying to grab the text from an xml file
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM