简体   繁体   English

xmltodict 错误

[英]Error with xmltodict

EDIT: I can print rev['contributor'] for a while but then every try to access rev['contributor'] returns the following编辑:我可以打印 rev['contributor'] 一段时间,但每次尝试访问 rev['contributor'] 都会返回以下内容

 TypeError: string indices must be integers

ORIGINAL POST: I'm trying to extract data from an xml using xml to dict with the code:原始帖子:我正在尝试使用 xml 从 xml 中提取数据以使用代码进行 dict:

import xmltodict, json

with open('Sockpuppet_articles.xml', encoding='utf-8') as xml_file:
    dic_xml = xmltodict.parse(xml_file.read(), xml_attribs=False)
    print("parsed")
    for page in dic_xml['mediawiki']['page']:
        for rev in  page['revision']:
            for user in open("Sockpuppet_names.txt", "r", encoding='utf-8'):
                user = user.strip()

                if 'username' in rev['contributor'] and rev['contributor']['username'] == user:
                    dosomething()

I get this error in the last line with the if-statement:我在 if 语句的最后一行收到此错误:

TypeError: string indices must be integers

Weird thing is, it works on another xml-file.奇怪的是,它适用于另一个 xml 文件。

I got the same error when the next level has only one element.当下一个级别只有一个元素时,我遇到了同样的错误。

...

## Read XML
pastas = [os.path.join(caminho, name) for name in os.listdir(caminho)]
pastas = filter(os.path.isdir, pastas)
for pasta in pastas:
    for arq in glob.glob(os.path.join(pasta, "*.xml")):
        xmlData = codecs.open(arq, 'r', encoding='utf8').read()
        xmlDict = xmltodict.parse(xmlData, xml_attribs=True)["XMLBIBLE"]
        bible_name = xmlDict["@biblename"]
        list_verse = []
        for xml_inBook in xmlDict["BIBLEBOOK"]:
            bnumber = xml_inBook["@bnumber"]
            bname = xml_inBook["@bname"]
            for xml_chapter in xml_inBook["CHAPTER"]:
                cnumber = xml_chapter["@cnumber"]
                for xml_verse in xml_chapter["VERS"]:
                    vnumber = xml_verse["@vnumber"]
                    vtext = xml_verse["#text"]
...


TypeError: string indices must be integers

The error occurs when the book is "Obadiah".当书是“Obadiah”时会发生错误。 It has only one chapter.它只有一章。

xml_inBook

Cliking CHAPTER value we see the following view.单击 CHAPTER 值我们看到以下视图。 Then it's supposed xml_chapter will be the same.那么它应该 xml_chapter 将是相同的。 That is true only if the book has more then one chapter:只有当这本书有一章以上时,这才是正确的: 在此处输入图片说明

But the loop returns "@cnumber" instead of an OrderedDict.但是循环返回“@cnumber”而不是 OrderedDict。

I solved that converting the OrderedDict to List when has only one chapter.我解决了在只有一章时将 OrderedDict 转换为 List 的问题。

...

            if len(xml_inBook["CHAPTER"]) == 2:
                xml_chapter = list(xml_inBook["CHAPTER"].items())
                cnumber = xml_chapter[0][1]
                for xml_verse in xml_chapter[1][1]:
                    vnumber = xml_verse["@vnumber"]
                    vtext = xml_verse["#text"]
...

I am using Python 3,6.我正在使用 Python 3,6。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM