简体   繁体   中英

Python: Object of type 'NoneType' has no len()

I running into bit of an issue and I'm hoping the stackoverflow crew would be able to help.

I keep getting the error message Object of type 'NoneType' has no len() whenever I attempt to separate a class of documents.

The full trace-back is:

C:\>C:\Python27\python.exe C:\Testing\test.py "C:\Testing\IN" "C:\Testing\Outputs" "C:\Testing\Test.csv"

C:\Testing\IN\000001.000001.xml

Traceback (most recent call last):
  File "C:\Testing\test.py", line 46, in <module>
    if len(documentclass)==0:
TypeError: object of type 'NoneType' has no len()

C:\>

Here's the code:

import csv, sys, os
import shutil
import xml.etree.ElementTree as ET


if __name__ == '__main__':
    if not (len(sys.argv) == 4):
        print 'USAGE: %s inFolder OutFolder csvFile' % (sys.argv[0])
    else:        
        inFolder    = sys.argv[1]
        outFolder   = sys.argv[2]
        className   = sys.argv[3]

        count = 0        
        for fileName in os.listdir(inFolder):
            if fileName.endswith(".pdf"):                
                baseName = fileName.split('.pdf')[0]
                pdfFile = inFolder+"\\"+baseName+".pdf"
                xmlFile = inFolder+"\\"+baseName+".xml"
                validatedXmlFile = inFolder+"\\"+baseName+".xml.validated.xml"                                
                xmlSize = os.path.getsize(xmlFile)
                pdfSize = os.path.getsize(pdfFile)                
                if xmlSize>0 and pdfSize>0:
                    print
                    print xmlFile
                    count = count + 1
                    tree = ET.parse(xmlFile)
                    root_xml = tree.getroot()
                    form_xml = root_xml[0]
                    #form_xml = root_xml[1]
                    documentclass_xml = form_xml.find('DocumentClassGlobal')                    
                    documentclassLocal_xml = form_xml.find('DocumentClassLocal')                
                    #documentclass_xml = form_xml.find('SSMClassID')
                    if documentclass_xml is not None:
                        documentclass = documentclass_xml.find('data').text                       
                    elif documentclassLocal_xml is not None:
                        documentclass = documentclassLocal_xml.find('data').text
                        documentclass = documentclass + "_Local"
                    else:
                        documentclass = ""                        
                    if len(documentclass)==0:
                        documentclass = "UNKNOWN"                
                    print documentclass                 

                    if documentclass == className:                                    
                        if not os.path.exists(outFolder + "\\" + documentclass):
                            os.makedirs(outFolder + "\\" + documentclass)
                        inBaseFile = inFolder + "\\"+baseName
                        outBaseFile = outFolder + "\\" + documentclass+"\\"+baseName                                
                        inFile = inBaseFile+".pdf"
                        outFile = outBaseFile+".pdf"
                        print inFile
                        print outFile
                        shutil.copy(inBaseFile+".pdf", outBaseFile+".pdf")
                        shutil.copy(inBaseFile+".pdf.conf.xml", outBaseFile+".pdf.conf.xml")
                        shutil.copy(inBaseFile+".pdf.multi.txt", outBaseFile+".pdf.multi.txt")
                        shutil.copy(inBaseFile+".pdf.txt", outBaseFile+".pdf.txt")
                        #shutil.move(inBaseFile+".wdb", outBaseFile+".wdb")
                        shutil.copy(inBaseFile+".xml", outBaseFile+".xml")
                        if os.path.exists(inBaseFile+".xml.validated.xml"):
                            shutil.copy(inBaseFile+".xml.validated.xml", outBaseFile+".xml.validated.xml")
                        if os.path.exists(inBaseFile+".xml.validationinfo.xml"):
                            shutil.copy(inBaseFile+".xml.validationinfo.xml", outBaseFile+".xml.validationinfo.xml")


        print '%d files found and copied.' % (count)

Obviously, the if len(documentclass)== 0 is returning the None value. The idea is to assign the None value and the 0 value to documentclass - "Unknown"

Thus far, I have come up with the below, but with no success. Any ideas?

Many Thanks

if documentclass_xml is not None:
                        documentclass = documentclass_xml.find('data').text                       
                    elif documentclassLocal_xml is not None:
                        documentclass = documentclassLocal_xml.find('data').text
                        documentclass = documentclass + "_Local"
if documentclass_xml is None:
                        documentclass = "UNKNOWN"                      
                    elif documentclassLocal_xml is None:
                        documentclass = "UNKNOWN"
                    else:
                        documentclass = ""                        
                    if len(documentclass)==0:
                        documentclass = "UNKNOWN"                
                    print documentclass

The traceback tells you that documentclass has the value None . You initialized it with:

                if documentclass_xml is not None:
                    documentclass = documentclass_xml.find('data').text                       
                elif documentclassLocal_xml is not None:
                    documentclass = documentclassLocal_xml.find('data').text
                    documentclass = documentclass + "_Local"
                else:
                    documentclass = ""   

so at least one branch of the if must assign None to it. Accessing the text attribute of an ElementTree node will return None if there is no text content in the node. It has to be the first branch of the condition otherwise the attempt to append "_Local" would throw the error.

>>> data = ET.fromstring('<test/>')
>>> data.text is None
True

Therefore you are accessing an empty <data/> node.

Turns out there were some changes made to DocumentClassGlobal and DocumentClassLocal. Before, these objects were created only if there was a value in documentclass_xml.find('data').text. Now, this rule is ignored and documentclass_xml.find('data').text can be a None value. I made the below adjustment and it did the trick. Thanks mgilson for pointing it out.

if documentclass_xml is not None and documentclass_xml.find('data').text is not None:
                        documentclass = documentclass_xml.find('data').text                       
                    elif documentclassLocal_xml is not None and documentclass_xml.find('data').text is not None:
                        documentclass = documentclassLocal_xml.find('data').text
                        documentclass = documentclass + "_Local"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM