简体   繁体   中英

Python SUDS unicode decode error returned from Webservice

I am attempting to use a Webservice created by one of our developers that allows us to upload files into the system, within certain restrictions.

Using SUDS, I get the following information:

Suds ( https://fedorahosted.org/suds/ ) version: 0.4 GA  build: R699-20100913

Service ( ConnectToEFS ) tns="http://tempuri.org/"
   Prefixes (3)
      ns0 = "http://schemas.microsoft.com/2003/10/Serialization/"
      ns1 = "http://schemas.microsoft.com/Message"
      ns2 = "http://tempuri.org/"
   Ports (1):
      (BasicHttpBinding_IConnectToEFS)
         Methods (2):
            CreateContentFolder(xs:string FileCode, xs:string FolderName, xs:string ContentType, xs:string MetaDataXML, )
            UploadFile(ns1:StreamBody FileByteStream, )
         Types (4):
            ns1:StreamBody
            ns0:char
            ns0:duration
            ns0:guid

My method to using UploadFile is as follows:

def webserviceUploadFile(self, targetLocation, fileName, fileSource):
    fileSource = './test_files/' + fileSource
    ntlm = WindowsHttpAuthenticated(username=uname, password=upass)
    client = Client(webservice_url, transport=ntlm)
    client.set_options(soapheaders={'TargetLocation':targetLocation, 'FileName': fileName})
    body = client.factory.create('AIRDocument')
    body_file = open(fileSource, 'rb')
    body_data = body_file.read()
    body.FileByteStream = body_data
    return client.service.UploadFile(body)

Running this gets me the following result:

Traceback (most recent call last):
  File "test_cases.py", line 639, in test_upload_file_invalid_extension
    result_string = self.HM.webserviceUploadFile('9999', 'AD-1234-5424__44.exe',
 'test_data.pdf')
  File "test_cases.py", line 81, in webserviceUploadFile
    return client.service.UploadFile(body)
  File "build\bdist.win32\egg\suds\client.py", line 542, in __call__
    return client.invoke(args, kwargs)
  File "build\bdist.win32\egg\suds\client.py", line 595, in invoke
    soapenv = binding.get_message(self.method, args, kwargs)
  File "build\bdist.win32\egg\suds\bindings\binding.py", line 120, in get_message
    content = self.bodycontent(method, args, kwargs)
  File "build\bdist.win32\egg\suds\bindings\document.py", line 63, in bodycontent
    p = self.mkparam(method, pd, value)
  File "build\bdist.win32\egg\suds\bindings\document.py", line 105, in mkparam
    return Binding.mkparam(self, method, pdef, object)
  File "build\bdist.win32\egg\suds\bindings\binding.py", line 287, in mkparam
    return marshaller.process(content)
  File "build\bdist.win32\egg\suds\mx\core.py", line 62, in process
    self.append(document, content)
  File "build\bdist.win32\egg\suds\mx\core.py", line 75, in append
    self.appender.append(parent, content)
  File "build\bdist.win32\egg\suds\mx\appender.py", line 102, in append
    appender.append(parent, content)
  File "build\bdist.win32\egg\suds\mx\appender.py", line 243, in append
    Appender.append(self, child, cont)
  File "build\bdist.win32\egg\suds\mx\appender.py", line 182, in append
    self.marshaller.append(parent, content)
  File "build\bdist.win32\egg\suds\mx\core.py", line 75, in append
    self.appender.append(parent, content)
  File "build\bdist.win32\egg\suds\mx\appender.py", line 102, in append
    appender.append(parent, content)
  File "build\bdist.win32\egg\suds\mx\appender.py", line 198, in append
    child.setText(tostr(content.value))
  File "build\bdist.win32\egg\suds\sax\element.py", line 251, in setText
    self.text = Text(value)
  File "build\bdist.win32\egg\suds\sax\text.py", line 43, in __new__
    result = super(Text, cls).__new__(cls, *args, **kwargs)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 10: ordinal
 not in range(128)

After much research and talking with the developer of the webservice, I modified the body_data = body_file.read() into body_data = body_file.read().decode("UTF-8") which gets me this error:

Traceback (most recent call last):
  File "test_cases.py", line 639, in test_upload_file_invalid_extension
    result_string = self.HM.webserviceUploadFile('9999', 'AD-1234-5424__44.exe', 'test_data.pdf')
  File "test_cases.py", line 79, in webserviceUploadFile
    body_data = body_file.read().decode("utf-8")
  File "C:\python27\lib\encodings\utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xe2 in position 10: invalid
continuation byte

Which is less than helpful.

After more research into the problem, I tried adding 'errors='ignore'' to the UTF-8 encode, and this was the result:

<TransactionDescription>Error in INTL-CONF_France_PROJ_MA_126807.docx: An exception has been thrown when reading the stream.. Inner Exception: System.Xml.XmlException: The byte 0x03 is not valid at this location.  Line 1, position 318.
    at System.Xml.XmlExceptionHelper.ThrowXmlException(XmlDictionaryReader reader, String res, String arg1, String arg2, String arg3)
    at System.Xml.XmlUTF8TextReader.Read()
    at System.ServiceModel.Dispatcher.StreamFormatter.MessageBodyStream.Exhaust(XmlDictionaryReader reader)
    at System.ServiceModel.Dispatcher.StreamFormatter.MessageBodyStream.Read(Byte[] buffer, Int32 offset, Int32 count). Source: System.ServiceModel</TransactionDescription>

Which pretty much stumps me on what to do. Based on the result stack trace by the webservice, it looks like it wants UTF-8 but I can't seem to get it to the webservice without Python or SUDS throwing a fit, or by ignoring problems in the encoding. The system I'm working on only takes in MicroSoft office type files (doc, xls, and the like), PDFs, and TXT files, so using something that I have more control on the encoding is not an option. I also tried detecting the encoding used by the sample PDF and the sample DOCX, but using what it suggested (Latin-1, ISO8859-x, and several windows XXXX) all were accepted by Python and SUDS, but not by the webservice.

Also note in the example shown, its most frequently referencing a test to an invalid extension. This error applies even in what should be a test of the successful upload, which is the only time really that the final stacktrace ever shows up.

You can use this base64.b64encode(body_file.read()) and this will return the base64 string value. So your request variable must be a string.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM