繁体   English   中英

python中的utf-16编码数据错误

[英]Error with utf-16 encoded data in python

这是一段代码,其中字符串将以UTF-16编码并通过网络发送:

# -*- coding: utf8-*-

import unit_test_utils
import os
import sys

...
...
def run():
    test_dir = unit_test_utils.get_test_dir("test")

    try:
        file_name = u'débárquér.txt'
        open_req = createrequest.CreateRequest(factory)
        open_req.create_disp_ = defines.FILE_OPEN_IF
        open_req.file_name_ = '%s\\%s' % (test_dir, file_name)
        res = unit_test_utils.test_send(client, open_req)
        ....
        ....
    finally:
        client.close()

if __name__ == '__main__':
    run()

运行此命令时,错误如下:

# python /root/python/tests/unicode_test.py
Traceback (most recent call last):
  File "/root/python/tests/unicode_test.py", line 47, in <module>
    run()
  File "/root/python/tests/unicode_test.py", line 29, in run
    res = unit_test_utils.test_send(client, open_req)
  File "/root/python/unit_test_utils.py", line 336, in test_send
    handle_class=handle_class)
  File "/root/python/unit_test_utils.py", line 321, in test_async_send
    test_handle_class(handle_class, expected_status))
  File "/root/usr/lib/python2.7/site-packages/client.py", line 220, in async_send
    return self._async_send(msg, function, handle_class, pdu_splits)
  File "/root/usr/lib/python2.7/site-packages/client.py", line 239, in _async_send
    data, handle = self._handle_request(msg, function, handle_class)
  File "/root/usr/lib/python2.7/site-packages/client.py", line 461, in _handle_request
    return handler(self, msg, *args, **kwargs)
  File "/root/usr/lib/python2.7/site-packages/client.py", line 473, in _common_request
    msg.encode(buf, smb_ver=2)
  File "/root/usr/lib/python2.7/site-packages/message.py", line 17, in encode
    new_offset = composite.Composite.encode(self, buf, offset, **kwargs)
  File "/root/usr/lib/python2.7/site-packages/pycifs/composite.py", line 36, in encode
    new_offset = self._encode(buf, offset, **kwargs)
  File "/root/usr/lib/python2.7/site-packages/packets/createrequest.py", line 128, in _encode
    offset = self._file_name.encode(self._file_name_value(**kwargs), buf, offset, **kwargs)
  File "/root/usr/lib/python2.7/site-packages/fields/unicode.py", line 76, in encode
    buf.append(_UTF16_ENC(value)[0])
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe9 in position 8: ordinal not in range(128)

代码有什么问题?

当我在本地尝试此练习时,情况似乎不错:

$ python
Python 2.6.6 (r266:84292, Jul 22 2015, 16:47:47)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-16)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> file_name = 'débárquér.txt'
>>> print type(file_name)
<type 'str'>
>>> utf16_filename = file_name.decode('utf8').encode('UTF-16LE')
>>> print type(utf16_filename)
<type 'str'>
>>> utf16_filename.decode('UTF-16LE')
u'd\xe9b\xe1rqu\xe9r.txt'

不要将文本分配给字节字符串。 在Python 2中,这意味着您必须使用unicode文字:

file_name = u'débárquér.txt'  # <-- unicode literal
utf16_filename = file_name.encode('UTF-16LE')

然后,请确保您准确声明了源文件的编码。

使用Unicode文本时,请尽快将传入的字节字符串转换为Unicode,在脚本中使用Unicode文本,然后尽早转换回字节字符串。

您混合使用了不同编码的字节字符串,可能的麻烦原因是此行:

open_req.file_name_ = '%s\\%s' % (test_dir, utf16_filename)

目前尚不清楚test_dir编码方式,但是格式字符串是ASCII字节字符串,而utf16_filename是UTF-16LE编码的字节字符串。 结果将是多种编码。

相反,请确定test_dir是什么,将其解码为Unicode(如果不是),然后在各处使用Unicode字符串。 这是一个例子:

test_dir = unit_test_utils.get_test_dir("test")
# if not already Unicode, decode it...need to know encoding
test_dir = test_dir.decode(encoding)
file_name = u'débárquér.txt' # Unicode string!
open_req = createrequest.CreateRequest(factory)
open_req.create_disp_ = defines.FILE_OPEN_IF
# This would work...
# fullname = u'%s\\%s' % (test_dir, file_name)
# But better way to join is this...
fullname = os.path.join(test_dir,file_name)
# I assume UTF-16LE is required for "file_name_" at this point.
open_req.file_name_ = fullname.encode('utf-16le')
res = unit_test_utils.test_send(client, open_req)

尝试替换:

utf16_filename = file_name.decode('utf8').encode('UTF-16LE')

utf16_filename = unicode(file_name.decode('utf8')).encode('UTF-16LE')

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM