简体   繁体   English

Python:UnicodeEncodeError'ascii'编解码器

[英]Python: UnicodeEncodeError 'ascii' codec

I would just like the python code to work but these conversion errors I don't understand (I always get some type of 'ascii' encoding or decoding error). 我只想让python代码正常工作,但是我不理解这些转换错误(我总是会遇到某种类型的“ ascii”编码或解码错误)。 I went crazy and did a decode and encode on every part of the line and it still giving me trouble. 我发疯了,对线路的每一部分都进行了解码和编码,这仍然给我带来麻烦。 It's available via GIT at https://github.com/TBOpen/papercut if you would be so kind as to correct it (I also solved a similar error not checked in on line 885 using self.wfile.write(message.decode('cp1250', 'replace').encode('ascii', 'replace') + "\\r\\n") . 如果您愿意更正它,可以通过https://github.com/TBOpen/papercut上的GIT进行访问(我也解决了一个类似的错误,该错误未使用self.wfile.write(message.decode('cp1250', 'replace').encode('ascii', 'replace') + "\\r\\n")

However here's the traceback for the one I can't solve (where I gave up). 但是,这是我无法解决的问题的追踪(我放弃了)。

Traceback (most recent call last):
  File "/usr/local/lib/python2.6/SocketServer.py", line 535, in process_request
    self.finish_request(request, client_address)
  File "/usr/local/lib/python2.6/SocketServer.py", line 320, in finish_request
    self.RequestHandlerClass(request, client_address, self)
  File "/usr/local/lib/python2.6/SocketServer.py", line 615, in __init__
    self.handle()
  File "./papercut.py", line 221, in handle
    getattr(self, "do_%s" % (command))()
  File "./papercut.py", line 410, in do_ARTICLE
    self.send_response("%s\r\n%s\r\n\r\n%s\r\n.".decode('cp1250', 'replace').encode('ascii', 'replace') % (response.decode('cp1250', 'replace').encode('ascii', 'replace'), result[0].decode('cp1250', 'replace').encode('ascii', 'replace'), result[1].decode('cp1250', 'replace').encode('ascii', 'replace')))
  File "/usr/local/lib/python2.6/encodings/cp1250.py", line 15, in decode
    return codecs.charmap_decode(input,errors,decoding_table)
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2122' in position 20: ordinal not in range(128)

TIA!! TIA!

The root problem is that one of response , result[0] , or result[1] is actually a unicode string, not an encoded str string. 根本问题是responseresult[0]result[1]实际上是一个unicode字符串,而不是编码的str字符串。

So, when you call (picking one arbitrarily) response.decode('cp1250', 'replace') , you're asking to decode something that's already decoded to Unicode. 因此,当您调用(任意选择一个) response.decode('cp1250', 'replace') ,您将要求解码已经解码为Unicode的内容。 What Python 2.x does with this is to first encode it to your default encoding (ASCII) so that it can decode it as you requested. Python 2.x这样做的目的是首先将其编码为默认编码(ASCII),以便它可以根据您的要求进行解码。 And that's why you're getting a UnicodeEncodeError from trying to call decode .* 这就是为什么从尝试调用decode得到UnicodeEncodeError的原因。*

To fix this, you're going to have to figure out which one of the three is wrong, and why. 要解决此问题,您将必须找出三个错误之一,以及错误原因。 That's not possible with a giant mess of a statement with 4 decode calls in it, but it's easy if you break it up into separate statements, or just add some print debugging to see what's in those variables right before they get used. 对于其中包含4个解码调用的庞大的语句,这是不可能的,但是如果将其分解为单独的语句,或者只是添加一些print调试来查看这些变量在使用之前的内容,则很容易。

However, it would make your life a whole lot easier to reorganize your code completely. 但是,这将使您的生活变得更加轻松,从而可以完全重新组织代码。 Instead of converting everything back and forth all over the place, giving yourself dozens of places to make a simple mistake that ends up causing an un-debuggable error halfway across your program, just decode all of your input at input time, process everything as Unicode, then encode everything at output time. 不必在整个地方来回转换所有内容,而是给自己几十个地方来犯一个简单的错误,最终在程序中途导致不可调试的错误,只需在输入时解码所有输入,然后将所有内容处理为Unicode ,然后在输出时对所有内容进行编码。

By the way, if you haven't read Python's Unicode HOWTO , and the blog post The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) , go read them before going any further. 顺便说一句,如果您还没有阅读Python的Unicode HOWTO和博客文章《绝对是每个软件开发人员的绝对最低要求》,那么肯定要知道Unicode和字符集(没有借口!) ,请先阅读它们。


* If you think this is a silly design for a language… well, that's the main reason Python 3 exists. *如果您认为这对某种语言来说是愚蠢的设计……那么,这就是Python 3存在的主要原因。 In Python 3, you can't decode a unicode or encode a bytes , so the error shows up as early as possible, and tells you exactly what's wrong, instead of making you try to hunt down where you called the wrong method on the wrong type and got an error that makes no sense. 在Python 3中,您无法decode unicodeencode bytes encode ,因此错误会尽早出现,并告诉您确切的问题,而不是使您尝试查找错误的错误方法输入并得到一个没有意义的错误。 So if you want to use Python 2 instead of 3, you don't get to complain that Python 2's design is sillier than 3's. 因此,如果您想使用Python 2而不是3,就不必抱怨Python 2的设计比3更愚蠢。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Python3中的“ UnicodeEncodeError:'ascii'编解码器无法编码字符” - “UnicodeEncodeError: 'ascii' codec can't encode character” in Python3 收到UnicodeEncodeError的Python脚本:“ ascii”编解码器无法编码字符 - Python script receiving a UnicodeEncodeError: 'ascii' codec can't encode character Python错误:UnicodeEncodeError:'ascii'编解码器无法编码字符 - Python error : UnicodeEncodeError: 'ascii' codec can't encode character Python UnicodeEncodeError:'ascii'编解码器无法编码字符 - Python UnicodeEncodeError: 'ascii' codec can't encode characters Python / Flask:UnicodeDecodeError / UnicodeEncodeError:'ascii'编解码器无法解码/编码 - Python/Flask: UnicodeDecodeError/ UnicodeEncodeError: 'ascii' codec can't decode/encode UnicodeEncodeError:'ascii'编解码器无法编码字符 - UnicodeEncodeError: 'ascii' codec can't encode characters UnicodeEncodeError:“ ascii”编解码器无法编码 - UnicodeEncodeError: 'ascii' codec can't encode UnicodeEncodeError:'ascii'编解码器不能编码字符[...] - UnicodeEncodeError: 'ascii' codec can't encode character […] 再次:UnicodeEncodeError:ascii编解码器无法编码 - Again: UnicodeEncodeError: ascii codec can't encode UnicodeEncodeError:'ascii'编解码器无法编码字符 - UnicodeEncodeError: 'ascii' codec can't encode characte
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM