繁体   English   中英

将 python 的 stdout 重定向到文件失败并显示 UnicodeEncodeError

[英]Redirecting python's stdout to the file fails with UnicodeEncodeError

我有一个 python 脚本,它连接到 Twitter Firehose 并向下游发送数据进行处理。 在它工作正常之前,但现在我试图只获取文本正文。 (这不是关于我应该如何从 Twitter 提取数据或如何编码/解码 ascii 字符的问题)。 所以当我像这样直接启动我的脚本时:

python -u fetch_script.py

它工作得很好,我可以看到消息传到屏幕上。 例如:

root@domU-xx-xx-xx-xx:/usr/local/streaming# python -u fetch_script.py 
Cuz I'm checking you out >on Facebook<
RT @SearchlightNV: #BarryLies👳🎌 has crapped on all honest patriotic hard-working citizens in the USA but his abuse of WWII Vets is sick #2A…
"Why do men chase after women? Because they fear death."~Moonstruck
RT @SearchlightNV: #BarryLies👳🎌 has crapped on all honest patriotic hard-working citizens in the USA but his abuse of WWII Vets is sick #2A…
Never let anyone tell you not to chase your dreams. My sister came home crying today, because someone told her she's not good enough.
"I can't even ask anyone out on a date because if it doesn't end up in a high speed chase, I get bored."
RT @ColIegeStudent: Double-checking the attendance policy while still in bed
Well I just handed my life savings to ya.. #trustingyou #abouttomakebankkkkk
Zillow $Z and Redfin useless to Wells Fargo Home Mortgage, $WFC, and FannieMae $FNM. Sale history LTV now 48%, $360 appraisal fee 4 no PMI.
The latest Dump and Chase Podcast http://somedomain.com/viaRSA9W3i check it out and subscribe on iTunes, or your favorite android app #Isles


python -u fetch_script.py >fetch_output.txt


root@domU-xx-xx-xx-xx:/usr/local/streaming# python -u fetch_script.py >fetch_output.txt
ERROR:tornado.application:Uncaught exception, closing connection.
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/tornado/iostream.py", line 341, in wrapper
  File "/usr/local/lib/python2.7/dist-packages/tornado/stack_context.py", line 331, in wrapped
  File "/usr/local/lib/python2.7/dist-packages/tornado/stack_context.py", line 302, in wrapped
    ret = fn(*args, **kwargs)
  File "/usr/local/streaming/twitter-stream.py", line 203, in parse_json
  File "/usr/local/streaming/twitter-stream.py", line 226, in parse_response
  File "fetch_script.py", line 57, in callback
    print msg['text']
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2026' in position 139: ordinal not in range(128)
ERROR:tornado.application:Exception in callback <functools.partial object at 0x187c2b8>
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/tornado/ioloop.py", line 458, in _run_callback
  File "/usr/local/lib/python2.7/dist-packages/tornado/stack_context.py", line 331, in wrapped
  File "/usr/local/lib/python2.7/dist-packages/tornado/stack_context.py", line 302, in wrapped
    ret = fn(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/tornado/iostream.py", line 341, in wrapper
  File "/usr/local/lib/python2.7/dist-packages/tornado/stack_context.py", line 331, in wrapped
  File "/usr/local/lib/python2.7/dist-packages/tornado/stack_context.py", line 302, in wrapped
    ret = fn(*args, **kwargs)
  File "/usr/local/streaming/twitter-stream.py", line 203, in parse_json
  File "/usr/local/streaming/twitter-stream.py", line 226, in parse_response
  File "fetch_script.py", line 57, in callback
    print msg['text']
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2026' in position 139: ordinal not in range(128)




def callback(self, message):
        if message:
            msg = message
            msg_props = pika.BasicProperties()
            msg_props.content_type = 'application/text'
            msg_props.delivery_mode = 2
            #print self.count
            print msg['text']
            #self.count += 1

但是,如果我删除['text']并且只保留print msg两种情况都会像魅力一样工作。

由于还没有人跳进去,这是我的镜头。 Python 在写入控制台时设置 stdout 的编码,但在写入文件时不设置。 这个脚本重现了这个问题:

import sys

msg = {'text':u'\2026'}
sys.stderr.write('default encoding: %s\n' % sys.stdout.encoding)
print msg['text']


$ python bad.py>/tmp/xxx
default encoding: None
Traceback (most recent call last):
  File "fix.py", line 5, in <module>
    print msg['text']
UnicodeEncodeError: 'ascii' codec can't encode character u'\x82' in position 0: ordinal not in range(128)


import sys

msg = {'text':u'\2026'}
sys.stderr.write('default encoding: %s\n' % sys.stdout.encoding)
encoding = sys.stdout.encoding or 'utf-8'
print msg['text'].encode(encoding)


$ python good.py >/tmp/xxx
default encoding: None
$ cat /tmp/xxx


声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM