[英](python utf-8) using 'à','ç','é','è','ê','ë','î','ô','ù'
[英]Antiword with UTF-8 in python
这是我的代码:
from subprocess import Popen, PIPE
cmd = ['antiword', 'tbhocbong151.doc']
p = Popen(cmd, stdout=PIPE)
stdout, stderr = p.communicate()
print(stdout.decode('utf-8', 'ignore'))
我在文件字词中有这样的内容: "Chào bạn"
但是当我生成输出时是: "Ch?ob?n"
我如何解决它像输入一样的输出? 谢谢你的帮助
我认为,问题是,当该区域设置不正确antiword
运行。 尝试这个:
import os
from subprocess import Popen, PIPE
myenv = dict(os.environ)
if 'LC_ALL' in myenv:
del myenv['LC_ALL']
myenv['LANG'] = 'en_US.UTF-8'
cmd = ['antiword', 'tbhocbong151.doc']
p = Popen(cmd, stdout=PIPE, env=myenv)
stdout, stderr = p.communicate()
print(stdout.decode('utf-8', 'ignore'))
如果那不起作用,请在运行python程序之前尝试在您的shell中设置LANG
env变量; 例如通过做:
export LANG=en_US.UTF-8
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.