简体   繁体   English

Python错误; UnicodeEncodeError:'ascii'编解码器无法编码字符u'\ u2026'

[英]Python Error; UnicodeEncodeError: 'ascii' codec can't encode character u'\u2026'

I am trying to extract some data from a JSON file which contains tweets and write it to a csv. 我试图从包含推文的JSON文件中提取一些数据并将其写入csv。 The file contains all kinds of characters, I'm guessing this is why i get this error message: 该文件包含各种字符,我猜这就是为什么我收到此错误消息:

UnicodeEncodeError: 'ascii' codec can't encode character u'\…'

I guess I have to convert the output to utf-8 before writing the csv file, but I have not been able to do that. 我想在编写csv文件之前我必须将输出转换为utf-8,但我无法做到这一点。 I have found similar questions here on stackoverflow, but not I've not been able to adapt the solutions to my problem (I should add that I am not really familiar with python. I'm a social scientist, not a programmer) 我在stackoverflow上找到了类似的问题,但不是我无法使解决方案适应我的问题(我应该补充说我对python并不熟悉。我是社会科学家,而不是程序员)

import csv
import json

fieldnames = ['id', 'text']

with open('MY_SOURCE_FILE', 'r') as f, open('MY_OUTPUT', 'a') as out:

    writer = csv.DictWriter(
                    out, fieldnames=fieldnames, delimiter=',', quoting=csv.QUOTE_ALL)

    for line in f:
        tweet = json.loads(line)
        user = tweet['user']
        output = {
            'text': tweet['text'],
            'id': tweet['id'],
        }
        writer.writerow(output)

You just need to encode the text to utf-8: 您只需要将文本编码为utf-8:

for line in f:
    tweet = json.loads(line)
    user = tweet['user']
    output = {
        'text': tweet['text'].encode("utf-8"),
        'id': tweet['id'],
    }
    writer.writerow(output)

The csv module does not support writing unicode in python2: csv模块不支持在python2中编写unicode:

Note This version of the csv module doesn't support Unicode input. 注意此版本的csv模块不支持Unicode输入。 Also, there are currently some issues regarding ASCII NUL characters. 此外,目前有一些关于ASCII NUL字符的问题。 Accordingly, all input should be UTF-8 or printable ASCII to be safe; 因此,所有输入应为UTF-8或可打印的ASCII以确保安全; see the examples in section Examples. 请参阅示例部分中的示例。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 UnicodeEncodeError:'ascii'编解码器无法编码字符u'\\ u2026' - UnicodeEncodeError: 'ascii' codec can't encode character u'\u2026' Python:'ascii'编解码器无法编码字符u'\\\\ u2026' - Python: 'ascii' codec can't encode character u'\\u2026' python:unicodeEncodeError:'charpmap'编解码器无法编码字符'\\ u2026' - python : unicodeEncodeError: 'charpmap' codec can't encode character '\u2026' Python unicode错误。 UnicodeEncodeError:'ascii'编解码器无法编码字符u'\\ u4e3a' - Python unicode error. UnicodeEncodeError: 'ascii' codec can't encode character u'\u4e3a' UnicodeEncodeError:'ascii'编解码器无法使用python脚本编码字符u'\\ u200f' - UnicodeEncodeError: 'ascii' codec can't encode character u'\u200f' with python script UnicodeEncodeError:'ascii'编解码器无法编码字符u'\\ xa3' - UnicodeEncodeError: 'ascii' codec can't encode character u'\xa3' UnicodeEncodeError:'ascii'编解码器不能编码字符u'\\ xe9' - UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' UnicodeEncodeError: 'ascii' 编解码器无法编码字符 '\’' - UnicodeEncodeError: 'ascii' codec can't encode character '\u2019' UnicodeEncodeError:'ascii'编解码器不能编码字符u'\\ xe4' - UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' UnicodeEncodeError:'ascii'编解码器不能编码字符u'\\ xef' - UnicodeEncodeError: 'ascii' codec can't encode character u'\xef'
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM