简体   繁体   English

python csv unicode'ascii'编解码器无法编码位置1中的字符u'\ xf6':序数不在范围内(128)

[英]python csv unicode 'ascii' codec can't encode character u'\xf6' in position 1: ordinal not in range(128)

I have copied this script from [python web site][1] This is another question but now problem with encoding: 我从[python web site]复制了这个脚本[1]这是另一个问题,但现在编码问题:

import sqlite3
import csv
import codecs
import cStringIO
import sys

class UTF8Recoder:
    """
    Iterator that reads an encoded stream and reencodes the input to UTF-8
    """
    def __init__(self, f, encoding):
        self.reader = codecs.getreader(encoding)(f)

    def __iter__(self):
        return self

    def next(self):
        return self.reader.next().encode("utf-8")

class UnicodeReader:
    """
    A CSV reader which will iterate over lines in the CSV file "f",
    which is encoded in the given encoding.
    """

    def __init__(self, f, dialect=csv.excel, encoding="utf-8", **kwds):
        f = UTF8Recoder(f, encoding)
        self.reader = csv.reader(f, dialect=dialect, **kwds)

    def next(self):
        row = self.reader.next()
        return [unicode(s, "utf-8") for s in row]

    def __iter__(self):
        return self

class UnicodeWriter:
    """
    A CSV writer which will write rows to CSV file "f",
    which is encoded in the given encoding.
    """

    def __init__(self, f, dialect=csv.excel, encoding="utf-8", **kwds):
        # Redirect output to a queue
        self.queue = cStringIO.StringIO()
        self.writer = csv.writer(self.queue, dialect=dialect, **kwds)
        self.stream = f
        self.encoder = codecs.getincrementalencoder(encoding)()

    def writerow(self, row):
        self.writer.writerow([s.encode("utf-8") for s in row])
        # Fetch UTF-8 output from the queue ...
        data = self.queue.getvalue()
        data = data.decode("utf-8")
        # ... and reencode it into the target encoding
        data = self.encoder.encode(data)
        # write to the target stream
        self.stream.write(data)
        # empty queue
        self.queue.truncate(0)

    def writerows(self, rows):
        for row in rows:
            self.writerow(row)

This time problem with encoding, when I ran this it gave me this error: 这次编码的问题,当我运行它时它给了我这个错误:

Traceback (most recent call last):
  File "makeCSV.py", line 87, in <module>
    uW.writerow(d)
  File "makeCSV.py", line 54, in writerow
    self.writer.writerow([s.encode("utf-8") for s in row])
AttributeError: 'int' object has no attribute 'encode'

Then I converted all integers to string, but this time I got this error: 然后我将所有整数转换为字符串,但这次我得到了这个错误:

Traceback (most recent call last):
  File "makeCSV.py", line 87, in <module>
    uW.writerow(d)
  File "makeCSV.py", line 54, in writerow
    self.writer.writerow([str(s).encode("utf-8") for s in row])
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf6' in position 1: ordinal not in range(128)

I have implemented above to deal with unicode characters, but it gives me such error. 我上面已经实现了处理unicode字符,但它给了我这样的错误。 What is the problem and how to fix it? 有什么问题以及如何解决?

Then I converted all integers to string, 然后我将所有整数转换为字符串,

You converted both integers and strings to byte strings . 您将整数字符串转换为字节字符串 For strings this will use the default character encoding which happens to be ASCII, and this fails when you have non-ASCII characters. 对于字符串,这将使用恰好是ASCII的默认字符编码,如果您有非ASCII字符,则会失败。 You want unicode instead of str . 你想要unicode而不是str

self.writer.writerow([unicode(s).encode("utf-8") for s in row])

It might be better to convert everything to unicode before calling that method. 在调用该方法之前将所有内容转换为unicode可能更好。 The class is designed specifically for parsing Unicode strings. 该类专门用于解析Unicode字符串。 It was not designed to support other data types. 它不是为支持其他数据类型而设计的。

From the documentation: 从文档:

Unlike the StringIO module, this module is not able to accept Unicode strings that cannot be encoded as plain ASCII strings. 与StringIO模块不同,此模块无法接受无法编码为纯ASCII字符串的Unicode字符串。

Ie only 7-bit clean strings can be stored. 也就是说,只能存储7位干净的字符串。

If you are using Python 2: 如果您使用的是Python 2:

make encoding as : str(s.encode("utf-8")) ie 使编码为: str(s.encode(“utf-8”)) ie

def writerow(self, row):
    self.writer.writerow([str(s.encode("utf-8")) for s in row])
    # Fetch UTF-8 output from the queue ...
    data = self.queue.getvalue()
    data = data.decode("utf-8")
    # ... and reencode it into the target encoding
    data = self.encoder.encode(data)
    # write to the target stream
    self.stream.write(data)
    # empty queue
    self.queue.truncate(0)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 UnicodeEncodeError:“ascii”编解码器无法在 position 134 中编码字符“\xf6”:序数不在范围内(128) - UnicodeEncodeError: 'ascii' codec can't encode character '\xf6' in position 134: ordinal not in range(128) Python:UnicodeEncodeError:&#39;ascii&#39;编解码器无法编码位置78中的字符u&#39;\\ xf1&#39;:序数不在范围内(128) - Python: UnicodeEncodeError: 'ascii' codec can't encode character u'\xf1' in position 78: ordinal not in range(128) UnicodeEncodeError:&#39;ascii&#39;编解码器无法在位置16编码字符u&#39;\\ xf3&#39;(序数不在范围内)(128) - UnicodeEncodeError: 'ascii' codec can't encode character u'\xf3' in position 16: ordinal not in range(128) Python&#39;ascii&#39;编解码器无法在位置10编码字符u&#39;\\ xf6&#39; - Python 'ascii' codec can't encode character u'\xf6' in position 10 Python:UnicodeEncodeError:&#39;ascii&#39; 编解码器无法在位置 0 中对字符 &#39;\Ο&#39; 进行编码:序号不在范围内 (128) - Python: UnicodeEncodeError: 'ascii' codec can't encode character '\u039f' in position 0: ordinal not in range(128) Python urllib&#39;ascii&#39;编解码器无法在位置5编码字符&#39;\\ u2757&#39;:序数不在范围内(128) - Python urllib 'ascii' codec can't encode character '\u2757' in position 5: ordinal not in range(128) Python:UnicodeEncodeError:&#39;ascii&#39;编解码器无法在位置0编码字符u&#39;\\ xfc&#39;:序数不在范围内(128)-&gt; Excel - Python: UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 0: ordinal not in range(128) -> Excel 带有pandas to_csv()的Python 2.7给出了UnicodeEncodeError:&#39;ascii&#39;编解码器不能编码位置4中的字符u&#39;\\ xc7&#39;:序数不在范围内(128) - Python 2.7 with pandas to_csv() gives UnicodeEncodeError: 'ascii' codec can't encode character u'\xc7' in position 4: ordinal not in range(128) Python - &#39;ascii&#39; 编解码器无法对位置 5 中的字符 u&#39;\\xe9&#39; 进行编码:序号不在范围内(128) - Python - 'ascii' codec can't encode character u'\xe9' in position 5: ordinal not in range(128) UnicodeEncodeError:&#39;ascii&#39;编解码器无法在位置61编码字符&#39;\\ xf1&#39;:序数不在范围内(128) - UnicodeEncodeError : 'ascii' codec can't encode character '\xf1' in position 61: ordinal not in range(128)
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM