[英]UnicodeEncodeError: 'ascii' codec can't encode character '\xf6' in position 134: ordinal not in range(128)
[英]python csv unicode 'ascii' codec can't encode character u'\xf6' in position 1: ordinal not in range(128)
我從[python web site]復制了這個腳本[1]這是另一個問題,但現在編碼問題:
import sqlite3
import csv
import codecs
import cStringIO
import sys
class UTF8Recoder:
"""
Iterator that reads an encoded stream and reencodes the input to UTF-8
"""
def __init__(self, f, encoding):
self.reader = codecs.getreader(encoding)(f)
def __iter__(self):
return self
def next(self):
return self.reader.next().encode("utf-8")
class UnicodeReader:
"""
A CSV reader which will iterate over lines in the CSV file "f",
which is encoded in the given encoding.
"""
def __init__(self, f, dialect=csv.excel, encoding="utf-8", **kwds):
f = UTF8Recoder(f, encoding)
self.reader = csv.reader(f, dialect=dialect, **kwds)
def next(self):
row = self.reader.next()
return [unicode(s, "utf-8") for s in row]
def __iter__(self):
return self
class UnicodeWriter:
"""
A CSV writer which will write rows to CSV file "f",
which is encoded in the given encoding.
"""
def __init__(self, f, dialect=csv.excel, encoding="utf-8", **kwds):
# Redirect output to a queue
self.queue = cStringIO.StringIO()
self.writer = csv.writer(self.queue, dialect=dialect, **kwds)
self.stream = f
self.encoder = codecs.getincrementalencoder(encoding)()
def writerow(self, row):
self.writer.writerow([s.encode("utf-8") for s in row])
# Fetch UTF-8 output from the queue ...
data = self.queue.getvalue()
data = data.decode("utf-8")
# ... and reencode it into the target encoding
data = self.encoder.encode(data)
# write to the target stream
self.stream.write(data)
# empty queue
self.queue.truncate(0)
def writerows(self, rows):
for row in rows:
self.writerow(row)
這次編碼的問題,當我運行它時它給了我這個錯誤:
Traceback (most recent call last):
File "makeCSV.py", line 87, in <module>
uW.writerow(d)
File "makeCSV.py", line 54, in writerow
self.writer.writerow([s.encode("utf-8") for s in row])
AttributeError: 'int' object has no attribute 'encode'
然后我將所有整數轉換為字符串,但這次我得到了這個錯誤:
Traceback (most recent call last):
File "makeCSV.py", line 87, in <module>
uW.writerow(d)
File "makeCSV.py", line 54, in writerow
self.writer.writerow([str(s).encode("utf-8") for s in row])
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf6' in position 1: ordinal not in range(128)
我上面已經實現了處理unicode字符,但它給了我這樣的錯誤。 有什么問題以及如何解決?
然后我將所有整數轉換為字符串,
您將整數和字符串轉換為字節字符串 。 對於字符串,這將使用恰好是ASCII的默認字符編碼,如果您有非ASCII字符,則會失敗。 你想要unicode
而不是str
。
self.writer.writerow([unicode(s).encode("utf-8") for s in row])
在調用該方法之前將所有內容轉換為unicode可能更好。 該類專門用於解析Unicode字符串。 它不是為支持其他數據類型而設計的。
從文檔:
與StringIO模塊不同,此模塊無法接受無法編碼為純ASCII字符串的Unicode字符串。
也就是說,只能存儲7位干凈的字符串。
如果您使用的是Python 2:
使編碼為: str(s.encode(“utf-8”)) ie
def writerow(self, row):
self.writer.writerow([str(s.encode("utf-8")) for s in row])
# Fetch UTF-8 output from the queue ...
data = self.queue.getvalue()
data = data.decode("utf-8")
# ... and reencode it into the target encoding
data = self.encoder.encode(data)
# write to the target stream
self.stream.write(data)
# empty queue
self.queue.truncate(0)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.