I have a list with different values. It looks like this:
data = [
('Column1', 'Column2'),
('myFirstNovel', 'myAge'),
('mySecondNovel', 'myAge2'),
('myThirdNovel', 'myAge3'),
('myFourthNovel', 'myAge4')
]
I'm getting encoding errors when I'm writing the data to csv and thus want to encode the data before exporting. So I tried this:
[[all.encode('utf-8') for all in items] for items in data]
Now this doesn't really solve my problem to begin with (the data gets filled with \\xe2\\x80\\x94\\xc2\\xa0 and other stuff). But main thing is it takes ages and my python pretty much crashes.
Is there a better method or should I just change export method?
(using csv tool and writerows right now)
If you are using python 2.X you can use following unicode_writer
class which python suggests in it's documentation:
class UnicodeWriter:
"""
A CSV writer which will write rows to CSV file "f",
which is encoded in the given encoding.
"""
def __init__(self, f, dialect=csv.excel, encoding="utf-8", **kwds):
# Redirect output to a queue
self.queue = cStringIO.StringIO()
self.writer = csv.writer(self.queue, dialect=dialect, **kwds)
self.stream = f
self.encoder = codecs.getincrementalencoder(encoding)()
def writerow(self, row):
self.writer.writerow([s.encode("utf-8") for s in row])
# Fetch UTF-8 output from the queue ...
data = self.queue.getvalue()
data = data.decode("utf-8")
# ... and reencode it into the target encoding
data = self.encoder.encode(data)
# write to the target stream
self.stream.write(data)
# empty queue
self.queue.truncate(0)
def writerows(self, rows):
for row in rows:
self.writerow(row)
And in python 3.X you can simply pass your encoding to open
function.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.