簡體   English   中英

使用python寫入Excel時出現UnicodeDecodeError

[英]UnicodeDecodeError when write to excel using python

我嘗試用add_sheet來排除文件

df.groupby('member_id').apply(lambda x: add_xlsx_sheet(x, u'Десктопы полно'.decode('utf-8'), path='{}.xlsx'.format(x.name)))功能

def add_xlsx_sheet(df, sheet_name=u'Смартфоны кратко', index=True, digits=2, path=None):
book = load_workbook(path)
writer = pd.ExcelWriter(path, engine='openpyxl')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
if sheet_name in list(writer.sheets.keys()):
    sh = book.get_sheet_by_name(sheet_name)
    book.remove_sheet(sh)
df.to_excel(excel_writer=writer, sheet_name=sheet_name, startrow=0, startcol=0,
            float_format='%.{}f'.format(digits), index=index, encoding='utf-8')
writer.save()

並得到錯誤

Traceback (most recent call last): File "C:/Users/ /PycharmProjects/14-27/desktop.py", line 142, in <module> df.groupby('member_id').apply(lambda x: add_xlsx_sheet(x, u'Десктопы полно'.decode('utf-8'), path='{}.xlsx'.format(x.name))) File "C:\\Python27\\lib\\site-packages\\pandas\\core\\groupby.py", line 651, in apply return self._python_apply_general(f) File "C:\\Python27\\lib\\site-packages\\pandas\\core\\groupby.py", line 655, in _python_apply_general self.axis) File "C:\\Python27\\lib\\site-packages\\pandas\\core\\groupby.py", line 1527, in apply res = f(group) File "C:\\Python27\\lib\\site-packages\\pandas\\core\\groupby.py", line 647, in f return func(g, *args, **kwargs) File "C:/Users/ /PycharmProjects/14-27/desktop.py", line 142, in <lambda> df.groupby('member_id').apply(lambda x: add_xlsx_sheet(x, u'Десктопы полно'.decode('utf-8'), path='{}.xlsx'.format(x.name))) File "C:/Users/ /PycharmProjects/14-27/desktop.py", line 137, in add_xlsx_sheet float_format='%.{}f'.format(digits), index=index) File "C:\\Python27\\lib\\site-packages\\pandas\\core\\frame.py", line 1425, in to_excel startrow=startrow, startcol=startcol) File "C:\\Python27\\lib\\site-packages\\pandas\\io\\excel.py", line 1257, in write_cells xcell.value = _conv_value(cell.val) File "C:\\Python27\\lib\\site-packages\\openpyxl\\cell\\cell.py", line 291, in value self._bind_value(value) File "C:\\Python27\\lib\\site-packages\\openpyxl\\cell\\cell.py", line 190, in _bind_value value = self.check_string(value) File "C:\\Python27\\lib\\site-packages\\openpyxl\\cell\\cell.py", line 149, in check_string value = unicode(value, self.encoding) UnicodeDecodeError: 'utf8' codec can't decode byte 0xc4 in position 0: invalid continuation byte

為什么會發生? 但是當我嘗試add_sheet而不重寫文件時

df1.groupby('member_id').apply(lambda x: add_xlsx_sheet(x, u'Десктопы кратко', path='{}.xlsx'.format(x.name)))
df.groupby('member_id').apply(lambda x: add_xlsx_sheet(x, u'Десктопы полно', path='{}.xlsx'.format(x.name)))

它返回錯誤

Traceback (most recent call last):
  File "C:/Users/�����/PycharmProjects/14-27/desktop.py", line 141, in <module>
    df1.groupby('member_id').apply(lambda x: add_xlsx_sheet(x, u'Десктопы кратко', path='{}.xlsx'.format(x.name)))
  File "C:\Python27\lib\site-packages\pandas\core\groupby.py", line 651, in apply
    return self._python_apply_general(f)
  File "C:\Python27\lib\site-packages\pandas\core\groupby.py", line 655, in _python_apply_general
self.axis)
  File "C:\Python27\lib\site-packages\pandas\core\groupby.py", line 1527, in apply
res = f(group)
  File "C:\Python27\lib\site-packages\pandas\core\groupby.py", line 647, in f
    return func(g, *args, **kwargs)
   File "C:/Users/�����/PycharmProjects/14-27/desktop.py", line 141, in <lambda>
    df1.groupby('member_id').apply(lambda x: add_xlsx_sheet(x, u'Десктопы кратко', path='{}.xlsx'.format(x.name)))
  File "C:/Users/�����/PycharmProjects/14-27/desktop.py", line 138, in add_xlsx_sheet
    writer.save()
  File "C:\Python27\lib\site-packages\pandas\io\excel.py", line 732, in save
return self.book.save(self.path)
  File "C:\Python27\lib\site-packages\openpyxl\workbook\workbook.py", line 294, in save
    save_workbook(self, filename)
  File "C:\Python27\lib\site-packages\openpyxl\writer\excel.py", line 270, in save_workbook
    writer.save(filename)
  File "C:\Python27\lib\site-packages\openpyxl\writer\excel.py", line 251, in save
self.write_data()
  File "C:\Python27\lib\site-packages\openpyxl\writer\excel.py", line 94, in write_data
archive.writestr(ARC_WORKBOOK, write_workbook(self.workbook))
  File "C:\Python27\lib\site-packages\openpyxl\writer\workbook.py", line 85, in write_workbook
active = get_active_sheet(wb)
  File "C:\Python27\lib\site-packages\openpyxl\writer\workbook.py", line 59, in get_active_sheet
sheet = wb.active
  File "C:\Python27\lib\site-packages\openpyxl\workbook\workbook.py", line 115, in active
return self._sheets[self._active_sheet_index]
IndexError: list index out of range

由於代碼段而發生錯誤

u'Десктопы полно'.decode('utf-8')

前綴'u'使字符串成為Unicode字符串 Unicode字符串實際上並未進行任何編碼,並且已經采用了解碼形式。

例如,

>>> s='Десктопы полно'
>>> u=u'Десктопы полно'
>>> s
'\xd0\x94\xd0\xb5\xd1\x81\xd0\xba\xd1\x82\xd0\xbe\xd0\xbf\xd1\x8b \xd0\xbf\xd0\xbe\xd0\xbb\xd0\xbd\xd0\xbe'
>>> u
u'\u0414\u0435\u0441\u043a\u0442\u043e\u043f\u044b \u043f\u043e\u043b\u043d\u043e'
>>> s.decode('utf-8')
u'\u0414\u0435\u0441\u043a\u0442\u043e\u043f\u044b \u043f\u043e\u043b\u043d\u043e'
>>> u.encode('utf-8')
'\xd0\x94\xd0\xb5\xd1\x81\xd0\xba\xd1\x82\xd0\xbe\xd0\xbf\xd1\x8b \xd0\xbf\xd0\xbe\xd0\xbb\xd0\xbd\xd0\xbe'

我們可以看到s == u.encode('utf-8')

有關原因的進一步詳細說明,您可以瀏覽http://pythoncentral.io/python-unicode-encode-decode-strings-python-2x/

因此,基本上,Unicode字符串必須進行編碼而不是解碼,即,

u'Десктопы полно'.encode('utf-8')

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM