简体   繁体   English

熊猫read_excel奇怪的错误:UnicodeDecodeError:'ascii'编解码器无法解码字节0xe2

[英]Pandas read_excel strange error: UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2

When I read an excel file in IPython (or Jupyter to be exact), the dataframe seems to be read ok, but I can't display it or work on textual columns on it (for instance, to merge with another excel when the key is the textual field), because I get a " 当我在IPython中(或准确地说是Jupyter)读取一个excel文件时,该数据框似乎没问题,但是我无法显示它或在其上处理文本列(例如,当关键点与另一个excel合并时)是文本字段),因为我得到了一个“

UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2...

error 错误

The strange part is that if I do something like: 奇怪的是,如果我做类似的事情:

for i in df['Textual Col Name']:
    print i

it prints all the values ok. 它会打印所有值。

I've tried the different solutions offered to other similar questions here but nothing worked. 我在这里尝试了提供给其他类似问题的不同解决方案,但是没有任何效果。 I don't think there's a good answer on SO for when the df is read from an excel file. 我不认为从excel文件中读取df时有什么好的答案。

Would love to get your help how to work out this problem and if possible also an explanation why all said and done I can still print the individual items just fine. 希望得到您的帮助,以解决该问题,如果可能的话,还请解释为什么所有人都说完了我仍然可以打印单个项目。

Thanks in advance! 提前致谢!

You need to specify the encoding of the file, without having the file is is impossible to know how it is encoded but you can try a few and see which works if you don't know. 您需要指定文件的编码,而无需知道文件的编码方式是不可能的,但是您可以尝试一些尝试,如果不知道,则可以查看哪个文件可以工作。 encoding=tuf-8 , encoding=latin-1 or encoding=cp1252 in pd.read_excel . encoding=tuf-8encoding=latin-1encoding=cp1252 pd.read_excel

Try using the method of bisection to isolate the problematic row: 尝试使用二等分方法来隔离有问题的行:

import numpy as np
import pandas as pd

# substitute your df here
df = pd.DataFrame({'textcol':np.random.randint(10, size=[1000])})

def isokay(df):
    try:
        print(df)
    except UnicodeDecodeError:
        return False
    return True

i = 0
chunksize = len(df) // 2
while True:
    if isokay(df.iloc[i:i+chunksize]):
        i += chunksize
        if i > len(df):
            print('No error found')
            break
    else:
        if chunksize <= 1:
            # Problem occurs at row i
            print('Problem occurs on row {}'.format(i))
            print(df.iloc[i])
            break
        else:
            chunksize /= 2

The line print(df.iloc[i]) may result in an error. print(df.iloc[i])可能会导致错误。 If so, you could consult the excel file to find out what data is contained on row i . 如果是这样,您可以查阅excel文件以找出第i行包含的数据。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 python熊猫到excel UnicodeDecodeError:&#39;ascii&#39;编解码器无法解码位置11的字节0xe2 - Python pandas to excel UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 11 安装熊猫:UnicodeDecodeError:&#39;ascii&#39;编解码器无法解码位置 72 中的字节 0xe2:序号不在范围内(128) - Installing pandas: UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 72: ordinal not in range(128) `pip install pandas`给出UnicodeDecodeError:&#39;ascii&#39;编解码器无法解码位置41的字节0xe2:序号不在范围内(128) - `pip install pandas` gives UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 41: ordinal not in range(128) UnicodeDecodeError:&#39;ascii&#39;编解码器无法解码位置0中的字节0xe2:序号不在范围内(128) - UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 0: ordinal not in range(128) UnicodeDecodeError:“ascii”编解码器无法解码位置 13 中的字节 0xe2:序号不在范围内(128) - UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 13: ordinal not in range(128) UnicodeDecodeError:“ ascii”编解码器无法解码位置46的字节0xe2:序数不在范围内 - UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 46: ordinal not in range 调用nlp时发生Python Spacy错误:UnicodeDecodeError:&#39;ascii&#39;编解码器无法解码字节0xe2 - Python Spacy errors when nlp is called: UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 Python 3 UnicodeDecodeError:&#39;ascii&#39;编解码器无法解码位置0中的字节0xe2:序数不在范围内(128) - Python 3 UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 0: ordinal not in range(128) UnicodeDecodeError:&#39;ascii&#39;编解码器无法解码位置35的字节0xe2:序数不在范围内(128) - UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 35: ordinal not in range(128) UnicodeDecodeError:&#39;ascii&#39;编解码器无法解码位置4中的字节0xe2:序数不在范围内(128) - UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 4: ordinal not in range(128)
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM