简体   繁体   English

我不断收到 UnicodeDecodeError: 'utf-8' codec can't decode byte 0x92 in position 1: invalid start byte

[英]I keep getting UnicodeDecodeError: 'utf-8' codec can't decode byte 0x92 in position 1: invalid start byte

I'm trying to read monthly csv file but for some reason, I keep getting this error.我正在尝试阅读每月 csv 文件,但由于某种原因,我不断收到此错误。

This is my code below.这是我下面的代码。

df = pd.DataFrame()
 
for file in os.listdir("Performance_Data"):
    if file.endswith(".csv"):
        df = pd.concat([df , pd.read_csv(os.path.join("Performance_Data", file))], axis=0 )
        
df.head()

What do I do?我该怎么办?

Pandas assumes by default that your file is encoded in UTF-8. Pandas 默认假定您的文件以 UTF-8 编码。 Your file is encoded in Windows-1252.您的文件在 Windows-1252 中编码。 You can tell Pandas to use this encoding by您可以通过以下方式告诉 Pandas 使用此编码

pd.read_csv(os.path.join("Performance_Data", file), encoding='cp1252')

Detecting the encoding of a file automatically is a bit tricky, but you can use a package called "chardet".自动检测文件的编码有点棘手,但您可以使用名为“chardet”的 package。 For your code, it could look like this:对于您的代码,它可能如下所示:

import os

import chardet
import pandas as pd

df = pd.DataFrame()

for file in os.listdir("Performance_Data"):
    if file.endswith(".csv"):
        with open(file, "rb") as fp:
            encoding = chardet.detect(fp.read())["encoding"]
        df = pd.concat(
            [
                df,
                pd.read_csv(os.path.join("Performance_Data", file), encoding=encoding),
            ],
            axis=0,
        )

df.head()

References参考

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 UnicodeDecodeError'utf-8'编解码器无法解码位置2893中的字节0x92:无效的起始字节 - UnicodeDecodeError 'utf-8' codec can't decode byte 0x92 in position 2893: invalid start byte UnicodeDecodeError:“ utf8”编解码器无法解码位置661中的字节0x92:无效的起始字节 - UnicodeDecodeError: 'utf8' codec can't decode byte 0x92 in position 661: invalid start byte “utf-8”编解码器无法解码 position 107 中的字节 0x92:无效的起始字节 - 'utf-8' codec can't decode byte 0x92 in position 107: invalid start byte “utf-8”编解码器无法解码位置 11 中的字节 0x92:起始字节无效 - 'utf-8' codec can't decode byte 0x92 in position 11: invalid start byte “utf-8”编解码器无法解码 position 18 中的字节 0x92:无效的起始字节 - 'utf-8' codec can't decode byte 0x92 in position 18: invalid start byte Anaconda:UnicodeDecodeError:'utf8'编解码器无法解码位置1412中的字节0x92:无效的起始字节 - Anaconda: UnicodeDecodeError: 'utf8' codec can't decode byte 0x92 in position 1412: invalid start byte 使用 CSVLogger 时出错:“utf-8”编解码器无法解码位置 144 中的字节 0x92:起始字节无效 - Error using CSVLogger: 'utf-8' codec can't decode byte 0x92 in position 144: invalid start byte Python错误:“ utf8”编解码器无法解码位置85的字节0x92:无效的起始字节 - Python error: 'utf8' codec can't decode byte 0x92 in position 85: invalid start byte UnicodeDecodeError: 'utf-8' 编解码器无法解码位置 3131 中的字节 0x80:起始字节无效 - UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 3131: invalid start byte Python UnicodeDecodeError:“ utf-8”编解码器无法解码位置2的字节0x8c:无效的起始字节 - Python UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8c in position 2: invalid start byte
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM