I am using the code below to read a csv file into a dataframe. However, I get the error pandas.parser.CParserError: Error tokenizing data. C error: Expected 1 fields in line 4, saw 2
pandas.parser.CParserError: Error tokenizing data. C error: Expected 1 fields in line 4, saw 2
and hence I changed pd.read_csv('D:/TRYOUT.csv')
to pd.read_csv('D:/TRYOUT.csv', error_bad_lines=False)
as suggested here . However, I now get the error UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf0 in position 1: invalid continuation byte
in the same line.
def ExcelFileReader():
mergedf = pd.read_csv('D:/TRYOUT.csv', error_bad_lines=False)
return mergedf
Thank You
如果您使用的是Windows,则可能需要使用pd.read_csv(filename, encoding='latin-1')
I had a similar problem and had to use
utf-8-sig
as the encoding,
The reason i used utf-8-sig is because if you do ever get non-Latin characters it wont be able to deal with it correctly. There are a few ways of getting around the problem, but i guess you can just choose the best that suits your needs.
Hope that helps.
如果您想排除提供错误的行并忽略格式错误的数据,则需要使用pd.read_csv(file_path, encoding="utf8", error_bad_lines=False, encoding_errors="ignore")
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.