[英]UnicodeDecodeError when reading CSV file in Pandas with Python “'utf-8' codec can't decode byte 0xff in position 0: invalid start byte”
I am having having trouble reading a csv file using read_csv in Pandas.我在使用 Pandas 中的 read_csv 读取 csv 文件时遇到问题。 Here's the error:这是错误:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte
I have tried a bunch of different encoding types with the file I am dealing with and none seem to work.我已经用我正在处理的文件尝试了一堆不同的编码类型,但似乎都没有。 The file is from Google's Search Ads 360 product, which says the csv should be in the 'UFT-16' format .该文件来自 Google 的 Search Ads 360 产品,该产品称 csv 应采用“UFT-16”格式。 Strangely, if I open the file in Excel and save it as a utf-8 format, I can use read_csv normally.奇怪的是,如果我打开 Excel 中的文件并将其保存为 utf-8 格式,我可以正常使用 read_csv。
I've tried the solutions to a similar problem here , but they did not work for me.我在这里尝试了类似问题的解决方案,但它们对我不起作用。 This is the only code I am running:这是我正在运行的唯一代码:
import pandas as pd
df = pd.read_csv('path/file.csv')
Edit: I read in the file as tab delimited, and that seemed to work.编辑:我在文件中以制表符分隔读取,这似乎有效。 I still don't understand why I got the error I did when I tried to read it in as a normal csv.我仍然不明白为什么当我尝试将其作为普通 csv 读取时出现错误。 Any insight into this would be appreciated!!对此的任何见解将不胜感激!
Try this encoding:试试这个编码:
import pandas as pd
df = pd.read_csv('path/file.csv',encoding='cp1252')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.