简体   繁体   English

如何用pandas数据框打开csv文件

[英]How to Open csv file with pandas data frame

There is a CSV format file with three column dataframe. 有一个包含三列数据帧的CSV格式文件。 The third column has long text. 第三列有长文本。 This error message occurred, when i tried to open the file using pandas.read_csv 当我尝试使用pandas.read_csv打开文件时出现此错误消息

message : UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa1 in position 0: invalid start byte.

But there is no problem opening the file with 但打开文件没有问题

with open('file.csv', 'r', encoding='utf-8', errors = "ignore") as csvfile:

I don't know how converting this data to dataframe and i don't think pandas.read_csv handle this error properly. 我不知道如何将这些数据转换为数据帧,我不认为pandas.read_csv正确处理此错误。

So, how can i open this file and get dataframe? 那么,我该如何打开这个文件并获取数据帧呢?

Try this: 尝试这个:

Open the cvs file in a text editor and make sure to save it in utf-8 format. 在文本编辑器中打开cvs文件,并确保以utf-8格式保存。

Then read the file as normal: 然后正常读取文件:

import pandas
csvfile = pandas.read_csv('file.csv', encoding='utf-8')

I would try using the built-in csv reader then put the data into pandas. 我会尝试使用内置的csv阅读器,然后将数据放入pandas。

import csv
with open('eggs.csv', newline='') as csvfile:
     spamreader = csv.reader(csvfile, delimiter=' ', quotechar='|')
     for row in spamreader:
         print(', '.join(row))

If this doesn't work, then at least you can confirm that it is a csv issue and not a pandas issue choking on encodings. 如果这不起作用,那么至少你可以确认它是一个csv问题而不是编码窒息的熊猫问题。

The other recommendation is to ensure you are using Python 3.x that handles encoding issues much better than 2.7. 另一个建议是确保使用Python 3.x来处理比2.7更好的编码问题。

If you can provide your sample, I can test it myself and update my answer accordingly. 如果您可以提供样品,我可以自己测试并相应地更新我的答案。

You can try another option for encoding as "ISO-8859-1" 您可以尝试其他选项进行编码为“ISO-8859-1”

In your case: 在你的情况下:

with open('file.csv', 'r', encoding = 'ISO-8859-1', errors = "ignore") as csvfile:

or try this: 或试试这个:

import pandas as pd
data_file = pd.read_csv("file.csv", encoding = "ISO-8859-1")
print(data_file)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM