简体   繁体   English

使用 pandas 正确读取 python 中的 csv 文件

[英]Read a csv file in python correctly using pandas

I am trying to read this file using read_csv in pandas(python).我正在尝试使用 pandas(python) 中的 read_csv 读取此文件 But I am not able to capture all columns.但我无法捕获所有列。 Can you help?你能帮我吗?

Here is the code:这是代码:

file = r'path of file'
df = pd.read_csv(file, encoding='cp1252', on_bad_lines='skip')

Thank you谢谢

I tried to read your file, and I first noticed that the encoding you specified does not correspond to the one used in your file.我试图读取您的文件,我首先注意到您指定的编码与您文件中使用的编码不对应。 I also noticed that the separator is not a comma ( , ) but a tab ( \t ).我还注意到分隔符不是逗号( , )而是制表符( \t )。

First, to get the file encoding (in linux), you just need to run:首先,要获取文件编码(在 linux 中),您只需要运行:

$ file -i kopie.csv 
kopie.csv: text/plain; charset=utf-16le

In Python:在 Python 中:

import pandas as pd

path_to_file = 'kopie.csv'
df = pd.read_csv(path_to_file, encoding='utf-16le', sep='\t')

And when I print the shape of the loaded dataframe:当我打印加载的 dataframe 的形状时:

>>> df.shape
(869, 161)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM