熊猫读csv正在改变列

Question

I'm trying to create a dataframe of a csv file that has 4 empty columns. 我正在尝试创建一个包含4个空列的csv文件的数据框。 When I open it on LibreOffice or Excel it correctly identifies the empty columns. 当我在LibreOffice或Excel上打开它时，它正确识别空列。 However, opening with pd.read_csv() ends up shifting the columns' values by one. 但是，使用pd.read_csv()打开pd.read_csv()列的值移动一个。

How can I solve this? 我怎么解决这个问题？ It seems like a problem with pandas read_csv() method. 这似乎是pandas read_csv()方法的一个问题。

My code is really standard: 我的代码非常标准：

import pandas as pd
df = pd.DataFrame.read_csv('csv_file.csv', sep=',')
df.head()

I changed the headers and used this: 我改变了标题并使用了这个：

df = pd.DataFrame.read_csv('csv_file.csv', sep=',', index_col=False).

This solved the problem, but what in my previous headers was causing this? 这解决了这个问题，但是我之前的标题中是什么导致了这个问题？

Answer 1

It seems you need the parameter index_col=False to NOT read the first column to index in read_csv , sep=',' parameter can be omitted, because it is the default value: 看来你需要参数index_col=False来读取read_csv索引的第一列， sep=','参数可以省略，因为它是默认值：

df = pd.read_csv('csv_file.csv', index_col=False)

Your sample: 你的样本：

df = pd.read_csv('teste2.csv', index_col=False)
print (df)
  Header1 Header2  Header3  Unnamed: 3  Unnamed: 4  Header4  Header5  Header6  \
0     ptn  M00001        0         NaN         NaN        2        0        0   

   Header7  Header8    ...     Header22  Header23  Header24  Header25  \
0        0  -31.573    ...       -0.375       0.0   -64.168   276.586   

   Header26  Header27  Unnamed: 29  Unnamed: 30  Header28  Header29  
0    -0.232       0.0          NaN          NaN     0.702       1.0  

[1 rows x 33 columns]

Answer 2

I encountered the same problem. 我遇到了同样的问题。 Try writing headings on top of each column if there are none. 如果没有，请尝试在每列的顶部写标题。 This time, read_csv() also reads the headings and lists them. 这次， read_csv()还会读取标题并列出它们。
After that convert the data frame to an array by 之后，将数据帧转换为数组

df=df.values

and the headings are gone. 标题消失了。

Answer 3

The problems occurs if your line ends with an delimiter (here comma[,]), which creates an empty cell generally not visible in MS Excel. 如果您的行以分隔符（此处为逗号[，]）结束，则会出现问题，这会创建一个通常在MS Excel中不可见的空单元格。 If your csv line looks like this: 如果你的csv行看起来像这样：

1,2282816,102.97245065789474,2432,0.8333333333333334,0.1388888888888889,certain,

then modify it to: 然后将其修改为：

1,2282816,102.97245065789474,2432,0.8333333333333334,0.1388888888888889,certain

and pd.read_csv(fileName) will work fine. 和pd.read_csv(fileName)将正常工作。

Answer 4

I had a similar problem. 我遇到了类似的问题。 Here is how I have solved it: 以下是我如何解决它：

Opened excel file with google spreadsheet on google drive 在谷歌驱动器上使用谷歌电子表格打开excel文件
Downloaded spread sheet as csv file 下载的电子表格作为csv文件
Read the csv file via pandas.read_csv('filename', sep=',', index_col=False)) 通过pandas.read_csv('filename', sep=',', index_col=False))读取csv文件

Problem resolved. 问题解决了。

熊猫读csv正在改变列

问题描述

4 个解决方案

解决方案1
7 2017-08-12 16:59:26

解决方案2
2 2018-08-24 22:29:38

解决方案3
1 2019-02-26 06:42:57

解决方案4
0 2019-06-23 07:53:04

熊猫读csv正在改变列

问题描述

4 个解决方案

解决方案1 7 2017-08-12 16:59:26

解决方案2 2 2018-08-24 22:29:38

解决方案3 1 2019-02-26 06:42:57

解决方案4 0 2019-06-23 07:53:04

解决方案1
7 2017-08-12 16:59:26

解决方案2
2 2018-08-24 22:29:38

解决方案3
1 2019-02-26 06:42:57

解决方案4
0 2019-06-23 07:53:04