[英]Pandas read csv is shifting columns
I'm trying to create a dataframe of a csv file that has 4 empty columns. 我正在尝试创建一个包含4个空列的csv文件的数据框。 When I open it on LibreOffice or Excel it correctly identifies the empty columns.
当我在LibreOffice或Excel上打开它时,它正确识别空列。 However, opening with
pd.read_csv()
ends up shifting the columns' values by one. 但是,使用
pd.read_csv()
打开pd.read_csv()
列的值移动一个。
How can I solve this? 我怎么解决这个问题? It seems like a problem with pandas
read_csv()
method. 这似乎是pandas
read_csv()
方法的一个问题。
My code is really standard: 我的代码非常标准:
import pandas as pd
df = pd.DataFrame.read_csv('csv_file.csv', sep=',')
df.head()
I changed the headers and used this: 我改变了标题并使用了这个:
df = pd.DataFrame.read_csv('csv_file.csv', sep=',', index_col=False).
This solved the problem, but what in my previous headers was causing this? 这解决了这个问题,但是我之前的标题中是什么导致了这个问题?
It seems you need the parameter index_col=False
to NOT read the first column to index in read_csv
, sep=','
parameter can be omitted, because it is the default value: 看来你需要参数
index_col=False
来读取read_csv
索引的第一列, sep=','
参数可以省略,因为它是默认值:
df = pd.read_csv('csv_file.csv', index_col=False)
Your sample: 你的样本:
df = pd.read_csv('teste2.csv', index_col=False)
print (df)
Header1 Header2 Header3 Unnamed: 3 Unnamed: 4 Header4 Header5 Header6 \
0 ptn M00001 0 NaN NaN 2 0 0
Header7 Header8 ... Header22 Header23 Header24 Header25 \
0 0 -31.573 ... -0.375 0.0 -64.168 276.586
Header26 Header27 Unnamed: 29 Unnamed: 30 Header28 Header29
0 -0.232 0.0 NaN NaN 0.702 1.0
[1 rows x 33 columns]
I encountered the same problem. 我遇到了同样的问题。 Try writing headings on top of each column if there are none.
如果没有,请尝试在每列的顶部写标题。 This time,
read_csv()
also reads the headings and lists them. 这次,
read_csv()
还会读取标题并列出它们。
After that convert the data frame to an array by 之后,将数据帧转换为数组
df=df.values
and the headings are gone. 标题消失了。
The problems occurs if your line ends with an delimiter (here comma[,]), which creates an empty cell generally not visible in MS Excel. 如果您的行以分隔符(此处为逗号[,])结束,则会出现问题,这会创建一个通常在MS Excel中不可见的空单元格。 If your csv line looks like this:
如果你的csv行看起来像这样:
1,2282816,102.97245065789474,2432,0.8333333333333334,0.1388888888888889,certain,
then modify it to: 然后将其修改为:
1,2282816,102.97245065789474,2432,0.8333333333333334,0.1388888888888889,certain
and pd.read_csv(fileName)
will work fine. 和
pd.read_csv(fileName)
将正常工作。
I had a similar problem. 我遇到了类似的问题。 Here is how I have solved it:
以下是我如何解决它:
csv
file csv
文件 pandas.read_csv('filename', sep=',', index_col=False))
pandas.read_csv('filename', sep=',', index_col=False))
读取csv文件 Problem resolved. 问题解决了。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.