[英]Why is dummy '0' row created after pd.read_csv()?
Why there is an additional row named '0' after read_csv? 为什么在read_csv之后还有一个名为“ 0”的附加行?
In the following code, I saved df1 to csv file and read it back. 在以下代码中,我将df1保存到csv文件中并读回。 However, there is an additional row named '0'.
但是,还有一个名为“ 0”的附加行。 How can I avoid it?
我该如何避免呢?
d1 = {'a' : 1, 'b' : 2, 'c' : 3}
df1=pd.Series(d1)
print('\ndf1:'); print(df1)
> df1:
> a 1
> b 2
> c 3
> dtype: int64
df1.to_csv("df1.csv")
df1=pd.read_csv("df1.csv", index_col=0, header=None)
print('\ndf1:'); print(df1)
> df1:
> 1
> 0 <<<<---- ????
> a 1
> b 2
> c 3
That is not a dummy row - that is the name of your indexing Column. 那不是虚拟行-这是索引列的名称。
You can check it by running: 您可以通过运行以下命令进行检查:
> df.index.name
0
You can change it by setting it too: 您也可以通过设置来更改它:
> df.index.name = "my_index"
1
my_index
a 1
b 2
c 3
One way to avoid it is to create a DataFrame (as your variable name suggests) instead of a Series (which you have atm). 避免这种情况的一种方法是创建一个DataFrame(如变量名所示),而不是创建Series(拥有atm)。
import pandas as pd
d1 = {'a' : 1, 'b' : 2, 'c' : 3}
df1 = pd.Series(d1).to_frame() # Use to_frame() here
df1.to_csv("df1.csv")
df1 = pd.read_csv("df1.csv", index_col=0)
print(df1) gives: print(df1)给出:
0
a 1
b 2
c 3
Another solution would be to actually specify names when you import. 另一个解决方案是在导入时实际指定名称。 In case you opt for this solution I would however rename the df1 to s1 when you create a series for readability.
如果您选择此解决方案,则在创建可读性系列时,我会将df1重命名为s1。
df1 = pd.read_csv("df1.csv", names=['index','values'], index_col='index')
And you get: 你会得到:
values
index
a 1
b 2
c 3
Full example: 完整示例:
d1 = {'a' : 1, 'b' : 2, 'c' : 3}
s1 = pd.Series(d1)
print('s1:\n{}\n'.format(s1))
s1.to_csv("df1.csv")
df1 = pd.read_csv("df1.csv", names=['index','values'], index_col='index')
print('df1:\n{}\n'.format(df1))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.