简体   繁体   English

关于 pd.dataframe.reset_index() 中 drop=True 的问题

[英]Question about drop=True in pd.dataframe.reset_index()

In a Pandas dataframe, it's possible to reset the index using the reset_index() method.在 Pandas 数据帧中,可以使用reset_index()方法重置索引。 One optional argument is drop=True which according to the documentation:一个可选参数是drop=True ,根据文档:

drop : bool, default False
    Do not try to insert index into dataframe columns. 
    This resets the index to the default integer index.

My question is, what does the first sentence mean?我的问题是,第一句话是什么意思? Will it try to convert an integer index to a new column in my df if I leave if False?如果我离开如果为假,它会尝试将整数索引转换为我的 df 中的新列吗?

Also, will my row order be preserved or should I also sort to ensure proper ordering?另外,我的行顺序会被保留还是应该排序以确保正确排序?

As you can see below, df.reset_index() will move the index into the dataframe as a column.正如您在下面看到的, df.reset_index()会将索引作为列移动到数据帧中。 If the index was just a generic numerical index, you probably don't care about it and can just discard it.如果索引只是一个通用的数字索引,您可能不关心它,可以丢弃它。 Below is a simple dataframe, but I dropped the first row just to have differing values in the index.下面是一个简单的数据框,但我删除了第一行只是为了在索引中有不同的值。

df = pd.DataFrame([['a', 10], ['b', 20], ['c', 30], ['d', 40]], columns=['letter','number'])
df = df[df.number > 10]
print(df)
#   letter  number
# 1      b      20
# 2      c      30
# 3      d      40

Default behavior now shows a column named index which was the previous index.默认行为现在显示一个名为index的列,它是以前的索引。 You can see that df['index'] matches the index from above, but the index has been renumbered starting from 0.您可以看到df['index']与上面的索引匹配,但索引已从 0 开始重新编号。

print(df.reset_index())
#    index letter  number
# 0      1      b      20
# 1      2      c      30
# 2      3      d      40

drop=True doesn't pretend like the index was important and just gives you a new index. drop=True不会假装索引很重要,只是给你一个新的索引。

print(df.reset_index(drop=True))
#   letter  number
# 0      b      20
# 1      c      30
# 2      d      40

Regarding row order, I suspect that it would be maintained, but the order in which things are stored should not be relied on in general.关于行顺序,我怀疑它会被维护,但一般不应该依赖存储的顺序。 If you are performing an aggregate function, you probably want to make sure you have the data ordered properly for the aggrigation.如果您正在执行聚合函数,您可能希望确保为聚合正确排序数据。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM