简体   繁体   English

什么是* not *重置pandas中的groupby索引的用例

[英]What are use cases for *not* resetting a groupby index in pandas

When working with groupby on a pandas DataFrame instance, I have never not used either as_index=False or reset_index() . 当工作groupby对大熊猫DataFrame的实例,我从来没有使用,也可以as_index=Falsereset_index() I cannot actually think of any reason why I wouldn't do so. 我实际上无法想到为什么我这样做。 Because my behavior is not the pandas default (indeed, because the groupby index exists at all), I suspect that there is some functionality of pandas that I am not taking advantage of. 因为我的行为不是pandas默认的(事实上,因为groupby索引完全存在),我怀疑有一些我没有利用的pandas功能。

Can anyone describe cases where it would be advantageous to not reset the index? 任何人都可以描述不重置索引有利的情况吗?

When you perform a groupby/agg operation, it is natural to think of the result as a mapping from the groupby keys to the aggregated scalar values. 当您执行groupby/agg操作时,很自然地将结果视为从groupby键到聚合标量值的映射。 If we were using plain Python, a dict would be the natural data structure to hold such a mapping from keys to values. 如果我们使用普通的Python,dict将是保存从键到值的映射的自然数据结构。 Since we are using Pandas, a Series is the natural data structure. 由于我们使用的是Pandas,因此Series是自然数据结构。 Its index would hold the keys, and the Series values would be the aggregrated scalars. 它的索引将保存密钥,而Series值将是聚合的标量。 If there is more than one aggregated value for each key, then the natural data structure to use would be a DataFrame. 如果每个键有多个聚合值,则要使用的自然数据结构将是DataFrame。

The advantage of holding the keys in an index rather than a column is that looking up values based on index labels is an O(1) operation, whereas looking up values based on a value in a column is an O(n) operation. 将键保存在索引而不是列中的优点是,基于索引标签查找值是O(1)操作,而基于列中的值查找值是O(n)操作。

Since the result of a groupby/agg operation fits naturally into a Series or DataFrame with groupby keys as the index, and since indexes have this special fast lookup property, it is better to return the result in this form by default. 由于groupby/agg操作的结果自然适合使用groupby键作为索引的Series或DataFrame,并且由于索引具有此特殊的快速查找属性,因此最好在默认情况下以此形式返回结果。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM