pandas groupby 中“as_index = False”和“reset_index()”的区别

Question

I just wanted to know what is the difference in the function performed by these 2.我只是想知道这两个执行的功能有什么区别。

Data:数据：

import pandas as pd
df = pd.DataFrame({"ID":["A","B","A","C","A","A","C","B"], "value":[1,2,4,3,6,7,3,4]})

as_index=False : as_index=False ：

df_group1 = df.groupby("ID").sum().reset_index()

reset_index() :重置索引（）：

df_group2 = df.groupby("ID", as_index=False).sum()

Both of them give the exact same output.它们都给出了完全相同的输出。

  ID  value
0  A     18
1  B      6
2  C      6

Can anyone tell me what is the difference and any example illustrating the same?谁能告诉我有什么区别和任何说明相同的例子？

Answer 1

When you use as_index=False , you indicate to groupby() that you don't want to set the column ID as the index (duh!).当您使用as_index=False ，您向groupby()表明您不想将列 ID 设置为索引（废话！）。 When both implementation yield the same results, use as_index=False because it will save you some typing and an unnecessary pandas operation ;)当两个实现产生相同的结果时，使用as_index=False因为它会为您节省一些输入和不必要的熊猫操作；)

However, sometimes, you want to apply more complicated operations on your groups.但是，有时，您希望对组应用更复杂的操作。 In those occasions, you might find out that one is more suited than the other.在这些情况下，您可能会发现一个比另一个更适合。

Example 1: You want to sum the values of three variables (ie columns) in a group on both axes.示例 1：您想对一组中两个轴上的三个变量（即列）的值求和。

Using as_index=True allows you to apply a sum over axis=1 without specifying the names of the columns, then summing the value over axis 0. When the operation is finished, you can use reset_index(drop=True/False) to get the dataframe under the right form.使用as_index=True允许您在不指定列名称的情况下对axis=1应用求和，然后对轴 0 上的值求和。操作完成后，您可以使用reset_index(drop=True/False)获得正确形式下的数据框。

Example 2: You need to set a value for the group based on the columns in the groupby() .示例 2：您需要根据groupby()中的groupby()组设置一个值。

Setting as_index=False allow you to check the condition on a common column and not on an index, which is often way easier.设置as_index=False允许您检查公共列而不是索引上的条件，这通常更容易。

At some point, you might come across KeyError when applying operations on groups.在某些时候，您可能会在对组应用操作时遇到KeyError 。 In that case, it is often because you are trying to use a column in your aggregate function that is currently an index of your GroupBy object.在这种情况下，通常是因为您试图在聚合函数中使用当前是 GroupBy 对象索引的列。

pandas groupby 中“as_index = False”和“reset_index()”的区别

问题描述

1 个解决方案

解决方案1
23 已采纳 2018-08-20 15:12:07

pandas groupby 中“as_index = False”和“reset_index()”的区别

问题描述

1 个解决方案

解决方案1 23 已采纳 2018-08-20 15:12:07

解决方案1
23 已采纳 2018-08-20 15:12:07