简体   繁体   English

pandas.DataFrame.groupby.nunique()不会删除groupby列。 这是错误吗?

[英]pandas.DataFrame.groupby.nunique() does not drop the groupby column/s. Is this a bug?

Although I set the parameter as_index to True, pandas.DataFrame.groupby.nunique() keeps the columns I am grouping by in the result. 尽管我将参数as_index设置为True,但pandas.DataFrame.groupby.nunique()会将我要分组的列保留在结果中。

The pandas version is: 0.24.1 熊猫版本是:0.24.1

df = pd.DataFrame(
    {'a': [1, 1, 2, 3, 2],
     'b': [1, 2, 3, 4, 4]}
)
df.groupby('a', as_index=True).nunique()

The output is: 输出为:

#    a  b
# a      
# 1  1  2
# 2  1  2
# 3  1  1

I expected: 我期望:

#    b
# a   
# 1  2
# 2  2
# 3  1

As a counterexample that behaves as expected: 作为反例,其行为符合预期:

df.groupby('a', as_index=True).max()

results in: 结果是:

#    b
# a   
# 1  2
# 2  4
# 3  4

If you run [print(df.to_string() + '\\n') for i, df in df.groupby('a', as_index=True)] , you get printed: 如果对[print(df.to_string() + '\\n') for i, df in df.groupby('a', as_index=True)]运行[print(df.to_string() + '\\n') for i, df in df.groupby('a', as_index=True)]

   a  b
0  1  1
1  1  2

   a  b
2  2  3
4  2  4

   a  b
3  3  4

The a column isn't set as the index for each data frame group. 没有将a列设置为每个数据框组的索引。 It is the output of the groupby which has its index set to the group indices when as_index=True (which also is the default), not the data frame groups themselves. 它是一种具有其索引设置为组索引的GROUPBY的输出as_index=True (其也是默认值),而不是数据帧组本身。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM