简体   繁体   English

使用 pandas .agg 执行 value_counts() 两次

[英]Using pandas .agg to do value_counts() twice

I am trying to do a groupby on a dataframe where I apply value_counts(normalize=True) and value_counts(normalize=False) on it at the same time using .agg .我正在尝试在数据帧上执行 groupby,在该数据帧上我同时使用.agg应用value_counts(normalize=True)value_counts(normalize=False)

However, I cannot find a way to do this without it throwing an error.但是,我无法找到一种方法来做到这一点而不会引发错误。 I have tried multiple methods here: Multiple aggregations of the same column using pandas GroupBy.agg() but none seem to work for me.我在这里尝试了多种方法: 使用 Pandas GroupBy.agg() 对同一列进行多次聚合,但似乎没有一个对我有用 A part of the issue for me is having to pass normalize to value_counts.对我来说问题的一部分是必须将 normalize 传递给 value_counts。

I have created a test example like using this:我创建了一个像这样使用的测试示例:

example = pd.DataFrame({'A': ['a','a','a','b','b','c'], 'B':[1,1,2,3,3,4]})

which gives me:这给了我:

+---+---+---+
|   | A | B |
+---+---+---+
| 0 | a | 1 |
| 1 | a | 1 |
| 2 | a | 2 |
| 3 | b | 3 |
| 4 | b | 3 |
| 5 | c | 4 |
+---+---+---+

and I want to return:我想回来:

A   B   False   True
a   1   2   0.666
    2   1   0.333
b   3   2   1.000
c   3   1   1.000

Doing something like:做类似的事情:

example.groupby('A')['B'].value_counts(normalize=True)

gives me half of what I want, but I can never get the .agg to work给了我一半我想要的,但我永远无法让.agg工作

Thanks谢谢

Here agg isn't great because pd.Series.value_counts returns a Series and to get the normalized result it requires an additional level of aggregation.这里agg不是很好,因为pd.Series.value_counts返回一个 Series 并且要获得规范化的结果,它需要额外的聚合级别。 Either concat the different value_counts or manually calculate the percent after the first groupby .任一concat不同value_counts或手动第一后计算的百分比groupby

pd.concat([df.groupby('A').B.value_counts().rename('N'),
           df.groupby('A').B.value_counts(normalize=True).rename('pct')], axis=1)

# or 
res = df.groupby('A').B.value_counts().rename('N')
res = pd.concat([res, (res/res.groupby(level='A').transform('sum')).rename('pct')], axis=1)

     N       pct
A B             
a 1  2  0.666667
  2  1  0.333333
b 3  2  1.000000
c 4  1  1.000000

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM