[英]How to get the mean of pandas cut categorical column
I was using pandas cut for the binning continuous values.我使用 pandas cut 来分箱连续值。 I wonder how to get the mean for each bin.我想知道如何获得每个垃圾箱的平均值。
import numpy as np
import pandas as pd
np.random.seed(100)
df = pd.DataFrame({'a': np.random.randint(1,10,10)})
df['bins_a'] = pd.cut(df['a'],4)
print(df)
a bins_a
0 9 (7.0, 9.0]
1 9 (7.0, 9.0]
2 4 (3.0, 5.0]
3 8 (7.0, 9.0]
4 8 (7.0, 9.0]
5 1 (0.992, 3.0]
6 5 (3.0, 5.0]
7 3 (0.992, 3.0]
8 6 (5.0, 7.0]
9 3 (0.992, 3.0]
I tried:我试过:
df['bins_a_mean'] = df['bins_a'].mean()
But this fails.
How to get the means for each interval?如何获得每个区间的均值?
Try this:尝试这个:
df['bins_a_mean'] = df.groupby('bins_a')['a'].transform('mean')
print(df)
a bins_a bins_a_mean
0 9 (7.0, 9.0] 8.500000
1 9 (7.0, 9.0] 8.500000
2 4 (3.0, 5.0] 4.500000
3 8 (7.0, 9.0] 8.500000
4 8 (7.0, 9.0] 8.500000
5 1 (0.992, 3.0] 2.333333
6 5 (3.0, 5.0] 4.500000
7 3 (0.992, 3.0] 2.333333
8 6 (5.0, 7.0] 6.000000
9 3 (0.992, 3.0] 2.333333
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.