[英]How to get the mean of pandas cut categorical column
我使用 pandas cut 來分箱連續值。 我想知道如何獲得每個垃圾箱的平均值。
import numpy as np
import pandas as pd
np.random.seed(100)
df = pd.DataFrame({'a': np.random.randint(1,10,10)})
df['bins_a'] = pd.cut(df['a'],4)
print(df)
a bins_a
0 9 (7.0, 9.0]
1 9 (7.0, 9.0]
2 4 (3.0, 5.0]
3 8 (7.0, 9.0]
4 8 (7.0, 9.0]
5 1 (0.992, 3.0]
6 5 (3.0, 5.0]
7 3 (0.992, 3.0]
8 6 (5.0, 7.0]
9 3 (0.992, 3.0]
我試過:
df['bins_a_mean'] = df['bins_a'].mean()
But this fails.
如何獲得每個區間的均值?
嘗試這個:
df['bins_a_mean'] = df.groupby('bins_a')['a'].transform('mean')
print(df)
a bins_a bins_a_mean
0 9 (7.0, 9.0] 8.500000
1 9 (7.0, 9.0] 8.500000
2 4 (3.0, 5.0] 4.500000
3 8 (7.0, 9.0] 8.500000
4 8 (7.0, 9.0] 8.500000
5 1 (0.992, 3.0] 2.333333
6 5 (3.0, 5.0] 4.500000
7 3 (0.992, 3.0] 2.333333
8 6 (5.0, 7.0] 6.000000
9 3 (0.992, 3.0] 2.333333
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.