繁体   English   中英

Pandas value_counts() 中的分组值

[英]Grouping values in Pandas value_counts()

我想从我的熊猫数据帧创建直方图。 我有 1 列,用于保存百分比值。 我使用了 value_counts() 但我有太多的百分比值。 例子:

0.752        1
0.769        2
0.800        1
0.823        1
          ... 
80.365       1
84.000       1
84.615       1
85.000       10
85.714       1

我需要按相同的速率对这些值进行分组。 例如 5%。 (0 - 4,999 , 5,000 - 9,999, ...) 我想要这个结果:

(例子)

0  - 4,999       24
5  - 9,999       12
10 - 14,999      30
...

您可以通过pd.cut()方法的结果对数据进行分组

In [38]: df
Out[38]:
    value  count
0   0.752      1
1  11.769      3
2  22.800      4
3  33.823      5
4  55.365      1
5  84.000      1
6  84.615      1
7  85.000     10
8  99.714      1

In [39]: df.groupby(pd.cut(df.value, bins=np.linspace(0, 100, 21)))['count'].sum().fillna(0)
Out[39]:
value
(0, 5]        1.0
(5, 10]       0.0
(10, 15]      3.0
(15, 20]      0.0
(20, 25]      4.0
(25, 30]      0.0
(30, 35]      5.0
(35, 40]      0.0
(40, 45]      0.0
(45, 50]      0.0
(50, 55]      0.0
(55, 60]      1.0
(60, 65]      0.0
(65, 70]      0.0
(70, 75]      0.0
(75, 80]      0.0
(80, 85]     12.0
(85, 90]      0.0
(90, 95]      0.0
(95, 100]     1.0
Name: count, dtype: float64

或者,您可以删除 NaN:

In [40]: df.groupby(pd.cut(df.value, bins=np.linspace(0, 100, 21)))['count'].sum().dropna()
Out[40]:
value
(0, 5]        1.0
(10, 15]      3.0
(20, 25]      4.0
(30, 35]      5.0
(55, 60]      1.0
(80, 85]     12.0
(95, 100]     1.0
Name: count, dtype: float64

解释:

In [41]: pd.cut(df.value, bins=np.linspace(0, 100, 21))
Out[41]:
0       (0, 5]
1     (10, 15]
2     (20, 25]
3     (30, 35]
4     (55, 60]
5     (80, 85]
6     (80, 85]
7     (80, 85]
8    (95, 100]
Name: value, dtype: category
Categories (20, object): [(0, 5] < (5, 10] < (10, 15] < (15, 20] ... (80, 85] < (85, 90] < (90, 95] < (95, 100]]

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM