简体   繁体   中英

Is there a way to select the upper end of a bin using pandas.cut() function?

I have a question regarding the bins in pandas. My code so far looks like this:

africa_uhc = pd.cut(africa[("Universal health coverage (UHC) service coverage index")]/100, [0, 0.25, 0.50, 0.75, 1])

which prints out

29     (0.5, 0.75]
36     (0.25, 0.5]
111    (0.5, 0.75]
112    (0.5, 0.75]
118    (0.25, 0.5]
140    (0.25, 0.5]
141    (0.5, 0.75]
142      (0, 0.25]
Name: Universal health coverage (UHC) service coverage index, dtype: category
Categories (4, object): [(0, 0.25] < (0.25, 0.5] < (0.5, 0.75] < (0.75, 1]]

I want to take the numbers in the second row out to do some more aggregrations, is there a way to do this? Or is there a way to split them into bins by rounding them up? eg, index 29 would have 0.75 rather than(0.5, 0.75]. Thanks!

You can use the labels argument to control what to return.

import pandas as pd

df = pd.DataFrame({'UHC': [60,30,60,70,40,50,70,10]})
bins = [0, 0.25, 0.50, 0.75, 1]

print(pd.cut(df.UHC/100, bins))
#0    (0.5, 0.75]
#1    (0.25, 0.5]
#2    (0.5, 0.75]
#3    (0.5, 0.75]
#4    (0.25, 0.5]
#5    (0.25, 0.5]
#6    (0.5, 0.75]
#7    (0.0, 0.25]
#Name: UHC, dtype: category
#Categories (4, interval[float64]): [(0.0, 0.25] < (0.25, 0.5] < (0.5, 0.75] < (0.75, 1.0]]

print(pd.cut(df.UHC/100, bins, labels=bins[1:]))
#0    0.75
#1    0.50
#2    0.75
#3    0.75
#4    0.50
#5    0.50
#6    0.75
#7    0.25
#Name: UHC, dtype: category
#Categories (4, float64): [0.25 < 0.50 < 0.75 < 1.00]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM