I have a question regarding the bins in pandas. My code so far looks like this:
africa_uhc = pd.cut(africa[("Universal health coverage (UHC) service coverage index")]/100, [0, 0.25, 0.50, 0.75, 1])
which prints out
29 (0.5, 0.75]
36 (0.25, 0.5]
111 (0.5, 0.75]
112 (0.5, 0.75]
118 (0.25, 0.5]
140 (0.25, 0.5]
141 (0.5, 0.75]
142 (0, 0.25]
Name: Universal health coverage (UHC) service coverage index, dtype: category
Categories (4, object): [(0, 0.25] < (0.25, 0.5] < (0.5, 0.75] < (0.75, 1]]
I want to take the numbers in the second row out to do some more aggregrations, is there a way to do this? Or is there a way to split them into bins by rounding them up? eg, index 29 would have 0.75 rather than(0.5, 0.75]. Thanks!
You can use the labels
argument to control what to return.
import pandas as pd
df = pd.DataFrame({'UHC': [60,30,60,70,40,50,70,10]})
bins = [0, 0.25, 0.50, 0.75, 1]
print(pd.cut(df.UHC/100, bins))
#0 (0.5, 0.75]
#1 (0.25, 0.5]
#2 (0.5, 0.75]
#3 (0.5, 0.75]
#4 (0.25, 0.5]
#5 (0.25, 0.5]
#6 (0.5, 0.75]
#7 (0.0, 0.25]
#Name: UHC, dtype: category
#Categories (4, interval[float64]): [(0.0, 0.25] < (0.25, 0.5] < (0.5, 0.75] < (0.75, 1.0]]
print(pd.cut(df.UHC/100, bins, labels=bins[1:]))
#0 0.75
#1 0.50
#2 0.75
#3 0.75
#4 0.50
#5 0.50
#6 0.75
#7 0.25
#Name: UHC, dtype: category
#Categories (4, float64): [0.25 < 0.50 < 0.75 < 1.00]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.