简体   繁体   中英

Matplotlib error plotting interval bins for discretized values form pandas dataframe

An error is returned when I want to plot an interval. I created an interval for my age column so now I want to show on a chart the age interval compares to the revenue

my code

bins = [10, 20, 30, 40, 50, 60, 70, 80, 90, 100]
clients['tranche'] = pd.cut(clients.age, bins)
clients.head()

    client_id   sales       revenue         birth       age     sex     tranche
0   c_1         39          558.18          1955        66      m       (60, 70]
1   c_10        58          1353.60         1956        65      m       (60, 70]
2   c_100       8           254.85          1992        29      m       (20, 30]
3   c_1000      125         2261.89         1966        55      f       (50, 60]
4   c_1001      102         1812.86         1982        39      m       (30, 40]
    
# Plot a scatter tranche x revenue
df = clients.groupby('tranche')[['revenue']].sum().reset_index().copy()
plt.scatter(df.tranche, df.revenue)
plt.show()

But an error appears ending by

TypeError: float() argument must be a string or a number, not 'pandas._libs.interval.Interval'

How to use an interval for plotting?

You'll need to add labels. (i tried to convert them to str using .astype(str) but that does not seem to work in 3.9)

if you do the following, it will work just fine.

labels = ['10-20', '20-30', '30-40']
df['tranche'] = pd.cut(df.age, bins, labels=labels)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM