简体   繁体   中英

Take top 5 values from dataframe and plot graph

I have list of objects and i have created Dataframe, grouped object by usage_start_date and ploted stacked graph. I'm wondering in which way i can extract top 5 expensive services by checking sum of cost . So, for one date i can 10 services but i want to show in graph 5 most expensive services. Here is the current code:

dates = ["2022-02-13T13:43:22+00:00", "2022-02-14T13:43:22+00:001", "2022-02-15T13:43:22+00:00", "2022-02-16T13:43:22+00:00"]

service_name = ["Example service 1", "Example service 2", "Example service 3", "Example service 4", "Example service 5", "Example service 6", "Example service 7", "Example service 8", "Example service 9", 'Example 10']
data = []
for i in range(0,50):
    tmp_data = {
        "usage_start_date": random.choice(dates),
        "cost": random.randrange(100),
        "service_name": random.choice(service_name)
        }
    data.append(tmp_data)
df = pd.DataFrame(data)
df['usage_start_date'] = pd.to_datetime(df['usage_start_date'], utc=True).dt.tz_convert(None).dt.date

df['usage_start_date'] = pd.to_datetime(df['usage_start_date'], format='%Y-%m-%d')
grouped_data = df.groupby(['usage_start_date', 'service_name'], as_index=False, group_keys=True).sum() #.nlargest(n=5, columns=['cost'])

df1 = grouped_data.sort_values(by=['usage_start_date', 'cost'], ascending=[False, False]) #.nlargest(n=5, columns=['cost'])

df1.pivot(index="usage_start_date", columns="service_name", values="cost").plot(kind="bar", stacked=True, width=0.2)
plt.legend(title="Service names")
plt.show()

I tried to add nlargest(n=5, columns['cost']) but it's not working.

You can replace the plotting line with

df1.sort_values('cost', ascending = False).groupby('usage_start_date').head(5).pivot(index="usage_start_date", columns="service_name", values="cost").plot(kind="bar", stacked=True, width=0.2)

where we first group byusage_start_date and take 5 largest cost services

The 5 most expensive services per date are different for different dates and I did not relabel them, so the legend mentions all 10 services. But there are only 5 points per date The graph looks like this

在此处输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM