简体   繁体   中英

how to groupby multiple columns using python and pandas

I have a dataframe that includes 4 columns my question is: How to groupby 3 columns and plot the bar chart ?

How to plot the result of the groupby?

code:

import pandas as pd
import plotly.offline
import plotly.express as px
import plotly.graph_objs as go

df =pd.DataFrame({"source_number":[11199,11328,11287,32345,12342,1232,13456,123244,1235],
       "location":["loc1","loc2","loc3","loc1","loc2","loc2","loc3","loc2","loc1"],
       "category":["cat1","cat3","cat1","cat3","cat3","cat2","cat2","cat3","cat1"],
       "date":["2021/04","2021/05","2021/04","2021/05","2021/05","2021/04","2021/03","2021/05","2021/04"]
                 }) 
# group by date with category and location  AND COUNT THE VALUES 
df_group = df.groupby(["date","category","location"]).size().reset_index(name="count")
df_group

Using:

df_group = (
    df.groupby(["date", "category", "location"])
        .size()
        .reset_index(name='count')
)
      date category location  count
0  2021/03     cat2     loc3      1
1  2021/04     cat1     loc1      2
2  2021/04     cat1     loc3      1
3  2021/04     cat2     loc2      1
4  2021/05     cat3     loc1      1
5  2021/05     cat3     loc2      3

Try creating a color indicator with category and location , plot, and drop:

import plotly.express as px

df_group['cat_location'] = df_group['category'] + '_' + df_group['location']
fig = px.bar(df_group, x="date", y="count", color='cat_location')
df_group = df_group.drop('cat_location', axis=1)
fig.show()

Or without adding a column to df_group :

fig = px.bar(df_group,
             x="date",
             y="count",
             color=df_group['category'] + '_' + df_group['location'])
fig.show()

情节情节

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM