简体   繁体   English

如何使用python和pandas对多列进行分组

[英]how to groupby multiple columns using python and pandas

I have a dataframe that includes 4 columns my question is: How to groupby 3 columns and plot the bar chart ?我有一个包含 4 列的数据框,我的问题是:如何分组 3 列并绘制条形图?

How to plot the result of the groupby?如何绘制groupby的结果?

code:代码:

import pandas as pd
import plotly.offline
import plotly.express as px
import plotly.graph_objs as go

df =pd.DataFrame({"source_number":[11199,11328,11287,32345,12342,1232,13456,123244,1235],
       "location":["loc1","loc2","loc3","loc1","loc2","loc2","loc3","loc2","loc1"],
       "category":["cat1","cat3","cat1","cat3","cat3","cat2","cat2","cat3","cat1"],
       "date":["2021/04","2021/05","2021/04","2021/05","2021/05","2021/04","2021/03","2021/05","2021/04"]
                 }) 
# group by date with category and location  AND COUNT THE VALUES 
df_group = df.groupby(["date","category","location"]).size().reset_index(name="count")
df_group

Using:使用:

df_group = (
    df.groupby(["date", "category", "location"])
        .size()
        .reset_index(name='count')
)
      date category location  count
0  2021/03     cat2     loc3      1
1  2021/04     cat1     loc1      2
2  2021/04     cat1     loc3      1
3  2021/04     cat2     loc2      1
4  2021/05     cat3     loc1      1
5  2021/05     cat3     loc2      3

Try creating a color indicator with category and location , plot, and drop:尝试使用categorylocation 、 plot 和 drop 创建颜色指示器:

import plotly.express as px

df_group['cat_location'] = df_group['category'] + '_' + df_group['location']
fig = px.bar(df_group, x="date", y="count", color='cat_location')
df_group = df_group.drop('cat_location', axis=1)
fig.show()

Or without adding a column to df_group :或者不向df_group添加列:

fig = px.bar(df_group,
             x="date",
             y="count",
             color=df_group['category'] + '_' + df_group['location'])
fig.show()

情节情节

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM