[英]how to groupby multiple columns using python and pandas
I have a dataframe that includes 4 columns my question is: How to groupby 3 columns and plot the bar chart ?我有一个包含 4 列的数据框,我的问题是:如何分组 3 列并绘制条形图?
How to plot the result of the groupby?如何绘制groupby的结果?
import pandas as pd
import plotly.offline
import plotly.express as px
import plotly.graph_objs as go
df =pd.DataFrame({"source_number":[11199,11328,11287,32345,12342,1232,13456,123244,1235],
"location":["loc1","loc2","loc3","loc1","loc2","loc2","loc3","loc2","loc1"],
"category":["cat1","cat3","cat1","cat3","cat3","cat2","cat2","cat3","cat1"],
"date":["2021/04","2021/05","2021/04","2021/05","2021/05","2021/04","2021/03","2021/05","2021/04"]
})
# group by date with category and location AND COUNT THE VALUES
df_group = df.groupby(["date","category","location"]).size().reset_index(name="count")
df_group
Using:使用:
df_group = (
df.groupby(["date", "category", "location"])
.size()
.reset_index(name='count')
)
date category location count
0 2021/03 cat2 loc3 1
1 2021/04 cat1 loc1 2
2 2021/04 cat1 loc3 1
3 2021/04 cat2 loc2 1
4 2021/05 cat3 loc1 1
5 2021/05 cat3 loc2 3
Try creating a color indicator with category
and location
, plot, and drop:尝试使用
category
和location
、 plot 和 drop 创建颜色指示器:
import plotly.express as px
df_group['cat_location'] = df_group['category'] + '_' + df_group['location']
fig = px.bar(df_group, x="date", y="count", color='cat_location')
df_group = df_group.drop('cat_location', axis=1)
fig.show()
Or without adding a column to df_group
:或者不向
df_group
添加列:
fig = px.bar(df_group,
x="date",
y="count",
color=df_group['category'] + '_' + df_group['location'])
fig.show()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.