[英]Pandas groupby to get a total of a column by each comma separated value in another column
From the given sample dataset below, I want to draw a clustered bar chart showing total revenue by each feature for every year从下面给定的示例数据集中,我想绘制一个聚集条形图,显示每个功能每年的总收入
-------------------------------
Year Product Feature Revenue
-------------------------------
2012 P1 a,d,e 98
2016 P2 a,b,c 167
2014 P3 d,e 120
2014 P4 a,c 144
2016 P5 b,c,d 156
2016 P6 e,a 107
The data to draw the chart could be:绘制图表的数据可能是:
---------------------------------
Year | Feature_wise_total_revenue
---------------------------------
a b c d e
2012 98 0 0 98 98
2014 144 0 140 120 120
2016 274 323 323 156 107
Please help to get the code for the total revenue by each feature for every year from the sample dataset.请帮助从示例数据集中获取每个功能每年的总收入代码。
Try, using the string accessor, .str
, and split
with explode
.尝试使用字符串访问器.str
并使用explode
split
。 The groupby
and sum
with unstack
: groupby
和sum
与unstack
:
df.assign(Feature=df['Feature'].str.split(',')).explode('Feature')\
.groupby(['Year','Feature'])['Revenue'].sum().unstack(1).fillna(0)
Output: Output:
Feature a b c d e
Year
2012 98.0 0.0 0.0 98.0 98.0
2014 144.0 0.0 144.0 120.0 120.0
2016 274.0 323.0 323.0 156.0 107.0
Plotting:绘图:
df_out.plot.bar()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.