简体   繁体   English

Pandas groupby 通过在另一列中的每个逗号分隔值得到一列的总和

[英]Pandas groupby to get a total of a column by each comma separated value in another column

From the given sample dataset below, I want to draw a clustered bar chart showing total revenue by each feature for every year从下面给定的示例数据集中,我想绘制一个聚集条形图,显示每个功能每年的总收入

-------------------------------
Year  Product Feature   Revenue
-------------------------------
2012  P1      a,d,e     98
2016  P2      a,b,c     167
2014  P3      d,e       120
2014  P4      a,c       144
2016  P5      b,c,d     156
2016  P6      e,a       107

The data to draw the chart could be:绘制图表的数据可能是:

---------------------------------
Year | Feature_wise_total_revenue
---------------------------------
       a    b    c     d     e
2012   98   0    0     98    98
2014   144  0    140   120   120
2016   274  323  323   156   107

Please help to get the code for the total revenue by each feature for every year from the sample dataset.请帮助从示例数据集中获取每个功能每年的总收入代码。

Try, using the string accessor, .str , and split with explode .尝试使用字符串访问器.str并使用explode split The groupby and sum with unstack : groupbysumunstack

df.assign(Feature=df['Feature'].str.split(',')).explode('Feature')\
  .groupby(['Year','Feature'])['Revenue'].sum().unstack(1).fillna(0)

Output: Output:

Feature      a      b      c      d      e
Year                                      
2012      98.0    0.0    0.0   98.0   98.0
2014     144.0    0.0  144.0  120.0  120.0
2016     274.0  323.0  323.0  156.0  107.0

Plotting:绘图:

df_out.plot.bar()

在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM