Pandas groupby 通过在另一列中的每个逗号分隔值得到一列的总和

Question

From the given sample dataset below, I want to draw a clustered bar chart showing total revenue by each feature for every year从下面给定的示例数据集中，我想绘制一个聚集条形图，显示每个功能每年的总收入

-------------------------------
Year  Product Feature   Revenue
-------------------------------
2012  P1      a,d,e     98
2016  P2      a,b,c     167
2014  P3      d,e       120
2014  P4      a,c       144
2016  P5      b,c,d     156
2016  P6      e,a       107

The data to draw the chart could be:绘制图表的数据可能是：

---------------------------------
Year | Feature_wise_total_revenue
---------------------------------
       a    b    c     d     e
2012   98   0    0     98    98
2014   144  0    140   120   120
2016   274  323  323   156   107

Please help to get the code for the total revenue by each feature for every year from the sample dataset.请帮助从示例数据集中获取每个功能每年的总收入代码。

Answer 1

Try, using the string accessor, .str , and split with explode .尝试使用字符串访问器.str并使用explode split 。 The groupby and sum with unstack : groupby和sum与unstack ：

df.assign(Feature=df['Feature'].str.split(',')).explode('Feature')\
  .groupby(['Year','Feature'])['Revenue'].sum().unstack(1).fillna(0)

Output: Output：

Feature      a      b      c      d      e
Year                                      
2012      98.0    0.0    0.0   98.0   98.0
2014     144.0    0.0  144.0  120.0  120.0
2016     274.0  323.0  323.0  156.0  107.0

Plotting:绘图：

df_out.plot.bar()

Pandas groupby 通过在另一列中的每个逗号分隔值得到一列的总和

问题描述

1 个解决方案

解决方案1
4 已采纳 2020-04-29 00:28:32

Pandas groupby 通过在另一列中的每个逗号分隔值得到一列的总和

问题描述

1 个解决方案

解决方案1 4 已采纳 2020-04-29 00:28:32

解决方案1
4 已采纳 2020-04-29 00:28:32