![](/img/trans.png)
[英]group by count and sum based on particular column in pandas dataframe in separate column along with other columns
[英]Group by column values based on another column condition along with sum and count
我想将我的政策设置为一个变量,以便我可以输入我想要的任何政策。 按节目分组,计算出现的节目数量,汇总观看次数并汇总收入。 我怎样才能做到这一点?
我的桌子看起来像
policy. show. views. revenue
10 min. batman. 100. 10
10 min batman. 200. 20
5 min. joker. 100. 10
5 min joker. 300. 15
15 min. superman. 500. 30
我的预期输出是
政策 = '10 分钟'
Show count total_views total_revenue
batman. 2. 300. 30
如果我给 policy = '5 min',我的输出应该是
Show count total_views total_revenue
joker. 2. 400. 25
同样,对于任何其他政策,我在可变政策下给出
这可能会帮助您:
def set_policy(df, policy):
filtered = df[df['policy'] == policy]
t = {'show': filtered['show'].unique()[0], 'count': filtered.shape[0],
'total_views': filtered['views'].sum(), 'total_revenue': filtered['revenue'].sum()}
return pd.DataFrame([t])
df = set_policy(df, '10min')
输出:
show count total_views total_revenue
0 batman 2 300 30
更新
示例数据框
policy show views revenue
0 10min batman 100 10
1 10min batman 200 20
2 5min joker 100 10
3 5min joker 300 15
4 15min superman 500 30
5 10min superman 100 20
编码:
def set_policy(df, policy):
t = defaultdict(list)
filtered = df[df['policy'] == policy]
gp = filtered.groupby('show')
for i, k in gp:
t['show'].append(k['show'].unique()[0])
t['count'].append(k.shape[0])
t['total_views'].append(k['views'].sum())
t['total_revenue'].append(k['revenue'].sum())
return pd.DataFrame(t)
df = set_policy(df, '10min')
输出
show count total_views total_revenue
0 batman 2 300 30
1 superman 1 100 20
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.