[英]Pandas: Group by and aggregation with function
Assuming that I have a dataframe with the following values:假设我有一个具有以下值的数据框:
name start end description
0 ag 20 30 None
1 bgb 21 111 'a'
2 cdd 31 101 None
3 bgb 17 19 'Bla'
4 ag 20 22 None
I want to groupby
name and then get average of ( end
- start
) values.我想按名称
groupby
,然后获得( end
- start
)值的平均值。
I can use mean
( df.groupby(['name'], as_index=False).mean()
)我可以使用
mean
( df.groupby(['name'], as_index=False).mean()
)
but how can I give the mean function the subtraction of two columns (last - first) ?但是我怎样才能给均值函数减去两列(最后 - 首先)?
You can subtract column and then grouping by column df['name']
:您可以减去列,然后按列
df['name']
分组:
df1 = df['end'].sub(df['start']).groupby(df['name']).mean().reset_index(name='diff')
print (df1)
name diff
0 ag 6
1 bgb 46
2 cdd 70
Another idea with new column diff
:新列
diff
另一个想法:
df1 = (df.assign(diff = df['end'].sub(df['start']))
.groupby('name', as_index=False)['diff']
.mean())
print (df1)
name diff
0 ag 6
1 bgb 46
2 cdd 70
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.