[英]python pandas reattach column after aggregation
My DataFrame looks like this 我的DataFrame看起来像这样
exams = pd.DataFrame({'id1':['1x', '1x','2x','3x','3x'], 'id2':['a','a','b','a','a'],'data':[1,2,3,4,5]})
id1 id2 data
0 1x a 1
1 1x a 2
2 2x b 3
3 3x a 4
4 3x a 5
Then I aggregate it to 然后我将其汇总到
exams_agg = exams.groupby('id1').agg('mean')
Then exams_agg
looks like 然后exams_agg
看起来像
data
id1
1x 1.5
2x 3
3x 4.5
I want to reattach id2
column to exams_agg
. 我想将id2
列重新附加到exams_agg
。 So I was thinking about create a lookup table 所以我在考虑创建一个查询表
lookup = exams[['id1', 'id2']]
exams_agg = pd.merge(exams_agg, lookup, left_index=True, right_on='id1')
But since lookup
contains duplicate pairs of ids, exams_agg
contains duplicates as well. 但是由于lookup
包含重复的ID对, exams_agg
包含重复的ID。 What is a good way to create 什么是创造的好方法
data id2
id1
1x 1.5 a
2x 3 b
3x 4.5 a
If a unique id1
always corresponds to the same id2
, you can simply add id2
in your groupby
: 如果唯一的id1
始终对应于相同的id2
,则只需在groupby
添加id2
:
In [5]: df.groupby(['id1', 'id2']).agg('mean')
Out[5]:
data
id1 id2
1x a 1.5
2x b 3.0
3x a 4.5
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.