[英]Pandas merge by keep first element of a column and last of another column
I have a dataframe with different value and ID that can be in common.我有一个具有不同值和 ID 的数据框,它们可以是共同的。
df = pd.DataFrame({'A': ['chr1','chr1','chr1','chr1','chr1','chr2'],
'B': [700,750,800,850,900,200],
'C': [750,800,850,900,950,250],
'D':['id_1','id_1','id_1','id_1','id_1','id_2']})
What I'm trying to do is to keep lowest element of B, higher of C for identical value A and D我想要做的是保持 B 的最低元素,对于相同的值 A 和 D 保持较高的 C
Output should look :输出应该是:
A B C D
0 chr1 700 950 id_1
1 chr2 200 250 id_2
I tried to use我试着用
groupby('D').agg(['first', 'last'])
But it's not what I want...但这不是我想要的......
Use GroupBy.agg
with dictionary by columns names and aggregate functions:将GroupBy.agg
与按列名称和聚合函数的字典一起使用:
df1 = (df.groupby('D', as_index=False)
.agg({'A':'first', 'B':'first', 'C':'last'})
[['A','B','C','D']])
print (df1)
A B C D
0 chr1 700 950 id_1
1 chr2 200 250 id_2
With dict
passed the name and function inside the agg
使用dict
在agg
传递名称和函数
df.groupby(['A','D'],as_index=False).agg({'B':'first','C':'last'}).reindex(columns=df.columns)
A B C D
0 chr1 700 950 id_1
1 chr2 200 250 id_2
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.