Pandas 通过保留一列的第一个元素和另一列的最后一个元素来合并

Question

I have a dataframe with different value and ID that can be in common.我有一个具有不同值和 ID 的数据框，它们可以是共同的。

    df = pd.DataFrame({'A': ['chr1','chr1','chr1','chr1','chr1','chr2'],
                    'B': [700,750,800,850,900,200],
                    'C': [750,800,850,900,950,250],
                    'D':['id_1','id_1','id_1','id_1','id_1','id_2']})

What I'm trying to do is to keep lowest element of B, higher of C for identical value A and D我想要做的是保持 B 的最低元素，对于相同的值 A 和 D 保持较高的 C

Output should look :输出应该是：

    A    B    C    D
0  chr1 700  950   id_1
1  chr2 200  250   id_2

I tried to use我试着用

groupby('D').agg(['first', 'last'])

But it's not what I want...但这不是我想要的......

Answer 1

Use GroupBy.agg with dictionary by columns names and aggregate functions:将GroupBy.agg与按列名称和聚合函数的字典一起使用：

df1 = (df.groupby('D', as_index=False)
         .agg({'A':'first', 'B':'first', 'C':'last'})
         [['A','B','C','D']])
print (df1)
      A    B    C     D
0  chr1  700  950  id_1
1  chr2  200  250  id_2

Answer 2

With dict passed the name and function inside the agg使用dict在agg传递名称和函数

df.groupby(['A','D'],as_index=False).agg({'B':'first','C':'last'}).reindex(columns=df.columns)
      A    B    C     D
0  chr1  700  950  id_1
1  chr2  200  250  id_2

Pandas 通过保留一列的第一个元素和另一列的最后一个元素来合并

问题描述

2 个解决方案

解决方案1
4 已采纳 2020-01-07 15:07:56

解决方案2
1 2020-01-07 15:08:41

Pandas 通过保留一列的第一个元素和另一列的最后一个元素来合并

问题描述

2 个解决方案

解决方案1 4 已采纳 2020-01-07 15:07:56

解决方案2 1 2020-01-07 15:08:41

解决方案1
4 已采纳 2020-01-07 15:07:56

解决方案2
1 2020-01-07 15:08:41