如何根据 DataFrame Python Pandas 中其他 2 列中的值删除一列中的重复项？

Question

I have DataFrame in Python Pandas like below:我在 Python Pandas 中有 DataFrame，如下所示：

data types:数据类型：

TG_B - int TG_B - 整数

ID ID	TYPE类型	TG_A TG_A	TG_B TG_B
111 111	A一种	1 1个	0 0
111 111	B乙	1 1个	0 0
222 222	B乙	1 1个	0 0
222 222	A一种	1 1个	0 0
333 333	B乙	0 0	1 1个
333 333	A一种	0 0	1 1个

And I need to drop duplicates in above DataFrame, so as to:我需要在上面的 DataFrame 中删除重复项，以便：

If value in ID in my DF is duplicated -> drop rows where TYPE = B and TG_A = 1 or TYPE = A and TG_B = 1如果我的 DF 中的 ID 值重复 -> 删除 TYPE = B 和 TG_A = 1 或 TYPE = A 和 TG_B = 1 的行

So, as a result I need something like below:因此，结果我需要如下内容：

ID  | TYPE | TG_A | TG_B
----|------|------|-----
111 | A    | 1    | 0
222 | A    | 1    | 0
333 | B    | 0    | 1

How can I do that in Python Pandas?我怎样才能在 Python Pandas 中做到这一点？

Answer 1

You can use two boolean masks and groupby.idxmax to get the first non matching value:您可以使用两个 boolean 掩码和groupby.idxmax来获取第一个不匹配的值：

m1 = df['TYPE'].eq('B') & df['TG_A'].eq(1)
m2 = df['TYPE'].eq('A') & df['TG_B'].eq(1)

out = df.loc[(~(m1|m2)).groupby(df['ID']).idxmax()]

Output: Output：

    ID TYPE  TG_A  TG_B
0  111    A     1     0
3  222    A     1     0
4  333    B     0     1

Answer 2

df[df['TYPE'].eq('A').eq(df['TG_A'])]

result

    ID  TYPE    TG_A    TG_B
0   111 A       1       0
3   222 A       1       0
4   333 B       0       1