如何聚合 DataFrame 并根据 Python Pandas 中两列中的值删除重复项？

Question

I have DataFrame in Python Pandas like below:我在 Python Pandas 中有 DataFrame，如下所示：

ID  | COL1| COL2 | COL3
----------|------|------
123 | XXX | 0    | 1
123 | XXX | 1    | 1
444 | ABC | 1    | 1
444 | ABC | 1    | 1 
555 | PPP | 0    | 0

And I need to drop duplicates in above DF in that way:我需要以这种方式在上面的 DF 中删除重复项：

if in COL2 or COL3 is at least once '1' then should be 1 in these columns for ID (nevermind how often he had 0 in mentioned columns)如果在 COL2 或 COL3 中至少有一次“1”，那么在这些 ID 列中应该为 1（不管他在提到的列中出现 0 的频率如何）
rest of columns should still be in output列的 rest 应该仍然在 output
In COL1 the is no duplicates per ID在 COL1 中，每个 ID 没有重复项

So as a result I need output like below (I have many more columns so in output I need to have not only ID, COL2, COL3, but ID, COL1, COL2, COL3)因此，我需要如下所示的 output（我有更多的列，所以在 output 中，我不仅需要 ID、COL2、COL3，还需要 ID、COL1、COL2、COL3）

ID  | COL1| COL2 | COL3
----|-----|------|-----
123 | XXX | 1    | 1
444 | ABC | 1    | 1
555 | PPP | 0    | 0

How can I do that in Python Pandas?我怎样才能在 Python Pandas 中做到这一点？

Answer 1

Use a groupby.max :使用groupby.max ：

out = df.groupby(['ID', 'COL1'], as_index=False).max()

output: output：

    ID COL1  COL2  COL3
0  123  XXX     1     1
1  444  ABC     1     1
2  555  PPP     0     0

如何聚合 DataFrame 并根据 Python Pandas 中两列中的值删除重复项？

问题描述

1 个解决方案

解决方案1
0 已采纳 2022-10-07 18:23:46

如何聚合 DataFrame 并根据 Python Pandas 中两列中的值删除重复项？

问题描述

1 个解决方案

解决方案1 0 已采纳 2022-10-07 18:23:46

解决方案1
0 已采纳 2022-10-07 18:23:46