[英]Remove when 2 columns are duplicated, but keep based on value of a third column (pandas)
I'm looking at a way in which to remove all rows which are duplicated on Barcode and Product No., but to keep those duplicated rows when it was their latest Input.我正在寻找一种方法来删除在条形码和产品编号上重复的所有行,但在它们是最新输入时保留这些重复的行。 Example below:
下面的例子:
What I have:我有的:
Input ID![]() |
Barcode![]() |
Product No.![]() |
---|---|---|
001 ![]() |
225 ![]() |
111 ![]() |
001 ![]() |
225 ![]() |
111 ![]() |
001 ![]() |
225 ![]() |
111 ![]() |
002 ![]() |
225 ![]() |
111 ![]() |
002 ![]() |
225 ![]() |
111 ![]() |
002 ![]() |
225 ![]() |
111 ![]() |
002 ![]() |
225 ![]() |
111 ![]() |
003 ![]() |
226 ![]() |
222 ![]() |
003 ![]() |
226 ![]() |
222 ![]() |
004 ![]() |
226 ![]() |
222 ![]() |
004 ![]() |
226 ![]() |
222 ![]() |
005 ![]() |
227 ![]() |
222 ![]() |
005 ![]() |
227 ![]() |
222 ![]() |
006 ![]() |
227 ![]() |
222 ![]() |
006 ![]() |
227 ![]() |
222 ![]() |
Output: Output:
Input ID![]() |
Barcode![]() |
Product No.![]() |
---|---|---|
002 ![]() |
225 ![]() |
111 ![]() |
002 ![]() |
225 ![]() |
111 ![]() |
002 ![]() |
225 ![]() |
111 ![]() |
002 ![]() |
225 ![]() |
111 ![]() |
004 ![]() |
226 ![]() |
222 ![]() |
004 ![]() |
226 ![]() |
222 ![]() |
006 ![]() |
227 ![]() |
222 ![]() |
006 ![]() |
227 ![]() |
222 ![]() |
You can see where the Barcode and Product no.您可以看到条形码和产品编号的位置。 are the same all but the highest Input ID rows have now been removed leaving only duplicates which have the latest input.
除了最高的输入 ID 行之外,其他所有行都相同,现在已删除,只留下具有最新输入的重复项。
Thanks Oli谢谢奥利
You could run duplicated
to identify the last duplicate and extend the selection per group using groupby
+ transform('any')
:您可以运行
duplicated
以识别最后一个副本并使用groupby
+ transform('any')
扩展每个组的选择:
df[((~df[['Product No.', 'Barcode']].duplicated(keep='last'))
.groupby(df['Input ID']).transform('any'))]
output: output:
Input ID Barcode Product No.
3 2 225 111
4 2 225 111
5 2 225 111
6 2 225 111
9 4 226 222
10 4 226 222
13 6 227 222
14 6 227 222
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.