根据其他列值从熊猫 dataframe 中删除重复项

Question

Dataframe which I am using is as below:我正在使用的 Dataframe 如下：

Name    NoOfTrans   Avg_pass_time    Cons.Error            RunCounts
Jan     0                            Failed:abcd           4
Jan                                                        4
Jan                                                        4
Jan                                                        4
May     2                            Failed:abcFailed:cde  5
May                                                        5
May                  1200                                  5
May                  1200                                  5
May                                                        5

I need to remove the duplicate from "Name", "Avg_pass_time" and "RunCounts" columns group by the "Name" column so that the output is as below:我需要从按“名称”列分组的“名称”、“Avg_pass_time”和“RunCounts”列中删除重复项，以便 output 如下所示：

Name    NoOfTrans   Avg_pass_time    Cons.Error            RunCounts
Jan     0                            Failed:abcd           4
May     2           1200             Failed:abcFailed:cde  5

Any guide will be usefull任何指南都会有用

Answer 1

You can select a subset of rows that will be used to drop the duplicates:您可以 select 将用于删除重复项的行子集：

df = df.drop_duplicates(subset=['Name','Avg_pass_time','RunCounts'])

Untested but this should work.未经测试，但这应该有效。

Answer 2

If per groups are only empty strings or duplicated values use:如果每组只有空字符串或重复值，请使用：

df = df.replace('',np.nan).groupby('Name', as_index=False).first().fillna('')

根据其他列值从熊猫 dataframe 中删除重复项

问题描述

2 个解决方案

解决方案1
0 2022-09-30 08:16:30

解决方案2
0 已采纳 2022-09-30 08:19:18

根据其他列值从熊猫 dataframe 中删除重复项

问题描述

2 个解决方案

解决方案1 0 2022-09-30 08:16:30

解决方案2 0 已采纳 2022-09-30 08:19:18

解决方案1
0 2022-09-30 08:16:30

解决方案2
0 已采纳 2022-09-30 08:19:18