簡體   English   中英

如何根據多個條件刪除以下數據幀的重復項?

[英]How to drop duplicates for the following data-frame based on multiple conditions?

我有一個數據框如下:

df

周期 距離 水平
1 -40.1 9.87 2.7
1 -40.1 9.89 2.2
2 -39.1 14.07 2.0
2 -39.1 14.09 2.8
3 -38.7 18.09 3.2
4 -36.6 15.37 0.5
4 -38.01 16.23 1.8
4 -38.4 16.66 3.1
i have to drop the duplicate cycle based on some conditions:  
    -if the csec is same then  
            -look for the  dist and keep the row with highest dist  
                   -if dist are same check the vel ,keep the row with highest vel  
    -if csec is different
            -keep the row with highest csec

output

周期 距離 水平
1 -40.1 9.89 2.2
2 -39.1 14.09 2.8
3 -38.7 18.09 3.2
4 -36.6 15.37 0.5

我能夠使用以下代碼獲得重復的行

    duplicate_cyle = df[df.duplicated('cycle',keep = False)]

我想知道如何根據條件刪除行。

按 csec、dist 和 vel 的降序對數據幀進行排序,然后刪除重復項,例如:

out = (
    df.sort_values(['cycle', 'csec', 'dist', 'vel'], ascending=[True, False, False, False])
    .drop_duplicates(subset=['cycle'])
)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM