removing rows from dataframe based on condition

Question

I have the following dataframe:

df = pd.DataFrame({"Code": ['9958S135K108MF-1','9958S135-1','9958S105-1','9958S105K84MF-1',], "ID": ['FO995877000581098', 'FO995877000581098','FO995877000581098','FO995877000581098',], "NUM": ['9958S135','9958S135','9958S105','9958S105']})

I need the following output:

    Code                ID                  NUM
0   9958S135K108MF-1    FO995877000581098   9958S135
3   9958S105K84MF-1     FO995877000581098   9958S105

For every "ID" there should be a unique "NUM" . There will be many duplicate "ID"

The trick is upon dropping the row which has a duplicate '"ID"' and "'NUM" I need to remove the row that has the prefix ending in MF-1 ..

I have tried to add a "Mapping" column and delete True values in that column but it will not always allocate "True" to the correct row which "Code" contains 'MF-1'.

Here is what I have tried:

import pandas as pd

df['Mapping'] = df['NUM'].eq(df['NUM'].shift()) & df['ID'].eq(df['ID'].shift())

    Code                ID                  NUM         Mapping
0   9958S135K108MF-1    FO995877000581098   9958S135    False
1   9958S135-1          FO995877000581098   9958S135    True
2   9958S105-1          FO995877000581098   9958S105    False
3   9958S105K84MF-1     FO995877000581098   9958S105    True

Answer 1

I was able to acheive my outcome using the following:

df[~df.duplicated(['ID', 'NUM'], keep=False) | df['Code'].astype(str).str.contains('MF-1')]

removing rows from dataframe based on condition

Question

1 answers

solution1
0 2021-01-25 02:09:39

removing rows from dataframe based on condition

Question

1 answers

solution1 0 2021-01-25 02:09:39

solution1
0 2021-01-25 02:09:39