I want to mark all the columns after the first occurrence of an event(ONE-OFF) as NaN in pandas dataframe
Note: There can be multiple rows in this df and ONE-OFF can appear at any column or may not appear at all
input_df = pd.DataFrame(
{
1: {'15': 'Normal'},
2: {'15': 'Normal'},
3: {'15': 'Normal'},
4: {'15': 'ONE-OFF'},
5: {'15': 'Normal'},
6: {'15': 'Normal'},
}
)
All columns for this row should be NaN after first occurrence of ONE-OFF
output_df = pd.DataFrame(
{
1: {'15': 'Normal'},
2: {'15': 'Normal'},
3: {'15': 'Normal'},
4: {'15': 'ONE-OFF'},
5: {'15': np.nan},
6: {'15': np.nan},
}
)
Please suggest
Thanks
Compare values and use DataFrame.shift
with DataFrame.cummax
for mask and replace NaN
s by DataFrame.mask
for replace values after first matched value per rows separately:
print (input_df)
1 2 3 4 5 6
0 Normal Normal Normal ONE-OFF Normal Normal
1 ONE-OFF Normal Normal Normal Normal Normal
2 Normal Normal Normal ONE-OFF Normal Normal
3 Normal ONE-OFF Normal Normal Normal Normal
4 Normal Normal Normal Normal Normal ONE-OFF
df = input_df.mask(input_df.shift(axis=1).eq('ONE-OFF').cummax(axis=1))
print (df)
1 2 3 4 5 6
0 Normal Normal Normal ONE-OFF NaN NaN
1 ONE-OFF NaN NaN NaN NaN NaN
2 Normal Normal Normal ONE-OFF NaN NaN
3 Normal ONE-OFF NaN NaN NaN NaN
4 Normal Normal Normal Normal Normal ONE-OFF
If need set all columns by first occurncy in any column use DataFrame.loc
with DataFrame.any
for mask:
m = input_df.shift(axis=1).eq('ONE-OFF').cummax(axis=1).any()
input_df.loc[:, m] = np.nan
print (input_df)
1 2 3 4 5 6
0 Normal NaN NaN NaN NaN NaN
1 ONE-OFF NaN NaN NaN NaN NaN
2 Normal NaN NaN NaN NaN NaN
3 Normal NaN NaN NaN NaN NaN
4 Normal NaN NaN NaN NaN NaN
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.