简体   繁体   中英

Find pattern in pandas dataframe based on multiple columns

I have data which look like the following

Equipment   Timestamp           col       value
D1          18/04/2020 23:59    Command     1
            18/04/2020 23:59    Run_status  1
            19/04/2020 23:59    Run_status  0
            21/04/2020 00:59    Command     1
            22/04/2020 01:09    Command     1

I need to find the following pattern:

d['col']='Command' & d['col'].shift()='Run_status'

AND d['value']=1 & d['value'].shift()=1

AND (d['Timestamp'] - d['Timestamp'].shift()) < timedelta(minutes=5)

Then create a new column which gives True when such pattern is found:

Equipment   Timestamp           col          value  New_col
D1          18/04/2020 23:59    Command        1    TRUE
            18/04/2020 23:59    Run_status     1    FALSE
            19/04/2020 23:59    Run_status     0    FALSE
            21/04/2020 00:59    Command        1    FALSE
            22/04/2020 01:09    Command        1    FALSE

How to create New_col which finds the required pattern?

Conditions in Pandas generate boolean arrays. You can combine these with the binary & and |operators. To add a new column just assign to it.

cond1 = (d['col'] == 'Command') & (d['col'].shift(-1) == 'Run_status')
cond2 = (d['value'] == 1) & (d['value'].shift(-1) == 1)
cond3 = (d['Timestamp'].shift(-1) - d['Timestamp']) < timedelta(minutes=5)
d['New_col'] = cond1 & cond2 & cond3

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM