简体   繁体   中英

How to condense rows in Pandas by removing everything between two conditions

I have a keyboard log that is telling me when keys are being press/released:

key state   time
z   1   0.133
d   1   0.298
d   0   0.36
a   1   0.522
a   1   0.6455
a   1   0.7744
a   1   0.9033
a   1   1.0322
a   1   1.1611
a   1   1.29
a   1   1.4189
a   1   1.5478
a   1   1.6767
a   1   1.8056
a   1   1.9345
a   1   2.0634
z   0   2.1923
a   0   2.3212

When a key is pressed (state == 1), it continues to write that key until it returns to an up state (state = 0). How would I condense such a table so that it only includes rows where the key is first pressed and when it was let go? This form would make it easier to calculate the keypress duration.

key state   time
z   1   0.133
d   1   0.298
d   0   0.36
a   1   0.522
z   0   2.1923
a   0   2.3212

My first thought is to use what I know, ie an ugly loop that would repeat for each key:

(1) Detect first instance of keypress and add row to new dataframe, (2) Go through rows until we see that key has been released, then add that to dataframe, (3) Append everything into one dataframe and then sort by time

I'm new to Pandas, but I know there must be a better way that properly takes advantage of the dataframe. I've discoevered dataframe.shift(), but can't quite wrap my head around how to deal with the non-constant distance in rows between the key presses/releases.

Any suggestions would be appreciated:)

It's a simple application of first()

dfu = df.groupby(["key","state"], as_index=False).first().sort_values("time")

key state time
5 z 1 0.133
3 d 1 0.298
2 d 0 0.36
1 a 1 0.522
4 z 0 2.1923
0 a 0 2.3212

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM