简体   繁体   中英

Python Pandas DataFrame delete rows based on values in multiple previous rows

I have a dataframe with dates as index and one column which is instructions to enter and exit trades. Each row is one of

'short_entry', 'short_exit', 'long_entry', 'long_exit'.

Rules:

1 - You cannot exit a short (short_exit) position if you don't already hold a short position (short_entry). Likewise for long positions.

2 - You can only enter another short posn, if the previous short_entry has been closed with a corresponding short_exit. Likewise with long entry and exits.

在此处输入图片说明

Based on the rules the first four rows would be deleted and the first trade entered would be on 2008-02-28 followed by short_exit on 2008-03-27. The rest of the df would be updated accordingly.

I have read pretty much everything I can find in pandas docs and online helps. There are answers to delete rows based on values on a single row above (use .shift()), or use if-statements inside .loc(). But I just cannot get my head around how to put all these together to delete a row based on values of multiple previous rows. I can do it easily using for loops and df.itertuples().

Is there a pandas pythonic way of doing this? Any help and hints would be greatly appreciated.

Thanks

Your rules imply a state machine, which should result in marking the row index to delete.

I do not think there is one single function in pandas, that picks these rules as arguements.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM