Python Pandas DataFrame delete rows based on values in multiple previous rows

Question

I have a dataframe with dates as index and one column which is instructions to enter and exit trades. Each row is one of

'short_entry', 'short_exit', 'long_entry', 'long_exit'.

Rules:

1 - You cannot exit a short (short_exit) position if you don't already hold a short position (short_entry). Likewise for long positions.

2 - You can only enter another short posn, if the previous short_entry has been closed with a corresponding short_exit. Likewise with long entry and exits.

Based on the rules the first four rows would be deleted and the first trade entered would be on 2008-02-28 followed by short_exit on 2008-03-27. The rest of the df would be updated accordingly.

I have read pretty much everything I can find in pandas docs and online helps. There are answers to delete rows based on values on a single row above (use .shift()), or use if-statements inside .loc(). But I just cannot get my head around how to put all these together to delete a row based on values of multiple previous rows. I can do it easily using for loops and df.itertuples().

Is there a pandas pythonic way of doing this? Any help and hints would be greatly appreciated.

Thanks

Answer 1

Your rules imply a state machine, which should result in marking the row index to delete.

I do not think there is one single function in pandas, that picks these rules as arguements.

Python Pandas DataFrame delete rows based on values in multiple previous rows

Question

1 answers

solution1
0 2017-09-24 12:33:43

Python Pandas DataFrame delete rows based on values in multiple previous rows

Question

1 answers

solution1 0 2017-09-24 12:33:43

solution1
0 2017-09-24 12:33:43