How to use the previous value from calculated new pandas column based on conditions?

Question

import pandas as pd
import numpy as np

df = pd.DataFrame(
    (
        [6, 5, 10],
        [12, 6, 11],
        [7, 6, 10],
        [7, 5, 11],
        [4, 5, 10],
        [6, 5, 10],
        [7, 4, 9],
    ),
    columns=[
        "val", "lower", "upper"
    ]
)

# define conditions
conditions = [df['val'] > df['upper'],
              df['val'] < df['lower']]

# define choices
choices = [1, -1]

# create new column in DataFrame that displays results of comparisons
df['cond'] = np.select(conditions, choices, default=0)

print(df)

The result of the above is now this:

    val   lower upper  cond
0    6      5     10     0
1   12      6     11     1
2    7      6     10     0
3    7      5     11     0
4    4      5     10    -1
5    6      5     10     0
6    7      4      9     0

What I want to achieve is the following:

row[0].cond should have value NaN because I don't know the last cross was at the upper or lower
row[1] has the 'val' crossed the upper that result sin cond = 1
row[2] is between the upper and lower so no cross in upper or lower 'cond' should have the prev 'cond' value from row[1], so cond = 1
row[3] is between the upper and lower so no cross in upper or lower 'cond' should have the prev 'cond' value from row[2], so cond = 1
row[4] has the 'val' crossed the lower that results in cond = -1
row[5] is between the upper and lower so no cross in upper or lower 'cond' should have the prev 'cond' value from row[4], so cond = -1
row[6] is between the upper and lower so no cross in upper or lower 'cond' should have the prev 'cond' value from row[6], so cond = -1

The following is not working

df['cond'] = np.select(conditions, choices, default=df["cond"].shift(1))

So the result should be:

    val   lower upper  cond
0    6      5     10     NaN
1   12      6     11     1
2    7      6     10     1
3    7      5     11     1
4    4      5     10    -1
5    6      5     10    -1
6    7      4      9    -1

What is the easiest way to get this done???

Answer 1

IIUC, you can try to replace the zero by the previous non zero value and replace the left zero (always the first) with NaN

df['cond'] = np.select(conditions, choices, default=0)

df['cond'] = df['cond'].replace(to_replace=0, method='ffill').replace(0, np.nan)

print(df)

   val  lower  upper  cond
0    6      5     10   NaN
1   12      6     11     1
2    7      6     10     1
3    7      5     11     1
4    4      5     10    -1
5    6      5     10    -1
6    7      4      9    -1

As mozway suggests, rather than set 0 as default value in np.select , you can use NaN directly

df['cond'] = np.select(conditions, choices, default=np.nan)

df['cond'] = df['cond'].ffill()

# or in one line
# np.select returns an array,
# here we use pd.Series to chain ffill method
df['cond'] = pd.Series(np.select(conditions, choices, default=np.nan), index=df.index).ffill()

How to use the previous value from calculated new pandas column based on conditions?

Question

1 answers

solution1
1 2022-05-27 17:53:32

How to use the previous value from calculated new pandas column based on conditions?

Question

1 answers

solution1 1 2022-05-27 17:53:32

solution1
1 2022-05-27 17:53:32