I have the following pandas data frame df
:
COL1 COL2 COL3 Y
10 2 A 1
20 5 A 3
30 2 B 1
20 7 B 4
15 2 A 2
25 1 B 1
10 3 A 1
25 1 A 1
I apply rolling to y
as follows:
window = 2
y = df["Y"]
y = y.rolling(window).apply(lambda x: np.max(x) if len(x)>0 else 0).dropna()
But now I need to add a restriction to y
: the max
should be calculated only over rows where COL3
is equal to A
. If there is no A
value in rows, then y
should be equal to 0. For example, rows 3 and 4 (if we use the window
of 2)
I tried:
y = df.rolling(window).apply(lambda row: np.max(row[row["COL3"=="A"]]["Y"]) if len(row["Y"])>0 else 0).dropna()["Y"]
But got the error:
IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
We can split the y before the rolling
and reindex
fill the value with 0
y1 = y[df.COL3 == 'A']
y1 = y1.rolling(window).apply(lambda x: np.max(x) if len(x)>0 else 0).fillna('drop')
y = y1.reindex(y.index, fill_value = 0).loc[lambda x : x!='drop']
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.