简体   繁体   中英

Pandas - Numpy.Where referencing previous row value

I have the following dataframe:

         date Price_C_1  OI_C_1
0  2021-03-05    549.75   37442
1  2021-03-08    549.75   37739
2  2021-03-09     542.5   37448
3  2021-03-10     537.0   39707
4  2021-03-11     551.0   39136
..        ...       ...     ...
95 2021-07-19     562.5  188911
96 2021-07-20    562.25  186953
97 2021-07-21    585.25  176430
98 2021-07-22     592.5       0
99 2021-07-23    597.75       0

I want to replace "0" values in OI_C_1 with the previous non-0 value.

I tried:

data['OI_C_1'] = np.where(data['OI_C_1']==0, data['OI_C_1'].shift(), data['OI_C_1']  )

but I got this:

         date Price_C_1  OI_C_1
0  2021-03-05    549.75   37442
1  2021-03-08    549.75   37739
2  2021-03-09     542.5   37448
3  2021-03-10     537.0   39707
4  2021-03-11     551.0   39136
..        ...       ...     ...
95 2021-07-19     562.5  188911
96 2021-07-20    562.25  186953
97 2021-07-21    585.25  176430
98 2021-07-22     592.5  176430
99 2021-07-23    597.75       0

So there is still one "0" value.

Is there a way to force np.where running one row at atime, starting from the top? Thanks

Instead of iteratively going row by row you can ffill . Use where or mask to NaN the 0s.

df['OI_C_1'] = df['OI_C_1'].mask(df['OI_C_1'].eq(0)).ffill(downcast='infer')

          date  Price_C_1  OI_C_1
0   2021-03-05     549.75   37442
1   2021-03-08     549.75   37739
2   2021-03-09     542.50   37448
3   2021-03-10     537.00   39707
4   2021-03-11     551.00   39136
95  2021-07-19     562.50  188911
96  2021-07-20     562.25  186953
97  2021-07-21     585.25  176430
98  2021-07-22     592.50  176430
99  2021-07-23     597.75  176430

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM