简体   繁体   中英

Pandas DataFrame replace negative values with latest preceding positive value

Consider a DataFrame such as

df = pd.DataFrame({'a': [1,-2,0,3,-1,2], 
                   'b': [-1,-2,-5,-7,-1,-1], 
                   'c': [-1,-2,-5,4,5,3]})

For each column, how to replace any negative value with the last positive value or zero? Last here refers from top to bottom for each column. The closest solution noticed is for instance df[df < 0] = 0 .

The expected result would be a DataFrame such as

df_res = pd.DataFrame({'a': [1,1,0,3,3,2], 
                       'b': [0,0,0,0,0,0], 
                       'c': [0,0,0,4,5,3]})

You can use DataFrame.mask to convert all values < 0 to NaN then use ffill and fillna :

df = df.mask(df.lt(0)).ffill().fillna(0).convert_dtypes()
   a  b  c
0  1  0  0
1  1  0  0
2  0  0  0
3  3  0  4
4  3  0  5
5  2  0  3

Use pandas where

df.where(df.gt(0)).ffill().fillna(0).astype(int)



   a  b  c
0  1  0  0
1  1  0  0
2  1  0  0
3  3  0  4
4  3  0  5
5  2  0  3

Expected result may obtained with this manipulations:

mask = df >= 0 #creating boolean mask for non-negative values
df_res = (df.where(mask, np.nan) #replace negative values to nan
          .ffill() #apply forward fill for nan values 
          .fillna(0)) # fill rest nan's with zeros

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM