I have a pandas dataframe and I would like to identify all negative values and replace them with NaN. Additionally, all zeros that follow a negative value should be replaced with NaN as well, up until the first positive value occurs.
I think it should be possible to achieve my goal using a for loop over all negative values in the data frame.
For example, for the negative value with index label 1737, I could use something like this:
# list indexes that follow the negative value
indexes = df['counter_diff'].loc[1737:,]
# find first value greater than zero
first_index = next(x for x, val in enumerate(indexes) if val > 0)
And then fill the values from index 1737 up to first_index
with NaN.
However, my dataframe is very large, so I was wondering whether it might be possible to come up with a more computiationally efficient method that leverages pandas.
This is an example of the input:
# input column
In[]
pd.Series({0 : 1, 2 : 3, 3 : -1, 4 : 0, 5 : 0, 7 : 1, 9 : 3, 10 : 0, 11 : -2, 14 : 1})
Out[]
0 1
2 3
3 -1
4 0
5 0
7 1
9 3
10 0
11 -2
14 1
dtype: int64
And the desired output:
# desired output
In[]
pd.Series({0 : 1, 2 : 3, 3 : np.nan, 4 : np.nan, 5:np.nan, 7:1, 9:3, 10:0, 11 : np.nan, 14:1})
Out[]
0 1.0
2 3.0
3 NaN
4 NaN
5 NaN
7 1.0
9 3.0
10 0.0
11 NaN
14 1.0
dtype: float64
Any help would be appreciated!
You could mask
all 0s
and forward fill them with ffill
, and check which values in the series are less than 0
. Then use the resulting boolean Series to mask the original Series:
s = pd.Series({0 : 1, 2 : 3, 3 : -1, 4 : 0, 5 : 0, 7 : 1, 9 : 3, 10 : 0, 11 : -2, 14 : 1})
s.mask(s.mask(s.eq(0)).ffill().lt(0))
0 1.0
2 3.0
3 NaN
4 NaN
5 NaN
7 1.0
9 3.0
10 0.0
11 NaN
14 1.0
dtype: float64
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.