I have column 'X' in a dataframe. I want to impute the negative values and value above 10 with median.
Below is my sample data
index X
0 -3
1 5
2 7
3 6
4 0
5 8
6 6
7 -2
8 9
9 2465
Below is the code that I have tried:
median = df.loc[(df['X']<10) & (df['X']>=0), 'X'].median()
df.loc[(df['X'] > 10) & (df['X']<0), 'X'] = np.nan
df['X'].fillna(median,inplace=True)
There is still no change in 'X' column even after applying the above codes.
Use Series.where
if need median of filtered values:
mask = (df['X']<10) & (df['X']>=0)
df['X'] = df['X'].where(mask, df.loc[mask, 'X'].median())
print (df)
X
0 6
1 5
2 7
3 6
4 0
5 8
6 6
7 6
8 9
9 6
Or median
of all values:
mask = (df['X']<10) & (df['X']>=0)
df['X'] = df['X'].where(mask, df['X'].median())
可能使用:
df.loc[(df['X'] > 0) & (df['X'] < 10), 'X'] = df['X'].median()
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.