简体   繁体   中英

Imputing the range of values with median

I have column 'X' in a dataframe. I want to impute the negative values and value above 10 with median.

Below is my sample data

index   X
0    -3
1     5
2     7
3     6
4     0
5     8
6     6
7    -2
8     9
9     2465

Below is the code that I have tried:

median = df.loc[(df['X']<10) & (df['X']>=0), 'X'].median()
df.loc[(df['X'] > 10) & (df['X']<0), 'X'] = np.nan
df['X'].fillna(median,inplace=True)

There is still no change in 'X' column even after applying the above codes.

Use Series.where if need median of filtered values:

mask = (df['X']<10) & (df['X']>=0)
df['X'] = df['X'].where(mask, df.loc[mask, 'X'].median())
print (df)
   X
0  6
1  5
2  7
3  6
4  0
5  8
6  6
7  6
8  9
9  6

Or median of all values:

mask = (df['X']<10) & (df['X']>=0)
df['X'] = df['X'].where(mask, df['X'].median())

可能使用:

df.loc[(df['X'] > 0) & (df['X'] < 10), 'X'] = df['X'].median()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM