I have a (268X4) df and found the outliers (22,1) for one column. I want to remove those outliers from the df. How do I do that?
> df=df_nonull import pandas as pd # to manipulate dataframes import
> numpy as np # to manipulate arrays
>
> # a number "a" from the vector "x" is an outlier if
> # a > median(x)+1.5*iqr(x) or a < median-1.5*iqr(x)
> # iqr: interquantile range = third interquantile - first interquantile def
>outliers(x):
> return np.abs(x- x.median()) > 1.5*(x.quantile(.75)-
>x.quantile(0.25))
>
> # Give the outliers for the first column for example
>outliers=df.StockValue[outliers(df.StockValue)]
You can only remove the whole row, njot a single cell like (22,1). If you want to remove the complete row of the data.
df = df.drop(df.index[[22]])
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.