简体   繁体   中英

Replacing outlier 2.5%, 97.5% code error in R

I used the following code to try to replace variables's value that are below the bottom 2.5% and above the top 97.5% with specific values.You can perform that code. It provides open data file.

credit<-read.csv("http://freakonometrics.free.fr/german_credit.csv", header=TRUE)
fun <- function(x){
  quantiles <- quantile( x, c(.025, .975 ) )
  x[ x < quantiles[1] ] <- quantiles[1]
  x[ x > quantiles[2] ] <- quantiles[2]
  x
}
fun(credit)

But the error message is appeared.

Error in `[.data.frame`(x, order(x, na.last = na.last, decreasing = decreasing)) : 
  undefined columns selected 

What's the problem? I happy to any help!

+Addition comment

I found that the above function does not work in the data frame but works only in the vector.

I can change the outlier of each variable in the data file with the following code:

credit$Duration.of.Credit..month. <- pmax(quantile(credit$Duration.of.Credit..month.,.025), 
                                          pmin(credit$Duration.of.Credit..month., quantile(credit$Duration.of.Credit..month.,.975)))

However, my data file has so many variables that it is inconvenient to enter code one by one.

So how can I change the outliers of the variables that a specific value not pmax&pmin?

There's actually nothing wrong with your function as long as you apply it to a column. I'd use mutate_at or mutate_all (if you really want to apply it to all columns) of the dplyr package. Something like this:

library(dplyr)
credit_trunc <- credit %>% 
   mutate_at(vars(Credit.Amount, Creditability), funs(fun))

or

credit_trunc <- credit %>%
   mutate_all(funs(fun))

or if you also have columns of another type (eg factor, character) in your data frame, you can use:

 credit_trunc <- credit %>% 
   mutate_if(is.numeric, funs(fun))

This will give you back the data frame with the chosen / all columns / all numeric columns modified as you wanted it.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM