简体   繁体   English

如何使用 R 删除异常值?

[英]How I do remove the Outlier using R?

weight<-c(117,  118,    125,    86,     131,     93,    103,    107,    112,    97, 105,    105,    111,    105,    124,    111,    103,    113,    112,    127,    111,    115,    108,    105,    108,    127,    148,    131,    126,    119,    131,    134,    127,    139,    106,    133,    139,    125,    127,    127,    113,    135,    113,    131,    145,    147,    139,    136)

gender<-c(1,    1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2)

data<-data.frame(weight,gender)
attach(data)

boxplot(weight[gender==1], weight[gender==2], names = c("Male", "Female"),
    col = topo.colors(6))

After running this I got one outlier in each category.运行此程序后,我在每个类别中都得到了一个异常值。 How I remove this outlier using R I also attach the Image enter image description here of boxplot我如何使用 R 删除这个异常值我还附上了箱线图的图像输入图像描述

try a tidy solution:尝试一个整洁的解决方案:

library(dplyr)

cleaned_data = data %>%
  group_by(gender) %>%
  mutate(
    lo_whisker = quantile(weight,.25)-IQR(weight)*1.5,
    hi_whisker = quantile(weight,.75)+IQR(weight)*1.5
  ) %>%
  ungroup() %>%
  filter(
    weight >= lo_whisker&weight <= hi_whisker
  ) 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM