[英]How I do remove the Outlier using R?
weight<-c(117, 118, 125, 86, 131, 93, 103, 107, 112, 97, 105, 105, 111, 105, 124, 111, 103, 113, 112, 127, 111, 115, 108, 105, 108, 127, 148, 131, 126, 119, 131, 134, 127, 139, 106, 133, 139, 125, 127, 127, 113, 135, 113, 131, 145, 147, 139, 136)
gender<-c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2)
data<-data.frame(weight,gender)
attach(data)
boxplot(weight[gender==1], weight[gender==2], names = c("Male", "Female"),
col = topo.colors(6))
運行此程序后,我在每個類別中都得到了一個異常值。 我如何使用 R 刪除這個異常值我還附上了箱線圖的圖像輸入圖像描述
嘗試一個整潔的解決方案:
library(dplyr)
cleaned_data = data %>%
group_by(gender) %>%
mutate(
lo_whisker = quantile(weight,.25)-IQR(weight)*1.5,
hi_whisker = quantile(weight,.75)+IQR(weight)*1.5
) %>%
ungroup() %>%
filter(
weight >= lo_whisker&weight <= hi_whisker
)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.