[英]Box plot with outliers for all variables with percentile in R
在我的数据集中
comp=structure(list(MYCT = c(125L, 29L, 29L, 29L, 29L, 26L, 23L, 23L,
23L, 23L, 400L, 400L), MMIN = c(256L, 8000L, 8000L, 8000L, 8000L,
8000L, 16000L, 16000L, 16000L, 32000L, 1000L, 512L), MMAX = c(6000L,
32000L, 32000L, 32000L, 16000L, 32000L, 32000L, 32000L, 64000L,
64000L, 3000L, 3500L), CACH = c(256L, 32L, 32L, 32L, 32L, 64L,
64L, 64L, 64L, 128L, 0L, 4L), CHMIN = c(16L, 8L, 8L, 8L, 8L,
8L, 16L, 16L, 16L, 32L, 1L, 1L), CHMAX = c(128L, 32L, 32L, 32L,
16L, 32L, 32L, 32L, 32L, 64L, 2L, 6L), PRP = c(198L, 269L, 220L,
172L, 132L, 318L, 367L, 489L, 636L, 1144L, 38L, 40L), ERP = c(199L,
253L, 253L, 253L, 132L, 290L, 381L, 381L, 749L, 1238L, 23L, 24L
)), .Names = c("MYCT", "MMIN", "MMAX", "CACH", "CHMIN", "CHMAX",
"PRP", "ERP"), class = "data.frame", row.names = c(NA, -12L))
我有8个变量。 我需要获得boxplot,其中异常值表示为红色圆圈,并且存在具有百分位数的比例。 现在我写简单
boxplot(comp$MMIN)
例如,在这张图片中,我看到两个异常值,高于75百分位数。 这个地块我需要每8个变量。 怎么做?
绝不是一个现成的解决方案,但这应该让你顺利。
off=0.55
ggplot() +
geom_boxplot(data=comp,
aes(x="",y=MMIN),
# custom outliers
outlier.colour="red",
outlier.fill="red",
outlier.size=3
) +
geom_line(aes(x=c(off,off),y=c(5000,20000))) +
geom_text(aes(x=c(off,off),y=c(5000,20000),label=c("needs to", "be calculated")))
这是使用基本图形的可能解决方案。 关键是抑制y轴,然后根据摘要统计添加刻度线。
#build the box plot and surpress the y axis lables
b<-boxplot(comp$MMIN, yaxt="n", range=1.1)
points(x=rep(1, nrow(comp)), y=comp$MMIN)
#highlight outliers
points(x=rep(1, length(b$out)), y=b$out, col="red", pch=19)
#get the points for the y axis
myscale<-summary(comp$MMIN)
#remove the median
myscale<-myscale[-3]
#add the y-axis
axis(2, b$stats, labels=c(0, 25, 50, 75, 100))
#use this option for labels on both the right and left side
b<-boxplot(comp$MMIN, outline = FALSE)
axis(4, b$stats, labels=c(0, 25, 50, 75, 100))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.