简体   繁体   中英

Ignore outliers in ggplot2 geom_violin

Is there a way to ignore outliers in geom_violin and have the y axis plot be correlated with the Q1 and Q3 quantiles? ( range=1.5 in base R). It would be great if this could be automated (ie not just calling out a specific y axis limit).

I see a solution using geom_boxplot here: Ignore outliers in ggplot2 boxplot

But is there a way to replicate this type of solution in geom_violin ? Thanks in advance!

Example code below with desired outcome

library(ggplot2)
Result <- as.numeric(c(.2, .03, .11,  .05, .2, .02, .22, 1.1, .02, 120))
Group <- as.factor(c("a", "a", "a", "b", "b", "b", "c", "c", "c", "c"))
x <- data.frame(Result, Group)

plot = ggplot(x, aes(x=Group, y=Result)) +
  geom_violin()

print(plot)

Here is the output of the above (not a super helpful graphic):

在此处输入图片说明

I'd like something like the plot below using the above data: 在此处输入图片说明

I think a similar method as the one you link to will work here, except you will need to compute those stats for each group and use the minimum Q1 and maximum Q3 as the coord_cartesian :

library(dplyr)
# compute lower and upper whiskers for each group
ylims <- x %>%
  group_by(Group) %>%
  summarise(Q1 = quantile(Result, 1/4), Q3 = quantile(Result, 3/4)) %>%
  ungroup() %>%
  #get lowest Q1 and highest Q3
  summarise(lowQ1 = min(Q1), highQ3 = max(Q3))

plot + coord_cartesian(ylim = as.numeric(ylims)*1.05)

Note that you can change the scaling in the call to coord_cartesian and the quantile breaks in the piped bit of code that calculates the range of Q1's and Q3's.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM