简体   繁体   中英

binning geom_boxplot in ggplot2 in R?

I want to use geom_boxplot to make boxplots that correlate two variables: for each bin of x values, plot the distribution (as boxplot) of y values for that bin. I tried:

ggplot(cars) + geom_boxplot(aes(x=dist, y=speed))

but this creates basically one large bin of x values. How can I make it so for each bin of dist , there's a boxplot representing the corresponding speed values?

Not sure what you mean by "bin", since you haven't provided any bins in your question. If you just mean that you would like a speed boxplot for each unique dist value, you can do it like this (treating dist as discrete):

ggplot(cars) + geom_boxplot(aes(factor(dist), speed))

If you were to actually create bins you could do something like:

cars$bin <- cut(cars$dist, c(1, 10, 30, 50, 200))
ggplot(cars) + geom_boxplot(aes(bin, speed))

Just to put it out there, you could also do

bin_size <- 10

cars %>% 
  mutate(bin_dist = factor(dist%/%bin_size*10)) %>% 
  ggplot(aes(x = bin_dist, y = speed)) +
  geom_boxplot()

geom_boxplot,箱大小 10

And to make the labeling better:

(cars2 <- cars %>% 
  mutate(bin_dist = dist%/%bin_size*10)) %>% 
  ggplot(aes(x = factor(bin_dist), y = speed)) +
  geom_boxplot() +
  scale_x_discrete(labels = paste0(unique(cars2$bin_dist), "-", unique(cars2$bin_dist)+10)) +
  labs(x = "dist")

带有改进标签的相同绘图

cars2 gets saved so it can work in paste0 .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM