简体   繁体   中英

Grouped ggplot boxplot in R

For a sample dataframe:

   df <- structure(list(year = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L), letter_group = c("A", "A", "A", "B", "B", "B", "C", 
"C", "C", "C", "A", "A", "A", "B", "B", "B", "C", "C", "C", "C", 
"A", "A", "A", "B", "B", "B", "C", "C", "C", "C", "C", "C", "C", 
"A", "A", "A", "B", "B", "B", "C", "C", "C", "C", "C"), value = c(2L, 
3L, 4L, 5L, 6L, 6L, 7L, 8L, 5L, 6L, 7L, 3L, 4L, 5L, 6L, 4L, 5L, 
6L, 2L, 3L, 4L, 4L, 5L, 6L, 7L, 8L, 5L, 3L, 2L, 4L, 5L, 6L, 4L, 
3L, 4L, 5L, 6L, 7L, 1L, 2L, 4L, 5L, 6L, 4L)), .Names = c("year", 
"letter_group", "value"), row.names = c(NA, -44L), class = c("tbl_df", 
"tbl", "data.frame"), spec = structure(list(cols = structure(list(
    year = structure(list(), class = c("collector_integer", "collector"
    )), letter_group = structure(list(), class = c("collector_character", 
    "collector")), value = structure(list(), class = c("collector_integer", 
    "collector"))), .Names = c("year", "letter_group", "value"
)), default = structure(list(), class = c("collector_guess", 
"collector"))), .Names = c("cols", "default"), class = "col_spec"))

I am trying to create a box plot which comprises the years on the x axes - but also the 'letter-groups' grouped by year...

ie A, B, C for year 1, then a small space then A, BC for year 2 and so on....

I have the following:

library(ggplot2)

p1 <- ggplot(df, aes(year, value))
p1 + geom_boxplot(aes(group=letter_group))

But this is only producing the 3 box plots.

Could someone please help me?

An alternative to @nouse's solution (which is the best solution) is to use faceting. One benefit of faceting, however, is that you also get letter group labels on the x-axis.

Define data structure

# Load library
library(ggplot2)

# Define data frame
df <- structure(list(year = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
                              2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 
                              3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
                              4L, 4L), letter_group = c("A", "A", "A", "B", "B", "B", "C", 
                                                        "C", "C", "C", "A", "A", "A", "B", "B", "B", "C", "C", "C", "C", 
                                                        "A", "A", "A", "B", "B", "B", "C", "C", "C", "C", "C", "C", "C", 
                                                        "A", "A", "A", "B", "B", "B", "C", "C", "C", "C", "C"), 
                     value = c(2L, 3L, 4L, 5L, 6L, 6L, 7L, 8L, 5L, 6L, 7L, 3L, 4L, 5L, 6L, 4L, 5L, 
                               6L, 2L, 3L, 4L, 4L, 5L, 6L, 7L, 8L, 5L, 3L, 2L, 4L, 5L, 6L, 4L, 
                               3L, 4L, 5L, 6L, 7L, 1L, 2L, 4L, 5L, 6L, 4L)), 
                .Names = c("year", "letter_group", "value"), 
                row.names = c(NA, -44L), 
                class = c("tbl_df","tbl", "data.frame"), 
                spec = structure(list(cols = structure(list( ear = structure(list(), class = c("collector_integer", "collector")), 
                                                             letter_group = structure(list(), class = c("collector_character", "collector")), 
                                                             value = structure(list(), class = c("collector_integer",  "collector"))), 
                                                       .Names = c("year", "letter_group", "value")), 
                                      default = structure(list(), class = c("collector_guess", "collector"))), 
                                 .Names = c("cols", "default"), class = "col_spec"))

Plot results

# Plot results
g <- ggplot(df)
g <- g + geom_boxplot(aes(letter_group, value))
g <- g + facet_grid(. ~ year, switch = "x")
g <- g + theme(strip.placement = "outside",
               strip.background = element_blank(),
               panel.background = element_rect(fill = "white"),
               panel.grid.major = element_line(colour = alpha("gray50", 0.25), linetype = "dashed"))
g <- g + ylab("Value") + xlab("Year & Letter Group")
print(g)

Created on 2019-05-23 by the reprex package (v0.2.1)

Your question has been largely answered here .

Your dataframe does not include factors, so you would first need to turn your grouping variables into factors. Then, there are two options, as per link given above. Either construct a new factor by combining your two original factors (as shown in z-cool's answer) - but this does not create the desired space between factor levels on the x-axis - or you would need to assign one of your factors to fill , or col . In your case, the quickest way to solve your problem is

ggplot(df, aes(as.factor(year), value, fill=as.factor(letter_group))) + geom_boxplot()

If you do not want to colorize your plot, you can change this with scale_fill_manual or scale_color_manual , depending on your choice in aes before:

ggplot(df, aes(as.factor(year), value, fill=as.factor(letter_group))) + geom_boxplot() +
  scale_fill_manual(values=c("white", "white", "white")) +
  theme(legend.position = "none")

This should work

library(tidyverse)
df %>% 
  mutate(year_group = paste(year, letter_group)) %>% 
  ggplot(aes(year_group, value)) +
  geom_boxplot()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM