How to get same legend categories on multiple stacked bar graphs when data categories are not identical ggplot2

Question

This is my first time posting on here so please go easy on me. I've been googling this problem for days and have not been able to find a solution so sorry if this has been answered elsewhere.

I am making several stacked bar graphs in ggplot and want the legend categories to be identical on all the graphs (ie each category has the same color on each graph) without having to manually set all the colors. The issue is that the categories are not identical between graphs so simply specifying a palette results in the categories being different colors.

I can't use the actual data I'm working with so I've created a similar data frame that mimics the problem.

Here is the example df:

  Year    Trial    Concentration    Chemical
  2013      1           0.8         Benzene
  2013      1           1.5         Toluene
  2013      1           0.8         Hexane 
  2013      2           1.5         Toluene
  2013      2           0.8         Carboxylic Acid
  2013      2           1.5         Acetone
  2013      3           0.8         Ethanol
  2013      3           1.9         Carboxylic Acid
  2013      3           3.1         Acetone
  2014      1           1.8         Benzene
  2014      1           2.5         Toluene
  2014      1           0.6         Methanol 
  2014      2           1.3         Toluene
  2014      2           1.8         Carboxylic Acid
  2014      2           2.5         Butane
  2014      3           1.5         Ethanol
  2014      3           1.2         Carboxylic Acid
  2014      3           3.5         Acetone
  ...      ...          ...         ...

Here is the code for the graphs:

  list <- split(df, df$Year)
  plot_list <- list()

for (i in 1:5) {
    df <- list[[i]]

    p <- ggplot(df, aes(x = Trial, y = Concentration, width=0.8)) +
         geom_bar(stat = "identity", aes(fill = Chemical))
    plot_list = p
}

And here are the resulting graphs:

堆积条形图

So for example, on the 2013 graph the brown-yellow = benzene and on the 2014 graph brown-yellow = butane. What I would like is for the legend to be identical on both graphs (ie the 2014 graph will show benzene in the legend, even though it was not measured in that year) and for each chemical to be the same color on each graph. Like this:

理想的堆积条形图

I know how to do this by hand with scale_file_manual, however I have about 30 chemicals so I would prefer not to set them manually. Let me know if you have questions or need any additional information. Thanks in advance for any help!

Answer 1

I would set up a table ahead of time linking the colors and the chemical names

library(data.table)
library(tidyverse)
library(RColorBrewer)

df <-
  fread("
    Year    Trial    Concentration    Chemical
    2013      1           0.8         Benzene
    2013      1           1.5         Toluene
    2013      1           0.8         Hexane 
    2013      2           1.5         Toluene
    2013      2           0.8         Carboxylic_Acid
    2013      2           1.5         Acetone
    2013      3           0.8         Ethanol
    2013      3           1.9         Carboxylic_Acid
    2013      3           3.1         Acetone
    2014      1           1.8         Benzene
    2014      1           2.5         Toluene
    2014      1           0.6         Methanol 
    2014      2           1.3         Toluene
    2014      2           1.8         Carboxylic_Acid
    2014      2           2.5         Butane
    2014      3           1.5         Ethanol
    2014      3           1.2         Carboxylic_Acid
    2014      3           3.5         Acetone
  ")

chem_colors <-
  tibble(Chemical = factor(unique(df$Chemical))) %>% 
  mutate(color = brewer.pal(n = n(), name = "RdBu")[as.integer(Chemical)])

# you can use your loop here instead
plot_trials <- function(year) {
  ggplot(filter(df, Year == year), aes(x = Trial, y = Concentration, width=0.8)) +
    geom_bar(stat = "identity", aes(fill = Chemical)) +
    scale_fill_manual(values = chem_colors$color, labels = chem_colors$Chemical)
}


gridExtra::grid.arrange(
  plot_trials(2013),
  plot_trials(2014), 
  nrow = 1
)

Answer 2

Here is the answer I got to work for my large data set. I used yake84's answer above and added the colorRampPalette() function to be able to extract more colors from a palette. I also changed chem_colors into a named vector because as a tibble the colors were not being mapped to the chemicals in my dataframe.

getPalette = colorRampPalette(brewer.pal(9, "Set1")   #create a palette with more than 9 colors

chem_colors  <- 
   tibble(Chemical = factor(unique(df$Chemical))) %>%
   mutate(color = getPalette(30))
chem_colors <- setNames(chem_colors$color, as.character(chem_colors$Chemical) #create named vector

plot_trials <- function(year) {
 ggplot(filter(df, Year == year), aes(x = Trial, y = Concentration, width=0.8)) +
  geom_bar(stat = "identity", aes(fill = Chemical)) +
  scale_fill_manual(values = chem_colors)
}

How to get same legend categories on multiple stacked bar graphs when data categories are not identical ggplot2

Question

2 answers

solution1
0 2019-05-08 14:34:00

solution2
0 2019-05-10 13:15:29

How to get same legend categories on multiple stacked bar graphs when data categories are not identical ggplot2

Question

2 answers

solution1 0 2019-05-08 14:34:00

solution2 0 2019-05-10 13:15:29

solution1
0 2019-05-08 14:34:00

solution2
0 2019-05-10 13:15:29