Subset and plot data by for loop / lappy

Question

I have about 300 sites located over multiple mountains types. I am trying to produce some meaningful plots. Therefore, I would like to subset my data by mountain type (type), and plot it by ggplot2. I would like to automate the process by for loop or by lapply, but I am beginner in both.

I have found some good examples using for loop : http://www.reed.edu/data-at-reed/resources/R/loops_with_ggplot2.html or using lapply: Use for loop in ggplot2 to generate a list

However, both approaches generate empty plots. What am I doing wrong? How can I fix my code?

# Create dummy data
df<- data.frame(loc = rep(c("l1", "l2"), each = 3),
                name = rep(c("A", "B"), 3),
                grid = c(5,6,7,2,3,5),
                area = c(5,10,1,1,3,1),
                areaOrig = rep(c(20, 10, 5), each = 2))

df2<-rbind(df, df)

# Create two mountain types types
df2$type = rep(c("y", "z"), each = 6)

Create function to produce plots:

require(ggplot2)

type.graph <- function(df2, na.rm = TRUE, ...) {

  # Create list of locations
  type_list <-unique(df2$type)

  # Create a for loop to produce ggpot plots
  for (i in seq_along(type_list)) {

    # create a plot for each loc in df
    plot<-

      windows()

      ggplot(subset(df2, df2$type == type_list[i]),
             aes(x = grid, 
                 y = area)) +
        geom_bar(stat = "identity") +
        ggtitle(type_list[i]) +
        facet_grid(loc ~name)

    print(plot)
  }
}

type.graph(df2)

Use lapply to produce plots:

#significant SNPs
type_list <- unique(df2$type)

#create list of ggplots per type
p_re <-
  lapply(type_list, function(i){

    ggplot(subset(df2, type == type_list[i]), 
           aes(x = grid, 
               y = area)) +
      geom_bar(stat = "identity")

  })

#assign names
names(p_re) <- type_list

#plot
p_re$y

Answer 1

I would suggest using a the purrr package as part of the tidyverse, nesting the data frame by the grouping factor, then looping through the subset data. Below is an example:

library(tidyverse)

by_type <- df2 %>% 
  group_by(type) %>% 
  nest() %>% 
  mutate(plot = map(data, 
                    ~ggplot(. ,aes(x = grid, y = area)) +
                      geom_bar(stat = "identity") +
                      ggtitle(.) +
                      facet_grid(loc ~name)))

by_type
# A tibble: 2 x 3
  type  data             plot    
  <chr> <list>           <list>  
1 y     <tibble [6 × 5]> <S3: gg>
2 z     <tibble [6 × 5]> <S3: gg>

The above gives you a normal data frame, but the data and plot columns are list columns. So the first "cell" for data contains all the data for type == y and the second contains all the data for type == z . This basic structure is created by tidyr::nest . You then create a new variable, which I've called plot, by looping through the data list column with purrr::map , and you just need to substitute the data argument for . . Note there are map2 and pmap functions for when you want to loop through more than one thing at a time (for example, if you wanted your title to be something different.

You can then easily look at your data with by_type$plot , or save them with

walk2(by_type$type, by_type$plot, 
      ~ggsave(paste0(.x, ".pdf"), .y))

Answer 2

Try this:

require(ggplot2)

type.graph <- function(df2, na.rm = TRUE, ...) {

  # Create list of locations
  type_list <-unique(df2$type)

  # Create a for loop to produce ggpot plots
  for (i in seq_along(type_list)) {

    # create a plot for each loc in df
    plot<-
        ggplot(subset(df2, df2$type == type_list[i]),
             aes(x = grid, 
                 y = area)) +
        geom_bar(stat = "identity") +
        ggtitle(type_list[i]) +
        facet_grid(loc ~name)
    windows()
    print(plot)
  }
}

type.graph(df2)

Answer 3

Several years ago, before tidyverse, I had used ggplot2 to produce list of plot objects using similar way you do. At the end of custom function I used to put explicit return() statement to return created object. That worked for me (for example, to run ggsave() later).

Example with custom histogram with df as main dataset to plot followed by some extra parameters:



ggHistFunc <- function (cl, df, ymax, st) {
    mn <- st$means[st$variable==cl]
    P50 <- st$medians[st$variable==cl]
    P10 <- st$P10[st$variable==cl]
    P90 <- st$P90[st$variable==cl]
    gghist <-
        ggplot(data = df, aes_string(x = cl)) +
        geom_histogram(binwidth = diff(range(df[,cl]))/10, aes(y = ..count..),
                       fill = "white", colour = "black") +
        geom_line(data = data.frame(x = c(mn, mn)), y = c(0, ymax),
                  aes(x=x), colour="green", size=1) +
        geom_line(data = data.frame(x = c(P50, P50)), y = c(0, ymax),
                  aes(x=x), colour="brown", size=1) +
        geom_line(data = data.frame(x = c(P10, P10)), y = c(0, ymax),
                  aes(x=x), colour="blue", size=1) +
        geom_line(data = data.frame(x = c(P90, P90)), y = c(0, ymax),
                  aes(x=x), colour="red", size=1)
    #print(gghist)
    return(gghist)
}

And followed by "loop" to create histogram for all parameters:



gg_Hist_HM <- lapply(X = as.list(names(params_HM)),
                     FUN = ggHistFunc, df = params_HM, ymax = 100, st = stat_HM)

Now I see the approach proposed above with purrr package looks more elegant!

Subset and plot data by for loop / lappy

Question

3 answers

solution1
4 ACCPTED 2018-01-30 18:19:31

solution2
1 2018-01-30 18:02:19

solution3
0 2018-01-31 14:37:56

Subset and plot data by for loop / lappy

Question

3 answers

solution1 4 ACCPTED 2018-01-30 18:19:31

solution2 1 2018-01-30 18:02:19

solution3 0 2018-01-31 14:37:56

solution1
4 ACCPTED 2018-01-30 18:19:31

solution2
1 2018-01-30 18:02:19

solution3
0 2018-01-31 14:37:56