简体   繁体   中英

Problem with storing results of R for loop into an object and using apply function to plot graphs

The problem is storing graphs from for loop as a vector in R.

I have written a function that can plot the graphs table by table (see below).

# packages used
library(xlsx)
library(ggplot2)
library(tidyverse)
library(readxl)
library(ggplot2)
library(reshape2)

d1 <- data.frame(options = c("Strongly Agree", "Agree", "Disagree", "N/A",NA), foo2016 =
                   c(1, 4, 5, 6, NA), foo2017 = c(10, 7, 8, 9, NA), foo2018 = c(10, 7, 15, 14, NA))
d2 <- data.frame(options = c("options","Strongly Agree", "Agree", "Disagree", "N/A",NA),
                 foo2016 = c(11, 4, 3, 2, 1, NA), foo2017 = c(12, 6, 5, 4, 5, NA), foo2018 = c(10, 7, 6, 15, 14, NA))

mytables_in_a_list <- list(d1, d2)


x <- mytables_in_a_list

# where x = my tables in a list, n = table index in the list 
foo_graph <- function(x, n){
  tbl1 <- x[[n]]

  if(tbl1[1,1] != "Strongly Agree"){
    tbl1 <- tbl1[-1,]
  }

  #rename column
  names(tbl1) <- c("Options", "2016", "2017", "2018")

  # remove rofoo with NAs
  tbl1 <- tbl1 %>% drop_na()

  cols.num <- c("2016","2017", "2018")

  tbl1[cols.num] <- sapply(tbl1[cols.num],as.numeric)

  sapply(tbl1, class)

  # alternative to removing rofoo with NAs
  # na.omit(tbl)
  mdf <- melt(tbl1, value.name="value", variable.name="year", id.vars="Options")

  foo_graph <- ggplot(data=mdf, aes(x=year, y=value, group = Options, colour = Options)) +
    geom_line() +
    geom_point( size=4, shape=21, fill="white")

foo_graph
}

The code above works fine. However, due to the fact that I have a lot of tables (about 40), I think I can save a lot of time by using for loop to iterate the plotting so that the graphs (about 40) can be stored in a single R object. I have tried for loop (see code below) but my resultant R object is empty with no error message.

# packages used
library(xlsx)
library(ggplot2)
library(tidyverse)
library(readxl)
library(ggplot2)
library(reshape2)

x <- mytables_in_a_list
foo_graph <- list()
for (i in length(x)){
tbl1 <- x[[i]]

# delete table 1st row if the 1st element in the 1st row is not "Strongly Agree" 
if(tbl1[1,1] != "Strongly Agree"){
  tbl1 <- tbl1[-1,]
}

#rename column
names(tbl1) <- c("Options", "2016", "2017", "2018")

# remove rows with NAs
tbl1 <- tbl1 %>% drop_na()

# change "2016","2017", "2018" columns to numeric
cols.num <- c("2016","2017", "2018")
tbl1[cols.num] <- sapply(tbl1[cols.num],as.numeric)

# melt the table
mdf <- melt(tbl1, value.name="value", variable.name="year", id.vars="Options")

# plot the graph with ggplot
foo_graph[[i]] <- ggplot(data=mdf, aes(x=year, y=value, group = Options, colour = Options)) +
  geom_line() +
  geom_point( size=4, shape=21, fill="white")
}

foo_graph

I also know it is possible to use lapply function to do the same thing. I tried mapply because my function has two arguments but unfortunately, I got the error below.

> mapply(x, foo_graph, n)
Error in get(as.character(FUN), mode = "function", envir = envir) : 
  object 'alistoftables' of mode 'function' was not found

I expect the output to be graphs stored in an R object, such that if I query the 3rd element in the object such as foo_graph[3] , the 3rd graph in the object will be displayed. However, instead of the expected result, below is what is displayed.

> foo_graph[3]
[[1]]
NULL

Without your data, we can't reproduce the behavior. But here is some example code to do this. I use purrr::map instead of loops or the apply family, but you can replace map with lapply and get the same results.

library(tidyverse)
library(cowplot) # to plot a list of plots

# create some fake data
# make a vector for table size
sz <- 21:60

# function to make data frames
make_tbl <- function(size) {
  a <- sample(x = 1:50, size = size, replace = TRUE)
  b <- sample(x = LETTERS[1:3], size = size, replace = TRUE)
  return(tibble(a,b))
}

# a list of tables
list_of_tbls <- map(sz, make_tbl )

# function to plot
make_plot <- function(tbl) {
  ggplot(data=tbl) + geom_boxplot(aes(x=b, y=a, fill=b))
}

# make plot for all tables
list_of_plots <- map(list_of_tbls, make_plot)

# plot (all 40 if on a big screen)
cowplot::plot_grid(plotlist = list_of_plots[1:8], nrow=2)

在此处输入图片说明

An alternative is to bind the tables by row and plot with faceting. Of course the tables should have the same columns. Here the argument .id will create a column tbl that keeps track of the table, so faceting by the 40 tables is straightforward.

# alternative to bind the tables if they have the same columns
bound_tbls <- bind_rows(list_of_tbls, .id = "tbl")

# then plot with facet
ggplot(bound_tbls) + geom_boxplot(aes(x=b, y=a, fill=b)) + facet_wrap("tbl", ncol=8)

在此处输入图片说明

EDIT: With OP's code and data . Slightly modified the plotting function to match the dummy data. Note the use of cowplot::plot_grid at the end. But you should be able to get plots to display one by one if you just run all_graphs[[graph_number]] .

# library(xlsx)
library(ggplot2)
library(tidyverse)
library(readxl)
library(ggplot2)
library(reshape2)

d1 <-
  data.frame(
    options = c("Strongly Agree", "Agree", "Disagree", "N/A", NA),
    foo2016 =
      c(1, 4, 5, 6, NA),
    foo2017 = c(10, 7, 8, 9, NA)
  )
d2 <-
  data.frame(
    options = c("options", "Strongly Agree", "Agree", "Disagree", "N/A", NA),
    foo2016 = c(11, 4, 3, 2, 1, NA),
    foo2017 = c(12, 6, 5, 4, 5, NA)
  )

mytables_in_a_list <- list(d1, d2)

# where x = my tables in a list, n = table index in the list
foo_graph <- function(x, n) {
  tbl1 <- x[[n]]

  if (tbl1[1, 1] != "Strongly Agree") {
    tbl1 <- tbl1[-1, ]
  }

  #rename column
  # edited to match input data that doesn't have 2018
  names(tbl1) <- c("Options", "2016", "2017")

  # remove rofoo with NAs
  tbl1 <- tbl1 %>% drop_na()

  # edited to match input data that doesn't have 2018
  cols.num <- c("2016", "2017")

  tbl1[cols.num] <- sapply(tbl1[cols.num], as.numeric)

  sapply(tbl1, class)

  # alternative to removing rofoo with NAs
  # na.omit(tbl)
  mdf <-
    melt(
      tbl1,
      value.name = "value",
      variable.name = "year",
      id.vars = "Options"
    )

  foo_graph <-
    ggplot(data = mdf, aes(
      x = year,
      y = value,
      group = Options,
      colour = Options
    )) +
    geom_line() +
    geom_point(size = 4,
               shape = 21,
               fill = "white")

  foo_graph
}

all_graphs <-
  lapply(1:length(x), function(i)
    foo_graph(x = mytables_in_a_list, n = i))

# plot all of them
library(cowplot)
pp <- plot_grid(plotlist = all_graphs,
          align = "hv",
          axis = "ltbr")

# to save:
# ggsave(pp, filename = "all_plots.pdf", width=10, height=5)

To see the plots plotted together, just call the pp object:

pp

在此处输入图片说明

After you have all_graphs , you should be able to see individual plots by calling:

all_graphs[[1]]

在此处输入图片说明

If you just call all_graphs you'll only see the last plot in the display window because each one is displayed and replaced with the following. In Rstudio, you can browse backwards in the display pane to see previous plots in the list.

> all_graphs
[[1]]

[[2]]

**Edit 2: Use faceting instead of cowplot . With 40 tables this should work better. Still, a question is whether there is a way to summarize/extract the interesting information from each of the 40 tables and make a single summary plot. Instead of plotting the raw results for 40 surveys.

library(tidyverse)

d1 <-
  data.frame(
    options = c("Strongly Agree", "Agree", "Disagree", "N/A", NA),
    foo2016 =
      c(1, 4, 5, 6, NA),
    foo2017 = c(10, 7, 8, 9, NA)
  )
d2 <-
  data.frame(
    options = c("options", "Strongly Agree", "Agree", "Disagree", "N/A", NA),
    foo2016 = c(11, 4, 3, 2, 1, NA),
    foo2017 = c(12, 6, 5, 4, 5, NA)
  )

mytables_in_a_list <- list(d1, d2)

# combine into a single table
mytables_df <- bind_rows(mytables_in_a_list, .id="table")

# a single chain instead of function. 
# You could make this a function, but not necessary

mytables_df %>%
  drop_na() %>%
  rename("Options" = options,
         "2016" = foo2016,
         "2017" = foo2017) %>%
  filter(Options %in% c("Strongly Agree", "Agree", "Disagree", "N/A")) %>%
# make sure the options are ordered appropriatelly
  mutate(Options = factor(Options, levels = c(
    "Strongly Agree", "Agree", "Disagree", "N/A"
  ))) %>%
# using `gather` instead of `melt`, but its the same operation
  gather("Year", "Value", -table, -Options) %>% 
  ggplot(data = ., aes(x=Year, y=Value, group=Options, color=Options)) +
  geom_line() +
  geom_point() +
  facet_wrap("table", ncol=2) +
  theme(legend.position = "top")

Makes this plot:

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM