简体   繁体   中英

Loop graphs ggplot y for x for different categories with linear regression, How to consequetively plot categories?

EDITED: I have a large data base trying to reapeatedly assess energy expenditue over time with the aim to compare multiple different variables (0/1, eg presence of severe head trauma vs. no such). The graph analysis should be repeated for all available variables in the database. All tables should be exported to a PDF File.

Currently I'm using the following code:

library(tidyverse)
library(ggpmisc)
my_data %>%
pdf(file="Plots.pdf" )
print(colnames(my_data) %>%
        map(function(x) my_data%>%
              ggplot(aes(x = Day, 
                         y = REE,
                         color=as_factor(x)))+
              scale_x_continuous(breaks = c(0,2,4,6,8,10,12,14,16,18,20,22,24,26,28))+
              scale_y_continuous(limits= c(0000,4000))+
              geom_point()+
              geom_smooth(method=lm,
                          se=TRUE,
                          size=2/10,
                          aes(group=as_factor(x)))+
              stat_poly_eq(aes(label = paste(after_stat(eq.label),
                                             after_stat(rr.label), 
                                             after_stat(p.value.label),
                                             sep = "*\", \"*")),
                           label.y="bottom", label.x="right")+
              labs(x="Time [d]",
                   y="Resting Energy Expenditure [kcal]")+
              scale_colour_grey(start=0.7,
                                  end=0.3)+
              theme_bw()
))
dev.off()

It generates the PDF File with all graphs. However, it does not group/color according to the as_factor(x) and all data points are categorised into the same group.

Does anyone have a possible explanation on how to resolve this problem that the categorising according to the factor variable doesn't work?

The issue is that you loop over column names which are character strings. Doing color=as.factor(x) you are mapping a constant character string on the color aes, ie you are doing something like color="foo" . To tell ggplot2 that you want to map the data column whose name stored in x on the color aes you have to use eg the .data pronoun, ie do color=as.factor(.data[[x]]) .

Using a minimal reprex based on mtcars :

Note: Personally I would suggest to put your plotting code in a separate function ainstead of passing it as an anonymous function to purrr::map as I do below. Makes debugging easier and your code cleaner.

library(tidyverse)
library(ggpmisc)

my_data <- mtcars

plot_fun <- function(x) {
  ggplot(my_data, aes(
    x = mpg,
    y = hp,
    color = as_factor(.data[[x]])
  )) +
    #scale_x_continuous(breaks = c(0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28)) +
    #scale_y_continuous(limits = c(0000, 4000)) +
    geom_point() +
    geom_smooth(
      method = lm,
      se = TRUE,
      size = 2 / 10,
      aes(group = as_factor(x))
    ) +
    stat_poly_eq(aes(label = paste(after_stat(eq.label),
      after_stat(rr.label),
      after_stat(p.value.label),
      sep = "*\", \"*"
    )),
    label.y = "bottom", label.x = "right"
    ) +
    labs(
      x = "Time [d]",
      y = "Resting Energy Expenditure [kcal]"
    ) +
    scale_colour_grey(
      start = 0.7,
      end = 0.3
    ) +
    theme_bw()
}

cols <- c("cyl", "am", "gear") # colnames(my_data)

#pdf(file = "Plots.pdf")
purrr::map(cols, plot_fun)
#> [[1]]
#> `geom_smooth()` using formula 'y ~ x'

#> 
#> [[2]]
#> `geom_smooth()` using formula 'y ~ x'

#> 
#> [[3]]
#> `geom_smooth()` using formula 'y ~ x'

#dev.off()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM