简体   繁体   中英

Call many variables in a for loop with dplyr/ggplot function

Sometimes when performing exploratory analysis or producing reports we want to plot univariate distributions for many variables. I could do this faceting the plot after some tidy trick, but there's ordered factors and I want to keep them ordered on the plots.

So, to accomplish it in a more efficient way, I built a simple dplyr / ggplot based function. I made this example below using the Arthritis dataset of vcd package.

library(dplyr)
library(ggplot2)

data(Arthritis, package = "vcd")

head(Arthritis)

plotUniCat <- function(df, x) {
  x <- enquo(x)
  df %>%
    filter(!is.na(!!x)) %>%
    count(!!x) %>%
    mutate(prop = prop.table(n)) %>%
    ggplot(aes(y=prop, x=!!x)) +
    geom_bar(stat = "identity")
}

plotUniCat(Arthritis, Improved)

I can plot a formatted graph in a very short way, which is cool, but with just one variable.

I tried to call more than one variable with a for loop, but it's not working. The code runs, but nothing happens.

variables <- c("Improved", "Sex", "Treatment")

for (i in variables) {
  plotUniCat(Arthritis, noquote(i))
}

I searched about this, but it's still not clear for me. Does someone know what I am doing wrong or how to make it work?

Thanks in advance.

You need to use rlang::sym to convert strings to symbols instead of enquo . I replace for loop with purrr::map to loop through the variables

library(tidyverse)

data(Arthritis, package = "vcd")

head(Arthritis)
#>   ID Treatment  Sex Age Improved
#> 1 57   Treated Male  27     Some
#> 2 46   Treated Male  29     None
#> 3 77   Treated Male  30     None
#> 4 17   Treated Male  32   Marked
#> 5 36   Treated Male  46   Marked
#> 6 23   Treated Male  58   Marked

plotUniCat2 <- function(df, x) {
  x <- rlang::sym(x)
  df %>%
    filter(!is.na(!!x)) %>%
    count(!!x) %>%
    mutate(prop = prop.table(n)) %>%
    ggplot(aes(y=prop, x=!!x)) +
    geom_bar(stat = "identity")
}

variables <- c("Improved", "Sex", "Treatment")

variables %>% purrr::map(., ~ plotUniCat2(Arthritis, .x))
#> [[1]]

#> 
#> [[2]]

#> 
#> [[3]]

Created on 2018-06-13 by the reprex package (v0.2.0).

Change the enquo in the function to sym , to convert the variable string to a symbol. That is,

plotUniCat <- function(df, x) {
  x <- sym(x)
  df %>%
    filter(!is.na(!!x)) %>%
    count(!!x) %>%
    mutate(prop = prop.table(n)) %>%
    ggplot(aes(y=prop, x=!!x)) +
    geom_bar(stat = "identity")
}

or, more concisely,

plotUniCat <- function(df, x) {
  x <- sym(x)
  df %>%
    filter(!is.na(!!x)) %>%
    ggplot(aes(x = as.factor(!!x))) +
    geom_histogram(stat = "count")
}

and then

out <- lapply(variables, function(i) plotUniCat(Arthritis,i))

Finally, use grid.arrange to display the plots. Eg

library(gridExtra)
do.call(grid.arrange, c(out, ncol = 2))

在此输入图像描述

I guess the OP would like to use the plotUniCat for both quoted and unquoted variable name. If we change the function, it would not work for plotUniCat(Arthritis, Improved) .

Therefore, instead of change the function, we can also change the way how we call the function plotUniCat into:

for (i in variables) {
    plotUniCat(Arthritis, !!rlang::sym(i))
}

However, the plots are generated but not returned by for . We can use print or lapply to force the display or collect the generated plots:

lapply(variables, function(i) plotUniCat(Arthritis, !!rlang::sym(i)))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM