简体   繁体   中英

How do I use a dynamically declared variable in R ggplot when using count() and factor() functions?

I would like to plot some relative frequency data using ggplot in a more efficient manner.

I have many variables of interest, and want to plot a separate barchart for each. The following is my current code for one variables of interest Gender :

chart.gender <- data %>% 
     count(Gender = factor(Gender)) %>% 
     mutate(Gender = fct_reorder(Gender,desc(n))) %>% 
     mutate(pct = prop.table(n)) %>% 
     ggplot(aes(x=Gender, y=n, fill=Gender)) +
            geom_col()

This works, but the variable Gender is repeated many times. Since I need to repeat plots for many variables of interest (Gender, Age, Location, etc.) with similar code, I would like to simplify this by declaring the variable of interest once at the top and using that declared variable for the rest of the code. Intuitively, something like:

var <- "Gender"
chart.gender <- data %>% 
     count(var = factor(var)) %>% 
     mutate(var = fct_reorder(var,desc(n))) %>% 
     mutate(pct = prop.table(n)) %>% 
     ggplot(aes(x=var, y=n, fill=var)) +
            geom_col()

Which does not result in a plot of three-level factor count of gender frequencies, but merely a single column named 'Gender'. I believe I see why it's not working, but I do not know the solution for it: I want R to retrieve the variable name I stored in var , and then use that to retrieve the data for that variable in 'data'.

With some research I've found suggestions like using as.name(var) , but there seems to (at the least) be a problem with declaring the variable var as a factor within the count() function.

Some reproducible data:

library(tidyverse)
library(ggplot2)

set.seed(1)
data <- data.frame(sample(c("Male", "Female", "Prefer not to say"),20,replace=TRUE))
colnames(data) <- c("Gender")

I'm using the following packages in R: tidyverse , ggplot2

Use .data pronound to subset the column with var as variable.

library(tidyverse)

var <- "Gender"
data %>% 
  count(var = factor(.data[[var]])) %>% 
  mutate(var = fct_reorder(var,desc(n))) %>% 
  mutate(pct = prop.table(n)) %>% 
  ggplot(aes(x=var, y=n, fill=var)) +
  geom_col()

Or another way would be using sym and !!

data %>% 
  count(var = factor(!!sym(var))) %>% 
  mutate(var = fct_reorder(var,desc(n))) %>% 
  mutate(pct = prop.table(n)) %>% 
  ggplot(aes(x=var, y=n, fill=var)) +
  geom_col()

If you use as.name() when you set the variable initially, you can use !! ("bang-bang") to unquote the variable for the count() step.

var <- as.name("Gender")

chart.gender <- data %>% 
     count(var = factor(!! var)) %>% 
     mutate(var = fct_reorder(var,desc(n))) %>% 
     mutate(pct = prop.table(n)) %>% 
     ggplot(aes(x=var, y=n, fill=var)) +
     geom_col()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM