简体   繁体   中英

Error when creating dataframe using dplyr in loop

I have a data frame with all numeric variables and one date variable. For each variable VARIABLE I want to create a dataframe using the following dplyr code:

avg_price = full_data_noNO %>% 
group_by(Month, Country) %>%
dplyr::summarize(avg = mean(VARIABLE, na.rm = TRUE))

This works fine if I hard code the name VARIABLE but if I do it in a loop I get the warning In mean.default(data.matrix(VARIABLE), na.rm = TRUE) : argument is not numeric or logical: returning NA . As a result the average column in my avg_price dataframe only contains NA's. Does anyone know how to solve this problem?

Update: I currently have a function:

make_plots_expl_vars <- function (VARIABLE, full_data_noNO ) {
   avg_price = full_data_noNO %>% 
   group_by(Month, Country) %>%
   dplyr::summarize(avg = mean(VARIABLE, na.rm = TRUE))
   return(avg_price)

Which I call using for example make_plots_expl_vars("price", full_data_noNO) . I want to call this function for all variables in my dataframe using a loop, but I know how to do that.

You can use either rlang::sym or rlang::enquo .

With rlang::sym :

make_plots_expl_vars <- function (VARIABLE, data=full_data_noNO) {
  xx = sym(VARIABLE)
  avg_price = data %>% 
    group_by(Month, Country) %>%
    dplyr::summarize(avg = mean(!!xx, na.rm = TRUE))
  return(avg_price)
}
make_plots_expl_vars("price", full_data_noNO)
make_plots_expl_vars("price") #you don't need it anymore with the "data" argument

With rlang::enquo :

make_plots_expl_vars <- function (VARIABLE, data=full_data_noNO) {
  xx = enquo(VARIABLE)
  avg_price = data %>% 
    group_by(Month, Country) %>%
    dplyr::summarize(avg = mean(!!xx, na.rm = TRUE))
  return(avg_price)
}
make_plots_expl_vars(price, full_data_noNO)

The difference is that in one case you declare your variable with quotes and in the other without. Your variable is then unquoted inside dplyr functions with the !! operator. If you want more information, you can take a look at the quasiquotation doc or at the "Programming with dplyr" tuto .

Note that rlang is already included in dplyr so you don't have to import the library.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM