简体   繁体   中英

dplyr: apply sequential functions to variables without creating new variables in a single mutate(across(...))

tl;dr -- is it possible to use dplyr syntax to apply more than one function to a selection of variables in a single call to mutate(across(...)) , without creating extra variables?

By way of example, say we want to apply mean and factor to mpg and cyl . We can do this by repeating ourselves:

library(dplyr)

# desired output (but we repeat ourselves)
mtcars %>%
    mutate(
        across(c('mpg', 'cyl'),
            mean
        )
    ) %>%
    mutate(
        across(c('mpg', 'cyl'),
            factor
        )
    )

I want to avoid repeating the mutate(across(...)) selection.

According to the reference for across , we can supply multiple functions or purrr-style lambdas in a list. However, I can't figure out how to mutate in place (overwrite the variable), rather than creating new variables.

Of course, applying a single function at a time does not create new variables with default parameters:

# single mean function mutates in place
mtcars %>%
    mutate(
        across(c('mpg', 'cyl'),
            ~mean(.)    
        )
    )

# single factor function mutates in place
mtcars %>%
    mutate(
        across(c('mpg', 'cyl'),
            ~factor(.)    
        )
    ) %>%
    glimpse()

But passing in a list creates new variables:

# this creates new vars
mtcars %>%
    mutate(
        across(c('mpg', 'cyl'),
            .fns = list(
                mean, factor
            )    
        )
    )

# as does this
mtcars %>%
    mutate(
        across(c('mpg', 'cyl'),
            .fns = list(
                ~mean(.), ~factor(.)
            )    
        )
    )

I've tried to specify the variable names directly with .names , but this does not work:

# trying to specify that we want to preserve
# the original names with {col} leads to a
# duplicated names error
mtcars %>%
    mutate(
        across(c('mpg', 'cyl'),
            .fns = list(
                mean, factor
            ),
            .names = "{col}"
        )
    )

# the same occurs with purrr-style lambda syntax
mtcars %>%
    mutate(
        across(c('mpg', 'cyl'),
            .fns = list(
                ~mean(.), ~factor(.)
            ),
            .names = "{col}"
        )
    )

Is this possible in a single mutate(across(...)) call?

So you want to first take mean of those variables and then turn them into factor ?

This can be achieved by :

library(dplyr)

mtcars %>% mutate(across(c('mpg', 'cyl'),~factor(mean(.)))) 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM