I'm using mutate_if
to modify columns of some dataframes in my workspace. When using only mutate
I can create variables based on pre-created ones, eg
x %>%
mutate(new = column_a * 2,
new_2 = new * 2)
But this approach doesn't work with mutate_if
so I have to make some kind of 'recursive method' creating each variable from the 'beginning' eg
mutate_if(!str_detect(names(.), 'date|PIB|Deflator|[$]'),
.funs = list(Real = ~ . / Deflator,
Real_YoY = ~ (((. / Deflator) / lag((. / Deflator), 12))-1) * 100))
Which the desired output is like:
mutate_if(!str_detect(names(.), 'date|PIB|Deflator|[$]'),
.funs = list(Real = ~ . / Deflator,
Real_YoY = ~ ((Real / lag(Real, 12))-1) * 100))
Is there some way to organize the code to get close this? Thank you!
Reproducible example:
x <- data.frame(x = seq(1,10),
x1 = seq(21,30),
y = seq(10,19))
x %>% mutate_if(str_detect(colnames(.), 'x'),
.funs = list(new = ~ (. * 2),
new2 = ~ (. * 2) * 4)) # where (. * 2) could make reference to the variable 'new'
Instead of a list
, return a tibble
which can also get the previous column value from its name and then unnest
the tibble
columns
library(dplyr)
library(tidyr)
x %>%
mutate(across(starts_with('x'),
~ tibble(`1` = (.x * 2),
`2` = `1` * 4), .names = "{.col}_new")) %>%
unnest(where(is.tibble), names_sep = "")
-output
# A tibble: 10 × 7
x x1 y x_new1 x_new2 x1_new1 x1_new2
<int> <int> <int> <dbl> <dbl> <dbl> <dbl>
1 1 21 10 2 8 42 168
2 2 22 11 4 16 44 176
3 3 23 12 6 24 46 184
4 4 24 13 8 32 48 192
5 5 25 14 10 40 50 200
6 6 26 15 12 48 52 208
7 7 27 16 14 56 54 216
8 8 28 17 16 64 56 224
9 9 29 18 18 72 58 232
10 10 30 19 20 80 60 240
Or could also use mutate
after converting to tibble
x %>%
transmute(across(starts_with('x'), ~ tibble(new1 = .x *2) %>%
mutate(new2 = new1 *4))) %>%
unnest(where(is_tibble), names_sep = "_") %>%
bind_cols(x, .)
-output
x x1 y x_new1 x_new2 x1_new1 x1_new2
1 1 21 10 2 8 42 168
2 2 22 11 4 16 44 176
3 3 23 12 6 24 46 184
4 4 24 13 8 32 48 192
5 5 25 14 10 40 50 200
6 6 26 15 12 48 52 208
7 7 27 16 14 56 54 216
8 8 28 17 16 64 56 224
9 9 29 18 18 72 58 232
10 10 30 19 20 80 60 240
Or block the multiple statements within {}
x %>%
mutate(across(starts_with('x'), ~
{
new <- .x * 2
new2 <- new * 4
tibble(new, new2)}, .names = "{.col}_")) %>%
unnest(where(is_tibble), names_sep = "")
# A tibble: 10 × 7
x x1 y x_new x_new2 x1_new x1_new2
<int> <int> <int> <dbl> <dbl> <dbl> <dbl>
1 1 21 10 2 8 42 168
2 2 22 11 4 16 44 176
3 3 23 12 6 24 46 184
4 4 24 13 8 32 48 192
5 5 25 14 10 40 50 200
6 6 26 15 12 48 52 208
7 7 27 16 14 56 54 216
8 8 28 17 16 64 56 224
9 9 29 18 18 72 58 232
10 10 30 19 20 80 60 240
You need to do this in two mutate calls. With across
it is not aware of the new columns. For example, even if you try to use a specific column you know will be created, this will cause an error:
x %>%
mutate(across(
.cols = contains('x'),
.fns = list(
new = ~(.x*2),
new2 = x_new
)
))
#> Error in `mutate()`:
#> ! Problem while computing `..1 = across(.cols = contains("x"), .fns =
#> list(new = ~(.x * 2), new2 = x_new))`.
#> Caused by error:
#> ! object 'x_new' not found
The second issue is that you need to make sure it's calling the appropriate *_new
column. This can be done by accessing the cur_column()
to create a symbol which to evaluate in the context of the data.frame.
x %>%
mutate(across(
.cols = contains('x'),
.fns = list(
new = ~(.x*2)
)
)) %>%
mutate(across(
.cols = matches("x[[:digit:]]?$"),
.fns = list(
new2 = ~eval(as.symbol(paste0(cur_column(), "_new"))) * 4
)
))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.