I'm trying to recode a large number of variables with 5 levels ("1_Disagree", "2_SomeD", "3_Neither", "4_SomeA", "5_Agree") into variables with 3 levels ("1_Disagree", "2_Neither", "3_Agree"). All these variables have similar names, so I'm using the across funtion from dplyr. Here's an exemple:
> df <- tibble(Q1_cat5 = as.factor(c("1_Disagree","2_SomeD","2_SomeD","4_SomeA","5_Agree")),
Q2_cat5 = as.factor(c("5_Agree","5_Agree","3_Neither","4_SomeA","5_Agree")),
Q3_cat5 = as.factor(c("3_Neither","2_SomeD","2_SomeD","1_Disagree","5_Agree")))
> df
# A tibble: 5 × 3
Q1_cat5 Q2_cat5 Q3_cat5
<fct> <fct> <fct>
1 1_Disagree 5_Agree 3_Neither
2 2_SomeD 5_Agree 2_SomeD
3 2_SomeD 3_Neither 2_SomeD
4 4_SomeA 4_SomeA 1_Disagree
5 5_Agree 5_Agree 5_Agree
What I'm trying to obtain:
> df2
# A tibble: 5 × 6
Q1_cat5 Q2_cat5 Q3_cat5 Q1_cat3 Q2_cat3 Q3_cat3
<fct> <fct> <fct> <fct> <fct> <fct>
1 1_Disagree 5_Agree 3_Neither 1_Disagree 3_Agree 2_Neither
2 2_SomeD 5_Agree 2_SomeD 1_Disagree 3_Agree 1_Disagree
3 2_SomeD 3_Neither 2_SomeD 1_Disagree 2_Neither 1_Disagree
4 4_SomeA 4_SomeA 1_Disagree 3_Agree 3_Agree 1_Disagree
5 5_Agree 5_Agree 5_Agree 3_Agree 3_Agree 3_Agree
As you can see, the new variables work as follow:
I've tried the following code:
df2 <- df %>% mutate(across(.cols = starts_with('Q') & ends_with('cat5'),
.funs = case_when(
(. == "1_Disagree" | . == "2_SomeD") ~ '1_Disagree',
. == "3_Neither" ~ '2_Neither',
(. == "4_SomeA" |. == "5_Agree") ~ '3_Agree',
is.na(.) ~ NA,
),
.names = '{str_sub(.col,1,-5)}cat3'
)
)
Which indeed creates new variables Q1_cat3, Q2_cat3, etc... But it keeps the old values of Q1_cat5, Q2_cat5, etc... So instead of what I want, it duplicates the old variables and just rename them:
> df2
# A tibble: 5 × 6
Q1_cat5 Q2_cat5 Q3_cat5 Q1_cat3 Q2_cat3 Q3_cat3
<fct> <fct> <fct> <fct> <fct> <fct>
1 1_Disagree 5_Agree 3_Neither 1_Disagree 5_Agree 3_Neither
2 2_SomeD 5_Agree 2_SomeD 2_SomeD 5_Agree 2_SomeD
3 2_SomeD 3_Neither 2_SomeD 2_SomeD 3_Neither 2_SomeD
4 4_SomeA 4_SomeA 1_Disagree 4_SomeA 4_SomeA 1_Disagree
5 5_Agree 5_Agree 5_Agree 5_Agree 5_Agree 5_Agree
Even after doing a lot of research and trying several other solutions, I can't figure out why this isn't working, nor can I find another solution to effectively do what I want. I've other post about "case_when" with "across" but none of the solutions work for me. Could you help me?
Firstly, across
has an argument .fns
not .funs
. However, the main issue is that you're trying to pass a lambda function without using the necessary operator such as tilde ( ~
) in tidyverse
. Try with:
df2 <- df %>%
mutate(
across(.cols = starts_with('Q') & ends_with('cat5'),
~ case_when(
(. == "1_Disagree" | . == "2_SomeD") ~ '1_Disagree',
. == "3_Neither" ~ '2_Neither',
(. == "4_SomeA" |. == "5_Agree") ~ '3_Agree',
is.na(.) ~ NA_character_ # You can skip this part though
),
.names = '{str_sub(.col,1,-5)}cat3')
)
Output:
df2
# A tibble: 5 x 6
Q1_cat5 Q2_cat5 Q3_cat5 Q1_cat3 Q2_cat3 Q3_cat3
<fct> <fct> <fct> <chr> <chr> <chr>
1 1_Disagree 5_Agree 3_Neither 1_Disagree 3_Agree 2_Neither
2 2_SomeD 5_Agree 2_SomeD 1_Disagree 3_Agree 1_Disagree
3 2_SomeD 3_Neither 2_SomeD 1_Disagree 2_Neither 1_Disagree
4 4_SomeA 4_SomeA 1_Disagree 3_Agree 3_Agree 1_Disagree
5 5_Agree 5_Agree 5_Agree 3_Agree 3_Agree 3_Agree
As you can see, instead of only NA
you'll also need to specify NA_character_
as all values need to be of same type, including NA
. I am not sure about your use case though, normally you could skip the last step as anything not fitting the previously described rules will be NA
anyhow.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.