[英]Using dplyr mutate_at to change specified list of variables with case_when statement
I'm trying to recode some columns in a data set.我正在尝试重新编码数据集中的某些列。 The columns have a lot of weird names like S3__8 or C4__2.
这些列有很多奇怪的名称,例如 S3__8 或 C4__2。 There are also some categorical columns I want to leave alone that start with C like Case.
还有一些分类列我想不理会,它们以 C 开头,例如 Case。
I used this segment to successfully recode all of the S columns:我使用这个段成功地重新编码了所有的 S 列:
Sa_Recode <- Sa %>%
mutate_at(vars(starts_with("S")),
funs(case_when(grepl("Yes", ., ignore.case = TRUE) ~ "1",
grepl("No", ., ignore.case = TRUE) ~ "0",
grepl("Some", ., ignore.case = TRUE) ~ "0.5",
TRUE ~ "Else")))
I want to recode the C columns, but can't use the same logic because some of my other columns start with C.我想重新编码 C 列,但不能使用相同的逻辑,因为我的其他一些列以 C 开头。 I've tried editing the mutate line like this with no luck:
我试过像这样编辑 mutate 行,但没有运气:
Creating a list of the columns I need and making a list创建我需要的列的列表并制作列表
list <- c('C1_(*)__', 'C2_4__', 'C3_(*)__', 'C3a_(*)__')
mutate_at(vars(list),
Listing them as variables将它们列为变量
mutate_at(c('C1_(*)__', 'C2_4__', 'C3_(*)__', 'C3a_(*)__'),
Listing them differently as variables以不同的方式将它们列为变量
mutate_at(vars(c('C1_(*)__', 'C2_4__', 'C3_(*)__', 'C3a_(*)__')),
Calling a range of columns调用一系列列
mutate_at(Sa[,8:53],
I'll be repeating this process with about nine other sets (with different starting letters) and am hoping to learn how to manipulate the logic.我将用其他大约九个集合(具有不同的起始字母)重复这个过程,并希望学习如何操作逻辑。 Alternatively, is there a way to make the "else" in the case statement not recode the value?
或者,有没有办法让case语句中的“else”不重新编码值? This could also fix the issue.
这也可以解决问题。 Thanks!
谢谢!
Sample Input:
Case S25_ S26_(*)__ C1_(*)__
A No Some Yes
B Yes Skipped Yes
C No N/A Some
Desired output:
Case S25_ S26_(*)__ C1_(*)__
A 0 0.5 1
B 1 Skipped 1
C 0 N/A 0.5
You can use regular expressions to correctly identify columns that you want to change.您可以使用正则表达式来正确识别要更改的列。
library(dplyr)
Sa %>%
mutate_at(vars(matches('^S|C\\d+')),
~case_when(grepl("Yes", ., ignore.case = TRUE) ~ "1",
grepl("No", ., ignore.case = TRUE) ~ "0",
grepl("Some", ., ignore.case = TRUE) ~ "0.5",
TRUE ~ "Else"))
This will select columns which start with "S"
or which has "C"
followed by a number.这将 select 列以
"S"
开头或"C"
后跟数字。
Also mutate_at
has been replaced with across
so you can now use:此外
mutate_at
已被替换为 cross across
因此您现在可以使用:
Sa %>%
mutate(across(matches('^S|C\\d+'),
~case_when(grepl("Yes", ., ignore.case = TRUE) ~ "1",
grepl("No", ., ignore.case = TRUE) ~ "0",
grepl("Some", ., ignore.case = TRUE) ~ "0.5",
TRUE ~ "Else")))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.