简体   繁体   中英

Most efficient way to recode variable?

I have the following 2 lines of code that I would like to make into a one-liner (would like to do it with base R). Any help would be appreciated.

#recode the sex variable
ibr.sub$SEX[ibr.sub$SEX == "1" | ibr.sub$SEX == "3"] <- "-1"
ibr.sub$SEX[ibr.sub$SEX == "2" | ibr.sub$SEX == "4"] <- "1"

We may do

library(dplyr)
ibr.sub$SEX <- case_when(ibr.sub$SEX %in% c(1,3)~ "-1", ibr.sub$SEX %in% c(2,4) ~ "1", TRUE ~ ibr.sub$SEX)
library(forcats)
ibr.sub <- data.frame(SEX = factor(c("1", "2", "3", "4")))
ibr.sub
#>   SEX
#> 1   1
#> 2   2
#> 3   3
#> 4   4
ibr.sub$SEX <- forcats::fct_collapse(ibr.sub$SEX,
                                     "-1" = c("1", "3"),
                                     "1"  = c("2", "4"))
ibr.sub
#>   SEX
#> 1  -1
#> 2   1
#> 3  -1
#> 4   1

Created on 2023-01-31 by the reprex package (v2.0.1)

Using stringi::stri_replace_all_regex .

df$sex2 <- stringi::stri_replace_all_regex(df$sex, c('1|3', '2|4'), c('-1', '1'), vectorize_all=FALSE)
df
#    sex sex2
# 1    3   -1
# 2    4    1
# 3    3   -1
# 4    4    1
# 5    1   -1
# 6    1   -1
# 7    2    1
# 8    4    1
# 9    2    1
# 10   2    1
# 11   3   -1
# 12   3   -1
# 13   1   -1
# 14   1   -1
# 15   3   -1
# 16   4    1
# 17   1   -1
# 18   3   -1
# 19   1   -1
# 20   1   -1

Data:

set.seed(42)
df <- data.frame(sex=sample(1:4, 20, replace=TRUE))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM