简体   繁体   中英

How can I split variable X into 2 variables depending on a character in X?

I have a variable that looks like this:

df$Code
22
34
24
12
44

How can I create a new variable in the data frame, such that a subject with a value of "4" in df$Code is grouped as "Patient", while everyone else is grouped as "Controls" in a new df$Groups ?

df$Groups
Control
Patient
Patient
Control
Patient

Thank you!

In case the last digit should be tested if it is a 4 endsWith or grepl could be used:

c("Control", "Patient")[1 + endsWith(as.character(df$Code), "4")]
#[1] "Control" "Patient" "Patient" "Control" "Patient"

c("Control", "Patient")[1 + grepl("4$", df$Code)]
#[1] "Control" "Patient" "Patient" "Control" "Patient"

or at any position:

c("Control", "Patient")[1 + grepl("4", df$Code)]
#[1] "Control" "Patient" "Patient" "Control" "Patient"

Data:

df <- data.frame(Code = c(22, 34, 24, 12, 44))

Using tidyverse :

library(tidyverse)
df %>% 
        mutate(group = ifelse(str_detect(as.character(Code), "4"), "Patient", "Control"))

Output:

   Code group  
  <dbl> <chr>  
1    22 Control
2    34 Patient
3    24 Patient
4    12 Control
5    44 Patient

Note that it detects "4" no matter if it comes first (eg 42) or second (eg 24) as I assumed this is what you want. If only the last digit should match, then use:

df %>% 
        mutate(group = ifelse(str_ends(as.character(Code), "4"), "Patient", "Control"))

Alternatively a function such as recode() is ideal for this - especially if you have more than two categories.

library(tidyverse)

tibble(code = c(22, 34, 24, 12, 44)) %>% 
  mutate(
    group = recode(code %% 10, `2` = "patient", `4` = "control")
  )

#> # A tibble: 5 x 2
#>    code group  
#>   <dbl> <chr>  
#> 1    22 patient
#> 2    34 control
#> 3    24 control
#> 4    12 patient
#> 5    44 control

Created on 2021-07-15 by the reprex package (v1.0.0)

We could use grepl in combination with ifelse

library(dplyr)
df  %>% 
  mutate(Groups = ifelse(
    grepl("4", as.character(Code)), 'Patient', 'Control'))

Output:

 Code Groups 
  <dbl> <chr>  
1    22 Control
2    34 Patient
3    24 Patient
4    12 Control
5    44 Patient

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM