简体   繁体   中英

How to group individual values from an existing variable into a new variable in R

I'm new to R and I'm stuck. I'm working on a health dataset with each row as one patient's information.

I have a variable called diag_codes. It has the patient's medical condition in the form of a diagnostic code/number. I want to group the individual condition codes into broader categories (heart disease, resp disease, liver disease) and make that a new variable.

Eg I know that 1,2,3,4,84 are all respiratory diseases. I also know that 5, 6, 7, 32, 56 are all cardiovascular diseases. I want to create a new variable called diagnosis.

diag_code diagnosis
1 "resp disease"
2 "resp disease"
56 "CVD disease"
3 "resp disease"
4 "resp disease"
84 "resp disease"
5 "CVD disease"
6 "CVD disease"
7 "CVD disease"
32 "CVD disease"

I have tried to use case_when() and mutate(), or ifelse() and mutate(), but they usually involve a single true or false condition.

I want to be able to do something like this (I know this is incorrect):

data <- data %>%
mutate(diagnosis = case_when(diag_code==c(1,2,3,5,84)) ~ "Resp disease",
                   case_when(diag_code==c(5,6,7,32,56)) ~ "CVD disease", 
                   TRUE ~ "Unknown)

There are two things that you need to correct to make it work:

First, you can use only one case_when() statement and second, when you want to evaluate a vector you can use %in% instead of == . This then should look like this:

data <- data %>%
mutate(diagnosis = case_when(diag_code %in% c(1,2,3,5,84) ~ "Resp disease",
                             diag_code %in% c(5,6,7,32,56) ~ "CVD disease", 
                             TRUE ~ "Unknown)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM