简体   繁体   中英

R: Recode 2 continuous variables into 1 categorical variable

I have a data set with 1 variable for systolic blood pressure and 1 variable diastolic blood pressure. I want to make one categorical variable of blood pressure levels. This requires using ranges of values from each variable which is proving difficult.

       ID   Systolic Diastolic
       1      130     80
       2      118     76
       3      120     80
       4      115     74
       5      184     107
       6      114     69
       7       95     72

This is closest I've gotten but I don't believe I'm on the right path with this one. Can someone point me in the right direction?

  df$BPLevel[Systolic < 120 | Diastolic < 80] <- "Normal"
  df$BPLevel[120 < Systolic < 139 | 80 < Diastolic < 89] <- "Prehypertension"
  df$BPLevel[Systolic >= 140 | Diastolic >= 90] <- "Hypertension"
  df$BPLevel[Systolic == "." | Diastolic == "."] <- "Missing"

With situations like this, my initial attempt is to try using dplyr 's case_when() function.

library(dplyr)

df <- data.frame(ID = c(1:7),
                 Systolic = c(130,118,120,115,184,114,95),
                 Diastolic = c(80,76,80,74,107,69,72))

df <- df %>%
      mutate(BPLevel = case_when(Systolic < 120 | Diastolic < 80 ~ "Normal",
                                 between(Systolic, 120, 139) | between(Diastolic, 80, 89)~ "Prehypertension",
                                 Systolic>=140 | Diastolic >= 90 ~ "Hypertension",
                                 TRUE ~ "Missing"
                                 ))

The only other thing is that in your example above, what should happen if Systolic = 120 or Diastolic = 80? The dplyr::between function I used includes 120 and 80. Check ?dplyr::between for more details.

Does this help solve your problem?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM