简体   繁体   中英

Create new column from conditions on multiple columns in R

What I'm trying to write would be written with the apply function in Python:

def categorise(row):  
  if row['colC'] > 0 and row['colC'] <= 99:
    return 'A'
  elif row['colC'] > 100 and row['colC'] <= 199:
    return 'B'
  elif row['colC'] > 200  and row['colC'] <= 299:
    return 'C'
  return 'D'

df['colF'] = df.apply(lambda row: categorise(row), axis=1)

This is the R code I have at the moment

myf <- function(x) {
  count <- 0
  if(x[,"BMICat"]==4){
    count = count +1}
  if(x[,"SleepTimeCat"]==1 | x[,"SleepTimeCat"]==4){
    count= count+1}
  if(x[,"MentalHealthCat"]==3){
    count= count+1}
  if(x[,"Smoking"]==TRUE){
    count= count+1}
  if(x[,"PhysicalActivity"]==FALSE){
    count= count+1}

  return(count)
}

dfAugment %>% 
mutate(BadHabits= myf(.))

I often get stuck on trying to apply this pattern in R, is my approach not common in R?

If I understand your question correctly, a possible solution is creating dummy variables and then adding them together.

library(dplyr)

dfAugment <- data.frame(BMICat = c(1, 2, 4, 4),
                        SleepTimeCat = c(1, 2, 3, 4))

dfAugment |> 
  mutate(risk_sum = if_else(BMICat == 4, 1, 0) +
                    if_else(SleepTimeCat == 1 | SleepTimeCat == 4, 1, 0))

Output

#>   BMICat SleepTimeCat risk_sum
#> 1      1            1        1
#> 2      2            2        0
#> 3      4            3        1
#> 4      4            4        2

Created on 2022-06-22 by the reprex package (v2.0.1)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM