简体   繁体   中英

R if statement with multiple conditions and multiple outcomes

I wonder if there exists a function for the following. I have a dataframe as following

test <- data.frame(replicate(2,sample(0:100,1000,rep=TRUE)))
cutoff.X1 <- 20
cutoff.X2 <- 60
test
    X1 X2
1   63 79
2   68 76
3   23 67
4   21 48
5   78 84
6   35 47
7   34 20
8   24 36
9   32 41
10  92 90
11  77 20
12  21 54
13  90 81
14  69 43
15  70 30
16  56 31
17   1 74
18 100 71
19  72 36
20  88 55

What I want is an added column stating X1, X2, none or both, depending on whether the row in X1 is above cutoff.X1 , the vaule of X2 is above cutoff.X2 , both are not, or both are above the cutoffs. I would know how to do it using multiple If statements, but since the real code is a lot of data, I wonder if there is a way without processing more data than necessary.

Here's a simple approach using some math and logical vectors. This leverages the fact that + TRUE evaluates to 1 .

First, make a character vector with the choices. Then Test if X1 is greater than the cutoff. That will equal 1 when TRUE . Then test if X2 is greater than the cutoff and multiply by 2 . Then, add those two numbers together and add 1 . The total will equal 1 when neither, 2 when X1 is greater, 3 when X2 is greater, and 4 when both are.

Finally, subset your character vector using the integer vector you just created.

test$above <- c("neither","X1","X2","both")[((test$X1 > cutoff.X1) + ((test$X2 > cutoff.X2) * 2)) + 1]
head(test,10)
   X1 X2   above
1  64 51      X1
2  39 31      X1
3  24 14      X1
4  74 57      X1
5  67 91    both
6   7  6 neither
7  14 78      X2
8  74 92    both
9  18 93      X2
10 27 31      X1

If you don't like that, there's always dplyr::case_when , which is admittedly easier to read:

library(dplyr)
test$above <- case_when(test$X1 > cutoff.X1 & test$X2 > cutoff.X2 ~ "both",
                        test$X1 > cutoff.X1 ~ "X1",
                        test$X2 > cutoff.X2 ~ "X2",
                        TRUE ~ "neither")

Just remember that case_when evaluates each condition until one evaluates TRUE , so be sure to put the "both" condition first.

I think you are after an if_else() from dplyr :

test <- data.frame(replicate(2,sample(0:100,1000,rep=TRUE)))
cutoff.X1 <- 20
cutoff.X2 <- 60
test %>% 
  mutate(
    X3 = if_else(
      X1 > cutoff.X1,
      if_else(
        X2 > cutoff.X2,
        "both",
        "X1"
      ),
      if_else(
        X2 > cutoff.X2,
        "X2",
        "none"
      )
    )
  )

Result:

     X1  X2   X3
1     9   1 none
2    32  30   X1
3    30   7   X1
4    79  36   X1
5    70   0   X1
6     0  12 none
7    59  21   X1
8     5  38 none
9    57   4   X1
10   41  69 both
11   20  98   X2

You can use multiple if statements in this way:

test$Result<-ifelse(test$X1<cutoff.X1,"X1 under cutoff","")

One for every condition you want to check.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM