简体   繁体   中英

Ifelse statement order with is.na using R Dplyr Mutate

I created some sample data below to help illustrate my question.

 library(dplyr)
 col1 = paste(rep('var',5),seq(1:5), sep = "")
 Value = c(1,1,0,NA,NA)
 p1 <- data.frame(col1,Value)

> p1
 col1 Value
 var1     1
 var2     1
 var3     0
 var4    NA
 var5    NA

When is.na(Value) is placed first in the ifelse statement, mutate works as expected.

> p1 %>% mutate(NewCol = ifelse(is.na(Value), "TestYes",
                         ifelse(Value == 1, "Test1Yes",
                         ifelse(Value == 0, "Test0Yes","No"))))
 col1 Value   NewCol
 var1     1 Test1Yes
 var2     1 Test1Yes
 var3     0 Test0Yes
 var4    NA  TestYes
 var5    NA  TestYes

When I place is.na(Value) as the second ifelse statement, it doesnt work. But the third ifelse statement still works checking for Value == 0. The second ifelse statement with is.na(Value) is skipped over.

> p1 %>% mutate(NewCol = ifelse(Value == 1, "Test1Yes",
                         ifelse(is.na(Value), "TestYes",
                         ifelse(Value == 0, "Test0Yes","No"))))
  col1 Value   NewCol
  var1     1 Test1Yes
  var2     1 Test1Yes
  var3     0 Test0Yes
  var4    NA     <NA>
  var5    NA     <NA>

Am I missing something in the code or is there a reason why is.na needs to be placed first in the ifelse statements?

When comparing with == NA values return NA . When the first statement returns an NA value it doesn't go and check the next ifelse statement. To go to the next ifelse statement it needs a FALSE value.

p1$Value == 1
#[1]  TRUE  TRUE FALSE    NA    NA

A workaround would be to use %in% instead of == which returns FALSE for NA values.

p1$Value %in% 1
#[1]  TRUE  TRUE FALSE FALSE FALSE

library(dplyr)

p1 %>% mutate(NewCol = ifelse(Value %in% 1, "Test1Yes",
                              ifelse(is.na(Value), "TestYes",
                                     ifelse(Value %in% 0, "Test0Yes","No"))))

#  col1 Value   NewCol
#1 var1     1 Test1Yes
#2 var2     1 Test1Yes
#3 var3     0 Test0Yes
#4 var4    NA  TestYes
#5 var5    NA  TestYes

You can also get the desired behaviour using case_when statement instead of nested ifelse .

p1 %>% 
  mutate(NewCol = case_when(Value == 1 ~ "Test1Yes",
                            is.na(Value) ~ "TestYes",
                            Value == 0 ~ "Test0Yes",
                            TRUE ~ "No"))

#  col1 Value   NewCol
#1 var1     1 Test1Yes
#2 var2     1 Test1Yes
#3 var3     0 Test0Yes
#4 var4    NA  TestYes
#5 var5    NA  TestYes

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM