简体   繁体   English

使用 R Dplyr Mutate 使用 is.na 的 ifelse 语句顺序

[英]Ifelse statement order with is.na using R Dplyr Mutate

I created some sample data below to help illustrate my question.我在下面创建了一些示例数据来帮助说明我的问题。

 library(dplyr)
 col1 = paste(rep('var',5),seq(1:5), sep = "")
 Value = c(1,1,0,NA,NA)
 p1 <- data.frame(col1,Value)

> p1
 col1 Value
 var1     1
 var2     1
 var3     0
 var4    NA
 var5    NA

When is.na(Value) is placed first in the ifelse statement, mutate works as expected.当 is.na(Value) 放在 ifelse 语句的第一个位置时, mutate 会按预期工作。

> p1 %>% mutate(NewCol = ifelse(is.na(Value), "TestYes",
                         ifelse(Value == 1, "Test1Yes",
                         ifelse(Value == 0, "Test0Yes","No"))))
 col1 Value   NewCol
 var1     1 Test1Yes
 var2     1 Test1Yes
 var3     0 Test0Yes
 var4    NA  TestYes
 var5    NA  TestYes

When I place is.na(Value) as the second ifelse statement, it doesnt work.当我将 is.na(Value) 作为第二个 ifelse 语句时,它不起作用。 But the third ifelse statement still works checking for Value == 0. The second ifelse statement with is.na(Value) is skipped over.但是第三个 ifelse 语句仍然可以检查 Value == 0。跳过带有 is.na(Value) 的第二个 ifelse 语句。

> p1 %>% mutate(NewCol = ifelse(Value == 1, "Test1Yes",
                         ifelse(is.na(Value), "TestYes",
                         ifelse(Value == 0, "Test0Yes","No"))))
  col1 Value   NewCol
  var1     1 Test1Yes
  var2     1 Test1Yes
  var3     0 Test0Yes
  var4    NA     <NA>
  var5    NA     <NA>

Am I missing something in the code or is there a reason why is.na needs to be placed first in the ifelse statements?我是否遗漏了代码中的某些内容,或者是否有理由将 is.na 放在 ifelse 语句的首位?

When comparing with == NA values return NA .当与== NA值比较时返回NA When the first statement returns an NA value it doesn't go and check the next ifelse statement.当第一个语句返回一个NA值时,它不会去检查下一个ifelse语句。 To go to the next ifelse statement it needs a FALSE value.要转到下一个ifelse语句,它需要一个FALSE值。

p1$Value == 1
#[1]  TRUE  TRUE FALSE    NA    NA

A workaround would be to use %in% instead of == which returns FALSE for NA values.一种解决方法是使用%in%而不是== ,它为NA值返回FALSE

p1$Value %in% 1
#[1]  TRUE  TRUE FALSE FALSE FALSE

library(dplyr)

p1 %>% mutate(NewCol = ifelse(Value %in% 1, "Test1Yes",
                              ifelse(is.na(Value), "TestYes",
                                     ifelse(Value %in% 0, "Test0Yes","No"))))

#  col1 Value   NewCol
#1 var1     1 Test1Yes
#2 var2     1 Test1Yes
#3 var3     0 Test0Yes
#4 var4    NA  TestYes
#5 var5    NA  TestYes

You can also get the desired behaviour using case_when statement instead of nested ifelse .您还可以使用case_when语句而不是嵌套的ifelse来获得所需的行为。

p1 %>% 
  mutate(NewCol = case_when(Value == 1 ~ "Test1Yes",
                            is.na(Value) ~ "TestYes",
                            Value == 0 ~ "Test0Yes",
                            TRUE ~ "No"))

#  col1 Value   NewCol
#1 var1     1 Test1Yes
#2 var2     1 Test1Yes
#3 var3     0 Test0Yes
#4 var4    NA  TestYes
#5 var5    NA  TestYes

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM