[英]Ifelse statement order with is.na using R Dplyr Mutate
I created some sample data below to help illustrate my question.我在下面创建了一些示例数据来帮助说明我的问题。
library(dplyr)
col1 = paste(rep('var',5),seq(1:5), sep = "")
Value = c(1,1,0,NA,NA)
p1 <- data.frame(col1,Value)
> p1
col1 Value
var1 1
var2 1
var3 0
var4 NA
var5 NA
When is.na(Value) is placed first in the ifelse statement, mutate works as expected.当 is.na(Value) 放在 ifelse 语句的第一个位置时, mutate 会按预期工作。
> p1 %>% mutate(NewCol = ifelse(is.na(Value), "TestYes",
ifelse(Value == 1, "Test1Yes",
ifelse(Value == 0, "Test0Yes","No"))))
col1 Value NewCol
var1 1 Test1Yes
var2 1 Test1Yes
var3 0 Test0Yes
var4 NA TestYes
var5 NA TestYes
When I place is.na(Value) as the second ifelse statement, it doesnt work.当我将 is.na(Value) 作为第二个 ifelse 语句时,它不起作用。 But the third ifelse statement still works checking for Value == 0. The second ifelse statement with is.na(Value) is skipped over.但是第三个 ifelse 语句仍然可以检查 Value == 0。跳过带有 is.na(Value) 的第二个 ifelse 语句。
> p1 %>% mutate(NewCol = ifelse(Value == 1, "Test1Yes",
ifelse(is.na(Value), "TestYes",
ifelse(Value == 0, "Test0Yes","No"))))
col1 Value NewCol
var1 1 Test1Yes
var2 1 Test1Yes
var3 0 Test0Yes
var4 NA <NA>
var5 NA <NA>
Am I missing something in the code or is there a reason why is.na needs to be placed first in the ifelse statements?我是否遗漏了代码中的某些内容,或者是否有理由将 is.na 放在 ifelse 语句的首位?
When comparing with ==
NA
values return NA
.当与==
NA
值比较时返回NA
。 When the first statement returns an NA
value it doesn't go and check the next ifelse
statement.当第一个语句返回一个NA
值时,它不会去检查下一个ifelse
语句。 To go to the next ifelse
statement it needs a FALSE
value.要转到下一个ifelse
语句,它需要一个FALSE
值。
p1$Value == 1
#[1] TRUE TRUE FALSE NA NA
A workaround would be to use %in%
instead of ==
which returns FALSE
for NA
values.一种解决方法是使用%in%
而不是==
,它为NA
值返回FALSE
。
p1$Value %in% 1
#[1] TRUE TRUE FALSE FALSE FALSE
library(dplyr)
p1 %>% mutate(NewCol = ifelse(Value %in% 1, "Test1Yes",
ifelse(is.na(Value), "TestYes",
ifelse(Value %in% 0, "Test0Yes","No"))))
# col1 Value NewCol
#1 var1 1 Test1Yes
#2 var2 1 Test1Yes
#3 var3 0 Test0Yes
#4 var4 NA TestYes
#5 var5 NA TestYes
You can also get the desired behaviour using case_when
statement instead of nested ifelse
.您还可以使用case_when
语句而不是嵌套的ifelse
来获得所需的行为。
p1 %>%
mutate(NewCol = case_when(Value == 1 ~ "Test1Yes",
is.na(Value) ~ "TestYes",
Value == 0 ~ "Test0Yes",
TRUE ~ "No"))
# col1 Value NewCol
#1 var1 1 Test1Yes
#2 var2 1 Test1Yes
#3 var3 0 Test0Yes
#4 var4 NA TestYes
#5 var5 NA TestYes
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.