简体   繁体   中英

Why does dplyr recode generate error when recoding to NA but not NaN

I'm recoding with dplyr. I'm getting an error when I recode a value to NA, but not NaN. Here's an example:

df <- df %>% mutate(var=recode(var,`2`=0,`3`=NaN))

Works fine, whereas

df <- df %>% mutate(var=recode(var,`2`=0,`3`=NA))

gives me the following error:

Error: Vector 2 must be a double vector, not a logical vector

When running the code you get this error

tibble(var = rep(2:3, 4)) %>% 
 mutate(var=recode(var,`2`=0,`3`=NA)) 
# Error: Vector 2 must be a double vector, not a logical vector

This is because NA is logical, but recode is expecting a double

class(NA)
# [1] "logical"

You can use NA_real_ instead, since that's a double

class(NA_real_)
# [1] "numeric"
is.double(NA_real_)
# [1] TRUE

tibble(var = rep(2:3, 4)) %>% 
 mutate(var=recode(var,`2`=0,`3`=NA_real_)) 
#     var
#   <dbl>
# 1     0
# 2    NA
# 3     0
# 4    NA
# 5     0
# 6    NA
# 7     0
# 8    NA

For why it's expecting a double, see ?recode

All replacements must be the same type, and must have either length one or the same length as .x.

I think the reason this is unexpected is because base functions like c don't care if the elements are of the same type and will just convert upwards anyway. So this works:

c(1, NA, 3)

Because for the c function:

The output type is determined from the highest type of the components in the hierarchy NULL < raw < logical < integer < double < complex < character < list < expression

An option to change a specific value to NA is na_if

library(dplyr)
df %>% 
   mutate(var = na_if(var, 3))

With recode , @IceCreamToucan's answer is great, but if we want to change it automatically between integer/numeric , we can still do it based on the property of NA in multiplication (to return NA, but it would change the type automatically)

df %>% 
    mutate(var = recode(var,`2`=0,`3`=NA* var[!is.na(var)][1]))
#    var
#1   0
#2  NA
#3   4
#4   5
#5  NA

It can be other functions as well which return NA

df %>%
      mutate(var = recode(var,`2`=0,`3`= max(var[1], NA)))

data

df <- data.frame(var = c(2, 3, 4, 5, 3))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM