简体   繁体   中英

why is dplyr filter not capturing NA's

I have the following data frame

  FileNumber ReferralDate Status
1  510709784   2018-10-07 CLOSED
2         NA         <NA>   <NA>
3  510704781   2018-05-04 CLOSED
4         NA         <NA>   <NA>
5         NA         <NA>   <NA>
6         NA         <NA>   <NA>

This is the structure of the data frame

'data.frame':   6 obs. of  3 variables:
 $ FileNumber  : int  510709784 NA 510704781 NA NA NA
 $ ReferralDate: chr  "2018-10-07" NA "2018-05-04" NA ...
 $ Status      : chr  "CLOSED" NA "CLOSED" NA ...

when I try to capture the NA values in either the FileNumber column or the Status column using the following code. But it doesn't seem to work. Why is this happening

  > df%>%filter(Status=="<NA>")
[1] FileNumber   ReferralDate Status      
<0 rows> (or 0-length row.names)
> df%>%mutate(Status=as.factor(Status))%>%filter(Status=="<NA>")
[1] FileNumber   ReferralDate Status      
<0 rows> (or 0-length row.names)
> df%>%filter(FileNumber=="NA")
[1] FileNumber   ReferralDate Status      
<0 rows> (or 0-length row.names)
library(dplyr)

df <- data.frame(FileNumber = c(510709784, NA, 510704781, NA, NA, NA),
                 ReferralDate = c("2018-10-07", NA, "2018-05-04", NA, NA, NA),
                 Status = c("CLOSED", NA, "CLOSED", NA, NA, NA),
                 stringsAsFactors = FALSE)

Use is.na() to refer to NA , not ==

df %>% filter(is.na(Status))
  FileNumber ReferralDate Status
1         NA         <NA>   <NA>
2         NA         <NA>   <NA>
3         NA         <NA>   <NA>
4         NA         <NA>   <NA>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM