How to filter data without losing NA rows using dplyr

Question

How to subset data in R without losing NA rows?

The post above subsets using logical indexing. Is there a way to do it in dplyr?

Also, when does dplyr automatically delete NAs? In my experience, it removes NA when I filter out a specific string, eg:

b = a %>% filter(col != "str")

I would think this would not exclude NA values but it does. But when I use other format of filtering, it does not automatically exclude NA , eg:

b = a %>% filter(!grepl("str", col))

I would like to understand this feature of filter. I would appreciate any help. Thank you!

Answer 1

The documentation for dplyr::filter says... "Unlike base subsetting, rows where the condition evaluates to NA are dropped."

NA != "str" evaluates to NA so is dropped by filter .

!grepl("str", NA) returns TRUE , so is kept.

If you want filter to keep NA , you could do filter(is.na(col)|col!="str")

Answer 2

If you want to keep NAs created by the filter condition you can simply turn the condition NAs into TRUEs using replace_na from tidyr .

a <- data.frame(col = c("hello", NA, "str"))
a %>% filter((col != "str") %>% replace_na(TRUE))

How to filter data without losing NA rows using dplyr

Question

2 answers

solution1
24 ACCPTED 2017-09-23 10:26:30

solution2
13 2019-06-19 19:59:28

How to filter data without losing NA rows using dplyr

Question

2 answers

solution1 24 ACCPTED 2017-09-23 10:26:30

solution2 13 2019-06-19 19:59:28

solution1
24 ACCPTED 2017-09-23 10:26:30

solution2
13 2019-06-19 19:59:28