How do I pick the rows in data frame with at least one variable with non-missing value?

Question

In a dataframe, I only want to keep rows that have at least one variables starting with DSDECOD is NOT empty. How can I do that?

it seems that following code works.

ds_sub <- subset(ds_supp, (DSDECOD1 !="" | DSDECOD2 !="" |
    DSDECOD3 !="" | DSDECOD4 !=""))

But is there simple way so that I don't have to write out all of the variables starting with DSDECOD?

Answer 1

Maybe using rowSums and grepl :

ds_supp[rowSums(ds_supp[, grepl("^DSDECOD", names(ds_supp))]!="")>0,]

  ID DSDECOD1 DSDECOD2 DSDECOD3 DSDECOD4
1  1                          B         
2  2        A                 A        A
3  3        B                          B
5  5        C                 C        C
6  6                          D        D

Data :

  ID DSDECOD1 DSDECOD2 DSDECOD3 DSDECOD4
1  1                          B         
2  2        A                 A        A
3  3        B                          B
4  4                                     # <- empty row
5  5        C                 C        C
6  6                          D        D

Answer 2

You could try using select and the remove_empty function from the janitor package?

ds_sub %>%
select(contains("DSDECOD")) %>%
janitor::remove_empty(.)

Answer 3

This regex solution works:

df[-which(grepl("\\d$", apply(df, 1, paste0, collapse = ""))),]

   id DSDECOD1 DSDECOD2 DSDECOD3
1   1                 A         
2   2        B                  
3   3                          A
4   4                 B         
8   8                          A
9   9                          B
10 10                          A

This solution works by paste0 ing the rows together and then subtracting from the dataframe those strings which end ( $ ) on a digit ( \\\\d ), which happens only when the DSDECOD rows are empty:

Reproducible data :

df <- data.frame(
  id = 1:10,
  DSDECOD1 = c("", "B", rep("",8)),
  DSDECOD2 = c("A","","","B","","","","","",""),
  DSDECOD3 = c("", "", "A", "","","","", "A", "B", "A"))

df
   id DSDECOD1 DSDECOD2 DSDECOD3
1   1                 A         
2   2        B                  
3   3                          A
4   4                 B         
5   5                             # empty 
6   6                             # empty 
7   7                             # empty
8   8                          A
9   9                          B
10 10                          A

How do I pick the rows in data frame with at least one variable with non-missing value?

Question

3 answers

solution1
1 ACCPTED 2020-03-31 16:08:40

solution2
0 2020-03-31 16:48:14

solution3
0 2020-03-31 17:06:46

How do I pick the rows in data frame with at least one variable with non-missing value?

Question

3 answers

solution1 1 ACCPTED 2020-03-31 16:08:40

solution2 0 2020-03-31 16:48:14

solution3 0 2020-03-31 17:06:46

solution1
1 ACCPTED 2020-03-31 16:08:40

solution2
0 2020-03-31 16:48:14

solution3
0 2020-03-31 17:06:46