简体   繁体   中英

How to drop rows containing NA in specified columns?

I have a dataframe like this

      dep_delay  temp  humid wind_dir precip pressure  date
16983  3         68.00 53.06 NA       0      1020.8    2013-05-07 

26477  42        NA  64.93 360      0        NA        2013-03-07
...
29299  -1        NA    NA    NA       NA     NA        2013-12-31 

29300  33        NA    NA    NA       NA     NA        2013-12-31 

I want to drop only the rows like 29299 and 29300, which contain 5 NA s from temp to pressure (these are consecutive columns), and keep the rows like 16983 and 26477.

desired result:

      dep_delay  temp  humid wind_dir precip pressure  date
16983  3         68.00 53.06 NA       0      1020.8    2013-05-07 

26477  42        NA  64.93 360      0        NA        2013-03-07

In other words, the problem is how to remove only the rows where there are at least 5 NA s in a row.

apparently this is not the right way to do it:

df <- df[!is.na(df$temp:df$pressure),]

Updated based on Yacine Jajji comment.

You can use standard filter function in dplyr package. You set the number of columns which should be never NA . In your case there are 2 : dep_delay and date . Then calculate amount of NA in each row, if the number equals 5 the row will be filtered out. See the code below:

df <- read.table( text = "dep_delay  temp  humid wind_dir precip pressure  date
16983  3         68.00 53.06 NA       0      1020.8    2013-05-07 
26477  42        NA  64.93 360      0        NA        2013-03-07
29299  -1        NA    NA    NA       NA     NA        2013-12-31 
29300  33        NA    NA    NA       NA     NA        2013-12-31")

library(dplyr)

cols_to_remove <- c("temp", "humid", "wind_dir", "precip", "pressure")
df[rowSums(is.na(df[, cols_to_remove])) != 
     ncol(df[, cols_to_remove]), ]

Output:

      dep_delay temp humid wind_dir precip pressure       date
16983         3   68 53.06       NA      0   1020.8 2013-05-07
26477        42   NA 64.93      360      0       NA 2013-03-07

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM