简体   繁体   中英

Searching for greater/less than values with NAs

I have a dataframe for which I've calculated and added a difftime column:

    name   amount   1st_date   2nd_date  days_out
    JEAN  318.5 1971-02-16 1972-11-27  650 days
 GREGORY 1518.5       <NA>       <NA>   NA days
    JOHN  318.5       <NA>       <NA>   NA days
  EDWARD  318.5       <NA>       <NA>   NA days
  WALTER  518.5 1971-07-06 1975-03-14 1347 days
   BARRY 1518.5 1971-11-09 1972-02-09   92 days
   LARRY  518.5 1971-09-08 1972-02-09  154 days
   HARRY  318.5 1971-09-16 1972-02-09  146 days
   GARRY 1018.5 1971-10-26 1972-02-09  106 days

I want to break it out and take subtotals where days_out is 0-60, 61-90, 91-120, 121-180.

For some reason I can't even reliably write bracket notation. I would expect

members[members$days_out<=120, ] to show just Barry and Garry, but I get a whole lot of lines like:

NA.1095     <NA>     NA       <NA>       <NA>  NA days
NA.1096     <NA>     NA       <NA>       <NA>  NA days
NA.1097     <NA>     NA       <NA>       <NA>  NA days

Those don't exist in the original data. There's no one without a name. What am I doing wrong here?

This is standard behavior for < and other relational operators: when asked to evaluate whether NA is less than (or greater than, or equal to, or ...) some other number, they return NA , rather than TRUE or FALSE .

Here's an example that should make clear what is going on and point to a simple fix.

x <- c(1, 2, NA, 4, 5)
x[x < 3]
# [1]  1  2 NA
x[x < 3 & !is.na(x)]
# [1] 1 2

To see why all of those rows indexed by NA 's have row.names like NA.1095 , NA.1096 , and so on, try this:

data.frame(a=1:2, b=1:2)[rep(NA, 5),]
#       a  b
# NA   NA NA
# NA.1 NA NA
# NA.2 NA NA
# NA.3 NA NA
# NA.4 NA NA

If you are working at the console the subset function does not have that annoying 'feature' which is actually due to the behavior of [ more than to the relational operators.

subset(members, days_out <= 120)

If you are programming, then you can use which or Josh's conjunction with & is.na(.) that which does behind "the scenes":

members[ which(members$days_out <= 120), ]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM