简体   繁体   English

使用 NA 值查找 POSIXct 日期的最小值或最大值

[英]Finding the min or max of POSIXct date with NA values

The data below has columns for an individual ID (with repeat observations), Date and Fate .下面的数据包含用于个人 ID(具有重复观察)、 DateFate

         ID       Date  Fate
1  BHS_1149 2017-04-11   MIA
2  BHS_1154       <NA>  <NA>
3  BHS_1155       <NA>  <NA>
4  BHS_1156       <NA>  <NA>
5  BHS_1157       <NA>  Mort
6  BHS_1159 2017-04-11 Alive
7  BHS_1169 2017-04-11 Alive
8  BHS_1259       <NA>  <NA>
9  BHS_1260       <NA>  <NA>
10 BHS_1262 2017-04-11   MIA
11 BHS_1262 2017-07-05 Alive
12 BHS_1262 2017-12-06 Alive
13 BHS_1262 2017-12-06   MIA
14 BHS_1262 2018-01-17  Mort

For each ID I want to make a new column that represents the min Date or max Date when Fate is Alive.对于每个 ID,我想创建一个新列,代表Fate is Alive 时的最小Date或最大Date I have tryed different combinations if including and excluding the na.rm = T argument in the code below but still get the following warnings.如果在下面的代码中包含和排除na.rm = T参数,我已经尝试了不同的组合,但仍然收到以下警告。

library(tidyverse)
library(lubridate)

dat %>% 
  group_by(ID) %>%
  mutate(
    #the first or min of Date
    FstSurvey = min(Date),
    LstAlive = max(Date[Fate == "Alive"])) %>%
  as.data.frame()

         ID       Date  Fate  FstSurvey   LstAlive
1  BHS_1149 2017-04-11   MIA 2017-04-11       <NA>
2  BHS_1154       <NA>  <NA>       <NA>       <NA>
3  BHS_1155       <NA>  <NA>       <NA>       <NA>
4  BHS_1156       <NA>  <NA>       <NA>       <NA>
5  BHS_1157       <NA>  Mort       <NA>       <NA>
6  BHS_1159 2017-04-11 Alive 2017-04-11 2017-04-11
7  BHS_1169 2017-04-11 Alive 2017-04-11 2017-04-11
8  BHS_1259       <NA>  <NA>       <NA>       <NA>
9  BHS_1260       <NA>  <NA>       <NA>       <NA>
10 BHS_1262 2017-04-11   MIA 2017-04-11 2017-12-06
11 BHS_1262 2017-07-05 Alive 2017-04-11 2017-12-06
12 BHS_1262 2017-12-06 Alive 2017-04-11 2017-12-06
13 BHS_1262 2017-12-06   MIA 2017-04-11 2017-12-06
14 BHS_1262 2018-01-17  Mort 2017-04-11 2017-12-06

Warning messages:
1: In max.default(numeric(0), na.rm = FALSE) :
  no non-missing arguments to max; returning -Inf
2: In max.default(numeric(0), na.rm = FALSE) :
  no non-missing arguments to max; returning -Inf

The code seems to work as expected, but I have not been able to intrepret or avoid the errors and was not able to find a solution though the max or min help pages.代码似乎按预期工作,但我无法解释或避免错误,也无法通过maxmin帮助页面找到解决方案。 The reproducable code is included below.可重现的代码包含在下面。

dat <- structure(list(ID = c("BHS_1149", "BHS_1154", "BHS_1155", "BHS_1156", 
"BHS_1157", "BHS_1159", "BHS_1169", "BHS_1259", "BHS_1260", "BHS_1262", 
"BHS_1262", "BHS_1262", "BHS_1262", "BHS_1262"), Date = structure(c(1491890400, 
NA, NA, NA, NA, 1491890400, 1491890400, NA, NA, 1491890400, 1499234400, 
1512543600, 1512543600, 1516172400), class = c("POSIXct", "POSIXt"
), tzone = ""), Fate = c("MIA", NA, NA, NA, "Mort", "Alive", 
"Alive", NA, NA, "MIA", "Alive", "Alive", "MIA", "Mort")), row.names = c(NA, 
-14L), .Names = c("ID", "Date", "Fate"), class = "data.frame")

I also like to write code that don't give me errors.我也喜欢编写不会出错的代码。 Here is a suggestion on how to make the same calculations without warnings.这是关于如何在没有警告的情况下进行相同计算的建议。 By using ordered first and last instead of min and max you dont get the weird scenarios where r interpret max(NULL) becomes Inf.通过使用有序的firstlast而不是minmax,您不会遇到 r interpret max(NULL) 变为 Inf 的奇怪情况。

dat %>% 
  group_by(ID) %>%
  mutate(FstSurvey = first(Date, 
                     order_by = Date),
         LstAlive  = last(Date[Fate == "Alive"], 
                     order_by = Date[Fate == "Alive"]))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM