简体   繁体   中英

R: Remove rows that satisfy condition based on time frame

I have data with 1000s of subjects, each with multiple rows per ID. Here is an excerpt of an individual in my data:

    ID      servicedate   firstdate     group      firstdateplus90
    AAA     01/01/2019    01/01/2019    A          04/01/2019
    AAA     03/01/2019    01/01/2019    B          04/01/2019

I'd like to remove all subjects like AAA, where in the 90-day time frame since date 1, they have a row indicating they are in a different group. In the above example, subject AAA started at group A but by 03/01/2019, which is before 04/01/2019 (90 days since date 1), they are in group B.

I first tried to create a new variable that tells us which group a subject was in on the first date:

mydata <- mydata %>% group_by(ID) %>%
mutate(first_group= {if(firstdate == servicedate) group[min(which(firstdate == servicedate))] else NA})

But I am not really sure where to go from here, or if there is an easier way to subset out those whose group within firstdateplus90 is not equal to the group at firstdate.

Any help is appreciated!

This worked for me:

df <-  data.frame(ID = c("AAA","AAA", "AAA", "BBB", "BBB"),
           servicedate = as_date(c(17774, 17794, 17804, 17374, 17386)),
           group = c("A", "A", "B", "A", "A"))

    df %>%
      anti_join(df %>%
      group_by(ID) %>%
        filter(servicedate - min(servicedate) < 90 & group != group[servicedate == min(servicedate)]) %>%
 select(ID))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM