简体   繁体   English

R:根据时间范围删除满足条件的行

[英]R: Remove rows that satisfy condition based on time frame

I have data with 1000s of subjects, each with multiple rows per ID. 我有1000个主题的数据,每个主题每个ID有多行。 Here is an excerpt of an individual in my data: 这是我数据中某人的摘录:

    ID      servicedate   firstdate     group      firstdateplus90
    AAA     01/01/2019    01/01/2019    A          04/01/2019
    AAA     03/01/2019    01/01/2019    B          04/01/2019

I'd like to remove all subjects like AAA, where in the 90-day time frame since date 1, they have a row indicating they are in a different group. 我想删除所有类似AAA的主题,其中从日期1开始的90天时间范围内,它们有一行指示它们在不同的组中。 In the above example, subject AAA started at group A but by 03/01/2019, which is before 04/01/2019 (90 days since date 1), they are in group B. 在上面的示例中,主题AAA从A组开始,但在03/01/2019之前(也就是从日期1起90天),该日期是04/01/2019之前的日期,它们位于B组中。

I first tried to create a new variable that tells us which group a subject was in on the first date: 我首先尝试创建一个新变量,该变量告诉我们主题在第一次约会时所在的组:

mydata <- mydata %>% group_by(ID) %>%
mutate(first_group= {if(firstdate == servicedate) group[min(which(firstdate == servicedate))] else NA})

But I am not really sure where to go from here, or if there is an easier way to subset out those whose group within firstdateplus90 is not equal to the group at firstdate. 但是我真的不确定从哪里来,或者是否有一种更简单的方法可以将firstdateplus90内的组与firstdate的组不相等的那些子组化。

Any help is appreciated! 任何帮助表示赞赏!

This worked for me: 这对我有用:

df <-  data.frame(ID = c("AAA","AAA", "AAA", "BBB", "BBB"),
           servicedate = as_date(c(17774, 17794, 17804, 17374, 17386)),
           group = c("A", "A", "B", "A", "A"))

    df %>%
      anti_join(df %>%
      group_by(ID) %>%
        filter(servicedate - min(servicedate) < 90 & group != group[servicedate == min(servicedate)]) %>%
 select(ID))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM