R：按一列分组，然后在其他任何列中返回值大于0的第一行，然后返回此行之后的所有行

Question

I'm new to R programming and hope someone could help me with the situation below: 我是R编程的新手，希望有人可以帮助我解决以下情况：

I have a dataframe shown in the picture (Original Dataframe), I would like to return the first record grouped by the [ID] column that has a value >= 1 in any of the four columns (A, B, C, or D) and all the records after based off the [Date] column (the desired dataframe should look like the Output Dataframe shown in the picture). 我有一个显示在图片中的数据框（原始数据框），我想返回由[ID]列分组的第一条记录，该记录在四个列（A，B，C或D中的任何一个中，值> = 1））以及基于[日期]列的所有记录（所需的数据框应类似于图片所示的输出数据框）。 Basically, remove all the records highlighted in yellow. 基本上，删除所有以黄色突出显示的记录。 I would appreciate greatly if you can provide the R code to achieve this. 如果可以提供R代码来实现此目标，我将不胜感激。

structure(list(ID = c(101L, 101L, 101L, 101L, 101L, 101L, 103L, 
103L, 103L, 103L), Date = c(43338L, 43306L, 43232L, 43268L, 43183L, 
43144L, 43310L, 43246L, 43264L, 43209L), A = c(0L, 0L, 0L, 0L, 
0L, 0L, 0L, 1L, 0L, 0L), B = c(0L, 2L, 0L, 0L, 0L, 0L, 0L, 1L, 
0L, 0L), C = c(0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), D = c(0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L)), .Names = c("ID", "Date", 
"A", "B", "C", "D"), row.names = c(NA, -10L), class = c("data.table", 
"data.frame"))

Answer 1

Here is a solution, 这是一个解决方案，

    ID       Date A B C D
1  101 26.08.2018 0 0 0 0
2  101 25.07.2018 0 2 0 0
3  101 12.05.2018 0 0 1 0
4  101 17.06.2018 0 0 0 0
5  101 24.03.2018 0 0 0 0
6  101 13.02.2018 0 0 0 0
7  103 29.07.2018 0 0 0 0
8  103 26.05.2018 1 1 0 0
9  103 13.06.2018 0 0 0 0
10 103 19.04.2018 0 0 0 0


data$Check <- rowSums(data[3:6]) 

data$Date <- as.Date(data$Date , "%d.%m.%Y")


data <- data[order(data$ID,data$Date),]


id <- unique(data$ID)

for(i in 1:length(id)) {

    data_sample <- data[data$ID == id[i],]

    data_sample <- data_sample[ min(which(data_sample$Check>0 )):nrow(data_sample),]

    if(i==1) {

        final <- data_sample


    } else {

        final <- rbind(final,data_sample)

    }

}

final <- final[,-7]

   ID       Date A B C D
3 101 2018-05-12 0 0 1 0
4 101 2018-06-17 0 0 0 0
2 101 2018-07-25 0 2 0 0
1 101 2018-08-26 0 0 0 0
8 103 2018-05-26 1 1 0 0
9 103 2018-06-13 0 0 0 0
7 103 2018-07-29 0 0 0 0

Answer 2

Here's a tidyverse solution. 这是一个tidyverse解决方案。 The filter condition deserves some explanation: filter条件值得一些解释：

first, we sort by ID and Date and group_by ID 首先，我们按ID和Date以及group_by ID排序
Then, for each ID (since we're grouped by ID) we apply the filter condition: 然后，对于每个ID（因为我们按ID分组），我们应用了过滤条件：
1. Test, for each row, whether any of the variables are > 0 测试每一行是否有任何变量> 0
2. Get the row number for all rows (in the group) where this is the case 在这种情况下，获取（组中）所有行的行号
3. Find the lowest one (since rows are sorted by Date, this will be the earliest) 找到最低的行（因为行按日期排序，这将是最早的行）
4. Get the value of Date for that row. 获取该行的Date值。
5. Then filter rows where Date is >= than this. 然后，其中过滤行Date是>=比这个。

Since we're still grouping by ID , all these calculations will happen separately for each group: 由于我们仍按ID分组，因此所有这些计算将分别针对每个组进行：

df %>%
    arrange(ID, Date) %>%
    group_by(ID) %>%
    filter(Date >= Date[min(which(A > 0 | B > 0 | C > 0 | D > 0))])

# A tibble: 7 x 6
# Groups:   ID [2]
     ID  Date     A     B     C     D
  <int> <int> <int> <int> <int> <int>
1   101 43232     0     0     1     0
2   101 43268     0     0     0     0
3   101 43306     0     2     0     0
4   101 43338     0     0     0     0
5   103 43246     1     1     0     0
6   103 43264     0     0     0     0
7   103 43310     0     0     0     0

R：按一列分组，然后在其他任何列中返回值大于0的第一行，然后返回此行之后的所有行

问题描述

2 个解决方案

解决方案1
0 2018-09-18 21:24:34

解决方案2
0 2018-09-18 22:13:37

R：按一列分组，然后在其他任何列中返回值大于0的第一行，然后返回此行之后的所有行

问题描述

2 个解决方案

解决方案1 0 2018-09-18 21:24:34

解决方案2 0 2018-09-18 22:13:37

解决方案1
0 2018-09-18 21:24:34

解决方案2
0 2018-09-18 22:13:37