简体   繁体   中英

Finding the max value of a column by group and condition in R

I have two data.tables in R.

Table A has ID_A, days, and group.

Table B has ID_B, days, group, and value_of_interest.

I'm trying to add a column to A, max_value_of_interest, where the value is the maximum of the value_of_interest in all rows of a group where the days in B is greater than days in table A.

I'll try to describe it another way:

Table A:

ID_A    days    group
A1      5       X

I want to add a column to A containing the maximum value_of_interest from B, where the maximum value is chosen from B where B.group=X and B.days > 5 (greater than the value in row A1).

I've found solutions for finding the maximum by group, but I'm having trouble figuring out how to add in a condition to consider only values where B.days by group > A.days.

I'm not sure of the best way to approach this. I'd appreciate any help.

It might be easiest to loop through the rows of Table A. For each row, select the relevant rows of B, then find the max value.

library(tidyverse)
A <- tibble(ID_A=paste("A", 1:5, sep=""), 
            days=seq(5,1,-1), 
            group=c("X", "X", "X", "Y", "Y"),
            max_val=NA)
B <- tibble(ID_B=paste("A", 1:5, sep=""), 
            days=seq(3,7,1), 
            group=c("X", "X", "X", "Y", "Y"),
            val=runif(5))

for (i in 1:nrow(A)){
  B_sel <- B %>%
    filter(group==A$group[i] & days>A$days[i]) 
  if (nrow(B_sel)>0)
    A$max_val[i] <- max(B_sel$val)
}

or

for (i in 1:nrow(A)){
  rows <- which(B$group==A$group[i] & B$days>A$days[i]) 
  if (length(rows)>0)
    A$max_val[i] <- max(B$val[rows])
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM