[英]How to match one row from one column to the next 5-10 rows in two other columns in R?
I have a data frame which looks like this:我有一个看起来像这样的数据框:
df1 <- structure(list(day = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
13, 14, 15, 16, 17, 18, 19, 20), observ1 = c(1, 0, 0, 0, 0, 1,
0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0), observ2 = c(0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1),
observ3 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0)),
class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -20L))
Previously I got a TRUE value if observ1 equals 1 and after 5 to 10 days, observ2 also equals 1.以前,如果 observ1 等于 1,并且在 5 到 10 天后,observ2 也等于 1,我得到一个 TRUE 值。
Now, I need to add a 3rd condition that if observ1 equals 1, and after 5-10 days, observ2 equals 1 AND also observ3 equals 1 within the same 5-10 days, then retrun TRUE.现在,我需要添加第三个条件,如果 observ1 等于 1,并且在 5-10 天后,observ2 等于 1并且observ3 在相同的 5-10 天内也等于 1,然后返回 TRUE。
So, the new 'check' column should look like this:因此,新的“检查”列应如下所示:
df1 <- structure(list(day = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20),
observ1 = c(1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0),
observ2 = c(0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1),
observ3 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0),
check = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 'TRUE', 0, 0, 0, 0, 0, 0)),
class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -20L))
Hopefully this helps, thanks for asking another question, this is generally considered the way to go when you need to add on to your original question btw.希望这会有所帮助,感谢您提出另一个问题,当您需要添加到原始问题 btw 时,这通常被认为是 go 的方式。 Im not sure this is correct, can you please give me guidance on whether or not this is what you are after?
我不确定这是否正确,请您指导我这是否是您所追求的?
df1$check <- with(
df1,
vapply(
seq_along(observ1),
function(i){
# If we are less than five days in:
if(i - 5 <= 0){
# Return NA: logical scalar => env
NA
# Otherwise:
}else{
# Ensure no negative indices by setting a lower bound of 1:
# idx_lower_bound => integer scalar
idx_lower_bound <- max(
i-10,
1
)
# Compute the index: idx => integer vector
idx <- seq(
idx_lower_bound,
i+5,
by = 1
)
# Test if all conditions are true:
# check => logical scalar
check <- all(
# The current value of observ2 == 1 ? logical scalar
observ1[i] == 1,
# Any observ2 values in the range == 1 ? logical scalar
any(observ2[idx] == 1),
# Any observ3 values in the range == 1 ? logical scalar
any(observ3[idx] == 1)
)
# Replace false with NA: logical vector => env
ifelse(
check,
check,
NA
)
}
},
logical(1)
)
)
Data:数据:
df1 <- structure(
list(
day = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20),
observ1 = c(1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0),
observ2 = c(0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1),
observ3 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0)
),
class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -20L)
)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.