I have the following dataframe df
. I would like to return a vector result
that indicates which rows meet the following criterion: at least 2 consecutive values in that row are lower than -1.7.
set.seed(123)
df <- data.frame(V1=rnorm(10,-1.5,.5),
V2=rnorm(10,-1.5,.5),
V3=rnorm(10,-1.5,.5),
V4=rnorm(10,-1.5,.5),
V5=rnorm(10,-1.5,.5),
V6=rnorm(10,-1.5,.5),
V7=rnorm(10,-1.5,.5),
V8=rnorm(10,-1.5,.5),
V9=rnorm(10,-1.5,.5),
V10=rnorm(10,-1.5,.5))
rownames(df) <- c(seq(1976,1985,1))
The result would be a vector:
result <- c(1977,1979,1980,1982,1983,1985)
One option is to loop through the rows with apply
, create a logical condition with rle
, check if there are any
TRUE elements that have lengths
more than 1, extract the names
names(which(apply(df, 1, function(x) with(rle(x < - 1.7), any(lengths[values] > 1)))))
#[1] "1977" "1979" "1980" "1982" "1983" "1985"
Or a better approach is to vectorize it by placing two logical matrices (ie remove the first column of the dataset, check whether it is less than -1.7, similarly remove the last column and do the same), Reduce
it to a single logical matrix
by checking whether the corresponding elements are TRUE
, get the rowSums
, if the value is greater than 0, we extract the row names
names(which(rowSums(Reduce(`&`, list(df[-ncol(df)] < -1.7, df[-1] < -1.7))) > 0))
#[1] "1977" "1979" "1980" "1982" "1983" "1985"
A fun option using which
with arr.ind = TRUE
temp <- which(df < -1.7, arr.ind = TRUE)
rownames(df)[aggregate(col~row, temp, function(x) any(diff(x) == 1))[, 2]]
#[1] "1977" "1979" "1980" "1982" "1983" "1985"
We first get all row and column positions where value is less than -1.7. Using aggregate
we group col
for every row
and check if there is at least one consecutive value in a row and for values which return TRUE
subset its rownames
.
A solution which uses the lagged sum to get the sum of each pair of numbers in a vector. If the lagged sum gets 2, then it means at least 2 consecutive values in that row meet the condition.
rownames(df)[apply(df < -1.7, 1, function(x) any(x[-nrow(df)] + x[-1] == 2))]
# [1] "1977" "1979" "1980" "1982" "1983" "1985"
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.