返回连续值满足条件的行

Question

I have the following dataframe df . 我有以下数据帧df 。 I would like to return a vector result that indicates which rows meet the following criterion: at least 2 consecutive values in that row are lower than -1.7. 我想返回一个向量result ，指示哪些行符合以下标准：该行中至少有2个连续值低于-1.7。

set.seed(123)

df <- data.frame(V1=rnorm(10,-1.5,.5),
                 V2=rnorm(10,-1.5,.5),
                 V3=rnorm(10,-1.5,.5),
                 V4=rnorm(10,-1.5,.5),
                 V5=rnorm(10,-1.5,.5),
                 V6=rnorm(10,-1.5,.5),
                 V7=rnorm(10,-1.5,.5),
                 V8=rnorm(10,-1.5,.5),
                 V9=rnorm(10,-1.5,.5),
                 V10=rnorm(10,-1.5,.5))
rownames(df) <- c(seq(1976,1985,1))

The result would be a vector: 结果将是一个向量：

result <- c(1977,1979,1980,1982,1983,1985)

Answer 1

One option is to loop through the rows with apply , create a logical condition with rle , check if there are any TRUE elements that have lengths more than 1, extract the names 一个选项是使用apply遍历行，使用rle创建逻辑条件，检查是否有any lengths大于1的TRUE元素，提取names

names(which(apply(df, 1, function(x) with(rle(x < - 1.7), any(lengths[values] > 1)))))
#[1] "1977" "1979" "1980" "1982" "1983" "1985"

Or a better approach is to vectorize it by placing two logical matrices (ie remove the first column of the dataset, check whether it is less than -1.7, similarly remove the last column and do the same), Reduce it to a single logical matrix by checking whether the corresponding elements are TRUE , get the rowSums , if the value is greater than 0, we extract the row names 或者更好的方法是通过放置两个逻辑矩阵来对其进行矢量化（即删除数据集的第一列，检查它是否小于-1.7，同样删除最后一列并执行相同操作）， Reduce其Reduce为单个逻辑matrix通过检查相应元素是否为TRUE ，获取rowSums ，如果值大于0，则提取行名称

names(which(rowSums(Reduce(`&`, list(df[-ncol(df)] < -1.7, df[-1] < -1.7))) > 0))
#[1] "1977" "1979" "1980" "1982" "1983" "1985"

Answer 2

A fun option using which with arr.ind = TRUE 一个有趣的选项which使用arr.ind = TRUE

temp <- which(df < -1.7, arr.ind = TRUE)
rownames(df)[aggregate(col~row, temp, function(x) any(diff(x) == 1))[, 2]]

#[1] "1977" "1979" "1980" "1982" "1983" "1985"

We first get all row and column positions where value is less than -1.7. 我们首先获得值小于-1.7的所有行和列位置。 Using aggregate we group col for every row and check if there is at least one consecutive value in a row and for values which return TRUE subset its rownames . 使用aggregate我们为每一row组合col ，并检查row是否至少有一个连续值，以及返回其rownames TRUE子集的值。

Answer 3

A solution which uses the lagged sum to get the sum of each pair of numbers in a vector. 一种解决方案，它使用滞后和来获得向量中每对数字的总和。 If the lagged sum gets 2, then it means at least 2 consecutive values in that row meet the condition. 如果滞后总和为2，则表示该行中至少有2个连续值满足条件。

rownames(df)[apply(df < -1.7, 1, function(x) any(x[-nrow(df)] + x[-1] == 2))]

# [1] "1977" "1979" "1980" "1982" "1983" "1985"

返回连续值满足条件的行

问题描述

3 个解决方案

解决方案1
3 已采纳 2019-01-10 15:53:00

解决方案2
3 2019-01-10 15:58:54

解决方案3
2 2019-01-10 16:37:21

返回连续值满足条件的行

问题描述

3 个解决方案

解决方案1 3 已采纳 2019-01-10 15:53:00

解决方案2 3 2019-01-10 15:58:54

解决方案3 2 2019-01-10 16:37:21

解决方案1
3 已采纳 2019-01-10 15:53:00

解决方案2
3 2019-01-10 15:58:54

解决方案3
2 2019-01-10 16:37:21