使用rle（）为data.frame建立索引-如何在函数中显示零以保持相同的矢量长度？

Question

在以下示例中，我的目标是显示在5个连续实例中df df_new置到df_new连续数字低于-1.2 threshold的df_new 。 然后，我想从df_new$year列返回相应的唯一值。 我链接rle()函数结果的问题是该长度与df_new$year长度不对应，因此我无法对其进行正确索引。 rle()函数的问题在于它不返回零，因此仅返回比k的threshold值低至少1的k 。 我该如何改进这段代码来实现我所需要的？ 有没有办法强迫rle（）在k包含零，还是我应该采用另一种方法？

# Example reproducible df:
set.seed(125)
df <- data.frame(V1=rnorm(10,-1.5,.5),
                 V2=rnorm(10,-1.5,.5),
                 V3=rnorm(10,-1.5,.5),
                 V4=rnorm(10,-1.5,.5),
                 V5=rnorm(10,-1.5,.5),
                 V6=rnorm(10,-1.5,.5),
                 V7=rnorm(10,-1.5,.5),
                 V8=rnorm(10,-1.5,.5),
                 V9=rnorm(10,-1.5,.5),
                 V10=rnorm(10,-1.5,.5))
library(data.table)
df_t <- t(df)
df_long <- melt(df_t)
df_long$year <- rep(1976:1985, each=nrow(df))
df_new <- data.frame(value=df_long$value,year=df_long$year)

# Threshold values:
 threshold = -1.2
    consecutiveentries = 5
    number <- consecutiveentries-1
# Start of the problem:
    k <- rle(df_new$value < threshold)
    years <- unique(df_new$year[k$lengths > number])

当前结果：

> years
[1] 1976 1978 1979 1980 1982 1984 1985

我想要的是：

> years
    [1] 1976 1980 1983 1985

Answer 1

这很丑陋，但有效:)

df_new$year[cumsum(k$lengths)[which(k$lengths >= 5)-1]+1]

每个部分：

idx <- which(k$lengths >= 5)-1为您提供k$lengths的索引，值刚好大于或等于4。

随着cumsum(k$lengths)然后我们建立在累积总和k$lengths ，并采取在元素idx 。 结果，我们得到了>=5序列的一部分在第一行之前出现的行数。

将1加到该结果将为我们提供每个序列开始的行的索引。

使用rle（）为data.frame建立索引-如何在函数中显示零以保持相同的矢量长度？

问题描述

1 个解决方案

解决方案1
1 已采纳 2019-01-22 13:39:25

使用rle（）为data.frame建立索引-如何在函数中显示零以保持相同的矢量长度？

问题描述

1 个解决方案

解决方案1 1 已采纳 2019-01-22 13:39:25

解决方案1
1 已采纳 2019-01-22 13:39:25