使用 dplyr 和 rle 计算具有条件的组中的连续值

Question

My question is very similar to the one posed below, however I want to add an additional command to return only cases when a sequence has more than 2 consecutive values.我的问题与下面提出的问题非常相似，但是我想添加一个额外的命令来仅返回序列具有 2 个以上连续值的情况。

How do I count the number of consecutive "success" (ie 1 in $consec) when a given sequence run has more than 2 consecutive numbers, within a given Era and a given Year?在给定的时代和给定的年份内，当给定的序列运行具有 2 个以上的连续数字时，如何计算连续“成功”的数量（即 $consec 中的 1 个）？

Similar question to: Summarize consecutive failures with dplyr and rle .类似的问题：用 dplyr 和 rle 总结连续失败。 For comparison, I've modified the example used in that question:为了比较，我修改了该问题中使用的示例：

library(dplyr)
df <- data.frame(Era=c(1,1,1,1,1,1,1,1,1,1),Year = c(1,2,2,3,3,3,3,3,3,3), consec = c(0,0,1,0,1,1,0,1,1,1))

df %>%
  group_by(Era,Year) %>%
  do({tmp <- with(rle(.$consec==1), lengths[values])
      data.frame(Year= .$Year, Count=(length(tmp)))}) %>% 
  slice(1L)

> Source: local data frame [3 x 3]
> Groups: Era, Year

>   Era Year Count
> 1   1    1     0
> 2   1    2     1
> 3   1    3     2
>

All I need now is to add a condition to include only cases of consecutive numbers in a sequence of >2.我现在需要的只是添加一个条件，以仅包含 >2 序列中连续数字的情况。 Desired result:想要的结果：

> Source: local data frame [3 x 3]
> Groups: Era, Year

>   Era Year Count
> 1   1    1     0
> 2   1    2     0
> 3   1    3     1

Any advice would be greatly appreciated.任何建议将不胜感激。

Answer 1

We need to create a logical index with lengths and get the sum of it我们需要创建一个具有lengths的逻辑索引并得到它的sum

df %>%
   group_by(Era, Year) %>% 
   do({ tmp <- with(rle(.$consec), sum(lengths > 2))
   data.frame(Count = tmp)})
#   Era  Year Count
#  <dbl> <dbl> <int>
#1     1     1     0    
#2     1     2     0
#3     1     3     1

使用 dplyr 和 rle 计算具有条件的组中的连续值

问题描述

1 个解决方案

解决方案1
2 已采纳 2016-10-17 09:51:36

使用 dplyr 和 rle 计算具有条件的组中的连续值

问题描述

1 个解决方案

解决方案1 2 已采纳 2016-10-17 09:51:36

解决方案1
2 已采纳 2016-10-17 09:51:36