简体   繁体   English

创建特定值的连续运行的计数器

[英]Create counter of consecutive runs of a certain value

I have data where consecutive runs of zero are separated by runs of non-zero values.我有数据,其中连续的零运行被非零值的运行分隔。 I want to create a counter for the runs of zero in the column 'SOG'.我想为“SOG”列中的零运行创建一个计数器。

For the first sequence of 0 in SOG, set the counter in column Stops to 1. For the second run of zeros, set 'Stops' to 2, and so on.对于 SOG 中的第一个 0 序列,将 Stops 列中的计数器设置为 1。对于第二次运行的零,将“Stops”设置为 2,依此类推。

SOG Stops
--- -----
4   0
4   0
0   1
0   1
0   1
3   0
4   0
5   0
0   2
0   2
1   0
2   0
0   3
0   3
0   3
SOG <- c(4,4,0,0,0,3,4,5,0,0,1,2,0,0,0)
#run length encoding:
tmp <- rle(SOG)
#turn values into logicals
tmp$values <- tmp$values == 0
#cumulative sum of TRUE values
tmp$values[tmp$values] <- cumsum(tmp$values[tmp$values])
#inverse the run length encoding
inverse.rle(tmp)
#[1] 0 0 1 1 1 0 0 0 2 2 0 0 3 3 3

Try尝试

 df$stops<- with(df, cumsum(c(0, diff(!SOG))>0)*!SOG)
 df$stops
 # [1] 0 0 1 1 1 0 0 0 2 2 0 0 3 3 3

Using dplyr :使用dplyr

 library(dplyr)
 df <- df %>% mutate(Stops = ifelse(SOG == 0, yes = cumsum(c(0, diff(!SOG) > 0)), no = 0))
 df$Stops
 #[1] 0 1 1 1 0 0 0 2 2 0 0 3 3 3

EDIT: As an aside to those of us who are still beginners, many of the answers to this question make use of logicals (ie TRUE, FALSE).编辑:对于我们这些仍然是初学者的人来说,这个问题的许多答案都使用了逻辑(即 TRUE,FALSE)。 ! before a numeric variable like SOG tests whether the value is 0 and assigns TRUE if it is, and FALSE otherwise.在像SOG这样的数字变量测试值是否为0 ,如果是,则分配TRUE ,否则分配FALSE

SOG
#[1] 4 0 0 0 3 4 5 0 0 1 2 0 0 0
!SOG
#[1] FALSE  TRUE  TRUE  TRUE FALSE FALSE FALSE  TRUE  TRUE FALSE FALSE
#[12]  TRUE  TRUE  TRUE

diff() takes the difference between the value and the one before it. diff()取值与其之前的值之间的差值。 Note that there is one less element in this list than in SOG since the first element doesn't have a lag with which to compute a difference.请注意,此列表中的元素比SOG的元素少一个,因为第一个元素没有计算差异的滞后。 When it comes to logicals, diff(!SOG) produces 1 for TRUE - FALSE = 1 , FALSE - TRUE = -1 , and 0 otherwise.当涉及到逻辑时, diff(!SOG)TRUE - FALSE = 1FALSE - TRUE = -1产生1 ,否则为0

diff(SOG)
#[1] -4  0  0  3  1  1 -5  0  1  1 -2  0  0
diff(!SOG)
#[1]  1  0  0 -1  0  0  1  0 -1  0  1  0  0

So cumsum(diff(!SOG) > 0) just focuses on the TRUE - FALSE changes所以cumsum(diff(!SOG) > 0)只关注TRUE - FALSE变化

cumsum(diff(!SOG) > 0)
#[1] 1 1 1 1 1 1 2 2 2 2 3 3 3

But since the list of differences is one element shorter, we can append an element:但是由于差异列表缩短了一个元素,我们可以附加一个元素:

cumsum(c(0, diff(!SOG) > 0))  #Or cumsum( c(0, diff(!SOG)) > 0 ) 
#[1] 0 1 1 1 1 1 1 2 2 2 2 3 3 3

Then either "multiply" that list by !SOG as in @akrun 's answer or use the ifelse() command.然后用!SOG将该列表“相乘”,如@akrun的答案,或者使用ifelse()命令。 If a particular element of SOG == 0 , we use the corresponding element from cumsum(c(0, diff(!SOG) > 0)) ;如果SOG == 0的特定元素,我们使用cumsum(c(0, diff(!SOG) > 0))的相应元素; if it isn't 0 , we assign 0 .如果它不是0 ,我们分配0

A one-liner with rle would be -带有rle将是 -

df <- data.frame(SOG = c(4,4,0,0,0,3,4,5,0,0,1,2,0,0,0))
df <- transform(df, Stops = with(rle(SOG == 0), rep(cumsum(values) * values, lengths)))
df

#   SOG Stops
#1    4     0
#2    4     0
#3    0     1
#4    0     1
#5    0     1
#6    3     0
#7    4     0
#8    5     0
#9    0     2
#10   0     2
#11   1     0
#12   2     0
#13   0     3
#14   0     3
#15   0     3

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM