简体   繁体   中英

Finding consecutive values by row

Is there a clever way to find out if there are consecutive "YES"s by rows?

        1/01 1/02 1/03 1/04
UserA   Yes  Yes  Yes  Yes
UserB   No   Yes  No   No
UserC   Yes  No   Yes  Yes
UserD   Yes  No   Yes  No

UserA would have 4 consecutive Yes's

UserB would have 0

UserC would have 2 consecutive Yes's

UserD would have 0 consecutive Yes's

I will assume that you have a data.frame d :

d <- structure(list(X1.01 = c("Yes", "No", "Yes", "Yes"), X1.02 = c("Yes", 
"Yes", "No", "No"), X1.03 = c("Yes", "No", "Yes", "Yes"), X1.04 = c("Yes", 
"No", "Yes", "No")), .Names = c("X1.01", "X1.02", "X1.03", "X1.04"
), class = "data.frame", row.names = c("UserA", "UserB", "UserC", 
"UserD"))

You can use apply by rows ( apply(,1) ) to calculate the longest consecutive series of 'Yes':

result <- apply(d,1,function(s) {z<-rle(s); max(z$lengths[z$values=='Yes'])})
#UserA UserB UserC UserD 
#    4     1     2     1 

The key function here is rle , which finds all consecutive series. We choose only those corresponding to 'Yes' ( z$lengths[z$values=='Yes' ) and return the maximal value. The last step is to set to convert ones to zeroes:

result[result==1] <- 0

#UserA UserB UserC UserD 
#    4     0     2     0 

Here's a similar approach using apply and rle (I'll post this because was already in the middle of posting)

apply(df, 1, function(x) {
                          temp <- rle((x == "Yes"))  
                          temp2 <- with(temp, lengths[values])
                          temp2[temp2 > 1]
                          }
      )
# $UserA
# 
# 4 
# 
# $UserB
# named integer(0)
# 
# $UserC
# 
# 2 
# 
# $UserD
# named integer(0)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM