简体   繁体   中英

Count the maximum of consecutive letters in a string

I have this vector:

vector <- c("XXXX-X-X", "---X-X-X", "--X---XX", "--X-X--X", "-X---XX-", "-X--X--X", "X-----XX", "X----X-X", "X---XX--", "XX--X---", "---X-XXX", "--X-XX-X")

I want to detect the maximum of consecutive times that appears X. So, my expected vector would be:

4, 1, 2, 1,2, 1, 2, 1, 2, 2, 3, 2

In base R, we can split each vector into separate characters and then using rle find the max consecutive length for "X".

sapply(strsplit(vector, ""), function(x) {
   inds = rle(x)
   max(inds$lengths[inds$values == "X"])
})

#[1] 4 1 2 1 2 1 2 1 2 2 3 2

Here is a slightly different approach. We can split each term in the input vector on any number of dashes. Then, find the substring with the greatest length.

sapply(vector, function(x) {
    max(nchar(unlist(strsplit(x, "-+"))))
})

XXXX-X-X ---X-X-X --X---XX --X-X--X -X---XX- -X--X--X X-----XX X----X-X 
       4        1        2        1        2        1        2        1 
X---XX-- XX--X--- ---X-XXX --X-XX-X 
       2        2        3        2 

I suspect that X really just represents any non dash character, so we don't need to explicitly check for it. If you do really only want to count X , then we can try removing all non X characters before we count:

sapply(vector, function(x) {
    max(nchar(gsub("[^X]", "", unlist(strsplit(x, "-+")))))
})

Use strapply in gsubfn to extract out the X... substrings applying nchar to each to count its number of character producing a list of vectors of lengths. sapply the max function each such vector.

library(gsubfn)

sapply(strapply(vector, "X+", nchar), max)
## [1] 4 1 2 1 2 1 2 1 2 2 3 2

Here are a couple of tidyverse alternatives:

map_dbl(vector, ~sum(str_detect(., strrep("X", 1:8))))
# [1] 4 1 2 1 2 1 2 1 2 2 3 2
map_dbl(strsplit(vector,"-"), ~max(nchar(.)))
# [1] 4 1 2 1 2 1 2 1 2 2 3 2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM