简体   繁体   中英

Collapse vector to string of characters with respective numbers of consequtive occurences

I would like to collapse a CIGAR vector to a CIGAR string. By CIGAR vector to String I mean the following:

I want a function that converts:

cigar.vector = c("M", "M", "I", "I", "M", "I", "", "M", "D", "D", "M", "I", "D", "M", "I")

to this:

cigar.string = "2M2I1M1I1M2D1M1I1D1M1I"

and viceversa.

Note that there is a "" (empty character), that does not count. thanks!

rle seems the obvious choice here:

rcv <- rle(cigar.vector[cigar.vector!=""])
paste0(rcv$lengths,rcv$values,collapse="")
#[1] "2M2I1M1I1M2D1M1I1D1M1I"

If you want to get fancy, you could also exploit the fact that rle gives a list of length 2:

paste(do.call(rbind,rle(cigar.vector[cigar.vector!=""])),collapse="")
#[1] "2M2I1M1I1M2D1M1I1D1M1I"

Going backwards will be impossible if only given the result (assign above to result ), as it has lost information for the "" cases. Excluding those cases, you can get close enough with something like:

backwards <- rep(
  unlist(strsplit(result,"\\d+"))[-1],
  as.numeric(unlist(strsplit(result,"[^0-9]")))
)
identical(cigar.vector[cigar.vector!=""],backwards)
#[1] TRUE

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM