简体   繁体   中英

How to generate a new numeric column to represent a sequence of characters in R?

The data I have is like sdf:

s = c("aa", "bb", "cc","cc","cc","cc","aa", "bb", "cc", "bb", "bb", "bb", "bb")
sdf = data.frame(s)

What I want to do is generate a column that goes from 1 to whatever, but for every repeated character the number doesn't change. I can get a sequence with the following:

sdf$wrongseq<-seq(1:nrow(sdf))

But how do I get the sequence like the description above:

rightseq<- c(1, 2, 3,3,3,3,4,5, 6, 7, 7, 7, 7)
sdf = cbind(sdf,rightseq)
library(data.table)
rleid(sdf$s)
#[1] 1 2 3 3 3 3 4 5 6 7 7 7 7

# if no package to be loaded:
x = rle(as.character(sdf$s))$lengths #rle calculates lengths of equal values
# x
# [1] 1 1 4 1 1 1 4
rep(seq_along(x), x)
#[1] 1 2 3 3 3 3 4 5 6 7 7 7 7

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM