简体   繁体   中英

Rolling Text Concatenation with Data.Table in R

I have a dataset that looks like the following:


I would like to add a new column of text that adds from the previous column by rownum and name. The vector of the new column would be:


I am at a loss here in terms of what to do. For numbers I would just use the cumsum function, but for text I am thinking I would need a for loop or to use one of the apply functions?

You can use Reduce with the accumulate option:

a[, rolltext := Reduce(paste0, text, accumulate = TRUE), by = name]

    rownum name text rolltext
 1:      1 jeff    a        a
 2:      2 jeff    b       ab
 3:      3 mary    c        c
 4:      4 jeff    d      abd
 5:      5 jeff    e     abde
 6:      6 jeff    f    abdef
 7:      7 mary    g       cg
 8:      8 mary    h      cgh
 9:      9 mary    i     cghi
10:     10 mary    j    cghij

Alternately, as @DavidArenburg suggested, construct each row using sapply :

a[, rolltext := sapply(1:.N, function(x) paste(text[1:x], collapse = '')), by = name]

This is a running sum, while a rolling sum (in the OP's title) is something different, at least in R lingo.

Here's an idea using substring() .

a[, rolltext := substring(paste(text, collapse = ""), 1, 1:.N), by = name]

which gives

    rownum name text rolltext
 1:      1 jeff    a        a
 2:      2 jeff    b       ab
 3:      3 mary    c        c
 4:      4 jeff    d      abd
 5:      5 jeff    e     abde
 6:      6 jeff    f    abdef
 7:      7 mary    g       cg
 8:      8 mary    h      cgh
 9:      9 mary    i     cghi
10:     10 mary    j    cghij

We might be able to speed this up a bit with the stringi package

a[, rolltext := stri_sub(stri_c(text, collapse = ""), length = 1:.N), by = name]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM