Group a DNA sequence in codons

Question

I have generated a random DNA sequence

base <- c("A","G","U")
seq <- sample(base, 15, replace = T)
[1] "A" "G" "A" "U" "A" "G" "U" "A" "U" "A" "G" "U" "G" "U" "G"

How can I group the resulting sequence to codons (set of three nucleotides) in order to look for the stop codons? I need something like these:

new_seq <- c("AGA","UAG", "UAU", "AGU", "GUG")

Answer 1

Convert to 3 column matrix, then paste:

base <- c("A","G","U")
set.seed(1); x <- sample(base, 15, replace = T)
x
# [1] "A" "U" "A" "G" "A" "U" "U" "G" "G" "U" "U" "A" "A" "A" "G"

do.call(paste0, as.data.frame(matrix(x, ncol = 3, byrow = TRUE)))
# [1] "AUA" "GAU" "UGG" "UUA" "AAG"

Answer 2

We can use gl to create the group, and using tapply do a group by paste

unname(tapply(seq, as.integer(gl(length(seq), 3, 
        length(seq))), FUN = paste, collapse=""))
#[1] "GAU" "UUG" "AAG" "GGU" "AGA"

NOTE: This would also work when the length is not a multiple

Or another option is to split after paste ing into a single string

strsplit(paste(seq, collapse=""), "(?<=...)", perl = TRUE)[[1]]
#[1] "GAU" "UUG" "AAG" "GGU" "AGA"

Group a DNA sequence in codons

Question

2 answers

solution1
1 ACCPTED 2020-11-30 19:48:45

solution2
0 2020-11-30 19:31:52

Group a DNA sequence in codons

Question

2 answers

solution1 1 ACCPTED 2020-11-30 19:48:45

solution2 0 2020-11-30 19:31:52

solution1
1 ACCPTED 2020-11-30 19:48:45

solution2
0 2020-11-30 19:31:52