简体   繁体   中英

Python: how to show an input DNA sequence in string format to a list of nucleotide triplets in single element tuple format

def s_seq(dna_seq):
    '''
    parses an input sequence in string format to a list of nucleotide triplets/codons as single-valued tuples
    '''
    codons = []

    # arrange codons as list of single element tuples
    if len(dna_seq) % 3 == 0:
        for i in range(0, len(dna_seq), 3):
            codons = dna_seq[i:i + 3]

    return codons

dna_seq01 = 'ATATTAAAGAATAATTTTATAAAAATATGT'
codons01 = s_seq(dna_seq01)

It keeps showing the last three codons only, but what I want is the split of everything: 'ATA', 'TTA' and so on. I don't know what I am doing wrong here.

You just need to append the codon to the list you've set above :

codons = []
if len(dna_seq) % 3 == 0:
    for i in range(0,len(dna_seq),3):
       codons.append((dna_seq[i:i + 3],))

outputs :

>>> [('ATA',), ('TTA',), ('AAG',), ('AAT',), ('AAT',), ('TTT',), ('ATA',), ('AAA',), ('ATA',), ('TGT',)]

By using an codons = dna_seq[i:i+3] you're just replacing the value in each loop iteration.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM