简体   繁体   中英

question about list of list using python3

Code:

file =open(files, 'r')
seqs = []
title = []
f = file.readlines()

for line in f:
    if line[0] == ('>'):
        title.append(line[1:-1]) 
    if line[0] != '>':
            seqs.append(line.rstrip()) 
            

final = []
for t, s in zip(title, seqs):
    final.append([t, s])

return final

I want to pair those multiple lines described in the output.

But I'm getting output which is not aligned since sequence can occupy multiple lines.

you could read each line if it starts with a > then you know its a new list to append a new list to the output list with this line data and an empty string to contain the next lines.

data = """>A18178 1
caccaataaaaaaacaagcttaacctaattc
>A21196 1
cggccagatcta
>A21197 1
agcttagatctggccgggg
>AX557348 1
gcggatttactcaggggagagcccagataaatggagtctgtgcgtccaca
gaattcgcacca
>AX557349 1
gcggatttactcaggggagagcccagataaatggagtctgtgcgtccaca
gaattcgcacca
>AX557350 1
tccgtgaaacaaagcggatgtaccggatttttattccggctatggggcaa
ttccccgtcgcggagcca"""

output = []
for line in data.splitlines():
    if line.startswith('>'):
        output.append([line[1:], ''])
    else:
        output[-1][-1] += line

print(output)

OUTPUT

[
  ['A18178 1', 'caccaataaaaaaacaagcttaacctaattc'], 
  ['A21196 1', 'cggccagatcta'], 
  ['A21197 1', 'agcttagatctggccgggg'], 
  ['AX557348 1', 'gcggatttactcaggggagagcccagataaatggagtctgtgcgtccacagaattcgcacca'], 
  ['AX557349 1', 'gcggatttactcaggggagagcccagataaatggagtctgtgcgtccacagaattcgcacca'], 
  ['AX557350 1', 'tccgtgaaacaaagcggatgtaccggatttttattccggctatggggcaattccccgtcgcggagcca']
]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM