The code below extract short sequence in every sequence with the window size 100. The window will shift by step size one and extract the sequence. I would like to extract the short sequence with every step size 50. Can anyone help me?
from Bio import SeqIO
with open("B.fasta","w") as f:
for seq_record in SeqIO.parse("A.fasta", "fasta"):
for i in range(len(seq_record.seq) - 99) :
f.write(str(">"+seq_record.id) + "\n")
f.write(str(seq_record.seq[i:i+100]) + "\n")
Example of fasta file:
>hg17_ct_ER_ER_142
CTAAAAAAGTAAAAAAGAAAAAAAGAGAAAGAAAGAATATAGAAGCAACAAGTGTAGATTTACATTCTATTAGACAGTGACCCATTAGACCCGGACAAGGGG
Example output:
>hg17_ct_ER_ER_142
CTAAAAAAGTAAAAAAGAAAAAAAGAGAAAGAAAGAATATAGAAGCAACAAGTGTAGATTTACATTCTATTAGACAGTGACCCATTAGACCCGGACAAGG
>hg17_ct_ER_ER_142
TAAAAAAGTAAAAAAGAAAAAAAGAGAAAGAAAGAATATAGAAGCAACAAGTGTAGATTTACATTCTATTAGACAGTGACCCATTAGACCCGGACAAGGG
>hg17_ct_ER_ER_142
AAAAAAGTAAAAAAGAAAAAAAGAGAAAGAAAGAATATAGAAGCAACAAGTGTAGATTTACATTCTATTAGACAGTGACCCATTAGACCCGGACAAGGGG
Expected output:
>hg17_ct_ER_ER_142
CTAAAAAAGTAAAAAAGAAAAAAAGAGAAAGAAAGAATATAGAAGCAACA
>hg17_ct_ER_ER_142
AGTGTAGATTTACATTCTATTAGACAGTGACCCATTAGACCCGGACAAGG
只需对范围函数使用步长选项:
for i in range(0, len(seq_record.seq) - 99, 50) :
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.