简体   繁体   English

BioPython遍历Fasta文件中的序列

[英]BioPython iterating through sequences from fasta file

I'm new to BioPython and I'm trying to import a fasta/fastq file and iterate through each sequence, while performing some operation on each sequence. 我是BioPython的新手,正在尝试导入fasta / fastq文件并遍历每个序列,同时对每个序列执行一些操作。 I know this seems basic, but my code below for some reason is not printing correctly. 我知道这似乎很基本,但是出于某些原因,我的以下代码无法正确打印。

from Bio import SeqIO

newfile = open("new.txt", "w")
records = list(SeqIO.parse("rosalind_gc.txt", "fasta"))

i = 0
dna = records[i]

while i <= len(records):
    print (dna.name)
    i = i + 1

I'm trying to basically iterate through records and print the name, however my code ends up only printing "records[0]", where I want it to print "records[1-10]". 我试图从根本上遍历记录并打印名称,但是我的代码最终只打印“ records [0]”,而我希望它在其中打印“ records [1-10]”。 Can someone explain why it ends up only print "records[0]"? 有人可以解释为什么它最终只打印“ records [0]”吗?

The reason for your problem is here: 问题的原因在这里:

i = 0
dna = records[i]

Your object 'dna' is fixed to the index 0 of records, ie, records[0]. 您的对象'dna'固定为记录的索引0,即records [0]。 Since you are not calling it again, dna will always be fixed on that declaration. 由于您不再调用它,因此dna将始终固定在该声明上。 On your print statement within your while loop, use something like this: 在while循环中的print语句上,使用如下代码:

while i <= len(records):
    print (records[i].name)
    i = i + 1

If you would like to have an object dna as a copy of records entries, you would need to reassign dna to every single index, making this within your while loop, like this: 如果要将对象dna作为记录条目的副本,则需要将dna重新分配给每个索引,使其在while循环内,如下所示:

while i <= len(records):
    dna = records[i]
    print (dna.name)
    i = i + 1

However, that's not the most efficient way. 但是,这不是最有效的方法。 Finally, for you to learn, a much nicer way than with your while loop with i = i + 1 is to use a for loop, like this: 最后,供您学习,比使用i = i + 1的while循环更好的方法是使用for循环,如下所示:

for i in range(0,len(records)):
    print (records[i].name)

For loops do the iteration automatically, one by one. 对于循环,自动进行一次循环迭代。 range() will give a set of integers from 0 to the length of records. range()将给出一组从0到记录长度的整数。 There are also other ways, but I'm keeping it simple. 还有其他方法,但我保持简单。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在不使用Biopython的情况下,将fasta文件中的标题与序列分开 - Separate headers from sequences in fasta file without using Biopython 使用 Biopython 更改 fasta 文件中的 DNA 序列 - Change DNA sequences in fasta file using Biopython 蟒蛇。 尝试使用BioPython将来自genbank文件的3个最长基因核苷酸序列的文件排序为fasta文件 - Python. Trying to sort a file for 3 longest gene nucleotide sequences from genbank file into fasta file using BioPython AttributeError: &#39;list&#39; 对象没有属性 &#39;SeqRecord&#39; - 尝试从 fasta 文件中使用 Biopython&gt;SeqIO 切片多个序列 - AttributeError: 'list' object has no attribute 'SeqRecord' - while trying to slice multiple sequences with Biopython>SeqIO from fasta file 从fasta文件估计Biopython中的字母表 - Estimate Alphabet in Biopython from fasta file 使用 Biopython 从 FASTA 文件中获取 ID - Get ID from a FASTA file with Biopython 从fasta文件中提取序列 - extract sequences from fasta file 我想从 fasta 文件中解析序列和序列 ID,并将它们分配给 Dataframe。 我正在使用 biopython 中的 SeqIO 库 - I want to parse Sequences and sequence Ids from a fasta file and assign them to Dataframe. I am using SeqIO library from biopython 如何使用Biopython翻译FASTA文件中的一系列DNA序列并将蛋白质序列提取到一个单独的字段中? - How to use Biopython to translate a series of DNA sequences in a FASTA file and extract the Protein sequences into a separate field? 从 FASTA 文件中迭代多个序列以获得最大的 ORF 长度 - Iterating over a multiple sequences from a FASTA file to get the greatest ORF length
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM