[英]Get ID and protein sequences in biopython
I have this code.我有这个代码。
from Bio import SeqIO
for seq_record in SeqIO.parse("aminoacids.txt", "fasta"):
print(seq_record.id)
print(repr(seq_record.seq))
Output: Output:
NP_414584.1
Seq('MNTFSQVWVFSDTPSRLPELMNGAQALANQINTFVLNDADGAQAIQLGANHVWK...LAR')
NP_414563.1
Seq('MASVSISCPSCSATDGVVRNGKSTAGHQRYLCSHCRKTWQLQFTYTASQPGTHQ...RSR')
NP_414564.1
Seq('MANIKSAKKRAIQSEKARKHNASRRSMMRTFIKKVYAAIEAGDKAAAQKAFNEM...KLA')
NP_414565.1
Seq('MCRHSLRSDGAGFYQLAGCEYSFSAIKIAAGGQFLPVICAMAMKSHFFLISVLN...SLF')
NP_414566.1
Seq('MKLIRGIHNLSQAPQEGCVLTIGNFDGVHRGHRALLQGLQEEGRKRNLPVMVML...KPA')
Problem: I should get the ID and the full sequence without "Seq" at the beggining and in just one string.问题:我应该得到 ID 和完整的序列,在开头没有“Seq”,而且只有一个字符串。 Something like this:
是这样的:
NP_414584.1
MNTFSQVWVFSDTPSRLPELMNGAQALANQINTFVLNDADGAQAIQLGANHVWKLNGKPDDRMIEDYAGVMADTIRQHGADGLVLLPNTRRGKLLAAKLGYRLKAAVSNDASTVSVQDGKATVKHMVYGGLAIGEERIATPYAVLTISSGTFDAAQPDASRTGETHTVEWQAPAVAITRTATQARQSNSVDLDKARLVVSVGRGIGSKENIALAEQLCKAIGAELACSRPVAENEKWMEHERYVGISNLMLKPELYLAVGISGQIQHMVGANASQTIFAI NKDKNAPIFQYADYGIVGDAVKILPALTAALAR
How can I get this output?我怎样才能得到这个output?
repr
is not designed for doing final output. It's essentially a debug tool. repr
不是为执行最终 output 而设计的。它本质上是一个调试工具。 What you have is a Seq
object. You probably need to be doing:您拥有的是
Seq
object。您可能需要执行以下操作:
print(seq_record.seq)
which uses the str
method.它使用
str
方法。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.