简体   繁体   English

在 biopython 中获取 ID 和蛋白质序列

[英]Get ID and protein sequences in biopython

I have this code.我有这个代码。

from Bio import SeqIO

for seq_record in SeqIO.parse("aminoacids.txt", "fasta"):

print(seq_record.id)

print(repr(seq_record.seq))

Output: Output:

NP_414584.1

Seq('MNTFSQVWVFSDTPSRLPELMNGAQALANQINTFVLNDADGAQAIQLGANHVWK...LAR')

NP_414563.1

Seq('MASVSISCPSCSATDGVVRNGKSTAGHQRYLCSHCRKTWQLQFTYTASQPGTHQ...RSR')

NP_414564.1

Seq('MANIKSAKKRAIQSEKARKHNASRRSMMRTFIKKVYAAIEAGDKAAAQKAFNEM...KLA')

NP_414565.1

Seq('MCRHSLRSDGAGFYQLAGCEYSFSAIKIAAGGQFLPVICAMAMKSHFFLISVLN...SLF')

NP_414566.1

Seq('MKLIRGIHNLSQAPQEGCVLTIGNFDGVHRGHRALLQGLQEEGRKRNLPVMVML...KPA')

Problem: I should get the ID and the full sequence without "Seq" at the beggining and in just one string.问题:我应该得到 ID 和完整的序列,在开头没有“Seq”,而且只有一个字符串。 Something like this:是这样的:

NP_414584.1
MNTFSQVWVFSDTPSRLPELMNGAQALANQINTFVLNDADGAQAIQLGANHVWKLNGKPDDRMIEDYAGVMADTIRQHGADGLVLLPNTRRGKLLAAKLGYRLKAAVSNDASTVSVQDGKATVKHMVYGGLAIGEERIATPYAVLTISSGTFDAAQPDASRTGETHTVEWQAPAVAITRTATQARQSNSVDLDKARLVVSVGRGIGSKENIALAEQLCKAIGAELACSRPVAENEKWMEHERYVGISNLMLKPELYLAVGISGQIQHMVGANASQTIFAI NKDKNAPIFQYADYGIVGDAVKILPALTAALAR

How can I get this output?我怎样才能得到这个output?

repr is not designed for doing final output. It's essentially a debug tool. repr不是为执行最终 output 而设计的。它本质上是一个调试工具。 What you have is a Seq object. You probably need to be doing:您拥有的是Seq object。您可能需要执行以下操作:

print(seq_record.seq)

which uses the str method.它使用str方法。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Python / Biopython。 用蛋白质序列解析文件后,获取匹配单词的序列枚举列表 - Python/Biopython. Get enumerated list of sequences matching words after parsing file with protein sequences 在BioPython中使用Entrez从GenBank检索和解析蛋白质序列 - Retrieving and parsing protein sequences from GenBank using Entrez in BioPython 如何使用Biopython翻译FASTA文件中的一系列DNA序列并将蛋白质序列提取到一个单独的字段中? - How to use Biopython to translate a series of DNA sequences in a FASTA file and extract the Protein sequences into a separate field? Biopython:如何避免蛋白质的特定氨基酸序列,以便绘制Ramachandran图? - Biopython: How to avoid particular amino acid sequences from a protein so as to plot Ramachandran plot? 如何使用fasta文件而不是biopython中的蛋白质序列串创建多个序列比对 - How to Create multiple sequence alignments with fasta files rather then strings of protein sequences in biopython Ncbi蛋白质数据库,如何从特定生物项目中获取蛋白质序列(python脚本) - Ncbi protein database, how to get protein sequences from a specific bioproject (python script) OneHotEncoding 蛋白质序列 - OneHotEncoding Protein Sequences 当我尝试使用biopython获取共识序列时,出现AttributeError - When I try to get the consensus sequences with biopython I get an AttributeError 如何在没有 BioPython 库的情况下将 RNA 翻译成蛋白质 - How to translate RNA to protein without BioPython library Biopython 1.60中的Bio.Entrez和蛋白质问题 - Issue with Bio.Entrez and protein in Biopython 1.60
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM