[英]Open and Parse multiple .fasta files from a folder with a for loop and print/extract sequence (python)
I want to print the id and sequences of multiple .fasta
files and additionally put them in an array but I got a problem with gaining access to the sequence itself.我想打印多个.fasta
文件的 id 和序列,并将它们另外放在一个数组中,但是我在访问序列本身时遇到了问题。 I played around with SeqIO from Biopython to parse the .fasta
files and tried through os and glob to gain access to the files in the folder.我使用 Biopython 的 SeqIO 来解析.fasta
文件,并尝试通过 os 和 glob 来访问文件夹中的文件。 What am I doing wrong here, I'm really struggling with the code since I don't really have a lot of programming experience.我在这里做错了什么,我真的很纠结代码,因为我真的没有很多编程经验。 I don't get an error code here but there is also nothing printed.我在这里没有收到错误代码,但也没有打印任何内容。 Any advice?有什么建议吗?
from Bio import SeqIO
import os,glob
folder_path = ('genome_nucseq_unique/data/')
for seq_record in SeqIO.parse(glob.glob(os.path.join(folder_path, '*.fasta')), "fasta"):
print(seq_record.id)
print(seq_record.id)
SeqIO.parse
expects a str
, bytes
or os.PathLike
object, not a list
like glob.glob()
returns. SeqIO.parse
需要str
、 bytes
或os.PathLike
对象,而不是像glob.glob()
返回的list
。 Modify your function like this:像这样修改你的函数:
from Bio import SeqIO
import os, glob
folder_path = 'genome_nucseq_unique/data/'
fasta_paths = glob.glob(os.path.join(folder_path, '*.fasta'))
for fasta_path in fasta_paths:
print(fasta_path)
for seq_record in SeqIO.parse(fasta_path, "fasta"):
print(seq_record.id)
print(seq_record.seq)
print()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.