使用 for 循环和打印/提取序列（python）从文件夹中打开和解析多个 .fasta 文件

Question

I want to print the id and sequences of multiple .fasta files and additionally put them in an array but I got a problem with gaining access to the sequence itself.我想打印多个.fasta文件的 id 和序列，并将它们另外放在一个数组中，但是我在访问序列本身时遇到了问题。 I played around with SeqIO from Biopython to parse the .fasta files and tried through os and glob to gain access to the files in the folder.我使用 Biopython 的 SeqIO 来解析.fasta文件，并尝试通过 os 和 glob 来访问文件夹中的文件。 What am I doing wrong here, I'm really struggling with the code since I don't really have a lot of programming experience.我在这里做错了什么，我真的很纠结代码，因为我真的没有很多编程经验。 I don't get an error code here but there is also nothing printed.我在这里没有收到错误代码，但也没有打印任何内容。 Any advice?有什么建议吗？

from Bio import SeqIO
import os,glob
folder_path = ('genome_nucseq_unique/data/')
for seq_record in SeqIO.parse(glob.glob(os.path.join(folder_path, '*.fasta')), "fasta"):
    print(seq_record.id)
    print(seq_record.id)

Answer 1

SeqIO.parse expects a str , bytes or os.PathLike object, not a list like glob.glob() returns. SeqIO.parse需要str 、 bytes或os.PathLike对象，而不是像glob.glob()返回的list 。 Modify your function like this:像这样修改你的函数：

from Bio import SeqIO
import os, glob
folder_path = 'genome_nucseq_unique/data/'
fasta_paths = glob.glob(os.path.join(folder_path, '*.fasta'))
for fasta_path in fasta_paths:
    print(fasta_path)
    for seq_record in SeqIO.parse(fasta_path, "fasta"):
        print(seq_record.id)
        print(seq_record.seq)
        print()

使用 for 循环和打印/提取序列（python）从文件夹中打开和解析多个 .fasta 文件

问题描述

1 个解决方案

解决方案1
1 2020-01-23 10:22:13

使用 for 循环和打印/提取序列（python）从文件夹中打开和解析多个 .fasta 文件

问题描述

1 个解决方案

解决方案1 1 2020-01-23 10:22:13

解决方案1
1 2020-01-23 10:22:13