[英]Modify python script to run for multiple input files
我是python的新手,我有一個針對特定文件(input1.txt)運行的python腳本,並生成了輸出(output1.fasta),但我想針對多個文件運行此腳本,例如:input2 .txt,input3.txt ...並生成相應的輸出:output2.fasta,output3.fasta
from Bio import SeqIO
fasta_file = "sequences.txt"
wanted_file = "input1.txt"
result_file = "output1.fasta"
wanted = set()
with open(wanted_file) as f:
for line in f:
line = line.strip()
if line != "":
wanted.add(line)
fasta_sequences = SeqIO.parse(open(fasta_file),'fasta')
with open(result_file, "w") as f:
for seq in fasta_sequences:
if seq.id in wanted:
SeqIO.write([seq], f, "fasta")
我試圖添加glob函數,但是我不知道如何處理輸出文件名。
from Bio import SeqIO
import glob
fasta_file = "sequences.txt"
for filename in glob.glob('*.txt'):
wanted = set()
with open(filename) as f:
for line in f:
line = line.strip()
if line != "":
wanted.add(line)
fasta_sequences = SeqIO.parse(open(fasta_file),'fasta')
with open(result_file, "w") as f:
for seq in fasta_sequences:
if seq.id in wanted:
SeqIO.write([seq], f, "fasta")
錯誤消息是:NameError:未定義名稱'result_file'
您的glob
文件當前正在提取“序列”文件和輸入,因為*.txt
包含sequences.txt
文件。 如果“ fasta”文件始終相同,而您只想迭代輸入文件,則需要
for filename in glob.glob('input*.txt'):
另外,要遍歷整個過程,也許您想將其放入方法中。 而且,如果始終創建輸出文件名以與輸入相對應,則可以動態創建該文件名。
from Bio import SeqIO
def create_fasta_outputs(fasta_file, wanted_file):
result_file = wanted_file.replace("input","output").replace(".txt",".fasta")
wanted = set()
with open(wanted_file) as f:
for line in f:
line = line.strip()
if line != "":
wanted.add(line)
fasta_sequences = SeqIO.parse(open(fasta_file),'fasta')
with open(result_file, "w") as f:
for seq in fasta_sequences:
if seq.id in wanted:
SeqIO.write([seq], f, "fasta")
fasta_file = "sequences.txt"
for wanted_file in glob.glob('input*.txt'):
create_fasta_outputs(fasta_file, wanted_file)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.