[英]Modify python script to run for multiple input files
I am very new to python, and I have a python script to run for a particular file (input1.txt) and generated a output (output1.fasta), but I would like to run this script for multiple files, for example: input2.txt, input3.txt...and generate the respective output: output2.fasta, output3.fasta 我是python的新手,我有一个针对特定文件(input1.txt)运行的python脚本,并生成了输出(output1.fasta),但我想针对多个文件运行此脚本,例如:input2 .txt,input3.txt ...并生成相应的输出:output2.fasta,output3.fasta
from Bio import SeqIO
fasta_file = "sequences.txt"
wanted_file = "input1.txt"
result_file = "output1.fasta"
wanted = set()
with open(wanted_file) as f:
for line in f:
line = line.strip()
if line != "":
wanted.add(line)
fasta_sequences = SeqIO.parse(open(fasta_file),'fasta')
with open(result_file, "w") as f:
for seq in fasta_sequences:
if seq.id in wanted:
SeqIO.write([seq], f, "fasta")
I tried to add the glob function, but I do not know how to deal with the output file name. 我试图添加glob函数,但是我不知道如何处理输出文件名。
from Bio import SeqIO
import glob
fasta_file = "sequences.txt"
for filename in glob.glob('*.txt'):
wanted = set()
with open(filename) as f:
for line in f:
line = line.strip()
if line != "":
wanted.add(line)
fasta_sequences = SeqIO.parse(open(fasta_file),'fasta')
with open(result_file, "w") as f:
for seq in fasta_sequences:
if seq.id in wanted:
SeqIO.write([seq], f, "fasta")
The error message is: NameError: name 'result_file' is not defined 错误消息是:NameError:未定义名称'result_file'
Your glob
is currently pulling your "sequences" file as well as the inputs because *.txt
includes the sequences.txt
file. 您的
glob
文件当前正在提取“序列”文件和输入,因为*.txt
包含sequences.txt
文件。 If the "fasta" file is always the same and you only want to iterate the input files, then you need 如果“ fasta”文件始终相同,而您只想迭代输入文件,则需要
for filename in glob.glob('input*.txt'):
Also, to iterate through your entire process, perhaps you want to put it within a method. 另外,要遍历整个过程,也许您想将其放入方法中。 And if the output filename is always created to correspond to the input, then you can create that dynamically.
而且,如果始终创建输出文件名以与输入相对应,则可以动态创建该文件名。
from Bio import SeqIO
def create_fasta_outputs(fasta_file, wanted_file):
result_file = wanted_file.replace("input","output").replace(".txt",".fasta")
wanted = set()
with open(wanted_file) as f:
for line in f:
line = line.strip()
if line != "":
wanted.add(line)
fasta_sequences = SeqIO.parse(open(fasta_file),'fasta')
with open(result_file, "w") as f:
for seq in fasta_sequences:
if seq.id in wanted:
SeqIO.write([seq], f, "fasta")
fasta_file = "sequences.txt"
for wanted_file in glob.glob('input*.txt'):
create_fasta_outputs(fasta_file, wanted_file)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.