I have a perl script that can be executed from the console as follows:
perl perlscript.pl -i input.txt -o output.txt --append
I want to execute this script from my python code. I figured out that subprocess.Popen
can be used to connect to perl and I can pass my arguments with it. But, I also want to pass a variable (made by splitting up a text file) in place of input.txt. I have tried the following but it doesn't seem to work and gives an obvious TypeError in line 8:
import re, shlex, subprocess, StringIO
f=open('fulltext.txt','rb')
text= f.read()
l = re.split('\n\n',str(text))
intxt = StringIO.StringIO()
for i in range(len(l)):
intxt.write(l[i])
command_line='perl cnv_ltrfinder2gff.pl -i '+intxt+' -o output.gff --append'
args=shlex.split(command_line)
p = subprocess.Popen(args)
Is there any other work around for this?
EDIT: Here is a sample of the file fulltext.txt. Entries are separated by a line.
Predict protein Domains 0.021 second
>Sequence: seq1 Len:13143 [1] seq1 Len:13143 Location : 9 - 13124 Len: 13116 Strand:+ Score : 6 [LTR region similarity:0.959] Status : 11110110000 5'-LTR : 9 - 501 Len: 493 3'-LTR : 12633 - 13124 Len: 492 5'-TG : TG , TG 3'-CA : CA , CA TSR : NOT FOUND Sharpness: 1,1 Strand + : PBS : [14/20] 524 - 543 (LysTTT) PPT : [12/15] 12553 - 12567
Predict protein Domains 0.019 second
>Sequence: seq5 Len:11539 [1] seq5 Len:11539 Location : 7 - 11535 Len: 11529 Strand:+ Score : 6 [LTR region similarity:0.984] Status : 11110110000 5'-LTR : 7 - 506 Len: 500 3'-LTR : 11036 - 11535 Len: 500 5'-TG : TG , TG 3'-CA : CA , CA TSR : NOT FOUND Sharpness: 1,1 Strand + : PBS : [15/22] 515 - 536 (LysTTT) PPT : [11/15] 11020 - 11034
I want to separate them and pass each entry block to the perl script. All the files are in the same directory.
you might be interested in the os module and string formatting
Edit
I think I uderstand what you want now. correct me if I am wrong, but I think:
if this is what you want, you could use the following code.
import os
in_file = 'fulltext.txt'
seq = []
with open(in_file,'r') as handle:
lines = handle.readlines()
for i in range(0,len(lines)):
if lines[i].startswith(">"):
seq.append(lines[i].rstrip().split(" ")[1])
for x in seq:
command = "perl perl cnv_ltrfinder2gff.pl -i %s.txt -o output.txt --append"%x
os.system(command)
The docs for --infile
option :
Path of the input file. If an input file is not provided, the program will expect input from STDIN.
You could omit --infile
and pass input via a pipe (stdin) instead:
#!/usr/bin/env python
from subprocess import Popen, PIPE
with open('fulltext.txt') as file: # read input data
blocks = file.read().split('\n\n')
# run a separate perl process for each block
args = 'perl cnv_ltrfinder2gff.pl -o output.gff --append'.split()
for block in blocks:
p = Popen(args, stdin=PIPE, universal_newlines=True)
p.communicate(block)
if p.returncode != 0:
print('non-zero exit status: %s on block: %r' % (p.returncode, block))
You can run several perl
scripts concurrently:
from multiprocessing.dummy import Pool # use threads
def run((i, block)):
filename = 'out%03d.gff' % i
args = ['perl', 'cnv_ltrfinder2gff.pl', '-o', filename]
p = Popen(args, stdin=PIPE, universal_newlines=True, close_fds=True)
p.communicate(block)
return p.returncode, filename
exit_statuses, filenames = zip(*Pool().map(run, enumerate(blocks, start=1)))
It runs several (equal to the number of CPUs on your system) child processes in parallel. You could specify a different number of worker threads (pass to Pool()
).
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.