简体   繁体   中英

How can python wait for a batch SGE script finish execution?

I have a problem I'd like you to help me to solve.

I am working in Python and I want to do the following:

  • call an SGE batch script on a server
  • see if it works correctly
  • do something

What I do now is approx the following:

    import subprocess
    try:
       tmp = subprocess.call(qsub ....)
       if tmp != 0:
           error_handler_1()
       else:
           correct_routine()
    except:
       error_handler_2()

My problem is that once the script is sent to SGE, my python script interpret it as a success and keeps working as if it finished.

Do you have any suggestion about how could I make the python code wait for the actual processing result of the SGE script ?

Ah, btw I tried using qrsh but I don't have permission to use it on the SGE

Thanks!

From your code you want the program to wait for job to finish and return code, right? If so, the qsub sync option is likely what you want:

http://gridscheduler.sourceforge.net/htmlman/htmlman1/qsub.html

Additional Answer for an easier processing: By using the python drmaa module : link which allows a more complete processing with SGE. A functioning code provided in the documentation is here: [provided you put a sleeper.sh script in the same directory] please notice that the -bn option is needed to execute a .sh script, otherwise it expects a binary by default like explained here

import drmaa
import os

def main():
   """Submit a job.
   Note, need file called sleeper.sh in current directory.
   """
   s = drmaa.Session()
   s.initialize()
   print 'Creating job template'
   jt = s.createJobTemplate()
   jt.remoteCommand = os.getcwd()+'/sleeper.sh'
   jt.args = ['42','Simon says:']
   jt.joinFiles=False
   jt.nativeSpecification  ="-m abe -M mymail -q so-el6 -b n"
   jobid = s.runJob(jt)
   print 'Your job has been submitted with id ' + jobid
   retval = s.wait(jobid, drmaa.Session.TIMEOUT_WAIT_FOREVER)
   print('Job: {0} finished with status {1}'.format(retval.jobId, retval.hasExited))
   print 'Cleaning up'
   s.deleteJobTemplate(jt)
   s.exit()

if __name__=='__main__':
    main()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM