简体   繁体   中英

Executing a bash command through a python script run from an HTCondor job

I have a Python script that at the end runs an executable/program called "quickFit" through subprocess.Popen() . When I start my terminal I always move to the quickFit directory and source setup.sh so I can just run that executable from anywhere. I then tried to run this script in a HTCondor job but there it goes wrong. My shell is zsh. Here's an example:

test.py:

#!/usr/bin/env python

import subprocess
out = subprocess.check_output("quickFit -h", shell = True)
print(out)

test.sub:

executable              = ~/private/scripts/TEST.py
universe                = vanilla
log                     = ~/private/scripts/TEST/log.txt
error                   = ~/private/scripts/TEST/err.txt

should_transfer_files   = IF_NEEDED
when_to_transfer_output = ON_EXIT

queue 1

Running test.py results in the expected behaviour: the quickFit command runs and displays a list of options and possible arguments (-h is for help). This is exactly the same behaviour as when I would run quickfit -h from my terminal.
Running condor_submit test.sub however results in the job ending prematurely and the err.txt file informing me of a non-zero exit status 127: /bin/sh: quickFit: command not found

I have tried chmod -R 777 * everything in the quickFit directory because I thought it was related to permissions, but that didn't work.
I have also tried (in python) changing directories to the quickFit directory and re-sourcing setup.sh, but that brought about even more problems.
Lastly I tried adding getenv = True to the .sub file which resulted in the following error: quickFit: error while loading shared libraries: libquickFit.so: cannot open shared object file: No such file or directory

There are two ways that an administrator can configure an HTCondor pool -- either with a shared filesystem between the submit machine and the worker nodes, or without. It sounds like there is no shared filesystem between the two nodes, so you will explicitly need to tell condor to transfer quickFit and any files it depends on, using transfer_input_files.

Otherwise, if there is a shared filesystem, or if quickFit is pre-installed on the worker node, try invoking popen with the full, absolute path to quickFit.

From the Popen documentation :

On POSIX with shell=True, the shell defaults to /bin/sh

and

If shell=True, on POSIX the executable argument specifies a replacement shell for the default /bin/sh

So since your quickFit command works in zsh , changing test.py to:

#!/usr/bin/env python

import subprocess
out = subprocess.check_output("quickFit -h", shell = True, executable = '/path/to/zsh')
print(out)

and changing /path/to/zsh to whatever the result of which zsh is, for example:

which zsh
# /usr/local/bin/zsh on my mac with Mojave
# /usr/bin/zsh on my Kubuntu VM

will invoke zsh rather than sh when running the command. This will work if you follow the same procedure as you normally would (moving to quickFit directory and sourcing setup.sh).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM