I have an R script which I'm running in the terminal by firstly generating a .ksh file called myscript.ksh with the following information:
#!/bin/ksh
Rscript myscript.R 'Input1'
and then run the function with
./mycode.ksh
which sends the script to a node on the cluster in our department (the processes that we send to the cluster must be as a .ksh file).
'Input1' is an input argument that is used by the R script to some analysis.
The issue that I now have is that I need to run this script a number of times with different input arguments to the function. One solution is to generate a few .ksh files, such as:
#!/bin/ksh
Rscript myscript.R 'Input2'
and
#!/bin/ksh
Rscript myscript.R 'Input3'
and then execute them seperately, but I was hoping to find a better solution.
Note that I have to do this for 100 different input arguments so it is not realistic to write 100 of these files. Is there a way of generating another file with the information needed to be supplied to the function eg 'Input1' 'Input2' 'Input3' and then run myscript.ksh for these individually.
For example, I could have a variable defining the name of the input arguments and then have a loop which would pass it to myscript.ksh. Is that possible?
The reason for running these in this manner is so that each iteration will hopefully be send to a different node on the cluster, thus analysing the data at a much faster rate.
You need to do two things:
The following illustrates the concept:
#!/bin/ksh
#Create array of inputs - space separator
inputs=(Input1 Input2 Input3 Input4)
# Loop through all the array items {0 ... n-1}
for i in {0..3}
do
echo ${inputs[i]}
done
This will output all the values in the inputs array.
You just need to replace the contents of the do-loop with:
Rscript myscript.R ${inputs[i]}
Also, you may need to add a ` &' at the end of the Rscript command line to spawn off each Rscript command as a separate thread -- otherwise, the shell would wait for a return from each Rscript command before going onto the next.
EDIT:
Based on your comments, you need to actually generate .ksh scripts to submit to qsub
. For this you just need to expand the do
loop.
For example:
#!/bin/ksh
#Create array of inputs - space separator
inputs=(Input1 Input2 Input3 Input4)
# Loop through all the array items {0 ... n-1}
for i in {0..3}
do
cat > submission.ksh << EOF
#!/bin/ksh
Rscript myscript.R ${inputs[i]}
EOF
chmod u+x submission.ksh
qsub submission.ksh
done
The EOF
defines the beginning and end of what will be taken as input (STDIN) and the output (STDOUT) will written to submission.ksh.
Then submission.ksh is made executable with the chmod
command.
And then the script is submitted via qsub
. I'll let you fill in any other arguments you need for qsub
.
When your script doesn't know all parameters when it starts, you can make a .ksh file called mycode.ksh with the following information:
#!/bin/ksh
if [ $# -ne 1 ]; then
echo "Usage: $0 input"
exit 1
fi
# Or start at the background with nohup .... &, other question
Rscript myscript.R "$1"
and then run the function with ./mycode.ksh inputX
When your application knows all arguments, you can use a loop:
#!/bin/ksh
if [ $# -eq 0 ]; then
echo "Usage: $0 input(s)"
exit 1
fi
for input in $*; do
Rscript myscript.R "${input}"
done
and then run the function with
./mycode.ksh input1 input2 "input with space in double quotes" input4
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.