I have a large set of jobs using the exact same script, just different input and output files.
How can I run several hundred at a time?
(we have slurm and SGE)
On SGE the test script looks like:
#!/bin/bash
#$ -N myjobname
#$ -l h_vmem=100M
#$ -pe smp 1
#$ -cwd
myscript infile.txt outfile.txt
Curator Jpessin
That sound like a good use of the “Array” option. They’re a type of general purpose option for running the same code many times with something (inputs /output / seeds etc.) different for each run, which is what you want for most simple parameter-sweeps. Take a look at:
gridEngine (SGE & co) http://wiki.gridengine.info/wiki/index.php/Simple-Job-Array-Howto
or
Slurm Slurm Workload Manager - Job Array Support
With SGE for simple situations use -t <m>-<n>
where <m>
and <n>
are the number range for each run in your submit script which become the $SGE_TASK_ID
variable which is unique for each task but share a $JOB_ID
So if you have input files input.1.txt to input.100.txt
#!/bin/bash
#$ -N myjobname
#$ -l h_vmem=100M
#$ -pe smp 1
#$ -cwd
Everything in the above will be true for each individual task separately.
The following will loop through the range with the task id in the name for each.
#$ -t 1-100
myscript infile.$SGE_TASK_ID.txt outfile.$SGE_TASK_ID.txt