Slurm arrays: Easily submitting many similar jobs at once

jasonbuechler · August 24, 2020, 7:21pm

How can I simultaneously submit multiple jobs at once using slurm?

jasonbuechler · August 28, 2020, 4:58pm

General idea

Utilizing Slurm’s job-array functionality in sbatch, and just a little bit of linux shell scripting, you can easily submit “a couple” or “tens of thousands” of similar jobs simultaneously.

create a text file with all the parameters you wish to vary in your set of similar jobs: one line per job-variation (the example below will explain)
create one single sbatch script, but use variables where appropriate
- establish how many “sub”-jobs, and the range of numbering you’d like using ‘–array’
- arrays don’t need to start at 1, nor must they increment by 1
- check the sbatch manpage for more usage info
utilize the $SLURM_ARRAY_TASK_ID environment variable and linux scripting/tools to populate necessary variables
- The ‘–array’ parameter defines the range of $SLURM_ARRAY_TASK_ID values
- the value of $SLURM_ARRAY_TASK_ID is unique to each sub-job variation
- all basic linux tools are available: below is just one method of many
submit the single script using sbatch

Simple example

Let’s say you’ve written a routine in R that analyzes some data from a given county, where the data for each county is stored in an external file, like a CSV. While you could submit one job for each county, each with a command like…

Rscript analyze.r <county file>

…but it might be easier to slightly adjust your script and let an array do the repetitive work-- especially if you have many thousands of input files!

Examine the two demonstration files printed below. The first is just a list of filenames which one might prepare using a command like ls *county.csv > input_file_list.txt. The other is a jobscript I will submit to Monsoon using Slurm’s sbatch command.

In that jobscript, I use Awk to read an individual line from input_file_list.txt, and assign the contents of that line to the $csv_filename variable. Then, I simply use that variable in the command instead of an explicit filename. (Each job-variation’s unique slurm-array-task-id number determines the number of the line that Awk pulls.)

input_file_list.txt

apache_county.csv
cochise_county.csv
coconino_county.csv
gila_county.csv

my_job.sh

#!/bin/bash
#SBATCH --array=1-4                 # max num of array elements is 50,000
#SBATCH --output=output_%A_%a.txt   # %A/%a will be replaced by job/array nums
 
# Pull one single line/filename from the list
# (In Awk, "NR==x" means "where row# is x")
csv_filename=$( awk "NR==$SLURM_ARRAY_TASK_ID" input_file_list.txt )

# Run the Rscript on the input CSV
# (first load the R module, so we can use Rscript)
module load R   
Rscript analyze.r $csv_filename

(A third file named “analyze.r” is also referenced in that jobscript for demonstrative purposes. It merely represents a general-purpose R script that expects one CSV file as an input. For the purposes of this exercise, it also prints results to stdout where Slurm will redirect them to the output file specified at the top of the script.)

Advanced example: usage

Examine the following two demonstration files. The first has some statistics on four states, and the next is a script I will submit to Monsoon using Slurm’s sbatch command. Like the previous example, I use Awk to pull an individual line from states.txt, but then I also use Cut to read individual parts of that line/variable into their own variables.

(Please note: this is just 1 of a million possible variations on the idea!)

states.txt

L1,Alabama,135765,4903185
L2,Alaska,1717856,710249
L3,Arizona,295234,7278717
L4,Arkansas,137732,3017804

my_array.sh

#!/bin/bash
#SBATCH --array=1-4                 # max num of array elements is 50,000
#SBATCH --output=output_%A_%a.txt   # %A/%a will be replaced by job/array nums
 
## load up variables we might need
line_N=$( awk "NR==$SLURM_ARRAY_TASK_ID" states.txt )  # NR means row-# in Awk
field_2=$( echo "$line_N" | cut -d "," -f 2 )  # grab comma-delim'd field #2
field_3=$( echo "$line_N" | cut -d "," -f 3 )  # grab comma-delim'd field #3
field_4=$( echo "$line_N" | cut -d "," -f 4 )  # grab comma-delim'd field #4

## Use the variables as needed
pop_dens=$(( $field_4/$field_3 ))  # Bash's non-floating-point division
echo "ARR TASK#: $SLURM_ARRAY_TASK_ID"
echo "FULL LINE: $line_N"
echo "VAR PARTS: $field_2, area $field_3, pop $field_4"
echo "Population density of $field_2 is $pop_dens per sq km"
echo "---"

Advanced example: output

You would then simply submit one sbatch job: sbatch my_array.sh

Assuming your jobid# was 1234567, the contents of your output files would then be:

$ ls output_1234567_*
output_1234567_1.txt  output_1234567_2.txt
output_1234567_3.txt  output_1234567_4.txt

$ cat output_1234567_*.txt
ARR TASK#: 1
FULL LINE: L1,Alabama,135765,4903185
VAR PARTS: Alabama, area 135765, pop 4903185
Population density of Alabama is 36 per sq km
---
ARR TASK#: 2
FULL LINE: L2,Alaska,1717856,710249
VAR PARTS: Alaska, area 1717856, pop 710249
Population density of Alaska is 0 per sq km
---
ARR TASK#: 3
FULL LINE: L3,Arizona,295234,7278717
VAR PARTS: Arizona, area 295234, pop 7278717
Population density of Arizona is 24 per sq km
---
ARR TASK#: 4
FULL LINE: L4,Arkansas,137732,3017804
VAR PARTS: Arkansas, area 137732, pop 3017804
Population density of Arkansas is 21 per sq km
---

jpessin1 · August 28, 2020, 4:59pm

It’s great to see an awk example!