Obtaining local (temporary) disk space information using SLURM

katia · March 16, 2018, 1:19am

On a Linux cluster with a SLURM scheduler, how do I discover the amount of temporary disk space available on each node so I can config my job and set sbatch -–tmp?

CURATOR: Katia

katia · March 26, 2018, 1:49pm

ANSWER:

Various clusters might be set slightly differently and which partition is recommended to use as a “temp” might differ from cluster to cluster. One way to explore the environment on any scheduler is to submit a job that executes “env” command. Here is an example of output of this command on the c3ddb cluster:

SLURM_CHECKPOINT_IMAGE_DIR=/scratch/users/koleinik
SLURM_NODELIST=node005
...
TMPDIR=/tmp

payerle · April 17, 2018, 3:14pm

ANSWER: The amount of temporary disk space configured for a node that Slurm knows
about can be displayed with the sinfo command.

sinfo -l -N

will list every node/partition combination along with various data, including
a TMP_DISK field giving the available temporary disk space for the node in MB.

As Katia mentioned, what this temporary disk is/where it is mounted might
vary from cluster to cluster, but the value returned by sinfo should be the
same as is used by the scheduler when trying to meet the temporary disk requirement
specified by the --tmp flag to sbatch.

Sample output:

Tue Apr 17 11:12:25 2018
NODELIST        NODES      PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON              
compute-0001        1       standard   allocated   20   2:10:1 128000   750000      1   (null) none                
compute-0001        1      scavenger   allocated   20   2:10:1 128000   750000      1   (null) none 
compute-0002        1       standard   allocated   20   2:10:1 128000   750000      1   (null) none                
...