We have node types in a partition of our cluster with different number of cpu cores:
Feature cpu cores memory
dodeca96gb 24 96gb
icosa192gb 40 192gb
The features are defined in slurm.conf:
NodeName=compt[199-290] Procs=24 CoresPerSocket=12 RealMemory=95000 Feature=dodeca96gb
NodeName=compt[291-316] Procs=40 CoresPerSocket=20 RealMemory=191000 Feature=icosa192gb
I want resources that provide an equal number of cpu cores from each feature type (120).
From ‘man sbatch’ (or slurm.schedmd.com/sbatch.html),
#SBATCH --constraint=[dodeca96gb5&icosa192gb3]
I use the following slurm declarations:
1 #!/bin/bash
2 #SBATCH -p batch
3 #SBATCH -N 8
4 #SBATCH --constraint=[dodeca96gb5&icosa192gb3]
5 #SBATCH -n 240
6 #SBATCH --ntasks-per-core=1
7 #SBATCH --exclusive
This however results in an error from Slurm (we run 19.05):
error: Batch job submission failed: Requested node configuration is not available
If I do not specify the number of nodes, the job will receive resources, but it will treat my declarations as a minimum resource request, and allocate more nodes than I want.
All of the above is to support running a benchmark, so I’m particular about the node configuration.
I have come to understand that this is a ‘heterogeneous job step’, and from the documentation, this is not a standard type of job supported by Slurm.
Thanks,
~ Em
ps Thanks to Julie for suggesting this post, and to those who offered their thoughts when i made a similar post to the cc list ( :