What is a partition?

I’m not sure what hardware is on each partition. Can you explain what I get when I specify a partition?

Discovery cluster has many compute nodes (physical machines), and each node has many CPUs, GPUs etc. When you submit a job on SLURM, you pass in parameters representing the hardware you request, for example, number of cpus/task, total number of cpus, total number of cpus, etc. All the parameter information can be seen in the manpage of srun/sbatch

An addtional parameter with srun/sbatch is partition. You must submit your job with partition as a parameter. Each partition consists of several processor architectures and different compute node counts. All users in NEU have access to certain partitions by default (short, debug express, gpu), and require approval for certain other partitions (long, large, multigpu). The approval process is linked in the RC doc page below

For more information about partitions, please refer to https://rc-docs.northeastern.edu/en/latest/hardware/partitions.html