I am an LSF user moving to SLURM. What are a few basic commands that I can use to get started?

I am an expert user of a cluster with an LSF Scheduler and need to use a cluster with a SLURM Scheduler. Can someone help me get started by translating the following simple examples from SGE to SLURM?

What Jobs are currently running?
bqueues

What Jobs am I currently running?
bqueues -u username

Launch an interactive session on one node with 16 cores:
bsub -I -n 16

Launch a batch job on one node with 16 cores,
bsum -n 16

Cancel a batch job
bkill -J jobname

Cancel all my jobs
bkill -u myusername

CURATOR: John Goodhue

ANSWER: The HPC Wales Portal has a useful mapping of LSF commands to Slurm equivalents - see
http://portal.hpcwales.co.uk/wordpress/index.php/index/slurm/migrating-jobs/

There’s also the Slurm Rosetta Stone :slight_smile: https://slurm.schedmd.com/rosetta.html

Some quickies, (Note: some clusters default the squeue command to displaying all users jobs, others to only your jobs):

##Your jobs:
squeu -u $USER

##All Jobs:
squeue -u \*

##Show only running jobs for all users
squeue -u \* --state=RUNNING

## Cancel a job
scancel <JOBID>

## Cancel all of my jobs
scancel -u $USER

The two common job start commands in Slurm are (description taken from the man pages):

  • srun: Run a parallel job on cluster managed by Slurm. If necessary, srun will first create a resource allocation in which to run the parallel job.
  • sbatch: submits a batch script to Slurm. The batch script may be given to sbatch through a file name on the command line, or if no file name is specified, sbatch will read in a script from standard input. The batch script may contain options preceded with “#SBATCH” before any executable commands in the script.
1 Like