I am an expert user of a cluster with an LSF Scheduler and need to use a cluster with a SLURM Scheduler. Can someone help me get started by translating the following simple examples from SGE to SLURM?
What Jobs are currently running? bqueues
What Jobs am I currently running? bqueues -u username
Launch an interactive session on one node with 16 cores: bsub -I -n 16
Launch a batch job on one node with 16 cores, bsum -n 16
Some quickies, (Note: some clusters default the squeue command to displaying all users jobs, others to only your jobs):
##Your jobs:
squeu -u $USER
##All Jobs:
squeue -u \*
##Show only running jobs for all users
squeue -u \* --state=RUNNING
## Cancel a job
scancel <JOBID>
## Cancel all of my jobs
scancel -u $USER
The two common job start commands in Slurm are (description taken from the man pages):
srun: Run a parallel job on cluster managed by Slurm. If necessary, srun will first create a resource allocation in which to run the parallel job.
sbatch: submits a batch script to Slurm. The batch script may be given to sbatch through a file name on the command line, or if no file name is specified, sbatch will read in a script from standard input. The batch script may contain options preceded with “#SBATCH” before any executable commands in the script.