Slurm Workshop: Summary
Goal: Provide a summary of concepts and commands covered.
Slurm Command Covered
Command | Description |
|---|---|
| Create an interactive session. Notice use of short and long form options. Format for: |
| Submit a job to the cluster. |
| Cancel a pending/running job. Variations:
|
| View the status of your currently running jobs. |
| View jobs that have finished. Use the |
| View the Only accurate is the job successfully completes. If a job fails with an Out-Of-Memory (OOM) this will not be accurate. |
| View the status of the Slurm partitions/nodes. |
| Print a table showing active projects and jobs. |
| Print a node list with allocated jobs - can query individual nodes. |
Summary
Looked at:
What Slurm is and the core functionality it provides.?
How to start an interactive sessions using
salloc, and perform job submission usingsbatch.How to select appropriate resource allocations.
How to monitor your jobs using
squeueandsacct.What does a general workflow look like. Suggesting using a small/short interactive session to test and debug, then submitting large/long jobs.
Best practices in using HPC. Suggesting not to perform computation on the login nodes and being mindful of the resources you actually require and request.
How to be a good cluster citizen with respect to general cluster use and other users.
Use the following link to provide feedback on this training:
Intro to Job Scheduling - Evaluation or use the QR code below.