Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Goal: Introduction to Slurm and how to start interactive sessions, submit jobs and monitor.

...

Info
  • You submit a job to the queue and walk away.

  • Monitor its progress/state using command-line and/or email notifications.

  • Once complete, come back and analyze results.

...

Submit Jobs: sbatch: Example

...

Info
  • By default, an output of the form: slurm-<job-id>.out will be generated.

  • You can view this file while the job is still running. Only view, do not edit.

...

Submit Jobs: squeue: What’s happening? Continued

...

Code Block
[]$ squeue -u arcc-t05
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
          13526340     moran   run.sh arcc-t05  R       0:29      1 m233
[]$ squeue -u arcc-t05
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)

[]$ cat slurm-13526340.out
SLURM_JOB_ID: 13526340
Start: 03/22/24 09:38:36
Python version: 3.10.6 (main, Oct 17 2022, 16:47:32) [GCC 12.2.0]
Version info: sys.version_info(major=3, minor=10, micro=6, releaselevel='final', serial=0)
End: 03/22/24 09:39:36

...

Code Block
# Lots more information
[]$ squeue --help
[]$ man squeue

# Display more columns:
# For example how much time is left of your requested wall time: TimeLeft
squeue -u arcc-t05 --Format="Account,UserName,JobID,SubmitTime,StartTime,TimeLeft"
[salexan5@mblog1 ~]$ squeue -u vvarenth --Format="Account,UserName,JobID,SubmitTime,StartTime,TimeLeft"
ACCOUNT             USER                JOBID               SUBMIT_TIME         START_TIME          TIME_LEFT
arccantrain         arcc-t05            1795458             2024-08-14T10:31:07 2024-08-14T10:31:09 6-04:42:51
arccantrain         arcc-t05            1795453             2024-08-14T10:31:06 2024-08-14T10:31:07 6-04:42:49
arccantrain         arcc-t05            1795454             2024-08-14T10:31:06 2024-08-14T10:31:07 6-04:42:49
...

...