Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Slurm will automatically find if you have an investment and try running against this first. 

  1. If there are jobs running on your investment, belonging to users who are not part of your project, then slurm will pre-empt these jobs (ie stop them and add back to the queue) and immediately start your job.

  2. But if your investment is 'full' with jobs from users who are members of your project, then it will try to allocate across the other partitions if resources are available. The list of other partitions, tried in order are: moran, teton, teton-cascade, teton-gpu and,teton-hugemem.

  3. If there are no resources available to fit your job (i.e. cluster usage is very high), then your job will have a state of pending (i.e. waiting in the queue). On a regular interval Slurm will monitor the queue and run the job when appropriate resources become available.

A job can be pending for a number of reasons, and Slurm will only show one of the many potential reasons, which can be a bit confusing. For example, it might state BadConstraint (The job's constraints can not be satisfied) or Resources (The job is waiting for resources to become available) A full list can be found here: https://slurm.schedmd.com/squeue.html#SECTION_JOB-REASON-CODES

...

But, the fact your job has been accepted onto into the queue (i.e. you have a job number) means it will eventually run when appropriate resources become available.

...