Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Request your desired resources without defining a partition

AssumingHere, we assume:
(tick) your project is associated with an investment
(tick) and you did not define a partition in your request

  1. Slurm will identify a list of partitions to try to run your job against.

  2. Slurm will automatically prioritize your investment partition and hardware to try to run your job there. 

  3. Slurm checks for jobs running on your investment.

    1. IfSlurm checks if:
      (tick) there are no jobs running on your investment
      (tick) and the hardware requested in your job fits within the hardware confines of your partition

      1. Slurm immediately starts your job.

    2. IfSlurm checks if:
      (tick) there are jobs running on your investment
      (tick) and they were requested by HPC users who are not a member of any projects associated with your investment

      1. Slurm will pre-empt these jobs (ie stop them and add back to the queue)

      2. Slurm will immediately start your job.

    3. IfSlurm checks if:
      (tick) your investment hardware is 'full' with jobs from users who are members of project(s) associated with your investment

      1. Slurm will try to allocate your requested job across the other partitions

        1. On MedicineBow: The list of other partitions tried in order are: mb, mb-a30, mb-l40s, mb-h100.

        2. On Beartooth: The list of other partitions tried in order are: moran, teton, teton-cascade, teton-gpu, teton-hugemem. If

      2. Slurm checks if:
        (tick) resources are available

        1. Slurm will start your job

      3. IfSlurm checks if:
        (tick) there are no resources available to fit your job (i.e. cluster usage is very high)

        1. Slurm will place your job in the queue and your job will have a state of pending (i.e. waiting in the queue).

        2. Slurm will monitor the queue on regular intervals and run the job when any appropriate resources become available.

...

Explicitly Define a Partition

AssumingHere, we assume:
(tick) your project is associated with an investment
(tick) and you explicitly define a partition in your request

  1. Slurm will only use THAT partition (and subsequent nodes) to allocate your job within.

  2. IfSlurm checks if:
    (tick) you define a combination of nodes/cores/memory/gpus that are not available within the explicitly defined partition

    1. Slurm will not accept the job and provide an appropriate message.

...