...
Once configuration is initialized or updated, your investment is available to the associated projects on the cluster.
Jobs run under projects associated with an investment
...
Any job that are run on the HPC should specify an account in the submission/request. Projects associated with an investment are processed by Slurm in 2 ways. This is dependent upon whether you define a partition in your job request:
Request your desired resources without defining a partition
AssumingHere, we assume:
your project is associated with an investment
and you did not define a partition in your request
Slurm will identify a list of partitions to try to run your job against.
Slurm will automatically prioritize your investment partition and hardware to try to run your job there.
Slurm checks for jobs running on your investment.
IfSlurm checks if:
there are no jobs running on your investment
and the hardware requested in your job fits within the hardware confines of your partitionSlurm immediately starts your job.
IfSlurm checks if:
there are jobs running on your investment
and they were requested by HPC users who are not a member of any projects associated with your investmentSlurm will pre-empt these jobs (ie stop them and add back to the queue)
Slurm will immediately start your job.
IfSlurm checks if:
your investment hardware is 'full' with jobs from users who are members of project(s) associated with your investmentSlurm will try to allocate your requested job across the other partitions
On MedicineBow: The list of other partitions tried in order are: mb, mb-a30, mb-l40s, mb-h100.
On Beartooth: The list of other partitions tried in order are: moran, teton, teton-cascade, teton-gpu, teton-hugemem. If
Slurm checks if:
resources are availableSlurm will start your job
Slurm checks if:
there are no resources available to fit your job (i.e. cluster usage is very high)Slurm will place your job in the queue and your job will have a state of pending (i.e. waiting in the queue).
Slurm will monitor the queue on regular intervals and run the job when any appropriate resources become available.
A job can be pending for a number of reasons, and Slurm will only show one of the many potential reasons, which can be a bit confusing. For example, it might state BadConstraint (The job's constraints can not be satisfied) or Resources (The job is waiting for resources to become available) A full list can be found here: https://slurm.schedmd.com/squeue.html#SECTION_JOB-REASON-CODES
...
Explicitly Define a Partition
AssumingHere, we assume:
your project is associated with an investment
and you explicitly define a partition in your request
Slurm will only use THAT partition (and subsequent nodes) to allocate your job within.
IfSlurm checks if:
you define a combination of nodes/cores/memory/gpus that are not available within the explicitly defined partitionSlurm will not accept the job and provide an appropriate message.
...