Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The job's QOS has reached its aggregate CPU limit.”

This message stems from the policy that on Beartooth all projects are subject to a maximum concurrent core usage policy. This limits the total number of CPUs that can be allocated by all users in a project.

In a nutshell, what is happening is that the user's project group as a whole is submitting enough jobs, that the total number of CPUs being used (across all jobs, potentially across their investment, and resources outside their investment) is hitting their Quality of Service cpu limit. ARCC reviews what this limit is on a regular basis - as of March the 28th 2023, this is set to 100% of the non-investor partition (which is 5920 cores), plus their investment if they have one.

...