...
Slurm Partition name | Requestable features | Node | Socket/ | Cores/ | Threads/ | Total Cores/ | RAM | Processor (x86_64) | Local Disks | OS | Use Case | Key Attributes |
---|---|---|---|---|---|---|---|---|---|---|---|---|
medicinebowmb | amd, epyc | 25 | 2 | 48 | 1 | 96 | 1024 | 2x 48-Core/96-Thread 4th Gen AMD EPYC 9454 | 4TB SSD | RHEL 89.83 | For compute jobs running the latest and greatest MedicineBow hardware | MB Compute with 1TB RAM |
medicinebowmb-a30 | amd, epyc | 8 | 768 | DL Inference, AI, Mainstream Acceleration | MB Compute with 24GB RAM/GPU & A30 GPU | |||||||
medicinebowmb-l40s | amd, epyc | 5 | 768 | DL Inference, Omniverse/Rendering, Mainstream Acceleration | MB Compute with 48GB RAM/GPU & L40S GPU | |||||||
medicinebowmb-h100 | amd, epyc | 6 | 7681228 | DL Training and Inference, DA, AI, Mainstream Acceleration | MB Compute with 80GB RAM/GPU & Nvidia SXM5 H100 GPU |
Former Beartooth Hardware (to be consolidated into MedicineBow or retired - pending)
Slurm Partition name | Requestable features | Node | Sockets/ | Cores/ | Threads/ | Total | RAM | Processor (x86_64) | Local Disks | OS | Use Case | Key Attributes |
---|---|---|---|---|---|---|---|---|---|---|---|---|
moran | fdr, intel, sandy, ivy, community | 273 | 2 | 8 | 1 | 16 | 64 or 128 | Intel Ivybridge/ | 1 TB HD | RHEL 8.8 | For compute jobs not needing the latest and greatest hardware. | Original Moran compute |
moran-bigmem | fdr, intel, haswell | 2 | 2 | 8 | 1 | 16 | 512 | Intel Haswell | 1 TB HD | RHEL 8.8 | For jobs not needing the latest hardware, w/ above average memory requirements. | Moran compute w/ 512G of RAM |
moran-hugemem | fdr, intel, haswell, community | 2 | 2 | 8 | 1 | 16 | 1024 | Intel Haswell | 1 TB HD | RHEL 8.8 | For jobs that don’t need the latest hardware, w/ escalated memory requirements. | Moran compute w/ 1TB of RAM |
dgx | edr, intel, broadwell | 2 | 2 | 20 | 2 | 40 | 512 | Intel Broadwell | 7 TB SSD | RHEL 8.8 | For GPU and AI-enabled workloads. | Special DGX GPU compute nodes |
teton | edr, intel, broadwell, community | 175 | 2 | 16 | 1 | 32 | 128 | Intel Broadwell | 240 GB SSD | RHEL 8.8 | For regular compute jobs. | Teton compute |
teton-cascade | edr, intel, cascade, community | 56 | 2 | 20 | 1 | 40 | 192 or 768 | Intel Cascade Lake | 240 GB SSD | RHEL 8.8 | For compute jobs w/ on newer-older hardware, and somewhat higher memory requirements. | Teton compute w/ Cascade Lake CPUs |
teton-gpu | edr, intel, broadwell, community | 6 | 2 | 16 | 1 | 32 | 512 | Intel Broadwell | 240 GB SSD | RHEL 8.8 | For compute jobs utilizing GPUs on prior cluster hardware. | Teton GPU compute |
teton-hugemem | edr, intel, broadwell | 8 | 2 | 16 | 1 | 32 | 1024 | Intel Broadwell | 240 GB SSD | RHEL 8.8 | For compute jobs w/ large memory requirements, running on fast prior cluster hardware. | Teton compute w/ 1TB of RAM |
teton-massmem | edr, amd, epyc | 2 | 2 | 24 | 1 | 48 | 4096 | AMD/EPYC | 4096 GB SSD | RHEL 8.6 | For compute jobs w/ exceedingly demanding memory requirements | Teton compute w/ 4TB of RAM |
teton-knl | edr, intel, knl | 12 | 1 | 72 | 4 | 72 | 384 | Intel Knights Landing | 240 GB SSD | RHEL 8.8 | For jobs using many cores on a single node, but speed isn’t critical | Teton compute w/ Intel Knight’s Landing CPU’s |
beartooth | edr, intel, icelake | 2 | 2 | 28 | 1 | 56 | 256 | Intel Icelake | 436 GB SSD | RHEL 8.8 | For general compute jobs running with newer hardware | Beartooth compute |
beartooth-gpu | edr, intel, icelake | 4 | 2 | 28 | 1 | 56 | 250 or 1024 | Intel Icelake | 436 GB SSD | RHEL 8.8 | For compute jobs needing GPU. | Beartooth GPU compute |
beartooth-bigmem | edr, intel, icelake | 6 | 2 | 28 | 1 | 56 | 515 | Intel Icelake | 436 GB SSD | RHEL 8.8 | For jobs w/ above average memory requirements, on newer hardware. | Beartooth compute w/ 512G of RAM |
beartooth-hugemem | edr, intel, icelake | 8 | 2 | 28 | 1 | 56 | 1024 | Intel Icelake | 436 GB SSD | RHEL 8.8 | For jobs w/ large memory requirements newer hardware. | Beartooth compute w/ 1TB of RAM |
...
The ARCC Beartooth cluster has a number of compute nodes that contain GPUs. The following tables list each node that has GPUs and the type of GPU installed.
GPU Type | Partition | Example slurm value to request | # of Nodes | GPU devices per node | CUDA Cores | Tensor Cores | GPU Memory Size (GB) | Compute Capability | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Tesla P100 |
(all available on |
| 8 | 2 | 3584 | 0 | 16 | 6.0 | |||||||||||
V100 | |||||||||||||||||||
| 2 | 8 | 5120 | 640 | 16/32 | 7.0 | A30 |
|
| 15 | 7 on BT/non-investor, 8 on MedicineBow | 3584 | 224 | 25 | 8.0 | ||||
T4L40S |
| 2 | 3 | 2560 or | 320 or 224 TC/GPU on MB | 16G | 7.5 | L40S |
|
| 5 | 8 | 568 TC/GPU on MB | 48GB/GPU | |||||
H100 |
|
| 6 | 8 | 16896 FP32 CUDA/GPU | 528 TC/GPU on MB | 80GB/GPU |
...
Code Block |
---|
#SBATCH --cpus-per-task=70 # Fails with: sbatch: error: CPU count per node can not be satisfied #SBATCH --cpus-per-task=70 #SBATCH --partition=teton-knl # Job is allocated |
Specialty Hardware
ARCC also offers some specialty hardware for unique workloads. Currently this is still in the development and testing phase.
GPU Type | Node | Socket/ | Cores/ | Threads/ | Total Cores/ | RAM | Processor | Local Disks | Use Case | Notes |
---|---|---|---|---|---|---|---|---|---|---|
GH200 | 2 | 1 | 72 | 1 | 72 | 480 | NVIDIA Grace™ | 1TB SSD | Specialty nodes specifically designed for LLM inference, vector database search, and large data processing. | Not generally available to the public. |