...
New MedicineBow Hardware
Slurm Partition name | Requestable features | Node | Socket/ | Cores/ | Threads/ | Total Cores/ | RAM | Processor (x86_64) | Local Disks | OS | Use Case | Key Attributes |
---|---|---|---|---|---|---|---|---|---|---|---|---|
mb | amd, epyc | 25 | 2 | 48 | 1 | 96 | 1024 | 2x 48-Core/96-Thread 4th Gen AMD EPYC 9454 | 4TB SSD | RHEL 9.3 | For compute jobs running the latest and greatest MedicineBow hardware | MB Compute with 1TB RAM |
mb-a30 | amd, epyc | 8 | 768 | DL Inference, AI, Mainstream Acceleration | MB Compute with 24GB RAM/GPU & A30 GPU | |||||||
mb-l40s | amd, epyc | 5 | 768 | DL Inference, Omniverse/Rendering, Mainstream Acceleration | MB Compute with 48GB RAM/GPU & L40S GPU | |||||||
mb-h100 | amd, epyc | 6 | 1228 | DL Training and Inference, DA, AI, Mainstream Acceleration | MB Compute with 80GB RAM/GPU & Nvidia SXM5 H100 GPU |
Former Beartooth Hardware (to be consolidated into MedicineBow or retired - pending)
Slurm Partition name | Requestable features | Node | Sockets/ | Cores/ | Threads/ | Total | RAM | Processor (x86_64) | Local Disks | OS | Use Case | Key Attributes |
---|---|---|---|---|---|---|---|---|---|---|---|---|
moran | fdr, intel, sandy, ivy, community | 273 | 2 | 8 | 1 | 16 | 64 or 128 | Intel Ivybridge/ | 1 TB HD | RHEL 8.8 | For compute jobs not needing the latest and greatest hardware. | Original Moran compute |
moran-bigmem | fdr, intel, haswell | 2 | 2 | 8 | 1 | 16 | 512 | Intel Haswell | 1 TB HD | RHEL 8.8 | For jobs not needing the latest hardware, w/ above average memory requirements. | Moran compute w/ 512G of RAM |
moran-hugemem | fdr, intel, haswell, community | 2 | 2 | 8 | 1 | 16 | 1024 | Intel Haswell | 1 TB HD | RHEL 8.8 | For jobs that don’t need the latest hardware, w/ escalated memory requirements. | Moran compute w/ 1TB of RAM |
dgx | edr, intel, broadwell | 2 | 2 | 20 | 2 | 40 | 512 | Intel Broadwell | 7 TB SSD | RHEL 8.8 | For GPU and AI-enabled workloads. | Special DGX GPU compute nodes |
teton | edr, intel, broadwell, community | 175 | 2 | 16 | 1 | 32 | 128 | Intel Broadwell | 240 GB SSD | RHEL 8.8 | For regular compute jobs. | Teton compute |
teton-cascade | edr, intel, cascade, community | 56 | 2 | 20 | 1 | 40 | 192 or 768 | Intel Cascade Lake | 240 GB SSD | RHEL 8.8 | For compute jobs w/ on newer-older hardware, and somewhat higher memory requirements. | Teton compute w/ Cascade Lake CPUs |
teton-gpu | edr, intel, broadwell, community | 6 | 2 | 16 | 1 | 32 | 512 | Intel Broadwell | 240 GB SSD | RHEL 8.8 | For compute jobs utilizing GPUs on prior cluster hardware. | Teton GPU compute |
teton-hugemem | edr, intel, broadwell | 8 | 2 | 16 | 1 | 32 | 1024 | Intel Broadwell | 240 GB SSD | RHEL 8.8 | For compute jobs w/ large memory requirements, running on fast prior cluster hardware. | Teton compute w/ 1TB of RAM |
teton-massmem | edr, amd, epyc | 2 | 2 | 24 | 1 | 48 | 4096 | AMD/EPYC | 4096 GB SSD | RHEL 8.6 | For compute jobs w/ exceedingly demanding memory requirements | Teton compute w/ 4TB of RAM |
teton-knl | edr, intel, knl | 12 | 1 | 72 | 4 | 72 | 384 | Intel Knights Landing | 240 GB SSD | RHEL 8.8 | For jobs using many cores on a single node, but speed isn’t critical | Teton compute w/ Intel Knight’s Landing CPU’s |
beartooth | edr, intel, icelake | 2 | 2 | 28 | 1 | 56 | 256 | Intel Icelake | 436 GB SSD | RHEL 8.8 | For general compute jobs running the latest and greatest Beartooth with newer hardware | Beartooth compute |
beartooth-gpu | edr, intel, icelake | 4 | 2 | 28 | 1 | 56 | 250 or 1024 | Intel Icelake | 436 GB SSD | RHEL 8.8 | For compute jobs needing GPU on the latest and greatest hardware. | Beartooth GPU compute |
beartooth-bigmem | edr, intel, icelake | 6 | 2 | 28 | 1 | 56 | 515 | Intel Icelake | 436 GB SSD | RHEL 8.8 | For jobs w/ above average memory requirements, on the latest and greatest newer hardware. | Beartooth compute w/ 512G of RAM |
beartooth-hugemem | edr, intel, icelake | 8 | 2 | 28 | 1 | 56 | 1024 | Intel Icelake | 436 GB SSD | RHEL 8.8 | For jobs w/ large memory requirements on the latest and greatest newer hardware. | Beartooth compute w/ 1TB of RAM |
...
The ARCC Beartooth cluster has a number of compute nodes that contain GPUs. The following tables list each node that has GPUs and the type of GPU installed.
GPU Type | Partition | Example slurm value to request | # of Nodes | GPU devices per node | CUDA Cores | Tensor Cores | GPU Memory Size (GB) | Compute Capability | |||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Tesla P100A30 |
(all available on
|
| 8 | 2 | 3584 | 0 | 16 | 6.0 | V100 |
non-investor
| 15 | 7 on BT/non-investor, 8 on MedicineBow | 3584 | 224 | 25 | 8.0 | |||||||||
L40S |
|
| 5 | 8 | 5120 | 640 | 16/32 | 7.0 | A30 |
| 568 TC/GPU on MB | 48GB/GPU | |||||||||||||
H100 |
|
| 7 | 2 | 3584 | 224 | 25 | 8.0 | T4 |
| 2 | 3 | 2560 | 320 | 16G | 7.5
| 6 | 8 | 16896 FP32 CUDA/GPU | 528 TC/GPU on MB | 80GB/GPU |
Specialty Partitions
In some cases you will need to specifically define the partition
to request various compute nodes. Simply requesting associated resources will not be enough. For example:
...
Code Block |
---|
#SBATCH --cpus-per-task=70 # Fails with: sbatch: error: CPU count per node can not be satisfied #SBATCH --cpus-per-task=70 #SBATCH --partition=teton-knl # Job is allocated |
Specialty Hardware
ARCC also offers some specialty hardware for unique workloads. Currently this is still in the development and testing phase.
GPU Type | Node | Socket/ | Cores/ | Threads/ | Total Cores/ | RAM | Processor | Local Disks | Use Case | Notes |
---|---|---|---|---|---|---|---|---|---|---|
GH200 | 2 | 1 | 72 | 1 | 72 | 480 | NVIDIA Grace™ | 1TB SSD | Specialty nodes specifically designed for LLM inference, vector database search, and large data processing. | Not generally available to the public. |