MedicineBow Hardware Summary Table
New MedicineBow Hardware
Slurm Partition name | Requestable features | Node | Socket/ | Cores/ | Threads/ | Total Cores/ | RAM | Processor (x86_64) | Local Disks | OS | Use Case | Key Attributes |
---|---|---|---|---|---|---|---|---|---|---|---|---|
mb | amd, epyc | 25 | 2
| 48
| 1
| 96
| 1024 | 2x 48-Core/96-Thread 4th Gen AMD EPYC 9454 | 4TB SSD
| RHEL 9.3
| For compute jobs running the latest and greatest MedicineBow hardware | MB Compute with 1TB RAM |
mb-a30 | amd, epyc | 8 | 768 | DL Inference, AI, Mainstream Acceleration | MB Compute with 24GB RAM/GPU & A30 GPU | |||||||
mb-l40s | amd, epyc | 5 | 768 | DL Inference, Omniverse/Rendering, Mainstream Acceleration | MB Compute with 48GB RAM/GPU & L40S GPU | |||||||
mb-h100 | amd, epyc | 6 | 1228 | DL Training and Inference, DA, AI, Mainstream Acceleration | MB Compute with 80GB RAM/GPU & Nvidia SXM5 H100 GPU |
Former Beartooth Hardware (to be consolidated into MedicineBow or retired - pending)
Slurm Partition name | Requestable features | Node | Sockets/ | Cores/ | Threads/ | Total | RAM | Processor (x86_64) | Local Disks | OS | Use Case | Key Attributes |
---|---|---|---|---|---|---|---|---|---|---|---|---|
moran | fdr, intel, sandy, ivy, community | 273 | 2 | 8 | 1 | 16 | 64 or 128 | Intel Ivybridge/ | 1 TB HD | RHEL 8.8 | For compute jobs not needing the latest and greatest hardware. | Original Moran compute |
moran-bigmem | fdr, intel, haswell | 2 | 2 | 8 | 1 | 16 | 512 | Intel Haswell | 1 TB HD | RHEL 8.8 | For jobs not needing the latest hardware, w/ above average memory requirements. | Moran compute w/ 512G of RAM |
moran-hugemem | fdr, intel, haswell, community | 2 | 2 | 8 | 1 | 16 | 1024 | Intel Haswell | 1 TB HD | RHEL 8.8 | For jobs that don’t need the latest hardware, w/ escalated memory requirements. | Moran compute w/ 1TB of RAM |
dgx | edr, intel, broadwell | 2 | 2 | 20 | 2 | 40 | 512 | Intel Broadwell | 7 TB SSD | RHEL 8.8 | For GPU and AI-enabled workloads. | Special DGX GPU compute nodes |
teton | edr, intel, broadwell, community | 175 | 2 | 16 | 1 | 32 | 128 | Intel Broadwell | 240 GB SSD | RHEL 8.8 | For regular compute jobs. | Teton compute |
teton-cascade | edr, intel, cascade, community | 56 | 2 | 20 | 1 | 40 | 192 or 768 | Intel Cascade Lake | 240 GB SSD | RHEL 8.8 | For compute jobs w/ on newer-older hardware, and somewhat higher memory requirements. | Teton compute w/ Cascade Lake CPUs |
teton-gpu | edr, intel, broadwell, community | 6 | 2 | 16 | 1 | 32 | 512 | Intel Broadwell | 240 GB SSD | RHEL 8.8 | For compute jobs utilizing GPUs on prior cluster hardware. | Teton GPU compute |
teton-hugemem | edr, intel, broadwell | 8 | 2 | 16 | 1 | 32 | 1024 | Intel Broadwell | 240 GB SSD | RHEL 8.8 | For compute jobs w/ large memory requirements, running on fast prior cluster hardware. | Teton compute w/ 1TB of RAM |
teton-massmem | edr, amd, epyc | 2 | 2 | 24 | 1 | 48 | 4096 | AMD/EPYC | 4096 GB SSD | RHEL 8.6 | For compute jobs w/ exceedingly demanding memory requirements | Teton compute w/ 4TB of RAM |
teton-knl | edr, intel, knl | 12 | 1 | 72 | 4 | 72 | 384 | Intel Knights Landing | 240 GB SSD | RHEL 8.8 | For jobs using many cores on a single node, but speed isn’t critical | Teton compute w/ Intel Knight’s Landing CPU’s |
beartooth | edr, intel, icelake | 2 | 2 | 28 | 1 | 56 | 256 | Intel Icelake | 436 GB SSD | RHEL 8.8 | For general compute jobs running with newer hardware | Beartooth compute |
beartooth-gpu | edr, intel, icelake | 4 | 2 | 28 | 1 | 56 | 250 or 1024 | Intel Icelake | 436 GB SSD | RHEL 8.8 | For compute jobs needing GPU. | Beartooth GPU compute |
beartooth-bigmem | edr, intel, icelake | 6 | 2 | 28 | 1 | 56 | 515 | Intel Icelake | 436 GB SSD | RHEL 8.8 | For jobs w/ above average memory requirements, on newer hardware. | Beartooth compute w/ 512G of RAM |
beartooth-hugemem | edr, intel, icelake | 8 | 2 | 28 | 1 | 56 | 1024 | Intel Icelake | 436 GB SSD | RHEL 8.8 | For jobs w/ large memory requirements newer hardware. | Beartooth compute w/ 1TB of RAM |
Feature | Description of Feature |
---|---|
fdr | Requests nodes that are connected with an Infiniband cable with a signaling rate of 14.0625 Gbit/s |
edr | Requests nodes that are connected with an Infiniband cable with a signaling rate of 25.78125 Gbit/s |
intel | Requests a processor that is based on an Intel processor. Includes all Intel CPU versions in Beartooth. |
ivy | Requests an Intel Ivy Bridge CPU. |
sandy | Requests an Intel Sandy Bridge CPU. |
broadwell | Requests an Intel Broadwell CPU. |
haswell | Requests an Intel Haswell CPU. |
knl | Requests an Intel Knights Landing CPU. This is a specialized chip and not good for all work loads. |
icelake | Requests an Intel Icelake CPU. |
amd | Requests a processor that is based on an AMD processor. Include all AMD CPU versions in Beartooth. |
epyc | Requests an AMD EPYC CPU. |
community | This feature indicates a node shared equally among the research community. Jobs on these nodes can’t be pre-empted, but can be queued up for far longer. |
GPUs and Accelerators
The ARCC Beartooth cluster has a number of compute nodes that contain GPUs. The following tables list each node that has GPUs and the type of GPU installed.
GPU Type | Partition | Example slurm value to request | # of Nodes | GPU devices per node | CUDA Cores | Tensor Cores | GPU Memory Size (GB) | Compute Capability |
---|---|---|---|---|---|---|---|---|
A30 |
| #SBATCH --partition=beartooth-gpu
#SBATCH --gres=gpu:<#_gpu_requested> | 15 | 7 on BT/non-investor, 8 on MedicineBow | 3584 | 224 | 25 | 8.0 |
L40S |
| #SBATCH --partition=beartooth-gpu
#SBATCH --gres=gpu:<#_gpu_requested> | 5 | 8 |
| 568 TC/GPU on MB | 48GB/GPU |
|
H100 |
| #SBATCH --partition=beartooth-gpu
#SBATCH --gres=gpu:<#_gpu_requested> | 6 | 8 | 16896 FP32 CUDA/GPU | 528 TC/GPU on MB | 80GB/GPU |
|
Specialty Partitions
In some cases you will need to specifically define the partition
to request various compute nodes. Simply requesting associated resources will not be enough. For example:
Teton Massmem Nodes:
Teton KNL nodes:
Specialty Hardware
ARCC also offers some specialty hardware for unique workloads. Currently this is still in the development and testing phase.
GPU Type | Node | Socket/ | Cores/ | Threads/ | Total Cores/ | RAM | Processor | Local Disks | Use Case | Notes |
---|---|---|---|---|---|---|---|---|---|---|
GH200 | 2 | 1
| 72
| 1
| 72
| 480 | NVIDIA Grace™ | 1TB SSD | Specialty nodes specifically designed for LLM inference, vector database search, and large data processing. | Not generally available to the public. |