Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

New MedicineBow Hardware

Slurm Partition name

Requestable features

Node
count

Socket/
Node

Cores/
Socket

Threads/
Core

Total Cores/
Node

RAM
(GB)

Processor (x86_64)

Local Disks

OS

Use Case

Key Attributes

mb

amd, epyc

25

2

48

1

96

1024

2x 48-Core/96-Thread 4th Gen AMD EPYC 9454 

4TB SSD

RHEL 9.3

For compute jobs running the latest and greatest MedicineBow hardware

MB Compute with 1TB RAM

mb-a30

amd, epyc

8

768

DL Inference, AI, Mainstream Acceleration

MB Compute with 24GB RAM/GPU & A30 GPU

mb-l40s

amd, epyc

5

768

DL Inference, Omniverse/Rendering, Mainstream Acceleration

MB Compute with 48GB RAM/GPU & L40S GPU

mb-h100

amd, epyc

6

1228

DL Training and Inference, DA, AI, Mainstream Acceleration

MB Compute with 80GB RAM/GPU & Nvidia SXM5 H100 GPU

Former Beartooth Hardware (to be consolidated into MedicineBow or retired - pending)

Slurm Partition name

Requestable features

Node
count

Sockets/
Node

Cores/
Socket

Threads/
Core

Total
Cores/
Node

RAM
(GB)

Processor (x86_64)

Local Disks

OS

Use Case

Key Attributes

moran

fdr, intel, sandy, ivy, community

273

2

8

1

16

64 or 128

Intel Ivybridge/
Sandybridge

1 TB HD

RHEL 8.8

For compute jobs not needing the latest and greatest hardware.

Original Moran compute

moran-bigmem

fdr, intel, haswell

2

2

8

1

16

512

Intel Haswell

1 TB HD

RHEL 8.8

For jobs not needing the latest hardware, w/ above average memory requirements.

Moran compute w/ 512G of RAM

moran-hugemem

fdr, intel, haswell, community

2

2

8

1

16

1024

Intel Haswell

1 TB HD

RHEL 8.8

For jobs that don’t need the latest hardware, w/ escalated memory requirements.

Moran compute w/ 1TB of RAM

dgx

edr, intel, broadwell

2

2

20

2

40

512

Intel Broadwell

7 TB SSD

RHEL 8.8

For GPU and AI-enabled workloads.

Special DGX GPU compute nodes

teton

edr, intel, broadwell, community

175

2

16

1

32

128

Intel Broadwell

240 GB SSD

RHEL 8.8

For regular compute jobs.

Teton compute

teton-cascade

edr, intel, cascade, community

56

2

20

1

40

192 or 768

Intel Cascade Lake

240 GB SSD

RHEL 8.8

For compute jobs w/ on newer-older hardware, and somewhat higher memory requirements.

Teton compute w/ Cascade Lake CPUs

teton-gpu

edr, intel, broadwell, community

6

2

16

1

32

512

Intel Broadwell

240 GB SSD

RHEL 8.8

For compute jobs utilizing GPUs on prior cluster hardware.

Teton GPU compute

teton-hugemem

edr, intel, broadwell

8

2

16

1

32

1024

Intel Broadwell

240 GB SSD

RHEL 8.8

For compute jobs w/ large memory requirements, running on fast prior cluster hardware.

Teton compute w/ 1TB of RAM

teton-massmem

edr, amd, epyc

2

2

24

1

48

4096

AMD/EPYC

4096 GB SSD

RHEL 8.6

For compute jobs w/ exceedingly demanding memory requirements

Teton compute w/ 4TB of RAM

teton-knl

edr, intel, knl

12

1

72

4

72

384

Intel Knights Landing

240 GB SSD

RHEL 8.8

For jobs using many cores on a single node, but speed isn’t critical

Teton compute w/ Intel Knight’s Landing CPU’s

beartooth

edr, intel, icelake

2

2

28

1

56

256

Intel Icelake

436 GB SSD

RHEL 8.8

For general compute jobs running the latest and greatest Beartooth with newer hardware

Beartooth compute

beartooth-gpu

edr, intel, icelake

4

2

28

1

56

250 or 1024

Intel Icelake

436 GB SSD

RHEL 8.8

For compute jobs needing GPU on the latest and greatest hardware.

Beartooth GPU compute

beartooth-bigmem

edr, intel, icelake

6

2

28

1

56

515

Intel Icelake

436 GB SSD

RHEL 8.8

For jobs w/ above average memory requirements, on the latest and greatest newer hardware.

Beartooth compute w/ 512G of RAM

beartooth-hugemem

edr, intel, icelake

8

2

28

1

56

1024

Intel Icelake

436 GB SSD

RHEL 8.8

For jobs w/ large memory requirements on the latest and greatest newer hardware.

Beartooth compute w/ 1TB of RAM

medicinebow

amd, epyc

25

2

48

1

96

1024

AMD EPYC

4TB SSD

RHEL 8.8

You tell me, Homies.

You tell me, Homes.

medicinebow-a30

amd, epyc

8

2

48

1

96

768

AMD EPYC

4TB SSD

RHEL 8.8

You tell me, Homies.

You tell me, Homies.

medicinebow-l40s

amd, epyc

5

2

48

1

96

768

AMD EPYC

4TB SSD

RHEL 8.8

You tell me, Homies.

You tell me, Homies.

medicinebow-h100

amd, epyc

6

2

48

1

96

768

AMD EPYC

4TB SSD

RHEL 8.8

You tell me, Homies.

You tell me, Homies.

Feature

Description of Feature

fdr

Requests nodes that are connected with an Infiniband cable with a signaling rate of 14.0625 Gbit/s

edr

Requests nodes that are connected with an Infiniband cable with a signaling rate of 25.78125 Gbit/s

intel

Requests a processor that is based on an Intel processor. Includes all Intel CPU versions in Beartooth.

ivy

Requests an Intel Ivy Bridge CPU.

sandy

Requests an Intel Sandy Bridge CPU.

broadwell

Requests an Intel Broadwell CPU.

haswell

Requests an Intel Haswell CPU.

knl

Requests an Intel Knights Landing CPU. This is a specialized chip and not good for all work loads.

icelake

Requests an Intel Icelake CPU.

amd

Requests a processor that is based on an AMD processor. Include all AMD CPU versions in Beartooth.

epyc

Requests an AMD EPYC CPU.

community

This feature indicates a node shared equally among the research community. Jobs on these nodes can’t be pre-empted, but can be queued up for far longer.

...

The ARCC Beartooth cluster has a number of compute nodes that contain GPUs. The following tables list each node that has GPUs and the type of GPU installed.

7.0

GPU Type

Partition
(Some partitions may be in the process of migration to MB. Run sinfo for current partitions)

Example slurm value to request

# of Nodes

GPU devices per node

CUDA Cores

Tensor Cores

GPU Memory Size (GB)

Compute Capability

Tesla P100

teton-gpu

(all available on non-investor)

Code Block
#SBATCH --partition=teton-gpu
#SBATCH --gres=gpu:?

8

2

3584

0

16

6.0

V100

dgx

(both available on non-investor)

Code Block
#SBATCH --partition=dgx
#SBATCH --gres=gpu:?

2

8

5120

640

16/32

A30

beartooth-gpu (4)

medicinebowmb-gpu? a30 (8)

non-investor (3)

Code Block
#SBATCH --partition=beartooth-gpu
#SBATCH --gres=gpu:?<#_gpu_requested>

15

7 on BT/non-investor, 8 on MedicineBow

3584

224

25

8.0

T4L40S

non-investor

2

3

2560 or
3804 FP32 CUDA/GPU on MB

320 or 224 TC/GPU on MB

16G
24GB/GPU on MB

7.5

L40S

medicinebow-gpu? (5)mb-l40s (5)

Code Block
#SBATCH --partition=beartooth-gpu
#SBATCH --gres=gpu:<#_gpu_requested>

5

8

568 TC/GPU on MB

48GB/GPU

H100

medicinebowmb-gpu? h100 (6)

Code Block
#SBATCH --partition=beartooth-gpu
#SBATCH --gres=gpu:<#_gpu_requested>

6

8

16896 FP32 CUDA/GPU 

528 TC/GPU on MB

80GB/GPU

...

Code Block
#SBATCH --cpus-per-task=70
# Fails with: sbatch: error: CPU count per node can not be satisfied

#SBATCH --cpus-per-task=70
#SBATCH --partition=teton-knl
# Job is allocated

Specialty Hardware

ARCC also offers some specialty hardware for unique workloads. Currently this is still in the development and testing phase.

GPU Type

Node
count

Socket/
Node

Cores/
Socket

Threads/
Core

Total Cores/
Node

RAM
(GB)

Processor

Local Disks

Use Case

Notes

GH200

2

1

72

1

72

480
(+96 HBM3 Shared w/ GPU)

NVIDIA Grace™
72 Arm® Neoverse V2 cores (aarch64)

1TB SSD

Specialty nodes specifically designed for LLM inference, vector database search, and large data processing.

Not generally available to the public.
Please contact us if you have a specialty workload.