Hardware - Teton

Overview

This Wiki section contains information about the hardware used for research and/or run on the high performance computing system at UWyo.

Available Nodes

Type

Scheduler Partition

Series

Arch

Count

Sockets

Cores

Threads / Core

Clock (GHz)

RAM (GB)

GPU Type

GPU Count

Local Disk Type

Local Disk Capacity (GB)

IB Network

Operating System

Status

Type

Scheduler Partition

Series

Arch

Count

Sockets

Cores

Threads / Core

Clock (GHz)

RAM (GB)

GPU Type

GPU Count

Local Disk Type

Local Disk Capacity (GB)

IB Network

Operating System

Status

Teton Regular

teton

Intel Broadwell

x86_64

180

2

32

1

2.1

128

N/A

N/A

SSD

240

EDR

RHEL 7.9

No longer available for purchase

Teton Cascade

teton-cascade

Intel Cascade Lake

x86_64

44

2

40

1

2.1

192/768

N/A

N/A

SSD

240

EDR

RHEL 7.9

Current Model

Teton BigMem GPU

teton-gpu

Intel Broadwell

x86_64

8

2

32

1

2.1

512

NVIDIA P100 16G

2

SSD

240

EDR

RHEL 7.9

No longer available for purchase

Teton HugeMem

teton-hugemem

Intel Broadwell

x86_64

10

2

32

1

2.1

1024

N/A

N/A

SSD

240

EDR

RHEL 7.9

No longer available for purchase

Teton Massive Memory

teton-massmem

AMD/EPYC

x86_64

2

2

48

1

 

4096

N/A

N/A

SSD

4096

EDR

RHEL 7.9

Current Model

Teton KNL

teton-knl

Intel Knights Landing

x86_64

12

1

72

4

1.5

384 + 16

N/A

N/A

SSD

240

EDR

RHEL 7.9

No longer available for purchase

Teton DGX

dgx

Intel Broadwell

x86_64

1

2

40

2

2.2

512

NVIDIA V100 32G

8

SSD

7 TB

EDR

Ubuntu 18.04.2 LTS

Available as special order

Teton Test

arcc

Intel Broadwell

x86_64

8

2

14

1

2.4

128

N/A

N/A

HD

240G

EDR

RHEL 7.9

No longer available for purchase

Moran Regular

moran

Intel Sandbridge/Ivybridge

x86_64

280

2

16

1

2.6

64 or 128

k20 on some

2

HD

1T

FDR

RHEL 7.9

No longer available for purchase

Moran BigMem

moran-bigmem-gpu

Intel Haswell

x86_64

2

2

16

1

2.6

512

K80

8

HD

1T

FDR

RHEL 7.9

No longer available for purchase

Moran Debug

moran

Intel Ivybridge

x86_64

2

2

16

1

2.6

64

k20m

2

HD

1T

FDR

RHEL 7.9

No longer available for purchase

Moran HugeMem

moran-hugemem

Intel Haswell

x86_64

2

2

16

1

2.6

1024

k20

2

HD

1T

FDR

RHEL 7.9

No longer available for purchase

Moran DGX

dgx

Intel Broadwell

x86_64

1

2

40

2

2.2

512

NVIDIA V100 16G

8

SSD

7 TB

EDR

Ubuntu 18.04.2 LTS

Available as special order

Moran Test

arcc

Intel Haswell

x86_64

1

2

20

1

2.6

64

N/A

N/A

HD

300G

FDR

RHEL7.9

No longer available for purchase

TOTAL Nodes

 

 

 

553

 

 

 

 

 

 

 

 

 

 

 

 

GPUs and Accelerators

The ARCC Teton cluster has a number of compute nodes that contain GPUs. This section describes the hardware, as well as access and usage of the GPU nodes.

Teton GPU Hardware

The following tables list each node that has GPUs and the type of GPU installed.

Partition

GPU

Devices

Nodes

CUDA Cores

GPU Memory Size (GB)

Compute Capability

Partition

GPU

Devices

Nodes

CUDA Cores

GPU Memory Size (GB)

Compute Capability

moran

GeForce GTX Titan

[1-2]

mdbg01

2688

6

3.5

moran

GeForce GTX Titan X

0

[2-3]

mdbg01

mdbg02

3072

12

5.2

moran

Tesla K20m

[0-1]

1

m[025-32], m[075-82], m086

m268

2496

4.7

3.5

moran

Tesla K20Xm

[0-1]

0

m219/20/27/28, m235.36, m243/4, m251/2/9, m260/7

m268

2688

5.7

3.5

moran

Tesla K40c

[0-1]

mdbg02

2880

11.4

3.5

moran-bigmem-gpu

Tesla K80

[0-7]

mbm[01-02]

2496

11.4

3.7

teton-gpu

Tesla P100

[0-1]

tbm[03-10]

3584

16

6.0

Notes:

  • Review Nvidia’s Compute Capabilities to understand what each version provides.

    • The CUDA FAQ defines that “the compute capability of a GPU determines its general specifications and available features.

    • For example, none of the above GPUs have tensor cores - you need compute capabilities 7.0 and higher.

  • The above table will update as old nodes are decommissioned and new nodes are bought into the cluster.

The following two GPU nodes are reserved for AI use. These are special nodes running Ubuntu and CUDA 11.0.

Partition

GPU

Devices

Nodes

CUDA Cores

Tensor Cores

GPU Memory Size (GB)

Compute Capability

Partition

GPU

Devices

Nodes

CUDA Cores

Tensor Cores

GPU Memory Size (GB)

Compute Capability

dgx

Tesla V100

[0-7]

mdgx01

5120

640

16

7.0

dgx

Tesla V100

[0-7]

tdgx01

5120

640

32

7.0


On how to request GPUs via a bash script submitted via sbatch, or via an interactive session using salloc, and how to check resources requested, please see: Introduction to Job Submission: 02: Memory and GPUs

 

GPU Programming Environment

CUDA

On Teton Nvidia CUDA, PGI CUDA Fortran and the OpenACC compilers are installed. use module spider cuda to see the versions of CUDA available, then module load the version you require..

Any login node should work to compile your CUDA code as the CUDA tools are available from the login nodes.

To compile CUDA code using the CUDA compiler "nvcc" so that it runs on all types of GPUs that ARCC has, use the following compiler flags:

-gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_70,code=sm_70

For more info on the CUDA compilation and linking flags, please view the CUDA C++ Programming Guide.

 

OpenACC

To invoke OpenACC, use the "-acc" flag. More information on OpenACC can be obtained on the OpenACC website.

 

PGI Compilers - Depreciated

As of October 2020, the PGI compilers are not on the Teton cluster, so the text below is no longer relevant. If you require the PGI compilers hen please contact arcc.

PGI compilers come with their own CUDA which is quite recent, and can be set access by loading the PGI module, using "module load pgi".

The PGI compilers specify the GPU architecture with the -tp=tesla flag. If no further option is specified, the flag will generate code for all available computing capabilities (at the time of writing cc35,cc37, cc50, cc60, and cc70). To be specific for each GPU:

GPU Type

Compiler Flag

GPU Type

Compiler Flag

K20m

  • tp=tesla:cc35

K20Xm

  • tp=tesla:cc35

Titan

  • tp=tesla:cc60

Titan X

  • tp=tesla:cc50

K40c

  • tp=tesla:cc35

K80

  • tp=tesla:cc35

P100

  • tp=tesla:cc60

V100

  • tp=tesla:cc70