...
On Mount Moran, the default limits were specifically represented by concurrently used cores by each project account. Investors received an increase in concurrent core usage capability. To facilitate more flexible scheduling for all research groups, ARCC is looking at implementing limits based on concurrent usage of cores, memory, and walltime of jobs. This will be defined in the near future and will be subject to the FAC review.
Partitions
...
title | Click to View - Patitions |
---|
The Slurm configuration on Teton is quite complicated to help with the layout of hardware, investors, and runtime limits. The following tables represent the partition on Teton. Some require a QoS which will be auto-assigned during job submission. The tables represent the Slurm allocatable units rather than hardware units.
General Partitions
|
|
|
|
|
|
| Teton General Slurm Partitions |
Partition | Max Walltime | Node Cnt | Core Cnt | Thds / Core | CPUS | Mem (MB) / Node | Req'd QoS |
---|---|---|---|---|---|---|---|
Moran | 7-00:00:00 | 284 | 4544 | 1 | 4544 | 64000 or 128000 | N/A |
teton | 7-00:00:00 | 180 | 5760 | 1 | 5760 | 128000 | N/A |
teton-gpu | 7-00:00:00 | 8 | 256 | 1 | 256 | 512000 | N/A |
teton-hugemem | 7-00:00:00 | 10 | 256 | 1 | 256 | 1024000 | N/A |
teton-knl | 7-00:00:00 | 12 | 864 | 4 | 3456 | 384000 | N/A |
Investor Partitions
Investor partitions are likely to be quite heterogeneous and may have a mix of hardware and are indicated below where appropriate. They require a special QoS for access.
|
|
|
|
|
|
|
|
| Teton Investor Slurm Partitions |
Partition | Max Walltime | Node Cnt | Core Cnt | Thds / Core | Mem (MB) / Node | Req'd QoS | Preemption | Owner | Associated Projects |
---|---|---|---|---|---|---|---|---|---|
inv-arcc | Unlimited | 2 | 44 | 1 | 64000 or 192000 | TODO | Disabled | Jeffrey Lang | arcc |
inv-atmo2grid | 7-00:00:00 | 31 | 496 | 1 | 64000 | TODO | Disabled | Dr. Naughton, Dr. Mavriplis, Dr. Stoellinger | turbmodel, rotarywingcfg |
inv-chemistry | 7-00:00:00 | 6 | 96 | 1 | 128000 | TODO | Disabled | Dr. Hulley | hulleylab, pahc, chemcal |
inv-clune | 7-00:00:00 | 16 | 256 | 1 | Mixed | TODO | Disabled | Dr. Clune | evolvingai, iwctml |
inv-compmicrosc | 7-00:00:00 | 6 | 96 | 1 | 128000 | TODO | Disabled | Dr. Aidey (Composite Micro Sciences) | rd-hea |
inv-compsci | 7-00:00:00 | 12 | 288 | 1 | 384999 | TODO | Disabled | Dr. Lars Kotthoff | mallet |
inv-fertig | 7-00:00:00 | 1 | 16 | 1 | 128000 | TODO | Disabled | Dr. Fertig | gbfracture |
inv-geology | 7-00:00:00 | 16 | 256 | 1 | 64000 | TODO | Disabled | Dr.Chen, Dr. Mallick | inversion, f3dt, geologiccarbonseq, stochasticaquiferinv |
inv-inbre | 7-00:00:00 | 24 | 160 | 1 | 128000 | TODO | Disabled | Dr. Blouin | inbre-train, inbreb, inbrev, human_microbiome |
inv-jang-condel | 7-00:00:00 | 2 | 32 | 1 | 128000 | TODO | Disabled | Dr. Jang-Condel | exoplanets, planets |
inv-liu | 7-00:00:00 | 4 | 64 | 1 | 128000 | TODO | Disabled | Dr. Liu | gwt |
inv-microbiome | 7-00:00:00 | 85 | 2816 | 1 | 128000 | TODO | Disabled | Dr. Ewers | bbtrees, plantanalytics |
inv-multicfd | 7-00:00:00 | 11 | 352 | 1 | 128000 | TODO | Disabled | Dr. Mousaviraad ,mechanical engineering | multicfd |
inv-physics | 7-00:00:00 | 4 | 128 | 1 | 128000 | TODO | Disabled | Dr. Dahnovsky | euo, 2dferromagnetism, d0ferromagnetism, microporousmat |
inv-wagner | 7-00:00:00 | 2 | 32 | 1 | 128000 | TODO | Disabled | Dr. Wagner | wagnerlab, latesgenomics, ltcichlidgenomics, phylogenref, ysctrout |
Special Partitions
Special partitions require access to be given directly to user accounts or project accounts and likely require additional approval for access.
Partition | Max Walltime | Node Cnt | Core Cnt | Thds / Core | Mem (MB) / Node | Owner | Associated Projects | Notes |
---|---|---|---|---|---|---|---|---|
dgx | 7-00:00:00 | 2 | 40 | 2 | 512000 | Dr. Clune | See partition inv-clune above | NVIDIA V100 with NVLink, Ubuntu 16.04 |
inv-compsci | 7-00:00:00 | 12 | 72 | 4 | 512000 | Dr. Kotthoff | See partition inv-compsci above | This includes the KNL nodes only |
More Details
Generally, to run a job on a cluster you will need the following:
A handy migration reference to compare MOAB/Torque commands to SLURM commands can be found on the SLURM home site: Batch System Rosetta Stone.
Commands
Expand | ||
---|---|---|
| ||
sacct
salloc
sbatch
scancel
sinfo
squeue
sreport
srun
|
...