Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Table of Contents

Glossary

...

Required Inputs and Default Values and Limits

There are some default limits set for Slurm jobs. By default the following is required for submission:

  1. Walltime limit(--time=[days-hours:mins:secs]

  2. Project account--account=account

Default Values

Additionally, the default submission has the following characteristics:

nodes is for one node (-N 1, --nodes=1)task count one tasks (-n 1, --ntasks-per-node=1)memory amount 1000 MB RAM / CPU (--mem-per-cpu=1000).

These can be changed by requesting different allocation schemes by modifying the appropriate flags. Please reference our Slurm documentation.

Default Limits

On Mount Moran, the default limits were specifically represented by concurrently used cores by each project account. Investors received an increase in concurrent core usage capability. To facilitate more flexible scheduling for all research groups, ARCC is looking at implementing limits based on concurrent usage of cores, memory, and walltime of jobs. This will be defined in the near future and will be subject to the FAC review.

Partitions

Expand
titleClick to View - Patitions

The Slurm configuration on Teton is quite complicated to help with the layout of hardware, investors, and runtime limits. The following tables represent the partition on Teton. Some require a QoS which will be auto-assigned during job submission. The tables represent the Slurm allocatable units rather than hardware units.

General Partitions

 

 

 

 

 

 

 

Teton General Slurm Partitions

Partition

Max Walltime

Node Cnt

Core Cnt

Thds / Core

CPUS

Mem (MB) / Node

Req'd QoS

Moran

7-00:00:00

284

4544

1

4544

64000 or 128000

N/A

teton

7-00:00:00

180

5760

1

5760

128000

N/A

teton-gpu

7-00:00:00

8

256

1

256

512000

N/A

teton-hugemem

7-00:00:00

10

256

1

256

1024000

N/A

teton-knl

7-00:00:00

12

864

4

3456

384000

N/A

Investor Partitions

Investor partitions are likely to be quite heterogeneous and may have a mix of hardware and are indicated below where appropriate. They require a special QoS for access.

 

 

 

 

 

 

 

 

 

Teton Investor Slurm Partitions

Partition

Max Walltime

Node Cnt

Core Cnt

Thds / Core

Mem (MB) / Node

Req'd QoS

Preemption

Owner

Associated Projects

inv-arcc

Unlimited

2

44

1

64000 or 192000

TODO

Disabled

Jeffrey Lang

arcc

inv-atmo2grid

7-00:00:00

31

496

1

64000

TODO

Disabled

Dr. Naughton, Dr. Mavriplis, Dr. Stoellinger

turbmodel, rotarywingcfg

inv-chemistry

7-00:00:00

6

96

1

128000

TODO

Disabled

Dr. Hulley

hulleylab, pahc, chemcal

inv-clune

7-00:00:00

16

256

1

Mixed

TODO

Disabled

Dr. Clune

evolvingai, iwctml

inv-compmicrosc

7-00:00:00

6

96

1

128000

TODO

Disabled

Dr. Aidey (Composite Micro Sciences)

rd-hea

inv-compsci

7-00:00:00

12

288

1

384999

TODO

Disabled

Dr. Lars Kotthoff

mallet

inv-fertig

7-00:00:00

1

16

1

128000

TODO

Disabled

Dr. Fertig

gbfracture

inv-geology

7-00:00:00

16

256

1

64000

TODO

Disabled

Dr.Chen, Dr. Mallick

inversion, f3dt, geologiccarbonseq, stochasticaquiferinv

inv-inbre

7-00:00:00

24

160

1

128000

TODO

Disabled

Dr. Blouin

inbre-train, inbreb, inbrev, human_microbiome

inv-jang-condel

7-00:00:00

2

32

1

128000

TODO

Disabled

Dr. Jang-Condel

exoplanets, planets

inv-liu

7-00:00:00

4

64

1

128000

TODO

Disabled

Dr. Liu

gwt

inv-microbiome

7-00:00:00

85

2816

1

128000

TODO

Disabled

Dr. Ewers

bbtrees, plantanalytics

inv-multicfd

7-00:00:00

11

352

1

128000

TODO

Disabled

Dr. Mousaviraad ,mechanical engineering

multicfd

inv-physics

7-00:00:00

4

128

1

128000

TODO

Disabled

Dr. Dahnovsky

euo, 2dferromagnetism, d0ferromagnetism, microporousmat

inv-wagner

7-00:00:00

2

32

1

128000

TODO

Disabled

Dr. Wagner

wagnerlab, latesgenomics, ltcichlidgenomics, phylogenref, ysctrout

Special Partitions

Special partitions require access to be given directly to user accounts or project accounts and likely require additional approval for access.

Partition

Max Walltime

Node Cnt

Core Cnt

Thds / Core

Mem (MB) / Node

Owner

Associated Projects

Notes

dgx

7-00:00:00

2

40

2

512000

Dr. Clune

See partition inv-clune above

NVIDIA V100 with NVLink, Ubuntu 16.04

inv-compsci

7-00:00:00

12

72

4

512000

Dr. Kotthoff

See partition inv-compsci above

This includes the KNL nodes only

More Details

Generally, to run a job on a cluster you will need the following:

A handy migration reference to compare MOAB/Torque commands to SLURM commands can be found on the SLURM home site: Batch System Rosetta Stone.


Commands

Expand
titleClick to View - Commands

sacct

  • Query detailed information about job that have completed. Use this utility to get information about running or completed jobs

salloc

  • Request in an interactive job for debugging and/or interactive computing. ARCC configures the salloc command to launch an interactive shell on individual compute nodes with your current environment carried over from the current session (except in the dgx partition where the environment is reinitialized for Ubuntu). This command requires specifying a project account (-A--account=) and walltime (-t--time=).

sbatch

  • Submit a batch job consisting of a single job or job array. Several methods can be used to submit batch jobs. A script file can be used and provided as an argument on the command line. Alternatively, and rarer, the use of standard input can be used and the batch job can be created interactively. We recommend writing the batch job in a script so that it may be referenced at a later time.

scancel

  • Cancel jobs after submission. Works on pending and running jobs. By default, provide a jobid or set of jobids to cancel. Alternatively, one can use sets of flags to cancel specific jobs relating to account, name, partition, qos, reservation, nodelist. To cancel all array tasks, specify the parent jobid.

sinfo

  • View the status of the Slurm partitions or nodes. Status of nodes that are drained can be seen using the  -R flag.

squeue

  • View what is running or waiting to run in the job queue. Several modifiers and formats can be supplied to the command. You may be interested in the use of arccq as an alternative. The command arccjobs also provides a summary.

sreport

  • Obtain information regarding usage since the last database roll up (usually around midnight each day). sreport can be used as an interactive tool to see usage of the clusters.

srun

  • A front-end launcher for job steps which includes serial and parallel jobs. srun can be considered an equivalent to mpirun or mpiexec when launching MPI jobs. Using srun inside a job is defined to be a job step that provides accounting information relating to memory, cpu time, and other parameters that are valuable when a job terminates unexpectedly or historical information is needed.

Info

There are some additional commands, however, they'll not be mentioned here because they're not that useful on our system for general users. It's important to note that reading the man pages on the Slurm commands can be highly beneficial and if you have questions, ARCC encourages you to request information on submitting jobs to arcc-help@uwyo.edu.


...

Slurm is a flexible and powerful workload manager. It has been configured to allow very good expressiveness to allocate certain features of nodes and specialized hardware. Certain features are requested by the use of Generic Resource or GRES while others are requested through the constraints option.

GPU Requests

Request that 16 cpus 2 GPUs be requested for an interactive session:

...

Expand
titleClick to View - Examples

Example 1

In the following example, we use the ARCC as our project example. We want to give ARCC access to run longer jobs. We assume that the "long-jobs-14" QOS has been previously been created.

  • We run the command "assoc" which return the following definition for ARCC from the slurm database:

Code Block
            Account       User   Def QOS                  QOS 
-------------------- ---------- --------- --------------------

inv-arcc                            arcc          arcc,normal 
 arcc                               arcc          arcc,normal
  arcc                awillou2      arcc          arcc,normal 
  arcc                dperkin6      arcc          arcc,normal 
  arcc                 jbaker2      arcc          arcc,normal 
  arcc                  jrlang      arcc          arcc,normal 
  arcc                mkillean      arcc          arcc,normal 
  arcc                powerman      arcc          arcc,normal 
  arcc                salexan5      arcc          arcc,normal

This shows the default configuration for the QOS setup, "arcc" being the default QOS all arcc jobs run under. While :arcc: project users have access to either the "normal" or "arcc" QOS.

  • We want to give the "arcc" project access to the 14-day job runtime feature, we do this by adding the proper QOS to the ARCC project

Code Block
sacctmgr modify account arcc where cluster=teton set qos+=long-jobs-14
  • To verify the QOS has been added to the "arcc" project we run the "assoc" command as root

Code Block
            Account       User   Def QOS                  QOS 
-------------------- ---------- --------- --------------------

inv-arcc                            arcc            arcc,normal 
 arcc                               arcc arcc,long-job-14,norm+ 
  arcc                awillou2      arcc arcc,long-job-14,norm+
  arcc                dperkin6      arcc arcc,long-job-14,norm+
  arcc                 jbaker2      arcc arcc,long-job-14,norm+
  arcc                  jrlang      arcc arcc,long-job-14,norm+
  arcc                mkillean      arcc arcc,long-job-14,norm+
  arcc                powerman      arcc arcc,long-job-14,norm+
  arcc                salexan5      arcc arcc,long-job-14,norm+

Notes

  • Do we advertise this?

Code Block
 Keep it under wraps for now since this will be allowed on a per request basis.
  • How do we stop people from abusing it?

Code Block
 There are a couple of things in place to keep from abusing this:

We allow only a maximum of 10 jobs running under this QOS ARCC must enable access to the long-job-14 QOS.

By default, we don't attach this QOS to projects. Once the requirement for the project to run long jobs is over we will remove the QOS from the project.


Trouble Shooting

Expand
titleClick to View - Trouble Shooting
  • Node won't come online

If a node won't come online for some reason check the node information for a slurm reason. run

Code Block
scontrol show node=XXX

The command output should include a reason for why slurm won't bring the node online. As an example:

Code Block
root@tmgt1:/apps/s/lenovo/dsa# scontrol show node=mtest2
NodeName=mtest2 Arch=x86_64 CoresPerSocket=10 
   CPUAlloc=0 CPUTot=20 CPULoad=0.02
   AvailableFeatures=ib,dau,haswell,arcc
   ActiveFeatures=ib,dau,haswell,arcc
   Gres=(null)
   NodeAddr=mtest2 NodeHostName=mtest2 Version=18.08
   OS=Linux 3.10.0-693.21.1.el7.x86_64 #1 SMP Fri Feb 23 18:54:16 UTC 2018 
   RealMemory=64000 AllocMem=0 FreeMem=55805 Sockets=2 Boards=1
   State=IDLE+DRAIN ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
   Partitions=arcc 
   BootTime=06.08-11:44:57 SlurmdStartTime=06.08-11:47:35
   CfgTRES=cpu=20,mem=62.50G,billing=20
   AllocTRES=
   CapWatts=n/a
   CurrentWatts=0 LowestJoules=0 ConsumedJoules=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s
   Reason=Low RealMemory [slurm@06.10-10:00:27]

This indicates that the memory definition for the node and what Slurm actually found are different. You can use

Code Block
free -m

to see what the system thinks it has in terms of memory.

The node definition should have a memory definition less or equal to the total showed by the "free" command. You should verify that the settings are correct for the memory the node should have. If not, investigate and determine why the discrepancy.


Configuring Slurm for Investments

Expand
titleClick to View - Configuring Slurm for Investments

The Teton cluster is the University of Wyoming's Condo cluster which provides computing resources to the general UW research community. Being a condo cluster researchers can invest funds into the cluster in order to expand its functionality. As an investor, a researcher is afforded special privileges specifically first access to the nodes their funds purchased.

To establish an investment within Slurm follow the following steps:

  1. First, define an investor partition that refers to the purchased nodes. Create the partition definition, edit /apps/s/slurm/latest/etc/partitions-invest.conf. Add

Code Block
# Comment describing the investment
PartitionName=inv-<investment-name> AllowQos=<investment-name> \
  Default=No \
  Priority=10 \
  State=UP \
  Nodes=<nodelist> \
  PreemptMode=off \
  TRESBillingWeights="CPU=1.0,Mem=.00025"

Where:
  • investment-name is the name you wish to call the new investment

  • nodelist is the list of nodes to be included in the investment definition, i.e. t[305-315],t317

  • Adjust the TRESBillingWeights accordingly based on the node specifications

Code Block
Note: The nodes should also be added to the general partition list, i.e. teton

...

  1. Once you have checked and re-checked your work for correctness configure slurm with the new partition definition:

Code Block
scontrol reconfigure

For the following you will need access to two ARCC created commands:

  • add_slurm_inv

  • add_project_to_inv

...

  1. Now that you have the investor partition setup you need to create the associated Slurm DB entries. First, run

Code Block
/root/bin/idm_scripts/add_slurm_inv inv-<investment-name>

This will create the investor umbrella account that ties the investment to projects.

...

  1. Now add the investor project to the investor umbrella account.

Code Block
/root/bin/idm_scripts/add_proj_to_inv inv-<investment-name> <project>

...