Getting Started with ARCC

Steps to get started in HPC with ARCC:
1: Get an ARCC HPC account by being added to an HPC project	To access an ARCC HPC resource, you must to be added to a project on that resource whether you’re a UWyo faculty member (Principle Investigator; PI), researcher, or student. You must be added as a member of a project on the cluster. (If you’ve received an e-mail from arcc-admin@uwyo.edu, indicating you’ve been added to a project, you have access to the HPC cluster). If you are a PI, you may request a project be created in which you will automatically be listed as PI and added as a member. If you are not a PI, a PI may request a project creation on your behalf, then request you be added as a member.
2: Log into HPC	ARCC HPC users should be aware that accessing our HPC means accessing a Linux environment. This will be different from a Windows/Mac PC. If you’ve received e-mails from arcc-admin (granting you access to a project) you’re ready to connect/login to the cluster! Login through Southpass: ARCC’s OnDemand resource, which makes ARCC’s Beartooth HPC available through your web browser. You will be redirected to the wyologin page, and be prompted for your UWYO login credentials and 2-factor authentication. Southpass Login Directions Error rendering macro 'excerpt-include' : No link could be created for 'SouthPass'. If you prefer to log into the cluster over SSH/Command Line, directions are dependent upon the client in which you’re connecting to HPC from. See this page to log in using SSH. In your command line window, type in the following command: `ssh <your_username>@<cluster_name>.arcc.uwyo.edu`. When connected, a bunch of text will scroll by. This will vary depending on the cluster. On Beartooth, for example, there are usage rules, tips, and a summary of your storage utilization across all projects that you are part of. Upon login to the HPC, the command prompt will look something like this: `[arccuser@blog1 ~]$`. To learn more about the command prompt and command line, please look through our documentation on Command Line Interface.
3: Start Processing While processing, you may also need to:	A key principle of any shared computing environment is that resources are shared among users and therefore must be scheduled. Please DO NOT simply log into the HPC and run your computations without requesting or scheduling resources from Slurm through a batch script or Interactive job. ARCC uses the Slurm Workload Manager to regulate and schedule user submitted jobs on our HPC systems. In order for your job to submit properly to Slurm, you must at minimum specify your account and a time in your submission. There are 2 ways to run your work on an ARCC HPC systems from the Command Line: Option 1: Run it as an Interactive Job These are jobs that allow users access to computing nodes where applications can be run in real time. This may be necessary when performing heavy processing of files, or compiling large applications. Interactive jobs can be requested with an `salloc` command. ARCC has configured the clusters so that interactive jobs provide shell access on compute nodes themselves rather than running on the login node. An example of an salloc request can be expanded below. Interactive Job Examples and Explanations ( Examples include using: salloc, --account, --time, --partition, --nodes, --cpus-per-task --mem) The following is the simplest example of a command to start an interactive job. This command has the bare minimum information (account and time) in order to run any job on an ARCC cluster: [cowboyjoe@hpclog1 ~]$ salloc --account=<your project name> --time=01:00 Breaking it down: `[cowboyjoe@hpclog1 ~]$` This is our command prompt and specifies our username (`cowboyjoe`), the node on which we're currently working from (`hpclog1`), and the folder in which we're located on the HPC (`~` which is short for our /home directory). To learn more about command line in Linux, see our Linux Command Line tutorial here. `salloc` is the slurm command to allocate a work session on the cluster `--account` is a flag specifying the account/project under which you’re performing your work in the session. `--time`is a flag specifying your “walltime limit” which is how long you will have access to the HPC resources you’re requesting in your work session. At the end of this time, you will be disconnected from your requested resources. Syntax for time may be in the form of “minutes”, “minutes:seconds”, “hours:minutes:seconds”, “days-hours”, days-hours:minutes”, or “days-hours:minutes:seconds”. On ARCC HPC, the maximum time you may request for any job is 7 days (job runtime may be extended, upon request). In this basic example, aside from time requested and account used, all allocated resources are set to the default. Total CPUs/cores available for this session is set to 1 by default. Total nodes we have access to in our session is set to 1 by default. Total memory allocated to the session is 1GB by default. The next example is a set of commands. The first line is a command to allocate an interactive job requesting specific hardware to perform the computations in our session. The second line runs a python script: [cowboyjoe@hpclog1 ~]$ salloc --account=arcc --time=40:00 --partition=moran --nodes=1 --cpus-per-task=8 --mem=8G python my_job_sequential_steps.py Breaking it down: `--account` is a flag specifying the account/project under which you’re performing your work in the session. In this case, the project’s name is arcc. `--time`is a flag specifying the “walltime limit” for this job, in this case, 40 minutes. `--partition` is a flag specifying which partition we want our nodes to come from for this session. You can view the partitions and learn about hardware specific to different partitions by viewing the hardware summary associated with the HPC you’re using. Since our overall computational needs are not significant, we requested the moran partition. These nodes have 16 cores/node and at least 32GB of RAM/node. `--nodes` is a flag telling Slurm how many compute nodes we need available to us to run the computations we want to run in the session. In this example, we only ask for 1. `--cpus-per-task` is a flag specifying how many cores/cpus we need available to run any single task. By default, `ntasks` (the number of tasks we run concurrently) is set to 1. Since our `salloc` command didn't specify `--ntasks` or any other "`tasks-per`" related parameters the total cpu for the requested session will be what was requested using the `--cpus-per-task` flag. In this example, 8 cores x (default) 1 tasks run concurrently = 8 cores total. `--mem` is a flag to request the minimum memory/RAM per node that we’ll need for our job. Above we requested 8G, so 8GB. This is under the 32GB of RAM/node for the partition (moran) we asked for in our salloc command, so Slurm accepts this memory request. The `-mem` flag should be followed by a unit prefix (G for GB). If a unit prefix is not specified and only an integer is provided, default prefix is M (megabytes). In response, Slurm assigns us a single node (node1), with access to a total of 8 cores and 8GB RAM for 40 minutes. salloc: Granted job allocation 1012024 salloc: Nodes node1 are ready for job In the last line we have been granted our requested resources from the Slurm scheduler and use them to run a python script named ‘`my_job_sequential_steps.py`'using the default version of python installed on the cluster. If we need more hardware resources than what we asked for in our salloc command, we may get an error (such as a “oom-kill” or out of memory error indicating we didn’t request enough RAM to run our session), or our job will run for an extremely long time (which may mean we didn’t request enough CPU and/or we didn’t parse out our computational work appropriately in the script). Option 2: Run it as a Batch Job This means running of one or more tasks on a computer environment. Batch jobs are initiated using scripts or command-line parameters. They run to completion without further human intervention (fire and forget). Batch jobs are submitted to a job scheduler (on ARCC HPC, Slurm) and run on the first available compute node(s). Batch Script Example and Explanation (Example using --account, --time, --partition, --mem, --job-name, --mail-type, --mail-user) In the following example we need to create our own batch script which then gets run by Slurm to execute your jobs and any associated tasks. Below is an example of a batch script we created named `myfirstjob.sh` that then runs our computational work in a python script named `my_job_sequential_steps.py`: #!/bin/bash #SBATCH --account=myproject #SBATCH --time=1-01:00:00 #SBATCH --partition=moran #SBATCH --mem=8G #SBATCH --job-name sequential_run #SBATCH --mail-type=ALL #SBATCH --mail-user=cowboyjoe@uwyo.edu python my_job_sequential_steps.py Breaking it down: `#!/bin/bash` is the “shebang” line, telling which HPC to use the bash shell to interpret the script. `--account` is a flag specifying the account/project under which you’re performing your work in the session. Here the account we’re using is myproject `--time`is a flag specifying your “walltime limit”. This is how long the script can run on HPC resources once it begins. At the end of this time, the script will end regardless of whether computations are completed. The example sets total time for our job to 1-02:15:45, 1 day, 2 hours, 15 minutes, 45 seconds. Syntax for time may be in the form of “minutes”, “minutes:seconds”, “hours:minutes:seconds”, “days-hours”, days-hours:minutes”, or “days-hours:minutes:seconds”. On ARCC HPC, the maximum time you may request for any job is 7 days (job runtime may be extended, upon request). `--partition` is a flag specifying which nodes we want to use for our job. You can view the partitions and learn about hardware specific to different partitions by viewing the hardware summary associated with the HPC you’re using. Since our overall computational needs are not significant, we requested to run on the moran partition. These nodes have 16 cores/node and at least 32GB of RAM/node. `--mem` is a flag to request the minimum memory/RAM per node that we’ll need for our job. Above we specify 8G, so 8GB. This is under the 32GB of RAM/node for the partition (moran) we asked for in our salloc command, so Slurm accepts this memory request. The `-mem` flag should be followed by a unit prefix (G for GB). If a unit prefix is not specified and only an integer is provided, default prefix is M (megabytes). `--job-name` is a flag to specify the name of our job allocation. This will appear with the job id number if we query running jobs on the HPC. `--mail-type` is a flag to specify which job events should trigger notification e-mails. Setting it to ALL means notification e-mails will be sent when a job begins, ends, fails, hits time limits, never runs due to problems with the request, or gets requeued. `--mail-user` is a flag to specifies the e-mail address to notify when job events occur. In this basic example, aside from time requested and account used, all allocated resources are set to the default. Total CPUs/cores available for this session is set to 1 by default. Total nodes we have access to in our session is set to 1 by default. Total tasks we will run concurrently is set to 1 by default. In the last line we run a python script named ‘`my_job_sequential_steps.py`' using the default version of python installed on the cluster. Assuming our batch script and the python script are complete and ready to run, we log into the HPC and run it on the cluster to submit our job by navigating to the location of our script, then running it with the following command: sbatch myfirstjob.sh Since this batch script first makes a request to Slurm to schedule our job and allocate resources before performing any computations, we can submit it on the login node. To learn more about running parallel jobs, running jobs with GPUs, and avoid more common issues, see our SLURM tutorial.
3a. Get access to software	Option 1: Use The Module System LMOD is very useful software on a HPC cluster that is leveraged to maintain a number of dynamic user environments and allow users to switch between software stacks and packages on HPC resources. You may check to see if software is available as a module by running a module spider in the following expandable example. Using module spider to search for software modules: Module spider The spider subcommand is a great search tool to find out if the software package has been installed as a system package. From the command line, run the module spider command to output a list of the available software for the entire system: $ module spider To search for specific packages and/or versions, you can supply the names/arguments to the command: $ module spider samtools If there is only one version, the output will contain information regarding which compilers are required to be loaded before the package can be loaded as well as provide a brief segment on the help of the module. If there are multiple versions available, the output contains which versions are available and instructions to get more information on the individual version. To get information regarding a specific version of the package, include the version as part of the argument: $ module spider samtools/1.6 There are opportunities to use regular expressions to search for modules. See output from module help for more information. Example The general process for all apps you might want to load is: Find versions. Find a version’s dependencies. Check what is already loaded and what is missing. Load required (missing) dependencies. Load application. # Find versions of samtools [@blog2 ~]$ module spider samtools --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- samtools: --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Versions: samtools/1.14 samtools/1.16.1 --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- For detailed information about a specific "samtools" package (including how to load the modules) use the module's full name. Note that names that have a trailing (E) are extensions provided by other modules. For example: $ module spider samtools/1.16.1 --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- # Does samtools/1.16.1 have any dependencies that need to be loaded first? [@blog2 ~]$ module spider samtools/1.16.1 --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- samtools: samtools/1.16.1 --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- You will need to load all module(s) on any one of the lines below before the "samtools/1.16.1" module is available to load. arcc/1.0 gcc/12.2.0 Help: SAM Tools provide various utilities for manipulating alignments in the SAM format, including sorting, merging, indexing and generating alignments in a per-position format # Check what is already loaded? # In this arcc/1.0 is loaded by default whenever a new session is started. # But gcc/12.2.0 is missing. [@blog2 ~]$ ml Currently Loaded Modules: 1) slurm/latest (S) 2) arcc/1.0 (S) 3) singularity/3.10.3 Where: S: Module is Sticky, requires --force to unload or purge # You can load gcc/12.2.0 on the same line as samtools/1.16.1, but it must be loaded before it, i.e. it appears to the left of it. [@blog2 ~]$ module load gcc/12.2.0 samtools/1.16.1 # If you'd tried the following i.e. load samtools before gcc, then you'll see the following error: [@blog2 ~]$ module load samtools/1.16.1 gcc/12.2.0 Lmod has detected the following error: These module(s) or extension(s) exist but cannot be loaded as requested: "samtools/1.16.1" Try: "module spider samtools/1.16.1" to see how to load the module(s). If you have a software package that is not installed as a module, but you think it would be widely utilize, make a request with us to see if it can be installed. Learn more about using LMOD here. Option 2: Install it Yourself If your software packages are somewhat research specific, you may install them to your project. ARCC will be providing an additional allocation of 250GB in every MedicineBow /project directory under /project/for software installations. Information on installing software on your own will vary depending on the software. General instructions may be found here.
3b. Transfer Data on/off HPC	Data transfer can be performed between HPC resources using a number of methods. The two easiest ways to transfer data are detailed below. A cumulative list of methods to transfer data on or off of ARCC Resources are detailed here. Option 1: Southpass Southpass Directions Error rendering macro 'excerpt-include' : No link could be created for 'SouthPass'. Option 2: Globus (For big data transfers) Globus Configuration Directions Configuring Globus Online Login to Globus' Web app. Click “Login” on the top right corner of the webpage. To use your UWYO organizational login, search for ‘University of Wyoming’. It should autofill as shown in the screenshot. Hit 'Continue' to continue setting up your globus account, then click 'Allow' as shown below to allow globus to manage your groups search data using your ID and groups, and manage transfers. Note that the sign up and setup may be skipped if you’ve already logged into Globus and Globus still has you cached. On the left side of your browser window will be a file manager menu. Select this menu then type: uw-arcc in the Collection field to pull up ARCC storage spaces you have access to. ARCC manages several data storage endpoints so be sure you pick the one associated with your storage and HPC cluster (if applicable). You may have access to multiple ARCC endpoints/collections. MedicineBow and Data-Alcova are accessible and under the GCSv5.4 endpoint name Medicine Bow MedicineBow data (/gscratch, /home, and /project) are under `/cluster/medbow` Alcova storage (aka new Alcova) is under path `/cluster/alcova` Alcova (aka the Old Alcova) is designated with the GCSv5.4 endpoint named Alcova FileSystem Access. Beartooth is designated with GCSv5.4 endpoint named TetonCreek/Beartooth. Older Beartooth storage will be available on the TetoncCreek/Beartooth Collection and clicking on the link will take you to your /~/ home directory. Putting / as the path will display all shares you have access to. You should be able to access your project, home and gscratch directories from this collection. Pathfinder is designated with GCSv5.4 endpoint named Pathfinder S3 Access. You will need to set up S3 keys for access through globus. Note: Recently migrated Globus endpoints will be set as “Managed Mapped Collection (GCS)” and you should use these endpoints to access ARCC resources through Globus. If you have mapped private collections (those shared from your personal computer or elsewhere) those will be set as “Private Mapped Collections” (GCP). Older collections will be mapped as GCSv4 Shares. When you get to the folder you’re looking for you can save it for future access by adding it to your bookmarks. To do that, once you’re in the folder you’re looking for click on the bookmark icon to the right of the Path text box. When you wish to come back to your saved bookmarks, you can go to the bookmarks option on the very left side of the screen to get back to your bookmarked locations easily. If you wish to copy, sink or share, select the files/folders you wish to copy, transfer or share, and click ‘Start’, and choose a location or e-mail with which to transfer or sync. Click ‘Activity’ in the left pane to observe the transfer progress.
3c. View Visual Data or Access HPC with Graphics / Visual Interface	If you want to view visual output you’ve created on Beartooth or just need access to a GUI (Graphical User Interface), please use Southpass. Pages have been created for accessing Beartooth and Wildiris in a graphical user interface.

Getting Started with ARCC

Steps to get started in HPC with ARCC:

1: Get an ARCC HPC account by being added to an HPC project

2: Log into HPC

3: Start Processing

Option 1: Run it as an Interactive Job

Option 2: Run it as a Batch Job

3a. Get access to software

Option 1: Use The Module System

Module spider

Example

Option 2: Install it Yourself

3b. Transfer Data on/off HPC

Option 1: Southpass

Option 2: Globus (For big data transfers)

Configuring Globus Online

3c. View Visual Data or Access HPC with Graphics / Visual Interface