/We will be providing a quick tour covering high-level ideas for using Linux and HPC on our cluster, which should all you to access and use our Beartooth Cluster to perform analysis associated with this workshop.
Goals:
Introduce ARCC and what types of services we provide including “what is HPC?”
Define “what is a cluster”, and how is it made of partitions and compute nodes.
How to access and start using ARCC’s Beartooth cluster - using our SouthPass service.
How to start an interactive desktop and open a terminal to use Linux commands within.
Introduce the basics of Linux, the command-line, and how its File System looks on Beartooth.
Introduce Linux commands to allow navigation and file/folder manipulation.
Introduce Linux commands to allow text files to be searched and manipulated.
Introduce using a command-line text-editor and an alternative GUI based application.
How to setup a Linux environment to use R(/Python) and start RStudio, by loading modules.
How to start interactive sessions to run on a compute node, to allow computation, requesting appropriate resources.
How to put elements together to construct a workflow that can be submitted as a job to the cluster, which can then be monitored.
0 Getting Started
Users may log in with their BYODs (do you have a computer with you to follow along with the workshop?)
Log into UWYO wifi if you can. (Non-UW users will be unable to)
Follow along with our slides available at: final link
Logging in:
If you have a UWYO username and password: UW Users may test their HPC access by opening a browser and then going to the following URL: https://southpass.arcc.uwyo.edu.
Standard wyologin page will be presented. Log in with your
UWYO username and password.
If you do not have a UWYO username and password: Come see me for a Yubikey and directions allow you to access the Beartooth HPC cluster if you do not have a UW account.
00 Introduction and Setting the Scope:
The roadmap to becoming a proficient HPC user can be long, complicated, and varies depending on the user. There are a large number of concepts to cover. Some of these concepts are included in today’s training but given time constraints, it’s impossible to get to all of them. This workshop session introduces key high-level concepts, and follows a very hands-on demonstration approach, for you to follow.
Our training will help provide the foundation necessary for you to use Beartooth cluster specifically to perform some of the exercises later in this workshop.
Because of our limited time this morning, please submit any questions to the slack channel for this workshop and workshop instructors can address them as they are available.
More extensive and in-depth information and walkthroughs are available on our wiki and you are welcome to dive into those in your own time. Content within them should provide you with a lot of the foundational concepts you would need to be familiar with to become a proficient HPC user.
01 About UW ARCC and HPC
Goals:
Describe ARCC’s role at UW
Provide resources for ARCC Researchers to seek help
Introduce staff members, including those available throughout the workshop
Introduce the concept of an HPC cluster, it’s architecture and when to use one
Introduce the Beartooth HPC architecture, hardware, and partitions
About ARCC and how to reach us
Based on: Wiki Front Page: About ARCC
How to Reach Us: | About ARCC: |
---|---|
E-Mail: arcc-help@uwyo.edu | UW ARCC is the primary research computing facility for the University of Wyoming and our department is housed within the Division of Research and Economic Development. Our expert staff are committed to providing the UW research community with access to specialized research computing infrastructure, knowledge, support, and a large range of scientific software pre-configured for your use. If you are new to ARCC, please begin with our “getting started” wiki pages. We manage and maintain and support all internally housed scientific computing resources including HPC and high performance data storage. UW ARCC also aims to support all high performance computing resources available to any UW researcher. This includes the support of External HPC Resources which includes but is not limited to Wyoming-NCAR Alliance Allocations through NWSC. A list of our internally offered services is detailed in our service list. |
In short, we maintain internally housed scientific resources including more than one HPC Cluster, data storage, and several research computing servers and resources.
We are here to assist UW researchers like yourself with your research computing needs.
3 ARCC Staff Members will be available through the course of the workshop if you need help using Beartooth:
ARCC End User Support | ||
---|---|---|
Simon Alexander | HPC & Research Software Manager | |
Dylan Perkins | Research Computing Facilitator | |
Lisa Stafford | Research Computing Facilitator |
What is HPC
HPC stands for High Performance Computing and is one of UW ARCC’s core services. HPC is the practice of aggregating computing power in a way that delivers a much higher performance than one could get out of a typical desktop or workstation. HPC is commonly used to solve large problems, and has some common use cases:
Performing computation-intensive analyses on large datasets: MB/GB/TB in a single or many files, computations requiring RAM in excess of what is available on a single workstation, or analysis performed across multiple CPUs (cores) or GPUs.
Performing long, large-scale simulations: Hours, days, weeks, spread across multiple nodes each using multiple cores.
Running repetitive tasks in parallel: 10s/100s/1000s of small short tasks.
|
---|
Homogeneous vs Heterogeneous HPCs
There are 2 types of HPC systems:
Homogeneous: All compute nodes in the system share the same architecture. CPU, memory, and storage are the same across the system. (Ex: NWSC’s Derecho)
Heterogeneous: The compute nodes in the system can vary architecturally with respect to CPU, memory, even storage, and whether they have GPUs or not. Usually, the nodes are grouped in partitions. Beartooth is a heterogeneous cluster and our partitions are described on the Beartooth Hardware Summary Table on our ARCC Wiki.
Beartooth Hardware and Partitions
See Beartooth Hardware Summary Table on the ARCC Wiki.
02 Using Southpass to access the Beartooth HPC Cluster
Southpass is our Open OnDemand resource allowing users to access Beartooth over a web-based portal. Learn more about Southpass here.
Goals:
Demonstrate how users log into Southpass
Demonstrate requesting and using a XFCE Desktop Session
Introduce the Linux File System and how it compares to common workstation environments
Introduce HPC specific directories and how they’re used
Introduce Beartooth specific directories and how they’re used
Demonstrate how to access files using the Beartooth File Browsing Application
Demonstrate the use of emacs, available as a GUI based text-editor
Based on: SouthPass
Log in and Access the Cluster
Login to Southpass
Using Southpass
Interactive Applications in Southpass are requested by filling out a webform to specify hardware requirements while you use the application. Other applications can be accessed without filling out a webform:
|
---|
Exercise: Beartooth XFCE Desktop
Requests are made through a webform in which you specifically request certain hardware or software to use on Beartooth.
While we use a webform to request Beartooth resources on Southpass, later training will show how resource configurations can be requested through command line via salloc
or sbatch
commands.
Structure of the Linux File System and HPC Directories
Linux File Structure
This is specific to the Beartooth HPC but most Linux environments will look very similar
Linux Operating Systems (Generally)
Compare and Contrast: Linux, HPC Specific, Beartooth Specific
Based on: Beartooth Filesystem
HPC Specific Folders:
/home
(Common across most shared HPC Resources)What is it for? Similar to on a PC, and Macintosh HD → Users on a Mac
Permissions: It should have files specific to you, personally, as the HPC user. By default no one else has access to your files in your home.
Director Path: Every HPC user on Beartooth has a folder in on Beartooth under
/home/<your_username>
or$HOME
Default Quota: 25GB
/project
(Common across most shared HPC Resources)What is it for? Think of it as a shared folder for you and all your project members. Similar to
/glade/campaign
on NCAR HPC.Permissions: All project members have access to the folder. By default, all project members can read any files or folders within, and can write in the main project directory.
Directory path: get to it at
/project/biocompworkshop/
Subfolders in
/project/biocompworkshop/
for each user are added to project when a user gets added to the project, but only that user can write to their folder.Default Quota: 1TB which is for the project folder itself and includes all it’s contents and subfolders.
/gscratch
(Scratch folder, common across most HPC resources but sometimes just called "scratch")What is it for? It’s “scratch space”, so it’s storage dedicated for you to store temporary data you need access to.
Permissions: Like
/home
, contents is specific to you, personally, as the HPC user. By default no one else has access to your files in your/gscratch
.Director Path: Every HPC user on Beartooth has a gscratch directory in Beartooth under
/gscratch/<your_username>
or$SCRATCH
Default Quota: 5TB
Don’t store anything in
/gscratch
that you need or don't have backed up elsewhere. It's not meant to store anything long term.Everyone’s
/gscratch
directory is subject to ARCC's purge policy.
Beartooth Specific
/apps
(Specific to ARCC HPC) is like on Windows or on a Mac.Where applications are installed and where modules are loaded from. (More on that later).
/alcova
(Specific to ARCC HPC).Additional research storage for research projects that may not require HPC but is accessible from beartooth.
You won’t have access to it unless you were added to an alcova project by the PI.
Exercise: File Browsing in Southpass GUI
Users can access their files using the south pass file browser app.
Demonstration opening emacs GUI based text editor
03 Using Linux and the Command Line
Goals:
Introduce the shell terminal and command line interface
Demonstrate starting a Beartooth SSH shell using Southpass
Demonstrate information provided in a command prompt
Introduce Policy for HPC Login Nodes
Demonstrate how to navigate the file system to create and remove files and folders using command line interface (CLI)
mkdir
,cd
,ls
,mv
,cp
Demonstrate the use of
man
,--help
and identify when these should be usedDemonstrate using a command-line text editor,
vi
Based on: The Command Line Interface
Exercise: Shell Terminal Introducing Command Line
Login Node Policy
As a courtesy to your colleagues, please do not run the following on any login nodes:
Anything compute-intensive (tasks using significant computational/hardware resources - Ex: using 100% cluster CPU)
Long running tasks (over 10 min)
Any collection of a large # of tasks resulting in a similar hardware footprint to actions mentioned previously.
Not sure? Use
salloc
to be on the safe side.
Ex:salloc –-account=arccanetrain -–time 40:00
See more on ARCC’s Login Node Policy here
Demonstrating how to get help in CLI
| arcc-t10@blog2 ~]$ man pwd NAME pwd - print name of current/working directory SYNOPSIS pwd [OPTION]... DESCRIPTION Print the full filename of the current working directory. -L, --logical use PWD from environment, even if it contains symlinks -P, --physical avoid all symlinks --help display this help and exit --version output version information and exit If no option is specified, -P is assumed. NOTE: your shell may have its own version of pwd, which usually supersedes the version described here. Please refer to your shell's documentation for details about the options it supports. |
| [arcc-t10@blog1 ~]$ cp --help Usage: cp [OPTION]... [-T] SOURCE DEST or: cp [OPTION]... SOURCE... DIRECTORY or: cp [OPTION]... -t DIRECTORY SOURCE... Copy SOURCE to DEST, or multiple SOURCE(s) to DIRECTORY. |
Demonstrating file navigation in CLI
File Navigation demonstrating the use of:
| arcc-t10@blog2 ~]$ pwd /home/arcc-t10 arcc-t10@blog2 ~]$ ls Desktop Documents Downloads ondemand R arcc-t10@blog2 ~]$ cd /project/biocompworkshop [arcc-t10@blog2 biocompworkshop]$ pwd /project/biocompworkshop [arcc-t10@blog2 biocompworkshop]$ cd arcc-t10 [arcc-t10@blog2 arcc-t10]$ ls -la total 2.0K drwxr-sr-x 2 arcc-t10 biocompworkshop 4.0K May 23 11:05 . drwxrws--- 80 root biocompworkshop 4.0K Jun 4 14:39 .. [arcc-t10@blog2 arcc-t10]$ pwd /project/biocompworkshop/arcc-t10 [arcc-t10@blog2 arcc-t10]$ cd .. [arcc-t10@blog2 biocompworkshop]$ pwd /project/biocompworkshop |
Demonstrating how to create and remove files and folders using CLI
Creating, moving and copying files and folders:
| [arcc-t10@blog2 arcc-t10]$ touch testfile [arcc-t10@blog2 arcc-t10]$ mkdir testdirectory [arcc-t10@blog2 arcc-t10]$ ls testdirectory testfile [arcc-t10@blog2 arcc-t10]$ mv testfile testdirectory [arcc-t10@blog2 arcc-t10]$ cd testdirectory [arcc-t10@blog2 testdirectory]$ ls testfile [arcc-t10@blog2 testdirectory]$ cd.. [arcc-t10@blog2 arcc-t10]$ cp -r testdirectory ~ [arcc-t10@blog2 arcc-t10]$ cd ~ [arcc-t10@blog2 ~]$ ls Desktop Documents Downloads ondemand R testdirectory [arcc-t10@blog2 ~]$ cd testdirectory [arcc-t10@blog2 ~]$ ls testfile [arcc-t10@blog2 ~]$ rm testfile [arcc-t10@blog2 ~]$ ls |
Text Editor Cheatsheets
Vi/Vim Cheatsheet | Nano Cheatsheet |
---|---|
Demonstrating vi/vim text editor
VI/Vim is one of several text editors available for Linux Command Line. (
| [arcc-t10@blog2 arcc-t10]$ vi testfile stuff and things ~ ~ ~ ~ :wq [arcc-t10@blog2 arcc-t10]$ cat testfile stuff and things |
Try the vim tutor
Vim Tutor is a walkthrough for new users to get used to Vim. Run | [arc-t10@blog2 ~]$ vimtutor =============================================================================== = W e l c o m e t o t h e V I M T u t o r - Version 1.7 = =============================================================================== Vim is a very powerful editor that has many commands, too many to explain in a tutor such as this. This tutor is designed to describe enough of the commands that you will be able to easily use Vim as an all-purpose editor. ... |