Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Introduction: The workshop session will provide a quick tour covering high-level concepts, commands and processes for using Linux and HPC on our Beartooth cluster. It will cover enough to allow an attendee to access the cluster and to perform analysis associated with this workshop.

...

What is HPC

HPC stands for High Performance Computing and is one of UW ARCC’s core services. HPC is the practice of aggregating computing power in a way that delivers a much higher performance than one could get out of a typical desktop or workstation. HPC is commonly used to solve large problems, and has some common use cases:

...

  • We typically have multiple users independently running jobs concurrently across compute nodes.

  • Resources are shared, but do not interfere with any one else’s resources.

    • i.e. you have your own cores, your own block of memory.

  • If someone else’s job fails it does NOT affect yours.

...

There are 2 types of HPC systems:

  1. Homogeneous: All compute nodes in the system share the same architecture. CPU, memory, and storage are the same across the system. (Ex: NWSC’s Derecho)

  2. Heterogeneous: The compute nodes in the system can vary architecturally with respect to CPU, memory, even storage, and whether they have GPUs or not. Usually, the nodes are grouped in partitions. Beartooth is a heterogeneous cluster and our partitions are described on the Beartooth Hardware Summary Table on our ARCC Wiki.

...

A reservation can be considered a temporary partition.

It is a set of compute nodes reserved for a period of time for a set of users/projects, who get priority use.

...

Important Dates:

  1. After the 17th of June this reservation will stop and you will drop down to general usage if you have another Beartooth project.

  2. The project itself will be removed after the 24th of June. You will not be able to use/access it. Anything you require please copy out of the project.

...

Expand
titleWalk through (displays a filled out web form) going through steps for requesting a Beartooth XFCE Desktop
  1. Click on Beartooth XFCE Desktop
    You will be presented with a form asking for specific information.

    1. Project/Account: specifies the project you have access to on the HPC Cluster

    2. Reservation: not usually used for our general cluster use, but set up to access specific hardware that has been reserved for this workshop.

    3. Number of Hours: How long you plan to use the Remote Desktop Connection to the Beartooth HPC.

    4. Desktop Configuration: How many CPUs and Memory you require to perform your computations within this remote desktop session.

    5. GPU Type: GPU Hardware you want to access, specific to your use case. This may be set to “None - No GPU" if your computations do not require a GPU. Note: you can select DGX GPUs (Listed as V100s from the GPU Type drop down)

  2. You should see an interactive session starting. When it’s ready, it will turn green.

    1. Note the Host: field. Your Interactive session has been allocated to a specific host on the cluster. This is the node you are working on when you’re using your remote desktop session.

    2. Click Launch Beartooth XFCE Desktop to open your Remote Desktop session

  3. You should now see a Linux Desktop in your browser window

    A screenshot of a computer

Description automatically generated

    1. Beartooth runs Red Hat Enterprise Linux. If you’ve worked on a Red Hat System, it will probably look familiar.

    2. If not, hopefully it looks similar enough to a Windows or Mac Graphical OS Interface.

      1. Apps dock at the bottom (Similar to Mac OS, or Pinned apps in taskbar on Windows OS)

      2. Desktop icons provide links to specific folder locations and files, like Mac and PC).

Note: While we use a webform to request Beartooth resources on Southpass, later training will show how resource configurations can be requested through command line via salloc or sbatch commands.

...

  1. /apps (Specific to ARCC HPC) is like on Windows or on a Mac.

    1. Where applications are installed and where modules are loaded from. (More on that later).

  2. /alcova (Specific to ARCC HPC).

    1. Additional research storage for research projects that may not require HPC but is accessible from beartooth.

    2. You won’t have access to it unless you were added to an alcova project by the PI.

...

Exercise: File Browsing in Southpass GUI

...

  • The Beartooth Shell Access opens up a new browser tab that is running on a login node. Do not run any computation on these.
    [<username>@blog2 ~]$

  • The SouthPass Interactive Desktop (terminal) is already running on a compute node.
    [<username>@t402 ~]$

...

Login Node Policy

...

  1. Anything compute-intensive (tasks using significant computational/hardware resources - Ex: using 100% cluster CPU)

  2. Long running tasks (over 10 min)

  3. Any collection of a large # of tasks resulting in a similar hardware footprint to actions mentioned previously.  

  4. Not sure?  Usesallocto be on the safe side. This will be covered later.
    Ex:salloc –-account=arccanetrain -–time 40:00

  5. See more on ARCC’s Login Node Policy here

...

  • man - Short for the manual page. This is an interface to view the reference manual for the application or command.

  • man pages are only available on the login nodes.

Code Block
arcc-t10@blog2 ~]$ man pwd
NAME
       pwd - print name of current/working directory
SYNOPSIS
       pwd [OPTION]...
DESCRIPTION
       Print the full filename of the current working directory.
       -L, --logical
              use PWD from environment, even if it contains symlinks
       -P, --physical
              avoid all symlinks
       --help display this help and exit
       --version
              output version information and exit
       If no option is specified, -P is assumed.
       NOTE:  your  shell  may have its own version of pwd, which usually supersedes the version described here.  Please refer to your shell's documentation
       for details about the options it supports.
  • --help - a built-in command in shell. It accepts a text string as the command line argument and searches the supplied string in the shell's documents.

Code Block
[arcc-t10@blog1 ~]$ cp --help
Usage: cp [OPTION]... [-T] SOURCE DEST
  or:  cp [OPTION]... SOURCE... DIRECTORY
  or:  cp [OPTION]... -t DIRECTORY SOURCE...
Copy SOURCE to DEST, or multiple SOURCE(s) to DIRECTORY.

...

Note: On Beartooth, vi maps to vim i.e. if you open vi, you're actually starting vim.

...

Demonstrating vi/vim text editor

...

Since the cluster has to cater for everyone we can not provide a simple desktop environment that provides everything.

Instead we provide modules that a user will load that configures their environment for their particular needs within a session.

...

  • We’ve covered the following high-level concepts, commands and processes:

  • What is HPC and what is a cluster - focusing on ARC’s Beartooth cluster.

  • An introduction to Linux and its File System, and how to navigate around using an Interactive Desktop and/or using the command-line.

  • Linux command-line commands to view, search, parse, sort text files.

  • How to pipe the output of one command to the input of another, and how to redirect output to a file.

  • Using vim as a command-line text editor and/or emacs as a GUI within an Interactive Desktop.

  • Setting up your environment (using modules) to provide R/Python environments, and other software applications.

  • Accessing compute nodes via a SouthPass Interactive Desktop, and requesting different resources (cores, memory, GPUs).

  • Requesting interactive sessions (from a login node) using salloc.

  • Setting up a workflow, within a script, that can then be submitted to the Slurm queue using sbatch, and how to monitor jobs.

...