Introduction: This workshop will introduce users on how to remotely access the clusters and how to set up and personalize their environment for their computational needs. After the workshop, participants will understand:
How to (remotely) access the cluster.
How to set up various environments allowing the use of a variety of programming languages, applications, libraries and utilities.
Participants will require an intro level of experience of using Linux, as well as the ability to use a text editor from the command line.
Course Goals:
Introduce a number of ways to access the Beartooth cluster.
What is LMOD and how to use it to set up an Environment.
01: Accessing the Cluster
Topics:
Access the Beartooth Cluster
Navigate various folders.
Take a look at the basic system/cluster.
Exercises: Log on, and create and run a python script
(SSH onto a login node – if using an existing ARCC account.)
Access via SouthPass and start a shell terminal tab.
Commands:
man
,id
,groups
Commands:
arccquota
Create and run a Python script.
You have an Existing Account: Log on
There are a number of ways to access the cluster: Logging Into HPC:
Open up a terminal.
Use a client such as MobiXTerm
ssh <username>@beartooth.arcc.uwyo.edu [username@blog1 ~]$ # UW: Beartooth (blog1/blog2)
Temporarily use a test account:
Open up Chrome
Navigate to: https://southpass.arcc.uwyo.edu/
Start Beartooth Shell Access
Download Slides:
Opening Screen:
Message of the day
Storage usage - across project spaces
General Format:[<username>@<server/node-name> <folder>]$ [<username>@blog2 ~]$ arccquota # ‘man’ is only available on the login nodes. # It is not available on the compute nodes. [<username>@blog2 ~]$ man id [<username>@blog2 ~]$ id –-version [<username>@blog2 ~]$ id [<username>@blog2 ~]$ groups
Which groups?
[arcc-t05@blog1 ~]$ id uid=10339923(arcc-t05) gid=10339923(arcc-t05) groups=10339923(arcc-t05),89997(beartooth),446824(uwit-research-arccanetraining),5735503(teton_backup),6000211(arccanetrain) [arcc-t05@blog1 ~]$ groups arcc-t05 beartooth uwit-research-arccanetraining teton_backup arccanetrain
Beartooth: FileSystem
Type | Location | Description |
home | /home/<username> | Space for configuration files and software installations. |
project | /project/<project-name>/[username] | Space to collaborate among project members. Data here is persistent and is exempt from purge policy. |
gscratch | /gscratch/<username> | Space to perform computing for individual users. Data here is subject to a purge policy defined below. |
node local scratch | /lscratch | Only on compute. |
memory filesystem | /dev/shm | RAM-based tmpfs available as part of RAM for very rapid I/O operations; small capacity. |
Home and Project Folders:
# Home folder: []$ cd ~ []$ pwd /home/arcc-t05 []$ cd /gscratch/arcc-t05 # Shared project space []$ cd /project/arccanetrain/ [arcc-t05@blog1 arccanetrain]$ ls arcc-t01 arcc-t06 arcc-t11 arcc-t16 arcc-t21 arcc-t26 brewer mkillean arcc-t02 arcc-t07 arcc-t12 arcc-t17 arcc-t22 arcc-t27 excotest salexan5 arcc-t03 arcc-t08 arcc-t13 arcc-t18 arcc-t23 arcc-t28 intro_to_hpc arcc-t04 arcc-t09 arcc-t14 arcc-t19 arcc-t24 arcc-t29 lmainzer arcc-t05 arcc-t10 arcc-t15 arcc-t20 arcc-t25 arcc-t30 lreilly
Copy files:
[]$ cd [~]$ cp -r /project/arccanetrain/intro_to_hpc/ . [~]$ cd intro_to_hpc/ [intro_to_hpc]$ ls Intro_to_hpc.pdf python01.py python01.py.fixed run_gpu.sh run.sh
Let's run a python script:
# If you have NOT already copied the files. # Navigate back to your home folder. []$ cd [~]$ mkdir intro_to_hpc [~]$ cd intro_to_hpc/ [intro_to_hpc]$ ls [intro_to_hpc]$ pwd /home/<username>/intro_to_hpc [intro_to_hpc]$ vim python01.py "python01.py" [New File] # Using vim: Press ESC followed by ‘i’ to INSERT: Start typing: import sys print("Python version: " + sys.version) print("Version info: " + sys.version_info) # Using vim: ESC followed by ‘:wq’, then Return
Let's run a python script: Fixed:
[]$ python python01.py Python version: 3.8.16 (default, May 31 2023, 12:44:21) [GCC 8.5.0 20210514 (Red Hat 8.5.0-18)] Traceback (most recent call last): File "python01.py", line 3, in <module> print("Version info: " + sys.version_info) TypeError: can only concatenate str (not "sys.version_info") to str # Let's update the code: From: print("Version info: " + sys.version_info) To: print("Version info: " + str(sys.version_info)) []$ python python01.py Python version: 3.8.17 (default, Aug 10 2023, 12:50:17) [GCC 8.5.0 20210514 (Red Hat 8.5.0-20)] Version info: sys.version_info(major=3, minor=8, micro=17, releaselevel='final', serial=0) []$ python --version Python 3.8.17
02: Setting up the Environment and LMOD
Topics:
LMOD: Setting up an Environment.
What’s available.
How to find (spider) modules.
Loading / purging modules.
Dependencies.
What do we have available?
Compilers and eco systems: GNU family, Intel’s oneAPI, Nvidia’s hpc-sdk
Languages: C/C+, Fortran, Go, Java, Julia, Perl, Python, R, Ruby, Rust
Scientific libraries and toolkits: Built with a specific compiler: GNU by default
Standalone applications and utilities: Installed using:
Conda
Containers: running Singularity (not Docker)
We can create a Singularity image from a Docker image.
Binaries/Executables
Setting up your environment:
Exercises
Environment variables: env, echo
Find and load modules.
Commands: module spider/load/purge/ml
Understand dependencies and compiler stacks.
Run our python script using a specific version of python.
Load R (and Python).
Default modules – why define the module version.
Environment Variables:
[]$ env []$ echo $PATH
What’s being used?
[]$ which python /usr/bin/python
You can’t break the System:
[]$ ls /usr/bin []$ cd /usr/bin [bin]$ ls []$ pwd /usr/bin # Permissions (ugo:rwx) and Ownership (user:group) [bin]$ ls -al [bin]$ ls -al python* You can't break the system. [bin]$ rm python rm: cannot remove 'python': Permission denied
sudo: you will not be granted sudo access – do not ask!
What’s available?
[bin]$ cd ~ [~]$ cd intro_to_hpc/ []$ ml []$ module avail []$ module load gcc/12.2.0 []$ ml []$ module avail # What is different compared to the first time we called ml? []$ module load gcc/11.2.0 # What happened?
What’s available? Compiler tree:
[]$ module load gcc/11.2.0 # What happened? Due to MODULEPATH changes, the following have been reloaded: 1) gmp/6.2.1 2) mpfr/4.1.0 3) zlib/1.2.12 4) zstd/1.5.2 The following have been reloaded with a version change: 1) gcc/12.2.0 => gcc/11.2.0 # You can only have one compiler loaded at a time within a session. []$ ml []$ module avail []$ module purge []$ ml # What do you notice?
What’s available? Modules loaded by default:
[]$ module purge The following modules were not unloaded: (Use "module --force purge" to unload all): 1) slurm/latest 2) arcc/1.0 []$ ml Currently Loaded Modules: 1) slurm/latest (S) 2) arcc/1.0 (S) Where: S: Module is Sticky, requires --force to unload or purge
Module load/spider: Dependencies:
[]$ module load python/3.10.6 Lmod has detected the following error: These module(s) or extension(s) exist but cannot be loaded as requested: "python/3.10.6" Try: "module spider python/3.10.6" to see how to load the module(s).
What’s different between these command-lines?
[]$ module spider python/3.10.6 []$ module spider python/3.10.8
What’s different between these command-lines? Dependencies:
[]$ module spider python/3.10.6 ---------------------------------------- python: python/3.10.6 ---------------------------------------- You will need to load all module(s) on any one of the lines below before the "python/3.10.6" module is available to load. arcc/1.0 gcc/12.2.0 Help: The Python programming language. []$ module spider python/3.10.8 ---------------------------------------- python: python/3.10.8 ---------------------------------------- You will need to load all module(s) on any one of the lines below before the "python/3.10.8" module is available to load. arcc/1.0 gcc/11.2.0 Help: The Python programming language.
Setup Python environment:
[]$ module purge []$ module load gcc/12.2.0 []$ module load python/3.10.6 []$ python –version # Single line: # Order matters: []$ module purge []$ module load python/3.10.6 gcc/12.2.0 vs []$ module load gcc/12.2.0 python/3.10.6
What’s happened to the PATH environment variable?
[]$ module purge []$ echo $PATH []$ module load gcc/12.2.0 python/3.10.6 []$ echo $PATH []$ which python /apps/u/spack/gcc/12.2.0/python/3.10.6-7ginwsd/bin/python
Can we use the R language?
[]$ module purge []$ r []$ R # Can we find an 'R' module? []$ module avail []$ module spider r # What do we see and why? []$ module load r/4.2.2 # How do we fix it?
Can we use the R language? Fixed:
[]$ module purge []$ r []$ R # Can we find an 'R' module? []$ module avail []$ module spider r # What do we see and why? []$ module load r/4.2.2 # How do we fix it? []$ module load gcc/12.2.0 r/4.2.2 []$ ml []$ R --version
Can we also use the Python language?
[]$ module load python/3.10.6 []$ python --version Python 3.10.6 # Where’s the gcc/12.2.0? # What happens if we: []$ module load gcc/11.2.0 python/3.10.8 []$ python –-version []$ R --version
Remember:
Only one compiler/version can be loaded into your environment at a time.
Can only load languages/applications built with the same compiler.
But, even this can introduce dependency issues.
Defaults:
[]$ module purge []$ module avail []$ module load python []$ python --version Python 2.7.18 :: Anaconda, Inc.
Defaults:
Modules change and are updated.
The defaults will change and update – you might not realize.
Please define the version of a module you’re using.
Helps to replicate and triage.
Advanced: Why do versions matter?
Consider Python Packages installed using pip
:
You install a package with respect to the version of python you are using.
A packages is not automatically available/installed across different versions.
If you start using
python/3.9
and then swap topython/3.10
you will need to re-install any packages.
# Example: [lib]$ pwd /home/<username>/.local/lib []$ ls python2.7 python3.10 python3.6 python3.8 python3.9 # These folders will only be created if you use that version. # Each child folder has its own site-packages folder.
It is the same concept for R libraries.
Summary
LMOD: Setting up an Environment.