Beartooth

Overview

The Beartooth Compute Environment (AKA Beartooth) is a high performance computing (HPC) cluster that offers over 500 compute nodes and 1.2 PB of storage. ARCC works to maintain an expected uptime of 98%, allowing researchers to perform computation-intensive analyses on datasets of various sizes.

Beartooth can be securely accessed anywhere, anytime using SSH connectivity with UWyo two-factor authentication.

Beartooth hardware

To see a summary of the Beartooth hardware click here.

Beartooth Storage

Beartooth’s storage is divided into three isolated filesystems to ensure that researchers have control of where their data is, and who can access it. 

  • /home: for configuration files and software user specific installations.

  • /project: a collaborative area shared among project members.

  • /gscratch: a large and fast storage area to temporarily store large data sets while they are being processed. This area is not backed up and is subject to periodic purges of old data.

Additional information on Beartooth Storage is accessible here. General storage policy information is available for reference here.

Software

Click here to view a summary of the Beartooth software and Beartooth software list.

Project and Account Requests

For research projects, UWyo faculty members (Principal Investigators or PIs) can request a research project on ARCC HPC (high performance computing) resources, using this form.
Note: you may also submit an initial set of users for your research project using the same form.

User Accounts require a valid UWyo email address and a UWyo-Affiliated PI sponsor. UWyo faculty members can sponsor their own accounts, while students, post-doctoral researchers, or research associates must use their PI as their sponsor. Users with a valid UWyo email address can be added in the project request or added later, using the Request a Change to Your Project form.

Non-UWyo external collaborators (Ex_Co) must be sponsored by a current UWyo faculty member. Ex_Co accounts can be requested here. Please supply the Ex_Co username when requesting they be added to a project.

Logging Into Beartooth

Once access is granted, connection to ARCC HPC resources may be established via SSH.
Note that SSH connections require Two-Factor Authentication.

  1. Note: you need to not be connected to the UWguest wireless network.

  2. Open a terminal window or similar command line interface (CLI). Learn how.

  3. Type in “ssh <username>@<clustername>.arcc.uwyo.edu” and press enter.

    1. For example: ssh <username>@beartooth.arcc.uwyo.edu

      1. The first time you log in you will get a message saying the ‘authenticity of the host … can’t be established' and asking if you ‘are sure you want to continue?’.

      2. Enter ‘yes’.

    2. You will then see a Notice to Users and a Two-factor Authentication message, with your mobile device ready, enter your password and accept the Duo Mobile (2FA) challenge when it pops up.

      1. The Two-factor message may say something about entering your password in this form: <password, token>. This is no longer necessary, but still possible to do.

  1. Once you are connected, a bunch of text will scroll by. This will vary depending on the cluster. On Beartooth, for example, there are usage rules, tips, and a summary of your storage utilization across all projects that you are part of.

    1. Note that when you are logged in, the command prompt will look something like this: [your_username@blog3 ~]$

      1. This shows your username and which login node you are currently utilizing. The login node (here, blog3) can, and probably will, change from one session to the next.

      2. The “~” indicates that you are in your home folder on the storage system.

Condo Model

Our Beartooth cluster follows a condominium-style model of resource management. Investors, which are individual researchers or groups of researchers, work with us at ARCC to purchase compute nodes and storage. These resources are then installed in our cluster and administered by the ARCC system administrators.

Condo computing resources are used simultaneously by multiple users. When the investor-purchased resources aren’t in use by the investor defined users, those resources will be made available to the community. Within the condo model investors have priority on invested resources; This is implemented through preemption. General access to investor-owned resources can be immediately revoked when the investors wish to use their resources. The job will automatically be re-queued to ease the burden on users.

If an investor chooses not to implement preemption on their resources, ARCC can disable preemption on these resources and instead offer next-in-line access on their resources.

Beartooth includes a number of non-investor nodes in a partition. These nodes function differently than other nodes on the cluster. These nodes are not based on a specific hardware set or investment level. Instead they are a collection of community nodes, and not subject to preemption.

Limitations Implemented within the Condo Model:

  • ARCC has implemented default concurrent limits in place to prevent individual project accounts and users from saturating the cluster away from others. The default limits are listed below. To incentivize investments into the condo system, investors will have their limits increased.

  • The system leverages a fair share mechanism to allow for projects that execute jobs on a more rare occasions priority over those who continuously run jobs on the system. To incentivize investments into the condo system, investors will have their fair share value increased.

  • Individual jobs are subject to runtime limits based on a study that was performed in ~2014. The findings of the study were such that our maximum walltime for a compute job is 7 days. ARCC is currently evaluating this to determine whether the orthogonal limits of CPU count and walltime are optimal operational modes. ARCC is considering concurrent usage limits based on a relational combination of CPU count, Memory, and walltime that would allow more flexibility for different areas of science. There will likely still be an upper limit on individual compute job walltime as ARCC will not allow infinite job walltime and due to possible hardware faults.

Citing Beartooth

For information on citing Beartooth, please reference the citing section in Documentation and Help.