AI4WY

AI4WY

 

AI4WY Overview

The AI4WY Compute Environment (Advanced Infrastructure to Accelerate Impact of AI through Applications and Innovation for Wyoming) is a high-performance computing (HPC) cluster consisting of NVIDIA’s Hopper GPUs with ARM CPUs onto a single chip with fast memory interconnect. This unique architecture addresses critical computing challenges related to data movement and energy consumption through the Grace Hopper architecture, which integrates high-bandwidth coherent data transfers between CPU and GPU. The cluster was funded through an award from the NSF. Read more about the award details here.

The AI4WY cluster is expected to be available for users to submit requests for allocations in early 2026.

 

  1. Overview

  2. AI4WY Access & Usage

    1. Requesting Access: Subject to Approval

    2. Allocations

    3. Logging In

      1. Web Portal Login

        1. Non-UW Logins

      2. SSH Key Management

      3. SSH Key Device Configuration

  3. AI4WY Hardware & Cluster Architecture

    1. AI4WY Hardware Profile

    2. Operating System

    3. Running a Job on AI4WY

    4. Software

  4. Data Storage & Management

    1. Commonly Used Cluster Directories

    2. Storage Quotas

    3. Using Globus on AI4WY

  5. Known Issues [AI4WY]

    1. Reporting issues

AI4WY Access & Usage

Requesting Access: Subject to Approval

The fundamental difference between the AI4WY Service and other UW ARCC services is that at this time, PIs must be invited or granted approval from the AI4WY review committee to use the AI4WY cluster. We are not providing general access to all UW researchers. This policy also applies to any project changes that are associated with additional billing or costs.

Requestors may request allocations using the project request form on the ARCC portal. Please select the option from the dropdown indicating you’re requesting a new allocation on the AI4WY cluster:

Allocations

Allocation/Project Names: All AI4WY Allocations follow a generic and systematic naming convention and requestors will not be able to specify project names.

Expiration: All project allocations will expire after 1 year and must be subject to renewal or re-requested.
Requests for New Allocations, Renewals, or Allocation Changes: All requests for new allocations, adjustments to allocations, or renewal of allocations must be approved by the AI4WY review committee.

Logging In

Access to the ai4wy cluster is available both over ondemand (web) or SSH.

Initial Login

Users who have been granted access to AI4WY should review these instructions when accessing the cluster for the first time:
With your UWYO Account:

  1. Open a browser and go to: https://ai4wy.arcc.uwyo.edu

  2. See the following instructions to log in replacing the URL with https://ai4wy.arcc.uwyo.edu - Web Access to the Cluster: Featuring OnDemand

With a NON-UWYO (AKA “ARCC-Only”) Account:

  1. Follow e-mails from arcc, and open a browser for AI4WY Ondemand at: https://ai4wy.arcc.uwyo.edu.
    Refer to instructions available here: Cluster Access Setup for Non-UW Users (ARCC-account)

SSH Logins (Once initial login complete)

If you have set up ssh keys for Medicinebow, those keys will also function on the AI4WY cluster as long as you have been granted approval for an AI4WY project and account. (See Access is subject to Approval)

AI4WY Hardware and Cluster Architecture

The AI4WY cluster is made up of 2 login nodes, 12 single GPU Gracehopper nodes, and 24 dual GPU Gracehopper nodes.

Nodes are named based on usage and compute components. Please see the AI4WY Hardware Profile Page for more information specific to the cluster’s architecture, and node architecture.

Gracehopper nodes feature NVLink-C2C allowing CPU and GPU threads to access both CPU and GPU memory concurrently, enhancing performance and developer productivity. Users should be aware that code may need to be recompiled or rebuilt before running on different architectures.

Operating System

AI4WY was set up with the NVIDIA provided OS image based off Ubuntu 24.4 release, in contrast to ARCC’s Medicinebow cluster running RHEL 9.7 as of early 2026. As a result of running different distros, certain aspects will be different between the AI4WY and Medicinebow Cluster and the software installed on top of each operating system may also be subject to differences in version and compilation.

Software

Users should be aware that applications on Gracehopper nodes are designed for high performance computing and AI workloads making it more suitable for complex and large scale applications, but applications that ran on older architectures will likely not run on Gracehopper nodes without efforts to recompile source code for ARM or otherwise adapting your code to the new infrastructure and hardware. AI4WY users can query preinstalled modules using the module command. Users may find a full set of module commands available on the cluster with a module -h command.

  1. Using Preinstalled Modules

  2. Compilers and Languages

  3. NVIDIA Optimized Libraries

  4. Using MPI

  5. Container Environments

  6. Using PyTorch

Data Storage and Management on AI4WY

Common User Directories

Like most clusters, /home, /project, and /scratch space are available for users to store their data, code, and scripts. Paths for these directories are detailed below:

Directory

Path

Specifics

Directory

Path

Specifics

Project

/project/<allocation#>

 

Shared Project Software

/project/<allocation#>/software

Specifically to be used as a location for project members to install software to be used and shared among project members.

Home

/home/<username>

 

Scratch

/gscratch/<username>

Subject to periodic purge

Local Scratch

/lscratch

Only available on compute nodes

That said, AI4WY and Medicinebow are independent clusters with independent storage. Therefore changes made within your /home /gscratch or /project directories on one system will NOT be reflected on the other system.

Default Storage Quotas

Storage on the AI4WY for /home, /project, /gscratch and /lscratch is subject to default quotas upon creation of project allocations and user accounts. Allocations and their listed PIs may be subject to charges associated with quota increases and changes to quotas or requests for increase require submission and approval through the AI4WY Allocation Approval Process detailed above. Current default quotas are listed below:

Directory

Current Default Quota

Directory

Current Default Quota

Project

1TB

Shared Project Software

 

Home

50GB

Scratch

5TB

Local Scratch

 

Moving Your Data

One of the first things users typically need to do after logging into the cluster is to move their data from another location onto the AI4WY system. There are many ways to move your data and detailed examples are provided in the links below.

Using Globus to move data between AI4WY and other systems

Other ways to move your data

Computing on the Cluster: Running a Job on AI4WY

There is a single QOS for all jobs on the AI4WY cluster, and all jobs are subject to a 24hr wall-time.

The AI4WY cluster utilizes the SLURM resource scheduler. General information about SLURM may be found here: Slurm Workload Manager.

This link provides additional information about running a job on AI4WY and a few simple examples.

Additional pages related to running jobs on the Medicinebow Cluster that should also apply to AI4WY may be found at the following links:

Users will also be able to run interactive jobs with a GUI interface using OnDemand. Information for running a general interactive job through OnDemand may be found here:
Interactive Applications

Known Issues

Users may report known issues by e-mailing arcc-help@uwyo.edu with AI4WY in the subject. This will appropriately route incidents related to the AI4WY cluster and assist our team with reporting.
A current list or known issues for AI4WY may be found here: Known Issues [AI4WY]

Comments