Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Table of Contents

Glossary

Frequently Asked Questions

...

How it Works

Pathfinder uses the Simple Storage Service (S3) protocol originally developed by Amazon that they define as “storage for the Internet”. S3 works on object storage through a service provided by Red Hat Enterprise Linux called Ceph.

Unique Characteristics of S3

ARCC's S3 presences, Like Pathfinder, do not function like Windows or traditional storage systems. Below is a list of a few unique characteristics of S3.

  • S3 has two primary entities called buckets and objects.

    • Buckets are the access points and objects are stored inside them.

      • Bucket names have to be globally unique irrespective of which region they are created in.

      • As buckets can be accessed using URLs, it is recommended that bucket names follow DNS naming conventions: all letters should be in lowercase and don’t contain special characters.

    • Objects are directories or files.

      • Basically, it works like you upload images and you want to differentiate it from other files, you can create a file for it and store it so that the logical address of the file would have the prefix ‘pictures.’

      • For example, pictures/hello.jpg that would differentiate it from images/hello.jpg.

  • 'Users' are replaced with Access Keys and 'passwords' are replaced with Secret Keys.

    • Access keys consist of two parts: an access key ID (for example, AKIAIOSFODNN7EXAMPLE) and a secret access key (for example, wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY). 

    • Like a user name and password, you must use both the access key ID and secret access key together to authenticate your access to your buckets.

      • Manage your access keys as securely as you do your user name and password.

      • However, access keys are associated with a project, lab, or department and can not be associated with a specific UWYO user.

  • Permissions are functionally limited and are only supported for basic usage.

    • For example, granular access to a single folder or directory that is possible in a traditional storage system is not well supported in S3.

    • Typically multiple users are able to use the same single access key to buckets with nothing that distinguishes between them.

Purpose of System

The Pathfinder S3-Ceph storage architecture is designed and hosted by ARCC to serve two primary purposes:

...

  • Programmatically access data hosted within S3. Codes run on Teton can pull data from and push data to S3.

  • Onsite backup of important user-defined data. For users that run their own storage system but still need backups.

  • Timed/temp download links - S3 allows users to make data available publicly (with a tokenized link) that will expire after a specified time-frame. i.e. Make a file temporarily available to external users

Use Cases

  • Data Transfer

Host data publicly that end users can be allowed to download directly, or with credentials.

...

Note

This system is *NOT* backed up. Data that reside on this system should be available in other location(s). This system is intended as a secondary backup and a temporary repository for data transfers ONLY

S3 Clients

The S3 protocol requires a client to connect to the server. There are a variety of Graphical User Interface (GUI) and Command Line Interface (CLI) clients that can be used to connect to Pathfinder. With so many S3 clients available, not all have been tested by ARCC but the few that we have are detailed in the table below.

...

Instructions for using Pathfinder with rclone

Scripting/Programming Packages

Some programming languages provide software packages that can use the S3 protocol for accessing data. ARCC has tried a few of these and are detailed in the table below.

Package Name

Language

ARCC Tested

boto3

Python

Yes

aws.s3

R

Yes

AWS

C#

No

Cost

Price Structure for S3

This price structure is based on actual hardware costs and does not include personnel or infrastructure (network/datacenter) costs. Those have been subsidized by ARCC and the Office of Research and Economic Development.

  • one-time fee of $50 per Accesskey/Secretkey

  • $35 $45 per terabyte per year, billed monthly based on usage

Requesting Access

To request access to Pathfinder and receive an Accesskey/Secretkey combo please do so by emailing arcc-help@uwyo.edu with the subject of “Pathfinder access request”.

...