Date: Fri, 29 Mar 2024 06:05:52 +0000 (UTC) Message-ID: <130462547.19.1711692352893@3302cd4c2918> Subject: Exported From Confluence MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_Part_18_1800856090.1711692352893" ------=_Part_18_1800856090.1711692352893 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Location: file:///C:/exported.html
Named after one of Wyoming=E2=80=99s reservoirs on the North Pla= tte River, Pathfinder is a low-cost storage solution that enables a Cloud-l= ike presence for research data hosted by ARCC. The system is built to be ex= pandable and provides data protection. Its core functionality is hosting on= site backups as well as enabling data sharing and collaboration.
Contents
Pathfinder uses the Simple Storage Service (S3) pr= otocol originally developed by Amazon that they define as =E2=80=9Cstorage for the Internet=E2=80=9D. S3 wo= rks on object storage through a service pro= vided by Red Hat Enterprise Linux called Ce= ph.
ARCC's S3 presences, Like Pathfinder, do not function like Windows or tr= aditional storage systems. Below is a list of a few unique characteristics = of S3.
S3 has two primary entities called buckets and objects.
Buckets are the access points and objects are stored inside them.
Bucket names have to be globally unique irrespective of which region the= y are created in.
As buckets can be accessed using URLs, it is recommended that bucket nam= es follow DNS naming conventions: all letters should be in lowercase and do= n=E2=80=99t contain special characters.
Objects are directories or files.
Basically, it works like you upload images and you want to differentiate= it from other files, you can create a file for it and store it so that the= logical address of the file would have the prefix =E2=80=98pictures.=E2=80= =99
For example, pictures/hello.jpg that would differentiate it from images/= hello.jpg.
'Users' are replaced with Access Keys and 'passwords' are replaced with = Secret Keys.
Access keys consist of two parts: an access key ID (for example, wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
).
Like a user name and password, you must use both the access key ID and s= ecret access key together to authenticate your access to your buckets.
Manage your access keys as securely as you do your user name and passwor= d.
However, access keys are associated with a project, lab, or department a= nd can not be associated with a specific UWYO user.
Permissions are functionally limited and are only supported for basic us= age.
For example, granular access to a single folder or directory that is pos= sible in a traditional storage system is not well supported in S3.
Typically multiple users are able to use the same single access key to b= uckets with nothing that distinguishes between them.
The Pathfinder S3-Ceph storage architecture is designed and hosted by AR= CC to serve two primary purposes:
The system will act as an onsite backup target for other ARCC services s= uch as the petaLibrary or Teton.
The system will act as a publicly accessible data transfer platform via = the S3 protocol.
Users will be able to host their own 'bucket' to share data.
User can also obtain data from external collaborators.
The system also serves a wide variety of supplementary functions:
Programmatically access data hosted within S3. Codes run on Teton can pu= ll data from and push data to S3.
Onsite backup of important user-defined data. For users that run their o= wn storage system but still need backups.
Timed/temp download links - S3 allows users to make data available publi= cly (with a tokenized link) that will expire after a specified time-frame. = i.e. Make a file temporarily available to external users
Data Transfer
Host data publicly that end users can be allowed to download directly, o= r with credentials.
User-based Backups
Back data up to Pathfinder as a second (or third) copy of your critical = research, using a wide variety of open-source tools.
This space is a stand-alone entity, and will not be mounted directly on = other ARCC resources.
This system is *NOT* backed up. Data that reside on this system should b= e available in other location(s). This system is intended as a secondary ba= ckup and a temporary repository for data transfers ONLY
The S3 protocol requires a client to connect to the server. There are a = variety of Graphical User Interface (GUI) and Command Line Interface (CLI) = clients that can be used to connect to Pathfinder. With so many S3 clients = available, not all have been tested by ARCC but the few that we have are de= tailed in the table below.
Client Name |
Operating System |
GUI or CLI |
Free? |
ARCC recommended/supported |
---|---|---|---|---|
Windows, macOS |
GUI |
Yes, but larger transfers will require a license |
Yes |
|
Windows, macOS, |
GUI |
Yes |
Best Effort |
|
macOS |
GUI |
No |
Best Effort |
|
Windows, macOS, Linux |
GUI |
Yes |
No |
|
Windows, macOS, Linux |
CLI |
Yes |
Yes |
|
macOS, Linux |
CLI |
Yes |
Best Effort |
Instructions for using Pathfinder with MSP360 Explorer (Cloudbe= rry)
Instructions for using Pathfinder with rclone
Some programming languages provide software packages that can use the S3= protocol for accessing data. ARCC has tried a few of these and are detaile= d in the table below.
Package Name |
Language |
ARCC Tested |
---|---|---|
boto3 |
Python |
Yes |
aws.s3 |
R |
Yes |
AWS |
C# |
No |
Price Structure for S3
This price structure is based on actual hardware costs and does not incl= ude personnel or infrastructure (network/datacenter) costs. Those have been= subsidized by ARCC and the Office of Research and Economic Development.
one-time fee of $50 per Accesskey/Secretkey
$45 per terabyte per year, billed monthly based on usage
To request access to Pathfinder and receive an Accesskey/Secretkey combo= please do so by emailing arcc-help@uwyo.edu with the subject of = =E2=80=9CPathfinder access request=E2=80=9D.