Pathfinder

Overview

Named after one of Wyoming’s reservoirs on the North Platte River, Pathfinder is a low-cost, expandable storage solution that enables a cloud-like presence for research data hosted by ARCC. Its core functionality is hosting onsite backups as well as enabling data sharing and collaboration. If you want to make data, reports, etc available to external-to-UWyo collaborators, Pathfinder is a tool you can use.

Pathfinder uses the Simple Storage Service (S3) protocol originally developed by Amazon. S3 works on object storage through a service called Ceph, provided by Red Hat Enterprise Linux.
Pathfinder does not handle files like Windows or traditional storage systems. Pathfinder uses treats folder and files as buckets and objects. Instead of a file path, Pathfinder buckets are accessed via a URL, and instead of usernames and passwords, Pathfinder uses access keys and secret keys. These characteristics make Pathfinder uniquely suited to providing a cloud-like data repository.

Use Cases

  • On-premise, ARCC managed backup target for other ARCC services such as Alcova or Teton. Note that Pathfinder itself is not backed up.

  • Web-enabled S3 buckets for data storage, data transfer, etc.

    • Time-based token-authenticated links for file sharing. The S3 protocol allows users to make data publicly available with a tokenized link that will expire after a specified time-frame.

  • Publication of research data in partnership with UW Libraries.

S3 Clients

The S3 protocol requires a client to connect to the server. There are a variety of Graphical User Interface (GUI) and Command Line Interface (CLI) clients that can be used to connect to Pathfinder. ARCC has tested a subset of these clients:

Client Name

Operating System

GUI or CLI

Free?

ARCC recommended/supported

Client Name

Operating System

GUI or CLI

Free?

ARCC recommended/supported

MSP360 Explorer (Cloudberry)

Windows, macOS

GUI

Yes, but larger transfers will require a license

Yes

Cyberduck

Windows, macOS,

GUI

Yes

Best Effort

Transmit

macOS

GUI

No

Best Effort

Dragon Disk

Windows, macOS, Linux

GUI

Yes

No

rclone

Windows, macOS, Linux

CLI

Yes

Yes

s3cmd

macOS, Linux

CLI

Yes

Best Effort

Instructions for using Pathfinder with MSP360 Explorer (Cloudberry)

Instructions for using Pathfinder with Rclone

Scripting/Programming Packages

Some programming languages provide software packages that can use the S3 protocol for accessing data. ARCC has tried a few of these and are detailed in the table below.

Package Name

Language

ARCC Tested

Package Name

Language

ARCC Tested

boto3

Python

Yes

aws.s3

R

Yes

AWS

C#

No

Cost

The charges for using Pathfinder is based on actual hardware costs and does not include personnel or infrastructure (network/datacenter) costs. Those costs have been subsidized by ARCC and the Office of Research and Economic Development. See here for details: https://arccwiki.atlassian.net/wiki/spaces/AIP/pages/1627815937

Requesting Access

To request access to Pathfinder and receive an Accesskey/Secretkey combo, please fill out the request a new research project form.

Data sharing

Please see here for a few methods of data sharing from Pathfinder: Sharing data from Pathfinder via expiring URLs .

Maintenance Policy

To apply critical security updates and new feature releases, Pathfinder may undergo ad-hoc maintenance. Most often, maintenance occurs in the background with no user impact. If a major upgrade or other significant maintenance is to occur, user’s will be notified one week in advance via email. In general, Pathfinder adheres to the upstream Ceph project upgrade schedule, with a maximum of 24 months between major upgrades.