Pathfinder
Overview
Named after one of Wyoming’s reservoirs on the North Platte River, Pathfinder is a low-cost, expandable storage solution that enables a cloud-like presence for research data hosted by ARCC. Its core functionality is hosting onsite backups as well as enabling data sharing and collaboration. If you want to make data, reports, etc available to external-to-UWyo collaborators, Pathfinder is a tool you can use.
Pathfinder uses the Simple Storage Service (S3) protocol originally developed by Amazon. S3 works on object storage through a service called Ceph, provided by Red Hat Enterprise Linux.
Pathfinder does not handle files like Windows or traditional storage systems. Pathfinder uses treats folder and files as buckets and objects. Instead of a file path, Pathfinder buckets are accessed via a URL, and instead of usernames and passwords, Pathfinder uses access keys and secret keys. These characteristics make Pathfinder uniquely suited to providing a cloud-like data repository.
Use Cases
On-premise, ARCC managed backup target for other ARCC services such as Alcova or Teton. Note that Pathfinder itself is not backed up.
Web-enabled S3 buckets for data storage, data transfer, etc.
Time-based token-authenticated links for file sharing. The S3 protocol allows users to make data publicly available with a tokenized link that will expire after a specified time-frame.
Publication of research data in partnership with UW Libraries.
S3 Clients
The S3 protocol requires a client to connect to the server. There are a variety of Graphical User Interface (GUI) and Command Line Interface (CLI) clients that can be used to connect to Pathfinder. ARCC has tested a subset of these clients:
Client Name | Operating System | GUI or CLI | Free? | ARCC recommended/supported |
---|---|---|---|---|
Windows, macOS | GUI | Yes, but larger transfers will require a license | Yes | |
Windows, macOS, | GUI | Yes | Best Effort | |
macOS | GUI | No | Best Effort | |
Windows, macOS, Linux | GUI | Yes | No | |
Windows, macOS, Linux | CLI | Yes | Yes | |
macOS, Linux | CLI | Yes | Best Effort |
Instructions for using Pathfinder with MSP360 Explorer (Cloudberry)
Instructions for using Pathfinder with Rclone
Scripting/Programming Packages
Some programming languages provide software packages that can use the S3 protocol for accessing data. ARCC has tried a few of these and are detailed in the table below.
Package Name | Language | ARCC Tested |
---|---|---|
boto3 | Python | Yes |
aws.s3 | R | Yes |
AWS | C# | No |
Cost
The charges for using Pathfinder is based on actual hardware costs and does not include personnel or infrastructure (network/datacenter) costs. Those costs have been subsidized by ARCC and the Office of Research and Economic Development. See here for details: 2.2 Cost of resources and services
Requesting Access
To request access to Pathfinder and receive an Accesskey/Secretkey combo, please fill out the request a new research project form.
Data sharing
Please see here for a few methods of data sharing from Pathfinder: Sharing data from Pathfinder via expiring URLs .
Maintenance Policy
To apply critical security updates and new feature releases, Pathfinder may undergo ad-hoc maintenance. Most often, maintenance occurs in the background with no user impact. If a major upgrade or other significant maintenance is to occur, user’s will be notified one week in advance via email. In general, Pathfinder adheres to the upstream Ceph project upgrade schedule, with a maximum of 24 months between major upgrades.