Advanced Usage
Overview
Pathfinder uses Ceph internally to provide a generic S3 compliant storage endpoint. Ceph’s S3 implementation has many user configurable settings that can be tweaked from the CLI. This document aims to give an overview of how to do some less-common operations on S3 buckets.
Editing Cross-Origin Resource Sharing (CORS) Restrictions
Cross-Origin Resource Sharing (CORS) is a mechanism that allows a server to indicate other origins from which a browser may permit loading resources. Using S3 with CORS may offer better performance as requests can be made directly from the user to a public facing S3 bucket. By default, all CORS attributes are restricted, as they are a potential security hazard. These attributes can be edited using the CLI tool s3cmd
on a per-bucket basis.
These instructions are to be done on your local workstation.
Installing s3cmd
Make sure you have s3cmd
installed. If you are using Ubuntu or a RHEL-based Linux distro, you can easily install it via your native package manager. If you are on Windows or Mac, you can install it with pip
.
# Ubuntu
$ sudo apt install -y s3cmd
# RHEL-based
$ sudo dnf install -y s3cmd
# Windows/Mac, you may have to install pip first.
# Refer to this page: https://pip.pypa.io/en/stable/installation/
$ pip install --user s3cmd
Setup
Once s3cmd
is installed, we need to tweak a few initial parameters to support Pathfinder. Run this command to configure the s3cmd
tool and tweak these values to include your access key and secret key. Make sure to override the default “S3 Endpoint” and “DNS-style bucket+hostname:port template” to match the template below.
$ s3cmd --configure
Enter new values or accept defaults in brackets with Enter.
Refer to user manual for detailed description of all options.
Access key and Secret key are your identifiers for Amazon S3. Leave them empty for using the env variables.
Access Key: <YOUR_ACCESS_KEY>
Secret Key: <YOUR_SECRET_KEY>
Default Region [US]: US
Use "s3.amazonaws.com" for S3 Endpoint and not modify it to the target Amazon S3.
S3 Endpoint [s3.amazonaws.com]: pathfinder.arcc.uwyo.edu
Use "%(bucket)s.s3.amazonaws.com" to the target Amazon S3. "%(bucket)s" and "%(location)s" vars can be used
if the target S3 system supports dns based buckets.
DNS-style bucket+hostname:port template for accessing a bucket [%(bucket)s.s3.amazonaws.com]: %(bucket)s.pathfinder.arcc.uwyo.edu
Encryption password is used to protect your files from reading
by unauthorized persons while in transfer to S3
Encryption password:
Path to GPG program [/usr/bin/gpg]:
When using secure HTTPS protocol all communication with Amazon S3
servers is protected from 3rd party eavesdropping. This method is
slower than plain HTTP, and can only be proxied with Python 2.7 or newer
Use HTTPS protocol [Yes]: Yes
On some networks all internet access must go through a HTTP proxy.
Try setting it here if you can't connect to S3 directly
HTTP Proxy server name:
New settings:
Access Key: <YOUR_ACCESS_KEY>
Secret Key: <YOUR_SECRET_KEY>
Default Region: US
S3 Endpoint: pathfinder.arcc.uwyo.edu
DNS-style bucket+hostname:port template for accessing a bucket: %(bucket)s.pathfinder.arcc.uwyo.edu
Encryption password:
Path to GPG program: /usr/bin/gpg
Use HTTPS protocol: True
HTTP Proxy server name:
HTTP Proxy server port: 0
Test access with supplied credentials? [Y/n] n
Testing s3cmd
Once s3cmd
is installed, we can try querying your buckets to make sure everything works correctly.
$ s3cmd ls
2023-11-10 14:39 s3://<BUCKET_1>
If you see your corresponding buckets, s3cmd
is configured correctly.
Setting CORS Attributes
Once s3cmd
is installed and configured, we can proceed with editing the CORS attributes on your bucket.
Create an XML file with your CORS rules. For example, let’s save this as cors.xml
:
More details about the XML format for CORS attributes is available here: https://docs.aws.amazon.com/AmazonS3/latest/userguide/ManageCorsUsing.html
Note: Pathfinder only supports using XML, not the newer JSON format.
Apply the newly created XML CORS definition to your bucket. Using the setcors
sub-command will overwrite any existing CORS rules, if you have defined them in the past.
Removing CORS Attributes
To remove all CORS attributes from a bucket, use the delcors
sub-command.
Caveats
Pathfinder only supports the XML format of CORS attributes. The newer JSON format is not yet supported.
Setting CORS attributes will overwrite any existing attributes.
There is currently no way to query existing CORS attributes from a given bucket.
Setting attributes must be done on a per-bucket basis, there is no way to set CORS rules across all buckets owned by a particular user.