Rclone

Contents

Overview

Rclone is a command-line program to support file transfers and syncing between a number of storage services. It is a feature rich alternative to cloud vendors' web storage interfaces. Over 40 cloud storage products support rclone including S3 object stores, business & consumer file storage services, and standard transfer protocols. Rclone has powerful cloud equivalents to the unix commands rsync, cp, mv, mount, ls, ncdu, tree, rm, and cat. Rclone's familiar syntax includes shell pipeline support, and --dry-run protection. Storage that can be managed using RClone includes: 

  • All UW ARCC Hosted Storage (Beartooth/Teton-Creek, Alcova, Pathfinder)

  • Cloud based storage (Google Drive, Dropbox, OneDrive, Sharepoint)

  • Your local computer (or any host on which it is installed)  

It is a feature rich alternative to cloud vendors' web storage interfaces. Over 40 cloud storage products support rclone including S3 object stores, business & consumer file storage services, as well as standard transfer protocols. Rclone has powerful cloud equivalents to the unix commands rsync, cp, mv, mount, ls, ncdu, tree, rm, and cat. Rclone's familiar syntax includes shell pipeline support, and --dry-run protection. It can be used at the command line, in scripts or via its API. This page will describe rclone, and include instructions for using Pathfinder with rclone.

Features

  • MD5/SHA-1 hashes checked at all times for file integrity

  • Timestamps preserved on files

  • Partial syncs supported on a whole file basis

  • Copy mode to just copy new/changed files

  • Sync (one way) mode to make a directory identical

  • Check mode to check for file hash equality

  • Can sync to and from network, e.g. two different cloud accounts

  • Optional large file chunking (Chunker)

  • Optional encryption (Crypt)

  • Optional cache (Cache)

  • Optional FUSE mount (rclone mount)

  • Multi-threaded downloads to local disk

  • Can serve local or remote files over HTTP/WebDav/FTP/SFTP/dlna

What does rclone do?

Rclone can help you:

  • Backup (and encrypt) files to cloud storage

  • Restore (and decrypt) files from cloud storage

  • Mirror cloud data to other cloud services or locally

  • Migrate data to cloud, or between cloud storage vendors

  • Mount multiple, encrypted, cached or diverse cloud storage as a disk

  • Analyse and account for data held on cloud storage using lsfljsonsizencdu

  • Union file systems together to present multiple local and/or cloud file systems as one


How to Use rclone with Pathfinder

Since rclone is intended to be used with cloud technologies, any server that can use cloud protocols can use rclone to transfer data. ARCC’s on-premises cloud-like storage, Pathfinder, uses the S3 protocol that was developed by Amazon AWS. This enables researchers to store and share data on with all of the capabilities of cloud storage without the costs of a third-party vendor.

The following are step-by-step instructions for using rclone with Pathfinder.

Before going through these instructions please make sure you have access to Pathfinder (or any other cloud service you wish to use rclone with) and already have your Accesskey/Secretkey credentials

Step 1. Make sure rclone is installed

On a laptop or desktop: Before using Pathfinder with your own workstation with rclone, you will need to make sure you download it and it is installed properly from the rclone website. The rclone website also has much more information on using rclone including several commands and other documentation with installation instructions.

On Beartooth: Use the module spider rclone command to discover which version or versions of rclone are installed on Beartooth. Once you have identified which version, use the module load rclone command to enable rclone for this Beartooth session. Once that is done, you can check to see if the rclone module is available by using the module avail command.

Step 2. Setting up the rclone .config file

Before using rclone you must set up a configuration file that details the information about the remote server you want to transfer data to. Since Pathfinder uses the S3 protocol, our following examples will all use S3, but it’s important to know that there are more options available.

There are two different ways of setting up your rclone configuration file:

Create the .conf file manually

At the command prompt, do the following:

cd ~/.config mkdir rclone cd rclone vim rclone-test.conf

(Note that there are other methods to perform step 4. Use the method you prefer)

and enter options that are similar to these:

[pf-mybucket]​ type = s3 provider = 3 env_auth = false access_key_id = <enter your access key here> secret_access_key = <enter your secret key here> endpoint = pathfinder.arcc.uwyo.edu

Using the rclone prompt to create the .conf file

Once rclone is available to use, run the command rclone config to see the options to create the configuration file.

Enter the value you wish to start with. For the purposes of this example, we are going to start with a new remote. So the value we will enter is 'n' and we will give a name to remote we are going to create. Example below:

Next, you will be asked for a type of remote connection. Since Pathfinder uses Ceph which is an S3 compliant storage provider that is what we will choose, so the option “s3” is what we’ll choose and the value we will enter here is 4. However, it’s important to know that there are many options.

Once you have selected the “Amazon S3 Compliant Storage Provider” you will need to specify that the config file uses Ceph. In this case option 3.

Next, you will be asked for AWS credentials. At this point we are going to enter “false” because if already have access to Pathfinder, ARCC has provided you with your Accesskey/Secretkey combo which we will enter at the next step.

Your Accesskey/Secretkey combo is unique to you or your research group. It is important to keep these secure to keep your data on Pathfinder protected. So your entries for each value will be something similar in the faux example below:

AWS Access Key ID. Leave blank for anonymous access or runtime credentials. Enter a string value. Press Enter for the default (""). access_key_id> AKIAIOSFODNN7EXAMPLE AWS Secret Access Key (password) Leave blank for anonymous access or runtime credentials. Enter a string value. Press Enter for the default (""). secret_access_key> wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY

You will then be prompted to enter your region, here we are going to enter the value of ‘1' because we are using an on-premises storage this value doesn’t matter.

The next section asks for your endpoint to connect to. In our example the endpoint is the web address for pathfinder, pathfinder.arcc.uwyo.edu.

Next will be a location prompt, we are going to leave this blank since we are also not using a region. Just hit enter/return to move on.

The last step in the configuration file setup is to choose the ACL settings. This part is very important. There are many options for both private and public read/write/delete permissions, take extra care in choosing the value you enter. This example is going to choose the “private” option for informational purposes only. We will enter the value ‘1' hit enter/return.

Next, choose ‘n' to not continue to advanced settings and the 'q’ to quit the configuration file set up.

Once that is completed, you can check your configuration file by navigating to your hidden .config folder and viewing the file.

It should look similar to the manual configuration that we mentioned earlier.

Step 3. Basic usage commands for rclone

The basic syntax goes as follows rclone <function> <source> <destination endpoint>:<bucket>.

the basic functions are:

  • copy

  • sync

  • move

  • check

  • mount

  • serve

More information on each function can be found at https://rclone.org/#what. An example of a copy from Teton to Pathfinder would be: