Rclone
Contents
Overview
Rclone is a command-line program to support file transfers and syncing between a number of storage services. It is a feature rich alternative to cloud vendors' web storage interfaces. Over 40 cloud storage products support rclone including S3 object stores, business & consumer file storage services, and standard transfer protocols. Rclone has powerful cloud equivalents to the unix commands rsync, cp, mv, mount, ls, ncdu, tree, rm, and cat. Rclone's familiar syntax includes shell pipeline support, and --dry-run
protection. Storage that can be managed using RClone includes:
All UW ARCC Hosted Storage (Beartooth/Teton-Creek, Alcova, Pathfinder)
Cloud based storage (Google Drive, Dropbox, OneDrive, Sharepoint)
Your local computer (or any host on which it is installed)
It is a feature rich alternative to cloud vendors' web storage interfaces. Over 40 cloud storage products support rclone including S3 object stores, business & consumer file storage services, as well as standard transfer protocols. Rclone has powerful cloud equivalents to the unix commands rsync, cp, mv, mount, ls, ncdu, tree, rm, and cat. Rclone's familiar syntax includes shell pipeline support, and --dry-run
protection. It can be used at the command line, in scripts or via its API. This page will describe rclone, and include instructions for using Pathfinder with rclone.
Features
MD5/SHA-1 hashes checked at all times for file integrity
Timestamps preserved on files
Partial syncs supported on a whole file basis
Copy mode to just copy new/changed files
Sync (one way) mode to make a directory identical
Check mode to check for file hash equality
Can sync to and from network, e.g. two different cloud accounts
Optional large file chunking (Chunker)
Optional encryption (Crypt)
Optional cache (Cache)
Optional FUSE mount (rclone mount)
Multi-threaded downloads to local disk
Can serve local or remote files over HTTP/WebDav/FTP/SFTP/dlna
What does rclone do?
Rclone can help you:
Backup (and encrypt) files to cloud storage
Restore (and decrypt) files from cloud storage
Mirror cloud data to other cloud services or locally
Migrate data to cloud, or between cloud storage vendors
Mount multiple, encrypted, cached or diverse cloud storage as a disk
Analyse and account for data held on cloud storage using lsf, ljson, size, ncdu
Union file systems together to present multiple local and/or cloud file systems as one
How to Use rclone with Pathfinder
Since rclone is intended to be used with cloud technologies, any server that can use cloud protocols can use rclone to transfer data. ARCC’s on-premises cloud-like storage, Pathfinder, uses the S3 protocol that was developed by Amazon AWS. This enables researchers to store and share data on with all of the capabilities of cloud storage without the costs of a third-party vendor.
The following are step-by-step instructions for using rclone with Pathfinder.
Before going through these instructions please make sure you have access to Pathfinder (or any other cloud service you wish to use rclone with) and already have your Accesskey/Secretkey credentials
Step 1. Make sure rclone is installed
On a laptop or desktop: Before using Pathfinder with your own workstation with rclone, you will need to make sure you download it and it is installed properly from the rclone website. The rclone website also has much more information on using rclone including several commands and other documentation with installation instructions.
On Beartooth: Use the module spider rclone
command to discover which version or versions of rclone are installed on Beartooth. Once you have identified which version, use the module load rclone
command to enable rclone for this Beartooth session. Once that is done, you can check to see if the rclone module is available by using the module avail
command.
Step 2. Setting up the rclone .config file
Before using rclone you must set up a configuration file that details the information about the remote server you want to transfer data to. Since Pathfinder uses the S3 protocol, our following examples will all use S3, but it’s important to know that there are more options available.
There are two different ways of setting up your rclone configuration file:
Create the .conf file manually
At the command prompt, do the following:
cd ~/.config
mkdir rclone
cd rclone
vim rclone-test.conf
(Note that there are other methods to perform step 4. Use the method you prefer)
and enter options that are similar to these:
[pf-mybucket]
type = s3
provider = 3
env_auth = false
access_key_id = <enter your access key here>
secret_access_key = <enter your secret key here>
endpoint = pathfinder.arcc.uwyo.edu
Using the rclone prompt to create the .conf file
Once rclone is available to use, run the command rclone config
to see the options to create the configuration file.
Enter the value you wish to start with. For the purposes of this example, we are going to start with a new remote. So the value we will enter is 'n' and we will give a name to remote we are going to create. Example below:
Next, you will be asked for a type of remote connection. Since Pathfinder uses Ceph which is an S3 compliant storage provider that is what we will choose, so the option “s3” is what we’ll choose and the value we will enter here is 4. However, it’s important to know that there are many options.
Once you have selected the “Amazon S3 Compliant Storage Provider” you will need to specify that the config file uses Ceph. In this case option 3.
Next, you will be asked for AWS credentials. At this point we are going to enter “false” because if already have access to Pathfinder, ARCC has provided you with your Accesskey/Secretkey combo which we will enter at the next step.
Your Accesskey/Secretkey combo is unique to you or your research group. It is important to keep these secure to keep your data on Pathfinder protected. So your entries for each value will be something similar in the faux example below:
AWS Access Key ID.
Leave blank for anonymous access or runtime credentials.
Enter a string value. Press Enter for the default ("").
access_key_id> AKIAIOSFODNN7EXAMPLE
AWS Secret Access Key (password)
Leave blank for anonymous access or runtime credentials.
Enter a string value. Press Enter for the default ("").
secret_access_key> wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
You will then be prompted to enter your region, here we are going to enter the value of ‘1' because we are using an on-premises storage this value doesn’t matter.
The next section asks for your endpoint to connect to. In our example the endpoint is the web address for pathfinder, pathfinder.arcc.uwyo.edu.
Next will be a location prompt, we are going to leave this blank since we are also not using a region. Just hit enter/return to move on.
The last step in the configuration file setup is to choose the ACL settings. This part is very important. There are many options for both private and public read/write/delete permissions, take extra care in choosing the value you enter. This example is going to choose the “private” option for informational purposes only. We will enter the value ‘1' hit enter/return.
Next, choose ‘n' to not continue to advanced settings and the 'q’ to quit the configuration file set up.
Once that is completed, you can check your configuration file by navigating to your hidden .config folder and viewing the file.
It should look similar to the manual configuration that we mentioned earlier.
Step 3. Basic usage commands for rclone
The basic syntax goes as follows rclone <function> <source> <destination endpoint>:<bucket>
.
the basic functions are:
copy
sync
move
check
mount
serve
More information on each function can be found at https://rclone.org/#what. An example of a copy from Teton to Pathfinder would be: