Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 10 Next »

While transferring data over a web application such as Open OnDemand, or over a client such as Cyberduck are easy-to-use, they are difficult to automate within a compute job. However, in some use cases, people may want to transfer data, run some computation on that data, transfer it back and so on. These type of tasks can be accomplished using the command line interface (CLI) on MedicineBow. There are many CLI options to use including the previously discussed Globus CLI, scp, SFTP, and rsync which will all work on MedicineBow, but in this module we will only discuss rclone because it is ARCC’s recommended command line tools due to it’s ability to work with desktops, HPC, and cloud storage systems as well as it’s ability to be multi-threaded to facilitate faster transfers.



scp and SFTP CLI Tools and Examples

Before diving into providing information on rclone, we’ll briefly cover some of the other command line tools and give examples for how to use them on MedicineBow.

  • scp - Uses SSH (Secure Shell) to authenticate, then securely transfer data. This means the data is authenticated by the user initiating the connection.

    • Example for from local to MedicineBow on Linux/Mac

    • Basic syntax of a scp command: scp file username@server:directory to transfer to

    • dylan@fireball:~$ scp transfer.file dperkin6@medicinebow.arcc.uwyo.edu:/project/arcc
      transfer.file                                                                                                                 100%    0     0.0KB/s   00:00    
      dylan@fireball:~$
  • SFTP - can also be used on the CLI as well as clients. Compared to the SCP protocol, which only allows file transfers, the SFTP protocol allows for a wider range of operations on remote files. SFTP clients provide extra capabilities include resuming interrupted transfers, directory listings, and remote file removal.

    • SFTP is generally more platform-independent than SCP.

    • Example of interactive use of SFTP

    • dylan@fireball:~$ sftp dperkin6@medicinebow.arcc.uwyo.edu
      Connected to medicinebow.arcc.uwyo.edu.
      sftp>
    • Helpful SFTP commands

      • ? - is how you access the help, put - to upload a file, mput - to upload multiple files

      • get - to download a file or directory, mget - to download multiple files


rsync CLI Tool and example

rsync is another very useful tool, that has many options. Rsync (Remote Sync) is a most commonly used command for copying and synchronizing files and directories remotely as well as locally in Linux/Unix systems. With the help of rsync command you can copy and synchronize your data remotely and locally across directories, across disks and networks, perform data backups and mirroring between two Linux machines.

  • Basic syntax of rsync command: rsync options source destination

  • dylan@fireball:~$ rsync transfer.file dperkin6@medicinebow.arcc.uwyo.edu:/project/arcc/dperkin6

Some common options used with rsync commands

-v : verbose
-r : copies data recursively (but don’t preserve timestamps and permission while transferring data
-a : archive mode, archive mode allows copying files recursively and it also preserves symbolic links, file permissions, user & group ownership and timestamps
-z : compress file data
-h : human-readable, output numbers in a human-readable format

About rclone

Rclone is a command-line program to manage files on remote storage. It is a feature-rich alternative to cloud vendors' web storage interfaces. Over 70 cloud storage products support rclone including S3 object stores, business & consumer file storage services, as well as standard transfer protocols. Rclone has powerful cloud equivalents to the unix commands rsync, cp, mv, mount, ls, ncdu, tree, rm, and cat. It is used at the command line, in scripts or via its API.

Rclone mounts any local, cloud or virtual filesystem as a disk on Windows, macOS, linux and FreeBSD, and also serves these over SFTP, HTTP, WebDAV, FTP and DLNA.

Rclone helps you:

  • Backup (and encrypt) files to cloud storage

  • Restore (and decrypt) files from cloud storage

  • Mirror cloud data to other cloud services or locally

  • Migrate data to the cloud, or between cloud storage vendors

  • Mount multiple, encrypted, cached or diverse cloud storage as a disk

  • Union file systems together to present multiple local and/or cloud file systems as one

Download rclone


rclone Configuration

Rclone does require some configuration for any “transfer partner” this is a long process, but once setup it is useful to use over and again. Examples will be for configuring transfers to/from MedicineBow using authentication with an ssh-key:

dylan@fireball:~$ rclone config
2024/07/10 14:32:44 NOTICE: Config file "/home/dylan/.config/rclone/rclone.conf" not found - using defaults
No remotes found - make a new one
n) New remote
s) Set configuration password
q) Quit config
n/s/q> n
name> medbow

You will then be given choices to make to continue configuring the ‘medbow’ remote configuration, in our case we will pick the number ‘27’ SSH/SFTP Connection

Storage> 27
** See help for sftp backend at: https://rclone.org/sftp/ **

SSH host to connect to
Enter a string value. Press Enter for the default ("").
Choose a number from below, or type in your own value
 1 / Connect to example.com
   \ "example.com"
host> medicinebow.arcc.uwyo.edu
SSH username, leave blank for current username, dylan
Enter a string value. Press Enter for the default ("").
user> dperkin6
SSH port, leave blank to use default (22)
Enter a string value. Press Enter for the default ("").
port> 

Then you will have to give more login info, in this case we will accept defaults by hitting ‘enter/return’ to continue until we get to the ‘key_file’ option and enter the location of the ssh key file:

SSH password, leave blank to use ssh-agent.
y) Yes type in my own password
g) Generate random password
n) No leave this optional password blank (default)
y/g/n> 
Raw PEM-encoded private key, If specified, will override key_file parameter.
Enter a string value. Press Enter for the default ("").
key_pem> 
Path to PEM-encoded private key file, leave blank or set key-use-agent to use ssh-agent.

Leading `~` will be expanded in the file name as will environment variables such as `${RCLONE_CONFIG_DIR}`.

Enter a string value. Press Enter for the default ("").
key_file> /home/dylan/.ssh/id_rsa

The next options will be relating to passwords and certificates. For MedicineBow, none of this applies so we keep hitting enter until we get to the cipher where we will enter '1' for false. The final two steps are entering an advanced configuration and then saving before exiting the configuration setup.

The passphrase to decrypt the PEM-encoded private key file.

Only PEM encrypted key files (old OpenSSH format) are supported. Encrypted keys
in the new OpenSSH format can't be used.
y) Yes type in my own password
g) Generate random password
n) No leave this optional password blank (default)
y/g/n> 
Choose a number from below, or type in your own value
 1 / Use default Cipher list.
   \ "false"
 2 / Enables the use of the aes128-cbc cipher and diffie-hellman-group-exchange-sha256, diffie-hellman-group-exchange-sha1 key exchange.
   \ "true"
use_insecure_cipher> 1
Disable the execution of SSH commands to determine if remote file hashing is available.
Leave blank or set to false to enable hashing (recommended), set to true to disable hashing.
Enter a boolean value (true or false). Press Enter for the default ("false").
disable_hashcheck> 
Edit advanced config? (y/n)
y) Yes
n) No (default)
y/n> n
Remote config
--------------------
[medbow]
host = medicinebow.arcc.uwyo.edu
user = dperkin6
key_file = /home/dylan/.ssh/id_rsa
use_insecure_cipher = false
--------------------
y) Yes this is OK (default)
e) Edit this remote
d) Delete this remote
y/e/d> y

Using rclone

The basic syntax goes as follows rclone <function> <source> <destination endpoint>:<bucket>.

the basic functions are:

  • copy - to copy files/directories to or from somewhere

  • sync - (one way) to make a directory identical

  • move - files to cloud storage deleting the local after verification

  • check - hashes and for missing/extra files

  • mount - your cloud storage as a network disk

More information on each function can be found at https://rclone.org/#what. An example of a copy from local to MedicineBow would be:

dylan@fireball:~$ rclone copy transfer.file medbow:/project/arcc

Finally The End

Back to Previous Module

Data Transfer with Desktop Clients

Back to Home

Intro to Data Transfer

Move on to Next Module

Mapping/Mounting with SMB

  • No labels