While transferring data over a web application such as Open OnDemand, or over a client such as Cyberduck are easy-to-use, they are difficult to automate within a compute job. However, in some use cases, people may want to transfer data, run some computation on that data, transfer it back and so on. These type of tasks can be accomplished using the command line interface (CLI) on MedicineBow. There are many CLI options to use including the previously discussed Globus CLI, scp, SFTP, and rsync which will all work on MedicineBow, but in this module we will detail rclone because it is ARCC’s recommended command line tools due to it’s ability to work with desktops, HPC, and cloud storage systems as well as it’s ability to be multi-threaded to facilitate faster transfers.
scp and SFTP CLI Tools and Examples
Before diving into providing information on rclone, we’ll briefly cover some of the other command line tools and give examples for how to use them on MedicineBow.
scp - Uses SSH (Secure Shell) to authenticate, then securely transfer data. This means the data is authenticated by the user initiating the connection.
Example for from local to MedicineBow on Linux/Mac
Basic syntax of a scp command: scp file username@server:directory to transfer to
dylan@fireball:~$ scp transfer.file dperkin6@medicinebow.arcc.uwyo.edu:/project/arcc transfer.file 100% 0 0.0KB/s 00:00 dylan@fireball:~$
SFTP - can also be used on the CLI as well as clients. Compared to the SCP protocol, which only allows file transfers, the SFTP protocol allows for a wider range of operations on remote files. SFTP clients provide extra capabilities include resuming interrupted transfers, directory listings, and remote file removal.
SFTP is generally more platform-independent than SCP.
Example of interactive use of SFTP
dylan@fireball:~$ sftp dperkin6@medicinebow.arcc.uwyo.edu Connected to medicinebow.arcc.uwyo.edu. sftp>
Helpful SFTP commands
?
- is how you access the help,put
- to upload a file,mput
- to upload multiple filesget
- to download a file or directory,mget
- to download multiple files
rsync CLI Tool and example
rsync is another very useful tool, that has many options. Rsync (Remote Sync) is a most commonly used command for copying and synchronizing files and directories remotely as well as locally in Linux/Unix systems. With the help of rsync command you can copy and synchronize your data remotely and locally across directories, across disks and networks, perform data backups and mirroring between two Linux machines.
Basic syntax of rsync command: rsync options source destination
dylan@fireball:~$ rsync transfer.file dperkin6@medicinebow.arcc.uwyo.edu:/project/arcc/dperkin6
Some common options used with rsync commands
-v : verbose -r : copies data recursively (but don’t preserve timestamps and permission while transferring data -a : archive mode, archive mode allows copying files recursively and it also preserves symbolic links, file permissions, user & group ownership and timestamps -z : compress file data -h : human-readable, output numbers in a human-readable format
About rclone
Rclone is a command-line program to manage files on remote storage. It is a feature-rich alternative to cloud vendors' web storage interfaces. Over 70 cloud storage products support rclone including S3 object stores, business & consumer file storage services, as well as standard transfer protocols. Rclone has powerful cloud equivalents to the unix commands rsync, cp, mv, mount, ls, ncdu, tree, rm, and cat. It is used at the command line, in scripts or via its API.
Rclone mounts any local, cloud or virtual filesystem as a disk on Windows, macOS, linux and FreeBSD, and also serves these over SFTP, HTTP, WebDAV, FTP and DLNA.
Rclone helps you:
Backup (and encrypt) files to cloud storage
Restore (and decrypt) files from cloud storage
Mirror cloud data to other cloud services or locally
Migrate data to the cloud, or between cloud storage vendors
Mount multiple, encrypted, cached or diverse cloud storage as a disk
Union file systems together to present multiple local and/or cloud file systems as one
rclone Configuration
Rclone does require some configuration for any “transfer partner” this is a long process, but once setup it is useful to use over and again. Examples will be for configuring transfers to/from MedicineBow using authentication with an ssh-key:
dylan@fireball:~$ rclone config 2024/07/10 14:32:44 NOTICE: Config file "/home/dylan/.config/rclone/rclone.conf" not found - using defaults No remotes found - make a new one n) New remote s) Set configuration password q) Quit config n/s/q> n name> medbow
You will then be given choices to make to continue configuring the ‘medbow’ remote configuration, in our case we will pick the number ‘27’ SSH/SFTP Connection
Storage> 27 ** See help for sftp backend at: https://rclone.org/sftp/ ** SSH host to connect to Enter a string value. Press Enter for the default (""). Choose a number from below, or type in your own value 1 / Connect to example.com \ "example.com" host> medicinebow.arcc.uwyo.edu SSH username, leave blank for current username, dylan Enter a string value. Press Enter for the default (""). user> dperkin6 SSH port, leave blank to use default (22) Enter a string value. Press Enter for the default (""). port>
Then you will have to give more login info, in this case we will accept defaults by hitting ‘enter/return’ to continue until we get to the ‘key_file’ option and enter the location of the ssh key file:
SSH password, leave blank to use ssh-agent. y) Yes type in my own password g) Generate random password n) No leave this optional password blank (default) y/g/n> Raw PEM-encoded private key, If specified, will override key_file parameter. Enter a string value. Press Enter for the default (""). key_pem> Path to PEM-encoded private key file, leave blank or set key-use-agent to use ssh-agent. Leading `~` will be expanded in the file name as will environment variables such as `${RCLONE_CONFIG_DIR}`. Enter a string value. Press Enter for the default (""). key_file> /home/dylan/.ssh/id_ecdsa
The next options will be relating to passwords and certificates. For MedicineBow, none of this applies so we keep hitting enter until we get to the cipher where we will enter '1' for false. The final two steps are entering an advanced configuration and then saving before exiting the configuration setup.
The passphrase to decrypt the PEM-encoded private key file. Only PEM encrypted key files (old OpenSSH format) are supported. Encrypted keys in the new OpenSSH format can't be used. y) Yes type in my own password g) Generate random password n) No leave this optional password blank (default) y/g/n> Choose a number from below, or type in your own value 1 / Use default Cipher list. \ "false" 2 / Enables the use of the aes128-cbc cipher and diffie-hellman-group-exchange-sha256, diffie-hellman-group-exchange-sha1 key exchange. \ "true" use_insecure_cipher> 1 Disable the execution of SSH commands to determine if remote file hashing is available. Leave blank or set to false to enable hashing (recommended), set to true to disable hashing. Enter a boolean value (true or false). Press Enter for the default ("false"). disable_hashcheck> Edit advanced config? (y/n) y) Yes n) No (default) y/n> n Remote config -------------------- [medbow] host = medicinebow.arcc.uwyo.edu user = dperkin6 key_file = /home/dylan/.ssh/id_rsa use_insecure_cipher = false -------------------- y) Yes this is OK (default) e) Edit this remote d) Delete this remote y/e/d> y
Using rclone
The basic syntax goes as follows rclone <function> <source> <destination endpoint>:<bucket>.
the basic functions are:
copy - to copy files/directories to or from somewhere
sync - (one way) to make a directory identical
move - files to cloud storage deleting the local after verification
check - for missing/extra files
mount - your cloud storage as a network disk
More information on each function can be found at https://rclone.org/#what. An example of a copy from local to MedicineBow would be:
dylan@fireball:~$ rclone copy transfer.file medbow:/project/arcc
Finally The End
Back to Previous Module | Back to Home | Move on to Next Module |