Basic Data Transfer with Globus

Globus is ARCC’s recommended method of transferring data onto our resources. Globus provides a secure, unified interface to your research data. Use Globus to 'fire and forget' high-performance data transfers between systems within and across organizations. With Globus, users can do more beyond just transfer data Globus users can move, share, & transfer, sync, and find data via a single interface regardless of where the data actually “lives”. Whether your files are housed on a supercomputer, lab cluster, tape archive, public cloud or your laptop, you can manage this data from anywhere, using your existing identities, accessible through your web browser.​



Globus Key Concepts

It’s important to be aware of a key concepts that only apply to Globus transfers. This will clear up much of the confusion for the content discussed below.

  • Collection - A collection is a named location containing data you can access with Globus. Collections can be hosted on many different kinds of systems, including campus storage, HPC clusters, laptops, Amazon S3 buckets, Google Drive, and scientific instruments. When you use Globus, you don’t need to know a physical location or details about storage. You only need a collection name. A collection allows authorized Globus users to browse and transfer files. Collections can also be used for sharing data with others and for enabling discovery by other Globus users.

  • Endpoint - An endpoint is a server that hosts collections. If you want to be able to access, share, transfer, or manage data using Globus, the first step is to create an endpoint on the system where the data is (or will be) stored. An endpoint can be a laptop, a personal desktop system, a laboratory server, a campus data storage service, a cloud service, or an HPC cluster. It’s easy to set up your own Globus endpoint on a laptop or other personal system using Globus Connect Personal. Administrators of shared services (like campus storage servers) can set up multi-user endpoints using Globus Connect Server. You can use endpoints set up by others as long as you’re authorized by the endpoint administrator or by a collection manager.

  • Fire-And-Forget Data Transfer - After you request a file transfer, Globus takes over and does the work on your behalf. You can navigate away from the File Manager, close the browser window, and even logout. Globus will optimize the transfer for performance, monitor the transfer for completion and correctness, and recover from network errors and collection downtime. When a problem is encountered part-way through the transfer, Globus resumes from the point of failure and does not retransmit all of the data specified in the original request. Globus can handle extremely large data transfers, even those that don’t complete within the authentication expiration period of a collection (which is controlled by the collection administrator). If your credentials expire before the transfer completes, Globus will notify you to re-authenticate on the collection, after which Globus will continue the transfer from where it was paused.


Globus Web Interface

The first step in transferring data with Globus is to navigate to the Globus web site at https://www.globus.org/. From there click on the LOG IN icon.

 

image-20240701-210526.png

Logging into Globus

To begin using Globus with ARCC systems, use the option for logging in with your existing organizational login and search for the “University of Wyoming”.

  • Caveat:  If the account does not have a UWYO associated e-mail, you will not be able to set up globus with that account.  


Account Linking

After logging in with your UWyo credentials, you will be prompted with a message about linking accounts. This is something that should be looked into if you have used Globus at another institution of if you have a separate Globus ID.  


Globus File Manager

The Globus File Manager is the primary interface to transferring files. On this screen is where we begin looking for collections that we want to transfer from or to.

  • Start by clicking on the ‘search’ box to begin looking for the collection you want to use.

 


ARCC’s Collections

By Searching for ‘arcc’ in the collection search, you can find all collections managed by ARCC. Here you can see the results with different attributes of owner and description. For UWyo ARCC collections, the owner will have a @uwyo.edu email address and a descirption of the filesystem being used. Here we can find:

  • MedicineBow

  • Pathfinder

  • And others that may or may not still be active

 

 

To access the collections managed by ARCC, you will need to provide your username and password along with your preferred two factor authentication method (Duo Push)


Basic Globus Transfer

Below is an example of a basic transfer from a Globus Personal Endpoint on a laptop to an ARCC system. Essentially after navigating to the directory you want to transfer from and to, select the number of files or directories you want and click “Start”

 


Enable Globus Personal Endpoint

If you want to enable Globus on your personal or work computer, please follow the instructions for your Operating System of choice on the Globus website.


Next Steps