Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Research data storage is a core service that ARCC provides and we have several storage options available for you that will be discussed in subsequent modules, but to state it briefly there are three core storage systems that ARCC provides that fit different phases of the Research Data Life-cycle each filling different roles detailed in the table below:

The ARCC Data Portal (Storage)

MedicineBow HPC system (Analysis)

Pathfinder (Storage)

  • Free for UWyo researchers up to a default limit

  • Accessible via the UWyo network or VPN

  • Includes backups and snapshots

  • Home (for configuration and profiles)

  • Project (for shared data during analysis)

  • gscratch (for actively read/write during analysis)

    • This MedicineBow is NOT backed up, but includes snapshots

  • Cloud-like backend

  • Web-enabled S3 buckets for data storage, data transfer, etc.

  • Is NOT backed up

Transferring data to and from these systems is discussed in another workshop.

...

The Analysis Phase

The analysis phase can include a variety of methodologies and tools to complete. This phase also often includes different stages and versions of data. Here are a few questions to ask yourself before entering this phase of the Research Data Life-cycle:

  • How large are the data that I am working with?

  • Will I need a powerful system such as a High Performance Computing system to complete this work?

  • What software will I need to perform the analysis?

  • Will there be new data generated as a result of this work (simulated data for model training, summarized subset of raw data etc.)

  • Will this work change my raw data and do I need to keep a copy of either the raw data or results?

  • How will I manage the changes that will happen during this phase and maintain a record of them?

...

How ARCC Can Help With the Analysis Phase

...