For convenience and organization, the petaLibrary is split into two distinct areas. The first area, Data Commons, is for data being utilized in active research projects. This is the primary storage location where most research data is contained. The second area, Data Curation, is for data used in published research. This area is home to data that will no longer be manipulated and/or is not being used in any active research projects. To provide users with a seamless computing experience, the petaLibrary is connected to the UWyo network for blazing-fast transfer speeds starting at 40 Gbps, upgradeable to 80 Gbps. The UWyo network is connected to a specialized research network at speeds of 100 Gbps to allow for high-volume bulk data transfer, remote experiment control, and data visualization.
Contents
ARCC petaLibrary
petaLibrary System Overview
The UW Libraries and UW/IT have collaborated on a shared digital data repository designed to provide two main functions: 1) a storage service for UW researchers who need to reliably store and exchange data with students, and collaborators anywhere in the world and; 2) a place for UW researchers to store data linked to publications or datasets as publications themselves, and be in compliance with funder requirements.
The petaLibrary is connected to the UW network backbone at 40 Gbps (upgradeable to 80 Gbps) which is connected to the Internet2 research network at 100 Gbps so the data can be provided to the wide-area network (WAN) at world-class speeds. The service is designed and operated such that access to data in the Data Commons should be limited only by the receiving endpoint or intervening network connections, not the UW infrastructure. Thus UW researchers will be able to share (and/or host) data via a service that provides world-class speeds for research data.
Storage Overview
The petaLibrary is broken down into four distinct areas: Data Commons, Data Curation (Publications, and Archive).
Data Commons
Led by the UW/IT/Research Support group and is divided into two functional areas:
Commons
Commons is a collaborative, project-oriented storage area intended to be used jointly with other researchers (UW and/or beyond) to store data from an active research project. Principal investigators are able to delegate access permissions to other campus users or external collaborators. This area of the storage system is very similar to Bighorn on Mount Moran, but with fewer restrictions on what the system can be used for.
We anticipate making Commons available via the Shibboleth authentication system so that external collaborators may use their home credentials for access (and not have to maintain separate UW credentials). This is similar to networking services enabled with EduRoam.
Data Curation
Overseen by the UW Libraries, Data Curation is divided into two functional areas:
Publications
Publications are provided by the University as a service to UW researchers who, by choice or necessity, cannot publish their data in research domain-specific repositories, publisher repositories or other external repositories because suitable services don't exist or because the researcher prefers to associate UW (and their research team) as the authoritative home for the data. Publications is used to present research materials and accompanying documentation, or data on the internet. All data contained within the Publications store is publicly available for download via a UW/Libraries-provided web service and/or direct web access.
There is no charge for data stored in this repository, but it must meet publication criteria established by the UW Library and cannot be changed once published.
Archive
Archive refers to the "cold storage" portion of the petaLibrary. It will serve as a Curation repository for not-published data that is no longer part of an ongoing research project but has value to the research community.
The system will allow for retrieval of data, but will not allow for modifications to data stored within the Archive. If a change is required the data will be migrated to Data Commons until it meets the criteria for archival again.
Price Structure
The cost for storage allocations is based on capacity is available in three distinct billing models:
Commons Storage — $100 per terabyte per year, billed monthly based on current usage (Free up to 500 GB)
Long Term Storage — $1,000 per terabyte for 10 years, billed upfront based on the allocation