For convenience and organization, the petaLibrary is split into two distinct areas. The first area, Data Commons, is for data being utilized in active research projects. This is the primary storage location where most research data is contained. The second area, Data Curation, is for data used in published research. This area is home to data that will no longer be manipulated and/or is not being used in any active research projects.
Contents
To provide users with a seamless computing experience, the petaLibrary is connected to the UWyo network for blazing-fast transfer speeds starting at 40 Gbps, upgradeable to 80 Gbps. The UWyo network is connected to a specialized research network at speeds of 100 Gbps to allow for high-volume bulk data transfer, remote experiment control, and data visualization.
The petaLibrary is connected to the UW network backbone at 40 Gbps (upgradeable to 80 Gbps) which is connected to the Internet2 research network at 100 Gbps so the data can be provided to the wide-area network (WAN) at world-class speeds. Connected to a specialized research network at speeds of 100 Gbps to allow for high-volume bulk data transfer, remote experiment control, and data visualization. The service is designed and operated such that access to data in the Data Commons should be limited only by the receiving endpoint or intervening network connections, not the UW infrastructure. Thus UW researchers will be able to share (and/or host) data via a service that provides world-class speeds for research data.
The petaLibrary is broken down into four distinct areas: Data Commons, Data Curation (Publications, and Archive).
Led by the UW/IT/Research Support group and is divided into two functional areas:
Commons is a collaborative, project-oriented storage area intended to be used jointly with other researchers (UW and/or beyond) to store data from an active research project. Principal investigators are able to delegate access permissions to other campus users or external collaborators. This area of the storage system is very similar to Bighorn on Mount Moran, but with fewer restrictions on what the system can be used for.
We anticipate making Commons available via the Shibboleth authentication system so that external collaborators may use their home credentials for access (and not have to maintain separate UW credentials). This is similar to networking services enabled with EduRoam.
Overseen by the UW Libraries, Data Curation is divided into two functional areas:
Publications are provided by the University as a service to UW researchers who, by choice or necessity, cannot publish their data in research domain-specific repositories, publisher repositories or other external repositories because suitable services don't exist or because the researcher prefers to associate UW (and their research team) as the authoritative home for the data. Publications is used to present research materials and accompanying documentation, or data on the internet. All data contained within the Publications store is publicly available for download via a UW/Libraries-provided web service and/or direct web access.
There is no charge for data stored in this repository, but it must meet publication criteria established by the UW Library and cannot be changed once published.
Archive refers to the "cold storage" portion of the petaLibrary. It will serve as a Curation repository for not-published data that is no longer part of an ongoing research project but has value to the research community.
The system will allow for retrieval of data, but will not allow for modifications to data stored within the Archive. If a change is required the data will be migrated to Data Commons until it meets the criteria for archival again.
The cost for storage allocations is based on capacity is available in three distinct billing models:
Commons Storage — $100 per terabyte per year, billed monthly based on current usage (Free up to 500 GB)
Long Term Storage — $1,000 per terabyte for 10 years, billed upfront based on the allocation
Connecting to the petaLibraryRequesting AccessTo request a storage allocation on the petaLibrary, the Principle Investigator of a research project will fill out the petaLibrary Access Request form. Getting ConnectedSMB (Linux, Mac OS, Windows): While users are on the UWyo network connecting to the petaLibrary can be accessed from any of the three primary OS families using their standard tools:
Connecting While Off-Campus NetworkGlobusGetting connected to Globus Data Transfer: Globus The Globus endpoint name for the petaLibrary is "ARCC petaLibrary". RHEL 7, and Ubuntu 16.04/10 Linux users will need to add the following lines to the Global section of the /etc/samba/smb.conf file:
Note: RHEL 6 is currently unable to connect to the petaLibrary. ARCC is working on a solution. Access from TetonThe petaLibrary is accessible from the Teton compute cluster at 10 gbps. You can access homes and common by changing directory to /petalibrary. Teton documentation can be found @ HPC system: TetonUsing the petaLibrary |
petaLibrary Road MapOwn CloudWithin the next year, we hope to enable the Own Cloud to provide DropBox like functionality. Collaboration with Regional InstitutionsComing Soon |