Documentation of research data, also called, metadata is an often overlooked, but critical, aspect of research data management. It’s important to note that this type of metadata is not as simple as basic system metadata familiar to computer scientists such as file sizes, ownership, etc. When we are talking in a data management context, metadata provides several descriptive elements that inform viewers of the data of who collected it, how it was collected, where and when it was collected, processing methods, and much more. Not only is this a requirement for data publishing, it is very useful for collaboration between other researchers and can serve as a tool to ensure consistency throughout the other steps taken in the Research Data Management Life-cycle.

Common Metadata Standards

There are several standards available depending on discipline that provide the advantages of ensuring you have a complete, standard set of information about each part of your data and enable your dataset to be organized with other datasets, a few examples are:

While these standards exist and can help aid researchers in recording complete metadata, they are not universally required to be used. A best practice is to record the information that works for you and your collaborators. It’s better to have something than nothing at all.

Common Metadata Fields

While each of the standards listed above are each unique and recommend recording different information, there are several commonalities. Listed below are a few of them that each metadata document could contain for a directory of research data.

Creator/Author: <Researcher name/ORCiD>
Subject/title: <Name of the data>
Description: <Short paragraph describing the data and how they got to this state e.g., image taken, data processed, etc.>
Contributor(s)/Collaborator(s): <Names of people associated with the project>
Date: <use a format that is standardized across all the data e.g., YYYYMMDD>
Original Format/File types: <.txt .csv .png .sql>
Relation: <list any relating files/folders>
Location: <e.g., Latitude & Longitude in decimal degrees>
Rights: <funder grant number, or open source>

Metadata File Types / README

Metadata can be recorded in multiple ways including in a filename, in a spreadsheet, in an XML file, or into a database. However, a very common type is a simple text file called a README file. A README provides information about a data file and is intended to help ensure that the data can be correctly interpreted, by yourself at a later date or by others when sharing or publishing data. In general a good README should include several things in addition to what is listed above:

General Items	Optional Items	Other Recommendations
File naming system (with examples) Folder structure Relationships and dependencies between files Other documentation files of interest within dataset (notes, companion files) For each major file, short description of contents Date of creation of each major file	Experimental & environmental conditions of collection (if appropriate) Standards and calibration for data collection (if applicable) Uncertainty, precision and accuracy of measurements (if appropriate)	Methods used for data processing Software used in data collection and processing, including version numbers File formats used in the dataset & recommended software Quality control procedure applied Description of file versioning system if appropriate Dataset changelog

README Example

Luckily for UWYO researchers the University Data Librarians provide an extensive sample README that can be downloaded directly from their website. Below is an example of what the beginning of the file looks like:

This DATSETNAMEreadme.txt file was generated on YYYY-MM-DD by NAME
GENERAL INFORMATION
1. Title of Dataset: 
2. Description or abstract of dataset:
3. Author Information
	A. Principal Investigator Contact Information
		Name: 
		Institution: 
		Address: 
		Email: 
4. Date of data collection (single date, range, approximate date) <suggested format YYYY-MM-DD>: 
5. Geographic location of data collection <latitude, longiute, or city/region, State, Country, as appropriate>: 
6. Information about funding sources that supported the collection of the data: 
7. Keywords for dataset: 
8. Discipline of dataset:
9. License for dataset:

How to Download the Libraries' README on ARCC HPC

ARCC recommends researchers to download the Libraries README file into the project directory for High Performance Computing (HPC) projects when they initially get the project setup. That way the file can be maintained throughout the process, so that when it comes to the publishing phase of the Research Data Life-cycle, the metadata is already recorded and can be shared quickly.

Here is an example of a Linux command to run to download the Libraries README file:

#To Download 
wget https://uwyo.libguides.com/ld.php?content_id=61572044  

#To rename as README.txt
mv 'ld.php?content_id=61572044' README.txt

Next Steps

Previous

Research Data Management Life-cycle

Workshop Home

Intro to Data Management

Next?

Data Value and Safety

Metadata and README files