File Naming Conventions

This page walks researchers through the process of creating a file naming convention for a group of files. This process includes: choosing metadata, encoding and ordering the metadata, adding version information, and properly formatting the file names.



Unique File Names

 

The first thing to consider when naming files is to determine what information (metadata) is important about these files and makes each file distinct?

Ideally, pick three pieces of metadata; use no more than five. This metadata should be enough for you to visually scan the file names and easily understand what’s in each one. Example: For my images, I want to know date, sample ID, and image number for that sample on that date.

 

image-20240726-195858.png

Abbreviations

Do you need to abbreviate any of the metadata or encode it?

If any of the metadata from step 2 is described by lots of text, decide what shortened information to keep. If any of the metadata from step 2 has regular categories, standardize the categories and/or replace them with 2- or 3-letter codes; be sure to document these codes. Example: Sample ID will use a code made up of: a 2-letter project abbreviation (project 1 = P1, project 2 = P2); a 3-letter species abbreviation (mouse = “MUS”, fruit fly = “DRS”); and 3-digit sample ID (assigned in my notebook).

image-20240726-200513.png

Ordering

What is the order for the metadata in the file name?

Think about how you want to sort and search for your files to decide what metadata should appear at the beginning of the file name. If date is important, use ISO 8601-formatted dates (YYYYMMDD or YYYY-MM-DD). Example 1: My sample ID is most important so I will list it first, followed by project abbreviation, then date.

 

image-20240726-200607.png

Separating Characters

What characters will you use to separate each piece of metadata in the file name?

Many computer systems cannot handle spaces in file names. To make file names both computer- and human-readable, use dashes (-), underscores (_), and/or capitalize the first letter of each word in the file names. Example: I will use underscores to separate metadata.

 

image-20240726-200709.png

Versioning

Will you need to track different versions of each file?

You can track versions of a file by appending version information to end of the file name. Consider using a version number (e.g. “v01”) or the version date (use ISO 8601 format: YYYYMMDD or YYYY-MM-DD). Example: As each image goes through my analysis workflow, I will append the version type to the end of the file name (e.g. “_raw”, “_processed”, and “_composite”)

 

image-20240726-200943.png

Patterns

It is a good idea to write down your naming convention pattern in a README file or another place that you can look up.

Make sure the convention only uses alphanumeric characters, dashes, and underscores. Ideally, file names will be 32 characters or less. Example: My file naming convention is “SAMPLEID_AB_YYYYMMDD_status.png” Examples are “samp23_MUS_20240701_raw.png” and “samp25_MUS_20240701_composite.png”.

 

image-20240726-201410.png

Next Steps

Use the following link to provide feedback on this training: https://forms.gle/zzWGDYqnqT5mLMJi9 or use the QR code below.

Intro to Data Management.png