Maker
Overview
Maker: MAKER is a portable and easily configurable genome annotation pipeline. Its purpose is to allow smaller eukaryotic and prokaryotic genome projects to independently annotate their genomes and to create genome databases.
This details the use of maker and related applications available on the cluster. Please note that this page is organic, and as we work with researcher to understand how best to functionally support maker on the cluster, it will likely change.
Tutorials:
Using
Use the module name maker
to discover versions available and to load the application.
On Beartooth, version 3.01.04
automatically loads the following additional modules:
snap-korf/2022-11-06
repeatmasker/4.1.3-ompi
exonerate/2.4.0
perl-bioperl/1.7.6
blast-plus/2.13.0
Control Files
When setting up a new maker environment, a user will typically run maker -CTL
to generate the core three control files:
[]$ maker -CTL
maker_bopts.ctl maker_exe.ctl maker_opts.ctl
maker_exe.ctl
The maker_exe.ctl
requires you the user to update a number of paths to point explicitly to required applications:
There are versions of the blastn
, blastx
, tblastx
and RepeatMasker
applications that come packaged with maker, and are already defined. You will need to explicitly enter the paths for exonerate
and augustus
, and if you’re using the versions of the modules above, then the paths to these versions are demonstrated below.
Beartooth Example:
Running maker -CTL
will generate a file with the following paths:
If you require augustus
then load it as a module.
Teton Example:
maker_opts.ctl
On Beartooth
We have observed a problem with the model_org=all
option, related to RepeatMasker
, of the form:
We have worked around this issue by setting the option to: model_org
to simple
or leaving it blank.
Under External Application Behaviour Options, you have the following multicore option:
Multicore
Maker can be used across multiple nodes, by setting, for example:
With the value after the -n
option equaling the number of nodes multiplied by the number of tasks (4 x 8 = 32).
On Beartooth use:
On Teton use:
Memory Issues
Memory is always an issue with any form of bioinformatics analysis, and there are no straight forward recommendations we can make. As a researcher you’ll need to track the size of your data sets, the type of analysis, and the resources you’ve requested and how efficiently they’ve been used.
One indicator that you have not allocated enough memory is if you see an error of the following form:
Please refer to our Slurm page on requesting and using memory: Introduction to Job Submission: 02: Memory and GPUs