Maker

1 Overview | 2 Using | 2.1 Control Files | 2.1.1 maker_exe.ctl | 2.1.1.1 Beartooth Example: | 2.1.1.2 Teton Example: | 2.1.2 maker_opts.ctl | 2.1.2.1 On Beartooth | 2.1.3 Multicore | 2.1.4 Memory Issues

Overview

Maker: MAKER is a portable and easily configurable genome annotation pipeline. Its purpose is to allow smaller eukaryotic and prokaryotic genome projects to independently annotate their genomes and to create genome databases.

Using

Use the module name maker to discover versions available and to load the application.

On Beartooth, version 3.01.04 automatically loads the following additional modules:

snap-korf/2022-11-06 repeatmasker/4.1.3-ompi exonerate/2.4.0 perl-bioperl/1.7.6 blast-plus/2.13.0

Control Files

When setting up a new maker environment, a user will typically run maker -CTL to generate the core three control files:

[]$ maker -CTL maker_bopts.ctl maker_exe.ctl maker_opts.ctl

maker_exe.ctl

The maker_exe.ctl requires you the user to update a number of paths to point explicitly to required applications:

There are versions of the blastn, blastx, tblastx and RepeatMasker applications that come packaged with maker, and are already defined. You will need to explicitly enter the paths for exonerate and augustus, and if you’re using the versions of the modules above, then the paths to these versions are demonstrated below.

Beartooth Example:

Running maker -CTL will generate a file with the following paths:

#-----Location of Executables Used by MAKER/EVALUATOR makeblastdb=/apps/u/spack/gcc/12.2.0/blast-plus/2.13.0-5zhb232/bin/makeblastdb #location of NCBI+ makeblastdb executable blastn=/apps/u/spack/gcc/12.2.0/blast-plus/2.13.0-5zhb232/bin/blastn #location of NCBI+ blastn executable blastx=/apps/u/spack/gcc/12.2.0/blast-plus/2.13.0-5zhb232/bin/blastx #location of NCBI+ blastx executable tblastx=/apps/u/spack/gcc/12.2.0/blast-plus/2.13.0-5zhb232/bin/tblastx #location of NCBI+ tblastx executable formatdb= #location of NCBI formatdb executable blastall= #location of NCBI blastall executable xdformat= #location of WUBLAST xdformat executable blasta= #location of WUBLAST blasta executable prerapsearch= #location of prerapsearch executable rapsearch= #location of rapsearch executable RepeatMasker=/apps/u/opt/gcc/12.2.0/repeatmasker/4.1.3/RepeatMasker #location of RepeatMasker executable exonerate=/apps/u/spack/gcc/12.2.0/exonerate/2.4.0-4d3tjyb/bin/exonerate #location of exonerate executable #-----Ab-initio Gene Prediction Algorithms snap=/apps/u/opt/gcc/12.2.0/snap-korf/20221106/snap #location of snap executable gmhmme3= #location of eukaryotic genemark executable gmhmmp= #location of prokaryotic genemark executable augustus= #location of augustus executable fgenesh= #location of fgenesh executable evm= #location of EvidenceModeler executable tRNAscan-SE= #location of trnascan executable snoscan= #location of snoscan executable #-----Other Algorithms probuild= #location of probuild executable (required for genemark)

If you require augustus then load it as a module.

Teton Example:

maker_opts.ctl

On Beartooth

We have observed a problem with the model_org=all option, related to RepeatMasker, of the form:

We have worked around this issue by setting the option to: model_org to simple or leaving it blank.

Under External Application Behaviour Options, you have the following multicore option:

Multicore

Maker can be used across multiple nodes, by setting, for example:

With the value after the -n option equaling the number of nodes multiplied by the number of tasks (4 x 8 = 32).

On Beartooth use:

On Teton use:

Memory Issues

Memory is always an issue with any form of bioinformatics analysis, and there are no straight forward recommendations we can make. As a researcher you’ll need to track the size of your data sets, the type of analysis, and the resources you’ve requested and how efficiently they’ve been used.

One indicator that you have not allocated enough memory is if you see an error of the following form:

Please refer to our Slurm page on requesting and using memory: https://arccwiki.atlassian.net/wiki/spaces/DOCUMENTAT/pages/377454617