Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 23 Current »

Overview

Maker: MAKER is a portable and easily configurable genome annotation pipeline. Its purpose is to allow smaller eukaryotic and prokaryotic genome projects to independently annotate their genomes and to create genome databases.

Using

Use the module name maker to discover versions available and to load the application.

On Beartooth, version 3.01.04 automatically loads the following additional modules:

snap-korf/2022-11-06
repeatmasker/4.1.3-ompi
exonerate/2.4.0
perl-bioperl/1.7.6
blast-plus/2.13.0

Control Files

When setting up a new maker environment, a user will typically run maker -CTL to generate the core three control files:

[]$ maker -CTL
maker_bopts.ctl  maker_exe.ctl  maker_opts.ctl

maker_exe.ctl

The maker_exe.ctl requires you the user to update a number of paths to point explicitly to required applications:

There are versions of the blastn, blastx, tblastx and RepeatMasker applications that come packaged with maker, and are already defined. You will need to explicitly enter the paths for exonerate and augustus, and if you’re using the versions of the modules above, then the paths to these versions are demonstrated below.

Beartooth Example:

Running maker -CTL will generate a file with the following paths:

 Beartooth: maker_exe.ctl
#-----Location of Executables Used by MAKER/EVALUATOR
makeblastdb=/apps/u/spack/gcc/12.2.0/blast-plus/2.13.0-5zhb232/bin/makeblastdb #location of NCBI+ makeblastdb executable
blastn=/apps/u/spack/gcc/12.2.0/blast-plus/2.13.0-5zhb232/bin/blastn #location of NCBI+ blastn executable
blastx=/apps/u/spack/gcc/12.2.0/blast-plus/2.13.0-5zhb232/bin/blastx #location of NCBI+ blastx executable
tblastx=/apps/u/spack/gcc/12.2.0/blast-plus/2.13.0-5zhb232/bin/tblastx #location of NCBI+ tblastx executable
formatdb= #location of NCBI formatdb executable
blastall= #location of NCBI blastall executable
xdformat= #location of WUBLAST xdformat executable
blasta= #location of WUBLAST blasta executable
prerapsearch= #location of prerapsearch executable
rapsearch= #location of rapsearch executable
RepeatMasker=/apps/u/opt/gcc/12.2.0/repeatmasker/4.1.3/RepeatMasker #location of RepeatMasker executable
exonerate=/apps/u/spack/gcc/12.2.0/exonerate/2.4.0-4d3tjyb/bin/exonerate #location of exonerate executable

#-----Ab-initio Gene Prediction Algorithms
snap=/apps/u/opt/gcc/12.2.0/snap-korf/20221106/snap #location of snap executable
gmhmme3= #location of eukaryotic genemark executable
gmhmmp= #location of prokaryotic genemark executable
augustus= #location of augustus executable
fgenesh= #location of fgenesh executable
evm= #location of EvidenceModeler executable
tRNAscan-SE= #location of trnascan executable
snoscan= #location of snoscan executable

#-----Other Algorithms
probuild= #location of probuild executable (required for genemark)

If you require augustus then load it as a module.

Teton Example:
 Teton: maker_exe.ctl
#-----Location of Executables Used by MAKER/EVALUATOR
makeblastdb=/pfs/tc1/apps/el7-x86_64/u/opt/maker/bin/../exe/blast/bin/makeblastdb #location of NCBI+ makeblastdb executable
blastn=/pfs/tc1/apps/el7-x86_64/u/opt/maker/bin/../exe/blast/bin/blastn #location of NCBI+ blastn executable
blastx=/pfs/tc1/apps/el7-x86_64/u/opt/maker/bin/../exe/blast/bin/blastx #location of NCBI+ blastx executable
tblastx=/pfs/tc1/apps/el7-x86_64/u/opt/maker/bin/../exe/blast/bin/tblastx #location of NCBI+ tblastx executable
formatdb= #location of NCBI formatdb executable
blastall= #location of NCBI blastall executable
xdformat= #location of WUBLAST xdformat executable
blasta= #location of WUBLAST blasta executable
RepeatMasker=/pfs/tc1/apps/el7-x86_64/u/opt/maker/bin/../exe/RepeatMasker/RepeatMasker #location of RepeatMasker executable
exonerate=/apps/u/gcc/7.3.0/exonerate/2.4.0-3bglywa/bin/exonerate #location of exonerate executable

#-----Ab-initio Gene Prediction Algorithms
snap= #location of snap executable
gmhmme3= #location of eukaryotic genemark executable
gmhmmp= #location of prokaryotic genemark executable
augustus=/apps/u/gcc/7.3.0/augustus/3.3.2-etubcuo/bin/augustus #location of augustus executable
fgenesh= #location of fgenesh executable
tRNAscan-SE= #location of trnascan executable
snoscan= #location of snoscan executable

#-----Other Algorithms
probuild= #location of probuild executable (required for genemark)

maker_opts.ctl

On Beartooth

We have observed a problem with the model_org=all option, related to RepeatMasker, of the form:

running  repeat masker.
#--------- command -------------#
Widget::RepeatMasker:
...
Species "all" is not known to RepeatMasker.  There may
not be any TE families defined in the libraries for this
species/clade or there may be an error in the spelling.
Please check your entry against the NCBI Taxonomy database
and/or try using a broader clade or related species instead.
The full list of species/clades defined in the library may be
obtained using the famdb.py script.

We have worked around this issue by setting the option to: model_org to simple or leaving it blank.

Under External Application Behaviour Options, you have the following multicore option:

cpus=1 #max number of cpus to use in BLAST and RepeatMasker (not for MPI, leave 1 when using MPI)

Multicore

Maker can be used across multiple nodes, by setting, for example:

#SBATCH --nodes=4
#SBATCH --ntasks-per-node=8

On Teton run the following:

mpiexec -n 32 maker

With the value after the -n option equaling the number of nodes multiplied by the number of tasks (4 x 8 = 32).

On Beartooth, simply run:

srun maker

Memory Issues

Memory is always an issue with any form of bioinformatics analysis, and there are no straight forward recommendations we can make. As a researcher you’ll need to track the size of your data sets, the type of analysis, and the resources you’ve requested and how efficiently they’ve been used.

One indicator that you have not allocated enough memory is if you see an error of the following form:

ERROR: Chunk failed at level:0, tier_type:1
FAILED CONTIG:tig00000510

Please refer to our Slurm page on requesting and using memory: Introduction to Job Submission: 02: Memory and GPUs

  • No labels