Canu

Overview

  • Canu: Canu is a fork of the Celera Assembler designed for high-noise single-molecule sequencing (such as the PacBio RSII or Oxford Nanopore MinION).

Using

Use the module name canu to discover versions available and to load the application.

Multicore

The canu application can use multiple cores, as well as submitting it's own sbatch jobs as part of its pipeline.

Depending on your pipeline and how you want your jobs to run, there are two main use cases:

  1. Running locally: If you create an allocation, the entire canu pipeline will only run on this allocation - pipeline tasks will essentially run sequentially.

  2. Using the Grid: In this mode, canu can submit its own sbatch jobs. It enables independent tasks of the pipeline to run concurrently.

Example Options:

# Running "locally" []$ canu useGrid=false # Running use "grid" # canu useGrid=true gridOptions=" -A <your-project> -t <time-required> " # To get an email for each Slurm job use: canu useGrid=true gridOptions=" -A <your-project> -t <time-required> --mail-type=ALL --mail-user=<email address> "

The default option (i.e. you do not set one) is useGrid=true. Without setting the required gridOptions that define you account and time required, you will see the following error:

CRASH: CRASH: canu 2.2 CRASH: Please panic, this is abnormal. CRASH: CRASH: Failed to submit compute jobs. CRASH: CRASH: Failed at /apps/u/opt/gcc/12.2.0/canu/2.2/build/bin/../lib/site_perl/canu/Execution.pm line 1259. CRASH: canu::Execution::submitOrRunParallelJob("ecoli", "meryl", "correction/0-mercounts", "meryl-count", 1) called at /apps/u/opt/gcc/12.2.0/canu/2.2/build/bin/../lib/site_perl/canu/Meryl.pm line 847 CRASH: canu::Meryl::merylCountCheck("ecoli", "cor") called at /apps/u/opt/gcc/12.2.0/canu/2.2/build/bin/canu line 1076 CRASH: CRASH: Last 50 lines of the relevant log file (correction/0-mercounts/meryl-count.jobSubmit-01.out): CRASH: CRASH: sbatch: error: You didn't specify a project account (-A,--account). Please open a ticket at arcc-help@uwyo.edu for help. CRASH: sbatch: error: Batch job submission failed: Invalid account or account/partition combination specified CRASH:

Using the grid, will spawn a serious of sbatch jobs. Viewing squeue you can monitor what is currently running, or calling sacct will show you the jobs that ran:

Within the generated output folder, there is the canu-scripts child folder that contains the list of jobs submitted and a log for each job.