Cutadapt

Overview

Cutadapt finds and removes adapter sequences, primers, poly-A tails and other types of unwanted sequence from your high-throughput sequencing reads.

Using

Use the module name cutadapt to discover versions available and to load the application.

Multicore

The cutadapt command line call provides the -j CORES, --cores CORES option to allow you to run over multiple cores. Call cutadapt --help for more information.

Using cutadpat with dada2

We have had a number of researchers using cutadapt within dada2 related R scripts, as detailed under DADA2 ITS Pipeline Workflow (1.8). Following this page, below is one example of how to use cutadapt, within your dada2 related script.

Step 1 Load Module

Within your bash script you will need to first load the cutadapt module:

... module load cutadapt/2.10 ... srun Rscript dada2_code.R ...

Behind the scenes, this will add the path to the cutadapt binary into your environment variables.

Step 2 Make System Call in R Code

The next step is to make a system call from within your R script that calls cutadapt.

So, within your dada2_code.R file (you can rename this to what ever you want).

... # Since you have loaded the cutadpath module, and it's path is in your environment variables # you do not have to state the full path. cutadapt_cmd <- "cutadapt" # Run shell commands from R to print the version of cutadapt being called. system2(cutadapt_cmd, args = "--version") ... # Run Cutadapt # Change options, variable names appropriately to match your code and call requirements. # Notice this example also uses the -j option. # The number of cores defined MUST match the the number of cpus-per-task requested. for(i in seq_along(fnFs)) { system2(cutadapt_cmd, args = c(R1.flags, R2.flags, "-n", 2, # -n 2 required to remove FWD and REV from reads "-o", fnFs.cut[i], "-p", fnRs.cut[i], # Output files. "-j", 16, # Number of CPU cores to use. fnFs.filtN[i], fnRs.filtN[i])) # Input files. }