Cutadapt
Overview
Cutadapt finds and removes adapter sequences, primers, poly-A tails and other types of unwanted sequence from your high-throughput sequencing reads.
Using
Use the module name cutadapt
to discover versions available and to load the application.
Multicore
The cutadapt
command line call provides the -j CORES, --cores CORES
option to allow you to run over multiple cores. Call cutadapt --help
for more information.
Using cutadpat with dada2
We have had a number of researchers using cutadapt
within dada2
related R scripts, as detailed under DADA2 ITS Pipeline Workflow (1.8). Following this page, below is one example of how to use cutadapt
, within your dada2
related script.
Step 1 Load Module
Within your bash script you will need to first load the cutadapt
module:
...
module load cutadapt/2.10
...
srun Rscript dada2_code.R
...
Behind the scenes, this will add the path to the cutadapt
binary into your environment variables.
Step 2 Make System Call in R Code
The next step is to make a system call from within your R script that calls cutadapt
.
So, within your dada2_code.R
file (you can rename this to what ever you want).
...
# Since you have loaded the cutadpath module, and it's path is in your environment variables
# you do not have to state the full path.
cutadapt_cmd <- "cutadapt"
# Run shell commands from R to print the version of cutadapt being called.
system2(cutadapt_cmd, args = "--version")
...
# Run Cutadapt
# Change options, variable names appropriately to match your code and call requirements.
# Notice this example also uses the -j option.
# The number of cores defined MUST match the the number of cpus-per-task requested.
for(i in seq_along(fnFs)) {
system2(cutadapt_cmd, args = c(R1.flags, R2.flags, "-n", 2, # -n 2 required to remove FWD and REV from reads
"-o", fnFs.cut[i], "-p", fnRs.cut[i], # Output files.
"-j", 16, # Number of CPU cores to use.
fnFs.filtN[i], fnRs.filtN[i])) # Input files.
}