Guppy
Overview
Guppy provides local accelerated basecalling for Nanopore.
To use outside of the cluster you will need to register to directly access the software and documentation.
Documentation: Here are some generally accessible links for using Guppy:
Guppy: GPU Acceleration: Specify
cuda:<device_id>
.
Trainings:
Welcome to the de.NBI Nanopore Training Course! “Welcome to the two-day nanopore training course. This tutorial will guide you through the typical steps of a nanopore assembly of a microbial genome.”
Using
Guppy is available into two flavors:
A CPU only version: use the module:
guppy-cpu
A GPU version: use the module:
guppy-gpu
The GPU version only supports GPUs with an NVIDIA compute version of 6.1 or higher.
You will need to request a GPU: Introduction to Job Submission: 02: Memory and GPUs
Multicore
Both versions provide multi core functionality across their commands.
Inspect the particular command to see what is offers. For example guppy_aligner -h
to understand the --worker_threads
option.
GPU Example
There are a lot of command line arguments to using Guppy, domain specific knowledge that you’ll gain from experience of using the software.
For request GPUs, please read Introduction to Job Submission: 02: Memory and GPUs and look at the Guppy specific cuda
command line option.
Setting custom GPU parameters in Guppy
Within the documentation there is also the following recommendation (taken directly from their documentation) for calculations providing a rough ceiling to the amount of GPU memory that Guppy will use. Since ARCC doesn’t directly use Guppy, this is provided for users who will understand the various domain specific parameters:
memory used [in bytes] = gpu_runners_per_device * chunks_per_runner * chunk_size * model_factor
Where model_factor
depends on the basecall model used:
Basecall model | model_factor |
---|---|
Fast | 1200 |
HAC | 3300 |
SUP | 12600 |
Note that gpu_runners_per_device
is a limit setting. Guppy will create at least one runner per device and will dynamically increase this number as needed up to gpu_runners_per_device
. Performance is usually much better with multiple runners, so it is important to choose parameters such that chunks_per_runner * chunk_size * model_factor
is less than half the total GPU memory available. If this value is more than the available GPU memory, Guppy will exit with an error.
For example, when basecalling using a SUP model and a chunk size of 2000 on a GPU with 8 GB of memory, we have to make sure that the GPU can fit at least one runner:
chunks_per_runner * chunk_size * model_factor should be lower than GPU memory
chunks_per_runner * 2000 * 12600 should be lower than 8 GB
chunks_per_runner lower than ~340
This represents the limit beyond which Guppy will not run at all. For best performance we recommend using an integer fraction of this number, rounded down to an even number, e.g. a third (~112) or a quarter (~84). Especially for fast models, it can be best to have a dozen runners or more. The ideal value varies depending on GPU architecture and available memory.