Guppy

Overview

Using

Guppy is available into two flavors:

  1. A CPU only version: use the module: guppy-cpu

  2. A GPU version: use the module: guppy-gpu

    1. The GPU version only supports GPUs with an NVIDIA compute version of 6.1 or higher.

    2. You will need to request a GPU: Introduction to Job Submission: 02: Memory and GPUs

Multicore

Both versions provide multi core functionality across their commands.

Inspect the particular command to see what is offers. For example guppy_aligner -h to understand the --worker_threads option.

GPU Example

There are a lot of command line arguments to using Guppy, domain specific knowledge that you’ll gain from experience of using the software.

For request GPUs, please read Introduction to Job Submission: 02: Memory and GPUs and look at the Guppy specific cuda command line option.

Setting custom GPU parameters in Guppy

Within the documentation there is also the following recommendation (taken directly from their documentation) for calculations providing a rough ceiling to the amount of GPU memory that Guppy will use. Since ARCC doesn’t directly use Guppy, this is provided for users who will understand the various domain specific parameters:

memory used [in bytes] = gpu_runners_per_device * chunks_per_runner * chunk_size * model_factor

Where model_factor depends on the basecall model used:

Basecall model

model_factor

Basecall model

model_factor

Fast

1200

HAC

3300

SUP

12600

Note that gpu_runners_per_device is a limit setting. Guppy will create at least one runner per device and will dynamically increase this number as needed up to gpu_runners_per_device. Performance is usually much better with multiple runners, so it is important to choose parameters such that chunks_per_runner * chunk_size * model_factor is less than half the total GPU memory available. If this value is more than the available GPU memory, Guppy will exit with an error.

For example, when basecalling using a SUP model and a chunk size of 2000 on a GPU with 8 GB of memory, we have to make sure that the GPU can fit at least one runner:

chunks_per_runner * chunk_size * model_factor should be lower than GPU memory chunks_per_runner * 2000 * 12600 should be lower than 8 GB chunks_per_runner lower than ~340

This represents the limit beyond which Guppy will not run at all. For best performance we recommend using an integer fraction of this number, rounded down to an even number, e.g. a third (~112) or a quarter (~84). Especially for fast models, it can be best to have a dozen runners or more. The ideal value varies depending on GPU architecture and available memory.