Using Dadi

ARCC is aware that the exact details and versions presented here are out-of-date, but the general process is still valid.

We will endeavor to update this page as soon as we can.

Overview

Dadi is a powerful software tool for simulating the joint frequency spectrum (FS) of genetic variation among multiple populations and employing the FS for population-genetic inference.

An important aspect of dadi is its flexibility, particularly in model specification, but with that flexibility comes some complexity. dadi is not a GUI program, nor can dadi be run usefully with a single command at the command-line; using dadi requires at least rudimentary Python scripting. Luckily for us, Python is a beautiful and simple language. Together with a few examples, this manual will quickly get you productive with dadi even if you have no prior Python experience.

Using Dadi with GPU on Teton

Here we will describe setting up a Conda environment, with dadi installed, that allows you to run related source code and that utilizes GPUs.

The basic environment will:

Step though creating a basic Conda environment. Dadi uses PyCuda and scikit-cuda as packages to assist with interfacing with GPUs.
Provide a template for a bash script to submit jobs using sbatch.
Provide a very simple script that tests dadi has been imported and can identify the allocated GPU.

Note:

This is a short page and assumes some familiarization with using Conda. The “Package and Dependency Management with Conda” can be found on ARCC’s Training/Consultation page.
The installation of dadi within the conda environment will also install related dependencies, but nothing else. Since you’re creating the conda environment, you can extend and install other packages. You can view the conda packages installed using conda list while in an active environment.
The bash script only uses a single node and single core. It is up to the user to explore other configurations.
In the scripts and examples below, please remember to appropriately edit to use your account, email address, folder locations etc.

Creating the Conda Environment

Setup the basic Conda environment to run with python version 3.8:

cd /project/arcc/salexan5/conda/gpu/dadi
module load miniconda3/4.3.30
conda create --prefix=dadi_env python=3.8

There are a number of conda options on how/where to install an environment. In this case -p with create an environment called dadi_env in the folder you’re running the command from. Once setup, make a note of the installation message that indicates how to activate your environment when you want to use it.

# To activate this environment, use:
# > source activate /pfs/tsfs1/project/arcc/salexan5/conda/gpu/dadi/dadi_env
#
# To deactivate an active environment, use:
# > source deactivate

Activate your environment, and install the dadi related package. Once installation has finished, deactivate your environment.

source activate /pfs/tsfs1/project/arcc/salexan5/conda/gpu/dadi/dadi_env
conda install -c conda-forge dadi
conda install numpy, scipy, matplotlib, ipython
conda install numpy scipy matplotlib ipython
python3 -m pip install pycuda
python3 -m pip install scikit-cuda
source deactivate

Bash Script to use with sbatch

Below is a basic template to use that you’ll need to insert your account and email details into.

Simple Source Code Example

Below is some very simple source code that will test your environment and GPU request is functioning properly.

It simply imports the tensor package, and then using this checks that it can identify the allocated GPU(s). To work with the bash script above, save this file as dadi_test.py

Requesting GPUs and Testing

We have a variety of GPUs on Teton, and depending on which you require, you'll need to adjust you bash script. The reason for the nvidia-smi -L within the bash script it that it will write out confirming the GPU configuration you’ve requested.

Below demonstrates the bash options for each GPU, as well as what you’d see from running the nvidia-smi -L command and source code from the bash script:

To request a k20 GPU on one of the moran nodes, use:

For other GPUs you’ll have to also specify the partition.

Running/Testing in an Interactive Session

If you’re just exploring, trying things out, and/or performing tests, then you can just as straight-forwardly use an interactive session. Below is an example of using salloc. Notice the steps are the same as if running a bash script via sbatch.

Once logged onto one of the login nodes, request an interactive session:

Load the modules you require. Since we’re using GPUs re need to load the appropriate NVidia drivers.

If you want, you can check the requested GPU has been allocated.

Because we are using a Conda environment, we need to activate it.

Navigate to the folder containing the source code and then run it

Once finished, deactivate the Conda environment and cancel you interactive session.

GPU Not Found/Detected

Remember to prefix the line where you call your application/program with srun. This actually releases the GPU allocation you requested.

If you forget, then you’ll see a warning like the following: