GPU - BLAST

Overview

The Basic Local Alignment Search Tool (BLAST) is one of the most widely used bioinformatics tools. GPU-BLAST can align query sequences against those present in a selected target database.

Using a general-purpose graphics processing unit (GPU), we have developed GPU-BLAST, an accelerated version of the popular NCBI-BLAST. In comparison to the sequential NCBI-BLAST, GPU-BLAST is nearly four times faster, while producing identical results.

Using

Use the module name gpu-blast to discover versions available and to load the application.

Basic Command Line

blastp -help

Test Example: Based on section III. How to use GPU-BLAST as detailed in the README document.

mkdir test cd test/ mkdir database cd database/ wget ftp://ftp.ncbi.nlm.nih.gov/blast/db/FASTA/env_nr.gz gunzip env_nr.gz cd .. mkdir queries cd queries/ wget http://thales.cheme.cmu.edu/gpublast/queries.tar.gz tar -xzf queries.tar.gz rm queries.tar.gz cd .. makeblastdb -in database/env_nr -out database/sorted_env_nr -dbtype prot -sort_volumes -max_file_sz 500MB blastp -query queries/SequenceLength_00000100.txt -db database/sorted_env_nr -gpu t -method 2 -gpu_blocks 256 -gpu_threads 32 blastp -query queries/SequenceLength_00000100.txt -db database/sorted_env_nr -gpu t time ./blastp -query queries/SequenceLength_00000100.txt -db database/sorted_env_nr -gpu t > gpu_output.txt time blastp -query queries/SequenceLength_00000100.txt -db database/sorted_env_nr -gpu t > gpu_output.txt time blastp -query queries/SequenceLength_00000100.txt -db database/sorted_env_nr -gpu f > cpu_output.txt

Note

  • Please observe the use of the -gpu option to select whether to use a GPU or not. Remember, if you wish to use this software with a GPU then you will have to request an appropriate node within your job allocation request.

  • If you try running with the gpu option set but are not on a GPU node, then you will see the following output:

WARNING: There is no available device supporting CUDA. Continuing with the CPU only...

Example Batch

#!/bin/bash #SBATCH -J blastp #SBATCH -t 00:10:00 #SBATCH --mail-type=ALL #SBATCH --mail-user=<insert-your-email-address> #SBATCH --account=<insert-your-project-name> #SBATCH --gres=gpu:1 #SBTACH --partition=moran-gpu echo "Start:" module load gpu-blast/1.1 echo "Loaded gpu-blast:" srun blastp -query queries/SequenceLength_00000100.txt -db database/sorted_env_nr -gpu t -method 1 -gpu_blocks 128 -gpu_threads 32 echo "Finished Successfully:"