GPU - BLAST
Overview
The Basic Local Alignment Search Tool (BLAST) is one of the most widely used bioinformatics tools. GPU-BLAST can align query sequences against those present in a selected target database.
Using a general-purpose graphics processing unit (GPU), we have developed GPU-BLAST, an accelerated version of the popular NCBI-BLAST. In comparison to the sequential NCBI-BLAST, GPU-BLAST is nearly four times faster, while producing identical results.
Using
Use the module name gpu-blast
to discover versions available and to load the application.
Basic Command Line
blastp -help
Test Example: Based on section III. How to use GPU-BLAST as detailed in the README document.
mkdir test
cd test/
mkdir database
cd database/
wget ftp://ftp.ncbi.nlm.nih.gov/blast/db/FASTA/env_nr.gz
gunzip env_nr.gz
cd ..
mkdir queries
cd queries/
wget http://thales.cheme.cmu.edu/gpublast/queries.tar.gz
tar -xzf queries.tar.gz
rm queries.tar.gz
cd ..
makeblastdb -in database/env_nr -out database/sorted_env_nr -dbtype prot -sort_volumes -max_file_sz 500MB
blastp -query queries/SequenceLength_00000100.txt -db database/sorted_env_nr -gpu t -method 2 -gpu_blocks 256 -gpu_threads 32
blastp -query queries/SequenceLength_00000100.txt -db database/sorted_env_nr -gpu t
time ./blastp -query queries/SequenceLength_00000100.txt -db database/sorted_env_nr -gpu t > gpu_output.txt
time blastp -query queries/SequenceLength_00000100.txt -db database/sorted_env_nr -gpu t > gpu_output.txt
time blastp -query queries/SequenceLength_00000100.txt -db database/sorted_env_nr -gpu f > cpu_output.txt
Note
Please observe the use of the
-gpu
option to select whether to use a GPU or not. Remember, if you wish to use this software with a GPU then you will have to request an appropriate node within your job allocation request.If you try running with the gpu option set but are not on a GPU node, then you will see the following output:
WARNING: There is no available device supporting CUDA. Continuing with the CPU only...
Example Batch
#!/bin/bash
#SBATCH -J blastp
#SBATCH -t 00:10:00
#SBATCH --mail-type=ALL
#SBATCH --mail-user=<insert-your-email-address>
#SBATCH --account=<insert-your-project-name>
#SBATCH --gres=gpu:1
#SBTACH --partition=moran-gpu
echo "Start:"
module load gpu-blast/1.1
echo "Loaded gpu-blast:"
srun blastp -query queries/SequenceLength_00000100.txt -db database/sorted_env_nr -gpu t -method 1 -gpu_blocks 128 -gpu_threads 32
echo "Finished Successfully:"