Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 6 Next »

Overview

  • InterPro is a database which integrates together predictive information about proteins' function from a number of partner resources, giving an overview of the families that a protein belongs to and the domains and sites it contains.

    Users who have novel nucleotide or protein sequences that they wish to functionally characterise can use the software package InterProScan to run the scanning algorithms from the InterPro database in an integrated way. Sequences are submitted in FASTA format. Matches are then calculated against all of the required member database's signatures and the results are then output in a variety of formats.

Interproscan Features:

  • Users who have novel nucleotide or protein sequences that they wish to functionally characterise can use the software package InterProScan to run the scanning algorithms from the InterPro database in an integrated way. Sequences are submitted in FASTA format. Matches are then calculated against all of the required member database’s signatures and the results are then output in a variety of formats.

Using

Use the module name interproscan.

More information on running interproscan and further documentation may be found here: https://interproscan-docs.readthedocs.io/en/5.56-89.0/HowToRun.html

Example

[]$ module load interproscan
[]$ ml
Currently Loaded Modules:
  1) slurm/latest       (S)   5) mpfr/4.1.0    9) gcc/12.2.0     13) expat/2.4.8     17) libiconv/1.16   21) tar/1.34        25) sqlite/3.39.2           29) perl/5.34.1
  2) arcc/1.0           (S)   6) mpc/1.2.1    10) bzip2/1.0.8    14) ncurses/6.3     18) xz/5.2.5        22) gettext/0.21    26) util-linux-uuid/2.37.4  30) openjdk/11.0.15_10
  3) singularity/3.10.3       7) zlib/1.2.12  11) libmd/1.0.4    15) readline/8.1.2  19) libxml2/2.10.1  23) libffi/3.4.2    27) python/3.10.6           31) interproscan/5.61-93.0
  4) gmp/6.2.1                8) zstd/1.5.2   12) libbsd/0.11.5  16) gdbm/1.19       20) pigz/2.7        24) openssl/1.1.1q  28) berkeley-db/18.1.40

  Where:
   S:  Module is Sticky, requires --force to unload or purge
 
[]$ salloc --account=<my account> --time=1:00:00 --mem=60GB
salloc: Granted job allocation
salloc: Waiting for resource configuration
salloc: Nodes are ready for job
[]$ interproscan.sh -i ~/test_all_appl.fasta -f tsv -dp
01/05/2023 08:22:41:521 Welcome to InterProScan-5.61-93.0
01/05/2023 08:22:41:523 Running InterProScan v5 in STANDALONE mode... on Linux
01/05/2023 08:22:54:075 RunID: mtest2_20230501_082253698_fi3i
01/05/2023 08:23:12:222 Loading file /test_all_appl.fasta
01/05/2023 08:23:12:234 Running the following analyses:
[AntiFam-7.0,CDD-3.20,Coils-2.2.1,FunFam-4.3.0,Gene3D-4.3.0,Hamap-2021_04,MobiDBLite-2.0,PANTHER-17.0,Pfam-35.0,PIRSF-3.10,PIRSR-2021_05,PRINTS-42.0,ProSitePatterns-2022_05,ProSiteProfiles-2022_05,SFLD-4,SMART-9.0,SUPERFAMILY-1.75,TIGRFAM-15.0]
Pre-calculated match lookup service DISABLED.  Please wait for match calculations to complete...

01/05/2023 08:24:15:125 25% completed
01/05/2023 08:25:04:303 50% completed
01/05/2023 08:26:06:396 76% completed
01/05/2023 08:27:04:227 90% completed
01/05/2023 08:27:52:696 100% done:  InterProScan analyses completed

Batch / Interactive Session Example

After logging onto Beartooth you may:

  1. Create an interactive session:

 In the example below change arcc to your project name, and modify the time you think you need, the example below is set for 60 minutes.

[]$ module load interproscan
[]$ ml
Currently Loaded Modules:
  1) slurm/latest       (S)   5) mpfr/4.1.0    9) gcc/12.2.0     13) expat/2.4.8     17) libiconv/1.16   21) tar/1.34        25) sqlite/3.39.2           29) perl/5.34.1
  2) arcc/1.0           (S)   6) mpc/1.2.1    10) bzip2/1.0.8    14) ncurses/6.3     18) xz/5.2.5        22) gettext/0.21    26) util-linux-uuid/2.37.4  30) openjdk/11.0.15_10
  3) singularity/3.10.3       7) zlib/1.2.12  11) libmd/1.0.4    15) readline/8.1.2  19) libxml2/2.10.1  23) libffi/3.4.2    27) python/3.10.6           31) interproscan/5.61-93.0
  4) gmp/6.2.1                8) zstd/1.5.2   12) libbsd/0.11.5  16) gdbm/1.19       20) pigz/2.7        24) openssl/1.1.1q  28) berkeley-db/18.1.40

  Where:
   S:  Module is Sticky, requires --force to unload or purge
 
[]$ salloc --account=arcc --time=1:00:00 --mem=60GB
salloc: Granted job allocation
salloc: Waiting for resource configuration
salloc: Nodes are ready for job
[]$ interproscan.sh -i ~/test_all_appl.fasta -f tsv -dp
01/05/2023 08:22:41:521 Welcome to InterProScan-5.61-93.0
01/05/2023 08:22:41:523 Running InterProScan v5 in STANDALONE mode... on Linux
01/05/2023 08:22:54:075 RunID: mtest2_20230501_082253698_fi3i
01/05/2023 08:23:12:222 Loading file /test_all_appl.fasta
01/05/2023 08:23:12:234 Running the following analyses:
[AntiFam-7.0,CDD-3.20,Coils-2.2.1,FunFam-4.3.0,Gene3D-4.3.0,Hamap-2021_04,MobiDBLite-2.0,PANTHER-17.0,Pfam-35.0,PIRSF-3.10,PIRSR-2021_05,PRINTS-42.0,ProSitePatterns-2022_05,ProSiteProfiles-2022_05,SFLD-4,SMART-9.0,SUPERFAMILY-1.75,TIGRFAM-15.0]
Pre-calculated match lookup service DISABLED.  Please wait for match calculations to complete...

01/05/2023 08:24:15:125 25% completed
01/05/2023 08:25:04:303 50% completed
01/05/2023 08:26:06:396 76% completed
01/05/2023 08:27:04:227 90% completed
01/05/2023 08:27:52:696 100% done:  InterProScan analyses completed

2. Submit a job:

 Below is an example of a batch file:

#!/bin/bash

#SBATCH --account=arcc
#SBATCH --nodes=1
#SBATCH --mem=0
#SBATCH --time=0:30:00


module load interproscan
interproscan.sh -i ~/test_all_appl.fasta -f tsv -dp

wait

Match Lookup Service

This service is disabled on Beartooth, and analysis runs locally without a match lookup.

Parallel Jobs

Out of the box, interproscan has a cluster mode but this does not integrate with SLURM and therefore Interproscan is installed to run in standalone mode.
The interproscan.properties file is set as indicated.

  • No labels