Interproscan

Overview

  • InterPro is a database which integrates together predictive information about proteins' function from a number of partner resources, giving an overview of the families that a protein belongs to and the domains and sites it contains.

    Users who have novel nucleotide or protein sequences that they wish to functionally characterise can use the software package InterProScan to run the scanning algorithms from the InterPro database in an integrated way. Sequences are submitted in FASTA format. Matches are then calculated against all of the required member database's signatures and the results are then output in a variety of formats.

Using

Use the module name interproscan, you may load version 5.61-93.0.

More information on running interproscan and further documentation may be found here: https://interproscan-docs.readthedocs.io/en/5.56-89.0/HowToRun.html

Users should allocate at least --mem=60GB when running analysis. If you receive an out of memory error, consider requesting more memory.

Example

[]$ module load interproscan   []$ salloc --account=<my account> --time=1:00:00 --mem=60GB salloc: Granted job allocation salloc: Waiting for resource configuration salloc: Nodes are ready for job []$ interproscan.sh -i ~/test_all_appl.fasta -f tsv -dp 01/05/2023 08:22:41:521 Welcome to InterProScan-5.61-93.0 01/05/2023 08:22:41:523 Running InterProScan v5 in STANDALONE mode... on Linux 01/05/2023 08:22:54:075 RunID: mtest2_20230501_082253698_fi3i 01/05/2023 08:23:12:222 Loading file /test_all_appl.fasta 01/05/2023 08:23:12:234 Running the following analyses: [AntiFam-7.0,CDD-3.20,Coils-2.2.1,FunFam-4.3.0,Gene3D-4.3.0,Hamap-2021_04,MobiDBLite-2.0,PANTHER-17.0,Pfam-35.0,PIRSF-3.10,PIRSR-2021_05,PRINTS-42.0,ProSitePatterns-2022_05,ProSiteProfiles-2022_05,SFLD-4,SMART-9.0,SUPERFAMILY-1.75,TIGRFAM-15.0] Pre-calculated match lookup service DISABLED. Please wait for match calculations to complete... 01/05/2023 08:24:15:125 25% completed 01/05/2023 08:25:04:303 50% completed 01/05/2023 08:26:06:396 76% completed 01/05/2023 08:27:04:227 90% completed 01/05/2023 08:27:52:696 100% done: InterProScan analyses completed

 

 Below is an example of a batch file:

#!/bin/bash #SBATCH --account=arcc #SBATCH --nodes=1 #SBATCH --mem=60 #SBATCH --time=0:30:00 module load interproscan interproscan.sh -i ~/test_all_appl.fasta -f tsv -dp wait

Match Lookup Service

This service is disabled on Beartooth, and analysis runs locally without match lookup.

 

Parallel Jobs

Out of the box, interproscan has a cluster mode but this does not integrate with SLURM and therefore Interproscan is installed on Beartooth to run in standalone mode.
The interproscan.properties file has been set as indicated.