HMMaccel
Aim of this script is to make hmmsearches faster without sacrificing any sensitivity.
HMMaccel can be downloaded HERE.
The archive should contain three files: a readme, a configuration file and the perl script.
To run the program you will need a perl interpreter and local installations of both PSIBLAST(1) and HMMER(2) (and make sure they work).
INPUT:
Alignment in fasta format.
Program parameters as read from the configuration file (hmmaccel.conf).
Steps:
1. kickstart a psiblast run with the input alignment, search for ONE round.
2. extract all full-length sequences with hits up to a very high evalue (def: evalue=1000) to be sure to have no false negatives.
3. build a HMM from the input alignment.
4. calibrate the HMM.
5. run HMMsearch against this smaller database.
this generally speeds up a HMMsearch by a factor of 20 or more (5 min. instead of 1.5 hours for a search against nr.)
Additionally there is also the possibility to convert the hmmsearch hits into a multiple alignment.
Steps:
1. extract the hmmsearch hits (only the domains that hit)
2. realign these domains to the HMM using HMMalign
3. convert Stockholm format to fasta
OUTPUT:
*.db sub-database used in the search (generated via psiblast).
*.hmm HMM built from the input alignment.
*.hms HMMsearch results.
*.hln HMMalign output file (Stockholm format).
*.hln.fasta fasta version of the alignment.
Hope you find the program helpful.
Ref:
1) Altschul, S., Madden, T., Schaffer, A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucl. Acids Res., 25 (17), 3389-3402.
2) Eddy, S. (1998) Profile hidden Markov models. Bioinformatics, 14(9), 755-763.