note - FIZ Karlsruhe
note - FIZ Karlsruhe
note - FIZ Karlsruhe
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
RUN BLAST<br />
Advanced similarity searching<br />
The RUN BLAST function makes use of the industry standard BLAST methodology, and is used<br />
in DGENE with the permission of the National Center for Biotechnology Information (NCBI).<br />
The Basic Local Alignment Search Tool (BLAST) was described by Altschul et al. 1 in 1997.<br />
BLAST is a sequence comparison algorithm 2 optimized for speed used to search sequence<br />
databases for optimal local alignments to a query. The initial search is done for a word of length<br />
"W" that scores at least "T" when compared to the query using a substitution matrix. Word hits are<br />
then extended in either direction in an attempt to generate an alignment with a score exceeding the<br />
threshold of "S". The "T" parameter dictates the speed and sensitivity of the search.<br />
BLAST search modes<br />
� Protein similarity (BLASTP) (/SQP) [default]<br />
� Translated protein similarity 3 (TBLASTN) (/TSQN)<br />
� Nucleic acid similarity (BLASTN) (/SQN)<br />
1. Single strand (/SQN SIN)<br />
2. Complementary strand (/SQN COM)<br />
3. Both strands (/SQN BOTH) [default]<br />
RUN BLAST offers both offline BATCH (page 79) and ALERT (page 83) search options. For the<br />
basic steps of a BLAST search, including how to gather, display and review results, see page 14.<br />
BLAST query limits<br />
BLAST accepts sequence queries up to 10,000 characters in length for all search modes. While<br />
the command line is limited to 256 characters, longer queries can be conveniently prepared offline<br />
and uploaded (see page 14). The uploaded query can be displayed and saved online for future<br />
use. BLAST has a maximum limit of 10,000 best scoring answers which can be reported.<br />
<strong>note</strong><br />
RUN BLAST requires substantially less computational resources than RUN<br />
GETSIM, which is based on FASTA methodology. Searches conducted using<br />
RUN BLAST will therefore usually take much less time to run to completion<br />
online than RUN GETSIM. However, BLAST is known to sometimes be less<br />
sensitive than FASTA, and as such may not always retrieve a comprehensive<br />
set of results compared to a GETSIM (FASTA) search (see page 47).<br />
1 Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and<br />
David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs."<br />
Nucleic Acids Res. 25:3389-3402. See: http://www.ncbi.nlm.nih.gov/pubmed/9254694.<br />
2 See: http://www.ncbi.nlm.nih.gov/books/NBK21097/.<br />
3 A protein query sequence searched against a nucleotide database translated in all three reading frames.<br />
GENESEQ on STN (DGENE) Workshop Manual | Page 47