21.01.2013 Views

note - FIZ Karlsruhe

note - FIZ Karlsruhe

note - FIZ Karlsruhe

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

RUN GETSIM for a SQN BOTH search<br />

Introduction to similarity searching<br />

=> FILE DGENE<br />

Sequence queries can be entered directly on<br />

FILE 'DGENE' ENTERED AT 13:06:43 ON 28 DEC 2010 the command line, or sent as a plain text file<br />

COPYRIGHT (C) 2010 THOMSON REUTERS<br />

via the Query Upload Wizard (page 14).<br />

=> RUN GETSIM GGGUUUAGGAGUGGUAGGUCUUACGAUGCCAGCUGUAAUGCCUACCGGATAA/SQN BOTH<br />

RUN GETSIM AT 13:07:03 ON 28 DEC 2010<br />

COPYRIGHT (C) 2010 <strong>FIZ</strong> KARLSRUHE GMBH<br />

. . . .<br />

8725 ANSWERS FOUND ABOVE A THRESHOLD OF 67<br />

QUERY SELF SCORE VALUE IS 260<br />

BEST ANSWER SCORE VALUE IS 260<br />

Similarity<br />

The Best Answer Score value is also<br />

Score<br />

given. In this example, there is at least<br />

260 |<br />

one perfect answer match to the query.<br />

|<br />

|<br />

|<br />

|<br />

|<br />

|<br />

The graphic representation gives a count<br />

|<br />

of hit sequences (x-axis) and similarity<br />

|<br />

score (y-axis). The graph gives a visual<br />

|<br />

clue about the proportion of similar and<br />

130 |<br />

not so similar sequences in the answer set.<br />

|<br />

|<br />

||||<br />

|||||||||||||||||||||||||||||||||<br />

||||||||||||||||||||||||||||||||||||||||||||||||||<br />

||||||||||||||||||||||||||||||||||||||||||||||||||<br />

||||||||||||||||||||||||||||||||||||||||||||||||||<br />

||||||||||||||||||||||||||||||||||||||||||||||||||<br />

||||||||||||||||||||||||||||||||||||||||||||||||||<br />

Answer Count 1750 3500 5250 7000 8750<br />

ENTER EITHER THE NUMBER OF ANSWERS YOU WISH TO KEEP<br />

OR ENTER MINIMUM PERCENT OF SELF SCORE FOLLOWED BY %<br />

(BEST ANSWER PERCENTAGE OF SELF SCORE IS 100%)<br />

ENTER (ALL) OR ? :ALL In this example, ALL are answers kept (L1).<br />

L1 RUN STATEMENT CREATED<br />

L1 8722 GGGUUUAGGAGUGGUAGGUCUUACGAUGCCAGCUGUAAUGCCUACCGGATAA/SQN.BOTH<br />

Answer set arranged by accession number; to sort by descending<br />

similarity score, enter at an arrow prompt (=>) "sor score d".<br />

Further refinement with sequence length<br />

=> S L1 AND SQL>100<br />

13125023 SQL>100<br />

L2 8554 L1 AND SQL>100<br />

The Query Self Score value is the ideal<br />

score for a perfect answer match.<br />

For this example we are not interested in<br />

sequences under 101 nucleotides. This is<br />

accomplished by making use of the sequence<br />

length field, i.e. SQL>100.<br />

GENESEQ on STN (DGENE) Workshop Manual | Page 37

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!