29.07.2014 Views

On the Analysis of Optical Mapping Data - University of Wisconsin ...

On the Analysis of Optical Mapping Data - University of Wisconsin ...

On the Analysis of Optical Mapping Data - University of Wisconsin ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

14<br />

Significance: An optimal alignment exists in any map comparison problem, irrespective<br />

<strong>of</strong> any actual association. In order to minimize <strong>the</strong> potential effects <strong>of</strong> misaligned maps, it is<br />

essential to limit alignments by some additional criterion. This is <strong>the</strong> problem <strong>of</strong> assessing<br />

<strong>the</strong> significance <strong>of</strong> a given alignment. The significance problem in optical map alignment<br />

is more difficult than in sequence alignment, because <strong>of</strong> a greater degree <strong>of</strong> noise and also<br />

because <strong>of</strong> differences in <strong>the</strong> nature <strong>of</strong> <strong>the</strong> data. We find deficiencies in <strong>the</strong> current state <strong>of</strong><br />

<strong>the</strong> art, and in Chapter 3 we introduce and evaluate an alternative approach to measuring<br />

<strong>the</strong> significance <strong>of</strong> optical map alignments. Here, we give a general overview <strong>of</strong> <strong>the</strong> mechanics<br />

<strong>of</strong> map alignment.<br />

Notation: We restrict our attention to pairwise alignments, i.e. those between two restriction<br />

maps. Let x = (x 1 , . . .,x m ) and y = (y 1 , . . .,y n ) denote two restriction maps with<br />

m and n fragments respectively. Let <strong>the</strong> corresponding representations in terms <strong>of</strong> cut sites<br />

be S(x) = {s 0 < s 1 < · · · < s m } and S(y) = {t 0 < t 1 < · · · < t n }. An alignment between x<br />

and y can be represented by an ordered set <strong>of</strong> index pairs<br />

C = (( i 1<br />

j1<br />

)<br />

,<br />

( i2<br />

j2<br />

)<br />

, . . .,<br />

( ik<br />

jk<br />

))<br />

indicating a correspondence between <strong>the</strong> cut sites s il and t jl for l = 1, . . .,k, where 0 < i 1 <<br />

· · · < i k < m and 0 < j 1 < · · · < j k < n. To allow missing fragments in <strong>the</strong> alignment, this<br />

last condition can be modified to allow successive indices to be equal, as long as successive<br />

index pairs are not identical. For non-trivial alignments k ≥ 2, in which case <strong>the</strong> alignment<br />

consists <strong>of</strong> k −1 aligned chunks. The l th chunk (l = 1, . . ., k −1) has lengths ˜x l = s il −s il−1 ,<br />

and ỹ l = t jl − t jl−1 involving m l = i l − i l−1 and n l = j l − j l−1 fragments respectively in<br />

<strong>the</strong> original maps x and y. To be used successfully in a dynamic programming algorithm, a<br />

score function must be additive, in <strong>the</strong> sense that <strong>the</strong> score <strong>of</strong> a complete alignment must be<br />

<strong>the</strong> sum <strong>of</strong> <strong>the</strong> scores for its component chunks.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!