On the Analysis of Optical Mapping Data - University of Wisconsin ...
On the Analysis of Optical Mapping Data - University of Wisconsin ...
On the Analysis of Optical Mapping Data - University of Wisconsin ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
3<br />
hard, do not always have a unique solution, and may not scale well. Additionally, <strong>the</strong>se<br />
methods typically require measurements from multiple copies <strong>of</strong> <strong>the</strong> target DNA, usually<br />
through <strong>the</strong> creation <strong>of</strong> clone libraries.<br />
<strong>Optical</strong> mapping: <strong>Optical</strong> mapping (Schwartz et al., 1993; Dimalanta et al., 2004) produces<br />
ordered restriction maps from single DNA molecules. Briefly, DNA from hundreds <strong>of</strong><br />
thousands <strong>of</strong> cells in solution is randomly sheared to produce pieces that are around 500 Kb<br />
long. The solution is <strong>the</strong>n passed through a micro-channel, where <strong>the</strong> DNA molecules are<br />
stretched and <strong>the</strong>n attached to a positively charged glass support. A restriction enzyme is<br />
<strong>the</strong>n applied, cleaving <strong>the</strong> DNA at corresponding restriction sites. The DNA molecules remain<br />
attached to <strong>the</strong> surface, but <strong>the</strong> elasticity <strong>of</strong> <strong>the</strong> stretched DNA pulls back <strong>the</strong> molecule<br />
ends at <strong>the</strong> cleaved sites. The surface is photographed under a microscope after being stained<br />
with a fluorochrome. The cleavage sites show up in <strong>the</strong> image as tiny gaps in <strong>the</strong> fluorescent<br />
line <strong>of</strong> <strong>the</strong> molecule, giving an snapshot <strong>of</strong> <strong>the</strong> full restriction map. Even though <strong>the</strong>se<br />
molecules are large by many standards, <strong>the</strong>y may still represent only a small fraction <strong>of</strong> <strong>the</strong><br />
chromosome <strong>the</strong>y come from. Naturally, <strong>the</strong> amount <strong>of</strong> information in an optical map data<br />
set is related to <strong>the</strong> size <strong>of</strong> <strong>the</strong> underlying genome. It is common to measure <strong>the</strong> effective<br />
size <strong>of</strong> a data set by its coverage, which is <strong>the</strong> ratio <strong>of</strong> <strong>the</strong> accumulated lengths <strong>of</strong> all optical<br />
maps and <strong>the</strong> estimated length <strong>of</strong> <strong>the</strong> genome.<br />
Several types <strong>of</strong> noise affect optical map data, and a reliable picture <strong>of</strong> <strong>the</strong> true map can<br />
only be obtained by combining information from multiple optical maps that redundantly tile<br />
<strong>the</strong> genome. Most <strong>of</strong> <strong>the</strong> algorithmic challenges in optical mapping stem from trying to model<br />
<strong>the</strong> various kinds <strong>of</strong> noise, which are not all completely understood, and making inferences<br />
about <strong>the</strong> underlying map. Figure 1.1 outlines <strong>the</strong> basic steps <strong>of</strong> data collection, image<br />
processing and data analysis that toge<strong>the</strong>r form <strong>the</strong> cornerstones <strong>of</strong> <strong>the</strong> optical mapping<br />
system.<br />
Uses: <strong>Optical</strong> mapping has various applications. It has been successfully used to assist in<br />
sequence assembly and validation efforts (Ivens et al., 2005; Armbrust et al., 2004), usually