29.07.2014 Views

On the Analysis of Optical Mapping Data - University of Wisconsin ...

On the Analysis of Optical Mapping Data - University of Wisconsin ...

On the Analysis of Optical Mapping Data - University of Wisconsin ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

16<br />

S<strong>of</strong>tware: The SOMA s<strong>of</strong>tware suite can be used to perform restriction map alignments.<br />

As in sequence alignment, one is <strong>of</strong>ten interested in sub-optimal alignments as well, i.e. highscoring<br />

alignments in addition to <strong>the</strong> top-scoring one. SOMA is able to find such alignments.<br />

Genspect can be used to visualize alignments reported by SOMA. Figure 1.5 shows a typical<br />

visualization <strong>of</strong> optical map alignments.<br />

Assembly<br />

The assembly problem can be viewed as a multiple alignment problem, with an additional<br />

step <strong>of</strong> producing an inferred consensus map. The most successful optical map assembly s<strong>of</strong>tware<br />

to date is Gentig, based on ideas described in Anantharaman et al. (1997) (for clones)<br />

and Anantharaman et al. (1999) (for genomic DNA). Briefly, <strong>the</strong>y develop a Bayesian approach<br />

where a prior model for <strong>the</strong> unknown restriction map and a conditional distribution<br />

for optical maps given <strong>the</strong> true map are used to derive <strong>the</strong> posterior density for an hypo<strong>the</strong>sized<br />

map. The inferred restriction map is, in principle, <strong>the</strong> one that maximizes this posterior<br />

density. Due to <strong>the</strong> complexities <strong>of</strong> <strong>the</strong> problem, a complete search is infeasible, and various<br />

heuristics are employed to enable an efficient implementation. We have little to add on <strong>the</strong><br />

assembly problem, and refer <strong>the</strong> reader to <strong>the</strong> original papers for fur<strong>the</strong>r details. Gentig<br />

results can also be visualized using Genspect, as shown in Figure 1.6.<br />

1.3.5 Example (continued)<br />

The goal <strong>of</strong> an optical mapping project is to infer <strong>the</strong> underlying restriction map <strong>of</strong> <strong>the</strong><br />

genome being studied. For small genomes, Gentig serves this purpose well. However, for<br />

large genomes such as GM07535 and CHM (Table 1.1), <strong>the</strong> sizes <strong>of</strong> <strong>the</strong> data sets exceeds its<br />

capacity, and new algorithms are required. Fortunately, additional information is available<br />

for <strong>the</strong>se data sets in <strong>the</strong> form <strong>of</strong> an in silico reference map, derived from <strong>the</strong> human genome<br />

sequence by locating instances <strong>of</strong> <strong>the</strong> SwaI recognition pattern. The genomes being studied<br />

are largely similar to this reference, so we are primarily interested in how <strong>the</strong>ir restriction<br />

maps differ from <strong>the</strong> reference.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!