On the Analysis of Optical Mapping Data - University of Wisconsin ...
On the Analysis of Optical Mapping Data - University of Wisconsin ...
On the Analysis of Optical Mapping Data - University of Wisconsin ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
16<br />
S<strong>of</strong>tware: The SOMA s<strong>of</strong>tware suite can be used to perform restriction map alignments.<br />
As in sequence alignment, one is <strong>of</strong>ten interested in sub-optimal alignments as well, i.e. highscoring<br />
alignments in addition to <strong>the</strong> top-scoring one. SOMA is able to find such alignments.<br />
Genspect can be used to visualize alignments reported by SOMA. Figure 1.5 shows a typical<br />
visualization <strong>of</strong> optical map alignments.<br />
Assembly<br />
The assembly problem can be viewed as a multiple alignment problem, with an additional<br />
step <strong>of</strong> producing an inferred consensus map. The most successful optical map assembly s<strong>of</strong>tware<br />
to date is Gentig, based on ideas described in Anantharaman et al. (1997) (for clones)<br />
and Anantharaman et al. (1999) (for genomic DNA). Briefly, <strong>the</strong>y develop a Bayesian approach<br />
where a prior model for <strong>the</strong> unknown restriction map and a conditional distribution<br />
for optical maps given <strong>the</strong> true map are used to derive <strong>the</strong> posterior density for an hypo<strong>the</strong>sized<br />
map. The inferred restriction map is, in principle, <strong>the</strong> one that maximizes this posterior<br />
density. Due to <strong>the</strong> complexities <strong>of</strong> <strong>the</strong> problem, a complete search is infeasible, and various<br />
heuristics are employed to enable an efficient implementation. We have little to add on <strong>the</strong><br />
assembly problem, and refer <strong>the</strong> reader to <strong>the</strong> original papers for fur<strong>the</strong>r details. Gentig<br />
results can also be visualized using Genspect, as shown in Figure 1.6.<br />
1.3.5 Example (continued)<br />
The goal <strong>of</strong> an optical mapping project is to infer <strong>the</strong> underlying restriction map <strong>of</strong> <strong>the</strong><br />
genome being studied. For small genomes, Gentig serves this purpose well. However, for<br />
large genomes such as GM07535 and CHM (Table 1.1), <strong>the</strong> sizes <strong>of</strong> <strong>the</strong> data sets exceeds its<br />
capacity, and new algorithms are required. Fortunately, additional information is available<br />
for <strong>the</strong>se data sets in <strong>the</strong> form <strong>of</strong> an in silico reference map, derived from <strong>the</strong> human genome<br />
sequence by locating instances <strong>of</strong> <strong>the</strong> SwaI recognition pattern. The genomes being studied<br />
are largely similar to this reference, so we are primarily interested in how <strong>the</strong>ir restriction<br />
maps differ from <strong>the</strong> reference.