On the Analysis of Optical Mapping Data - University of Wisconsin ...
On the Analysis of Optical Mapping Data - University of Wisconsin ...
On the Analysis of Optical Mapping Data - University of Wisconsin ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
26<br />
resulting in an unusually large sizing error for that particular fragment. The image processing<br />
step attempts to control such errors, but <strong>the</strong>y can not be eliminated entirely.<br />
2.2 Parameter estimation<br />
Estimation <strong>of</strong> parameters in <strong>the</strong> stochastic model described above is difficult, but it is<br />
important for several reasons. First, estimates <strong>of</strong> <strong>the</strong> parameters are required in certain<br />
fundamental procedures. For example, likelihood ratio based score functions are expressed<br />
in terms <strong>of</strong> model parameters, and exact values <strong>of</strong> <strong>the</strong> parameters are required to completely<br />
define <strong>the</strong> score. Parameter values are also required for null distributions used in determining<br />
p-values for potential genomic variations (Reslewic et al.). Second, estimates are necessary<br />
in order to simulate optical maps. Due to <strong>the</strong> complex nature <strong>of</strong> <strong>the</strong> data, simulation is<br />
<strong>of</strong>ten <strong>the</strong> only reasonable approach to investigate <strong>the</strong> operating characteristics <strong>of</strong> various<br />
inferential procedures, despite <strong>the</strong> fact that <strong>the</strong> model may not capture all <strong>the</strong> variability in<br />
real data. Simulation can also be a useful tool in directing laboratory research, since it can<br />
provide guidance about which aspects <strong>of</strong> <strong>the</strong> experiment have <strong>the</strong> maximum impact on <strong>the</strong><br />
final results.<br />
Difficulty: The difficulty in estimation arises primarily because <strong>the</strong> true restriction map<br />
is rarely known. Even for optical maps from genomes whose sequence (and hence restriction<br />
map) is completely known, <strong>the</strong> correspondence between cut sites in observed optical maps<br />
and recognition sites in <strong>the</strong> true restriction map are never known with certainty. In fact,<br />
inferring this correspondence is precisely <strong>the</strong> goal <strong>of</strong> alignment. <strong>On</strong>e possibility is to assume<br />
<strong>the</strong> correctness <strong>of</strong> alignments that are declared to be statistically significant, and <strong>the</strong>n use<br />
<strong>the</strong>se alignments for estimation. We will briefly discuss such methods, noting that <strong>the</strong><br />
resulting estimates are likely to be biased. A secondary difficulty in estimation is due to <strong>the</strong><br />
fact that <strong>the</strong> parameters may not remain constant over <strong>the</strong> course <strong>of</strong> an experiment. This is<br />
difficult to address, and we can only assume that <strong>the</strong> changes are not substantial enough to