29.07.2014 Views

On the Analysis of Optical Mapping Data - University of Wisconsin ...

On the Analysis of Optical Mapping Data - University of Wisconsin ...

On the Analysis of Optical Mapping Data - University of Wisconsin ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

26<br />

resulting in an unusually large sizing error for that particular fragment. The image processing<br />

step attempts to control such errors, but <strong>the</strong>y can not be eliminated entirely.<br />

2.2 Parameter estimation<br />

Estimation <strong>of</strong> parameters in <strong>the</strong> stochastic model described above is difficult, but it is<br />

important for several reasons. First, estimates <strong>of</strong> <strong>the</strong> parameters are required in certain<br />

fundamental procedures. For example, likelihood ratio based score functions are expressed<br />

in terms <strong>of</strong> model parameters, and exact values <strong>of</strong> <strong>the</strong> parameters are required to completely<br />

define <strong>the</strong> score. Parameter values are also required for null distributions used in determining<br />

p-values for potential genomic variations (Reslewic et al.). Second, estimates are necessary<br />

in order to simulate optical maps. Due to <strong>the</strong> complex nature <strong>of</strong> <strong>the</strong> data, simulation is<br />

<strong>of</strong>ten <strong>the</strong> only reasonable approach to investigate <strong>the</strong> operating characteristics <strong>of</strong> various<br />

inferential procedures, despite <strong>the</strong> fact that <strong>the</strong> model may not capture all <strong>the</strong> variability in<br />

real data. Simulation can also be a useful tool in directing laboratory research, since it can<br />

provide guidance about which aspects <strong>of</strong> <strong>the</strong> experiment have <strong>the</strong> maximum impact on <strong>the</strong><br />

final results.<br />

Difficulty: The difficulty in estimation arises primarily because <strong>the</strong> true restriction map<br />

is rarely known. Even for optical maps from genomes whose sequence (and hence restriction<br />

map) is completely known, <strong>the</strong> correspondence between cut sites in observed optical maps<br />

and recognition sites in <strong>the</strong> true restriction map are never known with certainty. In fact,<br />

inferring this correspondence is precisely <strong>the</strong> goal <strong>of</strong> alignment. <strong>On</strong>e possibility is to assume<br />

<strong>the</strong> correctness <strong>of</strong> alignments that are declared to be statistically significant, and <strong>the</strong>n use<br />

<strong>the</strong>se alignments for estimation. We will briefly discuss such methods, noting that <strong>the</strong><br />

resulting estimates are likely to be biased. A secondary difficulty in estimation is due to <strong>the</strong><br />

fact that <strong>the</strong> parameters may not remain constant over <strong>the</strong> course <strong>of</strong> an experiment. This is<br />

difficult to address, and we can only assume that <strong>the</strong> changes are not substantial enough to

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!