On the Analysis of Optical Mapping Data - University of Wisconsin ...
On the Analysis of Optical Mapping Data - University of Wisconsin ...
On the Analysis of Optical Mapping Data - University of Wisconsin ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
27<br />
affect inference. If necessary, <strong>the</strong> degree <strong>of</strong> change can be assessed by dividing <strong>the</strong> data by<br />
time <strong>of</strong> collection and compare estimates across time periods.<br />
To illustrate estimation techniques, we use <strong>the</strong> GM07535 data set and <strong>the</strong> in silico reference<br />
map derived from <strong>the</strong> human genome sequence (Build 35).<br />
2.2.1 Desorption<br />
Random truncation: We begin with <strong>the</strong> estimation <strong>of</strong> desorption rates, since this can<br />
be achieved without alignments, under certain assumptions. For a fragment in <strong>the</strong> true<br />
restriction map spanned by an observed optical map, let Z be a random variable indicating<br />
whe<strong>the</strong>r that particular fragment was observed, and let Y be its measured length had it been<br />
observed. The desorption rate is quantified by <strong>the</strong> probability that a fragment is observed<br />
(not truncated), given by<br />
π(y) = P (Z = 1|Y = y)<br />
Suppose that <strong>the</strong> marginal density <strong>of</strong> <strong>the</strong> unobserved (pre-truncation) random variable Y is<br />
g. Let X represent <strong>the</strong> length <strong>of</strong> an observed (truncated version <strong>of</strong> Y ) fragment, i.e. X = Y<br />
if Z = 1. Then, <strong>the</strong> marginal density <strong>of</strong> X is given by<br />
h(x) = 1 K<br />
π(x) g(x)<br />
where K is <strong>the</strong> normalizing constant<br />
K =<br />
∫ ∞<br />
0<br />
π(t) g(t) dt<br />
As formulated, π(·) only identifiable up to scale since π ′ (y) := α π(y), 0 < α ≤ 1 induces <strong>the</strong><br />
same h from g. Desorption is known to affect small fragments only, so we may additionally<br />
assume that lim y→∞ π(y) = 1, making π(·) identifiable. Empirically, π(y) = 1 for y > 15 Kb.<br />
Length distribution: Let us consider <strong>the</strong> distribution <strong>of</strong> <strong>the</strong> unobserved random variable<br />
Y . Suppose <strong>the</strong> true recognition sites are realizations <strong>of</strong> a homogeneous Poisson process with<br />
rate θ, true recognition sites are observed independently with probability p, and false cuts