On the Analysis of Optical Mapping Data - University of Wisconsin ...
On the Analysis of Optical Mapping Data - University of Wisconsin ...
On the Analysis of Optical Mapping Data - University of Wisconsin ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
45<br />
Mean spurious score<br />
0<br />
−10<br />
−20<br />
Number <strong>of</strong> fragments<br />
20 40 60<br />
Length (Kb)<br />
400 600 800 1000<br />
Regression Fit<br />
−30 −20 −10 0<br />
Counts<br />
6630<br />
5837<br />
5094<br />
4402<br />
3760<br />
3169<br />
2628<br />
2138<br />
1698<br />
1309<br />
971<br />
683<br />
445<br />
259<br />
122<br />
36<br />
1<br />
Figure 3.5 Parametric models for µ(M). The average <strong>of</strong> four spurious scores for each map is<br />
plotted against <strong>the</strong> number <strong>of</strong> fragments N, <strong>the</strong> length L, and <strong>the</strong> fitted values from a linear<br />
model with terms N, L and <strong>the</strong>ir product NL. The multiple regression model explains more<br />
<strong>of</strong> <strong>the</strong> variability, and also suggests better symmetry.<br />
3.5 demonstrates <strong>the</strong> utility <strong>of</strong> this approach. As before, a generalized least squares model<br />
with standard deviation linear in <strong>the</strong> fitted values is more appropriate, giving standardized<br />
test statistics<br />
)<br />
S<br />
(˜G|M − ˜µ (M)<br />
T 2 (M) =<br />
̂δ 2 − ˜µ (M)<br />
where ˜µ (M) are <strong>the</strong> fitted responses.<br />
Comparison: Table 3.1 summarizes <strong>the</strong> results from both approaches. Specifically, <strong>the</strong><br />
mean spurious scores for each <strong>of</strong> <strong>the</strong> 206796 GM07535 maps were estimated using n = 4<br />
permutations <strong>of</strong> <strong>the</strong> reference. A fifth permutation was used for parameter estimation: δ 1<br />
in <strong>the</strong> direct approach, δ 2 and <strong>the</strong> regression coefficients in <strong>the</strong> regression approach. A sixth<br />
permutation was used to sample from <strong>the</strong> null distributions, and 99% and 99.9% cut<strong>of</strong>fs<br />
were determined by <strong>the</strong> appropriate quantiles <strong>of</strong> <strong>the</strong>se samples <strong>of</strong> size 206796. The two<br />
approaches largely agree in both cases. For aligning a future map, <strong>the</strong> regression method<br />
is <strong>of</strong> more practical value, as it would require only one alignment to ˜G, whereas <strong>the</strong> direct<br />
method would require additional alignments to several permuted references to estimate µ(M).