On the Analysis of Optical Mapping Data - University of Wisconsin ...

More documents

Recommendations

Info

44 Quantiles of difference 10 5 0 −5 −10 Normal Logistic −4 −2 0 2 4 −10 −5 0 5 10 Theoretical quantiles Figure 3.3 The distribution of ǫ(G) induces a distribution of the difference between two independent realizations of the best spurious score for a map. This distribution can be compared to observed data to indirectly check models for ǫ(G). The Q-Q plots here suggest that a logistic distribution for the differences (induced by an extreme value distribution for ǫ) is a better fit that normal (induced when ǫ’s are normal). Absolute deviation from average 6 5 4 3 2 1 0 −40 −30 −20 −10 0 Average spurious score Counts 1443 1273 1113 964 826 698 582 475 380 295 221 158 105 63 32 11 1 Figure 3.4 Variance of errors. µ(M) is estimated by the average spurious score against four permutations from P G . Absolute deviations of scores against a fifth permutation is plotted against these averages. The LOESS smooth suggests that the standard deviation of the errors is a linear function of the average spurious score.
45 Mean spurious score 0 −10 −20 Number of fragments 20 40 60 Length (Kb) 400 600 800 1000 Regression Fit −30 −20 −10 0 Counts 6630 5837 5094 4402 3760 3169 2628 2138 1698 1309 971 683 445 259 122 36 1 Figure 3.5 Parametric models for µ(M). The average of four spurious scores for each map is plotted against the number of fragments N, the length L, and the fitted values from a linear model with terms N, L and their product NL. The multiple regression model explains more of the variability, and also suggests better symmetry. 3.5 demonstrates the utility of this approach. As before, a generalized least squares model with standard deviation linear in the fitted values is more appropriate, giving standardized test statistics ) S (˜G|M − ˜µ (M) T 2 (M) = ̂δ 2 − ˜µ (M) where ˜µ (M) are the fitted responses. Comparison: Table 3.1 summarizes the results from both approaches. Specifically, the mean spurious scores for each of the 206796 GM07535 maps were estimated using n = 4 permutations of the reference. A fifth permutation was used for parameter estimation: δ 1 in the direct approach, δ 2 and the regression coefficients in the regression approach. A sixth permutation was used to sample from the null distributions, and 99% and 99.9% cutoffs were determined by the appropriate quantiles of these samples of size 206796. The two approaches largely agree in both cases. For aligning a future map, the regression method is of more practical value, as it would require only one alignment to ˜G, whereas the direct method would require additional alignments to several permuted references to estimate µ(M).
Page 1 and 2:
ON THE ANALYSIS OF OPTICAL MAPPING
Page 3 and 4:
To my parents. i
Page 5 and 6:
DISCARD THIS PAGE
Page 7 and 8: iv Page 3.3 Results . . . . . . . .
Page 9 and 10: v LIST OF TABLES Table Page 1.1 Sum
Page 11 and 12: vi LIST OF FIGURES Figure Page 1.1
Page 13 and 14: ON THE ANALYSIS OF OPTICAL MAPPING
Page 15 and 16: 1 Chapter 1 Overview of Optical Map
Page 17 and 18: 3 hard, do not always have a unique
Page 19 and 20: 5 for microbial and other small gen
Page 21 and 22: 7 Figure 1.2 Close-up of a typical
Page 23 and 24: 9 0.96 0.98 1.00 1.02 1.04 Offset a
Page 25 and 26: 11 direct glimpse at the underlying
Page 27 and 28: 13 which is not surprising since we
Page 29 and 30: 15 Gapped alignments: The above des
Page 31 and 32: Figure 1.5 A visualization of align
Page 33 and 34: 19 Assembly: For these examples, th
Page 35 and 36: 21 Chapter 2 Modeling Optical Map D
Page 37 and 38: 23 Alternatively, it can be thought
Page 39 and 40: 25 and V (X i ) = E(V (Y i R i |R i
Page 41 and 42: 27 affect inference. If necessary,
Page 43 and 44: 29 Quantiles of fragment lengths (K
Page 45 and 46: 31 as a function of the parameters.
Page 47 and 48: 33 by rejecting maps that do not al
Page 49 and 50: 35 30 0.700 − 0.005 0 50 100 150
Page 51 and 52: 37 Chapter 3 Significance of Optica
Page 53 and 54: 39 using optical mapping data from
Page 55 and 56: 41 1.0 0.8 0.6 0.4 0.2 0.0 1.0 0.8
Page 57: 43 3.3.2 Simplifications Direct app
Page 61 and 62: 47 3.3.3 Simulation Given a generat
Page 63 and 64: 49 3.4 Discussion 3.4.1 Uses Alignm
Page 65 and 66: 51 The ability to simulate from the
Page 67 and 68: 53 maps, where the separation betwe
Page 69 and 70: Figure 3.10 Schematic representatio
Page 71 and 72: 57 especially a short noisy one, to
Page 73 and 74: 59 Test statistics: Variability due
Page 75 and 76: 61 in sequence assembly and validat
Page 77 and 78: 63 1988). However, due to sampling
Page 79 and 80: 65 and rate parameters Λ i = E(N i
Page 81 and 82: 67 with mean µ k for the k th stat
Page 83 and 84: 69 Estimated Copy Number in simulat
Page 85 and 86: 71 Posterior probabilities 1.0 0.8
Page 87 and 88: 73 (a) Observed counts and decoded
Page 89 and 90: 75 Conclusion: Copy number alterati
Page 91 and 92: 77 well in its current form, but th
Page 93 and 94: 79 Change in score 15 10 5 0 0.9 1.
Page 95 and 96: 81 will rarely be homozygous. It ma
Page 97 and 98: 83 E.T. Dimalanta, A. Lim, R. Runnh
Page 99 and 100: 85 Appendix A: Score functions for
Page 101 and 102: 87 Appendix B: Hidden Markov Model
Page 103 and 104: 89 which can be shown to have highe
show all

On the Analysis of Optical Mapping Data - University of Wisconsin ...

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?