29.07.2014 Views

On the Analysis of Optical Mapping Data - University of Wisconsin ...

On the Analysis of Optical Mapping Data - University of Wisconsin ...

On the Analysis of Optical Mapping Data - University of Wisconsin ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

56<br />

Actual alignments<br />

Predicted by model<br />

Density<br />

0.015<br />

0.010<br />

0.005<br />

0.000<br />

0.015<br />

0.010<br />

0.005<br />

0.000<br />

75 80 85 90 95 100 105<br />

50 55 60 65 70 75<br />

20 25 30 35 40 45 50<br />

Location (Mb)<br />

0.015<br />

0.010<br />

0.005<br />

0.000<br />

Figure 3.11 Estimated thinning rates. The data are approximately 10,000 simulated maps<br />

from human chromosome 14. The first curve is <strong>the</strong> kernel density estimate <strong>of</strong> locations<br />

obtained from alignments declared significant. The second curve is <strong>the</strong> density <strong>of</strong> <strong>the</strong> true<br />

locations <strong>of</strong> <strong>the</strong> same simulated maps, but with weights given by model (3.2).<br />

The fitted model was <strong>the</strong>n used to estimate P ( aligned | M ) for a new set <strong>of</strong> maps simulated<br />

from chromosome 14, which were actually aligned as well. Figure 3.11 compares <strong>the</strong> kernel<br />

density estimate obtained from aligned locations with <strong>the</strong> estimated density <strong>of</strong> <strong>the</strong> true<br />

locations <strong>of</strong> all simulated maps, but with weights given by model (3.2). The estimated<br />

densities estimated by <strong>the</strong> two methods are very close, suggesting that we can do away with<br />

<strong>the</strong> alignment step without substantial drawbacks.<br />

The calibration provided by (3.2) can also help in preliminary filtering <strong>of</strong> optical maps.<br />

Currently, it is common to entirely remove maps shorter than a certain length (typically 300<br />

Kb) from analysis as <strong>the</strong>y are expected to have little information. Our observations would<br />

suggest that ψ(M) is a better quantity on which to base this decision. This is also related to<br />

our earlier discussion motivated by a comparison <strong>of</strong> Figures 3.8 and 3.9. The subset <strong>of</strong> maps<br />

that have a high probability <strong>of</strong> being aligned based on ψ(M) but are not actually aligned to<br />

<strong>the</strong> reference are likely to contain a higher proportion <strong>of</strong> maps that originate from regions <strong>of</strong><br />

<strong>the</strong> genome not represented in <strong>the</strong> reference copy.<br />

3.4.3 O<strong>the</strong>r topics<br />

Choice <strong>of</strong> Null hypo<strong>the</strong>sis: Independence <strong>of</strong> M and ˜G is not necessarily <strong>the</strong> obvious<br />

hypo<strong>the</strong>sis to test when determining significance. It is not unlikely for an optical map,

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!