02.08.2013 Views

Sample A: Cover Page of Thesis, Project, or Dissertation Proposal

Sample A: Cover Page of Thesis, Project, or Dissertation Proposal

Sample A: Cover Page of Thesis, Project, or Dissertation Proposal

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Table 3.1: Down selection numbers. Number <strong>of</strong> ProbeSets giving values in the down selected gene lists<br />

that result from applying Welch’s T-test to the output <strong>of</strong> each probe cleansing method, per dataset, with the<br />

Base Model giving the <strong>or</strong>iginal size <strong>of</strong> each dataset. Each <strong>of</strong> the down-selected lists is then used as input<br />

into the 3 types <strong>of</strong> models. All <strong>of</strong> these data are provided in the Supplementary Material Data Folder.<br />

RMA dCHIP BaFL<br />

Base Model 12,625 12,625 4,200<br />

Stearman DE 5,291 5,208 3,344<br />

Bhattacharjee DE 6,595 6,429 480<br />

Cross-Experiment<br />

Intersection !DE<br />

3,761 3,407 325<br />

Figure 3.1 presents the p-value kernel densities resulting from each <strong>of</strong> the probe cleansing<br />

methods, f<strong>or</strong> the Bhattacharjee data and n<strong>or</strong>mally distributed random sampling. The top left graph<br />

estimates the probability distribution (default Gaussian smoothing) f<strong>or</strong> all the p-values f<strong>or</strong> the 3<br />

methods, with respect to the random population. The top right graph presents the quantiles f<strong>or</strong><br />

the three methods, with respect to the random population. While we observe that the RMA and<br />

dCHIP kernels appear to be m<strong>or</strong>e n<strong>or</strong>mal, they also demonstrate a large, exaggerated skew and<br />

the quantiles deviate from the expected. The skewed tail represents the population <strong>of</strong> null p-<br />

values, as shown in the lower left graph, with the accompanying quantiles presented in the lower<br />

right graph. This disprop<strong>or</strong>tion <strong>of</strong> the RMA and dCHIP t-test hypothesis testing results is<br />

associated with the skewed batch intensity distributions we presented in Chapter 2. In stark<br />

contrast, the BaFL p-value density demonstrates a skew f<strong>or</strong> the upper quantiles and becomes<br />

m<strong>or</strong>e pronounced f<strong>or</strong> the null p-values [39]. These graphs explain the observed weakness <strong>of</strong> the<br />

t-test f<strong>or</strong> RMA and dCHIP and are intriguing f<strong>or</strong> the BaFL hypothesis tests, since there seems to<br />

have been an increase in the power <strong>of</strong> the t-test. Is this increase due to the BaFL cleansing<br />

process <strong>or</strong> as a result <strong>of</strong> the bias in the dataset?<br />

76

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!