Sample A: Cover Page of Thesis, Project, or Dissertation Proposal
Sample A: Cover Page of Thesis, Project, or Dissertation Proposal
Sample A: Cover Page of Thesis, Project, or Dissertation Proposal
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Table 3.1: Down selection numbers. Number <strong>of</strong> ProbeSets giving values in the down selected gene lists<br />
that result from applying Welch’s T-test to the output <strong>of</strong> each probe cleansing method, per dataset, with the<br />
Base Model giving the <strong>or</strong>iginal size <strong>of</strong> each dataset. Each <strong>of</strong> the down-selected lists is then used as input<br />
into the 3 types <strong>of</strong> models. All <strong>of</strong> these data are provided in the Supplementary Material Data Folder.<br />
RMA dCHIP BaFL<br />
Base Model 12,625 12,625 4,200<br />
Stearman DE 5,291 5,208 3,344<br />
Bhattacharjee DE 6,595 6,429 480<br />
Cross-Experiment<br />
Intersection !DE<br />
3,761 3,407 325<br />
Figure 3.1 presents the p-value kernel densities resulting from each <strong>of</strong> the probe cleansing<br />
methods, f<strong>or</strong> the Bhattacharjee data and n<strong>or</strong>mally distributed random sampling. The top left graph<br />
estimates the probability distribution (default Gaussian smoothing) f<strong>or</strong> all the p-values f<strong>or</strong> the 3<br />
methods, with respect to the random population. The top right graph presents the quantiles f<strong>or</strong><br />
the three methods, with respect to the random population. While we observe that the RMA and<br />
dCHIP kernels appear to be m<strong>or</strong>e n<strong>or</strong>mal, they also demonstrate a large, exaggerated skew and<br />
the quantiles deviate from the expected. The skewed tail represents the population <strong>of</strong> null p-<br />
values, as shown in the lower left graph, with the accompanying quantiles presented in the lower<br />
right graph. This disprop<strong>or</strong>tion <strong>of</strong> the RMA and dCHIP t-test hypothesis testing results is<br />
associated with the skewed batch intensity distributions we presented in Chapter 2. In stark<br />
contrast, the BaFL p-value density demonstrates a skew f<strong>or</strong> the upper quantiles and becomes<br />
m<strong>or</strong>e pronounced f<strong>or</strong> the null p-values [39]. These graphs explain the observed weakness <strong>of</strong> the<br />
t-test f<strong>or</strong> RMA and dCHIP and are intriguing f<strong>or</strong> the BaFL hypothesis tests, since there seems to<br />
have been an increase in the power <strong>of</strong> the t-test. Is this increase due to the BaFL cleansing<br />
process <strong>or</strong> as a result <strong>of</strong> the bias in the dataset?<br />
76