Sample A: Cover Page of Thesis, Project, or Dissertation Proposal
Sample A: Cover Page of Thesis, Project, or Dissertation Proposal
Sample A: Cover Page of Thesis, Project, or Dissertation Proposal
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
fold decrease in the number <strong>of</strong> candidate genes is imp<strong>or</strong>tant f<strong>or</strong> diagnostic applications [6, 41]. It<br />
is also useful that there is no great dependence on a particular model f<strong>or</strong> doing the classification.<br />
Most notably, we see very good cross-experiment perf<strong>or</strong>mance, even with quite limited candidate<br />
gene lists, a rare achievement with Microarray data.<br />
Auth<strong>or</strong>’s List (Validation)<br />
Of the 72 AUC sc<strong>or</strong>es rec<strong>or</strong>ded f<strong>or</strong> classification perf<strong>or</strong>mance, the values generated by the BaFL<br />
pipeline when used f<strong>or</strong> candidate gene lists achieved 5 <strong>of</strong> 6 best overall sc<strong>or</strong>es (87-97%) over the<br />
three classification alg<strong>or</strong>ithms, although not all <strong>of</strong> these are significant given the variance in the<br />
results. The 6 th case, and the exception, was the kNN results when Bhattacharjee DE candidate<br />
genes, using the RMA generated values, were used to predict Stearman data (93.88%). In<br />
addition the BaFL-defined DE ProbeSets achieve the highest AUC f<strong>or</strong> almost every individual<br />
analysis <strong>of</strong> the BaFL intersecting DE and Stearman list, again the sole exception is the kNN<br />
implementation <strong>of</strong> Bhattacharjee predicting Stearman. If you average the perf<strong>or</strong>mance across<br />
the lists per cleansing routine, values obtained using the BaFL method achieved the highest<br />
perf<strong>or</strong>mance 5 <strong>of</strong> 6 times (bottom row <strong>of</strong> Figures 3.4 and 3.5). Equal <strong>or</strong> improved perf<strong>or</strong>mance<br />
across experiments, with less variability and smaller candidate gene lists, and low sensitivity to<br />
the model, are diagnostic goals that the BaFL method appears to be well positioned to achieve.<br />
We note that a possible cause f<strong>or</strong> the relatively po<strong>or</strong>er BaFL perf<strong>or</strong>mance f<strong>or</strong> Bhattacharjee<br />
predict Stearman implementation <strong>of</strong> kNN, may be the result <strong>of</strong> random replicate removal out <strong>of</strong><br />
the small Stearman dataset, coupled with the absence <strong>of</strong> scaling across arrays in the BaFL<br />
pipeline, as compared to RMA and dCHIP pipelines, which do inc<strong>or</strong>p<strong>or</strong>ate scaling steps. We<br />
supp<strong>or</strong>t this statement by noting that the degradation <strong>of</strong> perf<strong>or</strong>mance was not observed in the<br />
86