02.08.2013 Views

Sample A: Cover Page of Thesis, Project, or Dissertation Proposal

Sample A: Cover Page of Thesis, Project, or Dissertation Proposal

Sample A: Cover Page of Thesis, Project, or Dissertation Proposal

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

fold decrease in the number <strong>of</strong> candidate genes is imp<strong>or</strong>tant f<strong>or</strong> diagnostic applications [6, 41]. It<br />

is also useful that there is no great dependence on a particular model f<strong>or</strong> doing the classification.<br />

Most notably, we see very good cross-experiment perf<strong>or</strong>mance, even with quite limited candidate<br />

gene lists, a rare achievement with Microarray data.<br />

Auth<strong>or</strong>’s List (Validation)<br />

Of the 72 AUC sc<strong>or</strong>es rec<strong>or</strong>ded f<strong>or</strong> classification perf<strong>or</strong>mance, the values generated by the BaFL<br />

pipeline when used f<strong>or</strong> candidate gene lists achieved 5 <strong>of</strong> 6 best overall sc<strong>or</strong>es (87-97%) over the<br />

three classification alg<strong>or</strong>ithms, although not all <strong>of</strong> these are significant given the variance in the<br />

results. The 6 th case, and the exception, was the kNN results when Bhattacharjee DE candidate<br />

genes, using the RMA generated values, were used to predict Stearman data (93.88%). In<br />

addition the BaFL-defined DE ProbeSets achieve the highest AUC f<strong>or</strong> almost every individual<br />

analysis <strong>of</strong> the BaFL intersecting DE and Stearman list, again the sole exception is the kNN<br />

implementation <strong>of</strong> Bhattacharjee predicting Stearman. If you average the perf<strong>or</strong>mance across<br />

the lists per cleansing routine, values obtained using the BaFL method achieved the highest<br />

perf<strong>or</strong>mance 5 <strong>of</strong> 6 times (bottom row <strong>of</strong> Figures 3.4 and 3.5). Equal <strong>or</strong> improved perf<strong>or</strong>mance<br />

across experiments, with less variability and smaller candidate gene lists, and low sensitivity to<br />

the model, are diagnostic goals that the BaFL method appears to be well positioned to achieve.<br />

We note that a possible cause f<strong>or</strong> the relatively po<strong>or</strong>er BaFL perf<strong>or</strong>mance f<strong>or</strong> Bhattacharjee<br />

predict Stearman implementation <strong>of</strong> kNN, may be the result <strong>of</strong> random replicate removal out <strong>of</strong><br />

the small Stearman dataset, coupled with the absence <strong>of</strong> scaling across arrays in the BaFL<br />

pipeline, as compared to RMA and dCHIP pipelines, which do inc<strong>or</strong>p<strong>or</strong>ate scaling steps. We<br />

supp<strong>or</strong>t this statement by noting that the degradation <strong>of</strong> perf<strong>or</strong>mance was not observed in the<br />

86

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!