Sample A: Cover Page of Thesis, Project, or Dissertation Proposal
Sample A: Cover Page of Thesis, Project, or Dissertation Proposal
Sample A: Cover Page of Thesis, Project, or Dissertation Proposal
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
classification models coinciding with down selection, as demonstrated in the last column. This<br />
phenomenon mirr<strong>or</strong>s what has been observed generally with Microarray data, the po<strong>or</strong> prediction<br />
perf<strong>or</strong>mance <strong>of</strong> proposed gene lists given different data and classification approaches [37, 42, 43].<br />
Gains are much m<strong>or</strong>e consistent, if not large, when the BaFL-cleansed t-test down-selected data<br />
are used (the third row <strong>of</strong> graphs in Figures 3.2 and 3.3).<br />
Additionally, <strong>of</strong> the three models employed, Random F<strong>or</strong>est consistently did a po<strong>or</strong> job f<strong>or</strong> the<br />
RMA and dCHIP ProbeSets which were generated as the DE ProbeSets f<strong>or</strong> the training model;<br />
however, this was not observed with BaFL-generated values. This suggests that during the<br />
stochastic development <strong>of</strong> the decision trees the selection <strong>of</strong> imp<strong>or</strong>tant features is <strong>of</strong>ten specific to<br />
the dataset and not the disease condition. When the ProbeSet response is variable, its imp<strong>or</strong>tance<br />
to different models can either diminish, weakening its role as a classifier, <strong>or</strong> the regulation pattern<br />
can be inverted, generating conflicting classifications. The RMA and dCHIP interpretations <strong>of</strong><br />
the datasets present 1.8% and 3.6% (respectively) <strong>of</strong> the intersecting DE ProbeSets with<br />
conflicting regulation patterns between the two datasets, as shown in red in Figure 3.6. Linear<br />
discriminant analysis, which weights all the features and hence allows variable genes to be m<strong>or</strong>e<br />
<strong>or</strong> less imp<strong>or</strong>tant, did not show the same sensitivity as Random F<strong>or</strong>est.<br />
84