02.08.2013 Views

Sample A: Cover Page of Thesis, Project, or Dissertation Proposal

Sample A: Cover Page of Thesis, Project, or Dissertation Proposal

Sample A: Cover Page of Thesis, Project, or Dissertation Proposal

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

classification models coinciding with down selection, as demonstrated in the last column. This<br />

phenomenon mirr<strong>or</strong>s what has been observed generally with Microarray data, the po<strong>or</strong> prediction<br />

perf<strong>or</strong>mance <strong>of</strong> proposed gene lists given different data and classification approaches [37, 42, 43].<br />

Gains are much m<strong>or</strong>e consistent, if not large, when the BaFL-cleansed t-test down-selected data<br />

are used (the third row <strong>of</strong> graphs in Figures 3.2 and 3.3).<br />

Additionally, <strong>of</strong> the three models employed, Random F<strong>or</strong>est consistently did a po<strong>or</strong> job f<strong>or</strong> the<br />

RMA and dCHIP ProbeSets which were generated as the DE ProbeSets f<strong>or</strong> the training model;<br />

however, this was not observed with BaFL-generated values. This suggests that during the<br />

stochastic development <strong>of</strong> the decision trees the selection <strong>of</strong> imp<strong>or</strong>tant features is <strong>of</strong>ten specific to<br />

the dataset and not the disease condition. When the ProbeSet response is variable, its imp<strong>or</strong>tance<br />

to different models can either diminish, weakening its role as a classifier, <strong>or</strong> the regulation pattern<br />

can be inverted, generating conflicting classifications. The RMA and dCHIP interpretations <strong>of</strong><br />

the datasets present 1.8% and 3.6% (respectively) <strong>of</strong> the intersecting DE ProbeSets with<br />

conflicting regulation patterns between the two datasets, as shown in red in Figure 3.6. Linear<br />

discriminant analysis, which weights all the features and hence allows variable genes to be m<strong>or</strong>e<br />

<strong>or</strong> less imp<strong>or</strong>tant, did not show the same sensitivity as Random F<strong>or</strong>est.<br />

84

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!