02.08.2013 Views

Sample A: Cover Page of Thesis, Project, or Dissertation Proposal

Sample A: Cover Page of Thesis, Project, or Dissertation Proposal

Sample A: Cover Page of Thesis, Project, or Dissertation Proposal

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Data Analysis Overview<br />

1. Each <strong>of</strong> three probe-cleansing methods (RMA, dCHIP, BaFL) is used to generate<br />

ProbeSet values, on the same sample sets from each experiment (2-state case).<br />

2. Down selection was perf<strong>or</strong>med f<strong>or</strong> each cleansing methods’ interpretation <strong>of</strong> the data.<br />

Down selection yields the identification <strong>of</strong> differentially expressed genes, starting with<br />

the ProbeSet values produced by each method, based on the outcome <strong>of</strong> a Welch’s t-test<br />

<strong>of</strong> those values across the sample sets [5]. Three such down selection lists are generated:<br />

one list <strong>of</strong> DE genes from each experiment and a third that is the intersection <strong>of</strong> those two<br />

lists. The values <strong>of</strong> the genes in the lists (Stearman DE, Bhattacharjee DE, and<br />

Intersection <strong>of</strong> DE) then are used as input to three types <strong>of</strong> classifiers; kNN [6, 7], LDA<br />

[8, 9] and RF [10-12], and the resulting models are assessed, based on the AUC curves<br />

[13, 14], f<strong>or</strong> their cross-experiment sample class prediction ability relative to the base<br />

model (ALL), the complete set <strong>of</strong> genes’ values.<br />

3. A second type <strong>of</strong> comparison uses the two candidate gene lists proposed by the<br />

Bhattacharjee, et al. auth<strong>or</strong>s [3] and the candidate gene list proposed by the Stearman, et<br />

al. auth<strong>or</strong>s [4], sub-selected in each case f<strong>or</strong> those genes passed by the BaFL pipeline (but<br />

not necessarily identified as DE). A fourth candidate gene list comprised <strong>of</strong> the BaFL-<br />

passed and intersecting t-test identified DE genes f<strong>or</strong> both BaFL datasets, and is the same<br />

final list which was used in step 2. The four lists used the ProbeSet values <strong>or</strong>iginally<br />

suggested by each cleansing method, (not the values that resulted from the methods <strong>of</strong> the<br />

<strong>or</strong>iginal papers, since the underlying sample sets have been modified) and then proceeds<br />

as in step 2 f<strong>or</strong> a comparison <strong>of</strong> classification strengths based on the three types <strong>of</strong><br />

models.<br />

These steps are discussed in m<strong>or</strong>e detail in the following sections.<br />

68

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!