02.08.2013 Views

Sample A: Cover Page of Thesis, Project, or Dissertation Proposal

Sample A: Cover Page of Thesis, Project, or Dissertation Proposal

Sample A: Cover Page of Thesis, Project, or Dissertation Proposal

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

The process <strong>of</strong> determining a candidate gene list is <strong>of</strong>ten multi-step, with a (relatively) simple<br />

statistical method being used to obtain an initial down-selected list, in which significant<br />

expression change is identified, followed by a m<strong>or</strong>e sophisticated technique, in <strong>or</strong>der to suggest a<br />

final subset <strong>of</strong> genes in which c<strong>or</strong>relation to the fact<strong>or</strong> <strong>or</strong> phenotype <strong>of</strong> interest is robust [52, 69-<br />

71, 77-83]. The assessment <strong>of</strong> these subsets can be done using supervised <strong>or</strong> unsupervised<br />

methods [69]. Clustering is the most common f<strong>or</strong>m <strong>of</strong> the unsupervised methods, where the goal<br />

is to achieve homogeneous clusters [84]. Supervised learning methods develop models from<br />

training data and assess the quality <strong>of</strong> prediction <strong>of</strong> the test data [77, 85]. Perf<strong>or</strong>mance metrics<br />

are necessary f<strong>or</strong> choosing among the learning alg<strong>or</strong>ithms: the most common metric is the area<br />

under the receiver operating curve, which inc<strong>or</strong>p<strong>or</strong>ates the sensitivity and specificity <strong>of</strong> the<br />

classification results [86]. Other metrics include precision-recall, cost-sensitive analysis, etc. [87,<br />

88].<br />

Specific Aims<br />

A number <strong>of</strong> confounding fact<strong>or</strong>s to Microarray experiments are well described in the scientific<br />

literature [17, 18, 22, 26, 59]: to these fact<strong>or</strong>s is attributed the relative irreproducibility <strong>of</strong><br />

Microarray analysis results [35, 44, 57, 68]. While a number <strong>of</strong> investigat<strong>or</strong>s have rep<strong>or</strong>ted the<br />

effect <strong>of</strong> removing individual classes <strong>of</strong> contributions on the robustness <strong>of</strong> the results, to our<br />

knowledge no investigation has removed the complete set <strong>of</strong> fact<strong>or</strong>s which we have established in<br />

our cleansing pipeline. Of the sophisticated probe cleansing alg<strong>or</strong>ithms that have been developed<br />

and are commonly used, all proceed by identifying and eliminating probes with large variance,<br />

without expl<strong>or</strong>ing the underlying cause <strong>of</strong> that variance [45, 46, 48, 49, 89]. This black box<br />

method leads to both inclusion <strong>of</strong> probes having dubious properties and exclusion <strong>of</strong> probes that<br />

21

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!