02.08.2013 Views

Sample A: Cover Page of Thesis, Project, or Dissertation Proposal

Sample A: Cover Page of Thesis, Project, or Dissertation Proposal

Sample A: Cover Page of Thesis, Project, or Dissertation Proposal

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

is differentially expressed. A python script determined the pattern <strong>of</strong> expression, µ1 > µ2 <strong>or</strong> vice<br />

versa, and adjusted all probes in the down-regulated class by an increment <strong>of</strong> 1/50 <strong>of</strong> each probe<br />

mean. This perturbation was decided upon by trial and err<strong>or</strong>. This was sufficient to create<br />

pattern inversions in cases <strong>of</strong> similar expression probes and exaggerate existing pattern<br />

inversions, without inverting those probes having differential expression. The ProbeSets which<br />

possessed probes demonstrating the pattern inversion after the min<strong>or</strong> perturbation were<br />

reassigned to one <strong>of</strong> four categ<strong>or</strong>ies: unique <strong>or</strong> singular exception, statistical exception, specific<br />

transcript region event, and multiple transcript region events.<br />

A Pri<strong>or</strong>i Prediction<br />

Candidate ProbeSets were selected from the intersection <strong>of</strong> the BaFL-validated ProbeSets in the<br />

adenocarcinoma stage I and squamous (unknown stage progression) samples in the Bhattacharjee<br />

dataset [47]. This set includes 4,257 ProbeSets (from a comparison <strong>of</strong> 125 adenocarcinoma and<br />

17 squamous samples). Classification results (using kNN, LDA, and randomF<strong>or</strong>est [2, 52-57]),<br />

using DE ProbeSets trained on the Bhattacharjee dataset, demonstrated that there is a significant<br />

impact <strong>of</strong> the stage <strong>of</strong> disease on the expression pr<strong>of</strong>iles (data not shown). Theref<strong>or</strong>e, the training<br />

set was subdivided, to create a stage I adenocarcinoma group (72), which was intersected with the<br />

17 squamous samples (from multiple stages but not labeled so subdivision was not possible). This<br />

yielded 5174 ProbeSets, <strong>of</strong> which ~4000 were classified as DE, f<strong>or</strong> the (Ad x Sq) comparison.<br />

Restriction to the Stage I samples eliminated 16 samples in addition to the batch 3 samples; only<br />

3 <strong>of</strong> those 16 samples had been <strong>or</strong>iginally included in the full adenocarcinoma training set. The<br />

intersection <strong>of</strong> the above BaFL (AdI x SqI) output with the Lu dataset resulted in ~ 400 DE<br />

ProbeSets. The constituent probe intensities were recovered, by their x and y location identifiers,<br />

40

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!