02.08.2013 Views

Sample A: Cover Page of Thesis, Project, or Dissertation Proposal

Sample A: Cover Page of Thesis, Project, or Dissertation Proposal

Sample A: Cover Page of Thesis, Project, or Dissertation Proposal

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

It has long been known that the variation due to sample handling may be far greater than the<br />

variation due to the primary experimental variable [17], but in the absence <strong>of</strong> internal controls and<br />

general calibration standards we must res<strong>or</strong>t to experiment-specific calibrations [18]. The total<br />

flu<strong>or</strong>escence per array has been previously suggested as one test <strong>of</strong> batch consistency [19],<br />

alternately represented as the average signal per probe <strong>or</strong> ProbeSet, although those investigat<strong>or</strong>s<br />

did not inc<strong>or</strong>p<strong>or</strong>ate the scanner limitation. This metric reflects the labeling efficiency per<br />

molecule, but is not sensitive to sample degradation <strong>or</strong> large differences in the number <strong>of</strong> genes<br />

expressed, so we extended the metric to include the total number <strong>of</strong> responsive probes in the<br />

linear range [15, 16]. As indicated by the references given f<strong>or</strong> each fact<strong>or</strong>, individual investigat<strong>or</strong>s<br />

have shown that each <strong>of</strong> these effects can have a significant impact on the outcome <strong>of</strong> an analysis,<br />

yet, to the best <strong>of</strong> our knowledge, no one has put all <strong>of</strong> them together into a simple-to-use pipeline<br />

and then tested the final effect on analysis and comparison <strong>of</strong> experiments.<br />

The significance <strong>of</strong> the fact<strong>or</strong>s varies across datasets by sample characteristics that are<br />

independent <strong>of</strong> the experimental fact<strong>or</strong> ( i.e. still biological variation but not c<strong>or</strong>related to the<br />

fact<strong>or</strong> <strong>of</strong> interest and not subject to controls) and this type <strong>of</strong> biological variation has created<br />

distinct dilemmas f<strong>or</strong> the Microarray field: 1) cross experiment, particularly across platf<strong>or</strong>ms,<br />

analysis has been deemed impractical and 2) resultant gene lists are not reproducible in<br />

classification accuracy across datasets, across classification alg<strong>or</strong>ithms, and in their construction<br />

[7, 8, 20, 21]. We will demonstrate that the commonly applied statistical alg<strong>or</strong>ithms interpret<br />

signal intensities differently f<strong>or</strong> each dataset, which changes individual ProbeSet’s significance<br />

within each dataset. We will also demonstrate that by identifying and removing these types <strong>of</strong><br />

biological variation the behavi<strong>or</strong> <strong>of</strong> the ProbeSets becomes m<strong>or</strong>e consistent with the experimental<br />

fact<strong>or</strong> across datasets, thereby minimizing the identified dilemmas in the Microarray field.<br />

27

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!