Sample A: Cover Page of Thesis, Project, or Dissertation Proposal
Sample A: Cover Page of Thesis, Project, or Dissertation Proposal
Sample A: Cover Page of Thesis, Project, or Dissertation Proposal
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Hereafter the pipeline which we present is referred to as BaFL, <strong>or</strong> Biologically applied Filter<br />
Levels.<br />
Black Box Strategies<br />
A large number <strong>of</strong> purely statistical approaches have been applied (e.g. dCHIP, RMA, gcRMA)<br />
[5, 22-24] to remove variation (sample <strong>or</strong> technical) unrelated to the fact<strong>or</strong> <strong>of</strong> interest, but these<br />
function as black-box techniques that do not enlighten the investigat<strong>or</strong> about the extent that each<br />
fact<strong>or</strong> influences the experimental results. These methods tend to augment the data’s sensitivities<br />
to classification alg<strong>or</strong>ithms; the outcome has been that the processed data perf<strong>or</strong>ms well within<br />
but not between experiments, using the same <strong>or</strong> different classifications methods [8]. The<br />
implication is that these approaches over-train f<strong>or</strong> the fact<strong>or</strong>s that apply in one experiment and<br />
that those fact<strong>or</strong>s are not consistent in the next experiment. This would be expected if some <strong>of</strong> the<br />
result is due to variables with systematic effects on a subset <strong>of</strong> particular probes, such as the<br />
occurrence <strong>of</strong> different SNP-responsive probes that will give distinct patterns in different study<br />
populations [9, 10, 12, 25-30]. In <strong>or</strong>der to demonstrate that the data inconsistencies are<br />
sample/population <strong>or</strong> platf<strong>or</strong>m dependent, an investigat<strong>or</strong> needs to be able to delve into the<br />
aggregated signal and identify disc<strong>or</strong>dant probes and the likely cause <strong>of</strong> their behavi<strong>or</strong>, and then<br />
perf<strong>or</strong>m follow-up assays as needed, such as genotyping samples. A black box method does not<br />
allow the investigat<strong>or</strong> to understand which particular type <strong>of</strong> secondary assay must be perf<strong>or</strong>med.<br />
Our approach is to identify and remove all problematic probes in a progressive manner,<br />
categ<strong>or</strong>izing them as they are removed. Post- BaFL filtering the final set <strong>of</strong> data from all samples<br />
gives a considerably m<strong>or</strong>e homogeneous response; in addition the investigat<strong>or</strong> is provided<br />
categ<strong>or</strong>izations <strong>of</strong> the excluded sets that allow examination <strong>of</strong> each subcateg<strong>or</strong>y, and subsets <strong>of</strong><br />
28