12.07.2015 Views

View - ResearchGate

View - ResearchGate

View - ResearchGate

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

174 Ho et al.allowing for cases wherein 1D differences exist but are small or harder to detectthan the joint differences. Despite this indeterminacy, it is a useful guide in identifyingpairs that may represent interesting novel biological interactions, as forexample, genes that are in the same pathway.In practice, identification of gene pairs with joint differential expression ischallenged by the large number of possible pairs. The usual number p of genes ona chip is in the tens of thousands, so the number of gene pairs, p(p – 1)/2, isusually in the millions. Challenges increase exponentially as sets of more thantwo genes are considered. In this chapter, several statistical methods are discussedthat can be used to search for joint differential expression: a correlation-basedapproach (5,6), the liquid association (LA) (7), and a generalization (8), theexpected conditional F-statistic (ECF) (9) and a novel entropy-based method areexamined in some detail and compared in simulations. Some algorithmically andcomputationally more complex methods of investigating gene coregulation arealso mentioned (4,10,11). Several classification and network analysis approachessearch more generally for sets of differentially coexpressing genes. Thesemethods seek larger sets of differentially expressed genes, often without specifyingthe size of the set in advance. In these cases the search space is too largefor an exhaustive canvas, and so results depend on efficient search algorithms.Not surprisingly, methods tend to be complex both algorithmically and computationally.See Note 4.1. for a brief overview.2. Materials1. The CorScor R package, which implements the correlation-based method describedin Subheading 3.1.2. is downloadable from http://stat.ethz.ch/~dettling/jde.html.The package uses the object definition of bioconductor—an open source and opendevelopment software project for the analysis and comprehension of genomic data,available at http://www.bioconductor.org/. Both require the R language, which isavailable at http://cran.r-project.org/.2. The statistical tools for implementing the LA and projection-based LA (PLA)methods described in Subheading 3.1.3. are available as a downloadable R-packagefrom http://kiefer.stat.ucla.edu/LAP/index.php?tools.3. The R code for calculating ECF-statistics described in Subheading 3.1.4. is availablefrom http://bioinformatics.med.yale.edu/microarray/BioSuppl.html.3. Methods3.1. Statistical Approaches3.1.1. NotationMost of the discussion is developed for the basic case in which microarrayexperiments are available for two different phenotypes or classes (say normal,denoted by one; and cancer, by two), and interactions between pairs of genes

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!