12.07.2015 Views

View - ResearchGate

View - ResearchGate

View - ResearchGate

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Prediction Using PAINT 57represented by one if the corresponding TRE is present on the promoter for thatgene and by a zero otherwise. This matrix represents the constraints to a networkidentification scheme. The interaction parameters corresponding to zeros in thecandidate matrix need not be computed, substantially reducing the dimensionalityof the identification problem (Figs. 2 and 3).The FeasnetAnalyzer module contains a submodule named StatFilter thatcalculates the significance of “enrichment” for each TRE resulting from comparingthe selected genes submitted to PAINT with a random selection of genes.StatFilter computes p-values for the overrepresentation of each TRE in the set ofpromoters considered with respect to a background set of promoters. Specifically,the p-values give the probability that the observed counts for the TREs in the setof promoters could be explained by random occurrence in the background set ofpromoters. The p-values are calculated using the hypergeometric distribution(11,26–28). These raw p-values are adjusted for multiple testing using a falsediscovery rate (FDR) estimate (29). Typically, for a microarray experiment, thereference set is that of the genes on the microarray utilized in the experiments.For each TRE V$X, given (1) a reference Feasnet of n promoters of which n lpromoters contain V$X, and (2) a Feasnet of interest with m promoters of whichh contain V$X, the associated p-value for overrepresentation is given as in Eq. 1.p =min ( n1, m)∑i=h⎛ n1⎞⎛ n−n1⎞⎝⎜i ⎠⎟⎝⎜m−i⎠⎟⎛ n ⎞⎝⎜m⎠⎟The p-value for underrepresentation of a TRE in the observed Feasnet is calculatedsimilarly with the summation in the aforementioned equation going from1 to h. These estimates of significance can be utilized in filtering for those TREsthat meet a threshold (say, p ≤ 0.05, or FDR-adjusted p ≤ 0.3) to identify mostlikely regulators of the genes considered in the experimental context of interest.Given no information about the source of the genes from which the input list toPAINT is generated, PAINT can optionally utilize the Feasnet corresponding toall the genes in the PAINT promoter database as a reference Feasnet in the earlierenrichment analysis (also termed interchangeably as overrepresentation analysis).Fig. 2. (Opposite page) A visualization of a Feasnet. The elements are colorcodedto indicate the over- and underrepresentation of the transcriptional regulatoryelements. Each row in the vertical color bar next to the gene identifiers indicates thecluster membership of the corresponding gene. The dendrograms are based on hierarchicalclustering using average-linkage method and the binary distance as the dissimilaritymetric. A high-resolution color version of the gray-scale image presented hereinis available online at http://www.dbi.tju.edu/dbi/publications/MiMBchapter/.(1)

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!