02.03.2013 Views

Target Discovery and Validation Reviews and Protocols

Target Discovery and Validation Reviews and Protocols

Target Discovery and Validation Reviews and Protocols

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

48 Imoto et al.<br />

Fig. 8. Strategy to identify the drug active pathways. The red nodes represent drugactive<br />

genes considered to be independent on their parent genes. The green nodes are<br />

parent-active genes considered to be affected by their parent genes. The red arrows represent<br />

the active paths (13).<br />

To avoid these problems, we use Bayesian networks for discrete data for the<br />

probabilistic inference after estimating the structure of the gene network. We<br />

convert the estimated Bayesian network <strong>and</strong> nonparametric regression model<br />

into the Bayesian network for discrete data (Fig. 2b). The drug-active pathways<br />

are identified for every time-point (Fig. 2c). This step is based on the<br />

identification of affected genes directly <strong>and</strong> indirectly by the drug. However, it<br />

is possible that several genes might still be falsely determined as the affected<br />

genes. Therefore, we define time-exp<strong>and</strong>ed network (Fig. 8) <strong>and</strong> consider the<br />

time-dependent relationships among the estimated affected genes to reduce the<br />

number of false positives. The definition of the time-dependent network is given<br />

below (see Subheading 3.2.2.). The connected components in the timeexp<strong>and</strong>ed<br />

network are selected as the estimated drug-active pathways (Fig. 2d).<br />

This step also can reduce the number of falsely identified pathways.<br />

The identification of drug-active pathways is done based on the probabilistic<br />

inference on the Bayesian networks for discrete data. For this purpose, we convert<br />

the Bayesian network <strong>and</strong> nonparametric regression model that was used for structural<br />

learning of the gene network into the Bayesian network for discrete data.<br />

At first, we need to discretize the expression values into three categories. We<br />

regard expression values that are greater than a certain threshold h (or less than<br />

h –1 ) as “overexpreesed” (or “suppressed”) <strong>and</strong> otherwise as “unchanged.” The<br />

threshold h is determined so that the network estimated by the Bayesian network<br />

for discrete data with h becomes the closest to the network from the Bayesian

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!