27.12.2012 Views

l - People

l - People

l - People

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

!<br />

!<br />

Hinge prediction by combining sequence features<br />

As the GOR-like method did not work well, we sought to measure the predictive power<br />

of the various sequence features studied above. The HI scores we have reported provide<br />

an intuitive means of weighing the relative predictive value of each sequence feature.<br />

We show how to combine the HI scores for several features in order to make a more<br />

powerful predictor, which we call HingeSeq. We define this predictor as follows:<br />

" p(a j h) p(ak h) p(al h) %<br />

HS(i) = log $<br />

'<br />

10$<br />

# p(a j )p(ak )p(al ) '<br />

&<br />

= HIam(no)acid (i) + HIs*condary)structure (i) + HIactive)site (i)<br />

Equation 4<br />

For simplicity, statistical independence of the various features was assumed in creating<br />

this definition. Here the i‘s correspond to individual amino acids in the protein<br />

sequence. For each i, j designates one of the 20 amino acid types, k designates the<br />

secondary structural classification, and l designates active site versus non-active site<br />

classification.<br />

Thus<br />

HI am"no#acid (i) is assigned according to residue type by looking up the corresponding<br />

value in Table 2.1. Similarly,<br />

!<br />

HI s"condary#structure(i)is obtained according to secondary<br />

structure type from Table 2.3. Following Table 2.2 approximately, we assign<br />

74

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!