12.07.2015 Views

From Protein Structure to Function with Bioinformatics.pdf

From Protein Structure to Function with Bioinformatics.pdf

From Protein Structure to Function with Bioinformatics.pdf

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

198 E.C. Meng et al.<strong>with</strong>in subtilisins, which contain the catalytic triad in a different fold (Fischer et al.1994). Sequence-order independence is an important feature of many 3D motifmethods; however, in this work, these substructures were relatively large (>50 residues)and were identified from the entire structures rather than being predefined.In other early work in this field, the Thorn<strong>to</strong>n group classified catalytic-triadcontainingprotease and lipase structures in<strong>to</strong> four fold groups (Wallace et al. 1996).It was observed that oxygen a<strong>to</strong>ms from the serine and aspartic acid residuesoccupy nearly constant positions relative <strong>to</strong> the histidine ring across all four groups,whereas the rest of the side chain a<strong>to</strong>ms only superimposed well <strong>with</strong>in each group.An overall 3D motif or template containing just the histidine ring and two oxygenswas generated, along <strong>with</strong> group-specific templates containing the entire sidechains. To speed comparison, the TESS (TEmplate Search and Superposition) geometrichashing method was developed (Wallace et al. 1997). In this approach, oneresidue in a template provides a frame of reference, the surrounding a<strong>to</strong>ms arebinned in<strong>to</strong> 3D grid cells, and the information is hashed. <strong>Structure</strong>s <strong>to</strong> be searchedrequire similar pre-processing, where each residue of the same type as the templatereference residue (histidine in the catalytic triad example) is used <strong>to</strong> define a spatialpattern for hashing. Besides the need for pre-processing and file s<strong>to</strong>rage, TESSimposes some limitations on how motifs are defined and matched. The backtrackingconstraint solver JESS (not an acronym) was developed <strong>to</strong> address these issues<strong>with</strong>out sacrificing speed (Barker and Thorn<strong>to</strong>n 2003); it performs a depth-firstsearch on efficiently arranged descriptions of structures. The JESS paper alsodescribes obtaining E-values by comparing each 3D motif <strong>to</strong> a reference set ofstructures and modelling the resulting range of RMSD values as a mixture of normaldistributions (Barker and Thorn<strong>to</strong>n 2003).“Fuzzy functional forms” (FFF) consisting of just the alpha-carbons of importantresidues were shown <strong>to</strong> be useful for screening both experimentally determinedstructures and modeled structures of low <strong>to</strong> moderate resolution (Fetrow andSkolnick 1998; Di Gennaro et al. 2001). Glutaredoxins/thioredoxins were recognized<strong>with</strong> a motif of two cysteines and a proline, <strong>with</strong> additional restrictions thatthe proline must be in the cis conformation and the cysteines must be in a CxxCmotif near the N-terminus of a helix. T1 ribonucleases were recognized <strong>with</strong> a sixresiduemotif. In subsequent work, fuzzy functional forms for identifying broadfamilies were combined <strong>with</strong> sequence-based active site profiles for finer subclassification(Cammer et al. 2003). The FFF motif for the disulphide oxidoreductaseactive site found in many proteins is shown in Fig. 8.3.The program ASSAM uses subgraph isomorphism <strong>to</strong> find occurrences of a userdefinedpattern of residues (Artymiuk et al. 1994). Side chain functional groups areeach represented by two or three pseudoa<strong>to</strong>ms, and the distances among thesepoints in a motif are compared <strong>to</strong> the corresponding distances <strong>with</strong>in a structure.Residues can be labelled by type or chemical classification (for example, hydrophobic).The tradeoff between specificity and distance <strong>to</strong>lerance was demonstrated forthe catalytic triad, and further examples were presented and discussed. Enhancements<strong>to</strong> the original program include the ability <strong>to</strong> use backbone a<strong>to</strong>ms and <strong>to</strong> label residuesby secondary structure and degree of solvent exposure (Spriggs et al. 2003).

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!