12.07.2015 Views

From Protein Structure to Function with Bioinformatics.pdf

From Protein Structure to Function with Bioinformatics.pdf

From Protein Structure to Function with Bioinformatics.pdf

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

6 <strong>Function</strong> Diversity Within Folds and Superfamilies 1516.3 <strong>Function</strong> Diversity Between Homologous <strong>Protein</strong>sIn general, detecting homology (superfamily relationship) is much more helpful forfunction prediction than structural similarity alone (fold relationship). In this section,the relation between function diversity and structural homology is examinedand it is shown that even when homology is identified, many obstacles remain whenattempting <strong>to</strong> transfer functional annotations from one protein <strong>to</strong> another.6.3.1 DefinitionsBefore explaining how function diverges <strong>with</strong>in superfamilies, it is necessary <strong>to</strong>define clearly what a superfamily is, and how it is used in practise. The term family,which is used throughout this section, is also introduced.6.3.1.1 General UnderstandingA superfamily is an ensemble of proteins that are thought <strong>to</strong> be evolutionarilyrelated. Superfamily relationships can be determined by sequence similarities,which are detected using either traditional sequence alignment methods or moresensitive HMM searches (Reid et al. 2007). In the absence of sequence similarity,remote homologies can also be uncovered from structure and/or functionsimilarities. But contrary <strong>to</strong> the situation <strong>with</strong> sequence similarity, there is nowidely accepted approach <strong>to</strong> assess whether a structural or functional similarityis statistically significant. Because of that, the cut-offs used <strong>to</strong> define superfamilyrelationships can be arbitrary and somewhat subjective. Today, severaldatabases such as CATH and SCOP have come up <strong>with</strong> standard and widelyaccepteddefinitions of what superfamilies are (see Section 6.3.1.2). But in all ofthese, some degree of subjectivity in the assignment of proteins <strong>to</strong> superfamiliesremain, as hinted by the facts that they still rely on manual validation for thisspecific process, and that incompatible assignments are made in the differentdatabases for a number of domains (Greene et al. 2007; Andreeva et al. 2008).It is worth noting that both CATH and SCOP now pre-classify new protein structuresusing au<strong>to</strong>mated pro<strong>to</strong>cols, but final assignment <strong>to</strong> superfamilies still ultimatelyinvolves manual processing.The concept of a family is vaguer. Nowadays, a family is commonly unders<strong>to</strong>odas a sub-classification of homologous proteins according <strong>to</strong> some criteria. Forexample, a sequence family at a particular level of sequence similarity groups<strong>to</strong>gether all proteins that share at least that level of sequence similarity; a functionalfamily groups <strong>to</strong>gether homologues that have the same function; an orthologousfamily groups <strong>to</strong>gether orthologues; etc. Depending on the focus of the databases,the definition of a family will vary.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!