14.07.2022 Views

Essential Cell Biology 5th edition

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

A:24 Answers

the second intron is noticeably different (Figure A9–11B).

If the introns were the same length, the line segments

that represent sequence similarity would fall on the same

diagonal. The easiest way to test for the colinearity of

the line segments is to tilt the page and sight along the

diagonal. It is impossible to tell from this comparison if

the change in length is due to a shortening of the mouse

intron or to a lengthening of the human intron, or some

combination of those possibilities.

ANSWER 9–12 Computer algorithms that search for exons

are complex, as you might imagine. To identify unknown

genes, these programs combine statistical information

derived from known genes, such as:

1. An exon that encodes protein will have an open reading

frame. If the amino acid sequence specified by this

open reading frame matches a protein sequence in any

database, there is a high likelihood that it is an authentic

exon.

2. The reading frames of adjacent exons in the same gene

will match up when the intron sequences are omitted.

3. Internal exons (excluding the first and the last) will have

splicing signals at each end; most of the time (~98%)

these will be AG at the 5′ ends of the exons and GT at

the 3′ ends.

4. The multiple codons for most individual amino acids are

not used with equal frequency. This so-called coding

bias, which varies from one species to the next, can be

factored in to aid in the recognition of true exons.

5. Exons and introns have characteristic length

distributions. The median length of exons in human

genes is about 120 nucleotide pairs. Introns tend to be

much larger: a median length of about 2 kb in genomic

regions of 30–40% GC content, and a median length of

about 500 nucleotide pairs in regions above 50% GC.

6. The initiation codon for protein synthesis (nearly always

an ATG) has a statistical association with adjacent

nucleotides that seem to enhance its recognition by

translation factors.

7. The terminal exon will have a signal (most commonly

AATAAA) for cleavage and polyadenylation close to its

3′ end.

The statistical nature of these features, coupled with the

low frequency of coding information in the genome (1.5%

for humans) and the high frequency of alternative splicing

(estimated to occur in 95% of human genes), makes it

difficult for an algorithm to correctly identify all exons. As

shown in Figure 9−36, these bioinformatic approaches are

usually coupled with direct experimental data, such as those

obtained from full-genome RNA sequencing (RNA-Seq).

ANSWER 9–13 It is often not a simple matter to determine

the function of a gene, nor is there a universal recipe for

doing so. Nevertheless, there are a variety of standard

questions whose answers help to narrow down the

possibilities. Below we list some of these questions.

In what tissues is the gene expressed? If the gene is

expressed in all tissues, it is likely to have a general function.

If it is expressed in one or a few tissues, its function is

likely to be more specialized, perhaps related to the

specific functions of the tissues. If the gene is expressed

in the embryo but not the adult, it probably functions in

development.

In what compartment of the cell is the protein found?

Knowing the subcellular localization of the protein—nucleus,

plasma membrane, mitochondria, etc.—can also help to rule

out or support potential functions. For example, a protein

that is localized to the plasma membrane is likely to be a

transporter, a receptor or other component of a signaling

pathway, a cell adhesion molecule, etc.

What are the effects of mutations in the gene? Mutations

that eliminate or modify the function of the gene product

can provide important clues to function. For example,

if the gene product is critical at a certain time during

development, mutant embryos will often die at that stage or

develop obvious abnormalities.

With what other proteins does the encoded protein

interact? In carrying out their function, proteins often

interact with other proteins involved in the same or closely

related processes. If an interacting protein can be identified,

and if its function is already known (through previous

research or through the searching of databases), the range

of possible functions can often be narrowed.

Can mutations in other genes alter effects of mutation in

the unknown gene? Searching for such mutations can be

a very powerful approach to investigating gene function,

especially in organisms such as bacteria and yeast, which

have simple genetic systems. Although much more

difficult to perform in the mouse, this type of approach

can nonetheless be used. The rationale for this strategy

is analogous to that of looking for interacting proteins:

genes that interact genetically—so that the doublemutant

phenotype is more selective than either of the

individual mutants—are often involved in the same process

or in closely related processes. Identification of such an

interacting gene (and knowledge of its function) would

provide an important clue to the function of the unknown

gene.

Addressing each of these questions requires specialized

experimental expertise and a substantial time commitment

from the investigator. It is no wonder that progress is made

much more rapidly when a clue to a gene’s function can be

found simply by identifying a similar gene of known function

in the database. As more and more genes are studied, this

strategy will become increasingly successful.

ANSWER 9–14 In a long, random sequence of DNA, each

of the 64 different codons will occur with equal frequency.

Because 3 of the 64 are stop codons, they will be expected

to occur on average every 21 codons (64/3 = 21.3).

ANSWER 9–15 All of these mechanisms contribute to

the evolution of new protein-coding genes. A, C, D, and

E were discussed in the text. Recent studies indicate that

certain short protein-coding genes arose from previously

untranslated regions of genomes, so choice B is also correct.

ANSWER 9–16

A. Because synonymous changes do not alter the amino

acid sequence of the protein, they usually do not affect

the overall fitness of the organism and are therefore not

selected against. By contrast, nonsynonymous changes,

which substitute a new amino acid in place of the original

one, can alter the function of the encoded protein and

change the fitness of the organism. Since most amino

acid substitutions probably harm the protein, they tend

to be selected against.

B. Virtually all amino acid substitutions in the histone

H3 protein are deleterious and are therefore selected

against. The extreme conservation of histone H3 argues

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!