Essential Cell Biology 5th edition
348HOW WE KNOWSEQUENCING THE HUMAN GENOMEWhen DNA sequencing techniques became fully automated,determining the order of the nucleotides in apiece of DNA went from being an elaborate Ph.D. thesisproject to a routine laboratory chore. Feed DNA into thesequencing machine, add the necessary reagents, andout comes the sought-after result: the order of As, Ts,Gs, and Cs. Nothing could be simpler.So why was sequencing the human genome such aformidable task? Largely because of its size. The DNAsequencing methods employed at the time were limitedby the physical size of the gel used to separate thelabeled fragments (see, for example, Figure Q10−9). Atmost, only a few hundred nucleotides could be readfrom a single gel. How, then, do you handle a genomethat contains billions of nucleotide pairs?The solution is to break the genome into fragments andsequence these smaller pieces. The main challenge thencomes in piecing the short fragments together in thecorrect order to yield a comprehensive sequence of awhole chromosome, and ultimately a whole genome.There are two main strategies for accomplishing thisgenomic breakage and reassembly: the shotgun methodand the clone-by-clone approach.Shotgun sequencingThe most straightforward approach to sequencing agenome is to break it into random fragments, separateand sequence each of the single-stranded fragments,and then use a powerful computer to order these piecesusing sequence overlaps to guide the assembly (Figure10–18). This approach is called the shotgun sequencingstrategy. As an analogy, imagine shredding several copiesof Essential Cell Biology (ECB), mixing up the pieces,RANDOMFRAGMENTATIONmultiple copiesof genomeand then trying to put one whole copy of the book backtogether again by matching up the words or phrases orsentences that appear on each piece. (Several copieswould be needed to generate enough overlap for reassembly.)It could be done, but it would be much easier ifthe book were, say, only two pages long.For this reason, a straight-out shotgun approach is thestrategy of choice only for sequencing small genomes.The method proved its worth in 1995, when it was usedto sequence the genome of the infectious bacteriumHaemophilus influenzae, the first organism to have itscomplete genome sequence determined. The troublewith shotgun sequencing is that the reassembly processcan be derailed by repetitive nucleotide sequences.Although rare in bacteria, these sequences make up alarge fraction of vertebrate genomes (see Figure 9–33).Highly repetitive DNA segments make it difficult topiece DNA sequences back together accurately (Figure10–19). Returning to the ECB analogy, this chapter alonecontains more than a few instances of the phrase “thehuman genome.” Imagine that one slip of paper from theshredded ECBs contains the information: “So why wassequencing the human genome” (which appears at thestart of this section); another contains the information:“the human genome sequence consortium combinedshotgun sequencing with a clone-by-clone approach”(which appears below). You might be tempted to jointhese two segments together based on the overlappingphrase “the human genome.” But you would wind upwith the nonsensical statement: “So why was sequencingthe human genome sequence consortium combinedshotgun sequencing with a clone-by-clone approach.”You would also lose the several paragraphs of importanttext that originally appeared between these twoinstances of “the human genome.”And that’s just in this section. The phrase “the humangenome” appears in many chapters of this book. Suchrepetition compounds the problem of placing eachfragment in its correct context. To circumvent theseassembly problems, researchers in the human genomesequence consortium combined shotgun sequencingwith a clone-by-clone approach.---GCCATTAGTTCASEQUENCE ONE STRANDOF FRAGMENTSGTTCAGCATTG---ASSEMBLESEQUENCE---GCCATTAGTTCAGCATTG---ORIGINAL SEQUENCERECONSTRUCTED BASEDON SEQUENCE OVERLAPsequencesof twofragmentsFigure 10–18 Shotgunsequencing is themethod of choice forsmall genomes. Thegenome is first brokeninto much smaller,overlapping fragments.Each fragment is thensequenced, and thegenome is assembledbased on overlappingsequences.Clone-by-cloneIn this approach, researchers started by preparing agenomic DNA library. They broke the human genomeinto overlapping fragments, 100–200 kilobase pairs insize. They then plugged these segments into bacterialartificial chromosomes (BACs) and inserted them intoE. coli. (BACs are similar to the bacterial plasmids discussedearlier, except they can carry much larger piecesof DNA.) As the bacteria divided, they copied the BACs,thus producing a collection of overlapping cloned fragments(see Figure 10–8).
Sequencing DNA349repetitive DNAinterveninginformationRANDOMFRAGMENTATIONSEQUENCEFRAGMENTSGATTACAGATTACAGATTACA------GATTACAGATTACAGATTACASEQUENCE ASSEMBLEDINCORRECTLYmultiple copiesof genome---GATTACAGATTACAGATTACAGATTACA---intervening information is lostsequencesof twofragmentsFigure 10–19 Repetitive DNA sequences in a genome make itdifficult to accurately assemble its fragments. In this example,the DNA contains two segments of repetitive DNA, each madeof many copies of the sequence GATTACA. When the resultingsequences are examined, two fragments from different partsof the DNA appear to overlap. Assembling these sequencesincorrectly would result in a loss of the information (in brackets)that lies between the original repeats.ECB5 e10.25/10.19The researchers then determined where each of theseDNA fragments fit into the existing map of the humangenome. To do this, different restriction enzymes wereused to cut each clone to generate a unique restrictionsite“signature.” The locations of the restriction sites ineach fragment allowed researchers to map each BACclone onto a restriction map of a whole human genomethat had been generated previously using the same set ofrestriction enzymes (Figure 10–20).Knowing the relative positions of the cloned fragments,the researchers then selected some 30,000 BACs, shearedeach into smaller fragments, and determined the nucleotidesequence of each BAC separately using the shotgunmethod. They could then assemble the whole genomesequence by stitching together the sequences of thousandsof individual BACs that span the length of thegenome.The beauty of this approach was that it was relativelyeasy to accurately determine where the BAC fragmentsbelong in the genome. This mapping step reduced thelikelihood that regions containing repetitive sequenceswere assembled incorrectly, and it virtually eliminatedthe possibility that sequences from different chromosomeswere mistakenly joined together. Returning tothe textbook analogy, the BAC-based approach is akin tofirst separating your copies of ECB into individual pagesand then shredding each page into its own separate pile.It should be much easier to put the book back togetherwhen one pile of fragments contains words from page 1,a second pile from page 2, and so on. And there’s virtuallyno chance of mistakenly sticking a sentence frompage 40 into the middle of a paragraph on page 412.All together nowThe clone-by-clone approach produced the first draftof the human genome sequence in 2000 and the completedsequence in 2004. As the set of instructions thatspecify all of the RNA and protein molecules needed tobuild a human being, this string of genetic bits holds thesecrets to human development and physiology. But thesequence was also of great value to researchers interestedin comparative genomics or in the physiology ofother organisms: it eased the assembly of nucleotidesequences from other mammalian genomes—mice, rats,dogs, and other primates. It also made it much easier todetermine the nucleotide sequences of the genomes ofindividual humans by providing a framework on whichthe new sequences could be simply superimposed.The first human sequence was the only mammaliangenome completed in this methodical way. But theHuman Genome Project was an unqualified success inthat it provided the techniques, confidence, and momentumthat drove the development of the next generationof DNA sequencing methods, which are now rapidlytransforming all areas of biology.restriction map of one segmentof human genomerestriction patternfor individual BACclonescleavage sites for restriction nucleases A, B, C, D, and EAA D B BA B C ECFigure 10–20 Individual BAC clones arepositioned on the physical map of thehuman genome sequence on the basis oftheir restriction-site “signatures.” Clones aredigested with five different restriction enzymes,and the sites at which the different enzymes cuteach clone are recorded. The distinctive patternof restriction sites allows investigators to orderthe fragments and place them on a restrictionmap of a human genome that had beenpreviously generated using the same nucleases.
- Page 332 and 333: 298 CHAPTER 9 How Genes and Genomes
- Page 334 and 335: 300 CHAPTER 9 How Genes and Genomes
- Page 336 and 337: 302 CHAPTER 9 How Genes and Genomes
- Page 338 and 339: 304 CHAPTER 9 How Genes and Genomes
- Page 340 and 341: 306 CHAPTER 9 How Genes and Genomes
- Page 342 and 343: 308 CHAPTER 9 How Genes and Genomes
- Page 344 and 345: 310 CHAPTER 9 How Genes and Genomes
- Page 346 and 347: 312 CHAPTER 9 How Genes and Genomes
- Page 348 and 349: 314 CHAPTER 9 How Genes and Genomes
- Page 350 and 351: 316 CHAPTER 9 How Genes and Genomes
- Page 352 and 353: 318 CHAPTER 9 How Genes and Genomes
- Page 354 and 355: 320 CHAPTER 9 How Genes and Genomes
- Page 356 and 357: 322 CHAPTER 9 How Genes and Genomes
- Page 358 and 359: 324HOW WE KNOWCOUNTING GENESHow man
- Page 360 and 361: 326 CHAPTER 9 How Genes and Genomes
- Page 362 and 363: 328 CHAPTER 9 How Genes and Genomes
- Page 364 and 365: 5′ 3′5′3′330 CHAPTER 9 How
- Page 367 and 368: CHAPTER TEN10Analyzing the Structur
- Page 369 and 370: Isolating and Cloning DNA Molecules
- Page 371 and 372: Isolating and Cloning DNA Molecules
- Page 373 and 374: Isolating and Cloning DNA Molecules
- Page 375 and 376: IIIIIIIIIIIIIDNA Cloning by PCR341D
- Page 377 and 378: DNA Cloning by PCR343HEAT TOSEPARAT
- Page 379 and 380: DNA Cloning by PCR345(A)ANALYSIS OF
- Page 381: Sequencing DNA347single-stranded DN
- Page 385 and 386: Exploring Gene Function351each loca
- Page 387 and 388: Exploring Gene Function353(A)CONSTR
- Page 389 and 390: Exploring Gene Function355E. coli,
- Page 391 and 392: Exploring Gene Function357(A)(B)Fig
- Page 393 and 394: Exploring Gene Function359fused to
- Page 395 and 396: Exploring Gene Function361Figure 10
- Page 397 and 398: Questions363• DNA sequencing tech
- Page 399 and 400: CHAPTER ELEVEN11Membrane StructureA
- Page 401 and 402: ECB5 e11.05/11.05The Lipid Bilayer3
- Page 403 and 404: The Lipid Bilayer369hydrogen bondsC
- Page 405 and 406: The Lipid Bilayer371sets a lower li
- Page 407 and 408: The Lipid Bilayer373Figure 11-16 Ne
- Page 409 and 410: Membrane Proteins375TRANSPORTERS AN
- Page 411 and 412: Membrane Proteins377A Polypeptide C
- Page 413 and 414: Membrane Proteins379Figure 11-26 SD
- Page 415 and 416: Membrane Proteins381spectrin dimera
- Page 417 and 418: Membrane Proteins383apical plasmame
- Page 419 and 420: ECB5 e11.36/11.36Membrane Proteins3
- Page 421 and 422: Questions387• Most cell membranes
- Page 423 and 424: CHAPTER TWELVE12Transport Across Ce
- Page 425 and 426: Principles of Transmembrane Transpo
- Page 427 and 428: Principles of Transmembrane Transpo
- Page 429 and 430: Transporters and Their Functions395
- Page 431 and 432: Transporters and Their Functions397
348
HOW WE KNOW
SEQUENCING THE HUMAN GENOME
When DNA sequencing techniques became fully automated,
determining the order of the nucleotides in a
piece of DNA went from being an elaborate Ph.D. thesis
project to a routine laboratory chore. Feed DNA into the
sequencing machine, add the necessary reagents, and
out comes the sought-after result: the order of As, Ts,
Gs, and Cs. Nothing could be simpler.
So why was sequencing the human genome such a
formidable task? Largely because of its size. The DNA
sequencing methods employed at the time were limited
by the physical size of the gel used to separate the
labeled fragments (see, for example, Figure Q10−9). At
most, only a few hundred nucleotides could be read
from a single gel. How, then, do you handle a genome
that contains billions of nucleotide pairs?
The solution is to break the genome into fragments and
sequence these smaller pieces. The main challenge then
comes in piecing the short fragments together in the
correct order to yield a comprehensive sequence of a
whole chromosome, and ultimately a whole genome.
There are two main strategies for accomplishing this
genomic breakage and reassembly: the shotgun method
and the clone-by-clone approach.
Shotgun sequencing
The most straightforward approach to sequencing a
genome is to break it into random fragments, separate
and sequence each of the single-stranded fragments,
and then use a powerful computer to order these pieces
using sequence overlaps to guide the assembly (Figure
10–18). This approach is called the shotgun sequencing
strategy. As an analogy, imagine shredding several copies
of Essential Cell Biology (ECB), mixing up the pieces,
RANDOM
FRAGMENTATION
multiple copies
of genome
and then trying to put one whole copy of the book back
together again by matching up the words or phrases or
sentences that appear on each piece. (Several copies
would be needed to generate enough overlap for reassembly.)
It could be done, but it would be much easier if
the book were, say, only two pages long.
For this reason, a straight-out shotgun approach is the
strategy of choice only for sequencing small genomes.
The method proved its worth in 1995, when it was used
to sequence the genome of the infectious bacterium
Haemophilus influenzae, the first organism to have its
complete genome sequence determined. The trouble
with shotgun sequencing is that the reassembly process
can be derailed by repetitive nucleotide sequences.
Although rare in bacteria, these sequences make up a
large fraction of vertebrate genomes (see Figure 9–33).
Highly repetitive DNA segments make it difficult to
piece DNA sequences back together accurately (Figure
10–19). Returning to the ECB analogy, this chapter alone
contains more than a few instances of the phrase “the
human genome.” Imagine that one slip of paper from the
shredded ECBs contains the information: “So why was
sequencing the human genome” (which appears at the
start of this section); another contains the information:
“the human genome sequence consortium combined
shotgun sequencing with a clone-by-clone approach”
(which appears below). You might be tempted to join
these two segments together based on the overlapping
phrase “the human genome.” But you would wind up
with the nonsensical statement: “So why was sequencing
the human genome sequence consortium combined
shotgun sequencing with a clone-by-clone approach.”
You would also lose the several paragraphs of important
text that originally appeared between these two
instances of “the human genome.”
And that’s just in this section. The phrase “the human
genome” appears in many chapters of this book. Such
repetition compounds the problem of placing each
fragment in its correct context. To circumvent these
assembly problems, researchers in the human genome
sequence consortium combined shotgun sequencing
with a clone-by-clone approach.
---GCCATTAGTTCA
SEQUENCE ONE STRAND
OF FRAGMENTS
GTTCAGCATTG---
ASSEMBLE
SEQUENCE
---GCCATTAGTTCAGCATTG---
ORIGINAL SEQUENCE
RECONSTRUCTED BASED
ON SEQUENCE OVERLAP
sequences
of two
fragments
Figure 10–18 Shotgun
sequencing is the
method of choice for
small genomes. The
genome is first broken
into much smaller,
overlapping fragments.
Each fragment is then
sequenced, and the
genome is assembled
based on overlapping
sequences.
Clone-by-clone
In this approach, researchers started by preparing a
genomic DNA library. They broke the human genome
into overlapping fragments, 100–200 kilobase pairs in
size. They then plugged these segments into bacterial
artificial chromosomes (BACs) and inserted them into
E. coli. (BACs are similar to the bacterial plasmids discussed
earlier, except they can carry much larger pieces
of DNA.) As the bacteria divided, they copied the BACs,
thus producing a collection of overlapping cloned fragments
(see Figure 10–8).