12.07.2015 Views

Initial sequencing and analysis of the human genome - Vitagenes

Initial sequencing and analysis of the human genome - Vitagenes

Initial sequencing and analysis of the human genome - Vitagenes

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

articlesProportion <strong>of</strong> <strong>genome</strong> comprised by each class (%)6050403020100SSRLINE2 MIRDNALINE1 ALULTR54GC content (%)Figure 22 Density <strong>of</strong> <strong>the</strong> major repeat classes as a function <strong>of</strong> local GC content, in windows <strong>of</strong> 50 kb.2.0100Frequency <strong>of</strong> Alu class relative to itsaverage density in <strong>the</strong> <strong>genome</strong>1.81.61.41.21.00.80.60.40.2Proportion <strong>of</strong> DNA transposonscomprised by each age group (%)80604020Nucleotidesubsitutionlevel24–30%21–23%16–20%14–15.5%0–13%0.054AluY < 1% ( 4% (5–30 Myr)GC content bins (%)AluSc (25–35 Myr)AluS (35–60 Myr)AluJ, FAM (60–100 Myr)052GC content bins (%)Figure 24 DNA transposon copies in AT-rich DNA tend to be younger than those in moreGC-rich DNA. DNA transposon families were grouped into ®ve age categories by <strong>the</strong>irmedian substitution level (see Fig. 19). The proportion attributed to each age class isshown as a function <strong>of</strong> GC content. Similar patterns are seen for LINE1 <strong>and</strong> LTR elements.Figure 23 Alu elements target AT-rich DNA, but accumulate in GC-rich DNA. This graphshows <strong>the</strong> relative distribution <strong>of</strong> various Alu cohorts as a function <strong>of</strong> local GC content. Thedivergence levels (including CpG sites) <strong>and</strong> ages <strong>of</strong> <strong>the</strong> cohorts are shown in <strong>the</strong> key.similarly resisted <strong>the</strong> insertion <strong>of</strong> transposable elements duringrodent evolution.Distribution by GC content. We next focused on <strong>the</strong> correlationbetween <strong>the</strong> nature <strong>of</strong> <strong>the</strong> transposons in a region <strong>and</strong> its GCcontent. We calculated <strong>the</strong> density <strong>of</strong> each repeat type as a function<strong>of</strong> <strong>the</strong> GC content in 50-kb windows (Fig. 22). As has beenreported 142,173±176 , LINE sequences occur at much higher density inAT-rich regions (roughly fourfold enriched), whereas SINEs (MIR,Alu) show <strong>the</strong> opposite trend (for Alu, up to ®vefold lower in ATrichDNA). LTR retroposons <strong>and</strong> DNA transposons show a moreuniform distribution, dipping only in <strong>the</strong> most GC-rich regions.The preference <strong>of</strong> LINEs for AT-rich DNA seems like a reasonableway for a genomic parasite to accommodate its host, by targetinggene-poor AT-rich DNA <strong>and</strong> <strong>the</strong>reby imposing a lower mutationalburden. Mechanistically, selective targeting is nicely explained by<strong>the</strong> fact that <strong>the</strong> preferred cleavage site <strong>of</strong> <strong>the</strong> LINE endonuclease isTTTT/A (where <strong>the</strong> slash indicates <strong>the</strong> point <strong>of</strong> cleavage), which isused to prime reverse transcription from <strong>the</strong> poly(A) tail <strong>of</strong> LINERNA 177 .The contrary behaviour <strong>of</strong> SINEs, however, is baf¯ing. How doSINEs accumulate in GC-rich DNA, particularly if <strong>the</strong>y depend on<strong>the</strong> LINE transposition machinery 178 ? Notably, <strong>the</strong> same pattern isseen for <strong>the</strong> Alu-like B1 <strong>and</strong> <strong>the</strong> tRNA-derived SINEs in mouse <strong>and</strong>for MIR in <strong>human</strong> 142 . One possibility is that SINEs somehow targetGC-rich DNA for insertion. The alternative is that SINEs initiallyinsert with <strong>the</strong> same proclivity for AT-rich DNA as LINEs, but that<strong>the</strong> distribution is subsequently reshaped by evolutionaryforces 142,179 .We used <strong>the</strong> draft <strong>genome</strong> sequence to investigate this mystery bycomparing <strong>the</strong> proclivities <strong>of</strong> young, adolescent, middle-aged <strong>and</strong>old Alus (Fig. 23). Strikingly, recent Alus show a preference for ATrichDNA resembling that <strong>of</strong> LINEs, whereas progressively olderAlus show a progressively stronger bias towards GC-rich DNA.These results indicate that <strong>the</strong> GC bias must result from strongpressure: Fig. 23 shows that a 13-fold enrichment <strong>of</strong> Alus in GC-richDNA has occurred within <strong>the</strong> last 30 Myr, <strong>and</strong> possibly morerecently.These results raise a new mystery. What is <strong>the</strong> force that produces<strong>the</strong> great <strong>and</strong> rapid enrichment <strong>of</strong> Alus in GC-rich DNA? Oneexplanation may be that deletions are more readily tolerated ingene-poor AT-rich regions than in gene-rich GC-rich regions,resulting in older elements being enriched in GC-rich regions.Such an enrichment is seen for transposable elements such as884 © 2001 Macmillan Magazines Ltd NATURE | VOL 409 | 15 FEBRUARY 2001 | www.nature.com

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!