12.07.2015 Views

Initial sequencing and analysis of the human genome - Vitagenes

Initial sequencing and analysis of the human genome - Vitagenes

Initial sequencing and analysis of the human genome - Vitagenes

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

articles<strong>the</strong> IAC anticodon for a Val tRNA could recognize GUU, GUC <strong>and</strong>even GUA codons. Were this same I34 to be utilized in two-codonboxes, however, misreading <strong>of</strong> <strong>the</strong> NNA codon would occur, resultingin translational havoc. Eukaryotic glycine tRNAs represent aconserved exception to this last rule; <strong>the</strong>y use a GCC anticodon todecode GGU <strong>and</strong> GGC, ra<strong>the</strong>r than <strong>the</strong> expected ICC anticodon.Satisfyingly, <strong>the</strong> <strong>human</strong> tRNA set follows <strong>the</strong>se wobble rulesalmost perfectly (Fig. 34). Only three unexpected tRNA speciesare found: single genes for a tRNATyr-AUA, tRNAIle-GAU, <strong>and</strong>tRNAAsn-AUU. Perhaps <strong>the</strong>se are pseudogenes, but <strong>the</strong>y appear tobe plausible tRNAs. We also checked <strong>the</strong> possibility <strong>of</strong> <strong>sequencing</strong>errors in <strong>the</strong>ir anticodons, but each <strong>of</strong> <strong>the</strong>se three genes is in a region<strong>of</strong> high sequence accuracy, with PHRAP quality scores higherthan 70 for every base in <strong>the</strong>ir anticodons.As in all o<strong>the</strong>r organisms, <strong>human</strong> protein-coding genes showcodon biasÐpreferential use <strong>of</strong> one synonymous codon overano<strong>the</strong>r 258 (Fig. 34). In less complex organisms, such as yeast orbacteria, highly expressed genes show <strong>the</strong> strongest codon bias.Cytoplasmic abundance <strong>of</strong> tRNA species is correlated with bothcodon bias <strong>and</strong> overall amino-acid frequency (for example, tRNAsfor preferred codons <strong>and</strong> for more common amino acids are moreabundant). This is presumably driven by selective pressure foref®cient or accurate translation 259 . In many organisms, tRNAabundance in turn appears to be roughly correlated with tRNAgene copy number, so tRNA gene copy number has been used as aproxy for tRNA abundance 260 . In vertebrates, however, codon bias isnot so obviously correlated with gene expression level. Differingcodon biases between <strong>human</strong> genes is more a function <strong>of</strong> <strong>the</strong>irlocation in regions <strong>of</strong> different GC composition 261 . In agreementwith <strong>the</strong> literature, we see only a very rough correlation <strong>of</strong> <strong>human</strong>tRNA gene number with ei<strong>the</strong>r amino-acid frequency or codon bias(Fig. 34). The most obvious outliers in <strong>the</strong>se weak correlations are<strong>the</strong> strongly preferred CUG leucine codon, with a mere six tRNA-Leu-CAG genes producing a tRNA to decode it, <strong>and</strong> <strong>the</strong> relativelyrare cysteine UGU <strong>and</strong> UGC codons, with 30 tRNA genes to decode<strong>the</strong>m.The tRNA genes are dispersed throughout <strong>the</strong> <strong>human</strong> <strong>genome</strong>.However, this dispersal is nonr<strong>and</strong>om. tRNA genes have sometimesbeen seen in clusters at small scales 262,263 but we can now see strikingclustering on a <strong>genome</strong>-wide scale. More than 25% <strong>of</strong> <strong>the</strong> tRNAgenes (140) are found in a region <strong>of</strong> only about 4 Mb on chromosome6. This small region, only about 0.1% <strong>of</strong> <strong>the</strong> <strong>genome</strong>, containsan almost suf®cient set <strong>of</strong> tRNA genes all by itself. The 140 tRNAgenes contain a representative for 36 <strong>of</strong> <strong>the</strong> 49 anticodons found in<strong>the</strong> complete set; <strong>and</strong> <strong>of</strong> <strong>the</strong> 21 isoacceptor types, only tRNAs todecode Asn, Cys, Glu <strong>and</strong> selenocysteine are missing. Many <strong>of</strong> <strong>the</strong>setRNA genes, meanwhile, are clustered elsewhere; 18 <strong>of</strong> <strong>the</strong> 30 CystRNAs are found in a 0.5-Mb stretch <strong>of</strong> chromosome 7 <strong>and</strong> many <strong>of</strong><strong>the</strong> Asn <strong>and</strong> Glu tRNA genes are loosely clustered on chromosome 1.More than half <strong>of</strong> <strong>the</strong> tRNA genes (280 out <strong>of</strong> 497) reside on ei<strong>the</strong>rchromosome 1 or chromosome 6. Chromosomes 3, 4, 8, 9, 10, 12,18, 20, 21 <strong>and</strong> X appear to have fewer than 10 tRNA genes each; <strong>and</strong>chromosomes 22 <strong>and</strong> Y have none at all (each has a singlepseudogene).Ribosomal RNA genes. The ribosome, <strong>the</strong> protein syn<strong>the</strong>ticmachine <strong>of</strong> <strong>the</strong> cell, is made up <strong>of</strong> two subunits <strong>and</strong> contains fourrRNA species <strong>and</strong> many proteins. The large ribosomal subunitcontains 28S <strong>and</strong> 5.8S rRNAs (collectively called `large subunit'(LSU) rRNA) <strong>and</strong> also a 5S rRNA. The small ribosomal subunitcontains 18S rRNA (`small subunit' (SSU) rRNA). The genes forLSU <strong>and</strong> SSU rRNA occur in <strong>the</strong> <strong>human</strong> <strong>genome</strong> as a 44-kb t<strong>and</strong>emrepeat unit 264 . There are thought to be about 150±200 copies <strong>of</strong> thisrepeat unit arrayed on <strong>the</strong> short arms <strong>of</strong> acrocentric chromosomes13, 14, 15, 21 <strong>and</strong> 22 (refs 254, 264). There are no true completePheLeu171 UUU203 UUC73 UUA125 UUGAAA 0GAA 14UAA 8CAA 6Ser14717211845UCUUCCUCAUCGAGAGGAUGACGA10054Tyrstopstop12415800UAUUACUAAUAGAUAGUAUUACUA11100CysstopTrp991190122UGUUGCUGAUGGACA 0GCA 30UCA 0CCA 7Leu127 CUU187 CUC69 CUA392 CUGAAG 13GAG 0UAG 2CAG 6Pro17519717069CCUCCCCCACCGAGGGGGUGGCGG110104HisGln104147121343CAUCACCAACAGAUGGUGUUGCUG0121121Arg4710763115CGUCGCCGACGGACG 9GCG 0UCG 7CCG 5IleMet165 AUU218 AUC71 AUA221 AUGAAU 13GAU 1UAU 5CAU 17Thr13119215063ACUACCACAACGAGUGGUUGUCGU80107AsnLys174199248331AAUAACAAAAAGAUUGUUUUUCUU1331622SerArg121191113110AGUAGCAGAAGGACU 0GCU 7UCU 5CCU 4Val111 GUU146 GUC72 GUA288 GUGAAC 20GAC 0UAC 5CAC 19Ala18528216074GCUGCCGCAGCGAGCGGCUGCCGC250105AspGlu230262301404GAUGACGAAGAGAUCGUCUUCCUC010148Gly112230168160GGUGGCGGAGGGACC 0GCC 11UCC 5CCC 8Figure 34 The <strong>human</strong> genetic code <strong>and</strong> associated tRNA genes. For each <strong>of</strong> <strong>the</strong> 64codons, we show: <strong>the</strong> corresponding amino acid; <strong>the</strong> observed frequency <strong>of</strong> <strong>the</strong> codon per10,000 codons; <strong>the</strong> codon; predicted wobble pairing to a tRNA anticodon (black lines); anunmodi®ed tRNA anticodon sequence; <strong>and</strong> <strong>the</strong> number <strong>of</strong> tRNA genes found with thisanticodon. For example, phenylalanine is encoded by UUU or UUC; UUC is seen morefrequently, 203 to 171 occurrences per 10,000 total codons; both codons are expected tobe decoded by a single tRNA anticodon type, GAA, using a G/U wobble; <strong>and</strong> <strong>the</strong>re are 14tRNA genes found with this anticodon. The modi®ed anticodon sequence in <strong>the</strong> maturetRNA is not shown, even where post-transcriptional modi®cations can be con®dentlypredicted (for example, when an A is used to decode a U/C third position, <strong>the</strong> A is almostcertainly an inosine in <strong>the</strong> mature tRNA). The Figure also does not show <strong>the</strong> number <strong>of</strong>distinct tRNA species (such as distinct sequence families) for each anticodon; <strong>of</strong>ten <strong>the</strong>reis more than one species for each anticodon.894 © 2001 Macmillan Magazines Ltd NATURE | VOL 409 | 15 FEBRUARY 2001 | www.nature.com

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!