Special topic: Metagenomics - Genome Sciences

More documents

Recommendations

Info

Gene prediction: Database analysis Pfam database Database provides models (statistical definitions) of ≈9,000 protein families. Query: Does my predicted proteins fit to a known protein family? ? 350 million comparisons: On a standard computer would have taken over a century to complete.
Sorcerer II: What Gentle lysis, did you do to my database!? Gentle lysis, extraction, DNA size separation BAC/fosmid cloning (40–100 kb) extraction, DNA size separation Million proteins Lysis, DN shearing 5.7 Table 2. Clustering and HMM Profiling Results Showing the Number of Predicted P Expanding the Protein ofiling Results Showing the Number of Predicted Proteins (Including Both Redundant Expanding the Protein F h Nonredundant Dataset Sequences) in Each Dataset lustering Dataset(A) HMM Original Set A Clustering \ B (A) A B HMM B A A Total \ BPredicted A B filing Table Results 2. Clustering Showing BAC/fosmid Shotgun Profiling and the (B) HMM Number Profiling of Predicted Results Proteins Showing Profiling (Including the (B) Number Proteins Both of Redundant Predicted A P h Dataset Nonredundant Sequences) cloning in Each Dataset (3 [ B kb) (40–100 kb) 939,056 NCBI-nr 1,645,146 2,317,995 1,566,123 1,939,056 372,9331,645,14679,023 1,566,123 2,018,079 372,93 PG ORFs 3,049,695 575,729 448,159 418,503 157,22 TGI-EST ORFs 5,458,820 1,097,083 606,779 576,532 520,55 319,855 ENS 253,007 361,668 241,671 319,855 78,184 253,00711,336 241,671 331,191 78,18 046,914 GOS ORFs 39,056 NCBI-nr 978,637 Total 75,729 PG ORFs 3,701,388 17,422,766 1,645,146 2,317,995 6,654,479 28,610,944 448,159 3,049,695 3,624,907 6,046,914 1,566,123 1,939,056 6,427,736 9,978,637 418,503 575,729 2,422,0073,701,38876,481 372,9331,645,146 79,023 3,550,9016,654,479 226,743 157,226 448,159 29,656 3,624,907 6,123,395 1,566,123 2,018,079 6,427,736 10,205,380 418,503 605,385 2,422,00 372,93 3,550,90 157,22 97,083 TGI-EST ORFs 606,779 5,458,820 576,532 1,097,083 520,551 606,779 30,247 1,127,330 576,532 520,55 ns 19,855 common ENS A \ B denotes to both the the number 253,007 clustering 361,668 of predicted and the HMM proteins 241,671 profiling; 319,855 common A toB, both the 78,184 number the clustering 253,007 of predicted 11,336 and the proteins HMM241,671 331,191 profiling; in clusters Abut B, not the 78,18 in nut ustering 575,729 Dataset (A) HMM Original 448,159 Set A Clustering 418,503 \ B (A) A 157,226 B HMMB29,656 A Total A \ 605,385 BPredicted A B 097,083 Profiling 606,779 (B) 576,532 520,551Profiling 30,247 (B) Proteins 1,127,330A [ B
Page 1 and 2: MetaGenomics Carlos Araya | 2.6.08
Page 3 and 4: What is metagenomics? (and why now?
Page 5 and 6: 16s rRNA: Molecular clocks & evolut
Page 7 and 8: How complex are different environme
Page 9 and 10: percent similarity 50 100 Bible: Fr
Page 11 and 12: Yooseph et al., 2007
Page 13 and 14: Ocean’s diversity landscape is fa
Page 15 and 16: How much sequencing does it take? #
Page 17 and 18: # reads required fraction covered
Page 19 and 20: Sequencing of entire BAC insert Gen
Page 21: Gene prediction: Clustering ≥3 fi
Page 25 and 26: grouping shows that our current GOS
Page 27 and 28: Sargasso Sea North Atlantic Coast 1
Page 29 and 30: and phosphatases protein kinases se
Page 31 and 32: A few words on Craig Venter…
Page 33 and 34: of this yloonnearrmiinoare S1. ject
Page 35 and 36: “The best-laid plans of mice & me
Page 37 and 38: Computational: ‣ Genome assembly
Page 39 and 40: Biocatalysis Bio-catalysis: Greener
Page 41 and 42: TTG-8 [2] TTG-9 [5] TTG-10 [39] Bio
Page 44 and 45: Woese, 1987

Special topic: Metagenomics - Genome Sciences

Create successful ePaper yourself

Delete template?

Save as template?