10.04.2013 Views

Special topic: Metagenomics - Genome Sciences

Special topic: Metagenomics - Genome Sciences

Special topic: Metagenomics - Genome Sciences

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

grouping shows that our current GOS sampling methodology<br />

will not cover all protein families, and perhaps misses some<br />

protein families Rate that are of exclusive protein to higher discovery eukaryotes. The<br />

large section of clusters that include all three groupings<br />

uences Are Added<br />

e y-axis denotes the number of clusters (in thousands). Seven datasets with increasing<br />

d in the text. The blue curve shows the number of core sets of size 3 for the seven<br />

own. Linear regression gives slopes 0.027 (R 2 ¼ 0.999), 0.011 (R 2 ¼ 0.999), 0.0053 (R 2 ¼<br />

nd size 20, respectively.<br />

only sequenc<br />

members of k<br />

clustering pa<br />

detected by a<br />

Figure 2. Rate of Discovery of Clusters as (Nonredundant) Sequences Are Added<br />

The x-axis denotes the number of sequences (in millions) and the y-axis denotes the number<br />

numbers of (nonredundant) sequences are chosen as described in the text. The blue curve<br />

datasets. Curves for core set sizes 5, 10, and 20 are also shown. Linear regression gives<br />

0.999), and 0.0024 (R 2 0437<br />

March 2007 | Volume 5 | Issue 3 | e16<br />

¼ 0.996) for size 3, size 5, size 10, and size Yooseph 20, et respectively. al., 2007

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!