11.07.2015 Views

Comparing genomes to computer operating systems in terms of the ...

Comparing genomes to computer operating systems in terms of the ...

Comparing genomes to computer operating systems in terms of the ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

are revised more <strong>of</strong>ten. In fact, <strong>the</strong> adaptive functions dist<strong>in</strong>guish<strong>the</strong>mselves by hav<strong>in</strong>g higher values <strong>of</strong> reuse (12.6% versus 4.4%,Wilcoxon rank-sum test P < 10 −20 ) than <strong>the</strong> conservativefunctions.DiscussionWe have presented a comparative analysis between <strong>the</strong> transcriptionalregula<strong>to</strong>ry network <strong>of</strong> E. coli and <strong>the</strong> call graph <strong>of</strong> <strong>the</strong>L<strong>in</strong>ux <strong>operat<strong>in</strong>g</strong> system and explored <strong>the</strong>ir similarities and differences<strong>in</strong> hierarchical structure, modularity <strong>of</strong> organization, andpersistence <strong>of</strong> nodes. A summary <strong>of</strong> <strong>the</strong> comparison can be found<strong>in</strong> Table 2. The two networks are shaped by different underly<strong>in</strong>gdesign pr<strong>in</strong>ciples, which are deeply connected <strong>to</strong> <strong>the</strong> <strong>in</strong>terplay between<strong>the</strong> <strong>systems</strong> and <strong>the</strong>ir environments. From a <strong>to</strong>pologicalstandpo<strong>in</strong>t, it is <strong>in</strong>trigu<strong>in</strong>g that two dist<strong>in</strong>ct evolutionary processesboth lead <strong>to</strong> <strong>the</strong> emergence <strong>of</strong> hierarchy <strong>in</strong> <strong>the</strong> control andregulation layouts, probably because hierarchy is a most effectiveway <strong>to</strong> transfer <strong>in</strong>formation and coord<strong>in</strong>ate processes. Never<strong>the</strong>less,we have observed several <strong>in</strong>tr<strong>in</strong>sic differences between <strong>the</strong>two hierarchical networks. To a certa<strong>in</strong> extent, <strong>the</strong> presence <strong>of</strong><strong>in</strong>-degree hub functions and <strong>the</strong> <strong>to</strong>p-heavy hierarchy found <strong>in</strong><strong>the</strong> call graph can be readily expla<strong>in</strong>ed by common programm<strong>in</strong>gpractices. In general, for <strong>the</strong> sake <strong>of</strong> clarity and easy debugg<strong>in</strong>g,programmers are encouraged <strong>to</strong> break down a code <strong>in</strong><strong>to</strong> piecesand reuse certa<strong>in</strong> functions; functions that are called by manyo<strong>the</strong>rs, i.e., <strong>in</strong>-degree hubs, are <strong>the</strong>refore favored. The reuse<strong>of</strong> code leads <strong>to</strong> generic functions, which also accounts for <strong>the</strong><strong>in</strong>crease <strong>of</strong> overlap between modules <strong>in</strong> <strong>the</strong> L<strong>in</strong>ux call graph.These programm<strong>in</strong>g practices are rooted <strong>in</strong> considerations <strong>of</strong> costeffectiveness. From an eng<strong>in</strong>eer<strong>in</strong>g po<strong>in</strong>t <strong>of</strong> view, <strong>the</strong> reuse <strong>of</strong>common nodes between modules is a cost-effective way <strong>to</strong>construct a complex system. However, such optimized usage <strong>of</strong>functions comes at <strong>the</strong> expense <strong>of</strong> robustness, because breakdown<strong>of</strong> a generic function causes problems <strong>in</strong> many modules. Moreimportantly, generic functions lead <strong>to</strong> potential fragility <strong>in</strong> <strong>the</strong>sense that modify<strong>in</strong>g any module may require compensat<strong>in</strong>gchanges <strong>in</strong> a generic function. As a result, generic functions have<strong>to</strong> be updated more <strong>of</strong>ten (as reflected by <strong>the</strong> class <strong>of</strong> rapidlyrevis<strong>in</strong>g functions <strong>in</strong> Fig. 4A). The low overlap between modules<strong>in</strong> biological networks, on <strong>the</strong> o<strong>the</strong>r hand, <strong>in</strong>creases robustness.Modules tend <strong>to</strong> work more <strong>in</strong>dependently by recruit<strong>in</strong>g differentsets <strong>of</strong> workhorses from <strong>the</strong> broad base <strong>of</strong> <strong>the</strong> network hierarchy.The study <strong>of</strong> persistent genes <strong>in</strong> biological networks and persistentfunctions <strong>in</strong> call graphs <strong>of</strong>fers <strong>in</strong>sight <strong>in</strong><strong>to</strong> <strong>the</strong> evolution <strong>of</strong>hierarchies. Persistent genes form <strong>the</strong> core mach<strong>in</strong>ery <strong>of</strong> life, <strong>the</strong>so-called paleome (23). They usually are not regula<strong>to</strong>rs but workhorsegenes that perform vital tasks. In fact, most persistent genesare enzymes. The enrichment <strong>of</strong> persistent genes at <strong>the</strong> bot<strong>to</strong>m <strong>of</strong><strong>the</strong> regula<strong>to</strong>ry hierarchy <strong>in</strong> E. coli is <strong>in</strong> accordance with <strong>the</strong> viewthat orthologous prote<strong>in</strong>s are ra<strong>the</strong>r similar <strong>in</strong> function whereasregula<strong>to</strong>ry changes are <strong>the</strong> ma<strong>in</strong> driv<strong>in</strong>g forces <strong>of</strong> evolution (9).To a certa<strong>in</strong> extent, biological evolution is build<strong>in</strong>g from <strong>the</strong>bot<strong>to</strong>m <strong>to</strong> <strong>the</strong> <strong>to</strong>p. In contrast, persistent functions <strong>in</strong> <strong>the</strong> L<strong>in</strong>uxcall graph are usually not bot<strong>to</strong>m-level workhorses but “controllers.”This difference suggests that not only do s<strong>of</strong>tware networkspossess more regula<strong>to</strong>rs than workhorses, <strong>the</strong> regula<strong>to</strong>rs arema<strong>in</strong>ta<strong>in</strong>ed on purpose and thus <strong>the</strong> evolution goes from <strong>to</strong>p<strong>to</strong> bot<strong>to</strong>m.The trade-<strong>of</strong>f between robustness and cost effectiveness biologicaland s<strong>of</strong>tware <strong>systems</strong> is deeply related <strong>to</strong> <strong>the</strong> nature <strong>of</strong><strong>the</strong>ir evolutionary processes. Biological evolution is mediatedby random mutations followed by natural selection; a hub prote<strong>in</strong><strong>in</strong> a biological network is <strong>in</strong> general hard <strong>to</strong> evolve because <strong>of</strong> <strong>the</strong>constra<strong>in</strong>ts imposed by its many <strong>in</strong>teractions. This constra<strong>in</strong>edevolution is ev<strong>in</strong>ced by <strong>the</strong> negative correlation between nodecentrality and evolutionary rate <strong>in</strong> biological networks (24, 25).The random mutation and selection process underly<strong>in</strong>g biologicalevolution prohibits <strong>the</strong> frequent targeted changes required fornodes <strong>to</strong> become generic. The system is <strong>the</strong>n forced <strong>to</strong> pay forma<strong>in</strong>ta<strong>in</strong><strong>in</strong>g a large set <strong>of</strong> specially designed components perform<strong>in</strong>ga variety <strong>of</strong> functions <strong>in</strong> response <strong>to</strong> environmentalchanges. In contrast, eng<strong>in</strong>eer<strong>in</strong>g <strong>systems</strong> are fundamentally different.Both <strong>in</strong>-degree and betweenness centrality (26) are positivelycorrelated with <strong>the</strong> rate <strong>of</strong> revision <strong>in</strong> <strong>the</strong> L<strong>in</strong>ux call graph(see Fig. 4B for <strong>in</strong>-degree, Spearman correlation r ¼ 0.26,P < 10 −82 for betweenness). In o<strong>the</strong>r words, <strong>in</strong> s<strong>of</strong>tware eng<strong>in</strong>eer<strong>in</strong>g,a system that needs <strong>to</strong> cont<strong>in</strong>ually adapt <strong>to</strong> new conditions iscost effective only by pay<strong>in</strong>g <strong>the</strong> price <strong>of</strong> constantly f<strong>in</strong>e-tun<strong>in</strong>g itsmost highly accessed functions.Reuse is extremely common <strong>in</strong> design<strong>in</strong>g man-made <strong>systems</strong>.For biological <strong>systems</strong>, <strong>to</strong> what extent <strong>the</strong>y reuse <strong>the</strong>ir reper<strong>to</strong>iresand by what means susta<strong>in</strong> robustness at <strong>the</strong> same time are questions<strong>of</strong> much <strong>in</strong>terest. It was recently proposed that <strong>the</strong> reper<strong>to</strong>ire<strong>of</strong> enzymes could be viewed as <strong>the</strong> <strong>to</strong>olbox <strong>of</strong> an organism(27). As <strong>the</strong> genome <strong>of</strong> an organism grows larger, it can reuse its<strong>to</strong>ols more <strong>of</strong>ten and thus require fewer and fewer new <strong>to</strong>ols fornovel metabolic tasks. In o<strong>the</strong>r words, <strong>the</strong> number <strong>of</strong> enzymesgrows slower than <strong>the</strong> number <strong>of</strong> transcription fac<strong>to</strong>rs when<strong>the</strong> size <strong>of</strong> <strong>the</strong> genome <strong>in</strong>creases. Previous studies (4) have made<strong>the</strong> related f<strong>in</strong>d<strong>in</strong>g that as one moves <strong>to</strong>wards more complexorganisms, <strong>the</strong> transcriptional regula<strong>to</strong>ry network has an <strong>in</strong>creas-BIOPHYSICS ANDCOMPUTATIONAL BIOLOGYBasic properties <strong>of</strong><strong>systems</strong>HierarchicalorganizationOrganization <strong>of</strong>modulesTable 2. One-<strong>to</strong>-one comparison between <strong>the</strong> E. coli regula<strong>to</strong>ry network and <strong>the</strong> L<strong>in</strong>ux call graphE. coli transcriptional regula<strong>to</strong>ry network L<strong>in</strong>ux call graphNodes Genes (TFs & targets) Functions (subrout<strong>in</strong>es)Edges Transcriptional regulation Function callsExternal constra<strong>in</strong>ts Natural environment Hardware architecture, cus<strong>to</strong>mer requirementsOrig<strong>in</strong> <strong>of</strong> evolutionarychangesRandom mutation & natural selection Designers’ f<strong>in</strong>e-tun<strong>in</strong>gStructure Pyramidal Top-heavyCharacteristic hubs Upper-level TFs with high out-degree Generic workhorse functions with high <strong>in</strong>-degreeDownstream modules as Master TFs responsible for sens<strong>in</strong>g High-level start<strong>in</strong>g functions that <strong>in</strong>itiatelabeled byenvironmental signalsexecution for specific tasksNode reuse Low HighOverlap betweenLowHighmodulesPersistent nodes Characteristics Specialized (nongeneric) workhorses Generic or reusable functionsLocation <strong>in</strong> hierarchy Mostly bot<strong>to</strong>m Mostly <strong>to</strong>pEvolutionary rate Mostly conservative (e.g., dnaA) Conservative (e.g., strlen) & adaptive (e.g.,mempool_alloc)Design pr<strong>in</strong>ciples Build<strong>in</strong>g <strong>of</strong> hierarchy Bot<strong>to</strong>m up Top downOptimal solution favors Robustness Cost effectiveness (reuse <strong>of</strong> components)Yan et al. PNAS Early Edition ∣ 5<strong>of</strong>6

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!