Transition Pathways in Complex Systems: Application of the Finite ...

Transition Pathways in Complex Systems: Application of the Finite ... Transition Pathways in Complex Systems: Application of the Finite ...

math.princeton.edu
from math.princeton.edu More from this publisher

AbstractThe f<strong>in</strong>ite-temperature str<strong>in</strong>g method proposed by E, Ren and Vanden-Eijnden is a very effectiveway <strong>of</strong> identify<strong>in</strong>g transition mechanisms and transition rates between metastable states <strong>in</strong>systems with complex energy landscapes. In this paper, we discuss <strong>the</strong> <strong>the</strong>oretical backgroundand algorithmic details <strong>of</strong> <strong>the</strong> f<strong>in</strong>ite-temperature str<strong>in</strong>g method, as well as application to <strong>the</strong> study<strong>of</strong> isomerization reaction <strong>of</strong> <strong>the</strong> alan<strong>in</strong>e dipeptide, both <strong>in</strong> vacuum and <strong>in</strong> explicit solvent.Wedemonstrate that <strong>the</strong> method allows us to identify directly <strong>the</strong> iso-committor surfaces, which areapproximated by hyperplanes, <strong>in</strong> <strong>the</strong> region <strong>of</strong> configuration space where <strong>the</strong> most probable transitiontrajectories are concentrated. These results are verified subsequently by comput<strong>in</strong>g directly<strong>the</strong> committor distribution on <strong>the</strong> hyperplanes that def<strong>in</strong>e <strong>the</strong> transition state region.PACS numbers: 2.50.-r, 02.60.-x, 82.20.Pm2


I. INTRODUCTIONThis paper is a cont<strong>in</strong>uation <strong>of</strong> <strong>the</strong> works presented <strong>in</strong> Refs. 1 and 2 on <strong>the</strong> study <strong>of</strong>transition pathways and transition rates between metastable states <strong>in</strong> complex systems. In<strong>the</strong> present paper, we will focus on <strong>the</strong> f<strong>in</strong>ite-temperature str<strong>in</strong>g method (FTS), presented<strong>in</strong> Ref. 2. We will discuss <strong>the</strong> <strong>the</strong>oretical background beh<strong>in</strong>d FTS, <strong>the</strong> algorithmic detailsand implementation issues, and we will demonstrate <strong>the</strong> success and difficulties <strong>of</strong> FTS for<strong>the</strong> example <strong>of</strong> conformation changes <strong>of</strong> an alan<strong>in</strong>e dipeptide molecule, both <strong>in</strong> vacuum andwith explicit solvent.The study <strong>of</strong> transition between metastable states has been a topic <strong>of</strong> great <strong>in</strong>terest<strong>in</strong> many areas for many years. For systems with simple energy landscapes <strong>in</strong> which <strong>the</strong>metastable states are separated by a few isolated barriers, <strong>the</strong>re exists a solid <strong>the</strong>oreticalfoundation as well as satisfactory computational tools for identify<strong>in</strong>g <strong>the</strong> transition pathwaysand transition rates. The key objects <strong>in</strong> this case are <strong>the</strong> transition states, which are saddlepo<strong>in</strong>ts on <strong>the</strong> potential energy landscape that separate <strong>the</strong> metastable states. The relevantnotion for <strong>the</strong> transition pathways is that <strong>of</strong> m<strong>in</strong>imum energy paths (MEP). MEPs arepaths <strong>in</strong> configuration space that connects <strong>the</strong> metastable states along which <strong>the</strong> potentialforce is parallel to <strong>the</strong> tangent vector. MEP allows us to identify <strong>the</strong> relevant saddle po<strong>in</strong>tswhich act as bottlenecks for a particular transition. Several computational methods havebeen developed for f<strong>in</strong>d<strong>in</strong>g <strong>the</strong> MEPs. Most successful among <strong>the</strong>se methods are <strong>the</strong> nudgedelastic band (NEB) method 3 and <strong>the</strong> zero-temperature str<strong>in</strong>g method (ZTS). 1 Once <strong>the</strong> MEPis obta<strong>in</strong>ed, transition rates can be computed us<strong>in</strong>g several strategies (see for example Refs.1 and 4). Alternatively, one may look directly for <strong>the</strong> saddle po<strong>in</strong>t, and several methodshave been developed for this purpose, <strong>in</strong>clud<strong>in</strong>g <strong>the</strong> conjugate peak ref<strong>in</strong>ement method, 18<strong>the</strong> ridge method 19 and <strong>the</strong> dimer method. 20The situation is quite different for systems with rough energy landscapes, as is <strong>the</strong> case fortypical chemical reactions <strong>of</strong> solvated systems. In this case, traditional notions <strong>of</strong> transitionstates have to be reconsidered s<strong>in</strong>ce <strong>the</strong>re may not exist specific microscopic configurationsthat identify <strong>the</strong> bottleneck <strong>of</strong> <strong>the</strong> transition. Instead <strong>the</strong> potential energy landscape typicallyconta<strong>in</strong>s numerous saddle po<strong>in</strong>ts, most <strong>of</strong> which are separated by barriers that are lessthan or comparable to k B T , and <strong>the</strong>refore do not act as barriers. There is not a unique mostprobable path for <strong>the</strong> transition. Instead, a collection <strong>of</strong> paths are important. 21 Describ<strong>in</strong>g3


transitions <strong>in</strong> such systems has become a central issue <strong>in</strong> recent years.The standard practice for identify<strong>in</strong>g transition pathways and transition rates <strong>in</strong> complexsystems is to coarse-gra<strong>in</strong> <strong>the</strong> system us<strong>in</strong>g a pre-determ<strong>in</strong>ed “reaction coord<strong>in</strong>ate,” usuallyselected on an empirical basis. The free energy landscape associated with this reaction coord<strong>in</strong>ateis <strong>the</strong>n calculated, us<strong>in</strong>g for example <strong>the</strong> blue-sampl<strong>in</strong>g technique, 5 and <strong>the</strong> transitionstates are identified as <strong>the</strong> saddle po<strong>in</strong>ts <strong>in</strong> <strong>the</strong> free energy landscape.Such a procedure based on <strong>in</strong>tuitive notions <strong>of</strong> reaction coord<strong>in</strong>ates has been both helpfuland mislead<strong>in</strong>g. For simple enough systems, for which a lot is known about <strong>the</strong> mechanisms<strong>of</strong> <strong>the</strong> transition, this procedure allows one to make use <strong>of</strong> and fur<strong>the</strong>r ref<strong>in</strong>e <strong>the</strong> exist<strong>in</strong>gknowledge. But as has been po<strong>in</strong>ted out by Chandler and collaborators, 6,7 if little is knownabout <strong>the</strong> mechanism <strong>of</strong> <strong>the</strong> transition, a poor choice <strong>of</strong> <strong>the</strong> reaction coord<strong>in</strong>ate can lead towrong predictions for <strong>the</strong> transition mechanism. Fur<strong>the</strong>rmore, <strong>in</strong>tuitively reasonable coarsegra<strong>in</strong>edvariables may not be good reaction coord<strong>in</strong>ates. 7In view <strong>of</strong> this, it is helpful to make <strong>the</strong> notion <strong>of</strong> reaction coord<strong>in</strong>ates more precise <strong>in</strong>order to place our discussions on a firm basis. There is <strong>in</strong>deed a dist<strong>in</strong>guished reactioncoord<strong>in</strong>ate, def<strong>in</strong>ed with <strong>the</strong> help <strong>of</strong> <strong>the</strong> backward Kolmogorov equation, which specifies, ateach po<strong>in</strong>t <strong>of</strong> <strong>the</strong> configuration space or phase space, <strong>the</strong> probability that <strong>the</strong> reaction willsucceed if <strong>the</strong> system is <strong>in</strong>itiated at that particular configuration or phase space location(see <strong>the</strong> discussion <strong>in</strong> section II). All o<strong>the</strong>r “reaction coord<strong>in</strong>ates” are approximate, and<strong>the</strong> quality <strong>of</strong> <strong>the</strong> approximation can <strong>in</strong> pr<strong>in</strong>ciple be quantified. This particular notion <strong>of</strong>reaction coord<strong>in</strong>ate is crucial for our discussion, s<strong>in</strong>ce it identifies <strong>the</strong> so-called iso-committorsurfaces, which is a central object <strong>in</strong> our study.There are several ways <strong>of</strong> characteriz<strong>in</strong>g this dist<strong>in</strong>guished reaction coord<strong>in</strong>ate that identifies<strong>the</strong> iso-committor surfaces. Besides be<strong>in</strong>g <strong>the</strong> solution <strong>of</strong> <strong>the</strong> backward Kolmogorovequation, it also satisfies a variational pr<strong>in</strong>ciple, as we discuss below. Never<strong>the</strong>less, it is <strong>of</strong>tenimpractical to f<strong>in</strong>d this reaction coord<strong>in</strong>ate, s<strong>in</strong>ce it is an object that may depend on verymany variables. The backward Kolmogorov equation is an equation <strong>in</strong> a huge dimensionalspace. What we are really <strong>in</strong>terested <strong>in</strong>, after all, are <strong>the</strong> ensemble <strong>of</strong> transition trajectories.This po<strong>in</strong>t <strong>of</strong> view has been emphasized by Chandler and co-workers, <strong>in</strong> <strong>the</strong>ir development<strong>of</strong> <strong>the</strong> transition path sampl<strong>in</strong>g technique (TPS). 6,7Our viewpo<strong>in</strong>t is consistent with that <strong>of</strong> TPS, but <strong>the</strong>re is a crucial difference: TPS views<strong>the</strong> transition path ensemble <strong>in</strong> <strong>the</strong> space <strong>of</strong> trajectories parameterized by <strong>the</strong> physical time,4


whereas we view <strong>the</strong> transition path ensemble <strong>in</strong> <strong>the</strong> configuration space with parameterizationchosen for <strong>the</strong> convenience <strong>of</strong> computations. As we show below, this latter viewpo<strong>in</strong>t<strong>of</strong>fers a number <strong>of</strong> advantages, <strong>the</strong> most important <strong>of</strong> which is that <strong>in</strong> configuration space, <strong>the</strong>transition path ensemble can be identified by look<strong>in</strong>g at <strong>the</strong> equilibrium distribution <strong>of</strong> <strong>the</strong>system restricted on <strong>the</strong> family <strong>of</strong> <strong>the</strong> iso-committor surfaces that separate <strong>the</strong> metastablesets.We assume that <strong>the</strong> transition path ensemble is localized <strong>in</strong> <strong>the</strong> configuration space, i.e.<strong>the</strong>y form isolated tubes. We call <strong>the</strong>m transition tubes. Our primary objective is to identify<strong>the</strong> transition tubes as well as <strong>the</strong> iso-committor surfaces with<strong>in</strong> <strong>the</strong> tubes. By focus<strong>in</strong>gon <strong>the</strong> transition tubes, we reduce a problem <strong>in</strong> large dimensional space, <strong>the</strong> backwardKolmogorov equation, to a large coupled system <strong>in</strong> one dimensional space, along <strong>the</strong> tube.With this understand<strong>in</strong>g, we can easily appreciate <strong>the</strong> basic pr<strong>in</strong>ciples beh<strong>in</strong>d <strong>the</strong> f<strong>in</strong>itetemperaturestr<strong>in</strong>g method. Consistent with <strong>the</strong> localization assumption, we approximate<strong>the</strong> iso-committor surfaces with<strong>in</strong> <strong>the</strong> tubes by hyperplanes. We <strong>the</strong>n adopt <strong>the</strong> variationalcharacterization <strong>of</strong> <strong>the</strong> iso-committor surfaces (see (10)), but we restrict <strong>the</strong> trial functions toa family <strong>of</strong> hyperplanes parametrized by a curve that passes through <strong>the</strong> center <strong>of</strong> mass (withrespect to <strong>the</strong> equilibrium distribution) on each plane. We call this curve <strong>the</strong> str<strong>in</strong>g. Thef<strong>in</strong>ite-temperature str<strong>in</strong>g method is a way <strong>of</strong> f<strong>in</strong>d<strong>in</strong>g <strong>the</strong> family <strong>of</strong> m<strong>in</strong>imiz<strong>in</strong>g hyperplanes.It is worth not<strong>in</strong>g that <strong>in</strong> <strong>the</strong> case when <strong>the</strong> energy landscape is simple and smooth,<strong>the</strong> transition tube reduces to MEP <strong>in</strong> <strong>the</strong> zero-temperature limit. At <strong>the</strong> same time, <strong>the</strong>f<strong>in</strong>ite-temperature str<strong>in</strong>g method reduces to <strong>the</strong> zero-temperature str<strong>in</strong>g method. 1This paper is organized as follows. In section II, we recall some <strong>the</strong>oretical backgroundfor describ<strong>in</strong>g transition <strong>in</strong> complex systems. In section III we discuss <strong>the</strong> f<strong>in</strong>ite-temperaturestr<strong>in</strong>g method, its foundation and implementation. FTS is <strong>the</strong>n used <strong>in</strong> section IV to study<strong>the</strong> conformation changes <strong>of</strong> <strong>the</strong> alan<strong>in</strong>e dipeptide, first <strong>in</strong> vacuum (section IV A) and <strong>the</strong>n<strong>in</strong> explicit solvent (section IV B). In each case we identify <strong>the</strong> transition tube us<strong>in</strong>g FTS,analyze <strong>the</strong> reaction coord<strong>in</strong>ates and transition states. This is possible s<strong>in</strong>ce FTS gives <strong>the</strong>iso-committor surfaces near <strong>the</strong> transition tube.5


II.THEORETICAL BACKGROUNDWe beg<strong>in</strong> with some <strong>the</strong>oretical background. To simplify <strong>the</strong> presentation, we will focuson <strong>the</strong> case when <strong>the</strong> system <strong>of</strong> <strong>in</strong>terest obeys over-damped friction dynamics:γẋ = −∇V (x) + √ 2γβ −1 η (1)where V is <strong>the</strong> potential energy <strong>of</strong> <strong>the</strong> system, γ is <strong>the</strong> friction coefficient, η is a whitenoise forc<strong>in</strong>g, and β = 1/k B T is <strong>the</strong> <strong>in</strong>verse temperature. More general forms <strong>of</strong> Langev<strong>in</strong>dynamics are considered <strong>in</strong> Ref. 9, where it is shown that <strong>the</strong> mechanism <strong>of</strong> transition isunaffected by <strong>the</strong> value <strong>of</strong> <strong>the</strong> friction coefficient at least under <strong>the</strong> assumption that <strong>the</strong>mechanism <strong>of</strong> transition can be described with<strong>in</strong> <strong>the</strong> configuration space alone. The systemdescribed by (1) has a standard equilibrium distributionρ e (x) = Z −1 e −βV (x) (2)<strong>in</strong> <strong>the</strong> configuration space Ω, where Z = ∫ Ω e−βV (x) dx is a normalization constant. We willassume that <strong>the</strong>re are two metastable states described by two subsets, A and B, <strong>of</strong> <strong>the</strong>configuration space Ω respectively, and we will refer to A and B as be<strong>in</strong>g <strong>the</strong> metastablesets. For simplicity, we will assume that <strong>the</strong>re are no o<strong>the</strong>r metastable sets, i.e.∫ρ e (x)dx ≈ 1 (3)A∪BWe are <strong>in</strong>terested <strong>in</strong> characteriz<strong>in</strong>g <strong>the</strong> mechanism <strong>of</strong> transition between <strong>the</strong> metastablesets A and B. In <strong>the</strong> simplest situation when <strong>the</strong> energy landscape is smooth and <strong>the</strong>metastable states are separated by a few isolated barriers, this is usually done by identify<strong>in</strong>g<strong>the</strong> transition states, which are saddle po<strong>in</strong>ts on <strong>the</strong> potential energy landscape. The mostprobable path for <strong>the</strong> transition is <strong>the</strong> so-called m<strong>in</strong>imum energy path which is <strong>the</strong> path<strong>in</strong> <strong>the</strong> configuration space such that <strong>the</strong> potential force is parallel to <strong>the</strong> tangent along <strong>the</strong>path. 1,3As has been discussed earlier, 2,6 <strong>the</strong>se concepts are no longer appropriate if <strong>the</strong> energylandscape is rough with many saddle po<strong>in</strong>ts, most <strong>of</strong> which are separated by potential energybarriers that are less than or comparable to k B T , and <strong>the</strong>refore do not act as barriers for<strong>the</strong> transition. In this case, <strong>the</strong> notion <strong>of</strong> transition states has to be replaced by a transitionstate ensemble which is a probability distribution <strong>of</strong> states. 8,96


A. <strong>Transition</strong> path ensemble <strong>in</strong> trajectory spaceHow do we characterize such an ensemble and <strong>the</strong> most probable transition paths? Toaddress this question, Chandler et al proposed study<strong>in</strong>g <strong>the</strong> ensemble <strong>of</strong> reactive trajectories,which are trajectories that successfully make <strong>the</strong> transition, parametrized by <strong>the</strong> physicaltime.By assign<strong>in</strong>g appropriate probability densities to <strong>the</strong>se trajectories, Monte Carloprocedures can be developed to sample <strong>the</strong> reactive trajectories. Indeed this is <strong>the</strong> basicidea beh<strong>in</strong>d transition path sampl<strong>in</strong>g 6,7 (see also Ref. 10 for an alternative technique tosample reactive trajectories). The efficiency ga<strong>in</strong> <strong>of</strong> TPS over standard molecular dynamicscomes from <strong>the</strong> fact that reactive trajectories are much shorter than <strong>the</strong> time it takes betweensuccessive transitions. Therefore, TPS allows one to sample much more reactive trajectoriesthan <strong>in</strong> a standard molecular dynamics run. Each reactive trajectory can be post-processedto determ<strong>in</strong>e <strong>the</strong> committor value <strong>of</strong> <strong>the</strong> po<strong>in</strong>ts along this trajectory, i.e. <strong>the</strong> probabilitythat <strong>the</strong> reaction will succeed if <strong>in</strong>itiated at that particular po<strong>in</strong>t. The collection <strong>of</strong> po<strong>in</strong>tswith committor value close to 1 2form <strong>the</strong> transition state ensemble.TPS <strong>in</strong> pr<strong>in</strong>ciple gives a systematic procedure for study<strong>in</strong>g reactive paths <strong>in</strong> complexsystems. In practice, however, it has several difficulties. First <strong>of</strong> all, harvest<strong>in</strong>g enoughreactive paths for statistical analysis might be very demand<strong>in</strong>g.This task is not madeeasier by <strong>the</strong> fact <strong>the</strong> paths are parametrized by physical time, and <strong>the</strong>se trajectories can bequite long compared to <strong>the</strong> time-step used <strong>in</strong> <strong>the</strong> simulations. In addition, <strong>the</strong> ensemble <strong>of</strong>reactive trajectories does not give directly <strong>the</strong> ma<strong>in</strong> object <strong>of</strong> <strong>in</strong>terest, <strong>the</strong> transition stateensemble, and it requires a very non-trivial post-process<strong>in</strong>g step to analyze <strong>the</strong> reactive paths<strong>in</strong> order to extract <strong>the</strong> transition state ensemble. As expla<strong>in</strong>ed before, this post-process<strong>in</strong>gstep <strong>in</strong>volves launch<strong>in</strong>g trajectories at each po<strong>in</strong>t along each reactive path to determ<strong>in</strong>e <strong>the</strong>committor value <strong>of</strong> that po<strong>in</strong>t, and this can be quite tedious <strong>in</strong>deed. 11B. <strong>Transition</strong> path ensemble <strong>in</strong> configuration spaceAn alternative viewpo<strong>in</strong>t was proposed <strong>in</strong> Ref. 2 <strong>in</strong> connection with <strong>the</strong> f<strong>in</strong>ite-temperaturestr<strong>in</strong>g method. Its <strong>the</strong>oretical background was expla<strong>in</strong>ed <strong>in</strong> Refs. 8 and 9. Instead <strong>of</strong>consider<strong>in</strong>g <strong>the</strong> ensemble <strong>of</strong> dynamic trajectories parametrized by <strong>the</strong> physical time, oneviews <strong>the</strong> transition paths <strong>in</strong> <strong>the</strong> configuration space. To understand why it is advantageous7


to do so, it is helpful to consider first a dist<strong>in</strong>guished reaction coord<strong>in</strong>ate q(x) def<strong>in</strong>ed by<strong>the</strong> solutions <strong>of</strong> <strong>the</strong> backward Kolmogorov equation:−∇V · ∇q + β −1 ∆q = 0, q| x∈A = 0, q| x∈B = 1, (4)The solution to this equation has a very simple probabilistic <strong>in</strong>terpretation: 12 q(x) is <strong>the</strong>probability that a trajectory <strong>in</strong>itiated at x reaches <strong>the</strong> metastable set B before it reaches <strong>the</strong>set A. In o<strong>the</strong>r words, <strong>in</strong> terms <strong>of</strong> <strong>the</strong> reaction from A to B, q(x) gives <strong>the</strong> probability thatthis reaction will succeed if it has already made it to <strong>the</strong> po<strong>in</strong>t x.The level sets (or iso-surfaces) <strong>of</strong> q, Γ z= {x : q(x) = z}, z ∈ [0, 1], def<strong>in</strong>e <strong>the</strong> isocommittorsurfaces: For any fixed z, trajectories <strong>in</strong>itiated at po<strong>in</strong>ts on Γ z have <strong>the</strong> sameprobability to reach B before reach<strong>in</strong>g A.The iso-committor surfaces allow one to def<strong>in</strong>e <strong>the</strong> transition path ensemble <strong>in</strong> <strong>the</strong> configurationspace <strong>in</strong> a different way, by restrict<strong>in</strong>g <strong>the</strong> equilibrium distribution (2) on <strong>the</strong>sesurfaces (see figure 1). To see how this is done, consider an arbitrary surface S <strong>in</strong> <strong>the</strong> configurationspace and an arbitrary po<strong>in</strong>t x on S. The probability density <strong>of</strong> hitt<strong>in</strong>g S at x byany typical trajectory isρ S (x) = Z −1S e−βV (x) (5)where <strong>the</strong> normalization factor Z S is <strong>the</strong> follow<strong>in</strong>g <strong>in</strong>tegral over <strong>the</strong> surface S: Z S =∫S e−βV (x) dσ(x), where dσ(x) is <strong>the</strong> surface element on S. (Note that ρ S (x) is a densitythat is attached to and lives <strong>in</strong> S, and that is why Z S is not equal to <strong>the</strong> normalizationfactor Z <strong>in</strong> (2)).On <strong>the</strong> o<strong>the</strong>r hand, if we consider only reactive trajectories, <strong>the</strong>n thisprobability density changes to˜ρ S (x) =˜Z−1S q(x)(1 − q(x))e−βV , (6)where ˜Z S = ∫ q(x)(1 − S q(x))e−βV dσ(x). This can be seen as follows (see also Ref. 9). ˜ρ S (x)is equal to <strong>the</strong> probability density that a trajectory (whe<strong>the</strong>r reactive or not) hits S at x,times <strong>the</strong> probability that <strong>the</strong> trajectory came from A <strong>in</strong> <strong>the</strong> past and <strong>the</strong> probability that itreaches B before reach<strong>in</strong>g A <strong>in</strong> <strong>the</strong> future. Us<strong>in</strong>g <strong>the</strong> strong Markov property and statisticaltime reversibility, this latter quantity is given by q(x)(1 − q(x)), and this gives (6).If S is an iso-committor surface, we see that ˜ρ S = ρ S , i.e., <strong>the</strong> probability <strong>of</strong> hitt<strong>in</strong>g aniso-committor surface at x by a hopp<strong>in</strong>g trajectory is simply <strong>the</strong> equilibrium distributionrestricted on <strong>the</strong> surface.8


Let ρ z (x) = ρ S (x) when S = Γ z , <strong>the</strong> iso-committor surfaces labeled by z ∈ [0, 1]. We cannow def<strong>in</strong>e <strong>the</strong> transition path ensemble as <strong>the</strong> one-parameter family <strong>of</strong> probability densities{ρ z (x), z ∈ [0, 1]} on <strong>the</strong> iso-committor surfaces. The discussion above tells us that ρ z (x)gives <strong>the</strong> target distribution when reactive trajectories hit <strong>the</strong> surface Γ z .C. <strong>Transition</strong> tubes <strong>in</strong> configuration spaceIn pr<strong>in</strong>ciple <strong>the</strong> set on which <strong>the</strong> transition path ensemble concentrates can be a verycomplicated set. In <strong>the</strong> follow<strong>in</strong>g, we will assume <strong>the</strong> distribution ρ z (x) is s<strong>in</strong>gly peaked oneach iso-committor surface, and consequently <strong>the</strong> set becomes a localized tube. The casewhen ρ z is peaked at several isolated locations and hence <strong>the</strong>re are several isolated tubes issimilar. The tube can be characterized by its centerl<strong>in</strong>e and width. To be more precise, <strong>the</strong>centerl<strong>in</strong>e is def<strong>in</strong>ed asϕ(z) = 〈x〉 Γz (7)where <strong>the</strong> average is with respect to ρ z (x) which is <strong>the</strong> restricted equilibrium distribution onΓ z . The width <strong>of</strong> <strong>the</strong> tube can be characterized by <strong>the</strong> eigenvalues <strong>of</strong> <strong>the</strong> covariance matrixC(z) = 〈(x − ϕ(z)) ⊗ (x − ϕ(z))〉 Γz . (8)The tube can be parameterized by o<strong>the</strong>r parameters, e.g. <strong>the</strong> normalized arc-length <strong>of</strong> <strong>the</strong>centerl<strong>in</strong>e.Remark. The transition tube can also be characterized by <strong>the</strong> probability that <strong>the</strong> reactivetrajectories stay <strong>in</strong>side <strong>the</strong> tube. To be more precise, we proceed as follows. Fix a positivenumber p ∈ (0, 1). Let c(z) be <strong>the</strong> subset <strong>of</strong> Γ z with smallest volume such that∫∫e −βV dσ(x) = p e −βV dσ(x), (9)c(z)q(x)=zThe union <strong>of</strong> <strong>the</strong> c(z) for z ∈ [0, 1] def<strong>in</strong>es <strong>the</strong> effective transition tube.The above discussion establishes a most remarkable fact, namely if considered <strong>in</strong> configurationspace, <strong>the</strong> ensemble <strong>of</strong> reactive trajectories can be characterized by equilibrium distributionson <strong>the</strong> iso-committor surfaces. This is <strong>the</strong> basic idea beh<strong>in</strong>d <strong>the</strong> f<strong>in</strong>ite-temperaturestr<strong>in</strong>g method. It is easy to see that if <strong>the</strong> potential is smooth, <strong>the</strong>n <strong>in</strong> <strong>the</strong> zero-temperaturelimit, <strong>the</strong> transition path ensemble or <strong>the</strong> transition tube reduces to <strong>the</strong> m<strong>in</strong>imum energypath.9


III.FINITE-TEMPERATURE STRING METHODA. DerivationFor numerical purpose we will label <strong>the</strong> iso-committor surfaces by α ∈ [0, 1] which is an<strong>in</strong>creas<strong>in</strong>g function <strong>of</strong> z, e.g. <strong>the</strong> normalized arc-length <strong>of</strong> <strong>the</strong> centerl<strong>in</strong>e <strong>of</strong> <strong>the</strong> transitiontube. We make <strong>the</strong> follow<strong>in</strong>g two assumptions:1. The transition tube is th<strong>in</strong> compared to <strong>the</strong> local radius <strong>of</strong> curvature <strong>of</strong> <strong>the</strong> centerl<strong>in</strong>e.2. The iso-committor surfaces can be approximated by hyperplanes with<strong>in</strong> <strong>the</strong> transitiontube.The precise form <strong>of</strong> <strong>the</strong> second assumption is given <strong>in</strong> (30). By this assumption we avoid<strong>the</strong> <strong>in</strong>tersection <strong>of</strong> <strong>the</strong> hyperplanes <strong>in</strong>side <strong>the</strong> transition tube.Our objective is to f<strong>in</strong>d <strong>the</strong> transition tube as well as <strong>the</strong> reaction coord<strong>in</strong>ate q(x) whoselevel sets are <strong>the</strong> iso-committor surfaces. For this purpose it is useful to recall a variationalformulation <strong>of</strong> q(x): As <strong>the</strong> solution <strong>of</strong> (4), q(x) has an equivalent characterization as be<strong>in</strong>g<strong>the</strong> m<strong>in</strong>imizer <strong>of</strong>∫I =Ω\(A∪B)subject to <strong>the</strong> boundary conditions <strong>in</strong> (4).e −βV |∇q| 2 dx. (10)In general this variational problem is difficult to solve s<strong>in</strong>ce it is on a large dimensionalspace. But under <strong>the</strong> assumptions above, we can make an approximation by restrict<strong>in</strong>g <strong>the</strong>trial functions to functions whose level sets are hyperplanes.We will represent <strong>the</strong> oneparameterfamily <strong>of</strong> hyperplanes, {P α , α ∈ [0, 1]}, by <strong>the</strong>ir unit normal ˆn(α) and <strong>the</strong> meanposition <strong>in</strong> <strong>the</strong> plane with respect to <strong>the</strong> equilibrium distribution restricted to P α , i.e.∫Pϕ(α) = 〈x〉 Pα ≡ αxe −βV dσ(x)∫P αe −βV dσ(x) . (11)It was shown <strong>in</strong> Ref. 8 and 9 (see also <strong>the</strong> Appendix) that (10) reduces toI =∫ 10(f ′ (α)) 2 e −βF (α) |ˆn · ϕ ′ | −1 dα (12)under <strong>the</strong> assumptions described above. Here f(α) = q(ϕ(α)), <strong>the</strong> prime denotes <strong>the</strong> derivativewith respect to α, and F (α) is <strong>the</strong> free energy∫F (α) = −β −1 log e −βV (x) dσ(x). (13)P α10


The m<strong>in</strong>imization condition for this functional is given by:ˆn(α) ‖ ϕ ′ (α) (14)where ϕ ′ (α) is <strong>the</strong> tangent vector <strong>of</strong> <strong>the</strong> str<strong>in</strong>g at parameter α, and∫ α0f(α) =eβF (α′) dα ′∫ 10 eβF (α′ )dα . (15)′Equation (14) says that <strong>the</strong> centerl<strong>in</strong>e ϕ(α) must be normal to <strong>the</strong> family <strong>of</strong> hyperplanes.A simple application <strong>of</strong> Laplace’s method on <strong>the</strong> <strong>in</strong>tegrals <strong>in</strong> (15) implies that <strong>the</strong> isocommittorsurface at α ∗ where f(α ∗ ) = 1 2roughly co<strong>in</strong>cides with <strong>the</strong> surface on which Fatta<strong>in</strong>s maximum. Therefore <strong>the</strong> transition state ensemble can be characterized by ei<strong>the</strong>r <strong>of</strong><strong>the</strong> follow<strong>in</strong>g:(i). F (α) is with<strong>in</strong> k B T <strong>of</strong> its maximum value;(ii). f(α) ≈ 1 2 .We emphasize that for (i) to be true, <strong>the</strong> correct free energy has to be used, i.e. one hasto select <strong>the</strong> right reaction coord<strong>in</strong>ate. We also note that <strong>the</strong> transition state region can bequite broad, if F or f changes slowly at α ∗ .A rigorous derivation <strong>of</strong> <strong>the</strong> conditions (14) and (15) can be found <strong>in</strong> Ref. 8. Heuristically,<strong>the</strong> m<strong>in</strong>imum <strong>of</strong> <strong>the</strong> factor |ˆn · ϕ ′ | −1 <strong>in</strong> (12) is achieved under <strong>the</strong> condition (14) (we assume<strong>the</strong> length <strong>of</strong> ϕ ′ (α) is a constant). Then <strong>the</strong> functional (12) reduces toI =∫ 10e −βF (α) (f ′ (α)) 2 dα (16)up to a constant. The m<strong>in</strong>imizer <strong>of</strong> (16) is given by (15), which is also <strong>the</strong> solution to <strong>the</strong>one-dimensional version <strong>of</strong> <strong>the</strong> backward Kolmogorov equation:with boundary conditions f(0) = 0 and f(1) = 1.−F ′ (α)f ′ (α) + β −1 f ′′ (α) = 0 (17)B. ImplementationThe f<strong>in</strong>ite-temperature str<strong>in</strong>g method is an iterative method for solv<strong>in</strong>gϕ(α) = 〈x〉 Pα , (18)11


andˆn(α) ‖ ϕ ′ (α). (19)This is done by seamlessly comb<strong>in</strong><strong>in</strong>g sampl<strong>in</strong>g on <strong>the</strong> hyperplanes with <strong>the</strong> dynamics <strong>of</strong> <strong>the</strong>hyperplanes. In (18) and (19), one <strong>of</strong> <strong>the</strong> equations can be viewed as a k<strong>in</strong>ematic constra<strong>in</strong>tand <strong>the</strong> o<strong>the</strong>r as <strong>the</strong> equilibrium condition. In <strong>the</strong> discussions above, we parametrized<strong>the</strong> planes accord<strong>in</strong>g to (18), <strong>the</strong>n (19) came out as <strong>the</strong> equilibrium condition. In <strong>the</strong>implementation <strong>of</strong> FTS, particularly for systems with rough energy landscapes, we f<strong>in</strong>d itmore convenient to parametrize <strong>the</strong> planes accord<strong>in</strong>g to (19) and set up an iterative procedureto solve (18) <strong>in</strong> order to f<strong>in</strong>d <strong>the</strong> transition tube. The object that we iterate upon is <strong>the</strong>so-called str<strong>in</strong>g, which is normal to <strong>the</strong> hyperplanes. At steady state, it co<strong>in</strong>cides with <strong>the</strong>centerl<strong>in</strong>e <strong>of</strong> <strong>the</strong> transition tube. At each iteration step n, we denote <strong>the</strong> mean str<strong>in</strong>g andassociated perpendicular hyperplanes by ϕ n (α) and Pαn respectively, where α ∈ [0, 1] is aparameterization <strong>of</strong> <strong>the</strong> str<strong>in</strong>g. The next iteration is given by:ϕ n+1 (α ′ ) = 〈x〉 P n α(20)where α ′ = g(α) is obta<strong>in</strong>ed by reparameterization <strong>in</strong> order to enforce a particular constra<strong>in</strong>ton <strong>the</strong> parametrization (say, by normalized arc-length).However, due to <strong>the</strong> roughness <strong>of</strong> <strong>the</strong> energy landscape, us<strong>in</strong>g (20) directly may lead toan unstable numerical algorithm. This difficulty is overcome by add<strong>in</strong>g a smooth<strong>in</strong>g stepfor <strong>the</strong> mean str<strong>in</strong>g. The computation <strong>of</strong> ϕ n+1 <strong>the</strong>refore follows four steps:a. Calculate <strong>the</strong> mean position ˜ϕ(α) on Pα n by constra<strong>in</strong>ed dynamics;b. Smooth<strong>in</strong>g ˜ϕ(α) gives ¯ϕ(α);c. Reparameterization <strong>of</strong> ¯ϕ(α) gives ϕ n+1 (α);d. Re<strong>in</strong>itialization <strong>of</strong> <strong>the</strong> sampl<strong>in</strong>g on <strong>the</strong> new hyperplanes.Notice that <strong>the</strong> smooth<strong>in</strong>g step is used to accelerate convergence and avoid numerical <strong>in</strong>stability.The details <strong>of</strong> each step are given below. A schematic <strong>of</strong> <strong>the</strong> computation procedureis shown <strong>in</strong> figure 2.12


a. Sampl<strong>in</strong>g <strong>the</strong> energy landscape around ϕ n . The computation <strong>of</strong> 〈x〉 P n αis done asfollows. On each hyperplane P n αa Langev<strong>in</strong> dynamicsẋ = −(∇V ) ⊥ (x) + √ 2β −1 η ⊥ (21)is carried out, where (·) ⊥ denotes <strong>the</strong> projection to <strong>the</strong> hyperplane Pα n , and η is a white-noiseforc<strong>in</strong>g. The mean position <strong>of</strong> <strong>the</strong> system on P n α is calculated from (21):˜ϕ(α) =1T 1 − T 0∫ T1T 0x(t)dt (22)where T 0 is <strong>the</strong> relaxation time and T 1 is <strong>the</strong> total simulation time. Constra<strong>in</strong>ed moleculardynamics can be used as well to calculate <strong>the</strong> mean position.For systems with very rough energy landscapes, it is convenient to add a constra<strong>in</strong><strong>in</strong>gpotential that localizes <strong>the</strong> sampl<strong>in</strong>g <strong>in</strong> (21) to region close to <strong>the</strong> current mean str<strong>in</strong>g.This is aga<strong>in</strong> used to avoid numerical <strong>in</strong>stability, but has no effect on <strong>the</strong> f<strong>in</strong>al result. Thefollow<strong>in</strong>g potential is used <strong>in</strong> our calculation⎧0,⎪⎨r < cV c (x) = (r − d) −12 − 2((c − d)(r − d)) −6 , c ≤ r < d(23)⎪⎩∞, r ≥ d.Here r = |x − ϕ n |, ϕ n is <strong>the</strong> current mean str<strong>in</strong>g and c and d are two parameters. Under thisconstra<strong>in</strong><strong>in</strong>g force, sampl<strong>in</strong>g <strong>in</strong> (21) is conf<strong>in</strong>ed <strong>in</strong> a tube with radius d around <strong>the</strong> currentmean str<strong>in</strong>g. Moreover, <strong>the</strong> system experiences no constra<strong>in</strong><strong>in</strong>g force <strong>in</strong> a even smaller tubewith radius c. The constra<strong>in</strong><strong>in</strong>g force is gradually switched <strong>of</strong>f as <strong>the</strong> computation converges.This is done by gradually <strong>in</strong>crease <strong>the</strong> conf<strong>in</strong><strong>in</strong>g parameter c.b. Smooth<strong>in</strong>g <strong>the</strong> mean str<strong>in</strong>g. The calculated mean str<strong>in</strong>g ˜ϕ(α) may not be verysmooth and this may make <strong>the</strong> tangent vector so <strong>in</strong>accurate that it <strong>in</strong>troduces numerical<strong>in</strong>stabilities. Therefore at each iteration, we smooth out ˜ϕ(α) by solv<strong>in</strong>g <strong>the</strong> follow<strong>in</strong>goptimization problemm<strong>in</strong>¯ϕ(α)∫ 10(C 1 κ(α) + C 2 || ¯ϕ(α) − ˜ϕ(α)|| M ) dα. (24)Here κ(α) is <strong>the</strong> curvature <strong>of</strong> ¯ϕ(α), ||x|| M = (x, Mx) 1/2 is <strong>the</strong> L 2 norm <strong>of</strong> x weightedby M = (C(α) + I) −1 where C(α) is <strong>the</strong> covariance matrix <strong>of</strong> x generated by constra<strong>in</strong>ed13


dynamics on each hyperplane. The parameters C 1 , C 2 ∈ [0, 1] determ<strong>in</strong>e <strong>the</strong> relative weights<strong>of</strong> <strong>the</strong> two terms. As two extreme cases, <strong>the</strong> solution <strong>of</strong> (24) is a l<strong>in</strong>e segment when C 1 = 1,C 2 = 0 and <strong>the</strong> solution ¯ϕ(α) = ˜ϕ(α) when C 1 = 0, C 2 = 1.The m<strong>in</strong>imization problem (24) can be solved by <strong>the</strong> steepest descent method, or moreadvanced techniques, e.g. <strong>the</strong> quasi-Newton (BFGS) method.c. Reparameterization. At <strong>the</strong> third step <strong>of</strong> each iteration, <strong>the</strong> mean str<strong>in</strong>g ¯ϕ(α) isreparameterized, for example, by normalized arc-length, to give ϕ n+1 (α). The reparameterizationis done by polynomial <strong>in</strong>terpolation. 1d. Re<strong>in</strong>itialization. A new collection <strong>of</strong> hyperplanes P n+1αwhich are orthogonal toϕ n+1 (α) are obta<strong>in</strong>ed after <strong>the</strong> reparameterization step. To start <strong>the</strong> sampl<strong>in</strong>g process on<strong>the</strong> new planes, we need to generate <strong>the</strong> <strong>in</strong>itial configurations for <strong>the</strong> Langev<strong>in</strong> dynamics(21). A naive way to generate <strong>the</strong> <strong>in</strong>itial configurations is to project <strong>the</strong> realizations onP n αn+1from <strong>the</strong> previous sampl<strong>in</strong>g process onto <strong>the</strong> new planes Pα. But unfortunately thisprocedure usually result <strong>in</strong> abrupt changes <strong>in</strong> <strong>the</strong> molecular structure and thus very largepotential force. In <strong>the</strong> case that <strong>the</strong> artificial constra<strong>in</strong><strong>in</strong>g potential (23) is used, <strong>the</strong> directprojection becomes even worse s<strong>in</strong>ce it is possible that <strong>the</strong> projected realizations are out <strong>of</strong><strong>the</strong> tube def<strong>in</strong>ed by <strong>the</strong> constra<strong>in</strong><strong>in</strong>g potential and thus have <strong>in</strong>f<strong>in</strong>ite force.In our implementation, this difficulty is overcome by <strong>in</strong>sert<strong>in</strong>g a few <strong>in</strong>termediate hyperplanes,denoted by { ˜P j , j = 0, 1, · · · , J}, <strong>in</strong> between P n αn+1and Pα, and each time weproject <strong>the</strong> realization from ˜P j to ˜P j+1 , followed by a few steps <strong>of</strong> relaxation on ˜P j+1 . This isanalogous to choos<strong>in</strong>g small time steps for numerical stability when solv<strong>in</strong>g ODEs or PDEs.Here ˜P 0 = Pα n, ˜P J = Pαn+1 , and { ˜P j , j = 1, · · · , J − 1} are obta<strong>in</strong>ed by l<strong>in</strong>ear <strong>in</strong>terpolationbetween P n α and P n+1α, i.e. <strong>the</strong> unit normals are given by((1 − j J )ˆn0 + j )J ˆnJ , j = 1, 2, · · · J − 1 (25)ˆn j = cwhere ˆn 0 and ˆn J are <strong>the</strong> unit normals to P n α and P n+1αfactor. The <strong>in</strong>terpolated plane ˜P j goes through φ j where φ j is given byrespectively, and c is <strong>the</strong> normalizationφ j = (1 − j J )φ0 + j J φJ , j = 1, 2, · · · J − 1 (26)where φ 0 = ϕ n α and φJ = ϕ n+1α . On each <strong>in</strong>terpolated hyperplane ˜P j , <strong>the</strong> constra<strong>in</strong><strong>in</strong>g force(23) is centered at φ j .14


For each α, we move <strong>the</strong> realization x on P n αj = 1, 2, · · · , J, we first project x from ˜P j−1 to ˜P j :to Pn+1α <strong>in</strong> J steps. At <strong>the</strong> j-th step,x := x − ((x − φ j ) · ˆn j )ˆn j . (27)Then we relax <strong>the</strong> projected configuration on ˜P j for a short time T 2 (a few time steps):ẋ(t) = −(∇V ) ⊥,j (x) + √ 2β −1 η ⊥,j , 0 ≤ t ≤ T 2 (28)where (∇V ) ⊥,j = ∇V − (∇V, ˆn j )ˆn j and V (x) <strong>in</strong>cludes <strong>the</strong> potential <strong>of</strong> <strong>the</strong> system and <strong>the</strong>constra<strong>in</strong><strong>in</strong>g potential. Similar projection is done for η.In <strong>the</strong> follow<strong>in</strong>g application to <strong>the</strong> alan<strong>in</strong>e dipeptide, <strong>the</strong> parameter T 2 is taken to be afew time steps for (28), and J = 10.Sampl<strong>in</strong>g on <strong>the</strong> new hyperplanes P n+1 beg<strong>in</strong>s after we obta<strong>in</strong> <strong>the</strong> new set <strong>of</strong> realizations,and <strong>the</strong> above four-step procedure repeats.When <strong>the</strong> iteration converges with satisfactory accuracy, <strong>the</strong> mean str<strong>in</strong>g gives <strong>the</strong> centerl<strong>in</strong>e<strong>of</strong> <strong>the</strong> tube, <strong>the</strong> variance <strong>of</strong> x(α) on <strong>the</strong> hyperplanes give <strong>the</strong> width <strong>of</strong> <strong>the</strong> transitiontube, and <strong>the</strong> function f(α) <strong>in</strong> (15) gives <strong>the</strong> committor value <strong>of</strong> each plane. To obta<strong>in</strong> f(α),<strong>the</strong> free energy (13) must be computed, which can be done by calculat<strong>in</strong>g <strong>the</strong> mean forceby sampl<strong>in</strong>g via (21) on <strong>the</strong> f<strong>in</strong>al planes, and <strong>the</strong>rmodynamic <strong>in</strong>tegration.IV.APPLICATION TO THE ALANINE DIPEPTIDEThe ball and stick model <strong>of</strong> <strong>the</strong> alan<strong>in</strong>e dipeptide is shown <strong>in</strong> figure 3. Despite its simplechemical structure, it exhibits some <strong>of</strong> <strong>the</strong> important features common to biomolecules, with<strong>the</strong> backbone degree <strong>of</strong> freedoms (dihedral angles φ and ψ), three methyl groups (CH 3 ), aswell as <strong>the</strong> polar groups (N-H, C=O) which form hydrogen bonds with water <strong>in</strong> aqueoussolution. The isomerization <strong>of</strong> alan<strong>in</strong>e dipeptide has been <strong>the</strong> subject <strong>of</strong> several <strong>the</strong>oreticaland computational studies and <strong>the</strong>refore serves as an excellent test for FTS.Apostolakis et al calculated <strong>the</strong> transition pathways and barriers for <strong>the</strong> isomerization<strong>of</strong> alan<strong>in</strong>e dipeptide. 13 In <strong>the</strong>ir work, a two-dimensional potential <strong>of</strong> mean force on <strong>the</strong> φ-ψ plane was first calculated by <strong>the</strong> adaptive umbrella sampl<strong>in</strong>g scheme. The conjugatepeak ref<strong>in</strong>ement algorithm and targeted molecular dynamics technique were applied on <strong>the</strong>15


obta<strong>in</strong>ed two-dimensional potential to obta<strong>in</strong> <strong>the</strong> transition pathways. As a result, <strong>the</strong>mechanism <strong>of</strong> <strong>the</strong> transition is pre-assumed to depend only on <strong>the</strong> two torsion angles φ andψ. Bolhuis et al applied TPS to study <strong>the</strong> isomerization <strong>of</strong> alan<strong>in</strong>e dipeptide <strong>in</strong> vacuum and<strong>in</strong> solution. 14 Their calculation used all-atom representation for <strong>the</strong> molecule and explicitsolvent model. An ensemble <strong>of</strong> transition pathways was collected, from which <strong>the</strong> transitionstate ensemble was found by determ<strong>in</strong><strong>in</strong>g <strong>the</strong> committor for each configurations visited by <strong>the</strong>trajectories. Their analysis shows that more degrees <strong>of</strong> freedom than <strong>the</strong> two torsion angleare necessary to describe <strong>the</strong> reaction coord<strong>in</strong>ates. Recently, Ma and D<strong>in</strong>ner <strong>in</strong>troduced astatistical method for identify<strong>in</strong>g reaction coord<strong>in</strong>ates from a database <strong>of</strong> candidate physicalvariables, and applied <strong>the</strong>ir method to study <strong>the</strong> isomerization <strong>of</strong> alan<strong>in</strong>e dipeptide. 11 In <strong>the</strong>follow<strong>in</strong>g, we apply FTS to study <strong>the</strong> transition <strong>of</strong> <strong>the</strong> alan<strong>in</strong>e dipeptide both <strong>in</strong> vacuumand <strong>in</strong> explicit solvent.A. Alan<strong>in</strong>e dipeptide <strong>in</strong> vacuumWe use <strong>the</strong> full atomic representation <strong>of</strong> <strong>the</strong> alan<strong>in</strong>e dipeptide molecule with <strong>the</strong>CHARMM22 force field. 15 We have not used <strong>the</strong> CMAP correction to <strong>the</strong> dihedral anglepotentials 17 s<strong>in</strong>ce it is unclear how <strong>the</strong> CMAP correction <strong>in</strong>fluences <strong>the</strong> dynamics <strong>of</strong> prote<strong>in</strong>s.Figure 4 shows <strong>the</strong> adiabatic energy landscape <strong>of</strong> <strong>the</strong> molecule as a function <strong>of</strong> <strong>the</strong> twobackbone dihedral angles φ and ψ. There are two stable conformers C 7eq and C ax . The stateC 7eq is fur<strong>the</strong>r split <strong>in</strong>to two sub-states (denoted by C 7eq and C ′ 7eq <strong>in</strong> figure 4) separated bya small barrier.Normally, one needs to identify <strong>the</strong> <strong>in</strong>itial and f<strong>in</strong>al states before FTS can be used.This case is different s<strong>in</strong>ce <strong>the</strong>re is some periodicity <strong>in</strong> <strong>the</strong> system. The <strong>in</strong>itial str<strong>in</strong>g (and<strong>the</strong> realizations on each hyperplane) is obta<strong>in</strong>ed by rotat<strong>in</strong>g <strong>the</strong> two backbone dihedralangles <strong>of</strong> <strong>the</strong> dipeptide along <strong>the</strong> diagonal l<strong>in</strong>e from (−180 ◦ , 180 ◦ ) to (180 ◦ , −180 ◦ ), withall o<strong>the</strong>r <strong>in</strong>ternal degrees <strong>of</strong> freedom kept fixed. Only non-hydrogen atoms are representedon <strong>the</strong> str<strong>in</strong>g. The str<strong>in</strong>g is discretized us<strong>in</strong>g 40 po<strong>in</strong>ts and correspond<strong>in</strong>gly 40 hyperplanesare evolved <strong>in</strong> <strong>the</strong> computation. At each iteration, time averag<strong>in</strong>g is performed for 0.5psfollow<strong>in</strong>g <strong>the</strong> dynamics described <strong>in</strong> (21) with T = 272K. A hard-core constra<strong>in</strong><strong>in</strong>g potentialis added to restrict <strong>the</strong> sampl<strong>in</strong>g to a tube with width 0.2Å (c = 0.2 and d = 1 <strong>in</strong> (23))around <strong>the</strong> str<strong>in</strong>g. This constra<strong>in</strong>t is gradually removed as <strong>the</strong> str<strong>in</strong>g converges to <strong>the</strong> steady16


state. The parameters used <strong>in</strong> smooth<strong>in</strong>g <strong>the</strong> str<strong>in</strong>g are C 1 = C 2 = 1.To fix <strong>the</strong> overall rotation and translation <strong>of</strong> <strong>the</strong> molecule, we constra<strong>in</strong>ed one Carbonatom at <strong>the</strong> orig<strong>in</strong>, one neighbor<strong>in</strong>g Carbon atom at <strong>the</strong> positive x-axis, and one Nitrogenatom <strong>in</strong> <strong>the</strong> xy-plane by harmonic spr<strong>in</strong>gs. The computation converges after about 60-70iterations. It takes about several-hour CPU time on a s<strong>in</strong>gle processor PC.The converged str<strong>in</strong>g and <strong>the</strong> tube are shown <strong>in</strong> figure 5. This tube is represented byproject<strong>in</strong>g onto <strong>the</strong> φ-ψ plane <strong>the</strong> ensemble <strong>of</strong> <strong>the</strong> realizations on each hyperplane correspond<strong>in</strong>gto <strong>the</strong> converged str<strong>in</strong>g. As we discussed earlier, this tube should co<strong>in</strong>cide with<strong>the</strong> tube obta<strong>in</strong>ed from <strong>the</strong> most probable transition paths between <strong>the</strong> stable conformersC ax and C 7eq . The free energy F along <strong>the</strong> str<strong>in</strong>g is shown <strong>in</strong> figure 6 (upper panel). The freeenergy has three local maxima S 1 , S 2 and S 3 , correspond<strong>in</strong>g to two transition pathways fromC ax to C 7eq : (a) C ax - S 1 - C ′ 7eq - S 2 - C 7eq ; and (b) C ax - S 3 - C 7eq . The transition at S 1 andS 3 is quite sharp. The energy difference <strong>of</strong> C ax relative to C 7eq is about E = 2.5 kcal/mol.The energy barrier is about ∆E = 7.0 kcal/mol at S 1 , and about ∆E = 7.5 kcal/mol atS 3 . Path (a) goes through an <strong>in</strong>termediate metastable state C ′ 7eq around (−150 ◦ , 170 ◦ ). But<strong>the</strong> energy barrier from C ′ 7eq to C 7eq is comparable to k B T and it has little effect on <strong>the</strong>transition from C ax to C 7eq .Shown <strong>in</strong> <strong>the</strong> lower panel <strong>of</strong> figure 6 is <strong>the</strong> committor function f(α) to <strong>the</strong> state C ax for<strong>the</strong> pathway C 7eq - S 3 - C ax calculated us<strong>in</strong>g <strong>the</strong> formula (15). The transition state regionis identified as <strong>the</strong> hyperplane labeled by α ⋆ that satisfies f(α ⋆ ) ≈ 1 . This plane is close to2<strong>the</strong> one with maximum free energy between C 7eq and C axFigure 7 shows <strong>the</strong> projection <strong>of</strong> <strong>the</strong> level curves <strong>of</strong> <strong>the</strong> equilibrium probability density on<strong>the</strong> hyperplanes correspond<strong>in</strong>g to <strong>the</strong> local maxima <strong>of</strong> <strong>the</strong> free energy. We see that at S 1 andS 2 , <strong>the</strong> projected transition states are localized on l<strong>in</strong>es which are roughly perpendicular to<strong>the</strong> projected mean path (dotted l<strong>in</strong>e), which suggests that <strong>the</strong> two backbone dihedral anglesmay parametrize reasonably well <strong>the</strong> reaction coord<strong>in</strong>ate q(x) for this transition. However,<strong>the</strong> projected transition states at S 3 spread out <strong>in</strong> φ-ψ plane and are no longer concentratedon <strong>the</strong> l<strong>in</strong>e perpendicular to <strong>the</strong> projected mean path. This <strong>in</strong>dicates that <strong>in</strong> addition toφ and ψ, <strong>the</strong>re are o<strong>the</strong>r degree <strong>of</strong> freedoms that are important for this transition. In Ref.14, <strong>the</strong> calculation by TPS shows that an additional torsion angle θ plays an importantrole <strong>in</strong> this transition, and <strong>the</strong> angles φ and θ provides a good set <strong>of</strong> coarse variables toparametrize <strong>the</strong> reaction coord<strong>in</strong>ate q(x). However, our calculation shows that <strong>the</strong> angle17


θ is approximately constant along <strong>the</strong> transition tube, and <strong>the</strong> projected transition stateson φ-θ plane is still quite broad and far from be<strong>in</strong>g concentrated on <strong>the</strong> l<strong>in</strong>e perpendicularto <strong>the</strong> projected mean str<strong>in</strong>g (see figure 8). Therefore, we conclude that φ and θ are notsufficient to parametrize <strong>the</strong> reaction coord<strong>in</strong>ate q(x). This discrepancy may be due to <strong>the</strong>different force fields used <strong>in</strong> <strong>the</strong> two calculations: We used <strong>the</strong> CHARMM22 force field, and<strong>the</strong> authors <strong>in</strong> Ref. 14 used AMBER.To see that we <strong>in</strong>deed obta<strong>in</strong>ed <strong>the</strong> right transition tube and transition state ensemble,we computed <strong>the</strong> committor distribution <strong>of</strong> <strong>the</strong> plane P α ⋆for which f(α ⋆ ) ≈ 1 2by <strong>in</strong>itiat<strong>in</strong>gtrajectories from this plane (this plane corresponds to <strong>the</strong> transition state region S 3 ). Recallthat <strong>in</strong> our calculation <strong>the</strong> str<strong>in</strong>g is discretized to a collection <strong>of</strong> po<strong>in</strong>ts parametrized by{α i }. In general α ⋆ <strong>in</strong> figure 6 is not <strong>in</strong> this discrete set. To compensate for this, wecompute <strong>the</strong> committor distribution for <strong>the</strong> two neighbor<strong>in</strong>g hyperplanes, one on each side<strong>of</strong> α ⋆ . We randomly picked 2490 po<strong>in</strong>ts based on <strong>the</strong> equilibrium distribution on each <strong>of</strong><strong>the</strong> two planes. Start<strong>in</strong>g from each <strong>of</strong> <strong>the</strong>se po<strong>in</strong>ts, 5000 trajectories were launched, fromwhich <strong>the</strong> committor distribution P Caxor <strong>the</strong> probability that <strong>the</strong> trajectory goes to C ax iscalculated. Figure 9 shows <strong>the</strong> distribution <strong>of</strong> P Cax . The distribution is peaked at P Cax = 0.3and P Cax= 0.7 respectively. This confirms our prediction that <strong>the</strong> transition state regionmust be <strong>in</strong> between <strong>the</strong>se two hyperplanes.We next ref<strong>in</strong>ed <strong>the</strong> str<strong>in</strong>g and added one more plane between <strong>the</strong> two neighbor<strong>in</strong>g planesdiscussed above. The committor distribution <strong>of</strong> <strong>the</strong> new hyperplane is shown <strong>in</strong> figure 10.The peak is located around 1 2<strong>in</strong>dicat<strong>in</strong>g that we <strong>in</strong>deed obta<strong>in</strong>ed <strong>the</strong> right transition stateensemble. The small deviation <strong>of</strong> <strong>the</strong> peak from 1 2is due to <strong>the</strong> approximation to <strong>the</strong> transitionstate surface by hyperplanes, and also to sampl<strong>in</strong>g errors <strong>in</strong> estimat<strong>in</strong>g <strong>the</strong> committorvalues. In figure 11, we plot four <strong>of</strong> <strong>the</strong> trajectories <strong>in</strong>itiated from S 1 . They stay well with<strong>in</strong><strong>the</strong> transition tube that we obta<strong>in</strong>ed by FTS. It is <strong>in</strong>terest<strong>in</strong>g to note that comput<strong>in</strong>g <strong>the</strong>committor distribution <strong>of</strong> P α ⋆to confirm <strong>the</strong> result <strong>of</strong> FTS is much more expensive thanobta<strong>in</strong><strong>in</strong>g P α ⋆by FTS.B. Alan<strong>in</strong>e dipeptide <strong>in</strong> explicit solventThe system consists <strong>of</strong> a full atom representation <strong>of</strong> an alan<strong>in</strong>e dipeptide molecule modeledwith <strong>the</strong> CHARMM22 force field, and 198 water molecules <strong>in</strong> a periodic cubic box.18


The temperature <strong>of</strong> <strong>the</strong> solvated system is at 298.15 K. The side length <strong>of</strong> <strong>the</strong> cubic box is18.4Å. Ewald summation was used to treat <strong>the</strong> long-range Coulomb force.Only <strong>the</strong> backbone atoms on <strong>the</strong> dipeptide are represented on <strong>the</strong> str<strong>in</strong>g. O<strong>the</strong>r atomson <strong>the</strong> dipeptide as well as <strong>the</strong> water molecules are not <strong>in</strong>cluded <strong>in</strong> <strong>the</strong> str<strong>in</strong>g, but <strong>the</strong>ycontribute to <strong>the</strong> force field as well as <strong>the</strong> width <strong>of</strong> <strong>the</strong> transition tube. For <strong>the</strong> <strong>in</strong>itial str<strong>in</strong>g,we rotated ψ from −180 ◦ to 180 ◦ while keep<strong>in</strong>g φ fixed at 50 ◦ . The str<strong>in</strong>g is discretizedus<strong>in</strong>g 30 po<strong>in</strong>ts <strong>in</strong> <strong>the</strong> configuration space, and correspond<strong>in</strong>gly 30 hyperplanes are evolved.To remove <strong>the</strong> rigid body translation and rotations for <strong>the</strong> dipeptide, we used a frame<strong>of</strong> reference which moves with <strong>the</strong> dipeptide. At each iteration, a constra<strong>in</strong>ed Langev<strong>in</strong>dynamics is carried out for 2 ps on each hyperplane <strong>in</strong> order to calculate <strong>the</strong> new position <strong>of</strong><strong>the</strong> str<strong>in</strong>g. A hard core constra<strong>in</strong><strong>in</strong>g potential is applied to restrict <strong>the</strong> Langev<strong>in</strong> dynamicsto a neighborhood <strong>of</strong> width 0.4Å (c = 0.4 and d = 1 <strong>in</strong> (23)) <strong>of</strong> <strong>the</strong> str<strong>in</strong>g. The parametersused <strong>in</strong> smooth<strong>in</strong>g <strong>the</strong> str<strong>in</strong>g are C 1 = C 2 = 1. The computation converged to a steadystate at 45 iterations <strong>of</strong> 40,000 dynamics steps. The mean str<strong>in</strong>g was subsequently fixed, <strong>the</strong>constra<strong>in</strong><strong>in</strong>g potential was removed, and <strong>the</strong> distributions <strong>in</strong> <strong>the</strong> hyperplanes were sampled.Figure 12 shows <strong>the</strong> converged transition tube on <strong>the</strong> φ-ψ plane. The background <strong>of</strong> <strong>the</strong>figure shows iso-contours <strong>of</strong> <strong>the</strong> free energy as a function <strong>of</strong> <strong>the</strong> φ-ψ variables. The freeenergy was obta<strong>in</strong>ed from 50 adaptive umbrella sampl<strong>in</strong>g 22 simulations <strong>of</strong> 100 ps at 300 Kus<strong>in</strong>g <strong>the</strong> CHARMM program. 16 The bias<strong>in</strong>g potential <strong>of</strong> those simulations was describedby a comb<strong>in</strong>ation <strong>of</strong> 12 harmonic functions and a constant <strong>in</strong> each dimension. Compared to<strong>the</strong> alan<strong>in</strong>e dipeptide <strong>in</strong> vacuum, <strong>the</strong> angle ψ <strong>of</strong> C 7eq is shifted to a higher value by about90 ◦ . In addition, <strong>the</strong> helical metastable state has appeared. This state has a significantπ-helix contribution <strong>in</strong> <strong>the</strong> CHARMM22 potential without <strong>the</strong> CMAP correction. 17 Thereare two transition path ensembles from C 7eq to α R : (a) C 7eq - S 1 - α R ; and (b) C 7eq - S 2 -α R .The projected hyperplanes onto <strong>the</strong> φ − ψ plane concentrate on <strong>the</strong> l<strong>in</strong>es perpendicularto <strong>the</strong> projected str<strong>in</strong>g. Therefore, <strong>the</strong> two backbone dihedral angles, φ and ψ, are qualifiedas a set <strong>of</strong> coarse variables to parametrize <strong>the</strong> reaction coord<strong>in</strong>ate. Note that <strong>the</strong> transitiontube is curved and deviates from vertical l<strong>in</strong>es, <strong>the</strong>refore both φ and ψ are necessary todescribe <strong>the</strong> transition dynamics, but <strong>the</strong>y may not be sufficient. This is consistent with <strong>the</strong>conclusion <strong>in</strong> Ref. 14, where Bolhuis and coauthors calculated <strong>the</strong> committor distributionfor configurations with ψ = 60 ◦ . A uniform distribution was obta<strong>in</strong>ed from which <strong>the</strong> author19


<strong>in</strong>ferred that <strong>the</strong> reaction coord<strong>in</strong>ate <strong>in</strong>cludes o<strong>the</strong>r important degrees <strong>of</strong> freedom, e.g. <strong>the</strong>solvent motion.To see whe<strong>the</strong>r <strong>the</strong> computed transition tube identifies with reasonable accuracy <strong>the</strong>transition state region, we computed <strong>the</strong> committor distribution on <strong>the</strong> hyperplane where<strong>the</strong> committor estimate was closest to 1/2. This is <strong>the</strong> plane centered around (−96 ◦ , 51.5 ◦ )<strong>in</strong> <strong>the</strong> φ-ψ map. We <strong>in</strong>itiated 100 trajectories from each <strong>of</strong> 896 po<strong>in</strong>ts sampled accord<strong>in</strong>g to<strong>the</strong> equilibrium distribution restricted to <strong>the</strong> hyperplane. We used f<strong>in</strong>ite-friction Langev<strong>in</strong>dynamics with a friction coefficient <strong>of</strong> 20 ps −1 as implemented <strong>in</strong> <strong>the</strong> CHARMM program. 16The trajectory was stopped once <strong>the</strong> ψ angle was larger than 100 ◦ or smaller than −20 ◦ and<strong>the</strong> trajectory was counted as committed to <strong>the</strong> extended or <strong>the</strong> helical bas<strong>in</strong> respectively.Due to <strong>the</strong> very long commitment times required by <strong>the</strong>se trajectories, we limited ourselvesto a smaller ensemble than <strong>in</strong> <strong>the</strong> vacuum study. The calculated committor distributionis shown <strong>in</strong> figure 13. The distribution is peaked around 1 , but it is less peaked than <strong>the</strong>2one obta<strong>in</strong>ed <strong>in</strong> vacuum. The average value <strong>of</strong> <strong>the</strong> committor distribution is 0.50 and <strong>the</strong>standard deviation is 0.21. One component <strong>of</strong> <strong>the</strong> distribution width is due to <strong>the</strong> limitedsampl<strong>in</strong>g <strong>of</strong> <strong>the</strong> ensemble by only 100 trajectories per <strong>in</strong>itial condition. The limited sampl<strong>in</strong>gby only 100 trajectories per <strong>in</strong>itial condition contributes a standard deviation <strong>of</strong> 0.05 to <strong>the</strong>results <strong>of</strong> sampl<strong>in</strong>g a sharp 0.5 committor iso-surface. The rema<strong>in</strong><strong>in</strong>g part <strong>of</strong> <strong>the</strong> error isdue to <strong>the</strong> approximation <strong>of</strong> <strong>the</strong> hyper-surfaces by hyperplanes and due to <strong>the</strong> assumptionthat <strong>the</strong> isocommittors can be described on <strong>the</strong> peptide backbone degrees <strong>of</strong> freedom.Figure 14 shows four typical reactive trajectories. These trajectories have equal probabilityto fall <strong>in</strong>to C 7eq or α R , and <strong>the</strong>y follow <strong>the</strong> transition tube quite well. Comparedto <strong>the</strong> behavior <strong>in</strong> vacuum, <strong>the</strong> dynamics <strong>of</strong> <strong>the</strong> solvated system is quite diffusive due to<strong>the</strong> collisions with water molecules and <strong>the</strong> lack <strong>of</strong> a high energetic barrier <strong>in</strong> <strong>the</strong> transitionregion.V. CONCLUSIONSIn this paper FTS was successfully applied to study <strong>the</strong> transition events <strong>of</strong> <strong>the</strong> alan<strong>in</strong>edipeptide <strong>in</strong> vacuum and <strong>in</strong> explicit solvent. The results <strong>of</strong> FTS were justified by launch<strong>in</strong>gtrajectories from <strong>the</strong> transition state region and <strong>the</strong> calculation <strong>of</strong> <strong>the</strong> committor distributionsto confirm that FTS allows one <strong>in</strong>deed to f<strong>in</strong>d iso-committor surfaces, <strong>in</strong> particular <strong>the</strong>20


one with committor value 1 correspond<strong>in</strong>g to <strong>the</strong> transition state region.2It is illum<strong>in</strong>at<strong>in</strong>g to make a comparison between <strong>the</strong> philosophies beh<strong>in</strong>d TPS and FTS.TPS and FTS are after <strong>the</strong> same objects, namely <strong>the</strong> ensemble <strong>of</strong> transition paths and transitionstates. Their key difference is <strong>in</strong> <strong>the</strong> way that <strong>the</strong> transition paths are parameterized.TPS considers transition paths <strong>in</strong> <strong>the</strong> space <strong>of</strong> physical trajectories, parametrized by <strong>the</strong>real time. FTS considers transition paths <strong>in</strong> configuration space, parametrized, for example,by <strong>the</strong> arc-length <strong>of</strong> <strong>the</strong> centerl<strong>in</strong>e <strong>of</strong> <strong>the</strong> transition tube. This different parametrizationgives FTS considerable advantage. First, <strong>the</strong> ensemble <strong>of</strong> transition paths can be characterizedby equilibrium densities, if we consider where <strong>the</strong>y hit <strong>the</strong> iso-committor surfaces.Secondly, this allows us to give a direct l<strong>in</strong>k between <strong>the</strong> transition path ensemble and <strong>the</strong>transition state ensemble. In addition, it also allows us to develop sampl<strong>in</strong>g procedures bywhich a collection <strong>of</strong> paths are harvested at <strong>the</strong> same time, <strong>in</strong>stead <strong>of</strong> one by one as <strong>in</strong> TPS.In o<strong>the</strong>r words FTS allows one to bypass completely <strong>the</strong> calculation <strong>of</strong> reactive trajectoriesparametrized by time and directly obta<strong>in</strong> <strong>the</strong> iso-committor surfaces and <strong>the</strong> transitiontubes which are more relevant to describe <strong>the</strong> mechanism <strong>of</strong> <strong>the</strong> transition.To conclude, FTS is a powerful tool for determ<strong>in</strong><strong>in</strong>g effective transition pathways <strong>in</strong>complex systems with multiscale energy landscapes. It does not require specify<strong>in</strong>g a reactioncoord<strong>in</strong>ate beforehand. It allows us to determ<strong>in</strong>e <strong>the</strong> hyperplanes which are approximationsto <strong>the</strong> iso-committor surfaces <strong>in</strong> configuration space by evolv<strong>in</strong>g a smooth curve. The smoothcurve converges to <strong>the</strong> center <strong>of</strong> a tube by which transitions occur with high probability.AcknowledgmentsThe authors thank M. Karplus for helpful discussions. E and Vanden-Eijnden was partiallysupported by ONR grants N00014-01-1-0674 and N00014-04-1-0565, and NSF grantsDMS01-01439, DMS02-09959 and DMS02-39625. PM acknowledges support by <strong>the</strong> MarieCurie European Fellowship grant number MEIF-CT-2003-501953. The explicit water calculationswere performed <strong>in</strong> <strong>the</strong> Crimson Grid cluster at Harvard, <strong>the</strong> SDSC teragrid and <strong>the</strong>CIMS computer center.21


APPENDIXWe derive <strong>the</strong> formula (12) from (10):∫I = |∇q| 2 e −βV dxΩ ′ ∫ 1= |∇q|∫Ω 2 e −βV δ(q(x) − z)dzdx′ 0∫ 1 ∫= dz |∇q| 2 e −βV δ(q(x) − z)dx0 Ω∫ ′ 1 ∫= f ′ (α)dα |∇q|e −βV δ(ˆn · (x − ϕ(α)))dx0Ω∫ ′ 1∫= f ′ (α)|∇q(ϕ)|dα e −βV δ(ˆn · (x − ϕ(α)))dx0Ω ′=∫ 10f ′ (α) 2 (ˆn · ϕ ′ ) −1 e −βF (α) dα.Here we denote Ω\(A ∪ B) by Ω ′ . To go from <strong>the</strong> third to <strong>the</strong> fourth l<strong>in</strong>e, we made <strong>the</strong>assumption that <strong>the</strong> iso-committor surface is locally planar with normal ˆn, and def<strong>in</strong>edf(α) = q(ϕ(α)). To go from <strong>the</strong> fourth to <strong>the</strong> fifth l<strong>in</strong>e, we used <strong>the</strong> follow<strong>in</strong>g assumption〈(ˆn ′ · (x − ϕ)) 2 〉 Pα ≪ (ˆn · ϕ ′ ) 2 (30)where <strong>the</strong> average is with respect to <strong>the</strong> equilibrium distribution restricted to P α . In <strong>the</strong> laststep, we def<strong>in</strong>ed <strong>the</strong> free energy F (α) as <strong>in</strong> (13) and used f ′ (α) = ∇q(ϕ)·ϕ ′ = |∇q(ϕ)|(ˆn·ϕ ′ ).(29)∗Electronic address: weiq<strong>in</strong>g@math.p<strong>in</strong>ceton.edu† Electronic address: eve2@cims.nyu.edu‡ Electronic address: plm@tammy.harvard.edu§ Electronic address: we<strong>in</strong>an@math.pr<strong>in</strong>ceton.edu1 W. E, W. Ren, and E. Vanden-Eijnden, Phys. Rev. B 66, 052301 (2002)2 W. E, W. Ren, and E. Vanden-Eijnden, J. Phys. Chem. 109, 6688 (2005)3 H. Jónsson, G. Mills, and K. W. Jacobsen, Nudged elastic band method for f<strong>in</strong>d<strong>in</strong>g m<strong>in</strong>imumenergy paths <strong>of</strong> transitions. In: Classical and Quantum Dynamics <strong>in</strong> Condensed Phase Simulations.Edited by: Berne, B. J.; Ciccoti, G.; Coker, D. F. World Scientific, 1998.4 W. E, W. Ren, and E. Vanden-Eijnden, unpublished22


5 E. A. Carter, G. Ciccotti, J. T. Hynes, and R. Kapral, Chem. Phys. Lett. 156, 472 (1989)6 P. G. Bolhuis, D. Chandler, C. Dellago, and P. Geissler, Ann. Rev. Phys. Chem. 59, 291 (2002)7 C. Dellago, P. G. Bolhuis, and P. L. Geissler, Adv. Chem. Phys. 123 (2002)8 W. E and E. Vanden-Eijnden, Metastability, conformation dynamics, and transition pathways<strong>in</strong> complex system, <strong>in</strong>: Multiscale Model<strong>in</strong>g and Simulation, S. Att<strong>in</strong>ger and P. Koumoutsakoseds. LNCSE 39, Spr<strong>in</strong>ger, 2004.9 W. E, W. Ren, and E. Vanden-Eijnden, submitted to Chem. Phys. Lett.10 R. Olender and R. Elber, J. Chem. Phys. 105, 9299 (1996)11 A. Ma and A. D<strong>in</strong>ner, J. Phys. Chem. 109, 6769 (2005)12 C. W. Gard<strong>in</strong>er, Handbook <strong>of</strong> stochastic methods for physics, chemistry, and <strong>the</strong> natural sciences.Spr<strong>in</strong>ger, 1997.13 J. Apostolakis, P. Ferrara, and A. Caflisch, J. Chem. Phys. 10, 2099 (1999)14 P. G. Bolhuis, C. Dellago, and D. Chandler, Proc. Natl. Acad. Sci. 97, 5877 (2000)15 A. D. MacKerell, D. Bashford, M. Bellott, R. L. Dunbrack, J. D. Evanseck, M. J. Field, S.Fischer, J. Gao, H. Guo, S. Ha, D. Joseph-McCarthy, L. Kuchnir, K. Kuczera, F. T. K. Lau,C. Mattos, S. Michnick, T. Ngo, D. T. Nguyen, B. Prodhom, W. E. Reiher, B. Roux, M.Schlenkrich, J. C. Smith, R. Stote, J. Straub, M. Watanabe, J. Wiorkiewicz-Kuczera, D. Y<strong>in</strong>,and M. Karplus, J. Phys. Chem. B 102, 3586-3616 (1998)16 B. R. Brooks, R. E. Bruccoleri, B. D. Olafson, D. J. States, S. Swam<strong>in</strong>athan, and M. Karplus,J. Comp. Chem. 4, 187-217 (1983)17 M. Feig, A. D. MacKerell, and C. L. Brooks, J. Phys. Chem. B 107, 2831 (2003)18 S. Fischer, and M. Karplus, Chem. Phys. Lett. 194, 252-261 (1992)19 I. V. Ionova, and E. A. Carter, J. Chem. Phys. 98, 6377-6386 (1993)20 G. Henkelman, and H. Jónsson, J. Chem. Phys. 111, 7010-7022 (1999)21 L. R. Pratt, J. Chem. Phys. 85, 5045-5048 (1986)22 C. Bartels, and M. Karplus, J. Comput. Chem. 18, 1450-1462 (1997)23


LIST OF FIGURESFIG. 1: <strong>Transition</strong> tube between <strong>the</strong> metastable sets A and B.Also shown are <strong>the</strong> isocommittorsurfaces Γ z .FIG. 2: Schematic <strong>of</strong> <strong>the</strong> computation procedure. Top figure (1st step): Sampl<strong>in</strong>g <strong>the</strong> hyperplanePα n . The empty circles are <strong>the</strong> mean positions on <strong>the</strong> hyperplanes; Second figure(2nd step): Smooth<strong>in</strong>g ˜ϕ(α) to obta<strong>in</strong> ¯ϕ(α); Third figure (3rd step): Reparameterization<strong>of</strong> ¯ϕ(α) to obta<strong>in</strong> ϕ n+1 (α); Last figure (4th step): Re<strong>in</strong>itializ<strong>in</strong>g <strong>the</strong> sampl<strong>in</strong>gprocess on P n+1 . The realizations on P n (empty circles) are moved to P n+1 (emptydiamonds).FIG. 3: Schematic representation <strong>of</strong> <strong>the</strong> alan<strong>in</strong>e dipeptide (CH 3 -CONH-CHCH 3 -CONH-CH 3 ).The backbone dihedral angles are labeled by φ: C-N-C-C and ψ: N-C-C-N. The pictureis taken from Ref. 14.FIG. 4: Adiabatic energy landscape <strong>of</strong> <strong>the</strong> alan<strong>in</strong>e dipeptide. The energy landscape is obta<strong>in</strong>edby m<strong>in</strong>imiz<strong>in</strong>g <strong>the</strong> potential energy <strong>of</strong> <strong>the</strong> molecule with (φ, ψ) fixed. The contoursare drawn at multiples <strong>of</strong> 1 kcal/mol above <strong>the</strong> C 7eq m<strong>in</strong>imum. There are two stableconformers C 7eq and C ax .FIG. 5: The transition tube between C 7eq and C ax . Two pathways exist: (a)C ax - S 1 - C ′ 7eq- S 2 - C 7eq ; and (b): C ax - S 3 - C 7eq . The dotted l<strong>in</strong>e is <strong>the</strong> path <strong>of</strong> (φ, ψ) along <strong>the</strong>mean str<strong>in</strong>g. The converged hyperplanes across <strong>the</strong> po<strong>in</strong>ts denoted by diamonds areapproximations to <strong>the</strong> transition state surfaces.FIG. 6: Top figure: Free energy <strong>of</strong> <strong>the</strong> alan<strong>in</strong>e dipeptide along <strong>the</strong> transition tube shown <strong>in</strong>figure 5; Lower figure: Committor f(α) along <strong>the</strong> path C 7eq - S 3 - C ax . The transitionstate ensemble for <strong>the</strong> transition C 7eq - S 3 - C ax is on <strong>the</strong> hyperplane labeled by α ⋆determ<strong>in</strong>ed by f(α ⋆ ) = 1 2 .Note that this plane is also very close to <strong>the</strong> one withmaximum free energy between C 7eq and C ax .FIG. 7: <strong>Transition</strong> state ensemble projected onto φ-ψ plane. The level curves show <strong>the</strong> density<strong>of</strong> <strong>the</strong> realizations on <strong>the</strong> hyperplanes correspond<strong>in</strong>g to <strong>the</strong> local maxima <strong>of</strong> <strong>the</strong> freeenergy. These planes have committor value f(α) very close 1 2for <strong>the</strong> correspond<strong>in</strong>gtransition.24


FIG. 8: <strong>Transition</strong> tube projected onto φ-θ plane. The level curves show <strong>the</strong> density <strong>of</strong> <strong>the</strong>transition states at S 3 .FIG. 9: Committor distribution for <strong>the</strong> transition state region S 3 . Two histograms correspondto <strong>the</strong> two hyperplanes with <strong>in</strong>dex α directly above and below α ⋆ , where f(α ⋆ ) = 1.2FIG. 10: Committor distribution for <strong>the</strong> ref<strong>in</strong>ed hyperplane with committor value 1.2FIG. 11: Four trajectories <strong>in</strong>itiated from <strong>the</strong> transition state region S 1 . The trajectories haveequal probabilities to go to C 7eq and C ax . The trajectories stay <strong>in</strong> <strong>the</strong> transition tubeobta<strong>in</strong>ed by str<strong>in</strong>g method.FIG. 12: <strong>Transition</strong> tube between C 7eq and α R projected onto <strong>the</strong> φ-ψ plane. The contour plotshows <strong>the</strong> potential <strong>of</strong> mean force <strong>of</strong> <strong>the</strong> φ-ψ angles. The iso-contours are separated by0.5 kcal/mol. The thick black po<strong>in</strong>ts that are connected with l<strong>in</strong>es show <strong>the</strong> location<strong>of</strong> <strong>the</strong> mean str<strong>in</strong>g.FIG. 13: Committor distribution function for <strong>the</strong> transition state region S 1 which is <strong>the</strong> planecentered around (−96 ◦ , 51.5 ◦ ) <strong>in</strong> <strong>the</strong> φ-ψ map.FIG. 14: Four typical trajectories <strong>in</strong>itiated from <strong>the</strong> transition state region S 1 . The trajectorieshave equal probability to go to C 7eq and α R , and <strong>the</strong>y follow <strong>the</strong> transition tube.25


Γ zABFIG. 1:26


~ϕAϕ nBAϕ _ϕ nBϕ n+1Aϕ nBϕ n+1Aϕ nBFIG. 2:27


FIG. 3:28


180C 7eq’90C 7eqψ0C ax−90−180−180 −90 0 90 180φFIG. 4:29


FIG. 5:30


8S 3’S 1S 276E (kcal/mol)5432C ax10C 7eq0 0.2 0.4 0.6 0.8 1αC 7eqC 7eq10.80.6f(α)0.40.200 0.1 0.2 0.3 α* 0.4αFIG. 6:31


180’S 2S 3C 7eqS 190C 7eqψ0C ax−90−180−180 −90 0 90 180φFIG. 7:32


FIG. 8:33


43P(p Cax)2100 0.2 0.4 0.6 0.8 1p Cax43P(p Cax)2100 0.2 0.4 0.6 0.8 1p CaxFIG. 9:34


32P(p Cax)100 0.2 0.4 0.6 0.8 1p CaxFIG. 10:35


FIG. 11:36


FIG. 12:37


1.5P(pαR)1.00.5000.20.4pαR0.60.81FIG. 13:38


FIG. 14:39

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!