04.08.2013 Views

Nonlinear Mechanics - Physics at Oregon State University

Nonlinear Mechanics - Physics at Oregon State University

Nonlinear Mechanics - Physics at Oregon State University

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Nonlinear</strong> <strong>Mechanics</strong><br />

A. W. Stetz<br />

January 8, 2012


Contents<br />

1 Lagrangian Dynamics 5<br />

1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5<br />

1.2 Generalized Coordin<strong>at</strong>es and the Lagrangian . . . . . . . . . 6<br />

1.3 Virtual Work and Generalized Force . . . . . . . . . . . . . . 8<br />

1.4 Conserv<strong>at</strong>ive Forces and the Lagrangian . . . . . . . . . . . . 10<br />

1.4.1 The Central Force Problem in a Plane . . . . . . . . . 11<br />

1.5 The Hamiltonian Formul<strong>at</strong>ion . . . . . . . . . . . . . . . . . . 13<br />

1.5.1 The Spherical Pendulum . . . . . . . . . . . . . . . . . 15<br />

2 Canonical Transform<strong>at</strong>ions 17<br />

2.1 Contact Transform<strong>at</strong>ions . . . . . . . . . . . . . . . . . . . . . 17<br />

2.1.1 The Harmonic Oscill<strong>at</strong>or: Cracking a Peanut with a<br />

Sledgehammer . . . . . . . . . . . . . . . . . . . . . . 20<br />

2.2 The Second Gener<strong>at</strong>ing Function . . . . . . . . . . . . . . . . 21<br />

2.3 Hamilton’s Principle Function . . . . . . . . . . . . . . . . . . 22<br />

2.3.1 The Harmonic Oscill<strong>at</strong>or: Again . . . . . . . . . . . . 24<br />

2.4 Hamilton’s Characteristic Function . . . . . . . . . . . . . . . 25<br />

2.4.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . 26<br />

2.5 Action-Angle Variables . . . . . . . . . . . . . . . . . . . . . . 27<br />

2.5.1 The harmonic oscill<strong>at</strong>or (for the last time) . . . . . . . 29<br />

3 Abstract Transform<strong>at</strong>ion Theory 33<br />

3.1 Not<strong>at</strong>ion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33<br />

3.1.1 Poisson Brackets . . . . . . . . . . . . . . . . . . . . . 35<br />

3.2 Geometry in n Dimensions: The Hairy Ball . . . . . . . . . . 38<br />

3.2.1 Example: Uncoupled Oscill<strong>at</strong>ors . . . . . . . . . . . . 41<br />

3.2.2 Example: A Particle in a Box . . . . . . . . . . . . . . 43<br />

3


4 CONTENTS<br />

4 Canonical Perturb<strong>at</strong>ion Theory 45<br />

4.1 One-Dimensional Systems . . . . . . . . . . . . . . . . . . . . 45<br />

4.1.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . 49<br />

4.1.2 The simple pendulum . . . . . . . . . . . . . . . . . . 49<br />

4.2 Many Degrees of Freedom . . . . . . . . . . . . . . . . . . . . 51<br />

5 Introduction to Chaos 55<br />

5.1 The total failure of perturb<strong>at</strong>ion theory . . . . . . . . . . . . 56<br />

5.2 Fixed points and lineariz<strong>at</strong>ion . . . . . . . . . . . . . . . . . . 58<br />

5.3 The Henon oscill<strong>at</strong>or . . . . . . . . . . . . . . . . . . . . . . . 62<br />

5.4 Discrete Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . 68<br />

5.5 Linearized Maps . . . . . . . . . . . . . . . . . . . . . . . . . 70<br />

5.6 Lyapunov Exponents . . . . . . . . . . . . . . . . . . . . . . . 72<br />

5.7 The Poincaré-Birkhoff Theorem . . . . . . . . . . . . . . . . . 74<br />

5.8 All in a tangle . . . . . . . . . . . . . . . . . . . . . . . . . . . 77<br />

5.9 The KAM theorem and its consequences . . . . . . . . . . . . 80<br />

5.9.1 Two Conditions . . . . . . . . . . . . . . . . . . . . . . 81<br />

5.10 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83


Chapter 1<br />

Lagrangian Dynamics<br />

1.1 Introduction<br />

The possibility th<strong>at</strong> deterministic mechanical systems could exhibit the behavior<br />

we now call chaos was first realized by the French m<strong>at</strong>hem<strong>at</strong>ician<br />

Henri Poincaré sometime toward the end of the nineteenth century. His<br />

discovery emerged from analytic or classical mechanics, which is still part<br />

of the found<strong>at</strong>ion of physics. To put it a bit facetiously, classical mechanics<br />

deals with those problems th<strong>at</strong> can be “solved,” in the sense th<strong>at</strong> it is possible<br />

to derive equ<strong>at</strong>ions of motions th<strong>at</strong> describe the positions of the various<br />

parts of a system as functions of time using standard analytic functions.<br />

<strong>Nonlinear</strong> dynamics tre<strong>at</strong>s problems th<strong>at</strong> cannot be so solved, and it is only<br />

in these problems th<strong>at</strong> chaos can appear. The simple pendulum makes a<br />

good example. The differential equ<strong>at</strong>ion of motion is<br />

¨θ + ω 2 sin θ = 0 (1.1)<br />

The sin is a nonlinear function of θ. If we linearize by setting sin θ ≈ θ,<br />

the solutions are elementary functions, sin ωt and cos ωt. If we keep the sin,<br />

the solutions can only be expressed in terms of elliptic integrals. This is<br />

not a chaotic system, because there is only one degree of freedom, but if we<br />

hang one pendulum from the end of another, the equ<strong>at</strong>ions of motion are<br />

hopeless to find (even with elliptic integrals) and the resulting motion can<br />

be chaotic. 1<br />

1 I should emphasize the distinction between the differential equ<strong>at</strong>ions of motion, which<br />

are usually simple (though nonlinear), and the equ<strong>at</strong>ions th<strong>at</strong> describe the positions of<br />

the elements of the system as functions of time, which are usually non-existent.<br />

5


6 CHAPTER 1. LAGRANGIAN DYNAMICS<br />

In order to arrive <strong>at</strong> Poincaré’s moment of discovery, we will have to<br />

review the development of classical mechanics through the nineteenth century.<br />

This m<strong>at</strong>erial is found in many standard texts, but I will cover it here<br />

in some detail. This is partly to insure uniform not<strong>at</strong>ion throughout these<br />

lectures and partly to focus on those things th<strong>at</strong> lead directly to chaos in<br />

nonlinear systems. We will begin formul<strong>at</strong>ing mechanics in terms of generalized<br />

coordin<strong>at</strong>es and Lagrange’s equ<strong>at</strong>ions of motion. We then study<br />

Lagrange transform<strong>at</strong>ions and use them to derive Hamilton’s equ<strong>at</strong>ions of<br />

motions. These equ<strong>at</strong>ions are particularly suited to conserv<strong>at</strong>ive systems<br />

in which the Hamiltonian is constant in time, and it is such systems th<strong>at</strong><br />

will be our primary concern. It turns out th<strong>at</strong> Lagrange transform<strong>at</strong>ions<br />

can be used to transform Hamiltonians in a myriad of ways. One particularly<br />

elegant form uses action-angle variables to transform a certain class of<br />

problems into a set of uncoupled harmonic oscill<strong>at</strong>ors. Systems th<strong>at</strong> can be<br />

so transformed are said to be integrable, which is to say th<strong>at</strong> they can be<br />

“solved,” <strong>at</strong> least in principle. Wh<strong>at</strong> happens, Poincaré asked, to a system<br />

th<strong>at</strong> is almost but not quite integrable? The answer entails perturb<strong>at</strong>ion<br />

theory and leads to the disastrous problem of small divisors. This is the<br />

p<strong>at</strong>h th<strong>at</strong> led originally to the discovery of chaos, and it is the one we will<br />

pursue here.<br />

1.2 Generalized Coordin<strong>at</strong>es and the Lagrangian<br />

Vector equ<strong>at</strong>ions, like F = ma, seem to imply a coordin<strong>at</strong>e system. Beginning<br />

students learn to use cartesian coordin<strong>at</strong>es and then learn th<strong>at</strong> this<br />

is not always the best choice. If the system has cylindrical symmetry, for<br />

example, it is best to use cylindrical coordin<strong>at</strong>es: it makes the problem easier.<br />

By “symmetry” we mean th<strong>at</strong> the number of degrees of freedom of the<br />

system is less th<strong>at</strong> the dimensionality of the space in which it is imbedded.<br />

The familiar example of the block sliding down the incline plane will make<br />

this clear. Let’s say th<strong>at</strong> it’s a two dimensional problem with an x-y coordin<strong>at</strong>e<br />

system. The block is constrained to move in a straight line, however,<br />

so th<strong>at</strong> its position can be completely specified by one variable, i.e. it has<br />

one degree of freedom. The clever student chooses the x axis so th<strong>at</strong> it lies<br />

along the p<strong>at</strong>h of the block. This reduces the problem to one dimension,<br />

since y = 0 and the x coordin<strong>at</strong>e is given by one simple equ<strong>at</strong>ion. In the<br />

pendulum example from the previous section, it was most convenient to use<br />

a polar coordin<strong>at</strong>e system centered <strong>at</strong> the pivot. Since r is constant, the<br />

motion can be described completely in terms of θ.


1.2. GENERALIZED COORDINATES AND THE LAGRANGIAN 7<br />

These coordin<strong>at</strong>e systems conceal a subtle point: the pendulum moves<br />

in a circular arc and the block moves in a straight line because they are<br />

acted on by forces of constraint. In most cases we are not interested in these<br />

forces. Our choice of coordin<strong>at</strong>es simply makes them disappear from the<br />

problem. Most problems don’t have obvious symmetries, however. Consider<br />

a bead sliding along a wire following some complic<strong>at</strong>ed snaky p<strong>at</strong>h in 3-d<br />

space. There’s only one degree of freedom, since the particle’s position<br />

is determined entirely by its distance measured along the wire from some<br />

reference point. The forces are so complic<strong>at</strong>ed, however, th<strong>at</strong> it is out of<br />

the question to solve the problem by using F = ma in any straightforward<br />

way. This is the problem th<strong>at</strong> Lagrangian mechanics is designed to handle.<br />

The basic (and quite profound) idea is th<strong>at</strong> even though there may be no<br />

coordin<strong>at</strong>e system (in the usual sense) th<strong>at</strong> will reduce the dimensionality of<br />

the problem, yet there is usually a system of coordin<strong>at</strong>es th<strong>at</strong> will do this.<br />

Such coordin<strong>at</strong>es are called generalized coordin<strong>at</strong>es.<br />

To be more specific, suppose th<strong>at</strong> a system consists of N point masses<br />

with positions specified by ordinary three-dimensional cartesian vectors, ri,<br />

i = 1 · · · N, subject to some constraints. The easiest constraints to deal with<br />

are those th<strong>at</strong> can be expressed as a set of l equ<strong>at</strong>ions of the form<br />

fj(r1, r2, . . . , t) = 0, (1.2)<br />

where j = 1 · · · l. Such constraints are said to be holonomic. If in addition,<br />

the equ<strong>at</strong>ions of constraint do not involve time explicitly, they are said to be<br />

scleronomous, otherwise they are called rheonomous. These constraints can<br />

be used to reduce the 3N cartesian components to a set of 3N − l variables<br />

q1, q2, . . . , q3N−l. The rel<strong>at</strong>ionship between the two is given by a set of N<br />

equ<strong>at</strong>ions of the form<br />

ri = ri(q1, q2, . . . , q3N−l, t). (1.3)<br />

The q’s used in this way are the generalized coordin<strong>at</strong>es. In the example of<br />

the bead on a curved wire, the equ<strong>at</strong>ions would reduce to r = r(q), where<br />

q is a distance measured along the wire. This simply specifies the curv<strong>at</strong>ure<br />

of the wire.<br />

It should be noted th<strong>at</strong> the q’s need not all have the same units. Also<br />

note th<strong>at</strong> we can use the same not<strong>at</strong>ion even if there are no constraints.<br />

For example, the position of an unconstrained particle could be written r =<br />

r(q1, q2, q3), and the q’s might represent cartesian, spherical, or cylindrical<br />

coordin<strong>at</strong>es. In order to simplify the not<strong>at</strong>ion, we will often pack the q’s


8 CHAPTER 1. LAGRANGIAN DYNAMICS<br />

into an array and use vector not<strong>at</strong>ion,<br />

<br />

<br />

<br />

<br />

<br />

q = <br />

<br />

<br />

<br />

q1<br />

q2<br />

q3<br />

.<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

<br />

(1.4)<br />

This is not meant to imply th<strong>at</strong> q is a vector in the usual sense. For one<br />

thing, it does not necessarily posses “a magnitude and a direction” as good<br />

vectors are supposed to have. By the same token, we cannot use the notion<br />

of orthogonal unit vectors.<br />

Along with the notion of generalized coordin<strong>at</strong>es comes th<strong>at</strong> of generalized<br />

velocities.<br />

˙qk ≡ dqk<br />

(1.5)<br />

dt<br />

Since qi depends only on t, this is a total deriv<strong>at</strong>ive, but when we differenti<strong>at</strong>e<br />

ri, we must remember th<strong>at</strong> it depends both explicitly on time as well as<br />

implicitly through the q’s.<br />

˙ri = ∑ ∂ri<br />

∂qk<br />

k<br />

˙qk + ∂ri<br />

∂t<br />

(1.6)<br />

(In this chapter I will consistently use the index i to sum over the N point<br />

masses and k to sum over the 3N − l degrees of freedom.) Differenti<strong>at</strong>ing<br />

both sides with respect to ˙qk yields<br />

∂ ˙ri<br />

∂ ˙qk<br />

= ∂ri<br />

∂qk<br />

which will be useful in the following deriv<strong>at</strong>ions.<br />

1.3 Virtual Work and Generalized Force<br />

(1.7)<br />

There are several routes for deriving Lagrange’s equ<strong>at</strong>ions of motion. The<br />

most elegant and general makes use of the principle of least action and the<br />

calculus of vari<strong>at</strong>ion. I will use a much more pedestrian approach based on<br />

Newton’s second law of motion. First note th<strong>at</strong> F = ma can be written in<br />

the r<strong>at</strong>her arcane form ( )<br />

d ∂T<br />

= Fi<br />

(1.8)<br />

dt ∂vi<br />

Where Fi is i-th component of the total force acting on a particle with<br />

kinetic energy T . The point of writing this in terms of energy r<strong>at</strong>her than


1.3. VIRTUAL WORK AND GENERALIZED FORCE 9<br />

acceler<strong>at</strong>ion is th<strong>at</strong> we can separ<strong>at</strong>e out the forces of constraint, which are<br />

always perpendicular to the direction of motion and hence do no work. The<br />

trick is to write this in terms of generalized coordin<strong>at</strong>es and velocities. This<br />

is r<strong>at</strong>her technical, but the underlying idea is simple, and the result looks<br />

much like (1.8).<br />

The qk’s are all independent, so we can vary one by a small amount δqk<br />

while holding all others constant.<br />

δri = ∑ ∂ri<br />

δqk<br />

∂qk<br />

k<br />

(1.9)<br />

This is sometimes called a virtual displacement. The corresponding virtual<br />

work is<br />

δWk = ∑<br />

(<br />

∂ri<br />

Fi ·<br />

∂qk<br />

)<br />

δqk<br />

(1.10)<br />

We define a generalized force<br />

i<br />

ℑk = ∑<br />

i<br />

Fi · ∂ri<br />

∂qk<br />

(1.11)<br />

The forces of constraint can be excluded from the sum for the reason explained<br />

above. We are left with<br />

ℑk = δWk<br />

δqk<br />

The kinetic energy is calcul<strong>at</strong>ed using ordinary velocities.<br />

T = 1<br />

2<br />

i<br />

∑<br />

mi ˙ri · ˙ri<br />

i<br />

∂T<br />

=<br />

∂qk<br />

∑ ∂ ˙ri<br />

mi ˙ri · =<br />

∂qk<br />

∑ ∂ ˙ri<br />

pi ·<br />

∂qk<br />

∂T<br />

∂ ˙qk<br />

= ∑ ∂ ˙ri<br />

mi ˙ri ·<br />

∂ ˙qk<br />

i<br />

i<br />

= ∑<br />

i<br />

pi · ∂ri<br />

∂qk<br />

(1.12)<br />

(1.13)<br />

(1.14)<br />

(1.15)<br />

Equ<strong>at</strong>ion (1.7) was used to obtain the last term. A straightforward calcul<strong>at</strong>ion<br />

now leads to<br />

??ℑk = d<br />

( )<br />

∂T<br />

−<br />

dt ∂ ˙qk<br />

∂T<br />

∂qk<br />

(1.16)<br />

which is the generalized form of (1.8).


10 CHAPTER 1. LAGRANGIAN DYNAMICS<br />

1.4 Conserv<strong>at</strong>ive Forces and the Lagrangian<br />

So far we have made no assumptions about the n<strong>at</strong>ure of the forces included<br />

in ℑ except th<strong>at</strong> they are not forces of constraint. Equ<strong>at</strong>ion (16) is therefore<br />

quite general, although seldom used in this form. In these notes we are<br />

primarily concerned with conserv<strong>at</strong>ive forces, i.e. forces th<strong>at</strong> can be derived<br />

from a potential.<br />

Fi = −∇iV (r1 · · · rN) (1.17)<br />

Notice th<strong>at</strong> V doesn’t depend on velocity. (Electromagnetic forces are velocity<br />

dependent, of course, but they can easily be accommod<strong>at</strong>ed into the<br />

Lagrangian framework. I will return to this issue l<strong>at</strong>er on.) Now calcul<strong>at</strong>e<br />

the work done by changing some of the q’s.<br />

∫<br />

∑<br />

W = Fi · dri = − ∑<br />

∫<br />

∇iV · dri<br />

i<br />

= − ∑<br />

∫<br />

∇iV ·<br />

i<br />

∑ ∂ri<br />

dqk<br />

∂qk k<br />

= − ∑<br />

∫<br />

k<br />

( ∑<br />

∇iV ·<br />

i<br />

∂ri<br />

)<br />

dqk<br />

∂qk<br />

= − ∑<br />

∫<br />

∂V<br />

dqk<br />

∂qk<br />

i<br />

k<br />

(1.18)<br />

The integral is a multidimensional definite integral over the various q’s th<strong>at</strong><br />

have changed. Summing over (1.12) then gives<br />

Comparison with (1.18) yields<br />

Finally define the Lagrangian<br />

δW = ∑<br />

δWk = ∑<br />

ℑkδqk<br />

k<br />

W = ∑<br />

∫<br />

k<br />

ℑk = − ∂V<br />

∂qk<br />

k<br />

ℑkdqk<br />

(1.19)<br />

(1.20)<br />

(1.21)<br />

L = T − V (1.22)


1.4. CONSERVATIVE FORCES AND THE LAGRANGIAN 11<br />

Equ<strong>at</strong>ion (??) becomes<br />

d<br />

dt<br />

( ∂L<br />

∂ ˙qk<br />

)<br />

− ∂L<br />

∂qk<br />

= 0. (1.23)<br />

Equ<strong>at</strong>ion (1.23) represents a set of 3N −l second order partial differential<br />

equ<strong>at</strong>ions called Lagrange’s equ<strong>at</strong>ions of motion. I can summarize this long<br />

development by giving you a “cookbook” procedure for using (1.23) to solve<br />

mechanics problems: First select a convenient set of generalized coordin<strong>at</strong>es.<br />

Then calcul<strong>at</strong>e T and V in the usual way using the ri’s. Use equ<strong>at</strong>ion (1.3)<br />

to elimin<strong>at</strong>e the ri’s in favor of the qk’s. Finally substitute L into (1.23) and<br />

solve the resulting equ<strong>at</strong>ions.<br />

Classical mechanics texts are full of examples in which this program is<br />

carried to a successful conclusion. In fact, most of these problems are contrived<br />

and of little interest except to illustr<strong>at</strong>e the method. The vast majority<br />

of systems lead to differential equ<strong>at</strong>ions th<strong>at</strong> cannot be solved in closed<br />

form. The modern emphasis is to understand the solutions qualit<strong>at</strong>ively<br />

and then obtain numerical solutions using the computer. The Hamiltonian<br />

formalism described in the next section is better suited to both these ends.<br />

1.4.1 The Central Force Problem in a Plane<br />

Consider the central force problem as an example of this technique.<br />

V = V (r) F = −∇V (1.24)<br />

L = T − V = 1<br />

2 m<br />

(<br />

˙r 2 + r 2 ϕ˙ 2 )<br />

− V (r) (1.25)<br />

Let’s choose our generalized coordin<strong>at</strong>es to be q1 = r and q2 = ϕ. Equ<strong>at</strong>ion<br />

(1.23) becomes<br />

m¨r − mr ˙ ϕ 2 + dV<br />

= 0<br />

dr<br />

d<br />

(<br />

mr<br />

dt<br />

(1.26)<br />

2 )<br />

ϕ˙<br />

= 0 (1.27)<br />

This last equ<strong>at</strong>ion tells us th<strong>at</strong> there is a quantity mr 2 ˙ ϕ th<strong>at</strong> does not change<br />

with time. Such a quantity is said to be conserved. In this case we have<br />

rediscovered the conserv<strong>at</strong>ion of angular momentum.<br />

This reduces the problem to one dimension.<br />

mr 2 ˙ ϕ ≡ lz = constant (1.28)<br />

m¨r = l2 z dV<br />

−<br />

mr3 dr<br />

(1.29)


12 CHAPTER 1. LAGRANGIAN DYNAMICS<br />

Since there are no constraints, the generalized forces are identical with the<br />

ordinary forces<br />

ℑϕ = − dV<br />

dϕ = 0 ℑr = − dV<br />

(1.30)<br />

dr<br />

This equ<strong>at</strong>ion has an elegant closed form solution in the special case of<br />

gravit<strong>at</strong>ional <strong>at</strong>traction.<br />

V = − GmM<br />

r<br />

≡ −k<br />

r<br />

(1.31)<br />

m¨r = l2 z k<br />

− (1.32)<br />

mr3 r<br />

This apparently nonlinear equ<strong>at</strong>ion yields to a simple trick, let u = 1/r.<br />

d2u dϕ2 + u = m2k l2 z<br />

(1.33)<br />

If the motion is circular u is constant. Otherwise it oscill<strong>at</strong>es around the<br />

value m2k/l2 z with simple harmonic motion. 2 The period of oscill<strong>at</strong>ion is<br />

identical with the period of rot<strong>at</strong>ion so the corresponding orbit is an ellipse.<br />

This problem was easy to solve because we were able to discover a nontrivial<br />

quantity th<strong>at</strong> was constant, in this case the angular momentum. The<br />

constant enabled us to reduce the number of independent variables from<br />

two to one. Such a conserved quantity is called an integral of the motion<br />

or a constant of the motion. Obviously, the more such quantities one can<br />

find, the easier the problem. This raises two practical problems. First, how<br />

can we tell, perhaps from looking <strong>at</strong> the physics of a problem, how many<br />

independent conserved quantities there are? Second, how are we to find<br />

them?<br />

In the central force problem, both of these questions answered themselves.<br />

We know th<strong>at</strong> angular momentum is conserved. This fact manifests<br />

itself in the Lagrangian in th<strong>at</strong> L depends on ˙ ϕ but not on ϕ. Such a coordin<strong>at</strong>e<br />

is said to be cyclic or ignorable. Let q be such a coordin<strong>at</strong>e. Then<br />

( )<br />

d ∂L<br />

= 0 (1.34)<br />

dt ∂ ˙q<br />

The quantity in brackets has a special significance. It is called the canonically<br />

conjug<strong>at</strong>e momentum. 3<br />

∂L<br />

≡ pk<br />

(1.35)<br />

˙qk<br />

2<br />

This illustr<strong>at</strong>es a general principle in physics: When correctly viewed, everything is a<br />

harmonic oscill<strong>at</strong>or.<br />

3<br />

This not<strong>at</strong>ion is universally used, hence the old aphorism th<strong>at</strong> mechanics is a m<strong>at</strong>ter<br />

of minding your p’s and q’s.


1.5. THE HAMILTONIAN FORMULATION 13<br />

To summarize, if q is cyclic, p is conserved.<br />

Suppose we had tried to do the central force problem in cartesian coordin<strong>at</strong>es.<br />

Both x and y would appear in the Lagrangian, and neither px nor<br />

py would be constant. If we insisted on this, central forces would remain an<br />

intractable problem in two dimensions. We need to choose our generalized<br />

coordin<strong>at</strong>es so th<strong>at</strong> there are as many cyclic variables as possible. The two<br />

questions reemerge: how many are we entitled to and how do we find the<br />

corresponding p’s and q’s?<br />

A partial answer to the first is given by a well-known result called<br />

Noether’s theorem: For every transform<strong>at</strong>ion th<strong>at</strong> leaves the Lagrangian<br />

invariant there is a constant of the motion. 4 This theorem (which underlies<br />

all of modern particle physics) says th<strong>at</strong> there is a fundamental connection<br />

between symmetries and invariance principles on one hand and conserv<strong>at</strong>ion<br />

laws on the other. Momentum is conserved because the laws of physics<br />

are invariant under transl<strong>at</strong>ion. Angular momentum is conserved because<br />

the laws of physics are invariant under rot<strong>at</strong>ion. Despite its fundamental<br />

significance, Noether’s theorem is not much help in practical calcul<strong>at</strong>ions.<br />

Granted it gives a procedure for finding the conserved quantity after the<br />

corresponding symmetry transform<strong>at</strong>ion has been found, but how is one to<br />

find the transform<strong>at</strong>ion? The physicist must rely on his traditional tools:<br />

inspir<strong>at</strong>ion, the Ouija Board, and simply pounding one’s head against a<br />

wall. The fact remains th<strong>at</strong> there are simple systems, e.g. the Henon-Heiles<br />

problem to be discussed l<strong>at</strong>er, th<strong>at</strong> have fascin<strong>at</strong>ed physicists for decades<br />

and for which the existence of these transform<strong>at</strong>ions is still controversial.<br />

I will have much more to say about the second question. As you will<br />

see, there is a more or less “cookbook” procedure for finding the right set<br />

of variables and some fundamental results about the sorts of problems for<br />

which these procedures are possible.<br />

1.5 The Hamiltonian Formul<strong>at</strong>ion<br />

I will explain the Hamiltonian assuming th<strong>at</strong> there is only one degree of<br />

freedom. It’s easy to generalize once the basic ideas are clear. Lagrangians<br />

are functions of q and ˙q. We define a new function of q and p (given by<br />

(1.34)).<br />

H(p, q) = p ˙q − L(q, ˙q) (1.36)<br />

The new function is called the Hamiltonian, and the transform<strong>at</strong>ion L → H<br />

is called a Lagrange transform<strong>at</strong>ion. The equ<strong>at</strong>ion is much more subtle than<br />

4 See Finch and Hand for a simple proof and further discussion.


14 CHAPTER 1. LAGRANGIAN DYNAMICS<br />

it looks. In fact, its worth several pages of explan<strong>at</strong>ion.<br />

It’s clear from elementary mechanics th<strong>at</strong> q, ˙q, and p can’t all be independent<br />

variables, since p = m ˙q. You might say th<strong>at</strong> there are two ways of<br />

formul<strong>at</strong>ing Newton’s second law: a (q, ˙q) formul<strong>at</strong>ion, F = m¨q, and a (q, p)<br />

formul<strong>at</strong>ion, F = ˙p. The connection between q and its canonically conjug<strong>at</strong>e<br />

momentum is usually more complic<strong>at</strong>ed than this, but there is still a (q, ˙q)<br />

formul<strong>at</strong>ion, the Lagrangian, and a (q, p) formul<strong>at</strong>ion, the Hamiltonian. The<br />

Legendre transform<strong>at</strong>ion is a procedure for transforming the one formul<strong>at</strong>ion<br />

into the other. The key point is th<strong>at</strong> it is invertible. 5 To see wh<strong>at</strong> this<br />

means, let’s first assume th<strong>at</strong> q, ˙q and p are all independent.<br />

dH =<br />

Wh<strong>at</strong> is the condition th<strong>at</strong> H not depend on ˙q?<br />

H(q, ˙q, p) = p ˙q − L(q, ˙q) (1.37)<br />

(<br />

p − ∂L<br />

)<br />

d ˙q + ˙q dp −<br />

∂ ˙q<br />

∂L<br />

dq (1.38)<br />

∂q<br />

p(q, ˙q) =<br />

∂L(q, ˙q)<br />

∂ ˙q<br />

OK. This is the definition of p anyhow, so we’re on the right track.<br />

dH = ˙q dp − ∂L<br />

∂q dq<br />

dH = ∂H<br />

∂p<br />

dp +∂H<br />

∂q dq<br />

Adding and subtracting these two equ<strong>at</strong>ions gives<br />

˙q(q, p) = ∂H<br />

∂p<br />

− ∂L<br />

∂q<br />

= ∂H<br />

∂q<br />

Combining (1.23), (1.39), and (1.4141) gives the fourth major result.<br />

˙p(q, p) = − ∂H<br />

∂q<br />

5 The following argument is taken from Finch & Hand.<br />

(1.39)<br />

(1.40)<br />

(1.41)<br />

(1.42)


1.5. THE HAMILTONIAN FORMULATION 15<br />

Now here’s wh<strong>at</strong> I mean th<strong>at</strong> Legendre transform<strong>at</strong>ions are invertible:<br />

First follow the steps from L → H. We start with L = L(q, ˙q). Equ<strong>at</strong>ion<br />

(1.39) gives p = p(q, ˙q). Invert this to find ˙q = ˙q(q, p). The Hamiltonian is<br />

now<br />

H(q, p) = ˙q(q, p)p − L[q, ˙q(q, p)]. (1.43)<br />

Now suppose th<strong>at</strong> we start from H = H(p, q). Use (1.40) to find ˙q = ˙q(q, p).<br />

Invert to find p = p(q, ˙q). Finally<br />

L(q, ˙q) = ˙qp(q, ˙q) − H[q, p(q, ˙q)] (1.44)<br />

In both cases we were able to complete the transform<strong>at</strong>ion without knowing<br />

ahead of time the functional rel<strong>at</strong>ionship among q, ˙q, and p. To summarize:<br />

Equ<strong>at</strong>ions (1.37), (1.39), and (1.41) enable us to transform between the (q, ˙q)<br />

(Lagrangian) prescription and the (q, p) (Hamiltonian) prescription; while<br />

(1.40) and (1.41) are Hamilton’s equ<strong>at</strong>ions of motion.<br />

1.5.1 The Spherical Pendulum<br />

A mass m hangs from a string of length R. The string makes an angle θ<br />

with the vertical and can rot<strong>at</strong>e about the vertical with an angle ϕ.<br />

T = 1<br />

2 mR2 ( ˙ θ 2 + sin 2 θ ˙ ϕ 2 ) (1.45)<br />

V = mgR(1 − cos θ) (1.46)<br />

The mgR constant doesn’t appear in the equ<strong>at</strong>ions of motion, so we can<br />

forget about it. The Lagrangian is L = T − V as usual.<br />

pθ = ∂L<br />

∂ ˙ θ = mR2 ˙ θ (1.47)<br />

pϕ = ∂L<br />

∂ ˙ ϕ = mR2 sin 2 θ ˙ ϕ ≡ lϕ<br />

(1.48)<br />

The angle ϕ is cyclic, so pϕ = lϕ is constant. At this point we are still in the<br />

(q, ˙q) prescription. Invert (47) and (48) to obtain ˙ θ and ˙ ϕ as functions of pθ<br />

and lϕ.<br />

˙θ = pθ/mR 2<br />

(1.49)<br />

H = p2 θ +<br />

2mR2 ˙ϕ = lϕ/mR 2 sin 2 θ (1.50)<br />

l2 ϕ<br />

2mR2 sin2 − mgR cos θ (1.51)<br />

θ


16 CHAPTER 1. LAGRANGIAN DYNAMICS<br />

The equ<strong>at</strong>ions of motion follow from this.<br />

˙θ = ∂H<br />

∂pθ<br />

= pθ<br />

mR 2<br />

˙pθ = − ∂H<br />

∂θ = l2 ϕ cos θ<br />

˙ϕ = ∂H<br />

∂pϕ<br />

mR 2 sin 3 θ<br />

=<br />

lϕ<br />

mR 2 sin 2 θ<br />

(1.52)<br />

− mgR sin θ (1.53)<br />

(1.54)<br />

˙pϕ = 0 (1.55)<br />

Suppose we were to try to find an analytic solution to this system of<br />

equ<strong>at</strong>ions. First note th<strong>at</strong> there are two constants of motion, the angular<br />

momentum lϕ, and the total energy E = H.<br />

1. Invert (1.49) to obtain pθ = pθ(θ, E, lϕ).<br />

2. Substitute pθ into (1.49) and integr<strong>at</strong>e<br />

∫<br />

mR2 dθ = t ≡ N(θ)<br />

pθ<br />

The integral is hopeless anyhow, so we label its output N(θ), (short<br />

for an exceedingly nasty function).<br />

3. Invert the nasty function to find θ as a function of t.<br />

4. Take the sine of this even nastier function and substitute it into (1.54)<br />

to find ˙ ϕ.<br />

5. Integr<strong>at</strong>e and invert to find ϕ as a function of t.<br />

This makes sense in principle, but is wildly impossible in practice. Now<br />

suppose we could change the problem so th<strong>at</strong> both θ and ϕ were cyclic so<br />

th<strong>at</strong> the two constants of motion were pθ and pϕ (r<strong>at</strong>her than E and pϕ).<br />

Then<br />

˙θ = ∂H<br />

= ωθ<br />

∂pθ<br />

θ = ωθt + θ0<br />

˙ϕ = ∂H<br />

∂pϕ<br />

= ωϕ ϕ = ωϕt + ϕ0<br />

Here ωθ and ωϕ are two constant “frequencies” th<strong>at</strong> we could easily extract<br />

from the Hamiltonian. This apparently small change makes the problem<br />

trivial! In both cases there are two constants of motion: it makes all the<br />

difference which two constants. This is the basis of the idea we will be<br />

pursuing in the next chapter.


Chapter 2<br />

Canonical Transform<strong>at</strong>ions<br />

We saw <strong>at</strong> the end af the last chapter th<strong>at</strong> a problem in which all the<br />

generalized coordin<strong>at</strong>es are cyclic is trivial to solve. We also saw th<strong>at</strong> there<br />

is a gre<strong>at</strong> flixibility allowed in the choice of coordin<strong>at</strong>es for any particular<br />

problem. It turns out th<strong>at</strong> there is an important class of problems for which<br />

it is possible to choose the coordin<strong>at</strong>es so th<strong>at</strong> they are in fact all cyclic.<br />

The choice is usually far from obvious, but there is a formal procedure for<br />

finding the “magic” variables. One formul<strong>at</strong>es the problem in terms of the<br />

n<strong>at</strong>ural p’s and q’s and then transforms to a new set of variables, usually<br />

called Qk and Pk, th<strong>at</strong> have the right properties.<br />

2.1 Contact Transform<strong>at</strong>ions<br />

The most general transform<strong>at</strong>ion is called a contact transform<strong>at</strong>ion.<br />

Qk = Qk(q, p, t) Pk = Pk(q, p, t) (2.1)<br />

(In this formula and wh<strong>at</strong> follows, the symbols p and q when used as arguments<br />

stand for the complete set, q1, q2, q3, · · · , etc.) There is a certain privileged<br />

class of transform<strong>at</strong>ions called canonical transform<strong>at</strong>ions th<strong>at</strong> preserve<br />

the structure of Hamilton’s equ<strong>at</strong>ion of motion for all dynamical systems.<br />

This means th<strong>at</strong> there is a new Hamiltonian function called K(Q, P ) for<br />

which the new equ<strong>at</strong>ions of motion are<br />

˙Qk = ∂K<br />

∂Pk<br />

Pk<br />

˙ = − ∂K<br />

∂Qk<br />

(2.2)<br />

In a footnote in Classical <strong>Mechanics</strong>, Goldstein suggested th<strong>at</strong> K be called<br />

the Kamiltonian. The idea has caught on with several authors, and I will<br />

use it without further apology. The trick is to find it.<br />

17


18 CHAPTER 2. CANONICAL TRANSFORMATIONS<br />

Theorem: Let F be any function of qk and Qk and possibly pk and Pk,<br />

as well as time. Then the new Lagrangian defined by<br />

¯L = L − dF<br />

dt<br />

(2.3)<br />

is equivalent to L in the sense th<strong>at</strong> it yields the same equ<strong>at</strong>ions of motion.<br />

Proof:<br />

F ˙ = ∑ ∂F<br />

˙qk +<br />

∂qk k<br />

∑ ∂F<br />

˙Qk +<br />

∂Qk k<br />

∂F<br />

∂t<br />

(<br />

d ∂<br />

dt<br />

(2.4)<br />

˙<br />

)<br />

F<br />

=<br />

∂ ˙qk<br />

d<br />

( )<br />

∂F<br />

=<br />

dt ∂qk<br />

∂ ˙ F<br />

∂qk<br />

(<br />

d ∂<br />

dt<br />

˙ F<br />

∂ ˙ )<br />

=<br />

Qk<br />

d<br />

( )<br />

∂F<br />

=<br />

dt ∂Qk<br />

∂ ˙ F<br />

∂Qk<br />

These last two can be rewritten<br />

d<br />

dt<br />

(<br />

∂ ˙<br />

F<br />

∂ ˙qk<br />

(<br />

d ∂<br />

dt<br />

˙ F<br />

∂ ˙ )<br />

−<br />

Qk<br />

)<br />

∂F˙ − = 0<br />

∂qk<br />

∂ ˙<br />

F<br />

∂Qk<br />

So ˙ F s<strong>at</strong>isfies Lagrange’s equ<strong>at</strong>ion whether we regard it as a function of qk<br />

or Qk. Obviously, if L s<strong>at</strong>isfies Lagrange’s equ<strong>at</strong>ion, then so does L − ˙ F .<br />

(The conclusion is unchanged if F contains pk and/or Pk.) The function F<br />

is called the gener<strong>at</strong>ing function of the transform<strong>at</strong>ion.<br />

K is obtained by a Legendre transform<strong>at</strong>ion just as H was.<br />

= 0<br />

K(Q, P ) = ∑<br />

Pk ˙ Qk − ¯ L(Q, ˙ Q, t) (2.5)<br />

k<br />

This has the same form as (1.36), so the deriv<strong>at</strong>ion of the equ<strong>at</strong>ions of motion<br />

(1.39) through(1.42) are unchanged as well.<br />

Pk = ∂K<br />

∂ ˙ Qk<br />

˙Qk = ∂K<br />

∂Pk<br />

Pk<br />

˙ = − ∂K<br />

∂Qk<br />

(2.6)<br />

These simple results provide the framework for canonical transform<strong>at</strong>ions.<br />

In order to use them we will need to know two more things: (1) How


2.1. CONTACT TRANSFORMATIONS 19<br />

to find F , and given F , (2) how to find the transform<strong>at</strong>ion (q, p) → (Q, P ).<br />

We deal with (2) now and postpone (1) to l<strong>at</strong>er sections.<br />

Consider the variables q, Q, p, and P . Any two of these constitute a<br />

complete set, so there are four kinds of gener<strong>at</strong>ing functions usually called<br />

F1(q, Q, t), F2(q, P, t), F3(p, Q, t), and F4(p, P, t). All four are discussed in<br />

Goldstein. F1 provides a good introduction. Most of our work will make<br />

use of F2.<br />

Starting with F1(q, Q) (2.3) becomes<br />

Since<br />

we get with the help of (4)<br />

¯L(Q, ˙ Q, t) = L(q, ˙q, t) − d<br />

dt F1(q, Q, t) (2.7)<br />

∂ ¯ L<br />

∂ ˙qk<br />

∂ ¯ L<br />

∂ ˙qk<br />

= ∂L<br />

−<br />

∂ ˙qk<br />

= ∂L<br />

∂ ˙ Qk<br />

∂ ˙<br />

F1<br />

∂ ˙qk<br />

= 0,<br />

= ˙pk − ∂F1<br />

∂qk<br />

∂ ¯ L<br />

∂ ˙ Qk<br />

= Pk = ∂L<br />

∂ ˙ Qk<br />

∂F1 ˙<br />

−<br />

∂ ˙ Qk<br />

This yields the two transform<strong>at</strong>ion equ<strong>at</strong>ions<br />

Pk = − ∂F1<br />

∂Qk<br />

pk = ∂F1<br />

∂qk<br />

= 0.<br />

= − ∂F1<br />

∂Qk<br />

(2.8)<br />

(2.9)<br />

A straightforward set of substitutions gives our final formula for the Kamiltonian.<br />

K = ∑<br />

[<br />

− ∂F<br />

˙Qk − L +<br />

∂Qk<br />

∂F<br />

˙qk +<br />

∂qk<br />

∂F<br />

]<br />

˙Qk +<br />

Qk<br />

∂F<br />

∂t<br />

To be more explicit<br />

k<br />

= −L + ∑<br />

k<br />

pk ˙qk + ∂F<br />

∂t<br />

K(Q, P ) = H(q(Q, P ), p(Q, P ), t) + ∂<br />

∂t F1(q(Q, P ), Q, t) (2.10)<br />

Summary:


20 CHAPTER 2. CANONICAL TRANSFORMATIONS<br />

1. Here is the typical problem: We are given the Hamiltonian H =<br />

H(q, p) for some conserv<strong>at</strong>ive system. H = E is constant, but the<br />

q’s and p’s change with time in a complic<strong>at</strong>ed way. Our goal is to find<br />

the functions q = q(t) and p = p(t) using the technique of canonical<br />

transform<strong>at</strong>ions.<br />

2. We need to know the gener<strong>at</strong>ing function F = F1(q, Q). This is the<br />

hard part, and I’m postponing it as long as possible.<br />

3. Substitute F into (2.8) and (2.9). This gives a set of coupled algebraic<br />

equ<strong>at</strong>ions for q, Q, p, and P . They must be combined in such a way<br />

as to give qk = qk(Q, P ) and pk = pk(Q, P ).<br />

4. Use (2.10) to find K. If we had the right gener<strong>at</strong>ing function to start<br />

with, Q will be cyclic, i.e. K = K(P ). The equ<strong>at</strong>ions of motion are<br />

obtained from (2.6). Pk<br />

˙ = 0 and ˙ Qk = ωk. The ω’s are a set of<br />

constants as are the P ’s. Qk(t) = ωkt + αk. The α’s are constants<br />

obtained from the initial conditions.<br />

5. Finally qk(t) = qk(Q(t), P ) and pk(t) = pk(Q(t), P ).<br />

2.1.1 The Harmonic Oscill<strong>at</strong>or: Cracking a Peanut with a<br />

Sledgehammer<br />

H = p2 kq2<br />

+<br />

2m 2<br />

It’s useful to try a new technique on an old problem. As it turns out, the<br />

gener<strong>at</strong>ing function is<br />

F = mωq2<br />

cot Q (2.12)<br />

2<br />

The transform<strong>at</strong>ion is found from (2.8) and (2.9).<br />

p = ∂F<br />

∂q<br />

= 1<br />

2m (p2 + m 2 ω 2 q 2 ) (2.11)<br />

= mωq cot Q<br />

P = − ∂F mωq2<br />

=<br />

∂Q 2 sin2 Q<br />

Solve for p and q in terms of P and Q and then substitute into (2.10) to<br />

find K.<br />

√<br />

2P<br />

q =<br />

mω sin Q p = √ 2P mω cos Q


2.2. THE SECOND GENERATING FUNCTION 21<br />

K = ωP P = E/ω<br />

We have achieved our goal. Q is cyclic, and the equ<strong>at</strong>ions of motion are<br />

trivial.<br />

˙Q = ∂K<br />

q =<br />

∂P = ω Q = ωt + Q0 (2.13)<br />

√ 2E<br />

mω 2 sin(ωt + Q0) p = √ 2mE cos(ωt + Q0) (2.14)<br />

2.2 The Second Gener<strong>at</strong>ing Function<br />

There’s an old recipe for tiger stew th<strong>at</strong> begins, “First c<strong>at</strong>ch the tiger.” In<br />

our quest for the tiger, we now turn our <strong>at</strong>tention to the second gener<strong>at</strong>ing<br />

function, F2 = F2(q, P, t). F2 is obtained from F1 by means of a Legendre<br />

transform<strong>at</strong>ion. 1<br />

F2(q, P ) = F1(q, Q) + ∑<br />

(2.15)<br />

k<br />

QkPk<br />

We are looking for transform<strong>at</strong>ion equ<strong>at</strong>ions analogous to (refe2.8) and (2.9).<br />

Since L = ¯ L + ˙ F1,<br />

∑<br />

pk ˙qk − H = ∑<br />

Pk ˙ Qk − K + d<br />

dt (F2 − ∑ QkPk)<br />

k<br />

k<br />

k<br />

= − ∑ Qk ˙<br />

Pk − K + ˙<br />

F2<br />

Substitute<br />

F2<br />

˙ = ∑<br />

[<br />

∂F2<br />

˙qk +<br />

∂qk k<br />

∂F2<br />

]<br />

Pk<br />

˙ +<br />

∂Pk<br />

∂F2<br />

∂t<br />

−H = −K + ∑<br />

[( ) ( ) ]<br />

∂F2<br />

∂F2<br />

− pk ˙qk + − Qk Pk<br />

˙ +<br />

∂qk<br />

∂Pk<br />

∂F2<br />

∂t<br />

We are working on the assumption th<strong>at</strong> ˙q and ˙<br />

P are not independent variables.<br />

We enforce this by requiring th<strong>at</strong><br />

∂F2<br />

∂qk<br />

∂F2<br />

∂Pk<br />

= pk<br />

= Qk<br />

(2.16)<br />

(2.17)<br />

K(q, P ) = H(q(Q, P ), P ) + ∂<br />

∂t F2(q(Q, P ), P ) (2.18)<br />

1 When in doubt, do a Legendre transform<strong>at</strong>ion.


22 CHAPTER 2. CANONICAL TRANSFORMATIONS<br />

2.3 Hamilton’s Principle Function<br />

The F1 style gener<strong>at</strong>ing functions were used to transform to a new set of<br />

variables (q, p) → (Q, P ) such th<strong>at</strong> all the Q’s were cyclic. As a consequence,<br />

the P ’s were constants of the motion, and the Q’s were linear functions<br />

of time. The gener<strong>at</strong>ing function itself was hard to find, however. The<br />

F2 gener<strong>at</strong>ing function goes one step further; it can transform to a set of<br />

variables in which both the Q’s and P ’s are constant and simple functions of<br />

the initial values of the phase space variables. In essence, our transform<strong>at</strong>ion<br />

is<br />

(q(t), p(t)) ↔ (q0, p0)<br />

This is a time dependent transform<strong>at</strong>ion, of course. The fact th<strong>at</strong> we can<br />

find such transform<strong>at</strong>ions shows th<strong>at</strong> the time evolution of a system is itself<br />

a canonical transform<strong>at</strong>ion.<br />

We look for an F2 so th<strong>at</strong> K in (2.18) is identically zero! Then from<br />

(2.6), ˙ Qk = 0 and ˙<br />

Pk = 0. The appropri<strong>at</strong>e gener<strong>at</strong>ing function will be a<br />

solution to<br />

H(q, p, t) + ∂F2<br />

∂t<br />

We elimin<strong>at</strong>e pk using (2.16)<br />

(<br />

H q1, . . . , qn; ∂F2<br />

, . . . ,<br />

∂q1<br />

∂F2<br />

)<br />

; t +<br />

∂qn<br />

∂F2<br />

∂t<br />

(2.19)<br />

= 0. (2.20)<br />

The solution to this equ<strong>at</strong>ion is usually called S, Hamilton’s principle function.<br />

The equ<strong>at</strong>ion itself is the Hamilton-Jacobi equ<strong>at</strong>ion. 2<br />

There are two serious issues here: does it have a solution, and if it does,<br />

can we find it? We will take a less serious approach: if we can find a solution,<br />

then it most surely exists. Furthermore, if we can find it, it will have the<br />

form<br />

S = ∑<br />

Wk(qk) − αt (2.21)<br />

k<br />

Partial differential equ<strong>at</strong>ions th<strong>at</strong> have solutions of the form (2.21) are said<br />

to be separable. 3 Most of the familiar textbook problems in classical mechanics<br />

and <strong>at</strong>omic physics can be separ<strong>at</strong>ed in this form. The question<br />

of separability does depend on the system of generalized coordin<strong>at</strong>es used.<br />

For example, the Kepler problem is separable in spherical coordin<strong>at</strong>es but<br />

not in cartesian coordin<strong>at</strong>es. It would be nice to know whether a particular<br />

2 See Goldstein, Classical <strong>Mechanics</strong>, Chapter 10<br />

3 Or to be meticulous, completely separable


2.3. HAMILTON’S PRINCIPLE FUNCTION 23<br />

Hamiltonian could be separ<strong>at</strong>ed with some system of coordin<strong>at</strong>es, but no<br />

completely general criterion is known. 4 As a rule of thumb, Hamiltonians<br />

with explicit time dependence are not separable.<br />

If our Hamiltonian is separable, then when (2.21) is substituted into<br />

(2.20), the result will look like<br />

f1<br />

(<br />

q1, dW1<br />

dq1<br />

)<br />

+ f2<br />

(<br />

q2, dW2<br />

)<br />

+ · · · = α (2.22)<br />

dq2<br />

Each function fk is a function only of qk and dWk/dqk. Since all the q’s<br />

are independent, each function must be separ<strong>at</strong>ely constant. This gives us a<br />

system of n independent, first-order, ordinary differential equ<strong>at</strong>ions for the<br />

Wk’s.<br />

(<br />

fk qk, dWk<br />

)<br />

= αk. (2.23)<br />

dqk<br />

The W ’s so obtained are then substituted into (2.21). The resulting function<br />

for S is<br />

F2 ≡ S = S(q1, . . . , qn; α1, . . . , αn; α, t)<br />

The final constant α is redundant for two reasons: first, ∑ αk = α, and<br />

second, the transform<strong>at</strong>ions equ<strong>at</strong>ions (2.16) and (2.17) involve deriv<strong>at</strong>ives<br />

with respect to qk and Pk. When S is so differenti<strong>at</strong>ed, the −αt piece will<br />

disappear. In order to make this apparent, we will write S as follows:<br />

F2 ≡ S = S(q1, . . . , qn; α1, . . . , αn; t) (2.24)<br />

Since the F2 gener<strong>at</strong>ing functions have the form F2(q, P ), we are entitled to<br />

think of the α’s as “momenta,” i.e. αk in (??) corresponds to Pk in (2.17).<br />

In a way this makes sense. Our goal was to transform the time-dependent<br />

q’s and p’s into a new set of constant Q’s and P ’s, and the α’s are most<br />

certainly constant. On the other hand, they are not the initial momenta p0<br />

th<strong>at</strong> evolve into p(t). The rel<strong>at</strong>ionship between α and p0 will be determined<br />

l<strong>at</strong>er.<br />

If we have dome our job correctly, the Q’s given by (2.17) are also constant.<br />

They are traditionally called β, so<br />

Qk = βk =<br />

∂S(q, α, t)<br />

∂αk<br />

Again, β’s are constant, but they are not equal to q0.<br />

We can turn this into a cookbook algorithm.<br />

(2.25)<br />

4 The is a very technical result, the so-called Staeckel conditions, which gives necessary<br />

and sufficient conditions for separability in orthogonal coordin<strong>at</strong>e systems.


24 CHAPTER 2. CANONICAL TRANSFORMATIONS<br />

1. Substitute (2.21) into (2.20) and separ<strong>at</strong>e variables.<br />

2. Integr<strong>at</strong>e the resulting fist-order ODE’s. The result will be n independent<br />

functions Wk = Wk(q, α). Put the Wk’s back into (2.21) to<br />

construct S = S(q, α, t).<br />

3. Find the constant β coordin<strong>at</strong>es using<br />

βk = ∂S<br />

∂αk<br />

4. Invert these equ<strong>at</strong>ions to find qk = qk(β, α, t)<br />

5. Find the momenta with<br />

pk = ∂S<br />

∂qk<br />

2.3.1 The Harmonic Oscill<strong>at</strong>or: Again<br />

The harmonic oscill<strong>at</strong>or provides an easy example of this procedure.<br />

H = 1<br />

[ (∂S<br />

1<br />

2m ∂q<br />

[ (∂W<br />

1<br />

2m ∂q<br />

2m (p2 + m 2 ω 2 q 2 )<br />

) 2<br />

+ m 2 ω 2 q 2<br />

]<br />

) 2<br />

+ m 2 ω 2 q 2<br />

+ ∂S<br />

= 0<br />

∂t<br />

]<br />

= α<br />

(2.26)<br />

(2.27)<br />

Since there is only one q, the entire quantity on the left of the equal sign is<br />

a constant.<br />

W (q, α) = √ ∫<br />

2mα<br />

√<br />

dq 1 − mω2q2 2α<br />

The new transformed constant “momentum” P = α.<br />

∂S(q, α, t)<br />

β = =<br />

∂α<br />

∂W (q, α)<br />

− t<br />

∂α<br />

= 1<br />

ω sin−1<br />

[ √ ]<br />

mω2 q − t (2.28)<br />

2α<br />

Invert this equ<strong>at</strong>ion to find q as a function of t and β.<br />

√<br />

2α<br />

q = sin(ωt + βω)<br />

mω2


2.4. HAMILTON’S CHARACTERISTIC FUNCTION 25<br />

Evidentally, β has something to do with initial conditions: ωβ = ϕ0, the<br />

initial phase angle.<br />

p = ∂S<br />

∂q = √ 2mα − m 2 ω 2 q 2<br />

= √ 2mα cos(ωt + ϕ0)<br />

The maximum value of p is √ 2mE, so th<strong>at</strong> makes sense too.<br />

2.4 Hamilton’s Characteristic Function<br />

There is another way to use the F2 gener<strong>at</strong>ing function to turn a difficult<br />

problem into an easy one. In the previous section we chose F2 = S = W −αt,<br />

so th<strong>at</strong> K = 0. It is also possible to to take F2 = W (q) so th<strong>at</strong><br />

(<br />

K = H qk, ∂W<br />

)<br />

= E = α1<br />

(2.29)<br />

∂qk<br />

The W obtained in this way is called Hamilton’s characteristic function.<br />

W = ∑<br />

Wk(qk, α1, . . . , αn)<br />

k<br />

= W (q1, . . . , qn, E, α2, . . . , αn) = W (q1, . . . , qn, α1, . . . , αn) (2.30)<br />

It gener<strong>at</strong>es a contact transform<strong>at</strong>ion with properties quite different from<br />

th<strong>at</strong> gener<strong>at</strong>ed by S. The equ<strong>at</strong>ions of motion are<br />

Pk<br />

˙ = − ∂K<br />

∂Qk<br />

˙Qk = ∂K<br />

∂Pk<br />

= ∂K<br />

∂αk<br />

= 0 (2.31)<br />

= δk1<br />

The new fe<strong>at</strong>ure is th<strong>at</strong> ˙ Q1 = 1 so Q1 = t − t0. In general<br />

but now β1 = t − t0.<br />

Qk = ∂W<br />

∂αk<br />

pk = ∂W<br />

∂qk<br />

as before.<br />

The algorithm now works like this:<br />

= βk<br />

(2.32)<br />

(2.33)<br />

(2.34)


26 CHAPTER 2. CANONICAL TRANSFORMATIONS<br />

1. Substitute (2.30) into (2.29) and separ<strong>at</strong>e variables.<br />

2. Integr<strong>at</strong>e the resulting fist-order ODE’s. The result will be n independent<br />

functions Wk = Wk(q, α). Put the Wk’s back into (2.30) to<br />

construct S = S(q, α, t).<br />

3. Find the constant β coordin<strong>at</strong>es using<br />

Remember th<strong>at</strong> β1 = t − t0.<br />

βk = ∂S<br />

∂αk<br />

4. Invert these equ<strong>at</strong>ions to find qk = qk(β, α, t)<br />

5. Find the momenta with<br />

2.4.1 Examples<br />

pk = ∂S<br />

∂qk<br />

(2.35)<br />

(2.36)<br />

Problems with one degree of freedom are virtually identical whether they<br />

are formul<strong>at</strong>ed in terms of the characteristic function or the principle function.<br />

Take for example, the harmonic oscill<strong>at</strong>or from the previous section.<br />

Equ<strong>at</strong>ion (2.28) becomes<br />

β =<br />

∂W (q, α)<br />

∂α<br />

q =<br />

= 1<br />

ω sin−1<br />

[ √<br />

mω2 q<br />

2α<br />

]<br />

= t − t0<br />

√ 2α<br />

mω 2 sin[ω(t − t0)] (2.37)<br />

The following problem raises some new issues.<br />

Consider a particle in a stable orbit in a central potential. The motion<br />

will lie in a plane so we can do the problem in two dimensions.<br />

H = 1<br />

2m<br />

(<br />

p 2 r + p2 ψ<br />

r 2<br />

)<br />

+ V (r) (2.38)<br />

pψ = mr 2 ˙ ψ is the angular momentum. It is conserved since ψ is cyclic.<br />

[ (∂W ) 2<br />

1<br />

+<br />

2m ∂r<br />

1<br />

r2 ( ) ]<br />

2<br />

∂W<br />

+ V (r) = α1<br />

∂ψ<br />

(2.39)


2.5. ACTION-ANGLE VARIABLES 27<br />

[<br />

r 2<br />

( ) 2<br />

dWr<br />

+ 2mr<br />

dr<br />

2 V (r) − 2mα1r 2<br />

]<br />

+<br />

( ) 2<br />

dWψ<br />

= 0 (2.40)<br />

dψ<br />

At this point we notice ∂W<br />

∂ψ = pψ, which we know is constant. Why not<br />

call it something like αψ? Then Wψ = αψψ. This is worth st<strong>at</strong>ing as a<br />

general principle: if q is cyclic, Wq = αqq, where αq is one of the n constant<br />

α’s appearing in (2.30).<br />

∫<br />

W =<br />

√<br />

dr 2m(α1 − V ) − α2 ψ /r2 + αψψ (2.41)<br />

We can find r as a function of time by inverting the equ<strong>at</strong>ion for β1, just as<br />

we did in (2.37), but more to the point<br />

βψ = ∂W<br />

∫<br />

αψdr<br />

= − √<br />

∂αψ r 2m(α1 − V ) − α2 ψ /r2<br />

+ ψ (2.42)<br />

Make the usual substitution,u = 1/r.<br />

∫<br />

du<br />

ψ − βψ = − √<br />

2m(α1 − V (r))/α2 ψ − u2<br />

(2.43)<br />

This is a new kind of equ<strong>at</strong>ion of motion, which gives ψ = ψ(r) or r = r(ψ)<br />

(assuming we can do the integral), i.e. there is no explicit time dependence.<br />

Such equ<strong>at</strong>ions are called orbit equ<strong>at</strong>ions. Often it will be more useful to<br />

have the equ<strong>at</strong>ions in this form, when we are concerned with the geometric<br />

properties of the trajectories.<br />

2.5 Action-Angle Variables<br />

We are pursuing a rout to chaos th<strong>at</strong> begins with periodic or quasi-periodic<br />

systems. A particularly elegant approach to these systems makes use of a<br />

variant of Hamilton’s characteristic function. In this technique, the integr<strong>at</strong>ion<br />

constants αk appearing directly in the solution of the Hamilton-Jacobi<br />

equ<strong>at</strong>ion are not themselves chosen to be the new momenta. Instead, we<br />

define a set of constants Ik, which form a set of n independent functions of<br />

the α’s known as action variables. The coordin<strong>at</strong>es conjug<strong>at</strong>e to the J’s are<br />

angles th<strong>at</strong> increase linearly with time. You are familiar with a system th<strong>at</strong><br />

behaves just like this, the harmonic oscill<strong>at</strong>or!<br />

q =<br />

√ 2E<br />

k sin ψ p = √ 2mE cos ψ


28 CHAPTER 2. CANONICAL TRANSFORMATIONS<br />

Where ψ = ωt + ψ0. In the language of action-angle variables I = E/ω, so<br />

√<br />

2I<br />

q =<br />

mω sin ψ p = √ 2mIω cos ψ<br />

I is the “momentum” conjug<strong>at</strong>e to the “coordin<strong>at</strong>e” ψ.<br />

Action angle variables are only appropri<strong>at</strong>e to periodic motion, and there<br />

are other restrictions we will learn as we go along, but within these limit<strong>at</strong>ions,<br />

all systems can be transformed into a set of uncoupled harmonic<br />

oscill<strong>at</strong>ors. 5 To see wh<strong>at</strong> “periodic motion” implies, have a look <strong>at</strong> the<br />

simple pendulum.<br />

H = p2 θ − mgl cos θ = E = α (2.44)<br />

2ml2 pθ = ± √ 2ml 2 (E + mgl cos θ) (2.45)<br />

There are two kinds of motion possible. If E is small, the pendulum will<br />

reverse <strong>at</strong> the points where pθ = 0. The motion is called libr<strong>at</strong>ion, i.e.<br />

bounded and periodic. If E is large enough, however, the pendulum will<br />

swing around a complete circle. Such motion is called rot<strong>at</strong>ion (obviously).<br />

There is a critical value of E = mgl for which, in principle, the pendulum<br />

could stand straight up motionless <strong>at</strong> θ = π. An orbit in pθ - θ phase space<br />

corresponding to this energy forms the dividing line between the two kinds<br />

of motion. Such a trajectory is called a separ<strong>at</strong>rix.<br />

For either type of periodic motion, we can introduce a new variable I<br />

designed to replace α as the new constant momentum.<br />

I(α) = 1<br />

2π<br />

<br />

p(q, α) dq (2.46)<br />

This is a definite integral taken over a complete period of libr<strong>at</strong>ion or rot<strong>at</strong>ion.<br />

6 I will prove (1) the angle ψ conjug<strong>at</strong>e to I is cyclic, and (2) ∆ψ = 2π<br />

corresponds to one complete cycle of the periodic motion.<br />

1. Since I = I(α) and H = α, it follows th<strong>at</strong> H is a function of I only.<br />

H = H(I).<br />

I ˙ = − ∂H<br />

∂ψ<br />

= 0<br />

˙<br />

ψ = ∂H<br />

∂I<br />

= ω(I) (2.47)<br />

5 When correctly viewed, everything is a harmonic oscill<strong>at</strong>or.<br />

6 Textbooks are about equally divided on whether to call action I or J and whether or<br />

not to include the factor 1/2π.


2.5. ACTION-ANGLE VARIABLES 29<br />

2. We are using an F2 type gener<strong>at</strong>ing function, which is a function of the<br />

old coordin<strong>at</strong>e and new momentum. Hamilton’s characteristic function<br />

can be written as<br />

W = W (q, I). (2.48)<br />

The transform<strong>at</strong>ion equ<strong>at</strong>ions are<br />

Note th<strong>at</strong><br />

so<br />

<br />

dψ =<br />

∂ψ<br />

∂q<br />

ψ = ∂W<br />

∂I<br />

∂ψ<br />

∂q<br />

<br />

∂ ∂W<br />

dq =<br />

∂I ∂q<br />

p = ∂W<br />

∂q<br />

( )<br />

∂ ∂W<br />

=<br />

∂I ∂q<br />

<br />

∂<br />

dq =<br />

∂I<br />

2.5.1 The harmonic oscill<strong>at</strong>or (for the last time)<br />

H = 1<br />

2m (p2 + m 2 ω 2 q 2 )<br />

p = ± √ 2mE − m2ω2q 2<br />

I = 1<br />

<br />

√2mE<br />

− m2ω2q 2 dq<br />

2π<br />

(2.49)<br />

p dq = ∂<br />

(2πI) = 2π.<br />

∂I<br />

The integral is tricky in this form because p changes sign <strong>at</strong> the turning<br />

points. We won’t have to worry about this if we make the substitution<br />

q =<br />

√<br />

2E<br />

sin ψ (2.50)<br />

mω2 This substitution not only makes the integral easy and takes care of the sign<br />

change, it also makes clear the meaning of an integral over a complete cycle,<br />

i.e. ψ goes from 0 to 2π.<br />

I = E<br />

πω<br />

<br />

cos 2 ψ dψ = E/ω<br />

From this point of view the introduction of ψ <strong>at</strong> (50) seems nothing<br />

more th<strong>at</strong> a m<strong>at</strong>hem<strong>at</strong>ical trick. We would have stumbled on it eventually,


30 CHAPTER 2. CANONICAL TRANSFORMATIONS<br />

however, as the following argument shows. The Hamilton-Jacobi equ<strong>at</strong>ion<br />

is<br />

[ (dW ) 2<br />

1<br />

+ m<br />

2m dq<br />

2 ω 2 q 2<br />

]<br />

= E<br />

∫<br />

√2mIω<br />

W =<br />

− m2ω2q 2 dq<br />

∂W<br />

∂I<br />

∫<br />

= mω<br />

dq<br />

√ 2mIω − m 2 ω 2 q 2<br />

= sin −1<br />

( √ )<br />

mω2 q − ψ0 = ψ<br />

2I<br />

√<br />

2E<br />

q = sin(ψ − ψ0)<br />

mω2 In the last equ<strong>at</strong>ion ψ0 appears as an integr<strong>at</strong>ion constant. Evidentally, ψ<br />

is the angle variable conjug<strong>at</strong>e to I.<br />

In summary, to use action-angle variables for problems with one degree<br />

of freedom:<br />

1. Find p as a function of E = α and q.<br />

2. Calcul<strong>at</strong>e I(E) using (2.46).<br />

3. Solve the Hamilton-Jacobi equ<strong>at</strong>ion to find W = W (q, I).<br />

4. Find ψ = ψ(q, I) using (2.49).<br />

5. Invert this equ<strong>at</strong>ion to get q = q(I, ψ).<br />

6. Use (2.47) to get ω(I).<br />

7. Calcul<strong>at</strong>e p = p(I, q) from (2.49).<br />

One <strong>at</strong>tractive fe<strong>at</strong>ure of this scheme is th<strong>at</strong> you can find the frequency<br />

without using the characteristic function and without finding the equ<strong>at</strong>ions<br />

of motion. The phase space plot is particularly important. Use polar coordin<strong>at</strong>es<br />

(wh<strong>at</strong> else) for (I, ψ). Every trajectory, wh<strong>at</strong>ever the system, is a<br />

circle!<br />

Our deriv<strong>at</strong>ion was based on the following assumptions: (1) The system<br />

had one degree of freedom. (2) Energy was conserved and the Hamiltonian<br />

had no explicit time dependence. (3) The motion was periodic. Every such<br />

system is <strong>at</strong> heart, a harmonic oscill<strong>at</strong>or. Phase space trajectories are circles.


2.5. ACTION-ANGLE VARIABLES 31<br />

The frequency can be found with a few deft moves. From a philosophical<br />

point of view, (and we will be getting deeper and deeper into philosophy as<br />

these lectures proceed) problems in this c<strong>at</strong>egory are “as good as solved,”<br />

nothing more needs to be said about them. The same is definitely not true<br />

true with more than one degree of freedom. I will take a paragraph to<br />

generalize before going on to some more abstract developments.<br />

We must assume th<strong>at</strong> the system is separable, so<br />

W (q1, . . . , qn, α1, . . . , αn) = ∑<br />

Wk(qk, α1, . . . , αn) (2.51)<br />

k<br />

pk = ∂<br />

Wk(qk, α1, . . . , αn) (2.52)<br />

∂qk<br />

Ik = 1<br />

<br />

pk(qk, α1, . . . , αn) (2.53)<br />

2π<br />

Next find all the q’s as function of the I’s and substitute into W .<br />

Finally<br />

ψk = ∂W<br />

∂Ik<br />

W = W (q1, . . . , qn; I1, . . . , In)<br />

˙<br />

Ik = 0<br />

˙ ψk = ∂H<br />

∂Ik<br />

= ωk<br />

(2.54)


32 CHAPTER 2. CANONICAL TRANSFORMATIONS


Chapter 3<br />

Abstract Transform<strong>at</strong>ion<br />

Theory<br />

So, one-dimensional problems are simple. Given the restrictions listed in<br />

the previous section, their phase space trajectories are circles. How does<br />

this generalize to problems with two or more degrees of freedom? A brief<br />

answer is th<strong>at</strong>, given a number of conditions th<strong>at</strong> we must discuss carefully,<br />

the phase space trajectories of a system with n degrees of freedom, move<br />

on the surface of an n-dimensional torus imbedded in 2n dimensional space.<br />

The final answer is a donut! In order to prove this remarkable assertion and<br />

understand the conditions th<strong>at</strong> must be s<strong>at</strong>isfied, we must slog through a<br />

lot of technical m<strong>at</strong>erial about transform<strong>at</strong>ions in general.<br />

3.1 Not<strong>at</strong>ion<br />

Our first job is to devise some compact not<strong>at</strong>ion for dealing with higher<br />

dimensional spaces. I will show you the not<strong>at</strong>ion in one dimension. It will<br />

then be easy to generalize. Recall Hamilton’s equ<strong>at</strong>ions of motion.<br />

˙p = − ∂H<br />

∂q<br />

We will turn this into a vector equ<strong>at</strong>ion.<br />

( )<br />

q<br />

η =<br />

p<br />

(<br />

0<br />

J =<br />

−1<br />

1<br />

0<br />

The equ<strong>at</strong>ions of motion in vector form are<br />

˙q = ∂H<br />

∂p<br />

)<br />

∇ =<br />

( ∂<br />

∂q<br />

∂<br />

∂p<br />

)<br />

(3.1)<br />

˙η = J · ∇H (3.2)<br />

33


34 CHAPTER 3. ABSTRACT TRANSFORMATION THEORY<br />

J is not a vector of course. Sometimes an array used in this way is called a<br />

dyadic. At any r<strong>at</strong>e this is just shorthand for m<strong>at</strong>rix multiplic<strong>at</strong>ion, i.e.<br />

( ) ( )<br />

˙q 0 1<br />

=<br />

˙p −1 0<br />

( )<br />

∂H<br />

∂q<br />

∂H<br />

∂p<br />

The structure of J is important. Notice th<strong>at</strong> it does two things: it exchanges<br />

p and q and it changes one sign. This is called a symplectic transform<strong>at</strong>ion.<br />

I want to explore the connection between canonical transform<strong>at</strong>ions and<br />

symlpectic transform<strong>at</strong>ions.<br />

I’ll start with the generic canonical transform<strong>at</strong>ion, (q, p) → (Q, P ). How<br />

do the velocities transform? Define<br />

)<br />

Using the not<strong>at</strong>ion<br />

this can be written<br />

( ˙Q<br />

˙<br />

P<br />

M =<br />

)<br />

=<br />

(<br />

∂Q ∂Q<br />

∂q<br />

∂P<br />

∂p<br />

∂P<br />

∂q ∂p<br />

(<br />

∂Q ∂Q<br />

∂q<br />

∂P<br />

∂p<br />

∂P<br />

∂q ∂p<br />

ζ =<br />

( Q<br />

P<br />

)<br />

) ( ˙q<br />

˙p<br />

)<br />

(3.3)<br />

(3.4)<br />

(3.5)<br />

˙ζ = M · ˙η = M · J · ∇H (3.6)<br />

The gradient oper<strong>at</strong>or differenti<strong>at</strong>es H with respect to q and p. These deriv<strong>at</strong>ives<br />

transform e.g.<br />

∂H ∂H ∂Q ∂H ∂P<br />

= +<br />

∂q ∂Q ∂q ∂P ∂q<br />

consequently<br />

The T stands for transpose, of course:<br />

but<br />

Combining (3.8) and (3.9):<br />

∇ (q,p) = M T · ∇ (Q,P )H (3.7)<br />

˙ζ = M · J · M T · ∇ (Q,P )H (3.8)<br />

˙ζ = J · ∇ (Q,P )H (3.9)<br />

J = M · J · M T<br />

(3.10)


3.1. NOTATION 35<br />

Those of you who have studied special rel<strong>at</strong>ivity should find (??) congenial.<br />

Remember the definition of a Lorentz transform<strong>at</strong>ion: any 4×4 m<strong>at</strong>rix<br />

Λ th<strong>at</strong> s<strong>at</strong>isfies<br />

g = Λ · g · Λ T<br />

(3.11)<br />

is a Lorentz transform<strong>at</strong>ion. 1 The m<strong>at</strong>rix<br />

⎛<br />

1 0 0<br />

⎞<br />

0<br />

⎜<br />

g = ⎜ 0<br />

⎝ 0<br />

−1<br />

0<br />

0<br />

−1<br />

0 ⎟<br />

0 ⎠<br />

0 0 0 −1<br />

(3.12)<br />

is called the metric or metric tensor. Forgive me for exagger<strong>at</strong>ing slightly:<br />

everything there is to know about special rel<strong>at</strong>ivity flows out of (3.11). We<br />

say th<strong>at</strong> Lorentz transform<strong>at</strong>ions “preserve the metric,” i.e. leave the metric<br />

invariant. The geometry of space and time is encapsul<strong>at</strong>ed in (12). By the<br />

same token, canonical transform<strong>at</strong>ions preserve the metric J. The geometry<br />

of phase space is encapsul<strong>at</strong>ed in the definition of J. Since J is symplectic,<br />

canonical transform<strong>at</strong>ions are symplectic transform<strong>at</strong>ion, they preserve the<br />

symplectic metric.<br />

Equ<strong>at</strong>ion (4.10) is the starting point for the modern approach to mechanics<br />

th<strong>at</strong> uses the tools of Lie group theory. I will only mention in passing<br />

some points of contact with group theory. Both Goldstein’s and Schenk’s<br />

texts have much more on the subject.<br />

3.1.1 Poisson Brackets<br />

Equ<strong>at</strong>ion (3.10) is really shorthand for four equ<strong>at</strong>ions, e.g.<br />

∂Q ∂P<br />

∂q ∂p<br />

∂P ∂Q<br />

− = 1 (3.13)<br />

∂q ∂p<br />

This combin<strong>at</strong>ion of deriv<strong>at</strong>ives is called a Poisson bracket. The usual not<strong>at</strong>ion<br />

is<br />

∂X ∂Y ∂X ∂Y<br />

− ≡ [X, Y ]q,p<br />

∂q ∂p ∂q ∂p<br />

(3.14)<br />

The quantity on the left is called a Poisson bracket. Then (3.13) becomes<br />

[Q, P ]q,p = 1 (3.15)<br />

1 It is not a good idea to use m<strong>at</strong>rix not<strong>at</strong>ion in rel<strong>at</strong>ivity because of the ambiguity<br />

inherent in covariant and contravariant indices. Normally one would write (11) using<br />

tensor not<strong>at</strong>ion.


36 CHAPTER 3. ABSTRACT TRANSFORMATION THEORY<br />

This together with the trivially true<br />

[q, p]q,p = 1 (3.16)<br />

are called the fundamental Poisson brackets. We conclude th<strong>at</strong> canonical<br />

transform<strong>at</strong>ions leave the fundamental Poisson brackets invariant. It turns<br />

out th<strong>at</strong> all Poisson brackets have the same value when evalu<strong>at</strong>ed with respect<br />

to any canonical set of variables. This assertion requires some proof,<br />

however. I will start by generalizing to n dimensions.<br />

⎛<br />

⎜<br />

η = ⎜<br />

⎝<br />

q1<br />

q2<br />

.<br />

qn<br />

p1<br />

p2<br />

.<br />

pn<br />

⎞<br />

⎟<br />

⎠<br />

J =<br />

( 0 ℑn<br />

−ℑn 0<br />

)<br />

⎛<br />

⎜<br />

∇ = ⎜<br />

⎝<br />

∂<br />

∂q1<br />

∂<br />

∂q2<br />

.<br />

∂<br />

∂qn<br />

∂<br />

∂p1<br />

∂<br />

∂p2<br />

∂<br />

∂pn<br />

⎞<br />

⎟<br />

⎠<br />

(3.17)<br />

The symbol ℑn is the anti-diagonal n × n unit m<strong>at</strong>rix. (refe3.14) becomes<br />

[X, Y ]η ≡ ∑<br />

(<br />

∂X ∂Y<br />

∂qk ∂pk<br />

− ∂X<br />

)<br />

∂Y<br />

,<br />

∂pk ∂qk<br />

(3.18)<br />

or in m<strong>at</strong>rix not<strong>at</strong>ion<br />

The following should look familiar:<br />

k<br />

[X, Y ]η = (∇ηX) T · J · ∇ηY. (3.19)<br />

[qi, qk] = [pi, pk] = 0 [qi, pk] = δik<br />

These are, of course, the commut<strong>at</strong>ion rel<strong>at</strong>ions for position and momentum<br />

oper<strong>at</strong>ors in quantum mechanics. The resemblance is not accidental. The<br />

oper<strong>at</strong>or formul<strong>at</strong>ion of quantum mechanics grew out of Poisson bracket<br />

formul<strong>at</strong>ion of classical mechanics. This development is reviewed in all the<br />

standard texts. In m<strong>at</strong>rix not<strong>at</strong>ion<br />

[η, η]η = [ζ, ζ]η = J (3.20)<br />

The Poisson bracket of two vectors is itself a n × n m<strong>at</strong>rix. i.e.<br />

[X, Y ]ij ≡ [Xi, Yj] (3.21)


3.1. NOTATION 37<br />

The proof of the above assertion is straightforward.<br />

∇ηY = M T · ∇ζY<br />

(∇ηX) T = (M T · ∇ζX) T = (∇ζX) T · M<br />

[X, Y ]η = (∇ζX) T · M · J · M T · ∇ζY<br />

= (∇ζX) T · J · ∇ζY = [X, Y ]ζ<br />

The last step makes use of (3.10). The invariance of the Poisson brackets is<br />

a non-trivial consequence of the symplectic n<strong>at</strong>ure of canonical transform<strong>at</strong>ions.<br />

From now on will will not bother with the subscripts on the Poisson<br />

brackets.<br />

Here is another similarity with quantum mechanics. Let f be any func-<br />

tion of canonical variables.<br />

f ˙ = ∑<br />

(<br />

∂f<br />

∂qk k<br />

= ∑<br />

(<br />

∂f ∂H<br />

∂qk ∂pk<br />

k<br />

df<br />

dt<br />

˙qk + ∂f<br />

)<br />

˙p +<br />

∂pk<br />

∂f<br />

∂t<br />

− ∂f<br />

∂pk<br />

∂H<br />

∂qk<br />

= [f, H] + ∂f<br />

∂t<br />

)<br />

+ ∂f<br />

∂t<br />

(3.22)<br />

This looks like Heisenberg’s equ<strong>at</strong>ion of motion. For our purposes it means<br />

th<strong>at</strong> if f doesn’t depend on time explicitly and if [f, H] = 0, then f is a<br />

constant of the motion. We can use (3.22) to test if our favorite function is<br />

in fact constant, and we can also use it to construct new constants as the<br />

following argument shows.<br />

Let f, g, and h be arbitrary functions of canonical variables. The following<br />

Jacobi identity is just a m<strong>at</strong>ter of algebra.<br />

[f, [g, h]] + [g, [h, f]] + [h, [f, g]] = 0 (3.23)<br />

Now suppose h = H, the Hamiltonian, and f and g are constants of the<br />

motion. Then<br />

[H, [f, g]] = 0<br />

Consequence: If f and g are constants of the motion, then so is [f, g].<br />

This should make us uneasy. Take any two constants. Well, maybe they<br />

commute, but if not, then we have three constants. Commute the new<br />

constant with f and g and get two more constants, etc. How many constants<br />

are we entitled to – anyway? This is a deep question, which has something<br />

to do with the notion of involution. I’ll get to th<strong>at</strong> l<strong>at</strong>er.


38 CHAPTER 3. ABSTRACT TRANSFORMATION THEORY<br />

3.2 Geometry in n Dimensions: The Hairy Ball<br />

Have another look <strong>at</strong> equ<strong>at</strong>ion (3.2). Let’s call ˙η a velocity field. By this<br />

I mean th<strong>at</strong> it associ<strong>at</strong>es a complete set of ˙q’s and ˙p’s with each point in<br />

phase space. In wh<strong>at</strong> direction does ˙η point? This is an easy question in<br />

one dimension; ˙η evalu<strong>at</strong>ed <strong>at</strong> the point P points in the direction tangent<br />

to the trajectory through P . Since trajectories can’t cross in phase space,<br />

there is only one trajectory through P , and the direction is unambiguous.<br />

If we use action-angle variables, the trajectory is a circle, and ˙η is wh<strong>at</strong> we<br />

would call in Ph211, a tangent velocity. The same is true, no doubt, for<br />

n > 1, but how do these circles fit together? How does one visualize this in<br />

higher dimensions?<br />

The answer, as I have mentioned before, is th<strong>at</strong> the trajectories all lie<br />

on the surface of an n dimensional torus imbedded in 2n dimensional space.<br />

Your ordinary breakfast donut is a two dimensional torus imbedded in three<br />

dimensional space. 2 This is easy to visualize, so let’s limit the discussion<br />

to two degrees of freedom for the time being. The step from one degree of<br />

freedom to two involves some profound new ideas. The step from two to<br />

higher dimension is mostly a m<strong>at</strong>ter of m<strong>at</strong>hem<strong>at</strong>ical generaliz<strong>at</strong>ion.<br />

Since we are dealing with conserv<strong>at</strong>ive systems, the trajectories are limited<br />

by the conserv<strong>at</strong>ion of energy, i.e. H(q1, q2; p1, p2) = E is an equ<strong>at</strong>ion<br />

of constraint. The trajectories move on a manifold with three independent<br />

variables. Now the gradient of a function has a well defined geometrical<br />

significance: <strong>at</strong> the point P , ∇f points in a direction perpendicular to the<br />

surface or contour of constant f through P . In this case ∇H is a four component<br />

vector perpendicular to the surface of constant energy. Unfortun<strong>at</strong>ely,<br />

˙η points in the direction of J · ∇H. Wh<strong>at</strong> direction is th<strong>at</strong>? Well,<br />

(∇H) T · J · ∇H = [H, H] = 0<br />

Consequently, J · ∇H points in a direction perpendicular to ∇H, which is<br />

perpendicular to the plane of constant H, i.e. ˙η lies somewhere on the three<br />

dimensional surface of constant H.<br />

We could have guessed th<strong>at</strong> ahead of time, of course, but we can take the<br />

argument further. H is probably not the only constant of motion. Suppose<br />

there are others; call them F , G, etc. For each of these constants we can<br />

2 The word “dimension” gets used in two different ways. When we talk about physical<br />

systems, Lagrangians, Hamiltonians, etc., the dimension is equal to the number of degrees<br />

of freedom. Here I am using dimension to mean the number of independent variables<br />

required to describe the system.


3.2. GEOMETRY IN N DIMENSIONS: THE HAIRY BALL 39<br />

construct a vector field using (2).<br />

˙ηF = J · ∇F<br />

˙ηG = J · ∇G<br />

· · · etc. · · ·<br />

How many such fields can we construct th<strong>at</strong> are independent of one another?<br />

To put it another way, how many independent constants of motion are there?<br />

Th<strong>at</strong>’s a good question – wh<strong>at</strong> do you mean by “independent”? The answer<br />

comes from differential geometry. I’m afraid I can only give a hand-waving<br />

introduction to it. There are two rel<strong>at</strong>ed requirements:<br />

1. Suppose F and G are independent constants of motion. Take any<br />

trajectory from the manifold of constant F and another from constant<br />

G. There is no continuous canonical transform<strong>at</strong>ion th<strong>at</strong> maps the one<br />

trajectory into another.<br />

2. For each point P in space there must be one unique trajectory th<strong>at</strong> lies<br />

in the plane of constant F and simultaneously in the plane of constant<br />

G.<br />

Think about this last requirement in the case where there are two degrees<br />

of freedom and two independent constants of motion. The trajectories must<br />

lie on a two- dimensional surface. If we use action-angle variables, the<br />

trajectories are circles. This sounds like a globe of the earth. Trajectories<br />

with constant ϕ are called longitudes, lines of constant θ are l<strong>at</strong>itudes. But<br />

wait! We have a serious problem <strong>at</strong> the poles. The north and south poles<br />

have all possible longitudes. Requirement 2 is viol<strong>at</strong>ed. Could you rearrange<br />

the lines so th<strong>at</strong> this problem doesn’t occur? It turns out th<strong>at</strong> this is not<br />

possible. This deep result is known in m<strong>at</strong>hem<strong>at</strong>ical circles a the Poincare-<br />

Hopf theorem. In the sort of less exalted company we keep, it’s the Hairy<br />

Ball Theorem. The idea is this: try to comb the hair on a hairy ball so th<strong>at</strong><br />

there is no bald spot. It can’t be done. So long as you really use a comb, i.e.<br />

so long as the trajectories don’t cross, you will always be left with one hair<br />

standing straight up! This is not a proof, of course, but it is a vivid way<br />

of visualizing the content of the theorem. It is easy to see, however, th<strong>at</strong><br />

wh<strong>at</strong> is impossible on a sphere is trivially easy on the surface of a donut. It<br />

can be done in an infinite variety of ways. The simplest is to choose your<br />

“longitudes” so they go around the donut the long way. L<strong>at</strong>itudes go around<br />

the short way. This also s<strong>at</strong>isfies requirement 1. You can’t deform a l<strong>at</strong>itude<br />

into a longitude without cutting through the donut.


40 CHAPTER 3. ABSTRACT TRANSFORMATION THEORY<br />

OK. Suppose you have two constants of motion F and G. How can you<br />

tell if they are independent? The answer is surprisingly simple, [F, G] = 0<br />

Proof: Take a point P on the surface of the donut. We should be able<br />

to set up a local coordin<strong>at</strong>e system with its origin <strong>at</strong> P to describe the<br />

trajectories on the surface. We need two unit vectors, ˆ ξF and ˆ ξG, such th<strong>at</strong><br />

every trajectory in the ˆ ξF - ˆ ξG plane has constant F and G. Choose<br />

ˆξF = ϵJ · ∇F<br />

This is guaranteed to lie in the surface of constant F ; however, G should<br />

remain constant along ˆ ξF . This means th<strong>at</strong><br />

0 = ( ˆ ξF ) T · ∇G = ϵ(J · ∇F ) T · ∇G<br />

= −ϵ(∇F ) T · J · ∇G = −ϵ[F, G]<br />

This proves the assertion. It’s worth reflecting on the fact th<strong>at</strong> this construction<br />

would be impossible on the surface of a sphere. The sphere, unlike<br />

the donut, has only one independent constant, its radius. This theorem also<br />

relieves our anxiety about extra constants. If F and G are independent, we<br />

don’t get a “free” constant K = [F, G], because K = 0.<br />

Summary and generaliz<strong>at</strong>ion:<br />

1. A system with n degrees of freedom has <strong>at</strong> most n independent constants<br />

of motion. Otherwise we could use the additional constants to<br />

elimin<strong>at</strong>e one or more of these degrees. For example, we could use the<br />

Hamilton-Jacobi procedure to make all the momenta constant. The<br />

Hamiltonian would then only be a function of the n coordin<strong>at</strong>es, but<br />

these would not be independent because of the additional constraints.<br />

2. Let’s say there are k constants, Fi, i = 1, . . . , k. If they are independent<br />

we must have [Fi, Fj] = 0.<br />

3. In the best case there are exactly n independent constants. Such<br />

constants are said to be in involution. Such a system is said to be<br />

integrable.<br />

4. All trajectories of integrable systems are confined to the surfaces of<br />

n-dimensional tori imbedded in 2n-dimension space.<br />

5. If k < n there are no general st<strong>at</strong>ements we can make about the<br />

behavior of the trajectories. We will be very much concerned in the<br />

next chapter with systems th<strong>at</strong> are “almost” integrable.


3.2. GEOMETRY IN N DIMENSIONS: THE HAIRY BALL 41<br />

6. There are no general criteria known for deciding whether or not a<br />

system is integrable; however, if the Hamiltonian is separable, the<br />

system is integrable.<br />

3.2.1 Example: Uncoupled Oscill<strong>at</strong>ors<br />

The Hamiltonian for two uncoupled harmonic oscill<strong>at</strong>ors (with m = 1) is<br />

H = 1<br />

2 (p2 1 + p 2 2 + ω 2 1q 2 1 + ω2q 2 2)<br />

This is an important problem because every linear oscill<strong>at</strong>ing system can<br />

be put in this form by a suitable choice of coordin<strong>at</strong>es. 3 There are two<br />

constants of motion<br />

E1 = 1<br />

2 (p2 1 + ω 2 1q 2 1) E2 = 1<br />

2 (p2 2 + ω 2 2q 2 2)<br />

In terms of action-angle variables, the constants are I1 and I2.<br />

H = I1ω1 + I2ω2 = E1 + E2 = E<br />

Every integrable system can be put in this form, although in general the<br />

ω’s will be functions of the I’s. Here they are just parameters from the<br />

Hamiltonian.<br />

This is a simple problem, but the phase space is four dimensional. Let’s<br />

think about all possible ways we might visualize it. In the q1 - p1 or (q2 -<br />

p2) plane the trajectories are ellipses with<br />

qk(max) = √ 2Ek/ωk<br />

pk(max) = √ 2Ek,<br />

where k = 1, 2. The area enclosed by each ellipse is significant, because<br />

∫ <br />

area = dq dp = p dq = 2πI (3.24)<br />

s<br />

The first integral is a surface integral over the area of the ellipse. The second<br />

is a line integral around the ellipse. This identity is a variant of Stokes’s<br />

theorem. It’s useful to rescale the variables so th<strong>at</strong> they both have the same<br />

units and the trajectory is a circle. An n<strong>at</strong>ural choice would be<br />

q ′ k<br />

√<br />

= qk ωk = √ 2Ik sin ψk<br />

p ′ k = pk/ √ ωk = √ 2Ik cos ψk<br />

3 This comes under the heading of “theory of small oscill<strong>at</strong>ions.” Most mechanics texts<br />

devote a chapter to it.


42 CHAPTER 3. ABSTRACT TRANSFORMATION THEORY<br />

The trajectories are now circles with radius √ 2Ik. The area enclosed is 2πIk,<br />

as required by (3.24).<br />

The motion in the q1 - q2 plane is more complic<strong>at</strong>ed. It depends on<br />

the r<strong>at</strong>io ω1/ω2 called the winding number. If this is a r<strong>at</strong>ional number,<br />

say N1/N2 then after N1 cycles of q1 and N2 cycles of q2, the trajectory<br />

will come back to its starting point. This is called a Lissajou figure. If the<br />

winding number is irr<strong>at</strong>ional, the trajectory will be confined to a limited<br />

area but will never return to its starting point. It will eventually “color<br />

in” all available space. In the next chapter we will be concerned with systems<br />

th<strong>at</strong> are “almost” integrable. For such systems the winding number is<br />

all-important. Systems with irr<strong>at</strong>ional winding numbers tend to be stable<br />

under perturb<strong>at</strong>ion. Those with r<strong>at</strong>ional winding numbers disintegr<strong>at</strong>e <strong>at</strong><br />

the slightest push!<br />

The centerpiece of this chapter is the torus. The trajectories spiral<br />

around the donut. If the winding number is r<strong>at</strong>ional they “wear a p<strong>at</strong>h”<br />

around the donut. If it’s irr<strong>at</strong>ional they cover the donut evenly. A useful<br />

way of visualizing this was invented by Poincaré. Imagine a fl<strong>at</strong> plane cutting<br />

through the donut in such a way th<strong>at</strong> every point on the plane has the<br />

angle ψ1 = 0. Place a dot on the plane <strong>at</strong> he point where each trajectory<br />

passes through it. If the winding number is a r<strong>at</strong>ional fraction, there will<br />

be a finite number of points. Each time a trajectory passes through ψ1 = 0<br />

it will pass through one of the dots. If the winding number is irr<strong>at</strong>ional the<br />

crossings will mark out a continuous circle. The Poincaré section as it is<br />

called (some books call it the surface of section) is a useful diagnostic tool.<br />

Suppose you have a system of equ<strong>at</strong>ions th<strong>at</strong> are not integrable (so far as<br />

you know) but is amenable to computer calcul<strong>at</strong>ion. Take various Poincaré<br />

sections. If they are circles then the system is <strong>at</strong> least approxim<strong>at</strong>ely integrable<br />

and can be described with action-angle variables. As we will see,<br />

there are often regions of phase space, “islands” as it were, where motion is<br />

simply periodic and other regions th<strong>at</strong> are wildly chaotic.<br />

Pictures of this motion appear in all the standard texts. I have yet to<br />

see a clear explan<strong>at</strong>ion of the coordin<strong>at</strong>es involved, however. Wh<strong>at</strong> does it<br />

mean really to say th<strong>at</strong> the donut is a 2-d surface in a 4-d space? Your<br />

breakfast donut, after all, is imbedded in 3-d space. If we take a Poincaré<br />

section through the donut <strong>at</strong> the plane ψ2 = 0 and plot q ′ 1 versus p′ 1<br />

, we<br />

will get either a circle of dots or a continuous circle with a radius equal to<br />

√ 2I1, or we can take a slice through ψ1 = 0 and get a circle with radius<br />

√ 2I2. Put it this way, any point on the torus has four (polar) coordin<strong>at</strong>es,<br />

( √ 2I1, ψ1, √ 2I2, ψ2), but in 3-d space, only three of them are independent.<br />

When the torus is in 4-d space, all four of them are independent. If we really


3.2. GEOMETRY IN N DIMENSIONS: THE HAIRY BALL 43<br />

lived in 4-d space, we would label the axes of the donut plot (q ′ 1 , p′ 1 , q′ 2 , p′ 2 ).<br />

This is impossible for us to imagine. The donut is easy; just remember th<strong>at</strong><br />

there is no equ<strong>at</strong>ion of constraint among the four variables. 4<br />

3.2.2 Example: A Particle in a Box<br />

Consider a particle in a two-dimensional box with elastic walls.<br />

0 ≤ x ≤ a 0 ≤ y ≤ b<br />

H = 1<br />

2m (p2x + p 2 y) = π2<br />

(<br />

I2 1<br />

2m a2 + I2 2<br />

b2 )<br />

I1 = 1<br />

<br />

px dx =<br />

2π<br />

a<br />

π |px| I2 = b<br />

π |py|<br />

ω1 = ∂H<br />

∂I1<br />

= π2<br />

I1 ω2<br />

ma2 = π2<br />

I2<br />

mb2 There are several interesting points about this apparently trivial problem.<br />

The Hamiltonian looks linear, but in fact it contains an invisible nonlinear<br />

potential th<strong>at</strong> reverses the particle’s momentum when it hits the wall. One<br />

symptom of this is th<strong>at</strong> the frequencies depend on I. This looks odd, but<br />

it’s just the action-angle way of saying th<strong>at</strong> the particle makes a round trip<br />

(in the x direction) in a time T = 2am/px. The loop integral in this context<br />

is an integral over one “round trip” of the particle.<br />

∫ a ∫ 0<br />

pxdx = |px| dx + (−|px|) dx = 2a|px|<br />

0<br />

My real point in showing this example is to call your <strong>at</strong>tention to the<br />

angle variable. I will work through the calcul<strong>at</strong>ion for the x variable. This<br />

same thing holds for y of course.<br />

1<br />

2m<br />

a<br />

( ) 2<br />

dWx<br />

= E1<br />

dx<br />

∫<br />

Wx = (±) √ ∫<br />

2mE1 dx =<br />

ψ1 = ∂Wx<br />

∂I1<br />

= ± π<br />

∫<br />

a<br />

(±) πI1<br />

a dx<br />

dx = ± π<br />

a x + ψ10 = ψ1<br />

4 Of course, I1 and I2 are constant for any given set of initial conditions. It is this sense<br />

in which the torus is a 2-d surface.


44 CHAPTER 3. ABSTRACT TRANSFORMATION THEORY<br />

The term ψ10 is an integr<strong>at</strong>ion constant. There is no reason why it must be<br />

the same for both legs of the journey. We are free to choose it as follows:<br />

0 → x → a: ψ1 = πx/a<br />

0 ← x ← a: ψ1 = 2π − πx/a<br />

While the particle is bouncing violently between the walls, the angle variables<br />

are increasing smoothly with time, ψ1 = ω1t and ψ2 = ω2t. Even this<br />

strange problem is equivalent to a donut! 5<br />

5 When correctly viewed, everything is a harmonic oscill<strong>at</strong>or – in this case two harmonic<br />

oscill<strong>at</strong>ors.


Chapter 4<br />

Canonical Perturb<strong>at</strong>ion<br />

Theory<br />

So far we have assumed th<strong>at</strong> our systems had exact analytic solutions. One<br />

way of st<strong>at</strong>ing this is th<strong>at</strong> we can find a canonical transform<strong>at</strong>ion to action<br />

angle variables such th<strong>at</strong> the new Hamiltonian is a function of the action<br />

variables only, H = H(I). Such problems are the exception r<strong>at</strong>her than the<br />

rule. For our purposes they are also uninteresting. All periodic integrable<br />

systems are equivalent to a set of uncoupled harmonic oscill<strong>at</strong>ors. Once you<br />

get over the thrill of this discovery, the oscill<strong>at</strong>ors are boring! The existence<br />

of chaos depends on the system not being equivalent to a set of oscill<strong>at</strong>ors.<br />

In order to deal with systems th<strong>at</strong> are non-trivial in this sense, we need some<br />

way of doing perturb<strong>at</strong>ion theory. 1<br />

4.1 One-Dimensional Systems<br />

I will present the theory first for systems with one degree of freedom. This<br />

will simplify the not<strong>at</strong>ion, however the interesting complic<strong>at</strong>ions only appear<br />

in higher dimensions. Here is the basic situ<strong>at</strong>ion: A bounded conserv<strong>at</strong>ive<br />

system with one degree of freedom is described by a constant Hamiltonian<br />

H(q, p) = E. We need to obtain the equ<strong>at</strong>ions of motion in the form q = q(t)<br />

1 I will follow the tre<strong>at</strong>ment in Chaos and Integrability in <strong>Nonlinear</strong> Dynamics, Michael<br />

Tabor, Wiley-Interscience, 1989. Another good reference is Classical <strong>Mechanics</strong> by R. A.<br />

Metzner and L.C. Shepley, Prentice Hall, 1991. The subject is also discussed in Classical<br />

<strong>Mechanics</strong>, Goldstein, Poole and Safko, third edition, Addison-Wesley, 2002. Goldstein<br />

discusses time-dependent and time-independent perturb<strong>at</strong>ion theory. We are doing the<br />

time-independent variety.<br />

45


46 CHAPTER 4. CANONICAL PERTURBATION THEORY<br />

and p = p(t), but this is impossible due to the non-linear n<strong>at</strong>ure of the<br />

problem. We are able to split up the Hamiltonian<br />

H = H0 + ϵH1<br />

in such a way th<strong>at</strong> H0 is amenable to exact solution, and H1 is in some<br />

sense small. We indic<strong>at</strong>e the smallness by multiplying it by ϵ. This is a<br />

bookkeeping device; it will be set to one after the approxim<strong>at</strong>ions have been<br />

derived.<br />

The first step is to find the canonical transform<strong>at</strong>ion th<strong>at</strong> makes H0<br />

cyclic, i.e. q = q(I, ψ), p = p(I, ψ), and H0 = H0(I) where I and ˙ ψ = ω0<br />

are both constant. Unfortun<strong>at</strong>ely, this transform<strong>at</strong>ion does not render the<br />

complete Hamiltonian cyclic, so we write<br />

H(I, ψ) = H0(I) + ϵH1(I, ψ). (4.1)<br />

H is still constant, and consequently I now depends on ψ. H0 is not an<br />

explicit function of ψ, but it does depend on ψ implicitly through I.<br />

Despite this inconvenience, I and ψ are still a perfectly good set of<br />

canonical variables, so th<strong>at</strong> the equ<strong>at</strong>ions of motion<br />

I ˙ = − ∂<br />

H(I, ψ) ˙<br />

∂ψ<br />

∂<br />

ψ = H(I, ψ) (4.2)<br />

∂I<br />

are valid without approxim<strong>at</strong>ion, even though we are unable to solve them in<br />

this form. The so-called time-dependent perturb<strong>at</strong>ion proceeds from here by<br />

expanding the solutions of (4.2) as power series in ϵ. Our approach is to find<br />

a second canonical transform<strong>at</strong>ion, i.e. (q, p) → (I, ψ) → (J, φ) such th<strong>at</strong><br />

H(I, ψ) → K(J). This last step must be done as a series of approxim<strong>at</strong>ions,<br />

of course, otherwise the problem would be exactly solvable.<br />

In order to make the transform<strong>at</strong>ion (I, ψ) → (J, φ) we will use a gener<strong>at</strong>ing<br />

function of the F2 genus, i.e. F = F (ψ, J). We need to expand<br />

F = F0(ψ, J) + ϵF1(ψ, J) + · · · (4.3)<br />

where F0 = Jψ. This is the identity transform<strong>at</strong>ion as can be seen as follows:<br />

I = ∂<br />

∂ψ F0 = J φ = ∂<br />

∂J F0 = ψ<br />

In terms of (??)the transform<strong>at</strong>ion equ<strong>at</strong>ions are<br />

I = ∂F<br />

∂ψ<br />

= J + ϵ∂F1 (ψ, J) + · · · (4.4)<br />

∂ψ


4.1. ONE-DIMENSIONAL SYSTEMS 47<br />

φ = ∂F<br />

∂J<br />

= ψ + ϵ∂F1 (ψ, J) + · · · (4.5)<br />

∂J<br />

Before going on there are some technical points about ψ and J th<strong>at</strong> need<br />

to be discussed. When ϵ = 0, ψ is the exact angle variable for the system.<br />

This means th<strong>at</strong> we can find p and q as functions of ψ such th<strong>at</strong> p and q<br />

return to their original values when ∆ψ = 2π. We can in principle invert<br />

this transform<strong>at</strong>ion to find ψ as a function of p and q.<br />

ψ = ψ(q, p) (4.6)<br />

When p and q run through a complete cycle, ψ advances by 2π. When ϵ ̸= 0<br />

the orbit will be different from the unperturbed case, but the functional<br />

rel<strong>at</strong>ionship doesn’t change, so when p and q run through a complete cycle,<br />

we must still have ∆ψ = 2π. Of course, the exact angle variable will also<br />

advance 2π. In summary<br />

∆ψ = ∆φ = 2π (4.7)<br />

for one complete cycle.<br />

The following integrals are all equal because canonical transform<strong>at</strong>ions<br />

preserve phase space volume.<br />

J = 1<br />

2π<br />

<br />

p dq = 1<br />

2π<br />

Now integr<strong>at</strong>e (4.4) around one orbit:<br />

th<strong>at</strong> is<br />

<br />

1<br />

2π<br />

I dψ = 1<br />

<br />

2π<br />

<br />

J dφ = 1<br />

<br />

2π<br />

J dψ + 1<br />

2π ϵ<br />

<br />

∂F1<br />

∂ψ<br />

J = J + 1<br />

2π ϵ<br />

<br />

∂F1<br />

∂ψ<br />

dψ + · · · ;<br />

I dψ (4.8)<br />

dψ + · · · ;<br />

We have just seen th<strong>at</strong> ∆ψ = 2π around one cycle. Consequently<br />

<br />

∂F1<br />

dψ = 0 (4.9)<br />

∂ψ<br />

implies th<strong>at</strong> the deriv<strong>at</strong>ive of F1 is purely oscill<strong>at</strong>ory with a fundamental<br />

period of 2π in ψ. (The same is true of the higher order terms as well.)<br />

The Hamiltonian is transformed using (??) with the new variables.<br />

K(φ, J) = H(ψ(φ, J), I(φ, J)) + ∂<br />

∂t F2(ψ(φ, J), J, t) (4.10)


48 CHAPTER 4. CANONICAL PERTURBATION THEORY<br />

As explained above, we seek a transform<strong>at</strong>ion th<strong>at</strong> makes φ cyclic so th<strong>at</strong><br />

K = K(J). The appropri<strong>at</strong>e gener<strong>at</strong>ing function does not depend on time,<br />

so (4.10) becomes<br />

K(J) = H(ψ(φ, J), I(φ, J)) (4.11)<br />

The approxim<strong>at</strong>ion procedure consists in expanding the left and right sides<br />

of this equ<strong>at</strong>ion in powers of ϵ and then equ<strong>at</strong>ing terms of zeroth and first<br />

order. This procedure could be carried out to higher order. I’m interested<br />

in first order corrections only.<br />

The so-called Kamiltonian is expanded as follows:<br />

K(J) = K0(J) + ϵK1(J) + · · ·<br />

At first sight, this agenda looks hopeless. We need to know the exact value<br />

of J to make use any of these terms, even the zeroth order approxim<strong>at</strong>ion.<br />

The exquisite point is th<strong>at</strong> we can use (4.8) to calcul<strong>at</strong>e J exactly without<br />

knowing the complete transform<strong>at</strong>ion.<br />

The zeroth order Hamiltonian is expanded with the help of (4.4).<br />

H0(I) = H0( ∂F<br />

∂ψ ) = H0(J + ϵ ∂F1<br />

∂ψ + · · · ) = H0(J) + ϵ ∂F1<br />

∂ψ<br />

∂H0(J)<br />

∂J<br />

<br />

<br />

<br />

ϵ=0<br />

The first order term is already multiplied by ϵ.<br />

Substitute all this into (??) gives<br />

∂H0<br />

∂J<br />

<br />

<br />

<br />

ϵ=0<br />

+ · · ·<br />

= ∂H0(I)<br />

∂I = ω0 (4.12)<br />

ϵH1(ψ, I) = ϵH1(φ, J) + · · ·<br />

K0(J) = H0(J) (4.13)<br />

K1(J) = ∂F1<br />

∂ψ ω0 + H1(φ, J) (4.14)<br />

The not<strong>at</strong>ion H0(J) means th<strong>at</strong> you take your formula for H0(I) and replace<br />

the symbol I with the symbol J without making any change in the functional<br />

form of H0.<br />

Integr<strong>at</strong>e (4.14) around one cycle and use (4.9); (4.14) becomes<br />

K1(J) = H1(J) ≡ 1<br />

∫ 2π<br />

H1(ψ, J) dψ, (4.15)<br />

2π 0


4.1. ONE-DIMENSIONAL SYSTEMS 49<br />

and<br />

∂<br />

∂ψ F1(ψ, J) = 1 [<br />

H1 − H1(ψ, J)<br />

ω0(J)<br />

] ≡ − ˜ H1(ψ, J)<br />

ω0<br />

(4.16)<br />

˜H1 is the periodic part of H1. We are left with a differential equ<strong>at</strong>ion th<strong>at</strong><br />

is easy to integr<strong>at</strong>e.<br />

F1(ψ, J) = − 1<br />

∫<br />

dψ<br />

ω0(J)<br />

˜ H1(ψ, J) (4.17)<br />

4.1.1 Summary<br />

I will summarize all these technical details in the form of an algorithm for<br />

doing first order perturb<strong>at</strong>ion theory. Remember th<strong>at</strong> the object is to find<br />

equ<strong>at</strong>ions of motion in the form q = q(t) and p = p(t). We do this in three<br />

steps: (1) Find q = q(I, ψ) and p = p(I, ψ). (2) Find I = I(J, φ) and<br />

ψ = ψ(J, φ). (3) J and ˙φ are constant, so φ = ˙φt + φ0.<br />

1. Identify the H0 part of the Hamiltonian. Find the transform<strong>at</strong>ion<br />

equ<strong>at</strong>ions q = q(I, ψ) and p = p(I, ψ) using the Hamiltonian-Jacobi<br />

equ<strong>at</strong>ion as described in the previous section. Use (??) to get ω0.<br />

2. Equ<strong>at</strong>ion (4.8) can be used to find J in terms of the total energy<br />

E. The integral presents no difficulties in principle, especially if the<br />

Hamiltonian is separable. In fact, textbooks never bother to do this.<br />

It seems sufficient to display the results in terms of J, the assumption<br />

being th<strong>at</strong> we could find J = J(E) if we really had to.<br />

3. The first order correction to the energy is obtained from the integral in<br />

(4.15). Get the first order correction to the frequency by differenti<strong>at</strong>ing<br />

it with respect to J.<br />

4. The gener<strong>at</strong>ing function F1 is calcul<strong>at</strong>ed from (4.17). It is then substituted<br />

into (4.4) and (4.5). These give implicit equ<strong>at</strong>ions for ψ =<br />

ψ(J, φ) and I = I(J, φ). Unfortun<strong>at</strong>ely, it is usually impossible to<br />

invert them to obtain these formula explicitly.<br />

4.1.2 The simple pendulum<br />

The pendulum makes a nice example<br />

H = l2<br />

+ mgR(1 − cos θ)<br />

2mR2


50 CHAPTER 4. CANONICAL PERTURBATION THEORY<br />

The angular momentum l = mR2θ˙ is canonically conjug<strong>at</strong>e to the angle θ.<br />

H = 1<br />

2mR2 [<br />

l 2 + m 2 R 4 ω 2 0θ 2<br />

(<br />

1 − θ2<br />

)]<br />

+ · · ·<br />

12<br />

The first two terms reduce to the familiar harmonic oscill<strong>at</strong>or. This is the<br />

zeroth order problem.<br />

l 2 =<br />

Make the n<strong>at</strong>ural substitution<br />

H0 = E0 = l2 mgRθ2<br />

+<br />

2mR2 2<br />

( ) 2<br />

dW<br />

= 2mR<br />

dθ<br />

2 E0 − m 2 R 4 ω 2 0θ 2<br />

l =<br />

sin 2 ψ = ml2 ω 2 0<br />

2E0<br />

( dW<br />

dθ<br />

θ 2<br />

(4.18)<br />

)<br />

= √ 2mR2E cos ψ (4.19)<br />

We can look on this as a convenient change of variable, but ψ is also the<br />

angle variable. This can be seen as follows:<br />

I = 1<br />

<br />

2π<br />

l dθ =<br />

√ 2mR 2 E0<br />

2π<br />

[<br />

1 − mR2 ω 2 0<br />

2E0<br />

θ 2<br />

] 1/2<br />

Use (4.19) to get the familiar result, I = E0/ω0. The gener<strong>at</strong>ing function is<br />

obtained from the indefinite integral<br />

∫ ( ) ∫<br />

dW [2mR2 W =<br />

dθ = ω0I − m<br />

dθ<br />

2 R 4 ω 2 0θ 2] 1/2<br />

dθ (4.20)<br />

According to the basic transform<strong>at</strong>ion formula we should have<br />

ψ = ∂W<br />

∂I<br />

One can show by differenti<strong>at</strong>ing (4.20) and using (4.19) to complete the<br />

integr<strong>at</strong>ion, th<strong>at</strong> this is indeed so.<br />

Equ<strong>at</strong>ions (4.18) and (4.19) can be rearranged to give<br />

l = √ 2mR 2 Iω0 cos ψ (4.21)<br />


4.2. MANY DEGREES OF FREEDOM 51<br />

θ =<br />

√ 2I<br />

mR 2 ω0<br />

sin ψ (4.22)<br />

The goal of the action-angle program is to express the original coordin<strong>at</strong>es<br />

and momenta in terms of the action-angle variables. This has now been<br />

completed to zeroth order.<br />

The first order correction is<br />

H1(I, ψ) = − mR2 ω 2 0 θ4<br />

24<br />

= − I2<br />

6mR 2 sin4 ψ.<br />

We are now in a position to recast our Hamiltonian à la (4.1).<br />

(<br />

H(I, ψ) = Iω0 + ϵ − I2<br />

6mR2 sin4 )<br />

ψ<br />

We have also obtained ω0 = √ g/R “for free.” The ϵ is there for bookkeeping<br />

purposes only. We have no further need for it.<br />

K0(J) = H0(J) = Jω0<br />

K1(J) = H1(J) = 1<br />

∫ 2π<br />

2π 0<br />

F1(J, ψ) = − 1<br />

˜H = H1 − H1 =<br />

ω0<br />

∫<br />

J 2<br />

H1 dψ = −<br />

16mR2 J 2<br />

48mR 2 (3 − 8 sin4 ψ)<br />

dψ ˜ J<br />

H1 =<br />

2<br />

192 mR2 (sin 4ψ − 8 sin 2ψ)<br />

ω0<br />

ω = ω0 −<br />

J<br />

32mR 2<br />

4.2 Many Degrees of Freedom<br />

For systems of two or more degrees of freedom, canonical perturb<strong>at</strong>ion theory<br />

is formul<strong>at</strong>ed in exactly the same way as before – but now profound<br />

difficulties arise, even to first order in ϵ. The problem centers around equ<strong>at</strong>ion<br />

(4.16) repe<strong>at</strong>ed here for reference<br />

ω0(J) ∂F1(ψ, J)<br />

∂ψ<br />

= − ˜ H1(ψ, J)<br />

We were able to solve this with a simple integr<strong>at</strong>ion (4.17). This is not<br />

possible for more th<strong>at</strong> one degree of freedom, so we must resort to Fourier


52 CHAPTER 4. CANONICAL PERTURBATION THEORY<br />

series. Before doing this, however, we will need to generalize our not<strong>at</strong>ion.<br />

Let’s use the vectors<br />

J = (J1, · · · , Jn) ω0 = (ω01, · · · , ω0n) ∇ψ = ( ∂<br />

∂ψ1<br />

, · · · , ∂<br />

)<br />

∂ψn<br />

where n is the number of degrees of freedom. In this not<strong>at</strong>ion (4.16) becomes<br />

where<br />

ω0(J) · ∇ψF1(ψ, J) = − ˜ H1(J, ψ) (4.23)<br />

¯H1(J, ψ) =<br />

∫ 2π<br />

0<br />

∫ 2π<br />

dψ1 · · · dψnH1(J, ψ) (4.24)<br />

0<br />

and ˜ H1 = H1 − ¯ H1. Since both sides of (4.16) are periodic, we can solve<br />

them with Fourier series.<br />

˜H1(J, ψ) = ∑<br />

Ak(J)e ik·ψ<br />

(4.25)<br />

where k is a vector of integers<br />

k<br />

F1(J, ψ) = ∑<br />

Bk(J)e ik·ψ<br />

k<br />

k = k1, · · · , kn<br />

(4.26)<br />

It seems as if we could proceed as follows: ˜ H1 is known <strong>at</strong> this point, so we<br />

can find Ak Substitute these definitions into (4.16) we get<br />

Bk = i Ak<br />

ω0 · k<br />

(4.27)<br />

Now here’s the infamous problem. Suppose, for example, there were only<br />

two degrees of freedom. In this case the denomin<strong>at</strong>or of (??) would be<br />

ω0 · k = ω01k1 + ω02k2<br />

(4.28)<br />

You can see th<strong>at</strong> if the winding number ω01/ω02 is a r<strong>at</strong>ional number, then<br />

for some k, Bk will be infinite. It seems th<strong>at</strong> the slightest perturb<strong>at</strong>ion<br />

will blow this system into outer space! Even if the winding number is not<br />

r<strong>at</strong>ional, there will always be values of k th<strong>at</strong> will make ω0 · k arbitrarily<br />

small.<br />

This problem was discovered in the early twentieth century, and all the<br />

effort of the most eminent m<strong>at</strong>hem<strong>at</strong>icians of the day failed to solve it.


4.2. MANY DEGREES OF FREEDOM 53<br />

One opinion held th<strong>at</strong> the slightest perturb<strong>at</strong>ion would cause the system to<br />

become “ergodic,” th<strong>at</strong> is to say, the trajectories would fill up all of phase<br />

space. Numerical calcul<strong>at</strong>ions l<strong>at</strong>er showed th<strong>at</strong> this was often not the case.<br />

Trajectories will often “lock in” to stable p<strong>at</strong>terns. This has been the subject<br />

of much contemporary research. When and why do trajectories lock in, and<br />

wh<strong>at</strong> happens when they do not? The question of wh<strong>at</strong> trajectories remain<br />

stable under small perturb<strong>at</strong>ions is <strong>at</strong> least partly answered by the so-called<br />

KAM (Kolmogorov, Arnold, Moser) theorem. In the general case there is,<br />

if not a complete theory, <strong>at</strong> least a well-developed taxonomy. We will turn<br />

to these m<strong>at</strong>ters in the next chapter.


54 CHAPTER 4. CANONICAL PERTURBATION THEORY


Chapter 5<br />

Introduction to Chaos<br />

The canonical perturb<strong>at</strong>ion theory of the previous chapter is a lot of work,<br />

and in two or more degrees of freedom it summons up the ogre of small<br />

denomin<strong>at</strong>ors. Many people have tried to solve this problem by pounding<br />

their heads on it. This turns out not to be a fruitful approach. I will<br />

illustr<strong>at</strong>e the limit<strong>at</strong>ions of perturb<strong>at</strong>ion theory by considering the van der<br />

Pol oscill<strong>at</strong>or. This is a simple nonlinear, one-dimensional, second-order<br />

differential equ<strong>at</strong>ion closely resembling a damped harmonic oscill<strong>at</strong>or. It<br />

has stable solutions which can easily be found numerically, yet it has no<br />

known analytic solutions, and perturb<strong>at</strong>ion theory, on general principles,<br />

just can’t work! 1 We then go on to discuss linear stability theory. With<br />

these simple techniques you can analyze most nonlinear systems (the van<br />

der Pol oscill<strong>at</strong>or is an exception) and get a qualit<strong>at</strong>ive picture of the phase<br />

space dynamics. In one degree of freedom (two-dimensional phase space)<br />

it will become immedi<strong>at</strong>ely apparent where perturb<strong>at</strong>ion theory is possible<br />

and a qualit<strong>at</strong>ive idea of the motion of the system where it is not.<br />

Higher dimensional spaces are not so easy to analyze, in part because<br />

they are hard to visualize and in part because they are often not integrable.<br />

It is this non-intagrability th<strong>at</strong> leads to chaos. Here we resort to the<br />

Poincareé section and the notion of discrete maps. The Poincaré-Birkoff and<br />

KAM theorems can then tell us something about the onset and structure of<br />

chaos.<br />

1 It should be remembered th<strong>at</strong> all the major developments in elementary particle theory<br />

over the last few decades starting with the standard model in the 1970’s are based on the<br />

notion of spontaneous symmetry breaking. Spontaneous symmetry breaking, almost by<br />

definition, cannot be described with perturb<strong>at</strong>ion theory. When perturb<strong>at</strong>ion theory fails<br />

we always expect new physics. The same is true (to a lesser extent) in classical mechanics<br />

as well.<br />

55


56 CHAPTER 5. INTRODUCTION TO CHAOS<br />

5.1 The total failure of perturb<strong>at</strong>ion theory<br />

To get some feeling for how perturb<strong>at</strong>ion theory might be useless, look <strong>at</strong><br />

the following “toy” example.<br />

¨x = −x + ϵ(x 2 + ˙x 2 − 1) sin( √ 2t) (5.1)<br />

This looks like a harmonic oscill<strong>at</strong>or with a resonant frequency ω = 1 and<br />

a “small” driving term with a frequency ω = √ 2. Obvious solutions are<br />

x(t) = sin t and x(t) = cos t, which hold for all values of ϵ. If we set ϵ = 0<br />

then the solutions more generally are x(t) = x0 sin(t + t0). This solution<br />

plotted on a phase space plot of x(t) versus ˙x(t) will be a circle with radius<br />

r = x0. Wh<strong>at</strong> would you expect for finite ϵ? There presumably are other<br />

solutions, but don’t waste your time looking for them! You should convince<br />

yourself however, th<strong>at</strong> there are no solutions of the form<br />

x(t) = sin t +<br />

∞∑<br />

ϵ n fn(t) (5.2)<br />

Also convince yourself th<strong>at</strong> the trouble comes from the non-linear terms.<br />

The point is because of the non-linearity, it is not possible to start with<br />

unperturbed solutions and get new solutions by adding to them.<br />

A more interesting and oft-studied example is the van der Pol equ<strong>at</strong>ion.<br />

It was first introduced by van der Pol in 1926 in a study of the nonlinear<br />

vacuum tube circuits of early radios.<br />

n=1<br />

¨x + ϵ(x 2 − 1) ˙x + x = 0 (5.3)<br />

Again the ϵ = 0 equ<strong>at</strong>ions are x(t) = x0 sin(t + t0). In phase space this is<br />

a circle of radius x0. If we make ϵ ever so much larger than zero, however,<br />

something remarkable happens as shown in the first of the plots in Fig.<br />

5.1. Yes the orbit eventually becomes a circle, but regardless of the initial<br />

conditions, the radius r ≈ 2. The same sort of behavior is shown in Fig. 5.1<br />

for larger values of ϵ. The shape of the final orbit is determined entirely by ϵ<br />

and is completely unaffected by the initial conditions. A curve of the sort is<br />

called a limit cycle. It’s easy to see in vague way why the limit cycle exists.<br />

The term proportional to ϵ in (5.3) looks like an oscill<strong>at</strong>or damping term, but<br />

its sign depends on whether x 2 is gre<strong>at</strong>er or less than 1. If it is gre<strong>at</strong>er, the<br />

oscill<strong>at</strong>ion is damped; if it is smaller the oscill<strong>at</strong>ion is “undamped.” Indeed,<br />

if ϵ is made neg<strong>at</strong>ive, the orbits either collapse to zero or diverge to infinity<br />

depending on the initial conditions. For obvious reasons the solutions with


5.1. THE TOTAL FAILURE OF PERTURBATION THEORY 57<br />

dx/dt<br />

dx/dt<br />

4<br />

2<br />

0<br />

−2<br />

Epsilon=0.1<br />

−4<br />

−4<br />

4<br />

−2 0<br />

x<br />

Epsilon=1.5<br />

2 4<br />

2<br />

0<br />

−2<br />

−4<br />

−4 −2 0<br />

x<br />

2 4<br />

dx/dt<br />

dx/dt<br />

4<br />

2<br />

0<br />

−2<br />

Epsilon=0.5<br />

−4<br />

−4<br />

10<br />

−2 0<br />

x<br />

Epsilon=3<br />

2 4<br />

5<br />

0<br />

−5<br />

−10<br />

−4 −2 0<br />

x<br />

2 4<br />

Figure 5.1: The van der Pol plot for four values of ϵ and two starting values<br />

(indic<strong>at</strong>ed by asterisks)


58 CHAPTER 5. INTRODUCTION TO CHAOS<br />

positive ϵ are said to be stable and those with neg<strong>at</strong>ive ϵ are said to be<br />

unstable.<br />

This simple model makes an important point. Conventional perturb<strong>at</strong>ion<br />

theory starts with unperturbed, i.e. ϵ = 0 solutions, and then looks for<br />

series solutions in powers of ϵ. This is obviously hopeless here since even<br />

a smidgeon of ϵ is enough to completely alter the n<strong>at</strong>ure of the orbits. It<br />

would be better to start with some simple function th<strong>at</strong> approxim<strong>at</strong>ed the<br />

limit cycle and then expand in powers of some parameter th<strong>at</strong> characterized<br />

the devi<strong>at</strong>ion of the actual orbit from the simple function. Alas, I don’t<br />

know how to do this. The trouble is th<strong>at</strong> the limit cycle is so weird, <strong>at</strong> least<br />

for large ϵ, th<strong>at</strong> it’s hard to come up with a “lowest-order” solution. For<br />

many systems however, this is a practical approach. The trick is to look for<br />

the fixed points.<br />

5.2 Fixed points and lineariz<strong>at</strong>ion<br />

Equ<strong>at</strong>ions of motion can always be cast in the form<br />

˙ξ = f(ξ, t). (5.4)<br />

With n degrees of freedom ξ and f are 2n-dimensional vectors. For example,<br />

Hamilton’s equ<strong>at</strong>ions with one degree of freedom are<br />

˙q = ∂H<br />

∂p<br />

˙p = − ∂H<br />

∂q<br />

[ ]<br />

p<br />

ξ =<br />

q<br />

(5.5)<br />

To keep the not<strong>at</strong>ion simple and general (and to save typing) I will keep the<br />

not<strong>at</strong>ion in the form (5.5) for the time being and not type out the p’s and<br />

q’. I will also restrict the discussion to autonomous systems, i.e. those in<br />

which the Hamiltonian does not depend explicitly on time. 2<br />

A fixed point (also called a st<strong>at</strong>ionary point, equilibrium point, or critical<br />

point) is simply the point ξf where all the time deriv<strong>at</strong>ives vanish, f(ξf ) =<br />

ξf<br />

˙ = 0. It’s the place where nothing happens. Detailed inform<strong>at</strong>ion about<br />

2 The not<strong>at</strong>ion in this section is taken from Classical Dynamics by J. V. Jose and E. J<br />

Saletan


5.2. FIXED POINTS AND LINEARIZATION 59<br />

the motion of a system close to a fixed point can be obtained by linearizing<br />

the equ<strong>at</strong>ions of motion. This is done as follows: First the origin is moved<br />

to the fixed point by writing<br />

ζ(t) = ξ(t; ξ0) − ξf<br />

Second, (5.4) is written for ζ r<strong>at</strong>her than for ξ.<br />

(5.6)<br />

˙ζ = f(ζ + ξf ) ≡ g(ζ) (5.7)<br />

Third, g is expanded in a Taylor series about ζ = 0.<br />

˙ζ j = dgj<br />

dζk <br />

<br />

<br />

ξf<br />

+ O(ζ 2 ) ≡ A j<br />

k ζk + O(ζ 2 ) (5.8)<br />

I am using the Einstein summ<strong>at</strong>ion convention in which one sums over repe<strong>at</strong>ed<br />

indices. Dropping the ζ 2 terms gives the m<strong>at</strong>rix equ<strong>at</strong>ion<br />

˙z = A · z (5.9)<br />

A is a constant m<strong>at</strong>rix called (among other things) the stability m<strong>at</strong>rix. It’s<br />

easy to solve (9) using the m<strong>at</strong>rix exponential.<br />

where<br />

e At ≡<br />

z(t) = e At z0<br />

∞∑<br />

n=0<br />

(5.10)<br />

A n t n /n! (5.11)<br />

For our purposes it will be enough to take the case of one degree of freedom<br />

in which case A is a 2 × 2 real, constant m<strong>at</strong>rix. If A is diagonal<br />

e At <br />

<br />

= <br />

eλ1t 0<br />

0 eλ2t <br />

<br />

<br />

<br />

(5.12)<br />

where λ1 and λ2 are eigenvalues, which might be real or complex. If they<br />

are complex they come in complex-conjug<strong>at</strong>e pairs, λ∗ 1 = λ2.<br />

Various cases can be identified. If both eigenvalues are real and positive,<br />

all trajectories flow away from the fixed point which is then called unstable.<br />

If they are both neg<strong>at</strong>ive all trajectories flow toward and the fixed point is<br />

said to be stable. If the eigenvalues have opposite signs then the trajectories<br />

are repelled from one axis and <strong>at</strong>tracted to the other. This is called a<br />

hyperbolic fixed point or a saddle point.


60 CHAPTER 5. INTRODUCTION TO CHAOS<br />

Figure 5.2: Unstable fixed point for real λ2 > λ1 > 0.<br />

Figure 5.3: Unstable fixed point for real λ1 = λ2 > 0.


5.2. FIXED POINTS AND LINEARIZATION 61<br />

Figure 5.4: Hyperbolic fixed point for real λ1 < 0, λ2 > 0.<br />

It is possible th<strong>at</strong> A cannot be diagonalized. In th<strong>at</strong> case it can <strong>at</strong> least<br />

be in upper triangular form, i.e.<br />

then<br />

<br />

<br />

A = <br />

<br />

z(t) = e λt<br />

<br />

<br />

<br />

<br />

λ 0<br />

µ λ<br />

<br />

<br />

<br />

<br />

1 0<br />

µt 1<br />

<br />

<br />

<br />

<br />

(5.13)<br />

(5.14)<br />

Complex eigenvalues require a bit more discussion. Let λ = α + iβ and<br />

z = u + iv, where α and β are real numbers, and u and v are real vectors<br />

orthogonal to one another. Separ<strong>at</strong>ing real and imaginary parts<br />

A · u = αu − βv A · v = βu + αv (5.15)<br />

Evidentally A · z ∗ = λ ∗ z ∗ so z ∗ is an eigenvector with eigenvalue λ ∗ . Eigne<br />

vectors belonging to different eigenvalues are independent. We can construct<br />

the independent real vectors u and v as follows<br />

u =<br />

z + z∗<br />

2<br />

v =<br />

z − z∗<br />

2i<br />

(5.16)


62 CHAPTER 5. INTRODUCTION TO CHAOS<br />

Figure 5.5: Unstable fixed point for nondiagonalizable A m<strong>at</strong>rix. All of the<br />

integral curves are tangent to z2 <strong>at</strong> the fixed point.<br />

Substituting these definitions into (5.10) gives<br />

e At u = e αt (u cos βt − v sin βt), (5.17)<br />

e At v = e αt (u sin βt + v cos βt). (5.18)<br />

There are two important cases, α > 0 in which case the fixed point is<br />

unstable and the orbits are spirals, and α = 0 when the phase portrait<br />

consists of circles. In this case the fixed point is call a center or an elliptic<br />

point.<br />

It should be remembered th<strong>at</strong> (5.9) is a linearized equ<strong>at</strong>ion. It hold<br />

in some small region of the fixed point, and of course, as is so often the<br />

case, the theory gives us no well to tell how small th<strong>at</strong> region might be. The<br />

damped oscill<strong>at</strong>or makes a good example of the method, and there are many<br />

other examples in the textbooks. On the other hand, the the theory fails<br />

completely for the van der Pol oscill<strong>at</strong>or in the previous section.<br />

5.3 The Henon oscill<strong>at</strong>or<br />

Although the theory from the previous section is perfectly general in the<br />

sense th<strong>at</strong> it can be applied to systems with any number of degrees of free-


5.3. THE HENON OSCILLATOR 63<br />

Figure 5.6: Unstable fixed point for complex λ with ℜ(λ) > 0.<br />

Figure 5.7: Stable fixed point for λ pure imaginary.


64 CHAPTER 5. INTRODUCTION TO CHAOS<br />

dom, it is almost impossible to visualize in four or more dimensions, and the<br />

number of cases th<strong>at</strong> must be considered increases rapidly. The best tool<br />

for visualizing higher dimensional spaces is the Poincaré section. This was<br />

described briefly in Chapter 3, and we will make more use of it shortly. Before<br />

doing so it will be useful to a good example of motion with two degrees<br />

of freedom. A fascin<strong>at</strong>ing and oft-studied cases is the Hénon-Heiles Hamiltonian.<br />

The Hamiltonian was originally used to model the motion of stars<br />

in the galaxy3 . Written in terms of dimensionless variables the Hamiltonian<br />

is<br />

H = 1<br />

2 ( ˙x2 + ˙y 2 + k1x 2 + k2y 2 ) + λ(x 2 y − y3<br />

) (5.19)<br />

3<br />

This is the Hamiltonian of two uncoupled harmonic oscill<strong>at</strong>ors with a perturb<strong>at</strong>ion<br />

proportional to λ. The oscill<strong>at</strong>ors have frequencies ω1 = √ k1 and<br />

ω2 = √ k2. The phase space is the four-dimensional space spanned by x,<br />

˙x, y, and ˙y. We can think of the unperturbed orbit as lying on two tori.<br />

In this case their cross sections are circular with radii determined by the<br />

initial conditions. If the winding number is w = r/s, x will complete r<br />

cycles while y completes s. Let us make a Poincaré section through the y<br />

torus <strong>at</strong> x = 0. Each time the orbit passes through from x < 0 to x > 0<br />

we mark a point <strong>at</strong> y and ˙y on the x = 0 plane. An example is shown in<br />

Figure (5.8) for w = 7/2. Because the winding number is r<strong>at</strong>ional there<br />

are seven discrete dots on the Poincaré section. The case of an irr<strong>at</strong>ional<br />

winding number is shown in Figure (5.9). The x vs. y plot is completely<br />

filled in, and the Poincaré plot is a continuous loop. Continuous loops like<br />

this on the Poincaré plot are a sign th<strong>at</strong> the system is circul<strong>at</strong>ing around an<br />

invariant torus and hence is integrable.<br />

When we turn on the perturb<strong>at</strong>ion by making λ ̸= 0 something remarkable<br />

happens. 4 Figures (5.10) through (5.13) show a progression from the<br />

orderly motion of the uncoupled oscill<strong>at</strong>ors Figure (5.9) through the loop in<br />

(5.10) suggesting motion around a single distorted torus. As the interaction<br />

strength is increased this one torus breaks up into five separ<strong>at</strong>e tori. In the<br />

next plot Figure (5.12) the points are beginning to disperse in a random<br />

way with some structure remaining. Because of the poor resolution of the<br />

plots one cannot see the fine details th<strong>at</strong> remain. Finally in the last plot the<br />

points are arranged in a completely random p<strong>at</strong>tern. This is paradigm<strong>at</strong>ic.<br />

As the strength of the perturb<strong>at</strong>ion increases orderly motion disintegr<strong>at</strong>es<br />

3 See Goldstein’s Classical Dynamics for a review of the physics<br />

4 I am following standard practice by varying the perturb<strong>at</strong>ion strength by changing<br />

the total energy with λ = 1.


5.3. THE HENON OSCILLATOR 65<br />

y<br />

dy/dt<br />

0.1<br />

0.05<br />

0<br />

−0.05<br />

−0.1<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

−0.2<br />

−0.4<br />

−0.6<br />

x vs y<br />

−0.1 −0.05 0<br />

x<br />

0.05 0.1<br />

dy/dt<br />

0.2<br />

0.15<br />

0.1<br />

0.05<br />

0<br />

−0.05<br />

−0.1<br />

−0.15<br />

−0.2<br />

Poincare Section through x=0<br />

−0.1 −0.05 0<br />

y<br />

0.05 0.1<br />

Figure 5.8: Harmonic oscill<strong>at</strong>or coordin<strong>at</strong>es for w = 7/2<br />

Poincare Section through x=0<br />

−0.3 −0.2 −0.1 0<br />

y<br />

0.1 0.2 0.3<br />

Figure 5.9: Harmonic oscill<strong>at</strong>or coordin<strong>at</strong>es irr<strong>at</strong>ional winding number


66 CHAPTER 5. INTRODUCTION TO CHAOS<br />

dy/dt<br />

0.4<br />

0.3<br />

0.2<br />

0.1<br />

0<br />

−0.1<br />

−0.2<br />

−0.3<br />

Poincare Section through x=0<br />

−0.4<br />

−0.4 −0.3 −0.2 −0.1 0<br />

y<br />

0.1 0.2 0.3 0.4<br />

Figure 5.10: Henon-Heiles Hamiltonian. Orbit circul<strong>at</strong>es a distorted torus.<br />

dy/dt<br />

0.5<br />

0.4<br />

0.3<br />

0.2<br />

0.1<br />

0<br />

−0.1<br />

−0.2<br />

−0.3<br />

−0.4<br />

Poincare Section through x=0<br />

−0.5<br />

−0.4 −0.2 0 0.2 0.4 0.6<br />

y<br />

Figure 5.11: The orbit breaks up into smaller tori.


5.3. THE HENON OSCILLATOR 67<br />

dy/dt<br />

dy/dt<br />

0.5<br />

0.4<br />

0.3<br />

0.2<br />

0.1<br />

0<br />

−0.1<br />

−0.2<br />

−0.3<br />

−0.4<br />

Poincare Section through x=0<br />

−0.5<br />

−0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8<br />

y<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

−0.2<br />

−0.4<br />

−0.6<br />

Figure 5.12: Chaos begins to set in.<br />

Poincare Section through x=0<br />

−0.8<br />

−0.5 0 0.5 1<br />

y<br />

Figure 5.13: Complete Chaos


68 CHAPTER 5. INTRODUCTION TO CHAOS<br />

into chaos. One of the goals of chaos theory is to explain explain and predict<br />

this phenomena. This will require some new formalism.<br />

5.4 Discrete Maps<br />

Suppose we were to number the points on the Poincaré plot in the order<br />

they appeared as the orbit repe<strong>at</strong>edly cut through the x = 0 plane. This<br />

would give us a series of coordin<strong>at</strong>es (x1, y2), (x2, y2), · · · , (xn, yn). Think of<br />

this in terms of a mapping oper<strong>at</strong>or T th<strong>at</strong> maps the n’th point into the<br />

n + 1’th point.<br />

T(xn, yn) ≡ (xn+1, yn+1) (5.20)<br />

In principle we could derive the exact m<strong>at</strong>hem<strong>at</strong>ical form for this oper<strong>at</strong>or.<br />

(I doubt th<strong>at</strong> anyone has actually done this.) Certainly we could write<br />

a computer program to do the mapping, and certainly we could derive a<br />

linearized version of T th<strong>at</strong> would be OK for small displacements. For my<br />

purposes it will be enough to consider the general properties such oper<strong>at</strong>ors<br />

must have. The first of these (from which all others flow) is th<strong>at</strong> they must<br />

be area preserving.<br />

Canonical transform<strong>at</strong>ions preserve the volume of phase space. This is<br />

called Liouville’s theorem; it’s proved in most mechanics texts. For a onedegree<br />

of freedom system, this is just preserv<strong>at</strong>ion of area in the (p, q)-phase<br />

plane. Thus for some area A, enclosed by a closed curve C, we can use<br />

Stokes’ theorem to write<br />

<br />

p dq = p dq (5.21)<br />

C<br />

where C ′ is the shape of the curve after it has been changed by some canonical<br />

transform<strong>at</strong>ion including the passage of time which is itself a canonical<br />

transform<strong>at</strong>ion. Another way to say the same thing is th<strong>at</strong> if the (q, p) point<br />

in phase space is transformed to (q ′ , p ′ ), then the Jacobian<br />

<br />

<br />

<br />

<br />

∂(q ′ , p ′ )<br />

∂(q, p)<br />

C ′<br />

<br />

<br />

<br />

= 1 (5.22)<br />

These results can be extended to higher dimensions in a completely straightforward<br />

manner.<br />

So far, so good. There is a corollary to Liouville’s theorem th<strong>at</strong> is not<br />

so easy to prove. The transform<strong>at</strong>ions of the form (5.20) on the Poincaré


5.4. DISCRETE MAPS 69<br />

J<br />

3<br />

2<br />

1<br />

0<br />

−1<br />

−2<br />

−3<br />

Standard Map, ε = 0.00<br />

0 1 2 3 4 5 6<br />

phi<br />

Figure 5.14: The standard map with ϵ = 0.<br />

section also preserve area in the sense of (5.22). 5 Discrete maps are area<br />

preserving.<br />

Let’s take time out for an example. The following transform<strong>at</strong>ion is<br />

called the standard map, presumably because it appears in so many different<br />

contexts. Thanks to the Jn+1 (r<strong>at</strong>her than Jn) in the first of equ<strong>at</strong>ion (5.23)<br />

it is trivially area preserving.<br />

ϕn+1 = (ϕn + Jn+1)mod2π (5.23)<br />

Jn+1 = ϵ sin ϕn + Jn<br />

This is a one-dimensional map written in terms of action-angle variables J<br />

and ϕ. Not only ϕ, but also J is periodic with period 2π. We can imagine<br />

all the orbits wrapped around a cylinder. In the case ϵ = 0, the (ϕn, Jn)’s lie<br />

along parallel circles as shown in Figure 5.14. When ϵ is increased to 0.050<br />

a new fe<strong>at</strong>ure appears, a loop in the center of the plot. This is unusual in<br />

the sense th<strong>at</strong> it can be contracted to a point; it is topologically distinct<br />

from all the ϵ = 0 circles. As ϵ is increased an assortment of smaller loops<br />

appear together with a sm<strong>at</strong>tering of completely random points. Because of<br />

limit<strong>at</strong>ions on plot resolution, computer time, and my p<strong>at</strong>ience you cannot<br />

see the really significant thing about this plot: this p<strong>at</strong>tern of islands of loopy<br />

order interspersed with random dots persists <strong>at</strong> ever smaller and smaller<br />

scales. They have a property called self similarity. In this sense they are<br />

5 Tabor, Appendix 4.1


70 CHAPTER 5. INTRODUCTION TO CHAOS<br />

J<br />

3<br />

2<br />

1<br />

0<br />

−1<br />

−2<br />

−3<br />

Standard Map, ε = 0.050<br />

0 1 2 3 4 5 6<br />

phi<br />

Figure 5.15: The standard map with ϵ = 0.050.<br />

similar to fractal p<strong>at</strong>terns. Finally, as ϵ is increased further, all appearance<br />

of order disappears and the dots become completely random. This is the<br />

st<strong>at</strong>e of complete chaos.<br />

5.5 Linearized Maps<br />

Like the continuous transform<strong>at</strong>ions we studied in Section 5.2, discrete maps<br />

have fixed points about which one can analyze the local topology. Consider<br />

a generic mapping of the form 6<br />

[ xi+1<br />

yi+1<br />

]<br />

[<br />

xi<br />

= T<br />

yi<br />

]<br />

(5.24)<br />

A fixed point of the mapping would be a point where xi+1 = xi and yi+1 = yi.<br />

I will argue l<strong>at</strong>er on th<strong>at</strong> in a plot like Figure 5.16 there are an infinite<br />

number of fixed points, but to keep the algebra simple here I will assume<br />

th<strong>at</strong> the fixed point is <strong>at</strong> the origin (0, 0). Linearizing T about this point<br />

gives [ ] [ ] [ ]<br />

δxi+1 T11 T12 δxi<br />

=<br />

(5.25)<br />

δyi+1 T21 T22 δyi<br />

where of course<br />

Tij = ∂Ti<br />

<br />

<br />

<br />

∂xj<br />

xi,xj=0<br />

6 I am using Tabor’s not<strong>at</strong>ion from section 4.3.4.<br />

(5.26)


5.5. LINEARIZED MAPS 71<br />

J<br />

3<br />

2<br />

1<br />

0<br />

−1<br />

−2<br />

−3<br />

Standard Map, ε = 0.750<br />

0 1 2 3 4 5 6<br />

phi<br />

Figure 5.16: The standard map with ϵ = 0.750.<br />

The eigenvalues λi of the Tij m<strong>at</strong>rix must s<strong>at</strong>isfy<br />

λ 2 − λ(trace(T )) + det(T ) = 0 (5.27)<br />

The all-important point here is th<strong>at</strong> because of the area-preserving property<br />

of T, det(T ) = 1. This gre<strong>at</strong>ly restricts the allowed types of fixed points.<br />

There are only three cases to consider.<br />

If |trace(T )| < 2, λ1 λ2 are a complex conjug<strong>at</strong>e pair lying on the unit<br />

circle, th<strong>at</strong> is,<br />

λ1 = e +iα , λ2 = e −iα<br />

(5.28)<br />

This is simply a rot<strong>at</strong>ion in the vicinity of the fixed point (0, 0). This corresponds<br />

to a stable or elliptic point. Thus in the immedi<strong>at</strong>e neighborhood<br />

of (0, 0) we expect to find invariant curves like Figure 5.7.<br />

If |trace(T )| > 2, λ1 λ2 are real numbers s<strong>at</strong>isfying<br />

λ1 = 1/λ2<br />

(5.29)<br />

There are two subcases to consider here depending on whether λ is positive<br />

or neg<strong>at</strong>ive. If it is positive we have a regular hyperbolic fixed point in<br />

which successive iter<strong>at</strong>e stay on the same branch of the hyperbola as in<br />

Figure 5.17 (a). If λ < 0 we have a hyperbolic-with-reflection fixed point<br />

in which successive iter<strong>at</strong>es jump backwards and forwards between opposite<br />

branches of the hyperbola. (See Figure 5.17 (b).)


72 CHAPTER 5. INTRODUCTION TO CHAOS<br />

Figure 5.17: (a) Hyperbolic fixed point. (b) Hyperbolic-with-reflection fixed<br />

point.<br />

5.6 Lyapunov Exponents<br />

Loosely speaking, systems are chaotic because adjacent trajectories diverge<br />

exponentially from one another. If this were literally true we could parameterize<br />

this divergence with the function e λx , where λ is some constant and x<br />

is the independent variable, which might be continuous or discrete depending<br />

on the applic<strong>at</strong>ion. This is the basic idea behind Lyapunov exponentials,<br />

a formalism with many altern<strong>at</strong>e definitions (and spellings).<br />

Let’s apply this idea first to a one-dimensional iter<strong>at</strong>ive map of the form<br />

xi+1 = f(xi) (5.30)<br />

We can characterize the divergence of two trajectories separ<strong>at</strong>ed by ϵ upon<br />

the n-th iter<strong>at</strong>ion as<br />

<br />

|f(xn + ϵ) − f(xn)| <br />

lim<br />

= <br />

df(xn) <br />

<br />

ϵ→0 ϵ<br />

dxn<br />

<br />

(5.31)<br />

A small but finite devi<strong>at</strong>ion <strong>at</strong> the n-th iter<strong>at</strong>ion, say δxn, should grow to<br />

<br />

<br />

δxn+1 ≈ <br />

df(xn) <br />

<br />

dxn<br />

δxn<br />

(5.32)<br />

Continuing this reasoning<br />

<br />

δxn+1 <br />

= <br />

df(xn) df(xn−1)<br />

δx0<br />

dxn dxn−1<br />

× · · · × df(x0)<br />

dx0<br />

<br />

<br />

<br />

<br />

(5.33)


5.6. LYAPUNOV EXPONENTS 73<br />

=<br />

n∏<br />

i=0<br />

|f ′ (xi)| = e λn<br />

The last equality is just a hypothesis. λ will certainly depend on the point<br />

n where we stop iter<strong>at</strong>ing. We should write instead<br />

λ(n) = 1<br />

n ln<br />

n∏<br />

|f ′ (xi)| (5.34)<br />

with the understanding th<strong>at</strong> the definition only makes sense if there is some<br />

range of n over which λ(n) is more or less constant. λ defined in this way is<br />

a Lyapunov exponent.<br />

In the case of multidimensional mappings<br />

i=0<br />

xi+1 = F (xi) (5.35)<br />

where x and F are n-dimensional vectors, there will be a set of n characteristic<br />

exponents corresponding to the n eigenvalues of the linearized map<br />

(??). Introducing the eigenvalues λi(N), i = 1, . . . , n, of the m<strong>at</strong>rix<br />

(LM)N = (T (xN)T (xN−1) · · · T (x1)) 1/N<br />

(5.36)<br />

where T (xi) is the lineariz<strong>at</strong>ion of F <strong>at</strong> the point xi, the exponents are<br />

defined as<br />

σi(N) = ln |λi(N)| (5.37)<br />

Since the T ’s have unit determinant for area-preserving maps, it is clear th<strong>at</strong><br />

the sum of the exponents must be zero.<br />

For the final example, suppose the equ<strong>at</strong>ion of motion is<br />

˙x = f(x) (5.38)<br />

Let s(t) = x(t) − x0(t) be the difference between two near-by trajectories.<br />

If this does indeed diverge exponentially with time, then ˙s = λs. Then we<br />

can argue th<strong>at</strong><br />

˙s = ˙x − ˙x0 = f(x) − f(x0) = λs = λ(x − x0) (5.39)<br />

λ =<br />

f(x) − f(x0)<br />

x − x0<br />

≈ df<br />

<br />

<br />

<br />

dx<br />

x0<br />

(5.40)


74 CHAPTER 5. INTRODUCTION TO CHAOS<br />

5.7 The Poincaré-Birkhoff Theorem<br />

The phase-space trajectories of integrable systems move on smooth tori. The<br />

appearance of the Poincaré section depends on whether the winding number<br />

is r<strong>at</strong>ional or irr<strong>at</strong>ional. If it is r<strong>at</strong>ional the section shows discrete points. If<br />

irr<strong>at</strong>ional, the points are ‘ergodic’ and form a continuous loop. Under the<br />

influence of nonlinear perturb<strong>at</strong>ions the tori become distorted, then break<br />

up into smaller tori, and finally disintegr<strong>at</strong>e into chaos. It turns out th<strong>at</strong><br />

the way this happens depends on whether the winding number is r<strong>at</strong>ional or<br />

irr<strong>at</strong>ional. If it is irr<strong>at</strong>ional the tori are preserved, distorted but preserved,<br />

under small perturb<strong>at</strong>ions. This is a gross oversimplific<strong>at</strong>ion of the KAM<br />

theorem, which I will discuss in the next Section 5.8. If the winding number<br />

is r<strong>at</strong>ional, the tori break up in a way th<strong>at</strong> is governed by the so-called<br />

Poincaré-Birkoff theorem, the subject of this section. This may seem like<br />

a swindle, since every irr<strong>at</strong>ional number can be approxim<strong>at</strong>ed to arbitrary<br />

accuracy by a r<strong>at</strong>ional number. But, as it turns out, some numbers are more<br />

irr<strong>at</strong>ional than others!<br />

I will prove the PB theorem for the standard map equ<strong>at</strong>ion (5.23) but it<br />

is true under quite general assumptions. I will use the symbol Tϵ for (5.23),<br />

i.e.<br />

Tϵ(ϕn, Jn) = (ϕn+1, Jn+1).<br />

Now imagine the points in Figure 5.14 (ϵ = 0) plotted in polar coordin<strong>at</strong>es<br />

(for positive J) with ϕ the angular and J the radial coordin<strong>at</strong>e. The points<br />

now lie on concentric circles of constant J. Choose J ≡ Jr = 2πj/k, with<br />

k and j integers, i.e. Jr has a r<strong>at</strong>ional winding number. If we iter<strong>at</strong>e T0 k<br />

times, J remains unchanged and ϕ is incremented by k factors of 2π, which<br />

is to say, ϕ is not changed <strong>at</strong> all. Symbolically<br />

T k 0(ϕ, Jr) = (ϕ, Jr)<br />

Now take a J+ slightly larger than Jr. Tk 0 will increment ϕ by slightly more<br />

than 2πk so ϕ will increase. In the same way if J− < Jr, Tk 0 will cause ϕ to<br />

decrease. We can imagine the values of ϕ lying on three circles J+, Jr, and<br />

J− as shown in Figure 5.18(a).<br />

Now turn on a small perturb<strong>at</strong>ion ϵ > 0. Tk ϵ will map some ϕ’s to larger<br />

values and some to smaller, but there will be some locus of points, called C<br />

in Figure 5.19 which are not changed <strong>at</strong> all. In other words, the curve C is<br />

mapped purely radially.<br />

T k ϵ (Jr, ϕ) = (Jc, ϕ)


5.7. THE POINCARÉ-BIRKHOFF THEOREM 75<br />

Figure 5.18: (a) Three orbits of the unperturbed standard map Tk 0 . (b) The<br />

ϕ coordin<strong>at</strong>e is left invariant on C by the perturbed map Tk ϵ .<br />

Curve C is mapped into a new curve called D in Figure 5.18(b).<br />

T k ϵ (Jc, ϕ) = (Jd, ϕ)<br />

The curves C and D must have the same area (remember these are areapreserving<br />

transform<strong>at</strong>ions) so they must cross one another an even number<br />

of times. This situ<strong>at</strong>ion is shown in Figure 5.19. The crossings represent<br />

points th<strong>at</strong> are invariant under T k ϵ – they are fixed points.<br />

This is our first result. A torus with r<strong>at</strong>ional winding number j/k is<br />

invariant under T k 0 , i.e. every point on the torus is a fixed point of Tk 0 .<br />

When ϵ is even slightly larger than zero, only a discrete (even) number of<br />

fixed points of T k ϵ survive. You can ascertain the type of fixed points by<br />

seeing how other points in their immedi<strong>at</strong>e vicinity are mapped. Compare<br />

this flow as it’s called with the arrows in Figures 5.4 and 5.17. You should be<br />

able to convince yourself th<strong>at</strong> the points along the curve C are altern<strong>at</strong>ively<br />

hyperbolic and elliptic. Figure 5.20 should help you visualize this. Since<br />

there are an even number of fixed points, half of them will be elliptic and<br />

half hyperbolic. How many are there? Suppose (ϕ0, J0) is a fixed point of<br />

T k ϵ . We can cre<strong>at</strong>e more fixed points by multiplying by Tϵ as the following


76 CHAPTER 5. INTRODUCTION TO CHAOS<br />

Figure 5.19: The curves C and D. Crossings, like a and b, are fixed points.<br />

Figure 5.20: A closer look <strong>at</strong> the fixed points a and b.


5.8. ALL IN A TANGLE 77<br />

simple argument shows.<br />

T k ϵ [Tϵ(ϕ0, J0)] = TϵT k ϵ (ϕ0, J0) = Tϵ(ϕ0, J0)<br />

Starting with (ϕ0, J0) we can cre<strong>at</strong>e k − 1 additional fixed points by multiplying<br />

repe<strong>at</strong>edly with Tϵ. To put it another way, every fixed point of Tk ϵ is a<br />

member of a family of k fixed points obtained by multiplying by various powers<br />

of Tϵ Because each mapping is a continuous function of ϕ and J, all the<br />

members of an elliptic family are elliptic and all the members of a hyperbolic<br />

family are hyperbolic. I claim th<strong>at</strong> all the members of a family are distinct.<br />

Proof: Let (ϕs, Js) be the fixed point obtained by Ts ϵ(ϕ0, J0) = (ϕs, Js) with<br />

s < k. Then of course all such points are fixed points of Tk ϵ . The claim is<br />

th<strong>at</strong> there is no m < k such th<strong>at</strong> Tm ϵ (ϕs, Js) = (ϕs, Js). Multiply both sides<br />

of this equ<strong>at</strong>ion with T−s ϵ . The result is Tm ϵ (ϕ0, J0) = (ϕ0, J0) It is just this<br />

equ<strong>at</strong>ion with m replaced by k th<strong>at</strong> defines (ϕ0, J0). Hence m = k. Finally<br />

note th<strong>at</strong> none of these newly cre<strong>at</strong>ed fixed points can lie along the original<br />

curve C. If there were there would be instances in which two hyperbolic or<br />

two elliptic points appeared side by side. This we know to be impossible.<br />

Consequently each torus breaks up into k fixed points for every fixed point<br />

on the curve C. This is the Poincaré-Birkoff theorem.<br />

5.8 All in a tangle<br />

Have another look <strong>at</strong> the hyperbolic fixed points in Figures 5.4 and 5.17.<br />

There are always two loci of points leading directly toward the fixed point<br />

and two loci leading away from it. These are called the stable and unstable<br />

manifolds respectively. Following the not<strong>at</strong>ion of Hand and Finch I will call<br />

them H+ and H−. Call the fixed point pf . Any point along H+ will be<br />

mapped asymptotically back to pf under repe<strong>at</strong>ed applic<strong>at</strong>ions of Tϵ, and<br />

any point on H− will be mapped asymptotically back to pf under repe<strong>at</strong>ed<br />

applic<strong>at</strong>ions of T −1<br />

ϵ . Can these manifolds cross one another? I claim the<br />

following.<br />

• H+ and H− cannot intersect themselves, but they can and do intersect<br />

one another.<br />

• Stable manifolds of different fixed points cannot intersect one another.<br />

• Unstable manifolds of different fixed points cannot intersect one another.


78 CHAPTER 5. INTRODUCTION TO CHAOS<br />

• Stable manifolds can intersect with unstable ones. The stable and<br />

unstable manifolds of a single fixed point intersect in wh<strong>at</strong> are called<br />

homoclinic points and those of two different fixed points, in heteroclinic<br />

points.<br />

• Neither H+ nor H− can cross the tori surrounding elliptic fixed points.<br />

• There are, depending on the size of ϵ, narrow bands surrounding tori<br />

with irr<strong>at</strong>ional winding number th<strong>at</strong> are not broken up into isol<strong>at</strong>ed<br />

fixed points. This is the content of the KAM theorem to be discussed<br />

in the next section. Neither H+ nor H− can cross these bands.<br />

The proofs of these assertions are easy and are given in Finch and Hand.<br />

Referring to Figure 5.21(a), x0 is a heteroclinic point th<strong>at</strong> lies on the unstable<br />

manifold H− of pf1 and the stable manifold H+ of pf2. Since both manifolds<br />

are invariant under Tϵ, the Tk ϵ x0 are a set of discrete points th<strong>at</strong> lie on both<br />

manifolds, so the two manifolds must therefore intersect again. For instance,<br />

because x1 = Tϵx0 is on both manifolds, H− must loop around to meet H+.<br />

Similarly the xk = Tk ϵ must lie on both manifolds, so H− must loop around<br />

over and over again as illustr<strong>at</strong>ed in Figure 5.21(b). The inverse map also<br />

leaves H+ and H+ invariant and hence the x−k = T−k ϵ x0 are intersections<br />

th<strong>at</strong> force H+ to loop around to meet H−. As k increases and xk approaches<br />

one of the fixed points, the spacing between the intersections gets smaller,<br />

so the loops they cre<strong>at</strong>e get narrower. But because Tϵ is area-preserving,<br />

the loop areas are the same, so the loops get longer, which leads to many<br />

intersections among them, as shown in Figure 5.21(c) and (d).<br />

Try explaining all this to an intim<strong>at</strong>e friend on a d<strong>at</strong>e. The more you<br />

explain the more you will see th<strong>at</strong> this mechanism produces a tangle of<br />

f<strong>at</strong>homless complexity. 7 Nonetheless the mess is contained, <strong>at</strong> least for small<br />

ϵ. Since stable manifolds cannot cross, the stable manifold eman<strong>at</strong>ing from<br />

pf1 acts as a barrier to the stable manifold eman<strong>at</strong>ing from pf2. The same is<br />

true of the unstable manifolds. The tangle also cannot cross the stable torri<br />

surrounding the elliptic fixed points nor can they cross the KAM tori. As<br />

a consequence we expect to see islands of chaos developing between stable<br />

ellipses. This is clear in Figures 5.15 and 5.16. As ϵ increases, the KAM tori<br />

also break down and chaos engulfs the entire plot.<br />

7 Don’t try to explain this for higher-dimensional spaces. Th<strong>at</strong> way lies madness.


5.8. ALL IN A TANGLE 79<br />

Figure 5.21: A hereroclinic intersection. (a) Two hyperbolic fixed points pf1<br />

and pf2, and an intersection x0 of the unstable manifold of pf1 with the stable<br />

manifold of pf2. (b) Adding the forward maps T k x0 of the intersection.<br />

(c) Adding the backward maps T −1 x0 of the intersection. (d) Adding another<br />

intersection x ′ and some of its backward maps. U – unstable manifold;<br />

S – stable manifold.


80 CHAPTER 5. INTRODUCTION TO CHAOS<br />

5.9 The KAM theorem and its consequences<br />

For a system with n independent degrees of freedom to be integable, it is<br />

a necessary and sufficient condition th<strong>at</strong> n independent constants of the<br />

motion exist. In this case the system can be transformed into a set of<br />

action-angle variables<br />

In this not<strong>at</strong>ion<br />

ω0 ≡ (ω01, ω02, · · · , ω0n) I0 ≡ (I01, I02, · · · , I0n) (5.41)<br />

ω0 = ∂H0<br />

∂I0<br />

Now suppose the system is perturbed slightly<br />

(5.42)<br />

H(ω0, I0, ϵ) = H0(I0) + ϵH1(ω0, I0) (5.43)<br />

where I0 and ω0 are the AA variables of H0. According to our perturb<strong>at</strong>ion<br />

formalism from Chapter 4, there are two series th<strong>at</strong> must converge (4.25)<br />

and (4.26) repe<strong>at</strong>ed here for convenience.<br />

where k is a vector of integers 8<br />

and<br />

˜H1(I, ψ) = ∑<br />

Ak(I)e ik·ψ<br />

k<br />

F1(I, ψ) = ∑<br />

Bk(I)e ik·ψ<br />

k<br />

k = k1, · · · , kn<br />

Bk = i Ak<br />

ω0 · k<br />

(5.44)<br />

(5.45)<br />

(5.46)<br />

The r<strong>at</strong>e of decrease of the |Bk| depends both on the |Ak| and the denomin<strong>at</strong>ors<br />

|ω0·k|, so even if the |Ak| decrease fast enough for (5.44) to converge,<br />

(5.45) will not converge if the |ω0 · k| decrease too rapidly.<br />

The situ<strong>at</strong>ion seems hopeless. If any of the ω0’s yields a r<strong>at</strong>ional winding<br />

number, the series will blow up immedi<strong>at</strong>ely, and if one is working with finite<br />

precision – on a computer for example – every number is a r<strong>at</strong>ional number.<br />

And yet we have seem from our computer models th<strong>at</strong> some stable periodic<br />

8 The sum over k means the sum over all possible combin<strong>at</strong>ions of the n integers<br />

k1, · · · , kn.


5.9. THE KAM THEOREM AND ITS CONSEQUENCES 81<br />

trajectories persist even under the influence of small perturb<strong>at</strong>ions. The circumstances<br />

under which this happens is spelled out in a remarkable theorem<br />

first outlined by Kolmogorov and l<strong>at</strong>er proved independently by Arnold and<br />

Moser. The theorem is extremely difficult and sophistic<strong>at</strong>ed although Tabor<br />

has a nice explan<strong>at</strong>ion of the basic ideas, and an understandable outline of<br />

the proof is given in Classical Dynamics by José and Saletan. I will explain<br />

the theorem as carefully as I can and let it go <strong>at</strong> th<strong>at</strong>.<br />

5.9.1 Two Conditions<br />

The KAM theorem claims th<strong>at</strong> in regions of phase space where certain conditions<br />

hold, the perturb<strong>at</strong>ion series converges to all order in ϵ. The first<br />

condition involves the Hessian m<strong>at</strong>rix.<br />

<br />

∂ω0α<br />

<br />

det <br />

≡ det <br />

∂<br />

<br />

2 <br />

H0 <br />

<br />

̸= 0. (5.47)<br />

∂I0β<br />

∂I0α∂I0β<br />

The content of this st<strong>at</strong>ement is as follows: We assume th<strong>at</strong> each torus has<br />

a unique frequency associ<strong>at</strong>ed with it. Thus if we knew all the ω0’s we could<br />

calcul<strong>at</strong>e all the I0’s and vice versa. Equ<strong>at</strong>ion (5.47) ensures th<strong>at</strong> this is<br />

true. A simple (albeit artificial) example is provided by José and Saletan.<br />

Consider the one degree of freedom Hamiltonian<br />

H = I 3 /3 (5.48)<br />

in which I takes on values in the interval −1 < I < +1. The above condition<br />

requires th<strong>at</strong><br />

d2H = 2aI ̸= 0 (5.49)<br />

dI2 Why is this significant? Note th<strong>at</strong> ω(I) = dH/dI = aJ 2 . Inverting this gives<br />

I(ω) = ± √ ω/a. The ± is a sign th<strong>at</strong> the inversion is not unique. There<br />

are two regions separ<strong>at</strong>ed by I = 0. In the region 0 < I ≤ 1 I = √ ω/a. In<br />

the region −1 ≤ I < 0, I = − √ ω/a. Thus there are two “good” regions<br />

separ<strong>at</strong>ed by a barrier.<br />

There is a second condition restricting the frequencies. Of course we<br />

are only considering frequencies with irr<strong>at</strong>ional winding numbers. Even if<br />

the frequencies are incommensur<strong>at</strong>e, |ω0 · k| could be arbitrarily small. The<br />

KAM theorem requires th<strong>at</strong> it be bounded from below by the so-called “weak<br />

diophantine condition”<br />

|ω0 · k| ≥ γ|k| −κ for all integer k (5.50)


82 CHAPTER 5. INTRODUCTION TO CHAOS<br />

where k = √ k · k and γ and κ > n are positive constants.<br />

Wh<strong>at</strong> is the significance of this strange inequality? The best way to<br />

understand it, I think, is to face up to the paradox I mentioned earlier th<strong>at</strong><br />

the series can only converge for irr<strong>at</strong>ional winding numbers, and yet it seems<br />

th<strong>at</strong> every irr<strong>at</strong>ional number is “arbitrarily close” to a r<strong>at</strong>ional number. It is<br />

this last st<strong>at</strong>ement th<strong>at</strong> needs to be examined more carefully. This requires<br />

a brief excursion into number theory. Consider the unit interval [0, 1]. The<br />

r<strong>at</strong>ionals have measure zero in the interval. Th<strong>at</strong> means roughly th<strong>at</strong> they<br />

don’t take up any space. This can be proved as follows. First put the<br />

r<strong>at</strong>ionals in a one-to-one correspondence with the integers. Construct a<br />

small open interval of length ϵ < 1 about the first r<strong>at</strong>ional, and one of<br />

length ϵ 2 about the second and so forth. The sum of all these little intervals<br />

(this is a geometric series) is σ = ϵ/(1 − ϵ), which can be made arbitrarily<br />

small by choosing ϵ small enough. Thus the space occupied by the r<strong>at</strong>ionals<br />

is less than any positive number. This requires taking the limit ϵ → 0. The<br />

paradoxical thing is th<strong>at</strong> it is possible to remove a finite interval around each<br />

r<strong>at</strong>ional without deleting all of [0, 1]. This can be seen as follows. Write each<br />

r<strong>at</strong>ional in [0, 1] in its lowest form as p/q, and about each one construct an<br />

interval of length 1/q 3 . For each q there are <strong>at</strong> most q − 1 r<strong>at</strong>ionals. Thus<br />

for a given q no more than (q − 1)/q 3 is covered by the intervals, and the<br />

total length Q th<strong>at</strong> is covered is less (because of overlaps) than the sum of<br />

these intervals over all q.<br />

Q <<br />

∞∑<br />

q=2<br />

q − 1<br />

<<br />

q3 ∞∑<br />

q=2<br />

1<br />

q 3<br />

(5.51)<br />

This sum is rel<strong>at</strong>ed to the Riemann zeta function. At any r<strong>at</strong>e Q < 0.645.<br />

We can make this number as small as we like by replacing 1/q 3 with Γ/q 3<br />

where Γ < 1. Even if we leave Γ = 1, the fraction of [0, 1] covered by the<br />

finite intervals is less than 1.<br />

Now we can divide the irr<strong>at</strong>ionals into two sets, those covered by the<br />

intervals around the r<strong>at</strong>ionals and those outside the intervals. Those uncov-<br />

ered s<strong>at</strong>isfy the condition <br />

<br />

ω − p<br />

<br />

<br />

<br />

q <br />

Several comments are in order regarding this inequality.<br />

Γ<br />

≥ . (5.52)<br />

q3 • Equ<strong>at</strong>ion (5.52) makes irr<strong>at</strong>ionality a quantit<strong>at</strong>ive concept. 9 Those<br />

irr<strong>at</strong>ionals th<strong>at</strong> s<strong>at</strong>isfy (5.52) are “more irr<strong>at</strong>ional” than those th<strong>at</strong><br />

9 This can also be quantified in terms of continued fraction expansions.


5.10. CONCLUSION 83<br />

don’t, and the extent of their irr<strong>at</strong>ionality can be quantified by th<strong>at</strong><br />

value of Γ for which they just do or do not s<strong>at</strong>isfy the inequality.<br />

• Equ<strong>at</strong>ion (5.50) is just an n-dimensional version of (5.52). The constants<br />

γ and κ characterize the degree of irr<strong>at</strong>ionality of ω in the same<br />

way th<strong>at</strong> Γ and the exponent 3 characterize the irr<strong>at</strong>ionality of ω in<br />

(5.52).<br />

• The uncovered irr<strong>at</strong>ionals occupy isol<strong>at</strong>ed “islands” between the covered<br />

intervals. We expect th<strong>at</strong> as the perturb<strong>at</strong>ion parameter ϵ is<br />

increased, those tori with less irr<strong>at</strong>ional winding numbers will be destroyed<br />

first, but islands of stability will remain between them. Eventually<br />

as the perturb<strong>at</strong>ion is increased, all will be swept away in chaos.<br />

• The KAM theorem gives us no clue how to calcul<strong>at</strong>e the appropri<strong>at</strong>e<br />

values of γ and κ or the values of ϵ for which chaos will set in. Some<br />

estim<strong>at</strong>es placed the critical value of ϵ to be something around 10 −50 !<br />

If this were true, of course, the theorem would be quite pointless.<br />

Numerical test with specific models have found critical values of ϵ as<br />

large as ϵc ≈ 1. I will close with a quote from José and Saletan, “To<br />

our knowledge a rigorous formal estim<strong>at</strong>e of a realistic critical value<br />

for ϵ remains an open question.”<br />

5.10 Conclusion<br />

This is the end of our story about chaos. Remember th<strong>at</strong> we have only<br />

dealt with bounded, conserv<strong>at</strong>ive systems with time-independent Hamiltonians.<br />

(Classical mechanics is a big subject.) Systems with one degree of<br />

freedom are trivial (in principle) to solve using the method of quadr<strong>at</strong>ures.<br />

Systems with n degrees of freedom are trivial (again in principle) if they have<br />

n constants of motion. Such a system can be reduced by using action-angle<br />

variables to an ensemble of uncoupled oscill<strong>at</strong>ors. These systems are said to<br />

be integrable and they do not display chaos. The trouble comes when we<br />

introduce some non-integrability as a perturb<strong>at</strong>ion. Perturb<strong>at</strong>ion theory is<br />

straightforward with one degree of freedom, but with two or more degrees<br />

of freedom comes the notorious problem of small denomin<strong>at</strong>ors. 10 Perturb<strong>at</strong>ion<br />

theory fails immedi<strong>at</strong>ely for all periodic trajectories with r<strong>at</strong>ional<br />

10 There are other ways of doing perturb<strong>at</strong>ion theory in addition to the one described<br />

here. They all suffer the same problem.


84 CHAPTER 5. INTRODUCTION TO CHAOS<br />

winding number. According to the Poincaré-Birkoff theorem, these trajectories<br />

on the Poincaré section break up into complic<strong>at</strong>ed whorls and tangles<br />

surrounded be regions of stability corresponding to irr<strong>at</strong>ional winding numbers.<br />

According to the KAM theorem these regions break down with those<br />

with “more irr<strong>at</strong>ional” winding numbers surviving those with less. At last<br />

“Universal darkness covers all,” and the trajectories though deterministic<br />

show no order or p<strong>at</strong>tern.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!