Nonlinear Mechanics - Physics at Oregon State University
Nonlinear Mechanics - Physics at Oregon State University
Nonlinear Mechanics - Physics at Oregon State University
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
<strong>Nonlinear</strong> <strong>Mechanics</strong><br />
A. W. Stetz<br />
January 8, 2012
Contents<br />
1 Lagrangian Dynamics 5<br />
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5<br />
1.2 Generalized Coordin<strong>at</strong>es and the Lagrangian . . . . . . . . . 6<br />
1.3 Virtual Work and Generalized Force . . . . . . . . . . . . . . 8<br />
1.4 Conserv<strong>at</strong>ive Forces and the Lagrangian . . . . . . . . . . . . 10<br />
1.4.1 The Central Force Problem in a Plane . . . . . . . . . 11<br />
1.5 The Hamiltonian Formul<strong>at</strong>ion . . . . . . . . . . . . . . . . . . 13<br />
1.5.1 The Spherical Pendulum . . . . . . . . . . . . . . . . . 15<br />
2 Canonical Transform<strong>at</strong>ions 17<br />
2.1 Contact Transform<strong>at</strong>ions . . . . . . . . . . . . . . . . . . . . . 17<br />
2.1.1 The Harmonic Oscill<strong>at</strong>or: Cracking a Peanut with a<br />
Sledgehammer . . . . . . . . . . . . . . . . . . . . . . 20<br />
2.2 The Second Gener<strong>at</strong>ing Function . . . . . . . . . . . . . . . . 21<br />
2.3 Hamilton’s Principle Function . . . . . . . . . . . . . . . . . . 22<br />
2.3.1 The Harmonic Oscill<strong>at</strong>or: Again . . . . . . . . . . . . 24<br />
2.4 Hamilton’s Characteristic Function . . . . . . . . . . . . . . . 25<br />
2.4.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . 26<br />
2.5 Action-Angle Variables . . . . . . . . . . . . . . . . . . . . . . 27<br />
2.5.1 The harmonic oscill<strong>at</strong>or (for the last time) . . . . . . . 29<br />
3 Abstract Transform<strong>at</strong>ion Theory 33<br />
3.1 Not<strong>at</strong>ion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33<br />
3.1.1 Poisson Brackets . . . . . . . . . . . . . . . . . . . . . 35<br />
3.2 Geometry in n Dimensions: The Hairy Ball . . . . . . . . . . 38<br />
3.2.1 Example: Uncoupled Oscill<strong>at</strong>ors . . . . . . . . . . . . 41<br />
3.2.2 Example: A Particle in a Box . . . . . . . . . . . . . . 43<br />
3
4 CONTENTS<br />
4 Canonical Perturb<strong>at</strong>ion Theory 45<br />
4.1 One-Dimensional Systems . . . . . . . . . . . . . . . . . . . . 45<br />
4.1.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . 49<br />
4.1.2 The simple pendulum . . . . . . . . . . . . . . . . . . 49<br />
4.2 Many Degrees of Freedom . . . . . . . . . . . . . . . . . . . . 51<br />
5 Introduction to Chaos 55<br />
5.1 The total failure of perturb<strong>at</strong>ion theory . . . . . . . . . . . . 56<br />
5.2 Fixed points and lineariz<strong>at</strong>ion . . . . . . . . . . . . . . . . . . 58<br />
5.3 The Henon oscill<strong>at</strong>or . . . . . . . . . . . . . . . . . . . . . . . 62<br />
5.4 Discrete Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . 68<br />
5.5 Linearized Maps . . . . . . . . . . . . . . . . . . . . . . . . . 70<br />
5.6 Lyapunov Exponents . . . . . . . . . . . . . . . . . . . . . . . 72<br />
5.7 The Poincaré-Birkhoff Theorem . . . . . . . . . . . . . . . . . 74<br />
5.8 All in a tangle . . . . . . . . . . . . . . . . . . . . . . . . . . . 77<br />
5.9 The KAM theorem and its consequences . . . . . . . . . . . . 80<br />
5.9.1 Two Conditions . . . . . . . . . . . . . . . . . . . . . . 81<br />
5.10 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
Chapter 1<br />
Lagrangian Dynamics<br />
1.1 Introduction<br />
The possibility th<strong>at</strong> deterministic mechanical systems could exhibit the behavior<br />
we now call chaos was first realized by the French m<strong>at</strong>hem<strong>at</strong>ician<br />
Henri Poincaré sometime toward the end of the nineteenth century. His<br />
discovery emerged from analytic or classical mechanics, which is still part<br />
of the found<strong>at</strong>ion of physics. To put it a bit facetiously, classical mechanics<br />
deals with those problems th<strong>at</strong> can be “solved,” in the sense th<strong>at</strong> it is possible<br />
to derive equ<strong>at</strong>ions of motions th<strong>at</strong> describe the positions of the various<br />
parts of a system as functions of time using standard analytic functions.<br />
<strong>Nonlinear</strong> dynamics tre<strong>at</strong>s problems th<strong>at</strong> cannot be so solved, and it is only<br />
in these problems th<strong>at</strong> chaos can appear. The simple pendulum makes a<br />
good example. The differential equ<strong>at</strong>ion of motion is<br />
¨θ + ω 2 sin θ = 0 (1.1)<br />
The sin is a nonlinear function of θ. If we linearize by setting sin θ ≈ θ,<br />
the solutions are elementary functions, sin ωt and cos ωt. If we keep the sin,<br />
the solutions can only be expressed in terms of elliptic integrals. This is<br />
not a chaotic system, because there is only one degree of freedom, but if we<br />
hang one pendulum from the end of another, the equ<strong>at</strong>ions of motion are<br />
hopeless to find (even with elliptic integrals) and the resulting motion can<br />
be chaotic. 1<br />
1 I should emphasize the distinction between the differential equ<strong>at</strong>ions of motion, which<br />
are usually simple (though nonlinear), and the equ<strong>at</strong>ions th<strong>at</strong> describe the positions of<br />
the elements of the system as functions of time, which are usually non-existent.<br />
5
6 CHAPTER 1. LAGRANGIAN DYNAMICS<br />
In order to arrive <strong>at</strong> Poincaré’s moment of discovery, we will have to<br />
review the development of classical mechanics through the nineteenth century.<br />
This m<strong>at</strong>erial is found in many standard texts, but I will cover it here<br />
in some detail. This is partly to insure uniform not<strong>at</strong>ion throughout these<br />
lectures and partly to focus on those things th<strong>at</strong> lead directly to chaos in<br />
nonlinear systems. We will begin formul<strong>at</strong>ing mechanics in terms of generalized<br />
coordin<strong>at</strong>es and Lagrange’s equ<strong>at</strong>ions of motion. We then study<br />
Lagrange transform<strong>at</strong>ions and use them to derive Hamilton’s equ<strong>at</strong>ions of<br />
motions. These equ<strong>at</strong>ions are particularly suited to conserv<strong>at</strong>ive systems<br />
in which the Hamiltonian is constant in time, and it is such systems th<strong>at</strong><br />
will be our primary concern. It turns out th<strong>at</strong> Lagrange transform<strong>at</strong>ions<br />
can be used to transform Hamiltonians in a myriad of ways. One particularly<br />
elegant form uses action-angle variables to transform a certain class of<br />
problems into a set of uncoupled harmonic oscill<strong>at</strong>ors. Systems th<strong>at</strong> can be<br />
so transformed are said to be integrable, which is to say th<strong>at</strong> they can be<br />
“solved,” <strong>at</strong> least in principle. Wh<strong>at</strong> happens, Poincaré asked, to a system<br />
th<strong>at</strong> is almost but not quite integrable? The answer entails perturb<strong>at</strong>ion<br />
theory and leads to the disastrous problem of small divisors. This is the<br />
p<strong>at</strong>h th<strong>at</strong> led originally to the discovery of chaos, and it is the one we will<br />
pursue here.<br />
1.2 Generalized Coordin<strong>at</strong>es and the Lagrangian<br />
Vector equ<strong>at</strong>ions, like F = ma, seem to imply a coordin<strong>at</strong>e system. Beginning<br />
students learn to use cartesian coordin<strong>at</strong>es and then learn th<strong>at</strong> this<br />
is not always the best choice. If the system has cylindrical symmetry, for<br />
example, it is best to use cylindrical coordin<strong>at</strong>es: it makes the problem easier.<br />
By “symmetry” we mean th<strong>at</strong> the number of degrees of freedom of the<br />
system is less th<strong>at</strong> the dimensionality of the space in which it is imbedded.<br />
The familiar example of the block sliding down the incline plane will make<br />
this clear. Let’s say th<strong>at</strong> it’s a two dimensional problem with an x-y coordin<strong>at</strong>e<br />
system. The block is constrained to move in a straight line, however,<br />
so th<strong>at</strong> its position can be completely specified by one variable, i.e. it has<br />
one degree of freedom. The clever student chooses the x axis so th<strong>at</strong> it lies<br />
along the p<strong>at</strong>h of the block. This reduces the problem to one dimension,<br />
since y = 0 and the x coordin<strong>at</strong>e is given by one simple equ<strong>at</strong>ion. In the<br />
pendulum example from the previous section, it was most convenient to use<br />
a polar coordin<strong>at</strong>e system centered <strong>at</strong> the pivot. Since r is constant, the<br />
motion can be described completely in terms of θ.
1.2. GENERALIZED COORDINATES AND THE LAGRANGIAN 7<br />
These coordin<strong>at</strong>e systems conceal a subtle point: the pendulum moves<br />
in a circular arc and the block moves in a straight line because they are<br />
acted on by forces of constraint. In most cases we are not interested in these<br />
forces. Our choice of coordin<strong>at</strong>es simply makes them disappear from the<br />
problem. Most problems don’t have obvious symmetries, however. Consider<br />
a bead sliding along a wire following some complic<strong>at</strong>ed snaky p<strong>at</strong>h in 3-d<br />
space. There’s only one degree of freedom, since the particle’s position<br />
is determined entirely by its distance measured along the wire from some<br />
reference point. The forces are so complic<strong>at</strong>ed, however, th<strong>at</strong> it is out of<br />
the question to solve the problem by using F = ma in any straightforward<br />
way. This is the problem th<strong>at</strong> Lagrangian mechanics is designed to handle.<br />
The basic (and quite profound) idea is th<strong>at</strong> even though there may be no<br />
coordin<strong>at</strong>e system (in the usual sense) th<strong>at</strong> will reduce the dimensionality of<br />
the problem, yet there is usually a system of coordin<strong>at</strong>es th<strong>at</strong> will do this.<br />
Such coordin<strong>at</strong>es are called generalized coordin<strong>at</strong>es.<br />
To be more specific, suppose th<strong>at</strong> a system consists of N point masses<br />
with positions specified by ordinary three-dimensional cartesian vectors, ri,<br />
i = 1 · · · N, subject to some constraints. The easiest constraints to deal with<br />
are those th<strong>at</strong> can be expressed as a set of l equ<strong>at</strong>ions of the form<br />
fj(r1, r2, . . . , t) = 0, (1.2)<br />
where j = 1 · · · l. Such constraints are said to be holonomic. If in addition,<br />
the equ<strong>at</strong>ions of constraint do not involve time explicitly, they are said to be<br />
scleronomous, otherwise they are called rheonomous. These constraints can<br />
be used to reduce the 3N cartesian components to a set of 3N − l variables<br />
q1, q2, . . . , q3N−l. The rel<strong>at</strong>ionship between the two is given by a set of N<br />
equ<strong>at</strong>ions of the form<br />
ri = ri(q1, q2, . . . , q3N−l, t). (1.3)<br />
The q’s used in this way are the generalized coordin<strong>at</strong>es. In the example of<br />
the bead on a curved wire, the equ<strong>at</strong>ions would reduce to r = r(q), where<br />
q is a distance measured along the wire. This simply specifies the curv<strong>at</strong>ure<br />
of the wire.<br />
It should be noted th<strong>at</strong> the q’s need not all have the same units. Also<br />
note th<strong>at</strong> we can use the same not<strong>at</strong>ion even if there are no constraints.<br />
For example, the position of an unconstrained particle could be written r =<br />
r(q1, q2, q3), and the q’s might represent cartesian, spherical, or cylindrical<br />
coordin<strong>at</strong>es. In order to simplify the not<strong>at</strong>ion, we will often pack the q’s
8 CHAPTER 1. LAGRANGIAN DYNAMICS<br />
into an array and use vector not<strong>at</strong>ion,<br />
<br />
<br />
<br />
<br />
<br />
q = <br />
<br />
<br />
<br />
q1<br />
q2<br />
q3<br />
.<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
(1.4)<br />
This is not meant to imply th<strong>at</strong> q is a vector in the usual sense. For one<br />
thing, it does not necessarily posses “a magnitude and a direction” as good<br />
vectors are supposed to have. By the same token, we cannot use the notion<br />
of orthogonal unit vectors.<br />
Along with the notion of generalized coordin<strong>at</strong>es comes th<strong>at</strong> of generalized<br />
velocities.<br />
˙qk ≡ dqk<br />
(1.5)<br />
dt<br />
Since qi depends only on t, this is a total deriv<strong>at</strong>ive, but when we differenti<strong>at</strong>e<br />
ri, we must remember th<strong>at</strong> it depends both explicitly on time as well as<br />
implicitly through the q’s.<br />
˙ri = ∑ ∂ri<br />
∂qk<br />
k<br />
˙qk + ∂ri<br />
∂t<br />
(1.6)<br />
(In this chapter I will consistently use the index i to sum over the N point<br />
masses and k to sum over the 3N − l degrees of freedom.) Differenti<strong>at</strong>ing<br />
both sides with respect to ˙qk yields<br />
∂ ˙ri<br />
∂ ˙qk<br />
= ∂ri<br />
∂qk<br />
which will be useful in the following deriv<strong>at</strong>ions.<br />
1.3 Virtual Work and Generalized Force<br />
(1.7)<br />
There are several routes for deriving Lagrange’s equ<strong>at</strong>ions of motion. The<br />
most elegant and general makes use of the principle of least action and the<br />
calculus of vari<strong>at</strong>ion. I will use a much more pedestrian approach based on<br />
Newton’s second law of motion. First note th<strong>at</strong> F = ma can be written in<br />
the r<strong>at</strong>her arcane form ( )<br />
d ∂T<br />
= Fi<br />
(1.8)<br />
dt ∂vi<br />
Where Fi is i-th component of the total force acting on a particle with<br />
kinetic energy T . The point of writing this in terms of energy r<strong>at</strong>her than
1.3. VIRTUAL WORK AND GENERALIZED FORCE 9<br />
acceler<strong>at</strong>ion is th<strong>at</strong> we can separ<strong>at</strong>e out the forces of constraint, which are<br />
always perpendicular to the direction of motion and hence do no work. The<br />
trick is to write this in terms of generalized coordin<strong>at</strong>es and velocities. This<br />
is r<strong>at</strong>her technical, but the underlying idea is simple, and the result looks<br />
much like (1.8).<br />
The qk’s are all independent, so we can vary one by a small amount δqk<br />
while holding all others constant.<br />
δri = ∑ ∂ri<br />
δqk<br />
∂qk<br />
k<br />
(1.9)<br />
This is sometimes called a virtual displacement. The corresponding virtual<br />
work is<br />
δWk = ∑<br />
(<br />
∂ri<br />
Fi ·<br />
∂qk<br />
)<br />
δqk<br />
(1.10)<br />
We define a generalized force<br />
i<br />
ℑk = ∑<br />
i<br />
Fi · ∂ri<br />
∂qk<br />
(1.11)<br />
The forces of constraint can be excluded from the sum for the reason explained<br />
above. We are left with<br />
ℑk = δWk<br />
δqk<br />
The kinetic energy is calcul<strong>at</strong>ed using ordinary velocities.<br />
T = 1<br />
2<br />
i<br />
∑<br />
mi ˙ri · ˙ri<br />
i<br />
∂T<br />
=<br />
∂qk<br />
∑ ∂ ˙ri<br />
mi ˙ri · =<br />
∂qk<br />
∑ ∂ ˙ri<br />
pi ·<br />
∂qk<br />
∂T<br />
∂ ˙qk<br />
= ∑ ∂ ˙ri<br />
mi ˙ri ·<br />
∂ ˙qk<br />
i<br />
i<br />
= ∑<br />
i<br />
pi · ∂ri<br />
∂qk<br />
(1.12)<br />
(1.13)<br />
(1.14)<br />
(1.15)<br />
Equ<strong>at</strong>ion (1.7) was used to obtain the last term. A straightforward calcul<strong>at</strong>ion<br />
now leads to<br />
??ℑk = d<br />
( )<br />
∂T<br />
−<br />
dt ∂ ˙qk<br />
∂T<br />
∂qk<br />
(1.16)<br />
which is the generalized form of (1.8).
10 CHAPTER 1. LAGRANGIAN DYNAMICS<br />
1.4 Conserv<strong>at</strong>ive Forces and the Lagrangian<br />
So far we have made no assumptions about the n<strong>at</strong>ure of the forces included<br />
in ℑ except th<strong>at</strong> they are not forces of constraint. Equ<strong>at</strong>ion (16) is therefore<br />
quite general, although seldom used in this form. In these notes we are<br />
primarily concerned with conserv<strong>at</strong>ive forces, i.e. forces th<strong>at</strong> can be derived<br />
from a potential.<br />
Fi = −∇iV (r1 · · · rN) (1.17)<br />
Notice th<strong>at</strong> V doesn’t depend on velocity. (Electromagnetic forces are velocity<br />
dependent, of course, but they can easily be accommod<strong>at</strong>ed into the<br />
Lagrangian framework. I will return to this issue l<strong>at</strong>er on.) Now calcul<strong>at</strong>e<br />
the work done by changing some of the q’s.<br />
∫<br />
∑<br />
W = Fi · dri = − ∑<br />
∫<br />
∇iV · dri<br />
i<br />
= − ∑<br />
∫<br />
∇iV ·<br />
i<br />
∑ ∂ri<br />
dqk<br />
∂qk k<br />
= − ∑<br />
∫<br />
k<br />
( ∑<br />
∇iV ·<br />
i<br />
∂ri<br />
)<br />
dqk<br />
∂qk<br />
= − ∑<br />
∫<br />
∂V<br />
dqk<br />
∂qk<br />
i<br />
k<br />
(1.18)<br />
The integral is a multidimensional definite integral over the various q’s th<strong>at</strong><br />
have changed. Summing over (1.12) then gives<br />
Comparison with (1.18) yields<br />
Finally define the Lagrangian<br />
δW = ∑<br />
δWk = ∑<br />
ℑkδqk<br />
k<br />
W = ∑<br />
∫<br />
k<br />
ℑk = − ∂V<br />
∂qk<br />
k<br />
ℑkdqk<br />
(1.19)<br />
(1.20)<br />
(1.21)<br />
L = T − V (1.22)
1.4. CONSERVATIVE FORCES AND THE LAGRANGIAN 11<br />
Equ<strong>at</strong>ion (??) becomes<br />
d<br />
dt<br />
( ∂L<br />
∂ ˙qk<br />
)<br />
− ∂L<br />
∂qk<br />
= 0. (1.23)<br />
Equ<strong>at</strong>ion (1.23) represents a set of 3N −l second order partial differential<br />
equ<strong>at</strong>ions called Lagrange’s equ<strong>at</strong>ions of motion. I can summarize this long<br />
development by giving you a “cookbook” procedure for using (1.23) to solve<br />
mechanics problems: First select a convenient set of generalized coordin<strong>at</strong>es.<br />
Then calcul<strong>at</strong>e T and V in the usual way using the ri’s. Use equ<strong>at</strong>ion (1.3)<br />
to elimin<strong>at</strong>e the ri’s in favor of the qk’s. Finally substitute L into (1.23) and<br />
solve the resulting equ<strong>at</strong>ions.<br />
Classical mechanics texts are full of examples in which this program is<br />
carried to a successful conclusion. In fact, most of these problems are contrived<br />
and of little interest except to illustr<strong>at</strong>e the method. The vast majority<br />
of systems lead to differential equ<strong>at</strong>ions th<strong>at</strong> cannot be solved in closed<br />
form. The modern emphasis is to understand the solutions qualit<strong>at</strong>ively<br />
and then obtain numerical solutions using the computer. The Hamiltonian<br />
formalism described in the next section is better suited to both these ends.<br />
1.4.1 The Central Force Problem in a Plane<br />
Consider the central force problem as an example of this technique.<br />
V = V (r) F = −∇V (1.24)<br />
L = T − V = 1<br />
2 m<br />
(<br />
˙r 2 + r 2 ϕ˙ 2 )<br />
− V (r) (1.25)<br />
Let’s choose our generalized coordin<strong>at</strong>es to be q1 = r and q2 = ϕ. Equ<strong>at</strong>ion<br />
(1.23) becomes<br />
m¨r − mr ˙ ϕ 2 + dV<br />
= 0<br />
dr<br />
d<br />
(<br />
mr<br />
dt<br />
(1.26)<br />
2 )<br />
ϕ˙<br />
= 0 (1.27)<br />
This last equ<strong>at</strong>ion tells us th<strong>at</strong> there is a quantity mr 2 ˙ ϕ th<strong>at</strong> does not change<br />
with time. Such a quantity is said to be conserved. In this case we have<br />
rediscovered the conserv<strong>at</strong>ion of angular momentum.<br />
This reduces the problem to one dimension.<br />
mr 2 ˙ ϕ ≡ lz = constant (1.28)<br />
m¨r = l2 z dV<br />
−<br />
mr3 dr<br />
(1.29)
12 CHAPTER 1. LAGRANGIAN DYNAMICS<br />
Since there are no constraints, the generalized forces are identical with the<br />
ordinary forces<br />
ℑϕ = − dV<br />
dϕ = 0 ℑr = − dV<br />
(1.30)<br />
dr<br />
This equ<strong>at</strong>ion has an elegant closed form solution in the special case of<br />
gravit<strong>at</strong>ional <strong>at</strong>traction.<br />
V = − GmM<br />
r<br />
≡ −k<br />
r<br />
(1.31)<br />
m¨r = l2 z k<br />
− (1.32)<br />
mr3 r<br />
This apparently nonlinear equ<strong>at</strong>ion yields to a simple trick, let u = 1/r.<br />
d2u dϕ2 + u = m2k l2 z<br />
(1.33)<br />
If the motion is circular u is constant. Otherwise it oscill<strong>at</strong>es around the<br />
value m2k/l2 z with simple harmonic motion. 2 The period of oscill<strong>at</strong>ion is<br />
identical with the period of rot<strong>at</strong>ion so the corresponding orbit is an ellipse.<br />
This problem was easy to solve because we were able to discover a nontrivial<br />
quantity th<strong>at</strong> was constant, in this case the angular momentum. The<br />
constant enabled us to reduce the number of independent variables from<br />
two to one. Such a conserved quantity is called an integral of the motion<br />
or a constant of the motion. Obviously, the more such quantities one can<br />
find, the easier the problem. This raises two practical problems. First, how<br />
can we tell, perhaps from looking <strong>at</strong> the physics of a problem, how many<br />
independent conserved quantities there are? Second, how are we to find<br />
them?<br />
In the central force problem, both of these questions answered themselves.<br />
We know th<strong>at</strong> angular momentum is conserved. This fact manifests<br />
itself in the Lagrangian in th<strong>at</strong> L depends on ˙ ϕ but not on ϕ. Such a coordin<strong>at</strong>e<br />
is said to be cyclic or ignorable. Let q be such a coordin<strong>at</strong>e. Then<br />
( )<br />
d ∂L<br />
= 0 (1.34)<br />
dt ∂ ˙q<br />
The quantity in brackets has a special significance. It is called the canonically<br />
conjug<strong>at</strong>e momentum. 3<br />
∂L<br />
≡ pk<br />
(1.35)<br />
˙qk<br />
2<br />
This illustr<strong>at</strong>es a general principle in physics: When correctly viewed, everything is a<br />
harmonic oscill<strong>at</strong>or.<br />
3<br />
This not<strong>at</strong>ion is universally used, hence the old aphorism th<strong>at</strong> mechanics is a m<strong>at</strong>ter<br />
of minding your p’s and q’s.
1.5. THE HAMILTONIAN FORMULATION 13<br />
To summarize, if q is cyclic, p is conserved.<br />
Suppose we had tried to do the central force problem in cartesian coordin<strong>at</strong>es.<br />
Both x and y would appear in the Lagrangian, and neither px nor<br />
py would be constant. If we insisted on this, central forces would remain an<br />
intractable problem in two dimensions. We need to choose our generalized<br />
coordin<strong>at</strong>es so th<strong>at</strong> there are as many cyclic variables as possible. The two<br />
questions reemerge: how many are we entitled to and how do we find the<br />
corresponding p’s and q’s?<br />
A partial answer to the first is given by a well-known result called<br />
Noether’s theorem: For every transform<strong>at</strong>ion th<strong>at</strong> leaves the Lagrangian<br />
invariant there is a constant of the motion. 4 This theorem (which underlies<br />
all of modern particle physics) says th<strong>at</strong> there is a fundamental connection<br />
between symmetries and invariance principles on one hand and conserv<strong>at</strong>ion<br />
laws on the other. Momentum is conserved because the laws of physics<br />
are invariant under transl<strong>at</strong>ion. Angular momentum is conserved because<br />
the laws of physics are invariant under rot<strong>at</strong>ion. Despite its fundamental<br />
significance, Noether’s theorem is not much help in practical calcul<strong>at</strong>ions.<br />
Granted it gives a procedure for finding the conserved quantity after the<br />
corresponding symmetry transform<strong>at</strong>ion has been found, but how is one to<br />
find the transform<strong>at</strong>ion? The physicist must rely on his traditional tools:<br />
inspir<strong>at</strong>ion, the Ouija Board, and simply pounding one’s head against a<br />
wall. The fact remains th<strong>at</strong> there are simple systems, e.g. the Henon-Heiles<br />
problem to be discussed l<strong>at</strong>er, th<strong>at</strong> have fascin<strong>at</strong>ed physicists for decades<br />
and for which the existence of these transform<strong>at</strong>ions is still controversial.<br />
I will have much more to say about the second question. As you will<br />
see, there is a more or less “cookbook” procedure for finding the right set<br />
of variables and some fundamental results about the sorts of problems for<br />
which these procedures are possible.<br />
1.5 The Hamiltonian Formul<strong>at</strong>ion<br />
I will explain the Hamiltonian assuming th<strong>at</strong> there is only one degree of<br />
freedom. It’s easy to generalize once the basic ideas are clear. Lagrangians<br />
are functions of q and ˙q. We define a new function of q and p (given by<br />
(1.34)).<br />
H(p, q) = p ˙q − L(q, ˙q) (1.36)<br />
The new function is called the Hamiltonian, and the transform<strong>at</strong>ion L → H<br />
is called a Lagrange transform<strong>at</strong>ion. The equ<strong>at</strong>ion is much more subtle than<br />
4 See Finch and Hand for a simple proof and further discussion.
14 CHAPTER 1. LAGRANGIAN DYNAMICS<br />
it looks. In fact, its worth several pages of explan<strong>at</strong>ion.<br />
It’s clear from elementary mechanics th<strong>at</strong> q, ˙q, and p can’t all be independent<br />
variables, since p = m ˙q. You might say th<strong>at</strong> there are two ways of<br />
formul<strong>at</strong>ing Newton’s second law: a (q, ˙q) formul<strong>at</strong>ion, F = m¨q, and a (q, p)<br />
formul<strong>at</strong>ion, F = ˙p. The connection between q and its canonically conjug<strong>at</strong>e<br />
momentum is usually more complic<strong>at</strong>ed than this, but there is still a (q, ˙q)<br />
formul<strong>at</strong>ion, the Lagrangian, and a (q, p) formul<strong>at</strong>ion, the Hamiltonian. The<br />
Legendre transform<strong>at</strong>ion is a procedure for transforming the one formul<strong>at</strong>ion<br />
into the other. The key point is th<strong>at</strong> it is invertible. 5 To see wh<strong>at</strong> this<br />
means, let’s first assume th<strong>at</strong> q, ˙q and p are all independent.<br />
dH =<br />
Wh<strong>at</strong> is the condition th<strong>at</strong> H not depend on ˙q?<br />
H(q, ˙q, p) = p ˙q − L(q, ˙q) (1.37)<br />
(<br />
p − ∂L<br />
)<br />
d ˙q + ˙q dp −<br />
∂ ˙q<br />
∂L<br />
dq (1.38)<br />
∂q<br />
p(q, ˙q) =<br />
∂L(q, ˙q)<br />
∂ ˙q<br />
OK. This is the definition of p anyhow, so we’re on the right track.<br />
dH = ˙q dp − ∂L<br />
∂q dq<br />
dH = ∂H<br />
∂p<br />
dp +∂H<br />
∂q dq<br />
Adding and subtracting these two equ<strong>at</strong>ions gives<br />
˙q(q, p) = ∂H<br />
∂p<br />
− ∂L<br />
∂q<br />
= ∂H<br />
∂q<br />
Combining (1.23), (1.39), and (1.4141) gives the fourth major result.<br />
˙p(q, p) = − ∂H<br />
∂q<br />
5 The following argument is taken from Finch & Hand.<br />
(1.39)<br />
(1.40)<br />
(1.41)<br />
(1.42)
1.5. THE HAMILTONIAN FORMULATION 15<br />
Now here’s wh<strong>at</strong> I mean th<strong>at</strong> Legendre transform<strong>at</strong>ions are invertible:<br />
First follow the steps from L → H. We start with L = L(q, ˙q). Equ<strong>at</strong>ion<br />
(1.39) gives p = p(q, ˙q). Invert this to find ˙q = ˙q(q, p). The Hamiltonian is<br />
now<br />
H(q, p) = ˙q(q, p)p − L[q, ˙q(q, p)]. (1.43)<br />
Now suppose th<strong>at</strong> we start from H = H(p, q). Use (1.40) to find ˙q = ˙q(q, p).<br />
Invert to find p = p(q, ˙q). Finally<br />
L(q, ˙q) = ˙qp(q, ˙q) − H[q, p(q, ˙q)] (1.44)<br />
In both cases we were able to complete the transform<strong>at</strong>ion without knowing<br />
ahead of time the functional rel<strong>at</strong>ionship among q, ˙q, and p. To summarize:<br />
Equ<strong>at</strong>ions (1.37), (1.39), and (1.41) enable us to transform between the (q, ˙q)<br />
(Lagrangian) prescription and the (q, p) (Hamiltonian) prescription; while<br />
(1.40) and (1.41) are Hamilton’s equ<strong>at</strong>ions of motion.<br />
1.5.1 The Spherical Pendulum<br />
A mass m hangs from a string of length R. The string makes an angle θ<br />
with the vertical and can rot<strong>at</strong>e about the vertical with an angle ϕ.<br />
T = 1<br />
2 mR2 ( ˙ θ 2 + sin 2 θ ˙ ϕ 2 ) (1.45)<br />
V = mgR(1 − cos θ) (1.46)<br />
The mgR constant doesn’t appear in the equ<strong>at</strong>ions of motion, so we can<br />
forget about it. The Lagrangian is L = T − V as usual.<br />
pθ = ∂L<br />
∂ ˙ θ = mR2 ˙ θ (1.47)<br />
pϕ = ∂L<br />
∂ ˙ ϕ = mR2 sin 2 θ ˙ ϕ ≡ lϕ<br />
(1.48)<br />
The angle ϕ is cyclic, so pϕ = lϕ is constant. At this point we are still in the<br />
(q, ˙q) prescription. Invert (47) and (48) to obtain ˙ θ and ˙ ϕ as functions of pθ<br />
and lϕ.<br />
˙θ = pθ/mR 2<br />
(1.49)<br />
H = p2 θ +<br />
2mR2 ˙ϕ = lϕ/mR 2 sin 2 θ (1.50)<br />
l2 ϕ<br />
2mR2 sin2 − mgR cos θ (1.51)<br />
θ
16 CHAPTER 1. LAGRANGIAN DYNAMICS<br />
The equ<strong>at</strong>ions of motion follow from this.<br />
˙θ = ∂H<br />
∂pθ<br />
= pθ<br />
mR 2<br />
˙pθ = − ∂H<br />
∂θ = l2 ϕ cos θ<br />
˙ϕ = ∂H<br />
∂pϕ<br />
mR 2 sin 3 θ<br />
=<br />
lϕ<br />
mR 2 sin 2 θ<br />
(1.52)<br />
− mgR sin θ (1.53)<br />
(1.54)<br />
˙pϕ = 0 (1.55)<br />
Suppose we were to try to find an analytic solution to this system of<br />
equ<strong>at</strong>ions. First note th<strong>at</strong> there are two constants of motion, the angular<br />
momentum lϕ, and the total energy E = H.<br />
1. Invert (1.49) to obtain pθ = pθ(θ, E, lϕ).<br />
2. Substitute pθ into (1.49) and integr<strong>at</strong>e<br />
∫<br />
mR2 dθ = t ≡ N(θ)<br />
pθ<br />
The integral is hopeless anyhow, so we label its output N(θ), (short<br />
for an exceedingly nasty function).<br />
3. Invert the nasty function to find θ as a function of t.<br />
4. Take the sine of this even nastier function and substitute it into (1.54)<br />
to find ˙ ϕ.<br />
5. Integr<strong>at</strong>e and invert to find ϕ as a function of t.<br />
This makes sense in principle, but is wildly impossible in practice. Now<br />
suppose we could change the problem so th<strong>at</strong> both θ and ϕ were cyclic so<br />
th<strong>at</strong> the two constants of motion were pθ and pϕ (r<strong>at</strong>her than E and pϕ).<br />
Then<br />
˙θ = ∂H<br />
= ωθ<br />
∂pθ<br />
θ = ωθt + θ0<br />
˙ϕ = ∂H<br />
∂pϕ<br />
= ωϕ ϕ = ωϕt + ϕ0<br />
Here ωθ and ωϕ are two constant “frequencies” th<strong>at</strong> we could easily extract<br />
from the Hamiltonian. This apparently small change makes the problem<br />
trivial! In both cases there are two constants of motion: it makes all the<br />
difference which two constants. This is the basis of the idea we will be<br />
pursuing in the next chapter.
Chapter 2<br />
Canonical Transform<strong>at</strong>ions<br />
We saw <strong>at</strong> the end af the last chapter th<strong>at</strong> a problem in which all the<br />
generalized coordin<strong>at</strong>es are cyclic is trivial to solve. We also saw th<strong>at</strong> there<br />
is a gre<strong>at</strong> flixibility allowed in the choice of coordin<strong>at</strong>es for any particular<br />
problem. It turns out th<strong>at</strong> there is an important class of problems for which<br />
it is possible to choose the coordin<strong>at</strong>es so th<strong>at</strong> they are in fact all cyclic.<br />
The choice is usually far from obvious, but there is a formal procedure for<br />
finding the “magic” variables. One formul<strong>at</strong>es the problem in terms of the<br />
n<strong>at</strong>ural p’s and q’s and then transforms to a new set of variables, usually<br />
called Qk and Pk, th<strong>at</strong> have the right properties.<br />
2.1 Contact Transform<strong>at</strong>ions<br />
The most general transform<strong>at</strong>ion is called a contact transform<strong>at</strong>ion.<br />
Qk = Qk(q, p, t) Pk = Pk(q, p, t) (2.1)<br />
(In this formula and wh<strong>at</strong> follows, the symbols p and q when used as arguments<br />
stand for the complete set, q1, q2, q3, · · · , etc.) There is a certain privileged<br />
class of transform<strong>at</strong>ions called canonical transform<strong>at</strong>ions th<strong>at</strong> preserve<br />
the structure of Hamilton’s equ<strong>at</strong>ion of motion for all dynamical systems.<br />
This means th<strong>at</strong> there is a new Hamiltonian function called K(Q, P ) for<br />
which the new equ<strong>at</strong>ions of motion are<br />
˙Qk = ∂K<br />
∂Pk<br />
Pk<br />
˙ = − ∂K<br />
∂Qk<br />
(2.2)<br />
In a footnote in Classical <strong>Mechanics</strong>, Goldstein suggested th<strong>at</strong> K be called<br />
the Kamiltonian. The idea has caught on with several authors, and I will<br />
use it without further apology. The trick is to find it.<br />
17
18 CHAPTER 2. CANONICAL TRANSFORMATIONS<br />
Theorem: Let F be any function of qk and Qk and possibly pk and Pk,<br />
as well as time. Then the new Lagrangian defined by<br />
¯L = L − dF<br />
dt<br />
(2.3)<br />
is equivalent to L in the sense th<strong>at</strong> it yields the same equ<strong>at</strong>ions of motion.<br />
Proof:<br />
F ˙ = ∑ ∂F<br />
˙qk +<br />
∂qk k<br />
∑ ∂F<br />
˙Qk +<br />
∂Qk k<br />
∂F<br />
∂t<br />
(<br />
d ∂<br />
dt<br />
(2.4)<br />
˙<br />
)<br />
F<br />
=<br />
∂ ˙qk<br />
d<br />
( )<br />
∂F<br />
=<br />
dt ∂qk<br />
∂ ˙ F<br />
∂qk<br />
(<br />
d ∂<br />
dt<br />
˙ F<br />
∂ ˙ )<br />
=<br />
Qk<br />
d<br />
( )<br />
∂F<br />
=<br />
dt ∂Qk<br />
∂ ˙ F<br />
∂Qk<br />
These last two can be rewritten<br />
d<br />
dt<br />
(<br />
∂ ˙<br />
F<br />
∂ ˙qk<br />
(<br />
d ∂<br />
dt<br />
˙ F<br />
∂ ˙ )<br />
−<br />
Qk<br />
)<br />
∂F˙ − = 0<br />
∂qk<br />
∂ ˙<br />
F<br />
∂Qk<br />
So ˙ F s<strong>at</strong>isfies Lagrange’s equ<strong>at</strong>ion whether we regard it as a function of qk<br />
or Qk. Obviously, if L s<strong>at</strong>isfies Lagrange’s equ<strong>at</strong>ion, then so does L − ˙ F .<br />
(The conclusion is unchanged if F contains pk and/or Pk.) The function F<br />
is called the gener<strong>at</strong>ing function of the transform<strong>at</strong>ion.<br />
K is obtained by a Legendre transform<strong>at</strong>ion just as H was.<br />
= 0<br />
K(Q, P ) = ∑<br />
Pk ˙ Qk − ¯ L(Q, ˙ Q, t) (2.5)<br />
k<br />
This has the same form as (1.36), so the deriv<strong>at</strong>ion of the equ<strong>at</strong>ions of motion<br />
(1.39) through(1.42) are unchanged as well.<br />
Pk = ∂K<br />
∂ ˙ Qk<br />
˙Qk = ∂K<br />
∂Pk<br />
Pk<br />
˙ = − ∂K<br />
∂Qk<br />
(2.6)<br />
These simple results provide the framework for canonical transform<strong>at</strong>ions.<br />
In order to use them we will need to know two more things: (1) How
2.1. CONTACT TRANSFORMATIONS 19<br />
to find F , and given F , (2) how to find the transform<strong>at</strong>ion (q, p) → (Q, P ).<br />
We deal with (2) now and postpone (1) to l<strong>at</strong>er sections.<br />
Consider the variables q, Q, p, and P . Any two of these constitute a<br />
complete set, so there are four kinds of gener<strong>at</strong>ing functions usually called<br />
F1(q, Q, t), F2(q, P, t), F3(p, Q, t), and F4(p, P, t). All four are discussed in<br />
Goldstein. F1 provides a good introduction. Most of our work will make<br />
use of F2.<br />
Starting with F1(q, Q) (2.3) becomes<br />
Since<br />
we get with the help of (4)<br />
¯L(Q, ˙ Q, t) = L(q, ˙q, t) − d<br />
dt F1(q, Q, t) (2.7)<br />
∂ ¯ L<br />
∂ ˙qk<br />
∂ ¯ L<br />
∂ ˙qk<br />
= ∂L<br />
−<br />
∂ ˙qk<br />
= ∂L<br />
∂ ˙ Qk<br />
∂ ˙<br />
F1<br />
∂ ˙qk<br />
= 0,<br />
= ˙pk − ∂F1<br />
∂qk<br />
∂ ¯ L<br />
∂ ˙ Qk<br />
= Pk = ∂L<br />
∂ ˙ Qk<br />
∂F1 ˙<br />
−<br />
∂ ˙ Qk<br />
This yields the two transform<strong>at</strong>ion equ<strong>at</strong>ions<br />
Pk = − ∂F1<br />
∂Qk<br />
pk = ∂F1<br />
∂qk<br />
= 0.<br />
= − ∂F1<br />
∂Qk<br />
(2.8)<br />
(2.9)<br />
A straightforward set of substitutions gives our final formula for the Kamiltonian.<br />
K = ∑<br />
[<br />
− ∂F<br />
˙Qk − L +<br />
∂Qk<br />
∂F<br />
˙qk +<br />
∂qk<br />
∂F<br />
]<br />
˙Qk +<br />
Qk<br />
∂F<br />
∂t<br />
To be more explicit<br />
k<br />
= −L + ∑<br />
k<br />
pk ˙qk + ∂F<br />
∂t<br />
K(Q, P ) = H(q(Q, P ), p(Q, P ), t) + ∂<br />
∂t F1(q(Q, P ), Q, t) (2.10)<br />
Summary:
20 CHAPTER 2. CANONICAL TRANSFORMATIONS<br />
1. Here is the typical problem: We are given the Hamiltonian H =<br />
H(q, p) for some conserv<strong>at</strong>ive system. H = E is constant, but the<br />
q’s and p’s change with time in a complic<strong>at</strong>ed way. Our goal is to find<br />
the functions q = q(t) and p = p(t) using the technique of canonical<br />
transform<strong>at</strong>ions.<br />
2. We need to know the gener<strong>at</strong>ing function F = F1(q, Q). This is the<br />
hard part, and I’m postponing it as long as possible.<br />
3. Substitute F into (2.8) and (2.9). This gives a set of coupled algebraic<br />
equ<strong>at</strong>ions for q, Q, p, and P . They must be combined in such a way<br />
as to give qk = qk(Q, P ) and pk = pk(Q, P ).<br />
4. Use (2.10) to find K. If we had the right gener<strong>at</strong>ing function to start<br />
with, Q will be cyclic, i.e. K = K(P ). The equ<strong>at</strong>ions of motion are<br />
obtained from (2.6). Pk<br />
˙ = 0 and ˙ Qk = ωk. The ω’s are a set of<br />
constants as are the P ’s. Qk(t) = ωkt + αk. The α’s are constants<br />
obtained from the initial conditions.<br />
5. Finally qk(t) = qk(Q(t), P ) and pk(t) = pk(Q(t), P ).<br />
2.1.1 The Harmonic Oscill<strong>at</strong>or: Cracking a Peanut with a<br />
Sledgehammer<br />
H = p2 kq2<br />
+<br />
2m 2<br />
It’s useful to try a new technique on an old problem. As it turns out, the<br />
gener<strong>at</strong>ing function is<br />
F = mωq2<br />
cot Q (2.12)<br />
2<br />
The transform<strong>at</strong>ion is found from (2.8) and (2.9).<br />
p = ∂F<br />
∂q<br />
= 1<br />
2m (p2 + m 2 ω 2 q 2 ) (2.11)<br />
= mωq cot Q<br />
P = − ∂F mωq2<br />
=<br />
∂Q 2 sin2 Q<br />
Solve for p and q in terms of P and Q and then substitute into (2.10) to<br />
find K.<br />
√<br />
2P<br />
q =<br />
mω sin Q p = √ 2P mω cos Q
2.2. THE SECOND GENERATING FUNCTION 21<br />
K = ωP P = E/ω<br />
We have achieved our goal. Q is cyclic, and the equ<strong>at</strong>ions of motion are<br />
trivial.<br />
˙Q = ∂K<br />
q =<br />
∂P = ω Q = ωt + Q0 (2.13)<br />
√ 2E<br />
mω 2 sin(ωt + Q0) p = √ 2mE cos(ωt + Q0) (2.14)<br />
2.2 The Second Gener<strong>at</strong>ing Function<br />
There’s an old recipe for tiger stew th<strong>at</strong> begins, “First c<strong>at</strong>ch the tiger.” In<br />
our quest for the tiger, we now turn our <strong>at</strong>tention to the second gener<strong>at</strong>ing<br />
function, F2 = F2(q, P, t). F2 is obtained from F1 by means of a Legendre<br />
transform<strong>at</strong>ion. 1<br />
F2(q, P ) = F1(q, Q) + ∑<br />
(2.15)<br />
k<br />
QkPk<br />
We are looking for transform<strong>at</strong>ion equ<strong>at</strong>ions analogous to (refe2.8) and (2.9).<br />
Since L = ¯ L + ˙ F1,<br />
∑<br />
pk ˙qk − H = ∑<br />
Pk ˙ Qk − K + d<br />
dt (F2 − ∑ QkPk)<br />
k<br />
k<br />
k<br />
= − ∑ Qk ˙<br />
Pk − K + ˙<br />
F2<br />
Substitute<br />
F2<br />
˙ = ∑<br />
[<br />
∂F2<br />
˙qk +<br />
∂qk k<br />
∂F2<br />
]<br />
Pk<br />
˙ +<br />
∂Pk<br />
∂F2<br />
∂t<br />
−H = −K + ∑<br />
[( ) ( ) ]<br />
∂F2<br />
∂F2<br />
− pk ˙qk + − Qk Pk<br />
˙ +<br />
∂qk<br />
∂Pk<br />
∂F2<br />
∂t<br />
We are working on the assumption th<strong>at</strong> ˙q and ˙<br />
P are not independent variables.<br />
We enforce this by requiring th<strong>at</strong><br />
∂F2<br />
∂qk<br />
∂F2<br />
∂Pk<br />
= pk<br />
= Qk<br />
(2.16)<br />
(2.17)<br />
K(q, P ) = H(q(Q, P ), P ) + ∂<br />
∂t F2(q(Q, P ), P ) (2.18)<br />
1 When in doubt, do a Legendre transform<strong>at</strong>ion.
22 CHAPTER 2. CANONICAL TRANSFORMATIONS<br />
2.3 Hamilton’s Principle Function<br />
The F1 style gener<strong>at</strong>ing functions were used to transform to a new set of<br />
variables (q, p) → (Q, P ) such th<strong>at</strong> all the Q’s were cyclic. As a consequence,<br />
the P ’s were constants of the motion, and the Q’s were linear functions<br />
of time. The gener<strong>at</strong>ing function itself was hard to find, however. The<br />
F2 gener<strong>at</strong>ing function goes one step further; it can transform to a set of<br />
variables in which both the Q’s and P ’s are constant and simple functions of<br />
the initial values of the phase space variables. In essence, our transform<strong>at</strong>ion<br />
is<br />
(q(t), p(t)) ↔ (q0, p0)<br />
This is a time dependent transform<strong>at</strong>ion, of course. The fact th<strong>at</strong> we can<br />
find such transform<strong>at</strong>ions shows th<strong>at</strong> the time evolution of a system is itself<br />
a canonical transform<strong>at</strong>ion.<br />
We look for an F2 so th<strong>at</strong> K in (2.18) is identically zero! Then from<br />
(2.6), ˙ Qk = 0 and ˙<br />
Pk = 0. The appropri<strong>at</strong>e gener<strong>at</strong>ing function will be a<br />
solution to<br />
H(q, p, t) + ∂F2<br />
∂t<br />
We elimin<strong>at</strong>e pk using (2.16)<br />
(<br />
H q1, . . . , qn; ∂F2<br />
, . . . ,<br />
∂q1<br />
∂F2<br />
)<br />
; t +<br />
∂qn<br />
∂F2<br />
∂t<br />
(2.19)<br />
= 0. (2.20)<br />
The solution to this equ<strong>at</strong>ion is usually called S, Hamilton’s principle function.<br />
The equ<strong>at</strong>ion itself is the Hamilton-Jacobi equ<strong>at</strong>ion. 2<br />
There are two serious issues here: does it have a solution, and if it does,<br />
can we find it? We will take a less serious approach: if we can find a solution,<br />
then it most surely exists. Furthermore, if we can find it, it will have the<br />
form<br />
S = ∑<br />
Wk(qk) − αt (2.21)<br />
k<br />
Partial differential equ<strong>at</strong>ions th<strong>at</strong> have solutions of the form (2.21) are said<br />
to be separable. 3 Most of the familiar textbook problems in classical mechanics<br />
and <strong>at</strong>omic physics can be separ<strong>at</strong>ed in this form. The question<br />
of separability does depend on the system of generalized coordin<strong>at</strong>es used.<br />
For example, the Kepler problem is separable in spherical coordin<strong>at</strong>es but<br />
not in cartesian coordin<strong>at</strong>es. It would be nice to know whether a particular<br />
2 See Goldstein, Classical <strong>Mechanics</strong>, Chapter 10<br />
3 Or to be meticulous, completely separable
2.3. HAMILTON’S PRINCIPLE FUNCTION 23<br />
Hamiltonian could be separ<strong>at</strong>ed with some system of coordin<strong>at</strong>es, but no<br />
completely general criterion is known. 4 As a rule of thumb, Hamiltonians<br />
with explicit time dependence are not separable.<br />
If our Hamiltonian is separable, then when (2.21) is substituted into<br />
(2.20), the result will look like<br />
f1<br />
(<br />
q1, dW1<br />
dq1<br />
)<br />
+ f2<br />
(<br />
q2, dW2<br />
)<br />
+ · · · = α (2.22)<br />
dq2<br />
Each function fk is a function only of qk and dWk/dqk. Since all the q’s<br />
are independent, each function must be separ<strong>at</strong>ely constant. This gives us a<br />
system of n independent, first-order, ordinary differential equ<strong>at</strong>ions for the<br />
Wk’s.<br />
(<br />
fk qk, dWk<br />
)<br />
= αk. (2.23)<br />
dqk<br />
The W ’s so obtained are then substituted into (2.21). The resulting function<br />
for S is<br />
F2 ≡ S = S(q1, . . . , qn; α1, . . . , αn; α, t)<br />
The final constant α is redundant for two reasons: first, ∑ αk = α, and<br />
second, the transform<strong>at</strong>ions equ<strong>at</strong>ions (2.16) and (2.17) involve deriv<strong>at</strong>ives<br />
with respect to qk and Pk. When S is so differenti<strong>at</strong>ed, the −αt piece will<br />
disappear. In order to make this apparent, we will write S as follows:<br />
F2 ≡ S = S(q1, . . . , qn; α1, . . . , αn; t) (2.24)<br />
Since the F2 gener<strong>at</strong>ing functions have the form F2(q, P ), we are entitled to<br />
think of the α’s as “momenta,” i.e. αk in (??) corresponds to Pk in (2.17).<br />
In a way this makes sense. Our goal was to transform the time-dependent<br />
q’s and p’s into a new set of constant Q’s and P ’s, and the α’s are most<br />
certainly constant. On the other hand, they are not the initial momenta p0<br />
th<strong>at</strong> evolve into p(t). The rel<strong>at</strong>ionship between α and p0 will be determined<br />
l<strong>at</strong>er.<br />
If we have dome our job correctly, the Q’s given by (2.17) are also constant.<br />
They are traditionally called β, so<br />
Qk = βk =<br />
∂S(q, α, t)<br />
∂αk<br />
Again, β’s are constant, but they are not equal to q0.<br />
We can turn this into a cookbook algorithm.<br />
(2.25)<br />
4 The is a very technical result, the so-called Staeckel conditions, which gives necessary<br />
and sufficient conditions for separability in orthogonal coordin<strong>at</strong>e systems.
24 CHAPTER 2. CANONICAL TRANSFORMATIONS<br />
1. Substitute (2.21) into (2.20) and separ<strong>at</strong>e variables.<br />
2. Integr<strong>at</strong>e the resulting fist-order ODE’s. The result will be n independent<br />
functions Wk = Wk(q, α). Put the Wk’s back into (2.21) to<br />
construct S = S(q, α, t).<br />
3. Find the constant β coordin<strong>at</strong>es using<br />
βk = ∂S<br />
∂αk<br />
4. Invert these equ<strong>at</strong>ions to find qk = qk(β, α, t)<br />
5. Find the momenta with<br />
pk = ∂S<br />
∂qk<br />
2.3.1 The Harmonic Oscill<strong>at</strong>or: Again<br />
The harmonic oscill<strong>at</strong>or provides an easy example of this procedure.<br />
H = 1<br />
[ (∂S<br />
1<br />
2m ∂q<br />
[ (∂W<br />
1<br />
2m ∂q<br />
2m (p2 + m 2 ω 2 q 2 )<br />
) 2<br />
+ m 2 ω 2 q 2<br />
]<br />
) 2<br />
+ m 2 ω 2 q 2<br />
+ ∂S<br />
= 0<br />
∂t<br />
]<br />
= α<br />
(2.26)<br />
(2.27)<br />
Since there is only one q, the entire quantity on the left of the equal sign is<br />
a constant.<br />
W (q, α) = √ ∫<br />
2mα<br />
√<br />
dq 1 − mω2q2 2α<br />
The new transformed constant “momentum” P = α.<br />
∂S(q, α, t)<br />
β = =<br />
∂α<br />
∂W (q, α)<br />
− t<br />
∂α<br />
= 1<br />
ω sin−1<br />
[ √ ]<br />
mω2 q − t (2.28)<br />
2α<br />
Invert this equ<strong>at</strong>ion to find q as a function of t and β.<br />
√<br />
2α<br />
q = sin(ωt + βω)<br />
mω2
2.4. HAMILTON’S CHARACTERISTIC FUNCTION 25<br />
Evidentally, β has something to do with initial conditions: ωβ = ϕ0, the<br />
initial phase angle.<br />
p = ∂S<br />
∂q = √ 2mα − m 2 ω 2 q 2<br />
= √ 2mα cos(ωt + ϕ0)<br />
The maximum value of p is √ 2mE, so th<strong>at</strong> makes sense too.<br />
2.4 Hamilton’s Characteristic Function<br />
There is another way to use the F2 gener<strong>at</strong>ing function to turn a difficult<br />
problem into an easy one. In the previous section we chose F2 = S = W −αt,<br />
so th<strong>at</strong> K = 0. It is also possible to to take F2 = W (q) so th<strong>at</strong><br />
(<br />
K = H qk, ∂W<br />
)<br />
= E = α1<br />
(2.29)<br />
∂qk<br />
The W obtained in this way is called Hamilton’s characteristic function.<br />
W = ∑<br />
Wk(qk, α1, . . . , αn)<br />
k<br />
= W (q1, . . . , qn, E, α2, . . . , αn) = W (q1, . . . , qn, α1, . . . , αn) (2.30)<br />
It gener<strong>at</strong>es a contact transform<strong>at</strong>ion with properties quite different from<br />
th<strong>at</strong> gener<strong>at</strong>ed by S. The equ<strong>at</strong>ions of motion are<br />
Pk<br />
˙ = − ∂K<br />
∂Qk<br />
˙Qk = ∂K<br />
∂Pk<br />
= ∂K<br />
∂αk<br />
= 0 (2.31)<br />
= δk1<br />
The new fe<strong>at</strong>ure is th<strong>at</strong> ˙ Q1 = 1 so Q1 = t − t0. In general<br />
but now β1 = t − t0.<br />
Qk = ∂W<br />
∂αk<br />
pk = ∂W<br />
∂qk<br />
as before.<br />
The algorithm now works like this:<br />
= βk<br />
(2.32)<br />
(2.33)<br />
(2.34)
26 CHAPTER 2. CANONICAL TRANSFORMATIONS<br />
1. Substitute (2.30) into (2.29) and separ<strong>at</strong>e variables.<br />
2. Integr<strong>at</strong>e the resulting fist-order ODE’s. The result will be n independent<br />
functions Wk = Wk(q, α). Put the Wk’s back into (2.30) to<br />
construct S = S(q, α, t).<br />
3. Find the constant β coordin<strong>at</strong>es using<br />
Remember th<strong>at</strong> β1 = t − t0.<br />
βk = ∂S<br />
∂αk<br />
4. Invert these equ<strong>at</strong>ions to find qk = qk(β, α, t)<br />
5. Find the momenta with<br />
2.4.1 Examples<br />
pk = ∂S<br />
∂qk<br />
(2.35)<br />
(2.36)<br />
Problems with one degree of freedom are virtually identical whether they<br />
are formul<strong>at</strong>ed in terms of the characteristic function or the principle function.<br />
Take for example, the harmonic oscill<strong>at</strong>or from the previous section.<br />
Equ<strong>at</strong>ion (2.28) becomes<br />
β =<br />
∂W (q, α)<br />
∂α<br />
q =<br />
= 1<br />
ω sin−1<br />
[ √<br />
mω2 q<br />
2α<br />
]<br />
= t − t0<br />
√ 2α<br />
mω 2 sin[ω(t − t0)] (2.37)<br />
The following problem raises some new issues.<br />
Consider a particle in a stable orbit in a central potential. The motion<br />
will lie in a plane so we can do the problem in two dimensions.<br />
H = 1<br />
2m<br />
(<br />
p 2 r + p2 ψ<br />
r 2<br />
)<br />
+ V (r) (2.38)<br />
pψ = mr 2 ˙ ψ is the angular momentum. It is conserved since ψ is cyclic.<br />
[ (∂W ) 2<br />
1<br />
+<br />
2m ∂r<br />
1<br />
r2 ( ) ]<br />
2<br />
∂W<br />
+ V (r) = α1<br />
∂ψ<br />
(2.39)
2.5. ACTION-ANGLE VARIABLES 27<br />
[<br />
r 2<br />
( ) 2<br />
dWr<br />
+ 2mr<br />
dr<br />
2 V (r) − 2mα1r 2<br />
]<br />
+<br />
( ) 2<br />
dWψ<br />
= 0 (2.40)<br />
dψ<br />
At this point we notice ∂W<br />
∂ψ = pψ, which we know is constant. Why not<br />
call it something like αψ? Then Wψ = αψψ. This is worth st<strong>at</strong>ing as a<br />
general principle: if q is cyclic, Wq = αqq, where αq is one of the n constant<br />
α’s appearing in (2.30).<br />
∫<br />
W =<br />
√<br />
dr 2m(α1 − V ) − α2 ψ /r2 + αψψ (2.41)<br />
We can find r as a function of time by inverting the equ<strong>at</strong>ion for β1, just as<br />
we did in (2.37), but more to the point<br />
βψ = ∂W<br />
∫<br />
αψdr<br />
= − √<br />
∂αψ r 2m(α1 − V ) − α2 ψ /r2<br />
+ ψ (2.42)<br />
Make the usual substitution,u = 1/r.<br />
∫<br />
du<br />
ψ − βψ = − √<br />
2m(α1 − V (r))/α2 ψ − u2<br />
(2.43)<br />
This is a new kind of equ<strong>at</strong>ion of motion, which gives ψ = ψ(r) or r = r(ψ)<br />
(assuming we can do the integral), i.e. there is no explicit time dependence.<br />
Such equ<strong>at</strong>ions are called orbit equ<strong>at</strong>ions. Often it will be more useful to<br />
have the equ<strong>at</strong>ions in this form, when we are concerned with the geometric<br />
properties of the trajectories.<br />
2.5 Action-Angle Variables<br />
We are pursuing a rout to chaos th<strong>at</strong> begins with periodic or quasi-periodic<br />
systems. A particularly elegant approach to these systems makes use of a<br />
variant of Hamilton’s characteristic function. In this technique, the integr<strong>at</strong>ion<br />
constants αk appearing directly in the solution of the Hamilton-Jacobi<br />
equ<strong>at</strong>ion are not themselves chosen to be the new momenta. Instead, we<br />
define a set of constants Ik, which form a set of n independent functions of<br />
the α’s known as action variables. The coordin<strong>at</strong>es conjug<strong>at</strong>e to the J’s are<br />
angles th<strong>at</strong> increase linearly with time. You are familiar with a system th<strong>at</strong><br />
behaves just like this, the harmonic oscill<strong>at</strong>or!<br />
q =<br />
√ 2E<br />
k sin ψ p = √ 2mE cos ψ
28 CHAPTER 2. CANONICAL TRANSFORMATIONS<br />
Where ψ = ωt + ψ0. In the language of action-angle variables I = E/ω, so<br />
√<br />
2I<br />
q =<br />
mω sin ψ p = √ 2mIω cos ψ<br />
I is the “momentum” conjug<strong>at</strong>e to the “coordin<strong>at</strong>e” ψ.<br />
Action angle variables are only appropri<strong>at</strong>e to periodic motion, and there<br />
are other restrictions we will learn as we go along, but within these limit<strong>at</strong>ions,<br />
all systems can be transformed into a set of uncoupled harmonic<br />
oscill<strong>at</strong>ors. 5 To see wh<strong>at</strong> “periodic motion” implies, have a look <strong>at</strong> the<br />
simple pendulum.<br />
H = p2 θ − mgl cos θ = E = α (2.44)<br />
2ml2 pθ = ± √ 2ml 2 (E + mgl cos θ) (2.45)<br />
There are two kinds of motion possible. If E is small, the pendulum will<br />
reverse <strong>at</strong> the points where pθ = 0. The motion is called libr<strong>at</strong>ion, i.e.<br />
bounded and periodic. If E is large enough, however, the pendulum will<br />
swing around a complete circle. Such motion is called rot<strong>at</strong>ion (obviously).<br />
There is a critical value of E = mgl for which, in principle, the pendulum<br />
could stand straight up motionless <strong>at</strong> θ = π. An orbit in pθ - θ phase space<br />
corresponding to this energy forms the dividing line between the two kinds<br />
of motion. Such a trajectory is called a separ<strong>at</strong>rix.<br />
For either type of periodic motion, we can introduce a new variable I<br />
designed to replace α as the new constant momentum.<br />
I(α) = 1<br />
2π<br />
<br />
p(q, α) dq (2.46)<br />
This is a definite integral taken over a complete period of libr<strong>at</strong>ion or rot<strong>at</strong>ion.<br />
6 I will prove (1) the angle ψ conjug<strong>at</strong>e to I is cyclic, and (2) ∆ψ = 2π<br />
corresponds to one complete cycle of the periodic motion.<br />
1. Since I = I(α) and H = α, it follows th<strong>at</strong> H is a function of I only.<br />
H = H(I).<br />
I ˙ = − ∂H<br />
∂ψ<br />
= 0<br />
˙<br />
ψ = ∂H<br />
∂I<br />
= ω(I) (2.47)<br />
5 When correctly viewed, everything is a harmonic oscill<strong>at</strong>or.<br />
6 Textbooks are about equally divided on whether to call action I or J and whether or<br />
not to include the factor 1/2π.
2.5. ACTION-ANGLE VARIABLES 29<br />
2. We are using an F2 type gener<strong>at</strong>ing function, which is a function of the<br />
old coordin<strong>at</strong>e and new momentum. Hamilton’s characteristic function<br />
can be written as<br />
W = W (q, I). (2.48)<br />
The transform<strong>at</strong>ion equ<strong>at</strong>ions are<br />
Note th<strong>at</strong><br />
so<br />
<br />
dψ =<br />
∂ψ<br />
∂q<br />
ψ = ∂W<br />
∂I<br />
∂ψ<br />
∂q<br />
<br />
∂ ∂W<br />
dq =<br />
∂I ∂q<br />
p = ∂W<br />
∂q<br />
( )<br />
∂ ∂W<br />
=<br />
∂I ∂q<br />
<br />
∂<br />
dq =<br />
∂I<br />
2.5.1 The harmonic oscill<strong>at</strong>or (for the last time)<br />
H = 1<br />
2m (p2 + m 2 ω 2 q 2 )<br />
p = ± √ 2mE − m2ω2q 2<br />
I = 1<br />
<br />
√2mE<br />
− m2ω2q 2 dq<br />
2π<br />
(2.49)<br />
p dq = ∂<br />
(2πI) = 2π.<br />
∂I<br />
The integral is tricky in this form because p changes sign <strong>at</strong> the turning<br />
points. We won’t have to worry about this if we make the substitution<br />
q =<br />
√<br />
2E<br />
sin ψ (2.50)<br />
mω2 This substitution not only makes the integral easy and takes care of the sign<br />
change, it also makes clear the meaning of an integral over a complete cycle,<br />
i.e. ψ goes from 0 to 2π.<br />
I = E<br />
πω<br />
<br />
cos 2 ψ dψ = E/ω<br />
From this point of view the introduction of ψ <strong>at</strong> (50) seems nothing<br />
more th<strong>at</strong> a m<strong>at</strong>hem<strong>at</strong>ical trick. We would have stumbled on it eventually,
30 CHAPTER 2. CANONICAL TRANSFORMATIONS<br />
however, as the following argument shows. The Hamilton-Jacobi equ<strong>at</strong>ion<br />
is<br />
[ (dW ) 2<br />
1<br />
+ m<br />
2m dq<br />
2 ω 2 q 2<br />
]<br />
= E<br />
∫<br />
√2mIω<br />
W =<br />
− m2ω2q 2 dq<br />
∂W<br />
∂I<br />
∫<br />
= mω<br />
dq<br />
√ 2mIω − m 2 ω 2 q 2<br />
= sin −1<br />
( √ )<br />
mω2 q − ψ0 = ψ<br />
2I<br />
√<br />
2E<br />
q = sin(ψ − ψ0)<br />
mω2 In the last equ<strong>at</strong>ion ψ0 appears as an integr<strong>at</strong>ion constant. Evidentally, ψ<br />
is the angle variable conjug<strong>at</strong>e to I.<br />
In summary, to use action-angle variables for problems with one degree<br />
of freedom:<br />
1. Find p as a function of E = α and q.<br />
2. Calcul<strong>at</strong>e I(E) using (2.46).<br />
3. Solve the Hamilton-Jacobi equ<strong>at</strong>ion to find W = W (q, I).<br />
4. Find ψ = ψ(q, I) using (2.49).<br />
5. Invert this equ<strong>at</strong>ion to get q = q(I, ψ).<br />
6. Use (2.47) to get ω(I).<br />
7. Calcul<strong>at</strong>e p = p(I, q) from (2.49).<br />
One <strong>at</strong>tractive fe<strong>at</strong>ure of this scheme is th<strong>at</strong> you can find the frequency<br />
without using the characteristic function and without finding the equ<strong>at</strong>ions<br />
of motion. The phase space plot is particularly important. Use polar coordin<strong>at</strong>es<br />
(wh<strong>at</strong> else) for (I, ψ). Every trajectory, wh<strong>at</strong>ever the system, is a<br />
circle!<br />
Our deriv<strong>at</strong>ion was based on the following assumptions: (1) The system<br />
had one degree of freedom. (2) Energy was conserved and the Hamiltonian<br />
had no explicit time dependence. (3) The motion was periodic. Every such<br />
system is <strong>at</strong> heart, a harmonic oscill<strong>at</strong>or. Phase space trajectories are circles.
2.5. ACTION-ANGLE VARIABLES 31<br />
The frequency can be found with a few deft moves. From a philosophical<br />
point of view, (and we will be getting deeper and deeper into philosophy as<br />
these lectures proceed) problems in this c<strong>at</strong>egory are “as good as solved,”<br />
nothing more needs to be said about them. The same is definitely not true<br />
true with more than one degree of freedom. I will take a paragraph to<br />
generalize before going on to some more abstract developments.<br />
We must assume th<strong>at</strong> the system is separable, so<br />
W (q1, . . . , qn, α1, . . . , αn) = ∑<br />
Wk(qk, α1, . . . , αn) (2.51)<br />
k<br />
pk = ∂<br />
Wk(qk, α1, . . . , αn) (2.52)<br />
∂qk<br />
Ik = 1<br />
<br />
pk(qk, α1, . . . , αn) (2.53)<br />
2π<br />
Next find all the q’s as function of the I’s and substitute into W .<br />
Finally<br />
ψk = ∂W<br />
∂Ik<br />
W = W (q1, . . . , qn; I1, . . . , In)<br />
˙<br />
Ik = 0<br />
˙ ψk = ∂H<br />
∂Ik<br />
= ωk<br />
(2.54)
32 CHAPTER 2. CANONICAL TRANSFORMATIONS
Chapter 3<br />
Abstract Transform<strong>at</strong>ion<br />
Theory<br />
So, one-dimensional problems are simple. Given the restrictions listed in<br />
the previous section, their phase space trajectories are circles. How does<br />
this generalize to problems with two or more degrees of freedom? A brief<br />
answer is th<strong>at</strong>, given a number of conditions th<strong>at</strong> we must discuss carefully,<br />
the phase space trajectories of a system with n degrees of freedom, move<br />
on the surface of an n-dimensional torus imbedded in 2n dimensional space.<br />
The final answer is a donut! In order to prove this remarkable assertion and<br />
understand the conditions th<strong>at</strong> must be s<strong>at</strong>isfied, we must slog through a<br />
lot of technical m<strong>at</strong>erial about transform<strong>at</strong>ions in general.<br />
3.1 Not<strong>at</strong>ion<br />
Our first job is to devise some compact not<strong>at</strong>ion for dealing with higher<br />
dimensional spaces. I will show you the not<strong>at</strong>ion in one dimension. It will<br />
then be easy to generalize. Recall Hamilton’s equ<strong>at</strong>ions of motion.<br />
˙p = − ∂H<br />
∂q<br />
We will turn this into a vector equ<strong>at</strong>ion.<br />
( )<br />
q<br />
η =<br />
p<br />
(<br />
0<br />
J =<br />
−1<br />
1<br />
0<br />
The equ<strong>at</strong>ions of motion in vector form are<br />
˙q = ∂H<br />
∂p<br />
)<br />
∇ =<br />
( ∂<br />
∂q<br />
∂<br />
∂p<br />
)<br />
(3.1)<br />
˙η = J · ∇H (3.2)<br />
33
34 CHAPTER 3. ABSTRACT TRANSFORMATION THEORY<br />
J is not a vector of course. Sometimes an array used in this way is called a<br />
dyadic. At any r<strong>at</strong>e this is just shorthand for m<strong>at</strong>rix multiplic<strong>at</strong>ion, i.e.<br />
( ) ( )<br />
˙q 0 1<br />
=<br />
˙p −1 0<br />
( )<br />
∂H<br />
∂q<br />
∂H<br />
∂p<br />
The structure of J is important. Notice th<strong>at</strong> it does two things: it exchanges<br />
p and q and it changes one sign. This is called a symplectic transform<strong>at</strong>ion.<br />
I want to explore the connection between canonical transform<strong>at</strong>ions and<br />
symlpectic transform<strong>at</strong>ions.<br />
I’ll start with the generic canonical transform<strong>at</strong>ion, (q, p) → (Q, P ). How<br />
do the velocities transform? Define<br />
)<br />
Using the not<strong>at</strong>ion<br />
this can be written<br />
( ˙Q<br />
˙<br />
P<br />
M =<br />
)<br />
=<br />
(<br />
∂Q ∂Q<br />
∂q<br />
∂P<br />
∂p<br />
∂P<br />
∂q ∂p<br />
(<br />
∂Q ∂Q<br />
∂q<br />
∂P<br />
∂p<br />
∂P<br />
∂q ∂p<br />
ζ =<br />
( Q<br />
P<br />
)<br />
) ( ˙q<br />
˙p<br />
)<br />
(3.3)<br />
(3.4)<br />
(3.5)<br />
˙ζ = M · ˙η = M · J · ∇H (3.6)<br />
The gradient oper<strong>at</strong>or differenti<strong>at</strong>es H with respect to q and p. These deriv<strong>at</strong>ives<br />
transform e.g.<br />
∂H ∂H ∂Q ∂H ∂P<br />
= +<br />
∂q ∂Q ∂q ∂P ∂q<br />
consequently<br />
The T stands for transpose, of course:<br />
but<br />
Combining (3.8) and (3.9):<br />
∇ (q,p) = M T · ∇ (Q,P )H (3.7)<br />
˙ζ = M · J · M T · ∇ (Q,P )H (3.8)<br />
˙ζ = J · ∇ (Q,P )H (3.9)<br />
J = M · J · M T<br />
(3.10)
3.1. NOTATION 35<br />
Those of you who have studied special rel<strong>at</strong>ivity should find (??) congenial.<br />
Remember the definition of a Lorentz transform<strong>at</strong>ion: any 4×4 m<strong>at</strong>rix<br />
Λ th<strong>at</strong> s<strong>at</strong>isfies<br />
g = Λ · g · Λ T<br />
(3.11)<br />
is a Lorentz transform<strong>at</strong>ion. 1 The m<strong>at</strong>rix<br />
⎛<br />
1 0 0<br />
⎞<br />
0<br />
⎜<br />
g = ⎜ 0<br />
⎝ 0<br />
−1<br />
0<br />
0<br />
−1<br />
0 ⎟<br />
0 ⎠<br />
0 0 0 −1<br />
(3.12)<br />
is called the metric or metric tensor. Forgive me for exagger<strong>at</strong>ing slightly:<br />
everything there is to know about special rel<strong>at</strong>ivity flows out of (3.11). We<br />
say th<strong>at</strong> Lorentz transform<strong>at</strong>ions “preserve the metric,” i.e. leave the metric<br />
invariant. The geometry of space and time is encapsul<strong>at</strong>ed in (12). By the<br />
same token, canonical transform<strong>at</strong>ions preserve the metric J. The geometry<br />
of phase space is encapsul<strong>at</strong>ed in the definition of J. Since J is symplectic,<br />
canonical transform<strong>at</strong>ions are symplectic transform<strong>at</strong>ion, they preserve the<br />
symplectic metric.<br />
Equ<strong>at</strong>ion (4.10) is the starting point for the modern approach to mechanics<br />
th<strong>at</strong> uses the tools of Lie group theory. I will only mention in passing<br />
some points of contact with group theory. Both Goldstein’s and Schenk’s<br />
texts have much more on the subject.<br />
3.1.1 Poisson Brackets<br />
Equ<strong>at</strong>ion (3.10) is really shorthand for four equ<strong>at</strong>ions, e.g.<br />
∂Q ∂P<br />
∂q ∂p<br />
∂P ∂Q<br />
− = 1 (3.13)<br />
∂q ∂p<br />
This combin<strong>at</strong>ion of deriv<strong>at</strong>ives is called a Poisson bracket. The usual not<strong>at</strong>ion<br />
is<br />
∂X ∂Y ∂X ∂Y<br />
− ≡ [X, Y ]q,p<br />
∂q ∂p ∂q ∂p<br />
(3.14)<br />
The quantity on the left is called a Poisson bracket. Then (3.13) becomes<br />
[Q, P ]q,p = 1 (3.15)<br />
1 It is not a good idea to use m<strong>at</strong>rix not<strong>at</strong>ion in rel<strong>at</strong>ivity because of the ambiguity<br />
inherent in covariant and contravariant indices. Normally one would write (11) using<br />
tensor not<strong>at</strong>ion.
36 CHAPTER 3. ABSTRACT TRANSFORMATION THEORY<br />
This together with the trivially true<br />
[q, p]q,p = 1 (3.16)<br />
are called the fundamental Poisson brackets. We conclude th<strong>at</strong> canonical<br />
transform<strong>at</strong>ions leave the fundamental Poisson brackets invariant. It turns<br />
out th<strong>at</strong> all Poisson brackets have the same value when evalu<strong>at</strong>ed with respect<br />
to any canonical set of variables. This assertion requires some proof,<br />
however. I will start by generalizing to n dimensions.<br />
⎛<br />
⎜<br />
η = ⎜<br />
⎝<br />
q1<br />
q2<br />
.<br />
qn<br />
p1<br />
p2<br />
.<br />
pn<br />
⎞<br />
⎟<br />
⎠<br />
J =<br />
( 0 ℑn<br />
−ℑn 0<br />
)<br />
⎛<br />
⎜<br />
∇ = ⎜<br />
⎝<br />
∂<br />
∂q1<br />
∂<br />
∂q2<br />
.<br />
∂<br />
∂qn<br />
∂<br />
∂p1<br />
∂<br />
∂p2<br />
∂<br />
∂pn<br />
⎞<br />
⎟<br />
⎠<br />
(3.17)<br />
The symbol ℑn is the anti-diagonal n × n unit m<strong>at</strong>rix. (refe3.14) becomes<br />
[X, Y ]η ≡ ∑<br />
(<br />
∂X ∂Y<br />
∂qk ∂pk<br />
− ∂X<br />
)<br />
∂Y<br />
,<br />
∂pk ∂qk<br />
(3.18)<br />
or in m<strong>at</strong>rix not<strong>at</strong>ion<br />
The following should look familiar:<br />
k<br />
[X, Y ]η = (∇ηX) T · J · ∇ηY. (3.19)<br />
[qi, qk] = [pi, pk] = 0 [qi, pk] = δik<br />
These are, of course, the commut<strong>at</strong>ion rel<strong>at</strong>ions for position and momentum<br />
oper<strong>at</strong>ors in quantum mechanics. The resemblance is not accidental. The<br />
oper<strong>at</strong>or formul<strong>at</strong>ion of quantum mechanics grew out of Poisson bracket<br />
formul<strong>at</strong>ion of classical mechanics. This development is reviewed in all the<br />
standard texts. In m<strong>at</strong>rix not<strong>at</strong>ion<br />
[η, η]η = [ζ, ζ]η = J (3.20)<br />
The Poisson bracket of two vectors is itself a n × n m<strong>at</strong>rix. i.e.<br />
[X, Y ]ij ≡ [Xi, Yj] (3.21)
3.1. NOTATION 37<br />
The proof of the above assertion is straightforward.<br />
∇ηY = M T · ∇ζY<br />
(∇ηX) T = (M T · ∇ζX) T = (∇ζX) T · M<br />
[X, Y ]η = (∇ζX) T · M · J · M T · ∇ζY<br />
= (∇ζX) T · J · ∇ζY = [X, Y ]ζ<br />
The last step makes use of (3.10). The invariance of the Poisson brackets is<br />
a non-trivial consequence of the symplectic n<strong>at</strong>ure of canonical transform<strong>at</strong>ions.<br />
From now on will will not bother with the subscripts on the Poisson<br />
brackets.<br />
Here is another similarity with quantum mechanics. Let f be any func-<br />
tion of canonical variables.<br />
f ˙ = ∑<br />
(<br />
∂f<br />
∂qk k<br />
= ∑<br />
(<br />
∂f ∂H<br />
∂qk ∂pk<br />
k<br />
df<br />
dt<br />
˙qk + ∂f<br />
)<br />
˙p +<br />
∂pk<br />
∂f<br />
∂t<br />
− ∂f<br />
∂pk<br />
∂H<br />
∂qk<br />
= [f, H] + ∂f<br />
∂t<br />
)<br />
+ ∂f<br />
∂t<br />
(3.22)<br />
This looks like Heisenberg’s equ<strong>at</strong>ion of motion. For our purposes it means<br />
th<strong>at</strong> if f doesn’t depend on time explicitly and if [f, H] = 0, then f is a<br />
constant of the motion. We can use (3.22) to test if our favorite function is<br />
in fact constant, and we can also use it to construct new constants as the<br />
following argument shows.<br />
Let f, g, and h be arbitrary functions of canonical variables. The following<br />
Jacobi identity is just a m<strong>at</strong>ter of algebra.<br />
[f, [g, h]] + [g, [h, f]] + [h, [f, g]] = 0 (3.23)<br />
Now suppose h = H, the Hamiltonian, and f and g are constants of the<br />
motion. Then<br />
[H, [f, g]] = 0<br />
Consequence: If f and g are constants of the motion, then so is [f, g].<br />
This should make us uneasy. Take any two constants. Well, maybe they<br />
commute, but if not, then we have three constants. Commute the new<br />
constant with f and g and get two more constants, etc. How many constants<br />
are we entitled to – anyway? This is a deep question, which has something<br />
to do with the notion of involution. I’ll get to th<strong>at</strong> l<strong>at</strong>er.
38 CHAPTER 3. ABSTRACT TRANSFORMATION THEORY<br />
3.2 Geometry in n Dimensions: The Hairy Ball<br />
Have another look <strong>at</strong> equ<strong>at</strong>ion (3.2). Let’s call ˙η a velocity field. By this<br />
I mean th<strong>at</strong> it associ<strong>at</strong>es a complete set of ˙q’s and ˙p’s with each point in<br />
phase space. In wh<strong>at</strong> direction does ˙η point? This is an easy question in<br />
one dimension; ˙η evalu<strong>at</strong>ed <strong>at</strong> the point P points in the direction tangent<br />
to the trajectory through P . Since trajectories can’t cross in phase space,<br />
there is only one trajectory through P , and the direction is unambiguous.<br />
If we use action-angle variables, the trajectory is a circle, and ˙η is wh<strong>at</strong> we<br />
would call in Ph211, a tangent velocity. The same is true, no doubt, for<br />
n > 1, but how do these circles fit together? How does one visualize this in<br />
higher dimensions?<br />
The answer, as I have mentioned before, is th<strong>at</strong> the trajectories all lie<br />
on the surface of an n dimensional torus imbedded in 2n dimensional space.<br />
Your ordinary breakfast donut is a two dimensional torus imbedded in three<br />
dimensional space. 2 This is easy to visualize, so let’s limit the discussion<br />
to two degrees of freedom for the time being. The step from one degree of<br />
freedom to two involves some profound new ideas. The step from two to<br />
higher dimension is mostly a m<strong>at</strong>ter of m<strong>at</strong>hem<strong>at</strong>ical generaliz<strong>at</strong>ion.<br />
Since we are dealing with conserv<strong>at</strong>ive systems, the trajectories are limited<br />
by the conserv<strong>at</strong>ion of energy, i.e. H(q1, q2; p1, p2) = E is an equ<strong>at</strong>ion<br />
of constraint. The trajectories move on a manifold with three independent<br />
variables. Now the gradient of a function has a well defined geometrical<br />
significance: <strong>at</strong> the point P , ∇f points in a direction perpendicular to the<br />
surface or contour of constant f through P . In this case ∇H is a four component<br />
vector perpendicular to the surface of constant energy. Unfortun<strong>at</strong>ely,<br />
˙η points in the direction of J · ∇H. Wh<strong>at</strong> direction is th<strong>at</strong>? Well,<br />
(∇H) T · J · ∇H = [H, H] = 0<br />
Consequently, J · ∇H points in a direction perpendicular to ∇H, which is<br />
perpendicular to the plane of constant H, i.e. ˙η lies somewhere on the three<br />
dimensional surface of constant H.<br />
We could have guessed th<strong>at</strong> ahead of time, of course, but we can take the<br />
argument further. H is probably not the only constant of motion. Suppose<br />
there are others; call them F , G, etc. For each of these constants we can<br />
2 The word “dimension” gets used in two different ways. When we talk about physical<br />
systems, Lagrangians, Hamiltonians, etc., the dimension is equal to the number of degrees<br />
of freedom. Here I am using dimension to mean the number of independent variables<br />
required to describe the system.
3.2. GEOMETRY IN N DIMENSIONS: THE HAIRY BALL 39<br />
construct a vector field using (2).<br />
˙ηF = J · ∇F<br />
˙ηG = J · ∇G<br />
· · · etc. · · ·<br />
How many such fields can we construct th<strong>at</strong> are independent of one another?<br />
To put it another way, how many independent constants of motion are there?<br />
Th<strong>at</strong>’s a good question – wh<strong>at</strong> do you mean by “independent”? The answer<br />
comes from differential geometry. I’m afraid I can only give a hand-waving<br />
introduction to it. There are two rel<strong>at</strong>ed requirements:<br />
1. Suppose F and G are independent constants of motion. Take any<br />
trajectory from the manifold of constant F and another from constant<br />
G. There is no continuous canonical transform<strong>at</strong>ion th<strong>at</strong> maps the one<br />
trajectory into another.<br />
2. For each point P in space there must be one unique trajectory th<strong>at</strong> lies<br />
in the plane of constant F and simultaneously in the plane of constant<br />
G.<br />
Think about this last requirement in the case where there are two degrees<br />
of freedom and two independent constants of motion. The trajectories must<br />
lie on a two- dimensional surface. If we use action-angle variables, the<br />
trajectories are circles. This sounds like a globe of the earth. Trajectories<br />
with constant ϕ are called longitudes, lines of constant θ are l<strong>at</strong>itudes. But<br />
wait! We have a serious problem <strong>at</strong> the poles. The north and south poles<br />
have all possible longitudes. Requirement 2 is viol<strong>at</strong>ed. Could you rearrange<br />
the lines so th<strong>at</strong> this problem doesn’t occur? It turns out th<strong>at</strong> this is not<br />
possible. This deep result is known in m<strong>at</strong>hem<strong>at</strong>ical circles a the Poincare-<br />
Hopf theorem. In the sort of less exalted company we keep, it’s the Hairy<br />
Ball Theorem. The idea is this: try to comb the hair on a hairy ball so th<strong>at</strong><br />
there is no bald spot. It can’t be done. So long as you really use a comb, i.e.<br />
so long as the trajectories don’t cross, you will always be left with one hair<br />
standing straight up! This is not a proof, of course, but it is a vivid way<br />
of visualizing the content of the theorem. It is easy to see, however, th<strong>at</strong><br />
wh<strong>at</strong> is impossible on a sphere is trivially easy on the surface of a donut. It<br />
can be done in an infinite variety of ways. The simplest is to choose your<br />
“longitudes” so they go around the donut the long way. L<strong>at</strong>itudes go around<br />
the short way. This also s<strong>at</strong>isfies requirement 1. You can’t deform a l<strong>at</strong>itude<br />
into a longitude without cutting through the donut.
40 CHAPTER 3. ABSTRACT TRANSFORMATION THEORY<br />
OK. Suppose you have two constants of motion F and G. How can you<br />
tell if they are independent? The answer is surprisingly simple, [F, G] = 0<br />
Proof: Take a point P on the surface of the donut. We should be able<br />
to set up a local coordin<strong>at</strong>e system with its origin <strong>at</strong> P to describe the<br />
trajectories on the surface. We need two unit vectors, ˆ ξF and ˆ ξG, such th<strong>at</strong><br />
every trajectory in the ˆ ξF - ˆ ξG plane has constant F and G. Choose<br />
ˆξF = ϵJ · ∇F<br />
This is guaranteed to lie in the surface of constant F ; however, G should<br />
remain constant along ˆ ξF . This means th<strong>at</strong><br />
0 = ( ˆ ξF ) T · ∇G = ϵ(J · ∇F ) T · ∇G<br />
= −ϵ(∇F ) T · J · ∇G = −ϵ[F, G]<br />
This proves the assertion. It’s worth reflecting on the fact th<strong>at</strong> this construction<br />
would be impossible on the surface of a sphere. The sphere, unlike<br />
the donut, has only one independent constant, its radius. This theorem also<br />
relieves our anxiety about extra constants. If F and G are independent, we<br />
don’t get a “free” constant K = [F, G], because K = 0.<br />
Summary and generaliz<strong>at</strong>ion:<br />
1. A system with n degrees of freedom has <strong>at</strong> most n independent constants<br />
of motion. Otherwise we could use the additional constants to<br />
elimin<strong>at</strong>e one or more of these degrees. For example, we could use the<br />
Hamilton-Jacobi procedure to make all the momenta constant. The<br />
Hamiltonian would then only be a function of the n coordin<strong>at</strong>es, but<br />
these would not be independent because of the additional constraints.<br />
2. Let’s say there are k constants, Fi, i = 1, . . . , k. If they are independent<br />
we must have [Fi, Fj] = 0.<br />
3. In the best case there are exactly n independent constants. Such<br />
constants are said to be in involution. Such a system is said to be<br />
integrable.<br />
4. All trajectories of integrable systems are confined to the surfaces of<br />
n-dimensional tori imbedded in 2n-dimension space.<br />
5. If k < n there are no general st<strong>at</strong>ements we can make about the<br />
behavior of the trajectories. We will be very much concerned in the<br />
next chapter with systems th<strong>at</strong> are “almost” integrable.
3.2. GEOMETRY IN N DIMENSIONS: THE HAIRY BALL 41<br />
6. There are no general criteria known for deciding whether or not a<br />
system is integrable; however, if the Hamiltonian is separable, the<br />
system is integrable.<br />
3.2.1 Example: Uncoupled Oscill<strong>at</strong>ors<br />
The Hamiltonian for two uncoupled harmonic oscill<strong>at</strong>ors (with m = 1) is<br />
H = 1<br />
2 (p2 1 + p 2 2 + ω 2 1q 2 1 + ω2q 2 2)<br />
This is an important problem because every linear oscill<strong>at</strong>ing system can<br />
be put in this form by a suitable choice of coordin<strong>at</strong>es. 3 There are two<br />
constants of motion<br />
E1 = 1<br />
2 (p2 1 + ω 2 1q 2 1) E2 = 1<br />
2 (p2 2 + ω 2 2q 2 2)<br />
In terms of action-angle variables, the constants are I1 and I2.<br />
H = I1ω1 + I2ω2 = E1 + E2 = E<br />
Every integrable system can be put in this form, although in general the<br />
ω’s will be functions of the I’s. Here they are just parameters from the<br />
Hamiltonian.<br />
This is a simple problem, but the phase space is four dimensional. Let’s<br />
think about all possible ways we might visualize it. In the q1 - p1 or (q2 -<br />
p2) plane the trajectories are ellipses with<br />
qk(max) = √ 2Ek/ωk<br />
pk(max) = √ 2Ek,<br />
where k = 1, 2. The area enclosed by each ellipse is significant, because<br />
∫ <br />
area = dq dp = p dq = 2πI (3.24)<br />
s<br />
The first integral is a surface integral over the area of the ellipse. The second<br />
is a line integral around the ellipse. This identity is a variant of Stokes’s<br />
theorem. It’s useful to rescale the variables so th<strong>at</strong> they both have the same<br />
units and the trajectory is a circle. An n<strong>at</strong>ural choice would be<br />
q ′ k<br />
√<br />
= qk ωk = √ 2Ik sin ψk<br />
p ′ k = pk/ √ ωk = √ 2Ik cos ψk<br />
3 This comes under the heading of “theory of small oscill<strong>at</strong>ions.” Most mechanics texts<br />
devote a chapter to it.
42 CHAPTER 3. ABSTRACT TRANSFORMATION THEORY<br />
The trajectories are now circles with radius √ 2Ik. The area enclosed is 2πIk,<br />
as required by (3.24).<br />
The motion in the q1 - q2 plane is more complic<strong>at</strong>ed. It depends on<br />
the r<strong>at</strong>io ω1/ω2 called the winding number. If this is a r<strong>at</strong>ional number,<br />
say N1/N2 then after N1 cycles of q1 and N2 cycles of q2, the trajectory<br />
will come back to its starting point. This is called a Lissajou figure. If the<br />
winding number is irr<strong>at</strong>ional, the trajectory will be confined to a limited<br />
area but will never return to its starting point. It will eventually “color<br />
in” all available space. In the next chapter we will be concerned with systems<br />
th<strong>at</strong> are “almost” integrable. For such systems the winding number is<br />
all-important. Systems with irr<strong>at</strong>ional winding numbers tend to be stable<br />
under perturb<strong>at</strong>ion. Those with r<strong>at</strong>ional winding numbers disintegr<strong>at</strong>e <strong>at</strong><br />
the slightest push!<br />
The centerpiece of this chapter is the torus. The trajectories spiral<br />
around the donut. If the winding number is r<strong>at</strong>ional they “wear a p<strong>at</strong>h”<br />
around the donut. If it’s irr<strong>at</strong>ional they cover the donut evenly. A useful<br />
way of visualizing this was invented by Poincaré. Imagine a fl<strong>at</strong> plane cutting<br />
through the donut in such a way th<strong>at</strong> every point on the plane has the<br />
angle ψ1 = 0. Place a dot on the plane <strong>at</strong> he point where each trajectory<br />
passes through it. If the winding number is a r<strong>at</strong>ional fraction, there will<br />
be a finite number of points. Each time a trajectory passes through ψ1 = 0<br />
it will pass through one of the dots. If the winding number is irr<strong>at</strong>ional the<br />
crossings will mark out a continuous circle. The Poincaré section as it is<br />
called (some books call it the surface of section) is a useful diagnostic tool.<br />
Suppose you have a system of equ<strong>at</strong>ions th<strong>at</strong> are not integrable (so far as<br />
you know) but is amenable to computer calcul<strong>at</strong>ion. Take various Poincaré<br />
sections. If they are circles then the system is <strong>at</strong> least approxim<strong>at</strong>ely integrable<br />
and can be described with action-angle variables. As we will see,<br />
there are often regions of phase space, “islands” as it were, where motion is<br />
simply periodic and other regions th<strong>at</strong> are wildly chaotic.<br />
Pictures of this motion appear in all the standard texts. I have yet to<br />
see a clear explan<strong>at</strong>ion of the coordin<strong>at</strong>es involved, however. Wh<strong>at</strong> does it<br />
mean really to say th<strong>at</strong> the donut is a 2-d surface in a 4-d space? Your<br />
breakfast donut, after all, is imbedded in 3-d space. If we take a Poincaré<br />
section through the donut <strong>at</strong> the plane ψ2 = 0 and plot q ′ 1 versus p′ 1<br />
, we<br />
will get either a circle of dots or a continuous circle with a radius equal to<br />
√ 2I1, or we can take a slice through ψ1 = 0 and get a circle with radius<br />
√ 2I2. Put it this way, any point on the torus has four (polar) coordin<strong>at</strong>es,<br />
( √ 2I1, ψ1, √ 2I2, ψ2), but in 3-d space, only three of them are independent.<br />
When the torus is in 4-d space, all four of them are independent. If we really
3.2. GEOMETRY IN N DIMENSIONS: THE HAIRY BALL 43<br />
lived in 4-d space, we would label the axes of the donut plot (q ′ 1 , p′ 1 , q′ 2 , p′ 2 ).<br />
This is impossible for us to imagine. The donut is easy; just remember th<strong>at</strong><br />
there is no equ<strong>at</strong>ion of constraint among the four variables. 4<br />
3.2.2 Example: A Particle in a Box<br />
Consider a particle in a two-dimensional box with elastic walls.<br />
0 ≤ x ≤ a 0 ≤ y ≤ b<br />
H = 1<br />
2m (p2x + p 2 y) = π2<br />
(<br />
I2 1<br />
2m a2 + I2 2<br />
b2 )<br />
I1 = 1<br />
<br />
px dx =<br />
2π<br />
a<br />
π |px| I2 = b<br />
π |py|<br />
ω1 = ∂H<br />
∂I1<br />
= π2<br />
I1 ω2<br />
ma2 = π2<br />
I2<br />
mb2 There are several interesting points about this apparently trivial problem.<br />
The Hamiltonian looks linear, but in fact it contains an invisible nonlinear<br />
potential th<strong>at</strong> reverses the particle’s momentum when it hits the wall. One<br />
symptom of this is th<strong>at</strong> the frequencies depend on I. This looks odd, but<br />
it’s just the action-angle way of saying th<strong>at</strong> the particle makes a round trip<br />
(in the x direction) in a time T = 2am/px. The loop integral in this context<br />
is an integral over one “round trip” of the particle.<br />
∫ a ∫ 0<br />
pxdx = |px| dx + (−|px|) dx = 2a|px|<br />
0<br />
My real point in showing this example is to call your <strong>at</strong>tention to the<br />
angle variable. I will work through the calcul<strong>at</strong>ion for the x variable. This<br />
same thing holds for y of course.<br />
1<br />
2m<br />
a<br />
( ) 2<br />
dWx<br />
= E1<br />
dx<br />
∫<br />
Wx = (±) √ ∫<br />
2mE1 dx =<br />
ψ1 = ∂Wx<br />
∂I1<br />
= ± π<br />
∫<br />
a<br />
(±) πI1<br />
a dx<br />
dx = ± π<br />
a x + ψ10 = ψ1<br />
4 Of course, I1 and I2 are constant for any given set of initial conditions. It is this sense<br />
in which the torus is a 2-d surface.
44 CHAPTER 3. ABSTRACT TRANSFORMATION THEORY<br />
The term ψ10 is an integr<strong>at</strong>ion constant. There is no reason why it must be<br />
the same for both legs of the journey. We are free to choose it as follows:<br />
0 → x → a: ψ1 = πx/a<br />
0 ← x ← a: ψ1 = 2π − πx/a<br />
While the particle is bouncing violently between the walls, the angle variables<br />
are increasing smoothly with time, ψ1 = ω1t and ψ2 = ω2t. Even this<br />
strange problem is equivalent to a donut! 5<br />
5 When correctly viewed, everything is a harmonic oscill<strong>at</strong>or – in this case two harmonic<br />
oscill<strong>at</strong>ors.
Chapter 4<br />
Canonical Perturb<strong>at</strong>ion<br />
Theory<br />
So far we have assumed th<strong>at</strong> our systems had exact analytic solutions. One<br />
way of st<strong>at</strong>ing this is th<strong>at</strong> we can find a canonical transform<strong>at</strong>ion to action<br />
angle variables such th<strong>at</strong> the new Hamiltonian is a function of the action<br />
variables only, H = H(I). Such problems are the exception r<strong>at</strong>her than the<br />
rule. For our purposes they are also uninteresting. All periodic integrable<br />
systems are equivalent to a set of uncoupled harmonic oscill<strong>at</strong>ors. Once you<br />
get over the thrill of this discovery, the oscill<strong>at</strong>ors are boring! The existence<br />
of chaos depends on the system not being equivalent to a set of oscill<strong>at</strong>ors.<br />
In order to deal with systems th<strong>at</strong> are non-trivial in this sense, we need some<br />
way of doing perturb<strong>at</strong>ion theory. 1<br />
4.1 One-Dimensional Systems<br />
I will present the theory first for systems with one degree of freedom. This<br />
will simplify the not<strong>at</strong>ion, however the interesting complic<strong>at</strong>ions only appear<br />
in higher dimensions. Here is the basic situ<strong>at</strong>ion: A bounded conserv<strong>at</strong>ive<br />
system with one degree of freedom is described by a constant Hamiltonian<br />
H(q, p) = E. We need to obtain the equ<strong>at</strong>ions of motion in the form q = q(t)<br />
1 I will follow the tre<strong>at</strong>ment in Chaos and Integrability in <strong>Nonlinear</strong> Dynamics, Michael<br />
Tabor, Wiley-Interscience, 1989. Another good reference is Classical <strong>Mechanics</strong> by R. A.<br />
Metzner and L.C. Shepley, Prentice Hall, 1991. The subject is also discussed in Classical<br />
<strong>Mechanics</strong>, Goldstein, Poole and Safko, third edition, Addison-Wesley, 2002. Goldstein<br />
discusses time-dependent and time-independent perturb<strong>at</strong>ion theory. We are doing the<br />
time-independent variety.<br />
45
46 CHAPTER 4. CANONICAL PERTURBATION THEORY<br />
and p = p(t), but this is impossible due to the non-linear n<strong>at</strong>ure of the<br />
problem. We are able to split up the Hamiltonian<br />
H = H0 + ϵH1<br />
in such a way th<strong>at</strong> H0 is amenable to exact solution, and H1 is in some<br />
sense small. We indic<strong>at</strong>e the smallness by multiplying it by ϵ. This is a<br />
bookkeeping device; it will be set to one after the approxim<strong>at</strong>ions have been<br />
derived.<br />
The first step is to find the canonical transform<strong>at</strong>ion th<strong>at</strong> makes H0<br />
cyclic, i.e. q = q(I, ψ), p = p(I, ψ), and H0 = H0(I) where I and ˙ ψ = ω0<br />
are both constant. Unfortun<strong>at</strong>ely, this transform<strong>at</strong>ion does not render the<br />
complete Hamiltonian cyclic, so we write<br />
H(I, ψ) = H0(I) + ϵH1(I, ψ). (4.1)<br />
H is still constant, and consequently I now depends on ψ. H0 is not an<br />
explicit function of ψ, but it does depend on ψ implicitly through I.<br />
Despite this inconvenience, I and ψ are still a perfectly good set of<br />
canonical variables, so th<strong>at</strong> the equ<strong>at</strong>ions of motion<br />
I ˙ = − ∂<br />
H(I, ψ) ˙<br />
∂ψ<br />
∂<br />
ψ = H(I, ψ) (4.2)<br />
∂I<br />
are valid without approxim<strong>at</strong>ion, even though we are unable to solve them in<br />
this form. The so-called time-dependent perturb<strong>at</strong>ion proceeds from here by<br />
expanding the solutions of (4.2) as power series in ϵ. Our approach is to find<br />
a second canonical transform<strong>at</strong>ion, i.e. (q, p) → (I, ψ) → (J, φ) such th<strong>at</strong><br />
H(I, ψ) → K(J). This last step must be done as a series of approxim<strong>at</strong>ions,<br />
of course, otherwise the problem would be exactly solvable.<br />
In order to make the transform<strong>at</strong>ion (I, ψ) → (J, φ) we will use a gener<strong>at</strong>ing<br />
function of the F2 genus, i.e. F = F (ψ, J). We need to expand<br />
F = F0(ψ, J) + ϵF1(ψ, J) + · · · (4.3)<br />
where F0 = Jψ. This is the identity transform<strong>at</strong>ion as can be seen as follows:<br />
I = ∂<br />
∂ψ F0 = J φ = ∂<br />
∂J F0 = ψ<br />
In terms of (??)the transform<strong>at</strong>ion equ<strong>at</strong>ions are<br />
I = ∂F<br />
∂ψ<br />
= J + ϵ∂F1 (ψ, J) + · · · (4.4)<br />
∂ψ
4.1. ONE-DIMENSIONAL SYSTEMS 47<br />
φ = ∂F<br />
∂J<br />
= ψ + ϵ∂F1 (ψ, J) + · · · (4.5)<br />
∂J<br />
Before going on there are some technical points about ψ and J th<strong>at</strong> need<br />
to be discussed. When ϵ = 0, ψ is the exact angle variable for the system.<br />
This means th<strong>at</strong> we can find p and q as functions of ψ such th<strong>at</strong> p and q<br />
return to their original values when ∆ψ = 2π. We can in principle invert<br />
this transform<strong>at</strong>ion to find ψ as a function of p and q.<br />
ψ = ψ(q, p) (4.6)<br />
When p and q run through a complete cycle, ψ advances by 2π. When ϵ ̸= 0<br />
the orbit will be different from the unperturbed case, but the functional<br />
rel<strong>at</strong>ionship doesn’t change, so when p and q run through a complete cycle,<br />
we must still have ∆ψ = 2π. Of course, the exact angle variable will also<br />
advance 2π. In summary<br />
∆ψ = ∆φ = 2π (4.7)<br />
for one complete cycle.<br />
The following integrals are all equal because canonical transform<strong>at</strong>ions<br />
preserve phase space volume.<br />
J = 1<br />
2π<br />
<br />
p dq = 1<br />
2π<br />
Now integr<strong>at</strong>e (4.4) around one orbit:<br />
th<strong>at</strong> is<br />
<br />
1<br />
2π<br />
I dψ = 1<br />
<br />
2π<br />
<br />
J dφ = 1<br />
<br />
2π<br />
J dψ + 1<br />
2π ϵ<br />
<br />
∂F1<br />
∂ψ<br />
J = J + 1<br />
2π ϵ<br />
<br />
∂F1<br />
∂ψ<br />
dψ + · · · ;<br />
I dψ (4.8)<br />
dψ + · · · ;<br />
We have just seen th<strong>at</strong> ∆ψ = 2π around one cycle. Consequently<br />
<br />
∂F1<br />
dψ = 0 (4.9)<br />
∂ψ<br />
implies th<strong>at</strong> the deriv<strong>at</strong>ive of F1 is purely oscill<strong>at</strong>ory with a fundamental<br />
period of 2π in ψ. (The same is true of the higher order terms as well.)<br />
The Hamiltonian is transformed using (??) with the new variables.<br />
K(φ, J) = H(ψ(φ, J), I(φ, J)) + ∂<br />
∂t F2(ψ(φ, J), J, t) (4.10)
48 CHAPTER 4. CANONICAL PERTURBATION THEORY<br />
As explained above, we seek a transform<strong>at</strong>ion th<strong>at</strong> makes φ cyclic so th<strong>at</strong><br />
K = K(J). The appropri<strong>at</strong>e gener<strong>at</strong>ing function does not depend on time,<br />
so (4.10) becomes<br />
K(J) = H(ψ(φ, J), I(φ, J)) (4.11)<br />
The approxim<strong>at</strong>ion procedure consists in expanding the left and right sides<br />
of this equ<strong>at</strong>ion in powers of ϵ and then equ<strong>at</strong>ing terms of zeroth and first<br />
order. This procedure could be carried out to higher order. I’m interested<br />
in first order corrections only.<br />
The so-called Kamiltonian is expanded as follows:<br />
K(J) = K0(J) + ϵK1(J) + · · ·<br />
At first sight, this agenda looks hopeless. We need to know the exact value<br />
of J to make use any of these terms, even the zeroth order approxim<strong>at</strong>ion.<br />
The exquisite point is th<strong>at</strong> we can use (4.8) to calcul<strong>at</strong>e J exactly without<br />
knowing the complete transform<strong>at</strong>ion.<br />
The zeroth order Hamiltonian is expanded with the help of (4.4).<br />
H0(I) = H0( ∂F<br />
∂ψ ) = H0(J + ϵ ∂F1<br />
∂ψ + · · · ) = H0(J) + ϵ ∂F1<br />
∂ψ<br />
∂H0(J)<br />
∂J<br />
<br />
<br />
<br />
ϵ=0<br />
The first order term is already multiplied by ϵ.<br />
Substitute all this into (??) gives<br />
∂H0<br />
∂J<br />
<br />
<br />
<br />
ϵ=0<br />
+ · · ·<br />
= ∂H0(I)<br />
∂I = ω0 (4.12)<br />
ϵH1(ψ, I) = ϵH1(φ, J) + · · ·<br />
K0(J) = H0(J) (4.13)<br />
K1(J) = ∂F1<br />
∂ψ ω0 + H1(φ, J) (4.14)<br />
The not<strong>at</strong>ion H0(J) means th<strong>at</strong> you take your formula for H0(I) and replace<br />
the symbol I with the symbol J without making any change in the functional<br />
form of H0.<br />
Integr<strong>at</strong>e (4.14) around one cycle and use (4.9); (4.14) becomes<br />
K1(J) = H1(J) ≡ 1<br />
∫ 2π<br />
H1(ψ, J) dψ, (4.15)<br />
2π 0
4.1. ONE-DIMENSIONAL SYSTEMS 49<br />
and<br />
∂<br />
∂ψ F1(ψ, J) = 1 [<br />
H1 − H1(ψ, J)<br />
ω0(J)<br />
] ≡ − ˜ H1(ψ, J)<br />
ω0<br />
(4.16)<br />
˜H1 is the periodic part of H1. We are left with a differential equ<strong>at</strong>ion th<strong>at</strong><br />
is easy to integr<strong>at</strong>e.<br />
F1(ψ, J) = − 1<br />
∫<br />
dψ<br />
ω0(J)<br />
˜ H1(ψ, J) (4.17)<br />
4.1.1 Summary<br />
I will summarize all these technical details in the form of an algorithm for<br />
doing first order perturb<strong>at</strong>ion theory. Remember th<strong>at</strong> the object is to find<br />
equ<strong>at</strong>ions of motion in the form q = q(t) and p = p(t). We do this in three<br />
steps: (1) Find q = q(I, ψ) and p = p(I, ψ). (2) Find I = I(J, φ) and<br />
ψ = ψ(J, φ). (3) J and ˙φ are constant, so φ = ˙φt + φ0.<br />
1. Identify the H0 part of the Hamiltonian. Find the transform<strong>at</strong>ion<br />
equ<strong>at</strong>ions q = q(I, ψ) and p = p(I, ψ) using the Hamiltonian-Jacobi<br />
equ<strong>at</strong>ion as described in the previous section. Use (??) to get ω0.<br />
2. Equ<strong>at</strong>ion (4.8) can be used to find J in terms of the total energy<br />
E. The integral presents no difficulties in principle, especially if the<br />
Hamiltonian is separable. In fact, textbooks never bother to do this.<br />
It seems sufficient to display the results in terms of J, the assumption<br />
being th<strong>at</strong> we could find J = J(E) if we really had to.<br />
3. The first order correction to the energy is obtained from the integral in<br />
(4.15). Get the first order correction to the frequency by differenti<strong>at</strong>ing<br />
it with respect to J.<br />
4. The gener<strong>at</strong>ing function F1 is calcul<strong>at</strong>ed from (4.17). It is then substituted<br />
into (4.4) and (4.5). These give implicit equ<strong>at</strong>ions for ψ =<br />
ψ(J, φ) and I = I(J, φ). Unfortun<strong>at</strong>ely, it is usually impossible to<br />
invert them to obtain these formula explicitly.<br />
4.1.2 The simple pendulum<br />
The pendulum makes a nice example<br />
H = l2<br />
+ mgR(1 − cos θ)<br />
2mR2
50 CHAPTER 4. CANONICAL PERTURBATION THEORY<br />
The angular momentum l = mR2θ˙ is canonically conjug<strong>at</strong>e to the angle θ.<br />
H = 1<br />
2mR2 [<br />
l 2 + m 2 R 4 ω 2 0θ 2<br />
(<br />
1 − θ2<br />
)]<br />
+ · · ·<br />
12<br />
The first two terms reduce to the familiar harmonic oscill<strong>at</strong>or. This is the<br />
zeroth order problem.<br />
l 2 =<br />
Make the n<strong>at</strong>ural substitution<br />
H0 = E0 = l2 mgRθ2<br />
+<br />
2mR2 2<br />
( ) 2<br />
dW<br />
= 2mR<br />
dθ<br />
2 E0 − m 2 R 4 ω 2 0θ 2<br />
l =<br />
sin 2 ψ = ml2 ω 2 0<br />
2E0<br />
( dW<br />
dθ<br />
θ 2<br />
(4.18)<br />
)<br />
= √ 2mR2E cos ψ (4.19)<br />
We can look on this as a convenient change of variable, but ψ is also the<br />
angle variable. This can be seen as follows:<br />
I = 1<br />
<br />
2π<br />
l dθ =<br />
√ 2mR 2 E0<br />
2π<br />
[<br />
1 − mR2 ω 2 0<br />
2E0<br />
θ 2<br />
] 1/2<br />
Use (4.19) to get the familiar result, I = E0/ω0. The gener<strong>at</strong>ing function is<br />
obtained from the indefinite integral<br />
∫ ( ) ∫<br />
dW [2mR2 W =<br />
dθ = ω0I − m<br />
dθ<br />
2 R 4 ω 2 0θ 2] 1/2<br />
dθ (4.20)<br />
According to the basic transform<strong>at</strong>ion formula we should have<br />
ψ = ∂W<br />
∂I<br />
One can show by differenti<strong>at</strong>ing (4.20) and using (4.19) to complete the<br />
integr<strong>at</strong>ion, th<strong>at</strong> this is indeed so.<br />
Equ<strong>at</strong>ions (4.18) and (4.19) can be rearranged to give<br />
l = √ 2mR 2 Iω0 cos ψ (4.21)<br />
dθ
4.2. MANY DEGREES OF FREEDOM 51<br />
θ =<br />
√ 2I<br />
mR 2 ω0<br />
sin ψ (4.22)<br />
The goal of the action-angle program is to express the original coordin<strong>at</strong>es<br />
and momenta in terms of the action-angle variables. This has now been<br />
completed to zeroth order.<br />
The first order correction is<br />
H1(I, ψ) = − mR2 ω 2 0 θ4<br />
24<br />
= − I2<br />
6mR 2 sin4 ψ.<br />
We are now in a position to recast our Hamiltonian à la (4.1).<br />
(<br />
H(I, ψ) = Iω0 + ϵ − I2<br />
6mR2 sin4 )<br />
ψ<br />
We have also obtained ω0 = √ g/R “for free.” The ϵ is there for bookkeeping<br />
purposes only. We have no further need for it.<br />
K0(J) = H0(J) = Jω0<br />
K1(J) = H1(J) = 1<br />
∫ 2π<br />
2π 0<br />
F1(J, ψ) = − 1<br />
˜H = H1 − H1 =<br />
ω0<br />
∫<br />
J 2<br />
H1 dψ = −<br />
16mR2 J 2<br />
48mR 2 (3 − 8 sin4 ψ)<br />
dψ ˜ J<br />
H1 =<br />
2<br />
192 mR2 (sin 4ψ − 8 sin 2ψ)<br />
ω0<br />
ω = ω0 −<br />
J<br />
32mR 2<br />
4.2 Many Degrees of Freedom<br />
For systems of two or more degrees of freedom, canonical perturb<strong>at</strong>ion theory<br />
is formul<strong>at</strong>ed in exactly the same way as before – but now profound<br />
difficulties arise, even to first order in ϵ. The problem centers around equ<strong>at</strong>ion<br />
(4.16) repe<strong>at</strong>ed here for reference<br />
ω0(J) ∂F1(ψ, J)<br />
∂ψ<br />
= − ˜ H1(ψ, J)<br />
We were able to solve this with a simple integr<strong>at</strong>ion (4.17). This is not<br />
possible for more th<strong>at</strong> one degree of freedom, so we must resort to Fourier
52 CHAPTER 4. CANONICAL PERTURBATION THEORY<br />
series. Before doing this, however, we will need to generalize our not<strong>at</strong>ion.<br />
Let’s use the vectors<br />
J = (J1, · · · , Jn) ω0 = (ω01, · · · , ω0n) ∇ψ = ( ∂<br />
∂ψ1<br />
, · · · , ∂<br />
)<br />
∂ψn<br />
where n is the number of degrees of freedom. In this not<strong>at</strong>ion (4.16) becomes<br />
where<br />
ω0(J) · ∇ψF1(ψ, J) = − ˜ H1(J, ψ) (4.23)<br />
¯H1(J, ψ) =<br />
∫ 2π<br />
0<br />
∫ 2π<br />
dψ1 · · · dψnH1(J, ψ) (4.24)<br />
0<br />
and ˜ H1 = H1 − ¯ H1. Since both sides of (4.16) are periodic, we can solve<br />
them with Fourier series.<br />
˜H1(J, ψ) = ∑<br />
Ak(J)e ik·ψ<br />
(4.25)<br />
where k is a vector of integers<br />
k<br />
F1(J, ψ) = ∑<br />
Bk(J)e ik·ψ<br />
k<br />
k = k1, · · · , kn<br />
(4.26)<br />
It seems as if we could proceed as follows: ˜ H1 is known <strong>at</strong> this point, so we<br />
can find Ak Substitute these definitions into (4.16) we get<br />
Bk = i Ak<br />
ω0 · k<br />
(4.27)<br />
Now here’s the infamous problem. Suppose, for example, there were only<br />
two degrees of freedom. In this case the denomin<strong>at</strong>or of (??) would be<br />
ω0 · k = ω01k1 + ω02k2<br />
(4.28)<br />
You can see th<strong>at</strong> if the winding number ω01/ω02 is a r<strong>at</strong>ional number, then<br />
for some k, Bk will be infinite. It seems th<strong>at</strong> the slightest perturb<strong>at</strong>ion<br />
will blow this system into outer space! Even if the winding number is not<br />
r<strong>at</strong>ional, there will always be values of k th<strong>at</strong> will make ω0 · k arbitrarily<br />
small.<br />
This problem was discovered in the early twentieth century, and all the<br />
effort of the most eminent m<strong>at</strong>hem<strong>at</strong>icians of the day failed to solve it.
4.2. MANY DEGREES OF FREEDOM 53<br />
One opinion held th<strong>at</strong> the slightest perturb<strong>at</strong>ion would cause the system to<br />
become “ergodic,” th<strong>at</strong> is to say, the trajectories would fill up all of phase<br />
space. Numerical calcul<strong>at</strong>ions l<strong>at</strong>er showed th<strong>at</strong> this was often not the case.<br />
Trajectories will often “lock in” to stable p<strong>at</strong>terns. This has been the subject<br />
of much contemporary research. When and why do trajectories lock in, and<br />
wh<strong>at</strong> happens when they do not? The question of wh<strong>at</strong> trajectories remain<br />
stable under small perturb<strong>at</strong>ions is <strong>at</strong> least partly answered by the so-called<br />
KAM (Kolmogorov, Arnold, Moser) theorem. In the general case there is,<br />
if not a complete theory, <strong>at</strong> least a well-developed taxonomy. We will turn<br />
to these m<strong>at</strong>ters in the next chapter.
54 CHAPTER 4. CANONICAL PERTURBATION THEORY
Chapter 5<br />
Introduction to Chaos<br />
The canonical perturb<strong>at</strong>ion theory of the previous chapter is a lot of work,<br />
and in two or more degrees of freedom it summons up the ogre of small<br />
denomin<strong>at</strong>ors. Many people have tried to solve this problem by pounding<br />
their heads on it. This turns out not to be a fruitful approach. I will<br />
illustr<strong>at</strong>e the limit<strong>at</strong>ions of perturb<strong>at</strong>ion theory by considering the van der<br />
Pol oscill<strong>at</strong>or. This is a simple nonlinear, one-dimensional, second-order<br />
differential equ<strong>at</strong>ion closely resembling a damped harmonic oscill<strong>at</strong>or. It<br />
has stable solutions which can easily be found numerically, yet it has no<br />
known analytic solutions, and perturb<strong>at</strong>ion theory, on general principles,<br />
just can’t work! 1 We then go on to discuss linear stability theory. With<br />
these simple techniques you can analyze most nonlinear systems (the van<br />
der Pol oscill<strong>at</strong>or is an exception) and get a qualit<strong>at</strong>ive picture of the phase<br />
space dynamics. In one degree of freedom (two-dimensional phase space)<br />
it will become immedi<strong>at</strong>ely apparent where perturb<strong>at</strong>ion theory is possible<br />
and a qualit<strong>at</strong>ive idea of the motion of the system where it is not.<br />
Higher dimensional spaces are not so easy to analyze, in part because<br />
they are hard to visualize and in part because they are often not integrable.<br />
It is this non-intagrability th<strong>at</strong> leads to chaos. Here we resort to the<br />
Poincareé section and the notion of discrete maps. The Poincaré-Birkoff and<br />
KAM theorems can then tell us something about the onset and structure of<br />
chaos.<br />
1 It should be remembered th<strong>at</strong> all the major developments in elementary particle theory<br />
over the last few decades starting with the standard model in the 1970’s are based on the<br />
notion of spontaneous symmetry breaking. Spontaneous symmetry breaking, almost by<br />
definition, cannot be described with perturb<strong>at</strong>ion theory. When perturb<strong>at</strong>ion theory fails<br />
we always expect new physics. The same is true (to a lesser extent) in classical mechanics<br />
as well.<br />
55
56 CHAPTER 5. INTRODUCTION TO CHAOS<br />
5.1 The total failure of perturb<strong>at</strong>ion theory<br />
To get some feeling for how perturb<strong>at</strong>ion theory might be useless, look <strong>at</strong><br />
the following “toy” example.<br />
¨x = −x + ϵ(x 2 + ˙x 2 − 1) sin( √ 2t) (5.1)<br />
This looks like a harmonic oscill<strong>at</strong>or with a resonant frequency ω = 1 and<br />
a “small” driving term with a frequency ω = √ 2. Obvious solutions are<br />
x(t) = sin t and x(t) = cos t, which hold for all values of ϵ. If we set ϵ = 0<br />
then the solutions more generally are x(t) = x0 sin(t + t0). This solution<br />
plotted on a phase space plot of x(t) versus ˙x(t) will be a circle with radius<br />
r = x0. Wh<strong>at</strong> would you expect for finite ϵ? There presumably are other<br />
solutions, but don’t waste your time looking for them! You should convince<br />
yourself however, th<strong>at</strong> there are no solutions of the form<br />
x(t) = sin t +<br />
∞∑<br />
ϵ n fn(t) (5.2)<br />
Also convince yourself th<strong>at</strong> the trouble comes from the non-linear terms.<br />
The point is because of the non-linearity, it is not possible to start with<br />
unperturbed solutions and get new solutions by adding to them.<br />
A more interesting and oft-studied example is the van der Pol equ<strong>at</strong>ion.<br />
It was first introduced by van der Pol in 1926 in a study of the nonlinear<br />
vacuum tube circuits of early radios.<br />
n=1<br />
¨x + ϵ(x 2 − 1) ˙x + x = 0 (5.3)<br />
Again the ϵ = 0 equ<strong>at</strong>ions are x(t) = x0 sin(t + t0). In phase space this is<br />
a circle of radius x0. If we make ϵ ever so much larger than zero, however,<br />
something remarkable happens as shown in the first of the plots in Fig.<br />
5.1. Yes the orbit eventually becomes a circle, but regardless of the initial<br />
conditions, the radius r ≈ 2. The same sort of behavior is shown in Fig. 5.1<br />
for larger values of ϵ. The shape of the final orbit is determined entirely by ϵ<br />
and is completely unaffected by the initial conditions. A curve of the sort is<br />
called a limit cycle. It’s easy to see in vague way why the limit cycle exists.<br />
The term proportional to ϵ in (5.3) looks like an oscill<strong>at</strong>or damping term, but<br />
its sign depends on whether x 2 is gre<strong>at</strong>er or less than 1. If it is gre<strong>at</strong>er, the<br />
oscill<strong>at</strong>ion is damped; if it is smaller the oscill<strong>at</strong>ion is “undamped.” Indeed,<br />
if ϵ is made neg<strong>at</strong>ive, the orbits either collapse to zero or diverge to infinity<br />
depending on the initial conditions. For obvious reasons the solutions with
5.1. THE TOTAL FAILURE OF PERTURBATION THEORY 57<br />
dx/dt<br />
dx/dt<br />
4<br />
2<br />
0<br />
−2<br />
Epsilon=0.1<br />
−4<br />
−4<br />
4<br />
−2 0<br />
x<br />
Epsilon=1.5<br />
2 4<br />
2<br />
0<br />
−2<br />
−4<br />
−4 −2 0<br />
x<br />
2 4<br />
dx/dt<br />
dx/dt<br />
4<br />
2<br />
0<br />
−2<br />
Epsilon=0.5<br />
−4<br />
−4<br />
10<br />
−2 0<br />
x<br />
Epsilon=3<br />
2 4<br />
5<br />
0<br />
−5<br />
−10<br />
−4 −2 0<br />
x<br />
2 4<br />
Figure 5.1: The van der Pol plot for four values of ϵ and two starting values<br />
(indic<strong>at</strong>ed by asterisks)
58 CHAPTER 5. INTRODUCTION TO CHAOS<br />
positive ϵ are said to be stable and those with neg<strong>at</strong>ive ϵ are said to be<br />
unstable.<br />
This simple model makes an important point. Conventional perturb<strong>at</strong>ion<br />
theory starts with unperturbed, i.e. ϵ = 0 solutions, and then looks for<br />
series solutions in powers of ϵ. This is obviously hopeless here since even<br />
a smidgeon of ϵ is enough to completely alter the n<strong>at</strong>ure of the orbits. It<br />
would be better to start with some simple function th<strong>at</strong> approxim<strong>at</strong>ed the<br />
limit cycle and then expand in powers of some parameter th<strong>at</strong> characterized<br />
the devi<strong>at</strong>ion of the actual orbit from the simple function. Alas, I don’t<br />
know how to do this. The trouble is th<strong>at</strong> the limit cycle is so weird, <strong>at</strong> least<br />
for large ϵ, th<strong>at</strong> it’s hard to come up with a “lowest-order” solution. For<br />
many systems however, this is a practical approach. The trick is to look for<br />
the fixed points.<br />
5.2 Fixed points and lineariz<strong>at</strong>ion<br />
Equ<strong>at</strong>ions of motion can always be cast in the form<br />
˙ξ = f(ξ, t). (5.4)<br />
With n degrees of freedom ξ and f are 2n-dimensional vectors. For example,<br />
Hamilton’s equ<strong>at</strong>ions with one degree of freedom are<br />
˙q = ∂H<br />
∂p<br />
˙p = − ∂H<br />
∂q<br />
[ ]<br />
p<br />
ξ =<br />
q<br />
(5.5)<br />
To keep the not<strong>at</strong>ion simple and general (and to save typing) I will keep the<br />
not<strong>at</strong>ion in the form (5.5) for the time being and not type out the p’s and<br />
q’. I will also restrict the discussion to autonomous systems, i.e. those in<br />
which the Hamiltonian does not depend explicitly on time. 2<br />
A fixed point (also called a st<strong>at</strong>ionary point, equilibrium point, or critical<br />
point) is simply the point ξf where all the time deriv<strong>at</strong>ives vanish, f(ξf ) =<br />
ξf<br />
˙ = 0. It’s the place where nothing happens. Detailed inform<strong>at</strong>ion about<br />
2 The not<strong>at</strong>ion in this section is taken from Classical Dynamics by J. V. Jose and E. J<br />
Saletan
5.2. FIXED POINTS AND LINEARIZATION 59<br />
the motion of a system close to a fixed point can be obtained by linearizing<br />
the equ<strong>at</strong>ions of motion. This is done as follows: First the origin is moved<br />
to the fixed point by writing<br />
ζ(t) = ξ(t; ξ0) − ξf<br />
Second, (5.4) is written for ζ r<strong>at</strong>her than for ξ.<br />
(5.6)<br />
˙ζ = f(ζ + ξf ) ≡ g(ζ) (5.7)<br />
Third, g is expanded in a Taylor series about ζ = 0.<br />
˙ζ j = dgj<br />
dζk <br />
<br />
<br />
ξf<br />
+ O(ζ 2 ) ≡ A j<br />
k ζk + O(ζ 2 ) (5.8)<br />
I am using the Einstein summ<strong>at</strong>ion convention in which one sums over repe<strong>at</strong>ed<br />
indices. Dropping the ζ 2 terms gives the m<strong>at</strong>rix equ<strong>at</strong>ion<br />
˙z = A · z (5.9)<br />
A is a constant m<strong>at</strong>rix called (among other things) the stability m<strong>at</strong>rix. It’s<br />
easy to solve (9) using the m<strong>at</strong>rix exponential.<br />
where<br />
e At ≡<br />
z(t) = e At z0<br />
∞∑<br />
n=0<br />
(5.10)<br />
A n t n /n! (5.11)<br />
For our purposes it will be enough to take the case of one degree of freedom<br />
in which case A is a 2 × 2 real, constant m<strong>at</strong>rix. If A is diagonal<br />
e At <br />
<br />
= <br />
eλ1t 0<br />
0 eλ2t <br />
<br />
<br />
<br />
(5.12)<br />
where λ1 and λ2 are eigenvalues, which might be real or complex. If they<br />
are complex they come in complex-conjug<strong>at</strong>e pairs, λ∗ 1 = λ2.<br />
Various cases can be identified. If both eigenvalues are real and positive,<br />
all trajectories flow away from the fixed point which is then called unstable.<br />
If they are both neg<strong>at</strong>ive all trajectories flow toward and the fixed point is<br />
said to be stable. If the eigenvalues have opposite signs then the trajectories<br />
are repelled from one axis and <strong>at</strong>tracted to the other. This is called a<br />
hyperbolic fixed point or a saddle point.
60 CHAPTER 5. INTRODUCTION TO CHAOS<br />
Figure 5.2: Unstable fixed point for real λ2 > λ1 > 0.<br />
Figure 5.3: Unstable fixed point for real λ1 = λ2 > 0.
5.2. FIXED POINTS AND LINEARIZATION 61<br />
Figure 5.4: Hyperbolic fixed point for real λ1 < 0, λ2 > 0.<br />
It is possible th<strong>at</strong> A cannot be diagonalized. In th<strong>at</strong> case it can <strong>at</strong> least<br />
be in upper triangular form, i.e.<br />
then<br />
<br />
<br />
A = <br />
<br />
z(t) = e λt<br />
<br />
<br />
<br />
<br />
λ 0<br />
µ λ<br />
<br />
<br />
<br />
<br />
1 0<br />
µt 1<br />
<br />
<br />
<br />
<br />
(5.13)<br />
(5.14)<br />
Complex eigenvalues require a bit more discussion. Let λ = α + iβ and<br />
z = u + iv, where α and β are real numbers, and u and v are real vectors<br />
orthogonal to one another. Separ<strong>at</strong>ing real and imaginary parts<br />
A · u = αu − βv A · v = βu + αv (5.15)<br />
Evidentally A · z ∗ = λ ∗ z ∗ so z ∗ is an eigenvector with eigenvalue λ ∗ . Eigne<br />
vectors belonging to different eigenvalues are independent. We can construct<br />
the independent real vectors u and v as follows<br />
u =<br />
z + z∗<br />
2<br />
v =<br />
z − z∗<br />
2i<br />
(5.16)
62 CHAPTER 5. INTRODUCTION TO CHAOS<br />
Figure 5.5: Unstable fixed point for nondiagonalizable A m<strong>at</strong>rix. All of the<br />
integral curves are tangent to z2 <strong>at</strong> the fixed point.<br />
Substituting these definitions into (5.10) gives<br />
e At u = e αt (u cos βt − v sin βt), (5.17)<br />
e At v = e αt (u sin βt + v cos βt). (5.18)<br />
There are two important cases, α > 0 in which case the fixed point is<br />
unstable and the orbits are spirals, and α = 0 when the phase portrait<br />
consists of circles. In this case the fixed point is call a center or an elliptic<br />
point.<br />
It should be remembered th<strong>at</strong> (5.9) is a linearized equ<strong>at</strong>ion. It hold<br />
in some small region of the fixed point, and of course, as is so often the<br />
case, the theory gives us no well to tell how small th<strong>at</strong> region might be. The<br />
damped oscill<strong>at</strong>or makes a good example of the method, and there are many<br />
other examples in the textbooks. On the other hand, the the theory fails<br />
completely for the van der Pol oscill<strong>at</strong>or in the previous section.<br />
5.3 The Henon oscill<strong>at</strong>or<br />
Although the theory from the previous section is perfectly general in the<br />
sense th<strong>at</strong> it can be applied to systems with any number of degrees of free-
5.3. THE HENON OSCILLATOR 63<br />
Figure 5.6: Unstable fixed point for complex λ with ℜ(λ) > 0.<br />
Figure 5.7: Stable fixed point for λ pure imaginary.
64 CHAPTER 5. INTRODUCTION TO CHAOS<br />
dom, it is almost impossible to visualize in four or more dimensions, and the<br />
number of cases th<strong>at</strong> must be considered increases rapidly. The best tool<br />
for visualizing higher dimensional spaces is the Poincaré section. This was<br />
described briefly in Chapter 3, and we will make more use of it shortly. Before<br />
doing so it will be useful to a good example of motion with two degrees<br />
of freedom. A fascin<strong>at</strong>ing and oft-studied cases is the Hénon-Heiles Hamiltonian.<br />
The Hamiltonian was originally used to model the motion of stars<br />
in the galaxy3 . Written in terms of dimensionless variables the Hamiltonian<br />
is<br />
H = 1<br />
2 ( ˙x2 + ˙y 2 + k1x 2 + k2y 2 ) + λ(x 2 y − y3<br />
) (5.19)<br />
3<br />
This is the Hamiltonian of two uncoupled harmonic oscill<strong>at</strong>ors with a perturb<strong>at</strong>ion<br />
proportional to λ. The oscill<strong>at</strong>ors have frequencies ω1 = √ k1 and<br />
ω2 = √ k2. The phase space is the four-dimensional space spanned by x,<br />
˙x, y, and ˙y. We can think of the unperturbed orbit as lying on two tori.<br />
In this case their cross sections are circular with radii determined by the<br />
initial conditions. If the winding number is w = r/s, x will complete r<br />
cycles while y completes s. Let us make a Poincaré section through the y<br />
torus <strong>at</strong> x = 0. Each time the orbit passes through from x < 0 to x > 0<br />
we mark a point <strong>at</strong> y and ˙y on the x = 0 plane. An example is shown in<br />
Figure (5.8) for w = 7/2. Because the winding number is r<strong>at</strong>ional there<br />
are seven discrete dots on the Poincaré section. The case of an irr<strong>at</strong>ional<br />
winding number is shown in Figure (5.9). The x vs. y plot is completely<br />
filled in, and the Poincaré plot is a continuous loop. Continuous loops like<br />
this on the Poincaré plot are a sign th<strong>at</strong> the system is circul<strong>at</strong>ing around an<br />
invariant torus and hence is integrable.<br />
When we turn on the perturb<strong>at</strong>ion by making λ ̸= 0 something remarkable<br />
happens. 4 Figures (5.10) through (5.13) show a progression from the<br />
orderly motion of the uncoupled oscill<strong>at</strong>ors Figure (5.9) through the loop in<br />
(5.10) suggesting motion around a single distorted torus. As the interaction<br />
strength is increased this one torus breaks up into five separ<strong>at</strong>e tori. In the<br />
next plot Figure (5.12) the points are beginning to disperse in a random<br />
way with some structure remaining. Because of the poor resolution of the<br />
plots one cannot see the fine details th<strong>at</strong> remain. Finally in the last plot the<br />
points are arranged in a completely random p<strong>at</strong>tern. This is paradigm<strong>at</strong>ic.<br />
As the strength of the perturb<strong>at</strong>ion increases orderly motion disintegr<strong>at</strong>es<br />
3 See Goldstein’s Classical Dynamics for a review of the physics<br />
4 I am following standard practice by varying the perturb<strong>at</strong>ion strength by changing<br />
the total energy with λ = 1.
5.3. THE HENON OSCILLATOR 65<br />
y<br />
dy/dt<br />
0.1<br />
0.05<br />
0<br />
−0.05<br />
−0.1<br />
0.6<br />
0.4<br />
0.2<br />
0<br />
−0.2<br />
−0.4<br />
−0.6<br />
x vs y<br />
−0.1 −0.05 0<br />
x<br />
0.05 0.1<br />
dy/dt<br />
0.2<br />
0.15<br />
0.1<br />
0.05<br />
0<br />
−0.05<br />
−0.1<br />
−0.15<br />
−0.2<br />
Poincare Section through x=0<br />
−0.1 −0.05 0<br />
y<br />
0.05 0.1<br />
Figure 5.8: Harmonic oscill<strong>at</strong>or coordin<strong>at</strong>es for w = 7/2<br />
Poincare Section through x=0<br />
−0.3 −0.2 −0.1 0<br />
y<br />
0.1 0.2 0.3<br />
Figure 5.9: Harmonic oscill<strong>at</strong>or coordin<strong>at</strong>es irr<strong>at</strong>ional winding number
66 CHAPTER 5. INTRODUCTION TO CHAOS<br />
dy/dt<br />
0.4<br />
0.3<br />
0.2<br />
0.1<br />
0<br />
−0.1<br />
−0.2<br />
−0.3<br />
Poincare Section through x=0<br />
−0.4<br />
−0.4 −0.3 −0.2 −0.1 0<br />
y<br />
0.1 0.2 0.3 0.4<br />
Figure 5.10: Henon-Heiles Hamiltonian. Orbit circul<strong>at</strong>es a distorted torus.<br />
dy/dt<br />
0.5<br />
0.4<br />
0.3<br />
0.2<br />
0.1<br />
0<br />
−0.1<br />
−0.2<br />
−0.3<br />
−0.4<br />
Poincare Section through x=0<br />
−0.5<br />
−0.4 −0.2 0 0.2 0.4 0.6<br />
y<br />
Figure 5.11: The orbit breaks up into smaller tori.
5.3. THE HENON OSCILLATOR 67<br />
dy/dt<br />
dy/dt<br />
0.5<br />
0.4<br />
0.3<br />
0.2<br />
0.1<br />
0<br />
−0.1<br />
−0.2<br />
−0.3<br />
−0.4<br />
Poincare Section through x=0<br />
−0.5<br />
−0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8<br />
y<br />
0.6<br />
0.4<br />
0.2<br />
0<br />
−0.2<br />
−0.4<br />
−0.6<br />
Figure 5.12: Chaos begins to set in.<br />
Poincare Section through x=0<br />
−0.8<br />
−0.5 0 0.5 1<br />
y<br />
Figure 5.13: Complete Chaos
68 CHAPTER 5. INTRODUCTION TO CHAOS<br />
into chaos. One of the goals of chaos theory is to explain explain and predict<br />
this phenomena. This will require some new formalism.<br />
5.4 Discrete Maps<br />
Suppose we were to number the points on the Poincaré plot in the order<br />
they appeared as the orbit repe<strong>at</strong>edly cut through the x = 0 plane. This<br />
would give us a series of coordin<strong>at</strong>es (x1, y2), (x2, y2), · · · , (xn, yn). Think of<br />
this in terms of a mapping oper<strong>at</strong>or T th<strong>at</strong> maps the n’th point into the<br />
n + 1’th point.<br />
T(xn, yn) ≡ (xn+1, yn+1) (5.20)<br />
In principle we could derive the exact m<strong>at</strong>hem<strong>at</strong>ical form for this oper<strong>at</strong>or.<br />
(I doubt th<strong>at</strong> anyone has actually done this.) Certainly we could write<br />
a computer program to do the mapping, and certainly we could derive a<br />
linearized version of T th<strong>at</strong> would be OK for small displacements. For my<br />
purposes it will be enough to consider the general properties such oper<strong>at</strong>ors<br />
must have. The first of these (from which all others flow) is th<strong>at</strong> they must<br />
be area preserving.<br />
Canonical transform<strong>at</strong>ions preserve the volume of phase space. This is<br />
called Liouville’s theorem; it’s proved in most mechanics texts. For a onedegree<br />
of freedom system, this is just preserv<strong>at</strong>ion of area in the (p, q)-phase<br />
plane. Thus for some area A, enclosed by a closed curve C, we can use<br />
Stokes’ theorem to write<br />
<br />
p dq = p dq (5.21)<br />
C<br />
where C ′ is the shape of the curve after it has been changed by some canonical<br />
transform<strong>at</strong>ion including the passage of time which is itself a canonical<br />
transform<strong>at</strong>ion. Another way to say the same thing is th<strong>at</strong> if the (q, p) point<br />
in phase space is transformed to (q ′ , p ′ ), then the Jacobian<br />
<br />
<br />
<br />
<br />
∂(q ′ , p ′ )<br />
∂(q, p)<br />
C ′<br />
<br />
<br />
<br />
= 1 (5.22)<br />
These results can be extended to higher dimensions in a completely straightforward<br />
manner.<br />
So far, so good. There is a corollary to Liouville’s theorem th<strong>at</strong> is not<br />
so easy to prove. The transform<strong>at</strong>ions of the form (5.20) on the Poincaré
5.4. DISCRETE MAPS 69<br />
J<br />
3<br />
2<br />
1<br />
0<br />
−1<br />
−2<br />
−3<br />
Standard Map, ε = 0.00<br />
0 1 2 3 4 5 6<br />
phi<br />
Figure 5.14: The standard map with ϵ = 0.<br />
section also preserve area in the sense of (5.22). 5 Discrete maps are area<br />
preserving.<br />
Let’s take time out for an example. The following transform<strong>at</strong>ion is<br />
called the standard map, presumably because it appears in so many different<br />
contexts. Thanks to the Jn+1 (r<strong>at</strong>her than Jn) in the first of equ<strong>at</strong>ion (5.23)<br />
it is trivially area preserving.<br />
ϕn+1 = (ϕn + Jn+1)mod2π (5.23)<br />
Jn+1 = ϵ sin ϕn + Jn<br />
This is a one-dimensional map written in terms of action-angle variables J<br />
and ϕ. Not only ϕ, but also J is periodic with period 2π. We can imagine<br />
all the orbits wrapped around a cylinder. In the case ϵ = 0, the (ϕn, Jn)’s lie<br />
along parallel circles as shown in Figure 5.14. When ϵ is increased to 0.050<br />
a new fe<strong>at</strong>ure appears, a loop in the center of the plot. This is unusual in<br />
the sense th<strong>at</strong> it can be contracted to a point; it is topologically distinct<br />
from all the ϵ = 0 circles. As ϵ is increased an assortment of smaller loops<br />
appear together with a sm<strong>at</strong>tering of completely random points. Because of<br />
limit<strong>at</strong>ions on plot resolution, computer time, and my p<strong>at</strong>ience you cannot<br />
see the really significant thing about this plot: this p<strong>at</strong>tern of islands of loopy<br />
order interspersed with random dots persists <strong>at</strong> ever smaller and smaller<br />
scales. They have a property called self similarity. In this sense they are<br />
5 Tabor, Appendix 4.1
70 CHAPTER 5. INTRODUCTION TO CHAOS<br />
J<br />
3<br />
2<br />
1<br />
0<br />
−1<br />
−2<br />
−3<br />
Standard Map, ε = 0.050<br />
0 1 2 3 4 5 6<br />
phi<br />
Figure 5.15: The standard map with ϵ = 0.050.<br />
similar to fractal p<strong>at</strong>terns. Finally, as ϵ is increased further, all appearance<br />
of order disappears and the dots become completely random. This is the<br />
st<strong>at</strong>e of complete chaos.<br />
5.5 Linearized Maps<br />
Like the continuous transform<strong>at</strong>ions we studied in Section 5.2, discrete maps<br />
have fixed points about which one can analyze the local topology. Consider<br />
a generic mapping of the form 6<br />
[ xi+1<br />
yi+1<br />
]<br />
[<br />
xi<br />
= T<br />
yi<br />
]<br />
(5.24)<br />
A fixed point of the mapping would be a point where xi+1 = xi and yi+1 = yi.<br />
I will argue l<strong>at</strong>er on th<strong>at</strong> in a plot like Figure 5.16 there are an infinite<br />
number of fixed points, but to keep the algebra simple here I will assume<br />
th<strong>at</strong> the fixed point is <strong>at</strong> the origin (0, 0). Linearizing T about this point<br />
gives [ ] [ ] [ ]<br />
δxi+1 T11 T12 δxi<br />
=<br />
(5.25)<br />
δyi+1 T21 T22 δyi<br />
where of course<br />
Tij = ∂Ti<br />
<br />
<br />
<br />
∂xj<br />
xi,xj=0<br />
6 I am using Tabor’s not<strong>at</strong>ion from section 4.3.4.<br />
(5.26)
5.5. LINEARIZED MAPS 71<br />
J<br />
3<br />
2<br />
1<br />
0<br />
−1<br />
−2<br />
−3<br />
Standard Map, ε = 0.750<br />
0 1 2 3 4 5 6<br />
phi<br />
Figure 5.16: The standard map with ϵ = 0.750.<br />
The eigenvalues λi of the Tij m<strong>at</strong>rix must s<strong>at</strong>isfy<br />
λ 2 − λ(trace(T )) + det(T ) = 0 (5.27)<br />
The all-important point here is th<strong>at</strong> because of the area-preserving property<br />
of T, det(T ) = 1. This gre<strong>at</strong>ly restricts the allowed types of fixed points.<br />
There are only three cases to consider.<br />
If |trace(T )| < 2, λ1 λ2 are a complex conjug<strong>at</strong>e pair lying on the unit<br />
circle, th<strong>at</strong> is,<br />
λ1 = e +iα , λ2 = e −iα<br />
(5.28)<br />
This is simply a rot<strong>at</strong>ion in the vicinity of the fixed point (0, 0). This corresponds<br />
to a stable or elliptic point. Thus in the immedi<strong>at</strong>e neighborhood<br />
of (0, 0) we expect to find invariant curves like Figure 5.7.<br />
If |trace(T )| > 2, λ1 λ2 are real numbers s<strong>at</strong>isfying<br />
λ1 = 1/λ2<br />
(5.29)<br />
There are two subcases to consider here depending on whether λ is positive<br />
or neg<strong>at</strong>ive. If it is positive we have a regular hyperbolic fixed point in<br />
which successive iter<strong>at</strong>e stay on the same branch of the hyperbola as in<br />
Figure 5.17 (a). If λ < 0 we have a hyperbolic-with-reflection fixed point<br />
in which successive iter<strong>at</strong>es jump backwards and forwards between opposite<br />
branches of the hyperbola. (See Figure 5.17 (b).)
72 CHAPTER 5. INTRODUCTION TO CHAOS<br />
Figure 5.17: (a) Hyperbolic fixed point. (b) Hyperbolic-with-reflection fixed<br />
point.<br />
5.6 Lyapunov Exponents<br />
Loosely speaking, systems are chaotic because adjacent trajectories diverge<br />
exponentially from one another. If this were literally true we could parameterize<br />
this divergence with the function e λx , where λ is some constant and x<br />
is the independent variable, which might be continuous or discrete depending<br />
on the applic<strong>at</strong>ion. This is the basic idea behind Lyapunov exponentials,<br />
a formalism with many altern<strong>at</strong>e definitions (and spellings).<br />
Let’s apply this idea first to a one-dimensional iter<strong>at</strong>ive map of the form<br />
xi+1 = f(xi) (5.30)<br />
We can characterize the divergence of two trajectories separ<strong>at</strong>ed by ϵ upon<br />
the n-th iter<strong>at</strong>ion as<br />
<br />
|f(xn + ϵ) − f(xn)| <br />
lim<br />
= <br />
df(xn) <br />
<br />
ϵ→0 ϵ<br />
dxn<br />
<br />
(5.31)<br />
A small but finite devi<strong>at</strong>ion <strong>at</strong> the n-th iter<strong>at</strong>ion, say δxn, should grow to<br />
<br />
<br />
δxn+1 ≈ <br />
df(xn) <br />
<br />
dxn<br />
δxn<br />
(5.32)<br />
Continuing this reasoning<br />
<br />
δxn+1 <br />
= <br />
df(xn) df(xn−1)<br />
δx0<br />
dxn dxn−1<br />
× · · · × df(x0)<br />
dx0<br />
<br />
<br />
<br />
<br />
(5.33)
5.6. LYAPUNOV EXPONENTS 73<br />
=<br />
n∏<br />
i=0<br />
|f ′ (xi)| = e λn<br />
The last equality is just a hypothesis. λ will certainly depend on the point<br />
n where we stop iter<strong>at</strong>ing. We should write instead<br />
λ(n) = 1<br />
n ln<br />
n∏<br />
|f ′ (xi)| (5.34)<br />
with the understanding th<strong>at</strong> the definition only makes sense if there is some<br />
range of n over which λ(n) is more or less constant. λ defined in this way is<br />
a Lyapunov exponent.<br />
In the case of multidimensional mappings<br />
i=0<br />
xi+1 = F (xi) (5.35)<br />
where x and F are n-dimensional vectors, there will be a set of n characteristic<br />
exponents corresponding to the n eigenvalues of the linearized map<br />
(??). Introducing the eigenvalues λi(N), i = 1, . . . , n, of the m<strong>at</strong>rix<br />
(LM)N = (T (xN)T (xN−1) · · · T (x1)) 1/N<br />
(5.36)<br />
where T (xi) is the lineariz<strong>at</strong>ion of F <strong>at</strong> the point xi, the exponents are<br />
defined as<br />
σi(N) = ln |λi(N)| (5.37)<br />
Since the T ’s have unit determinant for area-preserving maps, it is clear th<strong>at</strong><br />
the sum of the exponents must be zero.<br />
For the final example, suppose the equ<strong>at</strong>ion of motion is<br />
˙x = f(x) (5.38)<br />
Let s(t) = x(t) − x0(t) be the difference between two near-by trajectories.<br />
If this does indeed diverge exponentially with time, then ˙s = λs. Then we<br />
can argue th<strong>at</strong><br />
˙s = ˙x − ˙x0 = f(x) − f(x0) = λs = λ(x − x0) (5.39)<br />
λ =<br />
f(x) − f(x0)<br />
x − x0<br />
≈ df<br />
<br />
<br />
<br />
dx<br />
x0<br />
(5.40)
74 CHAPTER 5. INTRODUCTION TO CHAOS<br />
5.7 The Poincaré-Birkhoff Theorem<br />
The phase-space trajectories of integrable systems move on smooth tori. The<br />
appearance of the Poincaré section depends on whether the winding number<br />
is r<strong>at</strong>ional or irr<strong>at</strong>ional. If it is r<strong>at</strong>ional the section shows discrete points. If<br />
irr<strong>at</strong>ional, the points are ‘ergodic’ and form a continuous loop. Under the<br />
influence of nonlinear perturb<strong>at</strong>ions the tori become distorted, then break<br />
up into smaller tori, and finally disintegr<strong>at</strong>e into chaos. It turns out th<strong>at</strong><br />
the way this happens depends on whether the winding number is r<strong>at</strong>ional or<br />
irr<strong>at</strong>ional. If it is irr<strong>at</strong>ional the tori are preserved, distorted but preserved,<br />
under small perturb<strong>at</strong>ions. This is a gross oversimplific<strong>at</strong>ion of the KAM<br />
theorem, which I will discuss in the next Section 5.8. If the winding number<br />
is r<strong>at</strong>ional, the tori break up in a way th<strong>at</strong> is governed by the so-called<br />
Poincaré-Birkoff theorem, the subject of this section. This may seem like<br />
a swindle, since every irr<strong>at</strong>ional number can be approxim<strong>at</strong>ed to arbitrary<br />
accuracy by a r<strong>at</strong>ional number. But, as it turns out, some numbers are more<br />
irr<strong>at</strong>ional than others!<br />
I will prove the PB theorem for the standard map equ<strong>at</strong>ion (5.23) but it<br />
is true under quite general assumptions. I will use the symbol Tϵ for (5.23),<br />
i.e.<br />
Tϵ(ϕn, Jn) = (ϕn+1, Jn+1).<br />
Now imagine the points in Figure 5.14 (ϵ = 0) plotted in polar coordin<strong>at</strong>es<br />
(for positive J) with ϕ the angular and J the radial coordin<strong>at</strong>e. The points<br />
now lie on concentric circles of constant J. Choose J ≡ Jr = 2πj/k, with<br />
k and j integers, i.e. Jr has a r<strong>at</strong>ional winding number. If we iter<strong>at</strong>e T0 k<br />
times, J remains unchanged and ϕ is incremented by k factors of 2π, which<br />
is to say, ϕ is not changed <strong>at</strong> all. Symbolically<br />
T k 0(ϕ, Jr) = (ϕ, Jr)<br />
Now take a J+ slightly larger than Jr. Tk 0 will increment ϕ by slightly more<br />
than 2πk so ϕ will increase. In the same way if J− < Jr, Tk 0 will cause ϕ to<br />
decrease. We can imagine the values of ϕ lying on three circles J+, Jr, and<br />
J− as shown in Figure 5.18(a).<br />
Now turn on a small perturb<strong>at</strong>ion ϵ > 0. Tk ϵ will map some ϕ’s to larger<br />
values and some to smaller, but there will be some locus of points, called C<br />
in Figure 5.19 which are not changed <strong>at</strong> all. In other words, the curve C is<br />
mapped purely radially.<br />
T k ϵ (Jr, ϕ) = (Jc, ϕ)
5.7. THE POINCARÉ-BIRKHOFF THEOREM 75<br />
Figure 5.18: (a) Three orbits of the unperturbed standard map Tk 0 . (b) The<br />
ϕ coordin<strong>at</strong>e is left invariant on C by the perturbed map Tk ϵ .<br />
Curve C is mapped into a new curve called D in Figure 5.18(b).<br />
T k ϵ (Jc, ϕ) = (Jd, ϕ)<br />
The curves C and D must have the same area (remember these are areapreserving<br />
transform<strong>at</strong>ions) so they must cross one another an even number<br />
of times. This situ<strong>at</strong>ion is shown in Figure 5.19. The crossings represent<br />
points th<strong>at</strong> are invariant under T k ϵ – they are fixed points.<br />
This is our first result. A torus with r<strong>at</strong>ional winding number j/k is<br />
invariant under T k 0 , i.e. every point on the torus is a fixed point of Tk 0 .<br />
When ϵ is even slightly larger than zero, only a discrete (even) number of<br />
fixed points of T k ϵ survive. You can ascertain the type of fixed points by<br />
seeing how other points in their immedi<strong>at</strong>e vicinity are mapped. Compare<br />
this flow as it’s called with the arrows in Figures 5.4 and 5.17. You should be<br />
able to convince yourself th<strong>at</strong> the points along the curve C are altern<strong>at</strong>ively<br />
hyperbolic and elliptic. Figure 5.20 should help you visualize this. Since<br />
there are an even number of fixed points, half of them will be elliptic and<br />
half hyperbolic. How many are there? Suppose (ϕ0, J0) is a fixed point of<br />
T k ϵ . We can cre<strong>at</strong>e more fixed points by multiplying by Tϵ as the following
76 CHAPTER 5. INTRODUCTION TO CHAOS<br />
Figure 5.19: The curves C and D. Crossings, like a and b, are fixed points.<br />
Figure 5.20: A closer look <strong>at</strong> the fixed points a and b.
5.8. ALL IN A TANGLE 77<br />
simple argument shows.<br />
T k ϵ [Tϵ(ϕ0, J0)] = TϵT k ϵ (ϕ0, J0) = Tϵ(ϕ0, J0)<br />
Starting with (ϕ0, J0) we can cre<strong>at</strong>e k − 1 additional fixed points by multiplying<br />
repe<strong>at</strong>edly with Tϵ. To put it another way, every fixed point of Tk ϵ is a<br />
member of a family of k fixed points obtained by multiplying by various powers<br />
of Tϵ Because each mapping is a continuous function of ϕ and J, all the<br />
members of an elliptic family are elliptic and all the members of a hyperbolic<br />
family are hyperbolic. I claim th<strong>at</strong> all the members of a family are distinct.<br />
Proof: Let (ϕs, Js) be the fixed point obtained by Ts ϵ(ϕ0, J0) = (ϕs, Js) with<br />
s < k. Then of course all such points are fixed points of Tk ϵ . The claim is<br />
th<strong>at</strong> there is no m < k such th<strong>at</strong> Tm ϵ (ϕs, Js) = (ϕs, Js). Multiply both sides<br />
of this equ<strong>at</strong>ion with T−s ϵ . The result is Tm ϵ (ϕ0, J0) = (ϕ0, J0) It is just this<br />
equ<strong>at</strong>ion with m replaced by k th<strong>at</strong> defines (ϕ0, J0). Hence m = k. Finally<br />
note th<strong>at</strong> none of these newly cre<strong>at</strong>ed fixed points can lie along the original<br />
curve C. If there were there would be instances in which two hyperbolic or<br />
two elliptic points appeared side by side. This we know to be impossible.<br />
Consequently each torus breaks up into k fixed points for every fixed point<br />
on the curve C. This is the Poincaré-Birkoff theorem.<br />
5.8 All in a tangle<br />
Have another look <strong>at</strong> the hyperbolic fixed points in Figures 5.4 and 5.17.<br />
There are always two loci of points leading directly toward the fixed point<br />
and two loci leading away from it. These are called the stable and unstable<br />
manifolds respectively. Following the not<strong>at</strong>ion of Hand and Finch I will call<br />
them H+ and H−. Call the fixed point pf . Any point along H+ will be<br />
mapped asymptotically back to pf under repe<strong>at</strong>ed applic<strong>at</strong>ions of Tϵ, and<br />
any point on H− will be mapped asymptotically back to pf under repe<strong>at</strong>ed<br />
applic<strong>at</strong>ions of T −1<br />
ϵ . Can these manifolds cross one another? I claim the<br />
following.<br />
• H+ and H− cannot intersect themselves, but they can and do intersect<br />
one another.<br />
• Stable manifolds of different fixed points cannot intersect one another.<br />
• Unstable manifolds of different fixed points cannot intersect one another.
78 CHAPTER 5. INTRODUCTION TO CHAOS<br />
• Stable manifolds can intersect with unstable ones. The stable and<br />
unstable manifolds of a single fixed point intersect in wh<strong>at</strong> are called<br />
homoclinic points and those of two different fixed points, in heteroclinic<br />
points.<br />
• Neither H+ nor H− can cross the tori surrounding elliptic fixed points.<br />
• There are, depending on the size of ϵ, narrow bands surrounding tori<br />
with irr<strong>at</strong>ional winding number th<strong>at</strong> are not broken up into isol<strong>at</strong>ed<br />
fixed points. This is the content of the KAM theorem to be discussed<br />
in the next section. Neither H+ nor H− can cross these bands.<br />
The proofs of these assertions are easy and are given in Finch and Hand.<br />
Referring to Figure 5.21(a), x0 is a heteroclinic point th<strong>at</strong> lies on the unstable<br />
manifold H− of pf1 and the stable manifold H+ of pf2. Since both manifolds<br />
are invariant under Tϵ, the Tk ϵ x0 are a set of discrete points th<strong>at</strong> lie on both<br />
manifolds, so the two manifolds must therefore intersect again. For instance,<br />
because x1 = Tϵx0 is on both manifolds, H− must loop around to meet H+.<br />
Similarly the xk = Tk ϵ must lie on both manifolds, so H− must loop around<br />
over and over again as illustr<strong>at</strong>ed in Figure 5.21(b). The inverse map also<br />
leaves H+ and H+ invariant and hence the x−k = T−k ϵ x0 are intersections<br />
th<strong>at</strong> force H+ to loop around to meet H−. As k increases and xk approaches<br />
one of the fixed points, the spacing between the intersections gets smaller,<br />
so the loops they cre<strong>at</strong>e get narrower. But because Tϵ is area-preserving,<br />
the loop areas are the same, so the loops get longer, which leads to many<br />
intersections among them, as shown in Figure 5.21(c) and (d).<br />
Try explaining all this to an intim<strong>at</strong>e friend on a d<strong>at</strong>e. The more you<br />
explain the more you will see th<strong>at</strong> this mechanism produces a tangle of<br />
f<strong>at</strong>homless complexity. 7 Nonetheless the mess is contained, <strong>at</strong> least for small<br />
ϵ. Since stable manifolds cannot cross, the stable manifold eman<strong>at</strong>ing from<br />
pf1 acts as a barrier to the stable manifold eman<strong>at</strong>ing from pf2. The same is<br />
true of the unstable manifolds. The tangle also cannot cross the stable torri<br />
surrounding the elliptic fixed points nor can they cross the KAM tori. As<br />
a consequence we expect to see islands of chaos developing between stable<br />
ellipses. This is clear in Figures 5.15 and 5.16. As ϵ increases, the KAM tori<br />
also break down and chaos engulfs the entire plot.<br />
7 Don’t try to explain this for higher-dimensional spaces. Th<strong>at</strong> way lies madness.
5.8. ALL IN A TANGLE 79<br />
Figure 5.21: A hereroclinic intersection. (a) Two hyperbolic fixed points pf1<br />
and pf2, and an intersection x0 of the unstable manifold of pf1 with the stable<br />
manifold of pf2. (b) Adding the forward maps T k x0 of the intersection.<br />
(c) Adding the backward maps T −1 x0 of the intersection. (d) Adding another<br />
intersection x ′ and some of its backward maps. U – unstable manifold;<br />
S – stable manifold.
80 CHAPTER 5. INTRODUCTION TO CHAOS<br />
5.9 The KAM theorem and its consequences<br />
For a system with n independent degrees of freedom to be integable, it is<br />
a necessary and sufficient condition th<strong>at</strong> n independent constants of the<br />
motion exist. In this case the system can be transformed into a set of<br />
action-angle variables<br />
In this not<strong>at</strong>ion<br />
ω0 ≡ (ω01, ω02, · · · , ω0n) I0 ≡ (I01, I02, · · · , I0n) (5.41)<br />
ω0 = ∂H0<br />
∂I0<br />
Now suppose the system is perturbed slightly<br />
(5.42)<br />
H(ω0, I0, ϵ) = H0(I0) + ϵH1(ω0, I0) (5.43)<br />
where I0 and ω0 are the AA variables of H0. According to our perturb<strong>at</strong>ion<br />
formalism from Chapter 4, there are two series th<strong>at</strong> must converge (4.25)<br />
and (4.26) repe<strong>at</strong>ed here for convenience.<br />
where k is a vector of integers 8<br />
and<br />
˜H1(I, ψ) = ∑<br />
Ak(I)e ik·ψ<br />
k<br />
F1(I, ψ) = ∑<br />
Bk(I)e ik·ψ<br />
k<br />
k = k1, · · · , kn<br />
Bk = i Ak<br />
ω0 · k<br />
(5.44)<br />
(5.45)<br />
(5.46)<br />
The r<strong>at</strong>e of decrease of the |Bk| depends both on the |Ak| and the denomin<strong>at</strong>ors<br />
|ω0·k|, so even if the |Ak| decrease fast enough for (5.44) to converge,<br />
(5.45) will not converge if the |ω0 · k| decrease too rapidly.<br />
The situ<strong>at</strong>ion seems hopeless. If any of the ω0’s yields a r<strong>at</strong>ional winding<br />
number, the series will blow up immedi<strong>at</strong>ely, and if one is working with finite<br />
precision – on a computer for example – every number is a r<strong>at</strong>ional number.<br />
And yet we have seem from our computer models th<strong>at</strong> some stable periodic<br />
8 The sum over k means the sum over all possible combin<strong>at</strong>ions of the n integers<br />
k1, · · · , kn.
5.9. THE KAM THEOREM AND ITS CONSEQUENCES 81<br />
trajectories persist even under the influence of small perturb<strong>at</strong>ions. The circumstances<br />
under which this happens is spelled out in a remarkable theorem<br />
first outlined by Kolmogorov and l<strong>at</strong>er proved independently by Arnold and<br />
Moser. The theorem is extremely difficult and sophistic<strong>at</strong>ed although Tabor<br />
has a nice explan<strong>at</strong>ion of the basic ideas, and an understandable outline of<br />
the proof is given in Classical Dynamics by José and Saletan. I will explain<br />
the theorem as carefully as I can and let it go <strong>at</strong> th<strong>at</strong>.<br />
5.9.1 Two Conditions<br />
The KAM theorem claims th<strong>at</strong> in regions of phase space where certain conditions<br />
hold, the perturb<strong>at</strong>ion series converges to all order in ϵ. The first<br />
condition involves the Hessian m<strong>at</strong>rix.<br />
<br />
∂ω0α<br />
<br />
det <br />
≡ det <br />
∂<br />
<br />
2 <br />
H0 <br />
<br />
̸= 0. (5.47)<br />
∂I0β<br />
∂I0α∂I0β<br />
The content of this st<strong>at</strong>ement is as follows: We assume th<strong>at</strong> each torus has<br />
a unique frequency associ<strong>at</strong>ed with it. Thus if we knew all the ω0’s we could<br />
calcul<strong>at</strong>e all the I0’s and vice versa. Equ<strong>at</strong>ion (5.47) ensures th<strong>at</strong> this is<br />
true. A simple (albeit artificial) example is provided by José and Saletan.<br />
Consider the one degree of freedom Hamiltonian<br />
H = I 3 /3 (5.48)<br />
in which I takes on values in the interval −1 < I < +1. The above condition<br />
requires th<strong>at</strong><br />
d2H = 2aI ̸= 0 (5.49)<br />
dI2 Why is this significant? Note th<strong>at</strong> ω(I) = dH/dI = aJ 2 . Inverting this gives<br />
I(ω) = ± √ ω/a. The ± is a sign th<strong>at</strong> the inversion is not unique. There<br />
are two regions separ<strong>at</strong>ed by I = 0. In the region 0 < I ≤ 1 I = √ ω/a. In<br />
the region −1 ≤ I < 0, I = − √ ω/a. Thus there are two “good” regions<br />
separ<strong>at</strong>ed by a barrier.<br />
There is a second condition restricting the frequencies. Of course we<br />
are only considering frequencies with irr<strong>at</strong>ional winding numbers. Even if<br />
the frequencies are incommensur<strong>at</strong>e, |ω0 · k| could be arbitrarily small. The<br />
KAM theorem requires th<strong>at</strong> it be bounded from below by the so-called “weak<br />
diophantine condition”<br />
|ω0 · k| ≥ γ|k| −κ for all integer k (5.50)
82 CHAPTER 5. INTRODUCTION TO CHAOS<br />
where k = √ k · k and γ and κ > n are positive constants.<br />
Wh<strong>at</strong> is the significance of this strange inequality? The best way to<br />
understand it, I think, is to face up to the paradox I mentioned earlier th<strong>at</strong><br />
the series can only converge for irr<strong>at</strong>ional winding numbers, and yet it seems<br />
th<strong>at</strong> every irr<strong>at</strong>ional number is “arbitrarily close” to a r<strong>at</strong>ional number. It is<br />
this last st<strong>at</strong>ement th<strong>at</strong> needs to be examined more carefully. This requires<br />
a brief excursion into number theory. Consider the unit interval [0, 1]. The<br />
r<strong>at</strong>ionals have measure zero in the interval. Th<strong>at</strong> means roughly th<strong>at</strong> they<br />
don’t take up any space. This can be proved as follows. First put the<br />
r<strong>at</strong>ionals in a one-to-one correspondence with the integers. Construct a<br />
small open interval of length ϵ < 1 about the first r<strong>at</strong>ional, and one of<br />
length ϵ 2 about the second and so forth. The sum of all these little intervals<br />
(this is a geometric series) is σ = ϵ/(1 − ϵ), which can be made arbitrarily<br />
small by choosing ϵ small enough. Thus the space occupied by the r<strong>at</strong>ionals<br />
is less than any positive number. This requires taking the limit ϵ → 0. The<br />
paradoxical thing is th<strong>at</strong> it is possible to remove a finite interval around each<br />
r<strong>at</strong>ional without deleting all of [0, 1]. This can be seen as follows. Write each<br />
r<strong>at</strong>ional in [0, 1] in its lowest form as p/q, and about each one construct an<br />
interval of length 1/q 3 . For each q there are <strong>at</strong> most q − 1 r<strong>at</strong>ionals. Thus<br />
for a given q no more than (q − 1)/q 3 is covered by the intervals, and the<br />
total length Q th<strong>at</strong> is covered is less (because of overlaps) than the sum of<br />
these intervals over all q.<br />
Q <<br />
∞∑<br />
q=2<br />
q − 1<br />
<<br />
q3 ∞∑<br />
q=2<br />
1<br />
q 3<br />
(5.51)<br />
This sum is rel<strong>at</strong>ed to the Riemann zeta function. At any r<strong>at</strong>e Q < 0.645.<br />
We can make this number as small as we like by replacing 1/q 3 with Γ/q 3<br />
where Γ < 1. Even if we leave Γ = 1, the fraction of [0, 1] covered by the<br />
finite intervals is less than 1.<br />
Now we can divide the irr<strong>at</strong>ionals into two sets, those covered by the<br />
intervals around the r<strong>at</strong>ionals and those outside the intervals. Those uncov-<br />
ered s<strong>at</strong>isfy the condition <br />
<br />
ω − p<br />
<br />
<br />
<br />
q <br />
Several comments are in order regarding this inequality.<br />
Γ<br />
≥ . (5.52)<br />
q3 • Equ<strong>at</strong>ion (5.52) makes irr<strong>at</strong>ionality a quantit<strong>at</strong>ive concept. 9 Those<br />
irr<strong>at</strong>ionals th<strong>at</strong> s<strong>at</strong>isfy (5.52) are “more irr<strong>at</strong>ional” than those th<strong>at</strong><br />
9 This can also be quantified in terms of continued fraction expansions.
5.10. CONCLUSION 83<br />
don’t, and the extent of their irr<strong>at</strong>ionality can be quantified by th<strong>at</strong><br />
value of Γ for which they just do or do not s<strong>at</strong>isfy the inequality.<br />
• Equ<strong>at</strong>ion (5.50) is just an n-dimensional version of (5.52). The constants<br />
γ and κ characterize the degree of irr<strong>at</strong>ionality of ω in the same<br />
way th<strong>at</strong> Γ and the exponent 3 characterize the irr<strong>at</strong>ionality of ω in<br />
(5.52).<br />
• The uncovered irr<strong>at</strong>ionals occupy isol<strong>at</strong>ed “islands” between the covered<br />
intervals. We expect th<strong>at</strong> as the perturb<strong>at</strong>ion parameter ϵ is<br />
increased, those tori with less irr<strong>at</strong>ional winding numbers will be destroyed<br />
first, but islands of stability will remain between them. Eventually<br />
as the perturb<strong>at</strong>ion is increased, all will be swept away in chaos.<br />
• The KAM theorem gives us no clue how to calcul<strong>at</strong>e the appropri<strong>at</strong>e<br />
values of γ and κ or the values of ϵ for which chaos will set in. Some<br />
estim<strong>at</strong>es placed the critical value of ϵ to be something around 10 −50 !<br />
If this were true, of course, the theorem would be quite pointless.<br />
Numerical test with specific models have found critical values of ϵ as<br />
large as ϵc ≈ 1. I will close with a quote from José and Saletan, “To<br />
our knowledge a rigorous formal estim<strong>at</strong>e of a realistic critical value<br />
for ϵ remains an open question.”<br />
5.10 Conclusion<br />
This is the end of our story about chaos. Remember th<strong>at</strong> we have only<br />
dealt with bounded, conserv<strong>at</strong>ive systems with time-independent Hamiltonians.<br />
(Classical mechanics is a big subject.) Systems with one degree of<br />
freedom are trivial (in principle) to solve using the method of quadr<strong>at</strong>ures.<br />
Systems with n degrees of freedom are trivial (again in principle) if they have<br />
n constants of motion. Such a system can be reduced by using action-angle<br />
variables to an ensemble of uncoupled oscill<strong>at</strong>ors. These systems are said to<br />
be integrable and they do not display chaos. The trouble comes when we<br />
introduce some non-integrability as a perturb<strong>at</strong>ion. Perturb<strong>at</strong>ion theory is<br />
straightforward with one degree of freedom, but with two or more degrees<br />
of freedom comes the notorious problem of small denomin<strong>at</strong>ors. 10 Perturb<strong>at</strong>ion<br />
theory fails immedi<strong>at</strong>ely for all periodic trajectories with r<strong>at</strong>ional<br />
10 There are other ways of doing perturb<strong>at</strong>ion theory in addition to the one described<br />
here. They all suffer the same problem.
84 CHAPTER 5. INTRODUCTION TO CHAOS<br />
winding number. According to the Poincaré-Birkoff theorem, these trajectories<br />
on the Poincaré section break up into complic<strong>at</strong>ed whorls and tangles<br />
surrounded be regions of stability corresponding to irr<strong>at</strong>ional winding numbers.<br />
According to the KAM theorem these regions break down with those<br />
with “more irr<strong>at</strong>ional” winding numbers surviving those with less. At last<br />
“Universal darkness covers all,” and the trajectories though deterministic<br />
show no order or p<strong>at</strong>tern.