12.07.2015 Views

def-eq

def-eq

def-eq

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

DIFFERENTIAL EQUATIONS,DYNAMICAL SYSTEMS, ANDAN INTRODUCTIONTO CHAOS


This is volume 60, 2ed in the PURE AND APPLIED MATHEMATICS SeriesFounding Editors: Paul A. Smith and Samuel Eilenberg


DIFFERENTIAL EQUATIONS,DYNAMICAL SYSTEMS, ANDAN INTRODUCTIONTO CHAOSMorris W. HirschUniversity of California, BerkeleyStephen SmaleUniversity of California, BerkeleyRobert L. DevaneyBoston UniversityAmsterdam Boston Heidelberg London New York OxfordParis San Diego San Francisco Singapore Sydney TokyoAcademic Press is an imprint of Elsevier


Senior Editor, MathematicsAssociate EditorProject ManagerMarketing ManagerProduction ServicesCover DesignCopyeditingCompositionPrinterBarbara HollandTom SingerKyle SarofeenLinda BeattieBeth Callaway, Graphic WorldEric DeCiccoGraphic WorldCepha Imaging Pvt. Ltd.Maple-VailThis book is printed on acid-free paper.Copyright 2004, Elsevier (USA)All rights reserved.No part of this publication may be reproduced or transmitted in any form or by any means,electronic or mechanical, including photocopy, recording, or any information storage andretrieval system, without permission in writing from the publisher.Permissions may be sought directly from Elsevier’s Science & Technology RightsDepartment in Oxford, UK: phone: (+44) 1865 843830, fax: (+44) 1865 853333, e-mail:permissions@elsevier.com.uk. You may also complete your r<strong>eq</strong>uest on-line via the ElsevierScience homepage (http://elsevier.com), by selecting “Customer Support” and then“Obtaining Permissions.”Academic PressAn imprint of Elsevier525 B Street, Suite 1900, San Diego, California 92101-4495, USAhttp://www.academicpress.comAcademic PressAn imprint of Elsevier84 Theobald’s Road, London WC1X 8RR, UKhttp://www.academicpress.comLibrary of Congress Cataloging-in-Publication DataHirsch, Morris W., 1933-Differential <strong>eq</strong>uations, dynamical systems, and an introduction to chaos/Morris W.Hirsch, Stephen Smale, Robert L. Devaney.p. cm.Rev. ed. of: Differential <strong>eq</strong>uations, dynamical systems, and linear algebra/Morris W.Hirsch and Stephen Smale. 1974.Includes bibliographical references and index.ISBN 0-12-349703-5 (alk. paper)1. Differential <strong>eq</strong>uations. 2. Algebras, Linear. 3. Chaotic behavior in systems. I. Smale,Stephen, 1930- II. Devaney, Robert L., 1948- III. Hirsch, Morris W., 1933-Differential Equations, dynamical systems, and linear algebra. IV. Title.QA372.H67 2003515’.35--dc22PRINTED IN THE UNITED STATES OF AMERICA03 04 05 06 07 08 9 8 7 6 5 4 3 2 12003058255


ContentsPrefacexCHAPTER 1 First-Order Equations 11.1 The Simplest Example 11.2 The Logistic Population Model 41.3 Constant Harvesting and Bifurcations 71.4 Periodic Harvesting and PeriodicSolutions 91.5 Computing the Poincaré Map 121.6 Exploration: A Two-Parameter Family 15CHAPTER 2 Planar Linear Systems 212.1 Second-Order Differential Equations 232.2 Planar Systems 242.3 Preliminaries from Algebra 262.4 Planar Linear Systems 292.5 Eigenvalues and Eigenvectors 302.6 Solving Linear Systems 332.7 The Linearity Principle 36CHAPTER 3 Phase Portraits for Planar Systems 393.1 Real Distinct Eigenvalues 393.2 Complex Eigenvalues 44v


viContents3.3 Repeated Eigenvalues 473.4 Changing Coordinates 49CHAPTER 4 Classification of Planar Systems 614.1 The Trace-Determinant Plane 614.2 Dynamical Classification 644.3 Exploration: A 3D Parameter Space 71CHAPTER 5 Higher Dimensional Linear Algebra 755.1 Preliminaries from Linear Algebra 755.2 Eigenvalues and Eigenvectors 835.3 Complex Eigenvalues 865.4 Bases and Subspaces 895.5 Repeated Eigenvalues 955.6 Genericity 101CHAPTER 6 Higher Dimensional Linear Systems 1076.1 Distinct Eigenvalues 1076.2 Harmonic Oscillators 1146.3 Repeated Eigenvalues 1196.4 The Exponential of a Matrix 1236.5 Nonautonomous Linear Systems 130CHAPTER 7 Nonlinear Systems 1397.1 Dynamical Systems 1407.2 The Existence and UniquenessTheorem 1427.3 Continuous Dependence of Solutions 1477.4 The Variational Equation 1497.5 Exploration: Numerical Methods 153CHAPTER 8 Equilibria in Nonlinear Systems 1598.1 Some Illustrative Examples 1598.2 Nonlinear Sinks and Sources 1658.3 Saddles 1688.4 Stability 1748.5 Bifurcations 1768.6 Exploration: Complex Vector Fields 182


ContentsviiCHAPTER 9 Global Nonlinear Techniques 1899.1 Nullclines 1899.2 Stability of Equilibria 1949.3 Gradient Systems 2039.4 Hamiltonian Systems 2079.5 Exploration: The Pendulum with ConstantForcing 210CHAPTER 10 Closed Orbits and Limit Sets 21510.1 Limit Sets 21510.2 Local Sections and Flow Boxes 21810.3 The Poincaré Map 22010.4 Monotone S<strong>eq</strong>uences in Planar DynamicalSystems 22210.5 The Poincaré-Bendixson Theorem 22510.6 Applications of Poincaré-Bendixson 22710.7 Exploration: Chemical Reactions ThatOscillate 230CHAPTER 11 Applications in Biology 23511.1 Infectious Diseases 23511.2 Predator/Prey Systems 23911.3 Competitive Species 24611.4 Exploration: Competition andHarvesting 252CHAPTER 12 Applications in Circuit Theory 25712.1 An RLC Circuit 25712.2 The Lienard Equation 26112.3 The van der Pol Equation 26212.4 A Hopf Bifurcation 27012.5 Exploration: Neurodynamics 272CHAPTER 13 Applications in Mechanics 27713.1 Newton’s Second Law 27713.2 Conservative Systems 28013.3 Central Force Fields 28113.4 The Newtonian Central ForceSystem 285


viiiContents13.5 Kepler’s First Law 28913.6 The Two-Body Problem 29213.7 Blowing Up the Singularity 29313.8 Exploration: Other Central ForceProblems 29713.9 Exploration: Classical Limits of QuantumMechanical Systems 298CHAPTER 14 The Lorenz System 30314.1 Introduction to the Lorenz System 30414.2 Elementary Properties of the LorenzSystem 30614.3 The Lorenz Attractor 31014.4 A Model for the Lorenz Attractor 31414.5 The Chaotic Attractor 31914.6 Exploration: The Rössler Attractor 324CHAPTER 15 Discrete Dynamical Systems 32715.1 Introduction to Discrete DynamicalSystems 32715.2 Bifurcations 33215.3 The Discrete Logistic Model 33515.4 Chaos 33715.5 Symbolic Dynamics 34215.6 The Shift Map 34715.7 The Cantor Middle-Thirds Set 34915.8 Exploration: Cubic Chaos 35215.9 Exploration: The Orbit Diagram 353CHAPTER 16 Homoclinic Phenomena 35916.1 The Shil’nikov System 35916.2 The Horseshoe Map 36616.3 The Double Scroll Attractor 37216.4 Homoclinic Bifurcations 37516.5 Exploration: The Chua Circuit 379CHAPTER 17 Existence and Uniqueness Revisited 38317.1 The Existence and UniquenessTheorem 38317.2 Proof of Existence and Uniqueness 385


ContentsixBibliography 407Index 41117.3 Continuous Dependence on InitialConditions 39217.4 Extending Solutions 39517.5 Nonautonomous Systems 39817.6 Differentiability of the Flow 400


PrefaceIn the 30 years since the publication of the first edition of this book, muchhas changed in the field of mathematics known as dynamical systems. In theearly 1970s, we had very little access to high-speed computers and computergraphics. The word chaos had never been used in a mathematical setting, andmost of the interest in the theory of differential <strong>eq</strong>uations and dynamicalsystems was confined to a relatively small group of mathematicians.Things have changed dramatically in the ensuing 3 decades. Computers areeverywhere, and software packages that can be used to approximate solutionsof differential <strong>eq</strong>uations and view the results graphically are widely available.As a cons<strong>eq</strong>uence, the analysis of nonlinear systems of differential <strong>eq</strong>uationsis much more accessible than it once was. The discovery of such complicateddynamical systems as the horseshoe map, homoclinic tangles, and theLorenz system, and their mathematical analyses, convinced scientists that simplestable motions such as <strong>eq</strong>uilibria or periodic solutions were not always themost important behavior of solutions of differential <strong>eq</strong>uations. The beautyand relative accessibility of these chaotic phenomena motivated scientists andengineers in many disciplines to look more carefully at the important differential<strong>eq</strong>uations in their own fields. In many cases, they found chaotic behavior inthese systems as well. Now dynamical systems phenomena appear in virtuallyevery area of science, from the oscillating Belousov-Zhabotinsky reaction inchemistry to the chaotic Chua circuit in electrical engineering, from complicatedmotions in celestial mechanics to the bifurcations arising in ecologicalsystems.As a cons<strong>eq</strong>uence, the audience for a text on differential <strong>eq</strong>uations anddynamical systems is considerably larger and more diverse than it was inx


Prefacexithe 1970s. We have accordingly made several major structural changes tothis text, including the following:1. The treatment of linear algebra has been scaled back. We have dispensedwith the generalities involved with abstract vector spaces and normedlinear spaces. We no longer include a complete proof of the reductionof all n × n matrices to canonical form. Rather we deal primarily withmatrices no larger than 4 × 4.2. We have included a detailed discussion of the chaotic behavior in theLorenz attractor, the Shil’nikov system, and the double scroll attractor.3. Many new applications are included; previous applications have beenupdated.4. There are now several chapters dealing with discrete dynamical systems.5. We deal primarily with systems that are C ∞ , thereby simplifying many ofthe hypotheses of theorems.The book consists of three main parts. The first part deals with linear systemsof differential <strong>eq</strong>uations together with some first-order nonlinear <strong>eq</strong>uations.The second part of the book is the main part of the text: Here we concentrate onnonlinear systems, primarily two dimensional, as well as applications of thesesystems in a wide variety of fields. The third part deals with higher dimensionalsystems. Here we emphasize the types of chaotic behavior that do not occur inplanar systems, as well as the principal means of studying such behavior, thereduction to a discrete dynamical system.Writing a book for a diverse audience whose backgrounds vary greatly posesa significant challenge. We view this book as a text for a second course indifferential <strong>eq</strong>uations that is aimed not only at mathematicians, but also atscientists and engineers who are seeking to develop sufficient mathematicalskills to analyze the types of differential <strong>eq</strong>uations that arise in their disciplines.Many who come to this book will have strong backgrounds in linear algebraand real analysis, but others will have less exposure to these fields. To makethis text accessible to both groups, we begin with a fairly gentle introductionto low-dimensional systems of differential <strong>eq</strong>uations. Much of this will be areview for readers with deeper backgrounds in differential <strong>eq</strong>uations, so weintersperse some new topics throughout the early part of the book for thesereaders.For example, the first chapter deals with first-order <strong>eq</strong>uations. We beginthis chapter with a discussion of linear differential <strong>eq</strong>uations and the logisticpopulation model, topics that should be familiar to anyone who has a rudimentaryacquaintance with differential <strong>eq</strong>uations. Beyond this review, we discussthe logistic model with harvesting, both constant and periodic. This allowsus to introduce bifurcations at an early stage as well as to describe Poincarémaps and periodic solutions. These are topics that are not usually found inelementary differential <strong>eq</strong>uations courses, yet they are accessible to anyone


xiiPrefacewith a background in multivariable calculus. Of course, readers with a limitedbackground may wish to skip these specialized topics at first and concentrateon the more elementary material.Chapters 2 through 6 deal with linear systems of differential <strong>eq</strong>uations.Again we begin slowly, with Chapters 2 and 3 dealing only with planar systemsof differential <strong>eq</strong>uations and two-dimensional linear algebra. Chapters5 and 6 introduce higher dimensional linear systems; however, our emphasisremains on three- and four-dimensional systems rather than completelygeneral n-dimensional systems, though many of the techniques we describeextend easily to higher dimensions.The core of the book lies in the second part. Here we turn our attentionto nonlinear systems. Unlike linear systems, nonlinear systems presentsome serious theoretical difficulties such as existence and uniqueness of solutions,dependence of solutions on initial conditions and parameters, and thelike. Rather than plunge immediately into these difficult theoretical questions,which r<strong>eq</strong>uire a solid background in real analysis, we simply state the importantresults in Chapter 7 and present a collection of examples that illustratewhat these theorems say (and do not say). Proofs of all of these results areincluded in the final chapter of the book.In the first few chapters in the nonlinear part of the book, we introducesuch important techniques as linearization near <strong>eq</strong>uilibria, nullcline analysis,stability properties, limit sets, and bifurcation theory. In the latter half of thispart, we apply these ideas to a variety of systems that arise in biology, electricalengineering, mechanics, and other fields.Many of the chapters conclude with a section called “Exploration.” Thesesections consist of a series of questions and numerical investigations dealingwith a particular topic or application relevant to the preceding material. Ineach Exploration we give a brief introduction to the topic at hand and providereferences for further reading about this subject. But we leave it to the reader totackle the behavior of the resulting system using the material presented earlier.We often provide a series of introductory problems as well as hints as to howto proceed, but in many cases, a full analysis of the system could become amajor research project. You will not find “answers in the back of the book” forthese questions; in many cases nobody knows the complete answer. (Except,of course, you!)The final part of the book is devoted to the complicated nonlinear behaviorof higher dimensional systems known as chaotic behavior. We introduce theseideas via the famous Lorenz system of differential <strong>eq</strong>uations. As is often thecase in dimensions three and higher, we reduce the problem of comprehendingthe complicated behavior of this differential <strong>eq</strong>uation to that of understandingthe dynamics of a discrete dynamical system or iterated function. So we thentake a detour into the world of discrete systems, discussing along the way howsymbolic dynamics may be used to describe completely certain chaotic systems.


PrefacexiiiWe then return to nonlinear differential <strong>eq</strong>uations to apply these techniquesto other chaotic systems, including those that arise when homoclinic orbitsare present.We maintain a website at math.bu.edu/hsd devoted to issues regardingthis text. Look here for errata, suggestions, and other topics of interest toteachers and students of differential <strong>eq</strong>uations. We welcome any contributionsfrom readers at this site.It is a special pleasure to thank Bard Ermentrout, John Guckenheimer, TassoKaper, Jerrold Marsden, and Gareth Roberts for their many fine commentsabout an earlier version of this edition. Thanks are especially due to DanielLook and Richard Moeckel for a careful reading of the entire manuscript.Many of the phase plane drawings in this book were made using the excellentMathematica package called DynPac: A Dynamical Systems Package for Mathematicawritten by Al Clark. See www.me.rochester.edu/˜clark/dynpac.html.And, as always, Killer Devaney digested the entire manuscript; all errors thatremain are due to her.


AcknowledgementsI would like to thank the following reviewers:Bruce Peckham, University of MinnesotaBard Ermentrout, University of PittsburghRichard Moeckel, University of MinnesotaJerry Marsden, CalTechJohn Guckenheimer, Cornell UniversityGareth Roberts, College of Holy CrossRick Moeckel, University of MinnesotaHans Lindblad, University of California San DiegoTom LoFaro, Gustavus Adolphus CollegeDaniel M. Look, Boston Universityxiv


1First-Order EquationsThe purpose of this chapter is to develop some elementary yet importantexamples of first-order differential <strong>eq</strong>uations. These examples illustrate someof the basic ideas in the theory of ordinary differential <strong>eq</strong>uations in the simplestpossible setting.We anticipate that the first few examples in this chapter will be familiar toreaders who have taken an introductory course in differential <strong>eq</strong>uations. Laterexamples, such as the logistic model with harvesting, are included to give thereader a taste of certain topics (bifurcations, periodic solutions, and Poincarémaps) that we will return to often throughout this book. In later chapters, ourtreatment of these topics will be much more systematic.1.1 The Simplest ExampleThe differential <strong>eq</strong>uation familiar to all calculus studentsdxdt = axis the simplest differential <strong>eq</strong>uation. It is also one of the most important.First, what does it mean? Here x = x(t) is an unknown real-valued functionof a real variable t and dx/dt is its derivative (we will also use x ′ or x ′ (t)for the derivative). Also, a is a parameter; for each value of a we have a1


2 Chapter 1 First-Order Equationsdifferent differential <strong>eq</strong>uation. The <strong>eq</strong>uation tells us that for every value of tthe relationshipx ′ (t) = ax(t)is true.The solutions of this <strong>eq</strong>uation are obtained from calculus: If k is any realnumber, then the function x(t) = ke at is a solution sincex ′ (t) = ake at = ax(t).Moreover, there are no other solutions. To see this, let u(t) be any solution andcompute the derivative of u(t)e −at :d (u(t)e−at ) = u ′ (t)e −at + u(t)(−ae −at )dt= au(t)e −at − au(t)e −at = 0.Therefore u(t)e −at is a constant k, sou(t) = ke at . This proves our assertion.We have therefore found all possible solutions of this differential <strong>eq</strong>uation. Wecall the collection of all solutions of a differential <strong>eq</strong>uation the general solutionof the <strong>eq</strong>uation.The constant k appearing in this solution is completely determined if thevalue u 0 of a solution at a single point t 0 is specified. Suppose that a functionx(t) satisfying the differential <strong>eq</strong>uation is also r<strong>eq</strong>uired to satisfy x(t 0 ) = u 0 .Then we must have ke at 0= u 0 , so that k = u 0 e −at 0. Thus we have determinedk, and this <strong>eq</strong>uation therefore has a unique solution satisfying the specifiedinitial condition x(t 0 ) = u 0 . For simplicity, we often take t 0 = 0; then k = u 0 .There is no loss of generality in taking t 0 = 0, for if u(t) is a solution withu(0) = u 0 , then the function v(t) = u(t − t 0 ) is a solution with v(t 0 ) = u 0 .It is common to restate this in the form of an initial value problem:x ′ = ax, x(0) = u 0 .A solution x(t) of an initial value problem must not only solve the differential<strong>eq</strong>uation, but it must also take on the prescribed initial value u 0 at t = 0.Note that there is a special solution of this differential <strong>eq</strong>uation when k = 0.This is the constant solution x(t) ≡ 0. A constant solution such as this is calledan <strong>eq</strong>uilibrium solution or <strong>eq</strong>uilibrium point for the <strong>eq</strong>uation. Equilibria areoften among the most important solutions of differential <strong>eq</strong>uations.The constant a in the <strong>eq</strong>uation x ′ = ax can be considered a parameter.If a changes, the <strong>eq</strong>uation changes and so do the solutions. Can we describe


1.1 The Simplest Example 3qualitatively the way the solutions change? The sign of a is crucial here:1. If a>0, lim t→∞ ke at <strong>eq</strong>uals ∞ when k>0, and <strong>eq</strong>uals −∞ when k


4 Chapter 1 First-Order Equationsby a solid dot), while any other solution moves up or down the x-axis, asindicated by the arrows in Figure 1.1.The <strong>eq</strong>uation x ′ = ax is stable in a certain sense if a ̸= 0. More precisely,if a is replaced by another constant b whose sign is the same as a, then th<strong>eq</strong>ualitative behavior of the solutions does not change. But if a = 0, the slightestchange in a leads to a radical change in the behavior of solutions. We thereforesay that we have a bifurcation at a = 0 in the one-parameter family of <strong>eq</strong>uationsx ′ = ax.1.2 The Logistic Population ModelThe differential <strong>eq</strong>uation x ′ = ax above can be considered a simplistic modelof population growth when a>0. The quantity x(t) measures the populationof some species at time t. The assumption that leads to the differential <strong>eq</strong>uationis that the rate of growth of the population (namely, dx/dt) is directlyproportional to the size of the population. Of course, this naive assumptionomits many circumstances that govern actual population growth, including,for example, the fact that actual populations cannot increase with bound.To take this restriction into account, we can make the following furtherassumptions about the population model:1. If the population is small, the growth rate is nearly directly proportionalto the size of the population;2. but if the population grows too large, the growth rate becomes negative.One differential <strong>eq</strong>uation that satisfies these assumptions is the logisticpopulation growth model. This differential <strong>eq</strong>uation isx ′ = ax(1 − x ).NHere a and N are positive parameters: a gives the rate of population growthwhen x is small, while N represents a sort of “ideal” population or “carryingcapacity.” Note that if x is small, the differential <strong>eq</strong>uation is essentially x ′ = ax[since the term 1 − (x/N ) ≈ 1], but if x>N, then x ′ < 0. Thus this simple<strong>eq</strong>uation satisfies the above assumptions. We should add here that there aremany other differential <strong>eq</strong>uations that correspond to these assumptions; ourchoice is perhaps the simplest.Without loss of generality we will assume that N = 1. That is, we willchoose units so that the carrying capacity is exactly 1 unit of population, andx(t) therefore represents the fraction of the ideal population present at time t.


Therefore the logistic <strong>eq</strong>uation reduces to1.2 The Logistic Population Model 5x ′ = f a (x) = ax(1 − x).This is an example of a first-order, autonomous, nonlinear differential <strong>eq</strong>uation.It is first order since only the first derivative of x appears in the <strong>eq</strong>uation.It is autonomous since the right-hand side of the <strong>eq</strong>uation depends on x alone,not on time t. And it is nonlinear since f a (x) is a nonlinear function of x. Theprevious example, x ′ = ax, is a first-order, autonomous, linear differential<strong>eq</strong>uation.The solution of the logistic differential <strong>eq</strong>uation is easily found by the triedand-truecalculus method of separation and integration:∫∫dxx(1 − x) =adt.The method of partial fractions allows us to rewrite the left integral as∫ ( 1x + 1 )dx.1 − xIntegrating both sides and then solving for x yieldsx(t) =Keat1 + Ke atwhere K is the arbitrary constant that arises from integration. Evaluating thisexpression at t = 0 and solving for K givesK =x(0)1 − x(0) .Using this, we may rewrite this solution asx(0)e at1 − x(0) + x(0)e at .So this solution is valid for any initial population x(0). When x(0) = 1,we have an <strong>eq</strong>uilibrium solution, since x(t) reduces to x(t) ≡ 1. Similarly,x(t) ≡ 0 is also an <strong>eq</strong>uilibrium solution.Thus we have “existence” of solutions for the logistic differential <strong>eq</strong>uation.We have no guarantee that these are all of the solutions to this <strong>eq</strong>uation at thisstage; we will return to this issue when we discuss the existence and uniquenessproblem for differential <strong>eq</strong>uations in Chapter 7.


6 Chapter 1 First-Order Equationsxx1tx0Figure 1.3 The slope field, solution graphs, and phase line forx ′ = ax(1 – x).To get a qualitative feeling for the behavior of solutions, we sketch theslope field for this <strong>eq</strong>uation. The right-hand side of the differential <strong>eq</strong>uationdetermines the slope of the graph of any solution at each time t. Hence wemay plot little slope lines in the tx–plane as in Figure 1.3, with the slope of theline at (t, x) given by the quantity ax(1 − x). Our solutions must thereforehave graphs that are everywhere tangent to this slope field. From these graphs,we see immediately that, in agreement with our assumptions, all solutionsfor which x(0) > 0 tend to the ideal population x(t) ≡ 1. For x(0) < 0,solutions tend to −∞, although these solutions are irrelevant in the contextof a population model.Note that we can also read this behavior from the graph of the functionf a (x) = ax(1 − x). This graph, displayed in Figure 1.4, crosses the x-axis atthe two points x = 0 and x = 1, so these represent our <strong>eq</strong>uilibrium points.When 0


1.3 Constant Harvesting and Bifurcations 70.80.5 1Figure 1.4 The graph of thefunction f (x) =ax(1 – x) witha = 3.2.xx1tx0x1Figure 1.5 The slope field, solution graphs, and phase line for x ′ = x – x 3 .Example. As a further illustration of these qualitative ideas, consider thedifferential <strong>eq</strong>uationx ′ = g(x) = x − x 3 .There are three <strong>eq</strong>uilibrium points, at x = 0, ±1. Since g ′ (x) = 1 − 3x 2 ,wehave g ′ (0) = 1, so the <strong>eq</strong>uilibrium point 0 is a source. Also, g ′ (±1) =−2, sothe <strong>eq</strong>uilibrium points at ±1 are both sinks. Between these <strong>eq</strong>uilibria, the signof the slope field of this <strong>eq</strong>uation is nonzero. From this information we canimmediately display the phase line, which is shown in Figure 1.5. 1.3 Constant Harvesting andBifurcationsNow let’s modify the logistic model to take into account harvesting of the population.Suppose that the population obeys the logistic assumptions with the


8 Chapter 1 First-Order Equationsf h (x)0.5xh1/4h1/4h1/4Figure 1.6 The graphs of the functionf h (x) =x(1 – x) –h.parameter a = 1, but is also harvested at the constant rate h. The differential<strong>eq</strong>uation becomesx ′ = x(1 − x) − hwhere h ≥ 0 is a new parameter.Rather than solving this <strong>eq</strong>uation explicitly (which can be done — seeExercise 6 at the end of this chapter), we use the graphs of the functionsf h (x) = x(1 − x) − hto “read off” the qualitative behavior of solutions. In Figure 1.6 we displaythe graph of f h in three different cases: 0


1.4 Periodic Harvesting and Periodic Solutions 9x1/4hFigure 1.7 The bifurcation diagram forf h (x) =x(1 – x) –h.the initial population is sufficiently large (x(0) ≥ x l ). But a very small changein the rate of harvesting when h = 1/4 leads to a major change in the fate ofthe population: At any rate of harvesting h>1/4, the species becomes extinct.This phenomenon highlights the importance of detecting bifurcations infamilies of differential <strong>eq</strong>uations, a procedure that we will encounter manytimes in later chapters. We should also mention that, despite the simplicity ofthis population model, the prediction that small changes in harvesting ratescan lead to disastrous changes in population has been observed many times inreal situations on earth.Example. As another example of a bifurcation, consider the family ofdifferential <strong>eq</strong>uationsx ′ = g a (x) = x 2 − ax = x(x − a)which depends on a parameter a. The <strong>eq</strong>uilibrium points are given by x = 0and x = a. We compute g a ′ (0) =−a, so 0 is a sink if a>0 and a source ifa


10 Chapter 1 First-Order EquationsxxaaFigure 1.8 The bifurcationdiagram for x ′ = x 2 – ax.populations of many species of fish are harvested at a higher rate in warmerseasons than in colder months. So we assume that the population is harvestedat a periodic rate. One such model is thenx ′ = f (t, x) = ax(1 − x) − h(1 + sin(2πt))where again a and h are positive parameters. Thus the harvesting reaches amaximum rate −2h at time t = 1 4+ n where n is an integer (representing theyear), and the harvesting reaches its minimum value 0 when t = 3 4+n, exactlyone-half year later. Note that this differential <strong>eq</strong>uation now depends explicitlyon time; this is an example of a nonautonomous differential <strong>eq</strong>uation. As inthe autonomous case, a solution x(t) of this <strong>eq</strong>uation must satisfy x ′ (t) =f (t, x(t)) for all t. Also, this differential <strong>eq</strong>uation is no longer separable, so wecannot generate an analytic formula for its solution using the usual methodsfrom calculus. Thus we are forced to take a more qualitative approach.To describe the fate of the population in this case, we first note that the righthandside of the differential <strong>eq</strong>uation is periodic with period 1 in the timevariable. That is, f (t + 1, x) = f (t, x). This fact simplifies somewhat theproblem of finding solutions. Suppose that we know the solution of all initialvalue problems, not for all times, but only for 0 ≤ t ≤ 1. Then in fact weknow the solutions for all time. For example, suppose x 1 (t) is the solution thatis <strong>def</strong>ined for 0 ≤ t ≤ 1 and satisfies x 1 (0) = x 0 . Suppose that x 2 (t) isthesolution that satisfies x 2 (0) = x 1 (1). Then we may extend the solution x 1 by<strong>def</strong>ining x 1 (t + 1) = x 2 (t) for 0 ≤ t ≤ 1. The extended function is a solutionsince we havex 1 ′ (t + 1) = x′ 2 (t) = f (t, x 2(t))= f (t + 1, x 1 (t + 1)).


1.4 Periodic Harvesting and Periodic Solutions 11Figure 1.9 The slope field for f (x) =x (1 – x) –h (1 + sin (2πt)).Thus if we know the behavior of all solutions in the interval 0 ≤ t ≤ 1, thenwe can extrapolate in similar fashion to all time intervals and thereby knowthe behavior of solutions for all time.Secondly, suppose that we know the value at time t = 1 of the solutionsatisfying any initial condition x(0) = x 0 . Then, to each such initial conditionx 0 , we can associate the value x(1) of the solution x(t) that satisfies x(0) = x 0 .This gives us a function p(x 0 ) = x(1). If we compose this function with itself,we derive the value of the solution through x 0 at time 2; that is, p(p(x 0 )) =x(2). If we compose this function with itself n times, then we can computethe value of the solution curve at time n and hence we know the fate of thesolution curve.The function p is called a Poincaré map for this differential <strong>eq</strong>uation. Havingsuch a function allows us to move from the realm of continuous dynamicalsystems (differential <strong>eq</strong>uations) to the often easier-to-understand realm ofdiscrete dynamical systems (iterated functions). For example, suppose thatwe know that p(x 0 ) = x 0 for some initial condition x 0 . That is, x 0 is a fixedpoint for the function p. Then from our previous observations, we know thatx(n) = x 0 for each integer n. Moreover, for each time t with 0


12 Chapter 1 First-Order Equations∧x 0t1 t2x 0x(1)p(x 0 ) x(2)p(p(x 0 ))t0Figure 1.10 The Poincaré map for x ′ =5x(1 – x) –0.8(1 + sin (2πt)).1.5 Computing the Poincaré MapBefore computing the Poincaré map for this <strong>eq</strong>uation, we introduce someimportant terminology. To emphasize the dependence of a solution on theinitial value x 0 , we will denote the corresponding solution by φ(t, x 0 ). Thisfunction φ : R × R → R is called the flow associated to the differential<strong>eq</strong>uation. If we hold the variable x 0 fixed, then the functiont → φ(t, x 0 )is just an alternative expression for the solution of the differential <strong>eq</strong>uationsatisfying the initial condition x 0 . Sometimes we write this function as φ t (x 0 ).Example. For our first example, x ′ = ax, the flow is given byφ(t, x 0 ) = x 0 e at .For the logistic <strong>eq</strong>uation (without harvesting), the flow isφ(t, x 0 ) =x(0)e at1 − x(0) + x(0)e at .Now we return to the logistic differential <strong>eq</strong>uation with periodic harvestingx ′ = f (t, x) = ax(1 − x) − h(1 + sin(2πt)).


1.5 Computing the Poincaré Map 13The solution satisfying the initial condition x(0) = x 0 is given by t → φ(t, x 0 ).While we do not have a formula for this expression, we do know that, by thefundamental theorem of calculus, this solution satisfiesφ(t, x 0 ) = x 0 +∫ t0f (s, φ(s, x 0 )) dssince∂φ∂t (t, x 0) = f (t, φ(t, x 0 ))and φ(0, x 0 ) = x 0 .If we differentiate this solution with respect to x 0 , we obtain, using the chainrule:∫∂φt(t, x 0 ) = 1 +∂x 0 0∂f∂x 0(s, φ(s, x 0 )) · ∂φ∂x 0(s, x 0 ) ds.Now letNote thatz(t) = ∂φ∂x 0(t, x 0 ).z(0) = ∂φ∂x 0(0, x 0 ) = 1.Differentiating z with respect to t,wefindz ′ (t) = ∂f∂x 0(t, φ(t, x 0 )) · ∂φ∂x 0(t, x 0 )= ∂f∂x 0(t, φ(t, x 0 )) · z(t).Again, we do not know φ(t, x 0 ) explicitly, but this <strong>eq</strong>uation does tell us thatz(t) solves the differential <strong>eq</strong>uationz ′ (t) = ∂f (t, φ(t, x 0 )) z(t)∂x 0with z(0) = 1. Cons<strong>eq</strong>uently, via separation of variables, we may computethat the solution of this <strong>eq</strong>uation is∫ tz(t) = exp0∂f∂x 0(s, φ(s, x 0 )) ds


14 Chapter 1 First-Order Equationsand so we find∫∂φ1∂f(1, x 0 ) = exp (s, φ(s, x 0 )) ds.∂x 0 0 ∂x 0Since p(x 0 ) = φ(1, x 0 ), we have determined the derivative p ′ (x 0 )ofthePoincaré map. Note that p ′ (x 0 ) > 0. Therefore p is an increasing function.Differentiating once more, we find(∫ 1p ′′ (x 0 ) = p ′ (x 0 )0∂ 2 f∂x 0 ∂x 0(s, φ(s, x 0 )) · expwhich looks pretty intimidating. However, since(∫ sf (t, x 0 ) = ax 0 (1 − x 0 ) − h(1 + sin(2πt)),0) )∂f(u, φ(u, x 0 )) du ds ,∂x 0we have∂ 2 f≡−2a.∂x 0 ∂x 0Thus we know in addition that p ′′ (x 0 ) < 0. Cons<strong>eq</strong>uently, the graph of thePoincaré map is concave down. This implies that the graph of p can cross thediagonal line y = x at most two times. That is, there can be at most two valuesof x for which p(x) = x. Therefore the Poincaré map has at most two fixedpoints. These fixed points yield periodic solutions of the original differential<strong>eq</strong>uation. These are solutions that satisfy x(t +1) = x(t) for all t. Another wayto say this is that the flow φ(t, x 0 ) is a periodic function in t with period 1 whenthe initial condition x 0 is one of the fixed points. We saw these two solutionsin the particular case when h = 0. 8 in Figure 1.10. In Figure 1.11, we againsee two solutions that appear to be periodic. Note that one of these solutionsappears to attract all nearby solutions, while the other appears to repel them.We will return to these concepts often and make them more precise later inthe book.Recall that the differential <strong>eq</strong>uation also depends on the harvesting parameterh. For small values of h there will be two fixed points such as shown inFigure 1.11. Differentiating f with respect to h,wefind∂f∂h (t, x 0) =−(1 + sin 2πt)Hence ∂f /∂h h ∗ , there are no fixed points for p and so p(x 0 )


1.6 Exploration: A Two-Parameter Family 15112 3 4 5Figure 1.11 Several solutions of x ′ =5x (1 – x) – 0.8 (1 + sin (2πt )).1.6 Exploration: A Two-ParameterFamilyConsider the family of differential <strong>eq</strong>uationsx ′ = f a,b (x) = ax − x 3 − bwhich depends on two parameters, a and b. The goal of this exploration isto combine all of the ideas in this chapter to put together a complete pictureof the two-dimensional parameter plane (the ab–plane) for this differential<strong>eq</strong>uation. Feel free to use a computer to experiment with this differential<strong>eq</strong>uation at first, but then try to verify your observations rigorously.1. First fix a = 1. Use the graph of f 1,b to construct the bifurcation diagramfor this family of differential <strong>eq</strong>uations depending on b.2. Repeat the previous step for a = 0 and then for a =−1.3. What does the bifurcation diagram look like for other values of a?4. Now fix b and use the graph to construct the bifurcation diagram for thisfamily, which this time depends on a.5. In the ab–plane, sketch the regions where the corresponding differential<strong>eq</strong>uation has different numbers of <strong>eq</strong>uilibrium points, including a sketchof the boundary between these regions.6. Describe, using phase lines and the graph of f a,b (x), the bifurcations thatoccur as the parameters pass through this boundary.7. Describe in detail the bifurcations that occur at a = b = 0asa and/or bvary.


16 Chapter 1 First-Order Equations8. Consider the differential <strong>eq</strong>uation x ′ = x − x 3 − b sin(2πt) where |b| issmall. What can you say about solutions of this <strong>eq</strong>uation? Are there anyperiodic solutions?9. Experimentally, what happens as |b| increases? Do you observe anybifurcations? Explain what you observe.EXERCISES1. Find the general solution of the differential <strong>eq</strong>uation x ′ = ax + 3 wherea is a parameter. What are the <strong>eq</strong>uilibrium points for this <strong>eq</strong>uation? Forwhich values of a are the <strong>eq</strong>uilibria sinks? For which are they sources?2. For each of the following differential <strong>eq</strong>uations, find all <strong>eq</strong>uilibriumsolutions and determine if they are sinks, sources, or neither. Also, sketchthe phase line.(a) x ′ = x 3 − 3x(b) x ′ = x 4 − x 2(c) x ′ = cos x(d) x ′ = sin 2 x(e) x ′ =|1 − x 2 |3. Each of the following families of differential <strong>eq</strong>uations depends on aparameter a. Sketch the corresponding bifurcation diagrams.(a) x ′ = x 2 − ax(b) x ′ = x 3 − ax(c) x ′ = x 3 − x + a4. Consider the function f (x) whose graph is displayed in Figure 1.12.f(x)bxFigure 1.12function f.The graph of the


Exercises 17(a) Sketch the phase line corresponding to the differential <strong>eq</strong>uation x ′ =f (x).(b) Let g a (x) = f (x)+a. Sketch the bifurcation diagram correspondingto the family of differential <strong>eq</strong>uations x ′ = g a (x).(c) Describe the different bifurcations that occur in this family.5. Consider the family of differential <strong>eq</strong>uationsx ′ = ax + sin xwhere a is a parameter.(a) Sketch the phase line when a = 0.(b) Use the graphs of ax and sin x to determine the qualitative behaviorof all of the bifurcations that occur as a increases from −1to1.(c) Sketch the bifurcation diagram for this family of differential<strong>eq</strong>uations.6. Find the general solution of the logistic differential <strong>eq</strong>uation withconstant harvestingx ′ = x(1 − x) − hfor all values of the parameter h>0.7. Consider the nonautonomous differential <strong>eq</strong>uationx ′ ={x − 4 if t


18 Chapter 1 First-Order Equations10. Consider the differential <strong>eq</strong>uation x ′ = x + cos t.(a) Find the general solution of this <strong>eq</strong>uation.(b) Prove that there is a unique periodic solution for this <strong>eq</strong>uation.(c) Compute the Poincaré map p :{t = 0} →{t = 2π} for this <strong>eq</strong>uationand use this to verify again that there is a unique periodicsolution.11. First-order differential <strong>eq</strong>uations need not have solutions that are <strong>def</strong>inedfor all times.(a) Find the general solution of the <strong>eq</strong>uation x ′ = x 2 .(b) Discuss the domains over which each solution is <strong>def</strong>ined.(c) Give an example of a differential <strong>eq</strong>uation for which the solutionsatisfying x(0) = 0 is <strong>def</strong>ined only for −1


Exercises 19for all t. Suppose there are constants p, q such thatf (t, p) > 0, f (t, q) < 0for all t. Prove that there is a periodic solution x(t) for this <strong>eq</strong>uation withp


This Page Intentionally Left Blank


2Planar Linear SystemsIn this chapter we begin the study of systems of differential <strong>eq</strong>uations. A systemof differential <strong>eq</strong>uations is a collection of n interrelated differential <strong>eq</strong>uationsof the formx 1 ′ = f 1(t, x 1 , x 2 , ..., x n )x 2 ′ = f 2(t, x 1 , x 2 , ..., x n ).x n ′ = f n(t, x 1 , x 2 , ..., x n ).Here the functions f j are real-valued functions of the n + 1 variables x 1 , x 2 , ...,x n , and t. Unless otherwise specified, we will always assume that the f j are C ∞functions. This means that the partial derivatives of all orders of the f j existand are continuous.To simplify notation, we will use vector notation:⎛ ⎞x 1⎜ ⎟X = ⎝ . ⎠ .x nWe often write the vector X as (x 1 , ..., x n ) to save space.21


22 Chapter 2 Planar Linear SystemsOur system may then be written more concisely asX ′ = F(t, X)where⎛⎞f 1 (t, x 1 , ..., x n )⎜⎟F(t, X) = ⎝ . ⎠ .f n (t, x 1 , ..., x n )A solution of this system is then a function of the form X(t) =(x 1 (t), ..., x n (t)) that satisfies the <strong>eq</strong>uation, so thatX ′ (t) = F(t, X(t))where X ′ (t) = (x 1 ′ (t), ..., x′ n (t)). Of course, at this stage, we have no guaranteethat there is such a solution, but we will begin to discuss this complicatedquestion in Section 2.7.The system of <strong>eq</strong>uations is called autonomous if none of the f j depends ont, so the system becomes X ′ = F(X). For most of the rest of this book we willbe concerned with autonomous systems.In analogy with first-order differential <strong>eq</strong>uations, a vector X 0 for whichF(X 0 ) = 0 is called an <strong>eq</strong>uilibrium point for the system. An <strong>eq</strong>uilibrium pointcorresponds to a constant solution X(t) ≡ X 0 of the system as before.Just to set some notation once and for all, we will always denote real variablesby lowercase letters such as x, y, x 1 , x 2 , t, and so forth. Real-valued functionswill also be written in lowercase such as f (x, y) orf 1 (x 1 , ..., x n , t). We willreserve capital letters for vectors such as X = (x 1 , ..., x n ), or for vector-valuedfunctions such asF(x, y) = (f (x, y), g(x, y))or⎛⎞h 1 (x 1 , ..., x n )⎜⎟H(x 1 , ..., x n ) = ⎝ . ⎠ .h n (x 1 , ..., x n )We will denote n-dimensional Euclidean space by R n , so that R n consists ofall vectors of the form X = (x 1 , ..., x n ).


2.1 Second-Order Differential Equations 232.1 Second-Order Differential EquationsMany of the most important differential <strong>eq</strong>uations encountered in scienceand engineering are second-order differential <strong>eq</strong>uations. These are differential<strong>eq</strong>uations of the formx ′′ = f (t, x, x ′ ).Important examples of second-order <strong>eq</strong>uations include Newton’s <strong>eq</strong>uationmx ′′ = f (x),the <strong>eq</strong>uation for an RLC circuit in electrical engineeringLCx ′′ + RCx ′ + x = v(t),and the mainstay of most elementary differential <strong>eq</strong>uations courses, the forcedharmonic oscillatormx ′′ + bx ′ + kx = f (t).We will discuss these and more complicated relatives of these <strong>eq</strong>uations atlength as we go along. First, however, we note that these <strong>eq</strong>uations are aspecial subclass of two-dimensional systems of differential <strong>eq</strong>uations that are<strong>def</strong>ined by simply introducing a second variable y = x ′ .For example, consider a second-order constant coefficient <strong>eq</strong>uation of theformx ′′ + ax ′ + bx = 0.If we let y = x ′ , then we may rewrite this <strong>eq</strong>uation as a system of first-order<strong>eq</strong>uationsx ′ = yy ′ =−bx − ay.Any second-order <strong>eq</strong>uation can be handled in a similar manner. Thus, for theremainder of this book, we will deal primarily with systems of <strong>eq</strong>uations.


24 Chapter 2 Planar Linear Systems2.2 Planar SystemsFor the remainder of this chapter we will deal with autonomous systems inR 2 , which we will write in the formx ′ = f (x, y)y ′ = g(x, y)thus eliminating the annoying subscripts on the functions and variables. Asabove, we often use the abbreviated notation X ′ = F(X) where X = (x, y)and F(X) = F(x, y) = (f (x, y), g(x, y)).In analogy with the slope fields of Chapter 1, we regard the right-hand sideof this <strong>eq</strong>uation as <strong>def</strong>ining a vector field on R 2 . That is, we think of F(x, y)as representing a vector whose x- and y-components are f (x, y) and g(x, y),respectively. We visualize this vector as being based at the point (x, y). Forexample, the vector field associated to the systemx ′ = yy ′ =−xis displayed in Figure 2.1. Note that, in this case, many of the vectors overlap,making the pattern difficult to visualize. For this reason, we always draw adirection field instead, which consists of scaled versions of the vectors.A solution of this system should now be thought of as a parameterized curvein the plane of the form (x(t), y(t)) such that, for each t, the tangent vector atthe point (x(t), y(t)) is F(x(t), y(t)). That is, the solution curve (x(t), y(t))winds its way through the plane always tangent to the given vector F(x(t), y(t))based at (x(t), y(t)).Figure 2.1 The vector field, direction field, andseveral solutions for the system x ′ = y, y ′ =–x.


2.2 Planar Systems 25Example. The curve( ) x(t)=y(t)( ) a sin ta cos tfor any a ∈ R is a solution of the systemsincex ′ = yy ′ =−xx ′ (t) = a cos t = y(t)y ′ (t) =−a sin t =−x(t)as r<strong>eq</strong>uired by the differential <strong>eq</strong>uation. These curves <strong>def</strong>ine circles of radius|a| in the plane that are traversed in the clockwise direction as t increases.When a = 0, the solutions are the constant functions x(t) ≡ 0 ≡ y(t). Note that this example is <strong>eq</strong>uivalent to the second-order differential <strong>eq</strong>uationx ′′ =−x by simply introducing the second variable y = x ′ . This is anexample of a linear second-order differential <strong>eq</strong>uation, which, in more generalform, can be writtena(t)x ′′ + b(t)x ′ + c(t)x = f (t).An important special case of this is the linear, constant coefficient <strong>eq</strong>uationwhich we write as a system asax ′′ + bx ′ + cx = f (t),x ′ = yy ′ =− c a x − b a y + f (t)a .An even more special case is the homogeneous <strong>eq</strong>uation in which f (t) ≡ 0.Example. One of the simplest yet most important second-order, linear, constantcoefficient differential <strong>eq</strong>uations is the <strong>eq</strong>uation for a harmonic oscillator.This <strong>eq</strong>uation models the motion of a mass attached to a spring. The springis attached to a vertical wall and the mass is allowed to slide along a horizontaltrack. We let x denote the displacement of the mass from its natural


26 Chapter 2 Planar Linear Systemsresting place (with x>0 if the spring is stretched and x0isthe spring constant. Newton’s law states that the force acting on the oscillatoris <strong>eq</strong>ual to mass times acceleration. Therefore the differential <strong>eq</strong>uation for thedamped harmonic oscillator ismx ′′ + bx ′ + kx = 0.If b = 0, the oscillator is said to be undamped; otherwise, we have a dampedharmonic oscillator. This is an example of a second-order, linear, constantcoefficient, homogeneous differential <strong>eq</strong>uation. As a system, the harmonicoscillator <strong>eq</strong>uation becomesx ′ = yy ′ =− k m x − b m y.More generally, the motion of the mass-spring system can be subjected to anexternal force (such as moving the vertical wall back and forth periodically).Such an external force usually depends only on time, not position, so we havea more general forced harmonic oscillator systemmx ′′ + bx ′ + kx = f (t)where f (t) represents the external force. This is now a nonautonomous,second-order, linear <strong>eq</strong>uation.2.3 Preliminaries from AlgebraBefore proceeding further with systems of differential <strong>eq</strong>uations, we need torecall some elementary facts regarding systems of algebraic <strong>eq</strong>uations. We willoften encounter simultaneous <strong>eq</strong>uations of the formax + by = αcx + dy = β


2.3 Preliminaries from Algebra 27where the values of a, b, c, and d as well as α and β are given. In matrix form,we may write this <strong>eq</strong>uation as( ) ( )a b x α= .c d)(y βWe denote by A the 2 × 2 coefficient matrix( ) a bA = .c dThis system of <strong>eq</strong>uations is easy to solve, assuming that there is a solution.There is a unique solution of these <strong>eq</strong>uations if and only if the determinant ofA is nonzero. Recall that this determinant is the quantity given bydet A = ad − bc.If det A = 0, we may or may not have solutions, but if there is a solution, thenin fact there must be infinitely many solutions.In the special case where α = β = 0, we always have infinitely manysolutions of( ) ( x 0A =y 0)when det A = 0. Indeed, if the coefficient a of A is nonzero, we have x =−(b/a)y and so−c( ba)y + dy = 0.Thus (ad − bc)y = 0. Since det A = 0, the solutions of the <strong>eq</strong>uation assumethe form (−(b/a)y, y) where y is arbitrary. This says that every solution lieson a straight line through the origin in the plane. A similar line of solutionsoccurs as long as at least one of the entries of A is nonzero. We will not worrytoo much about the case where all entries of A are 0; in fact, we will completelyignore it.Let V and W be vectors in the plane. We say that V and W are linearlyindependent if V and W do not lie along the same straight line through theorigin. The vectors V and W are linearly dependent if either V or W is thezero vector or if both lie on the same line through the origin.A geometric criterion for two vectors in the plane to be linearly independentis that they do not point in the same or opposite directions. That is, V and W


28 Chapter 2 Planar Linear Systemsare linearly independent if and only if V ̸= λW for any real number λ. An<strong>eq</strong>uivalent algebraic criterion for linear independence is as follows:Proposition. Suppose V = (v 1 , v 2 ) and W = (w 1 , w 2 ). Then V and W arelinearly independent if and only if( )v1 wdet 1̸= 0.v 2 w 2For a proof, see Exercise 11 at the end of this chapter.Whenever we have a pair of linearly independent vectors V and W , we mayalways write any vector Z ∈ R 2 in a unique way as a linear combination of Vand W . That is, we may always find a pair of real numbers α and β such thatZ = αV + βW .Moreover, α and β are unique. To see this, suppose Z = (z 1 , z 2 ). Then wemust solve the <strong>eq</strong>uationsz 1 = αv 1 + βw 1z 2 = αv 2 + βw 2where the v i , w i , and z i are known. But this system has a unique solution (α, β)since( )v1 wdet 1̸= 0.v 2 w 2The linearly independent vectors V and W are said to <strong>def</strong>ine a basis for R 2 .Any vector Z has unique “coordinates” relative to V and W . These coordinatesare the pair (α, β) for which Z = αV + βW .Example. The unit vectors E 1 = (1, 0) and E 2 = (0, 1) obviously form abasis called the standard basis of R 2 . The coordinates of Z in this basis are justthe “usual” Cartesian coordinates (x, y)ofZ.Example. The vectors V 1 = (1, 1) and V 2 = (−1, 1) also form a basis of R 2 .Relative to this basis, the coordinates of E 1 are (1/2, −1/2) and those of E 2 are


2.4 Planar Linear Systems 29(1/2, 1/2), because( 1=0)1 2( 0=1)1 2( 11)− 1 2( 11)+ 1 2( ) −11( ) −11These “changes of coordinates” will become important later.Example. The vectors V 1 = (1, 1) and V 2 = (−1, −1) do not form a basisof R 2 since these vectors are collinear. Any linear combination of these vectorsis of the form( ) α − βαV 1 + βV 2 = ,α − βwhich yields only vectors on the straight line through the origin, V 1 ,and V 2 .2.4 Planar Linear SystemsWe now further restrict our attention to the most important class of planarsystems of differential <strong>eq</strong>uations, namely, linear systems. In the autonomouscase, these systems assume the simple formx ′ = ax + byy ′ = cx + dywhere a, b, c, and d are constants. We may abbreviate this system by using thecoefficient matrix A where( ) a bA = .c dThen the linear system may be written asX ′ = AX.Note that the origin is always an <strong>eq</strong>uilibrium point for a linear system. Tofind other <strong>eq</strong>uilibria, we must solve the linear system of algebraic <strong>eq</strong>uationsax + by = 0cx + dy = 0.


30 Chapter 2 Planar Linear SystemsThis system has a nonzero solution if and only if det A = 0. As we sawpreviously, if det A = 0, then there is a straight line through the origin onwhich each point is an <strong>eq</strong>uilibrium. Thus we have:Proposition. The planar linear system X ′ = AX has1. A unique <strong>eq</strong>uilibrium point (0, 0) if det A ̸= 0.2. A straight line of <strong>eq</strong>uilibrium points if det A = 0 (and A is not the 0matrix).2.5 Eigenvalues and EigenvectorsNow we turn to the question of finding non<strong>eq</strong>uilibrium solutions of the linearsystem X ′ = AX. The key observation here is this: Suppose V 0 is a nonzerovector for which we have AV 0 = λV 0 where λ ∈ R. Then the functionX(t) = e λt V 0is a solution of the system. To see this, we computeX ′ (t) = λe λt V 0= e λt (λV 0 )= e λt (AV 0 )= A(e λt V 0 )= AX(t)so X(t) does indeed solve the system of <strong>eq</strong>uations. Such a vector V 0 and itsassociated scalar have names:DefinitionA nonzero vector V 0 is called an eigenvector of A if AV 0 = λV 0for some λ. The constant λ is called an eigenvalue of A.As we observed, there is an important relationship between eigenvalues,eigenvectors, and solutions of systems of differential <strong>eq</strong>uations:Theorem. Suppose that V 0 is an eigenvector for the matrix A with associatedeigenvalue λ. Then the function X(t) = e λt V 0 is a solution of the systemX ′ = AX.


2.5 Eigenvalues and Eigenvectors 31Note that if V 0 is an eigenvector for A with eigenvalue λ, then any nonzeroscalar multiple of V 0 is also an eigenvector for A with eigenvalue λ. Indeed, ifAV 0 = λV 0 , thenfor any nonzero constant α.A(αV 0 ) = αAV 0 = λ(αV 0 )Example. ConsiderA =( ) 1 3.1 −1Then A has an eigenvector V 0 = (3, 1) with associated eigenvalue λ = 2 since( ) ( ( 1 3 3 6 3= = 2 . 1 −1)(1 2)1)Similarly, V 1 = (1, −1) is an eigenvector with associated eigenvalue λ =−2.Thus, for the systemX ′ =( ) 1 3X1 −1we now know three solutions: the <strong>eq</strong>uilibrium solution at the origin togetherwith( )( )3 1X 1 (t) = e 2t and X 2 (t) = e −2t .1−1We will see that we can use these solutions to generate all solutions of this systemin a moment, but first we address the question of how to find eigenvectorsand eigenvalues.To produce an eigenvector V = (x, y), we must find a nonzero solution(x, y) of the <strong>eq</strong>uation( ) ( )x xA = λ .y yNote that there are three unknowns in this system of <strong>eq</strong>uations: the twocomponents of V as well as λ. Let I denote the 2 × 2 identity matrix( ) 1 0I = .0 1


32 Chapter 2 Planar Linear SystemsThen we may rewrite the <strong>eq</strong>uation in the form(A − λI)V = 0,where 0 denotes the vector (0, 0).Now A − λI is just a 2 × 2 matrix (having entries involving the variableλ), so this linear system of <strong>eq</strong>uations has nonzero solutions if and only ifdet (A − λI) = 0, as we saw previously. But this <strong>eq</strong>uation is just a quadratic<strong>eq</strong>uation in λ, whose roots are therefore easy to find. This <strong>eq</strong>uation will appearover and over in the s<strong>eq</strong>uel; it is called the characteristic <strong>eq</strong>uation. As a functionof λ, we call det(A − λI) the characteristic polynomial. Thus the strategy togenerate eigenvectors is first to find the roots of the characteristic <strong>eq</strong>uation.This yields the eigenvalues. Then we use each of these eigenvalues to generatein turn an associated eigenvector.Example. We return to the matrixA =( ) 1 3.1 −1We haveA − λI =( )1 − λ 3.1 −1 − λSo the characteristic <strong>eq</strong>uation isSimplifying, we finddet(A − λI) = (1 − λ)(−1 − λ) − 3 = 0.λ 2 − 4 = 0,which yields the two eigenvalues λ =±2. Then, for λ = 2, we next solve the<strong>eq</strong>uation( ) ( x 0(A − 2I) = .y 0)In component form, this reduces to the system of <strong>eq</strong>uations(1 − 2)x + 3y = 0x + (−1 − 2)y = 0


2.6 Solving Linear Systems 33or −x + 3y = 0, because these <strong>eq</strong>uations are redundant. Thus any vector ofthe form (3y, y) with y ̸= 0 is an eigenvector associated to λ = 2. In similarfashion, any vector of the form (y, −y) with y ̸= 0 is an eigenvector associatedto λ =−2.Of course, the astute reader will notice that there is more to the story ofeigenvalues, eigenvectors, and solutions of differential <strong>eq</strong>uations than what wehave described previously. For example, the roots of the characteristic <strong>eq</strong>uationmay be complex, or they may be repeated real numbers. We will handle all ofthese cases shortly, but first we return to the problem of solving linear systems.2.6 Solving Linear SystemsAs we saw in the example in the previous section, if we find two real roots λ 1 andλ 2 (with λ 1 ̸= λ 2 ) of the characteristic <strong>eq</strong>uation, then we may generate a pairof solutions of the system of differential <strong>eq</strong>uations of the form X i (t) = e λ it V iwhere V i is the eigenvector associated to λ i . Note that each of these solutions isa straight-line solution. Indeed, we have X i (0) = V i , which is a nonzero pointin the plane. For each t, e λ it V i is a scalar multiple of V i and so lies on thestraight ray emanating from the origin and passing through V i . Note that, ifλ i > 0, thenandlimt→∞ |X i(t)| =∞limt→−∞ X i(t) = (0, 0).The magnitude of the solution X i (t) increases monotonically to ∞ along theray through V i as t increases, and X i (t) tends to the origin along this rayin backward time. The exact opposite situation occurs if λ i < 0, whereas, ifλ i = 0, the solution X i (t) is the constant solution X i (t) = V i for all t.So how do we find all solutions of the system given this pair of specialsolutions? The answer is now easy and important. Suppose we have two distinctreal eigenvalues λ 1 and λ 2 with eigenvectors V 1 and V 2 . Then V 1 and V 2 arelinearly independent, as is easily checked (see Exercise 14). Thus V 1 and V 2form a basis of R 2 , so, given any point Z 0 ∈ R 2 , we may find a unique pair ofreal numbers α and β for whichαV 1 + βV 2 = Z 0 .


34 Chapter 2 Planar Linear SystemsNow consider the function Z(t) = αX 1 (t) + βX 2 (t) where the X i (t) are thestraight-line solutions previously. We claim that Z(t) is a solution of X ′ = AX.To see this we computeZ ′ (t) = αX 1 ′ (t) + βX 2 ′ (t)= αAX 1 (t) + βAX 2 (t)= A(αX 1 (t) + βX 2 (t)).= AZ(t)This last step follows from the linearity of matrix multiplication (see Exercise13). Hence we have shown that Z ′ (t) = AZ(t), so Z(t) is a solution. Moreover,Z(t) is a solution that satisfies Z(0) = Z 0 . Finally, we claim that Z(t) istheunique solution of X ′ = AX that satisfies Z(0) = Z 0 . Just as in Chapter 1,we suppose that Y (t) is another such solution with Y (0) = Z 0 . Then we maywritewith ζ (0) = α, μ(0) = β. HenceButTherefore we haveY (t) = ζ (t)V 1 + μ(t)V 2AY (t) = Y ′ (t) = ζ ′ (t)V 1 + μ ′ (t)V 2 .AY (t) = ζ (t)AV 1 + μ(t)AV 2= λ 1 ζ (t)V 1 + λ 2 μ(t)V 2 .ζ ′ (t) = λ 1 ζ (t)μ ′ (t) = λ 2 μ(t)with ζ (0) = α, μ(0) = β. As we saw in Chapter 1, it follows thatζ (t) = αe λ 1t , μ(t) = βe λ 2tso that Y (t) is indeed <strong>eq</strong>ual to Z(t).As a cons<strong>eq</strong>uence, we have now found the unique solution to the systemX ′ = AX that satisfies X(0) = Z 0 for any Z 0 ∈ R 2 . The collection of all suchsolutions is called the general solution of X ′ = AX. That is, the general solutionis the collection of solutions of X ′ = AX that features a unique solution of theinitial value problem X(0) = Z 0 for each Z 0 ∈ R 2 .


We therefore have shown the following:2.6 Solving Linear Systems 35Theorem. Suppose A has a pair of real eigenvalues λ 1 ̸= λ 2 and associatedeigenvectors V 1 and V 2 . Then the general solution of the linear system X ′ = AXis given byX(t) = αe λ 1t V 1 + βe λ 2t V 2 .Example. Consider the second-order differential <strong>eq</strong>uationx ′′ + 3x ′ + 2x = 0.This is a specific case of the damped harmonic oscillator discussed earlier,where the mass is 1, the spring constant is 2, and the damping constant is 3.As a system, this <strong>eq</strong>uation may be rewritten:( )X ′ 0 1=X = AX.−2 −3The characteristic <strong>eq</strong>uation isλ 2 + 3λ + 2 = (λ + 2)(λ + 1) = 0,so the system has eigenvalues −1 and −2. The eigenvector corresponding tothe eigenvalue −1 is given by solving the <strong>eq</strong>uation( ( x 0(A + I) = .y)0)In component form this <strong>eq</strong>uation becomesx + y = 0−2x − 2y = 0.Hence, one eigenvector associated to the eigenvalue −1 is(1,−1). In similarfashion we compute that an eigenvector associated to the eigenvalue −2 is(1, −2). Note that these two eigenvectors are linearly independent. Therefore,by the previous theorem, the general solution of this system is( ) ( )X(t) = αe −t 1+ βe−1−2t 1.−2That is, the position of the mass is given by the first component of the solutionx(t) = αe −t + βe −2t


36 Chapter 2 Planar Linear Systemsand the velocity is given by the second componenty(t) = x ′ (t) =−αe −t − 2βe −2t .2.7 The Linearity PrincipleThe theorem discussed in the previous section is a very special case of thefundamental theorem for n-dimensional linear systems, which we shall provein Section 6.1 of Chapter 6. For the two-dimensional version of this result,note that if X ′ = AX is a planar linear system for which Y 1 (t) and Y 2 (t)are both solutions, then just as before, the function αY 1 (t) + βY 2 (t) is alsoa solution of this system. We do not need real and distinct eigenvalues toprove this. This fact is known as the linearity principle. More importantly, ifthe initial conditions Y 1 (0) and Y 2 (0) are linearly independent vectors, thenthese vectors form a basis of R 2 . Hence, given any vector X 0 ∈ R 2 , we maydetermine constants α and β such that X 0 = αY 1 (0) + βY 2 (0). Then thelinearity principle tells us that the solution X(t) satisfying the initial conditionX(0) = X 0 is given by X(t) = αY 1 (t) + βY 2 (t). Hence we have produced asolution of the system that solves any given initial value problem. The existenceand uniqueness theorem for linear systems in Chapter 6 will show that thissolution is also unique. This important result may then be summarized:Theorem. Let X ′ = AX be a planar system. Suppose that Y 1 (t) and Y 2 (t)are solutions of this system, and that the vectors Y 1 (0) and Y 2 (0) are linearlyindependent. ThenX(t) = αY 1 (t) + βY 2 (t)is the unique solution of this system that satisfies X(0) = αY 1 (0) +βY 2 (0).EXERCISES1. Find the eigenvalues and eigenvectors of each of the following 2 × 2matrices:( ) ( )3 12 1(a)(b)1 31 1( ) ( )a b(c)(d) √ 1√ 30 c2 3 2


Exercises 371. 2.3. 4.Figure 2.2 Match these direction fields withthe systems in Exercise 2.2. Find the general solution of each of the following linear systems:( )( )(a) X ′ 1 2= X (b) X0 3′ 1 2= X3 6( )( )(c) X ′ 1 2= X (d) X1 0′ 1 2= X3 −33. In Figure 2.2, you see four direction fields. Match each of these directionfields with one of the systems in the previous question.4. Find the general solution of the system( )X ′ a b= Xc awhere bc > 0.5. Find the general solution of the systemX ′ =( ) 0 0X.0 06. For the harmonic oscillator system x ′′ + bx ′ + kx = 0, find all valuesof b and k for which this system has real, distinct eigenvalues. Find the


38 Chapter 2 Planar Linear Systemsgeneral solution of this system in these cases. Find the solution of thesystem that satisfies the initial condition (0, 1). Describe the motion ofthe mass in this particular case.7. Consider the 2 × 2 matrix( ) a 1A = .0 1Find the value a 0 of the parameter a for which A has repeated realeigenvalues. What happens to the eigenvectors of this matrix as aapproaches a 0 ?8. Describe all possible 2 × 2 matrices whose eigenvalues are 0 and 1.9. Give an example of a linear system for which (e −t , α) is a solution forevery constant α. Sketch the direction field for this system. What is thegeneral solution of this system.10. Give an example of a system of differential <strong>eq</strong>uations for which (t,1)isa solution. Sketch the direction field for this system. What is the generalsolution of this system?11. Prove that two vectors V = (v 1 , v 2 ) and W = (w 1 , w 2 ) are linearlyindependent if and only if( )v1 wdet 1̸= 0.v 2 w 212. Prove that if λ, μ are real eigenvalues of a 2 × 2 matrix, then any nonzerocolumn of the matrix A − λI is an eigenvector for μ.13. Let A bea2× 2 matrix and V 1 and V 2 vectors in R 2 . Prove that A(αV 1 +βV 2 ) = αAV 1 + βV 2 .14. Prove that the eigenvectors of a 2 × 2 matrix corresponding to distinctreal eigenvalues are always linearly independent.


3Phase Portraits forPlanar SystemsGiven the linearity principle from the previous chapter, we may now computethe general solution of any planar system. There is a seemingly endless numberof distinct cases, but we will see that these represent in the simplest possibleform nearly all of the types of solutions we will encounter in the higherdimensional case.3.1 Real Distinct EigenvaluesConsider X ′ = AX and suppose that A has two real eigenvalues λ 1 < λ 2 .Assuming for the moment that λ i ̸= 0, there are three cases to consider:1. λ 1 < 0 < λ 2 ;2. λ 1 < λ 2 < 0;3. 0 < λ 1 < λ 2 .We give a specific example of each case; any system that falls into any one ofthese three categories may be handled in a similar manner, as we show later.Example. (Saddle) First consider the simple system X ′ = AX where( )λ1 0A =0 λ 239


40 Chapter 3 Phase Portraits for Planar Systemswith λ 1 < 0 < λ 2 . This can be solved immediately since the system decouplesinto two unrelated first-order <strong>eq</strong>uations:x ′ = λ 1 xy ′ = λ 2 y.We already know how to solve these <strong>eq</strong>uations, but, having in mind whatcomes later, let’s find the eigenvalues and eigenvectors. The characteristic<strong>eq</strong>uation is(λ − λ 1 )(λ − λ 2 ) = 0so λ 1 and λ 2 are the eigenvalues. An eigenvector corresponding to λ 1 is (1, 0)and to λ 2 is (0, 1). Hence we find the general solutionX(t) = αe λ 1t( ( 1+ βe0)λ 2t 0.1)Since λ 1 < 0, the straight-line solutions of the form αe λ1t (1, 0) lie on thex-axis and tend to (0, 0) as t →∞. This axis is called the stable line. Sinceλ 2 > 0, the solutions βe λ2t (0, 1) lie on the y-axis and tend away from (0, 0) ast →∞; this axis is the unstable line. All other solutions (with α, β ̸= 0) tendto ∞ in the direction of the unstable line, as t →∞, since X(t) comes closerand closer to (0, βe λ2t )ast increases. In backward time, these solutions tendto ∞ in the direction of the stable line.In Figure 3.1 we have plotted the phase portrait of this system. The phaseportrait is a picture of a collection of representative solution curves of theFigure 3.1 Saddle phaseportrait for x ′ =–x,y ′ = y.


3.1 Real Distinct Eigenvalues 41system in R 2 , which we call the phase plane. The <strong>eq</strong>uilibrium point of a systemof this type (eigenvalues satisfying λ 1 < 0 < λ 2 ) is called a saddle.For a slightly more complicated example of this type, consider X ′ = AXwhere( ) 1 3A = .1 −1As we saw in Chapter 2, the eigenvalues of A are ±2. The eigenvector associatedto λ = 2 is the vector (3, 1); the eigenvector associated to λ =−2is(1,−1).Hence we have an unstable line that contains straight-line solutions of theform( ) 3X 1 (t) = αe 2t ,1each of which tends away from the origin as t →∞. The stable line containsthe straight-line solutionsX 2 (t) = βe −2t ( 1−1),which tend toward the origin as t →∞. By the linearity principle, any othersolution assumes the form( ) ( )3 1X(t) = αe 2t + βe −2t1−1for some α, β. Note that, if α ̸= 0, as t →∞, we have( ) 3X(t) ∼ αe 2t = X 1 (t)1whereas, if β ̸= 0, as t →−∞,( ) 1X(t) ∼ βe −2t = X 2 (t).−1Thus, as time increases, the typical solution approaches X 1 (t) while, as timedecreases, this solution tends toward X 2 (t), just as in the previous case.Figure 3.2 displays this phase portrait.In the general case where A has a positive and negative eigenvalue, we alwaysfind a similar stable and unstable line on which solutions tend toward or away


42 Chapter 3 Phase Portraits for Planar SystemsFigure 3.2 Saddle phaseportrait for x ′ = x +3y,y ′ = x – y.from the origin. All other solutions approach the unstable line as t →∞, andtend toward the stable line as t →−∞.Example. (Sink) Now consider the case X ′ = AX whereA =( )λ1 00 λ 2but λ 1 < λ 2 < 0. As above we find two straight-line solutions and then thegeneral solution:X(t) = αe λ 1t( ( 1+ βe0)λ 2t 0.1)Unlike the saddle case, now all solutions tend to (0, 0) as t →∞. The questionis: How do they approach the origin? To answer this, we compute the slopedy/dx of a solution with β ̸= 0. We writeand computex(t) = αe λ 1ty(t) = βe λ 2tdydx = dy/dtdx/dt = λ 2βe λ 2tλ 1 αe λ 1t= λ 2βλ 1 α e(λ 2−λ 1 )t .Since λ 2 −λ 1 > 0, it follows that these slopes approach ±∞ (provided β ̸= 0).Thus these solutions tend to the origin tangentially to the y-axis.


3.1 Real Distinct Eigenvalues 43(a)(b)Figure 3.3 Phase portraits for (a) a sink and(b) a source.Since λ 1 < λ 2 < 0, we call λ 1 the stronger eigenvalue and λ 2 the weakereigenvalue. The reason for this in this particular case is that the x-coordinates ofsolutions tend to 0 much more quickly than the y-coordinates. This accountsfor why solutions (except those on the line corresponding to the λ 1 eigenvector)tend to “hug” the straight-line solution corresponding to the weakereigenvalue as they approach the origin.The phase portrait for this system is displayed in Figure 3.3a. In this case the<strong>eq</strong>uilibrium point is called a sink.More generally, if the system has eigenvalues λ 1 < λ 2 < 0 with eigenvectors(u 1 , u 2 ) and (v 1 , v 2 ), respectively, then the general solution is(αe λ 1t u1The slope of this solution is given byu 2)+ βe λ 2t(v1v 2).dydx = λ 1αe λ1t u 2 + λ 2 βe λ2t v 2λ 1 αe λ1t u 1 + λ 2 βe λ2t v 1(λ1 αe λ1t u 2 + λ 2 βe λ2t )v 2 e−λ 2 t=λ 1 αe λ1t u 1 + λ 2 βe λ2t v 1 e −λ 2t= λ 1αe (λ 1−λ 2 )t u 2 + λ 2 βv 2λ 1 αe (λ 1−λ 2 )t u 1 + λ 2 βv 1,which tends to the slope v 2 /v 1 of the λ 2 eigenvector, unless we have β = 0. Ifβ = 0, our solution is the straight-line solution corresponding to the eigenvalueλ 1 . Hence all solutions (except those on the straight line corresponding


44 Chapter 3 Phase Portraits for Planar Systemsto the stronger eigenvalue) tend to the origin tangentially to the straight-linesolution corresponding to the weaker eigenvalue in this case as well.Example. (Source) When the matrixA =( )λ1 00 λ 2satisfies 0 < λ 2 < λ 1 , our vector field may be regarded as the negative of theprevious example. The general solution and phase portrait remain the same,except that all solutions now tend away from (0, 0) along the same paths. SeeFigure 3.3b.Now one may argue that we are presenting examples here that are muchtoo simple. While this is true, we will soon see that any system of differential<strong>eq</strong>uations whose matrix has real distinct eigenvalues can be manipulated intothe above special forms by changing coordinates.Finally, a special case occurs if one of the eigenvalues is <strong>eq</strong>ual to 0. As wehave seen, there is a straight-line of <strong>eq</strong>uilibrium points in this case. If theother eigenvalue λ is nonzero, then the sign of λ determines whether the othersolutions tend toward or away from these <strong>eq</strong>uilibria (see Exercises 10 and 11at the end of this chapter).3.2 Complex EigenvaluesIt may happen that the roots of the characteristic polynomial are complexnumbers. In analogy with the real case, we call these roots complex eigenvalues.When the matrix A has complex eigenvalues, we no longer have straight linesolutions. However, we can still derive the general solution as before by using afew tricks involving complex numbers and functions. The following examplesindicate the general procedure.Example. (Center) Consider X ′ = AX withA =( 0) β−β 0and β ̸= 0. The characteristic polynomial is λ 2 + β 2 = 0, so the eigenvaluesare now the imaginary numbers ±iβ. Without worrying about the resultingcomplex vectors, we react just as before to find the eigenvector corresponding


to λ = iβ. We therefore solve( ) ( −iβ β x 0=−β −iβ)(y 0)3.2 Complex Eigenvalues 45or iβx = βy, since the second <strong>eq</strong>uation is redundant. Thus we find a complexeigenvector (1, i), and so the functionX(t) = e iβt ( 1i)is a complex solution of X ′ = AX.Now in general it is not polite to hand someone a complex solution to areal system of differential <strong>eq</strong>uations, but we can remedy this with the help ofEuler’s formulae iβt = cos βt + i sin βt.Using this fact, we rewrite the solution asX(t) =( ) ( )cos βt + i sin βt cos βt + i sin βt=.i(cos βt + i sin βt) − sin βt + i cos βtBetter yet, by breaking X(t) into its real and imaginary parts, we haveX(t) = X Re (t) + iX Im (t)whereX Re (t) =( ) ( )cos βtsin βt, X Im (t) = .− sin βtcos βtBut now we see that both X Re (t) and X Im (t) are (real!) solutions of the originalsystem. To see this, we simply checkX Re ′ (t) + iX Im ′ (t) = X ′ (t)= AX(t)= A(X Re (t) + iX Im (t))= AX Re + iAX Im (t).


46 Chapter 3 Phase Portraits for Planar SystemsEquating the real and imaginary parts of this <strong>eq</strong>uation yields X ′ Re = AX Re andX ′ Im = AX Im, which shows that both are indeed solutions. Moreover, sinceX Re (0) =( ( 1 0, X Im (0) = ,0)1)the linear combination of these solutionsX(t) = c 1 X Re (t) + c 2 X Im (t)where c 1 and c 2 are arbitrary constants provides a solution to any initial valueproblem.We claim that this is the general solution of this <strong>eq</strong>uation. To prove this, weneed to show that these are the only solutions of this <strong>eq</strong>uation. Suppose thatthis is not the case. Let( ) u(t)Y (t) =v(t)be another solution. Consider the complex function f (t) = (u(t)+iv(t))e iβt .Differentiating this expression and using the fact that Y (t) is a solution of the<strong>eq</strong>uation yields f ′ (t) = 0. Hence u(t) + iv(t) is a complex constant timese −iβt . From this it follows directly that Y (t) is a linear combination of X Re (t)and X Im (t).Note that each of these solutions is a periodic function with period 2π/β.Indeed, the phase portrait shows that all solutions lie on circles centered atthe origin. These circles are traversed in the clockwise direction if β > 0,counterclockwise if β < 0. See Figure 3.4. This type of system is called acenter.Figure 3.4 Phaseportrait for a center.


3.3 Repeated Eigenvalues 47Example. (Spiral Sink, Spiral Source) More generally, consider X ′ =AX where( ) α βA =−β αand α, β ̸= 0. The characteristic <strong>eq</strong>uation is now λ 2 − 2αλ + α 2 + β 2 , so theeigenvalues are λ = α ± iβ. An eigenvector associated to α + iβ is determinedby the <strong>eq</strong>uation(α − (α + iβ))x + βy = 0.Thus (1, i) is again an eigenvector. Hence we have complex solutions of theform( ) 1X(t) = e (α+iβ)t i( ) ( )cos βt sin βt= e αt + ie αt− sin βt cos βt= X Re (t) + iX Im (t).As above, both X Re (t) and X Im (t) yield real solutions of the system whoseinitial conditions are linearly independent. Thus we find the general solution( cos βtX(t) = c 1 e αt − sin βt) ( ) sin βt+ c 2 e αt .cos βtWithout the term e αt , these solutions would wind periodically around circlescentered at the origin. The e αt term converts solutions into spirals that eitherspiral into the origin (when α < 0) or away from the origin (α > 0). In thesecases the <strong>eq</strong>uilibrium point is called a spiral sink or spiral source, respectively.See Figure 3.5.3.3 Repeated EigenvaluesThe only remaining cases occur when A has repeated real eigenvalues. Onesimple case occurs when A is a diagonal matrix of the form( ) λ 0A = .0 λ


48 Chapter 3 Phase Portraits for Planar SystemsFigure 3.5 Phase portraits for a spiral sink and aspiral source.The eigenvalues of A are both <strong>eq</strong>ual to λ. In this case every nonzero vector isan eigenvector sinceAV = λVfor any V ∈ R 2 . Hence solutions are of the formX(t) = αe λt V .Each such solution lies on a straight line through (0, 0) and either tends to(0, 0) (if λ < 0) or away from (0, 0) (if λ > 0). So this is an easy case.A more interesting case occurs whenA =( ) λ 1.0 λAgain both eigenvalues are <strong>eq</strong>ual to λ, but now there is only one linearlyindependent eigenvector given by (1, 0). Hence we have one straight-linesolution( ) 1X 1 (t) = αe λt .0To find other solutions, note that the system can be writtenx ′ = λx + yy ′ = λy.


3.4 Changing Coordinates 49Thus, if y ̸= 0, we must havey(t) = βe λt .Therefore the differential <strong>eq</strong>uation for x(t) readsx ′ = λx + βe λt .This is a nonautonomous, first-order differential <strong>eq</strong>uation for x(t). One mightfirst expect solutions of the form e λt , but the nonautonomous term is also inthis form. As you perhaps saw in calculus, the best option is to guess a solutionof the formx(t) = αe λt + μte λtfor some constants α and μ. This technique is often called “the method ofundetermined coefficients.” Inserting this guess into the differential <strong>eq</strong>uationshows that μ = β while α is arbitrary. Hence the solution of the system maybe written( ) ( )1 tαe λt + βe λt .0 1This is in fact the general solution (see Exercise 12).Note that, if λ < 0, each term in this solution tends to 0 as t →∞. Thisis clear for the αe λt and βe λt terms. For the term βte λt this is an immediatecons<strong>eq</strong>uence of l’Hôpital’s rule. Hence all solutions tend to (0, 0) as t →∞.When λ > 0, all solutions tend away from (0, 0). See Figure 3.6. In fact,solutions tend toward or away from the origin in a direction tangent to theeigenvector (1, 0) (see Exercise 7).3.4 Changing CoordinatesDespite differences in the associated phase portraits, we really have dealt withonly three types of matrices in these past three sections:( ) λ 0,0 μ( ) α β,−β α( ) λ 1,0 λwhere λ may <strong>eq</strong>ual μ in the first case.Any 2 × 2 matrix that is in one of these three forms is said to be in canonicalform. Systems in this form may seem rather special, but they are not. Given


50 Chapter 3 Phase Portraits for Planar SystemsFigure 3.6 Phaseportrait for a system withrepeated negativeeigenvalues.any linear system X ′ = AX, we can always “change coordinates” so that thenew system’s coefficient matrix is in canonical form and hence easily solved.Here is how to do this.A linear map (or linear transformation) onR 2 is a function T : R 2 → R 2of the form( ) ( )x ax + byT =.y cx + dyThat is, T simply multiplies any vector by the 2 × 2 matrix( ) a b.c dWe will thus think of the linear map and its matrix as being interchangeable,so that we also write( ) a bT = .c dWe hope no confusion will result from this slight imprecision.Now suppose that T is invertible. This means that the matrix T has an inversematrix S that satisfies TS = ST = I where I is the 2 × 2 identity matrix. Itis traditional to denote the inverse of a matrix T by T −1 . As is easily checked,the matrixS = 1det T( d) −b−c a


3.4 Changing Coordinates 51serves as T −1 if det T ̸= 0. If det T = 0, then we know from Chapter 2 thatthere are infinitely many vectors (x, y) for which( ) xT =y( 00).Hence there is no inverse matrix in this case, for we would need( ) ( ) ( )x x 0= T −1 T = T −1y y 0for each such vector. We have shown:Proposition. The 2 × 2 matrix T is invertible if and only if det T ̸= 0. Now, instead of considering a linear system X ′ = AX, suppose we considera different systemY ′ = (T −1 AT )Yfor some invertible linear map T . Note that if Y (t) is a solution of this newsystem, then X(t) = TY (t) solves X ′ = AX. Indeed, we have(TY (t)) ′ = TY ′ (t)= T (T −1 AT )Y (t)= A(TY (t))as r<strong>eq</strong>uired. That is, the linear map T converts solutions of Y ′ = (T −1 AT )Yto solutions of X ′ = AX. Alternatively, T −1 takes solutions of X ′ = AX tosolutions of Y ′ = (T −1 AT )Y .We therefore think of T as a change of coordinates that converts a givenlinear system into one whose coefficient matrix is different. What we hopeto be able to do is find a linear map T that converts the given system into asystem of the form Y ′ = (T −1 AT )Y that is easily solved. And, as you mayhave guessed, we can always do this by finding a linear map that converts agiven linear system to one in canonical form.Example. (Real Eigenvalues) Suppose the matrix A has two real, distincteigenvalues λ 1 and λ 2 with associated eigenvectors V 1 and V 2 . Let T be thematrix whose columns are V 1 and V 2 . Thus TE j = V j for j = 1, 2 where the


52 Chapter 3 Phase Portraits for Planar SystemsE j form the standard basis of R 2 . Also, T −1 V j = E j . Therefore we have(T −1 AT )E j = T −1 AV j = T −1 (λ j V j )= λ j T −1 V j= λ j E j .Thus the matrix T −1 AT assumes the canonical form( )T −1 λ1 0AT =0 λ 2and the corresponding system is easy to solve.Example. As a further specific example, supposeA =( ) −1 01 −2The characteristic <strong>eq</strong>uation is λ 2 + 3λ + 2, which yields eigenvalues λ =−1and λ =−2. An eigenvector corresponding to λ =−1 is given by solving( ) ( ) ( x 0 0 x 0(A + I) ==y 1 −1)(y 0)which yields an eigenvector (1, 1). Similarly an eigenvector associated to λ =−2 is given by (0, 1).We therefore have a pair of straight-line solutions, each tending to the originas t →∞. The straight-line solution corresponding to the weaker eigenvaluelies along the line y = x; the straight-line solution corresponding to thestronger eigenvalue lies on the y-axis. All other solutions tend to the origintangentially to the line y = x.To put this system in canonical form, we choose T to be the matrix whosecolumns are these eigenvectors:T =( ) 1 01 1so thatT −1 =( ) 1 0.−1 1


3.4 Changing Coordinates 53Finally, we computeT −1 AT =( ) −1 0,0 −2so T −1 AT is in canonical form. The general solution of the system Y ′ =(T −1 AT )Y is( ) ( )1 0Y (t) = αe −t + βe −2t01so the general solution of X ′ = AX is( )( ( ) ( ))1 0 1 0TY (t) = αe1 1−t + βe −2t01( ) ( )1 0= αe −t + βe −2t .11Thus the linear map T converts the phase portrait for the systemY ′ =( ) −1 0Y0 −2to that of X ′ = AX as shown in Figure 3.7.Note that we really do not have to go through the step of converting aspecific system to one in canonical form; once we have the eigenvalues andeigenvectors, we can simply write down the general solution. We take thisextra step because, when we attempt to classify all possible linear systems, thecanonical form of the system will greatly simplify this process.TFigure 3.7sink.The change of variables T in the case of a (real)


54 Chapter 3 Phase Portraits for Planar SystemsExample. (Complex Eigenvalues) Now suppose that the matrix Ahas complex eigenvalues α ± iβ with β ̸= 0. Then we may find a complexeigenvector V 1 + iV 2 corresponding to α + iβ, where both V 1 and V 2 arereal vectors. We claim that V 1 and V 2 are linearly independent vectors in R 2 .If this were not the case, then we would have V 1 = cV 2 for some c ∈ R. Butthen we haveBut we also haveA(V 1 + iV 2 ) = (α + iβ)(V 1 + iV 2 ) = (α + iβ)(c + i)V 2 .A(V 1 + iV 2 ) = (c + i)AV 2 .So we conclude that AV 2 = (α + iβ)V 2 . This is a contradiction since theleft-hand side is a real vector while the right is complex.Since V 1 + iV 2 is an eigenvector associated to α + iβ, we haveA(V 1 + iV 2 ) = (α + iβ)(V 1 + iV 2 ).Equating the real and imaginary components of this vector <strong>eq</strong>uation, we findAV 1 = αV 1 − βV 2AV 2 = βV 1 + αV 2 .Let T be the matrix whose columns are V 1 and V 2 . Hence TE j = V j forj = 1, 2. Now consider T −1 AT . We haveand similarly(T −1 AT )E 1 = T −1 (αV 1 − βV 2 )= αE 1 − βE 2(T −1 AT )E 2 = βE 1 + αE 2 .Thus the matrix T −1 AT is in the canonical form( )T −1 α βAT = .−β αWe saw that the system Y ′ = (T −1 AT )Y has phase portrait corresponding toa spiral sink, center, or spiral source depending on whether α < 0, α = 0, or


3.4 Changing Coordinates 55α > 0. Therefore the phase portrait of X ′ = AX is <strong>eq</strong>uivalent to one of theseafter changing coordinates using T .Example. (Another Harmonic Oscillator) Consider the second-order<strong>eq</strong>uationx ′′ + 4x = 0.This corresponds to an undamped harmonic oscillator with mass 1 and springconstant 4. As a system, we haveX ′ =The characteristic <strong>eq</strong>uation is( ) 0 1X = AX.−4 0λ 2 + 4 = 0so that the eigenvalues are ±2i. A complex eigenvector associated to λ = 2i isa solution of the system−2ix + y = 0−4x − 2iy = 0.One such solution is the vector (1, 2i). So we have a complex solution of theforme 2it ( 12i).Breaking this solution into its real and imaginary parts, we find the generalsolution( ) ( )cos 2t sin 2tX(t) = c 1 + c−2 sin 2t 2 .2 cos 2tThus the position of this oscillator is given byx(t) = c 1 cos 2t + c 2 sin 2t,which is a periodic function of period π.


56 Chapter 3 Phase Portraits for Planar SystemsTFigure 3.8center.The change of variables T in the case of aNow, let T be the matrix whose columns are the real and imaginary parts ofthe eigenvector (1, 2i). That isThen, we compute easily thatT =T −1 AT =( ) 1 0.0 2( ) 0 2,−2 0which is in canonical form. The phase portraits of these systems are shownin Figure 3.8. Note that T maps the circular solutions of the system Y ′ =(T −1 AT )Y to elliptic solutions of X ′ = AX. Example. (Repeated Eigenvalues) Suppose A has a single real eigenvalueλ. If there exist a pair of linearly independent eigenvectors, then infact A must be in the form( ) λ 0A = ,0 λso the system X ′ = AX is easily solved (see Exercise 15).For the more complicated case, let’s assume that V is an eigenvector andthat every other eigenvector is a multiple of V . Let W be any vector for whichV and W are linearly independent. Then we haveAW = μV + νW


Exercises 57for some constants μ, ν ∈ R. Note that μ ̸= 0, for otherwise we would have asecond linearly independent eigenvector W with eigenvalue ν. We claim thatν = λ.Ifν − λ ̸= 0, a computation shows that( ( ) ) ( ( ) )μμA W + V = ν W + V .ν − λν − λThis says that ν is a second eigenvalue different from λ. Hence we must haveν = λ.Finally, let U = (1/μ)W . ThenAU = V + λ μ W = V + λU .Thus if we <strong>def</strong>ine TE 1 = V , TE 2 = U ,wegetT −1 AT =( ) λ 10 λas r<strong>eq</strong>uired. Thus X ′ = AX is again in canonical form after this change ofcoordinates.EXERCISES1. In Figure 3.9 on page 58, you see six phase portraits. Match each of thesephase portraits with one of the following linear systems:( ) ( ) ( )3 5−3 −2 3 −2(a)(b)(c)−2 −25 2 5 −2( )( ) ( )−3 53 5 −3 5(d)(e)(f )−2 3−2 −3 −2 2


58 Chapter 3 Phase Portraits for Planar Systems1.2.3.4.5.7.Figure 3.9 Match these phase portraits with the systems in Exercise 1.2. For each of the following systems of the form X ′ = AX(a) Find the eigenvalues and eigenvectors of A.(b) Find the matrix T that puts A in canonical form.(c) Find the general solution of both X ′ = AX and Y ′ = (T −1 AT )Y .(d) Sketch the phase portraits of both systems.( )( )0 11 1(i) A =(ii) A =1 01 0( )( )1 11 1(iii) A =(iv) A =−1 0−1 3( ) ( )1 11 1(v) A =(vi) A =−1 −31 −13. Find the general solution of the following harmonic oscillator <strong>eq</strong>uations:(a) x ′′ + x ′ + x = 0(b) x ′′ + 2x ′ + x = 04. Consider the harmonic oscillator system( )X ′ 0 1=X−k −bwhere b ≥ 0, k>0 and the mass m = 1.


Exercises 59(a) For which values of k, b does this system have complex eigenvalues?Repeated eigenvalues? Real and distinct eigenvalues?(b) Find the general solution of this system in each case.(c) Describe the motion of the mass when the mass is released fromthe initial position x = 1 with zero velocity in each of the cases inpart (a).5. Sketch the phase portrait of X ′ = AX whereA =( ) a 1.2a 2For which values of a do you find a bifurcation? Describe the phaseportrait for a-values above and below the bifurcation point.6. Consider the systemX ′ =( ) 2a bX.b 0Sketch the regions in the ab–plane where this system has different typesof canonical forms.7. Consider the systemX ′ =( ) λ 1X0 λwith λ ̸= 0. Show that all solutions tend to (respectively, away from)the origin tangentially to the eigenvector (1, 0) when λ < 0 (respectively,λ > 0).8. Find all 2 × 2 matrices that have pure imaginary eigenvalues. That is,determine conditions on the entries of a matrix that guarantee that thematrix has pure imaginary eigenvalues.9. Determine a computable condition that guarantees that, if a matrix Ahas complex eigenvalues, then solutions of X ′ = AX travel around theorigin in the counterclockwise direction.10. Consider the system( )X ′ a b= Xc dwhere a + d ̸= 0 but ad − bc = 0. Find the general solution of thissystem and sketch the phase portrait.


60 Chapter 3 Phase Portraits for Planar Systems11. Find the general solution and describe completely the phase portrait forX ′ =( ) 0 1X.0 012. Prove that))αe λt ( 10+ βe λt ( t1is the general solution ofX ′ =( ) λ 1X.0 λ13. Prove that a 2×2 matrix A always satisfies its own characteristic <strong>eq</strong>uation.That is, if λ 2 + αλ + β = 0 is the characteristic <strong>eq</strong>uation associated to A,then the matrix A 2 + αA + βI is the 0 matrix.14. Suppose the 2 × 2 matrix A has repeated eigenvalues λ. Let V ∈ R 2 .Using the previous problem, show that either V is an eigenvector for Aor else (A − λI)V is an eigenvector for A.15. Suppose the matrix A has repeated real eigenvalues λ and there exists apair of linearly independent eigenvectors associated to A. Prove thatA =16. Consider the (nonlinear) system( ) λ 0.0 λx ′ =|y|y ′ =−x.Use the methods of this chapter to describe the phase portrait.


4Classification ofPlanar SystemsIn this chapter, we summarize what we have accomplished so far using adynamical systems point of view. Among other things, this means that wewould like to have a complete “dictionary” of all possible behaviors of 2 × 2autonomous linear systems. One of the dictionaries we present here is geometric:the trace-determinant plane. The other dictionary is more dynamic:this involves the notion of conjugate systems.4.1 The Trace-Determinant PlaneFor a matrix( ) a bA =c dwe know that the eigenvalues are the roots of the characteristic <strong>eq</strong>uation, whichcan be writtenλ 2 − (a + d)λ + (ad − bc) = 0.The constant term in this <strong>eq</strong>uation is det A. The coefficient of λ also has aname: The quantity a + d is called the trace of A and is denoted by tr A.61


62 Chapter 4 Classification of Planar SystemsThus the eigenvalues satisfyλ 2 − (tr A)λ + det A = 0and are given byλ ± = 1 2(tr A ± √ )(tr A) 2 − 4 det A .Note that λ + + λ − = tr A and λ + λ − = det A, so the trace is the sum of theeigenvalues of A while the determinant is the product of the eigenvalues ofA. We will also write T = tr A and D = det A. Knowing T and D tells usthe eigenvalues of A and therefore virtually everything about the geometry ofsolutions of X ′ = AX. For example, the values of T and D tell us whethersolutions spiral into or away from the origin, whether we have a center, andso forth.We may display this classification visually by painting a picture in the tracedeterminantplane. In this picture a matrix with trace T and determinant Dcorresponds to the point with coordinates (T , D). The location of this pointin the TD–plane then determines the geometry of the phase portrait as above.For example, the sign of T 2 − 4D tells us that the eigenvalues are:1. Complex with nonzero imaginary part if T 2 − 4D 0;3. Real and repeated if T 2 − 4D = 0.Thus the location of (T , D) relative to the parabola T 2 − 4D = 0intheTD–plane tells us all we need to know about the eigenvalues of A from an algebraicpoint of view.In terms of phase portraits, however, we can say more. If T 2 − 4D 0 we have a similar breakdown into cases. In this region, botheigenvalues are real. If D


4.1 The Trace-Determinant Plane 63Thus we have√T + T 2 − 4D >0√T − T 2 − 4D 0 and T0 leads to a (real) source.When D = 0 and T ̸= 0, we have one zero eigenvalue, while botheigenvalues vanish if D = T = 0.Plotting all of this verbal information in the TD–plane gives us a visualsummary of all of the different types of linear systems. The <strong>eq</strong>uations abovepartition the TD–plane into various regions in which systems of a particulartype reside. See Figure 4.1. This yields a geometric classification of 2 × 2 linearsystems.DetT 2 4DTrFigure 4.1 The trace-determinant plane. Any resemblance to any of theauthors’ faces is purely coincidental.


64 Chapter 4 Classification of Planar SystemsA couple of remarks are in order. First, the trace-determinant plane is atwo-dimensional representation of what is really a four-dimensional space,since 2 × 2 matrices are determined by four parameters, the entries of thematrix. Thus there are infinitely many different matrices corresponding to eachpoint in the TD–plane. While all of these matrices share the same eigenvalueconfiguration, there may be subtle differences in the phase portraits, such as thedirection of rotation for centers and spiral sinks and sources, or the possibilityof one or two independent eigenvectors in the repeated eigenvalue case.We also think of the trace-determinant plane as the analog of the bifurcationdiagram for planar linear systems. A one-parameter family of linear systemscorresponds to a curve in the TD–plane. When this curve crosses the T -axis,the positive D-axis, or the parabola T 2 − 4D = 0, the phase portrait of thelinear system undergoes a bifurcation: A major change occurs in the geometryof the phase portrait.Finally, note that we may obtain quite a bit of information about the systemfrom D and T without ever computing the eigenvalues. For example, if D


4.2 Dynamical Classification 65We will consider two systems to be dynamically <strong>eq</strong>uivalent if there is afunction h that takes one flow to the other. We r<strong>eq</strong>uire that this function bea homeomorphism, that is, h is a one-to-one, onto, and continuous functionwhose inverse is also continuous.DefinitionSuppose X ′ = AX and X ′ = BX have flows φ A and φ B . These twosystems are (topologically) conjugate if there exists a homeomorphismh : R 2 → R 2 that satisfiesφ B (t, h(X 0 )) = h(φ A (t, X 0 )).The homeomorphism h is called a conjugacy. Thus a conjugacytakes the solution curves of X ′ = AX to those of X ′ = BX.Example. For the one-dimensional linear differential <strong>eq</strong>uationswe have the flowsx ′ = λ 1 x and x ′ = λ 2 xφ j (t, x 0 ) = x 0 e λ jtfor j = 1, 2. Suppose that λ 1 and λ 2 are nonzero and have the same sign.Then let⎧⎪⎨ x λ 2/λ 1if x ≥ 0h(x) =⎪⎩−|x| λ 2/λ 1if x 0h(φ 1 (t, x 0 )) = ( x 0 e λ 1t ) λ 2 /λ 1


66 Chapter 4 Classification of Planar Systems= x λ 2/λ 10 e λ 2t= φ 2 (t, h(x 0 ))as r<strong>eq</strong>uired. A similar computation works when x 0 < 0.There are several things to note here. First, λ 1 and λ 2 must have the samesign, because otherwise we would have |h(0)| =∞, in which case h is not ahomeomorphism. This agrees with our notion of dynamical <strong>eq</strong>uivalence: Ifλ 1 and λ 2 have the same sign, then their solutions behave similarly as eitherboth tend to the origin or both tend away from the origin. Also, note thatif λ 2 < λ 1 , then h is not differentiable at the origin, whereas if λ 2 > λ 1then h −1 (x) = x λ 1/λ 2is not differentiable at the origin. This is the reasonwhy we r<strong>eq</strong>uire h to be only a homeomorphism and not a diffeomorphism(a differentiable homeomorphism with differentiable inverse): If we assumedifferentiability, then we must have λ 1 = λ 2 , which does not yield a veryinteresting notion of “<strong>eq</strong>uivalence.”This gives a classification of (autonomous) linear first-order differential<strong>eq</strong>uations, which agrees with our qualitative observations in Chapter 1.There are three conjugacy “classes”: the sinks, the sources, and the special“in-between” case, x ′ = 0, where all solutions are constants.Now we move to the planar version of this scenario. We first note thatwe only need to decide on conjugacies among systems whose matrices are incanonical form. For, as we saw in Chapter 3, if the linear map T : R 2 → R 2puts A in canonical form, then T takes the time t map of the flow of Y ′ =(T −1 AT )Y to the time t map for X ′ = AX.Our classification of planar linear systems now proceeds just as in theone-dimensional case. We will stay away from the case where the systemhas eigenvalues with real part <strong>eq</strong>ual to 0, but you will tackle this case in theexercises.DefinitionA matrix A is hyperbolic if none of its eigenvalues has real part0. We also say that the system X ′ = AX is hyperbolic.Theorem. Suppose that the 2 × 2 matrices A 1 and A 2 are hyperbolic. Then thelinear systems X ′ = A i X are conjugate if and only if each matrix has the samenumber of eigenvalues with negative real part.Thus two hyperbolic matrices yield conjugate linear systems if both sets ofeigenvalues fall into the same category below:1. One eigenvalue is positive and the other is negative;


4.2 Dynamical Classification 672. Both eigenvalues have negative real parts;3. Both eigenvalues have positive real parts.Before proving this, note that this theorem implies that a system with a spiralsink is conjugate to a system with a (real) sink. Of course! Even though theirphase portraits look very different, it is nevertheless the case that all solutionsof both systems share the same fate: They tend to the origin as t →∞.Proof: Recall from the previous discussion that we may assume all systemsare in canonical form. Then the proof divides into three distinct cases.Case 1Suppose we have two linear systems X ′ = A i X for i = 1, 2 such that each A ihas eigenvalues λ i < 0 < μ i . Thus each system has a saddle at the origin. Thisis the easy case. As we saw previously, the real differential <strong>eq</strong>uations x ′ = λ i xhave conjugate flows via the homeomorphismh 1 (x) ={x λ 2/λ 1if x ≥ 0−|x| λ 2/λ 1if x


68 Chapter 4 Classification of Planar Systemswith α, λ, μ < 0. We will show that, in either case, the system is conjugate toX ′ = BX where( ) −1 0B =.0 −1It then follows that any two systems of this form are conjugate.Consider the unit circle in the plane parameterized by the curve X(θ) =(cos θ, sin θ), 0 ≤ θ ≤ 2π. We denote this circle by S 1 . We first claim that thevector field determined by a matrix in the above form must point inside S 1 .Incase (a), we have that the vector field on S 1 is given byAX(θ) =( )α cos θ + β sin θ.−β cos θ + α sin θThe outward pointing normal vector to S 1 at X(θ)isN (θ) =( ) cos θ.sin θThe dot product of these two vectors satisfiesAX(θ) · N (θ) = α(cos 2 θ + sin 2 θ) < 0since α < 0. This shows that AX(θ) does indeed point inside S 1 . Case (b) iseven easier.As a cons<strong>eq</strong>uence, each nonzero solution of X ′ = AX crosses S 1 exactlyonce. Let φtA denote the time t map for this system, and let τ = τ(x, y) denotethe time at which φt A(x, y) meets S1 . Thus∣∣∣φτ(x,y) A ∣∣ (x, y) = 1.Let φ B tdenote the time t map for the system X ′ = BX. Clearly,φ B t (x, y) = (e−t x, e −t y).We now <strong>def</strong>ine a conjugacy H between these two systems. If (x, y) ̸= (0, 0),letH(x, y) = φ−τ(x,y) B φA τ(x,y)(x, y)and set H(0, 0) = (0, 0). Geometrically, the value of H(x, y) is given byfollowing the solution curve of X ′ = AX exactly τ(x, y) time units (forward


4.2 Dynamical Classification 69t B r (t A r (x, y))H(x, y)t A r (x, y)(x, y)S 1Figure 4.2of τ (x, y).The <strong>def</strong>initionor backward) until the solution reaches S 1 , and then following the solutionof X ′ = BX starting at that point on S 1 and proceeding in the opposite timedirection exactly τ time units. See Figure 4.2.To see that H gives a conjugacy, note first thatsinceτ ( φ A s (x, y)) = τ(x, y) − sφ A τ−s φA s (x, y) = φA τ (x, y) ∈ S1 .Therefore we haveH ( φs A (x, y)) = φ−τ+s B ( φA τ−s φAs (x, y) )= φs B φB −τ φA τ (x, y)= φsB ( )H(x, y) .So H is a conjugacy.Now we show that H is a homeomorphism. We can construct an inversefor H by simply reversing the process <strong>def</strong>ining H. That is, letG(x, y) = φ−τ A 1 (x,y) φB τ 1 (x,y)(x, y)and set G(0, 0) = (0, 0). Here τ 1 (x, y) is the time for the solution of X ′ = BXthrough (x, y) to reach S 1 . An easy computation shows that τ 1 (x, y) = log rwhere r 2 = x 2 + y 2 . Clearly, G = H −1 so H is one to one and onto. Also, Gis continuous at (x, y) ̸= (0, 0) since G may be written( xG(x, y) = φ− A log rr , y ),r


70 Chapter 4 Classification of Planar Systemswhich is a composition of continuous functions. For continuity of G at theorigin, suppose that (x, y) is close to the origin, so that r is small. Observe thatas r → 0, − log r →∞. Now (x/r, y/r) is a point on S 1 and for r sufficientlysmall, φ− A log rmaps the unit circle very close to (0, 0). This shows that G iscontinuous at (0, 0).We thus need only show continuity of H. For this, we need to show thatτ(x, y) is continuous. But τ is determined by the <strong>eq</strong>uation∣∣φ A t (x, y)∣ ∣ = 1.We write φt A(x, y) = (x(t), y(t)). Taking the partial derivative of |φA t (x, y)|with respect to t,wefind∂ ∣∣φt A ∂t(x, y)∣ ∣ = ∂ √(x(t))∂t2 + (y(t)) 21 (= √ x(t)x ′ (t) + y(t)y ′ (t) )(x(t)) 2 + (y(t)) 21= ∣∣φt A (x, y) ∣ (( ) x(t)·y(t)( x ′ ))(t)y ′ .(t)But the latter dot product is nonzero when t = τ(x, y) since the vector fieldgiven by (x ′ (t), y ′ (t)) points inside S 1 . Hence∂ ∣∣φt A ∂t(x, y)∣ ∣ ̸= 0at (τ(x, y), x, y). Thus we may apply the implicit function theorem to showthat τ is differentiable at (x, y) and hence continuous. Continuity of H at theorigin follows as in the case of G = H −1 . Thus H is a homeomorphism andwe have a conjugacy between X ′ = AX and X ′ = BX.Note that this proof works <strong>eq</strong>ually well if the eigenvalues have positive realparts.Case 3Finally, suppose thatA =( ) λ 10 λ


4.3 Exploration: A 3D Parameter Space 71with λ < 0. The associated vector field need not point inside the unit circle inthis case. However, if we letthen the vector field given byT =( ) 1 0,0 ɛY ′ = (T −1 AT )Ynow does have this property, provided ɛ > 0 is sufficiently small. IndeedT −1 AT =( ) λ ɛ0 λso that(T −1 AT( )) cos θ·sin θ( ) cos θ= λ + ɛ sin θ cos θ.sin θThus if we choose ɛ < −λ, this dot product is negative. Therefore the changeof variables T puts us into the situation where the same proof as in Case 2applies. This completes the proof.4.3 Exploration: A 3D Parameter SpaceConsider the three-parameter family of linear systems given byX ′ =where a, b, and c are parameters.( ) a bXc 01. First, fix a>0. Describe the analog of the trace-determinant plane inthe bc–plane. That is, identify the bc–values in this plane where the correspondingsystem has saddles, centers, spiral sinks, etc. Sketch theseregions in the bc–plane.2. Repeat the previous question when a


72 Chapter 4 Classification of Planar Systemsas, say, a varies, or use a computer model to visualize this image. In anyevent, your model should accurately capture all of the distinct regions inthis space.EXERCISES1. Consider the one-parameter family of linear systems given by(√ )X ′ a 2 + (a/2)= √ X.2 − (a/2) 0(a) Sketch the path traced out by this family of linear systems in thetrace-determinant plane as a varies.(b) Discuss any bifurcations that occur along this path and compute thecorresponding values of a.2. Sketch the analog of the trace-determinant plane for the two-parameterfamily of systems( )X ′ a b= Xb ain the ab–plane. That is, identify the regions in the ab–plane where thissystem has similar phase portraits.3. Consider the harmonic oscillator <strong>eq</strong>uation (with m = 1)x ′′ + bx ′ + kx = 0where b ≥ 0 and k>0. Identify the regions in the relevant portion of thebk–plane where the corresponding system has similar phase portraits.4. Prove that H(x, y) = (x, −y) provides a conjugacy betweenX ′ =( ) 1 1X and Y−1 1′ =( ) 1 −1Y .1 15. For each of the following systems, find an explicit conjugacy between theirflows.( )( )(a) X ′ −1 1= X and Y0 2′ 1 0= Y1 −2( )( )(b) X ′ 0 1= X and Y−4 0′ 0 2= Y .−2 0


Exercises 736. Prove that any two linear systems with the same eigenvalues ±iβ, β ̸= 0are conjugate. What happens if the systems have eigenvalues ±iβ and ±iγwith β ̸= γ ? What if γ =−β?7. Consider all linear systems with exactly one eigenvalue <strong>eq</strong>ual to 0. Whichof these systems are conjugate? Prove this.8. Consider all linear systems with two zero eigenvalues. Which of thesesystems are conjugate? Prove this.9. Provide a complete description of the conjugacy classes for 2 × 2 systemsin the nonhyperbolic case.


This Page Intentionally Left Blank


5Higher DimensionalLinear AlgebraAs in Chapter 2, we need to make another detour into the world of linear algebrabefore proceeding to the solution of higher dimensional linear systems ofdifferential <strong>eq</strong>uations. There are many different canonical forms for matrices inhigher dimensions, but most of the algebraic ideas involved in changing coordinatesto put matrices into these forms are already present in the 2×2 case. Inparticular, the case of matrices with distinct (real or complex) eigenvalues canbe handled with minimal additional algebraic complications, so we deal withthis case first. This is the “generic case,” as we show in Section 5.6. Matriceswith repeated eigenvalues demand more sophisticated concepts from linearalgebra; we provide this background in Section 5.4. We assume throughoutthis chapter that the reader is familiar with solving systems of linear algebraic<strong>eq</strong>uations by putting the associated matrix in (reduced) row echelon form.5.1 Preliminaries from Linear AlgebraIn this section we generalize many of the algebraic notions of Section 2.3 tohigher dimensions. We denote a vector X ∈ R n in coordinate form as⎛ ⎞x 1⎜ ⎟X = ⎝ . ⎠ .x n75


76 Chapter 5 Higher Dimensional Linear AlgebraIn the plane, we called a pair of vectors V and W linearly independent if theywere not collinear. Equivalently, V and W were linearly independent if therewere no (nonzero) real numbers α and β such that αV + βW was the zerovector.More generally, in R n , a collection of vectors V 1 , ..., V k in R n is said to belinearly independent if, wheneverα 1 V 1 +···+α k V k = 0with α j ∈ R, it follows that each α j = 0. If we can find such α 1 , ..., α k , not allof which are 0, then the vectors are linearly dependent. Note that if V 1 , ..., V kare linearly independent and W is the linear combinationW = β 1 V 1 +···+β k V k ,then the β j are unique. This follows since, if we could also writethen we would haveW = γ 1 V 1 +···+γ k V k ,0 = W − W = (β 1 − γ 1 )V 1 +···(β k − γ k )V k ,which forces β j = γ j for each j, by linear independence of the V j .Example. The vectors (1, 0, 0), (0, 1, 0), and (0, 0, 1) are clearly linearly independentin R 3 . More generally, let E j be the vector in R n whose jth componentis 1 and all other components are 0. Then the vectors E 1 , ..., E n are linearlyindependent in R n . The collection of vectors E 1 , ..., E n is called the standardbasis of R n . We will discuss the concept of a basis in Section 5.4. Example. The vectors (1, 0, 0), (1, 1, 0), and (1, 1, 1) in R 3 are also linearlyindependent, because if we have⎛α 1⎝ 1 ⎞ ⎛0⎠ + α 2⎝ 1 ⎞ ⎛1⎠ + α 3⎝ 1 ⎞ ⎛1⎠ = ⎝ α ⎞ ⎛1 + α 2 + α 3α 2 + α 3⎠ = ⎝ 0 ⎞0⎠ ,0 0 10then the third component says that α 3 = 0. The fact that α 3 = 0 in the secondcomponent then says that α 2 = 0, and finally the first component similarlytells us that α 1 = 0. On the other hand, the vectors (1, 1, 1), (1, 2, 3), and(2, 3, 4) are linearly dependent, for we have⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞α 31 ⎝ 1 1⎠ + 1 ⎝ 1 2⎠ − 1 ⎝ 2 3⎠ = ⎝ 0 0⎠ .1 3 4 0


5.1 Preliminaries from Linear Algebra 77When solving linear systems of differential <strong>eq</strong>uations, we will oftenencounter special subsets of R n called subspaces. Asubspace of R n is a collectionof all possible linear combinations of a given set of vectors. Moreprecisely, given V 1 , ..., V k ∈ R n , the setS ={α 1 V 1 +···+α k V k | α j ∈ R}is a subspace of R n . In this case we say that S is spanned by V 1 , ..., V k .Equivalently, it can be shown (see Exercise 12 at the end of this chapter) that asubspace S is a nonempty subset of R n having the following two properties:1. If X, Y ∈ S, then X + Y ∈ S;2. If X ∈ S and α ∈ R, then αX ∈ S.Note that the zero vector lies in every subspace of R n and that any linearcombination of vectors in a subspace S also lies in S.Example. Any straight line through the origin in R n is a subspace of R n ,since this line may be written as {tV | t ∈ R} for some nonzero V ∈ R n . Thesingle vector V spans this subspace. The plane P <strong>def</strong>ined by x + y + z = 0inR 3 is a subspace of R 3 . Indeed, any vector V in P may be written in the form(x, y, −x − y)or⎛V = x ⎝ 1 ⎞ ⎛0⎠ + y ⎝ 0 ⎞1⎠ ,−1 −1which shows that the vectors (1, 0, −1) and (0, 1, −1) span P.In linear algebra, one often encounters rectangular n × m matrices, but indifferential <strong>eq</strong>uations, most often these matrices are square (n × n). Cons<strong>eq</strong>uentlywe will assume that all matrices in this chapter are n × n. We writesuch a matrix⎛⎞a 11 a 12 ... a 1na 21 a 22 ... a 2nA = ⎜⎟⎝ . ⎠a n1 a n2 ... a nnmore compactly as A =[a ij ].For X = (x 1 , ..., x n ) ∈ R n , we <strong>def</strong>ine the product AX to be the vector⎛∑ nj=1 ⎞a 1j x j⎜ ⎟AX = ⎝∑. ⎠ ,nj=1a nj x j


78 Chapter 5 Higher Dimensional Linear Algebraso that the ith entry in this vector is the dot product of the ith row of A withthe vector X.Matrix sums are <strong>def</strong>ined in the obvious way. If A =[a ij ] and B =[b ij ] aren × n matrices, then we <strong>def</strong>ine A + B = C where C =[a ij + b ij ]. Matrixarithmetic has some obvious linearity properties:1. A(k 1 X 1 + k 2 X 2 ) = k 1 AX 1 + k 2 AX 2 where k j ∈ R, X j ∈ R n ;2. A + B = B + A;3. (A + B) + C = A + (B + C).The product of the n × n matrices A and B is <strong>def</strong>ined to be the n × n matrixAB =[c ij ] wherec ij =n∑a ik b kj ,k=1so that c ij is the dot product of the ith row of A with the jth column of B.Wecan easily check that, if A, B, and C are n × n matrices, then1. (AB)C = A(BC);2. A(B + C) = AB + AC;3. (A + B)C = AC + BC;4. k(AB) = (kA)B = A(kB) for any k ∈ R.All of the above properties of matrix arithmetic are easily checked by writingout the ij entries of the corresponding matrices. It is important to rememberthat matrix multiplication is not commutative, so that AB ̸= BA in general.For example( )( )1 0 1 11 1 0 1=( ) 1 11 2whereas( )( )1 1 1 0=0 1 1 1( ) 2 1.1 1Also, matrix cancellation is usually forbidden; if AB = AC, then we do notnecessarily have B = C as in( )( )1 1 1 01 1 0 0=( ) 1 0=1 0( )( )1 1 1/2 1/2.1 1 1/2 −1/2In particular, if AB is the zero matrix, it does not follow that one of A or B isalso the zero matrix.


5.1 Preliminaries from Linear Algebra 79The n × n matrix A is invertible if there exists an n × n matrix C for whichAC = CA = I where I is the n × n identity matrix that has 1s along thediagonal and 0s elsewhere. The matrix C is called the inverse of A. Note thatif A has an inverse, then this inverse is unique. For if AB = BA = I as well,thenC = CI = C(AB) = (CA)B = IB = B.The inverse of A is denoted by A −1 .If A is invertible, then the vector <strong>eq</strong>uation AX = V has a unique solutionfor any V ∈ R n . Indeed, A −1 V is one solution. Moreover, it is the only one,for if Y is another solution, then we haveY = (A −1 A)Y = A −1 (AY ) = A −1 V .For the converse of this statement, recall that the <strong>eq</strong>uation AX = V hasunique solutions if and only if the reduced row echelon form of the matrixA is the identity matrix. The reduced row echelon form of A is obtained byapplying to A a s<strong>eq</strong>uence of elementary row operations of the form1. Add k times row i of A to row j;2. Interchange row i and j;3. Multiply row i by k ̸= 0.Note that these elementary row operations correspond exactly to theoperations that are used to solve linear systems of algebraic <strong>eq</strong>uations:1. Add k times <strong>eq</strong>uation i to <strong>eq</strong>uation j;2. Interchange <strong>eq</strong>uations i and j;3. Multiply <strong>eq</strong>uation i by k ̸= 0.Each of these elementary row operations may be represented by multiplyingA by an elementary matrix. For example, if L =[l ij ] is the matrix that has1’s along the diagonal, l ji = k for some choice of i and j, i ̸= j, and allother entries 0, then LA is the matrix that is obtained by performing rowoperation 1 on A. Similarly, if L has 1’s along the diagonal with the exceptionthat l ii = l jj = 0, but l ij = l ji = 1, and all other entries are 0, then LA isthe matrix that results after performing row operation 2 on A. Finally, if Lis the identity matrix with a k instead of 1 in the ii position, then LA is thematrix obtained by performing row operation 3. A matrix L in one of thesethree forms is called an elementary matrix.Each elementary matrix is invertible, since its inverse is given by the matrixthat simply “undoes” the corresponding row operation. As a cons<strong>eq</strong>uence,any product of elementary matrices is invertible. Therefore, if L 1 , ..., L nare the elementary matrices that correspond to the row operations that put


80 Chapter 5 Higher Dimensional Linear AlgebraA into the reduced row echelon form, which is the identity matrix, then(L n ·····L 1 ) = A −1 . That is, if the vector <strong>eq</strong>uation AX = V has unique solutionsfor any V ∈ R n , then A is invertible. Thus we have our first importantresult.Proposition. LetAbeann×n matrix. Then the system of algebraic <strong>eq</strong>uationsAX = V has a unique solution for any V ∈ R n if and only if A is invertible. Thus the natural question now is: How do we tell if A is invertible? Oneanswer is provided by the following result.Proposition. The matrix A is invertible if and only if the columns of A forma linearly independent set of vectors.Proof: Suppose first that A is invertible and has columns V 1 , ..., V n . We haveAE j = V j where the E j form the standard basis of R n . If the V j are not linearlyindependent, we may find real numbers α 1 , ..., α n , not all zero, such that∑j α jV j = 0. But then0 =⎛ ⎞n∑n∑α j AE j = A ⎝ α j E j⎠ .j=1j=1Hence the <strong>eq</strong>uation AX = 0 has two solutions, the nonzero vector (α 1 , ..., α n )and the 0 vector. This contradicts the previous proposition.Conversely, suppose that the V j are linearly independent. If A is not invertible,then we may find a pair of vectors X 1 and X 2 with X 1 ̸= X 2 andAX 1 = AX 2 . Therefore the nonzero vector Z = X 1 − X 2 satisfies AZ = 0. LetZ = (α 1 , ..., α n ). Then we have0 = AZ =n∑α j V j ,j=1so that the V j are not linearly independent. This contradiction establishes theresult.A more computable criterion for determining whether or not a matrix isinvertible, as in the 2 × 2 case, is given by the determinant of A. Given then × n matrix A, we will denote by A ij the (n − 1) × (n − 1) matrix obtainedby deleting the ith row and jth column of A.


5.1 Preliminaries from Linear Algebra 81DefinitionThe determinant of A =[a ij ] is <strong>def</strong>ined inductively bydet A =n∑(−1) 1+k a 1k det A 1k .k=1Note that we know the determinant of a 2 × 2 matrix, so this inductionmakes sense for k>2.Example. From the <strong>def</strong>inition we compute⎛det ⎝ 1 2 3⎞4 5 6⎠ = 1 det7 8 9( ) 5 6− 2 det8 9=−3 + 12 − 9 = 0.( ) 4 6+ 3 det7 9( ) 4 57 8We remark that the <strong>def</strong>inition of det A given above involves “expandingalong the first row” of A. One can <strong>eq</strong>ually well expand along the jth row sothatdet A =n∑(−1) j+k a jk det A jk .k=1We will not prove this fact; the proof is an entirely straightforward thoughtedious calculation. Similarly, det A can be calculated by expanding along agiven column (see Exercise 1).Example. Expanding the matrix in the previous example along the secondand third rows yields the same result:⎛det ⎝ 1 2 3⎞4 5 6⎠ =−4 det7 8 9( ) 2 3+ 5 det8 9= 24 − 60 + 36 = 0( ) 2 3= 7 det − 8 det5 6=−21 + 48 − 27 = 0.( ) 1 3− 6 det7 9( ) 1 3+ 9 det4 6( ) 1 27 8( ) 1 24 5


82 Chapter 5 Higher Dimensional Linear AlgebraIncidentally, note that this matrix is not invertible, since⎛⎝ 1 2 3⎞ ⎛4 5 6⎠⎝ 1 ⎞ ⎛−2⎠ = ⎝ 0 ⎞0⎠ .7 8 9 1 0The determinant of certain types of matrices is easy to compute. A matrix[a ij ] is called upper triangular if all entries below the main diagonal are 0. Thatis, a ij = 0ifi>j. Lower triangular matrices are <strong>def</strong>ined similarly. We haveProposition. If A is an upper or lower triangular n × n matrix, then det Aisthe product of the entries along the diagonal. That is, det[a ij ]=a 11 ...a nn . The proof is a straightforward application of induction. The followingproposition describes the effects that elementary row operations have on thedeterminant of a matrix.Proposition. Let A and B be n × n matrices.1. Suppose the matrix B is obtained by adding a multiple of one row of A toanother row of A. Then det B = det A.2. Suppose B is obtained by interchanging two rows of A. Then det B =− det A.3. Suppose B is obtained by multiplying each element of a row of A by k. Thendet B = k det A.Proof: The proof of the proposition is straightforward when A isa2× 2matrix, so we use induction. Suppose A is k × k with k>2. To computedet B, we expand along a row that is left untouched by the row operation. Byinduction on k, we see that det B is a sum of determinants of size (k − 1) ×(k − 1). Each of these subdeterminants has precisely the same row operationperformed on it as in the case of the full matrix. By induction, it follows thateach of these subdeterminants is multiplied by 1, −1, or k in cases 1, 2, and 3,respectively. Hence det B has the same property.In particular, we note that if L is an elementary matrix, thendet(LA) = (det L)(det A).Indeed, det L = 1, −1, or k in cases 1–3 above (see Exercise 7). The precedingproposition now yields a criterion for A to be invertible:Corollary. (Invertibility Criterion) The matrix A is invertible if and only ifdet A ̸= 0.


5.2 Eigenvalues and Eigenvectors 83Proof: By elementary row operations, we can manipulate any matrix A intoan upper triangular matrix. Then A is invertible if and only if all diagonalentries of this row reduced matrix are nonzero. In particular, the determinantof this matrix is nonzero. Now, by the previous observation, row operationsmultiply det A by nonzero numbers, so we see that all of the diagonal entriesare nonzero if and only if det A is also nonzero. This concludes the proof. We conclude this section with a further important property of determinants.Proposition.det(AB) = (det A)(det B).Proof: If either A or B is noninvertible, then AB is also noninvertible (seeExercise 11). Hence the proposition is true since both sides of the <strong>eq</strong>uation arezero. If A is invertible, then we can writeA = L 1 ...L n · Iwhere each L j is an elementary matrix. Hencedet(AB) = det(L 1 ...L n B)= det(L 1 ) det(L 2 ...L n B)= det(L 1 )(det L 2 ) ...(det L n )(det B)= det(L 1 ...L n ) det(B)= det(A) det(B). 5.2 Eigenvalues and EigenvectorsAs we saw in Chapter 3, eigenvalues and eigenvectors play a central role in theprocess of solving linear systems of differential <strong>eq</strong>uations.DefinitionA vector V is an eigenvector of an n ×n matrix A if V is a nonzerosolution to the system of linear <strong>eq</strong>uations (A − λI)V = 0. Th<strong>eq</strong>uantity λ is called an eigenvalue of A, and V is an eigenvectorassociated to λ.Just as in Chapter 2, the eigenvalues of a matrix A may be real or complexand the associated eigenvectors may have complex entries.


84 Chapter 5 Higher Dimensional Linear AlgebraBy the invertibility criterion of the previous section, it follows that λ is aneigenvalue of A if and only if λ is a root of the characteristic <strong>eq</strong>uationdet(A − λI) = 0.Since A is n × n, this is a polynomial <strong>eq</strong>uation of degree n, which thereforehas exactly n roots (counted with multiplicity).As we saw in R 2 , there are many different types of solutions of systemsof differential <strong>eq</strong>uations, and these types depend on the configuration of theeigenvalues of A and the resulting canonical forms. There are many, manymore types of canonical forms in higher dimensions. We will describe thesetypes in this and the following sections, but we will relegate some of the morespecialized proofs of these facts to the exercises.Suppose first that λ 1 , ..., λ l are real and distinct eigenvalues of A withassociated eigenvectors V 1 , ..., V l . Here “distinct” means that no two of theeigenvalues are <strong>eq</strong>ual. Thus AV k = λ k V k for each k. We claim that the V k arelinearly independent. If not, we may choose a maximal subset of the V i thatare linearly independent, say, V 1 , ..., V j . Then any other eigenvector may bewritten in a unique way as a linear combination of V 1 , ..., V j . Say V j+1 is onesuch eigenvector. Then we may find α i , not all 0, such thatV j+1 = α 1 V 1 +···+α j V j .Multiplying both sides of this <strong>eq</strong>uation by A,wefindλ j+1 V j+1 = α 1 AV 1 +···+α j AV j= α 1 λ 1 V 1 +···+α j λ j V j .Now λ j+1 ̸= 0 for otherwise we would haveα 1 λ 1 V 1 +···+α j λ j V j = 0,with each λ i ̸= 0. This contradicts the fact that V 1 , ..., V j are linearlyindependent. Hence we haveλ 1V j+1 = α 1 V 1 +···+α j V j .λ j+1 λ j+1Since the λ i are distinct, we have now written V j+1 in two different ways asa linear combination of V 1 , ..., V j . This contradicts the fact that this set ofvectors is linearly independent. We have proved:Proposition. Suppose λ 1 , ..., λ l are real and distinct eigenvalues for A withassociated eigenvectors V 1 , ..., V l . Then the V j are linearly independent. λ j


5.2 Eigenvalues and Eigenvectors 85Of primary importance when we return to differential <strong>eq</strong>uations is theCorollary. Suppose A is an n × n matrix with real, distinct eigenvalues. Thenthere is a matrix T such that⎛ ⎞λ 1T −1 ⎜AT = ⎝. ..⎟⎠λ nwhere all of the entries off the diagonal are 0.Proof: Let V j be an eigenvector associated to λ j . Consider the linear mapT for which TE j = V j , where the E j form the standard basis of R n . Thatis, T is the matrix whose columns are V 1 , ..., V n . Since the V j are linearlyindependent, T is invertible and we have(T −1 AT )E j = T −1 AV j= λ j T −1 V j= λ j E j .That is, the jth column of T −1 AT is just the vector λ j E j , as r<strong>eq</strong>uired.Example. Let⎛A = ⎝ 1 2 −1⎞0 3 −2⎠ .0 2 −2Expanding det (A − λI) along the first column, we find that the characteristic<strong>eq</strong>uation of A is( )3 − λ −2det (A − λI) = (1 − λ) det2 −2 − λ= (1 − λ)((3 − λ)(−2 − λ) + 4)= (1 − λ)(λ − 2)(λ + 1),so the eigenvalues are 2, 1, and −1. The eigenvector corresponding to λ = 2isgiven by solving the <strong>eq</strong>uations (A − 2I)X = 0, which yields−x + 2y − z = 0y − 2z = 02y − 4z = 0.


86 Chapter 5 Higher Dimensional Linear AlgebraThese <strong>eq</strong>uations reduce tox − 3z = 0y − 2z = 0.Hence V 1 = (3, 2, 1) is an eigenvector associated to λ = 2. In similar fashionwe find that (1, 0, 0) is an eigenvector associated to λ = 1, while (0, 1, 2) is aneigenvector associated to λ =−1. Then we set⎛T = ⎝ 3 1 0⎞2 0 1⎠ .1 0 2A simple calculation shows that⎛AT = T ⎝ 2 0 0⎞0 1 0 ⎠ .0 0 −1Since det T =−3, T is invertible and we have⎛T −1 AT = ⎝ 2 0 0⎞0 1 0 ⎠ .0 0 −15.3 Complex EigenvaluesNow we treat the case where A has nonreal (complex) eigenvalues. Supposeα + iβ is an eigenvalue of A with β ̸= 0. Since the characteristic <strong>eq</strong>uation forA has real coefficients, it follows that if α + iβ is an eigenvalue, then so is itscomplex conjugate α + iβ = α − iβ.Another way to see this is the following. Let V be an eigenvector associatedto α + iβ. Then the <strong>eq</strong>uationAV = (α + iβ)Vshows that V is a vector with complex entries. We write⎛ ⎞x 1 + iy n⎜ ⎟V = ⎝ . ⎠ .x n + iy n


Let V denote the complex conjugate of V :Then we have⎛ ⎞x 1 − iy 1⎜ ⎟V = ⎝ . ⎠ .x n − iy nAV = AV = (α + iβ)V = (α − iβ)V ,5.3 Complex Eigenvalues 87which shows that V is an eigenvector associated to the eigenvalue α − iβ.Notice that we have (temporarily) stepped out of the “real” world of R n andinto the world C n of complex vectors. This is not really a problem, since all ofthe previous linear algebraic results hold <strong>eq</strong>ually well for complex vectors.Now suppose that A isa2n × 2n matrix with distinct nonreal eigenvaluesα j ± iβ j for j = 1, ..., n. Let V j and V j denote the associated eigenvectors.Then, just as in the previous proposition, this collection of eigenvectors islinearly independent. That is, if we haven∑(c j V j + d j V j ) = 0j=1where the c j and d j are now complex numbers, then we must have c j = d j = 0for each j.Now we change coordinates to put A into canonical form. LetW 2j−1 = 1 ( )Vj + V j2W 2j = −i ( )Vj − V j .2Note that W 2j−1 and W 2j are both real vectors. Indeed, W 2j−1 is just the realpart of V j while W 2j is its imaginary part. So working with the W j brings usback home to R n .Proposition.The vectors W 1 , ..., W 2n are linearly independent.Proof: Suppose not. Then we can find real numbers c j , d j for j = 1, ..., nsuch thatn∑ ( )cj W 2j−1 + d j W 2j = 0j=1


88 Chapter 5 Higher Dimensional Linear Algebrabut not all of the c j and d j are zero. But then we havefrom which we find12n∑ (cj (V j + V j ) − id j (V j − V j ) ) = 0j=1n∑ ( )(cj − id j )V j + (c j + id j )V j = 0.j=1Since the V j and V j ’s are linearly independent, we must have c j ± id j = 0,from which we conclude c j = d j = 0 for all j. This contradiction establishesthe result.Note that we haveSimilarly, we computeAW 2j−1 = 1 2 (AV j + AV j )= 1 ( )(α + iβ)Vj + (α − iβ)V j2= α 2 (V j + V j ) + iβ 2 (V j − V j )= αW 2j−1 − βW 2j .AW 2j = βW 2j−1 + αW 2j .Now consider the linear map T for which TE j = W j for j = 1, ...,2n.That is, the matrix associated to T has columns W 1 , ..., W 2n . Note that thismatrix has real entries. Since the W j are linearly independent, it follows fromSection 5.2 that T is invertible. Now consider the matrix T −1 AT . We have(T −1 AT )E 2j−1 = T −1 AW 2j−1= T −1 (αW 2j−1 − βW 2j )= αE 2j−1 − βE 2jand similarly(T −1 AT )E 2j = βE 2j−1 + αW 2j .


Therefore the matrix associated to T −1 AT is⎛⎞D 1T −1 ⎜AT = ⎝. ..⎟⎠D n5.4 Bases and Subspaces 89where each D j isa2× 2 matrix of the formD j =( )αj β j.−β j α jThis is our canonical form for matrices with distinct nonreal eigenvalues.Combining the results of this and the previous section, we have:Theorem. Suppose that the n ×n matrix A has distinct eigenvalues. Then thereexists a linear map T so that⎛⎞λ 1 . .. T −1 AT =λ k D 1 ⎜⎝. ..⎟⎠D lwhere the D j are 2 × 2 matrices in the formD j =( )αj β j.−β j α j5.4 Bases and SubspacesTo deal with the case of a matrix with repeated eigenvalues, we need somefurther algebraic concepts. Recall that the collection of all linear combinationsof a given finite set of vectors is called a subspace of R n . More precisely, givenV 1 , ..., V k ∈ R n , the setS ={α 1 V 1 +···+α k V k | α j ∈ R}is a subspace of R n . In this case we say that S is spanned by V 1 , ..., V k .


90 Chapter 5 Higher Dimensional Linear AlgebraDefinitionLet S be a subspace of R n . A collection of vectors V 1 , ..., V k is abasis of S if the V j are linearly independent and span S.Note that a subspace always has a basis, for if S is spanned by V 1 , ..., V k ,wecan always throw away certain of the V j in order to reach a linearly independentsubset of these vectors that spans S. More precisely, if the V j are not linearlyindependent, then we may find one of these vectors, say, V k , for whichV k = β 1 V 1 +···+β k−1 V k−1 .Hence we can write any vector in S as a linear combination of the V 1 , ..., V k−1alone; the vector V k is extraneous. Continuing in this fashion, we eventuallyreach a linearly independent subset of the V j that spans S.More important for our purposes is:Proposition.elements.Every basis of a subspace S ⊂ R n has the same number ofProof: We first observe that the system of k linear <strong>eq</strong>uations in k+l unknownsgiven bya 11 x 1 +···+a 1 k+l x k+l = 0.a k1 x 1 +···+a kk+l x k+l = 0always has a nonzero solution if l > 0. Indeed, using row reduction, we mayfirst solve for one unknown in terms of the others, and then we may eliminatethis unknown to obtain a system of k − 1 <strong>eq</strong>uations in k + l − 1 unknowns.Thus we are finished by induction (the first case, k = 1, being obvious).Now suppose that V 1 , ..., V k is a basis for the subspace S. Suppose thatW 1 , ..., W k+l is also a basis of S, with l > 0. Then each W j is a linearcombination of the V i , so we have constants a ij such thatk∑W j = a ij V i , for j = 1, ..., k + l.i=1


By the previous observation, the system of k <strong>eq</strong>uations5.4 Bases and Subspaces 91k+l∑a ij x j = 0,j=1for i = 1, ..., khas a nonzero solution (c 1 , ..., c k+l ). Thenk+lk+l∑ ∑( k∑ )c j W j = c j a ij V i =j=1 j=1 i=1k∑i=1( ∑k+la ij c j)V i = 0j=1so that the W j are linearly dependent. This contradiction completes theproof.As a cons<strong>eq</strong>uence of this result, we may <strong>def</strong>ine the dimension of a subspaceS as the number of vectors that form any basis for S. In particular, R n isa subspace of itself, and its dimension is clearly n. Furthermore, any othersubspace of R n must have dimension less than n, for otherwise we would havea collection of more than n vectors in R n that are linearly independent. Thiscannot happen by the previous proposition. The set consisting of only the 0vector is also a subspace, and we <strong>def</strong>ine its dimension to be zero. We writedim S for the dimension of the subspace S.Example. A straight line through the origin in R n forms a one-dimensionalsubspace of R n , since any vector on this line may be written uniquely as tVwhere V ∈ R n is a fixed nonzero vector lying on the line and t ∈ R is arbitrary.Clearly, the single vector V forms a basis for this subspace.Example. The plane P in R 3 <strong>def</strong>ined byx + y + z = 0is a two-dimensional subspace of R 3 . The vectors (1, 0, −1) and (0, 1, −1) bothlie in P and are linearly independent. If W ∈ P, we may write⎛W = ⎝x ⎞ ⎛y ⎠ = x ⎝ 1 ⎞ ⎛0⎠ + y ⎝ 0 ⎞1⎠ ,−y − x −1 −1so these vectors also span P.


92 Chapter 5 Higher Dimensional Linear AlgebraAs in the planar case, we say that a function T : R n → R n is linear ifT (X) = AX for some n × n matrix A. T is called a linear map or lineartransformation. Using the properties of matrices discussed in Section 5.1, wehaveT (αX + βY ) = αT (X) + βT (Y )for any α, β ∈ R and X, Y ∈ R n . We say that the linear map T is invertible ifthe matrix A associated to T has an inverse.For the study of linear systems of differential <strong>eq</strong>uations, the most importanttypes of subspaces are the kernels and ranges of linear maps. We <strong>def</strong>ine thekernel of T , denoted Ker T , to be the set of vectors mapped to 0 by T . Therange of T consists of all vectors W for which there exists a vector V for whichTV = W . This, of course, is a familiar concept from calculus. The differencehere is that the range of T is always a subspace of R n .Example. Consider the linear map⎛T (X) = ⎝ 0 1 0⎞0 0 1⎠ X.0 0 0If X = (x, y, z), then⎛⎞T (X) = ⎝ y z⎠ .0Hence Ker T consists of all vectors of the form (α, 0, 0) while Range T is theset of vectors of the form (β, γ , 0), where α, β, γ ∈ R. Both sets are clearlysubspaces.Example. Let⎛T (X) = AX = ⎝ 1 2 3⎞4 5 6⎠ X.7 8 9For Ker T , we seek vectors X that satisfy AX = 0. Using row reduction, wefind that the reduced row echelon form of A is the matrix⎛⎝ 1 0 −1⎞0 1 2⎠ .0 0 0


5.4 Bases and Subspaces 93Hence the solutions X = (x, y, z) ofAX = 0 satisfy x = z, y = −2z.Therefore any vector in Ker T is of the form (z, −2z, z), so Ker T has dimensionone. For Range T , note that the columns of A are vectors in Range T ,since they are the images of (1, 0, 0), (0, 1, 0), and (0, 0, 1), respectively. Thesevectors are not linearly independent since⎛⎞⎛⎞⎛⎞−1 ⎝ 1 4⎠ + 2 ⎝ 2 5⎠ = ⎝ 3 6⎠ .7 8 9However, (1, 4, 7) and (2, 5, 8) are linearly independent, so these two vectorsgive a basis of Range T .Proposition. Let T : R n → R n be a linear map. Then Ker T and Range Tare both subspaces of R n . Moreover,dim Ker T + dim Range T = n.Proof: First suppose that Ker T ={0}. Let E 1 , ..., E n be the standard basisof R n . Then we claim that TE 1 , ..., TE n are linearly independent. If this is notthe case, then we may find α 1 , ..., α n , not all 0, such thatn∑α j TE j = 0.j=1But then we have⎛ ⎞n∑T ⎝ α j E j⎠ = 0,j=1which implies that ∑ α j E j ∈ Ker T , so that ∑ α j E j = 0. Hence each α j = 0,which is a contradiction. Thus the vectors TE j are linearly independent. Butthen, given V ∈ R n , we may writeV =n∑β j TE jj=1


94 Chapter 5 Higher Dimensional Linear Algebrafor some β 1 , ..., β n . Hence⎛ ⎞n∑V = T ⎝ β j E j⎠j=1which shows that Range T = R n . Hence both Ker T and Range T aresubspaces of R n and we have dim Ker T = 0 and dim Range T = n.If Ker T ̸= {0}, we may find a nonzero vector V 1 ∈ Ker T . Clearly,T (αV 1 ) = 0 for any α ∈ R, so all vectors of the form αV 1 lie in Ker T .If Ker T contains additional vectors, choose one and call it V 2 . Then Ker Tcontains all linear combinations of V 1 and V 2 , sinceT (α 1 V 1 + α 2 V 2 ) = α 1 TV 1 + α 2 TV 2 = 0.Continuing in this fashion we obtain a set of linearly independent vectors thatspan Ker T , thus showing that Ker T is a subspace. Note that this process mustend, since every collection of more than n vectors in R n is linearly dependent.A similar argument works to show that Range T is a subspace.Now suppose that V 1 , ..., V k form a basis of Ker T where 0 < k < n(the case where k = n being obvious). Choose vectors W k+1 , ..., W n so thatV 1 , ..., V k , W k+1 , ..., W n form a basis of R n . Let Z j = TW j for each j. Thenthe vectors Z j are linearly independent, for if we hadthen we would also haveThis implies thatα k+1 Z k+1 +···+α n Z n = 0,T (α k+1 W k+1 +···+α n W n ) = 0.α k+1 W k+1 +···+α n W n ∈ Ker T .But this is impossible, since we cannot write any W j (and hence any linearcombination of the W j ) as a linear combination of the V i . This proves that thesum of the dimensions of Ker T and Range T is n.We remark that it is easy to find a set of vectors that spans Range T ; simplytake the set of vectors that comprise the columns of the matrix associatedto T . This works since the ith column vector of this matrix is the image of thestandard basis vector E i under T . In particular, if these column vectors are


5.5 Repeated Eigenvalues 95linearly independent, then Ker T ={0} and there is a unique solution to the<strong>eq</strong>uation T (X) = V for every V ∈ R n . Hence we have:Corollary. If T : R n → R n is a linear map with dim Ker T = 0, then T isinvertible.5.5 Repeated EigenvaluesIn this section we describe the canonical forms that arise when a matrix hasrepeated eigenvalues. Rather than spending an inordinate amount of timedeveloping the general theory in this case, we will give the details only for3 × 3 and 4 × 4 matrices with repeated eigenvalues. More general cases arerelegated to the exercises. We justify this omission in the next section wherewe show that the “typical” matrix has distinct eigenvalues and hence can behandled as in the previous section. (If you happen to meet a random matrixwhile walking down the street, the chances are very good that this matrix willhave distinct eigenvalues!) The most general result regarding matrices withrepeated eigenvalues is given by:Proposition.T for whichLet A be an n × n matrix. Then there is a change of coordinates⎛ ⎞B 1T −1 ⎜AT = ⎝. ..⎟⎠B kwhere each of the B j ’s is a square matrix (and all other entries are zero) of one ofthe following forms:(i)⎛⎞λ 1λ 1. .. . ..⎜⎝. ..⎟1 ⎠λ(ii)⎛⎞C 2 I 2C 2 I 2. .. . ..⎜⎝. ..⎟I2 ⎠C 2where( α βC 2 =−β α), I 2 =( ) 1 0,0 1


96 Chapter 5 Higher Dimensional Linear Algebraand where α, β, λ ∈ R with β ̸= 0. The special cases where B j = (λ) or( ) α βB j =−β αare, of course, allowed.We first consider the case of R 3 .IfA has repeated eigenvalues in R 3 , thenall eigenvalues must be real. There are then two cases. Either there are twodistinct eigenvalues, one of which is repeated, or else all eigenvalues are thesame. The former case can be handled by a process similar to that describedin Chapter 3, so we restrict our attention here to the case where A has a singleeigenvalue λ of multiplicity 3.Proposition. Suppose A is a 3 × 3 matrix for which λ is the only eigenvalue.Then we may find a change of coordinates T such that T −1 AT assumes one ofthe following three forms:⎛(i) ⎝ λ 0 0⎞0 λ 0⎠0 0 λ⎛(ii) ⎝ λ 1 0⎞0 λ 0⎠0 0 λ⎛(iii) ⎝ λ 1 0⎞0 λ 1⎠ .0 0 λProof: Let K be the kernel of A − λI. Any vector in K is an eigenvector of A.There are then three subcases depending on whether the dimension of K is 1,2, or 3.If the dimension of K is 3, then (A − λI)V = 0 for any V ∈ R 3 . HenceA = λI. This yields matrix (i).Suppose the dimension of K is 2. Let R be the range of A − λI. Then R hasdimension 1 since dim K + dim R = 3, as we saw in the previous section. Weclaim that R ⊂ K . If this is not the case, let V ∈ R be a nonzero vector. Since(A − λI)V ∈ R and R is one dimensional, we must have (A − λI)V = μV forsome μ ̸= 0. But then AV = (λ + μ)V , so we have found a new eigenvalueλ + μ. This contradicts our assumption, so we must have R ⊂ K .Now let V 1 ∈ R be nonzero. Since V 1 ∈ K , V 1 is an eigenvector and so(A − λI)V 1 = 0. Since V 1 also lies in R, we may find V 2 ∈ R 3 − K with(A − λI)V 2 = V 1 . Since K is two dimensional we may choose a secondvector V 3 ∈ K such that V 1 and V 3 are linearly independent. Note that V 3is also an eigenvector. If we now choose the change of coordinates TE j =V j for j = 1, 2, 3, then it follows easily that T −1 AT assumes the form ofcase (ii).Finally, suppose that K has dimension 1. Thus R has dimension 2. We claimthat, in this case, K ⊂ R. If this is not the case, then (A − λI)R = R and so


5.5 Repeated Eigenvalues 97A − λI is invertible on R. Thus, if V ∈ R, there is a unique W ∈ R for which(A − λI)W = V . In particular, we haveAV = A(A − λI)W= (A 2 − λA)W= (A − λI)(AW ).This shows that, if V ∈ R, then so too is AV . Hence A also preserves thesubspace R. It then follows immediately that A must have an eigenvector in R,but this then says that K ⊂ R and we have a contradiction.Next we claim that (A − λI)R = K . To see this, note that (A − λI)R isone dimensional, since K ⊂ R. If(A − λI)R ̸= K , there is a nonzero vectorV ̸∈ K for which (A − λI)R ={tV } where t ∈ R. But then (A − λI)V = tVfor some t ∈ R, t ̸= 0, and so AV = (t + λ)V yields another new eigenvalue.Thus we must in fact have (A − λI)R = K .Now let V 1 ∈ K be an eigenvector for A. As above there exists V 2 ∈ R suchthat (A−λI)V 2 = V 1 . Since V 2 ∈ R there exists V 3 such that (A−λI)V 3 = V 2 .Note that (A −λI) 2 V 3 = V 1 . The V j are easily seen to be linearly independent.Moreover, the linear map <strong>def</strong>ined by TE j = V j finally puts A into canonicalform (iii). This completes the proof.Example. Suppose⎛A = ⎝ 2 0 −1⎞0 2 1⎠ .−1 −1 2Expanding along the first row, we finddet(A − λI) = (2 − λ)[(2 − λ) 2 + 1]−(2 − λ) = (2 − λ) 3so the only eigenvalue is 2. Solving (A−2I)V = 0 yields only one independenteigenvector V 1 = (1, −1, 0), so we are in case (iii) of the proposition. Wecompute⎛(A − 2I) 2 = ⎝ 1 1 0⎞−1 −1 0⎠0 0 0so that the vector V 3 = (1, 0, 0) solves (A − 2I) 2 V 3 = V 1 . We also have(A − 2I)V 3 = V 2 = (0, 0, −1).


98 Chapter 5 Higher Dimensional Linear AlgebraAs before, we let TE j = V j for j = 1, 2, 3, so that⎛T = ⎝ 1 0 1⎞−1 0 0⎠ .0 −1 0Then T −1 AT assumes the canonical form⎛T −1 AT = ⎝ 2 1 0⎞0 2 1⎠ .0 0 2Example. Now suppose⎛A = ⎝ 1 1 0⎞−1 3 0⎠ .−1 1 2Again expanding along the first row, we finddet(A − λI) = (1 − λ)[(3 − λ)(2 − λ)]+(2 − λ) = (2 − λ) 3so again the only eigenvalue is 2. This time, however, we have⎛A − 2I = ⎝ −1 1 0⎞−1 1 0⎠−1 1 0so that we have two linearly independent eigenvectors (x, y, z) for which wemust have x = y while z is arbitrary. Note that (A − 2I) 2 is the zero matrix, sowe may choose any vector that is not an eigenvector as V 2 , say, V 2 = (1, 0, 0).Then (A − 2I)V 2 = V 1 = (−1, −1, −1) is an eigenvector. A second linearlyindependent eigenvector is then V 3 = (0, 0, 1), for example. Defining TE j =V j as usual then yields the canonical form⎛T −1 AT = ⎝ 2 1 0⎞0 2 0⎠ .0 0 2Now we turn to the 4 × 4 case. The case of all real eigenvalues is similar tothe 3 × 3 case (though a little more complicated algebraically) and is left as


5.5 Repeated Eigenvalues 99an exercise. Thus we assume that A has repeated complex eigenvalues α ± iβwith β ̸= 0.There are just two cases; either we can find a pair of linearly independenteigenvectors corresponding to α+iβ, or we can find only one such eigenvector.In the former case, let V 1 and V 2 be the independent eigenvectors. The V 1 andV 2 are linearly independent eigenvectors for α − iβ. As before, choose the realvectorsW 1 = (V 1 + V 1 )/2W 2 =−i(V 1 − V 1 )/2W 3 = (V 2 + V 2 )/2W 4 =−i(V 2 − V 2 )/2.If we set TE j = W j then changing coordinates via T puts A in canonical form⎛⎞α β 0 0T −1 AT = ⎜−β α 0 0⎟⎝ 0 0 α β⎠ .0 0 −β αIf we find only one eigenvector V 1 for α + iβ, then we solve the system of<strong>eq</strong>uations (A − (α + iβ)I)X = V 1 as in the case of repeated real eigenvalues.The proof of the previous proposition shows that we can always find a nonzerosolution V 2 of these <strong>eq</strong>uations. Then choose the W j as above and set TE j = W j .Then T puts A into the canonical formFor example, we compute⎛⎞α β 1 0T −1 AT = ⎜−β α 0 1⎟⎝ 0 0 α β⎠ .0 0 −β α(T −1 AT )E 3 = T −1 AW 3= T −1 A(V 2 + V 2 )/2= T −1( (V 1 + (α + iβ)V 2 )/2 + (V 1 + (α − iβ)V 2 )/2 )= T −1( (V 1 + V 1 )/2 + α ( ) ( ) )V 2 + V 2 /2 + iβ V2 − V 2 /2= E 1 + αE 3 − βE 4 .


100 Chapter 5 Higher Dimensional Linear AlgebraExample. Let⎛⎞1 −1 0 1A = ⎜2 −1 1 0⎟⎝0 0 −1 2⎠ .0 0 −1 1The characteristic <strong>eq</strong>uation, after a little computation, is(λ 2 + 1) 2 = 0.Hence A has eigenvalues ±i, each repeated twice.Solving the system (A − iI)X = 0 yields one linearly independent complexeigenvector V 1 = (1, 1 − i, 0, 0) associated to i. Then V 1 is an eigenvectorassociated to the eigenvalue −i.Next we solve the system (A − iI)X = V 1 to find V 2 = (0, 0, 1 − i, 1). ThenV 2 solves the system (A − iI)X = V 1 . Finally, chooseW 1 = ( V 1 + V 1)/2 = Re V1W 2 =−i ( V 1 − V 1)/2 = Im V1W 3 = ( V 2 + V 2)/2 = Re V2W 4 =−i ( V 2 − V 2)/2 = Im V2and let TE j = W j for j = 1, ..., 4. We have⎛⎞ ⎛⎞1 0 0 01 0 0 0T = ⎜1 −1 0 0⎟⎝0 0 1 −1⎠ , T −1 = ⎜1 −1 0 0⎟⎝0 0 0 1⎠0 0 1 00 0 −1 1and we find the canonical form⎛⎞0 1 1 0T −1 AT = ⎜−1 0 0 1⎟⎝ 0 0 0 1⎠ .0 0 −1 0Example. Let⎛⎞2 0 1 0A = ⎜0 2 0 1⎟⎝0 0 2 0⎠ .0 −1 0 2


5.6 Genericity 101The characteristic <strong>eq</strong>uation for A is(2 − λ) 2 ((2 − λ) 2 + 1) = 0so the eigenvalues are 2 ± i and 2 (with multiplicity 2).Solving the <strong>eq</strong>uations (A − (2 + i)I)X = 0 yields an eigenvector V =(0, −i,0,1)for2+ i. Let W 1 = (0, 0, 0, 1) and W 2 = (0, −1, 0, 0) be the realand imaginary parts of V .Solving the <strong>eq</strong>uations (A − 2I)X = 0 yields only one eigenvector associatedto 2, namely, W 3 = (1, 0, 0, 0). Then we solve (A − 2I)X = W 3 to findW 4 = (0, 0, 1, 0). Setting TE j = W j as usual puts A into the canonical form⎛⎞2 1 0 0T −1 AT = ⎜−1 2 0 0⎟⎝ 0 0 2 1⎠ ,0 0 0 2as is easily checked.5.6 GenericityWe have mentioned several times that “most” matrices have distinct eigenvalues.Our goal in this section is to make this precise.Recall that a set U ⊂ R n is open if whenever X ∈ U there is an open ballabout X contained in U; that is, for some a>0 (depending on X) the openball about X of radius a,{Y ∈ R n ∣ ∣ |Y − X| 0 there exists some Y ∈ U with |X −Y | < ɛ. Equivalently, Uis dense in R n if V ∩ U is nonempty for every nonempty open set V ⊂ R n . Forexample, the rational numbers form a dense subset of R, as do the irrationalnumbers. Similarlyis a dense subset of the plane.{(x, y) ∈ R 2 | both x and y are rational}


102 Chapter 5 Higher Dimensional Linear AlgebraAn interesting kind of subset of R n is a set that is both open and dense.Such a set U is characterized by the following properties: Every point inthe complement of U can be approximated arbitrarily closely by points ofU (since U is dense); but no point in U can be approximated arbitrarily closelyby points in the complement (because U is open).Here is a simple example of an open and dense subset of R 2 :V ={(x, y) ∈ R 2 | xy ̸= 1}.This, of course, is the complement in R 2 of the hyperbola <strong>def</strong>ined by xy = 1.Suppose (x 0 , y 0 ) ∈ V. Then x 0 y 0 ̸= 1 and if |x −x 0 |, |y −y 0 | are small enough,then xy ̸= 1; this proves that V is open. Given any (x 0 , y 0 ) ∈ R 2 , we can find(x, y) as close as we like to (x 0 , y 0 ) with xy ̸= 1; this proves that V is dense.An open and dense set is a very fat set, as the following proposition shows:Proposition.Let V 1 , ..., V m be open and dense subsets of R n . ThenV = V 1 ∩ ...∩ V mis also open and dense.Proof: It can be easily shown that the intersection of a finite number of opensets is open, so V is open. To prove that V is dense let U ⊂ R n be a nonemptyopen set. Then U ∩ V 1 is nonempty since V 1 is dense. Because U and V 1 areopen, U ∩ V 1 is also open. Since U ∩ V 1 is open and nonempty, (U ∩ V 1 ) ∩ V 2is nonempty because V 2 is dense. Since V 1 is open, U ∩ V 1 ∩ V 2 is open. Thus(U ∩ V 1 ∩ V 2 ) ∩ V 3 is nonempty, and so on. So U ∩ V is nonempty, whichproves that V is dense in R n .We therefore think of a subset of R n as being large if this set contains anopen and dense subset. To make precise what we mean by “most” matrices, weneed to transfer the notion of an open and dense set to the set of all matrices.Let L(R n ) denote the set of n × n matrices, or, <strong>eq</strong>uivalently, the set of linearmaps of R n . To discuss open and dense sets in L(R n ), we need to have a notionof how far apart two given matrices in L(R n ) are. But we can do this by simplywriting all of the entries of a matrix as one long vector (in a specified order)and thereby thinking of L(R n )asR n2 .Theorem. The set M of matrices in L(R n ) that have n distinct eigenvalues isopen and dense in L(R n ).Proof: We first prove that M is dense. Let A ∈ L(R n ). Suppose that A hassome repeated eigenvalues. The proposition from the previous section states


5.6 Genericity 103that we can find a matrix T such that T −1 AT assumes one of two forms. Eitherwe have a canonical form with blocks along the diagonal of the form(i)⎛⎞λ 1λ 1. .. . ..⎜⎝. ..⎟1 ⎠λwhere α, β, λ ∈ R with β ̸= 0 and( α βC 2 =−β αor(ii)), I 2 =⎛⎞C 2 I 2C 2 I 2. .. . ..⎜⎝. ..⎟I2 ⎠C 2( ) 1 0,0 1or else we have a pair of separate diagonal blocks (λ)orC 2 . Either case can behandled as follows.Choose distinct values λ j such that |λ−λ j | is as small as desired, and replaceblock (i) above by⎛⎞λ 1 1λ 2 1. .. . ...⎜⎝. ..⎟1 ⎠λ jThis new block now has distinct eigenvalues. In block (ii) we may similarlyreplace each 2 × 2 block( ) α β−βαwith distinct α i ’s. The new matrix thus has distinct eigenvalues α i ± β. In thisfashion, we find a new matrix B arbitrarily close to T −1 AT with distinct eigenvalues.Then the matrix TBT −1 also has distinct eigenvalues and, moreover,this matrix is arbitrarily close to A. Indeed, the function F : L(R n ) → L(R n )given by F(M) = TMT −1 where T is a fixed invertible matrix is a continuousfunction on L(R n ) and hence takes matrices close to T −1 AT to new matricesclose to A. This shows that M is dense.To prove that M is open, consider the characteristic polynomial of a matrixA ∈ L(R n ). If we vary the entries of A slightly, then the characteristic polynomial’scoefficients vary only slightly. Hence the roots of this polynomial in C


5. Put the following matrices in canonical form⎛(a) ⎝ 0 0 1⎞0 1 0⎠1 0 0⎛(d) ⎝ 0 1 0⎞1 0 0⎠1 1 1⎛(g) ⎝ 1 0 −1⎞−1 1 −1⎠0 0 1Exercises 105⎛(b) ⎝ 1 0 1⎞ ⎛0 1 0⎠(c) ⎝ 0 1 0⎞−1 0 0⎠0 0 11 1 1⎛(e) ⎝ 1 0 1⎞ ⎛0 1 0⎠ (f ) ⎝ 1 1 0⎞1 1 1⎠1 0 10 1 1⎛ ⎞1 0 0 1(h) ⎜0 1 1 0⎟⎝0 0 1 0⎠1 0 0 06. Suppose that a 5 × 5 matrix has eigenvalues 2 and 1 ± i. List all possiblecanonical forms for a matrix of this type.7. Let L be the elementary matrix that interchanges the ith and jth rows ofa given matrix. That is, L has 1’s along the diagonal, with the exceptionthat l ii = l jj = 0 but l ij = l ji = 1. Prove that det L =−1.8. Find a basis for both Ker T and Range T when T is the matrix(a)( ) 1 22 4⎛(b) ⎝ 1 1 1⎞1 1 1⎠1 1 1⎛(c) ⎝ 1 9 6⎞1 4 1⎠2 7 19. Suppose A isa4× 4 matrix that has a single real eigenvalue λ and onlyone independent eigenvector. Prove that A may be put in canonical form⎛⎞λ 1 0 0⎜0 λ 1 0⎟⎝0 0 λ 1⎠ .0 0 0 λ10. Suppose A isa4× 4 matrix with a single real eigenvalue and two linearlyindependent eigenvectors. Describe the possible canonical forms for Aand show that A may indeed be transformed into one of these canonicalforms. Describe explicitly the conditions under which A is transformedinto a particular form.11. Show that if A and/or B are noninvertible matrices, then AB is alsononinvertible.12. Suppose that S is a subset of R n having the following properties:(a) If X, Y ∈ S, then X + Y ∈ S;(b) If X ∈ S and α ∈ R, then αX ∈ S.


106 Chapter 5 Higher Dimensional Linear AlgebraProve that S may be written as the collection of all possible linearcombinations of a finite set of vectors.13. Which of the following subsets of R n are open and/or dense? Give a briefreason in each case.(a) U 1 ={(x, y) | y>0};(b) U 2 ={(x, y) | x 2 + y 2 ̸= 1};(c) U 3 ={(x, y) | x is irrational};(d) U 4 ={(x, y) | x and y are not integers};(e) U 5 is the complement of a set C 1 where C 1 is closed and not dense;(f) U 6 is the complement of a set C 2 which contains exactly 6 billion andtwo distinct points.14. Each of the following properties <strong>def</strong>ines a subset of real n × n matrices.Which of these sets are open and/or dense in the L(R n )? Give a briefreason in each case.(a) det A ̸= 0.(b) Trace A is rational.(c) Entries of A are not integers.(d) 3 ≤ det A


6Higher DimensionalLinear SystemsAfter our little sojourn into the world of linear algebra, it’s time to return to differential<strong>eq</strong>uations and, in particular, to the task of solving higher dimensionallinear systems with constant coefficients. As in the linear algebra chapter, wehave to deal with a number of different cases.6.1 Distinct EigenvaluesConsider first a linear system X ′ = AX where the n × n matrix A has ndistinct, real eigenvalues λ 1 , ..., λ n . By the results in Chapter 5, there is achange of coordinates T so that the new system Y ′ = (T −1 AT )Y assumes theparticularly simple formy ′ 1 = λ 1y 1.y ′ n = λ ny n .The linear map T is the map that takes the standard basis vector E j to theeigenvector V j associated to λ j . Clearly, a function of the form⎛c 1 e λ ⎞1t⎜ ⎟Y (t) = ⎝ . ⎠c n e λ nt107


108 Chapter 6 Higher Dimensional Linear Systemsis a solution of Y ′ = (T −1 AT )Y that satisfies the initial condition Y (0) =(c 1 , ..., c n ). As in Chapter 3, this is the only such solution, because if⎛ ⎞w 1 (t)⎜ ⎟W (t) = ⎝ . ⎠w n (t)is another solution, then differentiating each expression w j (t) exp(−λ j t), wefindddt w j(t)e −λjt = (w j ′ − λ jw j )e −λjt = 0.Hence w j (t) = c j exp(λ j t) for each j. Therefore the collection of solutionsY (t) yields the general solution of Y ′ = (T −1 AT )Y .It then follows that X(t) = TY (t) is the general solution of X ′ = AX, sothis general solution may be written in the formn∑X(t) = c j e λjt V j .j=1Now suppose that the eigenvalues λ 1 , ..., λ k of A are negative, while theeigenvalues λ k+1 , ..., λ n are positive. Since there are no zero eigenvalues, thesystem is hyperbolic. Then any solution that starts in the subspace spannedby the vectors V 1 , ..., V k must first of all stay in that subspace for all timesince c k+1 = ··· = c n = 0. Secondly, each such solution tends to the originas t →∞. In analogy with the terminology introduced for planar systems,we call this subspace the stable subspace. Similarly, the subspace spanned byV k+1 , ..., V n contains solutions that move away from the origin. This subspaceis the unstable subspace. All other solutions tend toward the stable subspaceas time goes backward and toward the unstable subspace as time increases.Therefore this system is a higher dimensional analog of a saddle.Example. Consider⎛X ′ = ⎝ 1 2 −1⎞0 3 −2⎠ X.0 2 −2In Section 5.2 in Chapter 5, we showed that this matrix has eigenvalues2, 1, and −1 with associated eigenvectors (3, 2, 1), (1, 0, 0), and (0, 1, 2),respectively. Therefore the matrix⎛T = ⎝ 3 1 0⎞2 0 1⎠1 0 2


6.1 Distinct Eigenvalues 109zT(0, 1, 2)yxFigure 6.1 The stable and unstable subspaces of a saddle in dimension 3.On the left, the system is in canonical form.converts X ′ = AX to⎛Y ′ = (T −1 AT )Y = ⎝ 2 0 0⎞0 1 0 ⎠ Y ,0 0 −1which we can solve immediately. Multiplying the solution by T then yields thegeneral solution⎛X(t) = c 1 e 2t ⎝ 3 ⎞ ⎛2⎠ + c 2 e t ⎝ 1 ⎞ ⎛0⎠ + c 3 e −t ⎝ 0 ⎞1⎠1 02of X ′ = AX. The straight line through the origin and (0, 1, 2) is the stableline, while the plane spanned by (3, 2, 1) and (1, 0, 0) is the unstable plane. Acollection of solutions of this system as well as the system Y ′ = (T −1 AT )Y isdisplayed in Figure 6.1.Example. If the 3 × 3 matrix A has three real, distinct eigenvalues that arenegative, then we may find a change of coordinates so that the system assumesthe form⎛Y ′ = (T −1 AT )Y = ⎝ λ ⎞1 0 00 λ 2 0 ⎠ Y0 0 λ 3where λ 3 < λ 2 < λ 1 < 0. All solutions therefore tend to the origin and sowe have a higher dimensional sink. See Figure 6.2. For an initial condition(x 0 , y 0 , z 0 ) with all three coordinates nonzero, the corresponding solutiontends to the origin tangentially to the x-axis (see Exercise 2 at the end ofthe chapter).


110 Chapter 6 Higher Dimensional Linear SystemszxyFigure 6.2A sink in three dimensions.Now suppose that the n × n matrix A has n distinct eigenvalues, of whichk 1 are real and 2k 2 are nonreal, so that n = k 1 + 2k 2 . Then, as in Chapter 5,we may change coordinates so that the system assumes the formx ′ j = λ j x ju ′ l = α lu l + β l v lv ′ l =−β lu l + α l v lfor j = 1, ..., k 1 and l = 1, ..., k 2 . As in Chapter 3, we therefore have solutionsof the formx j (t) = c j e λ jtu l (t) = p l e αlt cos β l t + q l e αlt sin β l tv l (t) =−p l e αlt sin β l t + q l e αlt cos β l t.As before, it is straightforward to check that this is the general solution. Wehave therefore shown:Theorem. Consider the system X ′ = AX where A has distinct eigenvaluesλ 1 , ..., λ k1 ∈ R and α 1 + iβ 1 , ..., α k2 + iβ k2 ∈ C. Let T be the matrix that putsA in the canonical form⎛⎞λ 1 . .. T −1 AT =λ k1 B 1 ⎜⎝. ..⎟⎠B k2


6.1 Distinct Eigenvalues 111whereB j =( )αj β j.−β j α jThen the general solution of X ′ = AX is TY (t) where⎛Y (t) =⎜⎝c 1 e λ 1t.c k e λ k 1ta 1 e α 1t cos β 1 t + b 1 e α 1t sin β 1 t−a 1 e α 1t sin β 1 t + b 1 e α 1t cos β 1 t.a k2 e α k 2 t cos β k2 t + b k2 e α k 2 t sin β k2 t−a k2 e α k 2 t sin β k2 t + b k2 e α k 2 t cos β k2 t⎞⎟⎠As usual, the columns of the matrix T in this theorem are the eigenvectors(or the real and imaginary parts of the eigenvectors) corresponding to eacheigenvalue. Also, as before, the subspace spanned by the eigenvectors correspondingto eigenvalues with negative (resp., positive) real parts is the stable(resp., unstable) subspace.Example. Consider the systemX ′ =⎛⎝ 0 −1 1 0 000 0 −1⎞⎠ Xwhose matrix is already in canonical form. The eigenvalues are ±i, −1. Thesolution satisfying the initial condition (x 0 , y 0 , z 0 ) is given by⎛Y (t) = x 0⎝ cos t⎞ ⎛− sin t⎠ + y 0⎝ sin t⎞ ⎛cos t⎠ + z 0 e −t ⎝ 0 ⎞0⎠001so this is the general solution. The phase portrait for this system is displayedin Figure 6.3. The stable line lies along the z-axis, whereas all solutions in thexy–plane travel around circles centered at the origin. In fact, each solutionthat does not lie on the stable line actually lies on a cylinder in R 3 given byx 2 + y√2 = constant. These solutions spiral toward the circular solution ofradius x0 2 + y2 0 in the xy–plane if z 0 ̸= 0.


112 Chapter 6 Higher Dimensional Linear Systemsx 2 y 2 a 2Figure 6.3center.The phase portrait for a spiralExample. Now consider X ′ = AX whereThe characteristic <strong>eq</strong>uation is⎛A = ⎝ −0. 1 0 1⎞−1 1 −1. 1⎠ .−1 0 −0. 1−λ 3 + 0. 8λ 2 − 0. 81λ + 1. 01 = 0,which we have kindly factored for you into(1 − λ)(λ 2 + 0. 2λ + 1. 01) = 0.Therefore the eigenvalues are the roots of this <strong>eq</strong>uation, which are 1 and −0. 1±i. Solving (A − (−0. 1 + i)I)X = 0 yields the eigenvector (−i, 1, 1) associatedto −0. 1 + i. Let V 1 = Re (−i,1,1) = (0, 1, 1) and V 2 = Im (−i,1,1) =(−1, 0, 0). Solving (A − I)X = 0 yields V 3 = (0, 1, 0) as an eigenvectorcorresponding to λ = 1. Then the matrix whose columns are the V i ,⎛T = ⎝ 0 −1 0⎞1 0 1⎠ ,1 0 0


6.1 Distinct Eigenvalues 113Figure 6.4form.A spiral saddle in canonicalconverts X ′ = AX into⎛Y ′ = ⎝ −0.1 1 0⎞−1 −0. 1 0⎠ Y .0 0 1This system has an unstable line along the z-axis, while the xy–plane is thestable plane. Note that solutions spiral into 0 in the stable plane. We callthis system a spiral saddle (see Figure 6.4). Typical solutions of the stableplane spiral toward the z-axis while the z-coordinate meanwhile increases ordecreases (see Figure 6.5).Figure 6.5 Typical solutions of the spiralsaddle tend to spiral toward the unstableline.


114 Chapter 6 Higher Dimensional Linear Systems6.2 Harmonic OscillatorsConsider a pair of undamped harmonic oscillators whose <strong>eq</strong>uations arex ′′1 =−ω2 1 x 1x ′′2 =−ω2 2 x 2.We can almost solve these <strong>eq</strong>uations by inspection as visions of sin ωt andcos ωt pass through our minds. But let’s push on a bit, first to illustrate thetheorem in the previous section in the case of nonreal eigenvalues, but moreimportantly to introduce some interesting geometry.We first introduce the new variables y j = xj ′ for j = 1, 2 so that the <strong>eq</strong>uationscan be written as a systemx ′ j = y jy ′ j =−ω 2 j x j.In matrix form, this system is X ′ = AX where X = (x 1 , y 1 , x 2 , y 2 ) and⎛⎞0 1A = ⎜−ω1 2 0⎟⎝0 1⎠ .−ω2 2 0This system has eigenvalues ±iω 1 and ±iω 2 . An eigenvector corresponding toiω 1 is V 1 = (1, iω 1 , 0, 0) while V 2 = (0, 0, 1, iω 2 ) is associated to iω 2 . Let W 1and W 2 be the real and imaginary parts of V 1 , and let W 3 and W 4 be the samefor V 2 . Then, as usual, we let TE j = W j and the linear map T puts this systeminto canonical form with the matrix⎛⎞0 ω 1T −1 AT = ⎜−ω 1 0⎟⎝0 ω 2⎠ .−ω 2 0We then derive the general solution⎛ ⎞ ⎛⎞x 1 (t) a 1 cos ω 1 t + b 1 sin ω 1 tY (t) = ⎜y 1 (t)⎟⎝x 2 (t) ⎠ = ⎜−a 1 sin ω 1 t + b 1 cos ω 1 t⎟⎝ a 2 cos ω 2 t + b 2 sin ω 2 t ⎠y 2 (t) −a 2 sin ω 2 t + b 2 cos ω 2 tjust as we originally expected.


6.2 Harmonic Oscillators 115We could say that this is the end of the story and stop here since we have theformulas for the solution. However, let’s push on a bit more.Each pair of solutions (x j (t), y j (t)) for j = 1, 2 is clearly a periodic solutionof the <strong>eq</strong>uation with period 2π/ω j , but this does not mean that the full fourdimensionalsolution is a periodic function. Indeed, the full solution is a periodicfunction with period τ if and only if there exist integers m and n such thatω 1 τ = m · 2π and ω 2 τ = n · 2π.Thus, for periodicity, we must haveτ = 2πmω 1= 2πnω 2or, <strong>eq</strong>uivalently,ω 2ω 1= n m .That is, the ratio of the two fr<strong>eq</strong>uencies of the oscillators must be a rationalnumber. In Figure 6.6 we have plotted (x 1 (t), x 2 (t)) for particular solution ofthis system when the ratio of the fr<strong>eq</strong>uencies is 5/2.When the ratio of the fr<strong>eq</strong>uencies is irrational, something very differenthappens. To understand this, we make another (and much more familiar)change of coordinates. In canonical form, our system currently isx j ′ = ω j y jy ′ j =−ω j x j .x 2x 1Figure 6.6 A solution withfr<strong>eq</strong>uency ratio 5/2 projectedinto the x 1 x 2 –plane. Note thatx 2 (t) oscillates five times andx 1 (t) only twice before returningto the initial position.


116 Chapter 6 Higher Dimensional Linear SystemsLet’s now introduce polar coordinates (r j , θ j ) in place of the x j and y j variables.Differentiatingwe findr 2 j = x 2 j + y 2 j ,2r j r ′ j = 2x j x ′ j + 2y jy ′ j= 2x j y j ω j − 2x j y j ω j= 0.Therefore r ′ j= 0 for each j. Also, differentiating the <strong>eq</strong>uationyieldstan θ j = y jx jfrom which we find(sec 2 θ j )θ ′j = y′ j x j − y j x ′ jx 2 j= −ω jr 2 jr 2 jcos 2 θ jθ ′j =−ω j .So, in polar coordinates, these <strong>eq</strong>uations really are quite simple:r j ′ = 0θ j ′ =−ω j .The first <strong>eq</strong>uation tells us that both r 1 and r 2 remain constant along anysolution. Then, no matter what we pick for our initial r 1 and r 2 values, theθ j <strong>eq</strong>uations remain the same. Hence we may as well restrict our attention tor 1 = r 2 = 1. The resulting set of points in R 4 is a torus—the surface of adoughnut—although this is a little difficult to visualize in four-dimensionalspace. However, we know that we have two independent variables on this set,namely, θ 1 and θ 2 , and both are periodic with period 2π. So this is akin tothe two independent circular directions that parameterize the familiar torusin R 3 .


Restricted to this torus, the <strong>eq</strong>uations now read6.2 Harmonic Oscillators 117θ ′ 1 =−ω 1θ ′ 2 =−ω 2.It is convenient to think of θ 1 and θ 2 as variables in a square of sidelength 2πwhere we glue together the opposite sides θ j = 0 and θ j = 2π to make thetorus. In this square our vector field now has constant slopeθ 2′θ 1′= ω 2ω 1.Therefore solutions lie along straight lines with slope ω 2 /ω 1 in this square.When a solution reaches the edge θ 1 = 2π (say, at θ 2 = c), it instantlyreappears on the edge θ 1 = 0 with the θ 2 coordinate given by c, and thencontinues onward with slope ω 2 /ω 1 . A similar identification occurs when thesolution meets θ 2 = 2π.So now we have a simplified geometric vision of what happens to thesesolutions on these tori. But what really happens? The answer depends on theratio ω 2 /ω 1 . If this ratio is a rational number, say, n/m, then the solutionstarting at (θ 1 (0), θ 2 (0)) will pass horizontally through the torus exactly mtimes and vertically n times before returning to its starting point. This is theperiodic solution we observed above. Incidentally, the picture of the straightline solutions in the θ 1 θ 2 –plane is not at all the same as our depiction ofsolutions in the x 1 x 2 –plane as shown in Figure 6.6.In the irrational case, something quite different occurs. See Figure 6.7. Tounderstand what is happening here, we return to the notion of a Poincaré mapdiscussed in Chapter 1. Consider the circle θ 1 = 0, the left-hand edge of oursquare representation of the torus. Given an initial point on this circle, say,θ 2 = x 0 , we follow the solution starting at this point until it next hits θ 1 = 2π.Figure 6.7 A solution with fr<strong>eq</strong>uency ratio √ 2projected into the x 1 x 2 –plane, the left curvecomputed up to time 50π, the right to time 100π.


118 Chapter 6 Higher Dimensional Linear Systemsx 1 f(x 0 )x 0x 2h 1 0h 1 2πFigure 6.8 The Poincaré map on thecircle θ 1 =0intheθ 1 θ 2 torus.By our identification, this solution has now returned to the circle θ 1 = 0. Thesolution may cross the boundary θ 2 = 2π several times in making this transit,but it does eventually return to θ 1 = 0. So we may <strong>def</strong>ine the Poincaré mapon θ 1 = 0 by assigning to x 0 on this circle the corresponding coordinate of thepoint of first return. Suppose that this first return occurs at the point θ 2 (τ)where τ is the time for which θ 1 (τ) = 2π. Since θ 1 (t) = θ 1 (0) + ω 1 t, we haveτ = 2π/ω 1 . Hence θ 2 (τ) = x 0 + ω 2 (2π/ω 1 ). Therefore the Poincaré map onthe circle may be written asf (x 0 ) = x 0 + 2π(ω 2 /ω 1 ) mod 2πwhere x 0 = θ 2 (0) is our initial θ 2 coordinate on the circle. See Figure 6.8. Thusthe Poincaré map on the circle is just the function that rotates points on thecircle by angle 2π(ω 2 /ω 1 ). Since ω 2 /ω 1 is irrational, this function is called anirrational rotation of the circle.DefinitionThe set of points x 0 , x 1 = f (x 0 ), x 2 = f (f (x 0 )), ..., x n = f (x n−1 ) iscalled the orbit of x 0 under iteration of f .The orbit of x 0 tracks how our solution successively crosses θ 1 = 2π as timeincreases.Proposition. Suppose ω 2 /ω 1 is irrational. Then the orbit of any initial pointx 0 on the circle θ 1 = 0 is dense in the circle.Proof: Recall from Section 5.6 that a subset of the circle is dense if thereare points in this subset that are arbitrarily close to any point whatsoever


6.3 Repeated Eigenvalues 119in the circle. Therefore we must show that, given any point z on the circle andany ɛ > 0, there is a point x n on the orbit of x 0 such that |z − x n | < ɛ wherez and x n are measured mod 2π. To see this, observe first that there must ben, m for which m>nand |x n − x m | < ɛ. Indeed, we know that the orbitof x 0 is not a finite set of points since ω 2 /ω 1 is irrational. Hence there mustbe at least two of these points whose distance apart is less than ɛ since thecircle has finite circumference. These are the points x n and x m (actually, theremust be infinitely many such points). Now rotate these points in the reversedirection exactly n times. The points x n and x m are rotated to x 0 and x m−n ,respectively. We find, after this rotation, that |x 0 − x m−n | < ɛ. Now x m−n isgiven by rotating the circle through angle (m − n)2π(ω 2 /ω 1 ), which, mod 2π,is therefore a rotation of angle less than ɛ. Hence, performing this rotationagain, we findas well, and, inductively,|x 2(m−n) − x m−n | < ɛ|x k(m−n) − x (k−1)(m−n) | < ɛfor each k. Thus we have found a s<strong>eq</strong>uence of points obtained by repeatedrotation through angle (m − n)2π(ω 2 /ω 1 ), and each of these points iswithin ɛ of its predecessor. Hence there must be a point of this form withinɛ of z.Since the orbit of x 0 is dense in the circle θ 1 = 0, it follows that the straightlinesolutions connecting these points in the square are also dense, and so theoriginal solutions are dense in the torus on which they reside. This accounts forthe densely packed solution shown projected into the x 1 x 2 –plane in Figure 6.7when ω 2 /ω 1 = √ 2.Returning to the actual motion of the oscillators, we see that when ω 2 /ω 1is irrational, the masses do not move in periodic fashion. However, they docome back very close to their initial positions over and over again as time goeson due to the density of these solutions on the torus. These types of motionsare called quasi-periodic motions. In Exercise 7 we investigate a related set of<strong>eq</strong>uations, namely, a pair of coupled oscillators.6.3 Repeated EigenvaluesAs we saw in the previous chapter, the solution of systems with repeatedreal eigenvalues reduces to solving systems whose matrices contain blocks


120 Chapter 6 Higher Dimensional Linear Systemsof the form⎛⎞λ 1λ 1. .. . ...⎜⎝. ..⎟1 ⎠λExample. Let⎛X ′ = ⎝ λ 1 0⎞0 λ 1⎠ X.0 0 λThe only eigenvalue for this system is λ, and its only eigenvector is (1, 0, 0).We may solve this system as we did in Chapter 3, by first noting that x ′ 3 = λx 3,so we must haveNow we must havex 3 (t) = c 3 e λt .x ′ 2 = λx 2 + c 3 e λt .As in Chapter 3, we guess a solution of the formx 2 (t) = c 2 e λt + αte λt .Substituting this guess into the differential <strong>eq</strong>uation for x 2 ′ , we determine thatα = c 3 and findFinally, the <strong>eq</strong>uationsuggests the guessx 2 (t) = c 2 e λt + c 3 te λt .x ′ 1 = λx 1 + c 2 e λt + c 3 te λtx 1 (t) = c 1 e λt + αte λt + βt 2 e λt .


6.3 Repeated Eigenvalues 121zxFigure 6.9 The phase portrait forrepeated real eigenvalues.Solving as above, we findAltogether, we findx 1 (t) = c 1 e λt + c 2 te λt + c 3t 22 eλt .⎛X(t) = c 1 e λt ⎝ 1 ⎞ ⎛0⎠ + c 2 e λt ⎝ t ⎞ ⎛1⎠ + c 3 e λt ⎝ t 2 ⎞/2t ⎠ ,0 01which is the general solution. Despite the presence of the polynomial terms inthis solution, when λ < 0, the exponential term dominates and all solutionsdo tend to zero. Some representative solutions when λ < 0 are shown inFigure 6.9. Note that there is only one straight-line solution for this system;this solution lies on the x-axis. Also, the xy–plane is invariant and solutionsthere behave exactly as in the planar repeated eigenvalue case.Example. Consider the following four-dimensional system:x 1 ′ = x 1 + x 2 − x 3x 2 ′ = x 2 + x 4x 3 ′ = x 3 + x 4x 4 ′ = x 4


122 Chapter 6 Higher Dimensional Linear SystemsWe may write this system in matrix form as⎛⎞1 1 −1 0X ′ = AX = ⎜0 1 0 1⎟⎝0 0 1 1⎠ X.0 0 0 1Because A is upper triangular, all of the eigenvalues are <strong>eq</strong>ual to 1. Solving(A − I)X = 0, we find two independent eigenvectors V 1 = (1, 0, 0, 0) andW 1 = (0, 1, 1, 0). This reduces the possible canonical forms for A to twopossibilities. Solving (A − I)X = V 1 yields one solution V 2 = (0, 1, 0, 0), andsolving (A − I)X = W 1 yields another solution W 2 = (0, 0, 0, 1). Thus weknow that the system X ′ = AX may be tranformed intowhere the matrix T is given by⎛ ⎞1 1 0 0Y ′ = (T −1 AT )Y = ⎜0 1 0 0⎟⎝0 0 1 1⎠ Y0 0 0 1⎛ ⎞1 0 0 0T = ⎜0 1 1 0⎟⎝0 0 1 0⎠ .0 0 0 1Solutions of Y ′ = (T −1 AT )Y therefore are given byy 1 (t) = c 1 e t + c 2 te ty 2 (t) = c 2 e ty 3 (t) = c 3 e t + c 4 te ty 4 (t) = c 4 e t .Applying the change of coordinates T , we find the general solution of theoriginal systemx 1 (t) = c 1 e t + c 2 te tx 2 (t) = c 2 e t + c 3 e t + c 4 te tx 3 (t) = c 3 e t + c 4 te tx 4 (t) = c 4 e t .


6.4 The Exponential of a Matrix6.4 The Exponential of a Matrix 123We turn now to an alternative and elegant approach to solving linear systemsusing the exponential of a matrix. In a certain sense, this is the more naturalway to attack these systems.Recall from Section 1.1 in Chapter 1 how we solved the 1 × 1 “system” oflinear <strong>eq</strong>uations x ′ = ax where our matrix was now simply (a). We did notgo through the process of finding eigenvalues and eigenvectors here. (Well,actually, we did, but the process was pretty simple.) Rather, we just exponentiatedthe matrix (a) to find the general solution x(t) = c exp(at). In fact, thisprocess works in the general case where A is n × n. All we need to know is howto exponentiate a matrix.Here’s how: Recall from calculus that the exponential function can beexpressed as the infinite seriese x =∞∑k=0x kk! .We know that this series converges for every x ∈ R. Now we can add matrices,we can raise them to the power k, and we can multiply each entry by 1/k!. Sothis suggests that we can use this series to exponentiate them as well.DefinitionLet A be an n × n matrix. We <strong>def</strong>ine the exponential of A to be thematrix given byexp(A) =∞∑k=0A kk! .Of course we have to worry about what it means for this sum of matrices toconverge, but let’s put that off and try to compute a few examples first.Example. LetThen we haveA =A k =( ) λ 0.0 μ( ) λk00 μ k


124 Chapter 6 Higher Dimensional Linear Systemsso that⎛⎞∞∑λ k /k! 0( )exp(A) =k=0e⎜⎝∞∑ ⎟0 μ k /k!⎠ = λ00 e μk=0as you may have guessed.Example. For a slightly more complicated example, letWe computeA =( ) 0 β.−β 0( )A 0 = I, A 2 =−β 2 I, A 3 =−β 3 0 1,−1 0( )A 4 = β 4 I, A 5 = β 5 0 1, ...−1 0so we find⎛ ∞∑(−1) k β2k(2k)!k=0exp(A) =⎜⎝∞∑− (−1) k β 2k+1(2k + 1)!k=0( )cos β sin β=.− sin β cos β∞∑(−1) k β 2k+1 ⎞(2k + 1)!∞∑(−1) k β2k ⎟⎠(2k)!k=0k=0Example. Now letA =( ) λ 10 λ


6.4 The Exponential of a Matrix 125with λ ̸= 0. With an eye toward what comes later, we compute, not exp A, butrather exp(tA). We have(tA) k =( (tλ)kkt k λ k−1 )0 (tλ) k .Hence we find⎛ ∞∑ (tλ) kk!k=0exp(tA) =⎜⎝0∞∑ (tλ) k ⎞tk!k=0=∞∑ (tλ) k ⎟⎠k!k=0( etλte tλ )0 e tλ .Note that, in each of these three examples, the matrix exp(A) is a matrixwhose entries are infinite series. We therefore say that the infinite series ofmatrices exp(A) converges absolutely if each of its individual terms does so. Ineach of the previous cases, this convergence was clear. Unfortunately, in thecase of a general matrix A, this is not so clear. To prove convergence here, weneed to work a little harder.Let a ij (k) denote the ij entry of A k . Let a = max |a ij |. We have|a ij (2)| =∣|a ij (3)| =∣.n∑∣ ∣∣∣a ik a kj ≤ na 2k=1∣ n∑ ∣∣∣a ik a kl a lj ≤ n 2 a 3k,l=1|a ij (k)| ≤n k−1 a k .Hence we have a bound for the ij entry of the n × n matrix exp(A):∞∑a ij (k)∞ ∣ k! ∣ ≤ ∑ |a ij (k)|∞∑ n k−1 a k ∞∑ (na) k≤≤ ≤ exp nak!k! k!k=0k=0k=0so that this series converges absolutely by the comparison test. Therefore thematrix exp A makes sense for any A ∈ L(R n ).The following result shows that matrix exponentiation shares many of thefamiliar properties of the usual exponential function.k=0


126 Chapter 6 Higher Dimensional Linear SystemsProposition. Let A, B, and T be n × n matrices. Then:1. If B = T −1 AT , then exp(B) = T −1 exp(A)T.2. If AB = BA, then exp(A + B) = exp(A) exp(B).3. exp(−A) = (exp(A)) −1 .Proof: The proof of (1) follows from the identities T −1 (A+B)T = T −1 AT +T −1 BT and (T −1 AT ) k = T −1 A k T . ThereforeT −1 ( n∑k=0)A kT =k!( n∑ T −1 AT ) kand (1) follows by taking limits.To prove (2), observe that because AB = BA we have, by the binomialtheorem,Therefore we must show that⎛∞∑⎝ ∑ A jj!n=0j+k=n(A + B) n = n!k=0∑j+k=n⎞ ⎛B k ∞∑⎠ = ⎝k!j=0k!A j B kj! k! .A jj!⎞ (∑ ∞⎠k=0)B k.k!This is not as obvious as it may seem, since we are dealing here with seriesof matrices, not series of real numbers. So we will prove this in the followinglemma, which then proves (2). Putting B =−A in (2) gives (3). Lemma. For any n × n matrices A and B, we have:⎛ ⎞ ⎛ ⎞∞∑⎝ ∑(A j B k ∞∑⎠ = ⎝A j ∞)∑⎠B k.j! k!j! k!n=0j=0j+k=nk=0Proof: We know that each of these infinite series of matrices converges. Wejust have to check that they converge to each other. To do this, consider thepartial sumsγ 2m =2m∑n=0⎛⎝ ∑j+k=nA jj!⎞B k⎠k!


6.4 The Exponential of a Matrix 127and⎛m∑α m = ⎝j=0⎞(A j∑ m⎠ and β m =j!k=0)B k.k!We need to show that the matrices γ 2m − α m β m tend to the zero matrix asm →∞. Toward that end, for a matrix M =[m ij ], we let ||M|| = max |m ij |.We will show that ||γ 2m − α m β m || → 0asm →∞.A computation shows thatγ 2m − α m β m = ∑ ′ A jj!B kk! + ∑ ′′ A j B kj! k!where ∑′ denotes the sum over terms with indices satisfyingj + k ≤ 2m, 0 ≤ j ≤ m, m + 1 ≤ k ≤ 2mwhile ∑ ′′ is the sum corresponding toj + k ≤ 2m, m + 1 ≤ j ≤ 2m, 0 ≤ k ≤ m.Therefore||γ 2m − α m β m || ≤ ∑ ′ ∣ A j∣ · ∣∣ ∣ Bk∣ + ∑ ′′ ∣ A j∣ · ∣∣ ∣ Bk∣ ∣.j! k! j! k!Now∑ ′ ∣ A j∣ ∣ · B k∣ (∑ m ≤ ∣ Aj∣ )(∑ 2m j! k!j!j=0k=m+1∣ Bk∣ ).k!This tends to 0 as m →∞since, as we saw above,Similarly,∞∑∣ Aj∣ ≤ exp(n||A||) < ∞.j!j=0∑ ′′ ∣ A j∣ ∣ · B k∣ → 0j! k!as m →∞. Therefore lim m→∞ (γ 2m − α m β m ) = 0, proving the lemma.


128 Chapter 6 Higher Dimensional Linear SystemsObserve that statement (3) of the proposition implies that exp(A) isinvertible for every matrix A. This is analogous to the fact that e a ̸= 0 forevery real number a.There is a very simple relationship between the eigenvectors of A and thoseof exp(A).Proposition. If V ∈ R n is an eigenvector of A associated to the eigenvalue λ,then V is also an eigenvector of exp(A) associated to e λ .Proof: From AV = λV , we obtain( n∑A k Vexp(A)V = limn→∞ k!k=0)( n∑)λ k= limn→∞ k! V k=0(∑ ∞=k=0)λ kVk!= e λ V . Now let’s return to the setting of systems of differential <strong>eq</strong>uations. Let A bean n × n matrix and consider the system X ′ = AX. Recall that L(R n ) denotesthe set of all n × n matrices. We have a function R → L(R n ), which assignsthe matrix exp(tA)tot ∈ R. Since L(R n ) is identified with R n2 , it makes senseto speak of the derivative of this function.Proposition.dexp(tA) = A exp(tA) = exp(tA)A.dtIn other words, the derivative of the matrix-valued function t → exp(tA)is another matrix-valued function A exp(tA).Proof: We havedexp((t + h)A) − exp(tA)exp(tA) = limdt h→0h= limh→0exp(tA) exp(hA) − exp(tA)h


6.4 The Exponential of a Matrix 129( )exp(hA) − I= exp(tA) limh→0 h= exp(tA)A;that the last limit <strong>eq</strong>uals A follows from the series <strong>def</strong>inition of exp(hA). Notethat A commutes with each term of the series for exp(tA), hence with exp(tA).This proves the proposition.Now we return to solving systems of differential <strong>eq</strong>uations. The followingmay be considered as the fundamental theorem of linear differential <strong>eq</strong>uationswith constant coefficients.Theorem. LetAbeann× n matrix. Then the solution of the initial valueproblem X ′ = AX with X(0) = X 0 is X(t) = exp(tA)X 0 . Moreover, this is theonly such solution.Proof: The preceding proposition shows that( )dddt (exp(tA)X 0) =dt exp(tA) X 0 = A exp(tA)X 0 .Moreover, since exp(0A)X 0 = X 0 , it follows that this is a solution of the initialvalue problem. To see that there are no other solutions, let Y (t) be anothersolution satisfying Y (0) = X 0 and setZ(t) = exp(−tA)Y (t).ThenZ ′ (t) =( ddt exp(−tA) )Y (t) + exp(−tA)Y ′ (t)=−A exp(−tA)Y (t) + exp(−tA)AY (t)= exp(−tA)(−A + A)Y (t)≡ 0.Therefore Z(t) is a constant. Setting t = 0 shows Z(t) = X 0 , so that Y (t) =exp(tA)X 0 . This completes the proof of the theorem.Note that this proof is identical to that given way back in Section 1.1. Only themeaning of the letter A has changed.


130 Chapter 6 Higher Dimensional Linear SystemsExample. Consider the systemX ′ =By the theorem, the general solution is( ) λ 1X.0 λ( ) tλ tX(t) = exp(tA)X 0 = exp X0 tλ 0 .But this is precisely the matrix whose exponential we computed earlier. WefindX(t) =( etλte tλ )0 e tλ X 0 .Note that this agrees with our computations in Chapter 3.6.5 Nonautonomous Linear SystemsUp to this point, virtually all of the linear systems of differential <strong>eq</strong>uationsthat we have encountered have been autonomous. There are, however, certaintypes of nonautonomous systems that often arise in applications. One suchsystem is of the formX ′ = A(t)Xwhere A(t) =[a ij (t)] is an n × n matrix that depends continuously on time.We will investigate these types of systems further when we encounter thevariational <strong>eq</strong>uation in subs<strong>eq</strong>uent chapters.Here we restrict our attention to a different type of nonautonomous linearsystem given byX ′ = AX + G(t)where A is a constant n × n matrix and G : R → R n is a forcing termthat depends explicitly on t. This is an example of a first-order, linear,nonhomogeneous system of <strong>eq</strong>uations.


6.5 Nonautonomous Linear Systems 131Example. (Forced Harmonic Oscillator) If we apply an external forceto the harmonic oscillator system, the differential <strong>eq</strong>uation governing themotion becomesx ′′ + bx ′ + kx = f (t)where f (t) measures the external force. An important special case occurs whenthis force is a periodic function of time, which corresponds, for example, tomoving the table on which the mass-spring apparatus resides back and forthperiodically. As a system, the forced harmonic oscillator <strong>eq</strong>uation becomes( )X ′ 0 1=X + G(t) where G(t) =−k −b( ) 0. f (t)For a nonhomogeneous system, the <strong>eq</strong>uation that results from dropping thetime-dependent term, namely, X ′ = AX, is called the homogeneous <strong>eq</strong>uation.We know how to find the general solution of this system. Borrowing thenotation from the previous section, the solution satisfying the initial conditionX(0) = X 0 isX(t) = exp(tA)X 0 ,so this is the general solution of the homogeneous <strong>eq</strong>uation.To find the general solution of the nonhomogeneous <strong>eq</strong>uation, supposethat we have one particular solution Z(t) of this <strong>eq</strong>uation. So Z ′ (t) =AZ(t) + G(t). If X(t) is any solution of the homogeneous <strong>eq</strong>uation, thenthe function Y (t) = X(t) + Z(t) is another solution of the nonhomogeneous<strong>eq</strong>uation. This follows since we haveY ′ = X ′ + Z ′ = AX + AZ + G(t)= A(X + Z) + G(t)= AY + G(t).Therefore, since we know all solutions of the homogeneous <strong>eq</strong>uation, we cannow find the general solution to the nonhomogeneous <strong>eq</strong>uation, providedthat we can find just one particular solution of this <strong>eq</strong>uation. Often one getssuch a solution by simply guessing that solution (in calculus, this method isusually called the method of undetermined coefficients). Unfortunately, guessinga solution does not always work. The following method, called variation ofparameters, does work in all cases. However, there is no guarantee that we canactually evaluate the r<strong>eq</strong>uired integrals.


132 Chapter 6 Higher Dimensional Linear SystemsTheorem. (Variation of Parameters) Consider the nonhomogeneous<strong>eq</strong>uationX ′ = AX + G(t)where A is an n × n matrix and G(t) is a continuous function of t. Then( ∫ t)X(t) = exp(tA) X 0 + exp(−sA) G(s) ds0is a solution of this <strong>eq</strong>uation satisfying X(0) = X 0 .Proof: Differentiating X(t), we obtain( ∫ tX ′ (t) = A exp(tA) X 0 +∫ t0)exp(−sA) G(s) ds+ exp(tA) d exp(−sA) G(s) dsdt 0( ∫ t)= A exp(tA) X 0 + exp(−sA) G(s) ds + G(t)0= AX(t) + G(t). We now give several applications of this result in the case of the periodicallyforced harmonic oscillator. Assume first that we have a damped oscillator thatis forced by cos t, so the period of the forcing term is 2π. The system isX ′ = AX + G(t)where G(t) = (0, cos t) and A is the matrix( ) 0 1A =−k −bwith b, k>0. We claim that there is a unique periodic solution of this systemthat has period 2π. To prove this, we must first find a solution X(t) satisfyingX(0) = X 0 = X(2π). By variation of parameters, we need to find X 0 such that∫ 2πX 0 = exp(2πA)X 0 + exp(2πA) exp(−sA) G(s) ds.0Now the term∫ 2πexp(2πA) exp(−sA) G(s) ds0


6.5 Nonautonomous Linear Systems 133is a constant vector, which we denote by W . Therefore we must solve the<strong>eq</strong>uation(exp(2πA) − I)X0 =−W .There is a unique solution to this <strong>eq</strong>uation, since the matrix exp(2πA) − Iis invertible. For if this matrix were not invertible, we would have a nonzerovector V with(exp(2πA) − I)V = 0,or, in other words, the matrix exp(2πA) would have an eigenvalue of 1. But,from the previous section, the eigenvalues of exp(2πA) are given by exp(2πλ j )where the λ j are the eigenvalues of A. But each λ j has real part less than 0, sothe magnitude of exp(2πλ j ) is smaller than 1. Thus the matrix exp(2πA) − Iis indeed invertible, and the unique initial value leading to a 2π-periodicsolution isX 0 = ( exp(2πA) − I ) −1 (−W ).So let X(t) be this periodic solution with X(0) = X 0 . This solution is calledthe steady-state solution. If Y 0 is any other initial condition, then we may writeY 0 = (Y 0 − X 0 ) + X 0 , so the solution through Y 0 is given by∫ tY (t) = exp(tA)(Y 0 − X 0 ) + exp(tA)X 0 + exp(tA) exp(−sA) G(s) ds0= exp(tA)(Y 0 − X 0 ) + X(t).The first term in this expression tends to 0 as t →∞, since it is a solutionof the homogeneous <strong>eq</strong>uation. Hence every solution of this system tends tothe steady-state solution as t →∞. Physically, this is clear: The motion ofthe damped (and unforced) oscillator tends to <strong>eq</strong>uilibrium, leaving only themotion due to the periodic forcing. We have proved:Theorem. Consider the forced, damped harmonic oscillator <strong>eq</strong>uationx ′′ + bx ′ + kx = cos twith k, b>0. Then all solutions of this <strong>eq</strong>uation tend to the steady-state solution,which is periodic with period 2π.


134 Chapter 6 Higher Dimensional Linear SystemsNow consider a particular example of a forced, undamped harmonicoscillatorX ′ =( ) ( )0 1 0X +−1 0 cos ωtwhere the period of the forcing is now 2π/ω with ω ̸= ±1. LetA =( ) 0 1.−1 0The solution of the homogeneous <strong>eq</strong>uation is( )cos t sin tX(t) = exp(tA)X 0 =X− sin t cos t 0 .Variation of parameters provides a solution of the nonhomogeneous <strong>eq</strong>uationstarting at the origin:Recalling that∫ t( ) 0Y (t) = exp(tA) exp(−sA) ds0cos ωs∫ t= exp(tA)0∫ t= exp(tA)0= 1 2 exp(tA) ∫ t( cos s − sin ssin s cos s( )− sin s cos ωsdscos s cos ωs0)( ) 0dscos ωs( )sin(ω − 1)s − sin(ω + 1)sds.cos(ω − 1)s + cos(ω + 1)s( )cos t sin texp(tA) =− sin t cos t


Exercises 135and using the fact that ω ̸= ±1, evaluation of this integral plus a longcomputation yields⎛⎞− cos(ω − 1)t cos(ω + 1)tY (t) = 1 2 exp(tA) +⎜ ω − 1 ω + 1⎟⎝ sin(ω − 1)t sin(ω + 1)t ⎠+ω − 1 ω + 1((ω+ exp(tA)2 − 1) −1 )0= 1 ( ) ( − cos ωt(ωω 2 + exp(tA)2 − 1) −1 ).− 1 ω sin ωt0Thus the general solution of this <strong>eq</strong>uation is(Y (t) = exp(tA) X 0 +((ω 2 − 1) −10))+ 1ω 2 − 1( ) − cos ωt.ω sin ωtThe first term in this expression is periodic with period 2π while the secondhas period 2π/ω. Unlike the damped case, this solution does not necessarilyyield a periodic motion. Indeed, this solution is periodic if and only if ω is arational number. If ω is irrational, the motion is quasiperiodic, just as we sawin Section 6.2.EXERCISES1. Find the general solution for X ′ = AX where A is given by:⎛(a) ⎝ 0 0 1⎞ ⎛0 1 0⎠(b) ⎝ 1 0 1⎞ ⎛0 1 0⎠(c) ⎝ 0 1 0⎞−1 0 0⎠1 0 0 1 0 11 1 1(d)⎛⎝ 0 1 0⎞ ⎛1 0 0⎠(e) ⎝ 1 0 1⎞ ⎛0 1 0⎠ (f ) ⎝ 1 1 0⎞1 1 1⎠1 1 1 0 0 1 0 1 1⎛ ⎞⎛(g) ⎝ 1 0 −1⎞ 1 0 0 1−1 1 −1⎠(h) ⎜0 1 1 0⎟⎝0 0 1 0⎠0 0 11 0 0 0


136 Chapter 6 Higher Dimensional Linear Systems2. Consider the linear system⎛X ′ = ⎝ λ ⎞1 0 00 λ 2 0 ⎠ X0 0 λ 3where λ 3 < λ 2 < λ 1 < 0. Describe how the solution through an arbitraryinitial value tends to the origin.3. Give an example of a 3 × 3 matrix A for which all non<strong>eq</strong>uilibriumsolutions of X ′ = AX are periodic with period 2π. Sketch the phaseportrait.4. Find the general solution of⎛⎞0 1 1 0X ′ = ⎜−1 0 0 1⎟⎝ 0 0 0 1⎠ X.0 0 −1 05. Consider the system⎛X ′ = ⎝ 0 0 a⎞0 b 0⎠ Xa 0 0depending on the two parameters a and b.(a) Find the general solution of this system.(b) Sketch the region in the ab–plane where this system has differenttypes of phase portraits.6. Consider the system⎛X ′ = ⎝ a 0 b⎞0 b 0⎠ X−b 0 adepending on the two parameters a and b.(a) Find the general solution of this system.(b) Sketch the region in the ab–plane where this system has differenttypes of phase portraits.7. Coupled Harmonic Oscillators. In this series of exercises you are askedto generalize the material on harmonic oscillators in Section 6.2 to thecase where the oscillators are coupled. Suppose there are two masses m 1


Exercises 137k 1 k 2m 1 m 2k 1Figure 6.10A coupled oscillator.and m 2 attached to springs and walls as shown in Figure 6.10. The springsconnecting m j to the walls both have spring constants k 1 , while the springconnecting m 1 and m 2 has spring constant k 2 . This coupling means thatthe motion of either mass affects the behavior of the other.Let x j denote the displacement of each mass from its rest position, andassume that both masses are <strong>eq</strong>ual to 1. The differential <strong>eq</strong>uations forthese coupled oscillators are then given byx 1 ′′ =−(k 1 + k 2 )x 1 + k 2 x 2x 2 ′′ = k 2x 1 − (k 1 + k 2 )x 2 .These <strong>eq</strong>uations are derived as follows. If m 1 is moved to the right(x 1 > 0), the left spring is stretched and exerts a restorative force onm 1 given by −k 1 x 1 . Meanwhile, the central spring is compressed, so itexerts a restorative force on m 1 given by −k 2 x 1 . If the right spring isstretched, then the central spring is compressed and exerts a restorativeforce on m 1 given by k 2 x 2 (since x 2 < 0). The forces on m 2 are similar.(a) Write these <strong>eq</strong>uations as a first-order linear system.(b) Determine the eigenvalues and eigenvectors of the correspondingmatrix.(c) Find the general solution.(d) Let ω 1 = √ k 1 and ω 2 = √ k 1 + 2k 2 . What can be said about theperiodicity of solutions relative to the ω j ? Prove this.8. Suppose X ′ = AX where A isa4×4 matrix whose eigenvalues are ±i √ 2and ±i √ 3. Describe this flow.9. Suppose X ′ = AX where A isa4× 4 matrix whose eigenvalues are ±iand −1 ± i. Describe this flow.10. Suppose X ′ = AX where A isa4× 4 matrix whose eigenvalues are ±iand ±1. Describe this flow.11. Consider the system X ′ = AX where X = (x 1 , ..., x 6 ) and⎛⎞0 ω 1−ω 1 0A =0 ω 2⎜ −ω 2 0⎟⎝−1 ⎠1


138 Chapter 6 Higher Dimensional Linear Systemsand ω 1 /ω 2 is irrational. Describe qualitatively how a solution behaveswhen, at time 0, each x j is nonzero with the exception that(a) x 6 = 0;(b) x 5 = 0;(c) x 3 = x 4 = x 5 = 0;(d) x 3 = x 4 = x 5 = x 6 = 0.12. Compute the exponentials of the following matrices:(a)( ) 5 −63 −4(b)( ) 2 −11 2(c)( ) 2 −10 2(d)( ) 0 11 0(e)⎛⎝ 0 1 2⎞0 0 3⎠0 0 0(f)⎛⎝ 2 0 0⎞0 3 0⎠0 1 3(g)⎛⎝ λ 0 0⎞1 λ 0⎠0 1 λ(h)( ) i 00 −i(i)( 1 + i 0)2 1+ i(j)⎛ ⎞1 0 0 0⎜1 0 0 0⎟⎝1 0 0 0⎠1 0 0 013. Find an example of two matrices A, B such thatexp(A + B) ̸= exp(A) exp(B).14. Show that if AB = BA, then(a) exp(A) exp(B) = exp(B) exp(A)(b) exp(A)B = B exp(A).15. Consider the triplet of harmonic oscillatorsx ′′1 =−x 1x ′′2 =−2x 2x ′′3 =−ω2 x 3where ω is irrational. What can you say about the qualitative behavior ofsolutions of this six-dimensional system?


7Nonlinear SystemsIn this chapter we begin the study of nonlinear differential <strong>eq</strong>uations. Unlikelinear (constant coefficient) systems, where we can always find the explicitsolution of any initial value problem, this is rarely the case for nonlinear systems.In fact, basic properties such as the existence and uniqueness of solutions,which was so obvious in the linear case, no longer hold for nonlinear systems.As we shall see, some nonlinear systems have no solutions whatsoever to agiven initial value problem. On the other hand, some systems have infinitelymany different such solutions. Even if we do find a solution of such a system,this solution need not be <strong>def</strong>ined for all time; for example, the solutionmay tend to ∞ in finite time. And other questions arise: For example, whathappens if we vary the initial condition of a system ever so slightly? Does thecorresponding solution vary continuously? All of this is clear for linear systems,but not at all clear in the nonlinear case. This means that the underlyingtheory behind nonlinear systems of differential <strong>eq</strong>uations is quite a bit morecomplicated than that for linear systems.In practice, most nonlinear systems that arise are “nice” in the sense that wedo have existence and uniqueness of solutions as well as continuity of solutionswhen initial conditions are varied, and other “natural” properties. Thus wehave a choice: Given a nonlinear system, we could simply plunge ahead andeither hope that or, if possible, verify that, in each specific case, the system’ssolutions behave nicely. Alternatively, we could take a long pause at this stageto develop the necessary hypotheses that guarantee that solutions of a givennonlinear system behave nicely.139


140 Chapter 7 Nonlinear SystemsIn this book we pursue a compromise route. In this chapter, we will spellout in precise detail many of the theoretical results that govern the behaviorof solutions of differential <strong>eq</strong>uations. We will present examples of how andwhen these results fail, but we will not prove these theorems here. Rather,we will postpone all of the technicalities until Chapter 17, primarily becauseunderstanding this material demands a firm and extensive background in theprinciples of real analysis. In subs<strong>eq</strong>uent chapters, we will make use of theresults stated here, but readers who are primarily interested in applications ofdifferential <strong>eq</strong>uations or in understanding how specific nonlinear systems maybe analyzed need not get bogged down in these details here. Readers who wantthe technical details may take a detour to Chapter 17 now.7.1 Dynamical SystemsAs mentioned previously, most nonlinear systems of differential <strong>eq</strong>uations areimpossible to solve analytically. One reason for this is the unfortunate factthat we simply do not have enough functions with specific names that wecan use to write down explicit solutions of these systems. Equally problematicis the fact that, as we shall see, higher dimensional systems may exhibitchaotic behavior, a property that makes knowing a particular explicit solutionessentially worthless in the larger scheme of understanding the behavior of thesystem. Hence we are forced to resort to different means in order to understandthese systems. These are the techniques that arise in the field of dynamicalsystems. We will use a combination of analytic, geometric, and topologicaltechniques to derive rigorous results about the behavior of solutions of these<strong>eq</strong>uations.We begin by collecting some of the terminology regarding dynamical systemsthat we have introduced at various points in the preceding chapters. Adynamical system is a way of describing the passage in time of all points of agiven space S. The space S could be thought of, for example, as the space ofstates of some physical system. Mathematically, S might be a Euclidean spaceor an open subset of Euclidean space or some other space such as a surface inR 3 . When we consider dynamical systems that arise in mechanics, the space Swill be the set of possible positions and velocities of the system. For the sake ofsimplicity, we will assume throughout that the space S is Euclidean space R n ,although in certain cases the important dynamical behavior will be confinedto a particular subset of R n .Given an initial position X ∈ R n , a dynamical system on R n tells us whereX is located 1 unit of time later, 2 units of time later, and so on. We denotethese new positions of X by X 1 , X 2 , and so forth. At time zero, X is locatedat position X 0 . One unit before time zero, X was at X −1 . In general, the


7.1 Dynamical Systems 141“trajectory” of X is given by X t . If we measure the positions X t using onlyinteger time values, we have an example of a discrete dynamical system, whichwe shall study in Chapter 15. If time is measured continuously with t ∈ R,we have a continuous dynamical system. If the system depends on time ina continuously differentiable manner, we have a smooth dynamical system.These are the three principal types of dynamical systems that arise in thestudy of systems of differential <strong>eq</strong>uations, and they will form the backbone ofChapters 8 through 14.The function that takes t to X t yields either a s<strong>eq</strong>uence of points or a curvein R n that represents the life history of X as time runs from −∞ to ∞.Different branches of dynamical systems make different assumptions abouthow the function X t depends on t. For example, ergodic theory deals withsuch functions under the assumption that they preserve a measure on R n .Topological dynamics deals with such functions under the assumption thatX t varies only continuously. In the case of differential <strong>eq</strong>uations, we willusually assume that the function X t is continuously differentiable. The mapφ t : R n → R n that takes X into X t is <strong>def</strong>ined for each t and, from ourinterpretation of X t as a state moving in time, it is reasonable to expect φ t tohave φ −t as its inverse. Also, φ 0 should be the identity function φ 0 (X) = Xand φ t (φ s (X)) = φ t+s (X) is also a natural condition. We formalize all of thisin the following <strong>def</strong>inition:DefinitionA smooth dynamical system on R n is a continuously differentiablefunction φ : R × R n → R n where φ(t, X) = φ t (X) satisfies1. φ 0 : R n → R n is the identity function: φ 0 (X 0 ) = X 0 ;2. The composition φ t ◦ φ s = φ t+s for each t, s ∈ R.Recall that a function is continuously differentiable if all of its partial derivativesexist and are continuous throughout its domain. It is traditional to call acontinuously differentiable function a C 1 function. If the function is k timescontinuously differentiable, it is called a C k function. Note that the <strong>def</strong>initionabove implies that the map φ t : R n → R n is C 1 for each t and has a C 1 inverseφ −t (take s =−t in part 2).Example. For the first-order differential <strong>eq</strong>uation x ′ = ax, the functionφ t (x 0 ) = x 0 exp(at) gives the solutions of this <strong>eq</strong>uation and also <strong>def</strong>ines asmooth dynamical system on R.Example. Let A be an n × n matrix. Then the function φ t (X 0 ) = exp(tA)X 0<strong>def</strong>ines a smooth dynamical system on R n . Clearly, φ 0 = exp(0) = I and, as


142 Chapter 7 Nonlinear Systemswe saw in the previous chapter, we haveφ t+s = exp((t + s)A) = (exp(tA))(exp(sA)) = φ t ◦ φ s .Note that these examples are intimately related to the system of differential<strong>eq</strong>uations X ′ = AX. In general, a smooth dynamical system always yields avector field on R n via the following rule: Given φ t , letF(X) = d dt ∣ φ t (X).t=0Then φ t is just the time t map associated to the flow of X ′ = F(X).Conversely, the differential <strong>eq</strong>uation X ′ = F(X) generates a smooth dynamicalsystem provided the time t map of the flow is well <strong>def</strong>ined and continuouslydifferentiable for all time. Unfortunately, this is not always the case.7.2 The Existence and UniquenessTheoremWe turn now to the fundamental theorem of differential <strong>eq</strong>uations, theexistence and uniqueness theorem. Consider the system of differential<strong>eq</strong>uationsX ′ = F(X)where F : R n → R n . Recall that a solution of this system is a function X : J →R n <strong>def</strong>ined on some interval J ⊂ R such that, for all t ∈ J,X ′ (t) = F(X(t)).Geometrically, X(t) is a curve in R n whose tangent vector X ′ (t) exists for allt ∈ J and <strong>eq</strong>uals F(X(t)). As in previous chapters, we think of this vectoras being based at X(t), so that the map F : R n → R n <strong>def</strong>ines a vector fieldon R n .Aninitial condition or initial value for a solution X : J → R n isa specification of the form X(t 0 ) = X 0 where t 0 ∈ J and X 0 ∈ R n . Forsimplicity, we usually take t 0 = 0. The main problem in differential <strong>eq</strong>uationsis to find the solution of any initial value problem; that is, to determine thesolution of the system that satisfies the initial condition X(0) = X 0 for eachX 0 ∈ R n .Unfortunately, nonlinear differential <strong>eq</strong>uations may have no solutionssatisfying certain initial conditions.


7.2 The Existence and Uniqueness Theorem 143Example. Consider the simple first-order differential <strong>eq</strong>uationx ′ ={ 1 ifx


144 Chapter 7 Nonlinear Systemsphenomenon of nonexistence or nonuniqueness of solutions with given initialconditions is quite exceptional.The following is the fundamental local theorem of ordinary differential<strong>eq</strong>uations.The Existence and Uniqueness Theorem. Consider the initial valueproblemX ′ = F(X), X(t 0 ) = X 0where X 0 ∈ R n . Suppose that F : R n → R n is C 1 . Then, first of all, there exists asolution of this initial value problem and, secondly, this is the only such solution.More precisely, there exists an a > 0 and a unique solutionX : (t 0 − a, t 0 + a) → R nof this differential <strong>eq</strong>uation satisfying the initial condition X(t 0 ) = X 0 .The important proof of this theorem is contained in Chapter 17.Without dwelling on the details here, the proof of this theorem dependson an important technique known as Picard iteration. Before moving on, weillustrate how the Picard iteration scheme used in the proof of the theoremworks in several special examples. The basic idea behind this iterative processis to construct a s<strong>eq</strong>uence of functions that converges to the solution of thedifferential <strong>eq</strong>uation. The s<strong>eq</strong>uence of functions u r (t) is <strong>def</strong>ined inductivelyby u 0 (t) = x 0 where x 0 is the given initial condition, and then∫ tu r+1 (t) = x 0 + u k (s) ds.0Example. Consider the simple differential <strong>eq</strong>uation x ′ = x. We will producethe solution of this <strong>eq</strong>uation satisfying x(0) = x 0 . We know, of course,that this solution is given by x(t) = x 0 e t . We will construct a s<strong>eq</strong>uence offunctions u k (t), one for each k, which converges to the actual solution x(t)ask →∞.We start withu 0 (t) = x 0 ,the given initial value. Then we set∫ t∫ tu 1 (t) = x 0 + u 0 (s) ds = x 0 + x 0 ds,00


7.2 The Existence and Uniqueness Theorem 145so that u 1 (t) = x 0 + tx 0 . Given u 1 we <strong>def</strong>ine∫ t∫ tu 2 (t) = x 0 + u 1 (s) ds = x 0 + (x 0 + sx 0 ) ds,00so that u 2 (t) = x 0 + tx 0 + t 2 2 x 0. You can probably see where this is heading.Inductively, we setand so∫ tu k+1 (t) = x 0 + u k (s) ds,0As k →∞, u k (t) converges to∑k+1t iu k+1 (t) = x 0i! .i=0x 0 ∞ ∑i=0t ii! = x 0e t = x(t),which is the solution of our original <strong>eq</strong>uation.Example. For an example of Picard iteration applied to a system ofdifferential <strong>eq</strong>uations, consider the linear system( )X ′ 0 1= F(X) = X−1 0with initial condition X(0) = (1, 0). As we have seen, the solution of this initialvalue problem is( ) cos tX(t) = .− sin tUsing Picard iteration, we have( 1U 0 (t) =0)U 1 (t) =U 2 (t) =( ∫ 1 t+0)( 10)+0∫ t0( ( ∫ 1 1 tF ds = +0)0)( ) −sds =−10(1 − t 2 /2−t)( ( 0 1ds =−1)−t)


146 Chapter 7 Nonlinear Systems( ∫ 1 t( )−sU 3 (t) = +0)0−1 + s 2 ds =/2( 1 − tU 4 (t) =2 /2 + t 4 )/4!−t + t 3 /3!( 1 − t 2 )/2−t + t 3 /3!and we see the infinite series for the cosine and sine functions emerging fromthis iteration.Now suppose that we have two solutions Y (t) and Z(t) of the differential<strong>eq</strong>uation X ′ = F(X), and that Y (t) and Z(t) satisfy Y (t 0 ) = Z(t 0 ). Supposethat both solutions are <strong>def</strong>ined on an interval J. The existence and uniquenesstheorem guarantees that Y (t) = Z(t) for all t in an interval about t 0 , whichmay a priori be smaller than J. However, this is not the case. To see this,suppose that J ∗ is the largest interval on which Y (t) = Z(t). Let t 1 be anendpoint of J ∗ . By continuity, we have Y (t 1 ) = Z(t 1 ). The theorem thenguarantees that, in fact, Y (t) and Z(t) agree on an open interval containing t 1 .This contradicts the assertion that J ∗ is the largest interval on which the twosolutions agree.Thus we can always assume that we have a unique solution <strong>def</strong>ined on amaximal time domain. There is, however, no guarantee that a solution X(t)can be <strong>def</strong>ined for all time, no matter how “nice” F(X) is.Example. Consider the differential <strong>eq</strong>uation in R given byx ′ = 1 + x 2 .This <strong>eq</strong>uation has as solutions the functions x(t) = tan(t + c) where c is aconstant. Such a function cannot be extended over an interval larger than−c − π 2


7.3 Continuous Dependence of Solutions 147that X(t) must come arbitrarily close to the boundary of U as t → β. Similarresults hold as t → α.7.3 Continuous Dependence of SolutionsFor the existence and uniqueness theorem to be at all interesting in any physical(or even mathematical) sense, this result needs to be complemented by theproperty that the solution X(t) depends continuously on the initial conditionX(0). The next theorem gives a precise statement of this property.Theorem. Consider the differential <strong>eq</strong>uation X ′ = F(X) where F : R n → R nis C 1 . Suppose that X(t) is a solution of this <strong>eq</strong>uation which is <strong>def</strong>ined on theclosed interval [t 0 , t 1 ] with X(t 0 ) = X 0 . Then there is a neighborhood U ⊂ R nof X 0 and a constant K such that if Y 0 ∈ U , then there is a unique solution Y (t)also <strong>def</strong>ined on [t 0 , t 1 ] with Y (t 0 ) = Y 0 . Moreover Y (t) satisfies|Y (t) − X(t)| ≤K |Y 0 − X 0 | exp(K (t − t 0 ))for all t ∈[t 0 , t 1 ].This result says that, if the solutions X(t) and Y (t) start out close together,then they remain close together for t close to t 0 . While these solutions mayseparate from each other, they do so no faster than exponentially. In particular,since the right-hand side of this in<strong>eq</strong>uality depends on |Y 0 − X 0 |, which wemay assume is small, we have:Corollary. (Continuous Dependence on Initial Conditions) Letφ(t, X) be the flow of the system X ′ = F(X) where F is C 1 . Then φ is acontinuous function of X.Example. Let k>0. For the system( )X ′ −1 0= X,0 kwe know that the solution X(t) satisfying X(0) = (−1, 0) is given byX(t) = (−e −t , 0).For any η ̸= 0, let Y η (t) be the solution satisfying Y η (0) = (−1, η). ThenY η (t) = (−e −t , ηe kt ).


148 Chapter 7 Nonlinear Systemsg 3 e krYg 3 (t)g 2 e kr(1, g 3 )Yg 2g 1 e kr1X(t)Yg 1e rFigure 7.1 The solutions Y η (t) separateexponentially from X (t) but nonethelessare continuous in their initial conditions.As in the theorem, we have|Y η (t) − X(t)| =|ηe kt − 0| =|η − 0|e kt =|Y η (0) − X(0)|e kt .The solutions Y η do indeed separate from X(t), as we see in Figure 7.1, butthey do so at most exponentially. Moreover, for any fixed time t, we haveY η (t) → X(t)asη → 0.Differential <strong>eq</strong>uations often depend on parameters. For example, the harmonicoscillator <strong>eq</strong>uations depend on the parameters b (the damping constant)and k (the spring constant). Then the natural question is how do solutions ofthese <strong>eq</strong>uations depend on these parameters? As in the previous case, solutionsdepend continuously on these parameters provided that the system dependson the parameters in a continuously differentiable fashion. We can see thiseasily by using a special little trick. Suppose the systemX ′ = F a (X)depends on the parameter a in a C 1 fashion. Let’s consider an “artificially”augmented system of differential <strong>eq</strong>uations given byx ′ 1 = f 1(x 1 , ..., x n , a).x ′ n = f n(x 1 , ..., x n , a)a ′ = 0.


7.4 The Variational Equation 149This is now an autonomous system of n + 1 differential <strong>eq</strong>uations. While thisexpansion of the system may seem trivial, we may now invoke the previousresult about continuous dependence of solutions on initial conditions to verifythat solutions of the original system depend continuously on a as well.Theorem. (Continuous Dependence on Parameters) Let X ′ = F a (X)be a system of differential <strong>eq</strong>uations for which F a is continuously differentiablein both X and a. Then the flow of this system depends continuously on a aswell.7.4 The Variational EquationConsider an autonomous system X ′ = F(X) where, as usual, F is assumed tobe C 1 . The flow φ(t, X) of this system is a function of both t and X. From theresults of the previous section, we know that φ is continuous in the variable X.We also know that φ is differentiable in the variable t, since t → φ(t, X) is justthe solution curve through X. In fact, φ is also differentiable in the variable X,for we shall prove in Chapter 17:Theorem. (Smoothness of Flows). Consider the system X ′ = F(X) whereFisC 1 . Then the flow φ(t, X) of this system is a C 1 function; that is, ∂φ/∂t and∂φ/∂X exist and are continuous in t and X.Note that we can compute ∂φ/∂t for any value of t as long as we know thesolution passing through X 0 , for we haveWe also have∂φ∂t (t, X 0) = F(φ(t, X 0 )).∂φ∂X (t, X 0) = Dφ t (X 0 )where Dφ t is the Jacobian of the function X → φ t (X). To compute ∂φ/∂X,however, it appears that we need to know the solution through X 0 as well asthe solutions through all nearby initial positions, since we need to computethe partial derivatives of the various components of φ t . However, we canget around this difficulty by introducing the variational <strong>eq</strong>uation along thesolution through X 0 .To accomplish this, we need to take another brief detour into the world ofnonautonomous differential <strong>eq</strong>uations. Let A(t) be a family of n × n matrices


150 Chapter 7 Nonlinear Systemsthat depends continuously on t. The systemX ′ = A(t)Xis a linear, nonautonomous system. We have an existence and uniquenesstheorem for these types of <strong>eq</strong>uations:Theorem. Let A(t) be a continuous family of n × n matrices <strong>def</strong>ined for t ∈[α, β]. Then the initial value problemX ′ = A(t)X, X(t 0 ) = X 0has a unique solution that is <strong>def</strong>ined on the entire interval [α, β].Note that there is some additional content to this theorem: We do notassume that the right-hand side is a C 1 function in t. Continuity of A(t)suffices to guarantee existence and uniqueness of solutions.Example. Consider the first-order, linear, nonautonomous differential<strong>eq</strong>uationx ′ = a(t)x.The unique solution of this <strong>eq</strong>uation satisfying x(0) = x 0 is given by(∫ t)x(t) = x 0 exp a(s) ds ,0as is easily checked using the methods of Chapter 1. All we need is that a(t)iscontinuous in order that x ′ (t) = a(t)x(t); we do not need differentiability ofa(t) for this to be true.Note that solutions of linear, nonautonomous <strong>eq</strong>uations satisfy the linearityprinciple. That is, if Y (t) and Z(t) are two solutions of such a system, then sotoo is αY (t) + βZ(t) for any constants α and β.Now we return to the autonomous, nonlinear system X ′ = F(X). Let X(t)be a particular solution of this system <strong>def</strong>ined for t in some interval J =[α, β].Fix t 0 ∈ J and set X(t 0 ) = X 0 . For each t ∈ J letA(t) = DF X(t)where DF X(t) denotes the Jacobian matrix of F at the point X(t) ∈ R n . SinceF is C 1 , A(t) = DF X(t) is a continuous family of n × n matrices. Consider the


7.4 The Variational Equation 151nonautonomous linear <strong>eq</strong>uationU ′ = A(t)U .This <strong>eq</strong>uation is known as the variational <strong>eq</strong>uation along the solution X(t). Bythe previous theorem, we know that this variational <strong>eq</strong>uation has a solution<strong>def</strong>ined on all of J for every initial condition U (t 0 ) = U 0 .The significance of this <strong>eq</strong>uation is that, if U (t) is the solution of thevariational <strong>eq</strong>uation that satisfies U (t 0 ) = U 0 , then the functiont → X(t) + U (t)is a good approximation to the solution Y (t) of the autonomous <strong>eq</strong>uationwith initial value Y (t 0 ) = X 0 + U 0 , provided U 0 is sufficiently small. This isthe content of the following result.Proposition. Consider the system X ′ = F(X) where F is C 1 . Suppose1. X(t) is a solution of X ′ = F(X), which is <strong>def</strong>ined for all t ∈[α, β] andsatisfies X(t 0 ) = X 0 ;2. U (t) is the solution to the variational <strong>eq</strong>uation along X(t) that satisfiesU (t 0 ) = U 0 ;3. Y (t) is the solution of X ′ = F(X) that satisfies Y (t 0 ) = X 0 + U 0 .Then|Y (t) − (X(t) + U (t))|limU 0 →0 |U 0 |converges to 0 uniformly in t ∈[α, β].Technically, this means that for every ɛ > 0 there exists δ > 0 such that if|U 0 |≤δ, then|Y (t) − (X(t) + U (t))| ≤ɛ|U 0 |for all t ∈[α, β]. Thus, as U 0 → 0, the curve t → X(t) + U (t) is a betterand better approximation to Y (t). In many applications, the solution of thevariational <strong>eq</strong>uation X(t) + U (t) is used in place of Y (t); this is convenientbecause U (t) depends linearly on U 0 by the linearity principle.Example. Consider the nonlinear system of <strong>eq</strong>uationsx ′ = x + y 2y ′ =−y.We will discuss this system in more detail in the next chapter. For now, notethat we know one solution of this system explicitly, namely, the <strong>eq</strong>uilibrium


152 Chapter 7 Nonlinear Systemssolution at the origin X(t) ≡ (0, 0). The variational <strong>eq</strong>uation along thissolution is given byU ′ = DF 0 (U ) =( ) 1 0U ,0 −1which is an autonomous linear system. We obtain the solutions of this <strong>eq</strong>uationimmediately: They are given byU (t) =(x0 e ty 0 e −t ).The result above then guarantees that the solution of the nonlinear <strong>eq</strong>uationthrough (x 0 , y 0 ) and <strong>def</strong>ined on the interval [−τ, τ] is as close as we wish toU (t), provided (x 0 , y 0 ) is sufficiently close to the origin.Note that the arguments in this example are perfectly general. Given anynonlinear system of differential <strong>eq</strong>uations X ′ = F(X) with an <strong>eq</strong>uilibriumpoint at X 0 , we may consider the variational <strong>eq</strong>uation along this solution. ButDF X0 is a constant matrix A. The variational <strong>eq</strong>uation is then U ′ = AU , whichis an autonomous linear system. This system is called the linearized system atX 0 . We know that flow of the linearized system is exp(tA)U 0 , so the resultabove says that near an <strong>eq</strong>uilibrium point of a nonlinear system, the phaseportrait resembles that of the corresponding linearized system. We will makethe term resembles more precise in the next chapter.Using the previous proposition, we may now compute ∂φ/∂X, assuming weknow the solution X(t):Theorem. Let X ′ = F(X) be a system of differential <strong>eq</strong>uations where F is C 1 .Let X(t) be a solution of this system satisfying the initial condition X(0) = X 0and <strong>def</strong>ined for t ∈[α, β], and let U (t, U 0 ) be the solution to the variational<strong>eq</strong>uation along X(t) that satisfies U (0, U 0 ) = U 0 . ThenDφ t (X 0 ) U 0 = U (t, U 0 ).That is, ∂φ/∂X applied to U 0 is given by solving the corresponding variational<strong>eq</strong>uation starting at U 0 .Proof: Using the proposition, we have for all t ∈[α, β]:Dφ t (X 0 )U 0 = limh→0φ t (X 0 + hU 0 ) − φ t (X 0 )h= limh→0U (t, hU 0 )h= U (t, U 0 ).


7.5 Exploration: Numerical Methods 153Example. As an illustration of these ideas, consider the differential <strong>eq</strong>uationx ′ = x 2 . An easy integration shows that the solution x(t) satisfying the initialcondition x(0) = x 0 isHence we havex(t) = −x 0x 0 t − 1 .∂φ∂x (t, x 0) =1(x 0 t − 1) 2 .On the other hand, the variational <strong>eq</strong>uation for x(t)isu ′ = 2x(t) u = −2x 0x 0 t − 1 u.The solution of this <strong>eq</strong>uation satisfying the initial condition u(0) = u 0 isgiven by( )1 2u(t) = u 0x 0 t − 1as r<strong>eq</strong>uired.7.5 Exploration: Numerical MethodsIn this exploration, we first describe three different methods for approximatingthe solutions of first-order differential <strong>eq</strong>uations. Your task will be to evaluatethe effectiveness of each method.Each of these methods involves an iterative process whereby we find as<strong>eq</strong>uence of points (t k , x k ) that approximates selected points (t k , x(t k )) alongthe graph of a solution of the first-order differential <strong>eq</strong>uation x ′ = f (t, x). Ineach case we will begin with an initial value x(0) = x 0 .Sot 0 = 0 and x 0 is ourgiven initial value. We need to produce t k and x k .In each of the three methods we will generate the t k recursively by choosinga step size t and simply incrementing t k at each stage by t. Hencet k+1 = t k + tin each case. Choosing t small will (hopefully) improve the accuracy of themethod.Therefore, to describe each method, we only need to determine the valuesof x k . In each case, x k+1 will be the x-coordinate of the point that sits directly


154 Chapter 7 Nonlinear Systemsover t k+1 on a certain straight line through (t k , x k )inthetx–plane. Thus all weneed to do is to provide you with the slope of this straight line and then x k+1is determined. Each of the three methods involves a different straight line.1. Euler’s method. Here x k+1 is generated by moving t time units alongthe straight line generated by the slope field at the point (t k , x k ). Since theslope at this point is f (t k , x k ), taking this short step puts us atx k+1 = x k + f (t k , x k ) t.2. Improved Euler’s method. In this method, we use the average of two slopesto move from (t k , x k )to(t k+1 , x k+1 ). The first slope is just that of theslope field at (t k , x k ), namely,m k = f (t k , x k ).The second is the slope of the slope field at the point (t k+1 , y k ), wherey k is the terminal point determined by Euler’s method applied at (t k , x k ).That is,n k = f (t k+1 , y k )where y k = x k + f (t k , x k ) t.Then we have( )mk + n kx k+1 = x k +t.23. (Fourth-order) Runge-Kutta method. This method is the one most oftenused to solve differential <strong>eq</strong>uations. There are more sophisticated numericalmethods that are specifically designed for special situations, but thismethod has served as a general-purpose solver for decades. In this method,we will determine four slopes, m k , n k , p k , and q k . The step from (t k , x k )to (t k+1 , x k+1 ) will be given by moving along a straight line whose slopeis a weighted average of these four values:( )mk + 2n k + 2p k + q kx k+1 = x k +t.6These slopes are determined as follows:(a) m k is given as in Euler’s method:m k = f (t k , x k ).


7.5 Exploration: Numerical Methods 155(b) n k is the slope at the point obtained by moving halfway along theslope field line at (t k , x k ) to the intermediate point (t k + (t)/2, y k ),so thatn k = f (t k + t2 , y tk) where y k = x k + m k2 .(c) p k is the slope at the point obtained by moving halfway along adifferent straight line at (t k , x k ), where the slope is now n k ratherthan m k as before. Hencep k = f (t k + t2 , z tk) where z k = x k + n k2 .(d) Finally, q k is the slope at the point (t k+1 , w k ) where we use a line withslope p k at (t k , x k ) to determine this point. Henc<strong>eq</strong> k = f (t k+1 , w k )where w k = x k + p k t.Your goal in this exploration is to compare the effectiveness of these threemethods by evaluating the errors made in carrying out this procedure in severalexamples. We suggest that you use a spreadsheet to make these lengthycalculations.1. First, just to make sure you comprehend the rather terse descriptions ofthe three methods above, draw a picture in the tx–plane that illustratesthe process of moving from (t k , x k )to(t k+1 , x k+1 ) in each of the threecases.2. Now let’s investigate how the various methods work when applied to anespecially simple differential <strong>eq</strong>uation, x ′ = x.(a) First find the explicit solution x(t) of this <strong>eq</strong>uation satisfying theinitial condition x(0) = 1 (now there’s a free gift from the mathdepartment...).(b) Now use Euler’s method to approximate the value of x(1) = e usingthe step size t = 0. 1. That is, recursively determine t k and x k fork = 1, ..., 10 using t = 0. 1 and starting with t 0 = 0 and x 0 = 1.(c) Repeat the previous step with t half the size, namely, 0. 05.(d) Again use Euler’s method, this time reducing the step size by a factorof 5, so that t = 0. 01, to approximate x(1).(e) Now repeat the previous three steps using the improved Euler’smethod with the same step sizes.(f) And now repeat using Runge-Kutta.(g) You now have nine different approximations for the value of x(1) =e, three for each method. Now calculate the error in each case. For the


156 Chapter 7 Nonlinear Systemsrecord, use the value e = 2. 71828182845235360287 ...in calculatingthe error.(h) Finally calculate how the error changes as you change the step sizefrom 0. 1 to 0. 05 and then from 0. 05 to 0. 01. That is, if ρ denotes theerror made using step size , compute both ρ 0.1 /ρ 0.05 and ρ 0.05 /ρ 0.01 .3. Now repeat the previous exploration, this time for the nonautonomous<strong>eq</strong>uation x ′ = 2t(1 + x 2 ). Use the value tan 1 = 1. 557407724654 ....4. Discuss how the errors change as you shorten the step size by a factor of2 or a factor of 5. Why, in particular, is the Runge-Kutta method called a“fourth-order” method?EXERCISES1. Write out the first few terms of the Picard iteration scheme for each of thefollowing initial value problems. Where possible, find explicit solutionsand describe the domain of this solution.(a) x ′ = x + 2; x(0) = 2(b) x ′ = x 4/3 ; x(0) = 0(c) x ′ = x 4/3 ; x(0) = 1(d) x ′ = cos x; x(0) = 0(e) x ′ = 1/(2x); x(1) = 12. Let A be an n × n matrix. Show that the Picard method for solving X ′ =AX, X(0) = X 0 gives the solution exp(tA)X 0 .3. Derive the Taylor series for sin 2t by applying the Picard method to thefirst-order system corresponding to the second-order initial value problemx ′′ =−4x; x(0) = 0, x ′ (0) = 2.4. Verify the linearity principle for linear, nonautonomous systems ofdifferential <strong>eq</strong>uations.5. Consider the first-order <strong>eq</strong>uation x ′ = x/t. What can you say about“solutions” that satisfy x(0) = 0? x(0) = a ̸= 0?6. Discuss the existence and uniqueness of solutions of the <strong>eq</strong>uation x ′ = x awhere a>0 and x(0) = 0.7. Let A(t) be a continuous family of n×n matrices and let P(t) be the matrixsolution to the initial value problem P ′ = A(t)P, P(0) = P 0 . Show that(∫ tdet P(t) = (det P 0 ) exp0)Tr A(s) ds .


Exercises 1578. Construct an example of a first-order differential <strong>eq</strong>uation on R for whichthere are no solutions to any initial value problem.9. Construct an example of a differential <strong>eq</strong>uation depending on a parametera for which some solutions do not depend continuously on a.


This Page Intentionally Left Blank


8Equilibria inNonlinear SystemsTo avoid some of the technicalities that we encountered in the previous chapter,we henceforth assume that our differential <strong>eq</strong>uations are C ∞ , except whenspecifically noted. This means that the right-hand side of the differential <strong>eq</strong>uationis k times continuously differentiable for all k. This will at the very leastallow us to keep the number of hypotheses in our theorems to a minimum.As we have seen, it is often impossible to write down explicit solutions ofnonlinear systems of differential <strong>eq</strong>uations. The one exception to this occurswhen we have <strong>eq</strong>uilibrium solutions. Provided we can solve the algebraic<strong>eq</strong>uations, we can write down the <strong>eq</strong>uilibria explicitly. Often, these are themost important solutions of a particular nonlinear system. More importantly,given our extended work on linear systems, we can usually use the technique oflinearization to determine the behavior of solutions near <strong>eq</strong>uilibrium points.We describe this process in detail in this chapter.8.1 Some Illustrative ExamplesIn this section we consider several planar nonlinear systems of differential<strong>eq</strong>uations. Each will have an <strong>eq</strong>uilibrium point at the origin. Our goal is to seethat the solutions of the nonlinear system near the origin resemble those ofthe linearized system, at least in certain cases.159


160 Chapter 8 Equilibria in Nonlinear SystemsAs a first example, consider the system:x ′ = x + y 2y ′ =−y.There is a single <strong>eq</strong>uilibrium point at the origin. To picture nearby solutions,we note that, when y is small, y 2 is much smaller. Hence, near the originat least, the differential <strong>eq</strong>uation x ′ = x + y 2 is very close to x ′ = x. InSection 7.4 in Chapter 7 we showed that the flow of this system near the originis also “close” to that of the linearized system X ′ = DF 0 X. This suggests thatwe consider instead the linearized <strong>eq</strong>uationx ′ = xy ′ =−y,derived by simply dropping the higher order term. We can, of course, solvethis system immediately. We have a saddle at the origin with a stable line alongthe y-axis and an unstable line along the x-axis.Now let’s go back to the original nonlinear system. Luckily, we can also solvethis system explicitly. For the second <strong>eq</strong>uation y ′ =−y yields y(t) = y 0 e −t .Inserting this into the first <strong>eq</strong>uation, we must solvex ′ = x + y 2 0 e−2t .This is a first-order, nonautonomous <strong>eq</strong>uation whose solutions may be determinedas in calculus by “guessing” a particular solution of the form ce −2t .Inserting this guess into the <strong>eq</strong>uation yields a particular solution:x(t) =− 1 3 y2 0 e−2t .Hence any function of the formx(t) = ce t − 1 3 y2 0 e−2tis a solution of this <strong>eq</strong>uation, as is easily checked. The general solution is then(x(t) = x 0 + 1 )3 y2 0 e t − 1 3 y2 0 e−2ty(t) = y 0 e −t .If y 0 = 0, we find a straight-line solution x(t) = x 0 e t , y(t) = 0, just as inthe linear case. However, unlike the linear case, the y-axis is no longer home


8.1 Some Illustrative Examples 161Figure 8.1 The phase planefor x ′ = x + y 2 , y ′ = −y.Note the stable curve tangentto the y-axis.to a solution that tends to the origin. Indeed, the vector field along the y-axis isgiven by (y 2 , −y), which is not tangent to the axis; rather, all nonzero vectorspoint to the right along this axis.On the other hand, there is a curve through the origin on which solutionstend to (0, 0). Consider the curve x + 1 3 y2 = 0inR 2 . Suppose (x 0 , y 0 ) lies onthis curve, and let (x(t), y(t)) be the solution satisfying this initial condition.Since x 0 + 1 3 y2 0 = 0 this solution becomesx(t) =− 1 3 y2 0 e−2ty(t) = y 0 e −t .Note that we have x(t) + 1 3 (y(t))2 = 0 for all t, so this solution remainsfor all time on this curve. Moreover, as t →∞, this solution tends to the<strong>eq</strong>uilibrium point. That is, we have found a stable curve through the origin onwhich all solutions tend to (0, 0). Note that this curve is tangent to the y-axisat the origin. See Figure 8.1.Can we just “drop” the nonlinear terms in a system? The answer is, as weshall see below, it depends! In this case, however, doing so is perfectly legal,for we can find a change of variables that actually converts the original systemto the linear system.To see this, we introduce new variables u and v viau = x + 1 3 y2v = y.


162 Chapter 8 Equilibria in Nonlinear SystemsThen, in these new coordinates, the system becomesu ′ = x ′ + 2 3 yy′ = x + 1 3 y2 = uv ′ = y ′ =−y =−v.That is to say, the nonlinear change of variables F(x, y) = (x + 1 3 y2 , y) convertsthe original nonlinear system to a linear one, in fact, to the linearized systemabove.Example. In general, it is impossible to convert a nonlinear system to a linearone as in the previous example, since the nonlinear terms almost always makehuge changes in the system far from the <strong>eq</strong>uilibrium point at the origin. Forexample, consider the nonlinear systemx ′ = 1 2 x − y − 1 (x 3 + y 2 x )2y ′ = x + 1 2 y − 1 (y 3 + x 2 y ) .2Again we have an <strong>eq</strong>uilibrium point at the origin. The linearized system is now( 12)X ′ −1= X,1which has eigenvalues 1 2± i. All solutions of this system spiral away from theorigin and toward ∞ in the counterclockwise direction, as is easily checked.Solving the nonlinear system looks formidable. However, if we change topolar coordinates, the <strong>eq</strong>uations become much simpler. We computer ′ cos θ − r(sin θ)θ ′ = x ′ = 1 (r − r3 ) cos θ − r sin θ2r ′ sin θ + r(cos θ)θ ′ = y ′ = 1 (r − r3 ) sin θ + r cos θ2from which we conclude, after <strong>eq</strong>uating the coefficients of cos θ and sin θ,12r ′ = r(1 − r 2 )/2θ ′ = 1.We can now solve this system explicitly, since the <strong>eq</strong>uations are decoupled.Rather than do this, we will proceed in a more geometric fashion. From the<strong>eq</strong>uation θ ′ = 1, we conclude that all nonzero solutions spiral around the


8.1 Some Illustrative Examples 163Figure 8.2 The phase planefor r ′ = 1 2 (r − r3 ), θ ′ =1.origin in the counterclockwise direction. From the first <strong>eq</strong>uation, we see thatsolutions do not spiral toward ∞. Indeed, we have r ′ = 0 when r = 1, so allsolutions that start on the unit circle stay there forever and move periodicallyaround the circle. Since r ′ > 0 when 0


164 Chapter 8 Equilibria in Nonlinear Systemst = t(r, θ) for which φ t (r, θ) belongs to the circle r = r 0 . We then <strong>def</strong>ine thefunctionh(r, θ) = ψ −t φ t (r, θ)where t = t(r, θ). We stipulate also that h takes the origin to the origin. Thenit is straightforward to check thath ◦ φ s (r, θ) = ψ s ◦ h(r, θ)for any point (r, θ) inD, so that h gives a conjugacy between the nonlinearand linear systems. It is also easy to check that h takes D onto all of R 2 .Thus we see that, while it may not always be possible to linearize a systemglobally, we may sometimes accomplish this locally. Unfortunately, such alocal linearization may not provide any useful information about the behaviorof solutions.Example. Now consider the systemx ′ =−y + ɛx(x 2 + y 2 )y ′ = x + ɛy(x 2 + y 2 ).Here ɛ is a parameter that we may take to be either positive or negative.The linearized system isx ′ =−yy ′ = x,so we see that the origin is a center and all solutions travel in the counterclockwisedirection around circles centered at the origin with unit angularspeed.This is hardly the case for the nonlinear system. In polar coordinates, thissystem reduces tor ′ = ɛr 3θ ′ =−1.Thus when ɛ > 0, all solutions spiral away from the origin, whereas whenɛ < 0, all solutions spiral toward the origin. The addition of the nonlinearterms, no matter how small near the origin, changes the linearized phaseportrait dramatically; we cannot use linearization to determine the behaviorof this system near the <strong>eq</strong>uilibrium point.


8.2 Nonlinear Sinks and Sources 165Figure 8.3 The phase planefor x ′ = x 2 , y ′ = −y.Example. Now consider one final example:x ′ = x 2y ′ =−y.The only <strong>eq</strong>uilibrium solution for this system is the origin. All other solutions(except those on the y-axis) move to the right and toward the x-axis. On they-axis, solutions tend along this straight line to the origin. Hence the phaseportrait is as shown in Figure 8.3.Note that the picture in Figure 8.3 is quite different from the correspondingpicture for the linearized systemx ′ = 0y ′ =−y,for which all points on the x-axis are <strong>eq</strong>uilibrium points, and all other solutionslie on vertical lines x = constant.The problem here, as in the previous example, is that the <strong>eq</strong>uilibrium pointfor the linearized system at the origin is not hyperbolic. When a linear planarsystem has a zero eigenvalue or a center, the addition of nonlinear terms oftencompletely changes the phase portrait.8.2 Nonlinear Sinks and SourcesAs we saw in the examples of the previous section, solutions of planar nonlinearsystems near <strong>eq</strong>uilibrium points resemble those of their linear parts only in


166 Chapter 8 Equilibria in Nonlinear Systemsthe case where the linearized system is hyperbolic; that is, when neither ofthe eigenvalues of the system has zero real part. In this section we begin todescribe the situation in the general case of a hyperbolic <strong>eq</strong>uilibrium point ina nonlinear system by considering the special case of a sink. For simplicity, wewill prove the results below in the planar case, although all of the results holdin R n .Let X ′ = F(X) and suppose that F(X 0 ) = 0. Let DF X0 denote the Jacobianmatrix of F evaluated at X 0 . Then, as in Chapter 7, the linear system ofdifferential <strong>eq</strong>uationsY ′ = DF X0 Yis called the linearized system near X 0 . Note that, if X 0 = 0, the linearizedsystem is obtained by simply dropping all of the nonlinear terms in F, just aswe did in the previous section.In analogy with our work with linear systems, we say that an <strong>eq</strong>uilibriumpoint X 0 of a nonlinear system is hyperbolic if all of the eigenvalues of DF X0have nonzero real parts.We now specialize the discussion to the case of an <strong>eq</strong>uilibrium of a planarsystem for which the linearized system has a sink at 0. Suppose our system isx ′ = f (x, y)y ′ = g(x, y)with f (x 0 , y 0 ) = 0 = g(x 0 , y 0 ). If we make the change of coordinates u =x − x 0 , v = y − y 0 then the new system has an <strong>eq</strong>uilibrium point at (0, 0).Hence we may as well assume that x 0 = y 0 = 0 at the outset. We thenmake a further linear change of coordinates that puts the linearized systemin canonical form. For simplicity, let us assume at first that the linearizedsystem has distinct eigenvalues −λ < −μ < 0. Thus after these changes ofcoordinates, our system becomex ′ =−λx + h 1 (x, y)y ′ =−μy + h 2 (x, y)where h j = h j (x, y) contains all of the “higher order terms.” That is, in termsof its Taylor expansion, each h j contains terms that are quadratic or higherorder in x and/or y. Equivalently, we havewhere r 2 = x 2 + y 2 .h j (x, y)lim = 0(x,y)→(0,0) r


The linearized system is now given by8.2 Nonlinear Sinks and Sources 167x ′ =−λxy ′ =−μy.For this linearized system, recall that the vector field always points inside thecircle of radius r centered at the origin. Indeed, if we take the dot product ofthe linear vector field with the outward normal vector (x, y), we find(−λx, −μy) · (x, y) =−λx 2 − μy 2 < 0for any nonzero vector (x, y). As we saw in Chapter 4, this forces all solutionsto tend to the origin with strictly decreasing radial components.The same thing happens for the nonlinear system, at least close to (0, 0). Letq(x, y) denote the dot product of the vector field with (x, y). We hav<strong>eq</strong>(x, y) = (−λx + h 1 (x, y), −μy + h 2 (x, y)) · (x, y)=−λx 2 + xh 1 (x, y) − μy 2 + yh 2 (x, y)=−μ(x 2 + y 2 ) + (μ − λ)x 2 + xh 1 (x, y) + yh 2 (x, y)≤−μr 2 + xh 1 (x, y) + yh 2 (x, y)since (μ − λ)x 2 ≤ 0. Therefore we hav<strong>eq</strong>(x, y)r 2 ≤−μ + xh 1(x, y) + yh 2 (x, y)r 2 .As r → 0, the right-hand side tends to −μ. Thus it follows that q(x, y) isnegative, at least close to the origin. As a cons<strong>eq</strong>uence, the nonlinear vectorfield points into the interior of circles of small radius about 0, and so allsolutions whose initial conditions lie inside these circles must tend to theorigin. Thus we are justified in calling this type of <strong>eq</strong>uilibrium point a sink,just as in the linear case.It is straightforward to check that the same result holds if the linearized systemhas eigenvalues α + iβ with α < 0, β ̸= 0. In the case of repeated negativeeigenvalues, we first need to change coordinates so that the linearized system isx ′ =−λx + ɛyy ′ =−λywhere ɛ is sufficiently small. We showed how to do this in Chapter 4. Thenagain the vector field points inside circles of sufficiently small radius.


168 Chapter 8 Equilibria in Nonlinear SystemsWe can now conjugate the flow of a nonlinear system near a hyperbolic<strong>eq</strong>uilibrium point that is a sink to the flow of its linearized system. Indeed,the argument used in the second example of the previous section goes overessentially unchanged. In similar fashion, nonlinear systems near a hyperbolicsource are also conjugate to the corresponding linearized system.This result is a special case of the following more general theorem.The Linearization Theorem. Suppose the n-dimensional system X ′ =F(X) has an <strong>eq</strong>uilibrium point at X 0 that is hyperbolic. Then the nonlinearflow is conjugate to the flow of the linearized system in a neighborhood of X 0 . We will not prove this theorem here, since the proof r<strong>eq</strong>uires analyticaltechniques beyond the scope of this book when there are eigenvalues presentwith both positive and negative real parts.8.3 SaddlesWe turn now to the case of an <strong>eq</strong>uilibrium for which the linearized system hasa saddle at the origin in R 2 . As in the previous section, we may assume thatthis system is in the formx ′ = λx + f 1 (x, y)y ′ =−μy + f 2 (x, y)where −μ < 0 < λ and f j (x, y)/r tends to 0 as r → 0. As in the case of a linearsystem, we call this type of <strong>eq</strong>uilibrium point a saddle.For the linearized system, the y-axis serves as the stable line, with all solutionson this line tending to 0 as t →∞. Similarly, the x-axis is the unstable line.As we saw in Section 8.1, we cannot expect these stable and unstable straightlines to persist in the nonlinear case. However, there do exist a pair of curvesthrough the origin that have similar properties.Let W s (0) denote the set of initial conditions whose solutions tend to theorigin as t →∞. Let W u (0) denote the set of points whose solutions tendto the origin as t →−∞. W s (0) and W u (0) are called the stable curve andunstable curve, respectively.The following theorem shows that solutions near nonlinear saddles behavemuch the same as in the linear case.The Stable Curve Theorem. Suppose the systemx ′ = λx + f 1 (x, y)y ′ =−μy + f 2 (x, y)


8.3 Saddles 169satisfies −μ < 0 < λ and f j (x, y)/r → 0 as r → 0. Then there is an ɛ > 0and a curve x = h s (y) that is <strong>def</strong>ined for |y| < ɛ and satisfies h s (0) = 0.Furthermore:1. All solutions whose initial conditions lie on this curve remain on this curvefor all t ≥ 0 and tend to the origin as t →∞;2. The curve x = h s (y) passes through the origin tangent to the y-axis;3. All other solutions whose initial conditions lie in the disk of radius ɛ centeredat the origin leave this disk as time increases.Some remarks are in order. The curve x = h s (y) is called the local stable curveat 0. We can find the complete stable curve W s (0) by following solutions thatlie on the local stable curve backward in time. The function h s (y) is actuallyC ∞ at all points, though we will not prove this result here.A similar unstable curve theorem provides us with a local unstable curveof the form y = h u (x). This curve is tangent to the x-axis at the origin. Allsolutions on this curve tend to the origin as t →−∞.We begin with a brief sketch of the proof of the stable curve theorem.Consider the square bounded by the lines |x| =ɛ and |y| =ɛ for ɛ > 0sufficiently small. The nonlinear vector field points into the square along theinterior of the top and the bottom boundaries y =±ɛ since the system isclose to the linear system x ′ = λx, y ′ =−μy, which clearly has this property.Similarly, the vector field points outside the square along the left and rightboundaries x =±ɛ.Now consider the initial conditions that lie along the top boundary y = ɛ.Some of these solutions will exit the square to the left, while others will exit tothe right. Solutions cannot do both, so these sets are disjoint. Moreover, thesesets are open. So there must be some initial conditions whose solutions do notexit at all. We will show first of all that each of these nonexiting solutions tendsto the origin. Secondly, we will show that there is only one initial conditionon the top and bottom boundary whose solution behaves in this way. Finallywe will show that this solution lies along some graph of the form x = h s (y),which has the r<strong>eq</strong>uired properties.Now we fill in the details of the proof. Let B ɛ denote the square boundedby x =±ɛ and y =±ɛ. Let S ɛ± denote the top and bottom boundaries ofB ɛ . Let C M denote the conical region given by |y| ≥M|x| inside B ɛ . Here wethink of the slopes ±M of the boundary of C M as being large in absolute value.See Figure 8.4.Lemma. Given M > 0, there exists ɛ > 0 such that the vector field points outsideC M for points on the boundary of C M ∩ B ɛ (except, of course, at the origin).Proof: Given M, choose ɛ > 0 so that|f 1 (x, y)| ≤λ2 √ M 2 + 1 r


170 Chapter 8 Equilibria in Nonlinear SystemsyyMxxFigure 8.4 The cone C M .for all (x, y) ∈ B ɛ . Now suppose x>0. Then along the right boundary of C Mwe havex ′ = λx + f 1 (x, Mx)≥ λx −|f 1 (x, Mx)|λ≥ λx −2 √ M 2 + 1 rλ √= λx −2 √ M 2 + 1 (x M 2 + 1)= λ 2 x > 0.Hence x ′ > 0 on this side of the boundary of the cone.Similarly, if y>0, we may choose ɛ > 0 smaller if necessary so that we havey ′ < 0 on the edges of C M where y>0. Indeed, choosing ɛ so that|f 2 (x, y)| ≤μ2 √ M 2 + 1 rguarantees this exactly as above. Hence on the edge of C M that lies in the firstquadrant, we have shown that the vector field points down and to the rightand hence out of C M . Similar calculations show that the vector field pointsoutside C M on all other edges of C M . This proves the lemma.It follows from this lemma that there is a set of initial conditions in S ɛ ± ∩C Mwhose solutions eventually exit from C M to the right, and another set inS ɛ± ∩ C M whose solutions exit left. These sets are open because of continuityof solutions with respect to initial conditions (see Section 7.3). We next showthat each of these sets is actually a single open interval.


8.3 Saddles 171Let C + M denote the portion of C M lying above the x-axis, and let C − M denotethe portion lying below this axis.Lemma. Suppose M > 1. Then there is an ɛ > 0 such that y ′ < 0 in C + M andy ′ > 0 in C − M .Proof: In CM + we have |Mx| ≤y so thatr 2 ≤ y2M 2 + y2orr ≤ y M√1 + M 2 .As in the previous lemma, we choose ɛ so that|f 2 (x, y)| ≤μ2 √ M 2 + 1 rfor all (x, y) ∈ B ɛ . We then have in C + My ′ ≤−μy +|f 2 (x, y)|μ≤−μy +2 √ M 2 + 1 r≤−μy + μ2M y≤− μ 2 ysince M>1. This proves the result for CM + ; the proof for C− Mis similar. From this result we see that solutions that begin on S ɛ + ∩ C M decrease in they-direction while they remain in CM + . In particular, no solution can remain inCM + for all time unless that solution tends to the origin. By the existence anduniqueness theorem, the set of points in S ɛ + that exit to the right (or left) mustthen be a single open interval. The complement of these two intervals in S ɛ+ istherefore a nonempty closed interval on which solutions do not leave C M andtherefore tend to 0 as t →∞. We have similar behavior in CM − .We now claim that the interval of initial conditions in S ɛ± whose solutionstend to 0 is actually a single point. To see this, recall the variational <strong>eq</strong>uation


172 Chapter 8 Equilibria in Nonlinear Systemsfor this system discussed in Section 7.4. This is the nonautonomous linearsystemU ′ = DF X(t) Uwhere X(t) is a given solution of the nonlinear system and( ( )xF = λx + f1 (x, y).y)−μy + f 2 (x, y)We showed in Chapter 7 that, given τ > 0, if |U 0 | is sufficiently small, thenthe solution of the variational <strong>eq</strong>uation U (t) that satisfies U (0) = U 0 and thesolution Y (t) of the nonlinear system satisfying Y (0) = X 0 + U 0 remain closeto each other for 0 ≤ t ≤ τ.Now let X(t) = (x(t), y(t)) be one of the solutions of the system that neverleaves C M . Suppose x(0) = x 0 and y(0) = ɛ.Lemma. Let U (t) = (u 1 (t), u 2 (t)) be any solution of the variational <strong>eq</strong>uationU ′ = DF X(t) Uthat satisfies |u 2 (0)/u 1 (0)| ≤1. Then we may choose ɛ > 0 sufficiently small sothat |u 2 (t)| ≤|u 1 (t)| for all t ≥ 0.Proof: The variational <strong>eq</strong>uation may be writtenu 1 ′ = λu 1 + f 11 u 1 + f 12 u 2u ′ 2 =−μu 2 + f 21 u 1 + f 22 u 2where the |f ij (t)| may be made as small as we like by choosing ɛ smaller.Given any solution (u 1 (t), u 2 (t)) of this <strong>eq</strong>uation, let η = η(t) be the slopeu 2 (t)/u 1 (t) of this solution. Using the variational <strong>eq</strong>uation and the quotientrule, we find that η satisfiesη ′ = α(t) − (μ + λ + β(t))η + γ (t)η 2where α(t) = f 21 (t), β(t) = f 11 (t)−f 22 (t), and γ (t) =−f 12 (t). In particular,α, β, and γ are all small. Recall that μ + λ > 0, so that−(μ + λ + β(t)) < 0.It follows that, if η(t) > 0, there are constants a 1 and a 2 with 0


8.3 Saddles 173if η(0) =−1, then η ′ (0) > 0soη(t) increases. In particular, any solutionthat begins with |η(0)| < 1 can never satisfy η(t) =±1. Since our solutionsatisfies |η(0)| ≤1, it follows that this solution is trapped in the region where|η(t)| < 1, and so we have |u 2 (t)| ≤|u 1 (t)| for all t>0. This proves thelemma.To complete the proof of the theorem, suppose that we have a secondsolution Y (t) that lies in C M for all time. So Y (t) → 0ast →∞. Supposethat Y (0) = (x 0 ′ , ɛ) where x′ 0 ̸= x 0. Note that any solution with initial value(x, ɛ) with x 0


174 Chapter 8 Equilibria in Nonlinear Systemstheory, we simply note that this means there is a linear change of coordinatesin which the local stable set is given near the origin by the graph of a C ∞function g : B r → R n−k that satisfies g(0) = 0, and all partial derivatives of gvanish at the origin. Here B r is the disk of radius r centered at the origin in R k .The local unstable set is a similar graph over an n − k-dimensional disk. Eachof these graphs is tangent at the <strong>eq</strong>uilibrium point to the stable and unstablesubspaces at X 0 . Hence they meet only at X 0 .Example. Consider the systemx ′ =−xy ′ =−yz ′ = z + x 2 + y 2 .The linearized system at the origin has eigenvalues 1 and −1 (repeated). Thechange of coordinatesu = xv = yw = z + 1 3 (x2 + y 2 ).converts the nonlinear system to the linear systemu ′ =−uv ′ =−vw ′ = w.The plane w = 0 for the linear system is the stable plane. Under the change ofcoordinates this plane is transformed to the surfacez =− 1 3 (x2 + y 2 ),which is a paraboloid passing through the origin in R 3 and opening downward.All solutions tend to the origin on this surface; we call this the stable surfacefor the nonlinear system. See Figure 8.5.8.4 StabilityThe study of <strong>eq</strong>uilibria plays a central role in ordinary differential <strong>eq</strong>uationsand their applications. An <strong>eq</strong>uilibrium point, however, must satisfy a certain


8.4 Stability 175zW u (0)yxW s (0)Figure 8.5 The phase portraitfor x ′ = −x, y ′ = −y, z ′ =z + x 2 + y 2 .stability criterion in order to be significant physically. (Here, as in several otherplaces in this book, we use the word physical in a broad sense; in some contexts,physical could be replaced by biological, chemical, or even economic.)An <strong>eq</strong>uilibrium is said to be stable if nearby solutions stay nearby for allfuture time. In applications of dynamical systems one cannot usually pinpointpositions exactly, but only approximately, so an <strong>eq</strong>uilibrium must be stable tobe physically meaningful.More precisely, suppose X ∗ ∈ R n is an <strong>eq</strong>uilibrium point for the differential<strong>eq</strong>uationX ′ = F(X).Then X ∗ is a stable <strong>eq</strong>uilibrium if for every neighborhood O of X ∗ in R n thereis a neighborhood O 1 of X ∗ in O such that every solution X(t) with X(0) = X 0in O 1 is <strong>def</strong>ined and remains in O for all t>0.A different form of stability is asymptotic stability.IfO 1 can be chosen aboveso that, in addition to the properties for stability, we have lim t→∞ X(t) = X ∗ ,then we say that X ∗ is asymptotically stable.An <strong>eq</strong>uilibrium X ∗ that is not stable is called unstable. This means there is aneighborhood O of X ∗ such that for every neighborhood O 1 of X ∗ in O, thereis at least one solution X(t) starting at X(0) ∈ O 1 that does not lie entirely inO for all t>0.A sink is asymptotically stable and therefore stable. Sources and saddlesare examples of unstable <strong>eq</strong>uilibria. An example of an <strong>eq</strong>uilibrium that isstable but not asymptotically stable is the origin in R 2 for a linear <strong>eq</strong>uationX ′ = AX where A has pure imaginary eigenvalues. The importance of thisexample in applications is limited (despite the famed harmonic oscillator)because the slightest nonlinear perturbation will destroy its character, as we


176 Chapter 8 Equilibria in Nonlinear Systemssaw in Section 8.1. Even a small linear perturbation can make a center into asink or a source.Thus when the linearization of the system at an <strong>eq</strong>uilibrium point is hyperbolic,we can immediately determine the stability of that point. Unfortunately,many important <strong>eq</strong>uilibrium points that arise in applications are nonhyperbolic.It would be wonderful to have a technique that determined the stabilityof an <strong>eq</strong>uilibrium point that works in all cases. Unfortunately, as yet, we haveno universal way of determining stability except by actually finding all solutionsof the system, which is usually difficult if not impossible. We will presentsome techniques that allow us to determine stability in certain special cases inthe next chapter.8.5 BifurcationsIn this section we will describe some simple examples of bifurcations that occurfor nonlinear systems. We consider a family of systemsX ′ = F a (X)where a is a real parameter. We assume that F a depends on a in a C ∞ fashion.A bifurcation occurs when there is a “significant” change in the structure ofthe solutions of the system as a varies. The simplest types of bifurcations occurwhen the number of <strong>eq</strong>uilibrium solutions changes as a varies.Recall the elementary bifurcations we encountered in Chapter 1 for firstorder<strong>eq</strong>uations x ′ = f a (x). If x 0 is an <strong>eq</strong>uilibrium point, then we have f a (x 0 ) =0. If f a ′(x0) ̸= 0, then small changes in a do not change the local structure nearx 0 : that is, the differential <strong>eq</strong>uationx ′ = f a+ɛ (x)has an <strong>eq</strong>uilibrium point x 0 (ɛ) that varies continuously with ɛ for ɛ small.A glance at the (increasing or decreasing) graphs of f a+ɛ (x) near x 0 showswhy this is true. More rigorously, this is an immediate cons<strong>eq</strong>uence of theimplicit function theorem (see Exercise 3 at the end of this chapter). Thusbifurcations for first-order <strong>eq</strong>uations only occur in the nonhyperbolic casewhere f ′a (x 0) = 0.Example. The first-order <strong>eq</strong>uationx ′ = f a (x) = x 2 + a


8.5 Bifurcations 177has a single <strong>eq</strong>uilibrium point at x = 0 when a = 0. Note f 0 ′ (0) = 0, butf 0 ′′(0)̸= 0. For a>0 this <strong>eq</strong>uation has no <strong>eq</strong>uilibrium points since f a(x) > 0for all x, but for aa 0 . The example above is actually the typical type of bifurcation forfirst-order <strong>eq</strong>uations.Theorem. (Saddle-Node Bifurcation) Suppose x ′ = f a (x) is a first-orderdifferential <strong>eq</strong>uation for which1. f a0 (x 0 ) = 0;2. f a ′0(x 0 ) = 0;3. f a ′′0(x 0 ) ̸= 0;∂f a04.∂a (x 0) ̸= 0.Then this differential <strong>eq</strong>uation undergoes a saddle-node bifurcation at a = a 0 .Proof: Let G(x, a) = f a (x). We have G(x 0 , a 0 ) = 0. Also,∂G∂a (x 0, a 0 ) = ∂f a 0∂a (x 0) ̸= 0,so we may apply the implicit function theorem to conclude that there is asmooth function a = a(x) such that G(x, a(x)) = 0. In particular, if x ∗belongs to the domain of a(x), then x ∗ is an <strong>eq</strong>uilibrium point for the <strong>eq</strong>uationx ′ = f a(x ∗ )(x), since f a(x ∗ )(x ∗ ) = 0. Differentiating G(x, a(x)) = 0 withrespect to x,wefinda ′ (x) = −∂G/∂x∂G/∂a .


178 Chapter 8 Equilibria in Nonlinear SystemsNow (∂G/∂x)(x 0 , a 0 ) = f ′a 0(x 0 ) = 0, while (∂G/∂a)(x 0 , a 0 ) ̸= 0 by assumption.Hence a ′ (x 0 ) = 0. Differentiating once more, we finda ′′ (x) =− ∂2 G ∂G∂x 2 ∂a + ∂G ∂ 2 G∂x ∂a( ) 2∂G2.∂aSince (∂G/∂x)(x 0 , a 0 ) = 0, we havea ′′ (x 0 ) =− ∂2 G∂x 2 (x 0, a 0 )̸= 0∂G∂a (x 0, a 0 )since (∂ 2 G/∂x 2 )(x 0 , a 0 ) = f a ′′0(x 0 ) ̸= 0. This implies that the graph of a = a(x)is either concave up or concave down, so we have two <strong>eq</strong>uilibria near x 0 fora-values on one side of a 0 and no <strong>eq</strong>uilibria for a-values on the other side. We said earlier that such saddle-node bifurcations were the “typical” bifurcationsinvolving <strong>eq</strong>uilibrium points for first-order <strong>eq</strong>uations. The reason forthis is that we must have both1. f a0 (x 0 ) = 02. f a ′0(x 0 ) = 0if x ′ = f a (x) is to undergo a bifurcation when a = a 0 . Generically (in the senseof Section 5.6), the next higher order derivatives at (x 0 , a 0 ) will be nonzero.That is, we typically have3. f a ′′0(x 0 ) ̸= 0∂f4. a∂a (x 0, a 0 ) ̸= 0at such a bifurcation point. But these are precisely the conditions that guaranteea saddle-node bifurcation.Recall that the bifurcation diagram for x ′ = f a (x) is a plot of the variousphase lines of the <strong>eq</strong>uation versus the parameter a. The bifurcation diagramfor a typical saddle-node bifurcation is displayed in Figure 8.6. (The directionsof the arrows and the curve of <strong>eq</strong>uilibria may change.)Example. (Pitchfork Bifurcation) Considerx ′ = x 3 − ax.


8.5 Bifurcations 179xx 0aa 0Figure 8.6 The bifurcation diagram for asaddle-node bifurcation.xaFigure 8.7 The bifurcation diagramfor a pitchfork bifurcation.There are three <strong>eq</strong>uilibria for this <strong>eq</strong>uation, at x = 0 and x =± √ a whena>0. When a ≤ 0, x = 0 is the only <strong>eq</strong>uilibrium point. The bifurcationdiagram shown in Figure 8.7 explains why this bifurcation is so named. Now we turn to some bifurcations in higher dimensions. The saddle-nodebifurcation in the plane is similar to its one-dimensional cousin, only now wesee where the “saddle” comes from.Example. Consider the systemx ′ = x 2 + ay ′ =−y.


180 Chapter 8 Equilibria in Nonlinear SystemsFigure 8.8 The saddle-node bifurcation when, from left to right,a 0.When a = 0, this is one of the systems considered in Section 8.1. There is aunique <strong>eq</strong>uilibrium point at the origin, and the linearized system has a zeroeigenvalue.When a passes through a = 0, a saddle-node bifurcation occurs. Whena>0, we have x ′ > 0 so all solutions move to the right; the <strong>eq</strong>uilibrium pointdisappears. When a0 since, in thatcase, θ ′ > 0. When a = 0 two additional <strong>eq</strong>uilibria appear at (r, θ) = (1, 0)and (r, θ) = (1, π). When −1


8.5 Bifurcations 181We denote these roots by θ ± and θ ± + π, where we assume that 0 < θ + < π/2and −π/2 < θ − < 0.Note that the flow of this system takes the straight rays through the originθ = constant to other straight rays. This occurs since θ ′ depends only on θ,not on r. Also, the unit circle is invariant in the sense that any solution thatstarts on the circle remains there for all time. This follows since r ′ = 0 on thiscircle. All other nonzero solutions tend to this circle, since r ′ > 0if0 0, so all other solutions in this region windcounterclockwise about 0 and tend to x =−1; the θ-coordinate increasesto θ = π while r tends monotonically to 1. No solution winds more thanangle π about the origin, since the x-axis acts as a barrier. The system behavessymmetrically in the lower half-plane.When a>0 two things happen. First of all, the <strong>eq</strong>uilibrium points at x =±1disappear and now θ ′ > 0 everywhere. Thus the barrier on the x-axis has beenremoved and all solutions suddenly are free to wind forever about the origin.Secondly, we now have a periodic solution on the circle r = 1, and all nonzerosolutions are attracted to it.This dramatic change is caused by a pair of saddle-node bifurcations ata = 0. Indeed, when −1


182 Chapter 8 Equilibria in Nonlinear SystemsFigure 8.9 Global effects of saddle-node bifurcations when, from leftto right, a 0.There is an <strong>eq</strong>uilibrium point at the origin and the linearized system is( )X ′ a −1= X.1 aThe eigenvalues are a ± i, so we expect a bifurcation when a = 0.To see what happens as a passes through 0, we change to polar coordinates.The system becomesr ′ = ar − r 3θ ′ = 1.Note that the origin is the only <strong>eq</strong>uilibrium point for this system, since θ ′ ̸= 0.For a0. Thus all solutionstend to the origin in this case. When a>0, the <strong>eq</strong>uilibrium becomes a source.So what else happens? When a>0 we have r ′ = 0ifr = √ a. So the circleof radius √ a is a periodic solution with period 2π. We also have r ′ > 0if0 √ a. Thus, all nonzero solutions spiral towardthis circular solution as t →∞.This type of bifurcation is called a Hopf bifurcation. Thus at a Hopfbifurcation, no new <strong>eq</strong>uilibria arise. Instead, a periodic solution is bornat the <strong>eq</strong>uilibrium point as a passes through the bifurcation value. SeeFigure 8.10.8.6 Exploration: Complex Vector FieldsIn this exploration, you will investigate the behavior of systems of differential<strong>eq</strong>uations in the complex plane of the form z ′ = F(z). Throughout thissection z will denote the complex number z = x + iy and F(z) will be a


8.6 Exploration: Complex Vector Fields 183Figure 8.10a >0.The Hopf bifurcation for a < 0 andpolynomial with complex coefficients. Solutions of the differential <strong>eq</strong>uationwill be expressed as curves z(t) = x(t) + iy(t) in the complex plane.You should be familiar with complex functions such as the exponential, sine,and cosine as well as with the process of taking complex square roots in orderto comprehend fully what you see below. Theoretically, you should also havea grasp of complex analysis as well. However, all of the routine tricks fromintegration of functions of real variables work just as well when integratingwith respect to z. You need not prove this, because you can always check thevalidity of your solutions when you have completed the integrals.1. Solve the <strong>eq</strong>uation z ′ = az where a is a complex number. What kindof <strong>eq</strong>uilibrium points do you find at the origin for these differential<strong>eq</strong>uations?2. Solve each of the following complex differential <strong>eq</strong>uations and sketch thephase portrait.(a) z ′ = z 2(b) z ′ = z 2 − 1(c) z ′ = z 2 + 13. For a complex polynomial F(z), the complex derivative is <strong>def</strong>ined just asthe real derivativeF ′ (z 0 ) = lim z→z0F(z) − F(z 0 )z − z 0,only this limit is evaluated along any smooth curve in C that passesthrough z 0 . This limit must exist and yield the same (complex) numberfor each such curve. For polynomials, this is easily checked. Now writeF(x + iy) = u(x, y) + iv(x, y).


184 Chapter 8 Equilibria in Nonlinear SystemsEvaluate F ′ (z 0 ) in terms of the derivatives of u and v by first taking thelimit along the horizontal line z 0 + t, and secondly along the vertical linez 0 +it. Use this to conclude that, if the derivative exists, then we must have∂u∂x = ∂v∂yand∂u∂y =−∂v ∂xat every point in the plane. The <strong>eq</strong>uations are called the Cauchy-Riemann<strong>eq</strong>uations.4. Use the above observation to determine all possible types of <strong>eq</strong>uilibriumpoints for complex vector fields.5. Solve the <strong>eq</strong>uationz ′ = (z − z 0 )(z − z 1 )where z 0 , z 1 ∈ C, and z 0 ̸= z 1 . What types of <strong>eq</strong>uilibrium points occurfor different values of z 0 and z 1 ?6. Find a nonlinear change of variables that converts the previous system tow ′ = αw with α ∈ C. Hint: Since the original system has two <strong>eq</strong>uilibriumpoints and the linear system only one, the change of variables must sendone of the <strong>eq</strong>uilibrium points to ∞.7. Classify all complex quadratic systems of the formwhere a, b ∈ C.8. Consider the <strong>eq</strong>uationz ′ = z 2 + az + bz ′ = z 3 + azwith a ∈ C. First use a computer to describe the phase portraits for thesesystems. Then prove as much as you can about these systems and classifythem with respect to a.9. Choose your own (nontrivial) family of complex functions depending ona parameter a ∈ C and provide a complete analysis of the phase portraitsfor each a. Some neat families to consider include a exp z, a sin z,or (z 2 + a)(z 2 − a).EXERCISES1. For each of the following nonlinear systems,(a) Find all of the <strong>eq</strong>uilibrium points and describe the behavior of theassociated linearized system.


Exercises 185(b) Describe the phase portrait for the nonlinear system.(c) Does the linearized system accurately describe the local behaviornear the <strong>eq</strong>uilibrium points?(i) x ′ = sin x, y ′ = cos y(ii) x ′ = x(x 2 + y 2 ), y ′ = y(x 2 + y 2 )(iii) x ′ = x + y 2 , y ′ = 2y(iv) x ′ = y 2 , y ′ = y(v) x ′ = x 2 , y ′ = y 22. Find a global change of coordinates that linearizes the systemx ′ = x + y 2y ′ =−yz ′ =−z + y 2 .3. Consider a first-order differential <strong>eq</strong>uationx ′ = f a (x)for which f a (x 0 ) = 0 and f ′a (x 0) ̸= 0. Prove that the differential <strong>eq</strong>uationx ′ = f a+ɛ (x)has an <strong>eq</strong>uilibrium point x 0 (ɛ) where ɛ → x 0 (ɛ) is a smooth functionsatisfying x 0 (0) = x 0 for ɛ sufficiently small.4. Find general conditions on the derivatives of f a (x) so that the <strong>eq</strong>uationx ′ = f a (x)undergoes a pitchfork bifurcation at a = a 0 . Prove that your conditionslead to such a bifurcation.5. Consider the systemwhere a is a parameter.x ′ = x 2 + yy ′ = x − y + a(a) Find all <strong>eq</strong>uilibrium points and compute the linearized <strong>eq</strong>uation ateach.(b) Describe the behavior of the linearized system at each <strong>eq</strong>uilibriumpoint.


186 Chapter 8 Equilibria in Nonlinear Systems(c) Describe any bifurcations that occur.6. Given an example of a family of differential <strong>eq</strong>uations x ′ = f a (x) forwhich there are no <strong>eq</strong>uilibrium points if a0. Sketch the bifurcationdiagram for this family.7. Discuss the local and global behavior of solutions ofr ′ = r − r 3θ ′ = sin 2 (θ) + aat the bifurcation value a =−1.8. Discuss the local and global behavior of solutions ofat all of the bifurcation values.9. Consider the systemr ′ = r − r 2θ ′ = sin 2 (θ/2) + ar ′ = r − r 2θ ′ = sin θ + a(a) For which values of a does this system undergo a bifurcation?(b) Describe the local behavior of solutions near the bifurcation values(at, before, and after the bifurcation).(c) Sketch the phase portrait of the system for all possible different cases.(d) Discuss any global changes that occur at the bifurcations.10. Let X ′ = F(X) be a nonlinear system in R n . Suppose that F(0) = 0 andthat DF 0 has n distinct eigenvalues with negative real parts. Describe theconstruction of a conjugacy between this system and its linearization.11. Consider the system X ′ = F(X) where X ∈ R n . Suppose that F has an<strong>eq</strong>uilibrium point at X 0 . Show that there is a change of coordinates thatmoves X 0 to the origin and converts the system toX ′ = AX + G(X)where A is an n × n matrix, which is the canonical form of DF X0 andwhere G(X) satisfies|G(X)|lim = 0.|X|→0 |X|


Exercises 18712. In the <strong>def</strong>inition of an asymptotically stable <strong>eq</strong>uilibrium point, wer<strong>eq</strong>uired that the <strong>eq</strong>uilibrium point also be stable. This r<strong>eq</strong>uirement isnot vacuous. Give an example of a phase portrait (a sketch is sufficient)that has an <strong>eq</strong>uilibrium point toward which all nearby solution curves(eventually) tend, but that is not stable.


This Page Intentionally Left Blank


9Global Nonlinear TechniquesIn this chapter we present a variety of qualitative techniques for analyzingthe behavior of nonlinear systems of differential <strong>eq</strong>uations. The reader shouldbe forewarned that none of these techniques works for all nonlinear systems;most work only in specialized situations, which, as we shall see in the ensuingchapters, nonetheless occur in many important applications of differential<strong>eq</strong>uations.9.1 NullclinesOne of the most useful tools for analyzing nonlinear systems of differential<strong>eq</strong>uations (especially planar systems) are the nullclines. For a system in theformx ′ 1 = f 1(x 1 ,...,x n ).x ′ n = f n(x 1 ,...,x n ),the x j -nullcline is the set of points where x ′ j vanishes, so the x j-nullcline is theset of points determined by setting f j (x 1 , ..., x n ) = 0.The x j -nullclines usually separate R n into a collection of regions in whichthe x j -components of the vector field point in either the positive or negativedirection. If we determine all of the nullclines, then this allows us to decompose189


190 Chapter 9 Global Nonlinear TechniquesR n into a collection of open sets, in each of which the vector field points in a“certain direction.”This is easiest to understand in the case of a planar systemx ′ = f (x, y)y ′ = g(x, y).On the x-nullclines, we have x ′ = 0, so the vector field points straight upor down, and these are the only points at which this happens. Therefore thex-nullclines divide R 2 into regions where the vector field points either to theleft or to the right. Similarly, on the y-nullclines, the vector field is horizontal,so the y-nullclines separate R 2 into regions where the vector field points eitherupward or downward. The intersections of the x- and y-nullclines yield the<strong>eq</strong>uilibrium points. In any of the regions between the nullclines, the vector fieldis neither vertical nor horizontal, so it must point in one of four directions:northeast, northwest, southeast, or southwest. We call such regions basicregions. Often, a simple sketch of the basic regions allows us to understand thephase portrait completely, at least from a qualitative point of view.Example. For the systemx ′ = y − x 2y ′ = x − 2,the x-nullcline is the parabola y = x 2 and the y-nullcline is the vertical line x =2. These nullclines meet at (2, 4) so this is the only <strong>eq</strong>uilibrium point. The nullclinesdivide R 2 into four basic regions labeled A through D in Figure 9.1(a).By first choosing one point in each of these regions, and then determining thedirection of the vector field at that point, we can decide the direction of thevector field at all points in the basic region. For example, the point (0, 1) lies inregion A and the vector field is (1, −2) at this point, which points toward thesoutheast. Hence the vector field points southeast at all points in this region.Of course, the vector field may be nearly horizontal or nearly vertical in thisregion; when we say southeast we mean that the angle θ of the vector field liesin the sector −π/2 < θ < 0. Continuing in this fashion we get the directionof the vector field in all four regions, as in Figure 9.1(b). This also determinesthe horizontal and vertical directions of the vector field on the nullclines.Just from the direction field alone, it appears that the <strong>eq</strong>uilibrium point is asaddle. Indeed, this is the case because the linearized system at (2, 4) is( )X ′ −4 1= X,1 0


9.1 Nullclines 191BADC(a)Figure 9.1(b)The (a) nullclines and (b) direction field.x2Byx 2Figure 9.2 Solutionsenter the basic regionB and then tend to ∞.which has eigenvalues −2 ± √ 5, one of which is positive, the other negative.More importantly, we can fill in the approximate behavior of solutionseverywhere in the plane. For example, note that the vector field points into thebasic region marked B at all points along its boundary, and then it points northeasterlyat all points inside B. Thus any solution in region B must stay in regionB for all time and tend toward ∞ in the northeast direction. See Figure 9.2.Similarly, solutions in the basic region D stay in that region and head toward∞ in the southwest direction. Solutions starting in the basic regions A andC have a choice: They must eventually cross one of the nullclines and enterregions B and D (and therefore we know their ultimate behavior) or else theytend to the <strong>eq</strong>uilibrium point. However, there is only one curve of such solutionsin each region, the stable curve at (2, 4). Thus we completely understandthe phase portrait for this system, at least from a qualitative point of view. SeeFigure 9.3.


192 Chapter 9 Global Nonlinear TechniquesFigure 9.3 The nullclines andphase portrait for x ′ = y−x 2 ,y ′ = x − 2.Example. (Heteroclinic Bifurcation) Next consider the system thatdepends on a parameter a:x ′ = x 2 − 1y ′ =−xy + a(x 2 − 1).The x-nullclines are given by x =±1 while the y-nullclines are xy = a(x 2 −1).The <strong>eq</strong>uilibrium points are (±1, 0). Since x ′ = 0onx ± 1, the vector field isactually tangent to these nullclines. Moreover, we have y ′ =−y on x = 1 andy ′ = y on x =−1. So solutions tend to (1, 0) along the vertical line x = 1 andtend away from (−1, 0) along x =−1. This happens for all values of a.Now, let’s look at the case a = 0. Here the system simplifies tox ′ = x 2 − 1y ′ =−xy,so y ′ = 0 along the axes. In particular, the vector field is tangent to the x-axisand is given by x ′ = x 2 − 1 on this line. So we have x ′ > 0if|x| > 1 andx ′ < 0if|x| < 1. Thus, at each <strong>eq</strong>uilibrium point, we have one straight-linesolution tending to the <strong>eq</strong>uilibrium and one tending away. So it appears thateach <strong>eq</strong>uilibrium is a saddle. This is indeed the case, as is easily checked bylinearization.There is a second y-nullcline along x = 0, but the vector field is not tangentto this nullcline. Computing the direction of the vector field in each of thebasic regions determined by the nullclines yields Figure 9.4, from which wecan deduce immediately the qualitative behavior of all solutions.Note that, when a = 0, one branch of the unstable curve through (1, 0)matches up exactly with a branch of the stable curve at (−1, 0). All solutions


9.1 Nullclines 193(a)(b)Figure 9.4 The (a) nullclines and (b) phase portrait for x ′ =x 2 − 1, y ′ = −xy.on this curve simply travel from one saddle to the other. Such solutions arecalled heteroclinic solutions or saddle connections. Typically, for planar systems,stable and unstable curves rarely meet to form such heteroclinic “connections.”When they do, however, one can expect a bifurcation.Now consider the case where a ̸= 0. The x-nullclines remain the same, atx =±1. But the y-nullclines change drastically as shown in Figure 9.5. Theyare given by y = a(x 2 − 1)/x.When a>0, consider the basic region denoted by A. Here the vector fieldpoints southwesterly. In particular, the vector field points in this directionalong the x-axis between x =−1 and x = 1. This breaks the heteroclinicconnection: The right portion of the stable curve associated to (−1, 0) mustAA(a)(b)Figure 9.5 The (a) nullclines and (b) phase plane when a > 0 afterthe heteroclinic bifurcation.


194 Chapter 9 Global Nonlinear Techniquesnow come from y =∞in the upper half plane, while the left portion of theunstable curve associated to (1, 0) now descends to y =−∞in the lowerhalf plane. This opens an “avenue” for certain solutions to travel from y =+∞ to y =−∞between the two lines x =±1. Whereas when a = 0 allsolutions remain for all time confined to either the upper or lower half-plane,the heteroclinic bifurcation at a = 0 opens the door for certain solutions tomake this transit.A similar situation occurs when a


9.2 Stability of Equilibria 195containing X ∗ . Suppose further that(a) L(X ∗ ) = 0 and L(X) > 0 if X ̸= X ∗ ;(b) ˙L ≤ 0 in O − X ∗ .Then X ∗ is stable. Furthermore, if L also satisfies(c) ˙L 0. When ɛ ≤ 0, all we can concludeis that the origin is not hyperbolic.When ɛ ≤ 0 we search for a Liapunov function for (0, 0, 0) of the formL(x, y, z) = ax 2 + by 2 + cz 2 , with a, b, c>0. For such an L, we haveso that˙L = 2(axx ′ + byy ′ + czz ′ ),˙L/2 = ax(ɛx + 2y)(z + 1) + by(−x + ɛy)(z + 1) − cz 4= ɛ(ax 2 + by 2 )(z + 1) + (2a − b)(xy)(z + 1) − cz 4 .


196 Chapter 9 Global Nonlinear TechniquesFor stability, we want ˙L ≤ 0; this can be arranged, for example, by settinga = 1, b = 2, and c = 1. If ɛ = 0, we then have ˙L =−z 4 ≤ 0, so the originis stable. It can be shown (see Exercise 4) that the origin is not asymptoticallystable in this case.If ɛ < 0, then we find˙L = ɛ(x 2 + 2y 2 )(z + 1) − z 4so that ˙L −1 (minus the origin). We concludethat the origin is asymptotically stable in this case, and, indeed, from Exercise 4,that all solutions that start in the region O tend to the origin.Example. (The Nonlinear Pendulum) Consider a pendulum consistingof a light rod of length l to which is attached a ball of mass m. The other endof the rod is attached to a wall at a point so that the ball of the pendulummoves on a circle centered at this point. The position of the mass at time tis completely described by the angle θ(t) of the mass from the straight downposition and measured in the counterclockwise direction. Thus the positionof the mass at time t is given by (l sin θ(t), −l cos θ(t)).The speed of the mass is the length of the velocity vector, which is l dθ/dt,and the acceleration is l d 2 θ/dt 2 . We assume that the only two forces actingon the pendulum are the force of gravity and a force due to friction. Thegravitational force is a constant force <strong>eq</strong>ual to mg acting in the downwarddirection; the component of this force tangent to the circle of motion is givenby −mg sin θ. We take the force due to friction to be proportional to velocityand so this force is given by −bl dθ/dt for some constant b>0. When thereis no force due to friction (b = 0), we have an ideal pendulum. Newton’s lawthen gives the second-order differential <strong>eq</strong>uation for the pendulumml d2 θdt 2 =−bldθ dt− mg sin θ.For simplicity, we assume that units have been chosen so that m = l = g = 1.Rewriting this <strong>eq</strong>uation as a system, we introduce v = dθ/dt and getθ ′ = vv ′ =−bv − sin θ.Clearly, we have two <strong>eq</strong>uilibrium points (mod 2π): the downward rest positionat θ = 0, v = 0, and the straight-up position θ = π, v = 0. This upwardpostion is an unstable <strong>eq</strong>uilibrium, both from a mathematical (check thelinearization) and physical point of view.


9.2 Stability of Equilibria 197For the downward <strong>eq</strong>uilibrium point, the linearized system is( )Y ′ 0 1=Y .−1 −bThe eigenvalues here are either pure imaginary (when b = 0) or else have negativereal parts (when b>0). So the downward <strong>eq</strong>uilibrium is asymptoticallystable if b>0 as everyone on earth who has watched a real-life pendulumknows.To investigate this <strong>eq</strong>uilibrium point further, consider the functionE(θ, v) = 1 2 v2 + 1 − cos θ. For readers with a background in elementarymechanics, this is the well-known total energy function, which we will describefurther in Chapter 13. We computeĖ = vv ′ + sin θθ ′ =−bv 2 ,so that Ė ≤ 0. Hence E is a Liapunov function. Thus the origin is a stable<strong>eq</strong>uilibrium. If b = 0 (that is, there is no friction), then Ė ≡ 0. That is, E isconstant along all solutions of the system. Hence we may simply plot the levelcurves of E to see where the solution curves reside. We find the phase portraitshown in Figure 9.6. Note that we do not have to solve the differential <strong>eq</strong>uationto paint this picture; knowing the level curves of E (and the direction of thevector field) tells us everything. We will encounter many such (very special)functions that are constant along solution curves later in this chapter.The solutions encircling the origin have the property that −π < θ(t) < πfor all t. Therefore these solutions correspond to the pendulum oscillatingabout the downward rest position without ever crossing the upward positionθ = π. The special solutions connecting the <strong>eq</strong>uilibrium points at (±π,0)2π2πFigure 9.6 The phase portrait for theideal pendulum.


198 Chapter 9 Global Nonlinear Techniquescorrespond to the pendulum tending to the upward-pointing <strong>eq</strong>uilibrium inboth the forward and backward time directions. (You don’t often see suchmotions!) Beyond these special solutions we find solutions for which θ(t)either increases or decreases for all time; in these cases the pendulum spinsforever in the counterclockwise or clockwise direction.We will return to the pendulum example for the case b>0 later, but firstwe prove Liapunov’s theorem.Proof: Let δ > 0 be so small that the closed ball B δ (X ∗ ) around the <strong>eq</strong>uilibriumpoint X ∗ of radius δ lies entirely in O. Let α be the minimum value ofL on the boundary of B δ (X ∗ ), that is, on the sphere S δ (X ∗ ) of radius δ andcenter X ∗ . Then α > 0 by assumption. Let U ={X ∈ B δ (X ∗ ) | L(X) < α}and note that X ∗ lies in U. Then no solution starting in U can meet S δ (X ∗ )since L is nonincreasing on solution curves. Hence every solution startingin U never leaves B δ (X ∗ ). This proves that X ∗ is stable.Now suppose that assumption (c) in the Liapunov stability theorem holdsas well, so that L is strictly decreasing on solutions in U − X ∗ . Let X(t) bea solution starting in U − X ∗ and suppose that X(t n ) → Z 0 ∈ B δ (X ∗ ) forsome s<strong>eq</strong>uence t n →∞. We claim that Z 0 = X ∗ . To see this, observe thatL(X(t)) >L(Z 0 ) for all t ≥ 0 since L(X(t)) decreases and L(X(t n )) → L(Z 0 )by continuity of L. IfZ 0 ̸= X ∗ , let Z(t) be the solution starting at Z 0 . Forany s>0, we have L(Z(s))


9.2 Stability of Equilibria 199L 1 (c 1 )L 1 (c 2 )L 1 (c 3 )Figure 9.7 Solutions decrease through the level sets L −1 (c j )of a strict Liapunov function.Example. Now consider the systemx ′ =−x 3y ′ =−y(x 2 + z 2 + 1)z ′ =−sin z.The origin is again an <strong>eq</strong>uilibrium point. It is not the only one, however,since (0, 0, nπ) is also an <strong>eq</strong>uilibrium point for each n ∈ Z. Hence the origincannot be globally asymptotically stable. Moreover, the planes z = nπ forn ∈ Z are invariant in the sense that any solution that starts on one of theseplanes remains there for all time. This occurs since z ′ = 0 when z = nπ. Inparticular, any solution that begins in the region |z| < π must remain trappedin this region for all time.L 1 (c 2 )L 1 (c 1 )Figure 9.8 Level sets of a Liapunovfunction may look like this.


200 Chapter 9 Global Nonlinear TechniquesLinearization at the origin yields the system⎛Y ′ = ⎝ 0 0 0⎞0 −1 0⎠Y0 0 −1which tells us nothing about the stability of this <strong>eq</strong>uilibrium point.However, consider the functionL(x, y, z) = x 2 + y 2 + z 2 .Clearly, L>0 except at the origin. We compute˙L =−2x 4 − 2y 2 (x 2 + z 2 + 1) − 2z sin z.Then ˙L 0when z ̸= 0. Hence the origin is asymptotically stable.Moreover, we can conclude that the basin of attraction of the origin is theentire region |z| < π. From the proof of the Liapunov stability theorem, itfollows immediately that any solution that starts inside a sphere of radius r


9.2 Stability of Equilibria 201Before proving this theorem we apply it to the <strong>eq</strong>uilibrium X ∗ = (0, 0) ofthe damped pendulum discussed previously. Recall that a Liapunov functionis given by E(θ, v) = 1 2 v2 + 1 − cos θ and that Ė =−bv 2 . Since Ė = 0onv = 0, this Liapunov function is not strict.To estimate the basin of (0, 0), fix a number c with 0


202 Chapter 9 Global Nonlinear Techniquesc cooFigure 9.9 The curve γ cbounds the region P c .Figure 9.9 displays the phase portrait for the damped pendulum. The curvesmarked γ c are the level sets E(θ, v) = c. Note that solutions cross each of thesecurves exactly once and eventually tend to the origin.This result is quite natural on physical grounds. For if θ ̸= ±π, thenE(θ,0) < 2 and so the solution through (θ, 0) tends to (0, 0). That is, ifwe start the pendulum from rest at any angle θ except the vertical position,the pendulum will eventually wind down to rest at its stable <strong>eq</strong>uilibriumpostion.There are other initial positions in the basin of (0, 0) that are not in the setP. For example, consider the solution through (−π, u), where u is very smallbut not zero. Then (−π, u) ̸∈ P, but the solution through this point movesquickly into P, and therefore eventually approaches (0, 0). Hence (−π, u) alsolies in the basin of attraction of (0, 0). This can be seen in Figure 9.9, where thesolutions that begin just above the <strong>eq</strong>uilibrium point at (−π, 0) and just below(π, 0) quickly cross γ c and then enter P c . See Exercise 5 for further examplesof this.We now prove the theorem.Proof: Imagine a solution X(t) that lies in the positively invariant set P for0 ≤ t ≤∞, but suppose that X(t) does not tend to X ∗ as t →∞. Since Pis closed and bounded, there must be a point Z ̸= X ∗ in P and a s<strong>eq</strong>uencet n →∞such thatlim X(t n) = Z.n→∞We may assume that the s<strong>eq</strong>uence {t n } is an increasing s<strong>eq</strong>uence.We claim that the entire solution through Z lies in P. That is, φ t (Z) is<strong>def</strong>ined and in P for all t ∈ R, not just t ≥ 0. This can be seen as follows.


9.3 Gradient Systems 203First, φ t (Z) is certainly <strong>def</strong>ined for all t ≥ 0 since P is positively invariant. Onthe other hand, φ t (X(t n )) is <strong>def</strong>ined and in P for all t in the interval [−t n ,0].Since {t n } is an increasing s<strong>eq</strong>uence, we have that φ t (X(t n+k )) is also <strong>def</strong>inedand in P for all t ∈[−t n ,0] and all k ≥ 0. Since the points X(t n+k ) → Zas k →∞, it follows from continuous dependence of solutions on initialconditions that φ t (Z) is <strong>def</strong>ined and in P for all t ∈ [−t n ,0]. Since thisholds for any t n , we see that the solution through Z is an entire solution lyingin P.Finally, we show that L is constant on the entire solution through Z. IfL(Z) = α, then we have L(X(t n )) ≥ α and moreoverlim L(X(t n)) = α.n→∞More generally, if {s n } is any s<strong>eq</strong>uence of times for which s n →∞as n →∞,then L(X(s n )) → α as well. This follows from the fact that L is nonincreasingalong solutions. Now the s<strong>eq</strong>uence X(t n + s) converges to φ s (Z), andso L(φ s (Z)) = α. This contradicts our assumption that there are no entiresolutions lying in P on which L is constant and proves the theorem. In this proof we encountered certain points that were limits of a s<strong>eq</strong>uenceof points on the solution through X. The set of all points that are limit pointsof a given solution is called the set of ω-limit points, or the ω-limit set, of thesolution X(t). Similarly, we <strong>def</strong>ine the set of α-limit points, or the α-limit set,of a solution X(t) to be the set of all points Z such that lim n→∞ X(t n ) = Zfor some s<strong>eq</strong>uence t n →−∞. (The reason, such as it is, for this terminologyis that α is the first letter and ω the last letter of the Greek alphabet.)The following facts, essentially proved above, will be used in the followingchapter.Proposition. The α-limit set and the ω-limit set of a solution that is <strong>def</strong>inedfor all t ∈ R are closed, invariant sets.9.3 Gradient SystemsNow we turn to a particular type of system for which the previous materialon Liapunov functions is particularly germane. A gradient system on R n is asystem of differential <strong>eq</strong>uations of the formX ′ =−grad V (X)


204 Chapter 9 Global Nonlinear Techniqueswhere V : R n → R is a C ∞ function, and( ∂Vgrad V = , ..., ∂V ).∂x 1 ∂x n(The negative sign in this system is traditional.) The vector field grad V iscalled the gradient of V . Note that −grad V (X) = grad (−V (X)).Gradient systems have special properties that make their flows rather simple.The following <strong>eq</strong>uality is fundamental:DV X (Y ) = grad V (X) · Y .This says that the derivative of V at X evaluated at Y = (y 1 , ..., y n ) ∈ R nis given by the dot product of the vectors grad V (X) and Y . This followsimmediately from the formulaDV X (Y ) =n∑j=1∂V∂x j(X) y j .Let X(t) be a solution of the gradient system with X(0) = X 0 , and let˙V : R n → R be the derivative of V along this solution. That is,˙V (X) = d dt V (X(t)).Proposition. The function V is a Liapunov function for the system X ′ =−grad V (X). Moreover, ˙V (X) = 0 if and only if X is an <strong>eq</strong>uilibrium point.Proof: By the chain rule we have˙V (X) = DV X (X ′ )= grad V (X) · (−grad V (X))=−|grad V (X)| 2 ≤ 0.In particular, ˙V (X) = 0 if and only if grad V (X) = 0.An immediate cons<strong>eq</strong>uence of this is the fact that if X ∗ is an isolated minimumof V , then X ∗ is an asymptotically stable <strong>eq</strong>uilibrium of the gradientsystem. Indeed, the fact that X ∗ is isolated guarantees that ˙V < 0inaneighborhood of X ∗ (not including X ∗ ).To understand a gradient flow geometrically we look at the level surfacesof the function V : R n → R. These are the subsets V −1 (c) with c ∈ R. If


9.3 Gradient Systems 205X ∈ V −1 (c)isaregular point, that is, grad V (X) ̸= 0, then V −1 (c) looks likea “surface” of dimension n − 1 near X. To see this, assume (by renumberingthe coordinates) that ∂V /∂x n (X) ̸= 0. Using the implicit function theorem,we find a C ∞ function g : R n−1 → R such that, near X, the level set V −1 (c)is given byV ( x 1 , ..., x n−1 , g (x 1 , ..., x n−1 ) ) = c.That is, near X, V −1 (c) looks like the graph of the function g. In the specialcase where n = 2, V −1 (c) is a simple curve through X when X is a regularpoint. If all points in V −1 (c) are regular points, then we say that c is a regularvalue for V . In the case n = 2, if c is a regular value, then the level set V −1 (c)is a union of simple (or nonintersecting) curves. If X is a nonregular point forV , then grad V (X) = 0, so X is a critical point for the function V , since allpartial derivatives of V vanish at X.Now suppose that Y is a vector that is tangent to the level surface V −1 (c)atX. Then we can find a curve γ (t) in this level set for which γ ′ (0) = Y . SinceV is constant along γ , it follows thatDV X (Y ) = d dt ∣ V ◦ γ (t) = 0.t=0We thus have, by our observations above, that grad V (X) · Y = 0, or, inother words, grad V (X) is perpendicular to every tangent vector to the levelset V −1 (c) atX. That is, the vector field grad V (X) is perpendicular to thelevel surfaces V −1 (c) at all regular points of V . We may summarize all of thisin the following theorem.Theorem. (Properties of Gradient Systems) For the system X ′ =−grad V (X):1. If c is a regular value of V , then the vector field is perpendicular to the levelset V −1 (c).2. The critical points of V are the <strong>eq</strong>uilibrium points of the system.3. If a critical point is an isolated minimum of V , then this point is anasymptotically stable <strong>eq</strong>uilibrium point.Example. Let V : R 2 → R be the function V (x, y) = x 2 (x − 1) 2 + y 2 . Thenthe gradient systemX ′ = F(X) =−grad V (X)is given byx ′ =−2x(x − 1)(2x − 1)y ′ =−2y.


206 Chapter 9 Global Nonlinear TechniquesyxFigure 9.10 The level sets and phase portrait for the gradientsystem determined by V (x, y) =x 2 (x − 1) 2 + y 2 .There are three <strong>eq</strong>uilibrium points: (0, 0), (1/2, 0), and (1, 0). The linearizationsat these three points yield the following matrices:( )( )−2 01 0DF(0, 0) =, DF(1/2, 0) = ,0 −20 −2( ) −2 0DF(1, 0) =.0 −2Hence (0, 0) and (1, 0) are sinks, while (1/2, 0) is a saddle. Both the x- andy-axes are invariant, as are the lines x = 1/2 and x = 1. Since y ′ =−2y onthese vertical lines, it follows that the stable curve at (1/2, 0) is the line x = 1/2,while the unstable curve at (1/2, 0) is the interval (0, 1) on the x-axis. The level sets of V and the phase portrait are shown in Figure 9.10. Notethat it appears that all solutions tend to one of the three <strong>eq</strong>uilibria. This is noaccident, for we have:Proposition. Let Z be an α-limit point or an ω-limit point of a solution of agradient flow. Then Z is an <strong>eq</strong>uilibrium point.Proof: Suppose Z is an ω-limit point. As in the proof of Lasalle’s invarianceprinciple from Section 9.2, one shows that V is constant along thesolution through Z. Thus ˙V (Z) = 0, and so Z must be an <strong>eq</strong>uilibriumpoint. The case of an α-limit point is similar. In fact, an α-limit pointZ of X ′ = −grad V (X) isanω-limit point of X ′ = grad V (X), so thatgrad V (Z) = 0.If a gradient system has only isolated <strong>eq</strong>uilibrium points, this result impliesthat every solution of the system must tend either to infinity or to an


9.4 Hamiltonian Systems 207<strong>eq</strong>uilibrium point. In the previous example we see that the sets V −1 ([0, c])are closed, bounded, and positively invariant under the gradient flow. Thereforeeach solution entering such a set is <strong>def</strong>ined for all t ≥ 0, and tends toone of the three <strong>eq</strong>uilibria (0, 0), (1, 0), or (1/2, 0). The solution through everypoint does enter such a set, since the solution through (x, y) enters the setV −1 ([0, c 0 ]) where V (x, y) = c 0 .There is one final property that gradient systems share. Note that, in theprevious example, all of the eigenvalues of the linearizations at the <strong>eq</strong>uilibriahave real eigenvalues. Again, this is no accident, for the linearization of agradient system at an <strong>eq</strong>uilibrium point X ∗ is a matrix [a ij ] where( ∂ 2 )Va ij =− (X ∗ ).∂x i ∂x jSince mixed partial derivatives are <strong>eq</strong>ual, we have( ∂ 2 ) (V∂(X ∗ 2 )V) = (X ∗ ),∂x i ∂x j ∂x j ∂x iand so a ij = a ji . It follows that the matrix corresponding to the linearizedsystem is a symmetric matrix. It is known that such matrices have only realeigenvalues. For example, in the 2 × 2 case, a symmetric matrix assumes theform( ) a bbcand the eigenvalues are easily seen to be√a + c (a − c)±2 + 4b 2,22both of which are real numbers. A more general case is relegated to Exercise 15.We therefore have:Proposition. For a gradient system X ′ =−grad V (X), the linearized systemat any <strong>eq</strong>uilibrium point has only real eigenvalues.9.4 Hamiltonian SystemsIn this section we deal with another special type of system, a Hamiltoniansystem. As we shall see in Chapter 13, these are the types of systems that arisein classical mechanics.


208 Chapter 9 Global Nonlinear TechniquesWe shall restrict attention in this section to Hamiltonian systems in R 2 .AHamiltonian system on R 2 is a system of the formx ′ = ∂H (x, y)∂yy ′ =− ∂H (x, y)∂xwhere H : R 2 → R is a C ∞ function called the Hamiltonian function.Example. (Undamped Harmonic Oscillator) Recall that this system isgiven byx ′ = yy ′ =−kxwhere k>0. A Hamiltonian function for this system isH(x, y) = 1 2 y2 + k 2 x2 .Example. (Ideal Pendulum) The <strong>eq</strong>uation for this system, as we saw inSection 9.2, isThe total energy functionθ ′ = vv ′ =−sin θ.E(θ, v) = 1 2 v2 + 1 − cos θserves as a Hamiltonian function in this case. Note that we say a Hamiltonianfunction, since we can always add a constant to any Hamiltonian functionwithout changing the <strong>eq</strong>uations.What makes Hamiltonian systems so important is the fact that the Hamiltonianfunction is a first integral or constant of the motion. That is, H is constantalong every solution of the system, or, in the language of the previous sections,Ḣ ≡ 0. This follows immediately fromḢ = ∂H∂x x′ + ∂H∂y y′= ∂H∂x∂H∂y + ∂H∂y(− ∂H )= 0.∂x


9.4 Hamiltonian Systems 209Thus we have:Proposition.solution curve.For a Hamiltonian system on R 2 , H is constant along everyThe importance of knowing that a given system is Hamiltonian is the factthat we can essentially draw the phase portrait without solving the system.Assuming that H is not constant on any open set, we simply plot the levelcurves H(x, y) = constant. The solutions of the system lie on these level sets;all we need to do is figure out the directions of the solution curves on theselevel sets. But this is easy since we have the vector field. Note also that the<strong>eq</strong>uilibrium points for a Hamiltonian system occur at the critical points of H,that is, at points where both partial derivatives of H vanish.Example. Consider the systemA Hamiltonian function isx ′ = yy ′ =−x 3 + x.H(x, y) = x44 − x22 + y22 + 1 4 .The constant value 1/4 is irrelevant here; we choose it so that H has minimumvalue 0, which occurs at (±1, 0), as is easily checked. The only other<strong>eq</strong>uilibrium point lies at the origin. The linearized system isX ′ =( ) 0 11 − 3x 2 X.0At (0, 0), this system has eigenvalues ±1, so we have a saddle. At (±1, 0), theeigenvalues are ± √ 2i, so we have a center, at least for the linearized system.Plotting the level curves of H and adding the directions at non<strong>eq</strong>uilibriumpoints yields the phase portrait depicted in Figure 9.11. Note that the <strong>eq</strong>uilibriumpoints at (±1, 0) remain centers for the nonlinear system. Also notethat the stable and unstable curves at the origin match up exactly. That is, wehave solutions that tend to (0, 0) in both forward and backward time. Suchsolutions are known as homoclinic solutions or homoclinic orbits. The fact that the eigenvalues of this system assume the special forms ±1 and± √ 2i is again no accident.


210 Chapter 9 Global Nonlinear TechniquesFigure 9.11 The phaseportrait for x ′ = y,y ′ =−x 3 + x.Proposition. Suppose (x 0 , y 0 ) is an <strong>eq</strong>uilibrium point for a planar Hamiltoniansystem. Then the eigenvalues of the linearized system are either ±λ or ±iλwhere λ ∈ R.The proof of the proposition is straightforward (see Exercise 11).9.5 Exploration: The Pendulum withConstant ForcingRecall from Section 9.2 that the <strong>eq</strong>uations for a nonlinear pendulum areθ ′ = vv ′ =−bv − sin θ.Here θ gives the angular position of the pendulum (which we assume to bemeasured in the counterclockwise direction) and v is its angular velocity. Theparameter b>0 measures the damping.Now we apply a constant torque to the pendulum in the counterclockwisedirection. This amounts to adding a constant to the <strong>eq</strong>uation for v ′ , so thesystem becomesθ ′ = vv ′ =−bv − sin θ + k


Exercises 211where we assume that k ≥ 0. Since θ is measured mod 2π, we may think ofthis system as being <strong>def</strong>ined on the cylinder S 1 × R, where S 1 denotes the unitcircle.1. Find all <strong>eq</strong>uilibrium points for this system and determine their stability.2. Determine the regions in the bk–parameter plane for which there aredifferent numbers of <strong>eq</strong>uilibrium points. Describe the motion of thependulum in each different case.3. Suppose k>1. Prove that there exists a periodic solution for this system.Hint: What can you say about the vector field in a strip of the form0


212 Chapter 9 Global Nonlinear Techniqueswhen a0. Suppose that this system isonly <strong>def</strong>ined for x, y ≥ 0.(a) Use the nullclines to sketch the phase portrait for this system forvarious a- and b-values.(b) Determine the values of a and b at which a bifurcation occurs.(c) Sketch the regions in the ab–plane where this system has qualitativelysimilar phase portraits, and describe the bifurcations that occur asthe parameters cross the boundaries of these regions.4. Consider the systemx ′ = (ɛx + 2y)(z + 1)y ′ = (−x + ɛy)(z + 1)z ′ =−z 3 .(a) Show that the origin is not asymptotically stable when ɛ = 0.(b) Show that when ɛ < 0, the basin of attraction of the origin containsthe region z>−1.5. For the nonlinear damped pendulum, show that for every integer nand every angle θ 0 there is an initial condition (θ 0 , v 0 ) whose solutioncorresponds to the pendulum moving around the circle at least n times,but not n + 1 times, before settling down to the rest position.6. Find a strict Liapunov function for the <strong>eq</strong>uilibrium point (0, 0) ofx ′ =−2x − y 2y ′ =−y − x 2 .Find δ > 0 as large as possible so that the open disk of radius δ and center(0, 0) is contained in the basin of (0, 0).7. For each of the following functions V (X), sketch the phase portrait ofthe gradient flow X ′ =−grad V (X). Sketch the level surfaces of V onthe same diagram. Find all of the <strong>eq</strong>uilibrium points and determine theirtype.(a) x 2 + 2y 2(b) x 2 − y 2 − 2x + 4y + 5


(c) y sin x(d) 2x 2 − 2xy + 5y 2 + 4x + 4y + 4(e) x 2 + y 2 − z(f) x 2 (x − 1) + y 2 (y − 2) + z 2Exercises 2138. Sketch the phase portraits for the following systems. Determine if thesystem is Hamiltonian or gradient along the way. (That’s a little hint, bythe way.)(a) x ′ = x + 2y, y ′ =−y(b) x ′ = y 2 + 2xy, y ′ = x 2 + 2xy(c) x ′ = x 2 − 2xy, y ′ = y 2 − 2xy(d) x ′ = x 2 − 2xy, y ′ = y 2 − x 2(e) x ′ =−sin 2 x sin y, y ′ =−2 sin x cos x cos y9. Let X ′ = AX be a linear system where( ) a bA = .c d(a) Determine conditions on a, b, c, and d that guarantee that thissystem is a gradient system. Give a gradient function explicitly.(b) Repeat the previous question for a Hamiltonian system.10. Consider the planar systemx ′ = f (x, y)y ′ = g(x, y).Determine explicit conditions on f and g that guarantee that this systemis a gradient system or a Hamiltonian system.11. Prove that the linearization at an <strong>eq</strong>uilibrium point of a planarHamiltonian system has eigenvalues that are either ±λ or ±iλ whereλ ∈ R.12. Let T be the torus <strong>def</strong>ined as the square 0 ≤ θ 1 , θ 2 ≤ 2π with oppositesides identified. Let F(θ 1 , θ 2 ) = cos θ 1 + cos θ 2 . Sketch the phase portraitfor the system −grad F in T . Sketch a three-dimensional representationof this phase portrait with T represented as the surface of a doughnut.13. Repeat the previous exercise, but assume now that F is a Hamiltonianfunction.14. On the torus T above, let F(θ 1 , θ 2 ) = cos θ 1 (2 − cos θ 2 ). Sketch thephase portrait for the system −grad F in T . Sketch a three-dimensionalrepresentation of this phase portrait with T represented as the surface ofa doughnut.


214 Chapter 9 Global Nonlinear Techniques15. Prove that a 3 × 3 symmetric matrix has only real eigenvalues.16. A solution X(t) of a system is called recurrent if X(t n ) → X(0) forsome s<strong>eq</strong>uence t n →∞. Prove that a gradient dynamical system has nononconstant recurrent solutions.17. Show that a closed bounded omega limit set is connected. Give an exampleof a planar system having an unbounded omega limit set consistingof two parallel lines.


10Closed Orbits andLimit SetsIn the past few chapters we have concentrated on <strong>eq</strong>uilibrium solutions ofsystems of differential <strong>eq</strong>uations. These are undoubtedly among the mostimportant solutions, but there are other types of solutions that are importantas well. In this chapter we will investigate another important type of solution,the periodic solution or closed orbit. Recall that a periodic solution occurs forX ′ = F(X) if we have a non<strong>eq</strong>uilibrium point X and a time τ > 0 for whichφ τ (X) = X. It follows that φ t+τ (X) = φ t (X) for all t, soφ t is a periodicfunction. The least such τ > 0 is called the period of the solution. As anexample, all nonzero solutions of the undamped harmonic oscillator <strong>eq</strong>uationare periodic solutions. Like <strong>eq</strong>uilibrium points that are asymptotically stable,periodic solutions may also attract other solutions. That is, solutions may limiton periodic solutions just as they can approach <strong>eq</strong>uilibria.In the plane, the limiting behavior of solutions is essentially restricted to<strong>eq</strong>uilibria and closed orbits, although there are a few exceptional cases. Wewill investigate this phenomenon in this chapter in the guise of the importantPoincaré-Bendixson theorem. We will see later that, in dimensionsgreater than two, the limiting behavior of solutions can be quite a bit morecomplicated.10.1 Limit SetsWe begin by describing the limiting behavior of solutions of systems of differential<strong>eq</strong>uations. Recall that Y ∈ R n is an ω-limit point for the solution215


216 Chapter 10 Closed Orbits and Limit Setsthrough X if there is a s<strong>eq</strong>uence t n →∞such that lim n→∞ φ tn (X) = Y . Thatis, the solution curve through X accumulates on the point Y as time movesforward. The set of all ω-limit points of the solution through X is the ω-limitset of X and is denoted by ω(X). The α-limit points and the α-limit set α(X)are <strong>def</strong>ined by replacing t n →∞with t n →−∞in the above <strong>def</strong>inition. Bya limit set we mean a set of the form ω(X)orα(X).Here are some examples of limit sets. If X ∗ is an asymptotically stable<strong>eq</strong>uilibrium, it is the ω-limit set of every point in its basin of attraction. Any<strong>eq</strong>uilibrium is its own α- and ω-limit set. A periodic solution is the α-limitand ω-limit set of every point on it. Such a solution may also be the ω-limitset of many other points.Example. Consider the planar system given in polar coordinates byr ′ = 1 2 (r − r3 )θ ′ = 1.As we saw in Section 8.1, all nonzero solutions of this <strong>eq</strong>uation tend to theperiodic solution that resides on the unit circle in the plane. See Figure 10.1.Cons<strong>eq</strong>uently, the ω-limit set of any nonzero point is this closed orbit. Example. Consider the systemx ′ = sin x(−. 1 cos x − cos y)y ′ = sin y(cos x−. 1 cos y).There are <strong>eq</strong>uilibria which are saddles at the corners of the square (0, 0), (0, π),(π, π), and (π, 0), as well as at many other points. There are heteroclinicFigure 10.1 The phase planefor r ′ = 1 2 (r−r 3 ), θ ′ =1.


10.1 Limit Sets 217(o, o)(0, 0)Figure 10.2 The ω-limit set of anysolution emanating from the source at(π/2, π/2) is the square bounded by thefour <strong>eq</strong>uilibria and the heteroclinicsolutions.solutions connecting these <strong>eq</strong>uilibria in the order listed. See Figure 10.2. Thereis also a spiral source at (π/2, π/2). All solutions emanating from this sourceaccumulate on the four heteroclinic solutions connecting the <strong>eq</strong>uilibria (seeExercise 4 at the end of this chapter). Hence the ω-limit set of any point onthese solutions is the square bounded by x = 0, π and y = 0, π. In three dimensions there are extremely complicated examples of limitsets, which are not very easy to describe. In the plane, however, limit setsare fairly simple. In fact, Figure 10.2 is typical in that one can show thata closed and bounded limit set other than a closed orbit or <strong>eq</strong>uilibriumpoint is made up of <strong>eq</strong>uilibria and solutions joining them. The Poincaré-Bendixson theorem discussed in Section 10.5 states that if a closed andbounded limit set in the plane contains no <strong>eq</strong>uilibria, then it must be a closedorbit.Recall from Section 9.2 that a limit set is closed in R n and is invariant underthe flow. We shall also need the following result:Proposition.1. If X and Z lie on the same solution curve, then ω(X) = ω(Z) and α(X) =α(Z);2. If D is a closed, positively invariant set and Z ∈ D, then ω(Z) ⊂ D, andsimilarly for negatively invariant sets and α-limits;3. A closed invariant set, in particular, a limit set, contains the α-limit andω-limit sets of every point in it.


218 Chapter 10 Closed Orbits and Limit SetsProof: For (1), suppose that Y ∈ ω(X), and φ s (X) = Z. Ifφ tn (X) → Y ,then we haveφ tn −s(Z) = φ tn (X) → Y .Hence Y ∈ ω(Z) as well. For (2), if φ tn (Z) → Y ∈ ω(Z) ast n → ∞,then we have t n ≥ 0 for sufficiently large n so that φ tn (Z) ∈ D. HenceY ∈ D since D is a closed set. Finally, part (3) follows immediately frompart (2).10.2 Local Sections and Flow BoxesFor the rest of this chapter, we restrict the discussion to planar systems. In thissection we describe the local behavior of the flow associated to X ′ = F(X) neara given point X 0 , which is not an <strong>eq</strong>uilibrium point. Our goal is to constructfirst a local section at X 0 , and then a flow box neighborhood of X 0 . In this flowbox, solutions of the system behave particularly simply.Suppose F(X 0 ) ̸= 0. The transverse line at X 0 , denoted by l(X 0 ), is thestraight line through X 0 , which is perpendicular to the vector F(X 0 ) based atX 0 . We parametrize l(X 0 ) as follows. Let V 0 be a unit vector based at X 0 andperpendicular to F(X 0 ). Then <strong>def</strong>ine h : R → l(X 0 )byh(u) = X 0 + uV 0 .Since F(X) is continuous, the vector field is not tangent to l(X 0 ), at least insome open interval in l(X 0 ) surrounding X 0 . We call such an open subintervalcontaining X 0 a local section at X 0 . At each point of a local section S, the vectorfield points “away from” S, so solutions must cut across a local section. Inparticular F(X) ̸= 0 for X ∈ S. See Figure 10.3.t t (X 0 )X 0(X 0 )Figure 10.3 A localsection S at X 0 andseveral representativevectors from the vectorfield along S.


10.2 Local Sections and Flow Boxes 219ΨFigure 10.4 The flow box associated to S.Ψ( )Our first use of a local section at X 0 will be to construct an associated flowbox in a neighborhood of X 0 . A flow box gives a complete description of thebehavior of the flow in a neighborhood of a non<strong>eq</strong>uilibrium point by meansof a special set of coordinates. An intuitive description of the flow in a flowbox is simple: points move in parallel straight lines at constant speed.Given a local section S at X 0 , we may construct a map from a neighborhoodN of the origin in R 2 to a neighborhood of X 0 as follows. Given(s, u) ∈ R 2 , we <strong>def</strong>ine(s, u) = φ s (h(u))where h is the parameterization of the transverse line described above. Notethat maps the vertical line (0, u) inN to the local section S; also mapshorizontal lines in N to pieces of solution curves of the system. Provided thatwe choose N sufficiently small, the map is then one to one on N . Alsonote that D takes the constant vector field (1, 0) in N to vector field F(X).Using the language of Chapter 4, is a local conjugacy between the flow ofthis constant vector field and the flow of the nonlinear system.We usually take N in the form {(s, u) ||s| < σ } where σ > 0. In this casewe sometimes write V σ = (N ) and call V σ the flow box at (or about) X 0 .See Figure 10.4. An important property of a flow box is that if X ∈ V σ , thenφ t (X) ∈ S for a unique t ∈ (−σ , σ ).If S is a local section, the solution through a point Z 0 (perhaps far fromS) may reach X 0 ∈ S at a certain time t 0 ; see Figure 10.5. We show that in acertain local sense, this “time of first arrival” at S is a continuous function ofZ 0 . More precisely:Proposition. Let S be a local section at X 0 and suppose φ t0 (Z 0 ) = X 0 . Let Wbe a neighborhood of Z 0 . Then there is an open set U ⊂ W containing Z 0 anda differentiable function τ : U → R such that τ(Z 0 ) = t 0 andφ τ(X) (X) ∈ Sfor each X ∈ U.


220 Chapter 10 Closed Orbits and Limit Setst r(X) (X)XX 0 t t0 (Z 0 )Z 0Figure 10.5 Solutions crossing the local section S.Proof: Suppose F(X 0 ) is the vector (α, β) and recall that (α, β) ̸= (0, 0). ForY = (y 1 , y 2 ) ∈ R 2 , <strong>def</strong>ine η : R 2 → R byη(Y ) = Y · F(X 0 ) = αy 1 + βy 2 .Recall that Y belongs to the transverse line l(X 0 ) if and only if Y = X 0 + Vwhere V · F(X 0 ) = 0. Hence Y ∈ l(X 0 ) if and only if η(Y ) = Y · F(X 0 ) =X 0 · F(X 0 ).Now <strong>def</strong>ine G : R 2 × R → R byG(X, t) = η(φ t (X)) = φ t (X) · F(X 0 ).We have G(Z 0 , t 0 ) = X 0 · F(X 0 ) since φ t0 (Z 0 ) = X 0 . Furthermore∂G∂t (Z 0, t 0 ) =|F(X 0 )| 2 ̸= 0.We may thus apply the implicit function theorem to find a smooth functionτ : R 2 → R <strong>def</strong>ined on a neighborhood U 1 of (Z 0 , t 0 ) such that τ(Z 0 ) = t 0andG(X, τ(X)) ≡ G(Z 0 , t 0 ) = X 0 · F(X 0 ).Hence φ τ(X) (X) belongs to the transverse line l(X 0 ). If U ⊂ U 1 is a sufficientlysmall neighborhood of Z 0 , then φ τ(X) (X) ∈ S, as r<strong>eq</strong>uired. 10.3 The Poincaré MapAs in the case of <strong>eq</strong>uilibrium points, closed orbits may also be stable, asymptoticallystable, or unstable. The <strong>def</strong>initions of these concepts for closed orbits are


10.3 The Poincaré Map 221entirely analogous to those for <strong>eq</strong>uilibria as in Section 8.4. However, determiningthe stability of closed orbits is much more difficult than the correspondingproblem for <strong>eq</strong>uilibria. While we do have a tool that resembles the linearizationtechnique that is used to determine the stability of (most) <strong>eq</strong>uilibria, generallythis tool is much more difficult to use in practice. Here is the tool.Given a closed orbit γ , there is an associated Poincaré map for γ , someexamples of which we previously encountered in Sections 1.4 and 6.2. Neara closed orbit, this map is <strong>def</strong>ined as follows. Choose X 0 ∈ γ and let S bea local section at X 0 . We consider the first return map on S. This is thefunction P that associates to X ∈ S the point P(X) = φ t (X) ∈ S where t isthe smallest positive time for which φ t (X) ∈ S. Now P may not be <strong>def</strong>inedat all points on S as the solutions through certain points in S may neverreturn to S. But we certainly have P(X 0 ) = X 0 , and the previous propositionguarantees that P is <strong>def</strong>ined and continuously differentiable in a neighborhoodof X 0 .In the case of planar systems, a local section is a subset of a straight linethrough X 0 , so we may regard this local section as a subset of R and takeX 0 = 0 ∈ R. Hence the Poincaré map is a real function taking 0 to 0. If|P ′ (0)| < 1, it follows that P assumes the form P(x) = ax+ higher orderterms, where |a| < 1. Hence, for x near 0, P(x) is closer to 0 than x. Thismeans that the solution through the corresponding point in S moves closerto γ after one passage through the local section. Continuing, we see that eachpassage through S brings the solution closer to γ , and so we see that γ isasymptotically stable. We have:Proposition. Let X ′ = F(X) be a planar system and suppose that X 0 lies on aclosed orbit γ . Let P be a Poincaré map <strong>def</strong>ined on a neighborhood of X 0 in somelocal section. If |P ′ (X 0 )| < 1, then γ is asymptotically stable.Example. Consider the planar system given in polar coordinates byr ′ = r(1 − r)θ ′ = 1.Clearly, there is a closed orbit lying on the unit circle r = 1. This solutionin rectangular coordinates is given by (cos t, sin t) when the initial conditionis (1, 0). Also, there is a local section lying along the positive real axis sinceθ ′ = 1. Furthermore, given any x ∈ (0, ∞), we have φ 2π (x, 0), which also lieson the positive real axis R + . Thus we have a Poincaré map P : R + → R + .Moreover, P(1) = 1 since the point x = 1, y = 0 is the initial conditiongiving the periodic solution. To check the stability of this solution, we need tocompute P ′ (1).


222 Chapter 10 Closed Orbits and Limit SetsTo do this, we compute the solution starting at (x, 0). We have θ(t) = t,sowe need to find r(2π). To compute r(t), we separate variables to find∫drr(1 − r) = t + constant.Evaluating this integral yieldsHencer(t) =P(x) = r(2π) =xe t1 − x + xe t .xe 2π1 − x + xe 2π .Differentiating, we find P ′ (1) = 1/e 2π so that 0


10.4 Monotone S<strong>eq</strong>uences 223X 0 X 0X2X 1X 2X 1Figure 10.6 Two solutions crossing a straight line. On the left, X 0 ,X 1 , X 2 is monotone along the solution but not along the straightline. On the right, X 0 , X 1 , X 2 is monotone along both the solutionand the line.the segment, or vice versa; see Figure 10.6. However, this is impossible if thesegment is a local section in the plane.Proposition. Let S be a local section for a planar system of differential <strong>eq</strong>uationsand let Y 0 , Y 1 , Y 2 , . . . be a s<strong>eq</strong>uence of distinct points in S that lie on thesame solution curve. If this s<strong>eq</strong>uence is monotone along the solution, then it is alsomonotone along S.Proof: It suffices to consider three points Y 0 , Y 1 , and Y 2 in S. Let be thesimple closed curve made up of the part of the solution between Y 0 and Y 1and the segment T ⊂ S between Y 0 and Y 1 . Let D be the region bounded by. We suppose that the solution through Y 1 leaves D at Y 1 (see Figure 10.7; ifthe solution enters D, the argument is similar). Hence the solution leaves D atevery point in T since T is part of the local section.It follows that the complement of D is positively invariant. For no solutioncan enter D at a point of T ; nor can it cross the solution connecting Y 0 andY 1 , by uniqueness of solutions.Therefore φ t (Y 1 ) ∈ R 2 − D for all t > 0. In particular, Y 2 ∈ S − T .The set S − T is the union of two half open intervals I 0 and I 1 with Y jan endpoint of I j for j = 0, 1. One can draw an arc from a point φ ɛ (Y 1 )(with ɛ > 0 very small) to a point of I 1 , without crossing . Therefore I 1is outside D. Similarly I 0 is inside D. It follows that Y 2 ∈ I 1 since it mustbe outside D. This shows that Y 1 is between Y 0 and Y 2 in I, proving theproposition.We now come to an important property of limit points.


224 Chapter 10 Closed Orbits and Limit SetsI 1Y 1TI 0Y 0DFigure 10.7 Solutions exitthe region D through T.Proposition. For a planar system, suppose that Y ∈ ω(X). Then the solutionthrough Y crosses any local section at no more than one point. The same is true ifY ∈ α(X).Proof: Suppose for the sake of contradiction that Y 1 and Y 2 are distinct pointson the solution through Y and S is a local section containing Y 1 and Y 2 .Suppose Y ∈ ω(X) (the argument for α(X) is similar). Then Y k ∈ ω(X) fork = 1, 2. Let V k be flow boxes at Y k <strong>def</strong>ined by some intervals J k ⊂ S; weassume that J 1 and J 2 are disjoint as depicted in Figure 10.8. The solutionthrough X enters each V k infinitely often; hence it crosses J k infinitely often.XY 1m 1Y 2m 2Figure 10.8 The solutionthrough X cannot cross V 1 andV 2 infinitely often.


Hence there is a s<strong>eq</strong>uence10.5 The Poincaré-Bendixson Theorem 225a 1 , b 1 , a 2 , b 2 , a 3 , b 3 , ...,which is monotone along the solution through X, with a n ∈ J 1 , b n ∈ J 2 forn = 1, 2, .... But such a s<strong>eq</strong>uence cannot be monotone along S since J 1 and J 2are disjoint, contradicting the previous proposition.10.5 The Poincaré-Bendixson TheoremIn this section we prove a celebrated result concerning planar systems:Theorem. (Poincaré-Bendixson) Suppose that is a nonempty, closedand bounded limit set of a planar system of differential <strong>eq</strong>uations that containsno <strong>eq</strong>uilibrium point. Then is a closed orbit.Proof: Suppose that ω(X) is closed and bounded and that Y ∈ ω(X). (Thecase of α-limit sets is similar.) We show first that Y lies on a closed orbit andlater that this closed orbit actually is ω(X).Since Y belongs to ω(X) we know from Section 10.1 that ω(Y ) is a nonemptysubset of ω(X). Let Z ∈ ω(Y ) and let S be a local section at Z. Let V be aflow box associated to S. By the results of the previous section, the solutionthrough Y meets S in exactly one point. On the other hand, there is a s<strong>eq</strong>uencet n →∞such that φ tn (Y ) → Z; hence infinitely many φ tn (Y ) belong to V.Therefore we can find r, s ∈ R such that r>sand φ r (Y ), φ s (Y ) ∈ S. It followsthat φ r (Y ) = φ s (Y ); hence φ r−s (Y ) = Y and r − s>0. Since ω(X) containsno <strong>eq</strong>uilibria, Y must lie on a closed orbit.It remains to prove that if γ is a closed orbit in ω(X), then γ = ω(X). Forthis, it is enough to show thatlim d(φ t (X), γ ) = 0,t→∞where d(φ t (x), γ ) is the distance from φ t (X) to the set γ (that is, the distancefrom φ t (X) to the nearest point of γ ).Let S be a local section at Y ∈ γ . Let ɛ > 0 and consider a flow box V ɛassociated to S. Then there is a s<strong>eq</strong>uence t 0


226 Chapter 10 Closed Orbits and Limit SetsWe claim that there exists an upper bound for the set of positive numberst n+1 − t n for n sufficiently large. To see this, suppose φ τ (Y ) = Y where τ > 0.Then for X n sufficiently near Y , φ τ (X n ) ∈ V ɛ and hencefor some t ∈[−ɛ, ɛ]. Thusφ τ+t (X n ) ∈ St n+1 − t n ≤ τ + ɛ.This provides the upper bound for t n+1 − t n . Also, t n+1 − t n is clearly at least2ɛ,sot n →∞as n →∞.Let β > 0 be small. By continuity of solutions with respect to initial conditions,there exists δ > 0 such that, if |Z − Y | < δ and |t| ≤τ + ɛ then|φ t (Z) − φ t (Y )| < β. That is, the distance from the solution φ t (Z)toγ is lessthan β for all t satisfying |t| ≤τ + ɛ. Let n 0 be so large that |X n − Y | < δ forall n ≥ n 0 . Then|φ t (X n ) − φ t (Y )| < βif |t| ≤τ + ɛ and n ≥ n 0 . Now let t ≥ t n0 . Let n ≥ n 0 be such thatThent n ≤ t ≤ t n+1 .d(φ t (X), γ ) ≤|φ t (X) − φ t−tn (Y )|=|φ t−tn (X n ) − φ t−tn (Y )|< βsince |t −t n |≤τ +ɛ. This shows that the distance from φ t (X)toγ is less than βfor all sufficiently large t. This completes the proof of the Poincaré-Bendixsontheorem.Example. Another example of an ω-limit set that is neither a closed orbitnor an <strong>eq</strong>uilibrium is provided by a homoclinic solution. Consider the system( xx ′ 4=−y −4 − x2( x4y ′ = x 3 − x −2 + y224 − x22 + y22)(x 3 − x))y.


10.6 Applications of Poincaré-Bendixson 227Figure 10.9 A pair ofhomoclinic solutions in theω-limit set.A computation shows that there are three <strong>eq</strong>uilibria: at (0, 0), (−1, 0), and(1, 0). The origin is a saddle, while the other two <strong>eq</strong>uilibria are sources. Thephase portrait of this system is shown in Figure 10.9. Note that solutions farfrom the origin tend to accumulate on the origin and a pair of homoclinicsolutions, each of which leaves and then returns to the origin. Solutions emanatingfrom either source have ω-limit set that consists of just one homoclinicsolution and (0, 0). See Exercise 6 for proofs of these facts.10.6 Applications of Poincaré-BendixsonThe Poincaré-Bendixson theorem essentially determines all of the possiblelimiting behaviors of a planar flow. We give a number of corollaries of thisimportant theorem in this section.A limit cycle is a closed orbit γ such that γ ⊂ ω(X)orγ ⊂ α(X) for someX ̸∈ γ . In the first case γ is called an ω-limit cycle; in the second case, anα-limit cycle. We deal only with ω-limit sets in this section; the case of α-limitsets is handled by simply reversing time.In the proof of the Poincaré-Bendixson theorem it was shown that limitcycles have the following property: If γ is an ω-limit cycle, there exists X ̸∈ γsuch thatlimt→∞ d(φ t (X), γ ) = 0.Geometrically this means that some solution spirals toward γ as t →∞. SeeFigure 10.10. Not all closed orbits have this property. For example, in the


228 Chapter 10 Closed Orbits and Limit SetscFigure 10.10 Asolution spiralingtoward a limit cycle.case of a linear system with a center at the origin in R 2 , the closed orbits thatsurround the origin have no solutions approaching them, and so they are notlimit cycles.Limit cycles possess a kind of (one-sided, at least) stability. Let γ be anω-limit cycle and suppose φ t (X) spirals toward γ as t →∞. Let S be a localsection at Z ∈ γ . Then there is an interval T ⊂ S disjoint from γ , boundedby φ t0 (X) and φ t1 (X) with t 0


10.6 Applications of Poincaré-Bendixson 229ZTAcFigure 10.11 The regionA is positively invariant.Corollary 2. A compact set K that is positively or negatively invariant containseither a limit cycle or an <strong>eq</strong>uilibrium point.The next result exploits the spiraling property of limit cycles.Corollary 3. Let γ be a closed orbit and let U be the open region in the interiorof γ . Then U contains either an <strong>eq</strong>uilibrium point or a limit cycle.Proof: Let D be the compact set U ∪ γ . Then D is invariant since no solutionin U can cross γ .If U contains no limit cycle and no <strong>eq</strong>uilibrium, then, forany X ∈ U,ω(X) = α(X) = γby Poincaré-Bendixson. If S is a local section at a point Z ∈ γ , there ares<strong>eq</strong>uences t n →∞, s n →−∞such that φ tn (X), φ sn (X) ∈ S and both φ tn (X)and φ sn (X) tend to Z as n →∞. But this leads to a contradiction of theproposition in Section 10.4 on monotone s<strong>eq</strong>uences.Actually this last result can be considerably sharpened:Corollary 4. Let γ be a closed orbit that forms the boundary of an open set U .Then U contains an <strong>eq</strong>uilibrium point.Proof: Suppose U contains no <strong>eq</strong>uilibrium point. Consider first the casethat there are only finitely many closed orbits in U . We may choose theclosed orbit that bounds the region with smallest area. There are then noclosed orbits or <strong>eq</strong>uilibrium points inside this region, and this contradictscorollary 3.


230 Chapter 10 Closed Orbits and Limit SetsNow suppose that there are infinitely many closed orbits in U .IfX n → Xin U and each X n lies on a closed orbit, then X must lie on a closed orbit.Otherwise, the solution through X would spiral toward a limit cycle sincethere are no <strong>eq</strong>uilibria in U . By corollary 1, so would the solution throughsome nearby X n , which is impossible.Let ν ≥ 0 be the greatest lower bound of the areas of regions enclosed byclosed orbits in U . Let {γ n } be a s<strong>eq</strong>uence of closed orbits enclosing regionsof areas ν n such that lim n→∞ ν n = ν. Let X n ∈ γ n . Since γ ∪ U is compact,we may assume that X n → X ∈ U . Then if U contains no <strong>eq</strong>uilibrium, X lieson a closed orbit β bounding a region of area ν. The usual section argumentshows that as n →∞, γ n gets arbitrarily close to β and hence the area ν n − νof the region between γ n and β goes to 0. Then the argument above shows thatthere can be no closed orbits or <strong>eq</strong>uilibrium points inside γ , and this providesa contradiction to corollary 3.The following result uses the spiraling properties of limit cycles in a subtleway.Corollary 5. Let H be a first integral of a planar system. If H is not constanton any open set, then there are no limit cycles.Proof: Suppose there is a limit cycle γ ; let c ∈ R be the constant value ofH on γ .IfX(t) is a solution that spirals toward γ , then H(X(t)) ≡ c bycontinuity of H. In corollary 1 we found an open set whose solutions spiraltoward γ ; thus H is constant on an open set.Finally, the following result is implicit in our development of the theory ofLiapunov functions in Section 9.2.Corollary 6. If L is a strict Liapunov function for a planar system, then thereare no limit cycles.10.7 Exploration: Chemical ReactionsThat OscillateFor much of the 20th century, chemists believed that all chemical reactionstended monotonically to <strong>eq</strong>uilibrium. This belief was shattered in the 1950swhen the Russian biochemist Belousov discovered that a certain reactioninvolving citric acid, bromate ions, and sulfuric acid, when combined witha cerium catalyst, could oscillate for long periods of time before settling to


Exercises 231<strong>eq</strong>uilibrium. The concoction would turn yellow for a while, then fade, thenturn yellow again, then fade, and on and on like this for over an hour. Thisreaction, now called the Belousov-Zhabotinsky reaction (the BZ reaction, forshort), was a major turning point in the history of chemical reactions. Now,many systems are known to oscillate. Some have even been shown to behavechaotically.One particularly simple chemical reaction is given by a chlorine dioxide–iodine–malonic acid interaction. The exact differential <strong>eq</strong>uations modelingthis reaction are extremely complicated. However, there is a planar nonlinearsystem that closely approximates the concentrations of two of the reactants.The system isx ′ = a − x −4xy1 + x(2y ′ = bx 1 − y )1 + x 2where x and y represent the concentrations of I − and ClO − 2 , respectively, anda and b are positive parameters.1. Begin the exploration by investigating these reaction <strong>eq</strong>uations numerically.What qualitatively different types of phase portraits do youfind?2. Find all <strong>eq</strong>uilibrium points for this system.3. Linearize the system at your <strong>eq</strong>uilibria and determine the type of each<strong>eq</strong>uilibrium.4. In the ab–plane, sketch the regions where you find asymptotically stableor unstable <strong>eq</strong>uilibria.5. Identify the a, b-values where the system undergoes bifurcations.6. Using the nullclines for the system together with the Poincaré-Bendixsontheorem, find the a, b-values for which a stable limit cycle exists. Why dothese values correspond to oscillating chemical reactions?For more details on this reaction, see [27]. The very interesting history of theBZ-reaction is described in [47]. The original paper by Belousov is reprintedin [17].EXERCISES1. For each of the following systems, identify all points that lie in either anω-oranα-limit set(a) r ′ = r − r 2 , θ ′ = 1(b) r ′ = r 3 − 3r 2 + 2r, θ ′ = 1


232 Chapter 10 Closed Orbits and Limit Sets(c) r ′ = sin r, θ ′ =−1(d) x ′ = sin x sin y, y ′ =−cos x cos y2. Consider the three-dimensional systemr ′ = r(1 − r)θ ′ = 1z ′ =−z.Compute the Poincaré map along the closed orbit lying on the unit circlegiven by r = 1 and show that this closed orbit is asymptotically stable.3. Consider the three-dimensional systemr ′ = r(1 − r)θ ′ = 1z ′ = z.Again compute the Poincaré map for this system. What can you now sayabout the behavior of solutions near the closed orbit? Sketch the phaseportrait for this system.4. Consider the systemx ′ = sin x(−0. 1 cos x − cos y)y ′ = sin y(cos x − 0. 1 cos y).Show that all solutions emanating from the source at (π/2, π/2) haveω-limit sets <strong>eq</strong>ual to the square bounded by x = 0, π and y = 0, π.5. The systemr ′ = ar + r 3 − r 5θ ′ = 1depends on a parameter a. Determine the phase plane for representativea values and describe all bifurcations for the system.6. Consider the system( xx ′ 4)=−y −4 − x22 + y2(x 3 − x)2( xy ′ = x 3 4− x −4 − x22 + y22(a) Find all <strong>eq</strong>uilibrium points.(b) Determine the types of these <strong>eq</strong>uilibria.)y.


Exercises 233AFigure 10.12 Theregion A is positivelyinvariant.(c) Prove that all non<strong>eq</strong>uilibrium solutions have ω-limit sets consistingof either one or two homoclinic solutions plus a saddle point.7. Let A be an annular region in R 2 . Let F be a planar vector field thatpoints inward along the two boundary curves of A. Suppose also thatevery radial segment of A is local section. See Figure 10.12. Prove thereis a periodic solution in A.8. Let F be a planar vector field and again consider an annular region Aas in the previous problem. Suppose that F has no <strong>eq</strong>uilibria and that Fpoints inward along the boundary of the annulus, as before.(a) Prove there is a closed orbit in A. (Notice that the hypothesis isweaker than in the previous problem.)(b) If there are exactly seven closed orbits in A, show that one of themhas orbits spiraling toward it from both sides.9. Let F be a planar vector field on a neighborhood of the annular region Aabove. Suppose that for every boundary point X of A, F(X) is a nonzerovector tangent to the boundary.(a) Sketch the possible phase portraits in A under the further assumptionthat there are no <strong>eq</strong>uilibria and no closed orbits besides the boundarycircles. Include the case where the solutions on the boundary travelin opposite directions.(b) Suppose the boundary solutions are oppositely oriented and that theflow preserves area. Show that A contains an <strong>eq</strong>uilibrium.10. Show that a closed orbit of a planar system meets a local section in atmost one point.11. Show that a closed and bounded limit set is connected (that is, not theunion of two disjoint nonempty closed sets).


234 Chapter 10 Closed Orbits and Limit Sets12. Let X ′ = F(X) be a planar system with no <strong>eq</strong>uilibrium points. Supposethe flow φ t generated by F preserves area (that is, if U is any open set,the area of φ t (U ) is independent of t). Show that every solution curve isa closed set.13. Let γ be a closed orbit of a planar system. Let λ be the period of γ . Let{γ n } be a s<strong>eq</strong>uence of closed orbits. Suppose the period of γ n is λ n . If thereare points X n ∈ γ n such that X n → X ∈ γ , prove that λ n → λ. (Thisresult can be false for higher dimensional systems. It is true, however,that if λ n → μ, then μ is an integer multiple of λ.)14. Consider a system in R 2 having only a finite number of <strong>eq</strong>uilibria.(a) Show that every limit set is either a closed orbit or the union of<strong>eq</strong>uilibrium points and solutions φ t (X) such that lim t→∞ φ t (X)and lim t→−∞ φ t (X) are these <strong>eq</strong>uilibria.(b) Show by example (draw a picture) that the number of distinctsolution curves in ω(X) may be infinite.15. Let X be a recurrent point of a planar system, that is, there is a s<strong>eq</strong>uencet n →±∞such thatφ tn (X) → X.(a) Prove that either X is an <strong>eq</strong>uilibrium or X lies on a closed orbit.(b) Show by example that there can be a recurrent point for a nonplanarsystem that is not an <strong>eq</strong>uilibrium and does not lie on a closed orbit.16. Let X ′ = F(X) and X ′ = G(X) be planar systems. Suppose thatF(X) · G(X) = 0for all X ∈ R 2 .IfF has a closed orbit, prove that G has an <strong>eq</strong>uilibriumpoint.17. Let γ be a closed orbit for a planar system, and let U be the bounded,open region inside γ . Show that γ is not simultaneously the omega andalpha limit set of points of U. Use this fact and the Poincaré-Bendixsontheorem to prove that U contains an <strong>eq</strong>uilibrium that is not a saddle.(Hint: Consider the limit sets of points on the stable and unstable curvesof saddles.)


11Applications in BiologyIn this chapter we make use of the techniques developed in the previous fewchapters to examine some nonlinear systems that have been used as mathematicalmodels for a variety of biological systems. In Section 11.1 we utilizethe preceding results involving nullclines and linearization to describe severalbiological models involving the spread of communicable diseases. InSection 11.2 we investigate the simplest types of <strong>eq</strong>uations that model a predator/preyecology. A more sophisticated approach is used in Section 11.3 tostudy the populations of a pair of competing species. Instead of developingexplicit formulas for these differential <strong>eq</strong>uations, we instead make only qualitativeassumptions about the form of the <strong>eq</strong>uations. We then derive geometricinformation about the behavior of solutions of such systems based on theseassumptions.11.1 Infectious DiseasesThe spread of infectious diseases such as measles or malaria may be modeledas a nonlinear system of differential <strong>eq</strong>uations. The simplest model of thistype is the SIR model. Here we divide a given population into three disjointgroups. The population of susceptible individuals is denoted by S, the infectedpopulation by I, and the recovered population by R. As usual, each of theseis a function of time. We assume for simplicity that the total population isconstant, so that (S + I + R) ′ = 0.235


236 Chapter 11 Applications in BiologyIn the most basic case we make the assumption that, once an individualhas been infected and subs<strong>eq</strong>uently has recovered, that individual cannotbe reinfected. This is the situation that occurs for such diseases as measles,mumps, and smallpox, among many others. We also assume that the rateof transmission of the disease is proportional to the number of encountersbetween susceptible and infected individuals. The easiest way to characterizethis assumption mathematically is to put S ′ =−βSI for some constantβ > 0. We finally assume that the rate at which infected individuals recover isproportional to the number of infected. The SIR model is thenS ′ =−βSII ′ = βSI − νIR ′ = νIwhere β and ν are positive parameters.As stipulated, we have (S + I + R) ′ = 0, so that S + I + R is a constant.This simplifies the system, for if we determine S(t) and I(t), we then deriveR(t) for free. Hence it suffices to consider the two-dimensional systemS ′ =−βSII ′ = βSI − νI.The <strong>eq</strong>uilibria for this system are given by the S-axis (I = 0). Linearizationat (S, 0) yields the matrix( )0 −βS,0 βS − νso the eigenvalues are 0 and βS − ν. This second eigenvalue is negative if0 ν/β and I 0 > 0, the susceptible populationdecreases monotonically, while the infected population at first rises, buteventually reaches a maximum and then declines to 0.We can actually prove this analytically, because we can explicitly computea function that is constant along solution curves. Note that the slope of thevector field is a function of S alone:I ′ βSI − νIS ′ =−βSI=−1 + νβS .


11.1 Infectious Diseases 237ISm/bSFigure 11.1 Thenullclines and directionfield for the SIR model.Hence we havedIdS = dI/dtdS/dt =−1 + νβS ,which we can immediately integrate to findI = I(S) =−S + ν log S + constant.βHence the function I + S − (ν/β) log S is constant along solution curves. Itthen follows that there is a unique solution curve connecting each <strong>eq</strong>uilibriumpoint in the interval ν/β


238 Chapter 11 Applications in BiologyA slightly more complicated model for infectious diseases arises when weassume that recovered individuals may lose their immunity and become reinfectedwith the disease. Examples of this type of disease include malaria andtuberculosis. We assume that the return of recovered individuals to the class Soccurs at a rate proportional to the population of recovered individuals. Thisleads to the SIRS model (where the extra S indicates that recovered individualsmay reenter the susceptible group). The system becomesS ′ =−βSI + μRI ′ = βSI − νIR ′ = νI − μR.Again we see that the total population S + I + R is a constant, which wedenote by τ. We may eliminate R from this system by setting R = τ − S − I:S ′ =−βSI + μ(τ − S − I)I ′ = βSI − νI.Here β, μ, ν, and τ are all positive parameters.Unlike the SIR model, we now have at most two <strong>eq</strong>uilibria, one at (τ, 0) andthe other at( ν(S ∗ , I ∗ ) =β , μ(τ − β ν ) ).ν + μThe first <strong>eq</strong>uilibrium point corresponds to no disease whatsoever in the population.The second <strong>eq</strong>uilibrium point only exists when τ ≥ ν/β. Whenτ = ν/β, we have a bifurcation as the two <strong>eq</strong>uilibria coalesce at (τ, 0). Th<strong>eq</strong>uantity ν/β is called the threshold level for the disease.The linearized system is given by( )−βI − μ −βS − μY ′ =Y .βI βS − νAt the <strong>eq</strong>uilibrium point (τ, 0), the eigenvalues are −μ and βτ − ν, so this<strong>eq</strong>uilibrium point is a saddle provided that the total population exceeds thethreshold level. At the second <strong>eq</strong>uilibrium point, a straightforward computationshows that the trace of the matrix is negative, while the determinant ispositive. It then follows from the results in Chapter 4 that both eigenvalueshave negative real parts, and so this <strong>eq</strong>uilibrium point is asymptotically stable.


11.2 Predator/Prey Systems 239Biologically, this means that the disease may become established in the communityonly when the total population exceeds the threshold level. We willonly consider this case in what follows.Note that the SIRS system is only of interest in the region given by I, S ≥ 0and S + I ≤ τ. Denote this triangular region by (of course!). Note that theI-axis is no longer invariant, while on the S-axis, solutions increase up to the<strong>eq</strong>uilibrium at (τ, 0).Proposition.The region is positively invariant.Proof: We check the direction of the vector field along the boundary of .The field is tangent to the boundary along the lower edge I = 0 as well as at(0, τ). Along S = 0 we have S ′ = μ(τ − I) > 0, so the vector field pointsinward for 0


240 Chapter 11 Applications in BiologyISIrFigure 11.3 The nullclines andphase portrait in for the SIRSsystem. Here β = ν = μ = 1 andτ =2.Sassume that, in the absence of predators, the prey population grows at a rateproportional to the current population. That is, as in Chapter 1, when y = 0we have x ′ = ax where a > 0. So in this case x(t) = x 0 exp(at). Whenpredators are present, we assume that the prey population decreases at a rateproportional to the number of predator/prey encounters. As in the previoussection, one simple model for this is bxy where b>0. So the differential<strong>eq</strong>uation for the prey population is x ′ = ax − bxy.For the predator population, we make more or less the opposite assumptions.In the absence of prey, the predator population declines at a rateproportional to the current population. So when x = 0 we have y ′ =−cy withc>0, and thus y(t) = y 0 exp(−ct). The predator species becomes extinct inthis case. When there are prey in the environment, we assume that the predatorpopulation increases at a rate proportional to the predator/prey meetings, ordxy. We do not at this stage assume anything about overcrowding. Thus oursimplified predator/prey system (also called the Volterra-Lotka system) isx ′ = ax − bxy = x(a − by)y ′ =−cy + dxy = y(−c + dx)where the parameters a, b, c, and d are all assumed to be positive. Since we aredealing with populations, we only consider x, y ≥ 0.As usual, our first job is to locate the <strong>eq</strong>uilibrium points. These occur at theorigin and at (x, y) = (c/d, a/b). The linearized system is( )X ′ a − by −bx=X,dy −c + dx


11.2 Predator/Prey Systems 241yxc/dya/bFigure 11.4 The nullclinesand direction field for thepredator/prey system.xso when x = y = 0 we have a saddle with eigenvalues a and −c. We know thestable and unstable curves: They are the y- and x-axes, respectively.At the other <strong>eq</strong>uilibrium point (c/d, a/b), the eigenvalues are pure imaginary±i √ ac, and so we cannot conclude anything at this stage about stability ofthis <strong>eq</strong>uilibrium point.We next sketch the nullclines for this system. The x-nullclines are given bythe straight lines x = 0 and y = a/b, whereas the y-nullclines are y = 0 andx = c/d. The nonzero nullcline lines separate the region x, y>0 into four basicregions in which the vector field points as indicated in Figure 11.4. Hence thesolutions wind in the counterclockwise direction about the <strong>eq</strong>uilibrium point.From this, we cannot determine the precise behavior of solutions: Theycould possibly spiral in toward the <strong>eq</strong>uilibrium point, spiral toward a limitcycle, spiral out toward “infinity” and the coordinate axes, or else lie on closedorbits. To make this determination, we search for a Liapunov function L.Employing the trick of separation of variables, we look for a function of theformL(x, y) = F(x) + G(y).Recall that ˙L denotes the time derivative of L along solutions. We compute˙L(x, y) = d L(x(t), y(t))dt= dFdx x′ + dGdy y′ .Hence˙L(x, y) = x dFdxdG(a − by) + y (−c + dx).dy


242 Chapter 11 Applications in BiologyWe obtain ˙L ≡ 0 providedxdF/dxdx − c≡ ydG/dyby − a .Since x and y are independent variables, this is possible if and only ifxdF/dxdx − c= ydG/dyby − a = constant.Setting the constant <strong>eq</strong>ual to 1, we obtainIntegrating, we findThus the functiondFdx = d − c x ,dGdy = b − a y .F(x) = dx − c log x,G(y) = by − a log y.L(x, y) = dx − c log x + by − a log yis constant on solution curves of the system when x, y>0.By considering the signs of ∂L/∂x and ∂L/∂y it is easy to see that the <strong>eq</strong>uilibriumpoint Z = (c/d, a/b) is an absolute minimum for L. It follows that L [or,more precisely, L − L(Z)] is a Liapunov function for the system. Therefore Zis a stable <strong>eq</strong>uilibrium.We note next that there are no limit cycles; this follows from Corollary 6in Section 10.6 because L is not constant on any open set. We now prove thefollowing theorem.Theorem. Every solution of the predator/prey system is a closed orbit (exceptthe <strong>eq</strong>uilibrium point Z and the coordinate axes).Proof: Consider the solution through W ̸= Z, where W does not lie on thex-ory-axis. This solution spirals around Z, crossing each nullcline infinitelyoften. Thus there is a doubly infinite s<strong>eq</strong>uence ···


11.2 Predator/Prey Systems 243xc/dFigure 11.5 The nullclines andphase portrait for thepredator/prey system.x = c/d, as discussed in the previous chapter. Since there are no limit cycles,either φ tn (W ) → Z as n →∞or φ tn (W ) → Z as n →−∞. Since L isconstant along the solution through W , this implies that L(W ) = L(Z). Butthis contradicts minimality of L(Z). This completes the proof.The phase portrait for this predator/prey system is displayed in Figure 11.5.We conclude that, for any given initial populations (x(0), y(0)) with x(0) ̸= 0and y(0) ̸= 0, other than Z, the populations of predator and prey oscillatecyclically. No matter what the populations of prey and predator are, neitherspecies will die out, nor will its population grow in<strong>def</strong>initely.Now let us introduce overcrowding into the prey <strong>eq</strong>uation. As in the logisticmodel in Chapter 1, the <strong>eq</strong>uations for prey, in the absence of predators, maybe written in the formx ′ = ax − λx 2 .We also assume that the predator population obeys a similar <strong>eq</strong>uationy ′ =−cy − μy 2when x = 0. Incorporating the assumptions above yields the predator/prey<strong>eq</strong>uations for species with limited growth:x ′ = x(a − by − λx)y ′ = y(−c + dx − μy).


244 Chapter 11 Applications in BiologyAs before, the parameters a, b, c, d as well as λ and μ are all positive. Wheny = 0, we have the logistic <strong>eq</strong>uation x ′ = x(a − λx), which yields <strong>eq</strong>uilibriaat the origin and at (a/λ, 0). As we saw in Chapter 1, all nonzero solutions onthe x-axis tend to a/λ.When x = 0, the <strong>eq</strong>uation for y is y ′ =−cy −μy 2 . Since y ′ < 0 when y>0,it follows that all solutions on this axis tend to the origin. Thus we confineattention to the upper-right quadrant Q where x, y>0.The nullclines are given by the x- and y-axes, together with the linesL : a − by − λx = 0M : −c + dx − μy = 0.Along the lines L and M, we have x ′ = 0 and y ′ = 0, respectively. There aretwo possibilities, according to whether these lines intersect in Q or not.We first consider the case where the two lines do not meet in Q. In this casewe have the nullcline configuration depicted in Figure 11.6. All solutions to theright of M head upward and to the left until they meet M; between the lines Land M solutions now head downward and to the left. Thus they either meet Lor tend directly to the <strong>eq</strong>uilibrium point at (a/λ, 0). If solutions cross L, theythen head right and downward, but they cannot cross L again. Thus they tootend to (a/λ, 0). Thus all solutions in Q tend to this <strong>eq</strong>uilibrium point. Weconclude that, in this case, the predator population becomes extinct and theprey population approaches its limiting value of a/λ.We may interpret the behavior of solutions near the nullclines as follows.Since both x ′ and y ′ are never both positive, it is impossible for both preyLMFigure 11.6 The nullclinesand phase portrait for apredator/prey system withlimited growth when thenullclines do not meet in Q.


11.2 Predator/Prey Systems 245and predators to increase at the same time. If the prey population is aboveits limiting value, it must decrease. After a while the lack of prey causes thepredator population to begin to decrease (when the solution crosses M). Afterthat point the prey population can never increase past a/λ, and so the predatorpopulation continues to decrease. If the solution crosses L, the prey populationincreases again (but not past a/λ), while the predators continue to die off. Inthe limit the predators disappear and the prey population stabilizes at a/λ.Suppose now that L and M cross at a point Z = (x 0 , y 0 ) in the quadrant Q;of course, Z is an <strong>eq</strong>uilibrium. The linearization of the vector field at Z isX ′ =( )−λx0 −bx 0X.dy 0 −μy 0The characteristic polynomial has trace given by −λx 0 − μy 0 < 0 and determinant(bd + λμ)x 0 y 0 > 0. From the trace-determinant plane of Chapter 4,we see that Z has eigenvalues that are either both negative or both complexwith negative real parts. Hence Z is asymptotically stable.Note that, in addition to the <strong>eq</strong>uilibria at Z and (0, 0), there is still an<strong>eq</strong>uilibrium at (a/λ, 0). Linearization shows that this <strong>eq</strong>uilibrium is a saddle;its stable curve lies on the x-axis. See Figure 11.7.It is not easy to determine the basin of Z, nor do we know whether thereare any limit cycles. Nevertheless we can obtain some information. The lineL meets the x-axis at (a/λ, 0) and the y-axis at (0, a/b). Let Ɣ be a rectanglewhose corners are (0, 0), (p, 0), (0, q), and (p, q) with p>a/λ, q>a/b, andthe point (p, q) lying in M. Every solution at a boundary point of Ɣ eitherqΓMLFigure 11.7 The nullclines andphase portrait for a predator/preysystem with limited growth whenthe nullclines do meet in Q.p


246 Chapter 11 Applications in Biologyenters Ɣ or is part of the boundary. Therefore Ɣ is positively invariant. Everypoint in Q is contained in such a rectangle.By the Poincaré-Bendixson theorem, the ω-limit set of any point (x, y)inƔ,with x, y>0, must be a limit cycle or contain one of the three <strong>eq</strong>uilibria (0, 0),Z,or(a/λ, 0). We rule out (0, 0) and (a/λ, 0) by noting that these <strong>eq</strong>uilibria aresaddles whose stable curves lie on the x-ory-axes. Therefore ω(x, y) is eitherZ or a limit cycle in Ɣ. By Corollary 4 of the Poincaré-Bendixson theorem anylimit cycle must surround Z.We observe further that any such rectangle Ɣ contains all limit cycles,because a limit cycle (like any solution) must enter Ɣ, and Ɣ is positively invariant.Fixing (p, q) as above, it follows that for any initial values (x(0), y(0)),there exists t 0 > 0 such that x(t)


11.3 Competitive Species 247Hence∂M∂y < 0 and ∂N∂x < 0.2. If either population is very large, both populations decrease. Hence thereexists K>0 such thatM(x, y) < 0 and N (x, y) < 0 ifx ≥ K or y ≥ K .3. In the absence of either species, the other has a positive growth rate upto a certain population and a negative growth rate beyond it. Thereforethere are constants a, b>0 such thatM(x,0)> 0 for xa,N (0, y) > 0 for yb.By conditions (1) and (3) each vertical line {x} ×R meets the set μ =M −1 (0) exactly once if 0 ≤ x ≤ a and not at all if x>a. By condition (1)and the implicit function theorem, μ is the graph of a nonnegative functionf :[0, a] →R such that f −1 (0) = a. Below the curve μ, M is positive andabove it, M is negative. In the same way the set ν = N −1 (0) is a smooth curveof the form{ }(x, y) | x = g(y) ,where g :[0, b] →R is a nonnegative function with g −1 (0) = b. The functionN is positive to the left of ν and negative to the right.Suppose first that μ and ν do not intersect and that μ is below ν. Thenthe phase portrait can be determined immediately from the nullclines. The<strong>eq</strong>uilibria are (0, 0), (a, 0), and (0, b). The origin is a source, while (a,0)isasaddle (assuming that (∂M/∂x)(a,0) < 0). The <strong>eq</strong>uilibrium at (0, b) is a sink[again assuming that (∂N /∂y)(0, b) < 0]. All solutions with y 0 > 0 tend tothe asymptotically stable <strong>eq</strong>uilibrium (0, b) with the exception of solutions onthe x-axis. See Figure 11.8. In the case where μ lies above ν, the situation isreversed, and all solutions with x 0 > 0 tend to the sink that now appears at(a, 0).Suppose now that μ and ν intersect. We make the assumption that μ ∩ νis a finite set, and at each intersection point, μ and ν cross transversely, thatis, they have distinct tangent lines at the intersection points. This assumptionmay be eliminated; we make it only to simplify the process of determining theflow.


248 Chapter 11 Applications in BiologybmlaFigure 11.8 The phase portraitwhen μ and ν do not meet.The nullclines μ and ν and the coordinate axes bound a finite number ofconnected open sets in the upper-right quadrant: These are the basic regionswhere x ′ ̸= 0 and y ′ ̸= 0. They are of four types:A: x ′ > 0, y ′ > 0, B: x ′ < 0, y ′ > 0;C: x ′ < 0, y ′ < 0, D: x ′ > 0, y ′ < 0.Equivalently, these are the regions where the vector field points northeast,northwest, southwest, or southeast, respectively. Some of these regions areindicated in Figure 11.9. The boundary ∂R of a basic region R is made up ofpoints of the following types: points of μ ∩ ν, called vertices; points on μ or νbut not on both nor on the coordinate axes, called ordinary boundary points;and points on the axes.A vertex is an <strong>eq</strong>uilibrium; the other <strong>eq</strong>uilibria lie on the axes at (0, 0), (a, 0),and (0, b). At an ordinary boundary point Z ∈ ∂R, the vector field is eithervertical (if Z ∈ μ) or horizontal (if Z ∈ ν). This vector points either into or outof R since μ has no vertical tangents and ν has no horizontal tangents. We callZ an inward or outward point of ∂R, accordingly. Note that, in Figure 11.9,the vector field either points inward at all ordinary points on the boundary ofa basic region, or else it points outward at all such points. This is no accident,for we have:Proposition. Let R be a basic region for the competitive species model. Thenthe ordinary boundary points of R are either all inward or all outward.Proof: There are only two ways in which the curves μ and ν can intersect ata vertex P. Asy increases along ν, the curve ν may either pass from below μ


11.3 Competitive Species 249bmQlBRDCSAPaFigure 11.9 The basic regionswhen the nullclines μ and νintersect.to above μ, or from above to below μ. These two scenarios are illustrated inFigures 11.10a and b. There are no other possibilities since we have assumedthat these curves cross transversely.Since x ′ > 0 below μ and x ′ < 0 above μ, and since y ′ > 0 to the left of νand y ′ < 0 to the right, we therefore have the following configurations for thevector field in these two cases. See Figure 11.11.In each case we see that the vector field points inward in two opposite basicregions abutting P, and outward in the other two basic regions.If we now move along μ or ν to the next vertex along this curve, we see thatadjacent basic regions must maintain their inward or outward configuration.Therefore, at all ordinary boundary points on each basic region, the vectorfield either points outward or points inward, as r<strong>eq</strong>uired.llm(a)m(b)Figure 11.10 In (a), ν passes from below μ to above μ as yincreases. The situation is reversed in (b).


250 Chapter 11 Applications in BiologyllmmFigure 11.11vertices.Configurations of the vector field nearAs a cons<strong>eq</strong>uence of the proposition, it follows that each basic region andits closure is either positively or negatively invariant. What are the possibleω-limit points of this system? There are no closed orbits. A closed orbit mustbe contained in a basic region, but this is impossible since x(t) and y(t) aremonotone along any solution curve in a basic region. Therefore all ω-limitpoints are <strong>eq</strong>uilibria.We note also that each solution is <strong>def</strong>ined for all t ≥ 0, because any pointlies in a large rectangle Ɣ with corners at (0, 0), (x 0 , 0), (0, y 0 ), and (x 0 , y 0 ) withx 0 >aand y 0 >b; such a rectangle is positively invariant. See Figure 11.12.Thus we have shown:Theorem. The flow φ t of the competitive species system has the followingproperty: For all points (x, y), with x ≥ 0, y ≥ 0, the limitlim φ t (x, y)t→∞exists and is one of a finite number of <strong>eq</strong>uilibria.mlΓFigure 11.12 All solutions must enterand then remain in Ɣ.


11.3 Competitive Species 251lmPFigure 11.13 This configuration of μ and νleads to an asymptotically stable <strong>eq</strong>uilibriumpoint.We conclude that the populations of two competing species always tend toone of a finite number of limiting populations.Examining the <strong>eq</strong>uilibria for stability, one finds the following results. A vertexwhere μ and ν each have negative slope, but μ is steeper, is asymptoticallystable. See Figure 11.13. One sees this by drawing a small rectangle with sidesparallel to the axes around the <strong>eq</strong>uilibrium, putting one corner in each of thefour adjacent basic regions. Such a rectangle is positively invariant; since it canbe arbitrarily small, the <strong>eq</strong>uilibrium is asymptotically stable.This may also be seen as follows. We haveslope of μ =− M xM y< slope of ν =− N xN y< 0,where M x = ∂M/∂x, M y = ∂M/∂y, and so on, at the <strong>eq</strong>uilibrium. Now recallthat M y < 0 and N x < 0. Therefore, at the <strong>eq</strong>uilibrium point, we also haveM x < 0 and N y < 0. Linearization at the <strong>eq</strong>uilibrium point yields the matrix( )xMx xM y.yN x yN yThe trace of this matrix is xM x +yM y < 0 while the determinant is xy(M x N y −M y N x ) > 0. Thus the eigenvalues have negative real parts, and so we have asink.A case-by-case study of the different ways μ and ν can cross shows that theonly other asymptotically stable <strong>eq</strong>uilibrium in this model is (0, b) when (0, b)is above μ, or(a, 0) when (a, 0) is to the right of ν. All other <strong>eq</strong>uilibria areunstable. There must be at least one asymptotically stable <strong>eq</strong>uilibrium. If (0, b)is not one, then it lies under μ; and if (a, 0) is not one, it lies over μ. In thatcase μ and ν cross, and the first crossing to the left of (a, 0) is asymptoticallystable.


252 Chapter 11 Applications in BiologyZbB ∞QlB ∞mRSPFigure 11.14 Note that solutions on either side ofthe point Z in the stable curve of Q have verydifferent fates.For example, this analysis tells us that, in Figure 11.14, only P and (0, b) areasymptotically stable; all other <strong>eq</strong>uilibria are unstable. In particular, assumingthat the <strong>eq</strong>uilibrium Q in Figure 11.14 is hyperbolic, then it must be a saddlebecause certain nearby solutions tend toward it, while others tend away. Thepoint Z lies on one branch of the stable curve through Q. All points in theregion denoted B ∞ to the left of Z tend to the <strong>eq</strong>uilibrium at (0, b), whilepoints to the right go to P. Thus as we move across the branch of the stablecurve containing Z, the limiting behavior of solutions changes radically. Sincesolutions just to the right of Z tend to the <strong>eq</strong>uilibrium point P, it follows thatthe populations in this case tend to stabilize. On the other hand, just to theleft of Z, solutions tend to an <strong>eq</strong>uilibrium point where x = 0. Thus in thiscase, one of the species becomes extinct. A small change in initial conditionshas led to a dramatic change in the fate of populations. Ecologically, this smallchange could have been caused by the introduction of a new pesticide, theimportation of additional members of one of the species, a forest fire, or thelike. Mathematically, this event is a jump from the basin of P to that of (0, b).11.4 Exploration: Competition andHarvestingIn this exploration we will investigate the competitive species model where weallow either harvesting (emigration) or immigration of one of the species. We


Exercises 253consider the systemx ′ = x(1 − ax − y)y ′ = y(b − x − y) + h.Here a, b, and h are parameters. We assume that a, b>0. If h0, we add to the populationy at a constant rate. The goal is to understand this system completely forall possible values of these parameters. As usual, we only consider the regimewhere x, y ≥ 0. If y(t) < 0 for any t>0, then we consider this species to havebecome extinct.1. First assume that h = 0. Give a complete synopsis of the behavior of thissystem by plotting the different behaviors you find in the a, b parameterplane.2. Identify the points or curves in the ab–plane where bifurcations occurwhen h = 0 and describe them.3. Now let h < 0. Describe the ab–parameter plane for various (fixed)h-values.4. Repeat the previous exploration for h>0.5. Describe the full three-dimensional parameter space using pictures, flipbooks, 3D models, movies, or whatever you find most appropriate.EXERCISES1. For the SIRS model, prove that all solutions in the triangular region tend to the <strong>eq</strong>uilibrium point (τ, 0) when the total population does notexceed the threshold level for the disease.2. Sketch the phase plane for the following variant of the predator/preysystem:x ′ = x(1 − x) − xy(y ′ = y 1 − y ).x3. A modification of the predator/prey <strong>eq</strong>uations is given bywhere a>0 is a parameter.x ′ = x(1 − x) − axyx + 1y ′ = y(1 − y)


254 Chapter 11 Applications in Biology(a) Find all <strong>eq</strong>uilibrium points and classify them.(b) Sketch the nullclines and the phase portraits for different values of a.(c) Describe any bifurcations that occur as a varies.4. Another modification of the predator/prey <strong>eq</strong>uations is given byx ′ = x(1 − x) −y ′ = y(1 − y)xyx + bwhere b>0 is a parameter.(a) Find all <strong>eq</strong>uilibrium points and classify them.(b) Sketch the nullclines and the phase portraits for different valuesof b.(c) Describe any bifurcations that occur as b varies.5. The <strong>eq</strong>uationsx ′ = x(2 − x − y),y ′ = y(3 − 2x − y)satisfy conditions (1) through (3) in Section 11.3 for competing species.Determine the phase portrait for this system. Explain why these <strong>eq</strong>uationsmake it mathematically possible, but extremely unlikely, for both speciesto survive.6. Consider the competing species modelx ′ = x(a − x − ay)y ′ = y(b − bx − y)where the parameters a and b are positive.(a) Find all <strong>eq</strong>uilibrium points for this system and determine theirstability type. These types will, of course, depend on a and b.(b) Use the nullclines to determine the various phase portraits that arisefor different choices of a and b.(c) Determine the values of a and b for which there is a bifurcation inthis system and describe the bifurcation that occurs.(d) Record your findings by drawing a picture of the ab–plane andindicating in each open region of this plane the qualitative structureof the corresponding phase portraits.


Exercises 2557. Two species x, y are in symbiosis if an increase of either population leadsto an increase in the growth rate of the other. Thus we assumex ′ = M(x, y)xy ′ = N (x, y)ywith∂M∂y > 0 and ∂N∂x > 0and x, y ≥ 0. We also suppose that the total food supply is limited; hencefor some A>0, B>0 we haveM(x, y) < 0N (x, y) < 0if x>A,if y>B.If both populations are very small, they both increase; henceM(0, 0) > 0 and N (0, 0) > 0.Assuming that the intersections of the curves M −1 (0), N −1 (0) are finite,and that all are transverse, show the following:(a) Every solution tends to an <strong>eq</strong>uilibrium in the region 0


256 Chapter 11 Applications in Biology(b) An increase in the prey improves the predator growth rate; hence∂N /∂x >0.(c) In the absence of predators a small prey population will increase;hence M(0, 0) > 0.(d) Beyond a certain size, the prey population must decrease; hencethere exists A>0 with M(x, y) < 0ifx>A.(e) Any increase in predators decreases the rate of growth of prey; hence∂M/∂y 0 and N (u, v) > 0 thenthere is either an asymptotically stable <strong>eq</strong>uilibrium or an ω-limit cycle.Moreover, show that, if the number of limit cycles is finite and positive,one of them must have orbits spiraling toward it from both sides.10. Consider the following modification of the predator/prey <strong>eq</strong>uations:x ′ = x(1 − x) − axyy ′ = by(1 − y xx + c)where a, b, and c are positive constants. Determine the region in theparameter space for which this system has a stable <strong>eq</strong>uilibrium with bothx, y ̸= 0. Prove that, if the <strong>eq</strong>uilibrium point is unstable, this system hasa stable limit cycle.


12Applications inCircuit TheoryIn this chapter we first present a simple but very basic example of an electricalcircuit and then derive the differential <strong>eq</strong>uations governing this circuit. Certainspecial cases of these <strong>eq</strong>uations are analyzed using the techniques developedin Chapters 8 through 10 in the next two sections; these are the classical<strong>eq</strong>uations of Lienard and van der Pol. In particular, the van der Pol <strong>eq</strong>uationcould perhaps be regarded as one of the fundamental examples of a nonlinearordinary differential <strong>eq</strong>uation. It possesses an oscillation or periodic solutionthat is a periodic attractor. Every nontrivial solution tends to this periodicsolution; no linear system has this property. Whereas asymptotically stable<strong>eq</strong>uilibria sometimes imply death in a system, attracting oscillators imply life.We give an example in Section 12.4 of a continuous transition from one suchsituation to the other.12.1 An RLC CircuitIn this section, we present our first example of an electrical circuit. This circuitis the simple but fundamental series RLC circuit displayed in Figure 12.1. Webegin by explaining what this diagram means in mathematical terms. Thecircuit has three branches, one resistor marked by R, one inductor marked byL, and one capacitor marked by C. We think of a branch of this circuit as a257


258 Chapter 12 Applications in Circuit TheorybRCLcFigure 12.1An RLC circuit.certain electrical device with two terminals. For example, in this circuit, thebranch R has terminals α and β and all of the terminals are wired together toform the points or nodes α, β, and γ .In the circuit there is a current flowing through each branch that is measuredby a real number. More precisely, the currents in the circuit are given by thethree real numbers i R , i L , and i C where i R measures the current through theresistor, and so on. Current in a branch is analogous to water flowing in apipe; the corresponding measure for water would be the amount flowing inunit time, or better, the rate at which water passes by a fixed point in the pipe.The arrows in the diagram that orient the branches tell us how to read whichway the current (read water!) is flowing; for example, if i R is positive, thenaccording to the arrow, current flows through the resistor from β to α (thechoice of the arrows is made once and for all at the start).The state of the currents at a given time in the circuit is thus represented bya point i = (i R , i L , i C ) ∈ R 3 . But Kirchhoff ’s current law (KCL) says that inreality there is a strong restriction on which i can occur. KCL asserts that thetotal current flowing into a node is <strong>eq</strong>ual to the total current flowing out ofthat node. (Think of the water analogy to make this plausible.) For our circuitthis is <strong>eq</strong>uivalent toKCL : i R = i L =−i C .This <strong>def</strong>ines the one-dimensional subspace K 1 of R 3 of physical current states.Our choice of orientation of the capacitor branch may seem unnatural. Infact, these orientations are arbitrary; in this example they were chosen so thatthe <strong>eq</strong>uations eventually obtained relate most directly to the history of thesubject.The state of the circuit is characterized by the current i = (i R , i L , i C ) togetherwith the voltage (or, more precisely, the voltage drop) across each branch.These voltages are denoted by v R , v L , and v C for the resistor branch, inductorbranch, and capacitor branch, respectively. In the water analogy one thinks of


12.1 An RLC Circuit 259the voltage drop as the difference in pressures at the two ends of a pipe. Tomeasure voltage, one places a voltmeter (imagine a water pressure meter) ateach of the nodes α, β, and γ that reads V (α)atα, and so on. Then v R is thedifference in the reading at α and βV (β) − V (α) = v R .The direction of the arrow tells us that v R = V (β) − V (α) rather than V (α) −V (β).An unrestricted voltage state of the circuit is then a point v = (v R , v L , v C ) ∈R 3 . This time the Kirchhoff voltage law puts a physical restriction on v:KVL : v R + v L − v C = 0.This <strong>def</strong>ines a two-dimensional subspace K 2 of R 3 . KVL follows immediatelyfrom our <strong>def</strong>inition of the v R , v L , and v C in terms of voltmeters; that is,v R + v L − v C = (V (β) − V (α)) + (V (α) − V (γ )) − (V (β) − V (γ )) = 0.The product space R 3 × R 3 is called the state space for the circuit. Thosestates (i, v) ∈ R 3 × R 3 satisfying Kirchhoff’s laws form a three-dimensionalsubspace of the state space.Now we give a mathematical <strong>def</strong>inition of the three kinds of electrical devicesin the circuit. First consider the resistor element. A resistor in the R branchimposes a “functional relationship” on i R and v R . We take in our examplethis relationship to be <strong>def</strong>ined by a function f : R → R, so that v R = f (i R ).If R is a conventional linear resistor, then f is linear and so f (i R ) = ki R .This relation is known as Ohm’s law. Nonlinear functions yield a generalizedOhm’s law. The graph of f is called the characteristic of the resistor. A coupleof examples of characteristics are given in Figure 12.2. (A characteristic likethat in Figure 12.2b occurs in a “tunnel diode.”)A physical state (i, v) ∈ R 3 × R 3 is a point that satisfies KCL, KVL, and alsof (i R ) = v R . These conditions <strong>def</strong>ine the set of physical states ⊂ R 3 × R 3 .Thus is the set of points (i R , i L , i C , v R , v L , v C )inR 3 × R 3 that satisfy:1. i R = i L =−i C (KCL);2. v R + v L − v C = 0 (KVL);3. f (i R ) = v R (generalized Ohm’s law).Now we turn to the differential <strong>eq</strong>uations governing the circuit. The inductor(which we think of as a coil; it is hard to find a water analogy) specifiesthatL di L(t)= v L (t) (Faraday’s law),dt


260 Chapter 12 Applications in Circuit TheoryV Ri RV Ri R(a)(b)Figure 12.2resistor.Several possible characteristics for awhere L is a positive constant called the inductance. On the other hand, thecapacitor (which may be thought of as two metal plates separated by someinsulator; in the water model it is a tank) imposes the conditionC dv C(t)= i C (t)dtwhere C is a positive constant called the capacitance.Let’s summarize the development so far: a state of the circuit is given by thesix numbers (i R , i L , i C , v R , v L , v C ), that is, a point in the state space R 3 × R 3 .These numbers are subject to three restrictions: Kirchhoff’s current law,Kirchhoff’s voltage law, and the resistor characteristic or generalized Ohm’slaw. Therefore the set of physical states is a certain subset ⊂ R 3 × R 3 .The way a state changes in time is determined by the two previous differential<strong>eq</strong>uations.Next, we simplify the set of physical states by observing that i L and v Cdetermine the other four coordinates. This follows since i R = i L and i C =−i Lby KCL, v R = f (i R ) = f (i L ) by the generalized Ohm’s law, and v L = v C −v R =v C −f (i L ) by KVL. Therefore we can use R 2 as the state space, with coordinatesgiven by (i L , v C ). Formally, we <strong>def</strong>ine a map π : R 3 × R 3 → R 2 , which sends(i, v) ∈ R 3 × R 3 to (i L , v C ). Then we set π 0 = π | , the restriction of π to. The map π 0 : → R 2 is one to one and onto; its inverse is given by themap ψ : R 2 → , whereψ(i L , v C ) = (i L , i L , −i L , f (i L ), v C − f (i L ), v C ).It is easy to check that ψ(i L , v C ) satisfies KCL, KVL, and the generalized Ohm’slaw, so ψ does map R 2 into . It is also easy to see that π 0 and ψ are inverseto each other.We therefore adopt R 2 as our state space. The differential <strong>eq</strong>uationsgoverning the change of state must be rewritten in terms of our new


12.2 The Lienard Equation 261coordinates (i L , v C ):L di LdtC dv Cdt= v L = v C − f (i L )= i C = −i L .For simplicity, and since this is only an example, we set L = C = 1. If wewrite x = i L and y = v C , we then have a system of differential <strong>eq</strong>uations inthe plane of the formdxdt = y − f (x)dydt =−x.This is one form of the <strong>eq</strong>uation known as the Lienard <strong>eq</strong>uation. We analyzethis system in the following section.12.2 The Lienard EquationIn this section we begin the study of the phase portrait of the Lienard systemfrom the circuit of the previous section, namely:dxdt = y − f (x)dydt =−x.In the special case where f (x) = x 3 − x, this system is called the van der Pol<strong>eq</strong>uation.First consider the simplest case where f is linear. Suppose f (x) = kx, wherek>0. Then the Lienard system takes the form Y ′ = AY whereA =( ) −k 1.−1 0The eigenvalues of A are given by λ ± = (−k ±(k 2 −4) 1/2 )/2. Since λ ± is eithernegative or else has a negative real part, the <strong>eq</strong>uilibrium point at the origin isa sink. It is a spiral sink if k0, all solutions of the system tendto the origin; physically, this is the dissipative effect of the resistor.


262 Chapter 12 Applications in Circuit TheoryNote that we havey ′′ =−x ′ =−y + kx =−y − ky ′ ,so that the system is <strong>eq</strong>uivalent to the second-order <strong>eq</strong>uation y ′′ +ky ′ +y = 0,which is often encountered in elementary differential <strong>eq</strong>uations courses.Next we consider the case of a general characteristic f . There is a unique <strong>eq</strong>uilibriumpoint for the Lienard system that is given by (0, f (0)). Linearizationyields the matrix( −f ′ )(0) 1−1 0whose eigenvalues are given byλ ± = 1 2( √ )−f ′ (0) ± (f ′ (0)) 2 − 4 .We conclude that this <strong>eq</strong>uilibrium point is a sink if f ′ (0) > 0 and a source iff ′ (0) < 0. In particular, for the van der Pol <strong>eq</strong>uation where f (x) = x 3 − x, theunique <strong>eq</strong>uilibrium point is a source.To analyze the system further, we <strong>def</strong>ine the function W : R 2 → R 2 byW (x, y) = 1 2 (x2 + y 2 ). Then we haveẆ = x(y − f (x)) + y(−x) =−xf (x).In particular, if f satisfies f (x) > 0ifx>0, f (x) < 0ifx


12.3 The van der Pol Equation 263dydt =−x.Let φ t denote the flow of this system. In this case we can give a fairly completephase portrait analysis.Theorem. There is one nontrivial periodic solution of the van der Pol <strong>eq</strong>uationand every other solution (except the <strong>eq</strong>uilibrium point at the origin) tends to thisperiodic solution. “The system oscillates.”We know from the previous section that this system has a unique <strong>eq</strong>uilibriumpoint at the origin, and that this <strong>eq</strong>uilibrium is a source, since f ′ (0) < 0.The next step is to show that every non<strong>eq</strong>uilibrium solution “rotates” in acertain sense around the <strong>eq</strong>uilibrium in a clockwise direction. To see this, notethat the x-nullcline is given by y = x 3 − x and the y-nullcline is the y-axis.We subdivide each of these nullclines into two pieces given byv + ={(x, y) | y>0, x = 0}v − ={(x, y) | y0, y = x 3 − x}g − ={(x, y) | x


264 Chapter 12 Applications in Circuit TheoryProof: Any solution starting on v + immediately enters the region A sincex ′ (0) > 0. In A we have y ′ < 0, so this solution must decrease in the y direction.Since the solution cannot tend to the source, it follows that this solution musteventually meet g + .Ong + we have x ′ = 0 and y ′ < 0. Cons<strong>eq</strong>uently, thesolution crosses g + , then enters the region B. Once inside B, the solutionheads southwest. Note that the solution cannot reenter A since the vector fieldpoints straight downward on g + . There are thus two possibilities: Either thesolution crosses v − , or else the solution tends to −∞ in the y direction andnever crosses v − .We claim that the latter cannot happen. Suppose that it does. Let (x 0 , y 0 )be a point on this solution in region B and consider φ t (x 0 , y 0 ) = (x(t), y(t)).Since x(t) is never 0, it follows that this solution curve lies for all time in thestrip S given by 0


12.3 The van der Pol Equation 265m g P (y)yg Figure 12.4on v + .m The Poincaré mapClearly, any fixed point of P lies on a periodic solution. On the otherhand, if P(y 0 ) ̸= y 0 , then the solution through (0, y 0 ) can never be periodic.Indeed, if P(y 0 ) >y 0 , then the successive intersection of φ t (0, y 0 ) with v + isa monotone s<strong>eq</strong>uence as in Section 10.4. Hence the solution crosses v + in anincreasing s<strong>eq</strong>uence of points and so the solution can never meet itself. Thecase P(y 0 ) 0.Also <strong>def</strong>ineδ(y) = 1 (α(y) 2 − y 2) .2Note for later use that there is a unique point (0, y ∗ ) ∈ v + and time t ∗ suchthat1. φ t (0, y ∗ ) ∈ A for 0


266 Chapter 12 Applications in Circuit Theorym g y * ya(y)m (1, 0)Figure 12.5 Thesemi-Poincaré map.about the origin. This implies that if (x(t), y(t)) is a solution curve, then so is(−x(t), −y(t)).Part of the graph of δ(y) is shown schematically in Figure 12.6. The intermediatevalue theorem and the proposition imply that there is a unique y 0 ∈ v +with δ(y 0 ) = 0. Cons<strong>eq</strong>uently, α(y 0 ) =−y 0 and it follows from the skewsymmetrythat the solution through (0, y 0 ) is periodic. Since δ(y) ̸= 0 exceptat y 0 , we have α(y) ̸= y for all other y-values, and so it follows that φ t (0, y 0 )is the unique periodic solution.We next show that all other solutions (except the <strong>eq</strong>uilibrium point) tendto this periodic solution. Toward that end, we <strong>def</strong>ine a map β : v − → v + ,sending each point of v − to the first intersection of the solution (for t>0)with v + . By symmetry we haveNote also that P(y) = β ◦ α(y).β(y) =−α(−y).d(y)y * y 0yFigure 12.6 Thegraph of δ(y).


12.3 The van der Pol Equation 267We identify the y-axis with the real numbers in the y coordinate. Thus ify 1 , y 2 ∈ v + ∪ v − we write y 1 >y 2 if y 1 is above y 2 . Note that α and β reversethis ordering while P preserves it.Now choose y ∈ v + with y>y 0 . Since α(y 0 ) =−y 0 we have α(y) < −y 0and P(y) >y 0 . On the other hand, δ(y) < 0, which implies that α(y) > −y.Therefore P(y) = β(α(y)) y 0 implies y>P(y) >y 0 . Similarly P(y) >P(P(y)) >y 0 and by induction P n (y) >P n+1 (y) >y 0for all n>0.The decreasing s<strong>eq</strong>uence P n (y) has a limit y 1 ≥ y 0 in v + . Note that y 1 is afixed point of P, because, by continuity of P, we haveP(y 1 ) − y 1 = limn→∞ P(Pn (y)) − y 1= y 1 − y 1 = 0.Since P has only one fixed point, we have y 1 = y 0 . This shows that the solutionthrough y spirals toward the periodic solution as t →∞. See Figure 12.7. Thesame is true if y


268 Chapter 12 Applications in Circuit TheoryIf it happens that x ′ (t) ̸= 0 for a ≤ t ≤ b, then along γ , y is a function of x,so we may write y = y(x). In this case we can change variables:∫ baF(x(t), y(t)) dt =∫ x(b)x(a)F(x, y(x)) dtdx dx.Therefore∫γF(x, y) =∫ x(b)x(a)F(x, y(x))dx/dtdx.We have a similar expression if y ′ (t) ̸= 0.Now recall the functionW (x, y) = 1 (x 2 + y 2)2introduced in the previous section. Let p ∈ v + . Suppose α(p) = φ τ (p). Letγ (t) = (x(t), y(t)) for 0 ≤ t ≤ τ = τ(p) be the solution curve joining p ∈ v +to α(p) ∈ v − . By <strong>def</strong>initionThusδ(p) = 1 (y(τ) 2 − y(0) 2)2= W (x(τ), y(τ)) − W (x(0), y(0)).δ(p) =∫ τRecall from Section 12.2 that we haveThus we have0dW (x(t), y(t)) dt.dtẆ =−xf (x) =−x(x 3 − x).δ(p) ==∫ τ0∫ τ0−x(t)(x(t) 3 − x(t)) dtx(t) 2 (1 − x(t) 2 ) dt.This immediately proves part (1) of the proposition because the integrand ispositive for 0


12.3 The van der Pol Equation 269y 0c 1y 2c 2c 3g y 1Figure 12.8 The curves γ 1 , γ 2 ,and γ 3 depicted on the closedorbit through y 0 .We may rewrite the last <strong>eq</strong>uality as∫δ(p) = x 2 ( 1 − x 2) .γWe restrict attention to points p ∈ v + with p>y ∗ . We divide the correspondingsolution curve γ into three curves γ 1 , γ 2 , γ 3 as displayed in Figure 12.8.The curves γ 1 and γ 3 are <strong>def</strong>ined for 0 ≤ x ≤ 1, while the curve γ 2 is <strong>def</strong>inedfor y 1 ≤ y ≤ y 2 . Thenδ(p) = δ 1 (p) + δ 2 (p) + δ 3 (p)where∫δ i (p) = x 2 ( 1 − x 2) , i = 1, 2, 3.γ iNotice that, along γ 1 , y(t) may be regarded as a function of x. Hence we have∫ 1x 2 ( 1 − x 2)δ 1 (p) =dx0 dx/dt∫ 1x 2 ( 1 − x 2)=dxy − f (x)0where f (x) = x 3 − x.Asp moves up the y-axis, y − f (x) increases [for (x, y)on γ 1 ]. Hence δ 1 (p) decreases as p increases. Similarly δ 3 (p) decreases as pincreases.


270 Chapter 12 Applications in Circuit TheoryOn γ 2 , x(t) may be regarded as a function of y, which is <strong>def</strong>ined for y ∈[y 1 , y 2 ] and x ≥ 1. Therefore since dy/dt =−x, we haveδ 2 (p) ==∫ y1y 2∫ y2y 1−x(y) ( 1 − x(y) 2) dyx(y) ( 1 − x(y) 2) dyso that δ 2 (p) is negative.As p increases, the domain [y 1 , y 2 ] of integration becomes steadily larger.The function y → x(y) depends on p, so we write it as x p (y). As pincreases, the curves γ 2 move to the right; hence x p (y) increases and sox p (y)(1 − x p (y) 2 ) decreases. It follows that δ 2 (p) decreases as p increases;and evidently lim p→∞ δ 2 (p) =−∞. Cons<strong>eq</strong>uently, δ(p) also decreases andtends to −∞ as p →∞. This completes the proof of the proposition.12.4 A Hopf BifurcationWe now describe a more general class of circuit <strong>eq</strong>uations where the resistorcharacteristic depends on a parameter μ and is denoted by f μ . (Perhaps μ isthe temperature of the resistor.) The physical behavior of the circuit is thendescribed by the system of differential <strong>eq</strong>uations on R 2 :dxdt = y − f μ(x)dydt =−x.Consider as an example the special case where f μ is described byf μ (x) = x 3 − μxand the parameter μ lies in the interval [−1, 1]. When μ = 1 we have thevan der Pol system from the previous section. As before, the only <strong>eq</strong>uilibriumpoint lies at the origin. The linearized system isY ′ =( ) μ 1Y ,−1 0


12.4 A Hopf Bifurcation 271and the eigenvalues areλ ± = 1 2(μ ± √ )μ 2 − 4 .Thus the origin is a spiral sink for −1 ≤ μ < 0 and a spiral source for0 < μ ≤ 1. Indeed, when −1 ≤ μ ≤ 0, the resistor is passive as the graph of f μlies in the first and third quadrants. Therefore all solutions tend to the origin inthis case. This holds even in the case where μ = 0 and the linearization yieldsa center. Physically the circuit is dead in that, after a period of transition, allcurrents and voltages stay at 0 (or as close to 0 as we want).However, as μ becomes positive, the circuit becomes alive. It begins tooscillate. This follows from the fact that the analysis of Section 12.3 appliesto this system for all μ in the interval (0, 1]. We therefore see the birth ofa (unique) periodic solution γ μ as μ increases through 0 (see Exercise 4 atthe end of this chapter). Just as above, this solution attracts all other nonzerosolutions. As in Section 8.5, this is an example of a Hopf bifurcation. Furtherelaboration of the ideas in Section 12.3 can be used to show that γ μ → 0asμ → 0 with μ > 0. Figure 12.9 shows some phase portraits associated to thisbifurcation.l0.5l0.1l0.05l0.2Figure 12.9 The Hopf bifurcation in the system x ′ = y −x 3 + μ x, y ′ = −x.


272 Chapter 12 Applications in Circuit Theory12.5 Exploration: NeurodynamicsOne of the most important developments in the study of the firing of nervecells or neurons was the development of a model for this phenomenon ingiant squid in the 1950s by Hodgkin and Huxley [23]. They developed a fourdimensionalsystem of differential <strong>eq</strong>uations that described the electrochemicaltransmission of neuronal signals along the cell membrane, a work for whichthey later received the Nobel prize. Roughly speaking, this system is similarto systems that arise in electrical circuits. The neuron consists of a cell body,or soma, which receives electrical stimuli. This stimulus is then conductedalong the axon, which can be thought of as an electrical cable that connects toother neurons via a collection of synapses. Of course, the motion is not reallyelectrical, because the current is not really made up of electrons, but ratherions (predominantly sodium and potassium). See [15] or [34] for a primer onthe neurobiology behind these systems.The four-dimensional Hodgkin-Huxley system is difficult to deal with,primarily because of the highly nonlinear nature of the <strong>eq</strong>uations. An importantbreakthrough from a mathematical point of view was achieved byFitzhugh [18] and Nagumo et al. [35], who produced a simpler model ofthe Hodgkin-Huxley model. Although this system is not as biologically accurateas the original system, it nevertheless does capture the essential behaviorof nerve impulses, including the phenomenon of excitability alluded to below.The Fitzhugh-Nagumo system of <strong>eq</strong>uations is given bywhere a and b are constants satisfyingx ′ = y + x − x33 + Iy ′ =−x + a − by0 < 3 (1 − a)


Exercises 273explicitly solving the <strong>eq</strong>uations. Also remember the restrictions placed ona and b.2. Prove that this <strong>eq</strong>uilibrium point is always a sink.3. Now suppose that I ̸= 0. Prove that there is still a unique <strong>eq</strong>uilibriumpoint (x I , y I ) and that x I varies monotonically with I.4. Determine values of x I for which the <strong>eq</strong>uilibrium point is a source andshow that there must be a stable limit cycle in this case.5. When I ̸= 0, the point (x 0 , y 0 ) is no longer an <strong>eq</strong>uilibrium point.Nonetheless we can still consider the solution through this point. Describethe qualitative nature of this solution as I moves away from 0. Explainin mathematical terms why biologists consider this phenomenon the“excitement” of the neuron.6. Consider the special case where a = I = 0. Describe the phase plane foreach b>0 (no longer restrict to b0.8. Extend the analysis of the previous problem to the case b ≤ 0.9. Now fix b = 0 and let a and I vary. Sketch the bifurcation plane (theI, a–plane) in this case.EXERCISES1. Find the phase portrait for the differential <strong>eq</strong>uationx ′ = y − f (x), f (x) = x 2 ,y ′ =−x.Hint: Exploit the symmetry about the y-axis.2. Let⎧⎨2x − 3 ifx>1f (x) = −x if −1 ≤ x ≤ 1⎩2x + 3 ifx


274 Chapter 12 Applications in Circuit Theory3. Let⎧⎨2x + a − 2 ifx>1f a (x) = ax if −1 ≤ x ≤ 1⎩2x − a + 2 ifx 0y ′ =−x.Prove that this system has a unique nontrivial periodic solution γ μ . Showthat as μ →∞, γ μ tends to the closed curve consisting of two horizontalline segments and two arcs on y = x 3 − x as in Figure 12.10. This typeof solution is called a relaxation oscillation. When μ is large, there aretwo quite different timescales along the periodic solution. When movinghorizontally, we have x ′ very large, and so the solution makes this transitFigure 12.10


Exercises 275RCLFigure 12.11very quickly. On the other hand, near the cubic nullcline, x ′ = 0 while y ′is bounded. Hence this transit is comparatively much slower.6. Find the differential <strong>eq</strong>uations for the network in Figure 12.11, where theresistor is voltage controlled; that is, the resistor characteristic is the graphof a function g : R → R, i R = g(v R ).7. Show that the LC circuit consisting of one inductor and one capacitorwired in a closed loop oscillates.8. Determine the phase portrait of the following differential <strong>eq</strong>uation and inparticular show there is a unique nontrivial periodic solution:x ′ = y − f (x)y ′ =−g(x)where all of the following are assumed:(a) g(−x) =−g(x) and xg(x) > 0 for all x ̸= 0;(b) f (−x) =−f (x) and f (x) < 0 for 0 0.(e) Show that all nonzero solutions of the system tend to this closed orbitwhen a>0.


This Page Intentionally Left Blank


13Applications in MechanicsWe turn our attention in this chapter to the earliest important examples ofdifferential <strong>eq</strong>uations that, in fact, are connected with the origins of calculus.These <strong>eq</strong>uations were used by Newton to derive and unify the three laws ofKepler. In this chapter we give a brief derivation of two of Kepler’s laws andthen discuss more general problems in mechanics.The <strong>eq</strong>uations of Newton, our starting point, have retained importancethroughout the history of modern physics and lie at the root of that part ofphysics called classical mechanics. The examples here provide us with concreteexamples of historical and scientific importance. Furthermore, the case weconsider most thoroughly here, that of a particle moving in a central forcegravitational field, is simple enough so that the differential <strong>eq</strong>uations can besolved explicitly using exact, classical methods (just calculus!). However, withan eye toward the more complicated mechanical systems that cannot be solvedin this way, we also describe a more geometric approach to this problem.13.1 Newton’s Second LawWe will be working with a particle moving in a force field F. Mathematically, Fis just a vector field on the (configuration) space of the particle, which in ourcase will be R n . From the physical point of view, F(X) is the force exerted ona particle located at position X ∈ R n .The example of a force field we will be most concerned with is the gravitationalfield of the sun: F(X) is the force on a particle located at X that attractsthe particle to the sun. We go into details of this system in Section 13.3.277


278 Chapter 13 Applications in MechanicsThe connection between the physical concept of a force field and the mathematicalconcept of a differential <strong>eq</strong>uation is Newton’s second law: F = ma.This law asserts that a particle in a force field moves in such a way that theforce vector at the location X of the particle, at any instant, <strong>eq</strong>uals the accelerationvector of the particle times the mass m. That is, Newton’s law gives thesecond-order differential <strong>eq</strong>uationAs a system, this <strong>eq</strong>uation becomesmX ′′ = F(X).X ′ = VV ′ = 1 m F(X)where V = V (t) is the velocity of the particle. This is a system of <strong>eq</strong>uations onR n × R n . This type of system is often called a mechanical system with n degreesof freedom.A solution X(t) ⊂ R n of the second-order <strong>eq</strong>uation is said to lie in configurationspace. The solution of the system (X(t), V (t)) ⊂ R n × R n lies in thephase space or state space of the system.Example. Recall the simple undamped harmonic oscillator from Chapter 2.In this case the mass moves in one dimension and its position at time t is givenby a function x(t), where x : R → R. As we saw, the differential <strong>eq</strong>uationgoverning this motion ismx ′′ =−kxfor some constant k>0. That is, the force field at the point x ∈ R is givenby −kx.Example. The two-dimensional version of the harmonic oscillator allowsthe mass to move in the plane, so the position is now given by the vectorX(t) = (x 1 (t), x 2 (t)) ∈ R 2 . As in the one-dimensional case, the force field isF(X) =−kX so the <strong>eq</strong>uations of motion are the samemX ′′ =−kXwith solutions in configuration space given byx 1 (t) = c 1 cos( √ k/mt) + c 2 sin( √ k/mt)x 2 (t) = c 3 cos( √ k/mt) + c 4 sin( √ k/mt)


13.1 Newton’s Second Law 279for some choices of the c jChapter 6.as is easily checked using the methods ofBefore dealing with more complicated cases of Newton’s law, we need torecall a few concepts from multivariable calculus. Recall that the dot product(or inner product) of two vectors X, Y ∈ R n is denoted by X ·Y and <strong>def</strong>ined byX · Y =n∑x i y iwhere X = (x 1 , ..., x n ) and Y = (y 1 , ..., y n ). Thus X ·X =|X| 2 .IfX, Y : I →R n are smooth functions, then a version of the product rule yieldsi=1(X · Y ) ′ = X ′ · Y + X · Y ′ ,as can be easily checked using the coordinate functions x i and y i .Recall also that if g : R n → R, the gradient of g, denoted grad g, is<strong>def</strong>ined by( ∂ggrad g(X) = (X), ..., ∂g )(X) .∂x 1 ∂x nAs we saw in Chapter 9, grad g is a vector field on R n .Next, consider the composition of two smooth functions g ◦ F whereF : R → R n and g : R n → R. The chain rule applied to g ◦ F yieldsddt g(F(t)) = grad g(F(t)) · F ′ (t)n∑ ∂g= (F(t)) dF i∂x i dt (t).i=1We will also use the cross product (or vector product) U × V of vectorsU , V ∈ R 3 . By <strong>def</strong>inition,U × V = (u 2 v 3 − u 3 v 2 , u 3 v 1 − u 1 v 3 , u 1 v 2 − u 2 v 1 ) ∈ R 3 .Recall from multivariable calculus that we haveU × V =−V × U =|U ||V |N sin θwhere N is a unit vector perpendicular to U and V with the orientations ofthe vectors U , V , and N given by the “right-hand rule.” Here θ is the anglebetween U and V .


280 Chapter 13 Applications in MechanicsNote that U ×V = 0 if and only if one vector is a scalar multiple of the other.Also, if U × V ̸= 0, then U × V is perpendicular to the plane containing Uand V .IfU and V are functions of t in R, then another version of the productrule asserts thatddt (U × V ) = U ′ × V + U × V ′as one can again check by using coordinates.13.2 Conservative SystemsMany force fields appearing in physics arise in the following way. There is asmooth function U : R n → R such that( ∂UF(X) =− (X), ∂U (X), ..., ∂U )(X)∂x 1 ∂x 2 ∂x n=−grad U (X).(The negative sign is traditional.) Such a force field is called conservative. Theassociated system of differential <strong>eq</strong>uationsX ′ = VV ′ =− 1 grad U (X)mis called a conservative system. The function U is called the potential energyof the system. [More properly, U should be called a potential energy sinceadding a constant to it does not change the force field −grad U (X).]Example. The planar harmonic oscillator above corresponds to the forcefield F(X) =−kX. This field is conservative, with potential energyU (X) = 1 2 k |X|2 .For any moving particle X(t) of mass m, the kinetic energy is <strong>def</strong>ined to beK = 1 2 m|V (t)|2 .Note that the kinetic energy depends on velocity, while the potential energyis a function of position. The total energy (or sometimes simply energy) is<strong>def</strong>ined on phase space by E = K + U . The total energy function is important


13.3 Central Force Fields 281in mechanics because it is constant along any solution curve of the system.That is, in the language of Section 9.4, E is a constant of the motion or a firstintegral for the flow.Theorem. (Conservation of Energy) Let (X(t), V (t)) be a solution curveof a conservative system. Then the total energy E is constant along this solutioncurve.Proof: To show that E(X(t)) is constant in t, we computeĖ = d dt( 12 m|V (t)|2 + U (X(t)))= mV · V ′ + (grad U ) · X ′= V · (−grad U ) + (grad U ) · V= 0. We remark that we may also write this type of system in Hamiltonian form.Recall from Chapter 9 that a Hamiltonian system on R n × R n is a system ofthe formx ′ i = ∂H∂y iy ′ i =−∂H ∂x iwhere H : R n × R n → R is the Hamiltonian function. As we have seen, thefunction H is constant along solutions of such a system. To write the conservativesystem in Hamiltonian form, we make a simple change of variables. Weintroduce the momentum vector Y = mV and then setH = K + U = 1 ∑y22m i + U (x 1 , ..., x n ).This puts the conservative system in Hamiltonian form, as is easily checked.13.3 Central Force FieldsA force field F is called central if F(X) points directly toward or away from theorigin for every X. In other words, the vector F(X) is always a scalar multipleof X:F(X) = λ(X)X


282 Chapter 13 Applications in Mechanicswhere the coefficient λ(X) depends on X. We often tacitly exclude from considerationa particle at the origin; many central force fields are not <strong>def</strong>ined(or are “infinite”) at the origin. We deal with these types of singularities inSection 13.7. Theoretically, the function λ(X) could vary for different valuesof X on a sphere given by |X|=constant. However, if the force field isconservative, this is not the case.Proposition. Let F be a conservative force field. Then the following statementsare <strong>eq</strong>uivalent:1. F is central;2. F(X) = f (|X|)X;3. F(X) =−grad U (X) and U (X) = g(|X|).Proof: Suppose (3) is true. To prove (2) we find, from the chain rule:∂U= g ′ (|X|) ∂ ( 2 x1 + x 2 2 2 + x ) 1/23∂x j ∂x j= g ′ (|X|)x j .|X|This proves (2) with f (|X|) =−g ′ (|X|)/|X|. It is clear that (2) implies (1). Toshow that (1) implies (3) we must prove that U is constant on each sphereS α ={X ∈ R n ||X| =α > 0}.Because any two points in S α can be connected by a curve in S α , it sufficesto show that U is constant on any curve in S α . Hence if J ⊂ R is an intervaland γ : J → S α is a smooth curve, we must show that the derivative of thecomposition U ◦ γ is identically 0. This derivative isddt U (γ (t)) = grad U (γ (t)) · γ ′ (t)as in Section 13.1. Now grad U (X) =−F(X) =−λ(X)X since F is central.Thus we haveddt U (γ (t)) =−λ(γ (t))γ (t) · γ ′ (t)λ(γ (t)) d=− |γ (t)|22 dt= 0because |γ (t)| ≡α.


13.3 Central Force Fields 283Consider now a central force field, not necessarily conservative, <strong>def</strong>ined onR 3 . Suppose that, at some time t 0 , P ⊂ R 3 denotes the plane containing theposition vector X(t 0 ), the velocity vector V (t 0 ), and the origin (assuming, forthe moment, that the position and velocity vectors are not collinear). Notethat the force vector F(X(t 0 )) also lies in P. This makes it plausible that theparticle stays in the plane P for all time. In fact, this is true:Proposition. A particle moving in a central force field in R 3 always moves ina fixed plane containing the origin.Proof: Suppose X(t) is the path of a particle moving under the influence ofa central force field. We haveddt (X × V ) = V × V + X × V ′= X × X ′′= 0because X ′′ is a scalar multiple of X. Therefore Y = X(t) × V (t) is a constantvector. If Y ̸= 0, this means that X and V always lie in the plane orthogonalto Y , as asserted. If Y = 0, then X ′ (t) = g(t)X(t) for some real function g(t).This means that the velocity vector of the moving particle is always directedalong the line through the origin and the particle, as is the force on the particle.This implies that the particle always moves along the same line through theorigin. To prove this, let (x 1 (t), x 2 (t), x 3 (t)) be the coordinates of X(t). Thenwe have three separable differential <strong>eq</strong>uations:dx kdt= g(t)x k (t), for k = 1, 2, 3.Integrating, we findx k (t) = e h(t) x k (0), where h(t) =∫ t0g(s) ds.Therefore X(t) is always a scalar multiple of X(0) and so X(t) moves in a fixedline and hence in a fixed plane.The vector m(X × V ) is called the angular momentum of the system, wherem is the mass of the particle. By the proof of the preceding proposition, thisvector is also conserved by the system.Corollary. (Conservation of Angular Momentum) Angular momentumis constant along any solution curve in a central force field.


284 Chapter 13 Applications in MechanicsWe now restrict attention to a conservative central force field. Because ofthe previous proposition, the particle remains for all time in a plane, which wemay take to be x 3 = 0. In this case angular momentum is given by the vector(0, 0, m(x 1 v 2 − x 2 v 1 )). Letl = m(x 1 v 2 − x 2 v 1 ).So the function l is also constant along solutions. In the planar case we alsocall l the angular momentum. Introducing polar coordinates x 1 = r cos θ andx 2 = r sin θ,wefindThenv 1 = x ′ 1 = r′ cos θ − r sin θθ ′v 2 = x ′ 2 = r′ sin θ + r cos θθ ′ .x 1 v 2 − x 2 v 1 = r cos θ(r ′ sin θ + r cos θθ ′ ) − r sin θ(r ′ cos θ − r sin θθ ′ )= r 2 (cos 2 θ + sin 2 θ)θ ′= r 2 θ ′ .Hence in polar coordinates, l = mr 2 θ ′ .We can now prove one of Kepler’s laws. Let A(t) denote the area sweptout by the vector X(t) in the time from t 0 to t. In polar coordinates we havedA = 1 2 r2 dθ. We <strong>def</strong>ine the areal velocity to beA ′ (t) = 1 2 r2 (t)θ ′ (t),the rate at which the position vector sweeps out area. Kepler observed that theline segment joining a planet to the sun sweeps out <strong>eq</strong>ual areas in <strong>eq</strong>ual times,which we interpret to mean A ′ = constant. We have therefore proved moregenerally that this is true for any particle moving in a conservative central forcefield.We now have found two constants of the motion or first integrals for aconservative system generated by a central force field: total energy and angularmomentum. In the 19th century, the idea of solving a differential <strong>eq</strong>uationwas tied to the construction of a sufficient number of such constants of themotion. In the 20th century, it became apparent that first integrals do not existfor differential <strong>eq</strong>uations very generally; the culprit here is chaos, which wewill discuss in the next two chapters. Basically, chaotic behavior of solutions ofa differential <strong>eq</strong>uation in an open set precludes the existence of first integralsin that set.


13.4 The Newtonian Central Force System 28513.4 The Newtonian Central ForceSystemWe now specialize the discussion to the Newtonian central force system. Thissystem deals with the motion of a single planet orbiting around the sun. Weassume that the sun is fixed at the origin in R 3 and that the relatively smallplanet exerts no force on the sun. The sun exerts a force on a planet given byNewton’s law of gravitation, which is also called the inverse square law. Thislaw states that the sun exerts a force on a planet located at X ∈ R 3 whosemagnitude is gm s m p /r 2 , where m s is the mass of the sun, m p is the mass of theplanet, and g is the gravitational constant. The direction of the force is towardthe sun. Therefore Newton’s law yields the differential <strong>eq</strong>uationm p X ′′ X=−gm s m p|X| 3 .For clarity, we change units so that the constants are normalized to one andso the <strong>eq</strong>uation becomes more simplyX ′′ = F(X) =− X|X| 3 .where F is now the force field. As a system of differential <strong>eq</strong>uations, we haveX ′ = VV ′ =− X|X| 3 .This system is called the Newtonian central force system. Our goal in this sectionis to describe the geometry of this system; in the next section we derive acomplete analytic solution of this system.Clearly, this is a central force field. Moreover, it is conservative, sinceX= grad U (X)|X| 3where the potential energy U is given byU (X) =− 1|X| .Observe that F(X) is not <strong>def</strong>ined at 0; indeed, the force field becomes infiniteas the moving mass approaches collision with the stationary mass at the origin.As in the previous section we may restrict attention to particles moving in theplane R 2 . Thus we look at solutions in the configuration space C = R 2 −{0}.We denote the phase space by P = (R 2 −{0}) × R 2 .


286 Chapter 13 Applications in MechanicsWe visualize phase space as the collection of all tangent vectors at each pointX ∈ C. For a given X ∈ R 2 −{0}, let T X ={(X, V ) | V ∈ R 2 }. T X is thetangent plane to the configuration space at X. ThenP = ⋃ X∈CT Xis the tangent space to the configuration space, which we may naturally identifywith a subset of R 4 .The dimension of phase space is four. However, we can cut this dimensionin half by making use of the two known first integrals, total energy and angularmomentum. Recall that energy is constant along solutions and is given byE(X, V ) = K (V ) + U (X) = 1 2 |V |2 − 1|X| .Let h denote the subset of P consisting of all points (X, V ) with E(X, V ) = h.The set h is called an energy surface with total energy h. Ifh ≥ 0, then hmeets each T X in a circle of tangent vectors satisfying(|V | 2 = 2h + 1|X|The radius of these circles in the tangent planes at X tends to ∞ as X → 0 anddecreases to 2h as |X| tends to ∞.When h−1/h, then there are no vectors in T X ∩ h . When |X| =−1/h, only the zerovector in T X lies in h . The circle r =−1/h in configuration space is thereforeknown as the zero velocity curve. IfX lies inside the zero velocity curve, thenT X meets the energy surface in a circle of tangent vectors as before. Figure 13.1gives a caricature of h for the case where h


13.4 The Newtonian Central Force System 287zero velocitycurveFigure 13.1 Over each nonzero pointinside the zero velocity curve, T Xmeets the energy surface h in a circleof tangent vectors.Therefore in the new coordinates (r,θ,v r ,v θ ), the system becomesr ′ =v rθ ′ =v θ /rv r ′ =− 1r 2 + v2 θrv θ ′ =−v rv θ.rIn these coordinates, total energy is given by1 (v22 r +vθ 2 ) 1 −r = hand angular momentum is given by l =rv θ . Let h,l consist of all points inphase space with total energy h and angular momentum l. For simplicity, wewill restrict attention to the case where h


288 Chapter 13 Applications in Mechanicsthrough the origin. The solution leaves the origin and travels along a straightline until reaching the zero velocity curve, after which time it recedes back tothe origin. In fact, since the vectors in h,0 have magnitude tending to ∞ asX → 0, these solutions reach the singularity in finite time in both directions.Solutions of this type are called collision-ejection orbits.When l ̸=0, a different picture emerges. Given X inside the zero velocitycurve, we have v θ =l/r, so that, from the total energy formula,r 2 vr 2 =2hr2 +2r −l 2 .(A)The quadratic polynomial in r on the right in Eq. (A) must therefore benonnegative, so this puts restrictions on which r-values can occur for X ∈ h,l .The graph of this quadratic polynomial is concave down since h−1/2h. Therefore the space h,l is empty in this case. Ifl 2 =−1/2h, we have a single root that occurs at r =−1/2h. Hence this is theonly allowable r-value in h,l in this case. In the tangent plane at (r, θ), wehave v r =0, v θ =−2hl, so this represents a circular closed orbit (traversedclockwise if l 0).If l 2


13.5 Kepler’s First Law 289A a,bFigure 13.2 A selection ofvectors in h,l .since, when r =α, we havev r ′ =− 1 α 2 + v2 θα = 1 α 3 (−α+l2 ).However, since the right-hand side of Eq. (A) vanishes at α, we haveso that2hα 2 +2α−l 2 = 0,−α+l 2 =(2hα+1)α.Since α 0 when r = α, so the r coordinate ofsolutions in h,l reaches a minimum when the curve meets r = α. Similarly,along r =β, the r coordinate reaches a maximum.Hence solutions in A α,β must behave as shown in Figure 13.3. As a remark,one can easily show that these curves are preserved by rotations about theorigin, so all of these solutions behave symmetrically.More, however, can be said. Each of these solutions actually lies on a closedorbit that traces out an ellipse in configuration space. To see this, we need toturn to analysis.13.5 Kepler’s First LawFor most nonlinear mechanical systems, the geometric analysis of the previoussection is just about all we can hope for. In the Newtonian central force system,


290 Chapter 13 Applications in MechanicsA a,bFigure 13.3 Solutions in h,lthat meet r = α or r = β.however, we get lucky: As has been known for centuries, we can write downexplicit solutions for this system.Consider a particular solution curve of the differential <strong>eq</strong>uation. We havetwo constants of the motion for this system, namely, the angular momentuml and total energy E. The case l = 0 yields collision-ejection solutions as wesaw above. Hence we assume l ̸=0. We will show that in polar coordinatesin configuration space, a solution with nonzero angular momentum lies on acurve given by r(1+ɛ cosθ) =κ where ɛ and κ are constants. This <strong>eq</strong>uation<strong>def</strong>ines a conic section, as can be seen by rewriting this <strong>eq</strong>uation in Cartesiancoordinates. This fact is known as Kepler’s first law.To prove this, recall that r 2 θ ′ is constant and nonzero. Hence the sign ofθ ′ remains constant along each solution curve. Thus θ is always increasing oralways decreasing in time. Therefore we may also regard r as a function of θalong the curve.Let W (t)=1/r(t); then W is also a function of θ. Note that W =−U . Thefollowing proposition gives a convenient formula for kinetic energy.Proposition. The kinetic energy is given by( (dW ) ) 2K = l2+W 2 .2 dθProof: In polar coordinates, we haveK = 1 ((r ′ ) 2 +(rθ ′ ) 2) .2Since r =1/W , we also haver ′ = −1W 2 dWdθ θ ′ =−l dWdθ .


13.5 Kepler’s First Law 291Finally,rθ ′ = l r =lW .Substitution into the formula for K then completes the proof.Now we find a differential <strong>eq</strong>uation relating W and θ along the solutioncurve. Observe that K =E −U =E +W . From the proposition we get( dWdθ) 2+W 2 = 2 (E +W ). (B)l2 Differentiating both sides with respect to θ, dividing by 2 dW /dθ, and usingdE/dθ =0 (conservation of energy), we obtaind 2 Wdθ 2 +W = 1 l 2where 1/l 2 is a constant.Note that this <strong>eq</strong>uation is just the <strong>eq</strong>uation for a harmonic oscillator withconstant forcing 1/l 2 . From elementary calculus, solutions of this secondorder<strong>eq</strong>uation can be written in the formor, <strong>eq</strong>uivalently,W (θ) = 1 +A cosθ +B sinθl2 W (θ) = 1 l 2 +C cos(θ +θ 0)where the constants C and θ 0 are related to A and B.If we substitute this expression into Eq. (B) and solve for C (at, say, θ +θ 0 =π/2), we findC =± 1 l 2 √1+2l 2 E.Inserting this into the solution above, we findW (θ) = 1 √)(1±l 2 1+2El 2 cos(θ +θ 0 ) .There is no need to consider both signs in front of the radical sincecos(θ +θ 0 +π) =−cos(θ +θ 0 ).


292 Chapter 13 Applications in MechanicsMoreover, by changing the variable θ to θ −θ 0 we can put any particularsolution into the form1( √)l 2 1+ 1+2El 2 cosθ .This looks pretty complicated. However, recall from analytic geometry (orfrom Exercise 2 at the end of this chapter) that the <strong>eq</strong>uation of a conic in polarcoordinates is1r = 1 (1+ɛ cosθ).κHere κ is the latus rectum and ɛ ≥0istheeccentricity of the conic. The originis a focus and the three cases ɛ >1, ɛ =1, ɛ < 1 correspond, respectively, toa hyperbola, parabola, and ellipse. The case ɛ = 0 is a circle. In our case wehave√ɛ = 1+2El 2 ,so the three different cases occur when E>0, E = 0, or E 0,E =0,orE


13.7 Blowing Up the Singularity 293Let’s examine these <strong>eq</strong>uations from the perspective of a viewer living on thefirst mass. Let X =X 2 −X 1 . We then haveX ′′ =X 2 ′′ ′′−X 1X 1 −X 2=gm 1|X 1 −X 2 | 3 −gm X 2 −X 12|X 1 −X 2 | 3=−g(m 1 +m 2 ) X|X| 3 .But this is just the Newtonian central force problem, with a different choice ofconstants.So, to solve the two-body problem, we first determine the solution of X(t)of this central force problem. This then determines the right-hand side of thedifferential <strong>eq</strong>uations for both X 1 and X 2 as functions of t, and so we cansimply integrate twice to find X 1 (t) and X 2 (t).Another way to reduce the two-body problem to the Newtonian centralforce is as follows. The center of mass of the two-body system is the vectorX c = m 1X 1 +m 2 X 2m 1 +m 2.A computation shows that X ′′c =0. Therefore we must have X c = At +B whereA and B are fixed vectors in R 3 . This says that the center of mass of the systemmoves along a straight line with constant velocity.We now change coordinates so that the origin of the system is located at X c .That is, we set Y j =X j −X c for j =1,2. Therefore m 1 Y 1 (t)+m 2 Y 2 (t)=0 for allt. Rewriting the differential <strong>eq</strong>uations in terms of the Y j ,wefindY ′′1 =− gm3 2(m 1 +m 2 ) 3 Y 1|Y 1 | 3Y ′′2 =− gm3 1(m 1 +m 2 ) 3 Y 2|Y 2 | 3 ,which yields a pair of central force problems. However, since we know thatm 1 Y 1 (t)+m 2 Y 2 (t)=0, we need only solve one of them.13.7 Blowing Up the SingularityThe singularity at the origin in the Newtonian central force problem is the firsttime we have encountered such a situation. Usually our vector fields have been


294 Chapter 13 Applications in Mechanicswell <strong>def</strong>ined on all of R n . In mechanics, such singularities can sometimes beremoved by a combination of judicious changes of variables and time scalings.In the Newtonian central force system, this may be achieved using a change ofvariables introduced by McGehee [32].We first introduce scaled variablesIn these variables the system becomesr ′ =r −1/2 u rθ ′ =r −3/2 u θu r =r 1/2 v ru θ =r 1/2 v θ .u ′ r =r−3/2 ( 12 u2 r +u2 θ −1 )u ′ θ =r−3/2 (− 1 2 u ru θ).We still have a singularity at the origin, but note that the last three <strong>eq</strong>uationsare all multiplied by r −3/2 . We can remove these terms by simply multiplyingthe vector field by r 3/2 . In doing so, solution curves of the system remain thesame but are parameterized differently.More precisely, we introduce a new time variable τ via the ruleBy the chain rule we havedtdτ =r3/2 .drdτ = dr dtdt dτand similarly for the other variables. In this new timescale the systembecomesṙ =ru r˙θ =u θ˙u r = 1 2 u2 r +u2 θ −1˙u θ =− 1 2 u ru θ


13.7 Blowing Up the Singularity 295where the dot now indicates differentiation with respect to τ. Note that, whenr is small, dt/dτ is close to zero, so “time” τ moves much more slowly thantime t near the origin.This system no longer has a singularity at the origin. We have “blown up”the singularity and replaced it with a new set given by r = 0 with θ,u r , and u θbeing arbitrary. On this set the system is now perfectly well <strong>def</strong>ined. Indeed,the set r =0 is an invariant set for the flow since ṙ = 0 when r = 0. We have thusintroduced a fictitious flow on r =0. While solutions on r = 0 mean nothingin terms of the real system, by continuity of solutions, they can tell us a lotabout how solutions behave near the singularity.We need not concern ourselves with all of r = 0 since the total energy relationin the new variables becomeshr = 1 2(u2r +uθ2 )−1.On the set r =0, only the subset <strong>def</strong>ined byur 2 +u2 θ =2, θ arbitrarymatters. The set is called the collision surface for the system; how solutionsbehave on dictates how solutions move near the singularity since anysolution that approaches r =0 necessarily comes close to in our new coordinates.Note that is a two-dimensional torus: It is formed by a circle in theθ direction and a circle in the u r u θ –plane.On the system reduces to˙θ =u θ˙u r = 1 2 u2 θ˙u θ =− 1 2 u ru θwhere we have used the energy relation to simplify ˙u r . This system is easyto analyze. We have ˙u r >0 provided u θ ̸=0. Hence the u r coordinate mustincrease along any solution in with u θ ̸=0.On the other hand, when u θ =0, the system has <strong>eq</strong>uilibrium points. Thereare two circles of <strong>eq</strong>uilibria, one given by u θ = 0,u r = √ 2, and θ arbitrary, theother by u θ =0,u r =− √ 2, and θ arbitrary. Let C ± denote these two circleswith u r =± √ 2onC ± . All other solutions must travel from C − to C + sincev θ increases along solutions.


296 Chapter 13 Applications in MechanicsTo fully understand the flow on , we introduce the angular variable ψ ineach u r u θ –plane viau r = √ 2sinψu θ = √ 2cosψ.The torus is now parameterized by θ and ψ. Inθψ coordinates, the systembecomes˙θ = √ 2cosψ˙ψ = 1 √2cosψ.The circles C ± are now given by ψ =±π/2. Eliminating time from this<strong>eq</strong>uation, we finddψdθ = 1 2 .Thus all non<strong>eq</strong>uilibrium solutions have constant slope 1/2 when viewed in θψcoordinates. See Figure 13.4.Now recall the collision-ejection solutions described in Section 13.4. Eachof these solutions leaves the origin and then returns along a ray θ = θ ∗ in configurationspace. The solution departs with v r > 0 (and so u r > 0) and returnswith v r


13.8 Exploration: Other Central Force Problems 297u rDhu hh ∗Figure 13.5 A collision-ejection solution in the regionr > 0 leaving and returning to and a connecting orbiton the collision surface.What happens to nearby noncollision solutions? Well, they come close tothe “lower” <strong>eq</strong>uilibrium point with θ =θ ∗ ,u r =− √ 2, then follow one of twobranches of the unstable curve through this point up to the “upper” <strong>eq</strong>uilibriumpoint θ = θ ∗ ,u r =+ √ 2, and then depart near the unstable curve leavingthis <strong>eq</strong>uilibrium point. Interpreting this motion in configuration space, we seethat each near-collision solution approaches the origin and then retreats afterθ either increases or decreases by 2π units. Of course, we know this already,since these solutions whip around the origin in tight ellipses.13.8 Exploration: Other Central ForceProblemsIn this exploration, we consider the (non-Newtonian) central force problemfor which the potential energy is given byU (X) = 1|X| νwhere ν >1. The primary goal is to understand near-collision solutions.1. Write this system in polar coordinates (r,θ,v r ,v θ ) and state explicitly theformulas for total energy and angular momentum.2. Using a computer, investigate the behavior of solutions of this systemwhen h


298 Chapter 13 Applications in Mechanics3. Blow up the singularity at the origin via the change of variablesu r =r ν/2 v r , u θ = r ν/2 v θand an appropriate change of timescale and write down the new system.4. Compute the vector field on the collision surface determined in r = 0by the total energy relation.5. Describe the bifurcation that occurs on when ν = 2.6. Describe the structure of h,l for all ν >1.7. Describe the change of structure of h,l that occurs when ν passesthrough the value 2.8. Describe the behavior of solutions in h,l when ν > 2.9. Suppose 1


Exercises 299y ′′ =−y(μx 2 +y 2 ) 3/2where μ is a parameter that we assume is greater than 1.1. Show that this system is conservative with potential energy given byU (x,y) =−1√μx 2 +y 2and write down an explicit formula for total energy.2. Describe the geometry of the energy surface h for energy h9/8, describe in qualitative terms what happens to solutionsthat approach collision close to the collision-ejection orbits on the x-axis.In particular, how do they retreat from the origin in the configurationspace? How do solutions approach collision when traveling near they-axis?EXERCISES1. Which of the following force fields on R 2 are conservative?(a) F(x, y) =(−x 2 , −2y 2 )


300 Chapter 13 Applications in Mechanics(b) F(x, y) =(x 2 −y 2 ,2xy)(c) F(x, y) =(x,0)2. Prove that the <strong>eq</strong>uation1r = 1 (1+ɛ cosθ)hdetermines a hyperbola, parabola, and ellipse when ɛ > 1, ɛ = 1, andɛ 0 and h =0.5. Let F(X) be a force field on R 3 . Let X 0 , X 1 be points in R 3 and let Y (s)be a path in R 3 with s 0 ≤s ≤s 1 , parametrized by arc length s, from X 0 toX 1 . The work done in moving a particle along this path is <strong>def</strong>ined to bethe integral∫ s1s 0F(y(s))·y ′ (s)dswhere Y ′ (s) is the (unit) tangent vector to the path. Prove that the forcefield is conservative if and only if the work is independent of the path. Infact, if F =−grad V , then the work done is V (X 1 )−V (X 0 ).6. Describe solutions to the non-Newtonian central force system given by7. Discuss solutions of the <strong>eq</strong>uationX ′′ =− X|X| 4 .X ′′ = X|X| 3 .This <strong>eq</strong>uation corresponds to a repulsive rather than attractive force atthe origin.8. This and the next two problems deal with the two-body problem. Let thepotential energy beU = gm 1m 2|X 2 −X 1 |


Exercises 301andgrad j (U ) =(∂U∂x j 1), ∂U∂x j , ∂U2 ∂x j .3Show that the <strong>eq</strong>uations for the two-body problem can be writtenm j X ′′j =−grad j (U ).9. Show that the total energy K +U of the system is a constant of the motion,whereK = 1 (m1 |V 1 | 2 +m 2 |V 2 | 2) .210. Define the angular momentum of the system byl=m 1 (X 1 ×V 1 )+m 2 (X 2 ×V 2 )and show that l is also a first integral.


This Page Intentionally Left Blank


14The Lorenz SystemSo far, in all of the differential <strong>eq</strong>uations we have studied, we have not encounteredany “chaos.” The reason is simple: The linear systems of the first fewchapters always exhibit straightforward, predictable behavior. (OK, we maysee solutions wrap densely around a torus as in the oscillators of Chapter 6, butthis is not chaos.) Also, for the nonlinear planar systems of the last few chapters,the Poincaré-Bendixson theorem completely eliminates any possibility ofchaotic behavior. So, to find chaotic behavior, we need to look at nonlinear,higher dimensional systems.In this chapter we investigate the system that is, without doubt, the mostfamous of all chaotic differential <strong>eq</strong>uations, the Lorenz system from meteorology.First formulated in 1963 by E. N. Lorenz as a vastly oversimplifiedmodel of atmospheric convection, this system possesses what has come tobe known as a “strange attractor.” Before the Lorenz model started makingheadlines, the only types of stable attractors known in differential <strong>eq</strong>uationswere <strong>eq</strong>uilibria and closed orbits. The Lorenz system truly opened up newhorizons in all areas of science and engineering, because many of the phenomenapresent in the Lorenz system have later been found in all of theareas we have previously investigated (biology, circuit theory, mechanics, andelsewhere).In the ensuing 40 years, much progress has been made in the study ofchaotic systems. Be forewarned, however, that the analysis of the chaoticbehavior of particular systems like the Lorenz system is usually extremelydifficult. Most of the chaotic behavior that is readily understandable arisesfrom geometric models for particular differential <strong>eq</strong>uations, rather than from303


304 Chapter 14 The Lorenz Systemthe actual <strong>eq</strong>uations themselves. Indeed, this is the avenue we pursue here. Wewill present a geometric model for the Lorenz system that can be completelyanalyzed using tools from discrete dynamics. Although this model has beenknown for some 30 years, it is interesting to note the fact that this model wasonly shown to be <strong>eq</strong>uivalent to the Lorenz system in the year 1999.14.1 Introduction to the Lorenz SystemIn 1963, E. N. Lorenz [29] attempted to set up a system of differential <strong>eq</strong>uationsthat would explain some of the unpredictable behavior of the weather. Mostviable models for weather involve partial differential <strong>eq</strong>uations; Lorenz soughta much simpler and easier-to-analyze system.The Lorenz model may be somewhat inaccurately thought of as follows.Imagine a planet whose “atmosphere” consists of a single fluid particle. As onearth, this particle is heated from below (and hence rises) and cooled fromabove (so then falls back down). Can a weather expert predict the “weather”on this planet? Sadly, the answer is no, which raises a lot of questions aboutthe possibility of accurate weather prediction down here on earth, where wehave quite a few more particles in our atmosphere.A little more precisely, Lorenz looked at a two-dimensional fluid cell thatwas heated from below and cooled from above. The fluid motion can bedescribed by a system of differential <strong>eq</strong>uations involving infinitely many variables.Lorenz made the tremendous simplifying assumption that all but threeof these variables remained constant. The remaining independent variablesthen measured, roughly speaking, the rate of convective “overturning” (x),and the horizontal and vertical temperature variation (y and z, respectively).The resulting motion led to a three-dimensional system of differential <strong>eq</strong>uationsthat involved three parameters: the Prandtl number σ , the Rayleighnumber r, and another parameter b that is related to the physical size of thesystem. When all of these simplifications were made, the system of differential<strong>eq</strong>uations involved only two nonlinear terms and was given byx ′ = σ (y − x)y ′ = rx − y − xzz ′ = xy − bz.In this system all three parameters are assumed to be positive and, moreover,σ >b+ 1. We denote this system by X ′ = L(X). In Figure 14.1, we havedisplayed the solution curves through two different initial conditions P 1 =(0, 2, 0) and P 2 = (0, −2, 0) when the parameters are σ = 10, b = 8/3, and


14.1 Introduction to the Lorenz System 305zP 1 yP 2xFigure 14.1 The Lorenz attractor. Two solutions with initial conditions P 1 =(0, 2, 0) and P 2 = (0, −2, 0).r = 28. These are the original parameters that led to Lorenz’s discovery. Notehow both solutions start out very differently, but eventually have more or lessthe same fate: They both seem to wind around a pair of points, alternating attimes which point they encircle. This is the first important fact about the Lorenzsystem: All non<strong>eq</strong>uilibrium solutions tend eventually to the same complicatedset, the so-called Lorenz attractor.There is another important ingredient lurking in the background here. InFigure 14.1, we started with two relatively far apart initial conditions. Had westarted with two very close initial conditions, we would not have observed the“transient behavior” apparent in Figure 14.1. Rather, more or less the samepicture would have resulted for each solution. This, however, is misleading.When we plot the actual coordinates of the solutions, we see that these twosolutions actually move quite far apart during their journey around the Lorenzattractor. This is illustrated in Figure 14.2, where we have graphed the xcoordinates of two solutions that start out nearby, one at (0, 2, 0), the other (ingray) at (0, 2. 01, 0). These graphs are nearly identical for a certain time period,but then they differ considerably as one solution travels around one of the lobesof the attractor while the other solution travels around the other. No matterhow close two solutions start, they always move apart in this manner whenthey are close to the attractor. This is sensitive dependence on initial conditions,one of the main features of a chaotic system.We will describe in detail the concept of an attractor and chaos in thischapter. But first, we need to investigate some of the more familiar features ofthe system.


306 Chapter 14 The Lorenz SystemxtFigure 14.2 The x(t) graphs for two nearby initial conditions P 1 =(0, 2, 0) and P 2 = (0, 2.01, 0).14.2 Elementary Properties of the LorenzSystemAs usual, to analyze this system, we begin by finding the <strong>eq</strong>uilibria. Some easyalgebra yields three <strong>eq</strong>uilibrium points, the origin, andQ ± = (± √ b(r − 1), ± √ b(r − 1), r − 1).The latter two <strong>eq</strong>uilibria only exist when r>1, so already we see that we havea bifurcation when r = 1.Linearizing, we find the system⎛Y ′ = ⎝ −σ σ 0⎞r − z −1 −x⎠ Y .y x −bAt the origin, the eigenvalues of this matrix are −b andλ ± = 1 2(−(σ + 1) ± √ )(σ + 1) 2 − 4σ (1 − r) .Note that both λ ± are negative when 0 ≤ r


14.2 Elementary Properties of the Lorenz System 307When x = y = 0, we have x ′ = y ′ = 0, so the z-axis is invariant. On thisaxis, we have simply z ′ =−bz, so all solutions tend to the origin on this axis.In fact, the solution through any point in R 3 tends to the origin when r


308 Chapter 14 The Lorenz SystemWhen r = 1 the polynomial f 1 has distinct roots at 0, −b, and −σ − 1. Theseroots are distinct since σ >b+ 1 so that−σ − 1 < −σ + 1 < −b 0 for λ ≥ 0 and r>1. Looking at the graph of f r ,it follows that, at least for r close to 1, the three roots of f r must be real andnegative.We now let r increase and ask what is the lowest value of r for which f r hasan eigenvalue with zero real part. Note that this eigenvalue must in fact be ofthe form ±iω with ω ̸= 0, since f r is a real polynomial that has no roots <strong>eq</strong>ualto 0 when r>1. Solving f r (iω) = 0 by <strong>eq</strong>uating both real and imaginary partsto zero then yields the result (recall that we have assumed σ >b+ 1). We remark that a Hopf bifurcation is known to occur at r ∗ , but proving thisis beyond the scope of this book.When r>1 it is no longer true that all solutions tend to the origin. However,we can say that solutions that start far from the origin do at least move closerin. To be precise, letV (x, y, z) = rx 2 + σ y 2 + σ (z − 2r) 2 .Note that V (x, y, z) = ν > 0 <strong>def</strong>ines an ellipsoid in R 3 centered at (0, 0, 2r).We will show:Proposition. There exists ν ∗ such that any solution that starts outside theellipsoid V = ν ∗ eventually enters this ellipsoid and then remains trapped thereinfor all future time.Proof: We computeThe <strong>eq</strong>uation˙V =−2σ ( rx 2 + y 2 + b(z 2 − 2rz) )=−2σ ( rx 2 + y 2 + b(z − r) 2 − br 2) .rx 2 + y 2 + b(z − r) 2 = μalso <strong>def</strong>ines an ellipsoid when μ > 0. When μ >br 2 we have ˙V < 0. Thuswe may choose ν ∗ large enough so that the ellipsoid V = ν ∗ strictly contains


14.2 Elementary Properties of the Lorenz System 309the ellipsoidrx 2 + y 2 + b(z − r) 2 = br 2in its interior. Then ˙V


310 Chapter 14 The Lorenz Systemlimit cycles, <strong>eq</strong>uilibrium points, and solutions connecting them. In higherdimensions, these attractors may be much “stranger,” as we show in the nextsection.14.3 The Lorenz AttractorThe behavior of the Lorenz system as the parameter r increases is the subjectof much contemporary research; we are decades (if not centuries) away fromrigorously understanding all of the fascinating dynamical phenomena thatoccur as the parameters change. Sparrow has written an entire book devotedto this subject [44].In this section we will deal with one specific set of parameters where theLorenz system has an attractor. Roughly speaking, an attractor for the flow isan invariant set that “attracts” all nearby solutions. To be more precise:DefinitionLet X ′ = F(X) be a system of differential <strong>eq</strong>uations in R n withflow φ t . A set is called an attractor if1. is compact and invariant;2. There is an open set U containing such that for each X ∈ U ,φ t (X) ∈ U for all t ≥ 0 and ∩ t≥0 φ t (U ) = ;3. (Transitivity) Given any points Y 1 , Y 2 ∈ and any open neighborhoodsU j about Y j in U , there is a solution curve thatbegins in U 1 and later passes through U 2 .The transitivity condition in this <strong>def</strong>inition may seem a little strange.Basically, we include it to guarantee that we are looking at a single attractorrather than a collection of dynamically different attractors. For example,the transitivity condition rules out situations such as that given by the planarsystemx ′ = x − x 3y ′ =−y.The phase portrait of this system is shown in Figure 14.3. Note that anysolution of this system enters the set marked U and then tends to one of thethree <strong>eq</strong>uilibrium points: either to one of the sinks at (±1, 0) or to the saddle(0, 0). The forward intersection of the flow φ t applied to U is the interval−1 ≤ x ≤ 1. This interval meets conditions 1 and 2 in the <strong>def</strong>inition, but


14.3 The Lorenz Attractor 311UFigure 14.3 The interval onthe x-axis between the twosinks is not an attractor forthis system, despite the factthat all solutions enter U.condition 3 is violated, because none of the solution curves passes close topoints in both the left and right half of this interval. We choose not to considerthis set an attractor since most solutions tend to one of the two sinks. We reallyhave two distinct attractors in this case.As a remark, there is no universally accepted <strong>def</strong>inition of an attractor inmathematics; some people choose to say that a set that meets only conditions1 and 2 is an attractor, while if also meets condition 3, it would be called atransitive attractor. For planar systems, condition 3 is usually easily verified;in higher dimensions, however, this can be much more difficult, as we shallsee.For the rest of this chapter, we restrict attention to the very special case of theLorenz system where the parameters are given by σ = 10, b = 8/3, and r = 28.Historically, these are the values Lorenz used when he first encountered chaoticphenomena in this system. Thus, the specific Lorenz system we consider is⎛⎞10(y − x)X ′ = L(X) = ⎝28x − y − xz⎠ .xy − (8/3)zAs in the previous section, we have three <strong>eq</strong>uilibria: the origin and Q ± =(±6 √ 2, ±6 √ 2, 27). At the origin we find eigenvalues λ 1 =−8/3 andλ ± =− 11√12012 ± .2


312 Chapter 14 The Lorenz SystemzyxFigure 14.4 Linearization at theorigin for the Lorenz system.For later use, note that these eigenvalues satisfyλ − < −λ + < λ 1 < 0 < λ + .The linearized system at the origin is then⎛Y ′ = ⎝ λ ⎞− 0 00 λ + 0 ⎠ Y .0 0 λ 1The phase portrait of the linearized system is shown in Figure 14.4. Note thatall solutions in the stable plane of this system tend to the origin tangentially tothe z-axis.At Q ± a computation shows that there is a single negative real eigenvalueand a pair of complex conjugate eigenvalues with positive real parts. Note thatthe symmetry in the system forces the rotations about Q + and Q − to haveopposite orientations.In Figure 14.5, we have displayed a numerical computation of a portionof the left- and right-hand branches of the unstable curve at the origin. Notethat the right-hand portion of this curve comes close to Q − and then spiralsaway. The left portion behaves symmetrically under reflection through thez-axis. In Figure 14.6, we have displayed a significantly larger portion of theseunstable curves. Note that they appear to circulate around the two <strong>eq</strong>uilibria,sometimes spiraling around Q + , sometimes about Q − . In particular, thesecurves continually reintersect the portion of the plane z = 27 containing Q ±in which the vector field points downward. This suggests that we may construct


14.3 The Lorenz Attractor 313Q Q (0, 0, 0)Figure 14.5The unstable curve at the origin.a Poincaré map on a portion of this plane. As we have seen before, computinga Poincaré map is often impossible, and this case is no different. So we willcontent ourselves with building a simplified model that exhibits much of thebehavior we find in the Lorenz system. As we shall see in the following section,this model provides a computable means to assess the chaotic behavior of thesystem.(0, 0, 0)Figure 14.6More of the unstable curve at the origin.


314 Chapter 14 The Lorenz System14.4 A Model for the Lorenz AttractorIn this section we describe a geometric model for the Lorenz attractor originallyproposed by Guckenheimer and Williams [20]. Tucker [46] showed thatthis model does indeed correspond to the Lorenz system for certain parameters.Rather than specify the vector field exactly, we give instead a qualitativedescription of its flow, much as we did in Chapter 11. The specific numberswe use are not that important; only their relative sizes matter.We will assume that our model is symmetric under the reflection (x, y, z) →(−x, −y, z), as is the Lorenz system. We first place an <strong>eq</strong>uilibrium point at theorigin in R 3 and assume that, in the cube S given by |x|, |y|, |z| ≤5, the systemis linear. Rather than use the eigenvalues λ 1 and λ ± from the actual Lorenzsystem, we simplify the computations a bit by assuming that the eigenvaluesare −1, 2, and −3, and that the system is given in the cube byx ′ =−3xy ′ = 2yz ′ =−z.Note that the phase portrait of this system agrees with that in Figure 14.4 andthat the relative magnitudes of the eigenvalues are the same as in the Lorenzcase.We need to know how solutions make the transit near (0, 0, 0). Consider arectangle R 1 in the plane z = 1 given by |x| ≤1, 0


14.4 A Model for the Lorenz Attractor 315z1(x, y)h(x, y)xy2Figure 14.7 Solutions making the transitnear (0, 0, 0).away from Q ± in the same manner as in the Lorenz system. We also assumethat the stable surface of (0, 0, 0) first meets in the line of intersection of thexz–plane and .Let ζ ± denote the two branches of the unstable curve at the origin. Weassume that these curves make a passage around and then enter this squareas shown in Figure 14.8. We denote the first point of intersection of ζ ± with by ρ ± = (±x ∗ , ∓y ∗ ).Q Φ(∑ )p ∑ ∑ p f Φ(∑ )Q f zyxFigure 14.8 The solutions ζ ± and their intersection with in the modelfor the Lorenz attractor.


316 Chapter 14 The Lorenz SystemNow consider a straight line y = ν in . Ifν = 0, all solutions beginningat points on this line tend to the origin as time moves forward. Hence thesesolutions never return to . We assume that all other solutions originating in do return to as time moves forward. How these solutions return leads toour major assumptions about this model:1. Return condition: Let + = ∩{y>0} and − = ∩{y √ 2.4. Hyperbolicity condition: Besides the expansion and contraction, we assumethat D maps vectors tangent to ± whose slopes are ±1 to vectors whoseslopes have magnitude larger than μ > 1.Analytically, these assumptions imply that the map assumes the form(x, y) = (f (x, y), g(y))where g ′ (y) > √ 2 and 0 < ∂f /∂x


14.4 A Model for the Lorenz Attractor 317y *(yy * )R 20(R )20(R)(yy * )y *R Figure 14.9 The Poincaré map on R.but outside R must eventually meet R, so it suffices to consider the behaviorof on R. A planar picture of the action of on R is displayed in Figure 14.9.Note that (R) ⊂ R.Let n denote the nth iterate of , and letA =∞⋂ n (R).n=0Here U denotes the closure of the set U . The set A will be the intersection ofthe attractor for the flow with R. That is, let( ) ⋃A = φ t (A) ∪{(0, 0, 0)}.t∈RWe add the origin here so that A will be a closed set. We will prove:Theorem. A is an attractor for the model Lorenz system.Proof: The proof that A is an attractor for the flow follows immediately fromthe fact that A is an attractor for the mapping (where an attractor for amapping is <strong>def</strong>ined completely analogously to that for a flow). Clearly A isclosed. Technically, A itself is not invariant under since is not <strong>def</strong>inedalong y = 0. However, for the flow, the solutions through all such points dolie in A and so A is invariant. If we let O be the open set given by the interiorof , then for any (x, y) ∈ O, there is an n such that n (x, y) ∈ R. Hence∞⋂ n (O) ⊂ An=0


318 Chapter 14 The Lorenz SystemBy <strong>def</strong>inition, A =∩ n≥0 n (R), and so A =∩ n≥0 n (O) as well. Thereforeconditions 1 and 2 in the <strong>def</strong>inition of an attractor hold for .It remains to show the transitivity property. We need to show that if P 1 andP 2 are points in A, and W j are open neighborhoods of P j in O, then thereexists an n ≥ 0 such that n (W 1 ) ∩ W 2 ̸= ∅.Given a set U ⊂ R, let y (U ) denote the projection of U onto the y-axis.Also let l y (U ) denote the length of y (U ), which we call the y length of U .Inthe following, U will be a finite collection of connected sets, so l y (U ) is well<strong>def</strong>ined.We need a lemma.Lemma. For any open set W ⊂ R, there exists n > 0 such that y ( n (W )) isthe interval [−y ∗ , y ∗ ]. Equivalently, n (W ) meets each line y = cinR.Proof: First suppose that W contains a connected subset W ′ that extendsfrom one of the tips to y = 0. Hence l y (W ′ ) = y ∗ . Then (W ′ ) is connectedand we have l y ((W ′ )) > √ 2y ∗ . Moreover, (W ′ ) also extends from one ofthe tips, but now crosses y = 0 since its y length exceeds y ∗ .Now apply again. Note that 2 (W ′ ) contains two pieces, one of whichextends to ρ + , the other to ρ − . Moreover, l y ( 2 (W ′ )) > 2y ∗ . Thus it followsthat y ( 2 (W ′ )) =[−y ∗ , y ∗ ] and so we are done in this special case.For the general case, suppose first that W is connected and does not crossy = 0. Then we have l y ((W )) > √ 2l y (W ) as above, so the y length of(W ) grows by a factor of more than √ 2.If W does cross y = 0, then we may find a pair of connected sets W ±with W ± ⊂{R ± ∩ W } and l y (W + ∪ W − ) = l y (W ). The images (W ± )extend to the tips ρ ± . If either of these sets also meets y = 0, then weare done by the above. If neither (W + ) nor (W − ) crosses y = 0,then we may apply again. Both (W + ) and (W − ) are connected sets,and we have l y ( 2 (W ± )) > 2l y (W ± ). Hence for one of W + or W −we have l y ( 2 (W ± )) > l y (W ) and again the y length of W grows underiteration.Thus if we continue to iterate or 2 and choose the appropriate largestsubset of j (W ) at each stage as above, then we see that the y lengthsof these images grow without bound. This completes the proof of thelemma.We now complete the proof of the theorem. We must find a point inW 1 whose image under an iterate of lies in W 2 . Toward that end, notethat∣∣ k (x 1 , y) − k (x 2 , y) ∣ ≤ c k |x 1 − x 2 |


14.5 The Chaotic Attractor 319since j (x 1 , y) and j (x 2 , y) lie on the same straight line parallel to the x-axisfor each j and contracts distances in the x direction by a factor of c


320 Chapter 14 The Lorenz Systemg(y * )y * y *g(y * )Figure 14.10 The graph of theone-dimensional function g on I =[−y ∗ , y ∗ ].the dynamics of this one-dimensional discrete dynamical system. In Chapter 15,we plunge more deeply into this topic.Let I be the interval [−y ∗ , y ∗ ]. Recall that g is <strong>def</strong>ined on I except at y = 0and satisfies g(−y) =−g(y). From the results of the previous section, wehave g ′ (y) > √ 2, and 0


14.5 The Chaotic Attractor 321preimages, one of which is the endpoint and the other is a point in I that doesnot <strong>eq</strong>ual the other endpoint. Hence we can continue taking preimages of thissecond backward orbit as before, thereby generating infinitely many distinctbackward orbits as before.We claim that each of these infinite backward orbits of y 0 corresponds toa unique point in A ∩{y = y 0 }. To see this, consider the line J −k given byy = y −k in R. Then k (J −k ) is a closed subinterval of J 0 for each k. Notethat (J −k−1 ) ⊂ J −k , since (y = y −k−1 ) is a proper subinterval of y =y −k . Hence the nested intersection of the sets k (J −k ) is nonempty, and anypoint in this intersection has backward orbit (y 0 , y −1 , y −2 , ...) by construction.Furthermore, the intersection point is unique, since each application of contracts the intervals y = y −k by a factor of c √ 2. More precisely, we have:Proposition. Let 0 < ν 0, wemay find u 0 , v 0 ∈ I with |u 0 − y 0 | < ɛ and |v 0 − y 0 | < ɛ and n > 0 such that|g n (u 0 ) − g n (v 0 )|≥2ν.Proof: Let J be the interval of length 2ɛ centered at y 0 . Each iteration of gexpands the length of J by a factor of at least √ 2, so there is an iteration forwhich g n (J) contains 0 in its interior. Then g n+1 (J) contains points arbitrarilyclose to both ±y ∗ , and hence there are points in g n+1 (J) whose distance fromeach other is at least 2ν. This completes the proof.Let’s interpret the meaning of this proposition in terms of the attractor A.Given any point in the attractor, we may always find points arbitrarily nearbywhose forward orbits move apart just about as far as they possibly can. This isthe hallmark of a chaotic system: We call this behavior sensitive dependence oninitial conditions. A tiny change in the initial position of the orbit may result in


322 Chapter 14 The Lorenz Systemdrastic changes in the eventual behavior of the orbit. Note that we must have asimilar sensitivity for the flow in A; certain nearby solution curves in A mustalso move far apart. This is the behavior we witnessed in Figure 14.2.This should be contrasted with the behavior of points in A that lie on thesame line y = constant with −y ∗ 0, we construct a rectangle W ⊂ U centered atP and having width 2ɛ (in the x direction) and height ɛ (in the y direction).Let W 1 ⊂ W be a smaller square centered at P with sidelength ɛ/2. Bythe transitivity result of the previous section, we may find a point Q 1 ∈ W 1such that n (Q 1 ) = Q 2 ∈ W 1 . By choosing a subset of W 1 if necessary, wemay assume that n>4 and furthermore that n is so large that c n < ɛ/8. Itfollows that the image of n (W ) (not n (W 1 )) crosses through the interiorof W nearly vertically and extends beyond its top and bottom boundaries, asdepicted in Figure 14.11. This fact uses the hyperbolicity condition.Now consider the lines y = c in W . These lines are mapped to other suchlines in R by n . Since the vertical direction is expanded, some of the linesmust be mapped above W and some below. It follows that one such line y = c 0must be mapped inside itself by n , and therefore there must be a fixed pointfor n on this line. Since this line is contained in W , we have produced aperiodic point for in W . This proves density of periodic points in A.In terms of the flow, a solution beginning at a periodic point of is a closedorbit. Hence the set of points lying on closed orbits is a dense subset of A. The


14.5 The Chaotic Attractor 323 n (W)WW 1PQ 1 Q 2yc 0(yc 0 )Figure 14.11itself. maps W acrossstructure of these closed orbits is quite interesting from a topological point ofview, because many of these closed curves are actually “knotted.” See [9] andExercise 10.Finally, we say that a function g is transitive on I if, for any pair of points y 1and y 2 in I and neighborhoods U i of y i , we can find ỹ ∈ U 1 and n such thatg n (ỹ) ∈ U 2 . Just as in the proof of density of periodic points, we may use thefact that g is expanding in the y direction to prove:Proposition. The function g is transitive on I.We leave the details to the reader. In terms of , we almost proved thecorresponding result when we showed that A was an attractor. The only detailwe did not provide was the fact that we could find a point in A whose orbitmade the transit arbitrarily close to any given pair of points in A. For thisdetail, we refer to Exercise 4.Thus we can summarize the dynamics of on the attractor A of the Lorenzmodel as follows.Theorem. (Dynamics of the Lorenz Model) The Poincaré map restricted to the attractor A for the Lorenz model has the following properties:1. has sensitive dependence on initial conditions;2. Periodic points of are dense in A;3. is transitive on A. We say that a mapping with the above properties is chaotic. We caution thereader that, just as in the <strong>def</strong>inition of an attractor, there are many <strong>def</strong>initions ofchaos around. Some involve exponential separation of orbits, others involvepositive Liapunov exponents, and others do not r<strong>eq</strong>uire density of periodicpoints. It is an interesting fact that, for continuous functions of the real line,


324 Chapter 14 The Lorenz Systemdensity of periodic points and transitivity are enough to guarantee sensitivedependence. See [8]. We will delve more deeply into chaotic behavior ofdiscrete systems in the next chapter.14.6 Exploration: The Rössler AttractorIn this exploration, we investigate a three-dimensional system similar in manyrespects to the Lorenz system. The Rössler system [37] is given byx ′ =−y − zy ′ = x + ayz ′ = b + z(x − c)where a, b, and c are real parameters. For simplicity, we will restrict attentionto the case where a = 1/4, b = 1, and c ranges from 0 to 7.As with the Lorenz system, it is difficult to prove specific results about thissystem, so much of this exploration will center on numerical experimentationand the construction of a model.1. First find all <strong>eq</strong>uilibrium points for this system.2. Describe the bifurcation that occurs at c = 1.3. Investigate numerically the behavior of this system as c increases. Whatbifurcations do you observe?4. In Figure 14.12 we have plotted a single solution for c = 5. 5. Computeother solutions for this parameter value, and display the results from otherzxyFigure 14.12The Rössler attractor.


326 Chapter 14 The Lorenz SystemRFigure 14.13 mapsR completely acrossitself.8. Consider a map on a rectangle R as shown in Figure 14.13. where has properties similar to the model Lorenz . How many periodic pointsof period n does have?9. Consider the systemx ′ = 10(y − x)y ′ = 28x − y + xzz ′ = xy − (8/3)z.Show that this system is not chaotic. (Note the +xz term in the <strong>eq</strong>uationfor y ′ .) Hint: Show that most solutions tend to ∞.10. A simple closed curve in R 3 is knotted if it cannot be continuously<strong>def</strong>ormed into the “unknot,” the unit circle in the xy–plane, withouthaving self-intersections along the way. Using the model Lorenz attractor,sketch a curve that follows the dynamics of (so should approximatea real solution) and is knotted. (You might want to use some string forthis!)11. Use a computer to investigate the behavior of the Lorenz system as rincreases from 1 to 28 (with σ = 10 and b = 8/3). Describe in qualitativeterms any bifurcations you observe.


15Discrete Dynamical SystemsOur goal in this chapter is to begin the study of discrete dynamical systems.As we have seen at several stages in this book, it is sometimes possible toreduce the study of the flow of a differential <strong>eq</strong>uation to that of an iteratedfunction, namely, a Poincaré map. This reduction has several advantages. Firstand foremost, the Poincaré map lives on a lower dimensional space, whichtherefore makes visualization easier. Secondly, we do not have to integrateto find “solutions” of discrete systems. Rather, given the function, we simplyiterate the function over and over to determine the behavior of the orbit, whichthen dictates the behavior of the corresponding solution.Given these two simplifications, it then becomes much easier to comprehendthe complicated chaotic behavior that often arises for systems of differential<strong>eq</strong>uations. While the study of discrete dynamical systems is a topic that couldeasily fill this entire book, we will restrict attention here primarily to the portionof this theory that helps us understand chaotic behavior in one dimension. Inthe following chapter we will extend these ideas to higher dimensions.15.1 Introduction to Discrete DynamicalSystemsThroughout this chapter we will work with real functions f : R → R. Asusual, we assume throughout that f is C ∞ , although there will be severalspecial examples where this is not the case.327


328 Chapter 15 Discrete Dynamical SystemsLet f n denote the nth iterate of f . That is, f n is the n-fold composition of fwith itself. Given x 0 ∈ R, the orbit of x 0 is the s<strong>eq</strong>uencex 0 , x 1 = f (x 0 ), x 2 = f 2 (x 0 ), ..., x n = f n (x 0 ), ....The point x 0 is called the seed of the orbit.Example. Let f (x) = x 2 + 1. Then the orbit of the seed 0 is the s<strong>eq</strong>uencex 0 = 0x 1 = 1x 2 = 2x 3 = 5x 4 = 26.x n = bigx n+1 = bigger.and so forth, so we see that this orbit tends to ∞ as n →∞.In analogy with <strong>eq</strong>uilibrium solutions of systems of differential <strong>eq</strong>uations,fixed points play a central role in discrete dynamical systems. A point x 0 iscalled a fixed point if f (x 0 ) = x 0 . Obviously, the orbit of a fixed point is theconstant s<strong>eq</strong>uence x 0 , x 0 , x 0 , ....The analog of closed orbits for differential <strong>eq</strong>uations is given by periodicpoints of period n. These are seeds x 0 for which f n (x 0 ) = x 0 for some n>0.As a cons<strong>eq</strong>uence, like a closed orbit, a periodic orbit repeats itself:x 0 , x 1 , ..., x n−1 , x 0 , x 1 , ..., x n−1 , x 0 ....Periodic orbits of period n are also called n-cycles. We say that the periodicpoint x 0 has minimal period n if n is the least positive integer for whichf n (x 0 ) = x 0 .Example. The function f (x) = x 3 has fixed points at x = 0, ±1. The functiong(x) = −x 3 has a fixed point at 0 and a periodic point of period 2


330 Chapter 15 Discrete Dynamical Systemsf n (y 0 ) ∈ U for all n and, moreover, f n (y 0 ) → x 0 as n →∞. Similarly, x 0 is asource or a repelling fixed point if all orbits (except x 0 ) leave U under iterationof f . A fixed point is called neutral or indifferent if it is neither attracting norrepelling.For differential <strong>eq</strong>uations, we saw that it was the derivative of the vector fieldat an <strong>eq</strong>uilibrium point that determined the type of the <strong>eq</strong>uilibrium point. Thisis also true for fixed points, although the numbers change a bit.Proposition. Suppose f has a fixed point at x 0 . Then1. x 0 is a sink if |f ′ (x 0 )| < 1;2. x 0 is a source if |f ′ (x 0 )| > 1;3. we get no information about the type of x 0 if f ′ (x 0 ) =±1.Proof: We first prove case (1). Suppose |f ′ (x 0 )|=ν < 1. Choose K withν


15.1 Introduction to Discrete Dynamical Systems 331f (x)xx 3 g(x)xx 3 h(x)xx 2Figure 15.2 In each case, the derivative at 0 is 1, but f has a source at 0; ghas a sink; and h has neither.Note that, at a fixed point x 0 for which f ′ (x 0 ) < 0, the orbits of nearbypoints jump from one side of the fixed point to the other at each iteration. SeeFigure 15.3. This is the reason why the output of graphical iteration is oftencalled a web diagram.Since a periodic point x 0 of period n for f is a fixed point of f n , we mayclassify these points as sinks or sources depending on whether |(f n ) ′ (x 0 )| < 1or |(f n ) ′ (x 0 )| > 1. One may check that (f n ) ′ (x 0 ) = (f n ) ′ (x j ) for any otherpoint x j on the periodic orbit, so this <strong>def</strong>inition makes sense (see Exercise 6 atthe end of this chapter).Example. The function f (x) = x 2 − 1 has a 2-cycle given by 0 and −1.One checks easily that (f 2 ) ′ (0) = 0 = (f 2 ) ′ (−1), so this cycle is a sink.In Figure 15.4, we show a graphical iteration of f with the graph of f 2superimposed. Note that 0 and −1 are attracting fixed points for f 2 . X 0 Z 0Figure 15.3 Since −1


332 Chapter 15 Discrete Dynamical Systemsyx1yxf(x)yxf 2 (x)Figure 15.4 The graphs of f (x) =x 2 −1 andf 2 showing that 0 and −1 lie on an attracting2-cycle for f.15.2 BifurcationsDiscrete dynamical systems undergo bifurcations when parameters are variedjust as differential <strong>eq</strong>uations do. We deal in this section with several types ofbifurcations that occur for one-dimensional systems.Example. Let f c (x) = x 2 + c where c is a parameter. The fixed points forthis family are given by solving the <strong>eq</strong>uation x 2 + c = x, which yieldsp ± = 1 √ 1 − 4c2 ± .2Hence there are no fixed points if c>1/4; a single fixed point at x = 1/2 whenc = 1/4; and a pair of fixed points at p ± when c1/4. When c = 1/4, the fixed pointat x = 1/2 is neutral, as is easily seen by graphical iteration. See Figure 15.5.When c1, so p + is always repelling.A straightforward computation also shows that −1


15.2 Bifurcations 333c0.35 c0.25 c0.15Figure 15.5 The saddle-node bifurcation for fc(x) = x 2 + c at c = 1/4.Note that, in this example, at the bifurcation point, the derivative at thefixed point <strong>eq</strong>uals 1. This is no accident, for we have:Theorem. (The Bifurcation Criterion) Let f λ be a family of functionsdepending smoothly on the parameter λ. Suppose that f λ0 (x 0 ) = x 0 andfλ ′0(x 0 ) ̸= 1. Then there are intervals I about x 0 and J about λ 0 and a smoothfunction p : J → I such that p(λ 0 ) = x 0 and f λ (p(λ)) = p(λ). Moreover, f λ hasno other fixed points in I.Proof: Consider the function <strong>def</strong>ined by G(x, λ) = f λ (x) − x. By hypothesis,G(x 0 , λ 0 ) = 0 and∂G∂x (x 0, λ 0 ) = f ′λ 0x 0 ) − 1 ̸= 0.By the implicit function theorem, there are intervals I about x 0 and J aboutλ 0 , and a smooth function p : J → I such that p(λ 0 ) = x 0 and G(p(λ), λ) ≡ 0for all λ ∈ J. Moreover, G(x, λ) ̸= 0 unless x = p(λ). This concludes theproof.As a cons<strong>eq</strong>uence of this result, f λ may undergo a bifurcation involving achange in the number of fixed points only if f λ has a fixed point with derivative<strong>eq</strong>ual to 1. The typical bifurcation that occurs at such parameter values is thesaddle-node bifurcation (see Exercises 18 and 19). However, many other typesof bifurcations of fixed points may occur.Example. Let f λ (x) = λx(1 − x). Note that f λ (0) = 0 for all λ. We havefλ ′ (0) = λ, so we have a possible bifurcation at λ = 1. There is a second fixedpoint for f λ at x λ = (λ − 1)/λ. When λ < 1, x λ is negative, and when λ > 1,x λ is positive. When λ = 1, x λ coalesces with the fixed point at 0 so there is


334 Chapter 15 Discrete Dynamical Systemsa single fixed point for f 1 . A computation shows that 0 is repelling and x λ isattracting if λ > 1 (and λ < 3), while the reverse is true if λ < 1. For thisreason, this type of bifurcation is known as an exchange bifurcation. Example. Consider the family of functions f μ (x) = μx + x 3 . When μ = 1we have f 1 (0) = 0 and f 1 ′ (0) = 1 so we have the possibility for a bifurcation.The fixed points are 0 and ± √ 1 − μ, so we have three fixed points when μ < 1but only one fixed point when μ ≥ 1, so a bifurcation does indeed occur as μpasses through 1.The only other possible bifurcation value for a one-dimensional discretesystem occurs when the derivative at the fixed (or periodic) point is <strong>eq</strong>ual to−1, since at these values the fixed point may change from a sink to a sourceor from a source to a sink. At all other values of the derivative, the fixed pointsimply remains a sink or source and there are no other periodic orbits nearby.Certain portions of a periodic orbit may come close to a source, but the entireorbit cannot lie close by (see Exercise 7). In the case of derivative −1 atthefixed point, the typical bifurcation is a period doubling bifurcation.Example. As a simple example of this type of bifurcation, consider the familyf λ (x) = λx near λ 0 =−1. There is a fixed point at 0 for all λ. When −1 1, 0is repelling and all nonzero orbits tend to ±∞. When λ =−1, 0 is a neutralfixed point and all nonzero points lie on 2-cycles. As λ passes through −1,the type of the fixed point changes from attracting to repelling; meanwhile, afamily of 2-cycles appears.Generally, when a period doubling bifurcation occurs, the 2-cycles do notall exist for a single parameter value. A more typical example of this bifurcationis provided next.Example. Again consider f c (x) = x 2 + c, this time with c near c =−3/4.There is a fixed point atp − = 1 √ 1 − 4c2 − .2We have seen that f−3/4 ′ (p −) =−1 and that p − is attracting when c is slightlylarger than −3/4 and repelling when c is less than −3/4. Graphical iterationshows that more happens as c descends through −3/4: We see the birth ofan (attracting) 2-cycle as well. This is the period doubling bifurcation. SeeFigure 15.6. Indeed, one can easily solve for the period two points and checkthat they are attracting (for −5/4


15.3 The Discrete Logistic Model 335c = −0.65 c = −0.75 c = −0.85Figure 15.6 The period doubling bifurcation for fc(x) = x 2 + c at c = −3/4.The fixed point is attracting for c ≥−0.75 and repelling for c < −0.75.15.3 The Discrete Logistic ModelIn Chapter 1 we introduced one of the simplest nonlinear first-orderdifferential <strong>eq</strong>uations, the logistic model for population growthx ′ = ax(1 − x).In this model we took into account the fact that there is a carrying capacityfor a typical population, and we saw that the resulting solutions behaved quitesimply: All nonzero solutions tended to the “ideal” population. Now somethingabout this model may have bothered you way back then: Populationsgenerally are not continuous functions of time! A more natural type of modelwould measure populations at specific times, say, every year or every generation.Here we introduce just such a model, the discrete logistic model forpopulation growth.Suppose we consider a population whose members are counted each year(or at other specified times). Let x n denote the population at the end of yearn. If we assume that no overcrowding can occur, then one such populationmodel is the exponential growth model where we assume thatx n+1 = kx nfor some constant k > 0. That is, the next year’s population is directlyproportional to this year’s. Thus we havex 1 = kx 0x 2 = kx 1 = k 2 x 0x 3 = kx 2 = k 3 x 0.


336 Chapter 15 Discrete Dynamical SystemsClearly, x n = k n x 0 so we conclude that the population explodes if k>1,becomes extinct if 0 ≤ k 0 is a parameter. We may therefore predict the fate of the initialpopulation x 0 by simply iterating the quadratic function f λ (x) = λx(1 − x)(also called the logistic map). Sounds easy, right? Well, suffice it to say thatthis simple quadratic iteration was only completely understood in the late1990s, thanks to the work of hundreds of mathematicians. We will see whythe discrete logistic model is so much more complicated than its cousin, thelogistic differential <strong>eq</strong>uation, in a moment, but first let’s do some simple cases.We consider only the logistic map on the unit interval I. We have f λ (0) = 0,so 0 is a fixed point. The fixed point is attracting in I for 0 < λ ≤ 1, andrepelling thereafter. The point 1 is eventually fixed, since f λ (1) = 0. Thereis a second fixed point x λ = (λ − 1)/λ in I for λ > 1. The fixed point x λ isattracting for 1 < λ ≤ 3 and repelling for λ > 3. At λ = 3 a period doublingbifurcation occurs (see Exercise 4). For λ-values between 3 and approximately3. 4, the only periodic points present are the two fixed points and the 2-cycle.When λ = 4, the situation is much more complicated. Note that f ′λ (1/2) = 0and that 1/2 is the only critical point for f λ for each λ. When λ = 4, we havef 4 (1/2) = 1, so f 24 (1/2) = 0. Therefore f 4 maps each of the half-intervals[0, 1/2] and [1/2, 1] onto the entire interval I. Cons<strong>eq</strong>uently, there exist pointsy 0 ∈[0, 1/2] and y 1 ∈[1/2, 1] such that f 4 (y j ) = 1/2 and hence f 24 (y j) = 1.


15.4 Chaos 3371/2 1/2 1/2y 0 y 1 y 0 y 1Figure 15.7 The graphs of the logistic function f λ (x) − λ x(1 − x) as well asf 2 λ and f 3 λ over the interval I.Therefore we haveandf 24 [0, y 0]=f 24 [y 0, 1/2] =If 24 [1/2, y 1]=f 24 [y 1,1]=I.Since the function f4 2 is a quartic, it follows that the graph of f24 is as depictedin Figure 15.7. Continuing in this fashion, we find 2 3 subintervals of I that aremapped onto I by f4 3,24subintervals mapped onto I by f4 4 , and so forth. Wetherefore see that f 4 has two fixed points in I; f4 2 has four fixed points in I; f34has 2 3 fixed points in I; and, inductively, f4 n has 2n fixed points in I. The fixedpoints for f 4 occur at 0 and 3/4. The four fixed points for f4 2 include these twofixed points plus a pair of periodic points of period 2. Of the eight fixed pointsfor f4 3 , two must be the fixed points and the other six must lie on a pair of3-cycles. Among the 16 fixed points for f4 4 are two fixed points, two periodicpoints of period 2, and twelve periodic points of period 4. Clearly, a lot haschanged as λ varies from 3. 4 to 4.On the other hand, if we choose a random seed in the interval I and plotthe orbit of this seed under iteration of f 4 using graphical iteration, we rarelysee any of these cycles. In Figure 15.8 we have plotted the orbit of 0. 123 underiteration of f 4 using 200 and 500 iterations. Presumably, there is something“chaotic” going on.15.4 ChaosIn this section we introduce several quintessential examples of chaotic onedimensionaldiscrete dynamical systems. Recall that a subset U ⊂ W is said


338 Chapter 15 Discrete Dynamical Systems(a)(b)Figure 15.8 The orbit of the seed 0.123 under f 4 using (a) 200iterations and (b) 500 iterations.to be dense in W if there are points in U arbitrarily close to any point in thelarger set W . As in the Lorenz model, we say that a map f , which takes aninterval I =[α, β] to itself, is chaotic if1. Periodic points of f are dense in I;2. f is transitive on I; that is, given any two subintervals U 1 and U 2 in I,there is a point x 0 ∈ U 1 and an n>0 such that f n (x 0 ) ∈ U 2 ;3. f has sensitive dependence in I; that is, there is a sensitivity constant βsuch that, for any x 0 ∈ I and any open interval U about x 0 , there is someseed y 0 ∈ U and n>0 such that|f n (x 0 ) − f n (y 0 )| > β.It is known that the transitivity condition is <strong>eq</strong>uivalent to the existence ofan orbit that is dense in I. Clearly, a dense orbit implies transitivity, for suchan orbit repeatedly visits any open subinterval in I. The other direction relieson the Baire category theorem from analysis, so we will not prove this here.Curiously, for maps of an interval, condition 3 in the <strong>def</strong>inition of chaos isredundant [8]. This is somewhat surprising, since the first two conditions inthe <strong>def</strong>inition are topological in nature, while the third is a metric property (itdepends on the notion of distance).We now discuss several classical examples of chaotic one-dimensional maps.Example. (The Doubling Map) Define the discontinuous functionD :[0, 1) →[0, 1) by D(x) = 2x mod 1. That is,{ 2x if 0 ≤ x


15.4 Chaos 339Figure 15.9 The graph of the doubling map D and its higher iterates D 2and D 3 on [0, 1).An easy computation shows that D n (x) = 2 n x mod 1, so that the graph ofD n consists of 2 n straight lines with slope 2 n , each extending over the entireinterval [0, 1). See Figure 15.9.To see that the doubling function is chaotic on [0, 1), note that D n maps anyinterval of the form [k/2 n ,(k + 1)/2 n ) for k = 0, 1, ...2 n − 2 onto the interval[0, 1). Hence the graph of D n crosses the diagonal y = x at some point in thisinterval, and so there is a periodic point in any such interval. Since the lengthsof these intervals are 1/2 n , it follows that periodic points are dense in [0, 1).Transitivity also follows, since, given any open interval J, we may always findan interval of the form [k/2 n ,(k +1)/2 n ) inside J for sufficiently large n. HenceD n maps J onto all of [0, 1). This also proves sensitivity, where we choose thesensitivity constant 1/2.We remark that it is possible to write down all of the periodic points forD explicitly (see Exercise 5a). It is also interesting to note that, if you usea computer to iterate the doubling function, then it appears that all orbitsare eventually fixed at 0, which, of course, is false! See Exercise 5c for anexplanation of this phenomenon.Example. (The Tent Map) Now consider a continuous cousin of thedoubling map given byT (x) ={ 2x if 0 ≤ x


340 Chapter 15 Discrete Dynamical SystemsFigure 15.10T 3 on [0, 1].The graph of the tent map T and its higher iterates T 2 andSuppose I and J are intervals and f : I → I and g : J → J. We say that fand g are conjugate if there is a homeomorphism h : I → J such that h satisfiesthe conjugacy <strong>eq</strong>uation h ◦ f = g ◦ h. Just as in the case of flows, a conjugacytakes orbits of f to orbits of g. This follows since we have h(f n (x)) = g n (h(x))for all x ∈ I, soh takes the nth point on the orbit of x under f to thenth point on the orbit of h(x) under g. Similarly, h −1 takes orbits of g toorbits of f .Example. Consider the logistic function f 4 :[0, 1] →[0, 1] and the quadraticfunction g :[−2, 2] →[−2, 2] given by g(x) = x 2 − 2. Let h(x) =−4x + 2and note that h takes [0, 1] to [−2, 2]. Moreover, we have h(4x(1 −x)) = (h(x)) 2 − 2, so h satisfies the conjugacy <strong>eq</strong>uation and f 4 and g areconjugate.From the point of view of chaotic systems, conjugacies are important sincethey map one chaotic system to another.Proposition. Suppose f : I → I and g : J → J are conjugate via h, whereboth I and J are closed intervals in R of finite length. If f is chaotic on I, then gis chaotic on J.Proof: Let U be an open subinterval of J and consider h −1 (U ) ⊂ I. Sinceperiodic points of f are dense in I, there is a periodic point x ∈ h −1 (U ) for f .Say x has period n. Theng n (h(x)) = h(f n (x)) = h(x)by the conjugacy <strong>eq</strong>uation. This gives a periodic point h(x) for g in U andshows that periodic points of g are dense in J.If U and V are open subintervals of J, then h −1 (U ) and h −1 (V ) are openintervals in I. By transitivity of f , there exists x 1 ∈ h −1 (U ) such that


15.4 Chaos 341f m (x 1 ) ∈ h −1 (V ) for some m. But then h(x 1 ) ∈ U and we have g m (h(x 1 )) =h(f m (x 1 )) ∈ V ,sog is transitive also.For sensitivity, suppose that f has sensitivity constant β. Let I =[α 0 , α 1 ].We may assume that β < α 1 − α 0 . For any x ∈[α 0 , α 1 − β], consider thefunction |h(x + β) − h(x)|. This is a continuous function on [α 0 , α 1 − β]that is positive. Hence it has a minimum value β ′ . It follows that h takesintervals of length β in I to intervals of length at least β ′ in J. Then itis easy to check that β ′ is a sensitivity constant for g. This completes theproof.It is not always possible to find conjugacies between functions with <strong>eq</strong>uivalentdynamics. However, we can relax the r<strong>eq</strong>uirement that the conjugacy beone to one and still salvage the preceding proposition. A continuous function hthat is at most n to one and that satisfies the conjugacy <strong>eq</strong>uation f ◦h = h ◦g iscalled a semiconjugacy between g and f . It is easy to check that a semiconjugacyalso preserves chaotic behavior on intervals of finite length (see Exercise 12).A semiconjugacy need not preserve the minimal periods of cycles, but it doesmap cycles to cycles.Example. The tent function T and the logistic function f 4 are semiconjugateon the unit interval. To see this, leth(x) = 1 (1 − cos 2πx) .2Then h maps the interval [0, 1] in two-to-one fashion over itself, except at 1/2,which is the only point mapped to 1. Then we computeh(T (x)) = 1 (1 − cos 4πx)2= 1 2 − 1 (2 cos 2 2πx − 1 )2= 1 − cos 2 2πx( 1= 42 − 1 )( 12 cos 2πx 2 + 1 )2 cos 2πx= f 4 (h(x)).Thus h is a semiconjugacy between T and f 4 . As a remark, recall that we mayfind arbitrarily small subintervals mapped onto all of [0, 1] by T . Hence f 4 mapsthe images of these intervals under h onto all of [0, 1]. Since h is continuous,the images of these intervals may be chosen arbitrarily small. Hence we may


342 Chapter 15 Discrete Dynamical Systemschoose 1/2 as a sensitivity constant for f 4 as well. We have proven the followingproposition:Proposition.interval.The logistic function f 4 (x) = 4x(1 − x) is chaotic on the unit15.5 Symbolic DynamicsWe turn now to one of the most useful tools for analyzing chaotic systems,symbolic dynamics. We give just one example of how to use symbolic dynamicshere; several more are included in the next chapter.Consider the logistic map f λ (x) = λx(1 − x) where λ > 4. Graphical iterationseems to imply that almost all orbits tend to −∞. See Figure 15.11.Of course, this is not true, because we have fixed points and other periodicpoints for this function. In fact, there is an unexpectedly “large” setcalled a Cantor set that is filled with chaotic behavior for this function, as weshall see.Unlike the case λ ≤ 4, the interval I =[0, 1] is no longer invariant whenλ > 4: Certain orbits escape from I and then tend to −∞. Our goal is tounderstand the behavior of the nonescaping orbits. Let denote the set ofpoints in I whose orbits never leave I. As shown in Figure 15.12a, there isan open interval A 0 on which f λ > 1. Hence fλ 2(x)< 0 for any x ∈ A 0 and,as a cons<strong>eq</strong>uence, the orbits of all points in A 0 tend to −∞. Note that anyorbit that leaves I must first enter A 0 before departing toward −∞. Also, the1Figure 15.11 Typicalorbits for the logisticfunction f λ with λ >4seem to tend to −∞ afterwandering around the unitinterval for a while.


15.5 Symbolic Dynamics 343A 1 A 0 A 1(a)I 0 A 0 I 1(b)Figure 15.12 (a) The exit set in I consists of a collectionof disjoint open intervals. (b) The intervals I 0 and I 1 lie tothe left and right of A 0 .orbits of the endpoints of A 0 are eventually fixed at 0, so these endpoints arecontained in . Now let A 1 denote the preimage of A 0 in I: A 1 consists oftwo open intervals in I, one on each side of A 0 . All points in A 1 are mappedinto A 0 by f λ , and hence their orbits also tend to −∞. Again, the endpointsof A 1 are eventual fixed points. Continuing, we see that each of the two openintervals in A 1 has as a preimage a pair of disjoint intervals, so there are fouropen intervals that consist of points whose first iteration lies in A 1 , the secondin A 0 , and so, again, all of these points have orbits that tend to −∞. Call thesefour intervals A 2 . In general, let A n denote the set of points in I whose nthiterate lies in A 0 . A n consists of set 2 n disjoint open intervals in I. Any pointwhose orbit leaves I must lie in one of the A n . Hence we see that = I −∞⋃A n .To understand the dynamics of f λ on I, we introduce symbolic dynamics.Toward that end, let I 0 and I 1 denote the left and right closed interval respectivelyin I − A 0 . See Figure 15.12b. Given x 0 ∈ , the entire orbit of x 0 liesin I 0 ∪ I 1 . Hence we may associate an infinite s<strong>eq</strong>uence S(x 0 ) = (s 0 s 1 s 2 ...)consisting of 0’s and 1’s to the point x 0 via the rulen=0s j = k if and only if f jλ (x 0) ∈ I k .That is, we simply watch how f jλ (x 0) bounces around I 0 and I 1 , assigning a 0or1atthejth stage depending on which interval f jλ (x 0) lies in. The s<strong>eq</strong>uenceS(x 0 ) is called the itinerary of x 0 .


344 Chapter 15 Discrete Dynamical SystemsExample. The fixed point 0 has itinerary S(0) = (000 ...). The fixed point x λin I 1 has itinerary S(x λ ) = (111 ...). The point x 0 = 1 is eventually fixed andhas itinerary S(1) = (1000 ...). A 2-cycle that hops back and forth between I 0and I 1 has itinerary (01 ...)or(10 ...) where 01 denotes the infinitely repeatings<strong>eq</strong>uence consisting of repeated blocks 01.Let denote the set of all possible s<strong>eq</strong>uences of 0’s and 1’s. A “point” inthe space is therefore an infinite s<strong>eq</strong>uence of the form s = (s 0 s 1 s 2 ...). Tovisualize , we need to tell how far apart different points in are. To do this,let s = (s 0 s 1 s 2 ...) and t = (t 0 t 1 t 2 ...) be points in . Adistance function ormetric on is a function d = d(s, t) that satisfies1. d(s, t) ≥ 0 and d(s, t) = 0 if and only if s = t;2. d(s, t) = d(t, s);3. the triangle in<strong>eq</strong>uality: d(s, u) ≤ d(s, t) + d(t, u).Since is not naturally a subset of a Euclidean space, we do not have aEuclidean distance to use on . Hence we must concoct one of our own. Hereis the distance function we choose:d(s, t) =∞∑i=0|s i − t i |2 i .Note that this infinite series converges: The numerators in this series are alwayseither 0 or 1, so this series converges by comparison to the geometric series:d(s, t) ≤∞∑i=012 i =11 − 1/2 = 2.It is straightforward to check that this choice of d satisfies the three r<strong>eq</strong>uirementsto be a distance function (see Exercise 13). While this distance functionmay look a little complicated at first, it is often easy to compute. Example.(1) d ( (0), (1) ) =(2) d ( (01), (10) ) =(3) d ( (01), (1) ) =∞∑i=0|0 − 1|2 i =∞∑i=0∞∑i=012 i = 214 i =∞∑i=012 i = 211 − 1/4 = 4 3 .


15.5 Symbolic Dynamics 345The importance of having a distance function on is that we now knowwhen points are close together or far apart. In particular, we haveProposition. Suppose s = (s 0 s 1 s 2 ...) and t = (t 0 t 1 t 2 ...) ∈ .1. If s j = t j for j = 0, ..., n, then d(s, t) ≤ 1/2 n ;2. Conversely, if d(s, t) < 1/2 n , then s j = t j for j = 0, ..., n.Proof: In case (1), we haved(s, t) =n∑i=0|s i − s i |2 i +≤ 0 + 1 ∑∞ 12 n+1 2 i= 1 2 n .i=0∞∑i=n+1|s i − t i |2 iIf, on the other hand, d(s, t) < 1/2 n , then we must have s j = t j for any j ≤ n,because otherwise d(s, t) ≥|s j − t j |/2 j = 1/2 j ≥ 1/2 n .Now that we have a notion of closeness in , we are ready to prove the maintheorem of this chapter:Theorem. The itinerary function S : → is a homeomorphism providedλ > 4.Proof: Actually, we will only prove this for the case in which λ is sufficientlylarge that |fλ ′(x)|>K>1 for some K and for all x ∈ I 0 ∪ I 1 . The reader maycheck that λ > 2 + √ 5 suffices for this. For the more complicated proof in thecase where 4 < λ ≤ 2 + √ 5, see [25].We first show that S is one to one. Let x, y ∈ and suppose S(x) = S(y).Then, for each n, fλ n (x) and fnλ(y) lie on the same side of 1/2. This implies thatf λ is monotone on the interval between fλ n (x) and fnλ(y). Cons<strong>eq</strong>uently, allpoints in this interval remain in I 0 ∪ I 1 when we apply f λ . Now |fλ ′ | >K>1atall points in this interval, so, as in Section 15.1, each iteration of f λ expands thisinterval by a factor of K . Hence the distance between fλ n (x) and fnλ(y) growswithout bound, so these two points must eventually lie on opposite sides ofA 0 . This contradicts the fact that they have the same itinerary.To see that S is onto, we first introduce the following notation. Let J ⊂ I bea closed interval. Letf −nλ(J) ={x ∈ I | fλ n (x) ∈ J},


346 Chapter 15 Discrete Dynamical Systemsso that fλ−n (J) is the preimage of J under fλ n.A glance at the graph of f λ whenλ > 4 shows that, if J ⊂ I is a closed interval, then fλ−1 (J) consists of twoclosed subintervals, one in I 0 and one in I 1 .Now let s = (s 0 s 1 s 2 ...). We must produce x ∈ with S(x) = s. To thatend we <strong>def</strong>ineI s0 s 1 ...s n={x ∈ I | x ∈ I s0 , f λ (x) ∈ I s1 , ..., fλ n (x) ∈ I s n}= I s0 ∩ fλ−1 (I s 1) ∩ ...∩ fλ −n (I sn ).We claim that the I s0 ...s nintervals. Note thatform a nested s<strong>eq</strong>uence of nonempty closedI s0 s 1 ...s n= I s0 ∩ f −1λ (I s 1 ...s n).By induction, we may assume that I s1 ...s nis a nonempty subinterval, so that, bythe observation above, fλ −1 (I s 1 ...s n) consists of two closed intervals, one in I 0and one in I 1 . Hence I s0 ∩ fλ−1 (I s 1 ...s n) is a single closed interval. These intervalsare nested becauseTherefore we conclude thatI s0 ...s n= I s0 ...s n−1∩ f −nλ(I sn ) ⊂ I s0 ...s n−1.∞⋂I s0 s 1 ...s nn≥0is nonempty. Note that if x ∈∩ n≥0 I s0 s 1 ...s n, then x ∈ I s0 , f λ (x) ∈ I s1 , etc. HenceS(x) = (s 0 s 1 ...). This proves that S is onto.Observe that ∩ n≥0 I s0 s 1 ...s nconsists of a unique point. This follows immediatelyfrom the fact that S is one to one. In particular, we have that the diameterof I s0 s 1 ...s ntends to 0 as n →∞.To prove continuity of S, we choose x ∈ and suppose that S(x) =(s 0 s 1 s 2 ...). Let ɛ > 0. Pick n so that 1/2 n < ɛ. Consider the closed subintervalsI t0 t 1 ...t n<strong>def</strong>ined above for all possible combinations t 0 t 1 ...t n . Thesesubintervals are all disjoint, and is contained in their union. There are2 n+1 such subintervals, and I s0 s 1 ...s nis one of them. Hence we may choose δsuch that |x − y| < δ and y ∈ implies that y ∈ I s0 s 1 ...s n. Therefore, S(y)agrees with S(x) in the first n + 1 terms. So, by the previous proposition,we haved(S(x), S(y)) ≤ 1 2 n < ɛ.


15.6 The Shift Map 347This proves the continuity of S. It is easy to check that S −1 is also continuous.Thus, S is a homeomorphism.15.6 The Shift MapWe now construct a map on σ : → with the following properties:1. σ is chaotic;2. σ is conjugate to f λ on ;3. σ is completely understandable from a dynamical systems point of view.The meaning of this last item will become clear as we proceed.We <strong>def</strong>ine the shift map σ : → byσ (s 0 s 1 s 2 ...) = (s 1 s 2 s 3 ...).That is, the shift map simply drops the first digit in each s<strong>eq</strong>uence in . Notethat σ is a two-to-one map onto . This follows since, if (s 0 s 1 s 2 ...) ∈ , thenwe haveσ (0s 0 s 1 s 2 ...) = σ (1s 0 s 1 s 2 ...) = (s 0 s 1 s 2 ...).Proposition.The shift map σ : → is continuous.Proof: Let s = (s 0 s 1 s 2 ...) ∈ , and let ɛ > 0. Choose n so that 1/2 n < ɛ. Letδ = 1/2 n+1 . Suppose that d(s, t) < δ, where t = (t 0 t 1 t 2 ...). Then we haves i = t i for i = 0, ..., n + 1.Now σ (t) = (s 1 s 2 ...s n t n+1 t n+2 ...) so that d(σ (s), σ (t)) ≤ 1/2 n < ɛ. Thisproves that σ is continuous.Note that we can easily write down all of the periodic points of any periodfor the shift map. Indeed, the fixed points are (0) and (1). The 2 cycles are (01)and (10). In general, the periodic points of period n are given by repeatings<strong>eq</strong>uences that consist of repeated blocks of length n: (s 0 ...s n−1 ). Note howmuch nicer σ is compared to f λ : Just try to write down explicitly all of theperiodic points of period n for f λ someday! They are there and we knowroughly where they are, because we have:Theorem. The itinerary function S : → provides a conjugacy between f λand the shift map σ .Proof: In the previous section we showed that S is a homeomorphism. So itsuffices to show that S ◦ f λ = σ ◦ S. To that end, let x 0 ∈ and suppose


348 Chapter 15 Discrete Dynamical Systemsthat S(x 0 ) = (s 0 s 1 s 2 ...). Then we have x 0 ∈ I s0 , f λ (x 0 ) ∈ I s1 , fλ 2(x0) ∈ I s2 ,and so forth. But then the fact that f λ (x 0 ) ∈ I s1 , fλ 2(x0) ∈ I s2 , etc., says thatS(f λ (x 0 )) = (s 1 s 2 s 3 ...), so S(f λ (x 0 )) = σ (S(x 0 )), which is what we wanted toprove.Now, not only can we write down all periodic points for σ , but we can infact write down explicitly a point in whose orbit is dense. Here is such apoint:s ∗ = ( 01 }{{}1 blocks| 00 } 01{{ 10 11}| 000 } 001 {{ ··· }2 blocks 3 blocks| }{{} ··· ).4 blocksThe s<strong>eq</strong>uence s ∗ is constructed by successively listing all possible blocks of 0’sand 1’s of length 1, length 2, length 3, and so forth. Clearly, some iterate ofσ applied to s ∗ yields a s<strong>eq</strong>uence that agrees with any given s<strong>eq</strong>uence in anarbitrarily large number of initial places. That is, given t = (t 0 t 1 t 2 ...) ∈ ,we may find k so that the s<strong>eq</strong>uence σ k (s ∗ ) beginsso that(t 0 ...t n s n+1 s n+2 ...)d(σ k (s ∗ ), t) ≤ 1/2 n .Hence the orbit of s ∗ comes arbitrarily close to every point in . This provesthat the orbit of s ∗ under σ is dense in and so σ is transitive. Note thatwe may construct a multitude of other points with dense orbits in by justrearranging the blocks in the s<strong>eq</strong>uence s ∗ . Again, think about how difficult itwould be to identify a seed whose orbit under a quadratic function like f 4 isdense in [0, 1]. This is what we meant when we said earlier that the dynamicsof σ are “completely understandable.”The shift map also has sensitive dependence. Indeed, we may choose thesensitivity constant to be 2, which is the largest possible distance between twopoints in . The reason for this is, if s = (s 0 s 1 s 2 ...) ∈ and ŝ j denotes “nots j ” (that is, if s j = 0, then ŝ j = 1, or if s j = 1 then ŝ j = 0), then the points ′ = (s 0 s 1 ...s n ŝ n+1 ŝ n+2 ...) satisfies:1. d(s, s ′ ) = 1/2 n , but2. d(σ n (s), σ n (s ′ )) = 2.As a cons<strong>eq</strong>uence, we have proved the following:Theorem. The shift map σ is chaotic on , and so by the conjugacy in theprevious theorem, the logistic map f λ is chaotic on when λ > 4.


15.7 The Cantor Middle-Thirds Set 349Thus symbolic dynamics provides us with a computable model for thedynamics of f λ on the set , despite the fact that f λ is chaotic on .15.7 The Cantor Middle-Thirds SetWe mentioned earlier that was an example of a Cantor set. Here we describethe simplest example of such a set, the Cantor middle-thirds set C. As we shallsee, this set has some unexpectedly interesting properties.To <strong>def</strong>ine C, we begin with the closed unit interval I =[0, 1]. The ruleis, each time we see a closed interval, we remove its open middle third.Hence, at the first stage, we remove (1/3, 2/3), leaving us with two closedintervals, [0, 1/3] and [2/3, 1]. We now repeat this step by removing themiddle thirds of these two intervals. We are left with four closed intervals[0, 1/9], [2/9, 1/3], [2/3, 7/9], and [8/9, 1]. Removing the open middle thirds ofthese intervals leaves us with 2 3 closed intervals, each of length 1/3 3 . Continuingin this fashion, at the nth stage we are left with 2 n closed intervals each oflength 1/3 n . The Cantor middle-thirds set C is what is left when we take thisprocess to the limit as n →∞. Note how similar this construction is to thatof in Section 15.5. In fact, it can be proved that is homeomorphic to C(see Exercises 16 and 17).What points in I are left in C after removing all of these open intervals?Certainly 0 and 1 remain in C, as do the endpoints 1/3 and 2/3 of the firstremoved interval. Indeed, each endpoint of a removed open interval liesin C because such a point never lies in an open middle-third subinterval.At first glance, it appears that these are the only points in the Cantor set,but in fact, that is far from the truth. Indeed, most points in C are notendpoints!To see this, we attach an address to each point in C. The address will bean infinite string of L’s or R’s determined as follows. At each stage of theconstruction, our point lies in one of two small closed intervals, one to theleft of the removed open interval or one to its right. So at the nth stage wemay assign an L or R to the point depending on its location left or right ofthe interval removed at that stage. For example, we associate LLL . . . to 0and RRR . . . to 1. The endpoints 1/3 and 2/3 have addresses LRRR . . . andRLLL . . . , respectively. At the next stage, 1/9 has address LLRRR . . . since 1/9lies in [0, 1/3] and [0, 1/9] at the first two stages, but then always lies in theright-hand interval. Similarly, 2/9 has address LRLLL . . . , while 7/9 and 8/9have addresses RLRRR . . . and RRLLL . . . .Notice what happens at each endpoint of C. As the above examples indicate,the address of an endpoint always ends in an infinite string of all L’s or all R’s.But there are plenty of other possible addresses for points in C. For example,


350 Chapter 15 Discrete Dynamical Systemsthere is a point with address LRLRLR . . . . This point lies in[0, 1/3] ∩[2/9, 1/3] ∩[2/9, 7/27] ∩[20/81, 7/27] ....Note that this point lies in the nested intersection of closed intervals of length1/3 n for each n, and it is the unique such point that does so. This shows thatmost points in C are not endpoints, for the typical address will not end in allL’s or all R’s.We can actually say quite a bit more: The Cantor middle-thirds set containsuncountably many points. Recall that an infinite set is countable if it can beput in one-to-one correspondence with the natural numbers; otherwise, theset is uncountable.Proposition.The Cantor middle-thirds set is uncountable.Proof: Suppose that C is countable. This means that we can pair each pointin C with a natural number in some fashion, say as1 : LLLLL . . .2 : RRRR . . .3 : LRLR . . .4 : RLRL . . .5 : LRRLRR . . .and so forth. But now consider the address whose first entry is the oppositeof the first entry of s<strong>eq</strong>uence 1, whose second entry is the opposite of thesecond entry of s<strong>eq</strong>uence 2; and so forth. This is a new s<strong>eq</strong>uence of L’s and R’s(which, in the example above, begins with RLRRL . . .). Thus we have createda s<strong>eq</strong>uence of L’s and R’s that disagrees in the nth spot with the nth s<strong>eq</strong>uenceon our list. Hence this s<strong>eq</strong>uence is not on our list and so we have failed in ourconstruction of a one-to-one correspondence with the natural numbers. Thiscontradiction establishes the result.We can actually determine the points in the Cantor middle-thirds set in amore familiar way. To do this we change the address of a point in C from as<strong>eq</strong>uence of L’s and R’s to a s<strong>eq</strong>uence of 0’s and 2’s; that is, we replace eachL with a 0 and each R with a 2. To determine the numerical value of a pointx ∈ C we approach x from below by starting at 0 and moving s n /3 n units to theright for each n, where s n = 0 or 2 depending on the nth digit in the addressfor n = 1, 2, 3 ....


15.7 The Cantor Middle-Thirds Set 351For example, 1 has address RRR . . . or 222 ..., so 1 is given by23 + 2 3 2 + 2 3 3 +··· = 2 3∞∑n=013 n = 2 ( )13 1 − 1/3Similarly, 1/3 has address LRRR . . . or 0222 ..., which yields03 + 2 3 2 + 2 3 3 +··· = 2 9∞∑n=013 n = 2 9 · 32 = 1 3 .= 1.Finally, the point with address LRLRLR . . . or 020202 ...is03 + 2 3 2 + 0 3 3 + 2 3 4 +··· = 2 9∞∑n=019 n = 2 ( )19 1 − 1/9= 1 4 .Note that this is one of the non-endpoints in C referred to earlier.The astute reader will recognize that the address of a point x in C with 0’sand 2’s gives the ternary expansion of x. A point x ∈ I has ternary expansiona 1 a 2 a 3 ...ifx =∞∑i=1where each a i is either 0, 1, or 2. Thus we see that points in the Cantor middlethirdsset have ternary expansions that may be written with no 1’s among thedigits.We should be a little careful here. The ternary expansion of 1/3 is 1000 ....But 1/3 also has ternary expansion 0222 ... as we saw above. So 1/3 may bewritten in ternary form in a way that contains no 1’s. In fact, every endpoint inC has a similar pair of ternary representations, one of which contains no 1’s.We have shown that C contains uncountably many points, but we can sayeven more:a i3 iProposition.interval [0, 1].The Cantor middle-thirds set contains as many points as theProof: C consists of all points whose ternary expansion a 0 a 1 a 2 ... containsonly 0’s or 2’s. Take this expansion and change each 2 to a 1 and then thinkof this string as a binary expansion. We get every possible binary expansion inthis manner. We have therefore made a correspondence (at most two to one)between the points in C and the points in [0, 1], since every such point has abinary expansion.


352 Chapter 15 Discrete Dynamical SystemsFinally, we note thatProposition. The Cantor middle-thirds set has length 0.Proof: We compute the “length” of C by adding up the lengths of the intervalsremoved at each stage to determine the length of the complement of C. Theseremoved intervals have successive lengths 1/3, 2/9, 4/27… and so the length ofI − C is13 + 2 9 + 427 +···= 1 ∞∑( ) 2 n= 1.3 3This fact may come as no surprise since C consists of a “scatter” of points.But now consider the Cantor middle-fifths set, obtained by removing the openmiddle-fifth of each closed interval in similar fashion to the construction ofC. The length of this set is nonzero, yet it is homeomorphic to C. TheseCantor sets have, as we said earlier, unexpectedly interesting properties! Andremember, the set on which f 4 is chaotic is just this kind of object.n=015.8 Exploration: Cubic ChaosIn this exploration, you will investigate the behavior of the discrete dynamicalsystem given by the family of cubic functions f λ (x) = λx − x 3 . You shouldattempt to prove rigorously everything outlined below.1. Describe the dynamics of this family of functions for all λ < −1.2. Describe the bifurcation that occurs at λ =−1. Hint: Note that f λ is anodd function. In particular, what happens when the graph of f λ crossesthe line y =−x?3. Describe the dynamics of f λ when −1 < λ < 1.4. Describe the bifurcation that occurs at λ = 1.5. Find a λ-value, λ ∗ , for which f λ ∗ has a pair of invariant intervals [0, ±x ∗ ]on each of which the behavior of f λ mimics that of the logistic function4x(1 − x).6. Describe the change in dynamics that occurs when λ increasesthrough λ ∗ .7. Describe the dynamics of f λ when λ is very large. Describe the set ofpoints λ whose orbits do not escape to ±∞ in this case.8. Use symbolic dynamics to set up a s<strong>eq</strong>uence space and a correspondingshift map for λ large. Prove that f λ is chaotic on λ .9. Find the parameter value λ ′ > λ ∗ above, which the results of the previoustwo investigations hold true.10. Describe the bifurcation that occurs as λ increases through λ ′ .


15.9 Exploration: The Orbit Diagram 35315.9 Exploration: The Orbit DiagramUnlike the previous exploration, this exploration is primarily experimental.It is designed to acquaint you with the rich dynamics of the logistic family asthe parameter increases from 0 to 4. Using a computer and whatever softwareseems appropriate, construct the orbit diagram for the logistic family f λ (x) =λx(1 − x) as follows: Choose N <strong>eq</strong>ually spaced λ-values λ 1 , λ 2 , ..., λ N in theinterval 0 ≤ λ j ≤ 4. For example, let N = 800 and set λ j = 0. 005j. For eachλ j , compute the orbit of 0. 5 under f λj and plot this orbit as follows.Let the horizontal axis be the λ-axis and let the vertical axis be the x-axis.Over each λ j , plot the points (λ j , fλ kj(0. 5)) for, say, 50 ≤ k ≤ 250. That is,compute the first 250 points on the orbit of 0. 5 under f λj , but display only thelast 200 points on the vertical line over λ = λ j . Effectively, you are displayingthe “fate” of the orbit of 0. 5 in this way.You will need to magnify certain portions of this diagram; one such magnificationis displayed in Figure 15.13, where we have displayed only that portionof the orbit diagram for λ in the interval 3 ≤ λ ≤ 4.1. The region bounded by 0 ≤ λ < 3. 57 ... is called the period 1 window.Describe what you see as λ increases in this window. What type ofbifurcations occur?2. Near the bifurcations in the previous question, you sometimes see a smearof points. What causes this?3. Observe the period 3 window bounded approximately by 3. 828 ... < λ


354 Chapter 15 Discrete Dynamical Systems4. There are many other period n windows (named for the least period ofthe cycle in that window). Discuss any pattern you can find in how thesewindows are arranged as λ increases. In particular, if you magnify portionsbetween the period 1 and period 3 windows, how are the larger windowsin each successive enlargement arranged?5. You observe “darker” curves in this orbit diagram. What are these? Whydoes this happen?EXERCISES1. Find all periodic points for each of the following maps and classify themas attracting, repelling, or neither.(a) Q(x) = x − x 2 (b) Q(x) = 2(x − x 2 )(c) C(x) = x 3 − 1 9 x (d) C(x) = x3 − x(e) S(x) = 1 2 sin(x) (f) S(x) = sin(x)(g) E(x) = e x−1 (h) E(x) = e x(i) A(x) = arctan x (j) A(x) =− π 4 arctan x2. Discuss the bifurcations that occur in the following families of maps atthe indicated parameter value(a) S λ (x) = λ sin x, λ = 1(b) C μ (x) = x 3 + μx, μ =−1(Hint: Exploit the fact that C μ is anodd function.)(c) G ν (x) = x + sin x + ν, ν = 1(d) E λ (x) = λe x , λ = 1/e(e) E λ (x) = λe x , λ =−e(f) A λ (x) = λ arctan x, λ = 1(g) A λ (x) = λ arctan x, λ =−13. Consider the linear maps f k (x) = kx. Show that there are four open setsof parameters for which the behavior of orbits of f k is similar. Describewhat happens in the exceptional cases.4. For the function f λ (x) = λx(1 − x) <strong>def</strong>ined on R:(a) Describe the bifurcations that occur at λ =−1 and λ = 3.(b) Find all period 2 points.(c) Describe the bifurcation that occurs at λ =−1. 75.5. For the doubling map D on [0, 1):(a) List all periodic points explicitly.


Exercises 355(b) List all points whose orbits end up landing on 0 and are therebyeventually fixed.(c) Let x ∈[0, 1) and suppose that x is given in binary form as a 0 a 1 a 2 ...where each a j is either 0 or 1. First give a formula for the binaryrepresentation of D(x). Then explain why this causes orbits of Dgenerated by a computer to end up eventually fixed at 0.6. Show that, if x 0 lies on a cycle of period n, thenConclude thatn−1∏(f n ) ′ (x 0 ) = f ′ (x i ).i=0(f n ) ′ (x 0 ) = (f n ) ′ (x j )for j = 1, ..., n − 1.7. Prove that if f λ0 has a fixed point at x 0 with |fλ ′0(x 0 )| > 1, then there is aninterval I about x 0 and an interval J about λ 0 such that, if λ ∈ J, then f λhas a unique fixed source in I and no other orbits that lie entirely in I.8. Verify that the family f c (x) = x 2 + c undergoes a period doublingbifurcation at c =−3/4 by(a) Computing explicitly the period two orbit.(b) Showing that this orbit is attracting for −5/4


356 Chapter 15 Discrete Dynamical SystemsProve that T is chaotic on [0, 1].16. Consider a different “tent function” <strong>def</strong>ined on all of R byT (x) ={ 3x if x ≤ 1/2−3x + 3 if 1/2 ≤ x.Identify the set of points whose orbits do not go to −∞. What canyou say about the dynamics of this set?17. Use the results of the previous exercise to show that the set inSection 15.5 is homeomorphic to the Cantor middle-thirds set.18. Prove the following saddle-node bifurcation theorem: Suppose that f λdepends smoothly on the parameter λ and satisfies:(a) f λ0 (x 0 ) = x 0(b) fλ ′0(x 0 ) = 1(c) fλ ′′0(x 0 ) ̸= 0(d) ∂f λ∂λ ∣ (x 0 ) ̸= 0λ=λ0Then there is an interval I about x 0 and a smooth function μ: I → Rsatisfying μ(x 0 ) = λ 0 and such thatf μ(x) (x) = x.Moreover, μ ′ (x 0 ) = 0 and μ ′′ (x 0 ) ̸= 0. Hint: Apply the implicit functiontheorem to G(x, λ) = f λ (x) − x at (x 0 , λ 0 ).19. Discuss why the saddle-node bifurcation is the “typical” bifurcationinvolving only fixed points.g(y * )y * y *g(y * )Figure 15.14 The graph of theone-dimensional function g on[−y ∗ , y ∗ ].


Exercises 35720. Recall that comprehending the behavior of the Lorenz system inChapter 14 could be reduced to understanding the dynamics of a certainone-dimensional function g on an interval [−y ∗ , y ∗ ] whose graph isshown in Figure 15.14. Recall also |g ′ (y)| > 1 for all y ̸= 0 and thatg is un<strong>def</strong>ined at 0. Suppose now that g 3 (y ∗ ) = 0 as displayed in thisgraph. By symmetry, we also have g 3 (−y ∗ ) = 0. Let I 0 =[−y ∗ , 0) andI 1 = (0, y ∗ ] and <strong>def</strong>ine the usual itinerary map on [−y ∗ , y ∗ ].(a) Describe the set of possible itineraries under g.(b) What are the possible periodic points for g?(c) Prove that g is chaotic on [−y ∗ , y ∗ ].


This Page Intentionally Left Blank


16Homoclinic PhenomenaIn this chapter we investigate several other three-dimensional systems of differential<strong>eq</strong>uations that display chaotic behavior. These systems include theShil’nikov system and the double scroll attractor. As with the Lorenz system,our principal means of studying these systems involves reducing them to lowerdimensional discrete dynamical systems, and then invoking symbolic dynamics.In these cases the discrete system is a planar map called the horseshoe map.This was one of the first chaotic systems to be analyzed completely.16.1 The Shil’nikov SystemIn this section we investigate the behavior of a nonlinear system of differential<strong>eq</strong>uations that possesses a homoclinic solution to an <strong>eq</strong>uilibrium point that isa spiral saddle. While we deal primarily with a model system here, the work ofShil’nikov and others [4, 40, 41], shows that the phenomena described in thischapter hold in many actual systems of differential <strong>eq</strong>uations. Indeed, in theexploration at the end of this chapter, we investigate the system of differential<strong>eq</strong>uations governing the Chua circuit, which, for certain parameter values, hasa pair of such homoclinic solutions.For this example, we do not specify the full system of differential <strong>eq</strong>uations.Rather, we first set up a linear system of differential <strong>eq</strong>uations in a certain cylindricalneighborhood of the origin. This system has a two-dimensional stablesurface in which solutions spiral toward the origin and a one-dimensionalunstable curve. We then make the simple but crucial dynamical assumptionthat one of the two branches of the unstable curve is a homoclinic solution359


360 Chapter 16 Homoclinic Phenomenaand thus eventually enters the stable surface. We do not write down a specificdifferential <strong>eq</strong>uation having this behavior. Although it is possible to do so,having the <strong>eq</strong>uations is not particularly useful for understanding the globaldynamics of the system. In fact, the phenomena we study here depend onlyon the qualitative properties of the linear system described previously a keyin<strong>eq</strong>uality involving the eigenvalues of this linear system, and the homoclinicassumption.The first portion of the system is <strong>def</strong>ined in the cylindrical region S of R 3given by x 2 + y 2 ≤ 1 and |z| ≤1. In this region consider the linear system⎛X ′ = ⎝ −1 1 0⎞−1 −1 0⎠ X.0 0 2The associated eigenvalues are −1 ± i and 2. Using the results of Chapter 6,the flow φ t of this system is easily derived:x(t) = x 0 e −t cos t + y 0 e −t sin ty(t) =−x 0 e −t sin t + y 0 e −t cos tz(t) = z 0 e 2t .Using polar coordinates in the xy–plane, solutions in S are given moresuccinctly byr(t) = r 0 e −tθ(t) = θ 0 − tz(t) = z 0 e 2t .This system has a two-dimensional stable plane (the xy–plane) and a pair ofunstable curves ζ ± lying on the positive and negative z-axis, respectively.We remark that there is nothing special about our choice of eigenvalues forthis system. Everything below works fine for eigenvalues α ± iβ and λ whereα < 0, β ̸= 0, and λ > 0 subject only to the important condition that λ > −α.The boundary of S consists of three pieces: the upper and lower disksD ± given by z = ±1, r ≤ 1, and the cylindrical boundary C given byr = 1, |z| ≤1. The stable plane meets C along the circle z = 0 and dividesC into two pieces, the upper and lower halves given by C + and C − , on whichz>0 and z


16.1 The Shil’nikov System 361let τ = τ(θ 0 , z 0 ) denote the time it takes for the solution through (θ 0 , z 0 )tomake the transit to D + . We compute immediately using z(t) = z 0 e 2t thatτ =−log( √ z 0 ). Therefore⎛ψ 1⎝ 1 ⎞ ⎛θ 0⎠ = ⎝ r ⎞ ⎛ √ ⎞1z0θ 1⎠ = ⎝θ 0 + log ( √ )z0 ⎠ .z 0 11For simplicity, we will regard ψ 1 as a map from the (θ 0 , z 0 ) cylinder to the(r 1 , θ 1 ) plane. Note that a vertical line given by θ 0 = θ ∗ in C + is mapped byψ 1 to the spiralz 0 → ( √z0 , θ ∗ + log ( √ ))z0 ,which spirals down to the point r = 0inD ± , since log √ z 0 →−∞as z 0 → 0.To <strong>def</strong>ine the second piece of the system, we assume that the branch ζ +of the unstable curve leaving the origin through D + is a homoclinic solution.That is, ζ + eventually returns to the stable plane. See Figure 16.1. We assumethat ζ + first meets the cylinder C at the point r = 1, θ = 0, z = 0. Moreprecisely, we assume that there is a time t 1 such that φ t1 (0, θ,1) = (1, 0, 0) inr, θ, z coordinates.Therefore we may <strong>def</strong>ine a second map ψ 2 by following solutions beginningnear r = 0inD + until they reach C. We will assume that ψ 2 is, in fact,<strong>def</strong>ined on all of D + . In Cartesian coordinates on D + , we assume that ψ 2takes (x, y) ∈ D + to (θ 1 , z 1 ) ∈ C via the rule( ) ( ) ( )x θ1 y/2ψ 2 = = .y z 1 x/2f Figure 16.1 The homoclinic orbit ζ + .


362 Chapter 16 Homoclinic PhenomenaIn polar coordinates, ψ 2 is given byθ 1 = (r sin θ)/2z 1 = (r cos θ)/2.Of course, this is a major assumption, since writing down such a map for aparticular nonlinear system would be virtually impossible.Now the composition = ψ 2 ◦ ψ 1 <strong>def</strong>ines a Poincaré map on C + . Themap ψ 1 is <strong>def</strong>ined on C + and takes values in D + , and then ψ 2 takes values inC. We have : C + → C where( ) ( ) ( 12 √θ0 θ1 z0 sin ( θ 0 + log( √ z 0 ) ) ) = =z 0 z 1√ 12 z0 cos ( θ 0 + log( √ z 0 ) ) .See Figure 16.2.As in the Lorenz system, we have now reduced the study of the flow of thisthree-dimensional system to the study of a planar discrete dynamical system.As we shall see in the next section, this type of mapping has incredibly richdynamics that may be (partially) analyzed using symbolic dynamics. For a littletaste of what is to come, we content ourselves here with just finding the fixedpoints of . To do this we need to solveθ 0 = 1 √z0 sin ( θ 0 + log( √ z 0 ) )2z 0 = 1 2√z0 cos ( θ 0 + log( √ z 0 ) ) .D f C v 2 (D )C Figure 16.2 The map ψ 2 :D + → C.


16.1 The Shil’nikov System 363These <strong>eq</strong>uations look pretty formidable. However, if we square both <strong>eq</strong>uationsand add them, we findθ 2 0 + z2 0 = z 04so thatθ 0 =± 1 2√z 0 − 4z 2 0 ,which is well <strong>def</strong>ined provided that 0 ≤ z 0 ≤ 1/4. Substituting this expressioninto the second <strong>eq</strong>uation above, we find that we need to solve(cos ± 1 √z 0 − 4z0 2 2+ log ( √ ) )z0 = 2 √ z 0 .√Now the term z 0 − 4z0 2 tends to zero as z 0 → 0, but log( √ z 0 ) →−∞.Therefore the graph of the left-hand side of this <strong>eq</strong>uation oscillates infinitelymany times between ±1 asz 0 → 0. Hence there must be infinitely manyplaces where this graph meets that of 2 √ z 0 , and so there are infinitely manysolutions of this <strong>eq</strong>uation. This, in turn, yields infinitely many fixed points for. Each of these fixed points then corresponds to a periodic solution of thesystem that starts in C + , winds a number of times around the z-axis near theorigin, and then travels around close to the homoclinic orbit until closing upwhen it returns to C + . See Figure 16.3.cf Figure 16.3 A periodic solution γnear the homoclinic solution ζ + .


364 Chapter 16 Homoclinic PhenomenaWe now describe the geometry of this map; in the next section we use theseideas to investigate the dynamics of a simplified version of this map. First notethat the circles z 0 = α in C + are mapped by ψ 1 to circles r = √ α centered atr = 0inD + since( ) ( ) ( √ )θ0 r1αψ 1 = =α θ 1 θ 0 + log( √ .α)Then ψ 2 maps these circles to circles of radius √ α/2 centered at θ 1 = z 1 = 0in C. (To be precise, these are circles in the θz–plane; in the cylinder, thesecircles are “bent.”) In particular, we see that “one-half” of the domain C + ismapped into the lower part of the cylinder C − and therefore no longer comesinto play.Let H denote the half-disk (C + ) ∩{z ≥ 0}. Half-disk H has center atθ 1 = z 1 = 0 and radius 1/2. The preimage of H in C + consists of all points(θ 0 , z 0 ) whose images satisfy z 1 ≥ 0, so that we must havez 1 = 1 2√z0 cos ( θ 0 + log( √ z 0 ) ) ≥ 0.It follows that the preimage of H is given by −1 (H) ={(θ 0 , z 0 ) |−π/2 ≤ θ 0 + log( √ z 0 ) ≤ π/2}where 0


16.1 The Shil’nikov System 365for −π/2 ≤ α ≤ π/2. These curves fill the preimage −1 (H) and each spiralsaround C just as the boundary curves do. Now we have(l α ) =√z02( ) sin α,cos αso maps each l α to a ray that emanates from θ = z = 0inC + and isparameterized by √ z 0 . In particular, maps each of the boundary curvesl ±π/2 to z = 0inC.Since the curves l ±π/2 spiral down toward the circle z = 0inC, it followsthat −1 (H) meets H in infinitely many strips, which are nearly horizontalclose to z = 0. See Figure 16.4. We denote these strips by H k for k sufficientlylarge. More precisely, let H k denote the component of −1 (H) ∩ H for whichwe have2kπ − 1 2 ≤ θ 0 ≤ 2kπ + 1 2 .The top boundary of H k is given by a portion of the spiral l π/2 and the bottomboundary by a piece of l −π/2 . Using the fact that− π 2 ≤ θ 0 + log ( √ ) πz0 ≤2 ,we find that, if (θ 0 , z 0 ) ∈ H k , then−(4k + 1)π − 1 ≤−π − 2θ 0 ≤ 2 log √ z 0 ≤ π − 2θ 0 ≤−(4k − 1)π + 1from which we conclude thatexp(−(4k + 1)π − 1) ≤ z 0 ≤ exp(−(4k − 1)π + 1).Now consider the image of H k under . The upper and lower boundariesof H k are mapped to z = 0. The curves l α ∩ H k are mapped to arcs in raysemanating from θ = z = 0. These rays are given as above by√z02( ) sin α.cos αIn particular, the curve l 0 is mapped to the vertical line θ 1 = 0, z 1 = √ z 0 /2.Using the above estimate of the size of z 0 in H k , one checks easily that the imageof l 0 lies completely above H k when k ≥ 2. Therefore the image of (H k )isa“horseshoe-shaped” region that crosses H k twice as shown in Figure 16.5. Inparticular, if k is large, the curves l α ∩ H k meet the horsehoe (H k ) in nearlyhorizontal subarcs.


366 Chapter 16 Homoclinic PhenomenaHΦ(H k )H kFigure 16.5 The image of H k is ahorseshoe that crosses H k twicein C + .Such a map is called a horseshoe map; in the next section we discuss theprototype of such a function.16.2 The Horseshoe MapSymbolic dynamics, which played such a crucial role in our understanding ofthe one-dimensional logistic map, can also be used to study higher dimensionalphenomena. In this section, we describe an important example in R 2 , thehorseshoe map [43]. We shall see that this map has much in common with thePoincaré map described in the previous section.To <strong>def</strong>ine the horseshoe map, we consider a region D consisting of threecomponents: a central square S with sides of length 1, together with twosemicircles D 1 and D 2 at the top and bottom. D is shaped like a “stadium.”The horseshoe map F takes D inside itself according to the following prescription.First, F linearly contracts S in the horizontal direction by a factorδ < 1/2 and expands it in the vertical direction by a factor of 1/δ so that S islong and thin, and then F curls S back inside D in a horseshoe-shaped figureas displayed in Figure 16.6. We stipulate that F maps S linearly onto the twovertical “legs”of the horseshoe.We assume that the semicircular regions D 1 and D 2 are mapped inside D 1as depicted. We also assume that there is a fixed point in D 1 that attracts allother orbits in D 1 . Note that F(D) ⊂ D and that F is one-to-one. However,since F is not onto, the inverse of F is not globally <strong>def</strong>ined. The remainder ofthis section is devoted to the study of the dynamics of F in D.Note first that the preimage of S consists of two horizontal rectangles, H 0and H 1 , which are mapped linearly onto the two vertical components V 0 andV 1 of F(S) ∩ S. The width of V 0 and V 1 is therefore δ, as is the height of H 0and H 1 . See Figure 16.7. By linearity of F : H 0 → V 0 and F : H 1 → V 1 ,weknow that F takes horizontal and vertical lines in H j to horizontal and verticallines in V j for j = 1, 2. As a cons<strong>eq</strong>uence, if both h and F(h) are horizontalline segments in S, then the length of F(h)isδ times the length of h. Similarly,


16.2 The Horseshoe Map 367D 2F(S)SFD 1F(D 1 ) F(D 2 )Figure 16.6The first iterate of the horseshoe map.if v is a vertical line segment in S whose image also lies in S, then the length ofF(v)is1/δ times the length of v.We now describe the forward orbit of each point X ∈ D. Recall that theforward orbit of X is given by {F n (X) | n ≥ 0}. By assumption, F has a uniquefixed point X 0 in D 1 and lim n→∞ F n (X) = X 0 for all X ∈ D 1 . Also, sinceF(D 2 ) ⊂ D 1 , all forward orbits in D 2 behave likewise. Similarly, if X ∈ S butF k (X) ̸∈ S for some k>0, then we must have that F k (X) ∈ D 1 ∪ D 2 so thatF n (X) → X 0 as n →∞as well. Cons<strong>eq</strong>uently, we understand the forwardorbits of any X ∈ D whose orbit enters D 1 , so it suffices to consider the set ofV 0 V 1H 1H 0Figure 16.7 Therectangles H 0 and H 1 andtheir images V 0 and V 1 .


368 Chapter 16 Homoclinic Phenomenapoints whose forward orbits never enter D 1 and so lie completely in S. Let + ={X ∈ S | F n (X) ∈ S for n = 0, 1, 2, ...}.We claim that + has properties similar to the corresponding set for theone-dimensional logistic map described in Chapter 15.If X ∈ + , then F(X) ∈ S, so we must have that either X ∈ H 0 or X ∈ H 1 ,for all other points in S are mapped into D 1 or D 2 . Since F 2 (X) ∈ S as well,we must also have F(X) ∈ H 0 ∪ H 1 , so that X ∈ F −1 (H 0 ∪ H 1 ). Here F −1 (W )denotes the preimage of a set W lying in D. In general, since F n (X) ∈ S, wehave X ∈ F −n (H 0 ∪ H 1 ). Thus we may write + =∞⋂F −n (H 0 ∪ H 1 ).n=0Now if H is any horizontal strip connecting the left and right boundaries ofS with height h, then F −1 (H) consists of a pair of narrower horizontal stripsof height δh, one in each of H 0 and H 1 . The images under F of these narrowerstrips are given by H ∩ V 0 and H ∩ V 1 . In particular, if H = H i , F −1 (H i )isapair of horizontal strips, each of height δ 2 , with one in H 0 and the other in H 1 .Similarly, F −1 (F −1 (H i )) = F −2 (H i ) consists of four horizontal strips, each ofheight δ 3 , and F −n (H i ) consists of 2 n horizontal strips of width δ n+1 . Hencethe same procedure we used in Section 15.5 shows that + is a Cantor set ofline segments, each extending horizontally across S.The main difference between the horseshoe and the logistic map is that, inthe horseshoe case, there is a single backward orbit rather than infinitely manysuch orbits. The backward orbit of X ∈ S is {F −n (X) | n = 1, 2, ...}, providedF −n (X) is <strong>def</strong>ined and in D. IfF −n (X) is not <strong>def</strong>ined, then the backwardorbit of X terminates. Let − denote the set of points whose backward orbitis <strong>def</strong>ined for all n and lies entirely in S.IfX ∈ − , then we have F −n (X) ∈ Sfor all n ≥ 1, which implies that X ∈ F n (S) for all n ≥ 1. As above, this forcesX ∈ F n (H 0 ∪ H 1 ) for all n ≥ 1. Therefore we may also write − =∞⋂F n (H 0 ∪ H 1 ).n=1On the other hand, if X ∈ S and F −1 (X) ∈ S, then we must have X ∈F(S) ∩ S, so that X ∈ V 0 or X ∈ V 1 . Similarly, if F −2 (X) ∈ S as well, thenX ∈ F 2 (S) ∩ S, which consists of four narrower vertical strips, two in V 0 andtwo in V 1 . In Figure 16.8 we show the image of D under F 2 . Arguing entirelyanalogously as above, it is easy to check that − consists of a Cantor set ofvertical lines.


16.2 The Horseshoe Map 369Figure 16.8 Thesecond iterate of thehorseshoe map.Let = + ∩ −be the intersection of these two sets. Any point in has its entire orbit (boththe backward and forward orbit) in S.To introduce symbolic dynamics into this picture, we will assign a doublyinfinite s<strong>eq</strong>uence of 0’s and 1’s to each point in . IfX ∈ , then, from theabove, we haveX ∈∞⋂n=−∞Thus we associate to X the itineraryF n (H 0 ∪ H 1 ).S(X) = (...s −2 s −1 · s 0 s 1 s 2 ...)where s j = 0or1ands j = k if and only if F j (X) ∈ H k . This then provides uswith the symbolic dynamics on . Let 2 denote the set of all doubly infinites<strong>eq</strong>uences of 0’s and 1’s: 2 ={(s) = (...s −2 s −1 · s 0 s 1 s 2 ...) | s j = 0or1}.


370 Chapter 16 Homoclinic PhenomenaWe impose a distance function on 2 by <strong>def</strong>iningd((s), (t)) =∞∑i=−∞|s i − t i |2 |i|as in Section 15.5. Thus two s<strong>eq</strong>uences in 2 are “close” if they agree in all kspots where |k| ≤n for some (large) n. Define the (two-sided) shift map σ byσ (...s −2 s −1 · s 0 s 1 s 2 ...) = (...s −2 s −1 s 0 · s 1 s 2 ...).That is, σ simply shifts each s<strong>eq</strong>uence in 2 one unit to the left (<strong>eq</strong>uivalently,σ shifts the decimal point one unit to the right). Unlike our previous (onesided)shift map, this map has an inverse. Clearly, shifting one unit to the rightgives this inverse. It is easy to check that σ is a homeomorphism on 2 (seeExercise 2 at the end of this chapter).The shift map is now the model for the restriction of F to . Indeed, theitinerary map S gives a conjugacy between F on and σ on 2 . For if X ∈ and S(X) = (...s −2 s −1 · s 0 s 1 s 2 ...), then we have X ∈ H s0 , F(X) ∈ H s1 ,F −1 (X) ∈ H s−1 , and so forth. But then we have F(X) ∈ H s1 , F(F(X)) ∈ H s2 ,X = F −1 (F(X)) ∈ H s0 , and so forth. This tells us that the itinerary of F(X)is(...s −1 s 0 · s 1 s 2 ...), so thatS(F(x)) = (...s −1 s 0 · s 1 s 2 ...) = σ (S(X)),which is the conjugacy <strong>eq</strong>uation. We leave the proof of the fact that S is ahomeomorphism to the reader (see Exercise 3).All of the properties that held for the old one-sided shift from the previouschapter hold for the two-sided shift σ as well. For example, there are precisely2 n periodic points of period n for σ and there is a dense orbit for σ . Moreover, Fis chaotic on (see Exercises 4 and 5). But new phenomena are present as well.We say that two points X 1 and X 2 are forward asymptotic if F n (X 1 ), F n (X 2 ) ∈ Dfor all n ≥ 0 and∣lim ∣F n (X 1 ) − F n (X 2 ) ∣ = 0.n→∞Points X 1 and X 2 are backward asymptotic if their backward orbits are <strong>def</strong>inedfor all n and the above limit is zero as n →−∞. Intuitively, two points in Dare forward asymptotic if their orbits approach each other as n →∞. Notethat any point that leaves S under forward iteration of F is forward asymptoticto the fixed point X 0 ∈ D 1 . Also, if X 1 and X 2 lie on the same horizontal linein + , then X 1 and X 2 are forward asymptotic. If X 1 and X 2 lie on the samevertical line in − , then they are backward asymptotic.


16.2 The Horseshoe Map 371We <strong>def</strong>ine the stable set of X to beW s (X) = { Z ||F n (Z) − F n (X)| →0asn →∞ } .The unstable set of X is given byW u (X) = { Z ||F −n (X) − F −n (Z)| →0asn →∞ } .Equivalently, a point Z lies in W s (X) ifX and Z are forward asymptotic.As above, any point in S whose orbit leaves S under forward iteration of thehorseshoe map lies in the stable set of the fixed point in D 1 .The stable and unstable sets of points in are more complicated. Forexample, consider the fixed point X ∗ , which lies in H 0 and therefore has theitinerary (...00 · 000 ...). Any point that lies on the horizontal segment l sthrough X ∗ lies in W s (X ∗ ). But there are many other points in this stable set.Suppose the point Y eventually maps into l s . Then there is an integer n suchthat |F n (Y ) − X ∗ | < 1. Hence|F n+k (Y ) − X ∗ | < δ kand it follows that Y ∈ W s (X ∗ ). Thus the union of horizontal intervals givenby F −k (l s ) for k = 1, 2, 3, ...all lie in W s (X ∗ ). The reader may easily checkthat there are 2 k such intervals.Since F(D) ⊂ D, the unstable set of the fixed point X ∗ assumes a somewhatdifferent form. The vertical line segment l u through X ∗ in D clearly lies inW u (X ∗ ). As above, all of the forward images of l u also lie in D. One mayeasily check that F k (l u ) is a “snake-like” curve in D that cuts vertically acrossS exactly 2 k times. See Figure 16.9. The union of these forward images is thena very complicated curve that passes through S infinitely often. The closure ofthis curve in fact contains all points in as well as all of their unstable curves(see Exercise 12).The stable and unstable sets in are easy to describe on the shift level. Lets ∗ = (...s ∗ −2 s∗ −1 · s∗ 0 s∗ 1 s∗ 2 ...) ∈ 2.Clearly, if t is a s<strong>eq</strong>uence whose entries agree with those of s ∗ to the right ofsome entry, then t ∈ W s (s ∗ ). The converse of this is also true, as is shown inExercise 6.A natural question that arises is the relationship between the set for theone-dimensional logistic map and the corresponding for the horseshoe map.Intuitively, it may appear that the for the horseshoe has many more points.However, both ’s are actually homeomorphic! This is best seen on the shiftlevel.


372 Chapter 16 Homoclinic PhenomenaX *X 0Figure 16.9 The unstableset for X ∗ in D.Let 1 2 denote the set of one-sided s<strong>eq</strong>uences of 0’s and 1’s and 2 the setof two-sided such s<strong>eq</strong>uences. Define a map: 1 2 → 2by(s 0 s 1 s 2 ...) = (...s 5 s 3 s 1 · s 0 s 2 s 4 ...).It is easy to check that is a homeomorphism between 1 2 and 2(see Exercise 11).Finally, to return to the subject of Section 16.1, note that the return mapinvestigated in that section consists of infinitely many pieces that resemblethe horseshoe map of this section. Of course, the horseshoe map here waseffectively linear in the region where the map was chaotic, so the results inthis section do not go over immediately to prove that the return maps nearthe homoclinic orbit have similar properties. This can be done; however, thetechniques for doing so (involving a generalized notion of hyperbolicity) arebeyond the scope of this book. See [13] or [36] for details.16.3 The Double Scroll AttractorIn this section we continue the study of behavior near homoclinic solutions ina three-dimensional system. We return to the system described in Section 16.1,


16.3 The Double Scroll Attractor 373f f Figure 16.10 The homoclinic orbits ζ ± .only now we assume that the vector field is skew-symmetric about the origin. Inparticular, this means that both branches of the unstable curve at the orgin, ζ ± ,now yield homoclinic solutions as depicted in Figure 16.10. We assume thatζ + meets the cylinder C given by r = 1, |z| ≤1 at the point θ = 0, z = 0,so that ζ − meets the cylinder at the diametrically opposite point, θ = π,z = 0.As in the Section 16.1, we have a Poincaré map <strong>def</strong>ined on the cylinder C.This time, however, we cannot disregard solutions that reach C in the regionz


374 Chapter 16 Homoclinic Phenomena1zC (C ) (C )0 oh1C Figure 16.11 (C ± ) ∩ C, where wehave displayed the cylinder C as astrip.The centers of these disks do not lie in the image, because these are the pointswhere ζ ± enters C. See Figure 16.11.Now let X ∈ C. Either the solution through X lies on the stable surface ofthe origin, or else (X) is <strong>def</strong>ined, so that the solution through X returns to Cat some later time. As a cons<strong>eq</strong>uence, each point X ∈ C has the property that1. Either the solution through X crosses C infinitely many times as t →∞,so that n (X) is <strong>def</strong>ined for all n ≥ 0, or2. The solution through X eventually meets z = 0 and hence lies on thestable surface through the origin.In backward time, the situation is different: only those points that lie in (C ± )have solutions that return to C; strictly speaking, we have not <strong>def</strong>ined thebackward solution of points in C − (C ± ), but we think of these solutions asbeing <strong>def</strong>ined in R 3 and eventually meeting C, after which time these solutionscontinually revisit C.As in the case of the Lorenz attractor, we letA =∞⋂ n (C)n=0where n (C) denotes the closure of the set n (C). Then we set( ⋃ ⋃A = φ t (A)){(0, 0, 0)}.t∈RNote that n (C)− n (C) is just the two intersection points of the homoclinicsolutions ζ ± with C. Therefore we only need to add the origin to A to ensurethat A is a closed set.


16.4 Homoclinic Bifurcations 375The proof of the following result is similar in spirit to the correspondingresult for the Lorenz attractor in Section 14.4.Proposition. The set A has the following properties:1. A is compact and invariant;2. There is an open set U containing A such that for each X ∈ U,φ t (X) ∈ Ufor all t ≥ 0 and ∩ t≥0 φ t (U ) = A.Thus A has all of the properties of an attractor except the transitivity property.Nonetheless, A is traditionally called a double scroll attractor.We cannot compute the divergence of the double scroll vector field as wedid in the Lorenz case for the simple reason that we have not written down theformula for this system. However, we do have an expression for the Poincarémap . A straightforward computation shows that det D = 1/8. That is,the Poincaré map shrinks areas by a factor of 1/8 at each iteration. HenceA =∩ n≥0 n (C) has area 0 in C and we have:Proposition. The volume of the double scroll attractor A is zero. 16.4 Homoclinic BifurcationsIn higher dimensions, bifurcations associated with homoclinic orbits may leadto horribly (or wonderfully, depending on your point of view) complicatedbehavior. In this section we give a brief indication of some of the ramificationsof this type of bifurcation. We deal here with a specific perturbation of thedouble scroll vector field that breaks both of the homoclinic connections.The full picture of this bifurcation involves understanding the “unfolding”of infinitely many horseshoe maps. By this we mean the following. Considera family of maps F λ <strong>def</strong>ined on a rectangle R with parameter λ ∈[0, 1]. Theimage of F λ (R) is a horseshoe as displayed in Figure 16.12. When λ = 0,F λ (R) lies below R. Asλ increases, F λ (R) rises monotonically. When λ = 1,F λ (R) crosses R twice and we assume that F 1 is the horseshoe map describedin Section 16.2.Clearly, F 0 has no periodic points whatsoever in R, but by the time λ hasreached 1, infinitely many periodic points have been born, and other chaoticbehavior has appeared. The family F λ has undergone infinitely many bifurcationsen route to the horseshoe map. How these bifurcations occur is thesubject of much contemporary research in mathematics.The situation here is significantly more complex than the bifurcations thatoccur for the one-dimensional logistic family f λ (x) = λx(1 − x) with 0 ≤ λ ≤4. The bifurcation structure of the logistic family has recently been completelydetermined; the planar case is far from being resolved.


376 Chapter 16 Homoclinic PhenomenabF 1 (R)a R F 0 (b)F 0 (R)F 0 (a)F 1 (a)F 1 (b)(a)(b)Figure 16.12(b) λ =1.The images F λ (R) for (a) λ = 0 andWe now introduce a parameter ɛ into the double scroll system. When ɛ = 0the system will be the double scroll system considered in the previous section.When ɛ ̸= 0 we change the system by simply translating ζ + ∩ C (and thecorresponding transit map) in the z direction by ɛ. More precisely, we assumethat the system remains unchanged in the cylindrical region r ≤ 1, |z| ≤1,but we change the transit map <strong>def</strong>ined on the upper disk D + by adding (0, ɛ)to the image. That is, the new Poincaré map is given on C + by( 12√ ( √ ) ) + z sin θ + log( z)ɛ (θ, z) = √ ( √ )z cos θ + log( z) + ɛ12and − ɛ is <strong>def</strong>ined similarly using the skew-symmetry of the system. We furtherassume that ɛ is chosen small enough (|ɛ| < 1/2) so that ± ɛ (C) ⊂ C.When ɛ > 0, ζ + intersects C in the upper cylindrical region C + and then,after passing close to the origin, winds around itself before reintersecting Ca second time. When ɛ < 0, ζ + now meets C in C − and then takes a verydifferent route back to C, this time winding around ζ − .Recall that + 0 has infinitely many fixed points in C ± . This changesdramatically when ɛ ̸= 0.Proposition. The maps ± ɛwhen ɛ ̸= 0.each have only finitely many fixed points in C±Proof: To find fixed points of + ɛ , we must solveθ =z =√z02 sin ( θ + log( √ z) )√z02 cos ( θ + log( √ z) ) + ɛ


where ɛ > 0. As in Section 16.1, we must therefore haveso thatIn particular, we must haveor, <strong>eq</strong>uivalently,16.4 Homoclinic Bifurcations 377z4 = θ 2 + (z − ɛ) 2θ =± 1 2√z − 4(z − ɛ) 2 .z − 4(z − ɛ) 2 ≥ 04(z − ɛ) 2z≤ 1.This in<strong>eq</strong>uality holds provided z lies in the interval I ɛ <strong>def</strong>ined by18 + ɛ − 1 √ 11 + 16ɛ ≤ z ≤88 + ɛ + 1 √1 + 16ɛ.8This puts a further restriction on ɛ for + ɛ to have fixed points, namely,ɛ > −1/16. Note that, when ɛ > −1/16, we have18 + ɛ − 1 √1 + 16ɛ > 08so that I ɛ has length √ 1 + 16ɛ/4 and this interval lies to the right of 0.To determine the z-values of the fixed points, we must now solve(cos ±2√ 1 z − 4(z − ɛ) 2 + log( √ )2(z − ɛ)z) = √ zor(cos 2 ±2√ 1 z − 4(z − ɛ) 2 + log( √ )z) =4(z − ɛ)2.zWith a little calculus, one may check that the functiong(z) =4(z − ɛ)2z


378 Chapter 16 Homoclinic Phenomenahas a single minimum 0 at z = ɛ and two maxima <strong>eq</strong>ual to 1 at the endpointsof I ɛ . Meanwhile the graph of(h(z) = cos 2 ±2√ 1 z − 4(z − ɛ) 2 + log( √ )z)oscillates between ±1 only finitely many times in I ɛ . Hence h(z) = g(z)at only finitely many z-values in I ɛ . These points are the fixed pointsfor ± ɛ .Note that, as ɛ → 0, the interval I ɛ tends to [0, 1/4] and so the number ofoscillations of h in I ɛ increases without bound. Therefore we haveCorollary. Given N ∈ Z, there exists ɛ N such that if 0 < ɛ < ɛ N , then + ɛhas at least N fixed points in C + .When ɛ > 0, the unstable curve misses the stable surface in its first passthrough C. Indeed, ζ + crosses C + at θ = 0, z = ɛ. This does not mean thatthere are no homoclinic orbits when ɛ ̸= 0. In fact, we have the followingproposition:Proposition. There are infinitely many values of ɛ for which ζ ± arehomoclinic solutions that pass twice through C.Proof: To show this, we need to find values of ɛ for which + ɛ (0, ɛ) lies onthe stable surface of the origin. Thus we must solveor0 =√ ɛ2 cos ( 0 − log( √ ɛ) ) + ɛ−2 √ ɛ = cos ( − log( √ ɛ) ) .But, as in Section 16.1, the graph of cos(− log √ ɛ) meets that of −2 √ ɛinfinitely often. This completes the proof.For each of the ɛ-values for which ζ ± is a homoclinic solution, we again haveinfinitely many fixed points (for ± ɛ ◦ ± ɛ ) as well as a very different structurefor the attractor. Clearly, a lot is happening as ɛ changes. We invite the readerwho has lasted this long with this book to go and figure out everything that ishappening here. Good luck! And have fun!


16.5 Exploration: The Chua Circuit 37916.5 Exploration: The Chua CircuitIn this exploration, we investigate a nonlinear three-dimensional system ofdifferential <strong>eq</strong>uations related to an electrical circuit known as Chua’s circuit.These were the first examples of circuit <strong>eq</strong>uations to exhibit chaotic behavior.Indeed, for certain values of the parameters these <strong>eq</strong>uations exhibit behaviorsimilar to the double scroll attractor in Section 16.3. The original Chua circuit<strong>eq</strong>uations possess a piecewise linear nonlinearity. Here we investigate a variationof these <strong>eq</strong>uations in which the nonlinearity is given by a cubic function.For more details on the Chua circuit, refer to [11] and [25].The nonlinear Chua circuit system is given byx ′ = a(y − φ(x))y ′ = x − y + zz ′ =−bzwhere a and b are parameters and the function φ is given byφ(x) = 116 x3 − 1 6 x.Actually, the coefficients of this polynomial are usually regarded as parameters,but we will fix them for the sake of <strong>def</strong>initeness in this exploration. Whena = 10. 91865 ...and b = 14, this system appears to have a pair of symmetrichomoclinic orbits as illustrated in Figure 16.13. The goal of this explorationzyxFigure 16.13 A pair of homoclinicorbits in the nonlinear Chua systemat parameter values a =10.91865... and b = 14.


380 Chapter 16 Homoclinic Phenomenais to investigate how this system evolves as the parameter a changes. As acons<strong>eq</strong>uence, we will also fix the parameter b at 14 and then let a vary. Wecaution the explorer that proving any of the chaotic or bifurcation behaviorobserved below is nearly impossible; virtually anything that you can do in thisregard would qualify as an interesting research result.1. As always, begin by finding the <strong>eq</strong>uilibrium points.2. Determine the types of these <strong>eq</strong>uilibria, perhaps by using a computeralgebra system.3. This system possesses a symmetry; describe this symmetry and tell whatit implies for solutions.4. Let a vary from 6 to 14. Describe any bifurcations you observe as a varies.Be sure to choose pairs of symmetrically located initial conditions in thisand other experiments in order to see the full effect of the bifurcations.Pay particular attention to solutions that begin near the origin.5. Are there values of a for which there appears to be an attractor for thissystem? What appears to be happening in this case? Can you construct amodel?6. Describe the bifurcation that occurs near the following a-values:(a) a = 6. 58(b) a = 7. 3(c) a = 8. 78(d) a = 10. 77EXERCISES1. Prove thatd[(s), (t)] =∞∑i=−∞|s i − t i |2 |i|is a distance function on 2 , where 2 is the set of doubly infinites<strong>eq</strong>uences of 0’s and 1’s as described in Section 16.2.2. Prove that the shift σ is a homeomorphism of 2 .3. Prove that S : → 2 gives a conjugacy between σ and F.4. Construct a dense orbit for σ .5. Prove that periodic points are dense for σ .6. Let s ∗ ∈ 2 . Prove that W s (s ∗ ) consists of precisely those s<strong>eq</strong>uenceswhose entries agree with those of s ∗ to the right of some entry of s ∗ .7. Let (0) = (...00. 000 ...) ∈ 2 . A s<strong>eq</strong>uence s ∈ 2 is called homoclinicto (0) ifs ∈ W s (0) ∩ W u (0). Describe the entries of a s<strong>eq</strong>uence that is


Exercises 381homoclinic to (0). Prove that s<strong>eq</strong>uences that are homoclinic to (0) aredense in 2 .8. Let (1) = (...11. 111 ...) ∈ 2 . A s<strong>eq</strong>uence s is a heteroclinic s<strong>eq</strong>uence ifs ∈ W s (0)∩W u (1). Describe the entries of such a heteroclinic s<strong>eq</strong>uence.Prove that such s<strong>eq</strong>uences are dense in 2 .9. Generalize the <strong>def</strong>initions of homoclinic and heteroclinic points toarbitrary periodic points for σ and reprove Exercises 7 and 8 in thiscase.10. Prove that the set of homoclinic points to a given periodic point iscountable.11. Let 2 1 denote the set of one-sided s<strong>eq</strong>uences of 0’s and 1’s. Define: 2 1 → 2 by(s 0 s 1 s 2 ...) = (...s 5 s 3 s 1 · s 0 s 2 s 4 ...).Prove that is a homeomorphism.12. Let X ∗ denote the fixed point of F in H 0 for the horseshoe map. Provethat the closure of W u (X ∗ ) contains all points in as well as points ontheir unstable curves.13. Let R : 2 → 2 be <strong>def</strong>ined byR(...s −2 s −1 . s 0 s 1 s 2 ...) = (...s 2 s 1 s 0 . s −1 s −2 ...).Prove that R ◦ R = id and that σ ◦ R = R ◦ σ −1 . Conclude thatσ = U ◦ R where U is a map that satisfies U ◦ U = id. Maps thatare their own inverses are called involutions. They represent very simpletypes of dynamical systems. Hence the shift may be decomposed into acomposition of two such maps.14. Let s be a s<strong>eq</strong>uence that is fixed by R, where R is as <strong>def</strong>ined in the previousexercise. Suppose that σ n (s) is also fixed by R. Prove that s is a periodicpoint of σ of period 2n.15. Rework the previous exercise, assuming that σ n (s) is fixed by U , whereU is given as in Exercise 13. What is the period of s?16. For the Lorenz system in Chapter 14, investigate numerically the bifurcationthat takes place for r between 13. 92 and 13. 96, with σ = 10 andb = 8/3.


This Page Intentionally Left Blank


17Existence andUniqueness RevisitedIn this chapter we return to the material presented in Chapter 7, this time fillingin all of the technical details and proofs that were omitted earlier. As a result,this chapter is more difficult than the preceding ones; it is, however, centralto the rigorous study of ordinary differential <strong>eq</strong>uations. To comprehend thoroughlymany of the proofs in this section, the reader should be familiar withsuch topics from real analysis as uniform continuity, uniform convergence offunctions, and compact sets.17.1 The Existence and UniquenessTheoremConsider the autonomous system of differential <strong>eq</strong>uationsX ′ = F(X)where F : R n → R n . In previous chapters, we have usually assumed that Fwas C ∞ ; here we will relax this condition and assume that F is only C 1 . Recallthat this means that F is continuously differentiable. That is, F and its firstpartial derivatives exist and are continuous functions on R n . For the first few383


384 Chapter 17 Existence and Uniqueness Revisitedsections of this chapter, we will deal only with autonomous <strong>eq</strong>uations; laterwe will assume that F depends on t as well as X.As we know, a solution of this system is a differentiable function X : J → R n<strong>def</strong>ined on some interval J ⊂ R such that for all t ∈ JX ′ (t) = F(X(t)).Geometrically, X(t) is a curve in R n whose tangent vector X ′ (t) <strong>eq</strong>ualsF(X(t)); as in previous chapters, we think of this vector as being based atX(t), so that the map F : R n → R n <strong>def</strong>ines a vector field on R n .Aninitialcondition or initial value for a solution X : J → R n is a specification of theform X(t 0 ) = X 0 where t 0 ∈ J and X 0 ∈ R n . For simplicity, we usually taket 0 = 0.A nonlinear differential <strong>eq</strong>uation may have several solutions that satisfya given initial condition. For example, consider the first-order nonlineardifferential <strong>eq</strong>uationx ′ = 3x 2/3 .In Chapter 7 we saw that the identically zero function u 0 : R → R given byu 0 (t) ≡ 0 is a solution satisfying the initial condition u(0) = 0. But u 1 (t) = t 3is also a solution satisfying this initial condition, and, in addition, for any τ > 0,the function given byu τ (t) ={ 0 ift ≤ τ(t − τ) 3 if t>τis also a solution satisfying the initial condition u τ (0) = 0.Besides uniqueness, there is also the question of existence of solutions. Whenwe dealt with linear systems, we were able to compute solutions explicitly. Fornonlinear systems, this is often not possible, as we have seen. Moreover, certaininitial conditions may not give rise to any solutions. For example, as we sawin Chapter 7, the differential <strong>eq</strong>uationx ′ ={ 1 ifx


17.2 Proof of Existence and Uniqueness 385The following is the fundamental local theorem of ordinary differential<strong>eq</strong>uations.The Existence and Uniqueness Theorem. Consider the initial valueproblemX ′ = F(X), X(0) = X 0where X 0 ∈ R n . Suppose that F : R n → R n is C 1 . Then there exists a uniquesolution of this initial value problem. More precisely, there exists a > 0 and aunique solutionX : (−a, a) → R nof this differential <strong>eq</strong>uation satisfying the initial conditionX(0) = X 0 .We will prove this theorem in the next section.17.2 Proof of Existence and UniquenessWe need to recall some multivariable calculus. Let F : R ncoordinates (x 1 , ..., x n )onR n , we write→ R n . InF(X) = (f 1 (x 1 , ..., x n ), ..., f n (x 1 , ..., x n )).Let DF X be the derivative of F at the point X ∈ R n . We may view this derivativein two slightly different ways. From one point of view, DF X is a linear map<strong>def</strong>ined for each point X ∈ R n ; this linear map assigns to each vector U ∈ R nthe vectorF(X + hU ) − F(X)DF X (U ) = lim,h→0 hwhere h ∈ R. Equivalently, from the matrix point of view, DF X is the n × nJacobian matrix( ) ∂fiDF X =∂x jwhere each derivative is evaluated at (x 1 , ..., x n ). Thus the derivative may beviewed as a function that associates different linear maps or matrices to eachpoint in R n . That is, DF : R n → L(R n ).


386 Chapter 17 Existence and Uniqueness RevisitedAs earlier, the function F is said to be continuously differentiable, or C 1 ,ifall of the partial derivatives of the f j exist and are continuous. We will assumefor the remainder of this chapter that F is C 1 . For each X ∈ R n , we <strong>def</strong>ine thenorm |DF X | of the Jacobian matrix DF X by|DF X |= sup |DF X (U )||U |=1where U ∈ R n . Note that |DF X | is not necessarily the magnitude of the largesteigenvalue of the Jacobian matrix at X.Example. SupposeDF X =( ) 2 0.0 1Then, indeed, |DF X |=2, and 2 is the largest eigenvalue of DF X . However, if( ) 1 1DF X = ,0 1then|DF X |=sup0≤θ≤2π= sup0≤θ≤2π= sup0≤θ≤2π> 1( )( ) ∣ 1 1 cos θ ∣∣∣∣ 0 1 sin θ√(cos θ + sin θ) 2 + sin 2 θ√1 + 2 cos θ sin θ + sin 2 θwhereas 1 is the largest eigenvalue.We do, however, have|DF X (V )| ≤|DF X ||V |for any vector V ∈ R n . Indeed, if we write V = (V /|V |) |V |, then we have|DF X (V )| = ∣ ( )∣∣DF X V /|V | ∣ |V |≤|DFX ||V |since V /|V | has magnitude 1. Moreover, the fact that F : R n → R n is C 1implies that the function R n → L(R n ), which sends X → DF X , is acontinuous function.


17.2 Proof of Existence and Uniqueness 387Let O ⊂ R n be an open set. A function F : O → R n is said to be Lipschitzon O if there exists a constant K such that|F(Y ) − F(X)| ≤K |Y − X|for all X, Y ∈ O. We call K a Lipschitz constant for F. More generally, we saythat F is locally Lipschitz if each point in O has a neighborhood O ′ in O suchthat the restriction F to O ′ is Lipschitz. The Lipschitz constant of F|O ′ mayvary with the neighborhoods O ′ .Another important notion is that of compactness. We say that a set C ⊂ R nis compact if C is closed and bounded. An important fact is that, if f : C → Ris continuous and C is compact, then first of all f is bounded on C and,secondly, f actually attains its maximum on C. See Exercise 13 at the end of thischapter.Lemma. Suppose that the function F : O → R n is C 1 . Then F is locallyLipschitz.Proof: Suppose that F : O → R n is C 1 and let X 0 ∈ O. Let ɛ > 0 be so smallthat the closed ball O ɛ of radius ɛ about X 0 is contained in O. Let K be an upperbound for |DF X | on O ɛ ; this bound exists because DF X is continuous and O ɛis compact. The set O ɛ is convex; that is, if Y , Z ∈ O ɛ , then the straight-linesegment connecting Y to Z is contained in O ɛ . This straight line is given byY + sU ∈ O ɛ , where U = Z − Y and 0 ≤ s ≤ 1. Let ψ(s) = F(Y + sU ).Using the chain rule we findThereforeThus we haveψ ′ (s) = DF Y +sU (U ).F(Z) − F(Y ) = ψ(1) − ψ(0)==∫ 10∫ 10ψ ′ (s) dsDF Y +sU (U ) ds.|F(Z) − F(Y )| ≤∫ 10K |U | ds = K |Z − Y |.The following remark is implicit in the proof of the lemma: If O is convex,and if |DF X |≤K for all X ∈ O, then K is a Lipschitz constant for F | O.


388 Chapter 17 Existence and Uniqueness RevisitedSuppose that J is an open interval containing zero and X : J → O satisfieswith X(0) = X 0 . Integrating, we haveX ′ (t) = F(X(t))∫ tX(t) = X 0 + F(X(s)) ds.0This is the integral form of the differential <strong>eq</strong>uation X ′ = F(X). Conversely,if X : J → O satisfies this integral <strong>eq</strong>uation, then X(0) = X 0 and X satisfiesX ′ = F(X), as is seen by differentiation. Thus the integral and differentialforms of this <strong>eq</strong>uation are <strong>eq</strong>uivalent as <strong>eq</strong>uations for X : J → O. To provethe existence of solutions, we will use the integral form of the differential<strong>eq</strong>uation.We now proceed with the proof of existence. Here are our assumptions:1. O ρ is the closed ball of radius ρ > 0 centered at X 0 .2. There is a Lipschitz constant K for F on O ρ .3. |F(X)| ≤M on O ρ .4. Choose a 0,there is some N > 0 such that for every p, q>Nmax |U p (t) − U q (t)| < ɛ.t∈JThen there is a continuous function U : J → R n such thatMoreover, for any t with |t| ≤a,max |U k (t) − U (t)| →0 ask →∞.t∈J∫ t ∫ tlim U k (s) ds = U (s) ds.k→∞ 00


17.2 Proof of Existence and Uniqueness 389This type of convergence is called uniform convergence of the functions U k .This lemma is proved in elementary analysis books and will not be provedhere. See [38].The s<strong>eq</strong>uence of functions U k is <strong>def</strong>ined recursively using an iterationscheme known as Picard iteration. We gave several illustrative examples ofthis iterative scheme back in Chapter 7. LetU 0 (t) ≡ X 0 .For t ∈ J <strong>def</strong>ine∫ tU 1 (t) = X 0 + F(U 0 (s)) ds = X 0 + tF(X 0 ).0Since |t| ≤a and |F(X 0 )|≤M, it follows that|U 1 (t) − X 0 |=|t||F(X 0 )|≤aM ≤ ρso that U 1 (t) ∈ O ρ for all t ∈ J. By induction, assume that U k (t) has been<strong>def</strong>ined and that |U k (t) − X 0 |≤ρ for all t ∈ J. Then let∫ tU k+1 (t) = X 0 + F(U k (s)) ds.0This makes sense since U k (s) ∈ O ρ so the integrand is <strong>def</strong>ined. We show that|U k+1 (t) − X 0 |≤ρ so that U k+1 (t) ∈ O ρ for t ∈ J; this will imply that thes<strong>eq</strong>uence can be continued to U k+2 , U k+3 , and so on. This is shown as follows:|U k+1 (t) − X 0 |≤≤∫ t0∫ t0|F(U k (s))| dsMds≤ Ma < ρ.Next, we prove that there is a constant L ≥ 0 such that, for all k ≥ 0,∣∣U k+1 (t) − U k (t) ∣ ≤ (aK ) k L.Let L be the maximum of |U 1 (t) − U 0 (t)| over −a ≤ t ≤ a. By the above,L ≤ aM. We have∫ t|U 2 (t) − U 1 (t)| =∣ F(U 1 (s)) − F(U 0 (s)) ds∣0


390 Chapter 17 Existence and Uniqueness Revisited≤∫ t0≤ aKL.K |U 1 (s) − U 0 (s)| dsAssuming by induction that, for some k ≥ 2, we have already provedfor |t| ≤a, we then have|U k+1 (t) − U k (t)| ≤|U k (t) − U k−1 (t)| ≤(aK ) k−1 L∫ t≤ K0∫ t|F(U k (s)) − F(U k−1 (s))| ds0≤ (aK )(aK ) k−1 L= (aK ) k L.|U k (s) − U k−1 (s)| dsLet α = aK , so that α < 1 by assumption. Given any ɛ > 0, we may choose Nlarge enough so that, for any r>s>Nwe have|U r (t) − U s (t)| ≤≤≤ ɛ∞∑k=N∣∣U k+1 (t) − U k (t) ∣ ∣∞∑α k Lk=Nsince the tail of the geometric series may be made as small as we please.By the lemma from analysis, this shows that the s<strong>eq</strong>uence of functionsU 0 , U 1 , ...converges uniformly to a continuous function X : J → R n . Fromthe identity∫ tU k+1 (t) = X 0 + F(U k (s)) ds,0we find by taking limits of both sides that∫ tX(t) = X 0 + lim F(U k (s)) dsk→∞ 0


17.2 Proof of Existence and Uniqueness 391∫ t= X 0 += X 0 +0∫ t0(limk→∞ F(U k(s)) ) dsF(X(s)) ds.The second <strong>eq</strong>uality also follows from the Lemma from analysis. ThereforeX : J → O ρ satisfies the integral form of the differential <strong>eq</strong>uation and henceis a solution of the <strong>eq</strong>uation itself. In particular, it follows that X : J → O ρis C 1 .This takes care of the existence part of the theorem. Now we turn to theuniqueness part.Suppose that X, Y : J → O are two solutions of the differential <strong>eq</strong>uationsatisfying X(0) = Y (0) = X 0 , where, as above, J is the closed interval [−a, a].We will show that X(t) = Y (t) for all t ∈ J. LetQ = max |X(t) − Y (t)|.t∈JThis maximum is attained at some point t 1 ∈ J. Then∫ t1Q =|X(t 1 ) − Y (t 1 )|=∣ (X ′ (s) − Y ′ (s)) ds∣≤≤0∫ t10∫ t10≤ aKQ.|F(X(s)) − F(Y (s))| dsK |X(s) − Y (s)| dsSince aK < 1, this is impossible unless Q = 0. ThereforeX(t) ≡ Y (t).This completes the proof of the theorem.To summarize this result, we have shown: Given any ball O ρ ⊂ O of radiusρ about X 0 on which1. |F(X)| ≤M;2. F has Lipschitz constant K ; and3. 0


392 Chapter 17 Existence and Uniqueness RevisitedSome remarks are in order. First note that two solution curves of X ′ = F(X)cannot cross if F satisfies the hypotheses of the theorem. This is an immediatecons<strong>eq</strong>uence of uniqueness but is worth emphasizing geometrically.Suppose X : J → O and Y : J 1 → O are two solutions of X ′ = F(X) forwhich X(t 1 ) = Y (t 2 ). If t 1 = t 2 we are done immediately by the theorem.If t 1 ̸= t 2 , then let Y 1 (t) = Y (t 2 − t 1 + t). Then Y 1 is also a solution ofthe system. Since Y 1 (t 1 ) = Y (t 2 ) = X(t 1 ), it follows that Y 1 and X agreenear t 1 by the uniqueness statement of the theorem, and hence so do X(t)and Y (t).We emphasize the point that if Y (t) is a solution, then so too is Y 1 (t) =Y (t + t 1 ) for any constant t 1 . In particular, if a solution curve X : J → Oof X ′ = F(X) satisfies X(t 1 ) = X(t 1 + w) for some t 1 and w>0, then thatsolution curve must in fact be a periodic solution in the sense that X(t + w) =X(t) for all t.17.3 Continuous Dependence on InitialConditionsFor the existence and uniqueness theorem to be at all interesting in any physicalor even mathematical sense, the result needs to be complemented by theproperty that the solution X(t) depends continuously on the initial conditionX(0). The next theorem gives a precise statement of this property.Theorem. Let O ⊂ R n be open and suppose F : O → R n has Lipschitzconstant K . Let Y (t) and Z(t) be solutions of X ′ = F(X) which remain in Oand are <strong>def</strong>ined on the interval [t 0 , t 1 ]. Then, for all t ∈[t 0 , t 1 ], we have|Y (t) − Z(t)| ≤|Y (t 0 ) − Z(t 0 )| exp(K (t − t 0 )). Note that this result says that, if the solutions Y (t) and Z(t) start out closetogether, then they remain close together for t near t 0 . While these solutionsmay separate from each other, they do so no faster than exponentially. Inparticular, we have this corollary:Corollary. (Continuous Dependence on Initial Conditions) Letφ(t, X) be the flow of the system X ′ = F(X) where F is C 1 . Then φ is acontinuous function of X.The proof depends on a famous in<strong>eq</strong>uality that we prove first.


17.3 Continuous Dependence on Initial Conditions 393Gronwall’s In<strong>eq</strong>uality. Let u :[0, α] →R be continuous and nonnegative.Suppose C ≥ 0 and K ≥ 0 are such that∫ tu(t) ≤ C + Ku(s) ds0for all t ∈[0, α]. Then, for all t in this interval,u(t) ≤ Ce Kt .Proof: Suppose first that C>0. Let∫ tU (t) = C + Ku(s) ds > 0.0Then u(t) ≤ U (t). Differentiating U ,wefindTherefore,Henceso thatU ′ (t) = Ku(t).U ′ (t)U (t) = Ku(t)U (t) ≤ K .d(log U (t)) ≤ Kdtlog U (t) ≤ log U (0) + Ktby integration. Since U (0) = C, we have by exponentiationand soU (t) ≤ Ce Kt ,u(t) ≤ Ce Kt .If C = 0, we may apply the above argument to a s<strong>eq</strong>uence of positive c i thattends to 0 as i →∞. This proves Gronwall’s in<strong>eq</strong>uality.


394 Chapter 17 Existence and Uniqueness RevisitedProof: We turn now to the proof of the theorem. Definev(t) =|Y (t) − Z(t)|.Sincewe have∫ t ( )Y (t) − Z(t) = Y (t 0 ) − Z(t 0 ) + F(Y (s)) − F(Z(s)) ds,t 0∫ tv(t) ≤ v(t 0 ) + Kv(s) ds.t 0Now apply Gronwall’s in<strong>eq</strong>uality to the function u(t) = v(t + t 0 )toget∫ t+t0u(t) = v(t + t 0 ) ≤ v(t 0 ) + Kv(s) ds= v(t 0 ) +t 0∫ t0Ku(τ) dτso v(t + t 0 ) ≤ v(t 0 ) exp(Kt)orv(t) ≤ v(t 0 ) exp(K (t − t 0 )), which is just theconclusion of the theorem.As we have seen, differential <strong>eq</strong>uations that arise in applications oftendepend on parameters. For example, the harmonic oscillator <strong>eq</strong>uations dependon the parameters b (the damping constant) and k (the spring constant); circuit<strong>eq</strong>uations depend on the resistance, capacitance, and inductance; and soforth. The natural question is how do solutions of these <strong>eq</strong>uations depend onthese parameters? As in the previous case, solutions depend continuously onthese parameters provided that the system depends on the parameters in acontinuously differentiable fashion. We can see this easily by using a speciallittle trick. Suppose the systemX ′ = F a (X)depends on the parameter a in a C 1 fashion. Let’s consider an “artificially”augmented system of differential <strong>eq</strong>uations given byx ′ 1 = f 1(x 1 , ..., x n , a).x n ′ = f n(x 1 , ..., x n , a)a ′ = 0.


17.4 Extending Solutions 395This is now an autonomous system of n + 1 differential <strong>eq</strong>uations. While thisexpansion of the system may seem trivial, we may now invoke the previousresult about continuous dependence of solutions on initial conditions to verifythat solutions of the original system depend continuously on a as well.Theorem. (Continuous Dependence on Parameters) Let X ′ = F a (X)be a system of differential <strong>eq</strong>uations for which F a is continuously differentiablein both X and a. Then the flow of this system depends continuously on a aswell as X.17.4 Extending SolutionsSuppose we have two solutions Y (t), Z(t) of the differential <strong>eq</strong>uation X ′ =F(X) where F is C 1 . Suppose also that Y (t) and Z(t) satisfy Y (t 0 ) = Z(t 0 ) andthat both solutions are <strong>def</strong>ined on an interval J about t 0 . Now the existenceand uniqueness theorem guarantees that Y (t) = Z(t) for all t in an intervalabout t 0 which may a priori be smaller than J. However, this is not the case. Tosee this, suppose that J ∗ is the largest interval on which Y (t) = Z(t). If J ∗ ̸= J,there is an endpoint t 1 of J ∗ and t 1 ∈ J. By continuity, we have Y (t 1 ) = Z(t 1 ).Now the uniqueness part of the theorem guarantees that, in fact, Y (t) andZ(t) agree on an interval containing t 1 . This contradicts the assertion that J ∗is the largest interval on which the two solutions agree.Thus we can always assume that we have a unique solution <strong>def</strong>ined on amaximal time domain. There is, however, no guarantee that a solution X(t)can be <strong>def</strong>ined for all time. For example, the differential <strong>eq</strong>uationx ′ = 1 + x 2 ,has as solutions the functions x(t) = tan(t − c) for any constant c. Such afunction cannot be extended over an interval larger thanc − π 2


396 Chapter 17 Existence and Uniqueness RevisitedThis theorem says that if a solution Y (t) cannot be extended to a larger timeinterval, then this solution leaves any compact set in O. This implies that, ast → β, either Y (t) accumulates on the boundary of O or else a subs<strong>eq</strong>uence|Y (t i )| tends to ∞ (or both).Proof: Suppose Y (t) ⊂ C for all t ∈ (α, β). Since F is continuous and C iscompact, there exists M>0 such that |F(X)| ≤M for all X ∈ C.Let γ ∈ (α, β). We claim that Y extends to a continuous functionY :[γ , β] →C. To see this, it suffices to prove that Y is uniformly continuouson J. For t 0 β, we can extend Y to the interval(α, δ). Hence (α, β) could not have been a maximal domain of a solution. Thiscompletes the proof of the theorem.


17.4 Extending Solutions 397The following important fact follows immediately from this theorem.Corollary. Let C be a compact subset of the open set O ⊂ R n and letF : O → R n be C 1 . Let Y 0 ∈ C and suppose that every solution curve of theform Y :[0, β] →O with Y (0) = Y 0 lies entirely in C. Then there is a solutionY :[0, ∞) → O satisfying Y (0) = Y 0 , and Y (t) ∈ C for all t ≥ 0, so thissolution is <strong>def</strong>ined for all (forward) time.Given these results, we can now give a slightly stronger theorem on thecontinuity of solutions in terms of initial conditions than the result discussedin Section 17.3. In that section we assumed that both solutions were <strong>def</strong>ined onthe same interval. In the next theorem we drop this r<strong>eq</strong>uirement. The theoremshows that solutions starting at nearby points are <strong>def</strong>ined on the same closedinterval and also remain close to each other on this interval.Theorem. Let F : O → R n be C 1 . Let Y (t) be a solution of X ′ = F(X) that is<strong>def</strong>ined on the closed interval [t 0 , t 1 ], with Y (t 0 ) = Y 0 . There is a neighborhoodU ⊂ R n of Y 0 and a constant K such that, if Z 0 ∈ U , then there is a uniquesolution Z(t) also <strong>def</strong>ined on [t 0 , t 1 ] with Z(t 0 ) = Z 0 . Moreover Z satisfies|Y (t) − Z(t)| ≤K |Y 0 − Z 0 | exp(K (t − t 0 ))for all t ∈[t 0 , t 1 ].For the proof we will need the following lemma.Lemma. If F : O → R n is locally Lipschitz and C ⊂ O is a compact set, thenF|C is Lipschitz.Proof: Suppose not. Then for every k>0, no matter how large, we can findX and Y in C with|F(X) − F(Y )| >k|X − Y |.In particular, we can find X n , Y n such that|F(X n ) − F(Y n )|≥n|X n − Y n | for n = 1, 2, ....Since C is compact, we can choose convergent subs<strong>eq</strong>uences of the X n andY n . Relabeling, we may assume X n → X ∗ and Y n → Y ∗ with X ∗ and Y ∗ in C.Note that we must have X ∗ = Y ∗ , since, for all n,|X ∗ − Y ∗ |= limn→∞ |X n − Y n |≤n −1 |F(X n ) − F(Y n )|≤n −1 2M,


398 Chapter 17 Existence and Uniqueness Revisitedwhere M is the maximum value of |F(X)| on C. There is a neighborhood O 0of X ∗ on which F|O 0 has Lipschitz constant K . Also there is an n 0 such thatX n ∈ O 0 if n ≥ n 0 . Therefore, for n ≥ n 0 ,|F(X n ) − F(Y n )|≤K |X n − Y n |,which contradicts the assertion above for n > n 0 . This proves thelemma.Proof: The proof of the theorem now goes as follows: By compactness of[t 0 , t 1 ], there exists ɛ > 0 such that X ∈ O if |X − Y (t)| ≤ ɛ for somet ∈[t 0 , t 1 ]. The set of all such points is a compact subset C of O. The C 1 mapF is locally Lipschitz, as we saw in Section 17.2. By the lemma, it follows thatF|C has a Lipschitz constant K .Let δ > 0 be so small that δ ≤ ɛ and δ exp(K |t 1 − t 0 |) ≤ ɛ. We claim thatif |Z 0 − Y 0 | < δ, then there is a unique solution through Z 0 <strong>def</strong>ined on all of[t 0 , t 1 ]. First of all, Z 0 ∈ O since |Z 0 − Y (t 0 )| < ɛ, so there is a solution Z(t)through Z 0 on a maximal interval [t 0 , β). We claim that β >t 1 . For supposeβ ≤ t 1 . Then, by Gronwall’s in<strong>eq</strong>uality, for all t ∈[t 0 , β), we have|Z(t) − Y (t)| ≤|Z 0 − Y 0 | exp(K |t − t 0 |)≤ δ exp(K |t − t 0 |)≤ ɛ.Thus Z(t) lies in the compact set C. By the results above, [t 0 , β) couldnot be a maximal solution domain. Therefore Z(t) is <strong>def</strong>ined on [t 0 , t 1 ].The uniqueness of Z(t) then follows immediately. This completes theproof.17.5 Nonautonomous SystemsWe turn our attention briefly in this section to nonautonomous differential<strong>eq</strong>uations. Even though our main emphasis in this book has beenon autonomous <strong>eq</strong>uations, the theory of nonautonomous (linear) <strong>eq</strong>uationsis needed as a technical device for establishing the differentiability ofautonomous flows.Let O ⊂ R × R n be an open set, and let F : O → R n be a function thatis C 1 in X but perhaps only continuous in t. Let (t 0 , X 0 ) ∈ O. Consider thenonautonomous differential <strong>eq</strong>uationX ′ (t) = F(t, X), X(t 0 ) = X 0 .


17.5 Nonautonomous Systems 399As usual, a solution of this system is a differentiable curve X(t)inR n <strong>def</strong>inedfor t in some interval J having the following properties:1. t 0 ∈ J and X(t 0 ) = X 0 ,2. (t, X(t)) ∈ O and X ′ (t) = F(t, X(t)) for all t ∈ J.The fundamental local theorem for nonautonomous <strong>eq</strong>uations is as follows:Theorem. Let O ⊂ R × R n be open and F : O → R n a function thatis C 1 in X and continuous in t. If (t 0 , X 0 ) ∈ O, there is an open interval Jcontaining t and a unique solution of X ′ = F(t, X) <strong>def</strong>ined on J and satisfyingX(t 0 ) = X 0 .The proof is the same as that of the fundamental theorem for autonomous<strong>eq</strong>uations (Section 17.2), the extra variable t being inserted where appropriate.An important corollary of this result follows:Corollary. Let A(t) be a continuous family of n × n matrices. Let (t 0 , X 0 ) ∈J × R n . Then the initial value problemX ′ = A(t)X, X(t 0 ) = X 0has a unique solution on all of J.For the proof, see Exercise 14 at the end of the chapter.We call the function F(t, X) Lipschitz in X if there is a constant K ≥ 0 suchthat|F (t, X 1 ) − F (t, X 2 ) |≤K |X 1 − X 2 |for all (t, X 1 ) and (t, X 2 )inO. Locally Lipschitz in X is <strong>def</strong>ined analogously.As in the autonomous case, solutions of nonautonomous <strong>eq</strong>uations arecontinuous with respect to initial conditions if F(t, X) is locally Lipschitz inX. We leave the precise formulation and proof of this fact to the reader.A different kind of continuity is continuity of solutions as functions of thedata F(t, X). That is, if F : O → R n and G : O → R n are both C 1 in X,and |F − G| is uniformly small, we expect solutions to X ′ = F(t, X) andY ′ = G(t, Y ), having the same initial values, to be close. This is true; in fact,we have the following more precise result:Theorem. Let O ⊂ R × R n be an open set containing (0, X 0 ) and supposethat F, G : O → R n are C 1 in X and continuous in t. Suppose also that for all(t, X) ∈ O,|F(t, X) − G(t, X)| < ɛ.


400 Chapter 17 Existence and Uniqueness RevisitedLet K be a Lipschitz constant in X for F(t, X).IfX(t) and Y (t) are solutions ofthe <strong>eq</strong>uations X ′ = F(t, X) and Y ′ = G(t, Y ), respectively, on some interval J,and X(0) = X 0 = Y (0), then|X(t) − Y (t)| ≤ ɛ ( )exp(K |t|) − 1Kfor all t ∈ J.Proof: For t ∈ J we haveHenceX(t) − Y (t) ==|X(t) − Y (t)| ≤≤Let u(t) =|X(t) − Y (t)|. Then∫ t0∫ t0∫ t+0∫ t0∫ t0u(t) ≤ K(X ′ (s) − Y ′ (s)) ds(F(s, X(s)) − G(s, Y (s))) ds.|F(s, X(s)) − F(s, Y (s))| ds|F(s, Y (s)) − G(s, Y (s))| dsK |X(s) − Y (s)| ds +∫ t0(u(s) + ɛ )ds,K∫ t0ɛ ds.so thatu(t) + ɛ K ≤ ɛ K + K ∫ t0(u(s) + ɛ )ds.KIt follows from Gronwall’s in<strong>eq</strong>uality thatu(t) + ɛ K ≤ ɛ Kexp (K |t|) ,which yields the theorem.17.6 Differentiability of the FlowNow we return to the case of an autonomous differential <strong>eq</strong>uation X ′ = F(X)where F is assumed to be C 1 . Our aim is to show that the flow φ(t, X) = φ t (X)


17.6 Differentiability of the Flow 401determined by this <strong>eq</strong>uation is a C 1 function of the two variables, and toidentify ∂φ/∂X. We know, of course, that φ is continuously differentiable inthe variable t, so it suffices to prove differentiability in X.Toward that end let X(t) be a particular solution of the system <strong>def</strong>ined fort in a closed interval J about 0. Suppose X(0) = X 0 . For each t ∈ J letA(t) = DF X(t) .That is, A(t) denotes the Jacobian matrix of F at the point X(t). Since F is C 1 ,A(t) is continuous. We <strong>def</strong>ine the nonautonomous linear <strong>eq</strong>uationU ′ = A(t)U .This <strong>eq</strong>uation is known as the variational <strong>eq</strong>uation along the solution X(t).From the previous section we know that the variational <strong>eq</strong>uation has a solutionon all of J for every initial condition U (0) = U 0 . Also, as in the autonomouscase, solutions of this system satisfy the linearity principle.The significance of this <strong>eq</strong>uation is that, if U 0 is small, then the functiont → X(t) + U (t)is a good approximation to the solution X(t) of the original autonomous<strong>eq</strong>uation with initial value X(0) = X 0 + U 0 .To make this precise, suppose that U (t, ξ) is the solution of the variational<strong>eq</strong>uation that satisfies U (0, ξ) = ξ where ξ ∈ R n .Ifξ and X 0 + ξ belong toO, let Y (t, ξ) be the solution of the autonomous <strong>eq</strong>uation X ′ = F(X) thatsatisfies Y (0) = X 0 + ξ.Proposition.<strong>def</strong>ined. ThenLet J be the closed interval containing 0 on which X(t) is|Y (t, ξ) − X(t) − U (t, ξ)|limξ→0|ξ|converges to 0 uniformly for t ∈ J.This means that for every ɛ > 0, there exists δ > 0 such that if |ξ| ≤δ, then|Y (t, ξ) − (X(t) + U (t, ξ))| ≤ɛ|ξ|for all t ∈ J. Thus as ξ → 0, the curve t → X(t) + U (t, ξ) is a better andbetter approximation to Y (t, ξ). In many applications X(t) + U (t, ξ) is usedin place of Y (t, ξ); this is convenient because U (t, ξ) is linear in ξ.


402 Chapter 17 Existence and Uniqueness RevisitedWe will prove the proposition momentarily, but first we use this result toprove the following theorem:Theorem. (Smoothness of Flows) The flow φ(t, X) of the autonomoussystem X ′ = F(X) isaC 1 function; that is, ∂φ/∂t and ∂φ/∂X exist and arecontinuous in t and X.Proof: Of course, ∂φ(t, X)/∂t is just F(φ t (X)), which is continuous. Tocompute ∂φ/∂X we have, for small ξ,φ(t, X 0 + ξ) − φ(t, X 0 ) = Y (t, ξ) − X(t).The proposition now implies that ∂φ(t, X 0 )/∂X is the linear map ξ → U (t, ξ).The continuity of ∂φ/∂X is then a cons<strong>eq</strong>uence of the continuity in initialconditions and data of solutions for the variational <strong>eq</strong>uation.Denoting the flow again by φ t (X), we note that for each t the derivativeDφ t (X) of the map φ t at X ∈ O is the same as ∂φ(t, X)/∂X. We call this thespace derivative of the flow, as opposed to the time derivative ∂φ(t, X)/∂t.The proof of the preceding theorem actually shows that Dφ t (X) is the solutionof an initial value problem in the space of linear maps on R n : for eachX 0 ∈ O the space derivative of the flow satisfies the differential <strong>eq</strong>uationddt (Dφ t (X 0 )) = DF φt (X 0 )Dφ t (X 0 ),with the initial condition Dφ 0 (X 0 ) = I. Here we may regard X 0 as a parameter.An important special case is that of an <strong>eq</strong>uilibrium solution ¯X so thatφ t ( ¯X) ≡ ¯X. Putting DF ¯X = A, we get the differential <strong>eq</strong>uationddt (Dφ t ( ¯X)) = ADφ t ( ¯X),with Dφ 0 ( ¯X) = I. The solution of this <strong>eq</strong>uation isDφ t ( ¯X) = exp tA.This means that, in a neighborhood of an <strong>eq</strong>uilibrium point, the flow isapproximately linear.We now prove the proposition. The integral <strong>eq</strong>uations satisfied by X(t),Y (t, ξ), and U (t, ξ) are∫ tX(t) = X 0 + F(X(s)) ds,0


17.6 Differentiability of the Flow 403∫ tY (t, ξ) = X 0 + ξ + F(Y (s, ξ)) ds,0∫ tU (t, ξ) = ξ + DF X(s) (U (s, ξ)) ds.0From these we get, for t ≥ 0,|Y (t,ξ)−X(t)−U (t,ξ)|≤∫ tThe Taylor approximation of F at a point Z says0|F(Y (s,ξ))−F(X(s))−DF X(s) (U (s,ξ))|ds.F(Y ) =F(Z)+DF Z (Y −Z)+R(Z,Y −Z),whereR(Z,Y −Z)lim= 0Y →Z |Y −Z|uniformly in Z for Z in a given compact set. We apply this to Y = Y (s,ξ), Z =X(s). From the linearity of DF X(s) we get|Y (t,ξ)−X(t)−U (t,ξ)|≤∫ t0∫ t+|DF X(s) (Y (s,ξ)−X(s)−U (s,ξ))|ds0|R(X(s),Y (s,ξ)−X(s))|ds.Denote the left side of this expression by g(t) and setThen we haveg(t)≤N∫ t0N =max{|DF X(s) || s ∈ J}.g(s)ds +∫ tFix ɛ >0 and pick δ 0 >0 so small that0|R(X(s),Y (s,ξ)−X(s))|ds.|R(X(s),Y (s,ξ)−X(s))|≤ɛ|Y (s,ξ)−X(s)|if |Y (s,ξ)−X(s)|≤δ 0 and s ∈ J.From Section 17.3 there are constants K ≥0 and δ 1 > 0 such thatif |ξ|≤δ 1 and s ∈J.|Y (s,ξ)−X(s)|≤|ξ|e Ks ≤ δ 0


404 Chapter 17 Existence and Uniqueness RevisitedAssume now that |ξ|≤δ 1 . From the previous <strong>eq</strong>uation, we find, for t ∈ J,g(t)≤N∫ t0g(s)ds +∫ t0ɛ|ξ|e Ks ds,so thatg(t)≤N∫ t0g(s)ds +Cɛ|ξ|for some constant C depending only on K and the length of J. ApplyingGronwall’s in<strong>eq</strong>uality we obtaing(t)≤Cɛe Nt |ξ|if t ∈ J and |ξ|≤δ 1 . (Recall that δ 1 depends on ɛ.) Since ɛ is any positivenumber, this shows that g(t)/|ξ|→0 uniformly in t ∈ J, which proves theproposition.EXERCISES1. Write out the first few terms of the Picard iteration scheme for each ofthe following initial value problems. Where possible, use any method tofind explicit solutions. Discuss the domain of the solution.(a) x ′ =x −2;x(0) =1(b) x ′ =x 4/3 ;x(0) =0(c) x ′ =x 4/3 ;x(0)=1(d) x ′ =cosx;x(0) =0(e) x ′ =1/2x;x(1) =12. Let A be an n×n matrix. Show that the Picard method for solving X ′ =AX,X(0)=X 0 gives the solution exp(tA)X 0 .3. Derive the Taylor series for cost by applying the Picard method tothe first-order system corresponding to the second-order initial valueproblemx ′′ =−x; x(0) =1, x ′ (0)=0.4. For each of the following functions, find a Lipschitz constant on theregion indicated, or prove there is none:(a) f (x)=|x|,−∞


Exercises 405(c) f (x)=1/x,1≤x ≤∞(d) f (x,y) =(x +2y,−y),(x,y) ∈R 2xy(e) f (x,y) =1+x 2 +y 2 ,x2 +y 2 ≤45. Consider the differential <strong>eq</strong>uationx ′ =x 1/3 .How many different solutions satisfy x(0) = 0?6. What can be said about solutions of the differential <strong>eq</strong>uation x ′ = x/t?7. Define f : R →R by f (x) =1ifx ≤1; f (x) = 2ifx>1. What can be saidabout solutions of x ′ = f (x) satisfying x(0) = 1, where the right-hand sideof the differential <strong>eq</strong>uation is discontinuous? What happens if you haveinstead f (x) =0ifx>1?8. Let A(t) be a continuous family of n×n matrices and let P(t) bethematrix solution to the initial value problem P ′ = A(t)P,P(0) = P 0 . Showthat(∫ t)detP(t)=(detP 0 )exp TrA(s)ds .09. Suppose F is a gradient vector field. Show that |DF X | is the magnitude ofthe largest eigenvalue of DF X .(Hint: DF X is a symmetric matrix.)10. Show that there is no solution to the second-order two-point boundaryvalue problemx ′′ =−x, x(0) =0, x(π) = 1.11. What happens if you replace the differential <strong>eq</strong>uation in the previousexercise by x ′′ =−kx with k>0?12. Prove the following general fact (see also Section 17.3): If C ≥ 0 andu,v :[0,β]→R are continuous and nonnegative, andu(t)≤C +∫ tfor all t ∈[0,β], then u(t)≤Ce V (t) , whereV (t)=0∫ t0u(s)v(s)dsv(s)ds.13. Suppose C ⊂R n is compact and f :C →R is continuous. Prove thatf is bounded on C and that f attains its maximum value at somepoint in C.


406 Chapter 17 Existence and Uniqueness Revisited14. Let A(t) be a continuous family of n×n matrices. Let (t 0 ,X 0 ) ∈ J ×R n .Then the initial value problemX ′ =A(t)X, X(t 0 ) = X 0has a unique solution on all of J.15. In a lengthy essay not to exceed 50 pages, describe the behavior of allsolutions of the system X ′ =0 where X ∈R n . Ah, yes. Another free andfinal gift from the Math Department.


Bibliography1. Abraham, R., and Marsden, J. Foundations of Mechanics. Reading, MA:Benjamin-Cummings, 1978.2. Abraham, R., and Shaw, C. Dynamics: The Geometry of Behavior. RedwoodCity, CA: Addison-Wesley, 1992.3. Alligood, K., Sauer, T., and Yorke, J. Chaos: An Introduction to DynamicalSystems. New York: Springer-Verlag, 1997.4. Afraimovich, V. S., and Shil’nikov, L. P. Strange attractors and quasiattractors.In Nonlinear Dynamics and Turbulence. Boston: Pitman, (1983), 1.5. Arnold, V. I. Ordinary Differential Equations. Cambridge: MIT Press, 1973.6. Arnold, V. I. Mathematical Methods of Classical Mechanics. New York:Springer-Verlag, 1978.7. Arrowsmith, D., and Place, C. An Introduction to Dynamical Systems.Cambridge: Cambridge University Press, 1990.8. Banks, J. et al. On Devaney’s <strong>def</strong>inition of chaos. Amer. Math. Monthly. 99(1992), 332.9. Birman, J. S., and Williams, R. F. Knotted periodic orbits in dynamical systemsI: Lorenz’s <strong>eq</strong>uations. Topology. 22 (1983), 47.10. Blanchard, P., Devaney, R. L., and Hall, G. R. Differential Equations. PacificGrove, CA: Brooks-Cole, 2002.11. Chua, L., Komuro, M., and Matsumoto, T. The double scroll family. IEEETrans. on Circuits and Systems. 33 (1986), 1073.12. Coddington, E., and Levinson, N. Theory of Ordinary Equations. New York:McGraw-Hill, 1955.13. Devaney, R. L. Introduction to Chaotic Dynamical Systems. Boulder, CO:Westview Press, 1989.14. Devaney, Kl. Math texts and digestion. J. Obesity. 23 (2002), 1.8.15. Edelstein-Keshet, L. Mathematical Models in Biology. New York: McGraw-Hill, 1987.407


408 References16. Ermentrout, G. B., and Kopell, N. Oscillator death in systems of coupledneural oscillators. SIAM J. Appl. Math. 50 (1990), 125.17. Field, R., and Burger, M., eds. Oscillations and Traveling Waves in ChemicalSystems. New York: Wiley, 1985.18. Fitzhugh, R. Impulses and physiological states in theoretical models of nervemembrane. Biophys. J. 1 (1961), 445.19. Golubitsky, M., Josić, K., and Kaper, T. An unfolding theory approach tobursting in fast-slow systems. In Global Theory of Dynamical Systems. Bristol,UK: Institute of Physics, 2001, 277.20. Guckenheimer, J., and Williams, R. F. Structural stability of Lorenz attractors.Publ. Math. IHES. 50 (1979), 59.21. Guckenheimer, J., and Holmes, P. Nonlinear Oscillations, Dynamical Systems,and Bifurcations of Vector Fields. New York: Springer-Verlag, 1983.22. Gutzwiller, M. The anisotropic Kepler problem in two dimensions. J. Math.Phys. 14 (1973), 139.23. Hodgkin, A. L., and Huxley, A. F. A quantitative description of membranecurrent and its application to conduction and excitation in nerves. J. Physiol.117 (1952), 500.24. Katok, A., and Hasselblatt, B. Introduction to the Modern Theory of DynamicalSystems. Cambridge, UK: Cambridge University Press, 1995.25. Khibnik, A., Roose, D., and Chua, L. On periodic orbits and homoclinicbifurcations in Chua’s circuit with a smooth nonlinearity. Int. J. Bifurcationand Chaos. 3 (1993), 363.26. Kraft, R. Chaos, Cantor sets, and hyperbolicity for the logistic maps. Amer.Math Monthly. 106 (1999), 400.27. Lengyel, I., Rabai, G., and Epstein, I. Experimental and modeling study ofoscillations in the chlorine dioxide–iodine–malonic acid reaction. J. Amer.Chem. Soc. 112 (1990), 9104.28. Liapunov, A. M. The General Problem of Stability of Motion. London: Taylor& Francis, 1992.29. Lorenz, E. Deterministic nonperiodic flow. J. Atmos. Sci. 20 (1963), 130.30. Marsden, J. E., and McCracken, M. The Hopf Bifurcation and Its Applications.New York: Springer-Verlag, 1976.31. May, R. M. Theoretical Ecology: Principles and Applications. Oxford: Blackwell,1981.32. McGehee, R. Triple collision in the collinear three body problem. InventionesMath. 27 (1974), 191.33. Moeckel, R. Chaotic dynamics near triple collision. Arch. Rational Mech. Anal.107 (1989), 37.34. Murray, J. D. Mathematical Biology. Berlin: Springer-Verlag, 1993.35. Nagumo, J. S., Arimoto, S., and Yoshizawa, S. An active pulse transmissionline stimulating nerve axon. Proc. IRE. 50 (1962), 2061.36. Robinson, C. Dynamical Systems: Stability, Symbolic Dynamics, and Chaos.Boca Raton, FL: CRC Press, 1995.37. Rössler, O. E. An <strong>eq</strong>uation for continuous chaos. Phys. Lett. A 57 (1976), 397.


References 40938. Rudin, W. Principles of Mathematical Analysis. New York: McGraw-Hill, 1976.39. Schneider, G., and Wayne, C. E. Kawahara dynamics in dispersive media.Phys. D 152 (2001), 384.40. Shil’nikov, L. P. A case of the existence of a countable set of periodic motions.Sov. Math. Dokl. 6 (1965), 163.41. Shil’nikov, L. P. Chua’s circuit: Rigorous results and future problems. Int. J.Bifurcation and Chaos. 4 (1994), 489.42. Siegel, C., and Moser, J. Lectures on Celestial Mechanics. Berlin: Springer-Verlag, 1971.43. Smale, S. Diffeomorphisms with many periodic points. In Differential andCombinatorial Topology. Princeton, NJ: Princeton University Press, 1965, 63.44. Sparrow, C. The Lorenz Equations: Bifurcations, Chaos, and Strange Attractors.New York: Springer-Verlag, 1982.45. Strogatz, S. Nonlinear Dynamics and Chaos. Reading, MA: Addison-Wesley,1994.46. Tucker, W. The Lorenz attractor exists. C. R. Acad. Sci. Paris Sér. I Math. 328(1999), 1197.47. Winfree, A. T. The prehistory of the Belousov-Zhabotinsky reaction. J. Chem.Educ. 61 (1984), 661.


This Page Intentionally Left Blank


IndexPage numbers followed by “f ” denote figures.Aalgebra<strong>eq</strong>uations, 26–29linear. see linear algebraangular momentumconservation of, 283–284<strong>def</strong>inition of, 283anisotropic Kepler problem, 298–299answersto all exercises, 1000.8, viiareal velocity, 284asymptotic stability, 175asymptotically stable, 175attracting fixed point, 329attractorchaotic, 319–324description of, 310–311double scroll, 372–375Lorenz. see Lorenz attractorRössler, 324–325autonomous, 5, 22Bbackward asymptotic, 370backward orbit, 320–321, 368basic regions, 190basin of attraction, 194, 200basis, 90Belousov-Zhabotinsky reaction, 231bifurcationcriterion, 333<strong>def</strong>inition of, 4, 8–9discrete dynamical systems, 332–335exchange, 334heteroclinic, 192–194homoclinic, 375–378Hopf, 181–182, 270–271, 308nonlinear systems, 176–182period doubling, 334, 335fpitchfork, 178–179, 179fsaddle-node, 177–178, 179f–180f, 181, 332tangent, 332bifurcation diagram, 8biological applicationscompetition and harvesting, 252–253competitive species, 246–252infectious diseases, 235–239predator/prey systems, 239–246blowing up the singularity, 293–297Bob. See MoeCcanonical form, 49, 67, 84, 98, 111, 115Cantor middle-thirds set, 349–352capacitance, 260carrying capacity, 335Cauchy-Riemann <strong>eq</strong>uations, 184center<strong>def</strong>inition of, 44–47spiral, 112fcenter of mass, 293central force fields, 281–284changing coordinates, 49–57411


412 Indexchaoscubic, 352description of, 337–342chaotic, 323chaotic attractor, 319–324characteristic, 259characteristic <strong>eq</strong>uation, 84chemical reactions, 230–231Chua circuit, 359, 379–380circuit theory applicationsdescription of, 257neurodynamics, 272–273RLC circuit, 23, 257–261classical mechanics, 277closed orbitconcepts for, 220–221<strong>def</strong>inition of, 215Poincaré map, 221coefficient matrix, 29collision surface, 295collision-ejection orbits, 288collision-ejection solutions, 296–297compact, 387compact set, 228–229competitive species applications, 246–252complex eigenvalues, 44–47, 54–55,86–89configuration space, 278conjugacy <strong>eq</strong>uation, 340, 370conjugate, 340conservation of angular momentum,283–284conservation of energy, 281conservative systems, 280–281constant coefficient <strong>eq</strong>uation, 25constant harvesting, 7–8constant of the motion, 208continuity, 70continuous dependenceof solutions, 147–149on initial conditions, 147–149,392–395on parameters, 149, 395continuous dynamical system, 141contracting condition, 316coordinates, changing of, 49–57critical point, 205cross product, 279cubic chaos, 352curve<strong>def</strong>inition of, 25local unstable, 169zero velocity, 286, 287fDdamping constant, 26dense set, 101dense subset, 101determinant, 27, 81diffeomorphism, 66difference <strong>eq</strong>uation, 336differentiability of the flow, 400–404dimension, 91direction field, 24, 191fdiscrete dynamical systemsbifurcations, 332–335Cantor middle-thirds set, 349–352chaos, 337–342description of, 141, 327discrete logistic model, 335–337introduction to, 327–332one-dimensional, 320, 329orbits on, 329planar, 362shift map, 347–349symbolic dynamics, 342–347discrete logistic model, 335–337discrete logistic population model, 336distance function, 344–345distinct eigenvalues, 75, 107–113divergence of vector field, 309dot product, 279double scroll attractor, 372–375doubling map, 338–339dynamical systemsdiscrete. see discrete dynamical systemsnonlinear, 140–142planar, 63–71, 222–225Eeccentricity, 292eigenvaluescomplex, 44–47, 54–55, 86–89<strong>def</strong>inition of, 30, 83distinct, 75, 107–113eigenvectors and, relationship between, 30–33real distinct, 39–44, 51–53repeated, 47–49, 56–57, 95–101, 119–122eigenvectors<strong>def</strong>inition of, 30, 83eigenvalues and, relationship between, 30–33elementary matrix, 79–80elementary row operations, 79energy, 280energy surface, 286


Index 413<strong>eq</strong>uationsalgebra, 26–29Cauchy-Riemann, 184characteristic, 84conjugacy, 340, 370constant coefficient, 25difference, 336first-order. see first-order <strong>eq</strong>uationsharmonic oscillator, 25–26homogeneous, 25, 131Lienard, 261–262Newton’s, 23nonautonomous, 10, 149–150, 160, 398–400second-order, 23van der Pol, 261–270variational, 149–153, 172, 401<strong>eq</strong>uilibrium point<strong>def</strong>inition of, 2, 22, 174–175isolated, 206stability of, 194<strong>eq</strong>uilibrium solution, 2errata. See Killer Devaney, viiiEuler’s method, 154exchange bifurcation, 334existence and uniqueness theoremdescription of, 142–147, 383–385proof of, 385–392expanding direction, 316exponential growth model, 335exponential of a matrix, 123–130extending solutions, 395–398Ffacesauthors’, 64masks, 313Faraday’s law, 259first integral, 208first-order <strong>eq</strong>uationsdescription of, 1example of, 1–4logistic population model, 4–7Poincaré map. see Poincaré mapfixed pointattracting, 329description of, 328repelling, 330source, 330z-values, 377flow<strong>def</strong>inition of, 12, 64–65differentiability of, 400–404gradient, 204smoothness of, 149–150, 402flow box, 218–220force fieldcentral, 281–284<strong>def</strong>inition of, 277forced harmonic oscillator, 23, 131forcing term, 130forward asymptotic, 370forward orbit, 320, 367free gift, 155, 406functiondistance, 344–345Hamiltonian, 281Liapunov, 195, 200, 307total energy, 197Ggeneral solution, 34generic property, 104genericity, 101–104gradient, 204gradient systems, 203–207graphical iteration, 329Gronwall’s in<strong>eq</strong>uality, 393–394, 398HHamiltonian function, 281Hamiltonian systems, 207–210, 281harmonic oscillatordescription of, 114–119<strong>eq</strong>uation for, 25–26forced, 23, 131two-dimensional, 278undamped, 55, 114, 208harvestingconstant, 7–8periodic, 9–11heteroclinic bifurcation, 192–194heteroclinic solutions, 193higher dimensional saddles, 173–174Hodgkin-Huxley system, 272homeomorphism, 65, 69, 345homoclinic bifurcations, 375–378homoclinic orbits, 209, 361homoclinic solutions, 209, 226, 363fhomogeneous <strong>eq</strong>uation, 25, 131Hopf bifurcation, 181–182, 270–271, 308horseshoe map, 359, 366–372hyperbolic, 66, 166hyperbolicity condition, 316


414 IndexIideal pendulum, 208–209improved Euler’s method, 154inductance, 260infectious diseases, 235–239initial conditionscontinuous dependence on, 147–149,392–395<strong>def</strong>inition of, 142, 384sensitive dependence on, 305, 321initial value, 142, 384initial value problem, 2, 142inner product, 279invariant, 181, 199inverse matrix, 50, 79inverse square law, 285invertibility criterion, 82–83invertible, 50, 79–80irrational rotation, 118itinerary, 343, 369JJacobian matrix, 149–150, 166, 401KKepler’s lawsfirst law, 289–292proving of, 284kernel, 92, 96Killer Devaney, viii, 407kinetic energy, 280, 290Kirchhoff’s current law, 258, 260Kirchhoff’s voltage law, 259LLasalle’s invariance principle,200–203latus rectum, 292law of gravitation, 285, 292lemma, 126–127, 318, 388–392Liapunov exponents, 323Liapunov function, 195, 200, 307Liapunov stability, 194Lienard <strong>eq</strong>uation, 261–262limit cycle, 227–228limit sets, 215–218linear algebradescription of, 75higher dimensional, 75–105preliminaries from, 75–83linear combination, 28linear map, 50, 88, 92–93, 107linear pendulum, 196linear systemsdescription of, 29–30higher dimensional, 107–138nonautonomous, 130–135solving of, 33–36three-parameter family of, 71linear transformation, 92linearity principle, 36linearization theorem, 168linearized system near X 0 , 166linearly dependent, 27, 76linearly independent, 27, 76Lipschitz constant, 387, 398Lipschitz in X, 399local section, 218local unstable curve, 169locally Lipschitz, 387logistic map, 336logistic population model, 4–7, 335Lorenz attractor<strong>def</strong>inition of, 305, 305fmodel for, 314–319properties of, 310–313, 374Lorenz modeldescription of, 304dynamics of, 323–324Lorenz systembehavior of, 310history of, 303–304introduction to, 304–305linearization, 312properties of, 306–310Lorenz vector field, 306Mmatrixarithmetic, 78elementary, 79–80exponential of, 123–130Jacobian, 149–150, 166, 401reduced row echelon form of, 79square, 95sums, 78symmetric, 207matrix formcanonical, 49, 67, 84, 98, 111, 115description of, 27mechanical system with n degrees of freedom,278


Index 415mechanicsclassical, 277conservative systems, 280–281Newton’s second law, 277–280method of undetermined coefficients, 130metric, 344minimal period n, 328Moe. See Stevemomentum vector, 281monotone s<strong>eq</strong>uences, 222–225Nn-cycles, 328near-collision solutions, 297neat picture, 305neurodynamics, 272–273Newtonian central force systemdescription of, 285–289problems, 297–298singularity, 293Newton’s <strong>eq</strong>uation, 23Newton’s law of gravitation, 196, 285, 292Newton’s second law, 277–280nonautonomous differential <strong>eq</strong>uations, 10,149–150, 160, 398–400nonautonomous linear systems, 130–135nonlinear pendulum, 196–199nonlinear saddle, 168–174nonlinear sink, 165–168nonlinear source, 165–168nonlinear systemsbifurcations, 176–182continuous dependence of solutions,147–149description of, 139–140dynamical, 140–142<strong>eq</strong>uilibria in, 159–187existence and uniqueness theorem,142–147global techniques, 189–214numerical methods, 153–156saddle, 168–174stability, 174–176variational <strong>eq</strong>uation, 149–153nullclines, 189–194OOhm’s law, 259–260one-dimensional discrete dynamical system,320orbitbackward, 320–321, 368closed. see Closed orbit<strong>def</strong>inition of, 118forward, 320, 367homoclinic, 209, 361seed of the, 328, 338forbit diagram, 353–354origin<strong>def</strong>inition of, 164linearization at, 312unstable curve at, 313fPparameterscontinuous dependence on,149, 395variation of, 130–131, 134pendulumconstant forcing, 210–211ideal, 208–209linear, 196nonlinear, 196–199period doubling bifurcation,334, 335fperiodic harvesting, 9–11periodic points<strong>def</strong>inition of, 322of period n, 328periodic solution, 215, 363fphase line, 3phase plane, 41phase portraitscenter, 46f<strong>def</strong>inition of, 40repeated eigenvalues, 50frepeated real eigenvalues, 121fsaddle, 39–42sink, 42–43source, 43–44spiral center, 112fspiral sink, 48fphase space, 278physical state, 259Picard iteration, 144–145, 389pitchfork bifurcation, 178–179, 179fplanar systemsdescription of, 24–26dynamical, 63–71, 222–225linear, 29–30local section, 221phase portraits for, 39–60Poincaré-Bendixson theorem. seePoincaré-Bendixson theorem


416 IndexPoincaré map , 319, 323, 373closed orbit, 221computing of, 12–15description of, 11, 221, 319graph of, 14semi-, 265Poincaré-Bendixson theoremapplications of, 227–230, 246description of, 225–227polar coordinates, 164populationdiscrete logistic population model, 336logistic population model, 4–7, 335positive Liapunov exponents, 323positively invariant, 200potential energy, 280Prandtl number, 304predator/prey systems, 239–246Qquantum mechanical systems, 298–299quasi-periodic motions, 119Rrange, 92, 94Rayleigh number, 304real distinct eigenvalues, 39–44, 51–53reduced row echelon form, 79regular point, 205repeated eigenvalues, 47–49, 56–57, 95–101,119–122repelling fixed point, 330resistorcharacteristic, 259passive, 262return condition, 316RLC circuit, 23, 257–261Rössler attractor, 324–325Runge-Kuta method, 154–155Ssaddle<strong>def</strong>inition of, 39–42, 108higher dimensional, 173–174nonlinear, 168–174spiral, 113saddle connections, 193saddle-node bifurcation, 177–178, 179f–180f,181, 332scaled variables, 294second-order <strong>eq</strong>uations, 23seed of the orbit, 38f, 328semiconjugacy, 341semi-Poincaré map, 265sensitive dependence on initial conditions,305, 321sensitivity constant, 338separation of variables, 241setcompact, 228–229dense, 101limit, 215–218set of physical states, 259shift map, 347–349, 370Shil’nikov system, 359–366singularity, 293–297sink<strong>def</strong>inition of, 42–44, 329higher-dimensional, 109nonlinear, 165–168spiral, 47, 48fthree-dimensional, 110fslope, 117slope field, 6smooth dynamical system, 141smoothness of flows, 149–150, 402S-nullclines, 236solutionscollision-ejection, 296–297continuous dependence of, 147–149<strong>eq</strong>uilibrium, 2extending, 395–398general, 34heteroclinic, 193homoclinic, 209, 226, 363fnear-collision, 297periodic, 215, 363fstraight-line, 33, 52source<strong>def</strong>inition of, 43–44nonlinear, 165–168spiral, 47, 48fsource fixed point, 330spaceconfiguration, 278phase, 278state, 259, 278tangent, 286three-parameter, 71–72space derivative, 402spanned, 77, 89spiral center, 112fspiral saddle, 113, 113f


Index 417spiral sink, 47, 48fspiral source, 47, 48fspring constant, 26, 394square matrix, 95stabilityasymptotic, 175<strong>eq</strong>uilibrium point, 194Liapunov, 194–196nonlinear systems, 174–176stable curve, 161stable curve theorem, 168–169stable <strong>eq</strong>uilibrium, 175stable line, 40stable subspace, 108, 109fstandard basis, 28, 76state space, 259, 278Steve. See Bobstraight-line solution, 33, 52subspace<strong>def</strong>inition of, 77, 89–90stable, 108, 109funstable, 108, 109fsymbolic dynamics, 342–347symmetric matrix, 207Ttangent bifurcation, 332tangent space, 286Taylor expansion, 166tent map, 339–340tertiary expansion, 351three-parameter space, 71–72threshold level, 238time derivative, 402time t map, 64torus, 116–117total energy, 280total energy function, 197trace-determinant plane, 61–63, 64ftransitive, 323two parameters, 15–16two-body problem, 292–293Uundamped harmonic oscillator, 55,114, 208uniform convergence, 389unrestricted voltage state, 259unstable curve, 169unstable line, 40unstable subspace, 108, 109fVvan der Pol <strong>eq</strong>uation, 261–270variation of parameters, 130–131,134variational <strong>eq</strong>uation, 149–153, 172,401vector<strong>def</strong>inition of, 83momentum, 281vector fieldcomplex, 182–184<strong>def</strong>inition of, 24divergence of, 309Lorenz, 306vector notation, 21–22Volterra-Lotka system, 240Wweb diagram, 331Zzero velocity curve, 286, 287f

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!