12.07.2015 Views

MAT 280: Multivariable Calculus

MAT 280: Multivariable Calculus

MAT 280: Multivariable Calculus

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>MAT</strong> <strong>280</strong>: <strong>Multivariable</strong> <strong>Calculus</strong>James V. LambersJuly 17, 2012


Contents1 Partial Derivatives 51.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.1.1 Partial Differentiation . . . . . . . . . . . . . . . . . . 51.1.2 Multiple Integration . . . . . . . . . . . . . . . . . . . 71.1.3 Vector <strong>Calculus</strong> . . . . . . . . . . . . . . . . . . . . . . 71.2 Functions of Several Variables . . . . . . . . . . . . . . . . . . 91.2.1 Terminology and Notation . . . . . . . . . . . . . . . . 101.2.2 Visualization Techniques . . . . . . . . . . . . . . . . . 121.3 Limits and Continuity . . . . . . . . . . . . . . . . . . . . . . 141.3.1 Terminology and Notation . . . . . . . . . . . . . . . . 141.3.2 Defining Limits Using Neighborhoods . . . . . . . . . 161.3.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . 171.3.4 Techniques for Establishing Limits and Continuity . . 201.4 Partial Derivatives . . . . . . . . . . . . . . . . . . . . . . . . 221.4.1 Terminology and Notation . . . . . . . . . . . . . . . . 221.4.2 Clairaut’s Theorem . . . . . . . . . . . . . . . . . . . . 261.4.3 Techniques . . . . . . . . . . . . . . . . . . . . . . . . 271.5 Tangent Planes, Linear Approximations and Differentiability 311.5.1 Tangent Planes and Linear Approximations . . . . . . 311.5.2 Functions of More than Two Variables . . . . . . . . . 331.5.3 The Gradient Vector . . . . . . . . . . . . . . . . . . . 341.5.4 The Jacobian Matrix . . . . . . . . . . . . . . . . . . . 361.5.5 Differentiability . . . . . . . . . . . . . . . . . . . . . . 381.6 The Chain Rule . . . . . . . . . . . . . . . . . . . . . . . . . . 401.6.1 The Implicit Function Theorem . . . . . . . . . . . . . 431.7 Directional Derivatives and the Gradient Vector . . . . . . . . 471.7.1 The Gradient Vector . . . . . . . . . . . . . . . . . . . 481.7.2 Directional Derivatives . . . . . . . . . . . . . . . . . . 491.7.3 Tangent Planes to Level Surfaces . . . . . . . . . . . . 513


4 CONTENTS1.8 Maximum and Minimum Values . . . . . . . . . . . . . . . . 531.9 Constrained Optimization . . . . . . . . . . . . . . . . . . . . 611.10 Appendix: Linear Algebra Concepts . . . . . . . . . . . . . . 661.10.1 Matrix Multiplication . . . . . . . . . . . . . . . . . . 661.10.2 Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . 671.10.3 The Transpose, Inner Product and Null Space . . . . 682 Multiple Integrals 712.1 Double Integrals over Rectangles . . . . . . . . . . . . . . . . 712.2 Double Integrals over More General Regions . . . . . . . . . . 752.2.1 Changing the Order of Integration . . . . . . . . . . . 782.2.2 The Mean Value Theorem for Integrals . . . . . . . . 802.3 Double Integrals in Polar Coordinates . . . . . . . . . . . . . 802.4 Triple Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . 852.5 Applications of Double and Triple Integrals . . . . . . . . . . 902.6 Triple Integrals in Cylindrical Coordinates . . . . . . . . . . . 912.7 Triple Integrals in Spherical Coordinates . . . . . . . . . . . . 932.8 Change of Variables in Multiple Integrals . . . . . . . . . . . 963 Vector <strong>Calculus</strong> 1033.1 Vector Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . 1033.2 Line Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . 1063.3 The Fundamental Theorem for Line Integrals . . . . . . . . . 1133.4 Green’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 1183.5 Curl and Divergence . . . . . . . . . . . . . . . . . . . . . . . 1233.6 Parametric Surfaces and Their Areas . . . . . . . . . . . . . . 1273.7 Surface Integrals . . . . . . . . . . . . . . . . . . . . . . . . . 1323.7.1 Surface Integrals of Scalar-Valued Functions . . . . . . 1323.7.2 Surface Integrals of Vector Fields . . . . . . . . . . . . 1343.8 Stokes’ Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 1413.8.1 A Note About Orientation . . . . . . . . . . . . . . . . 1443.9 The Divergence Theorem . . . . . . . . . . . . . . . . . . . . 1453.10 Differential Forms . . . . . . . . . . . . . . . . . . . . . . . . 148


Chapter 1Partial Derivatives1.1 IntroductionThis course is the fourth course in the calculus sequence, following <strong>MAT</strong>167, <strong>MAT</strong> 168 and <strong>MAT</strong> 169. Its purpose is to prepare students for moreadvanced mathematics courses, particularly courses in mathematical programming(<strong>MAT</strong> 419), advanced engineering mathematics (<strong>MAT</strong> 430), realanalysis (<strong>MAT</strong> 441), complex analysis (<strong>MAT</strong> 436), and numerical analysis(<strong>MAT</strong> 460 and 461). The course will focus on three main areas, which webriefly discuss here.1.1.1 Partial DifferentiationIn single-variable calculus, you learned how to compute the derivative of afunction of one variable, y = f(x), with respect to its independent variable x,denoted by dy/dx. In this course, we consider functions of several variables.In most cases, the functions we use will depend on two or three variables,denoted by x, y and z, corresponding to spatial dimensions.When a function f(x, y, z), for example, depends on several variables, itis not possible to describe its rate of change with respect to those variablesusing a single quantity such as the derivative. Instead, this rate of change is avector quantity, called the gradient, denoted by ∇f. Each component of thegradient is the partial derivative of f with respect to one of its independentvariables, x, y or z. That is,∇f =[∂f∂xFor example, the partial derivative of f with respect to x, denoted by∂f/∂x, describes the instantaneous rate of change of f with respect to x,5∂f∂y∂f∂z].


6 CHAPTER 1. PARTIAL DERIVATIVESwhen y and z are kept constant. Partial derivatives can be computed usingthe same differentiation techniques as in single-variable calculus, but onemust be careful, when differentiating with respect to one variable, to treatall other variables as if they are constant. For example, if f(x, y) = x 2 y +y 3 ,then∂f∂x = 2xy,∂f∂y = x2 + 3y 2 ,because the y 3 term does not depend on x, and therefore its partial derivativewith respect to x is zero.If⎡⎤F 1 (x, y, z)F(x, y, z) = ⎣ F 2 (x, y, z) ⎦F 3 (x, y, z)is a vector-valued function of three variables, then each of its componentfunctions F 1 , F 2 , and F 3 has a gradient vector, and the rate of change ofF with respect to x, y and z is described by a matrix, called the Jacobianmatrix⎡⎤J F (x, y, z) =⎢⎣∂F 1∂x∂F 2∂x∂F 3∂x∂F 1∂y∂F 2∂y∂F 3∂y∂F 1∂z∂F 2∂z∂F 3∂zwhere each entry of J F (x, y, z) is a partial derivative of one of the componentfunctions with respect to one of the independent variables.We will learn how to generalize various concepts and techniques fromsingle-variable differential calculus to the multi-variable case. These include:• tangent lines, which become tangent planes for functions of two variablesand tangent spaces for functions of three or more variables. Theseare used to compute linear approximations similar to those of functionsof a single variable.• The Chain Rule, which generalizes from a product of derivatives to aproduct of Jacobian matrices, using standard matrix multiplication.This allows computing the rate of change of a function as its independentvariables change along any direction in space, not just alongany of the coordinate axes, which in turn allows determination of thedirection in which a function increases or decreases most rapidly.• computing maximum and minimum values of functions, which, in themulti-variable case, requires finding points at which the gradient isequal to the zero vector (corresponding to finding points at which the⎥⎦ ,


1.1. INTRODUCTION 7derivative is equal to zero) and checking whether the matrix of secondpartial derivatives is positive definite for a minimum, or negative definitefor a maximum (which generalizes the second derivative test fromthe single-variable case). We will also learn how to compute maximumand minimum values subject to constraints on the independentvariables, using the method of Lagrange multipliers.1.1.2 Multiple IntegrationNext, we will learn how to compute integrals of functions of several variablesover multi-dimensional domains, generalizing the definite integral of a functionf(x) over an interval [a, b]. The integral of a function of two variablesf(x, y) represents the volume under a surface described by the graph of f,just as the integral of f(x) is the area under the curve described by thegraph of f.In some cases, it is more convenient to evaluate an integral by firstperforming a change of variables, as in the single-variable case. For example,when integrating a function of two variables, polar coordinates is useful. Forfunctions of three variables, cylindrical and spherical coordinates, which areboth generalizations of polar coordinates, are worth considering.In the general case, evaluating the integral of a function of n variablesby first changing to n different variables requires multiplying the integrandby the determinant of the Jacobian matrix of the function that maps thenew variables to the old. This is a generalization of the u-substitution fromsingle-variable calculus, and also relates to formulas for area and volumefrom <strong>MAT</strong> 169 that are defined in terms of determinants, or equivalently, interms of the dot product and cross product.1.1.3 Vector <strong>Calculus</strong>In the last part of the course, we will study vector fields, which are functionsthat assign a vector to each point in its domain, like the vector-valued functionF described above. We will first learn how to compute line integrals,which are integrals of functions along curves. A line integral can be viewedas a generalization of the integral of a function on an interval, in that dxis replaced by ds, an infinitesimal distance between points on the curve. Itcan also be viewed as a generalization of an integral that computes the arclength of a curve, as the line integral of a function that is equal to one yieldsthe arc length. A line integrals of a vector field is useful for computing thework done by a force applied to an object to move it along a curved path. To


8 CHAPTER 1. PARTIAL DERIVATIVESfacilitate the computation of line integrals, a variation of the FundamentalTheorem of <strong>Calculus</strong> is introduced.Next, we generalize the notion of a parametric curve to a parametricsurface, in which the coordinates of points on the surface depend on twoparameters u and v, instead of a single parameter t for a parametric curve.Using the relation between the cross product and the area of a parallelogram,we can define the integral of a function over a parametric surface, which issimilar to how a change of variables in a double integral is handled. Then,we will learn how to integrate vector fields over parametric surfaces, whichis useful for computing the mass of fluid that crosses a surface, given therate of flow per unit area.We conclude with discussion of several fundamental theorems of vectorcalculus: Green’s Theorem, Stokes’ Theorem, and the Divergence Theorem.All of these can be seen to be generalizations of the Fundamental Theorem of<strong>Calculus</strong> to higher dimensions, in that they relate the integral of a functionover the interior of a domain to an integral of a related function over itsboundary. These theorems can be conveniently stated using the div andcurl operations on vector fields. Specifically, if F = 〈P, Q, R〉, thencurl F = ∇ × F =div F = ∇ · F = ∂P∂x + ∂Q∂y + ∂R∂z ,〈 ∂R∂y − ∂Q∂z , ∂P∂z − ∂R∂x , ∂Q∂x − ∂P 〉.∂yHowever, using the language of differential forms, we can condense the FundamentalTheorem of <strong>Calculus</strong> and all four of its variations into one theorem,known as the General Stokes’ Theorem. We now state all six results; theirdiscussion is deferred to Chapter 3.Fundamental Theorem of <strong>Calculus</strong>:∫ baf ′ (x) dx = f(b) − f(a)where f is continuously differentiable on [a, b]Fundamental Theorem of Line Integrals:∫ ba∇f(r(t)) · r ′ (t) dt = f(r(b)) − f(r(a))


1.2. FUNCTIONS OF SEVERAL VARIABLES 9where r(t) = 〈x(t), y(t), z(t)〉, a ≤ t ≤ b, is the position function for a curveC and f(x, y, z) is a continuously differentiable function defined on CGreen’s Theorem:∫D( ∂Q∂x − ∂P ) ∫dA = P dx + Q dy∂y∂Dwhere D is a 2-D region with piecewise smooth boundary ∂D and P and Qare continuously differentiable on DStokes’ Theorem:∫S∫curl F · n dS =∂SF · T dswhere S is a surface in 3-D with unit normal vector n, and piecewise smoothboundary ∂S with unit tangent vector T, and F is a continuously differentiablevector fieldDivergence Theorem:∫∫div F dV =E∂EF · n dSwhere E is a solid region in 3-D with boundary surface ∂E, which hasoutward unit normal vector n, and F is a continuously differentiable vectorfieldGeneral Stokes’ Theorem:∫M∫dω =∂Mωwhere M is a k-manifold and ω is a (k − 1)-form on M1.2 Functions of Several VariablesMulti-variable calculus is concerned with the study of quantities that dependon more than one variable, such as temperature that varies within a threedimensionalobject, which is a scalar quantity, or the velocity of a flowing


10 CHAPTER 1. PARTIAL DERIVATIVESliquid, which is a vector quantity. To aid in this study, we first introducesome important terminology and notation that is useful for working withfunctions of more than one variable, and then introduce some techniques forvisualizing such functions.1.2.1 Terminology and NotationThe following standard notation and terminology is used to define, and discuss,functions of several variables and their visual representations. As theywill be used throughout the course, it is important to become acquaintedwith them immediately.• The set R is the set of all real numbers.• If S is a set and x is an element of S, we write x ∈ S.Example 2 ∈ R, and π ∈ R, but i /∈ R, where i = √ −1 is an imaginarynumber.• If S is a set and T is a subset of S (that is, every element of T is inS), we write T ⊆ S.Example The set of natural numbers, N, and the set of integers, Z,satisfy N ⊆ Z. Furthermore, Z ⊆ R.• The set R n is the set of all ordered n-tuples x = (x 1 , x 2 , . . . , x n ) of realnumbers. Each real number x i is called a coordinate of the point x.Example Each point (x, y) ∈ R 2 has coordinates x and y. Each point(x, y, z) ∈ R 3 has an x-, y- and z-coordinate.• A function f with domain D ⊆ R n and range R ⊆ R m is a set ofordered pairs of the form {(x, y)}, where x ∈ D and y ∈ R, such thateach element x ∈ D is mapped to only one element of R. That is, thereis only one ordered pair in f such that x is its first element. We writef : D → R to indicate that f maps elements of D to elements of R.We also say that f maps D into R.Example Let R + denote the set of non-negative real numbers. Thefunction f(x, y) = x 2 + y 2 maps R 2 into R + , and we can write f :R 2 → R + .• Let f : D → R, and let D ⊆ R n and R ⊆ R m . If m = 1, we say thatf is a scalar-valued function, and if m > 1, we say that f is a vectorvaluedfunction. If n = 1, we say that f is a function of one variable,


1.2. FUNCTIONS OF SEVERAL VARIABLES 11and if n > 1, we say that f is a function of several variables. For eachx = (x 1 , x 2 , . . . , x n ) ∈ D, the coordinates x 1 , x 2 , . . . , x n of x are calledthe independent variables of f, and for each y = (y 1 , y 2 , . . . , y m ) ∈ R,the coordinates y 1 , y 2 , . . . , y m of y are called the dependent variablesof f.Example The function z = x 2 + y 2 is a scalar-valued function ofseveral variables. The independent variables are x and y, and thedependent variable is z. The function r(t) = 〈x(t), y(t), z(t)〉, wherex(t) = t cos t, y(t) = t sin t, and z(t) = e t , is a vector-valued functionwith independent variable t and dependent variables x, y and z.• Let f : D ⊆ R n → R. The graph of f is the subset of R n+1 consisting ofthe points (x 1 , x 2 , . . . , x n , f(x 1 , x 2 , . . . , x n )), where (x 1 , x 2 , . . . , x n ) ∈D.Solution The graph of the function z = x 2 + y 2 is a parabola ofrevolution obtained by revolving the parabola z = x 2 around the z-axis. The graph of the function z = x + y − 1 is a line in 3-D spacethat passes through the points (0, 0, −1) and (1, 1, 1).• A function f : R n → R is a linear function if f has the formf(x 1 , x 2 , . . . , x n ) = a 1 x 1 + a 2 x 2 + · · · + a n x n + b,where a 1 , a 2 , . . . , a n and b are constants.Example The function y = mx + b is a linear function of the singleindependent variable x. Its graph is a line contained within the xyplane,with slope m, passing through the point (0, b). The functionz = ax + by + c is a linear function of the two independent variables xand y. Its graph is a line in 3-D space that passes through the points(0, 0, c) and (1, 1, a + b + c).• Let f : D ⊆ R n → R. We say that a set L is a level set of f if L ⊆ Dand f is equal to a constant value k on L; that is, f(x) = k if x ∈ L.If n = 2, we say that L is a level curve or level contour; if n = 3, wesay that L is a level surface.Example A level surface of the function f(x, y, z) = x 2 +y 2 +z 2 , wheref(x, y, z) = k for a constant k, is a sphere of radius √ k. The levelcurves of the function z = x 2 + y 2 are circles of radius √ k with center(0, 0, k), situated in the plane z = k, for each nonnegative number k.


12 CHAPTER 1. PARTIAL DERIVATIVES1.2.2 Visualization TechniquesWhile it is always possible to obtain the graph of a function f(x, y), forexample, by substituting various values for its independent variables andplotting the corresponding points from the graph, this approach is not necessarilyhelpful for understanding the graph as a whole. Knowing the extentof the possible values of a function’s independent and dependent variables(the domain and range, respectively), along with the behavior of a few selectcurves that are contained within the function’s graph, can be more helpful.To that end, we mention the following useful techniques for acquiring thisinformation.Figure 1.1: Level curves of the function z = x 2 + y 4• To find the domain and range of a function f, it is often necessary toaccount for the domains of functions that are included in the definitionof f. For example, if there is a square root, it is necessary to avoidtaking the square root of a negative number.Example Let f(x, y) = ln(x 2 − y 2 ). Since ln |x| is only defined forx > 0, we must have x 2 > y 2 , which, upon taking the square rootof both sides, yields |x| > |y|. Therefore, this inequality defines thedomain of f. The range of f is the range of ln, which is R.• To find the level set of a function f(x 1 , x 2 , . . . , x n ), solve the equationf(x 1 , x 2 , . . . , x n ) = k, where k is a constant. This equation will implicitlydefine the level set. In some cases, it can be solved for one of


1.2. FUNCTIONS OF SEVERAL VARIABLES 13Figure 1.2: Sections of the function z = x 2 + y 4the independent variables to obtain an explicit function that describesthe level set.Example Let z = 2x + y be a function of two variables. The graphof this function is a plane. Each level set of this function is describedby an equation of the form 2x + y = k, where k is a constant. Sincez = k as well, the graph of this level set is the line with equationy = −2x + k, contained within the plane z = k.Example Let z = ln y − x. Each level set of this function is describedby an equation of the form ln y − x = k, where k is a constant. Exponentiatingboth sides, we obtain y = e x+k . It follows that the graph ofthis level set is that of the exponential function, contained within theplane z = k, and shifted k units to the left (that is, along the x-axis).• To help visualize a function of two variables z = f(x, y), it can behelpful to use the method of sections. This involves viewing the func-


14 CHAPTER 1. PARTIAL DERIVATIVEStions when restricted to “vertical” planes, such as the xz-plane andthe yz-plane. To take these two sections, first set y = 0 to obtain z asa function of x, and then graph that function in the xz-plane. Then,set x = 0 to obtain z as a function of y, and graph that function inthe yz-plane. Using these graphs as guides, in conjunction with levelcurves, it is then easier to visualize what the rest of the graph of flooks like.Example Let z = x 2 + y 4 . Setting y = 0 yields z = x 2 , the graph ofwhich is a parabola in the xz-plane. Setting x = 0 yields z = y 4 , whichhas a graph that is a parabola-like curve, where z increases much morerapidly. Combining these graphs with selected level curves, which aredescribed by the equations y = 4√ k − x 2 , where |x| ≤ √ k for k ≥ 0,allows us to visualize the graph of this function. Level curves andsections are shown in Figures 1.1 and 1.2, respectively.1.3 Limits and ContinuityRecall that in single-variable calculus, the fundamental concept of a limitwas used to define derivatives and integrals of functions, as well as the notionof continuity of a function. We now generalize limits and continuity to thecase of functions of several variables.1.3.1 Terminology and Notation• Let f : D ⊆ R → R, and a ∈ D. We say f(x) approaches L as xapproaches a, and writelim f(x) = Lx→aif, for any ɛ > 0, there exists a δ > 0 such that if 0 < |x − a| < δ, then|f(x) − L| < ɛ.• If x = (x 1 , x 2 , . . . , x n ) is a point in R n , or, equivalently, if x =〈x 1 , x 2 , . . . , x n 〉 is a position vector in R n , then the magnitude, orlength, of x, denoted by ‖x‖, is defined by‖x‖ =√( n∑) 1/2x 2 1 + x2 2 + · · · + x2 n = x 2 i .i=1Note that if n = 1, then x is actually a scalar x, and ‖x‖ = |x|.


1.3. LIMITS AND CONTINUITY 15Example If x = (3, −1, 4) ∈ R 3 , then ‖x‖ = √ 3 2 + (−1) 2 + 4 2 =√26. ✷• Let f : D ⊆ R n → R m , and a ∈ D. We say f(x) approaches b as xapproaches a, and writelim f(x) = b,x→aif, for any ɛ > 0, no matter how small, there exists a δ > 0 such thatfor any x such that 0 < ‖x − a‖ < δ, ‖f(x) − b‖ < ɛ. This definitionis illustrated in Figure 1.3. Note that the condition ‖x − a‖ > 0specifically excludes consideration of x = a, because limits are used tounderstand the behavior of a function near a point, not at a point.Figure 1.3: Illustration of the limit, as x approaches a (left plot), of f(x)being equal to b (right plot). For any ball around the point b of radius ɛ(right plot), no matter how small, there exists a ball around the point a, ofradius δ (left plot), such that every point in the ball around a is mapped byf to a point in the ball around b.Example Consider the functionf(x, y) =xy√x 2 + y 2 .We will use the definition of a limit to show that as (x, y) → (0, 0),f(x, y) → 0. Let ɛ > 0. We need to show that there exists some δ > 0


16 CHAPTER 1. PARTIAL DERIVATIVESsuch that if 0 < ‖(x, y) − (0, 0)‖ = √ x 2 + y 2 < δ, then |f(x, y) − 0| =xy| | < ɛ. First, we note that2√x 2 +y∣ y ∣∣∣∣ y∣ ∣∣∣ √


1.3. LIMITS AND CONTINUITY 17• Let x 0 ∈ R n and let r > 0. We define the ball centered at x 0 of radiusr, denoted by D r (x 0 ), to be the set of all points x ∈ R n such that‖x − x 0 ‖ < r.Example In 1-D, the open interval (0, 1) is also the ball centered atx 0 = 1/2 of radius r = 1/2. In 3-D, the inside of the sphere withcenter (0, 0, 0) and radius 2, {(x, y, z)|x 2 + y 2 + z 2 < 4}, is also theball D 2 ((0, 0, 0)). ✷• We say that a set U ⊆ R n is open if, for any point x 0 ∈ U, there existsan r > 0 such that D r (x 0 ) ⊆ U.Example In 1-D, any open set is an open interval, such as (−1, 1),or a union of open intervals. In 2-D, the interior of the ellipse definedby the equation 4x 2 + 9y 2 = 1 is an open set; the ellipse itself is notincluded. ✷• Let x 0 ∈ R n . We say that N is a neighborhood of x 0 if N is an openset that contains x 0 .• Let A ⊆ R n be an open set. We say that x 0 ∈ R n is a boundary pointof A if every neighborhood of x 0 contains at least one point in A andone point not in A.Example Let D = {(x, y)|x 2 + y 2 < 1}, which is often called the unitball in R 2 . This set consists of all points inside the unit circle withcenter (0, 0) and radius 1, not including the circle itself. The point(x 0 , y 0 ) = ( √ 2/2, √ 2/2), which is on the circle, is a boundary point ofD because, as illustrated in Figure 1.4, any neighborhood of (x 0 , y 0 )must contain points inside the circle, and points that are outside. ✷• Let f : D ⊆ R n → R m , and let a ∈ D or let a be a boundary point ofD. We say that a ∈ D. We say that f(x) approaches b as x approachesa, and writelim f(x) = b,x→aif, for any neighborhood N of b, there exists a neighborhood U of asuch that if x ∈ U, then f(x) ∈ N.1.3.3 ResultsIn the statement of the following results concerning limits and continuity,f, g : D ⊆ R n → R m , a ∈ D or a is a boundary point of D, b, b 1 , b 2 ∈ R m ,and c ∈ R.


18 CHAPTER 1. PARTIAL DERIVATIVESFigure 1.4: Boundary point (x 0 , y 0 ) of the set D = {(x, y)|x 2 +y 2 < 1}. Theneighborhood of (x 0 , y 0 ) shown, D r ((x 0 , y 0 )) = {(x, y)|(x−x 0 ) 2 +(y−y 0 ) 2


1.3. LIMITS AND CONTINUITY 19Furthermore, if m = 1, thenlim (fg)(x) = b 1b 2 .x→a• If m = 1 and lim x→a f(x) = b ≠ 0, and f(x) ≠ 0 in a neighborhoodof a, then1limx→a f(x) = 1 b .• If f(x) = (f 1 (x), f 2 (x), . . . , f m (x)), where f 1 , f 2 , . . . , f m are the componentfunctions of f, and b = (b 1 , b 2 , . . . , b m ), then lim x→a f(x) = bif and only if lim x→a f i (x) = b i for i = 1, 2, . . . , m.• If f and g are continuous at a, then so is cf and f + g. If, in addition,m = 1, then fg is continuous at a. Furthermore, if m = 1 and if f isnonzero in a neighborhood of a, then 1/f is continuous at a.• If f(x) = (f 1 (x), f 2 (x), . . . , f m (x)), where f 1 , f 2 , . . . , f m are the componentfunctions of f, then f is continuous at a if and only if f i iscontinuous at a, for i = 1, 2, . . . , m.• Any polynomial function f : R n → R is continuous on all of R n .• Any rational function f : D ⊆ R n → R is continuous wherever it isdefined.Example The function f(x, y) = 2x/(x 2 − y 2 ) is defined on all of R 2except where x 2 − y 2 = 0; that is, where |x| = |y|. Therfore, f iscontinuous at all such points. ✷• Let f : D ⊆ R n → R m , and let g : U ⊆ R p → D. If the composition(f ◦ g)(x) = f(g(x)) defined on U, then f ◦ g is continuous at a ∈ U ifg is continuous at a and f is continuous at g(a).Example The function g(x, y) = x 2 + y 2 , being a polynomial, is continuouson all of R 2 . The function f(z) = sin z is continuous on all ofR. Therefore, the composition (f ◦ g)(x, y) = f(g(x, y)) = sin(x 2 + y 2 )is continuous on all of R 2 . ✷• Algebraic functions, such as x r where r is any rational number (forexample, f(x) = √ x) and trigonometric functions, such as sin x ortan x, are continuous wherever they are defined.


20 CHAPTER 1. PARTIAL DERIVATIVES1.3.4 Techniques for Establishing Limits and ContinuityWe now discuss some techniques for computing limits of functions of severalvariables, or determining that they do not exist. We also demonstrate howto determine if a function is continuous at a point.To show that the limit of a function f : D ⊆ R n → R as x → a does notexist, try letting x approach a along different paths to see if different valuesare approached. If they are, then the limit does not exist.For example, let n = 2 and let x = (x, y) and a = (a 1 , a 2 ). Then, trysetting x = a 1 in the formula for f(x, y) and letting y approach a 2 , or viceversa. Other possible paths include, for example, setting x = cy, wherec = a 1 /a 2 , if a 2 ≠ 0, and letting y approach a 2 , or considering the cases ofx < a 1 and x > a 1 , or y < a 2 and y > a 2 , separately.Example Let f(x, y) = x 3 y/(x 4 +y 4 ). If we let (x, y) → (0, 0) by first settingy = 0 and then letting x → 0, we observe that f(x, 0) = x 3 (0)/(x 4 + 0) = 0for all x ≠ 0. This suggests that f(x, y) → 0 as (x, y) → (0, 0). However, ifwe set x = y and let x, y → 0 together, we note that f(x, x) = x 3 x/(x 4 +x 4 ) = x 4 /(2x 4 ) = 1/2, which suggests that the limit is equal to 1/2. Weconclude that the limit does not exist. ✷To show that the limit of a function f : D ⊆ R n → R as x → a doesexist and is equal to b, use the definition of a limit by first assuming ɛ > 0,and then trying to find a δ so that |f(x) − b| < ɛ whenever 0 < ‖x − a‖ < δ.To that end, try to find an upper bound on |f(x) − b| in terms of ‖x − a‖.Specifically, if it can be shown that |f(x) − b| < g(‖x − a‖), where g is aninvertible, increasing function, then a suitable choice for δ is δ = g −1 (ɛ).Then, if ‖x − a‖ < δ = g −1 (ɛ), then|f(x) − b| < g(‖x − a‖) < g(g −1 (ɛ)) = ɛ.Example Let f(x, y) = (x 3 − y 3 )/(x 2 + y 2 ). Letting (x, y) → (0, 0) alongvarious paths, it appears that f(x, y) → 0 as (x, y) → (0, 0). To confirmthis, we assume ɛ > 0 and try to find δ > 0 such that if 0 < √ x 2 + y 2 < δ,then |(x 3 − y 3 )/(x 2 + y 2 )| < ɛ.Factoring the numerator of f(x, y), we obtainx 3 − y 3 ∣ ∣∣∣ ∣x 2 + y 2 =(x − y)(x 2 + xy + y 2 ∣ () ∣∣∣ ∣ x 2 + y 2 =∣ (x − y) 1 + xy )∣ ∣∣∣x 2 + y 2 .


1.4. PARTIAL DERIVATIVES 23Note that only values of f(x, y) for which y = y 0 influence the value ofthe partial derivative with respect to x. Similarly, the partial derivative off(x, y) with respect to y at (x 0 , y 0 ) is defined to be∂f∂y (x f(x 0 , y 0 + h) − f(x 0 , y 0 )0, y 0 ) = f y (x 0 , y 0 ) = lim.h→0 hNote the two methods of denoting partial derivatives used above: ∂f/∂x orf x for the partial derivative with respect to x. There are other notations,but these are the ones that we will use.Example Let f(x, y) = x 2 y, and let (x 0 , y 0 ) = (2, −1). Then✷f x (2, −1) =(2 + h) 2 (−1) − 2 2 (−1)limh→0 h=−(4 + 4h + h 2 ) + 4limh→0 h=−4h − h 2limh→0 h= −4,f y (2, −1) =2 2 (−1 + h) − 2 2 (−1)limh→0 h=4(h − 1) + 4limh→0 h= lim= 4.hh→04hIn the preceding example, the value f x (2, −1) = −4 can be interpretedas the slope of the line that is tangent to the graph of f(x, −1) = −x 2at x = 2. That is, we consider the restriction of f to the portion of itsdomain where y = −1, and thus obtain a function of the single variable x,g(x) = f(x, −1) = −x 2 . Note that if we apply differentiation rules fromsingle-variable calculus to g, we obtain g ′ (x) = −2x, and g ′ (2) = −4, whichis the value we obtained for f x (2, −1).Similarly, if we consider f y (2, −1) = 4, this can be interpreted as theslope of a line that is tangent to the graph of p(y) = f(2, y) = 4y at y = −1.Note that if we differentiate p, we obtain p ′ (y) = 4, which, again, shows thatthe partial derivative of a function of several variables can be obtained by“freezing” the values of all variables except the one with respect to which we


1.4. PARTIAL DERIVATIVES 25= limh→0(c · x 0 + hc · e 2 ) 2 − (c · x 0 ) 2h(c · x 0 ) 2 + 2(c · x 0 )(hc · e 2 ) + (hc · e 2 ) 2 − (c · x 0 ) 2= limh→0 h2h(c · x 0 )(c · e 2 ) + h 2 (c · e 2 ) 2= limh→0 h= lim 2(c · x 0 )(c · e 2 ) + h(c · e 2 ) 2h→0= 2(c · x 0 )(c · e 2 )= 2(〈4, −3, 2, −1〉 · 〈1, 3, 2, 4〉)(〈4, −3, 2, −1〉 · 〈0, 1, 0, 0〉)= 2[4(1) − 3(3) + 2(2) − 1(4)](−3)= 2(−5)(−3)= 30.This shows that f is increasing sharply as a function of x 2 at the point x 0 .Note that the same result can be obtained by definingg(x 2 ) = f(1, x 2 , 2, 4)= (c · 〈1, x 2 , 2, 4〉) 2= (〈4, −3, 2, −1〉 · 〈1, x 2 , 2, 4〉) 2= (4 − 3x 2 ) 2 ,differentiating this function of x 2 to obtain g ′ (x 2 ) = 2(4−3x 2 )(−3), and thenevaluating this derivative at x 2 = 3 to obtain g ′ (3) = 2(4 − 3(3))(−3) = 30.✷Just as functions of a single variable can have second derivatives, thirdderivatives, or derivatives of any order, functions of several variables canhave higher-order partial derivatives. To that end, let f : D ⊆ R n → R be ascalar-valued function of n variables x 1 , x 2 , . . . , x n . Then, the second partialderivative of f with respect to x i and x j at x 0 ∈ D is defined to be∂ 2 f∂x i ∂x j(x 0 ) = f xi x j(x 0 )= ∂ ( ) ∂f(x 0 )∂x i ∂x jf xj (x 0 + h i e i ) − f xj (x 0 )= limhi →0h i1= lim [f(x 0 + h i e i + h j e j ) − f(x 0 + h i e i ) −(h i ,h j )→(0,0) h i h j


28 CHAPTER 1. PARTIAL DERIVATIVES]∣3e −(x2 +y 2) ∣∣x=π/2,y=πcos 3x[= cos 4y −2xe −(x2 +y 2) sin 3x + 3e −(x2 +y 2) cos 3x[= cos 4π −2(π/2)e −((π/2)2 +π 2) sin(3π/2)+]3e −((π/2)2 +π 2) cos 3(π/2)= πe −5π2 /4 .]∣∣∣x=π/2,y=πSimilarly, to compute f y (π/2, π), we treat x as a constant, and apply thesedifferentiation rules to differentiate with respect to y. Finally, we substitutex = π/2 and y = π into the resulting derivative. ✷This approach to differentiation can also be applied to compute higher-orderpartial derivatives, as long as any substitution of values for the variables isdeferred to the end.Example To evaluate the second partial derivatives of f(x, y) = ln |x + y 2 |at x = 1, y = 2, we first compute the first partial derivatives of f:f x = 1 ∂x + y 2 ∂x [x + y2 ] = 1x + y 2 ,f y = 1x + y 2 ∂∂y [x + y2 ] =2yx + y 2 .Next, we differentiate each of these partial derivatives with respect to bothx and y to obtainf xx = (f x ) x( 1=x + y)x21 ∂= −(x + y 2 ) 2 ∂x [x + y2 ]1= −(x + y 2 ) 2 ,f xy = (f x ) y( 1=x + y)y21 ∂= −(x + y 2 ) 2 ∂y [x + y2 ]


30 CHAPTER 1. PARTIAL DERIVATIVESf xz = (f x ) z = (2xy 4 z 3 ) z = (2xy 4 )(3z 2 ) = 6xy 4 z 2 ,f yx = (f y ) x = (4x 2 y 3 z 3 ) x = 8xy 3 z 3 ,f yy = (f y ) y = (4x 2 y 3 z 3 ) y = (4x 2 )(3y 2 )(z 3 ) = 12x 2 y 2 z 3 ,f yz = (f y ) z = (4x 2 y 3 z 3 ) z = (4x 2 y 3 )(3z 2 ) = 12x 2 y 3 z 2 ,f zx = (f z ) x = (3x 2 y 4 z 2 ) x = 6xy 4 z 2 ,f zy = (f z ) y = (3x 2 y 4 z 2 ) y = (3x 2 )(4y 3 )(z 2 ) = 12x 2 y 3 z 2 ,f zz = (f z ) z = (3x 2 y 4 z 2 ) z = (3x 2 y 4 )(2z) = 6x 2 y 4 z.Then, these can be evaluated at (x 0 , y 0 , z 0 ) by substituting x = −1, y = 2,and z = 3 to obtainf xx (−1, 2, 3) = 864, f xy (−1, 2, 3) = −1728, f xz (−1, 2, 3) = −864,f yx (−1, 2, 3) = −1728, f yy (−1, 2, 3) = 1296, f yz (−1, 2, 3) = 864,f zx (−1, 2, 3) = −864, f zy (−1, 2, 3) = 864, f zz (−1, 2, 3) = 288.Note that the order in which partial differentiation operations occur doesnot appear to matter; that is, f xy = f yx , for example. That is, Clairaut’sTheorem applies for any number of variables. It also applies to any order ofpartial derivative. For example,f xyy = (f xy ) y = (8xy 3 z 3 ) y = 24xy 2 z 3 ,✷f yyx = (f yy ) x = (12x 2 y 2 z 3 ) x = 24xy 2 z 3 .In single-variable calculus, implicit differentiation is applied to an equationthat implicitly describes y as a function of x, in order to compute dy/dx.The same approach can be applied to an equation that implicitly describesany number of dependent variables in terms of any number of independentvariables. The approach is the same as in the single-variable case: differentiateboth sides of the equation with respect to the independent variable,leaving derivatives of dependent variables in the equation as unknowns. Theresulting equation can then be solved for the unknown partial derivatives.Example Consider the equationx 2 z + y 2 z + z 2 = 1.If we view this equation as one that implicitly describes z as a function of xand y, we can compute z x and z y using implicit differentiation with respectto x and y, respectively. Applying the Product Rule yields the equations2xz + x 2 z x + y 2 z x + 2zz x = 0,


1.5. TANGENT PLANES, LINEAR APPROXI<strong>MAT</strong>IONS AND DIFFERENTIABILITY31x 2 z y + 2yz + y 2 z y + 2zz y = 0,which can then be solved for the partial derivatives to obtain✷2xzz x = −x 2 + y 2 + 2z ,z 2yzy = −x 2 + y 2 + 2z .1.5 Tangent Planes, Linear Approximations andDifferentiabilityNow that we have learned how to compute partial derivatives of functions ofseveral independent variables, in order to measure their instantaneous ratesof change with respect to these variables, we will discuss another essentialapplication of derivatives: the approximation of functions by linear functions.Linear functions are the simplest to work with, and for this reason,there are many instances in which functions are replaced by a linear approximationin the context of solving a problem such as solving a differentialequation.1.5.1 Tangent Planes and Linear ApproximationsIn single-variable calculus, we learned that the graph of a function f(x) canbe approximated near a point x 0 by its tangent line, which has the equationy = f(x 0 ) + f ′ (x 0 )(x − x 0 ).For this reason, the function L f (x) = f(x 0 ) + f ′ (x 0 )(x − x 0 ) is also referredto as the linearization, or linear approximation, of f(x) at x 0 .Now, suppose that we have a function of two variables, f : D ⊆ R 2 →R, and a point (x 0 , y 0 ) ∈ D. Furthermore, suppose that the first partialderivatives of f, f x and f y , exist at (x 0 , y 0 ). Because the graph of thisfunction is a surface, it follows that a linear function that approximates fnear (x 0 , y 0 ) would have a graph that is a plane.Just as the tangent line of f(x) at x 0 passes through the point (x 0 , f(x 0 )),and has a slope that is equal to f ′ (x 0 ), the instantaneous rate of change off(x) with respect to x at x 0 , a plane that best approximates f(x, y) at(x 0 , y 0 ) must pass through the point (x 0 , y 0 , f(x 0 , y 0 )), and the slope of theplane in the x- and y-directions, respectively, should be equal to the valuesof f x (x 0 , y 0 ) and f y (x 0 , y 0 ).


32 CHAPTER 1. PARTIAL DERIVATIVESSince a general linear function of two variables can be described by theformulaL f (x, y) = A(x − x 0 ) + B(y − y 0 ) + C,so that L f (x 0 , y 0 ) = C, and a simple differentiation yields∂L f∂x = A,∂L f∂y = B,we conclude that the linear function that best approximates f(x, y) near(x 0 , y 0 ) is the linear approximationL f (x, y) = f(x 0 , y 0 ) + ∂f∂x (x 0, y 0 )(x − x 0 ) + ∂f∂y (x 0, y 0 )(y − y 0 ).Furthermore, the graph of this function is called the tangent plane of f(x, y)at (x 0 , y 0 ). Its equation isz − z 0 = ∂f∂x (x 0, y 0 )(x − x 0 ) + ∂f∂y (x 0, y 0 )(y − y 0 ).Example Let f(x, y) = 2x 2 y+3y 2 , and let (x 0 , y 0 ) = (1, 1). Then f(x 0 , y 0 ) =5, and the first partial derivatives at (x 0 , y 0 ) aref x (1, 1) = 4xy| x=1,y=1 = 4, f y (1, 1) = 2x 2 + 6y| x=1,y=1 = 8.It follows that the tangent plane at (1, 1) has the equationand the linearization of f at (1, 1) isz − 5 = 4(x − 1) + 8(y − 1),L f (x, y) = 5 + 4(x − 1) + 8(y − 1).Let (x, y) = (1.1, 1.1). Then f(x, y) = 6.292, while L f (x, y) = 6.2, for anerror of 6.292 − 6.2 = 0.092. However, if (x, y) = (1.01, 1.01), then f(x, y) =5.120902, while L f (x, y) = 5.12, for an error of 5.120902 − 5.12 = 0.000902.That is, moving 10 times as close to (1, 1) decreased the error by a factor ofover 100. ✷Another useful application of a linear approximation is to estimate theerror in the value of a function, given estimates of error in its inputs. Givena function z = f(x, y), and its linearization L f (x, y) around a point (x 0 , y 0 ),if x 0 and y 0 are measured values and dx = x − x 0 and dz = y − y 0 are


1.5. TANGENT PLANES, LINEAR APPROXI<strong>MAT</strong>IONS AND DIFFERENTIABILITY33regarded as errors in x 0 and y 0 , then the error in z can be estimated bycomputingdz = z − z 0 = L f (x, y) − f(x 0 , y 0 )= [f(x 0 , y 0 ) + f x (x 0 , y 0 )(x − x 0 ) + f y (x 0 , y 0 )(y − y 0 )] − f(x 0 , y 0 )= f x (x 0 , y 0 ) dx + f y (x 0 , y 0 ) dy.The variables dx and dy are called differentials, and dz is called the totaldifferential, as it depends on the values of dx and dy. The total differentialdz is only an estimate of the error in z; the actual error is given by ∆z =f(x, y) − f(x 0 , y 0 ), when the actual errors in x and y, ∆x = x − x 0 and∆y = y −y 0 , are known. Since this is rarely the case in practice, one insteadestimates the error in z from estimates dx and dy of the errors in x and y.Example Recall that the volume of a cylinder with radius r and height his V = πr 2 h. Suppose that r = 5 cm and h = 10 cm. Then the volume isV = 250π cm 3 . If the measurement error in r and h is at most 0.1 cm, then,to estimate the error in the computed volume, we first computeV r = 2πrh = 100π, V h = πr 2 = 25π.It follows that the error in V is approximatelydV = V r dr + V h dh = 0.1(100π + 25π) = 12.5π cm 3 .If we specify ∆r = 0.1 and ∆h = 0.1, and compute the actual volume usingradius r + ∆r = 5.1 and height h + ∆h = 10.1, we obtainwhich yields the actual errorV + ∆V = π(5.1) 2 (10.1) = 262.701π cm 3 ,∆V = 262.701π − 250π = 12.701π cm 3 .Therefore, the estimate of the error, dV , is quite accurate. ✷1.5.2 Functions of More than Two VariablesThe concepts of a tangent plane and linear approximation generalize tomore than two variables in a straightforward manner. Specifically, givenf : D ⊆ R n → R and p 0 = (x (0)1 , x(0) 2 , . . . , x(0) n ) ∈ D, we define the tangent


34 CHAPTER 1. PARTIAL DERIVATIVESspace of f(x 1 , x 2 , . . . , x n ) at p 0 to be the n-dimensional hyperplane in R n+1whose points (x 1 , x 2 , . . . , x n , y) satisfy the equationy − y 0 = ∂f∂x 1(p 0 )(x 1 − x (0)1 ) + ∂f∂x 2(p 0 )(x 2 − x (0)2 ) + · · · + ∂f∂x n(p 0 )(x n − x (0)n ),where y 0 = f(p 0 ). Similarly, the linearization of f at p 0 is the functionL f (x 1 , x 2 , . . . , x n ) defined byL f (x 1 , x 2 , . . . , x n ) = y 0 + ∂f∂x 1(p 0 )(x 1 − x (0)1 ) + ∂f∂x 2(p 0 )(x 2 − x (0)2 ) +1.5.3 The Gradient Vector· · · + ∂f∂x n(p 0 )(x n − x (0)n ).It can be seen from the above definitions that writing formulas that involvethe partial derivatives of functions of n variables can be cumbersome. Thiscan be addressed by expressing collections of partial derivatives of functionsof several variables using vectors and matrices, especially for vector-valuedfunctions of several variables.By convention, a point p 0 = (x (0)1 , x(0) 2 , . . . , x(0) n ), which can be identifiedwith the position vector p 0 = 〈x (0)1 , x(0) 2 , . . . , x(0) n 〉, is considered to be acolumn vector⎡p 0 =⎢⎣x (0)1x (0)2.x (0)nAlso, by convention, given a function of n variables, f : D ⊆ R n → R,the collection of its partial derivatives with respect to all of its variables iswritten as a row vector∇f(p 0 ) =[∂f∂x 1(p 0 )⎤⎥⎦ .∂f∂x 2(p 0 ) · · ·∂f∂x n(p 0 )This vector is called the gradient of f at p 0 .Viewing the partial derivatives of f as a vector allows us to use vectoroperations to describe, much more concisely, the linearization of f. Specifically,the linearization of f at p 0 , evaluated at a point p = (x 1 , x 2 , . . . , x n ),can be written asL f (p) = f(p 0 ) + ∂f∂x 1(p 0 )(x 1 − x (0)1 ) + ∂f∂x 2(p 0 )(x 2 − x (0)2 ) +].


1.5. TANGENT PLANES, LINEAR APPROXI<strong>MAT</strong>IONS AND DIFFERENTIABILITY35· · · + ∂f (p 0 )(x n − x (0)n )∂x nn∑ ∂f= f(p 0 ) + (p 0 )(x i − x (0)i)∂x ii=1= f(p 0 ) + ∇f(p 0 ) · (p − p 0 ),where ∇f(p 0 ) · (p − p 0 ) is the dot product, also known as the inner product,of the vectors ∇f(p 0 ) and p − p 0 . Recall that given two vectors u =〈u 1 , u 2 , . . . , u n 〉 and v = 〈v 1 , v 2 , . . . , v n 〉, the dot product of u and v, denotedby u · v, is defined byu · v =n∑u i v i = u 1 v 1 + u 2 v 2 + · · · + u n v n = ‖u‖‖v‖ cos θ,i=1where θ is the angle between u and v.Example Let f : R 3 → R be defined byThenf(x, y, z) = 3x 2 y 3 z 4 .∇f(x, y, z) = [ f x f y f z]=[6xy 3 z 4 9x 2 y 2 z 4 12x 2 y 3 z 3 ] .Let (x 0 , y 0 , z 0 ) = (1, 2, −1). Then∇f(x 0 , y 0 , z 0 ) = ∇f(1, 2, −1)= [ f x (1, 2, −1) f y (1, 2, −1) f z (1, 2, −1) ]= [ 48 36 −96 ] .It follows that the linearization of f at (x 0 , y 0 , z 0 ) isL f (x, y, z) = f(1, 2, −1) + ∇f(1, 2, −1) · 〈x − 1, y − 2, z + 1〉= 24 + 〈48, 36, −96〉 · 〈x − 1, y − 2, z + 1〉= 24 + 48(x − 1) + 36(y − 2) − 96(z + 1)= 48x + 36y − 96z − 192.At the point (1.1, 1.9, −1.1), we have f(1.1, 1.9, −1.1) ≈ 36.5, while L f (1.1, 1.9, −1.1) =34.8. Because f is changing rapidly in all coordinate directions at (1, 2, −1),it is not surprising that the linearization of f at this point is not highlyaccurate. ✷


36 CHAPTER 1. PARTIAL DERIVATIVES1.5.4 The Jacobian MatrixNow, let f : D ⊆ R n → R m be a vector-valued function of n variables, withcomponent functions⎡ ⎤f 1 (p)f 2 (p)f(p) = ⎢ ⎥⎣ . ⎦ ,f m (p)where each f i : D → R m . Combining the two conventions described above,the partial derivatives of these component functions at a point p 0 ∈ D arearranged in an m × n matrix⎡⎤J f (p 0 ) =⎢⎣∂f 1∂x 1(p 0 )∂f 2∂x 1(p 0 )∂f 1∂x 2(p 0 ) · · ·∂f 2∂x 2(p 0 ) · · ·. · · · · · ·∂f m∂x 1(p 0 )∂f m∂x 2(p 0 ) · · ·∂f 1∂x n(p 0 )∂f 2∂x n(p 0 )⎥. ⎦ .∂f m∂x n(p 0 )This matrix is called the Jacobian matrix of f at p 0 . It is also referred toas the derivative of f at x 0 , since it reduces to the scalar f ′ (x 0 ) when f is ascalar-valued function of one variable. Note that rows of J f (p 0 ) correspondto component functions, and columns correspond to independent variables.This allows us to view J f (p 0 ) as the following collections of rows or columns:⎡J f (p 0 ) = ⎢⎣∇f 1 (p 0 )∇f 2 (p 0 ).∇f m (p 0 )⎤[⎥⎦ =∂f∂x 1(p 0 )∂f∂f∂x 2(p 0 ) · · ·∂x n(p 0 )The Jacobian matrix provides a concise way of describing the linearizationof a vector-valued function, just the gradient does for a scalar-valuedfunction. The linearization of f at p 0 is the function L f (p), defined by⎡ ⎤ ⎡ ⎤∂f 1f 1 (p 0 )∂x f 2 (p 0 )1(p 0 )∂f L f (p) = ⎢ ⎥⎣ . ⎦ + 2∂x 1(p 0 )⎢ ⎥⎣ . ⎦ (x 1 − x (0)1 ) + · · ·f m (p 0 ) ∂f m∂x 1(p 0 )⎡ ⎤∂f 1∂x n(p 0 )∂f 2+∂x n(p 0 )⎢ ⎥⎣ . ⎦ (x n − x (0)n )∂f m∂x n(p 0 )].


1.5. TANGENT PLANES, LINEAR APPROXI<strong>MAT</strong>IONS AND DIFFERENTIABILITY37= f(p 0 ) +n∑j=1∂f∂x j(p 0 )(x j − x (0)j)= f(p 0 ) + J f (p 0 )(p − p 0 ),where the expression J f (p 0 )(p − p 0 ) involves matrix multiplication of thematrix J f (p 0 ) and the vector p − p 0 . Note the similarity between thisdefinition, and the definition of the linearization of a function of a singlevariable.In general, given a m × n matrix A; that is, a matrix A with m rowsand n columns, and an n × p matrix B, the product AB is the m × p matrixC, where the entry in row i and column j of C is obtained by computingthe dot product of row i of A and column j of B. When computing thelinearization of a vector-valued function f at the point p 0 in its domain, theith component function of the linearization is obtained by adding the valueof the ith component function at p 0 , f i (p 0 ), to the dot product of ∇f i (p 0 )and the vector p − p 0 , where p is the vector at which the linearization is tobe evaluated.Example Let f : R 2 → R 2 be defined by[ ]f1 (x, y)f(x, y) ==f 2 (x, y)[ e x cos ye −2x sin y].Then the Jacobian matrix, or derivative, of f is the 2 × 2 matrix[ ] [ ] [∇f1 (x, y) (f1 )J f (x, y) == x (f 1 ) y e=x cos y −e x sin y∇f 2 (x, y) (f 2 ) x (f 2 ) y −2e −2x sin y e −2x cos y].Let (x 0 , y 0 ) = (0, π/4). Then we haveJ f (x 0 , y 0 ) =and the linearization of f at (x 0 , y 0 ) is[ ]f1 (xL f (x, y) =0 , y 0 )f 2 (x 0 , y 0 )]==[ √2√222[ √2+2 + √2[ √ √22− 22− √ √2 22][ ] x − x0+ J f (x 0 , y 0 )y − y 0√ ]2[ ] x − 0[ √22−2− √ √2 22√2 x − 2( )2 y −π( 4 )y −π√22 − √ 2x +√224,y − π 4].


38 CHAPTER 1. PARTIAL DERIVATIVESAt the point (x 1 , y 1 ) = (0.1, 0.8), we have[ 0.76998f(x 1 , y 1 ) ≈0.58732], L f (x 1 , y 1 ) ≈[ 0.767490.57601Because of the relatively small partial derivatives at (x 0 , y 0 ), the linearizationat this point yields a fairly accurate approximation at (x 1 , y 1 ). ✷1.5.5 DifferentiabilityBefore using a linearization to approximate a function near a point p 0 , it ishelpful to know whether this linearization is actually an accurate approximationof the function in the first place. That is, we need to know if thefunction is differentiable at p 0 , which, informally, means that its instantaneousrate of change at p 0 is well-defined. In the single-variable case, afunction f(x) is differentiable at x 0 if f ′ (x 0 ) exists; that is, if the limitexists. In other words, we must havef ′ (x 0 ) = limx→x 0f(x) − f(x 0 )x − x 0f(x) − f(x 0 ) − f ′ (x 0 )(x − x 0 )lim= 0.x→x 0 x − x 0But f(x 0 ) + f ′ (x 0 )(x − x 0 ) is just the linearization of f at x 0 , so we cansay that f is differentiable at x 0 if and only iff(x) − L f (x)lim= 0.x→x 0 x − x 0Note that this is a stronger statement than simply requiring thatlimx→x 0f(x) − L f (x) = 0,because as x approaches x 0 , |1/(x − x 0 )| approaches ∞, so the differencef(x)−L f (x) must approach zero particularly rapidly in order for the fraction[f(x) − L f (x)]/(x − x 0 ) to approach zero. That is, the linearization must bea sufficiently accurate approximation of f near x 0 for this to be the case, inorder for f to be differentiable at x 0 .This notion of differentiability is readily generalized to functions of severalvariables. Given f : D ⊆ R n → R m , and p 0 ∈ D, we say that f isdifferentiable at p 0 if‖f(p) − L f (p)‖lim= 0,p→p 0 ‖p − p 0 ‖].


1.5. TANGENT PLANES, LINEAR APPROXI<strong>MAT</strong>IONS AND DIFFERENTIABILITY39where L f (p) is the linearization of f at p 0 .Example Let f(x, y) = x 2 y. To verify that this function is differentiable at(x 0 , y 0 ) = (1, 1), we first compute f x = 2xy and f y = x 2 . It follows that thelinearization of f at (1, 1) isL f (x, y) = f(1, 1) + f x (1, 1)(x − 1) + f y (1, 1)(y − 1)Therefore, f is differentiable at (1, 1) if= 1 + 2(x − 1) + (y − 1) = 2x + y − 2.|x 2 y − (2x + y − 2)||x 2 y − (2x + y − 2)|lim= lim √(x,y)→(1,1) ‖(x, y) − (1, 1)‖ (x,y)→(1,1) (x − 1) 2 + (y − 1) = 0. 2By rewriting this expression asand noting that|x 2 y − (2x + y − 2)| |x − 1||y(x + 1) − 2|√ = √(x − 1) 2 + (y − 1) 2 (x − 1) 2 + (y − 1) , 2lim |y(x + 1) − 2| = 0, 0 ≤ |x − 1|√(x,y)→(1,1) (x − 1) 2 + (y − 1) ≤ 1, 2we conclude that the limit actually is zero, and therefore f is differentiable.✷There are three important conclusions that we can make regarding differentiablefunctions:• If all partial derivatives of f at p 0 exist, and are continuous, then f isdifferentiable at p 0 .• Furthermore, if f is differentiable at p 0 , then it is continuous at p 0 .Note that the converse is not true; for example, f(x) = |x| is continuousat x = 0, but it is not differentiable there, because f ′ (x) does notexist there.• If f is differentiable at p 0 , then its first partial derivatives exist atp 0 . This statement might seem redundant, because the first partialderivatives are used in the definition of the linearization, but it isimportant nonetheless, because the converse of this statement is nottrue. That is, if a function’s first partial derivatives exist at a point,it is not necessarily differentiable at that point.


40 CHAPTER 1. PARTIAL DERIVATIVESThe notion of differentiability is related to not only partial derivatives, whichonly describe how a function changes as one of its variables changes, but alsothe instantaneous rate of change of a function as its variables change alongany direction. If a function is differentiable at a point, that means its rate ofchange along any direction is well-defined. We will explore this idea furtherlater in this chapter.1.6 The Chain RuleRecall from single-variable calculus that if a function g(x) is differentiable atx 0 , and f(x) is differentiable at g(x 0 ), then the derivative of the composition(f ◦ g)(x) = f(g(x)) is given by the Chain Rule(f ◦ g) ′ (x 0 ) = f ′ (g(x 0 ))g ′ (x 0 ).We now generalize the Chain Rule to functions of several variables. Letf : D ⊆ R n → R m , and let g : U ⊆ R p → D. That is, the range of g is thedomain of f.Assume that g is differentiable at a point p 0 ∈ U, and that f is differentiableat the point q 0 = g(p 0 ). Then, f has a Jacobian matrix J f (q 0 ), andg has a Jacobian matrix J g (p 0 ). These matrices contain the first partialderivatives of f and g evaluated at q 0 and p 0 , respectively.Then, the Chain Rule states that the derivative of the composition (f ◦g) : U → R m , defined by (f ◦g)(x) = f(g(x)), at p 0 , is given by the JacobianmatrixJ f◦g (p 0 ) = J f (g(p 0 ))J g (p 0 ).That is, the derivative of f ◦ g at p 0 is the product, in the sense of matrixmultiplication, of the derivative of f at g(p 0 ) and the derivative of g at p 0 .This is entirely analogous to the Chain Rule from single-variable calculus,in which the derivative of f ◦ g at x 0 is the product of the derivative of f atg(x 0 ) and the derivative of g at x 0 .It follows from the rules of matrix multiplication that the partial derivativeof the ith component function of f ◦ g with respect to the variable x j ,an independent variable of g, is given by the dot product of the gradient ofthe ith component function of f with the vector that contains the partialderivatives of the component functions of g with respect to x j . We nowillustrate the application of this general Chain Rule with some examples.Example Let f : R 3 → R be defined byf(x, y, z) = e z cos 2x sin 3y,


1.6. THE CHAIN RULE 41and let g : R → R 3 be a vector-valued function of one variable defined byg(t) = 〈x(t), y(t), z(t)〉 = 〈2t, t 2 , t 3 〉.Then, f ◦ g is a scalar-valued function of t,(f ◦ g)(t) = e z(t) cos 2x(t) sin 3y(t) = e t3 cos 4t sin 3t 2 .To compute its derivative with respect to t, we first compute∇f = [ ] [f x f y f z = −2e z sin 2x sin 3y 3e z cos 2x cos 3y e z cos 2x sin 3y ] ,andg ′ (t) = 〈x ′ (t), y ′ (t), z ′ (t)〉 = 〈2, 2t, 3t 2 〉,and then apply the Chain Rule to obtaindfdt= ∇f(x(t), y(t), z(t)) · g ′ (t)⎡= [ f x (x(t), y(t), z(t)) f y (x(t), y(t), z(t)) f z (x(t), y(t), z(t)) ] ⎣= f x (x(t), y(t), z(t)) dxdt + f y(x(t), y(t), z(t)) dydt + f z(x(t), y(t), z(t)) fzdt= (−2e z(t) sin 2x(t) sin 3y(t))(2) + (3e z(t) cos 2x(t) cos 3y(t))(2t) +(e z(t) cos 2x(t) sin 3y(t))(3t 2 )= −4e t3 sin 4t sin 3t 2 + 6te t3 cos 4t cos 3t 2 + 3t 2 e t3 cos 4t sin 3t 2 .dxdtdydtdzdt⎤⎦✷Example Let f : R 2 → R be defined byf(x, y) = x 2 y + xy 2 ,and let g : R 2 → R 2 be defined by[ ] x(s, t)g(s, t) ==y(s, t)[ 2s + ts − 2tThen, f ◦ g is a scalar-valued function of s and t,(f◦g)(s, t) = x(s, t) 2 y(s, t)+x(s, t)y(s, t) 2 = (2s+t) 2 (s−2t)+(2s+t)(s−2t) 2 .].


42 CHAPTER 1. PARTIAL DERIVATIVESTo compute its gradient, which includes its partial derivatives with respectto s and t, we first computeand∇f = [ f x f y]=[2xy + y2x 2 + 2xy ] ,J g (s, t) =and then apply the Chain Rule to obtain[ ] [xs x t 2 1=y s y t 1 −2∇(f ◦ g)(s, t) = ∇f(x(s, t), y(s, t))J g (s, t)= [ f x (x(s, t), y(s, t)) f y (x(s, t), y(s, t)) ] [ ]x s x ty s y t✷],= [ f x (x(s, t), y(s, t))x s + f y (x(s, t), y(s, t))y sf x (x(s, t), y(s, t))x t + f y (x(s, t), y(s, t))y t]= [ [2x(t)y(t) + y(t) 2 ](2) + [x(t) 2 + 2x(t)y(t)](1)[2x(t)y(t) + y(t) 2 ](1) + [x(t) 2 + 2x(t)y(t)](−2) ]= [ 4(2s + t)(s − 2t) + 2(s − 2t) 2 + (2s + t) 2 + 2(2s + t)(s − 2t)Example Let f : R → R be defined byand let g : R 2 → R be defined by2(2s + t)(s − 2t) + (s − 2t) 2 − 2(2s + t) 2 − 4(2s + t)(s − 2t) ] .f(x) = x 3 + 2x 2 ,g(u, v) = sin u cos v.Then f ◦ g is a scalar-valued function of u and v,(f ◦ g)(u, v) = (sin u cos v) 3 + 2(sin u cos v) 2 .To compute its gradient, which includes partial derivatives with respect tou and v, we first computef ′ (x) = 3x 2 + 4x,and∇g = [ g u g v]=[cos u cos v − sin u sin v],


1.6. THE CHAIN RULE 43and then use the Chain Rule to obtain∇(f ◦ g)(u, v) = f ′ (g(u, v))∇g(u, v)= [3(g(u, v)) 2 + 4g(u, v)] [ cos u cos v − sin u sin v ]✷= [3 sin 2 u cos 2 v + 4 sin u cos v] [ cos u cos v − sin u sin v ] .Example Let f : R 2 → R 2 be defined by[ ]f1 (x, y)f(x, y) ==f 2 (x, y)and let g : R → R 2 be defined by[ x 2 yxy 2 ],g(t) = 〈x(t), y(t)〉 = 〈cos t, sin t〉.Then f ◦ g is a vector-valued function of t,f(t) = 〈cos 2 t sin t, cos t sin 2 t〉.To compute its derivative with respect to t, we first compute[ ] [ ](f1 )J f (x, y) = x (f 1 ) y 2xy x2=(f 2 ) x (f 2 ) y y 2 ,2xyand g ′ (t) = 〈− sin t, cos t〉, and then use the Chain Rule to obtain[(f ◦ g) ′ (t) = J f (x(t), y(t))g ′ (f1 )(t) = x (x(t), y(t)) (f 1 ) y (x(t), y(t))(f 2 ) x (x(t), y(t)) (f 2 ) y (x(t), y(t))[ ] [ ]2x(t)y(t) x(t)2 − sin t=y(t) 2 2x(t)y(t) cos t✷] [ x ′ (t)y ′ (t)= 〈2 cos t sin t(− sin t) + cos 2 t(cos t), sin 2 t(− sin t) + 2 cos t sin t(cos t)〉= 〈−2 cos t sin 2 t + cos 3 t, − sin 3 t + 2 cos 2 t sin t〉.1.6.1 The Implicit Function TheoremThe Chain Rule can also be used to compute partial derivatives of implicitlydefined functions in a more convenient way than is provided by implicitdifferentiation. Let the equation F (x, y) = 0 implicitly define y as a differentiablefunction of x. That is, y = f(x) where F (x, f(x)) = 0 for x in]


44 CHAPTER 1. PARTIAL DERIVATIVESthe domain of f. If F is differentiable, then, by the Chain Rule, we candifferentiate the equation F (x, y(x)) = 0 with respect to x and obtainwhich yieldsF x + F ydydx = 0,dydx = −F xF y.By the Implicit Function Theorem, the equation F (x, y) = 0 defines y implicitlyas a function of x near (x 0 , y 0 ), where F (x 0 , y 0 ) = 0, provided thatF y (x 0 , y 0 ) ≠ 0 and F x and F y are continuous near (x 0 , y 0 ). Under theseconditions, we see that dy/dx is defined at (x 0 , y 0 ) as well.Example Let F : R 2 → R be defined byF (x, y) = x 2 + y 2 − 4.The equation F (x, y) = 0 defines y implicitly as a function of x, providedthat F satisfies the conditions of the Implicit Function Theorem.We haveF x = 2x, F y = 2y.Since both of these partial derivatives are polynomials, and therefore arecontinuous on all of R 2 , it follows that if F y ≠ 0, then y can be implicitlydefined as a function of x at a point where F (x, y, z) = 0, anddydx = −F xF y= − x y .For example, at the point (x, y) = (0, 2), F (x, y) = 0, and F y = 4. Therefore,y can be implicitly defined as a function of x near this point, and at x = 0,we have dy/dx = 0. ✷More generally, let F : D ⊆ R n+1 → R, and let p 0 = (x (0)1 , x(0) 2 , . . . , x(0) n , y (0) ) ∈D be such that F (x (0)1 , x(0) 2 , . . . , x(0) n , y (0) ) = 0. In this case, the ImplicitFunction Theorem states that if F y ≠ 0 near p 0 , and all first partial derivativesof F are continuous near p 0 , then this equation defines y as a functionof x 1 , x 2 , . . . , x n , and∂y∂x i= − F x iF y, i = 1, 2, . . . , n.


1.6. THE CHAIN RULE 45To see this, we differentiate the equation F (x (0)respect to x i to obtain the equationF xi + F y∂y∂x i= 0,1 , x(0)2 , . . . , x(0) n , y (0) ) = 0 withwhere all partial derivatives are evaluated at p 0 , and solve for ∂y/∂x i at p 0 .Example Let F : R 3 → R be defined byF (x, y, z) = x 2 z + z 2 y − 2xyz + 1.The equation F (x, y, z) = 0 defines z implicitly as a function of x and y,provided that F satisfies the conditions of the Implicit Function Theorem.We haveF x = 2xz − 2yz, F y = 2yz − 2xz, F z = x 2 + 2yz − 2xy.Since all of these partial derivatives are polynomials, and therefore are continuouson all of R 3 , it follows that if F z ≠ 0, then z can be implicitly definedas a function of x and y at a point where F (x, y, z) = 0, andz x = − F xF z=2yz − 2xzx 2 + 2yz − 2xy ,z y = − F yF z=2xz − 2yzx 2 + 2yz − 2xy .For example, at the point (x, y, z) = (1, 0, −1), F (x, y, z) = 0, and F z = 1.Therefore, z can be implicitly defined as a function of x and y near thispoint, and at (x, y) = (1, 0), we have z x = 2 and z y = −2. ✷letWe now consider the most general case: let F : D ⊆ R n+m → R m , andbe such thatp 0 = (x (0)1 , x(0) 2 , . . . , x(0) n , y (0)1 , y(0) 2 , . . . , y(0) m ) ∈ DF(x (0)1 , x(0) 2 , . . . , x(0) n , y (0)1 , y(0) 2 , . . . , y(0) m ) = 0.If we differentiate this system of equations with respect to x i , we obtain thesystems of linear equationsF xi + F y1∂y 1∂x i+ F y2∂y 2∂x i+ · · · + F ym∂y m∂x i= 0, i = 1, 2, . . . , n,where all partial derivatives are evaluated at p 0 .


46 CHAPTER 1. PARTIAL DERIVATIVESTo examine the solvability of these systems of equations, we first definex 0 = (x (0)1 , x(0) 2 , . . . , x(0) n ), and denote the component functions of the vectorvaluedfunction F by F = 〈F 1 , F 2 , . . . , F m 〉. We then define the Jacobianmatrices⎡ ∂F 1 ∂F∂x 1(p 0 ) 1∂F∂x 2(p 0 ) · · · 1⎤∂x n(p 0 )∂F 2 ∂F ∂xJ x,F (p 0 ) = ⎢1(p 0 ) 2∂F∂x 2(p 0 ) · · · 2∂x n(p 0 )⎥⎣ . · · · · · · . ⎦ ,∂F m ∂F∂x 1(p 0 ) m∂F∂x 2(p 0 ) · · · m∂x n(p 0 )⎡⎤∂y 1 ∂F∂y 1(p 0 ) 1∂F∂y 2(p 0 ) · · · 1∂y m(p 0 )∂y 2 ∂F∂y 1(p 0 ) 2∂F∂y 2(p 0 ) · · · 2∂y m(p 0 ) ⎥andJ y,F (p 0 ) =⎢⎣⎡J y (x 0 ) =⎢⎣. · · · · · ·∂y m∂y 1(p 0 )∂F m∂y 2(p 0 ) · · ·∂y 1∂x 1(x 0 )∂y 2∂x 1(x 0 )∂y 1∂x 2(x 0 ) · · ·∂y 2∂x 2(x 0 ) · · ·. · · · · · ·∂y m∂x 1(x 0 )∂y m∂x 2(x 0 ) · · ·⎥. ⎦ ,∂F m∂y m(p 0 )⎤∂y 1∂x n(x 0 )∂y 1∂x n(x 0 )⎥. ⎦ .∂y 1∂x n(x 0 )Then, from our previous differentiation with respect to x i , for each i =1, 2, . . . , n, we can concisely express our systems of equations as a singlesystemJ x,F (p 0 ) + J y,F (p 0 )J y (x 0 ) = 0.If the matrix J y,F (p 0 ) is invertible (also nonsingular), which is the caseif and only if its determinant is nonzero, and if all first partial derivativesof F are continuous near p 0 , then the equation F(p) = 0 implicitly definesy 1 , y 2 , . . . , y m as a function of x 1 , x 2 , . . . , x n , andJ y (x 0 ) = −[J y,F (p 0 )] −1 J x,F (p 0 ),where [J y,F (p 0 )] −1 is the inverse of the matrix J y,F (p 0 ).Example Let F : R 4 → R 2 by defined by[ ] [F1 (x, y, u, v)F(x, y, s, t) ==F 2 (x, y, u, v)xu + y 2 vx 2 v + yu + 1Then the vector equation F(x, y, u, v) = 0 implicitly defines (u, v) as afunction of (x, y), provided that F satisifes the conditions of the Implicit].


1.7. DIRECTIONAL DERIVATIVES AND THE GRADIENT VECTOR47Function Theorem. We will compute the partial derivatives of u and v withrespect to x and y, at a point that satisfies this equation.We have[ ]∂F1 ∂F 1[ u 2yvJ (x,y),F (x, y, u, v) =J (u,v),F (x, y, u, v) =∂x∂F 2∂x[ ∂F1∂u∂F 2∂u∂y∂F 2∂y∂F 1∂v∂F 2∂vFrom the formula for the inverse of a 2 × 2 matrix,[ ] −1 [ a b 1 d −b=c d ad − bc −c awe obtainJ (u,v) (x, y) =[ ]ux u yv x v y=2xv u] [ ] x y2=y x 2 .],],= −[J (u,v),F (x, y, u, v)] −1 J (x,y),F (x, y, u, v)= − 1 [ x2−y 2 ] [ ] u 2yvx 3 − y 3 −y x 2xv u[1 x=2 u − 2xy 2 v 2x 2 yv − y 2 ]uy 3 − x 3 2x 2 v − yu xu − 2y 2 .vThese partial derivatives can then be evaluated at any point (x, y, u, v) suchthat F(x, y, u, v) = 0, such as (x, y, u, v) = (0, 1, 0, −1). Note that thematrix J (u,v),F (x, y, u, v) is not invertible (that is, singular) if its determinantx 3 − y 3 = 0; that is, if x = y. When this is the case, (u, v) can not beimplicitly defined as a function of (x, y). ✷1.7 Directional Derivatives and the Gradient VectorPreviously, we defined the gradient as the vector of all of the first partialderivatives of a scalar-valued function of several variables. Now, we willlearn about how to use the gradient to measure the rate of change of thefunction with respect to a change of its variables in any direction, as opposedto a change in a single variable. This is extremely useful in applications inwhich the minimum or maxmium value of a function is sought. We willalso learn how the gradient can be used to easily describe tangent planes tolevel surfaces, thus providing an alternative to implicit differentiation or theChain Rule.


48 CHAPTER 1. PARTIAL DERIVATIVES1.7.1 The Gradient VectorLet f : D ⊆ R n → R be a scalar-valued function of n variables x 1 , x 2 , . . . , x n .Recall that the vector of its first partial derivatives,is called the gradient of f.∇f = [ f x1 f x2 · · · f xn],Example Let f(x, y, z) = e −(x2 +y 2) cos z. Then[∇f = −2xe −(x2 +y 2) cos z −2ye −(x2 +y 2) cos z −e −(x2 +y 2) sin z].Therefore, at the point (x 0 , y 0 , z 0 ) = (1, 2, π/3), the gradient is the vector✷∇f(x 0 , y 0 , z 0 ) = [ f x (1, 2, π/3) f y (1, 2, π/3) f z (1, 2, π/3) ]〈√ 〉3= −e −5 , −2e −5 , −2 e−5 .It should be noted that various differentiation rules from single-variablecalculus have direct generalizations to the gradient. Let u and v be differentiablefunctions defined on R n . Then, we have:• Linearity:∇(au + bv) = a∇u + b∇vwhere a and b are constants• Product Rule:∇(uv) = u∇v + v∇u• Quotient Rule:• Power Rule:( u)v∇u − u∇v∇ =v v 2∇u n = nu n−1 ∇u


1.7. DIRECTIONAL DERIVATIVES AND THE GRADIENT VECTOR491.7.2 Directional DerivativesThe components of the gradient vector ∇f represent the instantaneous ratesof change of the function f with respect to any one of its independent variables.However, in many applications, it is useful to know how f changes asits variables change along any path from a given point. To that end, givenf : D ⊆ R 2 → R, and a unit vector u = 〈a, b〉 ∈ R 2 , we define the directionalderivative of f at (x 0 , y 0 ) ∈ D in the direction of u to bef(x 0 + ah, y 0 + bh) − f(x 0 , y 0 )D u f(x 0 , y 0 ) = lim.h→0 hWhen u = i = 〈1, 0〉, then D u f = f x , and when u = j = 〈0, 1〉, thenD u f = f y . For general u, D u f(x 0 , y 0 ) represents the instantaneous rate ofchange of f as (x, y) change in the direction of u from the point (x 0 , y 0 ).Because it is cumbersome to compute a directional derivative using thedefinition directly, it is desirable to be able to relate the directional derivativeto the partial derivatives, which can be computed easily using differentiationrules. We haveD u f(x 0 , y 0 ) = limh→0f(x 0 + ah, y 0 + bh) − f(x 0 , y 0 )hf(x 0 + ah, y 0 + bh) − f(x 0 , y 0 + bh) + f(x 0 , y 0 + bh) − f(x 0 , y 0 )= limh→0 hf(x 0 + ah, y 0 + bh) − f(x 0 , y 0 + bh)= lim+h→0 hf(x 0 , y 0 + bh) − f(x 0 , y 0 )hf(x 0 + ah, y 0 + bh) − f(x 0 , y 0 + bh)= lima +h→0 ahf(x 0 , y 0 + bh) − f(x 0 , y 0 )bbh= f x (x 0 , y 0 )a + f y (x 0 , y 0 )b= ∇f(x 0 , y 0 ) · u.That is, the directional derivative in the direction of u is the dot product ofthe gradient with u. It can be shown that this is the case for any number ofvariables: given f : D ⊆ R n → R, and a unit vector u ∈ R n , the directionalderivative of f at x 0 ∈ R n in the direction of u is given byD u f(x 0 ) = ∇f(x 0 ) · u.


50 CHAPTER 1. PARTIAL DERIVATIVESBecause the dot product a · b can also be defined asa · b = ‖a‖‖b‖ cos θ,where θ is the angle between a and b, the directional derivative can be usedto determine the direction along which f increases most rapidly, decreasesmost rapidly, or does not change at all.We first note that if θ is the angle between ∇f(x 0 ) and u, thenThen we have the following:D u f(x 0 ) = ∇f(x 0 ) · u = ‖∇f(x 0 )‖ cos θ.• When θ = 0, cos θ = 1, so D u f is maximized, and its value is‖∇f(x 0 )‖. In this case,u = ∇f(x 0)‖∇f(x 0 )‖ ,and this is called the direction of steepest ascent.• When θ = π, cos θ = −1, so D u f is minimized, and its value is−‖∇f(x 0 )‖. In this case,u = − ∇f(x 0)‖∇f(x 0 )‖ ,and this is called the direction of steepest descent.• When θ = ±π/2, cos θ = 0, so D u = 0. In this case, u is a unit vectorthat is orthogonal (perpendicular) to ∇f(x 0 ). Since f is not changingat all along this direction, it follows that u indicates the direction ofa level set of f, on which f(x) = f(x 0 ).The direction of steepest descent is of particular interest in applications inwhich the goal is to find the minimum value of f. From a starting pointx 0 , one can choose a new point x 1 = x 0 + αu, where u = −∇f(x 0 ) is thedirection of steepest descent, by choosing α so as to minimize f(x 1 ). Then,this process can be repeated using the direction of steepest descent at x 1 ,which is −∇f(x 1 ), to compute a new point x 2 , and so on, until a minimumis found. This process is called the method of steepest descent.While not used very often in practice, it serves as a useful buildingblock for some of the most powerful methods that are used in practice forminimizing functions.


1.7. DIRECTIONAL DERIVATIVES AND THE GRADIENT VECTOR51Example Let f(x, y) = x 2 y + y 3 , and let (x 0 , y 0 ) = (2, −2). Then∇f(x, y) = [ f x (x, y) f y (x, y) ] = [ 2xy x 2 + 3y 2 ] ,which yields ∇f(x 0 , y 0 ) = 〈f x (2, −2), f y (2, −2)〉 = 〈−8, 16〉. It follows thatthe direction of steepest ascent is〈∇f(2, −2)u =‖∇f(2, −2)‖ = 〈−8, 16〉 〈−8, 16〉 〈−8, 16〉√ = √ =(−8) 2 + 162 320 8 √ = − 1 〉2√ , √ .5 5 5For this u, we have D u f(2, −2) = ‖∇f(2, −2)‖ = 8 √ 5.Furthermore, the direction of steepest descent is〈 1u = √5 , −√ 2 〉, 5and along this direction, we have D u f(2, −2) = −‖∇f(2, −2)‖ = −8 √ 5.Finally, the directions along which f does not change at all are those thatare orthogonal to the directions of steepest ascent and descent,〈 〉 2 1u = ± √5 , √ . 5The level curve defined by the equation f(x, y) = f(2, −2) = −16 proceedsalong these directions from the point (2, −2). ✷1.7.3 Tangent Planes to Level SurfacesLet F : D ⊆ R 3 → R be a function of three variables x, y and z thatimplicitly defines a surface through the equation F (x, y, z) = 0, and let(x 0 , y 0 , z 0 ) be a point on that surface. If F satisfies the conditions of theImplicit Function Theorem at (x 0 , y 0 , z 0 ), then the equation of the planethat is tangent to the surface at this point can be obtained using the factthat z is implicitly defined as a function of x and y near this point. It thenfollows that the equation of the tangent plane iswhere, by the Chain Rule,z − z 0 = z x (x 0 , y 0 )(x − x 0 ) + z y (x 0 , y 0 )(y − y 0 ),z x (x 0 , y 0 ) = − F x(x 0 , y 0 , z 0 )F z (x 0 , y 0 , z 0 ) , z y(x 0 , y 0 ) = − F y(x 0 , y 0 , z 0 )F z (x 0 , y 0 , z 0 ) .


52 CHAPTER 1. PARTIAL DERIVATIVESThis is not possible if F z (x 0 , y 0 , z 0 ) = 0, because then the Implicit FunctionTheorem does not apply.It would be desirable to be able to obtain the equation of the tangentplane even if F z (x 0 , y 0 , z 0 ) = 0, because the level surface still has a tangentplane at that point even if z cannot be implicitly defined as a function of xand y. To that end, we note that any direction u within the tangent planeis parallel to the tangent vector of some curve that lies within the surfaceand passes through (x 0 , y 0 , z 0 ). Because F (x, y, z) = 0 on this surface, itfollows that D u F (x 0 , y 0 , z 0 ) = 0. However, this implies that ∇F (x 0 , y 0 , z 0 )must be orthogonal to u, in view ofD u F (x 0 , y 0 , z 0 ) = ∇F (x 0 , y 0 , z 0 ) · u = 0.Since this is the case for any direction u within the tangent plane, we concludethat ∇F (x 0 , y 0 , z 0 ) is normal to the tangent plane, and therefore theequation of this plane isF x (x 0 , y 0 , z 0 )(x − x 0 ) + F y (x 0 , y 0 , z 0 )(y − y 0 ) + F z (x 0 , y 0 , z 0 )(z − z 0 ) = 0.Note that this equation is equivalent to that obtained using the Chain Rule,when F z (x 0 , y 0 , z 0 ) ≠ 0.The gradient not only provides the normal vector to the tangent plane,but also the direction numbers of the normal line to the surface at (x 0 , y 0 , z 0 ),which is the line that passes through the surface at this point and is perpendicularto the tangent plane. The equation of this line, in parametric form,isx = x 0 + tF x (x 0 , y 0 , z 0 ), y = y 0 + tF y (x 0 , y 0 , z 0 ), z = z 0 + tF z (x 0 , y 0 , z 0 ).Example Let F (x, y, z) = x 2 + y 2 + z 2 − 2x − 4y − 4. Then the equationF (x, y, z) = 0 defines a sphere of radius 3 centered at (1, 2, 0). At the point(x 0 , y 0 , z 0 ) = (3, 3, 2), we have∇F (x 0 , y 0 , z 0 ) = [ F x (x 0 , y 0 , z 0 ) F y (x 0 , y 0 , z 0 ) F z (x 0 , y 0 , z 0 ) ]= [ 2x 0 − 2 2y 0 − 4 2z 0]= 〈4, 2, 4〉.It follows that the equation of the plane that is tangent to the sphere at(3, 3, 2) is4(x − x 0 ) + 2(y − y 0 ) + 4(z − z 0 ) = 0,


1.8. MAXIMUM AND MINIMUM VALUES 53and the equation of the normal line, in parametric form, isx = x 0 + tF x (x 0 , y 0 , z 0 ) = 3 + 4t,y = y 0 + tF y (x 0 , y 0 , z 0 ) = 3 + 2t,z = z 0 + tF z (x 0 , y 0 , z 0 ) = 2 + 4t.Equivalently, we can describe the normal line using its symmetric equations,✷x − 34= y − 32= z − 24 .1.8 Maximum and Minimum ValuesIn single-variable calculus, one learns how to compute maximum and minimumvalues of a function. We first recall these methods, and then we willlearn how to generalize them to functions of several variables.Let f : D ⊆ R n → R. A local maximum of a function f is a pointa ∈ D such that f(x) ≤ f(a) for x near a. The value f(a) is called a localmaximum value. Similarly, f has a local minimum at a if f(x) ≥ f(a) for xnear a, and the value f(a) is called a local minimum value.When a function of a single variable, f(x), has a local maximum orminimum at x = a, then a must be a critical point of f, which means thatf ′ (c) = 0, or f ′ does not exist at a (which is the case if, for example, thegraph of f has a sharp corner at a). In general, if f is differentiable at apoint a, then in order for a to be a local maximum or minimum of f, therate of change of f, as its independent variables change in any direction,must be zero. The only way to ensure this is to require that ∇f(a) = 0.Therefore, we say that a is a critical point if ∇f(a) = 0 or if any partialderivative of f does not exist at a.Once we have found the critical points of a function, we must determinewhether they correspond to local maxima or minima. In the single-variablecase, we can use the Second Derivative Test, which states that if a is a criticalpoint of f, and f ′′ (a) > 0, then a is a local minimum, while if f ′′ (a) < 0, ais a local maximum, and if f ′′ (a) = 0, the test is inconclusive.This test is generalized to the multivariable case as follows: first, weform the Hessian, which is the matrix of second partial derivatives at a. Iff is a function of n variables, then the Hessian is an n × n matrix H, andthe entry in row i, column j of H is defined byH ij =∂2 f∂x i ∂x j(a).


54 CHAPTER 1. PARTIAL DERIVATIVESBecause mixed second partial derivatives are equal if they are continuous,it follows that H is a symmetric matrix, meaning that H ij = H ji .We can now state the Second Derivatives Test. If a is a critical pointof f, and the Hessian, H, is positive definite, then a is a local minimumof a. The notion of a matrix being positive definite is the generalization tomatrices of the notion of a positive number. When a matrix H is symmetric,the following statements are all equivalent:• H is positive definite.• x T Hx > 0, where x is a nonzero column vector of real numbers, andx T is the transpose of x, which is a row vector.• The eigenvalues of H are positive.• The determinant of H is positive.• The diagonal entries of H, H ii for i = 1, 2, . . . , n, are positive.On the other hand, if H is negative definite, then f has a local maximumat a. This means that x T Hx < 0 for any nonzero real vector x, andthat the eigenvalues and diagonal entries of H are negative. However, thedeterminant is not necessarily negative. Because it is equal to the productof the eigenvalues, the determinant is positive of n is even, and negative ifn is odd.If H is indefinite, which is the case if it is neither positive definite nornegative definite, and therefore has both positive and negative eigenvalues,then we say that f has a saddle point at a. This means that the graph of fcrosses its tangent plane at a, and the term “saddle point” arises from thefact that f is increasing from a along some directions, but decreasing alongothers.Finally, if H is a singular matrix, meaning that one of its eigenvalues, andtherefore its determinant, is equal to zero, the test is inconclusive. Therefore,a could be a local minimum, local maximum, saddle point, or none ofthe above. One must instead use other information about f, such as its directionalderivatives, to determine if f has a maximum, minimum or saddlepoint at a.Example Let f : R 2 → R be defined byf(x, y) = 6x 2 + 4xy + 8y 2 − x − 3y.


1.8. MAXIMUM AND MINIMUM VALUES 55We wish to find any local minima or maxima of this function. First, wecompute its gradient,∇f = [ 12x + 4y − 1 4x + 16y − 3 ] .To determine where ∇f = 0, we must solve the system of linear equations12x + 4y = 1,4x + 16y = 3.Using the second equation to obtain x = (3 − 16y)/4 and substituting thisinto the first equation, we obtain y = 2/11 and x = 1/44. Since the solutionof this system is unique, it follows that this is the only critical point of f.To determine whether this critical point corresponds to a maximum orminimum, we must compute the Hessian H, whose entries are the secondpartial derivatives of f at (1/44, 2/11). We haveH =[ ] [fxx f xy 12 4=f yx f yy 4 16To determine whether this matrix is positive definite, we first compute itsdeterminant,].det(H) = f xx f yy − f 2 xy = 12(16) − 4(4) = 176.Since the determinant, which is the product of H’s two eigenvalues, is positive,it follows that they must both be the same sign. To determine thatsign, we check the trace of H, denoted by tr(H). The trace of a matrix isthe sum of its diagonal entries, which is also the sum of the eigenvalues. Wehavetr(H) = f xx + f yy = 12 + 16 = 28.Since both eigenvalues are the same sign, and their sum is positive, theymust both be positive. Therefore, H is positive definite, and we concludethat (2/11, 1/44) is a local minimum of f. ✷The preceding example describes how the Second Derivatives Test can beperformed for a function of two variables:• If det(H) = f xx f yy − f 2 xy > 0, and f xx > 0, then the critical point is aminimum.• If det(H) > 0 and f xx < 0, then the critical point is a maximum.


56 CHAPTER 1. PARTIAL DERIVATIVES• If det(H) < 0, then the critical point is a saddle point.• If det(H) = 0, then the test is inconclusive.In many applications, it is desirable to know where a function assumes itslargest or smallest values, not just among nearby points, but within its entiredomain. We say that a function f : D ⊆ R n → R has an absolute maximumat a if f(a) ≥ f(x) for x ∈ D, and that f has an absolute minimum at a iff(a) ≤ f(x) for x ∈ D.In the single-variable case, it is known, by the Extreme Value Theorem,that if f is continuous on a closed interval [a, b], then it has has an absolutemaximum and an absolute minimum on [a, b]. To find them, it is necessaryto check all critical points in [a, b], and the endpoints a and b, as the absolutemaximum and absolute minimum must each occur at one of these points.The generalization of a closed interval to the multivariable case is thenotion of a compact set. Previously, we defined an open set, and a boundarypoint. A closed set is a set that contains all of its boundary points. Abounded set is a set that is contained entirely within a ball D r (x 0 ) for somechoice of r and x 0 . Finally, a set is compact if it is closed and bounded.We can now state the generalization of the Extreme Value Theorem tothe multivariable case. It states that a continuous function on a compactset has an absolute minimum and an absolute maximum. Therefore, givensuch a compact set D, to find the absolute maximum and minimum, it issufficient to check the critical points of f in D, and to find the extreme(maximum and minimum) values of f on the boundary. The largest of all ofthese values is the absolute maximum value, and the smallest is the absoluteminimum value.It should be noted that in cases where D has a simple shape, such asa rectangle, triangle or cube, it is possible to check boundary points bycharacterizing them using one or more equations, using these equations toeliminate a variable, and then substituting for the eliminated variable in fto obtain a function of one less variable. Then, it is possible to find extremevalues on the boundary by solving a maximization or minimization problemin one less dimension.Example Consider the function f(x, y) = x 2 + 3y 2 − 4x − 6y. We will findthe absolute maximum and minimum values of this function on the trianglewith vertices (0, 0), (4, 0) and (0, 3).First, we look for critical points. We have∇f = [ 2x − 4 6y − 6 ] .


1.8. MAXIMUM AND MINIMUM VALUES 57We see that there is only one critical point, at (x 0 , y 0 ) = (2, 1). Becausethe triangle includes points that satisfy the inequalities x ≥ 0, y ≥ 0 andy ≤ 3 − 3x/4, and the point (2, 1) satisfies all of these inequalities, weconclude that this point lies within the triangle. It is therefore a candidatefor an absolute maximum or minimum.We now check the boundary, by examining each edge of the triangleindividually. On the edge between (0, 0) and (0, 3), we have x = 0, whichyields f(0, y) = 3y 2 − 6y. We then have f y (0, y) = 6y − 6, which has acritical point at y = 1. Therefore, (0, 1) is also a candidate for an absoluteextremum. Similarly, along the edge between (0, 0) and (4, 0), we have y = 0,which yields f(x, 0) = x 2 − 4x. We then have f x (x, 0) = 2x − 4, which hasa critical point at x = 2. Therefore, (2, 0) is a candidate for an absoluteextremum.We then check the edge between (0, 3) and (4, 0), along which y = 3 −3x/4. Substituting this into f(x, y) yields the functiong(x) = f(x, 3 − 3x )= 434 16 x2 + 9 − 13x.To determine the critical points of this function, we solve g ′ (x) = 0, whichyields x = 104/43. Since y = 3−3x/4 along this edge, the point (104/43, 51/43)is a candidate for an absolute extremum.Finally, we must include the vertices of the triangle, because they tooare boundary points of the triangle, as well as boundary points of the edgesalong which we attempted to find extrema of single-variable functions. Inall, we have seven candidates: the critical point of f, (2, 1), the three criticalpoints found along the edges, (0, 1), (2, 0) and (104/43, 51/43), and the threevertices, (0, 0), (4, 0) and (0, 3). Evaluating f(x, y) at all of these points, weobtainx y f(x,y)2 1 −70 1 −32 0 −4104/43 51/43 −289/430 0 04 0 00 3 9We conclude that the absolute minimum is at (2, 1), and the absolute maximumis at (0, 3). The function is shown on Figure 1.5. ✷


58 CHAPTER 1. PARTIAL DERIVATIVESFigure 1.5: The function f(x, y) = x 2 + 3y 2 − 4x − 6y on the triangle withvertices (0, 0), (4, 0) and (0, 3).Previously, we learned that when seeking a local minimum or maximumof a function of variables, the Second Derivative Test from single-variablecalculus, in which the sign of the second derivative indicated whether alocal extremum was a maximum or minimum, generalizes to the SecondDerivatives Test, which indicates that a local extremum x 0 is a minimum ifthe Hessian, the matrix of second partial derivatives, is positive definite atx 0 .We will now use Taylor series to explain why this test is effective. Recallthat in single-variable calculus, Taylor’s Theorem states that a function f(x)with at least three continuous derivatives at x 0 can be written asf(x) = f(x 0 ) + f ′ (x 0 )(x − x 0 ) + 1 2 f ′′ (x 0 )(x − x 0 ) 2 + 1 6 f ′′′ (ξ)(x − x 0 ) 3 ,where ξ is between x and x 0 . In the multivariable case, Taylor’s Theoremstates that if f : D ⊆ R n → R has continuous third partial derivatives at


1.8. MAXIMUM AND MINIMUM VALUES 59x 0 ∈ D, thenf(x) = f(x 0 ) + ∇f(x 0 ) · (x − x 0 ) + (x − x 0 ) · H f (x 0 )(x − x 0 ) + R 2 (x 0 , x),where H f (x 0 ) is the Hessian, the matrix of second partial derivatives at x 0 ,defined by⎡H f (x 0 ) =⎢⎣∂ 2 f∂x 2 1(x 0 )∂ 2 f∂x 2 ∂x 1(x 0 ).∂ 2 f∂x n∂x 1(x 0 )∂ 2 f∂x 1 ∂x 2(x 0 ) · · ·∂ 2 f(x∂x 2 0 )2· · ·∂ 2 f∂x n∂x 2(x 0 ) · · ·and R 2 (x 0 , x) is the Taylor remainder, which satisfiesIf we let x 0 = (x (0)using summations:f(x) = f(x 0 ) +1 , x(0)R 2 (x 0 , x).n∑i=1R 2 (x 0 , x)limx→x 0 ‖x − x 0 ‖ 2 = 0.⎤∂ 2 f∂x 1 ∂x n(x 0 )∂ 2 f∂x 2 ∂x n(x 0 ),⎥. ⎦∂ 2 f(x∂x 2 0 )n2 , . . . , x(0) n ), then Taylor’s Theorem can be rewritten∂f∂x i(x 0 )(x i − x (0)i) +n∑i,j=1∂ 2 f∂x i ∂x j(x 0 )(x i − x (0)i)(x j − x (0)j) +Example Let f(x, y) = x 2 y 3 + xy 4 , and let (x 0 , y 0 ) = (1, −2). Then, frompartial differentiation of f, we obtain its gradientand its Hessian,∇f = [ f x f y]=[2xy 3 + y 4 3x 2 y 2 + 4xy 3 ] ,H f (x, y) =[ ] [fxx f xy=f yx f yy2y 3 6xy 2 + 4y 3 ]6xy 2 + 4y 3 6x 2 y + 12xy 2 .Therefore∇f(1, −2) = [ 0 −20 ] , H f (1, −2) =[ −16 −8−8 36],


60 CHAPTER 1. PARTIAL DERIVATIVESand the Taylor expansion of f around (1, −2) isf(x, y) = f(x 0 , y 0 ) + ∇f(x 0 , y 0 ) · 〈x − x 0 , y − y 0 〉 +12 〈x − x 0, y − y 0 〉 · H f (x 0 , y 0 )〈x − x 0 , y − y 0 〉 + R 2 ((x 0 , y 0 ), (x, y))]+= 8 + [ 0 −20 ] [ x − 1y + 2〈x − 1, y + 2〉 ·R 2 ((1, −2), (x, y))[ −16 −8−8 36] [ x − 1y + 2]+= 8 − 20(y + 2) − 16(x − 1) 2 − 16(x − 1)(y + 2) + 36(y + 2) 2 +R 2 ((1, −2), (x, y)).The first three terms represent an approximation of f(x, y) by a quadraticfunction that is valid near the point (1, −2). ✷Now, suppose that x 0 is a critical point of x. If this point is to bea local minimum, then we must have f(x) ≥ f(x 0 ) for x near x 0 . Since∇f(x 0 ) = 0, it follows that we must have(x − x 0 ) · [H f (x 0 )(x − x 0 )] ≥ 0.However, if the Hessian H f (x 0 ) is a positive definite matrix, then, by definition,this expression is actually strictly greater than zero. Therefore, we areassured that x 0 is a local minimum. In fact, x 0 is a strict local minimum,since we can conclude that f(x) > f(x 0 ) for all x sufficiently near x 0 .As discussed previously, there are various properties possessed by symmetricpositive definite matrices. One other, which provides a relativelystraightforward method of checking whether a matrix is positive definite, isto check whether the determinants of its principal submatrices, known asprincipal minors, are positive. Given an n × n matrix A, its principal submatricesare the submatrices consisting of its first k rows and columns, fork = 1, 2, . . . , n. Note that checking these determinants, the principal minors,is equivalent to the test that we have previously described for determiningwhether a 2 × 2 matrix is positive definite.Example Let f(x, y, z) = x 2 + y 2 + z 2 + xy. To find any local maxima orminima of this function, we compute its gradient, which is∇f(x, y, z) = [ 2x + y 2y + x 2z ] .


1.9. CONSTRAINED OPTIMIZATION 61It follows that the only critical point is at (x 0 , y 0 , z 0 ) = (0, 0, 0). To performthe Second Derivatives Test, we compute the Hessian of f, which is⎡f xx f xy f xz⎤ ⎡2 1⎤0f zx f zy f zz 0 0 2H f (x, y, z) = ⎣ f yx f yy f yz⎦ = ⎣ 1 2 0 ⎦ .To determine whether this matrix is positive definite, we can compute thedeterminants of the principal submatrices of H f (0, 0, 0), which are[H f (0, 0, 0)] 11= 2,[ 2 1[H f (0, 0, 0)] 1:2,1:2=1 2⎡[H f (0, 0, 0)] 1:3,1:3=For the principal minors, we have⎣],2 1 01 2 00 0 2det([H f (0, 0, 0)] 11 ) = 2, det([H f (0, 0, 0)] 1:2,1:2 ) = 2(2) − 1(1) = 3,det([H f (0, 0, 0)] 1:3,1:3 ) = 2 det([H f (0, 0, 0)] 1:2,1:2 ) = 6.Since all of the principal minors are positive, we conclude that H f (0, 0, 0) ispositive definite, and therefore the critical point is a minimum of f. ✷1.9 Constrained OptimizationNow, we consider the problem of finding the maximum or minimum value ofa function f(x), except that the independent variables x = (x 1 , x 2 , . . . , x n )are subject to one or more constraints. These constraints prevent us fromusing the standard approach for finding extrema, but the ideas behind thestandard approach are still useful for developing an approach to the constrainedproblem.We assume that the constraints are equations of the form⎤⎦ .g i (x) = 0,i = 1, 2, . . . , mfor given functions g i (x). That is, we may only consider x = (x 1 , x 2 , . . . , x n )that belong to the intersection of the hypersurfaces (surfaces, when n = 3,or curves, when n = 2) defined by the g i , when computing a maximumor minimum value of f. For conciseness, we rewrite these constraints as a


62 CHAPTER 1. PARTIAL DERIVATIVESvector equation g(x) = 0, where g : R n → R m is a vector-valued functionwith component functions g i , for i = 1, 2, . . . , m.By Taylor’s theorem, we have, for x 0 ∈ R n at which g is differentiable,g(x) = g(x 0 ) + J g (x 0 )(x − x 0 ) + R 1 (x 0 , x),where J g (x 0 ) is the Jacobian matrix of g at x 0 , consisting of the first partialderivatives of the g i evaluated at x 0 , and R 1 (x 0 , x) is the Taylor remainder,which satisfiesR 1 (x 0 , x)limx→x 0 ‖x − x 0 ‖ = 0.It follows that if u is a vector belonging to all of the tangent spaces of thehypersurfaces defined by the g i , then, because each g i must remain constantas x deviates from x 0 in the direction of u, we must have J g (x 0 )u = 0. Inother words, ∇g i (x 0 ) · u = 0 for i = 1, 2, . . . , m.Now, suppose that x 0 is a local minimum of f(x), subject to the constraintsg(x 0 ) = 0. Then, x 0 may not necessarily be a critical point of f, butf may not change along any direction from x 0 that satisfies the constraints.Therefore, we must have ∇f(x 0 ) · u = 0 for any vector u in the intersectionof tangent spaces, at x 0 , of the hypersurfaces defined by the constraints.It follows that if u is any such vector in this tangent plane, and thereexist constants λ 1 , λ 2 , . . . , λ m such that∇f(x 0 ) = λ 1 ∇g 1 (x 0 ) + λ 2 ∇g 2 (x 0 ) + · · · + λ m ∇g m (x 0 ),then the requirement ∇f(x 0 ) · u = 0 follows directly from the fact that∇g i (x 0 ) · u = 0, and therefore x 0 must be a constrained critical point of f.The constants λ 1 , λ 2 , . . . , λ m are called Lagrange multipliers.Example When m = 1; that is, when there is only one constraint, theproblem of finding a constrained minimum or maximum reduces to findinga point x 0 in the domain of f such that∇f(x 0 ) = λ∇g(x 0 ),for a single Lagrange multiplier λ.Let f(x, y) = 4x 2 + 9y 2 . The minimum value of this function is at 0,which is attained at x = y = 0, but we wish to find the minimum of f(x, y)subject to the constraint x 2 + y 2 − 2x − 2y = 2. That is, we must haveg(x, y) = 0 where g(x, y) = x 2 + y 2 − 2x − 2y − 2. To find any points thatare candidates for the constrained minimum, we compute the gradients of fand g, which are∇f = [ 8x 18y ] ,


1.9. CONSTRAINED OPTIMIZATION 63∇g = [ 2x − 2 2y − 2 ] .In order for the equation ∇f(x, y) = λ∇g(x, y) to be satisfied, we musthave, for some choice of λ, x and y,From these equations, we obtain8x = λ(2x − 2), 18y = λ(2y − 2).x =λλ − 4 , y = λλ − 9 .Substituting these into the constraint x 2 + y 2 − 2x − 2y − 2 = 0 yields thefourth-degree equation4λ 4 − 104λ 3 + 867λ 2 − <strong>280</strong>8λ + 2592 = 0.This equation has two real solutions,λ 1 = 3 2 , λ 2 ≈ 13.6.Substituting these values into the above equations for x and y yield thecritical pointsx 1 = − 3 5 , y1 = −1 5 , λ 1 = 3 2 ,x 2 ≈ 1.416626, y 2 ≈ 2.956124, λ 2 ≈ 13.6.Substituting the x and y values into f(x, y) yields the minimum value of 9/5at (x 1 , y 1 ) and the maximum value of approximately 86.675 at (x 2 , y 2 ). ✷Example Let f(x, y, z) = x + y + z. We wish to find the extremea of thisfunction subject to the constraints x 2 + y 2 = 1 and 2x + z = 1. That is, wemust have g 1 (x, y, z) = g 2 (x, y, z) = 0, where g 1 (x, y, z) = x 2 + y 2 − 1 andg 2 (x, y, z) = 2x + z − 1. We must find λ 1 and λ 2 such that∇f = λ 1 ∇g 1 + λ 2 ∇g 2 ,or[1 1 1]= λ1[2x 2y 0]+ λ2[2 0 1].This equation, together with the constraints, yields the system of equations1 = 2xλ 1 + 2λ 21 = 2yλ 11 = λ 21 = x 2 + y 21 = 2x + z.


64 CHAPTER 1. PARTIAL DERIVATIVESFrom the third equation, λ 2 = 1, which, by the first equation, yields 2xλ 1 =−1. It follows from the second equation that x = −y. This, in conjunctionwith the fourth equation, yields (x, y) = (1/ √ 2, −1/ √ 2) or (x, y) =(−1/ √ 2, 1/ √ 2). From the fifth equation, we obtain the two critical points(x 1 , y 1 , z 1 ) =( 1 √2 , − 1 √2, 1 − √ 2), (x 2 , y 2 , y 2 ) =(− 1 √2,1√2, 1 + √ 2Substituting these points into f yields f(x 1 , y 1 , z 1 ) = 1− √ 2 and f(x 2 , y 2 , z 2 ) =1+ √ 2, so we conclude that (x 1 , y 1 , z 1 ) is a local minimum of f and (x 2 , y 2 , z 2 )is a local maximum of f, subject to the constraints g 1 (x, y, z) = g 2 (x, y, z) =0. ✷The method of Lagrange multipliers can be used in conjunction with themethod of finding unconstrained local maxima and minima in order to findthe absolute maximum and minimum of a function on a compact (closedand bounded) set. The basic idea is as follows:• Find the (unconstrained) critical points of the function, and excludethose that do not belong to the interior of the set.• Use the method of Lagrange multipliers to find the constrained criticalpoints that lie on the boundary of the set, using equations thatcharacterize the boundary points as constraints. Also, include cornersof the boundary, as they represent critical points due to the function,restricted to the boundary, not being differentiable.• Evaluate the function at all of the constrained and unconstrained criticalpoints. The largest value is the absolute maximum value on theset, and the smallest value is the absolute minimum value on the set.From a linear algebra point of view, ∇f(x 0 ) must be orthogonal to anyvector u in the null space of J g (x 0 ) (that is, the set consisting of any vectorv such that J g (x 0 )v = 0), and therefore it must lie in the range of J g (x 0 ) T ,the transpose of J g (x 0 ). That is, ∇f(x 0 ) = J g (x 0 ) T u for some vector u,meaning that ∇f(x 0 ) must be a linear combination of the rows of J g (x 0 ) (thecolumns of J g (x 0 ) T ), which are the gradients of the component functions ofg at x 0 .Another way to view the method of Lagrange multipliers is as a modifiedunconstrained optimization problem. If we define the function h(x, λ) byh(x, λ) = f(x) − λ · g(x) = f(x) −m∑λ i g i (x),i=1).


1.9. CONSTRAINED OPTIMIZATION 65then we can find constrained extrema of f by finding unconstrained extremaof h, for∇h(x, λ) = [ ∇f(x) − λ · J g (x) −g(x) ] .Because all components of the gradient must be equal to zero at a criticalpoint (when the gradient exists), the constraints must be satisfied at acritical point of h, and ∇f must be a linear combination of the ∇g i , so fis only changing along directions that violate the constraints. Therefore, acritical point is a candidate for a constrained maximum or minimum. Bythe Second Derivatives Test, we can then use the Hessian of h to determineif any constrained extremum is a maximum or minimum.


66 CHAPTER 1. PARTIAL DERIVATIVES1.10 Appendix: Linear Algebra Concepts1.10.1 Matrix MultiplicationAs we work with Jacobian matrices for vector-valued functions of severalvariables, matrix multiplication is a highly relevant operation in multivariablecalculus. We have previously defined the product of an m × n matrixA (that is, A has m rows and n columns) and an n × p matrix B as them × p matrix C = AB, where the entry in row i and column j of C is thedot product of row i of A and column j of B. This can be written usingsigma notation asc ij =n∑a ik b kj , i = 1, 2, . . . , m, j = 1, 2, . . . , p.k=1Note that the number of columns in A must equal the number of rows in B,or the product AB is undefined. Furthermore, in general, even if A and Bcan be multiplied in either order (that is, if they are square matrices of thesame size), AB does not necessarily equal BA. In the special case where thematrix B is actually a column vector x with n components (that is, p = 1),it is useful to be able to recognize the summationy i =n∑a ij x jj=1as the formula for the ith component of the vector y = Ax.Example Let A a 3 × 2 matrix, and B be a 2 × 2 matrix, whose entries aregiven by⎡ ⎤1 −2 [ ]A = ⎣ −3 4 ⎦ −7 8, B =.9 −105 −6Then, because the number of columns in A is equal to the number of rowsin B, the product C = AB is defined, and equal to the 3 × 2⎡C = ⎣1(−7) + (−2)9(−3)(−7) + 4(9)5(−7) + (−6)9⎤ ⎡1(8) + (−2)(−10)(−3)(8) + 4(−10) ⎦ = ⎣5(8) + (−6)(−10)−25 2857 −64−89 100Because the number of columns in B is not the same as the number of rowsin A, it does not make sense to compute the product BA. ✷⎤⎦ .


1.10. APPENDIX: LINEAR ALGEBRA CONCEPTS 67In multivariable calculus, matrix multiplication most commonly ariseswhen applying the Chain Rule, because the Jacobian matrix of the compositionf ◦ g at point x 0 in the domain of g is the product of the Jacobianmatrix of f, evaluated at g(x 0 ), and the Jacobian matrix of g evaluated atx 0 . It follows that the Chain Rule only makes sense when composing functionsf and g such that the number of dependent variables of g (that is, thenumber of rows in its Jacobian matrix) equals the number of independentvariables of f (that is, the number of columns in its Jacobian matrix).Matrix multiplication also arises in Taylor series expansions of multivariablefunctions, because if f : D ⊆ R n → R, then the Taylor expansionof f around x 0 ∈ D involves the dot product of ∇f(x 0 ) with the vectorx − x 0 , which is a multiplication of a 1 × n matrix with an n × 1 matrixto produce a scalar (by convention, the gradient is written as a row vector,while points are written as column vectors). Also, such an expansion involvesthe dot product of x−x 0 with the product of the Hessian matrix, thematrix of second partial derivatives at x 0 , and the vector x − x 0 . Finally, ifg : U ⊆ R n → R m is a vector-valued function of n variables, then the secondterm in its Taylor expansion around x 0 ∈ U is the product of the Jacobianmatrix of g at x 0 and the vector x − x 0 .1.10.2 EigenvaluesPreviously, it was mentioned that the eigenvalues of a matrix that is bothsymmetric, and positive definite, are positive. A scalar λ, which can be realor complex, is an eigenvalue of an n × n matrix A (that is, A has n rowsand n columns) if there exists a nonzero vector x such thatAx = λx.That is, matrix-vector multiplication of A and x reduces to a simple scalingof x by λ. The vector x is called an eigenvector of A corresponding to λ.The eigenvalues of A are roots of the characteristic polynomial det(A −λI), which is a polynomial of degree n in the variable λ. Therefore, an n×nmatrix A has n eigenvalues, which may repeat. Although the eigenvaluesof a matrix may be real or complex, even when the matrix is real, theeigenvalues of a real, symmetric matrix, such as the Hessian of any functionwith continuous second partial derivatives, are real.For a general matrix A, det(A), the determinant of A, is the productof all of the eigenvalues of A. The trace of A, denoted by tr(A), which isdefined to be the sum of the diagonal entries of A, is also the sum of theeigenvalues of A. It follows that when A is a 2 × 2 symmetric matrix, the


68 CHAPTER 1. PARTIAL DERIVATIVESdeterminant and trace can be used to easily confirm that the eigenvalues ofA are either both positive, both negative, or of opposite signs. This is thebasis for the Second Derivatives Test for functions of two variables.Example Let A be a symmetric 2 × 2 matrix defined by[ ] 4 −6A =.−6 10Thentr(A) = 4 + 10 = 14, det(A) = 4(10) − (−6)(−6) = 4.It follows that the product and the sum of A’s two eigenvalues are bothpositive. Because A is symmetric, its eigenvalues are also real. Therefore,they must both also be positive, and we can conclude that A is positivedefinite.To actually compute the eigenvalues, we can compute its characteristicpolynomial, which is([ ])4 − λ −6det(A − λI) = det−6 10 − λNote that= (4 − λ)(10 − λ) − (−6)(−6)= λ 2 − 14λ + 4.det(A − λI) = λ 2 − tr(A)λ + det(A),which is true for 2 × 2 matrices in general. To compute the eigenvalues,we use the quadratic formula to compute the roots of this polynomial, andobtainλ = 14 ± √ 14 2 − 4(4)(1)= 7 ± 3 √ 5 ≈ 13.708, 0.292.2(1)If A represented the Hessian of a function f(x, y) at a point (x 0 , y 0 ), and∇f(x 0 , y 0 ) = 0, then f would have a local minimum at (x 0 , y 0 ). ✷1.10.3 The Transpose, Inner Product and Null SpaceThe dot product of two vectors u and v, denoted by u·v, can also be writtenas u T v, where u and v are both column vectors, and u T is the transpose ofu, which converts u into a row vector. In general, the transpose of a matrixA is the matrix A T whose entries are defined by [A T ] ij = [A] ji . That is, inthe transpose, the sense of rows and columns are reversed. The dot product


1.10. APPENDIX: LINEAR ALGEBRA CONCEPTS 69is also known as an inner product; the outer product of two column vectorsu and v is uv T , which is a matrix, whereas the inner product is a scalar.Given an m × n matrix A, the null space of A is the set N (A) of alln-vectors such that if x ∈ N (A), then Ax = 0. If x is such a vector, thenfor any m-vector v, v T (Ax) = v T 0 = 0. However, because of two properitesof the transpose, (A T ) T = A and (AB) T = B T A T , this inner product canbe rewritten as v T Ax = v T (A T ) T x = (A T v) T x. It follows that any vectorin N (A) is orthogonal to any vector in the range of A T , denoted by R(A T ),which is the set of all n-vectors of the form A T v, where v is an m-vector.This is the basis for the condition ∇f = J T g λ in the method of Lagrangemultipliers when there are multiple constraints.Example LetThen⎡A = ⎣⎡A T = ⎣1 −2 41 3 −61 −5 10⎤1 1 1−2 3 −54 −6 10The null space of A, N (A), consists of all vectors that are multiples of thevector⎡ ⎤0v = ⎣ 2 ⎦ ,1as it can be verified by matrix-vector multiplication that Av = 0. Now, if welet w be any vector in R 3 , and we compute u = A T w, then v ·u = v T u = 0,becausev T u = v T A T w = (Av) T w = 0 T w = 0.For example, it can be confirmed directly that v is orthogonal to any of thecolumns of A T . ✷⎦ .⎤⎦ .


70 CHAPTER 1. PARTIAL DERIVATIVES


Chapter 2Multiple Integrals2.1 Double Integrals over RectanglesIn single-variable calculus, the definite integral of a function f(x) over aninterval [a, b] was defined to be∫ baf(x) dx = limn→∞n∑f(x ∗ i )∆x,where ∆x = (b − a)/n, and, for each i, x i−1 ≤ x ∗ i ≤ x i, where x i = a + i∆x.The purpose of the definite integral is to compute the area of a regionwith a curved boundary, using the formula for the area of a rectangle. Thesummation used to define the integral is the sum of the areas of n rectangles,each with width ∆x, and height f(x ∗ i ), for i = 1, 2, . . . , n. By taking thelimit as n, the number of rectangles, tends to infinity, we obtain the sum ofthe areas of infinitely many rectangles of infinitely small width. We definethe area of the region bounded by the lines x = a, y = 0, x = b, and thecurve y = f(x), to be this limit, if it exists.Unfortunately, it is too tedious to compute definite integrals using thisdefinition. However, if we define the function F (x) as the definite integralF (x) =∫ xthen we have[∫F ′ 1 x+h(x) = lim f(s) ds −h→0 h aa71i=1f(s) ds,∫ xa]f(s) ds = 1 h∫ x+hxf(s) ds.


72 CHAPTER 2. MULTIPLE INTEGRALSIntuitively, as h → 0, this expression converges to the area of a rectangle ofwidth h and height f(x), divided by the width, which is simply the height,f(x). That is, F ′ (x) = f(x). This leads to the Fundamental Theorem of<strong>Calculus</strong>, which states that∫ baf(x) dx = F (b) − F (a),where F is an antiderivative of f; that is, F ′ = f. Therefore, definiteintegrals are typically evaluated by attempting to undo the differentiationprocess to find an antiderivative of the integrand f(x), and then evaluatingthis antiderivative at a and b, the limits of the integral.Now, let f(x, y) be a function of two variables. We consider the problemof computing the volume of the solid in 3-D space bounded by the surfacez = f(x, y), and the planes x = a, x = b, y = c, y = d, and z = 0, wherea, b, c and d are constants. As before, we divide the interval [a, b] into nsubintervals of width ∆x = (b − a)/n, and we similarly divide the interval[c, d] into m subintervals of width ∆y = (d − c)/m. For convenience, we alsodefine x i = a + i∆x, and y j = c + j∆y.Then, we can approximate the volume V of this solid by the sum of thevolumes of mn boxes. The base of each box is a rectangle with dimensions∆x and ∆y, and the height is given by f(x ∗ i , y∗ j ), where, for each i and j,x i−1 ≤ x ∗ i ≤ x i and y j−1 ≤ yj ∗ ≤ y j. That is,V ≈n∑ m∑f(x ∗ i , yj ∗ ) ∆y ∆x.i=1 j=1We then obtain the exact volume of this solid by letting the number ofsubintervals, n, tend to infinity. The result is the double integral of f(x, y)over the rectangle R = {(x, y) | a ≤ x ≤ b, c ≤ y ≤ d}, which is also writtenas R = [a, b] × [c, d]. The double integral is defined to be∫ ∫V =Rf(x, y) dA =limm,n→∞n∑i=1 j=1m∑f(x ∗ i , yj ∗ ) ∆y ∆x,which is equal to the volume of the given solid. The dA corresponds to thequantity ∆A = ∆x∆y, and emphasizes the fact that the integral is definedto be the limit of the sum of volumes of boxes, each with a base of area ∆A.To evaluate double integrals of this form, we can proceed as in the singlevariablecase, by noting that if f(x 0 , y), a function of y, is integrable on [c, d]


2.1. DOUBLE INTEGRALS OVER RECTANGLES 73for each x 0 ∈ [a, b], then we have∫ ∫Rf(x, y) dA = limm,n→∞= limn→∞= limn→∞=n∑i=1 j=1⎡n∑⎣ limi=1i=1∫ b ∫ dacm∑f(x ∗ i , yj ∗ ) ∆y ∆xm→∞j=1⎤m∑f(x ∗ i , yj ∗ )∆y⎦ ∆xn∑[∫ d]f(x ∗ i , y) dy ∆xcf(x, y) dy dx.Similarly, if f(x, y 0 ), a function of x, is integrable on [a, b] for each y 0 ∈ [c, d],we also have ∫ ∫∫ d ∫ bf(x, y) dA = f(x, y) dy dx.RThis result is known as Fubini’s Theorem, which states that a double integralof a function f(x, y) can be evaluated as two iterated single integrals,provided that f is integrable as a function of either variable when the othervariable is held fixed. This is guaranteed if, for instance, f(x, y) is continuouson the entire rectangle R.That is, we can evaluate a double integral by performing partial integrationwith respect to either variable, x or y, which entails applying theFundamental Theorem of <strong>Calculus</strong> to integrate f(x, y) with respect to onlythat variable, while treating the other variable as a constant. The result willbe a function of only the other variable, to which the Fundamental Theoremof <strong>Calculus</strong> can be applied a second time to complete the evaluation of thedouble integral.Example Let R = [0, 1] × [0, 2], and let f(x, y) = x 2 y + xy 3 . We will useFubini’s Theorem to evaluate∫ ∫f(x, y) dy dx.RcaWe have∫ ∫Rf(x, y) dy dx ==∫ 1 ∫ 20∫ 100[∫ 20x 2 y + xy 3 dy dx]x 2 y + xy 3 dy dx


74 CHAPTER 2. MULTIPLE INTEGRALS✷=====∫ 10∫ 10∫ 10∫ 10= 8 3 .[∫ 2x 2 y dy +00∫ 20∫ 2∫ 2[x 2 y dy + x00[x 2 y2 2]2 ∣ + x y4 24 ∣ dx2x 2 + 4x dx( )∣ 2x3 ∣∣∣13 + 2x200]xy 3 dy dx]y 3 dy dxIn view of Fubini’s Theorem, a double integral is often written as∫ ∫∫ ∫∫ ∫f(x, y) dA = f(x, y) dy dx = f(x, y) dx dy.RRExample We wish to compute the volume V of the solid bounded by theplanes x = 1, x = 4, y = 0, y = 2, z = 0, and x + y + z = 8. Theplane that defines the top of this solid is also the graph of the functionz = f(x, y) = 8 − x − y. It follows that the volume of the solid is given bythe double integral∫ ∫V = 8 − x − y dA, R = [1, 4] × [0, 2].RUsing Fubini’s Theorem, we obtain∫ ∫V = 8 − x − y dAR∫ 4[∫ 2]= 8 − x − y dy dx==1∫ 41∫ 410(8y − xy − y2214 − 2x dx= (14x − x 2 ) ∣ ∣ 4 1R)∣ ∣∣∣2dx0


2.2. DOUBLE INTEGRALS OVER MORE GENERAL REGIONS 75✷= (56 − 16) − (14 − 1)= 27.We conclude by noting some useful properties of the double integral, thatare direct generalizations of corresponding properties for single integrals:• Linearity: If f(x, y) and g(x, y) are both integrable over R, then∫ ∫∫ ∫∫ ∫[f(x, y) + g(x, y)] dA = f(x, y) dA + g(x, y) dAR• Homogeneity: If c is a constant, then∫ ∫∫ ∫cf(x, y) dA = cR• Monotonicity: If f(x, y) ≥ 0 on R, then∫ ∫f(x, y) dA ≥ 0.RRRf(x, y) dA• Additivity: If R 1 and R 2 are disjoint rectangles and Q = R 1 ∪ R 2 is arectangle, then∫ ∫f(x, y) dA =Q∫ ∫f(x, y) dA +R 1∫ ∫f(x, y) dA.R 22.2 Double Integrals over More General RegionsWe have learned how to integrate a function f(x, y) of two variables over arectangle R. However, it is important to be able to integrate such functionsover more general regions, in order to be able to compute the volume of awider variety of solids.To that end, given a region D ⊂ R 2 , contained within a rectangle R, wedefine the double integral of f(x, y) over D by∫ ∫∫ ∫f(x, y) dA = F (x, y) dAwhereDF (x, y) ={ f(x, y) (x, y) ∈ D0 (x, y) ∈ R, /∈ D .RR


76 CHAPTER 2. MULTIPLE INTEGRALSIt is possible to use Fubini’s Theorem to compute integrals over certaintypes of general regions. We say that a region D is of type I if it lies betweenthe graphs of two continuous functions of x, and is also bounded by twovertical lines. Specifically,D = {(x, y) | a ≤ x ≤ b, g 1 (x) ≤ y ≤ g 2 (x)}.To integrate f(x, y) over such a region, we can apply Fubini’s Theorem. Welet R = [a, b] × [c, d] be a rectangle that contains D. Then we have∫ ∫∫ ∫f(x, y) dA = F (x, y) dAD===R∫ b ∫ da c∫ b ∫ g2 (x)a g 1 (x)∫ b ∫ g2 (x)ag 1 (x)F (x, y) dy dxF (x, y) dy dxf(x, y) dy dx.This is valid because F (x, y) = 0 when y < g 1 (x) or y > g 2 (x), becausein these cases, (x, y) lies outside of D. The resulting iterated integral canbe evaluated in the same way as iterated integrals over rectangles; the onlydifference is that when the limits of the inner integral are substituted for yin the antiderivative of f(x, y) with respect to y, the limits are functions ofx, rather than constants.A similar approach can be applied to a region of type II, which is boundedon the left and right by continuous functions of y, and bounded above andbelow by vertical lines. Specifically, D is a region of type II ifD = {(x, y)|h 1 (y) ≤ x ≤ h 2 (y),Using Fubini’s Theorem, we obtain∫ ∫f(x, y) dA =D∫ d ∫ h2 (y)ch 1 (y)c ≤ y ≤ d}.f(x, y) dx dy.Example We wish to compute the volume of the solid under the planex + y + z = 8, and bounded by the surfaces y = x and y = x 2 . Thesesurfaces intersect along the lines x = 0, y = 0 and x = 1, y = 1. It followsthat the volume V of the solid is given by the double integral∫ 1 ∫ x0x 2 8 − x − y dy dx.


2.2. DOUBLE INTEGRALS OVER MORE GENERAL REGIONS 77Note that g 2 (x) = x is the upper limit of integration, because x 2 ≤ x when0 ≤ x ≤ 1. We have✷V ====∫ 1 ∫ x0∫ 10∫ 10∫ 10( x58 − x − y dy dxx(8y 2 )∣− xy − y2 ∣∣∣x2(8x − x 2 − x22x 2)−dxx 42 + x3 − 19x2 + 8x dx2)∣=10 + x44 − 19x3 ∣∣∣1+ 4x 260= 1 10 + 1 4 − 196 + 4= 7160 .(8x 2 − x 3 − x42)dxNote that it is sometimes necessary to determine the intersections of surfacesthat define a solid, in order to obtain the limits of integration.To compute the volume of a solid that is bounded above and below(along the z-direction) by two different surfaces, we can add the volume ofthe solid bounded by the top surface and the plane z = 0 to the volume ofthe solid bounded above by z = 0 and below by the lower surface, whichis equivalent to subtracting the volume of the solid bounded above by thelower surface and below by z = 0.Example We will compute the volume V of the solid in the first octantbounded by the planes z = 10 + x + y, z = 2 − x − y, and x = 0, as well asthe surfaces y = sin x and y = cos x. As these surfaces intersect along theline y = √ 2/2, x = π/4, this volume is given by the double integralV ===∫ π/4 ∫ cos x0 sin x∫ π/4 ∫ cos x0∫ π/40sin x(10 + x + y) − (2 − x − y) dy dx8 + 2x + 2y dy dx(8y + 2xy + y2 )∣ ∣ cos xsin x dx


78 CHAPTER 2. MULTIPLE INTEGRALS===∫ π/40∫ π/40(2x + 8)(cos x − sin x) + cos 2 x − sin 2 x dx(2x + 8)(cos x − sin x) + cos 2x dx(2x sin x + 2x cos x + 6 sin x + 10 cos x + 1 2 sin 2x )∣ ∣∣∣π/4= π√ 22+ 8 √ 2 − 19 2 .The final anti-differentiation requires integration by parts,∫∫u dv = uv − v du,with u = x and dv = (cos x − sin x) dx. The function z = 10 + x + y is the“top” plane because for 0 ≤ x ≤ π/4, sin x ≤ y ≤ cosx, 10+x+y ≥ 2−x−y.✷By setting the integrand f(x, y) ≡ 1 on a region D, and integrating overD, we can obtain A(D), the area of D.Example We will compute the area of a half-circle by integrating f(x, y) ≡ 1over a region D that is bounded by the planes z = 0, z = 1, and y = 0, andthe surface y = √ 1 − x 2 . This surface intersects the plane y = 0 along thelines y = 0, x = 1 and y = 0, x = −1. Therefore the area is given byA(D) =∫ 1 ∫ √ 1−x 2−101 dy dx =∫ 1−1√ ∫ 1 √y| 1−x 20dx = 1 − x 2 dx.−1To evaluate this integral, we use the trigonometric substitution x = sin θ,for which dx = cos θ dθ, which yields0✷A(D) =∫ π/2−π/2cos 2 θ dθ =∫ π/2−π/21 + cos 2θ2dθ =( θ2)∣sin 2θ ∣∣∣π/2+ = π 4−π/22 .2.2.1 Changing the Order of IntegrationIn some cases, a region can be classified as being of either type I or type II,and therefore a function can be integrated over the region in two differentways. However, one approach or the other may be impractical, due to the


2.2. DOUBLE INTEGRALS OVER MORE GENERAL REGIONS 79complexity, or even impossibility, of carrying out the anti-differentiation.Therefore, it is important to be able to change the order of integration ifnecessary.Example Consider the double integral∫ ∫De y3 dAwhere D = {(x, y) | 0 ≤ x ≤ 1, √ x ≤ y ≤ 1}. This region is defined as aregion of type I, so it is natural to attempt to evaluate the iterated integral∫ 1 ∫ 10√ xe y3 dy dx.Unfortunately, it is impossible to anti-differentiate e y3 with respect to y.However, the region D is also a region of type II, as it can be redefined asD = {(x, y) | 0 ≤ y ≤ 1, 0 ≤ x ≤ y 2 }.We then have∫ ∫De y3 dA ==== 1 3∫ 1 ∫ y 20∫ 10∫ 100∫ 1xe y3∣ ∣ ∣y 2e y3 dx dy0y 2 e y3 dy0= 1 3 eu ∣ ∣∣∣10= 1 (e − 1).3dye u du, u = y 3 ,It should be noted that usually, when changing the order of integration, it isnecessary to use the inverse functions of the functions that define the curvedportions of the boundary, in order to obtain the limits of the integration ofthe new inner integral.


80 CHAPTER 2. MULTIPLE INTEGRALS2.2.2 The Mean Value Theorem for IntegralsIt is important to note that all of the properties of double integrals that havebeen previously discussed, including linearity, homogeneity, monotonicity,and additivity, apply to double integrals over non-rectangular regions aswell. One additional property, that is a consequence of monotonicity, is thatif f(x, y) ≥ m on a region D, and f(x, y) ≤ M on D, then∫ ∫mA(D) ≤ f(x, y) dA ≤ MA(D),Dwhere, as before, A(D) is the area of D. Furthermore, if f is continuous onD, then, by the Mean Value Theorem for Double Integrals, we have∫ ∫f(x, y) dA = f(x 0 , y 0 )A(D),Dwhere (x 0 , y 0 ) is some point in D. This is a generalization of the Mean ValueTheorem for Integrals, which is closely related to the Mean Value Theoremfor derivatives.Example Consider the double integral∫ ∫e y dADwhere D is the triangle defined by D = {(x, y) | 0 ≤ x ≤ 1, 0 ≤ y ≤ 4x}.The area of this triangle is given by A(D) = 1 2bh, where b, the base, is 1and h, the height, is 4, which yields A(D) = 2. Because 1 ≤ e y ≤ e 4 when0 ≤ y ≤ 4, it follows that∫ ∫2 ≤ e y dA ≤ 2e 4 ≈ 109.2.DThe exact value is 1 4 (e4 − 5) ≈ 12.4, which is between the above lower andupper bounds. ✷2.3 Double Integrals in Polar CoordinatesWe have learned how to integrate functions of two variables, x and y, overvarious regions that have a simple form. The variables x and y correspondto Cartesian coordinates that are normally used to describe points in 2-D space. However, a region that may not be of type I or type II, when


2.3. DOUBLE INTEGRALS IN POLAR COORDINATES 81described using Cartesian coordinates, may be of one of these types if it isinstead described using polar coordinates r and θ.We recall that polar coordinates are related to Cartesian coordinates bythe equationsx = r cos θ, y = r sin θ,or, alternatively,r 2 = x 2 + y 2 , tan θ = y x .In order to integrate a function over a region defined using polar coordinates,we must derive the double integral in these coordinates, as was previouslydone in Cartesian coordinates.Let a solid be bounded by the surface z = f(r, θ), as well as the surfacesr = a, r = b, θ = α and θ = β, which define a polar rectangle. To computethe volume of this solid, we can approximate it by several solids for whichthe volume can easily be computed. This is accomplished by dividing thepolar rectangle into several smaller polar rectangles of dimensions ∆r and∆θ. The height of each solid is obtained from the value of the function at apoint in the polar rectangle.Specifically, we divide the interval [a, b] into n subintervals of width ∆r =(b − a)/n. Each subinterval is of the form [r i−1 , r i ], where r i = a + i∆r, fori = 1, 2, . . . , n. Similarly, [α, β] is divided into m subintervals of width ∆θ =(β −α)/m, and each subinterval is of the form [θ j−1 , θ j ], where θ j = α+j∆θ.Then, the volume V of the solid is approximated byV ≈n∑m∑i=1 j=112 f(r∗ i , θ ∗ j )(r 2 i − r 2 i−1)∆θ,where, for each i, r i−1 ≤ r ∗ i ≤ r i, and for each j, θ j−1 ≤ θ ∗ j ≤ θ j.The quantity 1 2 ∆r2 ∆θ is the area of a polar rectangle, for it is not trulya rectangle, but rather the difference between two circular sectors with angle∆θ and radii r i−1 and r i . However, from12 (r2 i − r 2 i−1) = 1 2 (r i−1 + r i )(r i − r i−1 ) = 1 2 (r i−1 + r i )∆r,we see that as m, n → ∞, this approximation of the volume converges tothe exact volume, which is given by the double integralV =∫ β ∫ bαaf(r, θ) r dr dθ.


82 CHAPTER 2. MULTIPLE INTEGRALSNote the extra factor of r in the integrand, which is the limit as n → ∞ of(r i−1 + r i )/2.If the base of the solid can be represented by a polar region of type I,D = {(r, θ) | α ≤ θ ≤ β, h 1 (θ) ≤ r ≤ h 2 (θ)},then the volume V of the solid defined by the surface z = f(r, θ) and thesurfaces that define D is given by the iterated integralV =∫ β ∫ h2 (θ)αh 1 (θ)f(r, θ) r dr dθ.As before, if f(r, θ) ≡ 1, then this integral yields A(D), the area of D.Example To evaluate the double integral∫ ∫x + y dA,Dwhere D = {(x, y) | 1 ≤ x 2 + y 2 ≤ 4, x ≤ 0}, we convert the integrand, andthe description of D, to polar coordinates. We then have∫ ∫r cos θ + r sin θ dADwhere D = {(r, θ) | 1 ≤ r ≤ 2, π/2 ≤ θ ≤ 3π/2}. This simplifies the integralconsiderably, because D can be described as a polar rectangle. We thenhave∫ ∫∫ 3π/2 ∫ 2x + y dA =(r cos θ + r sin θ)r dr dθD=== 7 3π/2 1∫ 3π/2 ∫ 2π/2∫ 3π/21π/2∫ 3π/2π/2r 2 (cos θ + sin θ) dr dθ(cos θ + sin θ) r33 ∣(cos θ + sin θ) dθ21dθ= 7 3(sin θ − cos θ)|3π/2π/2= 7 [(−1 − 0) − (1 − 0)]3= − 143 .


2.3. DOUBLE INTEGRALS IN POLAR COORDINATES 83✷Example To compute the volume of the solid in the first octant boundedbelow by the cone z = √ x 2 + y 2 , and above by the sphere x 2 + y 2 + z 2 = 8,as well as the planes y = x and y = 0, we first rewrite the equations of thebounding surfaces in polar coordinates. The solid is bounded below by thecone z = r, above by the sphere r 2 + z 2 = 8, and the surfaces θ = 0 andθ = π/4, since the solid lies in the first octant. The surfaces that bound thesolid above and below intersect when 2r 2 = 8, or r = 2. It follows that thevolume is given byV ==∫ π/4 ∫ 20 0∫ π/4 ∫ 20= − 1 2= 1 2= 1 30∫ π/4 ∫ 40∫ π/40∫ π/40[ √ 8 − r 2 − r]r dr dθr √ 8 − r 2 dr dθ −8∣2 ∣∣∣83 u3/2= 4π 3 [√ 2 − 1].u 1/2 du dθ −4dθ −∫ π/4 ∫ 20∫ π/40∫ π/4[16 √ 2 − 8] dθ − 2π 3083 dθr 330∣2∣0r 2 dr dθIn the third step, the substitution u = 8 − r 2 is used. Then, the limits ofintegration are interchanged in order to reverse the sign of the integral. ✷Example The double integral∫ 1 ∫ √ 1−x 2−10f(x, y) dy dxcan be converted to polar coordinates by converting the equation that describesthe top boundary of the domain of integration, y = √ 1 − x 2 , into apolar equation. We substitute x = r cos θ and y = r sin θ into this equationto obtainr sin θ = √ 1 − cos 2 θ.Squaring both sides yields r 2 sin 2 θ = 1 − cos 2 θ, and, in view of the identitycos 2 θ + sin 2 θ = 1, we obtain the polar equation r = 1. Because the bottomdθ


84 CHAPTER 2. MULTIPLE INTEGRALSboundary, y = 0, corresponds to the rays θ = 0 and θ = π, the integral canbe expressed in polar coordinates as✷∫ π ∫ 100f(r cos θ, r sin θ) r dr dθ.Example We evaluate the double integral∫ 2 ∫ √ 2x−x 200√x 2 + y 2 dy dxby converting to polar coordinates By completing the square, we obtain2x − x 2 = 1 − (x − 1) 2 . It follows that the region D over which the integralis to be evaluated,D = {(x, y) | 0 ≤ x ≤ 2, 0 ≤ y ≤ √ 2x − x 2 },has its top boundary defined by the equation y = √ 2x − x 2 , or(x − 1) 2 + y 2 = 1.That is, the top boundary is the upper half of the circle with radius 1 andcenter (1, 0). In polar coordinates, the equation of the top boundary becomesor, upon expanding and simplifying,(r cos θ − 1) 2 + r 2 sin 2 θ = 1,r = 2 cos θ.The region D is contained between the rays θ = 0 and θ = π/2. It followsthat in polar coordinates, D is defined byD = {(r, θ) | 0 ≤ θ ≤ π/2, 0 ≤ r ≤ 2 cos θ}.The lower limit r = 0 is obtained from the fact that D contains the origin.We thus obtain the integral∫ 2 ∫ √ 2x−x 200√x 2 + y 2 dy dx =∫ π/20∫ 2 cos θ0r 2 dr dθ.The integrand of the original integral is r, but the additional factor of rrequired by the change to polar coordinates yields an integrand of r 2 .


2.4. TRIPLE INTEGRALS 85✷Evaluating this integral, we obtain∫ 2 ∫ √ 2x−x 200√x 2 + y 2 dy dx === 8 3= 8 3= 8 3= 8 3∫ π/2 ∫ 2 cos θ0∫ π/20∫ π/20∫ π/20∫ π/20∫ π/200r 3 2 cos θ3 ∣0cos 3 θ dθr 2 dr dθdθcos 2 θ cos θ dθ(1 − sin 2 θ) cos θ dθcos θ dθ − 8 3= 8 sin θ|π/2 0 − 8 3 3= 8 3 (1) − 8 u 313 3 ∣0= 169 .∫ 10∫ π/20u 2 dusin 2 θ cos θ dθ2.4 Triple IntegralsThe integral of a function of three variables over a region D ⊂ R 3 can bedefined in a similar way as the double integral. Let D be the box defined byD = {(x, y, z) | a ≤ x ≤ b, c ≤ y ≤ d, r ≤ z ≤ s}.Then, as with the double integral, we divide [a, b] into n subintervals of width∆x = (b − a)/n, with endpoints [x i−1 , x i ], for i = 1, 2, . . . , n. Similarly, wedivide [c, d] into m subintervals of width ∆y = (d − c)/m, with endpoints[y j−1 , y j ], for j = 1, 2, . . . , m, and divide [r, s] into l subintervals of width∆z = (s − r)/l, with endpoints [z k−1 , z k ] for k = 1, 2, . . . , l.Then, we can define the triple integral of a function f(x, y, z) over D by∫ ∫ ∫Df(x, y, z) dV =limm,n,l→∞n∑m∑i=1 j=1 k=1l∑f(x ∗ i , yj ∗ , zk ∗ ) ∆V,


86 CHAPTER 2. MULTIPLE INTEGRALSwhere ∆V = ∆x∆y∆z. As with double integrals, the practical method ofevaluating a triple integral is as an iterated integral, such as∫ ∫ ∫Df(x, y, z) dV =∫ s ∫ d ∫ brcaf(x, y, z) dx dy dz.By Fubini’s Theorem, which generalizes to three dimensions or more, theorder of integration can be rearranged when f is continuous on D.A triple integral over a more general region can be defined in the sameway as with double integrals. If E is a bounded subset of R 3 , that is containedwithin a box B, then we can define∫ ∫ ∫∫ ∫ ∫f(x, y, z) dV = F (x, y, z) dV,whereEF (x, y, z) ={ f(x, y, z) (x, y, z) ∈ E,0 (x, y, z) /∈ E .All of the properties previously associated with the double integral, such aslinearity and additivity, generalize to the triple integral as well.Just as regions were classified as type I or type II for double integrals,they can be classified for the purpose of setting up triple integrals. A solidregion E is said to be of type 1 if it lies between the graphs of two continuousfunctions of x and y that are defined on a two-dimensional region D.Specifically,E = {(x, y, z) | (x, y) ∈ D, u 1 (x, y) ≤ z ≤ u 2 (x, y)}.Then, an integral of a function f(x, y, z) over E can be evaluated as∫ ∫ ∫E∫ ∫f(x, y, z) dV =DB∫ u2 (x,y)u 1 (x,y)f(x, y, z) dz dA,where the double integral over D can be evaluated in a manner that isappropriate for the type of D.For example, if D is of type I, thenE = {(x, y, z) | a ≤ x ≤ b, g 1 (x) ≤ y ≤ g 2 (x), u 1 (x, y) ≤ z ≤ u 2 (x, y)},and therefore∫ ∫ ∫Ef(x, y, z) dV =∫ b ∫ g2 (x) ∫ u2 (x,y)ag 1 (x)u 1 (x,y)f(x, y, z) dz dy dx.


2.4. TRIPLE INTEGRALS 87On the other hand, if E is of type 2, then it has a definition of the formE = {(x, y, z) | (y, z) ∈ D, u 1 (y, z) ≤ x ≤ u 2 (y, z)}.That is, E lies between the graphs of two continuous functions of y and zthat are defined on a two-dimensional region D. Finally, if E is a region oftype 3, then it lies between the graphs of two continuous functions of x andz. That is,E = {(x, y, z) | (x, z) ∈ D, u 1 (y, z) ≤ y ≤ u 2 (y, z)}.If more than one type applies to a given region E, then the order of evaluationcan be determined by which ordering leads to the integrands that aremost easily anti-differentiated within each single integral that arises.Example Let E be a solid tetrahedron bounded by the planes x = 0, y = 0,z = 0 and x + y + z = 1. We wish to integrate the function f(x, y, z) = xzover this tetrahedron. From the given bounding planes, we see that thetetrahedron is bounded below by the plane z = 0 and above by the planez = 1 − x − y. Therefore, we surmise that E can be viewed as a solid of type1. This requires finding a region D in the xy-plane such that E is boundedby z = 0 and z = 1 − x − y on D.We first note that these planes intersect along the line x + y = 1. Itfollows that the base of E is a 2-D region D that can be described by theinequalities x ≥ 0, y ≥ 0, and x + y ≤ 1. This region is of type I or type II,so we choose type I and obtain the descriptionD = {(x, y) | 0 ≤ x ≤ 1, 0 ≤ y ≤ 1 − x}.Therefore, we can integrate f(x, y, z) over E as follows:∫ ∫ ∫∫ 1 ∫ 1−x ∫ 1−x−yxz dV =xz dz dy dxE=== 1 2= 1 20∫ 10∫ 1xx0∫ 10∫ 100 0∫ 1−x ∫ 1−x−y0∫ 1−x0∫ 1−xx0x(−0z 2 1−x−y2 ∣0z dz dy dzdy dx(1 − x − y) 2 dy dx)∣(1 − x − y)3 ∣∣∣1−xdx30


88 CHAPTER 2. MULTIPLE INTEGRALS✷= 1 6= 1 6∫ 10∫ 10∫ 1x(1 − x) 3 dx(1 − u)u 3 du, u = 1 − x= 1 u 3 − u 4 du6 0= 1 ( )∣ u46 4 − u5 ∣∣∣150= 1 ( 16 4 − 1 5)= 1120 .Example We will compute the volume of the solid E bounded by the surfacesy = x, y = x 2 , z = x, and z = 0. Because E is bounded by twosurfaces that define z as a function of x and y, we view E as a solid of type1. It is bounded by the graphs of the functions z = 0 and z = x that aredefined on a region D in the xy-plane. This region is bounded by the curvesy = x and y = x 2 . Because these curves intersect when x = 0 and x = 1,we can describe D as a region of type I:D = {(x, y) | 0 ≤ x ≤ 1, x 2 ≤ y ≤ x}.It follows that the volume of E is given by the iterated integral∫ ∫ ∫∫ 1 ∫ x1 dV =1 dz dy dxE=====∫ x0 x 2 0∫ 1 ∫ x0∫ 10∫ 10∫ 10x( x3= 1 12 .x 2 x dy dx∫ xx 2 1 dy dxx(x − x 2 ) dxx 2 − x 3 dx3 − x44)∣ ∣∣∣10


2.4. TRIPLE INTEGRALS 89✷Example We evaluate the triple integral∫ ∫ ∫x dVwhere E is the solid bounded by the paraboloid x = 4y 2 + 4z 2 and theplane x = 4. The paraboloid and the plane intersect when y 2 + z 2 = 1. Itfollows that the right boundary of the solid E is the unit disk y 2 + z 2 ≤ 1,contained within the plane x = 4. Because the paraboloid serves as the“left” boundary of E, we can define E by the inequalitiesE = {(x, y, z) | y 2 + z 2 ≤ 1, 4(y 2 + z 2 ) ≤ x ≤ 4}.Therefore, the triple integral can be written as an iterated integral∫ ∫ ∫ ∫ ∫ [ ∫ ]4x dV =x dx dA,4(y 2 +z 2 )EDwhere D is the unit disk in the yz-plane, y 2 + z 2 = 1. If we convert y and zto polar coordinates y = r cos θ and z = r sin θ, we can rewrite this integralas∫ ∫ ∫ ∫ 2π ∫ 1 ∫ 4x dV =xr dx dr dθ.E0 0 4r 2Evaluating this integral, we obtain∫ ∫ ∫∫ 2π ∫ 1 ∫ 4x dV =xr dx dr dθE0 0 4r∫ 2 2π ∫ 1= r x2 42 ∣ dr dθ✷= 8= 8= 8= 8E0 0∫ 2π ∫ 10 0∫ 2π ∫ 10∫ 2π0∫ 2π0= 16π3 .0r 24r 2r(1 − r 4 ) dr dθr − r 5 dr dθ2 − r6613 dθ∣10dθ


90 CHAPTER 2. MULTIPLE INTEGRALS2.5 Applications of Double and Triple IntegralsWe now explore various applications of double and triple integrals arisingfrom physics. When an object has constant density ρ, then it is known thatits mass m is equal to ρV , where V is its volume. Now, suppose that aflat plate, also known as a lamina, has a non-uniform density ρ(x, y), for(x, y) ∈ D, where D defines the shape of the lamina. Then, its mass is givenby∫ ∫m = ρ(x, y) dA.DSimilarly, if E is a solid region in 3-D space, and ρ(x, y, z) is the density ofthe solid at the point (x, y, z) ∈ E, then the mass of the solid is given by∫ ∫ ∫m = ρ(x, y, z) dV.EWe see that just as the integral allows simple “product” formulas for area andvolume to be applied to more general problems, it allows similar formulasfor quantities such as mass to be generalized as well.The center of mass, also known as the center of gravity, of an object isthe point at which the object behaves as if its entire mass is concentratedat that point. If the object is one- or two-dimensional, the center of mass isthe point at which the object can be balanced horizontally (like a see-sawwith riders at either end, in the one-dimensional case).For a lamina with its shape defined by a bounded region D ⊂ R 2 , andwith density given byρ(x, y), its center of mass (¯x, ȳ) is located at¯x = M ym ,ȳ = M xm ,where M x and M y are the moments of the lamina about the x-axis andy-axis, respectively. These are given by∫ ∫∫ ∫M x = yρ(x, y) dA, M y = xρ(x, y) dA.DThese integrals are obtained from the formula for the moment of a pointmass about an axis, which is given by the product of the mass and thedistance from the axis.Similarly, the moments about the xy-, yz- and xz-planes, M xy , M yz , andM xz , of a solid E ⊂ R 3 with density ρ(x, y, z) are given by∫ ∫ ∫M xy = zρ(x, y, z) dV,ED


2.6. TRIPLE INTEGRALS IN CYLINDRICAL COORDINATES 91∫ ∫ ∫M yz = xρ(x, y, z) dV,∫ ∫ ∫EM xz = yρ(x, y, z) dV.It follows that its center of mass (¯x, ȳ, ¯z) is located atE¯x = M yzm ,ȳ = M xzm ,¯z = M xym .As in the 2-D case, each moment is defined using the distance of each pointof E from the coordinate plane about which the moment is being computed.The moment of interia, or second moment, of an object about an axisgives an indication of the object’s tendency to rotate about that axis. Fora lamina defined by a region D ⊂ R 2 with density function ρ(x, y), itsmoments of inertia about the x-axis and y-axis, I x and I y respectively, aregiven by∫ ∫I x =D∫ ∫y 2 ρ(x, y) dA, I y =Dx 2 ρ(x, y) dA.On the other hand, for a solid defined by a region E ⊂ R 3 with densityρ(x, y, z), its moments of inertia about the coordinate axes are defined by∫ ∫ ∫I x =E∫ ∫ ∫(y 2 + z 2 )ρ(x, y, z) dV, I y =∫ ∫ ∫I z =EE(x 2 + y 2 )ρ(x, y, z) dV.(x 2 + z 2 )ρ(x, y, z) dV,The moment I z is also called the polar moment of interia, or the moment ofinteria about the origin, when E reduces to a lamina with density ρ(x, y).2.6 Triple Integrals in Cylindrical CoordinatesWe have seen that in some cases, it is convenient to evaluate double integralsby converting Cartesian coordinates (x, y) to polar coordinates (r, θ). Thesame is true of triple integrals. When this is the case, Cartesian coordinates(x, y, z) are converted to cylindrical coordinates (r, θ, z).The relationships between (x, y) and (r, θ) are exactly the same as inpolar coordinates, and the z coordinate is unchanged.


92 CHAPTER 2. MULTIPLE INTEGRALSExample The point (x, y, z) = (−3, 3, 4) can be converted to cylindricalcoordinates (r, θ, z) using the relationships from polar coordinates,These relationships yieldr = √ x 2 + y 2 , tan θ = y x .r = √ 3 2 + (−3) 2 = √ 18 = 3 √ 2, tan θ = −1.Since x = −3 < 0, we have θ = tan −1 (−1) + π = 3π/4. We conclude thatthe cylindrical coordinates of the point (−3, 3, 4) are (3 √ 2, 3π/4, 4). ✷Furthermore, just as conversion to polar coordinates in double integralsintroduces a factor of r in the integrand, conversion to cylindrical coordinatesin triple integrals also introduces a factor of r.Example We evaluate the triple integral∫ ∫ ∫f(x, y, z) dV,Ewhere E is the solid bounded below by the paraboloid z = x 2 + y 2 , aboveby the plane z = 4, and the planes y = 0 and y = 2. This integral can beevaluated as an iterated integral∫ 2 ∫ √ 4−x 2 ∫ 4−20x 2 +y 2 f(x, y, z) dz dy dx,but if we instead describe the region using cylindrical coordinates, we findthat the solid is bounded below by the paraboloid z = r 2 , above by theplane z = 4, and contained within the polar “box” 0 ≤ r ≤ 2, 0 ≤ θ ≤ π.We can therefore evaluate the iterated integral∫ 2 ∫ π ∫ 4that has much simpler limits. ✷00r 2 f(r cos θ, r sin θ, z) r dz dθ dr,Example We use cylindrical coordinates to evaluate the triple integral∫ ∫ ∫x dVwhere E is the solid bounded by the planes z = 0 and z = x + y + 5, and thecylindrical shells x 2 + y 2 = 4 and x 2 + y 2 = 9. In cylindrical coordinates, EE


2.7. TRIPLE INTEGRALS IN SPHERICAL COORDINATES 93is bounded by the planes z = 0 and z = r(cos θ +sin θ)+5, and the cylindersr = 2 and r = 3. It follows that the integral can be written as the iteratedintegral∫ ∫ ∫✷∫ ∫ ∫Ex dV =∫ 2π ∫ 3 ∫ r(cos θ+sin θ)+5Evaluating this integral, we obtainEx dV ====∫ 2π0∫ 2π0∫ 2π0∫ 2π0= 65 4= 654= 65 4= 65π4 .0cos θcos θ2∫ 32∫ 320r 2 ∫ r(cos θ+sin θ)+50(r cos θ)r dz dr dθ.dz dr dθ[r 3 (cos θ + sin θ) + 5r 2 ] dr dθcos θ(cos θ + sin θ)∫ 3cos θ(cos θ + sin θ) r44 ∣∫ 2π0∫ 2π2∫ 2πr 3 dr dθ +32dθ + 5cos 2 θ + sin θ cos θ dθ + 9530∫ 2π0∫ 2π10 2 (1 + cos 2θ) + 1 95sin 2θ dθ +2 3[ 12 θ + 1 2 sin 2θ) − 1 ]∣ ∣∣∣2π4 cos 2θ dθ0cos θcos θ r33cos θ dθsin θ|2π02.7 Triple Integrals in Spherical Coordinates0∫ 32∣3∣25r 2 dr dθAnother approach to evaluating triple integrals, that is especially usefulwhen integrating over regions that are at least partially defined using spheres,is to use spherical coordinates. Consider a point (x, y, z) that lies on a sphereof radius ρ. Then we know that x 2 + y 2 + z 2 = ρ 2 . Furthermore, the points(0, 0, 0), (0, 0, z) and (x, y, z) form a right triangle with hypotenuse ρ andlegs |z| and √ ρ 2 − z 2 .If we denote by φ the angle adjacent to the leg of length |z|, then φ can beinterpreted as an angle of inclination of the point (x, y, z). The angle φ = 0corresonds to the “north pole” of the sphere, while φ = π/2 corresponds todθ


94 CHAPTER 2. MULTIPLE INTEGRALSthe “equator”, and φ = π corresponds to the “south pole”. By right triangletrigonometry, we havez = ρ cos φ.It follows that x 2 + y 2 = ρ 2 sin 2 φ. If we define the angle θ to have the samemeaning as in polar coordinates, then we havex = ρ sin φ cos θ, y = ρ sin φ sin θ.We define the spherical coordinates of (x, y, z) to be (ρ, θ, φ).Example To convert the point (x, y, z) = (1, √ 3, −4) to spherical coordinates,we first computeρ = √ √x 2 + y 2 + z 2 = 1 2 + ( √ 3) 2 + (−4) 2 = √ 20 = 2 √ 5.Next, we use the relation tan θ = y/x, and the fact that x = 1 > 0, to obtainθ = tan −1 y x = tan−1 √ 3 = π 3 .Finally, to obtain φ, we use the relation z = ρ cos φ, which yieldsφ = cos −1 z (ρ = cos−1 − 4 )2 √ ≈ 2.6779 radians.5✷To evaluate integrals in spherical coordinates, it is important to notethat the volume of a “spherical box” of dimensions ∆r, ∆θ and ∆φ, as∆ρ, ∆θ, ∆φ → 0, converges to the infinitesimalρ 2 sin φ dr dθ dφ,where (ρ, θ, φ) denotes the location of the box in the limit. Therefore, theintegral of a function f(x, y, z) over a solid E, when evaluated in sphericalcoordinates, becomes∫ ∫ ∫∫ ∫ ∫f(x, y, z) dV = f(ρ sin φ cos θ, ρ sin φ sin θ, ρ cos φ) ρ 2 sin φ dρ dθ dφ.EEExample We wish to compute the volume of the solid E in the first octantbounded below by the plane z = 0 and the hemisphere x 2 + y 2 + z 2 = 9,bounded above by the hemisphere x 2 + y 2 + z 2 = 16, and the planes y = 0and y = x. This would be highly inconvenient to attempt to evaluate in


2.7. TRIPLE INTEGRALS IN SPHERICAL COORDINATES 95Cartesian coordinates; determining the limits in z alone requires breakingup the integral with respect to z. However, in spherical coordinates, thesolid E is determined by the inequalities3 ≤ ρ ≤ 4, 0 ≤ θ ≤ π 4 , 0 ≤ φ ≤ π 2 .That is, the solid is actually a “spherical rectangle”.volume V is given by the iterated integralIt follows that the✷V == π 4= π 4= π 4∫ π/2 ∫ π/4 ∫ 40 0∫ π/2 ∫ 40∫ π/20∫ π/2= π 374 33sin φ0∫ π/2373= − π 4= 37π12 .03ρ 2 sin φ dρ dθ dφρ 2 sin φ dρ dθ dφ∫ 43sin φ ρ33 ∣ρ 2 dρ dθ dφ43sin φ dθ dφcos φ|π/2 0dθ dφExample We use spherical coordinates to evaluate the triple integral∫ ∫ ∫(x 2 + y 2 ) dV,Hwhere H is the solid that is bounded below by the xy-plane, and boundedabove by the sphere x 2 + y 2 + z 2 = 1. In spherical coordinates, H is definedby the inequalitiesH = {(ρ, θ, φ) | 0 ≤ ρ ≤ 1, 0 ≤ θ ≤ 2π, 0 ≤ φ ≤ π/2}.As the integrand x 2 +y 2 is equal to (ρ cos θ sin φ) 2 +(ρ sin θ sin φ) 2 = ρ 2 sin 2 φin spherical coordinates, we have∫ ∫ ∫H(x 2 + y 2 ) dV =∫ 2π ∫ π/2 ∫ 1000(ρ 2 sin 2 φ)ρ 2 sin φ dρ dφ dθ.


96 CHAPTER 2. MULTIPLE INTEGRALSEvaluating this integral, we obtain∫ ∫ ∫∫ 2π ∫ π/2(x 2 + y 2 ) dV =sin 3 φH== 1 5= 1 5= 1 5= 1 5= 1 5= 1 5= 1 50 0∫ 2π ∫ π/20 0∫ 2π ∫ π/20 0∫ 2π ∫ π/20 0∫ 2π ∫ π/20 0∫ 2π ∫ π/20∫ 2π0∫ 2π00∫ 10sin 3 φ ρ55 ∣ρ 4 dρ dφ dθ10sin 3 φ dφ dθdφ dθsin 2 φ sin φ dφ dθ(1 − cos 2 φ) sin φ dφ dθsin φ dφ dθ − 1 5(− cos φ)| π/20 dθ − 1 51 dθ − 1 ∫ 2πu 315 0 3 ∣ dθ0)(2π − 2π 3∫ 2π ∫ π/20 0∫ 2π ∫ 100cos 2 φ sin φ dφ dθu 2 du dθ✷= 4π15 .2.8 Change of Variables in Multiple IntegralsRecall that in single-variable calculus, if the integral∫ baf(u) duis evaluated by making a change of variable u = g(x), such that the intervalα ≤ x ≤ β is mapped by g to the interval a ≤ u ≤ b, then∫ baf(u) du =∫ βαf(g(x))g ′ (x) dx.The appearance of the factor g ′ (x) in the integrand is due to the factthat if we divide [a, b] into n subintervals [u i−1 , u i ] of equal width ∆u =


2.8. CHANGE OF VARIABLES IN MULTIPLE INTEGRALS 97(b − a)/n, and if we divide [α, β] into n subintervals [x i−1 , x i ] of equal width∆x = (β − α)/n, then∆u = u i − u i−1 = g(x i ) − g(x i−1 ) = g ′ (x ∗ i )∆x,where x i−1 ≤ x ∗ i ≤ x i . We will now generalize this change of variable tomultiple integrals.For simplicity, suppose that we wish to evaluate the double integral∫ ∫f(x, y) dAby making a change of variableDx = g(u, v), y = h(u, v), a ≤ u ≤ b, c ≤ v ≤ d.We divide the interval [a, b] into n subintervals [u i−1 , u i ] of equal width ∆u =(b − a)/n, and we divide [c, d] into m subintervals [v i−1 , v i ] of equal width∆v = (d − c)/m. Then, the rectangle [u i−1 , u i ] × [v i−1 , v i ] is approximatelymapped by g and h into a parallelogram with adjacent sidesr u = 〈g(u i , v i−1 ) − g(u i−1 , v i−1 ), h(u i , v i−1 ) − h(u i−1 , v i−1 )〉,r v = 〈g(u i−1 , v i ) − g(u i−1 , v i−1 ), h(u i−1 , v i ) − h(u i−1 , v i−1 )〉.By the Mean Value Theorem, we haver u ≈ 〈g u (u i−1 , v i−1 ), h u (u i−1 , v i−1 )〉∆u,r v ≈ 〈g v (u i−1 , v i−1 ), h v (u i−1 , v i−1 )〉∆v.The area of this parallelogram is given by|r u × r v | =∂g ∂h∣∂u∂v − ∂g∂h∂v ∂u∣ ∆u∆v.It follows that∫ ∫∫ ∫f(x, y) dx dy =D˜Df(g(u, v), h(u, v))∂(x, y)∣∂(u, v) ∣ du dv,where ˜D = [a, b] × [c, d] is the domain of g and h, and∣ ∂(x, y) ∣∣∣ ∂x ∂x∂(u, v) = ∂u ∂v∂y ∣ = ∂x ∂y∂u ∂v − ∂x ∂y∂v ∂u∂u∂y∂v


98 CHAPTER 2. MULTIPLE INTEGRALSis the Jacobian of the transformation from (u, v) to (x, y). It is also thedeterminant of the Jacobian matrix of the vector-valued function that maps(u, v) to (x, y).Example Let D be the parallelogram with vertices (0, 0), (2, 4), (6, 1), and(8, 5). To integrate a function f(x, y) over D, we can use a change of variable(x, y) = (g(u, v), h(u, v)) that maps a rectangle to this parallelogram, andthen integrate over the rectangle.Using the vertices, we find that the equations of the edges are−x + 6y = 0, −x + 6y = 22, 2x − y = 0, 2x − y = 11.Therefore, if we define the new variables u and v by the equationsu = −x + 6y, v = 2x − y,then, for (x, y) ∈ D, we have (u, v) belonging to the rectangle 0 ≤ u ≤ 22,0 ≤ v ≤ 11.To rewrite an integral over D in terms of u and v, it is much easier toexpress the original variables in terms of the new variables than the otherway around. Therefore, we need to solve the equations defining u and v forx and y. From the equation for u, we have x = 6y − u. Substituting intothe equation for v, we obtain v = 2(6y − u) − y, which yields y = h(u, v) =111111(2u + v). Subtituting this into the equation for u yields x = g(u, v) =(u + 6v).The Jacobian of this transformation is∣ ∂(x, y) ∣∣∣ ∂x ∂x∂(u, v) = ∂u ∂v∂y ∣ = ∂x ∂y∂u ∂v − ∂x ∂y∂v ∂u = 111 2 [1(1) − 6(2)] = − 1 11 .We conclude that∫ ∫✷∂u∂y∂vDf(x, y) dx dy = 1 11∫ ∫˜Df(g(u, v), h(u, v)) du dv.In general, when integrating a function f(x 1 , x 2 , . . . , x n ) over a regionD ⊂ R n , if the integral is evaluated using a change of variable (x 1 , x 2 , . . . , x n ) =g(u 1 , u 2 , . . . , u n ) that maps a region E ⊂ R n to D, then∫∫f(x 1 , . . . , x n ) dx 1 · · · dx n = (f◦g)(u 1 , . . . , u n )| det(J g (u 1 , . . . , u n ))| du 1 · · · du n ,DE


2.8. CHANGE OF VARIABLES IN MULTIPLE INTEGRALS 99where⎡J g (u 1 , u 2 , . . . , u n ) = ⎢⎣∂x 1 ∂x 1∂u 1∂x 2 ∂x 2∂u 1.∂x n∂u 1∂u 2· · ·∂u 2· · ·.∂x n∂u 2· · ·∂x 1∂u n∂x 1∂u n.∂x n∂u nis the Jacobian matrix of g and det(J g (u 1 , u 2 , . . . , u n )) is its determinant,which is simply referred to as the Jacobian of the transformation g.Example Consider the transformation from spherical to Cartesian coordinates,x = ρ sin φ cos θ, y = ρ sin φ sin θ, z = ρ cos φ.Then, the Jacobian matrix of this transformation is⎡⎢⎣∂x∂ρ∂y∂ρ∂z∂ρ∂x∂θ∂y∂θ∂z∂θ∂x∂φ∂y∂φ∂z∂φ⎤⎥⎦ =⎡⎣⎤⎥⎦sin φ cos θ −ρ sin φ sin θ ρ cos φ cos θsin φ sin θ ρ sin φ cos θ ρ cos φ sin θcos φ 0 − sin φIt follows that the Jacobian of this transformation is given by the determinantof this matrix,∣sin φ cos θ −ρ sin φ sin θ ρ cos φ cos θsin φ sin θ ρ sin φ cos θ ρ cos φ sin θcos φ 0 −ρ sin φ∣⎤⎦ .= cos φ−ρ sin φ sin θ ρ cos φ cos θ∣ ρ sin φ cos θ ρ cos φ sin θ ∣ −ρ sin φsin φ cos θ −ρ sin φ sin θ∣ sin φ sin θ ρ sin φ cos θ ∣= cos φ[−ρ 2 sin φ cos φ sin 2 θ − ρ 2 sin φ cos φ cos 2 θ] −ρ sin φ[ρ sin 2 φ cos 2 θ + ρ sin 2 φ sin 2 θ]= −ρ 2 cos 2 φ sin φ − ρ 2 sin 2 φ sin φ= −ρ 2 sin φ.The absolute value of the Jacobian is the factor that must be included inthe integrand when converting a triple integral from Cartesian to sphericalcoordinates. ✷Example We evaluate the double integral∫ ∫R(x 2 − xy + y 2 ) dA,


100 CHAPTER 2. MULTIPLE INTEGRALSwhere R is the region bounded by the ellipse x 2 − xy + y 2 = 2, using thechange of variablesx = √ 2u − √ 2/3v,y = √ 2u + √ 2/3v.First, we compute the Jacobian of the change of variables,([∂(x, y)∂x∂(u, v) = det ∂u ∂y∂u∂x∂v∂y∂v])([ √ √ ])2 − 2/3= det √ √ = √ 2 √ 2/3+ √ 2 √ 2/3 = √ 4 .2 2/3 3Next, we need to define the region R in terms of u and v. Rewriting theequation x 2 −xy+y 2 = 2 in terms of u and v yields the equation 2u 2 +2v 2 =2. It follows that the change of variables maps the region ˜R to R, where ˜Ris the unit disk. If we then use polar coordinates u = r cos θ and v = r sin θ,we have∫ ∫∫ ∫(x 2 −xy+y 2 ) dA = (2u 2 +2v 2 ) √ 4 du dv = 4 ∫ 2π ∫ 1√ (2r 2 )r dr dθ.R˜R3 3 0 0✷Evaluating this integral, we obtain∫ ∫R(x 2 − xy + y 2 ) dA = 8 √3∫ 2π0= 8 √3∫ 2π0= 2 √3∫ 2π= 4π √3.0∫ 10r 44∣1 dθr 3 dr dθExample We wish to use an appropriate change of variable to evaluate thedouble integral ∫ ∫(x + y)e x2 −y 2 dA,Rwhere R is the rectangle enclosed by the lines x−y = 0, x−y = 2, x+y = 0and x + y = 3. If we define u = x + y and v = x − y, then R is mapped bythis change of variables to the rectangle˜R = {(u, v) | 0 ≤ u ≤ 3, 0 ≤ v ≤ 2}.10dθ


2.8. CHANGE OF VARIABLES IN MULTIPLE INTEGRALS 101Solving for x and y in terms of u and v, we obtainx = 1 (u + v),2 y= 1 (u − v).2It follows that∂(x, y)∂(u, v) = ∂x ∂y∂u ∂v − ∂x ∂y∂v ∂u = 1 2and the integral becomes∫ ∫R∫ ∫(x+y)e x2 −y 2 dA =Evaluating this integral, we obtain∫ ∫R(− 1 )2(x+y)e (x+y)(x−y) dA =R(x + y)e x2 −y 2 dA = 1 2= 1 2∫ 3 ∫ 20∫ 30∫ 30− 1 12 2 = −1 2∫ 3 ∫ 200ue uv dv due uv | 2 0 du= 1 [e 2u − 1] du2 0= 1 [ ]∣ e2u ∣∣∣32 2 − u 0( )e6∣ ∣∣∣ue uv − 1 2∣ dv du.= 1 22 − 3 − 1 2= 1 4 (e6 − 7).✷


102 CHAPTER 2. MULTIPLE INTEGRALS


Chapter 3Vector <strong>Calculus</strong>3.1 Vector FieldsTo this point, we have mostly worked with scalar-valued functions of severalvariables, in the interest of computing quantities such as the maximum orminimum value of a function, or the volume or center of mass of a solid.Now, we will study applications involving vector-valued functions of severalvariables. The difficulty of visualizing such functions leads to the notion ofa vector field.A function F : U ⊆ R n → R n is a function that assigns to each pointx ∈ U a vectorF(x) = 〈F 1 (x), F 2 (x), . . . , F n (x)〉in R n . The functions F 1 , F 2 , . . . , F n are the component functions, or componentscalar fields, of F. For our purposes, n = 2 or 3. To visualize avector field, one can plot the vector F (x) at any given point x, using thecomponent functions to obtain the components of the vector to be plottedat each point.The following are certain vector fields of interest in applications:• Given a fluid, for example, a velocity field is a vector field V(x, y, z)that indicates the velocity of the fluid at each point (x, y, z). Whenplotting a velocity field, the speed of the fluid at each point is indicatedby the length of the vector plotted at that point, and the direction ofthe fluid at that point is indicated by the direction of the vector.A curve c(t) is said to be a flow line, or streamline, of a velocity fieldV if, for each value of the parameter t,c ′ (t) = V(c(t)).103


104 CHAPTER 3. VECTOR CALCULUSFigure 3.1: The vector field V(x, y) = 〈−y, x〉That is, at each point along the curve, its tangent vector coincides withV. A flow line can be approximated by first choosing an initial pointx 0 = c(t 0 ), then using the value of V at that point to approximate asecond point x 1 = c(t 1 ) as follows:x 1 − x 0t 1 − t 0= c(t 1) − c(t 0 )t 1 − t 0≈ V(c(t 0 )) =⇒ x 1 ≈ x 0 +(t 1 −t 0 )V(x 0 ).This can be continued to obtain the locations of any number of pointsalong the flow line. The closer the times t 0 , t 1 , . . . are to one another,the more accurate the approximate flow line will be.• Consider two objects with mass m and M, with the object of mass Mlocated at the origin, and the vector field F defined byF(r) = − mMG‖r‖ 3 r,


3.1. VECTOR FIELDS 105where r is a position vector of the object of mass m, and G is thegravitational constant. This vector field indicates the gravitationalforce exerted by the object at the origin on the object at position r,and is therefore an example of a gravitational field.• Suppose an electric charge Q is located at the origin, and a charge qis located at the point with position vector x. Then the electric forceexerted by the first charge on the second is given by the vector fieldF (x) = εqQ‖x‖ 3 x,where ε is a constant. This field, and the gravitational field describedabove, are both examples of force fields.Figure 3.2: The conservative vector field F(x, y) = 〈y, x〉• A vector field F is said to be conservative if F = ∇f for some functionf. We also say that F is a gradient field, and f is a potential function for


106 CHAPTER 3. VECTOR CALCULUSF. When we discuss line integrals, we will learn the physical meaningof a conservative vector field.In upcoming sections we will learn how to integrate vector fields, as wellas the physical interpretations of such integrals.Example Consider the velocity field V(x, y) = 〈−y, x〉. It is shown in Figure3.1. It can be seen from the figure that the flow lines of this velocity fieldare circles centered at the origin. ✷Example The vector field F(x, y) = 〈y, x〉 is conservative, because F = ∇f,where f(x, y) = xy. The field is shown in Figure 3.2. It should be noted thatconservative vector fields are also called irrotational; a fluid whose velocityfield is conservative has no vorticity. ✷3.2 Line IntegralsRecall from single-variable calclus that if a constant force F is applied toan object to move it along a straight line from x = a to x = b, then theamount of work done is the force times the distance, W = F (b − a). Moregenerally, if the force is not constant, but is instead dependent on x so thatthe amount of force applied when the object is at the point x is given byF (x), then the work done is given by the integralW =∫ baF (x) dx.This result is obtained by applying the “basic” formula for work along eachof n subintervals of width ∆x = (b − a)/n, and taking the limit as ∆x → 0.Now, suppose that a force is applied to an object to move it along a pathtraced by a curve C, instead of moving it along a straight line. If the amountof force that is being applied to the object at any point p on the curve C isgiven by the value of a function F (p), then the work can be approximated by,as before, applying the “basic” formula for work to each of n line segmentsthat approximate the curve and have lengths ∆s 1 , ∆s 2 , . . . , ∆s n . The workdone on the ith segment is approximately F (p ∗ i )∆s i, where p ∗ i is any pointon the segment. By taking the limit as max ∆s i → 0, we obtain the lineintegral∫n∑W = F (p) ds = lim F (p ∗ i ) ∆s i ,max ∆s i →0provided that this limit exists.Ci=1


3.2. LINE INTEGRALS 107In order to actually evaluate a line integral, it is necessary to expressthe curve C in terms of parametric equations. For concreteness, we assumethat C is a plane curve defined by the parametric equationsx = x(t), y = y(t), a ≤ t ≤ b.Then, if we divide [a, b] into subintervals of width ∆t = (b − a)/n, with endpoints[t i−1 , t i ] where t i = a+i∆t, we can approximate C by n line segmentswith endpoints (x(t i−1 ), y(t i−1 )) and (x(t i ), y(t i )), for i = 1, 2, . . . , n. Fromthe Pythagorean Theorem, it follows that the ith segment has length∆s i =√∆x 2 i + ∆y2 i = √ (∆xi∆t) 2+( ) 2 ∆yi∆t,∆twhere ∆x i = x(t i ) − x(t i−1 ) and ∆y i = y(t i ) − y(t i−1 ). Letting ∆t → 0, weobtain√∫∫ b(dx ) 2 ( ) dy 2F (p) ds = F (x(t), y(t)) + dt.dt dtCaWe recall that if F (x, y) ≡ 1, then this integral yields the arc length of thecurve C.Example (Stewart, Section 13.2, Exercise 8) To evaluate the line integral∫x 2 z dsCwhere C is the line segment from (0, 6, −1) to (4, 1, 5), we first need parametricequations for the line segment. Using the vector between the endpoints,v = 〈4 − 0, 1 − 6, 5 − (−1)〉 = 〈4, −5, 6〉,we obtain the parametric equationsx = 4t, y = 6 − 5t, z = −1 + 6t, 0 ≤ t ≤ 1.It follows that∫x 2 z ds =C==∫ 10∫ 10∫ 10(x(t)) 2 z(t) √ [x ′ (t)] 2 + [y ′ (t)] 2 + [z ′ (t)] 2 dt(4t) 2 (6t − 1) √ 4 2 + (−5) 2 + 6 2 dt16t 2 (6t − 1) √ 77 dt


108 CHAPTER 3. VECTOR CALCULUS= 16 √ 77∫ 106t 3 − t 2 dt= 16 √ ( )∣77 6 t4 4 − t3 ∣∣∣130= 16 √ ( 3772 3)− 1✷= 56√ 77.3Example (Stewart, Section 13.2, Exercise 10) We evaluate the line integral∫(2x + 9z) dswhere C is defined by the parametric equationsWe have∫(2x + 9z) ds =✷CCx = t, y = t 2 , z = t 3 , 0 ≤ t ≤ 1.=== 1 4= 1 4∫ 10∫ 10∫ 1(2x(t) + 9z(t)) √ [x ′ (t)] 2 + [y ′ (t)] 2 + [z ′ (t)] 2 dt(2t + 9t 3 ) √ 1 2 + (2t) 2 + (3t 2 ) 2 dt(2t + 9t 3 ) √ 1 + 4t 2 + 9t 4 dt0∫ 141∣2 ∣∣∣143 u3/2u 1/2 du, u = 1 + 4t 2 + 9t 41= 1 6 (143/2 − 1).Although we have introduced line integrals in the context of computingwork, this approach can be used to integrate any function along a curve. Forexample, to compute the mass of a wire that is shaped like a plane curve C,where the density of the wire is given by a function ρ(x, y) defined at each


3.2. LINE INTEGRALS 109point (x, y) on C, we can evaluate the line integral∫m = ρ(x, y) ds.It follows that the center of mass of the wire is the point (¯x, ȳ) where¯x = 1 ∫xρ(x, y) ds, ȳ = 1 ∫yρ(x, y) ds.m C m CCNow, suppose that a vector-valued force F is applied to an object tomove it along the path traced by a plane curve C. If we approximate thecurve by line segments, as before, the work done along the ith segment isapproximately given byW i = F(p ∗ i ) · [T(p ∗ i )∆s i ]where p ∗ i is a point on the segment, and T(p ∗ i ) is the unit tangent vectorto the curve at this point. That is, F · T = ‖F‖ cos θ is the amount of forcethat is applied to the object at each point on the curve, where θ is the anglebetween F and the direction of the curve, which is indicated by T. In thelimit as max ∆s i → 0, we obtain the line integral of F along C,∫F · T ds.CIf the curve C is parametrized by the the vector equation r(t) = 〈x(t), y(t)〉,where a ≤ t ≤ b, then the tangent vector is parametrized byT(t) = r ′ (t)/‖r ′ (t)‖,and, as before, ds = √ [x ′ (t)] 2 + [y ′ (t)] 2 dt = ‖r ′ (t)‖ dt. It follows that∫CF·T ds =∫ bar ′ (t)F(r(t))·‖r ′ (t)‖ ‖r′ (t)‖ dt =∫ ba∫F(r(t))·r ′ (t) dt =CF· dr.The last form of the line integral is merely an abbreviation that is used forconvenience. As with line integrals of scalar-valued functions, the parametricrepresentation of the curve is necessary for actual evaluation of a lineintegral.Example (Stewart, Section 13.2, Exercise 20) We evaluate the line integral∫F · drC


110 CHAPTER 3. VECTOR CALCULUSwhere F(x, y, z) = 〈z, y, −x〉 and C is the curve defined by the parametricvector equationWe have∫✷Cr(t) = 〈x(t), y(t), z(t)〉 = 〈t, sin t, cos t〉, 0 ≤ t ≤ π.F · dr ====∫ π0∫ π0∫ π0∫ π0∫ πF(r(t)) · r ′ (t) dt〈z(t), y(t), −x(t)〉 · 〈x ′ (t), y ′ (t), z ′ (t)〉 dt〈cos t, sin t, −t〉 · 〈1, cos t, − sin t〉 dt[cos t + sin t cos t + t sin t] dt∫ π∫ π= cos t dt + sin t cos t dt + t sin t dt000= sin t| π 0 + 1 π∫ π2 sin2 t∣ − t cos t| π 0 + cos t dt00= π.If we write F(x, y) = 〈P (x, y), Q(x, y)〉, where P and Q are the componentfunctions of F, then we have∫∫ bF · dr = F(r(t)) · r ′ (t) dtC==a∫ ba∫ ba〈P (x(t), y(t)), Q(x(t), y(t))〉 · 〈x ′ (t), y ′ (t)〉 dtP (x(t), y(t))x ′ (t) dt +∫ baQ(x(t), y(t))y ′ (t) dt.When the curve is approximated by n line segments, as before, the differencein the x-coordinates of each segment is, by the Mean Value Theorem,∆x i = x(t i ) − x(t i−1 ) ≈ x ′ (t ∗ i ) ∆t,where t i−1 ≤ t ∗ i ≤ t i. For this reason, we write∫ ba∫P (x(t), y(t))x ′ (t) dt =CP dx,


3.2. LINE INTEGRALS 111and conclude∫ ba∫Q(x(t), y(t))y ′ (t) dt =∫C∫F · dr =CCP dx + Q dy.Q dy,These line integrals of scalar-valued functions can be evaluated individuallyto obtain the line integral of the vector field F over C. However, it isimportant to note that unlike line integrals with respect to the arc length s,the value of line integrals with respect to x or y (or z, in 3-D) depends on theorientation of C. If the curve is traced in reverse (that is, from the terminalpoint to the initial point), then the sign of the line integral is reversed aswell. We denote by −C the curve C with its orientation reversed. We thenhave∫∫F · dr = − F · dr,and∫CC∫P dx = − P dx,−C∫−CC∫Q dy = − Q dy.−CAll of this discussion generalizes to space curves (that is, curves in 3-D) ina straightforward manner, as illustrated in the examples.Example (Stewart, Section 13.2, Exercise 6) Let F(x, y) = 〈sin x, cos y〉 andlet C be the curve that is the top half of the circle x 2 + y 2 = 1, traversedcounterclockwise from (1, 0) to (−1, 0), and the line segment from (−1, 0) to(−2, 3). To evaluate the line integral∫∫F · T ds = sin x dx + cos y dy,CCwe consider the integrals over the semicircle, denoted by C 1 , and the linesegment, denoted by C 2 , separately. We then have∫sin x dx + cos y dy =C∫sin x dx + cos y dy +C 1∫sin x dx + cos y dy.C 2For the semicircle, we use the parametric equationsx = cos t, y = sin t, 0 ≤ t ≤ pi.This yields∫C 1sin x dx + cos y dy =∫ π0sin(cos t)(− sin t) dt + cos(sin t) cos t dt


112 CHAPTER 3. VECTOR CALCULUS= − cos(cos t)| π 0 + sin(sin t)|π 0= − cos(−1) + cos(1)= 0.For the line segment, we use the parametric equationsx = −1 − t, y = 3t, 0 ≤ t ≤ 1.This yields∫C 2sin x dx + cos y dy =∫ 10sin(−1 − t)(−1) dt + cos(3t)(3) dt= − cos(−1 − t)| 1 0 + sin(3t)|1 0= − cos(−2) + cos(−1) + sin(3) − sin(0)= − cos(2) + cos(1) + sin(3).We conclude∫Csin x dx + cos y dy = cos(1) − cos(2) + sin(3).In evaluating these integrals, we have taken advantage of the rule∫ baf ′ (g(t))g ′ (t) dt = f(g(b)) − f(g(a)),from the Fundamental Theorem of <strong>Calculus</strong> and the Chain Rule. However,this shortcut can only be applied when an integral involves only one of theindependent variables. ✷Example (Stewart, Section 13.2, Exercise 12) We evaluate the line integral∫F · drwhereCF(x, y, z) = 〈P (x, y, z), Q(x, y, z), R(x, y, z)〉 = 〈z, x, y〉,and C is defined by the parametric equationsx = t 2 , y = t 3 , z = t 2 , 0 ≤ t ≤ 1.


3.3. THE FUNDAMENTAL THEOREM FOR LINE INTEGRALS 113We have∫F · dr =C✷=====∫C∫ 10∫ 10∫ 10∫ 10= 3 2 .P dx + Q dy + R dzz(t)x ′ (t) dt + x(t)y ′ (t) dt + y(t)z ′ (t) dtt 2 (2t) dt + t 2 (3t 2 ) dt + t 3 (2t) dt2t 3 dt + 3t 4 dt + 2t 4 dt(5t 4 + 2t 3 ) dt(5 t5 5 + 2t4 4)∣ ∣∣∣13.3 The Fundamental Theorem for Line IntegralsWe have learned that the line integral of a vector field F over a curvepiecewise smooth C, that is parameterized by a vector-valued function r(t),a ≤ t ≤ b, is given by∫ ∫ bF · dr = F(r(t)) · r ′ (t) dt.CaNow, suppose that F continuous, and is a conservative vector field; that is,F = ∇f for some scalar-valued function f. Then, by the Chain Rule, wehave∫CF· dr =∫ ba∇f(r(t))·r ′ (t) dt =∫ ba0ddt [(f◦r)(t)] dt = (f ◦ r)(t)|b a = f(r(b))−f(r(a)).This is the Fundamental Theorem of Line Integrals, which is a generalizationof the Fundamental Theorem of <strong>Calculus</strong>.If the curve C is a closed curve; that is, the initial and terminal pointsof C are the same, then r(b) = r(a), which yields∫F · dr = f(r(b)) − f(r(a)) = 0.C


114 CHAPTER 3. VECTOR CALCULUSIf we decompose C into two curves C 1 and C 2 , and use the fact that the signof the line integral of a vector field over a curve depends on the orientationof the curve, then we have∫ ∫ ∫∫ ∫F · dr = F · dr + F · dr = F · dr − F · dr = 0.C 1 C 2 C 1 −C 2CThat is,∫F · dr =C 1∫F · dr.−C 2However, C 1 and −C 2 have the same initial and terminal points. It followsthat if F is conservative within an open, connected domain D (so that anytwo points in D can be connected by a path that lies within D), then theline integral of F is independent of path in D; that is, the value of the lineintegral of F over a path C depends only on its initial and terminal points.The converse of this statement is also true: if the line integral of avector field F is independent of path within an open, connected domainD, then F is a conservative vector field on D. To see this, we considerthe two-variable case and let D be a region in R 2 . Furthermore, we letF(x, y) = 〈P (x, y), Q(x, y)〉. We choose an arbitrary point (a, b) ∈ D, anddefinef(x, y) =∫ (x,y)(a,b)F · dr.Since this line integral is independent of path, we can define f(x, y) usingany path between (a, b) and (x, y) that we choose, knowing that its value at(x, y) will be the same in any case.By choosing a path that ends with a horizontal line segment from (x 1 , y)to (x, y) contained entirely in D, parametrized by x = t, y = y, for x 1 ≤ t ≤x, we can show that∂f∂(x, y) =∂x ∂x[ ∫ ](x1 ,y)F · dr(a,b)[ ∫ ]+ ∂ (x,y)F · dr∂x (x 1 ,y)= 0 + ∂ [∫ x]P (x(t), y)x ′ (t) dt + Q(x(t), y)y ′ (t) dt∂x x 1= ∂ [∫ x]P (t, y) dt + 0∂x x 1= P (x, y).Using a similar argument, we can show that ∂f/∂y = Q. We have thusshown that F is conservative, and conclude that F is a conservative vectorfield if and only if its line integral is independent of path.


3.3. THE FUNDAMENTAL THEOREM FOR LINE INTEGRALS 115However, in order to use the Fundamental Theorem of Line Integrals toevaluate the line integral of a conservative vector field, it is necessary toobtain the function f such that ∇f = F . Furthermore, the theorem cannotbe applied to a vector field that is not conservative, so we need to be ableto confirm that a given vector field is conservative before we proceed.Continuing to restrict ourselves to the two-variable case, suppose thatF = 〈P, Q〉 is a conservative vector field defined on a domain D, and thatP and Q have continuous first partial derivatives. Then, we have∂f∂x = P,for some function f. It follows that∂f∂y = Q,∂P∂y = ∂2 f∂y∂x ,∂Q∂x = ∂2 f∂x∂y .However, by Clairaut’s Theorem, these mixed second partial derivatives off are equal, so it follows that∂P∂y = ∂Q∂xif F = 〈P, Q〉 is conservative.If the domain D is simply connected, meaning that any region enclosedby a closed curve in D contains only points in D (informally, D has “noholes”), then the converse is true: if∂P∂y = ∂Q∂xin D, then F = 〈P, Q〉 is a conservative vector field. Similarly, if F =〈P, Q, R〉 is a vector field defined on a simply connected domain D ⊆ R 3 ,and∂P∂y = ∂Q∂x ,∂P∂z = ∂R∂x ,∂Q∂z = ∂R∂y ,then F is conservative.It remains to be able to find the function f such that ∇f = F for agiven vector field F = 〈P, Q〉 that is known to be conservative. The generaltechnique is as follows:• Integrate P (x, y) with respect to x to obtainf(x, y) = f 1 (x, y) + g(y),


116 CHAPTER 3. VECTOR CALCULUSwhere f 1 (x, y) is obtained by anti-differentiation of P (x, y), and g(y) isan unknown function that plays the role of the constant of integration,since f(x, y) is obtained by anti-differentiating with respect to x.• Differentiate f with respect to y to obtainand solve for g ′ (y).∂∂y [f 1(x, y)] + g ′ (y) = Q(x, y),• Integrate g ′ (y) with respect to y to complete the definition of f(x, y),up to a constant of integration.A similar procedure can be used for a vector field defined on R 3 , except thatthe function g depends on both y and z, and differentiation with respect toboth y and z is needed to completely define the function f(x, y, z) such that∇f = F.Example (Stewart, Section 13.3, Exercise 14) LetF(x, y, z) = 〈P (x, y, z), Q(x, y, z), R(x, y, z)〉 = 〈2xz + y 2 , 2xy, x 2 + 3z 2 〉.To confirm that F is conservative, we check the appropriate first partialderivatives of P , Q and R:P y = 2y = Q x , P z = 2x = R x , Q z = 0 = R y .Now, to find a function f(x, y, z) such that ∇f = F, which must satisfyf x = P , we integrate P (x, y, z) with respect to x and obtainf(x, y, z) = x 2 z + y 2 x + g(y, z).Differentiating with respect to y and z yields the equationsf y (x, y, z) = 2xy + g y (y, z) = Q(x, y, z) = 2xy,f z (x, y, z) = x 2 + g z (y, z) = R(x, y, z) = x 2 + 3z 2 .It follows thatwhich yieldsg y (y, z) = 0, g z (y, z) = 3z 2 ,g(y, z) = z 3 + K


3.3. THE FUNDAMENTAL THEOREM FOR LINE INTEGRALS 117for some constant K. We conclude that F = ∇f wheref(x, y, z) = x 2 z + y 2 x + z 3 + Kwhere K is an arbitrary constant.To evaluate the line integral of F over the curve C parametrized byx = t 2 , y = t + 1, z = 2t − 1, 0 ≤ t ≤ 1,we apply the Fundamental Theorem of Line Integrals and obtain∫F · dr = f(x(1), y(1), z(1)) − f(x(0), y(0), z(0))✷C= f(1, 2, 1) − f(0, 1, −1)= 1 2 (1) + 2 2 (1) + 1 3 + K − (0 2 (−1) + 1 2 (0) + (−1) 3 + K)= 1 + 4 + 1 + K − (0 + 0 − 1 + K)= 7.Let F represent a force field. Then, recall that the work done by theforce field to move an object along a path r(t), a ≤ t ≤ b, is given by theline integral∫ ∫ bW = F · dr = F(r(t)) · r ′ (t) dt.From Newton’s Second Law of Motion, we haveCaF(r(t)) = mr ′′ (t),where m is the mass of the object, and r ′′ (t) = a(t) is its acceleration. Wethen haveW =∫ ba= 1 2 m ∫ bmr ′′ (t) · r ′ (t) dta= 1 2 m ∫ baddt [r′ (t) · r ′ (t)] dtddt [‖r′ (t)‖ 2 ] dt= 1 2 m‖v(b)‖2 − 1 2 m‖v(a)‖2where v(t) = r ′ (t) is the velocity of the object.


118 CHAPTER 3. VECTOR CALCULUSIt follows thatW = K(B) − K(A),where A = r(a) and B = r(b) are the initial and terminal points, respectively,andK(P ) = 1 m‖v(t)‖, r(t) = P,2is the kinetic energy of the object at the point P . That is, the work doneby the force field along C is the change in the kinetic energy from point Ato point B.If F is also a conservative force field, then F = −∇P , where P is thepotential energy. It follows from the Fundamental Theorem of Line Integralsthat∫∫W = F · dr = − ∇P · dr = −[P (B) − P (A)].CCWe conclude thatP (A) + K(A) = P (B) + K(B).That is, when an object is moved by a conservative force field, then itstotal energy remains constant. This is known as the Law of Conservationof Energy.3.4 Green’s TheoremWe have learned that if a vector field is conservative, then its line integralover a closed curve C is equal to zero. However, if this is not the case, thenevaluation of a line integral using the formula∫CF · dr =∫ baF(r(t)) · r ′ (t) dt,where r(t) is a parameterization of C, can be very difficult, even if C is arelatively simple plane curve. Fortunately, in this case, there is an alternativeapproach, using a result known as Green’s Theorem.We assume that F = 〈P, Q〉, and consider the case where C encloses aregion D that can be viewed as a region of either type I or type II. That is,D has the definitionsD = {(x, y) | a ≤ x ≤ b, g 1 (x) ≤ y ≤ g 2 (x)}


3.4. GREEN’S THEOREM 119andD = {(x, y) | c ≤ y ≤ d, h 1 (y) ≤ x ≤ h 2 (y)}.Using the first definition, we have C = C 1 ∪ C 2 ∪ (−C 3 ) ∪ (−C 4 ), where:• C 1 is the curve with parameterization x = t, y = g 1 (t), for a ≤ t ≤ b• C 2 is the vertical line segment with parameterization x = b, y = t, forg 1 (b) ≤ t ≤ g 2 (b)• C 3 is the curve with parameterization x = t, y = g 2 (t), for a ≤ t ≤ b• C 4 is the vertical line segment with parameterization x = a, y = t, forg 1 (a) ≤ t ≤ g 2 (a)We use positive orientation to describe the curve C, which means that thecurve is traversed counterclockwise. This means that as the curve is traversed,the region D is “on the left”.In view of ∫ ∫F · dr = P dx + Q dy,we have∫P dx =C====CC∫ ∫ ∫∫P dx + P dx + P dx + P dxC 1 C 2 −C 3 −C∫ ∫ ∫ ∫ 4P dx + P dx − P dx − P dxC 1 C 2 C 3 C 4∫ ba∫ ba∫ ba∫ ba∫ b= −P (x(t), y(t))x ′ (t) dt +P (x(t), y(t))x ′ (t) dt −P (t, g 1 (t))(1) dt +P (t, g 2 (t))(1) dt −a∫ b ∫ g2 (t)∫ g2 (b)g 1 (b)∫ g2 (a)g 1 (a)∫ g2 (b)g 1 (b)∫ g2 (a)g 1 (a)[P (t, g 1 (t)) − P (t, g 2 (t))] dta∫ ∫= −Dg 1 (t)∂P∂y dA.P y (t, y) dy dtP (x(t), y(t))x ′ (t) dt −P (x(t), y(t))x ′ (t) dtP (b, t)(0) dt −P (a, t)(0) dt


120 CHAPTER 3. VECTOR CALCULUSUsing a similar approach in which D is viewed as a region of type II, weobtain∫ ∫ ∫∂QQ dy =∂x dA.CPutting these results together, we obtain Green’s Theorem, which statesthat if C is a positively oriented, piecewise smooth, simple (that is, notself-intersecting) closed curve that encloses a region D, and P and Q arefunctions that have continuous first partial derivatives on D, then∫∫ ∫ ( ∂QP dx + Q dy =∂x − ∂P )dA.∂yCAnother common statement of the theorem is∫ ∫ ( ∂Q∂x − ∂P ) ∫dA = P dx + Q dy,∂yDwhere ∂D denotes the positively oriented boundary of D.This theorem can be used to find a simpler approach to evaluating aline integral of the vector field 〈P, Q〉 over C by converting the integral toa double integral over D, or it can be used to find a simpler approach toevaluating a double integral over a region D by converting it into an integralover its boundary.To show that Green’s Theorem applies for more general regions thanthose that are of both type I and type II, we consider a region D that isthe union of two regions D 1 and D 2 that are of both type I and type II. LetC be the positively oriented boundary of D, let D 1 have positively orientedboundary C 1 ∪C 3 , and let D 2 have positively oriented boundary C 2 ∪(−C 3 ),where C 3 is the boundary between D 1 and D 2 . Then, C = C 1 ∪C 2 . It followsthat for functions P and Q that satisfy the assumptions of Green’s Theoremon D, we can apply the theorem to D 1 and D 2 individually to obtain∫ ∫ ( ∂QD ∂x − ∂P ) ∫ ∫ ( ∂QdA =∂yD 1∂x − ∂P )dA +∂y∫ ∫ ( ∂QD 2∂x − ∂P )dA∂y∫∫= P dx + Q dy + P dx + Q dyC 1 ∪C 3 C 2 ∪(−C 3 )∫∫= P dx + Q dy + P dx + Q dy +C 1 C∫∫ 3P dx + Q dy + P dx + Q dyC 2 −C 3DD∂D


3.4. GREEN’S THEOREM 121====∫∫P dx + Q dy + P dx + Q dy +C 1 C∫∫ 2P dx + Q dy − P dx + Q dyC 3 C∫∫ 3P dx + Q dy + P dx + Q dyC 1 C∫2P dx + Q dyC 1 ∪C∫ 2P dx + Q dy.CWe conclude that Green’s Theorem holds on D 1 ∪ D 2 . The same argumentcan be used to easily show that Green’s Theorem applies on any finite unionof simple regions, which are regions of both type I and type II.Green’s Theorem can also be applied to regions with “holes”, that is,regions that are not simply connected. To see this, let D be a region enclosedby two curves C 1 and C 2 that are both positively oriented with respect to D(that is, D is on the left as either C 1 or C 2 is traversed). Let C 2 be containedwithin the region enclosed by C 1 ; that is, let C 2 be the boundary of the“hole” in D. Then, we can decompose D into two simply connected regionsD ′ and D ′′ by connecting C 2 to C 1 along two separate curves that lie withinD. Applying Green’s Theorem to D ′ and D ′′ individually, we find that theline integrals along the common boundaries of D ′ and D ′′ cancel, becausethey have opposite orientations with respect to these regions. Therefore, wehave∫ ∫ ( ∂QD ∂x − ∂P ) ∫ ∫ ( ∂QdA =∂yD ′ ∂x − ∂P )dA +∂y∫ ∫ ( ∂QD ′′ ∂x − ∂P )dA∂y∫∫= P dx + Q dy + P dx + Q dyC 1 C∫2= P dx + Q dy.C 1 ∪C 2Therefore, Green’s Theorem applies to D as well.Example The vector field〈F(x, y) = 〈P (x, y), Q(x, y)〉 = −y〉x 2 + y 2 , xx 2 + y 2


122 CHAPTER 3. VECTOR CALCULUSis conservative on all of R 2 except at the origin, because it is not definedthere. Specifically, F = ∇f wheref(x, y) = tan −1 y x .Now, consider a region D that is enclosed by a positively oriented, piecewisesmooth, simple closed curve C, and also has a “hole” that is a disk ofradius a, centered at the origin, and contained entirely within C. Let C ′ bethe positively oriented boundary of this disk. Then, the boundary of D isC ∪(−C ′ ), because, as a portion of the boundary of D, rather than the disk,it is necessary for C ′ to switch orientation. Applying Green’s Theorem tocompute the line integral of F over the boundary of D yields∫∫∫ ∫ ( ∂QP dx + Q dy + P dx + Q dy =C−C ′ D ∂x − ∂P )dA = 0,∂ysince F is conservative on D. It follows that∫∫∫F · dr = − F · dr = F · dr,C−C ′ C ′so we can compute the line integral of F over C, which we have not specified,by computing the line integral over the circle C ′ , which can be parameterizedby x = a cos t, y = a sin t, for 0 ≤ t ≤ 2π. This yields∫∫ 2πF · dr = P (x(t), y(t))x ′ (t) dt + Q(x(t), y(t))y ′ (t) dtC ′ 0∫ 2π()a sin t= −0 (a cos t) 2 + (a sin t) 2 (−a sin t) dt +()a cos t(a cos t) 2 + (a sin t) 2 (a cos t) dt==∫ 2π0∫ 2π0= 2π.a 2 sin 2 ta 2 cos 2 +a 2 sin 2 t dt +1 dta 2 cos 2 ta 2 cos 2 t + a 2 sin 2 t dtWe conclude that the line integral of F over any positively oriented, piecewisesmooth, simple closed curve that encloses the origin is equal to 2π. ✷Example Consider a n-sided polygon P with vertices (x 1 , y 1 ), (x 2 , y 2 ), . . .,(x n , y n ). The area A of the polygon is given by the double integral∫ ∫A = 1 dA.P


3.5. CURL AND DIVERGENCE 123Let P (x, y) = −y/2 and Q(x, y) = x/2. Then( ∂Q∂x − ∂P ) ( ( 1=∂y 2 − − 1 ))= 1.2It follows from Green’s Theorem that if ∂P is positively oriented, then∫A = Q dy + P dx = 1 ∫x dy − y dx.∂P 2 ∂PTo evaluate this line integral, we consider each edge of P individually. LetC be the line segment from (x 1 , y 1 ) to (x 2 , y 2 ), and assume, for convenience,that C is not vertical. Then C can be parameterized by x = t, y = mx + b,for x 1 ≤ x ≤ x 2 , whereWe then have∫CWe conclude that✷m = y 2 − y 1x 2 − x 1, b = y 1 − mx 1 .x dy − y dx == −∫ x2x∫ 1x2x 1mt dt − (mt + b) dtb dt= b(x 1 − x 2 )= y 1 (x 1 − x 2 ) − mx 1 (x 1 − x 2 )= y 1 (x 1 − x 2 ) + (y 2 − y 1 )x 1= x 1 y 2 − x 2 y 1 .A = 1 2 [(x 1y 2 − x 2 y 1 ) + (x 2 y 3 − x 3 y 2 ) + · · · + (x n−1 y n − x n y n−1 )+(x n y 1 − x 1 y n )] .3.5 Curl and DivergenceWe have seen two theorems in vector calculus, the Fundamental Theorem ofLine Integrals and Green’s Theorem, that relate the integral of a set to anintegral over its boundary. Before establishing similar results that apply to


124 CHAPTER 3. VECTOR CALCULUSsurfaces and solids, it is helpful to introduce new operations on vector fieldsthat will simplify exposition.We have previously learned that a vector field F = 〈P, Q, R〉 defined onR 3 is conservative ifR y − Q z = 0, P z − R x = 0, Q x − P y = 0.These equations are equivalent to the statement〈 ∂∂x , ∂ ∂y , ∂ 〉× 〈P, Q, R〉 = 〈0, 0, 0〉.∂zTherefore, we define the curl of a vector field F = 〈P, Q, R〉 bycurl F = ∇ × F,where∇ =〈 ∂∂x , ∂ ∂y , ∂ 〉.∂zFrom the definition of a conservative vector field, it follows that curl F = 0 ifF = ∇f where f has continuous second partial derivatives, due to Clairaut’sTheorem. That is, the curl of a gradient is zero.This is equivalent to the statement that the curl of a conservative vectorfield is zero. The converse, that a vector field F for which curl F = 0 isconservative, is also true if F has continuous first partial derivatives andcurl F = 0 within a simply connected domain. That is, the domain must nothave “holes”.When F represents the velocity field of a fluid, the fluid tends to rotatearound the axis that is aligned with curl F, and the magnitude of curl Findicates the speed of rotation. Therefore, when curl F = 0, we say that Fis irrotational, which is a term that has previously been associated with theequivalent condition of F being conservative.Another operation that is useful for discussing properties of vector fieldsis the divergence of a vector field F, denoted by div F. It is defined byFor example, if F = 〈P, Q, R〉, thendiv F =div F = ∇ · F.〈 ∂∂x , ∂ ∂y , ∂ 〉· 〈P, Q, R〉 = P x + Q y + R z .∂z


3.5. CURL AND DIVERGENCE 125Unlike the curl, the divergence is defined for vector fields with any number ofvariables, as long as the number of independent and the number of dependentvariables are the same.It can be verified directly that if F is the curl of a vector field G, thendiv F = 0. That is, the divergence of any curl is zero, as long as G hascontinuous second partial derivatives. This is useful for determining whethera given vector field F is the curl of any other vector field G, for if it is, itsdivergence must be zero.Example (Stewart, Section 13.5, Exercise 18) The vector field F(x, y, z) =〈yz, xyz, xy〉 is not the curl of any vector field G, becausediv F = (yz) x + (xyz) y + (xy) z = 0 + xz + 0 = xz,whereas if F = curl G, then✷div F = div curl G = 0.If F represents the velocity field of a fluid, then, at each point within thefluid, div F measures the tendency of the fluid to diverge away from thatpoint. Specifically, the divergence is the rate of change, with respect to time,of the density of the fluid. Therefore, if div F = 0, then we say that F, andtherefore the fluid as well, is incompressible.The divergence of a gradient isdiv(∇f) = ∇ · ∇f =〈 ∂∂x , ∂ ∂y , ∂ ∂z〉·〈 ∂f∂x , ∂f∂y , ∂f 〉= ∂2 f∂z ∂x 2 + ∂2 f∂y 2 + ∂2 f∂z 2 .We denote this expression ∇ · ∇f by ∇ 2 f, or ∆f, which is called the Laplacianof f. The operator ∇ 2 is called the Laplace operator. Its name comesfrom Laplace’s equation∆f = 0.The curl and divergence can be used to restate Green’s Theorem informs that are more directly generalizable to surfaces and solids in R 3 . LetF = 〈P, Q, 0〉, the embedding of a two-dimensional vector field in R 3 . Thencurl F =( ∂Q∂x − ∂P∂y)k,where, as before, k = 〈0, 0, 1〉. It follows that( ∂Qcurl F · k =∂x − ∂P ) ( ∂Qk · k =∂y∂x − ∂P ).∂y


126 CHAPTER 3. VECTOR CALCULUSThis expression is called the scalar curl of the two-dimensional vector field〈P, Q〉. We conclude that Green’s Theorem can be rewritten as∫C∫ ∫F dr =D(curl F) · k dA.Another useful form of Green’s Theorem involves the divergence. LetF = 〈P, Q〉 have continuous first partial derivatives in a domain D witha positively oriented, piecewise smooth boundary C that has parametrizationr(t) = 〈x(t), y(t)〉, for a ≤ t ≤ b. Using the original form of Green’sTheorem, we have∫ ∫Ddiv F dA ======∫ ∫∫C∫ ba∫ ba∫ b∫aCD( ∂P∂x + ∂Q )dA∂yP dy − Q dxP (x(t), y(t))y ′ (t) dt − Q(x(t), y(t))x ′ (t) dt[]P (x(t), y(t)) y′ (t)‖r ′ (t)‖ + Q(x(t), y(t)) −x′ (t)‖r ′ ‖r ′ (t)‖ dt(t)‖(F · n)(t)‖r ′ (t)‖ dtF · n dswheren(t) =1‖r ′ (t)‖ 〈y′ (t), −x ′ (t)〉is the outward unit normal vector to the curve C. Note that n·T = 0, whereT is the unit tangent vectorT(t) =1‖r ′ (t)‖ 〈x′ (t), y ′ (t)〉.We have established a third form of Green’s Theorem,∫C∫ ∫F · n ds =Ddiv F dA.


3.6. PARAMETRIC SURFACES AND THEIR AREAS 1273.6 Parametric Surfaces and Their AreasWe have learned that Green’s Theorem can be used to relate a line integralof a two-dimensional vector field F over a closed plane curve C to a doubleintegral of a component of curl F over the region D that is enclosed byC. Our goal is to generalize this result in such a way as to relate the lineintegral of a three-dimensional vector field F over a closed space curve C tothe integral of a component of curl F over a surface enclosed by C.We have also learned that Green’s Theorem relates the integral of thenormal component of a two-dimensional vector field over a closed curve C tothe double integral of div F over the region D that it encloses. We wish togeneralize this result in order to relate the integral of the normal componentof a three-dimensional vector field F over a closed surface S to the tripleintegral of div F over the solid E contained within S.In order to realize either of these generalizations, we need to be able to integratefunctions over piecewise smooth surfaces, just as we now know how tointegrate functions over piecewise smooth curves. Whereas a smooth curveC, being a curved one-dimensional entity, is most conveniently described bya parameterization r(t), where a ≤ t ≤ b and r(t) is a differentiable functionof one variable, a smooth surface S, being a curved two-dimensional entity,is most conveniently described by a parametrization r(u, v), where (u, v) lieswithin a 2-D region, and r(u, v) = 〈x(u, v), y(y, v), z(u, v)〉 is a differentiablefunction of two variables. We say that S is a parametric surface, andare the parametric equations of S.x = x(u, v), y = y(u, v), z = z(u, v)Example The Möbius strip is a surface that is famous for being a nonorientablesurface; that is, it “has only one side”. It can be parameterizedbyx(u, v) =y(u, v) =z(u, v) = v 2 sin u 2 ,(1 + v 2 cos u )cos u,(21 + v 2 cos u )sin u,2where 0 ≤ u ≤ 2π and −1 ≤ v ≤ 1. It is shown in Figure 3.3. ✷Example The paraboloid defined by the equation x = 4y 2 +4z 2 , 0 ≤ x ≤ 4,


128 CHAPTER 3. VECTOR CALCULUSFigure 3.3: The Möbius stripcan also be defined by the parametric equationsx = x, y =√ √ xx2 cos θ, z = sin θ,2where 0 ≤ θ ≤ 2π, since for each x, a point (x, y, z) on the paraboloid mustlie on a circle centered at (x, 0, 0) with radius √ x/4, parallel to the yz-plane.This is an example of a surface of revolution, since the surface is obtainedby revolving the curve y = f(x) around the x-axis. ✷Let P 0 = (x 0 , y 0 , z 0 ) = r(u 0 , v 0 ) be a point on a parametric surface S. Acurve defined by g(v) = r(u 0 , v) that lies within S and passes through P 0has the tangent vector〈 ∂xr v = g ′ (v) =∂v (u 0, v 0 ), ∂y∂v (u 0, v 0 ), ∂z 〉∂v (u 0, v 0 )at P 0 . Similarly, the tangent vector at P 0 of the curve h(u) = r(u, v 0 ), thatalso lies within S and passes through P 0 , is〈 ∂xr u = h ′ (u) =∂u (u 0, v 0 ), ∂y∂u (u 0, v 0 ), ∂z 〉∂u (u 0, v 0 ) .


3.6. PARAMETRIC SURFACES AND THEIR AREAS 129If these vectors are not parallel, then together they define the tangent planeof S at P 0 . Its normal vector iswhich yields the equationof the tangent plane.r u × r v = 〈a, b, c〉a(x − x 0 ) + b(y − y 0 ) + c(z − z 0 ) = 0Example (Stewart, Section 13.6, Exercise 30) Consider the surface definedby the parametric equationsx = u 2 , y = v 2 , z = uv, 0 ≤ u, v ≤ 10.At the point (x 0 , y 0 , z 0 ) = (1, 1, 1), which corresponds to u 0 = 1, v 0 = 1,the equation of the tangent plane can be obtained by first computing thepartial derivatives of the coordinate functions. We haveEvaluating at (u 0 , v 0 ) yieldsx u = 2u, y u = 0, z u = v,x v = 0, y v = 2v, z v = u.r u = 〈x u , y u , z u 〉 = 〈2, 0, 1〉, r v = 〈x v , y v , z v 〉 = 〈0, 2, 1〉.It follows that the normal to the tangent plane isn = r u × r v = 〈2, 0, 1〉 × 〈0, 2, 1〉 = 〈−2, −2, 4〉.We conclude that the equation of the tangent plane is✷−2(x − 1) − 2(y − 1) + 4(z − 1) = 0.The vectors r u and r v are helpful for computing the area of a smoothsurface S. For simplicity, we assume that S is parametrized by a functionr(u, v) with domain D, where D = [a, b]×[c, d] is a rectangle in the uv-plane.We divide [a, b] into n subintervals [u i−1 , u i ] of width ∆u = (b − a)/n, anddivide [c, d] into m subintervals [v j−1 , v j ] of width ∆v = (d − c)/m.


130 CHAPTER 3. VECTOR CALCULUSThen, r approximately maps the rectangle R ij with lower left corner(u i−1 , v j−1 ) into a parallelogram with adjacent edges defined by the vectorsandThe area of this parallelogram isr(u i , v j−1 ) − r(u i−1 , v j−1 ) ≈ r u ∆ur(u i−1 , v j ) − r(u i−1 , v j−1 ) ≈ r v ∆v.A ij = ‖r u × r v ‖∆u∆v.Adding all of these areas approximates the area of S, which we denote byA(S). If we let m, n → ∞, we obtainA(S) =limm,n→∞n∑i=1 j=1m∑∫ ∫A ij =D‖r u × r v ‖ dA.Example (Stewart, Section 13.6, Exercise 34) We wish to find the area ofthe surface S that is the part of the plane 2x + 5y + z = 10 that lies insidethe cylinder x 2 + y 2 = 9. First, we must find parametric equations for thissurface. Because x and y are restricted to the circle of radius 3 centered atthe origin, it makes sense to use polar coordinates for x and y. We thenhave the parametric equationsx = u cos v, y = u sin v, z = 10 − u(2 cos v + 5 sin v),where 0 ≤ u ≤ 3 and 0 ≤ v ≤ 2π. We then haveWe then haveIt follows thatr u = 〈x u , y u , z u 〉 = 〈cos v, sin v, −2 cos v − 5 sin v〉,r v = 〈x v , y v , z v 〉 = 〈−u sin v, u cos v, u(2 sin v − 5 cos v)〉.A(S) =‖r u × r v ‖ = ‖〈2u, 5u, u〉‖ = |u| √ 30.∫ 3 ∫ 2π00u √ 30 du dv = 2π √ 30∫ 30u du = 9π √ 30.It should be noted that it is to be expected that the direction of r u × r v isparallel to the normal vector of the plane 2x+5y +z = 10, since it is normalto the surface at every point. ✷


3.6. PARAMETRIC SURFACES AND THEIR AREAS 131Often, a surface is defined to be the graph of a function z = f(x, y).Such a surface can be parametrized byIt follows thatx = u, y = v, z = f(u, v), (u, v) ∈ D.r u = 〈1, 0, f u 〉, r v = 〈0, 1, f v 〉.We then have r v ×r u = 〈f u , f v , −1〉, which yields the equation of the tangentplane∂f∂u (u 0, v 0 )(x − x 0 ) + ∂f∂v (u 0, v 0 )(y − y 0 ) = z − z 0 ,which, using the relations x = u and y = v, can be rewritten as∂f∂x (x 0, y 0 )(x − x 0 ) + ∂f∂y (x 0, y 0 )(y − y 0 ) = z − z 0 .Recall that this is the equation of the tangent plane of a surface defined byan equation of the form z = f(x, y) that had been previously defined. Itfollows that the area of such a surface is given by the double integral∫ ∫A(S) =D√1 +( ) df 2 ( ) df 2+ dA.dx dyExample (Stewart, Section 13.6, Exercise 38) To find the area A(S) of thesurface z = 1 + 3x + 2y 2 that lies above the triangle with vertices (0, 0),(0, 1) and (2, 1), we compute∂z∂x = 3,∂z∂y = 4y,and then evaluate the double integral√A(S) ===∫ 1 ∫ 2y0 0∫ 1 ∫ 2y0∫ 10= 1 1601 +( ) ∂z 2+∂x√10 + 16y 2 dx dy2y √ 10 + 16y 2 dy∫ 2610u 1/2 du( ) ∂z 2dx dy∂y


132 CHAPTER 3. VECTOR CALCULUS✷= 1 24 u3/2 ∣ ∣∣∣2610= 124 (263/2 − 10 3/2 )≈ 4.206.A surface of revolution S that is obtained by revolving the curve y =f(x), a ≤ x ≤ b, around the x-axis has parametric equationsx = u, y = f(u) cos v, z = f(u) sin v,where a ≤ u ≤ b and 0 ≤ v ≤ 2π. From these equations, we obtainwhich yields‖r u × r v ‖ = |f(u)| √ 1 + [f ′ (u)] 2 ,A(S) = 2π∫ ba|f(u)| √ 1 + [f ′ (u)] 2 du.If y = f(x) is revolved around the y-axis instead, then the area isA(S) = 2π∫ ba|u| √ 1 + [f ′ (u)] 2 du,which can be obtained by considering the case of revolving x = f −1 (y)around the y-axis and proceeding with a parametrization similar to the caseof revolving around the x-axis.3.7 Surface Integrals3.7.1 Surface Integrals of Scalar-Valued FunctionsPreviously, we have learned how to integrate functions along curves. If asmooth space curve C is parameterized by a function r(t) = 〈x(t), y(t), z(t)〉,a ≤ t ≤ b, then the arc length L of C is given by the integral∫ ba‖r ′ (t)‖ dt.Similarly, the integral of a scalar-valued function f(x, y, z) along C is givenby∫ ∫ bf ds = f(x(t), y(t), z(t))‖r ′ (t)‖ dt.Ca


3.7. SURFACE INTEGRALS 133It follows that the integral of f(x, y, z) ≡ 1 along C is equal to the arc lengthof C.We now define integrals of scalar-valued functions on surfaces in an analogousmanner. Recall that the area of a smooth surface S, parametrized byr(u, v) = 〈x(u, v), y(u, v), z(u, v)〉 for (u, v) ∈ D, is given by the integral∫ ∫A(S) =D‖r u × r v ‖ du dv.To integrate a scalar-valued function f(x, y, z) over S, we assume for simplicitlythat D is a rectangle, and divide it into sub-rectangles {R ij } ofdimension ∆u and ∆v, as we did when we derived the formula for A(S).Then, the function r maps each sub-rectangle R ij into a surface patch S ijthat has area ∆S ij . This area is then multiplied by f(Pij ∗ ), where P∗ij is anypoint on S ij .Letting ∆u, ∆v → 0, we obtain the surface integral of f over S to be∫ ∫Sf(x, y, z) dS = lim∆u,∆v→0∫ ∫=since, in the limit as ∆u, ∆v → 0, we haveDn∑m∑i=1 j=1∆S ij → ‖r u × r v ‖ ∆u ∆v.f(P ∗ij) ∆S ijf(r(u, v))‖r u × r v ‖ du dv,Note that if f(x, y, z) ≡ 1, then the surface integral of f over S yields thearea of S, A(S).Example (Stewart, Section 13.7, Exercise 6) Let S be the helicoid withparameterizationThen we havewhich yieldsr(u, v) = 〈u cos v, u sin v, v〉, 0 ≤ u ≤ 1, 0 ≤ v ≤ π.r u = 〈cos v, sin v, 0〉, r v = 〈−u sin v, u cos v, 1〉,‖r u × r v ‖ = ‖〈sin v, − cos v, u〉‖ =√sin 2 v + cos 2 v + u 2 = √ 1 + u 2 .


134 CHAPTER 3. VECTOR CALCULUSIt follows that∫ ∫√1 + x 2 + y 2 dS =∫ 1 ∫ π√1 + (u cos v) 2 + (u sin v) 2 ‖r u × r v ‖ dv duS=== π= π0 0∫ 1 ∫ π0 0∫ 1 ∫ π0 0∫ 10= 4π 3 .√1 + u 2 √ 1 + u 2 dv du1 + u 2 dv du1 + u 2 du(u + u33)∣ ∣∣∣10✷The surface integral of a scalar-valued function is useful for computingthe mass and center of mass of a thin sheet. If the sheet is shaped like asurface S, and it has density ρ(x, y, z), then the mass is given by the surfaceintegral∫ ∫m = ρ(x, y, z) dS,and the center of mass is the point (¯x, ȳ, ¯z), where¯x = 1 ∫ ∫xρ(x, y, z) dS,m Sȳ = 1 ∫ ∫yρ(x, y, z) dS,m S¯z = 1 ∫ ∫zρ(x, y, z) dS.m3.7.2 Surface Integrals of Vector FieldsSSLet v be a vector field defined on R 3 that represents the velocity field of afluid, and let ρ be the density of the fluid. Then, the rate of flow of thefluid, which is defined to be the rate of change with respect to time of theamount of fluid (mass), per unit area, is given by ρv.To determine the total amount of fluid that is crossing S per unit oftime, called the flux across S, we divide S into several small patches S ij ,


3.7. SURFACE INTEGRALS 135as we did when we defined the surface integral of a scalar-valued function.Since each patch S ij is approximately planar (that is, parallel to a plane),we can approximate the flux across S ij by(ρv · n)A(S ij ),where n is a unit vector that is normal (perpendicular) to S ij . This isbecause if θ is the angle between S ij and the direction of v, then the fluiddirected at S ij is effectively passing through a region of area A(S ij )| cos θ|.If we sum the flux over each patch, and let the areas of the patchesapproach zero, then we obtain the total flux across S,∫ ∫ρ(x, y, z)v(x, y, z) · n(x, y, z) dS,Swhere n(x, y, z) is a continuous function that describes a unit normal vectorat each point (x, y, z) on S. For a general vector field F, we define thesurface integral of F over S by∫ ∫ ∫ ∫F · dS = F · n dS.SWhen F represents an electric field, we call the surface integral of F over Sthe electric flux of F through S. Alternatively, if F = −K∇u, where u isa function that represents temperature and K is a constant that representsthermal conductivity, then the surface integral of F over a surface S is calledthe heat flow or heat flux across S.If S is parameterized by a function r(u, v), where (u, v) ∈ D, thenand we then have∫ ∫F · dS =S==∫ ∫∫ ∫∫ ∫n =SDDF ·Sr u × r v‖r u × r v ‖ ,r u × r v‖r u × r v ‖ dSF(r(u, v)) ·r u × r v‖r u × r v ‖ ‖r u × r v ‖ dAF(r(u, v)) · (r u × r v ) dA.This is analogous to the definition of the line integral of a vector field overa curve C,∫ ∫∫ bF · dr = F · T ds = F(r(t)) · r ′ (t) dt.CCa


136 CHAPTER 3. VECTOR CALCULUSJust as the orientation of a curve was relevant to the line integral ofa vector field over a curve, the orientation of a surface is relevant to thesurface integral of a vector field. We say that a surface S is orientable, ororiented, if, at each point (x, y, z) in S, it is possible to choose a uniquevector n(x, y, z) that is normal to the tangent plane of S at (x, y, z), in sucha way that n(x, y, z) varies continuously over S. The particular choice of nis called an orientation.An orientable surface has two orientations, or, informally, two “sides”,with normal vectors n and −n. This definition of orientability excludes theMöbius strip, because for this surface, it is possible for a continuous variationof (x, y, z) to yield two distinct normal vectors at every point of the surface,that are negatives of one another. Geometrically, the Möbius strip can besaid to have only one “side”, because negating any choice of continuouslyvarying n yields the same normal vectors.For a surface that is the graph of a function z = g(x, y), if we choose theparametrizationx = u, y = v, z = g(u, v),then fromwe obtainwhich yieldsr u = 〈1, 0, g u 〉, r v = 〈0, 1, g v 〉,r u × r v = 〈−g u , −g v , 1〉 = 〈−g x , −g y , 1〉n =r u × r v‖r u × r v ‖ = 〈−g x, −g y , 1〉√ .1 + gx 2 + gy2Because the z-component of this vector is positive, we call this choice of nan upward orientation of the surface, while −n is a downward orientation.Example (Stewart, Section 13.7, Exercise 22) Let S be the part of the conez = √ x 2 + y 2 that lies beneath the plane z = 1, with downward orientation.We wish to evaluate the surface integral∫ ∫F · dSSwhere F = 〈x, y, z 4 〉.First, we must compute the unit normal vector for S. Using cylindricalcoordinates yields the parameterizationx = u cos v, y = u sin v, z = u, 0 ≤ u ≤ 1, 0 ≤ v ≤ 2π.


3.7. SURFACE INTEGRALS 137We then havewhich yieldsr u = 〈cos v, sin v, 1〉, r v = 〈−u sin v, u cos v, 0〉,r u × r v = 〈−u cos v, −u sin v, u cos 2 v + u sin 2 v〉 = u〈− cos v, − sin v, 1〉.Because we assume downward orientation, we must have the z-componentof the normal vector be negative. Therefore, r u ×r v must be negated, whichyields∫ ∫∫ ∫F · dS = − F(x(u, v), y(u, v), z(u, v)) · (r u × r v ) dA,SDwhere D is the domain of the parameters u and v, the rectangle [0, 1]×[0, 2π].Evaluating this integral, we obtain∫ ∫∫ ∫F · dS = − 〈u cos v, u sin v, u 4 〉 · u〈− cos v, − sin v, 1〉 dAS= −== 2πD∫ 2π ∫ 10 0∫ 2π ∫ 10 0∫ 10(−u cos 2 v − u sin 2 v + u 4 )u du dv(u 2 − u 5 ) du dv(u 2 − u 5 ) du( )∣ u3= 2π3 − u6 ∣∣∣16= π 3 .0An alternative approach is to retain Cartesian coordinates, and then use theformula for the unit normal for a downward orientation of a surface that isthe graph of a function z = g(x, y),〈n = − 〈−g x, −g y , 1〉√= √ 1gx 2 + gy 2 + 1 2x√x 2 + y , 2〉y√x 2 + y , −1 .2This approach stil requires a conversion to polar coordinates to integrateover the unit disk in the xy-plane. ✷


138 CHAPTER 3. VECTOR CALCULUSFor a closed surface S, which is the boundary of a solid region E, wedefine the positive orientaion of S to be the choice of n that consistentlypoint outward from E, while the inward-pointing normals define the negativeorientation.Example (Stewart, Section 13.7, Exercise 26) To evaluate the surface integral∫ ∫F · dSSwhere F(x, y, z) = 〈y, z − y, x〉 and S is the surface of the tetrahedron withvertices (0, 0, 0), (1, 0, 0), (0, 1, 0), and (0, 0, 1), we must evaluate surfaceintegrals over each of the four faces of the tetrahedron separately. We assumepositive (outward) orientation.For the first side, S 1 , with vertices (0, 0, 0), (1, 0, 0) and (0, 0, 1), we firstparameterize the side usingx = u, y = 0, z = v, 0 ≤ u ≤ 1, 0 ≤ v ≤ 1 − u.Then, fromwe obtainr u = 〈1, 0, 0〉, r v = 〈0, 0, 1〉,r u × r v = 〈0, −1, 0〉.This vector is pointing outside the tetrahedron, so it is the outward normalvector that we wish to use. Therefore, the surface integral of F over S 1 is∫ ∫S 1F · dS == −= −∫ 1 ∫ 1−u0 0∫ 1 ∫ 1−u0∫ 10∫ 12 ∣〈0, v − 0, u〉 · 〈0, −1, 0〉 dv du0v 2 1−u0v dv dudu= − 1 (1 − u) 2 du2 0= 1 (1 − u) 312 3 ∣= − 1 6 .0


3.7. SURFACE INTEGRALS 139For the second side, S 2 , with vertices (0, 0, 0), (0, 1, 0) and (0, 0, 1), weparameterize usingThen, fromwe obtainx = 0, y = u, z = v, 0 ≤ u ≤ 1, 0 ≤ v ≤ 1 − u.r u = 〈0, 1, 0〉, r v = 〈0, 0, 1〉,r u × r v = 〈1, 0, 0〉.This vector is pointing inside the tetrahedron, so we must negate it to obtainthe outward normal vector. Therefore, the surface integral of F over S 2 is∫ ∫S 2F · dS == −==∫ 1 ∫ 1−u0 0∫ 1 ∫ 1−u0∫ 10( u3= − 1 6 .0u(u − 1) du3 − u22〈u, v − u, 0〉 · 〈−1, 0, 0〉 dv duu dv duFor the base S 3 , with vertices (0, 0, 0), (1, 0, 0) and (0, 1, 0), we parametrizeusingx = u, y = v, z = 0, 0 ≤ u ≤ 1, 0 ≤ v ≤ 1 − u.Then, fromwe obtain)∣ ∣∣∣1r u = 〈1, 0, 0〉, r v = 〈0, 1, 0〉,r u × r v = 〈0, 0, 1〉.This vector is pointing inside the tetrahedron, so we must negate it to obtainthe outward normal vector. Therefore, the surface integral of F over S 3 is∫ ∫S 3F · dS == −∫ 1 ∫ 1−u0 0∫ 1 ∫ 1−u000〈v, 0 − v, u〉 · 〈0, 0, −1〉 dv duu dv du


140 CHAPTER 3. VECTOR CALCULUS==∫ 10( u3= − 1 6 .u(u − 1) du3 − u22Finally, for the “top” face S 4 , with vertices (1, 0, 0), (0, 1, 0) and (0, 0, 1),we parametrize using)∣ ∣∣∣1x = u, y = v, z = 1 − u − v, 0 ≤ u ≤ 1, 0 ≤ v ≤ 1 − u,since the equation of the plane containing this face is x + y + z − 1 = 0. Thiscan be determined by using the three vertices to obtain two vectors withinthe plane, and then computing their cross product to obtain the plane’snormal vector.Then, fromr u = 〈1, 0, −1〉, r v = 〈0, 1, −1〉,0we obtainr u × r v = 〈1, 1, 1〉.This vector is pointing outside the tetrahedron, so it is the outward normalvector that we wish to use. Therefore, the surface integral of F over S 4 is∫ ∫S 4F · dS ======∫ 1 ∫ 1−u0 0∫ 1 ∫ 1−u0∫ 10∫ 10∫ 10= 1 3 .0(v − v221 − u −〈v, 1 − u − 2v, u〉 · 〈1, 1, 1〉 dv du1 − v dv du)∣ ∣∣∣1−udu012 − 1 2 u2 du)∣ ∣∣∣1( u2 − u36(1 − u)220du


3.8. STOKES’ THEOREM 141Adding the four integrals together yields∫ ∫F · dS = − 1 6 − 1 6 − 1 6 + 1 3 = −1 6 .✷3.8 Stokes’ TheoremSLet C be a simple, closed, positively oriented, piecewise smooth plane curve,and let D be the region that it encloses. According to one of the forms ofGreen’s Theorem, for a vector field F with continuous first partial derivativeson D, we have ∫ ∫ ∫F · dr = (curl F) · k dA,CDwhere k = 〈0, 0, 1〉.By noting that k is normal to the region D when it is embedded in 3-D space, we can generalize this form of Green’s Theorem to more generalsurfaces that are enclosed by simple, closed, piecewise smooth, positivelyoriented space curves. Let S be an oriented, piecewise smooth surface thatis enclosed by a such a curve C. If we divide S into several small patchesS ij , then these patches are approximately planar. We can apply Green’sTheorem, approximately, to each patch by rotating it in space so that itsunit normal vector is k, and using the fact that rotating two vectors u andv in space does not change the value of u · v.Most of the line integrals along the boundary curves of each path cancelwith one another due to the positive orientation of all such boundary curves,and we are left with the line integral over C, the boundary of S. If we takethe limit as the size of the patches approches zero, we then obtain∫ ∫ ∫∫ ∫F · dr = curl F · dS = curl F · n dS,CSwhere n is the unit normal vector of S. This result is known as Stokes’Theorem.Stokes’ Theorem can be used either to evaluate an surface integral or anintegral over the curve that encloses it, whichever is easier.Example (Stewart, Section 13.8, Exercise 2) Let F(x, y, z) = 〈yz, xz, xy〉and let S be the part of the paraboloid z = 9 − x 2 − y 2 that lies above theplane z = 5, with upward orientation. By Stokes’ Theorem,∫ ∫∫curl F · dS = F · drSCS


142 CHAPTER 3. VECTOR CALCULUSwhere C is the boundary curve of S, which is a circle of radius 2 centered at(0, 0, 5), and parallel to the xy-plane. It can therefore be parameterized byIts tangent vector is thenWe then have∫ ∫curl F · dS =Sx = 2 cos t, y = 2 sin t, z = 5, 0 ≤ t ≤ 2π.==∫ 2π0∫ 2π0∫ 2π= 20r ′ (t) = 〈−2 sin t, 2 cos t, 0〉.0∫ 2πF(r(t)) · r ′ (t) dt〈10 sin t, 10 cos t, 4 cos t sin t〉 · 〈−2 sin t, 2 cos t, 0〉 dt0= 10 sin 2t| 2π0= 0.−20 sin 2 t + 20 cos 2 t dtcos 2t dtThis result can also be obtained by noting that because F = ∇f, wheref(x, y, z) = xyz, it follows that curl F = 0. ✷Example (Stewart, Section 13.8, Exercise 8) We wish to evaluate the lineintegral of F(x, y, z) = 〈xy, 2z, 3y〉 over the curve C that is the intersectionof the cylinder x 2 + y 2 = 9 with the plane x + z = 5.To describe the surface S enclosed by C, we use the parameterizationx = u cos v, y = u sin v, z = 5 − u cos v, 0 ≤ u ≤ 3, 0 ≤ v ≤ 2π.Usingr u = 〈cos v, sin v, − cos v〉,r v = 〈−u sin v, u cos v, u sin v〉,we obtainWe then computecurl F =r u × r v = 〈u, 0, u〉.〈 ∂∂x , ∂ ∂y , ∂ 〉× 〈xy, 2z, 3y〉 = 〈1, 0, −x〉.∂z


3.8. STOKES’ THEOREM 143Let D be the domain of the parameters,D = {(u, v) | 0 ≤ u ≤ 3, 0 ≤ v ≤ 2π.We then apply Stokes’ Theorem and obtain∫∫ ∫F · dr = curl F · dSC∫ ∫S= curl F(r(u, v)) · (r u × r v ) dA===D∫ 3 ∫ 2π0 0∫ 3 ∫ 2π0∫ 3= 2π00∫ 3〈1, 0, −u cos v〉 · 〈u, 0, u〉 dAu − u 2 cos v dv du(uv − u 2 sin v) ∣ 2πdv du00= 2π u22 ∣= 9π.u du du30✷Stokes’ Theorem can also be used to provide insight into the physicalinterpretation of the curl of a vector field. Let S a be a disk of radius acentered at a point P 0 , and let C a be its boundary. Furthermore, let v be avelocity field for a fluid. Then the line integral∫∫v · dr = v · T ds,C a C awhere T is the unit tangent vector of C a , measures the tendency of the fluidto move around C a . This is because this measure, called the circulation of varound C a , is greatest when the fluid velocity vector is consistently parallelto the unit tangent vector. That is, the circulation around C a is maximizedwhen the fluid follows the path of C a .Now, by Stokes’ Theorem,∫∫ ∫v · dr = curl v · dSC a S a


144 CHAPTER 3. VECTOR CALCULUS∫ ∫= curl v · n dSS a∫ ∫≈ curl V(P 0 ) · n(P 0 ) 1 dSS a≈ πa 2 curl v(P 0 ) · n(P 0 ).As a → 0, and S a collapses to the point P 0 , this approximation improves,and we obtain∫1curl v(P 0 ) · n(P 0 ) = lima→0 πa 2 v · dr.C aThis shows that circulation is maximized when the axis around which thefluid is circulating, n(P 0 ), is parallel to curl v. That is, the direction ofcurl v indicates the axis around which the greatest circulation occurs.3.8.1 A Note About OrientationRecall Stokes’ Theorem,∫C∫ ∫F · dr =Scurl F · dS,where C is a simple, closed, positively oriented, piecewise smooth curve andS is a oriented surface enclosed by C. If C is parameterized by a functionr(t), where a ≤ t ≤ b, and S is parameterized by a function g(u, v), where(u, v) ∈ D, then Stokes’ Theorem becomes∫ ba∫ ∫F(r(t)) · r ′ (t) dt =Dcurl F(g(u, v)) · (g u × g v ) du dv.It is important that the parameterizations r and g have the proper orientationfor Stokes’ Theorem to apply. This is why it is required that Chave positive orientation. It means, informally, that if one were to “walk”along C, in such a way that n, the unit normal vector of S, can be viewed,then S should always be “on the left” relative to the path traced along C.It follows that the parameterizations of C and S must be consistent withone another, to ensure that they are oriented properly. Otherwise, one of theparameterizations must be reversed, so that the sign of the correspondingintegral is corrected. The orientation of a curve can be reversed by changingthe parameter to s = a + b − t. The orientation of a surface can be reversedby interchanging the variables u and v.


3.9. THE DIVERGENCE THEOREM 1453.9 The Divergence TheoremLet F be a vector field with continuous first partial derivatives. Recall astatement of Green’s Theorem,∫∫ ∫F · n ds = div F dA,Cwhere n is the outward unit normal vector of D. Now, let E be a threedimensionalsolid whose boundary, denoted by ∂E, is a closed surface Swith positive orientation. Then, if we consider two-dimensional slices of E,each one being parallel to the xy-plane, then each slice is a region D withpositively oriented boundary C, to which Green’s Theorem applies. If wemultiply the integrals on both sides of Green’s Theorem, as applied to eachslice, by dz, the infinitesimal “thickness” of each slice, then we obtain∫ ∫∫ ∫ ∫F · n dS = div F dV,or, equivalently,S∫ ∫SD∫ ∫ ∫F · dS =EE∇ · F dV.This result is known as the Gauss Divergence Theorem, or simply the DivergenceTheorem.As the Divergence Theorem relates the surface integral of a vector field,known as the flux of the vector field through the surface, to an integral of itsdivergence over a solid, it is quite useful for converting potentially difficultdouble integrals into triple integrals that may be much easier to evaluate,as the following example demonstrates.Example (Stewart, Section 13.9, Exercise 6) Let S be the surface of thebox with vertices (±1, ±2, ±3), and let F(x, y, z) = 〈x 2 z 3 , 2xyz 3 , xz 4 〉. Tocompute the surface integral of F over S directly is quite tedious, becauseS has six faces that must be handled separately. Instead, we apply theDivergence Theorem to integrate div F over E, the interior of the box. Wethen have∫ ∫∫ ∫ ∫F · dS = div F dVS==E∫ 1 ∫ 2 ∫ 3−1 −2 −3∫ 1 ∫ 2 ∫ 3−1−2−3(x 2 z 3 ) x + (2xyz 3 ) y + (xz 4 ) z dz dy dz2xz 3 + 2xz 3 + 4xz 3 dz dy dx


146 CHAPTER 3. VECTOR CALCULUS✷== 32= 32= 0.∫ 1 ∫ 2 ∫ 3−1 −2 −3∫ 1 ∫ 3−1∫ 1−1x[x−3z 448xz 3 dz dy dxz 3 dz dx]3∣ dx−3The Divergence Theorem can also be used to convert a difficult surfaceintegral into an easier one.Example (Stewart, Section 13.9, Exercise 17) Let F(x, y, z) = 〈z 2 x, 1 3 y3 +tan z, x 2 z + y 2 〉. Let S be the top half of the sphere x 2 + y 2 + z 2 = 1. Toevaluate the surface integral of F over S, we note that if we combine S withS 1 , the disk x 2 + y 2 ≤ 1, with downward orientation. We then obtain a newsurface S 2 that is the boundary of the top half of the ball x 2 + y 2 + z 2 ≤ 1,which we denote by E. By the Divergence Theorem,∫ ∫ ∫ ∫∫ ∫∫ ∫ ∫F · dS + F · dS = F · dS = div F dV.SS 1 S 2 EWe parameterize S 1 byx = u sin v, y = u cos v, z = 0, 0 ≤ u ≤ 1, 0 ≤ v ≤ 2π.This parameterization is used instead of the usual one arising from polarcoordinates, due to the downward orientation. It follows fromthatr u = 〈sin u, cos u, 0〉, r v = 〈u cos v, −u sin v, 0〉r u × r v = 〈0, 0, −u sin 2 v − u cos 2 v〉 = u〈0, 0, −1〉,which points downward, as desired. From( )1div F(x, y, z) = (z 2 x) x +3 y3 + tan zy+ (x 2 z + y 2 ) z = x 2 + y 2 + z 2 ,which suggests the use of spherical coordinates for the integral over E, weobtain∫ ∫∫ ∫ ∫∫ ∫F · dS = div F dV − F · dSSES 1


3.9. THE DIVERGENCE THEOREM 147=== 2π= 2π= 2π∫ ∫ ∫E∫ 1 ∫ 2π0 0∫ 1 ∫ 2π ∫ π/20 0∫ 10∫ 10∫ 10= 2π ρ55(x 2 + y 2 + z 2 ) dV −F(x(u, v), y(u, v), z(u, v)) · u〈0, 0, −1〉 dv du0ρ 2 ρ 2 sin φ dφ dθ dρ +∫ π/2ρ 4 sin φ dφ dρ +∣0ρ 4 [ − cos φ| π/20ρ 4 dρ +10+ π= 2π 5 + π u44∣∫ 10∫ 1100]dρ −u 3 [ v2u 3 du∫ 10∫ 10+sin 2v4∫ 1 ∫ 2π00∫ 2πu 3 cos 2 v dv du0u 3 ∫ 2π0]∣ ∣∣∣2πdu0u(u 2 cos 2 v) dv du1 + cos 2v2dv du✷= 2π 5 + π 4= 13π20 .Suppose that F is a vector field that, at any point, represents the flowrate of heat energy, which is the rate of change, with respect to time, ofthe amount of heat energy flowing through that point. By Fourier’s Law,F = −K∇T , where K is a constant called thermal conductivity, and T is afunction that indicates temperature.Now, let E be a three-dimensional solid enclosed by a closed, positivelyoriented, surface S with outward unit normal vector n. Then, by the lawof conservation of energy, the rate of change, with respect to time, of theamount of heat energy inside E is equal to the flow rate, or flux, or heatinto E through S. That is, if ρ(x, y, z) is the density of heat energy, then∫ ∫ ∫ ∫ ∫∂ρ dV = F · (−n) dS,∂t ESwhere we use −n because n is the outward unit normal vector, but we needto express the flux into E through S.


148 CHAPTER 3. VECTOR CALCULUSFrom the definition of F, and the fact that ρ = cρ 0 T , where c is thespecific heat and ρ 0 is the mass density, which, for simplicity, we assume tobe constant, we have∂∂t∫ ∫ ∫E∫ ∫cρ 0 T dV =SK∇T · n dS.Next, we note that because c, ρ 0 , and E do not depend on time, we canwrite∫ ∫ ∫∫∂Tcρ 0E ∂t∫SdV = K∇T · dS.Now, we apply the Divergence Theorem, and obtain∫ ∫ ∫∫ ∫ ∫ ∫ ∫∂Tcρ 0E ∂t∫EdV = K div ∇T dV =That is,∫ ∫ ∫E()∂Tcρ 0∂t − K∇2 T dV = 0.Since the solid E is arbitrary, it follows that∂T∂t = Kcρ 0∇ 2 T.EK∇ 2 T dV.This is known as the heat equation, which is one of the most importantpartial differential equations in all of applied mathematics.3.10 Differential FormsTo date, we have learned the following theorems concerning the evalution ofintegrals of derivatives:• The Fundamental Theorem of <strong>Calculus</strong>:∫ baf ′ (x) dx = f(b) − f(a)• The Fundamental Theorem of Line Integrals:∫ ba∇f(r(t)) · r ′ (t) dt = f(r(b)) − f(r(a))


3.10. DIFFERENTIAL FORMS 149• Green’s Theorem:∫ ∫D∫(Q x − P y ) dA =CP dx + Q dy• Stokes’ Theorem:∫ ∫∫curl F · dS =SCF · dr• Gauss’ Divergence Theorem:∫ ∫ ∫∫ ∫div F dV =ESF · dSAll of these theorems relate the integral of the derivative or gradient of afunction, or partial derivatives of components of a vector field, over a higherdimensionalregion to the integral or sum of the function or vector field over alower-dimensional region. Now, we will see how the notation of differentialforms can be used to combine all of these theorems into one. It is thisnotation, as opposed to vectors and operations such as the divergence andcurl, that allows the Fundamental Theorem of <strong>Calculus</strong> to be generalized tofunctions of several variables.A differential form is an expression consisting of a scalar-valued functionf : K ⊆ R n → R and zero or more infinitesimals of the form dx 1 , dx 2 , . . . , dx n ,where x 1 , x 2 , . . . , x n are the independent variables of f. The order of a differentialform is defined to be the number of infinitesimals that it includes.For simplicity, we set n = 3 of three variables. With that in mind, a0-form, or a differential form of order zero, is simply a scalar-valued functionf(x, y, z). A 1-form is a function f(x, y, z) together with one of the expressionsdx, dy or dz. A 2-form is a function f(x, y, z) together with a pair ofdistinct infinitesimals, which can be either dx dy, dy dz or dz dx. Finally, a3-form is an expression of the form f(x, y, z) dx dy dz.Example The function f(x, y, z) = x 2 y + y 3 z is a 0-form on R 3 , whilef dx = (x 2 y + y 3 z) dx and f dy = (x 2 y + y 3 z) dy are both examples of a1-form on R 3 . ✷Example Let f(x, y, z) = 1/(x 2 + y 2 + z 2 ). Then f dx dy is a 2-form onR 3 − {(0, 0, 0}, while f dx dy dz is a 3-form on the same domain. ✷Forms of the same order can be added and scaled by functions, as the followingexamples show.


150 CHAPTER 3. VECTOR CALCULUSExample Let f(x, y, z) = e x−y sin z and let g(x, y, z) = (x 2 + y 2 + z 2 ) 3/2 .Then f, g and f + g are all 0-forms on R 3 , andf + g = e x−y sin z + (x 2 + y 2 + z 2 ) 3/2 .That is, addition of 0-forms is identical to addition of functions.If we define ω 1 = f dx and ω 2 = g dy, then ω 1 and ω 2 are both 1-formson R 3 , and so is ω = ω 1 + ω 2 , whereω = f dx + g dy = e x−y sin z dx + (x 2 + y 2 + z 2 ) 3/2 dy.Furthermore, if h(x, y, z) = xy 2 z 3 , andη 1 = f dx dy,η 2 = g dz dxare 2-forms on R 3 , thenη = hη 1 + η 2 = xy 2 z 3 e x−y sin z dx dy + (x 2 + y 2 + z 3 ) 3/2 dz dxis also a 2-form on R 3 . ✷Example Let f(x, y, z) = cos x, g(x, y, z) = e y and h(x, y, z) = xyz 2 . Then,ν 1 = f dx dy dz and ν 2 = g dx dy dz are 3-forms on R 3 , and so is✷ν = ν 1 + hν 2 = (cos x + xyz 2 e y ) dx dy dz.It should be noted that like addition of functions, addition of differentialforms is both commutative, associative, and distributive. Also, there isnever any need to add forms of different order, such as adding a 0-form toa 1-form.We now define two essential operations on differential forms. The firstis called the wedge product, a multiplication operation for differential forms.Given a k-form ω and an l-form η, where 0 ≤ k + l ≤ 3, the wedge productof ω and η, denoted by ω ∧η, is a (k +l)-form. It satisfies the following laws:1. For each k there is a k-form 0 such that η ∧ 0 = 0 ∧ η = 0 for anyl-form η.2. Distributitivy: If f is a 0-form, then(fω 1 + ω 2 ) ∧ η = f(ω 1 ∧ η) + (ω 2 ∧ η).


3.10. DIFFERENTIAL FORMS 1513. Anticommutativity:ω ∧ η = (−1) kl (η ∧ ω).4. Associativity:ω 1 ∧ (ω 2 ∧ ω 3 ) = (ω 1 ∧ ω 2 ) ∧ ω 35. Homogeneity: If f is a 0-form, thenω ∧ (fη) = (fω) ∧ η = f(ω ∧ η).6. If dx i is a basic 1-form, then dx i ∧ dx i = 0.7. If f is a 0-form, then f ∧ ω = fω.Example Let ω = f dx and η = g dy be 1-forms. Thenby homogeneity, whileω ∧ η = (f dx ∧ g dy) = fg(dx ∧ dy) = fg dx dy,η ∧ ω = (−1) 1(1) (ω ∧ η) = −fg dx dy.On the other hand, if ν = h dy dz is a 2-form, thenν ∧ ω = fh(dy dz ∧ dx) = fh dy dz dx = −fh dy dx dz = fh dx dy dzby homogeneity and anticommutativity, while✷ν ∧ η = fh(dy dz ∧ dy) = fh dy dz dy = −fh dy dy dz = 0.Note that if any 3-form on R 3 is multiplied by a k-form, where k > 0, thenthe result is zero, because there cannot be distinct basic 1-forms in the wedgeproduct of such forms.Example Let ω = x dx − y dy, and η = z dy dz − x dz dx. Thenω ∧ η = (x dx − y dy) ∧ (z dy dz − x dz dx)= (x dx ∧ z dy dz) − (y dy ∧ z dy dz) − (x dx ∧ x dz dz) +(y dy ∧ x dz dx)= xz dx dy dz − yz dy dy dz − x 2 dx dz dx + xy dy dz dx= xz dx dy dz − yz dy dy dz + x 2 dx dx dz + xy dy dz dx= xz dx dy dz − 0 − 0 − xy dy dx dz= (xz + xy) dx dy dz.


152 CHAPTER 3. VECTOR CALCULUS✷The second operation is differentiation. Given a k-form ω, where k < 3,the derivative of ω, denoted by dω, is a (k+1)-form. It satisfies the followinglaws:1. If f is a 0-form, thendf = f x dx + f y dy + f z dz2. Linearity: If ω 1 and ω 2 are k-forms, thend(ω 1 + ω 2 ) = dω 1 + dω 23. Product Rule: If ω is a k-form and η is an l-form, thend(ω ∧ η) = (dω ∧ η) + (−1) k (ω ∧ dη)4. The second derivative of a form is zero; that is, for any k-form ω,d(dω) = 0.We now illustrate the use of these differentiation rules.Example Let ω = x 2 y 3 z 4 dx dy be a 2-form. Then, by Linearity and theProduct Rule,dω = [d(x 2 y 3 z 4 ) ∧ dx dy] + (−1) 0 [x 2 y 3 z 4 ∧ d(dx dy)]= [( (x 2 y 3 z 4 ) x dx + (x 2 y 3 z 4 ) y dy + (x 2 y 3 z 4 ) z dz ) ∧ dx dy ] +[x 2 y 3 z 4 ∧ {(d(dx) ∧ dy) + (−1) 1 (dx ∧ d(dy)} ]= [( 2xy 3 z 4 dx + 3x 2 y 2 z 4 dy + 4x 2 y 3 z 3 dz ) ∧ dx dy ] +[x 2 y 3 z 4 ∧ {(0 ∧ dy) − (dx ∧ 0)} ]= 2xy 3 z 4 dx dx dy + 3x 2 y 2 z 4 dy dx dy + 4x 2 y 3 z 3 dz dx dy + 0= −4x 2 y 3 z 3 dx dz dy= 4x 2 y 3 z 3 dx dy dz.In general, differentiating a k-form ω, when k > 0, only requires differentiatingthe coefficient function with respect to the variables that are not amongany basic 1-forms that are included in ω. In this example, since ω = f dx dy,we obtain dω = f z dz dx dy = f z dx dy dz. ✷We now consider the kind of differential forms that appear in the theoremsof vector calculus.


3.10. DIFFERENTIAL FORMS 153• Let ω = f(x, y, z) be a 0-form. Then, by the first law of differentiation,dω = ∇f · 〈dx, dy, dz〉.If C is a smooth curve with parameterization r(t) = 〈x(t), y(t), z(t)〉,a ≤ t ≤ b, then∫ ba∇f(r(t)) · r ′ (t) dt ===∫ ba∫ b∫aC∇f(r(t)) · 〈x ′ (t), y ′ (t), z ′ (t)〉 dtdω(r(t))dω.It follows from the Fundamental Theorem of Line Integrals that∫dω = ω(r(b)) − ω(r(a)).CThe boundary of C, ∂C, consists of its initial point A and terminalpoint B. If we define the “integral” of a 0-form ω over this 0-dimensional region by∫ω = ω(B) − ω(A),∂Cwhich makes sense considering that, intuitively, the numbers 1 and −1serve as an appropriate “outward unit normal vector” at the terminaland initial points, respectively, then we have∫ ∫dω = ω.• Let ω = P (x, y) dx + Q(x, y) dy be a 1-form. ThenC∂Cdω = d[P (x, y) dx] + d[Q(x, y) dy]= dP (x, y) ∧ dx − P (x, y) ∧ d(dx) + dQ(x, y) ∧ dy −Q(x, y) ∧ d(dy)= (P x dx + P y dy) ∧ dx − 0 + (Q x dx + Q y dy) ∧ dy − 0= P x dx dx + P y dy dx + Q x dx dy + Q y dy dy= (Q y − P x ) dx dy.It follows from Green’s Theorem that∫ ∫ ∫ω =CDdω.


154 CHAPTER 3. VECTOR CALCULUS• If we proceed similarly with a 1-formω = F · 〈dx, dy, dz〉 = P (x, y, z) dx + Q(x, y, z) dy + R(x, y, z) dz,then we obtaindω = curl F · 〈dy dz, dz dx, dx dy〉= (R y − Q z ) dy dz + (P z − R x ) dz dx + (Q y − P x ) dx dy.Let S be a smooth surface parameterized byr(u, v) = 〈x(u, v), y(u, v), z(u, v)〉, (u, v) ∈ D.Then the (unnormalized) normal vector r u × r v is given byr u × r v = 〈x u , y u , z u 〉 × 〈x v , y v , z v 〉We then have∫ ∫curl F · dS =S= 〈y u z v − z u y v , z u x v − x u z v , x u y v − y u x v 〉〈 〉∂(y, z) ∂(z, x) ∂(x, y)= , , .∂(u, v) ∂(u, v) ∂(u, v)====∫ ∫curl F · n dS∫ ∫Scurl F(r(u, v)) · (r u × r v ) du dvD∫ ∫ {∂(y, z)[R y (r(u, v)) − Q z (r(u, v))]D∂(u, v) +∂(z, x)[P z (r(u, v)) − R x (r(u, v))]∂(u, v) +}∂(x, y)[Q x (r(u, v)) − P y (r(u, v))] du dv∂(u, v)∫ ∫(R y − Q z ) dy dz + (P z − R x ) dz dx +S(Q y − P x ) dx dy∫ ∫dω.SIf C is the boundary curve of S, and C is parameterized by r(t) =〈x(t), y(t), z(t)〉, a ≤ t ≤ b, then∫∫ bF · dr = F(r(t)) · r ′ (t) dtCa


3.10. DIFFERENTIAL FORMS 155===∫ ba∫ b∫aC〈P (r(t)), Q(r(t)), R(r(t)) · 〈x ′ (t), y ′ (t), z ′ (t)〉 dtω(r(t)) dtω.It follows from Stokes’ Theorem that∫ ∫ ∫ω =• Let F = 〈P, Q, R〉. Let ω be the 2-formThenCSdω.ω = P dy dz + Q dz dx + R dx dy.dω = dP dy dz + dQ dz dx + dR dx dy= [P x dx + P y dy + P z dz] dy dz + [Q x dx + Q y dy + Q z dz] dz dx +[R x dx + R y dy + R z dz] dx dy= P x dx dy dz + Q y dy dz dx + R z dz dx dy= P x dx dy dz − Q y dy dx dz − R z dx dz dy= P x dx dy dz + Q y dx dy dz + R z dx dy dz= div F dx dy dz.Let E be a solid enclosed by a smooth surface S with positive orientation,and let S be parameterized byr(u, v) = 〈x(u, v), y(u, v), z(u, v)〉, (u, v) ∈ D.We then have∫ ∫F · dS =S===∫ ∫∫ ∫∫ ∫∫ ∫SDDDF · n dSF(r(u, v)) · (r u × r v ) du dv〈 〉∂(y, z) ∂(z, x) ∂(x, y)〈P (r(u, v)), Q(r(u, v)), R(r(u, v))〉 · , , du dv∂(u, v) ∂(u, v) ∂(u, v)∂(y, z)x)P (r(u, v)) + Q(r(u, v))∂(z,∂(u, v) ∂(u, v) +


156 CHAPTER 3. VECTOR CALCULUS==∂(x, y)R(r(u, v)) du dv∂(u, v)∫ ∫P dy dz + Q dz dx + R dx dy∫ ∫Sω.SIt follows from the Divergence Theorem that∫ ∫ ∫ ∫ ∫ω = dω.SPutting all of these results together, we obtain the following combinedtheorem, that is known as the General Stokes’ Theorem:If M is an oriented k-manifold with boundary ∂M, and ω is a(k − 1)-form defined on an open set containing M, then∫ ∫ω = dω.∂MThe importance of this unified theorem is that, unlike the previously statedtheorems of vector calculus, this theorem, through the language of differentialforms, can be generalized to functions of any number of variables. Thisis because operations on differential forms are not defined in terms of otheroperations, such as the cross product, that are limited to three variables.For example, given a 3-form ω = f(x, y, z, w) dx dy dw, its integral over a3-dimensional, closed, positively oriented hypersurface S embedded in R 4 isequal to the integral of dω over the 4-dimensional solid E that is enclosed byS, where dω is computed using the previously stated rules for differentiationand multiplication of differential forms.ME

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!