12.07.2015 Views

ANALYSIS II

ANALYSIS II

ANALYSIS II

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>ANALYSIS</strong> <strong>II</strong>Classroom Noteswith an Appendix: German Translation of Section 7H.-D. Albermodified last 05.20.2013


Contents1 Sequences of functions, uniform convergence, power series 11.1 Pointwise convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Uniform convergence, continuity of the limit function . . . . . . . . . . . . 31.3 Supremum norm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.4 Uniformly converging series of functions . . . . . . . . . . . . . . . . . . . 101.5 Differentiability of the limit function . . . . . . . . . . . . . . . . . . . . . 111.6 Power series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131.7 Trigonometric functions continued . . . . . . . . . . . . . . . . . . . . . . . 192 The Riemann integral 242.1 Definition of the Riemann integral . . . . . . . . . . . . . . . . . . . . . . . 242.2 Criteria for Riemann integrable functions . . . . . . . . . . . . . . . . . . . 262.3 Simple properties of the integral . . . . . . . . . . . . . . . . . . . . . . . . 312.4 Fundamental theorem of calculus . . . . . . . . . . . . . . . . . . . . . . . 373 Continuous mappings on R n 423.1 Norms on R n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423.2 Topology of R n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473.3 Continuous mappings from R n to R m . . . . . . . . . . . . . . . . . . . . . 533.4 Uniform convergence, the normed spaces of continuous and linear mappings 634 Differentiable mappings on R n 674.1 Definition of the derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . 674.2 Directional derivatives and partial derivatives . . . . . . . . . . . . . . . . 704.3 Elementary properties of differentiable mappings . . . . . . . . . . . . . . . 744.4 Mean value theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 814.5 Continuously differentiable mappings, second derivative . . . . . . . . . . . 854.6 Higher derivatives, Taylor formula . . . . . . . . . . . . . . . . . . . . . . . 915 Local extreme values, inverse function and implicit function 945.1 Local extreme values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 945.2 Banach’s fixed point theorem . . . . . . . . . . . . . . . . . . . . . . . . . 985.3 Local invertibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1015.4 Implicit functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106i


6 Integration of functions of several variables 1116.1 Definition of the integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1116.2 Limits of integrals, parameter dependent integrals . . . . . . . . . . . . . . 1136.3 The Theorem of Fubini . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1146.4 The transformation formula . . . . . . . . . . . . . . . . . . . . . . . . . . 1177 p-dimensional surfaces in R m , curve- and surface integrals, Theorems ofGauß and Stokes 1237.1 p-dimensional patches of a surface, submanifolds . . . . . . . . . . . . . . . 1237.2 Integration on patches of a surface . . . . . . . . . . . . . . . . . . . . . . 1287.3 Integration on submanifolds . . . . . . . . . . . . . . . . . . . . . . . . . . 1317.4 The Integral Theorem of Gauß . . . . . . . . . . . . . . . . . . . . . . . . . 1337.5 Green’s formulae . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1357.6 The Integral Theorem of Stokes . . . . . . . . . . . . . . . . . . . . . . . . 136Appendix 141A p-dimensionale Flächen im R m , Flächenintegrale, Gaußscher undStokescher Satz 142A.1 p-dimensionale Flächenstücke, Untermannigfaltigkeiten . . . . . . . . . . . 142A.2 Integration auf Flächenstücken . . . . . . . . . . . . . . . . . . . . . . . . . 147A.3 Integration auf Untermannigfaltigkeiten . . . . . . . . . . . . . . . . . . . . 149A.4 Der Gaußsche Integralsatz . . . . . . . . . . . . . . . . . . . . . . . . . . . 152A.5 Greensche Formeln . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154A.6 Der Stokesche Integralsatz . . . . . . . . . . . . . . . . . . . . . . . . . . . 155ii


1 Sequences of functions, uniform convergence, power series1.1 Pointwise convergenceIn section 4 of the lecture notes to the Analysis I course we introduced the exponentialfunctionx ↦→ exp(x) =∞∑k=0x kk! .For every n ∈ N we define the polynomial function f n : R → R byf n (x) :=n∑k=0x kk! .Then {f n } ∞ n=1 is a sequence of functions with the property thatexp(x) = limn→∞f n (x)for every x ∈ R . We say that the sequence {f n } ∞ n=1 converges pointwise to the exponentialfunction.Definition 1.1 Let D be a set (not necessarily a set of real numbers), and let {f n } ∞ n=1be a sequence of functions f n : D → R . This sequence is said to converge pointwise, if afunction f : D → R exists such thatf(x) = limn→∞f n (x)for all x ∈ D . We call f the pointwise limit function of {f n } ∞ n=1 .The sequence {f n } ∞ n=1of functions converges pointwise if and only if the numerical sequence{f n (x)} ∞ n=1 converges for every x ∈ D . For, if {f n} ∞ n=1converges pointwise, then{f n (x)} ∞ n=1 converges by definition. On the other hand, if {f n(x)} ∞ n=1converges for everyx ∈ D , then a function f : D → R is defined byf(x) := limn→∞f n (x) ,and so {f n } ∞ n=1converges pointwise.Clearly, this shows that the limit function of a pointwise convergent function sequenceis uniquely determined. Moreover, together with the Cauchy convergence criterion fornumerical sequences it immediately yields the following1


Theorem 1.2 A sequence {f n } ∞ n=1 of functions f n : D → R converges pointwise, if andonly if to every x ∈ D and to every ε > 0 there is a number n 0 ∈ N such that|f n (x) − f m (x)| < εfor all n, m ≥ n 0 .With quantifiers this can be written as∀x>0∀ε>0∃n 0 ∈N∀ : |f n (x) − f m (x)| < ε .n,m≥n 0Examples1. Let D = [0, 1] and x ↦→ f n (x) := x n . Since for x ∈ [0, 1) we have lim n→∞ f n (x) =lim n→∞ x n = 0 , and since lim n→∞ f n (1) = lim n→∞ 1 n = 1 , the function sequence {f n } ∞ n=1converges pointwise to the limit function f : [0, 1] → R ,{0 , 0 ≤ x < 1f(x) =1 , x = 1 .2. Above we considered the sequence of polynomial functions {f n } ∞ n=1 with f n(x) =∑ n x k{ k=0 k! ∑n} ∞x kk=0 k!n=1, which converges pointwise to the exponential function. This sequencecan also be called a function series.3. Let D = [0, 2] and⎧⎪⎨ nx , 0 ≤ x ≤ 1 nf n (x) =12 − nx ,n⎪⎩< x < 2 n20 , ≤ x ≤ 2 .n✻.✲1n2n2This function sequence {f n } ∞ n=1 converges pointwise to the null function in [0, 2].Proof: It must be shown that for all x ∈ Dlim f n(x) = 0.n→∞2


For x = 0 we obviously have lim n→∞ f n (x) = lim n→∞ 0 = 0. Thus, let x > 0. Then thereis n 0 ∈ N such that 2 n 0≤ x. Since 2 n ≤ 2 n 0≤ x for n ≥ n 0 , the definition of f n yieldsf n (x) = 0 for all these n, whencelim f n(x) = 0.n→∞4. Let D = R and x ↦→ f n (x) = 1 [nx]. Here [nx] denotes the greatest integer less ornequal to nx....✻...........................✲{f n } ∞ n=1converges pointwise to the identity mapping x ↦→ f(x) := x .Proof: Let x ∈ R and n ∈ N . Then there is k ∈ Z with x ∈ [ k , k+1 ) , hence nx ∈n n[k, k + 1) , and thereforef n (x) = 1 n [nx] = k n .From k n ≤ x < k+1n it follows that 0 ≤ x − k n < 1 n ,which yields |x − f n (x)| = |x − k | < 1 . This impliesn nlim f n(x) = x .n→∞1.2 Uniform convergence, continuity of the limit functionSuppose that D ⊆ R and that {f n } ∞ n=1 is a sequence of continuous functions f n = D → R ,which converges pointwise. It is natural to ask whether the limit function f : D → R is3


continuous. However, the first example considered above shows that this need not be thecase, sincex ↦→ f n (x) = x n : [0, 1] → Ris continuous, but the limit function{0 , x ∈ [0, 1)f(x) =1 , x = 1is discontinuous. To be able to conclude that the limit function is continuous, a strongertype of convergence must be introduced:Definition 1.3 Let D be a set (not necessarily a set of real numbers), and let {f n } ∞ n=1 bea sequence of functions f n : D → R . This sequence is said to be uniformly convergent, ifa function f : D → R exists such that to every ε > 0 there is a number n 0 ∈ N with|f n (x) − f(x)| < εfor all n ≥ n 0 and all x ∈ D . The function f is called limit function.With quantifiers, this can be written as∀ε>0∃n 0 ∈N∀x∈D∀ : |f n (x) − f(x)| < ε .n≥n 0Note that for pointwise convergence the number n 0 may depend on x ∈ D, but for uniformconvergence it must be possible to choose the number n 0 independently of x ∈ D . It isobvious that if {f n } ∞ n=1converges uniformly, then it also converges pointwise, and thelimit functions of uniform convergence and pointwise convergence coincide.Examples1. Let D = [0, 1] and x ↦→ f n (x) := x n : D → R . We have shown above that the sequence{f n } ∞ n=1converges pointwise. However, this sequence is not uniformly convergent.Proof: If this sequence would converge uniformly, the limit function had to be{0 , x ∈ [0, 1)f(x) =1 , x = 1 ,since this is the pointwise limit function. We show that for this function the negation ofthe statement in the definition of uniform convergence is true:∃ε>0∀n 0 ∈N∃x∈D∃ : |f n (x) − f(x)| ≥ ε .n≥n 04


Choose ε = 1 2 and n 0 arbitrarily. The negation is true if x ∈ (0, 1) can be found withThis is equivalent to|f n0 (x) − f(x)| = |f n0 (x)| = x n 0= 1 2 = ε .x = ( 1) 1n − 0 = 2 1n 0 = e − log 2n 0 .2log 2n 0> 0 and the strict monotonicity of the exponential function imply 0 < e − log 2n 0 < e 0 =1 , whence 0 < ( ) 1(1 n 02 < 1 , whence x = 1) 1n 02 has the sought properties.2. Let {f n } ∞ n=1be the sequence of functions defined in example 3 of section 1.1. Thissequence converges pointwise to the function f = 0 , but it does not converge uniformly.Otherwise it had to converge uniformly to f = 0 . However, choose ε = 1 , let n 0 ∈ N bearbitrary and set x = 1 n 0. Then|f n( 1n 0)− f( 1n 0)| = |fn( 1n 0)| = 1 ≥ ε ,which negates the statement in the definition of uniform convergence.3. Let D = R and x ↦→ f n (x) = 1 [nx] . The sequence {f n n} ∞ n=1converges uniformly tox ↦→ f(x) = x . To verify this, let ε > 0 and remember that in example 4 of section 1.1we showed that|f n (x) − f(x)| = |f n (x) − x| < 1 nfor all x ∈ R and all n ∈ N . Hence, if we choose n 0 ∈ N such that 1 n 0all n ≥ n 0 and all x ∈ R|f n (x) − f(x)| < 1 n ≤ 1 n 0< ε .< ε , we obtain forUniform convergence is important because of the followingTheorem 1.4 Let D ⊆ R , let a ∈ D and let all the functions f n : D → R be continuousat a . Suppose that the sequence of functions {f n } ∞ n=1converges uniformly to the limitfunction f : D → R . Then f is continuous at a .Proof: Let ε > 0 . We have to find δ > 0 such that for all x ∈ D with |x − a| < δ|f(x) − f(a)| < εholds. To determine such a number δ , note that for all x ∈ D and all n ∈ N|f(x) − f(a)| = |f(x) − f n (x) + f n (x) − f n (a) + f n (a) − f(a)|≤ |f(x) − f n (x)| + |f n (x) − f n (a)| + |f n (a) − f(a)| .5


Since {f n } ∞ n=1 converges uniformly to f , there is n 0 ∈ N with |f n (y) − f(y)| < ε for all3n ≥ n 0 and all y ∈ D , whence|f(x) − f(a)| ≤ 2 3 ε + |f n 0(x) − f n0 (a)| .Since f n0is continuous, there is δ > 0 such that |f n0 (x) − f n0 (a)| < ε 3|x − a| < δ . Thus, if |x − a| < δfor all x ∈ D withwhich proves that f is continuous at a .This theorem shows thatlimx→a|f(x) − f(a)| < 2 3 ε + 1 3 ε = ε ,lim f n(x) = lim f(x) = f(a) = lim f n (a) = lim lim f n (a) .n→∞ x→a n→∞ n→∞ x→aHence, for a uniformly convergent sequence of functions the limits lim x→a and lim n→∞can be interchanged.Corollary 1.5 The limit function of a uniformly convergent sequence of continuous functionsis continuous.Example 2 considered above shows that the limit function can be continuous even if thesequence {f n } ∞ n=1does not converge uniformly. However, we haveTheorem 1.6 (of Dini) Let D ⊆ R be compact, let f n : D → R and f : D → R becontinuous, and assume that the sequence of functions {f n } ∞ n=1converges pointwise andmonotonically to f , i.e. the sequence {|f n (x) − f(x)|} ∞ n=1 is a decreasing null sequencefor every x ∈ D. Then {f n } ∞ n=1Proof: Let ε > 0 .converges uniformly to f . (Ulisse Dini, 1845-1918).To every x ∈ D a neighborhood U(x) is associated as follows:lim n→∞ f n (x) = f(x) implies that a number n 0 = n 0 (x, ε) exists such that |f n0 (x)−f(x)|


whence, since {f n (x)} ∞ n=1converges monotonically to f(x) ,|f n (x) − f(x)| < εfor all n ≥ n 0 (x i , ε) . In particular, this inequality holds for all n ≥ ñ . Since ñ isindependent of x , this proves that {f n } ∞ n=1converges uniformly to f .1.3 Supremum normFor the definition of convergence and limits of numerical sequences the absolute value, atool to measure distance for numbers, was of crucial importance. Up to now we have notintroduced a tool to measure distance of functions, but we were nevertheless able to definetwo different types of convergence of sequences of functions, the pointwise convergence andthe uniform convergence. Since functions with domain D and target set R are elementsof the algebra F (D, R) , it is natural to ask whether a tool can be introduced, whichallows to measure the distance of two elements from F (D, R) , and which can be used todefine convergence on the set F (D, R) just as the absolute value could be used to defineconvergence on the set R . Here we shall show that this is indeed possible on the smalleralgebra B(D, R) of bounded real valued functions. The resulting type of convergence ofsequences of functions from B(D, R) is the uniform convergence.Definition 1.7 Let D be a set (not necessarily a set of real numbers), and let f : D → Rbe a bounded function. The nonnegative numberis called the supremum norm of f .‖f‖ := sup |f(x)|x∈DThe norm has properties similar to the properties of the absolute value on R . This isshown by the followingTheorem 1.8 Let f, g : D → R be bounded functions and c be a real number. Then(i) ‖f‖ = 0 ⇐⇒ f = 0(ii)(iii)‖cf‖ = |c| ‖f‖‖f + g‖ ≤ ‖f‖ + ‖g‖(iv) ‖fg‖ ≤ ‖f‖ ‖g‖ .7


Proof: (i) and (ii) are obvious. To prove (iii), note that for x ∈ D|(f + g)(x)| = |f(x) + g(x)| ≤ |f(x)| + |g(x)|≤supy∈D|f(y)| + sup |g(y)| = ‖f‖ + ‖g‖ .y∈DThus, ‖f‖ + ‖g‖ is an upper bound for the set { |(f + g)(x)| ∣ ∣ x ∈ D}, whence for theleast upper boundTo prove (iv), we use that for x ∈ D‖f + g‖ = sup |(f + g)(x)| ≤ ‖f‖ + ‖g‖ .x∈D|(fg)(x)| = |f(x)g(x)| = |f(x)| |g(x)| ≤ ‖f‖ ‖g‖ ,whence‖fg‖ = sup |(fg)(x)| ≤ ‖f‖ ‖g‖ .x∈DDefinition 1.9 Let V be a vector space. A mapping ‖ · ‖ : V → [0, ∞) which has theproperties(i) ‖v‖ = 0 ⇐⇒ v = 0(ii) ‖cv‖ = |c| ‖v‖ (positive homogeneity)(iii) ‖v + u‖ ≤ ‖v‖ + ‖u‖ (triangle inequality)is called a norm on V . If V is an algebra, then ‖ · ‖ : V → [0, ∞) is called an algebranorm, provided that (i) - (iii) and(iv) ‖uv‖ ≤ ‖u‖ ‖v‖are satisfied. A vector space or an algebra with norm is called a normed vector space ora normed algebra.Clearly, the absolute value | · | : R → [0, ∞) has the properties (i) - (iv) of the precedingdefinition, hence | · | is an algebra norm on R and R is a normed algebra. The precedingtheorem shows that the supremum norm ‖ · ‖ : B(D, R) → [0, ∞) is an algebra norm onthe set B(D, R) of bounded real valued functions, and B(D, R) is a normed algebra.8


Conversely, assume that {f n } ∞ n=1is a Cauchy sequence. To prove that this sequenceconverges, we first must identify the limit function. To this end we show that {f n (x)} ∞ n=1is a Cauchy sequence of real numbers for every x ∈ D . For, since {f n } ∞ n=1is a Cauchysequence, to ε > 0 there exists n 0 ∈ N such that for all n, m ≥ n 0|f n (x) − f m (x)| ≤ ‖f n − f m ‖ < ε ,and so {f n (x)} ∞ n=1is indeed a Cauchy sequence of real numbers. Since every Cauchysequence of real numbers converges, we obtain that {f n } ∞ n=1converges pointwise withlimit function f : D → R defined byf(x) = limn→∞f n (x) .We show that {f n } ∞ n=1 even converges uniformly to f . For, using again that {f n} ∞ n=1 is aCauchy sequence, to ε > 0 there is n 0 ∈ N with ‖f n − f m ‖ < ε for n, m ≥ n 0 . Thereforewe obtain for x ∈ D and n ≥ n 0whence|f n (x) − f(x)| = |f n (x) − limm→∞ f m(x)| = limm→∞ |f n(x) − f m (x)| ≤ ε ,for n ≥ n 0 , since ε is independent of x .‖f n − f‖ = sup |f n (x) − f(x)| ≤ εx∈D1.4 Uniformly converging series of functionsLet D be a set and let f n : D → R be functions. The series of functions ∑ ∞n=1 f n is saidto be uniformly convergent, if the sequence { ∑ mn=1 f n} ∞ m=1is uniformly convergent.Theorem 1.14 (Criterion of Weierstraß) Let f n : D → R be bounded functions satisfying‖f n ‖ ≤ c n , and let ∑ ∞n=1 c n be convergent. Then the series of functions ∑ ∞n=1 f nconverges uniformly.Proof: It suffices to show that { ∑ mn=1 f n} ∞ m=1is a Cauchy sequence. Let ε > 0 . Since∑ ∞k=1 c k converges, there is n 0 ∈ N such that ∣ ∑ mk=n c k∣ = ∑ mk=n c k < ε for all m ≥ n ≥n 0 , whencem∑ m∑m∑‖ f k ‖ ≤ ‖f k ‖ ≤ c k < ε ,for all m ≥ n ≥ n 0 .k=nk=nk=n10


1.5 Differentiability of the limit functionLet D be a subset of R . We showed that a uniformly convergent sequence {f n } ∞ n=1 ofcontinuous functions has a continuous limit function f : D → R . One can ask the questionwhat type of convergence is needed to ensure that a sequence of differentiable functionshas a differentiable limit function? Simple examples show that uniform convergence isnot sufficient to ensure this. The following is a slightly different question: Assume that{f n } ∞ n=1is a uniformly convergent sequence of differentiable functions with limit functionf . If f is differentiable, does this imply that the sequence of derivatives {f n} ′ ∞ n=1 convergespointwise to f ′ ? Also this need not be true, as is shown by the following example: LetD = [0, 1] and let x ↦→ f n (x) = 1 n xn : [0, 1] → R . The sequence {f n } ∞ n=1of differentiablefunctions converges uniformly to the differentiable limit function f = 0 . The sequence ofderivatives {f n} ′ ∞ n=1 = {xn−1 } ∞ n=1does not converge uniformly on [0, 1], but it convergespointwise to the limit function{0 , 0 ≤ x < 1g(x) =1 , x = 1 .However, g ≠ f ′ = 0 .Our original question is answered by the followingTheorem 1.15 Let −∞ < a < b < ∞ and let f n : [a, b] → R be differentiable functions.If the sequence {f ′ n} ∞ n=1 of derivatives converges uniformly and the sequence {f n} ∞ n=1 convergesat least in one point x 0 ∈ [a, b] , then the sequence {f n } ∞ n=1converges uniformly toa differentiable limit function f : [a, b] → R andfor all x ∈ [a, b] .f ′ (x) = limn→∞f ′ n(x)This means that under the convergence condition given in this theorem, derivation (whichis a limit process) can be interchanged with the limit with respect to n :(limn→∞f n) ′= limn→∞f ′ n .Proof: First we show that {f n } ∞ n=1converges uniformly. Let ε > 0 . For x ∈ [a, b]|f m (x) − f n (x)| ≤ | ( f m (x) − f n (x) ) − ( f m (x 0 ) − f n (x 0 ) ) | + |f m (x 0 ) − f n (x 0 )| . (∗)11


Since f m − f n is differentiable, the mean value theorem yields for a suitable z betweenx 0 and x| ( f m (x) − f n (x) ) − ( f m (x 0 ) − f n (x 0 ) ) | = |f ′ m(z) − f ′ n(z)| |x − x 0 | .The sequence of derivatives converges uniformly. Therefore there is n 0 ∈ N such that forall m, n ≥ n 0|f ′ m(z) − f ′ n(z)|


for a suitable z between x and c if x ≠ c , and for z = c if x = c. By assumption {f ′ n} ∞ n=1converges uniformly, consequently there is n 0 ∈ N such that for all m, n ≥ n 0 and ally ∈ [a, b]whence|f ′ m(y) − f ′ n(y)| < ε ,|F m (x) − F n (x)| ≤ |f ′ m(z) − f ′ n(z)| + |f ′ m(c) − f ′ n(c)|< ε + ε = 2ε ,for all m, n ≥ n 0 and all x ∈ [a, b] . This shows that {F n } ∞ n=1completes the proof.converges uniformly and1.6 Power seriesLet a numerical sequence {a n } ∞ n=1 and a real number x 0 be given. For arbitrary x ∈ Rconsider the series∞∑a n (x − x 0 ) n .n=0This series is called a power series. a n is called the n-th coefficient, x 0 is the center ofexpansion of the power series. The Taylor series and the series for exp, sin and cos arepower series. These examples show that power series are interesting mainly as functionseries∞∑x ↦→ f n (x)n=0with f n (x) = a n (x − x 0 ) n . First the convergence of power series must be investigated:Theorem 1.16 Letbe a power series.∞∑a n (x − x 0 ) nn=0(i)Suppose first thatThen the power series is in casea = limn→∞n √ |a n | < ∞ .13


(ii)If{ √n |an |} ∞n=1a = 0 : absolutely convergent for all x ∈ R⎧absolutely convergent for |x − x 0 | 0 : convergent or divergent for |x − x 0 | = 1 a⎪⎩divergent for |x − x 0 | > 1 . ais unbounded, then the power series converges only for x = x 0 .Proof: By the root test, the series ∑ ∞n=0 a n(x − x 0 ) n converges absolutely if√nlim |an | |x − x 0 | n = |x − x 0 | lim√ n|a n | = |x − x 0 |a < 1 ,n→∞n→∞and diverges if√nlim |an | |x − x 0 | n = |x − x 0 |a > 1 .n→∞{ √ } ∞ {nThis proves (i). If |an | is unbounded, then for x ≠ x 0 also |x − x 0 | n√ } ∞|a n | ={n=1n=1√ } ∞n |an (x − x 0 ) n | is unbounded, hence {a n (x − x 0 ) n } ∞ n=1is not a null sequence, andn=1consequently ∑ ∞n=0 a n(x − x 0 ) n diverges. This proves (ii)Definition 1.17 Let a = lim √ n n→∞ |a n | . The number⎧1⎪⎨, if a ≠ 0ar = ∞ , if a = 0{ √ ⎪⎩n0 , if |an | is unbounded} ∞n=1is called radius of convergence and the open interval(x 0 − r , x 0 + r) = { x ∈ R ∣ ∣ |x − x0 | < r }is called interval of convergence of the power series∞∑a n (x − x 0 ) n .Examples1. The power seriesn=0∞∑x n ,n=0∞∑n=11n xnboth have radius of convergence equal to 1. This is evident for the first series. To proveit for the second series, note thatlimn→∞n√ n = limn→∞e 1 n log n = e limn→∞( 1 n log n) = e 0 = 1 ,14


log xsince lim x→∞ = 0 , by the rule of de l’Hospital. Thus, the radius of convergence ofxthe second series is given byr =1lim n→∞√n 1n=1√lim n→∞n1n= limn→∞n √ n = 1 .For x = 1 both power series diverge, for x = −1 the first one diverges, the second oneconverges.2. In Analysis I it was proved that the exponential seriesconverges absolutely for all x ∈ R .∞∑n=0x nn!(To verify this use the ratio test, for example.)Consequently, the radius of convergence r must be infinite. For, if r would be finite, theexponential√series had to diverge for all x with |x| > r , which is excluded. (This implies1= lim r n→∞ n 1= 0 , by the way.)n!Theorem 1.18 Let ∑ ∞n=0 a n(x − x 0 ) n and ∑ ∞n=0 b n(x − x 0 ) n be power series with radiiof convergence r 1 and r 2 , respectively. Then for all x with |x − x 0 | < r = min(r 1 , r 2 )∞∑a n (x − x 0 ) n +n=0∞∑b n (x − x 0 ) n =n=0[ ∑ ∞ ] [ ∑ ∞ ]a n (x − x 0 ) n b n (x − x 0 ) nn=0n=0=∞∑(a n + b n )(x − x 0 ) nn=0∞∑ ( n∑a k b n−k)(x − x 0 ) n .Proof: The statements follow immediately from the theorems about computing withseries and about the Cauchy product of two series. (We note that the radii of convergenceof both series on the right are at least equal to r , but can be larger.)Theorem 1.19 Let ∑ ∞n=0 a n(x − x 0 ) n be a power series with radius of convergence r .Then this series converges uniformly in every compact interval [x 0 − r 1 , x 0 + r 1 ] with0 ≤ r 1 < r .Proof: Let c n = |a n | r n 1 . Thenlimn→∞whence the root test implies that the seriesn=0k=0n√cn = limn→∞n √ |a n |r n 1 = r 11r < 1 ,∞∑n=015c n


converges. Because of |a n (x − x 0 ) n | ≤ |a n |r n 1 = c n for all x with |x − x 0 | ≤ r 1 , the Weierstraßcriterion (Theorem 1.14) yields that the power series ∑ ∞n=0 a n(x − x 0 ) n convergesuniformly for x ∈ [x 0 − r 1 , x 0 + r 1 ] = { y ∣ ∣ |y − x0 | ≤ r 1}.Corollary 1.20 Let ∑ ∞n=0 a n(x−x 0 ) n be a power series with radius of convergence r > 0 .Then the function f : (x 0 − r , x 0 + r) → R defined byis continuous.f(x) =∞∑a n (x − x 0 ) nn=0Proof: Since {x ↦→ ∑ mn=0 a n(x − x 0 ) n } ∞ m=0is a sequence of continuous functions, whichconverges uniformly in every compact interval [x 0 − r 1 , x 0 + r 1 ] with r 1 < r , the limitfunction f is continuous in each of these intervals. Hence f is continuous in the union(x 0 − r , x 0 + r) = ⋃[x 0 − r 1 , x 0 + r 1 ] .Letf(x) =0 r . By Theorem 1.16 this can only be true if r 1 = r .Thus, Theorem 1.19 implies that the sequence {f m} ′ ∞ m=1of derivatives converges uniformlyin every compact subinterval of the interval of convergence (x 0 − r, x 0 + r) .16


Consequently, we can use Theorem 1.15 to conclude that the limit function f(x) =∑ ∞n=0 a n(x − x 0 ) n is differentiable with derivativef ′ (x) = limm→∞ f ′ m(x) =∞∑na n (x − x 0 ) n−1in all these subintervals. Hence f is differentiable with derivative given by this formulain the interval of convergence (x 0 − r , x 0 + r) , which is the union of these subintervals.Repeating these arguments we obtainTheorem 1.21 Let f(x) = ∑ ∞n=0 a n(x−x 0 ) n be a power series with radius of convergencer > 0 . Then f is infinitely differentiable in the interval of convergence. All the derivativescan be computed termwise:f (k) (x) =n=1∞∑n(n − 1) . . . (n − k + 1)a n (x − x 0 ) n−k .n=kExample: In the interval (0, 2] the logarithm can be expanded into the power series∞∑ (−1) n−1log x =(x − 1) n .nn=1In section 7.4 of the lecture notes to Analysis I we proved that this equation holds true for1≤ x ≤ 2 . To verify that it also holds for 0 < x < 1 , note that the radius of convergence2 2of the power series on the right isr =lim n→∞n1√| (−1)n−1 |n= limn→∞n √ n = 1 .Hence, this power series converges in the interval of convergence { x ∣ ∣ |x − 1| < 1}= (0, 2)and represents there an infinitely differentiable function. The derivative of this functionis[ ∑ ∞(−1) n−1 ] ′(x − 1) n =nn=1=∞∑(−1) n−1 (x − 1) n−1 =n=1∞∑(1 − x) nn=011 − (1 − x) = 1 x = (log x)′ .Consequently ∑ ∞ (−1) n−1n=1(x − 1) n and log x both are antiderivatives of 1 nx(0, 2), and therefore differ at most by a constant:∞∑ (−1) n−1log x =(x − 1) n + C .nTo determine C , set x = 1 . From log(1) = 0 we obtain C = 0 .n=117in the interval


Theorem 1.22 (Identity theorem for power series) Let the radii of convergence r 1and r 2 of the power series ∑ ∞n=0 a n(x−x 0 ) n and ∑ ∞n=0 b n(x−x 0 ) n be greater than zero. Assumethat these power series coincide in a neighborhood U r (x 0 ) = { x ∈ R ∣ |x − x0 | < r }of x 0 with r ≤ min(r 1 , r 2 ) :∞∑a n (x − x 0 ) n =n=0∞∑b n (x − x 0 ) nn=0for all x ∈ U r (x 0 ) . Then a n = b n for all n = 0, 1, 2, . . . .Proof: First choose x = x 0 , which immediately yieldsa 0 = b 0 .Next let n ∈ N ∪ {0} and assume that a k = b k for 0 ≤ k ≤ n . It must be shown thata n+1 = b n+1 holds. From the assumptions of the theorem and from the assumption of theinduction it follows that∞∑k=n+1a k (x − x 0 ) k =∞∑k=n+1b k (x − x 0 ) k ,hence∑ ∞(x − x 0 ) n+1 a k (x − x 0 ) k−n−1 = (x − x 0 ) n+1∞∑k=n+1k=n+1b k (x − x 0 ) k−n−1for all x ∈ U r (x 0 ) . For x from this neighborhood with x ≠ x 0 this implies∞∑∞∑a k (x − x 0 ) k−n−1 = b k (x − x 0 ) k−n−1 .k=n+1k=n+1The continuity of power series thus implies∞∑a n+1 =k=n+1= limx→x0k=n+1a k (x 0 − x 0 ) k−n−1 = lim∞∑b k (x − x 0 ) k−n−1 =∞∑x→x0k=n+1a k (x − x 0 ) k−n−1∞∑b k (x 0 − x 0 ) k−n−1 = b n+1 .k=n+1Every power series defines a continuous function in the interval of convergence. Informationabout continuity of the power series on the boundary of the interval of convergenceis provided by the following18


Theorem 1.23 Let ∑ ∞n=0 a n(x − x 0 ) n be a power series with positive radius of convergence,let z ∈ R be a boundary point of the interval of convergence and assume that∑ ∞n=0 a n(z − x 0 ) n converges. Then the power series converges uniformly in the interval[z, x 0 ] (if z < x 0 ), or in the interval [x 0 , z] (if x 0 < z), respectively.A proof of this theorem can be found in the book: M. Barner, F. Flohr: Analysis I, p.317, 318 (in German).Corollary 1.24 (Abel’s limit theorem) If a power series converges at a point on theboundary of the interval of convergence, then it is continuous at this point. (Niels HendrickAbel, 1802-1829).1.7 Trigonometric functions continuedSince sine is defined by a power series with interval of convergence equal to R ,sin x =∞∑n=0(−1) n x 2n+1(2n + 1)! ,the derivative of sin can be computed by termwise differentiation of the power series,hencesin ′ x =∞∑(−1) n x 2n ∞(2n + 1)(2n + 1)! = ∑(−1) n x2n(2n)! = cos x .n=0This result has been proved in Analysis I using the addition theorem for sine.Tangent and cotangent. One definesn=0tan x = sin xcos x ,cos xcot x =sin x = 1tan x ....✻.cot x.cot x.cot x....................−π − π 20..π2..π...3π2✲x.........tan x.tan.x...tan.x......19


From the addition theorems for sine and cosine addition theorems for tangent and cotangentcan be derived:The derivatives aretan ′ x =cot ′ x =tan(x + y) =cot(x + y) =tan x + tan y1 − tan x tan ycot x cot y − 1cot x + cot y .( sin x) ′ cos 2 x + sin 2 x=cos x cos 2 x( cos x) ′ − sin 2 x − cos 2 x=sin x sin 2 x= 1cos 2 x= −1sin 2 x .Inverse trigonometric functions. sine and cosine are periodic, hence not injective,and consequently do not have inverse functions. However, if sine and cosine are restrictedto suitable intervals, inverse functions do exist.By definition of π, we have cos x > 0 for x ∈ (− π 2 , π 2 ) , hence because of sin′ x = cos x , thesine function is strictly increasing in the interval [− π 2 , π 2 ]. Consequently, sin : [− π 2 , π 2 ] →[−1, 1] has an inverse function. Moreover, inverse functions also exist to other restrictionsof sine:sin : [π(n + 1 2 ), π(n + 3 )] → [−1, 1] , n ∈ Z .2If one speaks of the inverse function of sine, one has to specify which one of these infinitelymany inverses are meant. If no specification is given, the inverse functionarcsin : [−1, 1] → [− π 2 , π 2 ]of sin : [− π, π ] → [−1, 1] is meant. Because of reasons, which have their origin in the2 2theory of functions of a complex variable, the infinitely many inverse functionsx ↦→ (arcsin x) + 2nπ ,n ∈ Zandx ↦→ −(arcsin x) + (2n + 1)π,n ∈ Zare called branches of the inverse function of sine or branches of arc sine (”Zweige desArcussinus”). The function arcsin : [−1, 1] → [− π, π ] is called principle branch of the2 2inverse function (”Hauptwert der Umkehrfunktion”).Correspondingly, the inverse functionarccos : [−1, 1] → [0, π]20


to the function cos : [0, π] → [−1, 1] is called principle branch of the inverse function ofcosine, but there exist the infinitely many other inverse functionsx → ±(arccos x) + 2nπ, n ∈ Z.−1......yπ2− π 2✻... . ✻.π..arcsin x...... π .2.✲ x1...yarccos x.. ✲ x−1 1A similar situation arises with tangent and cotangent. The principle branch of the inversefunction of tangent is the function[arctan : [−∞, ∞] → − π 2 , π ].2One calls this function arc tangent (“Arcustangens”), but there are infinitely many otherbranches of the inverse functionx ↦→ arctan x + nπ, n ∈ Zy✻π2. . . . . . . . . . . . ..arctan x✲ x. . . . . . . . . . . . .− π 2In the following we consider the principle branches of the inverse functions.For the21


derivatives one obtains(arcsin x) ′ ==(arccos x) ′ ==(arctan x) ′ ==1sin ′ (arcsin x) = 1cos(arcsin x)1√ = 1√1 − (sin(arcsin x))2 1 − x21cos ′ (arccos x) =−1sin(arccos x)−1√1 − (cos(arccos x))2 =−1 √1 − x21tan ′ (arctan x) = ( cos(arctan x) ) 211 + (tan(arctan x)) = 12 1 + x . 2The functions arcsin, arccos and arctan can be expanded into power series. For example,if |x| < 1. Also the power seriesd1(arctan x) =dt 1 + x = ∑ ∞(−1) n x 2n ,2∞∑n=0n=0(−1) n2n + 1 x2n+1has radius of convergence equal to 1, and it is an antiderivative of ∑ ∞n=0 (−1)n x 2n , hencearctan x =∞∑n=0(−1) n2n + 1 x2n+1 + Cfor |x| < 1, with a suitable constant C. From arctan 0 = 0 we obtain C = 0, thusarctan x =∞∑n=0(−1) n2n + 1 x2n+1for all x ∈ R with |x| < 1. The convergence criterion of Leibniz shows that the power serieson the right converges for x = 1, hence Abel’s limit theorem implies that the functiongiven by the power series is continuous at 1. Since arctan is continuous, the power seriesand the function arctan define two continuous extensions of the function arctan from theinterval (−1, 1) to (−1, 1]. Since the continuous extension is unique, we must havearctan 1 =∞∑n=1(−1) n2n + 1 .22


Because ofit followscos(2x) = (cos x) 2 − (sin x) 2 = 2(cos x) 2 − 1,(0 = 2 cos π ) 2− 1,4hencecos π 4 = √12andthusThis yieldssin π 4 = √1 − (cos π 4 )2 =√12 ,tan π 4 = sin π 4cos π 4arctan 1 = π 4 ,= 1 .whence∞π4 = ∑n=0(−1) n2n + 1 .Theoretically this series allows to compute π, but the convergence is slow.23


2 The Riemann integralFor a class of real functions as large as possible one wants to determine the area of thesurface bounded by the graph of the function and the abscissa. This area is called theintegral of the function.✻..f.....✲To determine this area might be a difficult task for functions as complicated as the Dirichletfunction⎧⎨ 1, x ∈ Qf(x) =⎩0, x ∈ R\Q,and in fact, the Riemann integral, which we are going to discuss in this section, is not ableto assign a surface area to this function. The Riemann integral was historically the firstrigorous notion of an integral. It was introduced by Riemann in his Habilitation thesis1854. Today mathematicians use a more general and advanced integral, the Lebesgueintegral, which can assign an area to the Dirichlet function. The value of the Lebesgueintegral of the Dirichlet function is 0. (Bernhard Riemann 1826 – 1866, Henri Lebesgue1875 – 1941)2.1 Definition of the Riemann integralLet −∞ < a < b < ∞ and let f : [a, b] → R be a given function. It suggests itself tocompute the area below the graph of f by inscribing rectangles into this surface. If werefine the subdivision, the total area of these rectangles will converge to the area of thesurface below the graph of f. It is also possible to cover the area below the graph of fby rectangles. Again, if the subdivision is refined, the total area of these rectangles willconverge to the area of the surface below the graph of f.Therefore one expects that in both approximating processes the total areas of therectangles will converge to the same number. The area of the surface below the graph off is defined to be this number.24


Of course, the total areas of the inscribed rectangles and of the covering rectangles willnot converge to the same number for all functions f. An example for this is the Dirichletfunction.Those functions f, for which these areas converge to the same number, are calledRiemann integrable, and the number is called Riemann integral of f over the interval[a, b].✻✻........f.........f......ab✲ab✲This program will now be carried through rigorously.Definition 2.1 Let −∞ < a < b < ∞ . A partition P of the interval [a, b] is a finite set{x 0 , . . . x n } ⊆ R witha = x 0 < x 1 < . . . < x n−1 < x n = b .For brevity we set ∆x i = x i − x x−1 (i = 1, . . . , n) .Let f : [a, b] → R be a bounded real function and P = {x 0 , . . . , x n } a partition of [a, b] .For i = 1, . . . , n setM i = sup {f(x) | x i−1 ≤ x ≤ x i } ,m i = inf {f(x) | x i−1 ≤ x ≤ x i } ,and defineU(P, f) =L(P, f) =n∑M i ∆x ii=1n∑m i ∆x i .i=1Since f is bounded, there exist numbers m, M such thatm ≤ f(x) ≤ M25


for all x ∈ [a, b]. This implies m ≤ m i ≤ M i ≤ M for all i = 1, . . . , n , hencem(b − a) =≤n∑n∑m ∆x i ≤ m i ∆x i = L(P, f)i=1i=1n∑n∑M i ∆x i = U(P, f) ≤ M ∆x i = M(b − a) .i=1i=1(∗)Consequently, the infimum and the supremum∫ ba∫ baf dx = inf {U(P, f) | P is a partition of [a, b]}f dx = sup {L(P, f) | P is a partition of [a, b]}exist. The numbers ∫ bf dx and ∫ bf dx are called upper and lower Riemann integral of f .a aDefinition 2.2 A bounded function f : [a, b] → R is called Riemann integrable, if theupper Riemann integral ∫ bf dx and the lower Riemann integral ∫ bf dx coincide. Thea acommon value or the upper and lower Riemann integral is denoted by∫ bf dxor∫ baaf(x) dxand called the Riemann integral of f . The set of Riemann integrable functions defined onthe interval [a, b] is denoted by R([a, b]) .2.2 Criteria for Riemann integrable functionsTo work with Riemann integrable functions, one needs simple criteria for a function to beRiemann integrable. In this section we derive such criteria.Definition 2.3 Let P, P 1 , P 2 and P ∗ be partitions of [a, b]. The partition P ∗ is calleda refinement of P if P ⊆ P ∗ holds. P ∗ is called common refinement of P 1 and P 2 ifP ∗ = P 1 ∪ P 2 .Theorem 2.4 Let f : [a, b] → R and let P ∗ be a refinement of the partition P of [a, b].ThenL(P, f) ≤ L(P ∗ , f)U(P ∗ , f) ≤ U(P, f).26


Proof: Let P = {x 0 , . . . , x n } and assume first that P ∗ contains exactly one point x ∗more than P. Then there are x j−1 , x j ∈ P with x j−1 < x ∗ < x j . Letw 1 = inf{f(x) ∣ xj−1 ≤ x ≤ x ∗ },w 2 = inf{f(x) ∣ x ∗ ≤ x ≤ x j },and for i = 1, . . . , nm i = inf{f(x) ∣ ∣ xi−1 ≤ x ≤ x i }.Then w 1 , w 2 ≥ m j , henceL(P, f) =≤n∑m i ∆x i =i=1j−1∑i=1m i ∆x i+ m j (x ∗ − x j−1 + x j − x ∗ ) +j−1∑i=1= L(P ∗ , f).n∑i=j+1m i ∆x im i ∆x i + w 1 (x ∗ − x j−1 ) + w 2 (x j − x ∗ ) +n∑i=j+1m i ∆x iBy induction we conclude that L(P, f) ≤ L(P ∗ , f) holds if P ∗ contains k points morethan P for any k. The second inequality stated in the theorem is proved analogously.Theorem 2.5 Let f : [a, b] → R be bounded. ThenProof:∫ baf dx ≤∫ baf dx.Let P 1 and P 2 be partitions and let P ∗ be the common refinement. Inequality(∗) proved above shows thatL(P ∗ , f) ≤ U(P ∗ , f).Combination of this inequality with the preceding theorem yieldsL(P 1 , f) ≤ L(P ∗ , f) ≤ U(P ∗ , f) ≤ U(P 2 , f),whenceL(P 1 , f) ≤ U(P 2 , f)for all partitions P 1 and P 2 of [a, b]. Therefore U(P 2 , f) is an upper bound of the set{L(P, f)∣ ∣ P is a partition of [a, b]},27


hence the least upper bound ∫ bf dx of this set satisfiesa∫ baf dx ≤ U(P 2 , f).Since this inequality holds for every partition P 2 of [a, b], it follows that ∫ bf dx is a lowerabound of the set{ ∣U(P, f) P is a partition of [a, b] } ,hence the greatest lower bound of this set satisfies∫ bf dx ≤∫ baaf dx.Theorem 2.6 Let f : [a, b] → R be bounded. The function f belongs to R([a, b]) if andonly if to every ε > 0 there is a partition P of [a, b] such thatU(P, f) − L(P, f) < ε.Proof: First assume that to every ε > 0 there is a partition P with U(P, f)−L(P, f) < ε.SinceL(P, f) ≤it follows that0 ≤hence∫ bfor every ε > 0. This impliesthus f ∈ R([a, b]).f dx −∫ baa0 ≤f dx ≤∫ baa∫ b∫ bf dx ≤ U(P, f),f dx ≤ U(P, f) − L(P, f) < ε,f dx −∫ baa∫ bf dx =∫ baaf dx < εf dx,Conversely, let f ∈ R([a, b]). By definition of the infimum and the supremum to everyε > 0 there are partitions P 1 and P 2 with∫ ba∫ baf dx =f dx =∫ ba∫ baf dx ≤ U(P 1 , f) ∫ ba∫ baf dx + ε 2f dx − ε 2 .28


Let P be the common refinement of P 1 and P 2 . Then∫ baf dx − ε 2 < L(P, f) ≤ ∫ baf dx ≤ U(P, f) 0 there is δ > 0 such thatn∑∣ f(t i )∆x i −i=1∫ bfor every partition P = {x 0 , . . . , x n } of [a, b] withaf dx∣ < εmax{∆x 1 , . . . , ∆x n } < δand for every choice of points t 1 , . . . , t n with t i ∈ [x i−1 , x i ].Note that if {P j } ∞ j=1 is a sequence of partitions P j = {x (j)0 = a, x (j)1 , . . . , x (j)n j= b} of [a, b]withand if t (j)i∈ [x (j)i−1 , x(j) ilimj→∞ max{∆x(j) 1 , . . . , ∆x (j)n j} = 0], then this theorem implies∫ baf dx = limj→∞n j∑i=1f(t (j)i )∆x (j)i .The integral is the limit of the Riemann sums ∑ ni=1 f(t i)∆x i .Proof: Let ε > 0. We setη =εb − a .As a continuous function on the compact interval [a, b], the function f is bounded anduniformly continuous (cf. Theorem 6.43 of the lecture notes to the Analysis I course).Therefore there exists δ > 0 such that for all x, t ∈ [a, b] with |x − t| < δ|f(x) − f(t)| < η.(∗)We choose a partition P = {x 0 , . . . , x n } of [a, b] with max{∆x 1 , . . . , ∆x n } < δ. Then (∗)implies for all x, t ∈ [x i−1 , x i ]f(x) − f(t) < η,29


hencefor suitable x 0 , t 0 ∈ [x i−1 , x i ]. This yieldsM i − m i = sup f(x) − inf f(t)x i−1 ≤x≤x ix i−1 ≤t≤x iU(P, f) − L(P, f) == max f(x) − min f(t)x i−1 ≤x≤x i x i−1 ≤t≤x i= f(x 0 ) − f(t 0 ) < η,n∑(M i − m i )∆x i < ηi=1n∑∆x ii=1= η(b − a) = ε. (∗∗)Since ε > 0 was arbitrary, the preceding theorem implies f ∈ R([a, b]). From (∗∗) andfrom the inequalitieswe infer thatL(P, f) =L(P, f)≤n∑m i ∆x i ≤i=1∫ ba∫ b∣ f dx −an∑f(t i )∆x i ≤i=1f dx ≤ U(P, f)n∑ ∣ ∣∣f(t i )∆x i < ε.Also the class of monotone functions is a subset of R([a, b]) :i=1n∑M i ∆x i ≤ U(P, f)Theorem 2.8 Let f : [a, b] → R be monotone. Then f is Riemann integrable.Proof: Assume that f is increasing. f is bounded because of f(a) ≤ f(x) ≤ f(b) for allx ∈ [a, b]. Let ε > 0. To arbitrary n ∈ N setx i = a + b − an i,for i = 0, 1, . . . , n. Then P = {x 0 , . . . , x n } is a partition of [a, b], and since f is increasingwe obtainthenceU(P, f) − L(P, f) ==n∑i=1i=1m i = infx i−1 ≤x≤x if(x) = f(x i−1 )M i = supx i−1 ≤x≤x if(x) = f(x i ),n∑(M i − m i )∆x ii=1() b − af(x i ) − f(x i−1 )n = (30) b − af(b) − f(a)n < ε,


where the last inequality sign holds if n ∈ N is chosen sufficiently large. By Theorem 2.6,this inequality shows that f ∈ R([a, b]).For decreasing f the proof is analogous.Example:Let −∞ < a < b < ∞. The function exp : [a, b] → R is continuous andtherefore Riemann integrable. The value of the integral is∫ bae x dx = e b − e a .To verify this equation we use Theorem 2.7. For every n ∈ N and all i = 0, 1, . . . , n we setx (n)i = a + i (b − a). Then {P n n} ∞ n=1 with P n = {x (n)0 , . . . , x (n)n } is a sequence of partitionsof [a, b] satisfyingThus, with t (n)ilimn→∞ max{∆x(n)= x (n)i−1 we obtain∫ bae x dx = limn→∞= limn→∞1 , . . . , ∆x (n)n∑i=1n∑expi=1= limn→∞e a b − ann} = limn→∞b − anexp(t (n)i )∆x (n)i(a + i − 1nn∑ [ ] e(b−a)/n i−1i=1= e a b − a [e (b−a)/n ] n − 1limn→∞ n e (b−a)/n − 1=e a (e b−a − 1)lim n→∞e (b−a)/n −1(b−a)/n= e b − e a ,= 0.(b − a)) b − ansince lim x→0e x −1x= 1, by the rule of de l’Hospital.2.3 Simple properties of the integralTheorem 2.9 (i) If f 1 , f 2 ∈ R([a, b]), then f 1 + f 2 ∈ R([a, b]), and∫ b(f 1 + f 2 )dx =∫ bf 1 dx +∫ baaaIf g ∈ R([a, b]) and c ∈ R, then cg ∈ R([a, b]) andf 2 dx.∫ bacg dx = c31∫ bag dx.


Hence R([a, b]) is a vector space.(ii) If f 1 , f 2 ∈ R([a, b]) and f 1 (x) ≤ f 2 (x) for all x ∈ [a, b], then∫ baf 1 dx ≤∫ baf 2 dx.(iii) If f ∈ R([a, b]) and if a < c < b, thenf |[a,c] ∈ R([a, b]),f |[c,b] ∈ R([a, b])and∫ caf dx +∫ bcf dx =∫ baf dx.(iv) If f ∈ R([a, b]) and |f(x)| ≤ M for all x ∈ [a, b], then∣∫ baf dx∣ ≤ M(b − a).Proof:(i) Let f = f 1 + f 2 and let P be a partition of [a, b]. Then()inf f(x) = inf f 1 (x) + f 2 (x)x i−1 ≤x≤x i x i−1 ≤x≤x i≥inf f 1 (x) +x i−1 ≤x≤x i(sup f(x) = sup f 1 (x) + f 2 (x)x −1 ≤x≤x i x i−1 ≤x≤x i≤inf f 2 (x)x i−1 ≤x≤x i)sup f 1 (x) + sup f 2 (x),x i−1 ≤x≤x i x i−1 ≤x≤x ihenceL(P, f 1 ) + L(P, f 2 ) ≤ L(P, f)U(P, f) ≤ U(P, f 1 ) + U(P, f 2 ).(∗)Let ε > 0. Since f 1 and f 2 are Riemann integrable, there exist partitions P 1 and P 2such that for j = 1, 2U(P j , f j ) − L(P j , f j ) < ε.For the common refinement P of P 1 and P 2 we have L(P j , f j ) ≤ L(P, f j ) and U(P, f j ) ≤U(P j , f j ), hence, for j = 1, 2,U(P, f j ) − L(P, f j ) < ε.(∗∗)From this inequality and from (∗) we obtainU(P, f) − L(P, f) ≤ U(P, f 1 ) + U(P, f 2 ) − L(P, f 1 ) − L(P, f 2 ) < 2ε.32


Since ε > 0 was chosen arbitrarily, this inequality and Theorem 2.6 imply f = f 1 + f 2 ∈R([a, b]).From (∗∗) we also obtainU(P, f j ) < L(P, f j ) + ε ≤whence, observing (∗)∫ b≤Since ε > 0 was arbitrary, this yieldsa∫ baf j dx + ε ,f dx ≤ U(P, f) ≤ U(P, f 1 ) + U(P, f 2 )∫ baf 1 dx +∫ baf 2 dx + 2ε .Similarly, (∗∗) yields∫ bf dx ≤∫ bf 1 dx +∫ baaaf 2 dx . (∗ ∗ ∗)which together with (∗) results in∫ bfrom which we conclude thataL(P, f j ) > U(P, f j ) − ε ≥∫ baf dx − ε ,f dx ≥ L(P, f) ≥ L(P, f 1 ) + L(P, f 2 )∫ bThis inequality and (∗ ∗ ∗) yielda∫ ba≥∫ baf dx ≥f dx =f 1 dx +∫ ba∫ ba∫ baf 1 dx +f 1 dx +f 2 dx − 2ε ,∫ ba∫ baf 2 dx .f 2 dx .To prove that cg ∈ R([a, b]) we note that the definition of L(P, cg) immediately yields forevery partition P of [a, b]⎧⎨ cL(P, g) , if c ≥ 0L(P, cg) =⎩cU(P, g) , if c < 0 .33


Thus, for c ≥ 0∫ baand for c < 0∫ bacg dx = sup { cL(P, g) ∣ ∣ P is a partition of [a, b]}= c sup { L(P, g) ∣ ∫} bP is a partition of [a, b] = c g dx = ccg dx = sup { cU(P, g) ∣ ∣ P is a partition of [a, b]}In the same mannerTherefore= c inf { U(P, g) ∣ ∫} bP is a partition of [a, b] = c g dx = c∫ ba∫ bacg dx = ccg dx = c∫ ba∫ bag dx =g dx .which implies cg ∈ R([a, b]) and ∫ ba cg dx = c ∫ ba g dx .∫ bacg dx ,aa∫ ba∫ bag dx ,g dx .This completes the proof of (i). The proof of (ii) is left as an exercise. To prove (iii), notefirst that to any partition P of [a, b] we can define a refinement P ∗ byTheorem 2.4 impliesP ∗ = P ∪ {c} .L(P, f) ≤ L(P ∗ , f), U(P ∗ , f) ≤ U(P, f) . (∗)From P ∗ we obtain partitions P ∗ − of [a, c] and P ∗ + of [c, b] by setting P ∗ − = P ∗ ∩ [a, c] andP ∗ + = P ∗ ∩ [c, b] , and if P ∗ = {x 0 , . . . , x n } with x j = c , thenL(P ∗ , f) =n∑m i ∆x i =i=1j∑n∑m i ∆x i + m i ∆x i = L(P−, ∗ f) + L(P+, ∗ f) .i=1i=j+1Here for simplicity we wrote L(P ∗ −, f) instead of L(P ∗ −, f |[a,c] ) and U(P ∗ −, f) instead ofU(P ∗ −, f| [c,b]). SimilarlyU(P ∗ , f) = U(P ∗ −, f) + U(P ∗ +, f) .34


From (∗) and from these equations we concludeL(P, f) ≤ L(P−, ∗ f) + L(P+, ∗ f) ≤U(P, f) ≥ U(P ∗ −, f) + U(P ∗ +, f) ≥∫ cThese estimates hold for any partition P of [a, b], whencef dx +∫ bac∫ c ∫ bf dx +acf dxf dx .∫ ba∫ baf dx =f dx =∫ bf dx ≤∫ cf dx +∫ baac∫ b ∫ c ∫ bf dx ≥f dx +aacf dxf dx .Since ∫ cf dx ≤ ∫ cf dx and ∫ bf dx ≤ ∫ bf dx , these inequalities can only hold ifa a c c∫ cf dx =∫ caaf dx ,∫ bf dx =∫ bccf dx ,hence f |[a,c] ∈ R([a, c]), f |[c,b] ∈ R([c, b]), and∫ cf dx +∫ bf dx =∫ bacaf dx .This proves (iii). The obvious proof of (iv) is left as an exercise.Theorem 2.10 Let −∞ < m < M < ∞ and f ∈ R([a, b]) with f : [a, b] → [m, M]. LetΦ : [m, M] → R be continuous and let h = Φ ◦ f . Then h ∈ R([a, b]) .Proof: Let ε > 0 . Since Φ is uniformly continuous on [m, M] , there is a number δ > 0such that for all s, t ∈ [m, M] with |s − t| ≤ δ|Φ(s) − Φ(t)| < ε .Moreover, since f ∈ R([a, b]) there is a partition P = {x 0 , . . . , x n } of [a, b] such thatU(P, f) − L(P, f) < εδ .(∗)LetM i = sup f(x) ,x i−1 ≤x≤x im i = inf f(x)x i−1 ≤x≤x iMi ∗ = sup h(x) ,x i−1 ≤x≤x im ∗ i = inf h(x)x i−1 ≤x≤x i35


andA = { i ∣ ∣ i ∈ N , 1 ≤ i ≤ n , Mi − m i < δ }B = {1, . . . , n} \A .If i ∈ A , then for all x, y with x i−1 ≤ x, y ≤ x i|h(x) − h(y)| = |Φ ( f(x) ) − Φ ( f(y) ) | < ε ,since |f(x) − f(y)| ≤ M i − m i < δ . This yields for i ∈ AMi ∗ − m ∗ i ≤ ε .If i ∈ B , thenM ∗ i − m ∗ i ≤ 2‖Φ‖ ,with the supremum norm ‖Φ‖ = sup m≤t≤M |Φ(t)| . Furthermore, (∗) yieldsδ ∑ ∆x i ≤ ∑ i − m i )∆x i ≤i∈Bi∈B(MwhenceTogether we obtainn∑(M i − m i )∆x i = U(P, f) − L(P, f) < εδ ,i=1∑∆x i ≤ ε .i∈BU(P, h) − L(P, h) = ∑ i∈A(M ∗ i − m ∗ i )∆x i + ∑ i∈B(M ∗ i − m ∗ i )∆x i≤ε ∑ i∈A∆x i + 2‖Φ‖ ∑ i∈B∆x i≤ε(b − a) + 2‖Φ‖ε = ε(b − a + 2‖Φ‖).Since ε was chosen arbitrarily, we conclude from this inequality that h ∈ R([a, b]), usingTheorem 2.6.Corollary 2.11 Let f, g ∈ R([a, b]). Then(i) fg ∈ R([a, b])(ii) |f| ∈ R([a, b]) and ∣∫ baf dx∣ ≤∫ ba|f| dx.Proof: (i) Setting Φ(t) = t 2 in the preceding theorem yields f 2 = Φ ◦ f ∈ R([a, b]). Fromfg = 1 [(f + g) 2 − (f − g) 2]436


we conclude with this result that also fg ∈ R([a, b]).(ii) Setting Φ(t) = |t| in the preceding theorem yields |f| = Φ ◦ f ∈ R([a, b]). Choosec = ±1 such thatThen∫ b∣a∫ bc f dx ≥ 0.af dx ∣ ∫ b= c f dx =since cf(x) ≤ |f(x)| for all x ∈ [a, b].a∫ bacf dx ≤∫ ba|f|dx,2.4 Fundamental theorem of calculusLet −∞ < a < b < ∞ and f ∈ R ( [a, b] ) . One definesThen∫ vu∫ abf dx +if u, v, w are arbitrary points of [a, b] .f dx = −∫ wv∫ baf dx =f dx .∫ wuf dx ,Theorem 2.12 (Mean value theorem of integration) Let f : [a, b] → R be continuous.Then there is a point c with a ≤ c ≤ b such that∫ baf dx = f(c)(b − a) .Proof: f is Riemann integrable, since f is continuous. Since the integral is monotone,we obtain∫ b∫ b(b − a) min f(x) = min f(y) dx ≤ f(x) dxx∈[a,b] y∈[a,b]≤a∫ bamaxy∈[a,b]af(y) dx = max f(x)(b − a) .x∈[a,b]Since f attains the minimum and the maximum on [a, b], by the intermediate valuetheorem there exists a number c ∈ [a, b] such thatf(c) = 1b − a∫ baf dx .37


Theorem 2.13 Let f ∈ R ( [a, b] ) . ThenF (x) =defines a continuous function F : [a, b] → R .∫ xaf(t) dtProof: There is M with |f(x)| ≤ M for all x ∈ [a, b] . Thus, for x, x 0 ∈ [a, b] with x 0 < x∣∣F (x) − F (x 0 ) ∣ =∣∫ xaf(t) dt −∫ x0af(t) dt∣ = ∣This estimate implies that F is continuous on [a, b] .∫ xf(t) dt∣ ≤ M(x − x 0 ) .x 0Theorem 2.14 Let f ∈ R ( [a, b] ) be continuous. Then the function F : [a, b] → R definedbyis continuously differentiable withTherefore F is an antiderivative of f .F (x) =∫ xaF ′ = f .f(t) dtProof: Let x 0 ∈ [a, b] . The mean value theorem of integration impliesF (x) − F (x 0 )1[ ∫ xlim= limx→x 0 x − x 0 x→x0 x − x 0af(t) dt −∫ x0a]f(t) dt∫1 x1= limf(t) dt = lim f(y)(x − x 0 )x→x0 x − x 0 x x→x0 0x − x 0= limx→x0f(y) = f(x 0 ) ,for suitable y between x 0 and x . Therefore F is differentiable with F ′ = f . Since f iscontinuous by assumption, F is continuously differentiable.Theorem 2.15 (Fundamental theorem of calculus) Let F be an antiderivative ofthe continuous function f : [a, b] → R . Then∫ ba∣f(t) dt = F (b) − F (a) = F (x)Proof: The functions x ↦→ ∫ xf(t) dt and F both are antiderivatives of f . Since twoaantiderivatives differ at most by a constant c , we obtainF (x) =∫ xa38f(t) dt + c∣ b a.


for all x ∈ [a, b] . This implies c = F (a) , whence F (b) − F (a) = ∫ bf(t) dt .aThis theorem is so important because it simplifies the otherwise so tedious computationof integrals.Examples. 1.) Let 0 < a < b and c ∈ R , c ≠ −1 . ThenFor c < −1 one obtainslimm→∞∫ ma∫ bax c dx = limm→∞Therefore one defines for a > 0 and c < −1∫ ∞ax c dx := limm→∞x c dx = 1 ∣ ∣∣bc + 1 xc+1 .a1c + 1 mc+1 − 1c + 1 ac+1 = − 1c + 1 ac+1 .∫ max c dx = − 1c + 1 ac+1 .The integral ∫ ∞axc dx is called improper Riemann integral and one says that for c < −1the function x ↦→ x c is improperly Riemann integrable over the interval [a, ∞) with a > 0 .In particular, one obtains∫ ∞1x −2 dx = 1 .For c < 0 the function x ↦→ x c is not defined at x = 0 and unbounded on every interval(0, b] with b > 0 . Therefore the Riemann integral ∫ b0 xc dx is not defined. However, for−1 < c < 0 one obtainslimε→0ε>0∫ bTherefore the improper Riemann integralεx c dx = 11c + 1 bc+1 − limε→0 c + 1 εc+1 = 1c + 1 bc+1 .ε>0∫ b0x c dx := limε→0ε>0∫ bεx c dx = 1c + 1 bc+1is defined, x c is improperly Riemann integrable over (0, b] for −1 < c < 0 and b > 0 . Inparticular, one obtains2.) For 0 < a < b < ∞∫ ba∫ 10x − 1 2 dx = 2 .1dx = log b − log a .x39


∫ b 1Neither of the limits lim b→∞ dx , lim ∫ b 1a x a→0 dx exists, so a x x−1 is not improperly Riemannintegrable over [a, ∞) or (0, b] .3.) Let −1 < a < b < 1 . ThenOne defines∫ 1−1∫ ba1√1 − x2b→1b−11√1 − x2 dx= lim arcsin b − lim arcsin a = π 2 − ( − π )= π .2√ 11−x 2is improperly Riemann integrable over the interval (−1, 1) .Theorem 2.16 (Substitution) Let f be continuous, let g : [a, b] → R be continuouslydifferentiable and let the composition f ◦ g be defined. Then∫ baf ( g(t) ) g ′ (t) dt =∫ g(b)g(a)f(x) dx .Proof: Since g is a continuous function defined on a compact interval, the range of g isa compact interval [c, d] . Therefore we can restrict f to this interval. As a continuousfunction, f : [c, d] → R is Riemann integrable, hence has an antiderivative F : [c, d] → R .The chain rule implieswhence(F ◦ g) ′ = (F ′ ◦ g) · g ′ = (f ◦ g) · g ′ ,F ( g(b) ) − F ( g(a) ) =Combination of this equation withyields the statement.∫ bF ( g(b) ) − F ( g(a) ) =af ( g(t) ) g ′ (t) dt .∫ g(b)g(a)f(x) dxRemark: If g −1 exists, the rule of substitution can be written in the form∫ baf(x) dx =∫ g −1 (b)g −1 (a)f ( g(t) ) g ′ (t) dt .40


Example. We want to compute ∫ 10√1 − x2 dx . With the substitution x = x(t) = cos tit follows because of the invertibility of cosine on the interval [0, π 2 ] that∫ 10√1 − x2 dx =∫ x −1 (1)x −1 (0)√1 − x(t)2 dx(t)dtdt==∫ 0π2∫ π/2where we used the addition theorem for cosine:0√∫ π/21 − (cos t)2 (− sin t) dt = (sin t) 2 dt(12 − 1 2 cos(2t)) dt = π 4 − 1 ∣ ∣∣4 sin(2t) π/2= π0 4 ,cos(2t) = cos(t + t) = (cos t) 2 − (sin t) 2 = 1 − (sin t) 2 − (sin t) 2 = 1 − 2(sin t) 2 .Theorem 2.17 (Product integration) Let f : [a, b] → R be continuous, let F be anantiderivative of f and let g : [a, b] → R be continuously differentiable. Then∫ baf(x) g(x) dx = F (x) g(x) ∣ b −a∫ ba0F (x) g ′ (x) dx .Proof: The product rule gives (F · g) ′ = F ′ · g + F · g ′ = f · g + F · g ′ , thusF (x) g(x) ∣ b =a∫ baf(x) g(x) dx +∫ baF (x) g ′ (x) dx .Example. With f(x) = g(x) = sin x and F (x) = − cos x we obtainhence∫ π0(sin x) 2 dx = − cos x sin x∣ π +0= − cos x sin x∣ π +0∫ π∫ π0(cos x) 2 dx( ) ∫ π1 − (sin x)2dx = π − (sin x) 2 dx ,00∫ π0(sin x) 2 dx = π 2 .41


3 Continuous mappings on R n3.1 Norms on R nLet n ∈ N . On the set of all n-tupels of real numbers{x = (x1 , x 2 , . . . , x n ) ∣ ∣ xi ∈ R , i = 1, . . . , n }the operations of addition and multiplication by real numbers are defined byx + y := (x 1 + y 1 , . . . , x n + y n )cx := (cx 1 , . . . , cx n ) .The set of n-tupels together with these operations is a vector space denoted by R n . Abasis of this vector space is for example given bye 1 = (1, 0, . . . , 0) , e 2 = (0, 1, 0, . . . , 0) , . . . , e n = (0, . . . , 0, 1) .On R n , norms can be defined in different ways. I consider three examples of norms:1.) The maximum norm:‖x‖ ∞ := max {|x 1 |, . . . , |x n |} .To prove that this is a norm, the properties(i) ‖x‖ ∞ = 0 ⇐⇒ x = 0(ii) ‖cx‖ ∞ = |c| ‖x‖ ∞ (positive homogeneity)(iii) ‖x + y‖ ∞ ≤ ‖x‖ ∞ + ‖y‖ ∞ (triangle inequality)must be verified. (i) and (ii) are obviously satisfied. To prove (iii) note that there existsi ∈ {1, . . . , n} such that ‖x + y‖ ∞ = |x i + y i | . Then‖x + y‖ ∞ = |x i + y i | ≤ |x i | + |y i | ≤ ‖x‖ ∞ + ‖y‖ ∞ .2.) The Euclidean norm:|x| :=√x 2 1 + . . . + x 2 n .✻√x21 + x 2 2.x 1.❜ x = (x 1 , x 2 )....x 2✲42


Using the scalar productx · y := x 1 y 1 + x 2 y 2 + . . . + x n y n ∈ Rthis can also be written as|x| = √ x · x .It is obvious that |x| = 0 ⇐⇒ x = 0 and |cx| = |c| |x| hold. To verify that | · | is a normon R n , it thus remains to verify the triangle inequality. To this end one first proves theCauchy-Schwarz inequality|x · y| ≤ |x| |y| .Proof: The quadratic polynomial in t|x| 2 t 2 + 2 x · y t + |y| 2 = |tx + y| 2 ≥ 0cannot have two different zeros, whence the discriminant must satisfy(x · y) 2 − |x| 2 |y| 2 ≤ 0 .Now the triangle inequality is obtained as follows:|x + y| 2 = (x + y) · (x + y) = |x| 2 + 2x · y + |y| 2≤ |x| 2 + 2|x · y| + |y| 2≤ |x| 2 + 2|x| |y| + |y| 2 = (|x| + |y|) 2 ,whence|x + y| ≤ |x| + |y| .3.) The p-norm:Let p be a real number with p ≥ 1 . Then the p-norm is defined by‖x‖ p := ( |x 1 | p + . . . + |x n | p) 1 p.Note that the 2-norm is the Euclidean norm:‖x‖ 2 = |x| .Here we only verify that ‖ · ‖ 1 is a norm. Since ‖x‖ 1 = 0 ⇐⇒ x = 0 and ‖cx‖ 1 = |c| ‖x‖ 1are evident, we have to show that the triangle inequality is satisfied:n∑n∑ (‖x + y‖ 1 = |x i + y i | ≤ |xi | + |y i | ) = ‖x‖ 1 + ‖y‖ 1 .i=1i=143


Definition 3.1 Let ‖ · ‖ be a norm on R n . A sequence {x k } ∞ k=1 with x k ∈ R n is said toconverge, if a ∈ R n exists such thatlim ‖x k − a‖ = 0 .k→∞a is called limit or limit element of the sequence { x k} ∞k=1 .Just as in R = R 1 one proves that a sequence cannot converge to two different limitelements. Hence the limit of a sequence is unique. This limit is denoted bya = limk→∞x k .In this definition of convergence on R n a norm is used. Hence, it seems that convergenceof a sequence depends on the norm chosen. The following results show that this is notthe case.Lemma 3.2 A sequence {x k } ∞ k=1 with x k = ( x (1)k, . . . , ) x(n) k ∈ R n converges to a =(a (1) , . . . , a (n) ) with respect to the maximum norm, if and only if every sequence of components{ x (i) } ∞kconverges to k=1 a(i) , i = 1, . . . , n .Proof: The statement follows immediately from the inequalities|x (i)k − a(i) | ≤ ‖x k − a‖ ∞ ≤ |x (1)k− a (1) | + . . . + |x (n) − a (n) | .Theorem 3.3 Let {x k } ∞ k=1 with x k ∈ R n be a sequence bounded with respect to the maximumnorm, i.e. there is a constant c > 0 with ‖x k ‖ ∞ ≤ c for all k ∈ N . Then thesequence {x k } ∞ k=1norm.possesses a subsequence, which converges with respect to the maximumProof: Since |x (i)k | ≤ ‖x k‖ ∞ for i = 1, . . . , n , all the component sequences are bounded.Therefore by the Bolzano-Weierstraß Theorem for sequences in R , the sequence { x (1) } ∞k k=1possesses a convergent subsequence { x (1) } ∞k(j). Then { x (2) ∞j=1 k(j)}is a bounded subsequencej=1of { x (2) } ∞k, hence it has a convergent subsequence { x (2) } ∞k=1 k(j(l)). Also { x (1) ∞l=1 k(j(l))} convergesas a subsequence of the converging sequence { x (1) ∞l=1k(j)}. Thus, for the subsequencej=1{ } ∞xk(j(l)) of { } ∞xl=1 k the first two component sequences converge. We proceed in thek=1same way and obtain after n steps a subsequence { } ∞x of { } ∞ks xs=1 k , for which all componentsequences converge. By the preceding lemma this implies that { } ∞k=1x convergesks s=1with respect to the maximum norm.44


such that for all x ∈ R n a‖x‖ ≤ x ≤ b‖x‖ .Theorem 3.4 Let ‖ · ‖ and · be norms on R n . Then there exist constants a, b > 0Proof: Obviously it suffices to show that for any norm ‖ · ‖ on R n there exist constantsa, b > 0 such that for the maximum norm ‖ · ‖ ∞‖x‖ ≤ a‖x‖ ∞ , ‖x‖ ∞ ≤ b‖x‖ ,for all x ∈ R n . The first one of these estimates is obtained as follows:‖x‖ = |x 1 e 1 + x 2 e 2 + . . . + x n e n |≤where a = ‖e 1 ‖ + . . . + ‖e n ‖ .‖x 1 e 1 ‖ + . . . + ‖x n e n ‖ = |x 1 | ‖e 1 ‖ + . . . + |x n | ‖e n ‖≤ ( ‖e 1 ‖ + . . . + ‖e n ‖ ) ‖x‖ ∞ = a‖x‖ ∞ ,The second one of these estimates is proved by contradiction: Suppose that such aconstant b > 0 would not exist. Then for every k ∈ N we can choose an element x k ∈ R nsuch thatSet y k =and‖x k ‖ ∞ > k ‖x k ‖ .x k‖x k ‖ ∞. The sequence {y k } ∞ k=1 satisfies‖y k ‖ = ∥ x k∥ 1 = ‖x k ‖ < 1 ‖x k ‖ ∞ ‖x k ‖ ∞ k‖y k ‖ ∞ = ∥ x k∥∥∞ = 1 ‖x k ‖ ∞ = 1 .‖x k ‖ ∞ ‖x k ‖ ∞Therefore by Theorem 3.3 the sequence {y k } ∞ k=1 has a subsequence { } ∞y kj , which convergeswith respect to the maximum norm. For brevity we set z j = y kj . Let z bej=1thelimit of {z j } ∞ j=1 . Thenhence, since ‖z j ‖ ∞ = ‖y kj ‖ ∞ = 1 ,lim ‖z j − z‖ ∞ = 0 ,j→∞1 = limj→∞‖z j ‖ ∞ = limj→∞‖z j − z + z‖ ∞ ≤ ‖z‖ ∞ + limj→∞‖z j − z‖ ∞ = ‖z‖ ∞ ,whence z ≠ 0 . On the other hand, ‖z j ‖ = ‖y kj ‖ < 1 k j≤ 1 jtogether with the estimate‖x‖ ≤ a‖x‖ ∞ proved above implies‖z‖ = ‖z − z j + z j ‖ = limj→∞‖z − z j + z j ‖≤1lim ‖z − z j ‖ + lim ‖z j ‖ ≤ a lim ‖z − z j ‖ ∞ + limj→∞ j→∞ j→∞ j→∞ j = 0 ,45


k, l ≥ k 0‖x k − x l ‖ < ε .hence z = 0 . This is a contradiction, hence a constant b must exist such that ‖x‖ ∞ ≤ b‖x‖for all x ∈ R .Definition 3.5 Let ‖ · ‖ and · be norms on a vector space V . If constant a, b > 0 existsuch thata‖v‖ ≤ v ≤ b‖v‖for all v ∈ V , then these norms are said to be equivalent.The above theorem thus shows that on R n all norms are equivalent. From the definitionof convergence it immediately follows that a sequence converging with respect to a normalso converges with respect to an equivalent norm. Therefore on R n the definition ofconvergence does not depend on the norm.Moreover, since all norms on R n are equivalent to the maximum norm, from Lemma 3.2and Theorem 3.3 we immediately obtainLemma 3.6 A sequence in R n converges to a ∈ R n if and only if the component sequencesall converge to the components of a .Theorem 3.7 (Theorem of Bolzano-Weierstraß for R n ) Every bounded sequencein R n possesses a convergent subsequence.Lemma 3.8 (Cauchy convergence criterion) Let ‖ · ‖ be a norm on R n . A sequence{x k } ∞ k=1 in Rn converges if and only if to every ε > 0 there is a k 0 ∈ N such that for allProof: { } ∞x k is a Cauchy sequence on k=1 Rn if and only if every component sequence{(i)} ∞xkfor i = 1, . . . , n is a Cauchy sequence in R . For, there are constants, a, b > 0k=1such that for all i = 1, . . . , na|x (i)k− x(i) l | ≤ a‖x k − x l ‖ ∞ ≤ ‖x k − x l ‖≤ b‖x k − x l ‖ ∞ ≤ b ( |x (1)k− x (1)l| + . . . + |x (n)k− x (n)l| ) .The statement of the lemma follows from this observation, from the fact that the componentsequences converge in R if and only if they are Cauchy sequences, and from the factthat a sequence converges in R n if and only if all the component sequences converge.46


Infinite series: Let { } ∞x k be a sequence in k=1 Rn . By the infinite series ∑ ∞k=1 x k onemeans the sequence { } ∞s l of partial sums s l=1 l = ∑ lk=1 x k . If { } ∞s l converges, thenl=1s = lim l→∞ s l is called the sum of the series ∑ ∞k=1 x k . One writesA series is said to converge absolutely, ifs =∞∑x k .k=1∞∑‖x k ‖k=1converges, where ‖ · ‖ is a norm on R n . Fromm∑‖ x k ‖ ≤k=lm∑‖x k ‖k=land from the Cauchy convergence criterion it follows that an absolutely convergent seriesconverges. The converse is in general not true.A series converges absolutely if and only if every component series converges absolutely.This implies that every rearrangement of an absolutely convergent series in R n convergesto the same sum, since this holds for the component series.3.2 Topology of R nIn the following we denote by ‖ · ‖ a norm on R n .Definition 3.9 Let a ∈ R n and ε > 0 . The setU ε (a) = {x ∈ R n | ‖x − a‖ < ε}is called open ε-neighborhood of a with respect the the norm ‖ · ‖ , or ball with center aand radius ε .A subset U of R n is called neighborhood of a if U contains an ε-neighborhood of a .The set U 1 (0) = {x ∈ R n | ‖x‖ < 1} is called open unit ball with respect to ‖ · ‖ .In R 2 the unit ball can be pictured for the different norms:47


Maximum norm ‖ · ‖ ∞ :✻x 21U 1 (0)✲1 x 1.Euclidean norm | · | :✻x 21 x 1U 1 (0).✲1-norm ‖ · ‖ 1 :✻x 21.U 1 (0)✲1 x 1p-norm ‖ · ‖ p with 1 ≤ p ≤ ∞ :✻x 2x 1p = 1.1 < p < 2.... . . . . . . .. . . . . . . . .. . ... ... ... . .. . ... . ... ............ ........ p = ∞✲2 < p < ∞48


Whereas the ε-neighborhoods of a point a differ for different norms, the notion of aneighborhood is independent of the norm. For, let ‖ · ‖ and · be norms on R n . Weshow that every ε-neighborhood with respect to ‖ · ‖ of a ∈ R n contains a δ-neighborhoodwith respect to · .To this end letU ε (a) = {x ∈ R n | ‖x − a‖ < ε} ,V ε (a) = {x ∈ R n | x − a < ε} .Since all norms on R n are equivalent, there is a constant c > 0 such thatc‖x − a‖ ≤ x − afor all x ∈ R n . Therefore, if x ∈ V cε (a) then x − a < cε , which implies ‖x − y‖ ≤1x − a < ε , and this means x ∈ Uc ε (a) . Consequently, with δ = cε ,V δ (a) ⊆ U ε (a) .This result implies that if U is a neighborhood of a with respect to ‖·‖ , then it contains aneighborhood U ε (a) , and then also the neighborhood V cε (a) , hence U is a neighborhoodof a with respect to the norm · as well. Consequently, a neighborhood of a with respectto one norm is a neighborhood of a with respect to every other norm on R n . Thereforethe definition of a neighborhood is independent of the norm.Definition 3.10 Let M be a subset of R n . A point x ∈ R n is called interior point of M ,if M contains an ε-neighborhood of x , hence if M is a neighborhood of x .x ∈ R n is called accumulation point of M , if every neighborhood of x contains a pointof M different from x .x ∈ R is called boundary point of M , if every neighborhood of x contains a point ofM and a point of the complement R n \M .M is called open, if it only consists of its interior points. M is called closed, if itcontains all its accumulation points.The following statements are proved exactly as in R 1 :The complement of an open set is closed, the complement of a closed set is open. Theunion of an arbitrary system of open sets is open, the intersection of finitely many opensets is open. The intersection of an arbitrary system of closed sets is closed, the union offinitely many closed sets is closed.49


A subset M of R n is called bounded, if there exists a positive constant C such thatfor all x ∈ M . The numberis called diameter of the bounded set M .‖x‖ ≤ Cdiam(M) := sup ‖y − x‖y,x∈MTheorem 3.11 Let {A k } ∞ k=1 be a sequence of bounded, closed, nonempty subsets A k ofR n with A k+1 ⊆ A k and withThen there is x ∈ R n such thatlim diam(A k) = 0 .k→∞∞⋂A k = {x} .k=1Proof: For every k ∈ N choose x k ∈ A k . Then the sequence { } ∞x k is a Cauchyk=1sequence, sind lim k→∞ diam(A k ) = 0 implies that to ε > 0 there is k 0 such that diam A k


✻.❜xx 2x 1Q❜ y✲Theorem 3.13 Let M ⊆ R n . The following three statements are equivalent:(i)M is bounded and closed.(ii)Let U be an open covering of M . Then there are finitely many U 1 , . . . , U m ∈ U suchthat M ⊆ ⋃ mi=1 U i .(iii) Every infinite subset of M possesses an accumulation point in M .Proof: (i) ⇒ (ii): Assume that M is bounded and closed, but that there is an opencovering U of M for which (ii) is not satisfied. As a bounded set M is contained in asufficiently large closed cube W .Subdivide this cube into 2 n closed cubes with edgelength halved. By assumption, there is at least one of the smaller cubes, denoted by W 1 ,such that W 1 ∩ M cannot be covered by finitely many sets from U . Now subdivide W 1and select W 2 analogously. The sequence {M ∩ W k } ∞ k=1has the following properties:1.) M ∩ W ⊇ M ∩ W 1 ⊇ M ∩ W 2 ⊇ . . .2.) lim diam (M ∩ W k ) = 0k→∞3.) M ∩ W k cannot be covered by finitely many sets from U .of closed sets thus constructed,3.) implies M ∩ W k ≠ ∅ . Therefore, by 1.) and 2.) the sequence {M ∩ W k } ∞ k=1 satisfiesthe assumptions of Theorem 3.11, hence there is x ∈ R n such thatx ∈∞⋂(M ∩ W k ) .k=1Since x ∈ M , there is U ∈ U with x ∈ U . The set U is open, and therefore contains anε-neighborhood of x , and then also a δ-neighborhood of x with respect to the maximumnorm. Because lim k→∞diam (W k ) → 0 and because x ∈ W k for all k, this δ-neighborhoodcontains the cubes W k for all sufficiently large k . Hence U contains M ∩ W k for allsufficiently large k . Thus, M ∩ W k can be covered by one set from U , contradicting 3.).We thus conclude that if (i) holds, then also (ii) must be satisfied.51


(ii) ⇒ (iii): Assume that (ii) holds and let A be a subset of M which does not haveaccumulation points in M . Then no one of the points of M is an accumulation point ofA, consequently to every x ∈ M there is an open neighborhood, which does not contain apoint from A different from x . The system of all these neighborhoods is an open coveringof M , hence finitely many of these neighborhoods cover M . Since everyone of theseneighborhoods contains at most one point from A , we conclude that A must be finite.An infinite subset of M must thus have an accumulation point in M .(iii) ⇒ (i). Assume that (iii) is satisfied. If M would not be bounded, to every k ∈ Nthere would exist x k ∈ M such that‖x k ‖ ≥ k .Let A denote the set of these points. A is an infinite subset of M , but it does not havean accumulation point. For, to an accumulation point y of A there must exist infinitelymany x ∈ A satisfying ‖x − y‖ < 1 , which implies‖x‖ = ‖x − y + y‖ ≤ ‖x − y‖ + ‖y‖ < 1 + ‖y‖ .This is not possible, since A only contains finitely many points with norm smaller than1 + ‖y‖ Thus, the infinite subset A of M does not have an accumulation point. Since thiscontradicts (iii), M must be bounded.Let x be an accumulation point of M . For every k ∈ N we can select x k ∈ M with0 < ‖x k −x‖ < 1 . The sequence { } ∞xk k converges to x , hence x is the only accumulationk=1point of this sequence. Therefore x must belong to M by (iii), thus M contains all itsaccumulation points, whence M is closed.Definition 3.14 A subset of R n is called compact, if it has one (and therefore all) of thethree properties stated in the preceding theorem.Theorem 3.15 A subset M of R n is compact, if and only if every sequence in M possessesa convergent subsequence with limit contained in M .This theorem is proved as in R 1 (cf. Theorem 6.15 in the classroom notes to Analysis I.)A set M with the property that every sequence in M has a subsequence converging inM, is called sequentially compact. Therefore, in R n a set is compact if and only if itis sequentially compact. Finally, just as in R 1 , from the Theorem of Bolzano-Weierstraßfor sequences (Theorem 3.7) we obtain52


Theorem 3.16 (Theorem of Bolzano-Weierstraß for sets in R n ) Everyinfinite subset of R n has an accumulation point.boundedThe proof is the same as the proof of Theorem 6.11 in the classroom notes to Analysis I.3.3 Continuous mappings from R n to R mLet D be a subset of R n . We consider mappings f : D → R m . Such mappings are calledfunctions of n variables.For x ∈ D let f 1 (x), . . . , f m (x) denote the components of the element f(x) ∈ R m . Thisdefines mappingsf i : D → R , i = 1, . . . , m .Conversely, let m mappings f 1 , . . . , f m : D → R be given. Then a mappingf : D → R mis defined byf(x) := ( f 1 (x), . . . , f m (x) ) .Thus, every mapping f : D → R m with D ⊆ R n is specified by m equationsy 1 = f 1 (x 1 , . . . , x n ).y m = f m (x 1 , . . . , x n ) .Examples1.) Let f : R n → R m be a mapping, which satisfies for all x, y ∈ R n and all c ∈ Rf(x + y) = f(x) + f(y)f(cx) = cf(x)Then f is called a linear mapping. The study of linear mappings from R n to R m is thetopic of linear algebra. From linear algebra one knows that f : R n → R m is a linearmapping if and only if there exists a matrix⎛A = ⎜⎝⎞a 11 . . . a 1n.⎟⎠a m1 . . . a mn53


with a ij ∈ R such that⎛f(x) = Ax = ⎜⎝⎞a 11 x 1 + . . . + a 1n x n.⎟⎠ .a m1 x 1 + . . . + a mn x n2.) Let n = 2, m = 1 and D = { x ∈ R 2 ∣ ∣ |x| < 1 } . A mapping f : D → R is defined byf(x) = f(x 1 , x 2 ) =√1 − x 2 1 − x 2 2 .The graph of a mapping from a subset D of R 2 to R is a surface in R 3 . In the presentexample graph f is the upper part of the unit sphere:✻yf❅.. . . . . . . . . .. .... . . . . . . . . . . .. .. ..... ✒ x 2 3.) Every mapping f : R → R m is called a path in R m . For example, let for t ∈ RThe range of f is a helix.⎛ ⎞ ⎛ ⎞f 1 (t) cos tf(t) = ⎜ f⎝ 2 (t) ⎟⎠ = ⎜ sin t ⎟⎝ ⎠f 3 (t) t✲x 154


✻y✒x.2 ✲x 14.) Polar coordinates: LetD = { (r, ϕ, ψ) ∈ R 3 | 0 < r, 0 ≤ ϕ < 2π, 0 < ϑ < π } ⊆ R 3 ,and let f : D → R 3 ,⎛r cos ϕ sin ϑf(r, ϕ, ψ) = ⎜ r sin ϕ sin ϑ⎝r cos ϑThe range of this mapping is R 3 without the x 3 -axis:⎞⎟⎠ .x 3✻x = (r, ϕ, ϑ)..✸ x 2✒ .❜ .r...ϑ..ϕ.......✲x 1Definition 3.17 Let D be a subset of R n . A mapping f : D → R m is said to be continuousat a ∈ D , if to every neighborhood V of f(a) there is a neighborhood U of a suchthat f(U ∩ D) ⊆ V .Since every neighborhood of a point contains an ε-neighborhood of this point, irrespectiveof the norm we use to define ε-neighborhoods, we obtain an equivalent formulation if in55


this definition we replace V by V ε(f(a))and U by Uδ (a) . Thus, using the definition ofε-neighborhoods, we immediately get the followingTheorem 3.18 Let D ⊆ R n . A mapping f : D → R m is continuous at a ∈ D if and onlyif to every ε > 0 there is δ > 0 such thatfor all x ∈ D with ‖x − a‖ < δ .‖f(x) − f(a)‖ < εNote that in this theorem we denoted the norms in R n and R m with the same symbol‖ · ‖ .Almost all results for continuous real functions transfer to continuous functions fromR n to R m with the same proofs. An example is the followingTheorem 3.19 Let D ⊆ R n . A function f : D → R m is continuous at a ∈ D , if andonly if for every sequence { x k} ∞k=1 with x k ∈ D and lim k→∞ x k = aholds.lim f(x k) = f(a)k→∞Proof: Cf. the proof of Theorem 6.21 of the classroom notes to Analysis I.Definition 3.20 Let f : D → R m and let a ∈ R n be an accumulation point of D . Letb ∈ R m . One says that f has the limit b at a and writesif to every ε > 0 there is δ > 0 such thatfor all x ∈ D\ {a} with ‖x − y‖ < δ .lim f(x) = bx→a‖f(x) − b‖ < εTheorem 3.21 Let f : D → R m and let a be an accumulation point. lim x→a f(x) = bholds if and only if for every sequence { x k} ∞k=1 with x k ∈ D\ {a} and lim k→∞ x k = aholds.lim f(x k) = bk→∞56


Proof: Cf. the proof of Theorem 6.39 of the classroom notes to Analysis I.Example: Let f : R 2 → R be defined by⎧⎨ 2xy, (x, y) ≠ 0xf(x, y) =2 +y 2⎩0 , (x, y) = 0 .This function is continuous at every point (x, y) ∈ R 2 with (x, y) ≠ 0 , but it is notcontinuous at (x, y) = 0 . Forf(x, 0) = f(0, y) = 0 ,whence f vanishes identically on the lines y = 0 and x = 0 . However, on the diagonalx = yf(x, y) = f(x, x) = 2x22x 2 = 1 .For the two sequences { z k} ∞k= 1 with z k = ( 1 k , 0) and {˜z k} ∞k=1 with ˜z l = ( 1k , 1 k)we thereforehave lim k→∞ z k = lim k→∞ ˜z k = 0 , butlim f(z k) = 0 = f(0) ≠ 1 = lim f(˜z k ) .k→∞ k→∞Therefore, by Theorem 3.19 f is not continuous at (0, 0) , and by Theorem 3.21 does nothave a limit at (0, 0) . Hence f cannot be made into a function continuous at (0, 0) bymodifying the value f(0, 0) .Observe however, that the functionis continuous for every y ∈ R , andis continuous for every x ∈ R .x ↦→ f(x, y) : R → Ry ↦→ f(x, y) : R → Rfunction f : R 2 → R it is not continuous at (0, 0) .Therefore f is continuous in every variable, but as aTheorem 3.22 Let D ⊆ R n and let f : D → R m . The function f is continuous at apoint a ∈ D , if and only if all the component functions f 1 , . . . , f m : D → R are continuousat a .Proof: f is continuous at a , if and only if for every sequence { } ∞x k with x k=1 k ∈ Dand lim k→∞ x k = a the sequence { f(x k ) } ∞converges to f(a) . This holds if and onlyk=1if every component sequence { f i (x k ) } ∞converges to f k=1 i(a) for i = 1, . . . , m , and this isequivalent to the continuity of f i at a for i = 1, . . . , m .57


Definition 3.23 Let D ⊆ R n . A function f : D → R m is said to be continuous if it iscontinuous at every point of D .Definition 3.24 Let D be a subset of R n . A subset D ′ of D is said to be relatively openwith respect to D , if there exists an open subset O of R n such that D ′ = O ∩ D .Thus, for example, every subset D of R n is relatively open with respect to itself, sinceD = D ∩ R n and R n is open.Lemma 3.25 A subset D ′ of D is relatively open with respect to D , if and only if forevery x ∈ D there is a neighborhood U of x such that U ∩ D ⊆ D ′ .Proof: If D ′ is relatively open, there is an open subset O of R n such that D ′ = O ′ ∩ D .For every x ∈ D ′ the set O is the sought neighborhood.Conversely, assume that to every x ∈ D ′ there is a neighborhood U(x) with U(x)∩D ⊆D ′ . Since every neighborhood contains an open neighborhood, we can assume that U(x)is open. ThenD ′ ⊆ D ∩ ⋃U(x) = ⋃ ( )D ∩ U(x) ⊆ D ′ ,x∈D ′ x∈D ′whence D ′ = D ∩ O with the open set O = ⋃ x∈DU(x) . Consequently D ′ is relatively′open with respect to D .Theorem 3.26 Let D ⊆ R n . A function f : D → R m is continuous, if and only if foreach open set O of R m the inverse image f −1 (O) is relatively open with respect to D.Proof: Let f be continuous and x ∈ f −1 (O). Then f(x) belongs to the open set O,whence O is a neighborhood of f(x). Therefore, by definition of continuity, there is aneighborhood V of x such that f(V ∩ D) ⊆ O, which implies V ∩ D ⊆ f −1 (O). Thus,f −1 (O) is relatively open with respect to D.Assume conversely that the inverse image of every open set is relatively open in D. Letx ∈ D and let U be an open neighborhood of f(x). Then f −1 (U) is relatively open, whencethere is an open set O ⊆ R n such that f −1 (U) = O ∩ D. This implies x ∈ f −1 (U) ⊆ O,whence O is a neighborhood of x. For this neighborhood of x we havef(O ∩ D) = f ( f −1 (U ) ⊆ U ,hence f is continuous.The following theorems and the corollary are proved as the corresponding theorems in R.58


Theorem 3.27 (i) Let D ⊆ R n and let f : D → R m , g : D → R m be continuous. Thenalso the mappings f + g : D → R m and cf : D → R m are continuous for every c ∈ R.(ii) Let f : D → R and g : D → R be continuous. Then also f · g : D → R andare continuous.fg : { x ∈ D | g(x) ≠ 0 } → R(iii) Let f : D → R m and ϕ : D → R be continuous. Then also ϕf is continuous.Theorem 3.28 Let D 1 ⊆ R n and D 2 ⊆ R p . Assume that f : D 1 → D 2 and g : D 2 → R mare continuous. Then g ◦ f : D 1 → R m is continuous.This theorem is proved just as Theorem 6.25 in the classroom notes of Analysis I.Definition 3.29 Let D be a subset of R n . A mapping f : D → R m is said to be uniformlycontinuous, if to every ε > 0 there is δ > 0 such thatfor all x, y ∈ D satisfying ‖x − y‖ < δ .‖f(x) − f(y)‖ < εTheorem 3.30 Let D ⊆ R n be compact and f : D → R m be continuous. Then f isuniformly continuous and f(D) ⊆ R m is compact.Corollary 3.31 Let D ⊆ R n be compact and f : D → R be continuous. Then f attainsthe maximum and minimum.Definition 3.32 A subset M of R n is said to be connected, if it has the following property:Let U 1 , U 2 be relatively open subsets of M such that U 1 ∩ U 2 = ∅ and U 1 ∪ U 2 = M. ThenM = U 1 and U 2 = ∅ or M = U 2 and U 1 = ∅ .Example Every interval in R is connected.Theorem 3.33 Let D be a connected subset of R n and f : D → R m be continuous. Thenf(D) is a connected subset of R m .Proof: Let U 1 and U 2 be relatively open subsets of f(D) with U 1 ∩ U 2 = ∅ and U 1 ∪U 2 = f(D). With suitable open subsets O 1 , O 2 of R m we thus have U 1 = O 1 ∩ f(D)and U 2 = O 2 ∩ f(D) , whence the continuity of f implies that f −1 (U 1 ) = f −1 (O 1 ) andf −1 (U 2 ) = f −1 (O 2 ) are relatively open subsets of D satisfying f −1 (U 1 ) ∩ f −1 (U 2 ) = ∅and f −1 (U 1 ) ∪ f −1 (U 2 ) = D . Thus, since D is connected, it follows that f −1 (U 1 ) = ∅ orf −1 (U 2 ) = ∅, hence U 1 = ∅ or U 2 = ∅ . Consequently, f(D) is connected.59


Definition 3.34 Let [a, b] be an interval in R and let γ : [a, b] → R m be continuous.Then γ is called a path in R m .Definition 3.35 A subset M of R n is said to be pathwise connected, if any two points inM can be connected by a path in M, i.e. if to x, y ∈ M there is an interval [a, b] and acontinuous mapping γ : [a, b] → M such that γ(a) = x and γ(b) = y .γ(a) is called starting point, γ(b) end point of γ.Theorem 3.36 Let D ⊆ R n be pathwise connected and let f : D → R m be continuous.Then f(D) is pathwise connected.Proof: Let u, v ∈ f(D) and let x ∈ f −1 (u) and y ∈ f −1 (v). Then there is a path γ,which connects x with y in D . Thus, f ◦ γ is a path which connects u with v in f(D) .Theorem 3.37 Let M ⊆ R m be pathwise connected. Then M is connected.Proof: Suppose that M is not connected. Then there are relatively open subsets U 1 ≠ ∅and U 2 ≠ ∅ such that U 1 ∩ U 2 = ∅ and U 1 ∪ U 2 = M. Select x ∈ U 1 and y ∈ U 2 and letγ : [a, b] → M be a path connecting x with y . Since M is not connected, it follows thatthe set γ([a, b]) is not connected. To see this, setV 1 = γ([a, b]) ∩ U 1 ,V 2 = γ([a, b]) ∩ U 2 .Then V 1 and V 2 are relatively open subsets of γ([a, b]) satisfying V 1 ∩ V 2 = ∅ and V 1 ∩ V 2 =γ([a, b]) . Therefore, since x ∈ V 1 , y ∈ V 2 implies V 1 ≠ ∅ , V 2 ≠ ∅ , it follows that γ([a, b])is not connected.On the other hand, since [a, b] is connected and since γ is continuous, the set γ([a, b])must be connected. Our assumption has thus led to a contradiction, hence M is connected.Example. Consider the mapping f : [0, ∞) → R defined by⎧⎨ sin 1f(x) =, x > 0x⎩0 , x = 0 .Then M = graph(f) = {( x, f(x) ) | x ∈ [0, ∞) } is a subset of R 2 , which is connected, butnot pathwise connected.60


✻1.f✉✲xTo prove that M is not pathwise connected, assume the contrary. Then, since (0, 0) ∈M and (x 0 , 1) ∈ M with x 0 = 1/ π , a path γ : [a, b] → M exists such that γ(a) = (0, 0)2and γ(b) = (x 0 , 1) . The component functions γ 1 and γ 2 are continuous. Since to everyx ≥ 0 a unique y ∈ R exists such that (x, y) ∈ M , namely y = f(x) , these componentfunctions satisfy for all c ∈ [a, b]γ(c) = ( γ 1 (c) , γ 2 (c) ) =(γ 1 (c) , f(γ 1 (c) )) ,henceγ 2 = f ◦ γ 1 .However, this is a contradiction, since f ◦ γ 1 is not continuous.To see this, setThen { } ∞x n is a null sequence withn=11x n =π+ 2nπ .2γ 1 (a) = 0 < x n < x 0 = γ 1 (b) .Therefore the intermediate value theorem implies that a sequence { c n} ∞n=1a ≤ c n ≤ b such thatγ 1 (c n ) = x n .exists withThe bounded sequence { c n} ∞n=1 has a convergent subsequence { c nj} ∞j=1with limitc = limj→∞c nj ∈ [a, b] .61


From the continuity of γ 1 it follows thatγ 1 (c) = limj→∞γ 1 (c nj ) = limj→∞x nj= limn→∞x n = 0 ,hence(f ◦ γ 1 )(c) = f ( γ 1 (c) ) = f(0) = 0 ,butlim (f ◦ γ 1)(c nj ) = lim f ( γ 1 (c nj ) ) = lim f(x nj )j→∞ j→∞ j→∞= limj→∞sin ( π2 + 2n jπ ) = limj→∞1 = 1 ≠ (f ◦ γ 1 )(c) ,which proves that f ◦ γ 1 is not continuous at c .To prove that M is connected, assume the contrary. Then there are relatively opensubsets U 1 , U 2 of M satisfying U 1 ≠ ∅ , U 2 ≠ ∅ , U 1 ∩ U 2 = ∅ , and U 1 ∪ U 2 = M . the setM ′ = {( x, f(x) ) | x > 0 } ⊆ Mis connected as the image of the connected set (0, ∞) under the continuous mapx ↦→ ( x, f(x) ) : (0, ∞) → R 2 .Consequently, U 1 ∩ M ′ = ∅ or U 2 ∩ M ′ = ∅ . Without restriction of equality we assumethat U 1 ∩ M ′ = ∅ . Then U 2 = M ′ and U 1 = { (0, 0) } . However, this is a contradiction,since { (0, 0) } is not relatively open with respect to M . Otherwise an open set O ⊆ R 2would exist such that { (0, 0) } = M ∩ O , hence (0, 0) ∈ O , and therefore O would containan ε-neighborhood of (0, 0) . Since sin ( 1x)has infinitely many zeros in every neighborhoodof x = 0 , the ε-neighborhood of (0, 0) would contain besides (0, 0) infinitely many pointsof M on the positive real axis, hence M ∩ O ≠ { (0, 0) } . Consequently, M is connected.This example shows that the statement of the preceding theorem cannot be inverted.Theorem 3.38 Let D be a compact subset of R n and f : D → R m be continuous andinjective. Then the inverse f −1 : f(D) → D is continuous.The proof of this theorem is obtained by a slight modification of the proof of Theorem6.28 in the classroom notes of Analysis I.Definition 3.39 let D ⊆ R n and W ⊆ R m . A mapping f : D → W is called homeomorphism,if f is bijective, continuous and has a continuous inverse.62


3.4 Uniform convergence, the normed spaces of continuous and linear mappingsDefinition 3.40 Let D be a nonempty set and let f : D → R m be bounded. Then‖f‖ ∞ := sup ‖f(x)‖x∈Dis called the supremum norm of f . Here ‖ · ‖ denotes a norm on R m .As for real valued mappings it follows that ‖ · ‖ ∞ is a norm on the vector space B(D, R m )of bounded mappings from D to R m , cf. the proof of Theorem 1.8. Therefore, with thisnorm B(D, R m ) is a normed space. Of course, the supremum norm on B(D, R m ) dependson the norm on R m used to define the supremum norm. However, from the equivalence ofall norms on R m it immediately follows that the supremum norms on B(D, R m ) obtainedfrom different norms on R m are equivalent. Therefore the following definition does notdepend on the supremum norm chosen:Definition 3.41 Let D be a nonempty set and let {f k } ∞ k=1 be a sequence of functionsf k ∈ B(D, R m ). The sequence {f k } ∞ k=1 is said to converge uniformly, if f ∈ B(D, Rm )exists such thatlim ‖f k − f‖ ∞ = 0.k→∞Theorem 3.42 A sequence {f k } ∞ k=1 with f k ∈ B(D, R m ) converges uniformly if and onlyif to every ε > 0 there is k 0 ∈ N such that for all k, l ≥ k 0(Cauchy convergence criterion.)This theorem is proved as Corollary 1.5.‖f k − f l ‖ ∞ < ε.Definition 3.43 A normed vector space with the property that every Cauchy sequenceconverges, is called a complete normed space or a Banach space (Stefan Banach, 1892 –1945).Corollary 3.44 The space B(D, R m ) with the supremum norm is a Banach space.Theorem 3.45 Let D ⊆ R n and let {f k } ∞ k=1 be a sequence of continuous functions f k ∈B(D, R m ), which converges uniformly to f ∈ B(D, R m ). Then f is continuous.63


This theorem is proved as Corollary 1.5. For a subset D of R n we denote by C(D, R m )the set of all continuous functions from D to R m . This is a linear subspace of the vectorspace of all functions from D to R m . Also the set of all bounded continuous functionsC(D, R m ) ∩ B(D, R m ) is a vector space. As a subspace of B(D, R m ) it is a normed spacewith the supremum norm. From the preceding theorem we obtain the following importantresult:Corollary 3.46 For D ⊆ R n the normed space C(D, R m ) ∩ B(D, R m ) is complete, henceit is a Banach space.Proof: Let {f k } ∞ k=1 be a Cauchy sequence in C(D, Rm ) ∩ B(D, R m ). Then this sequenceconverges with respect to the supremum norm to a function f ∈ B(D, R m ). The precedingtheorem implies that f ∈ C(D, R m ), since f k ∈ C(D, R m ) for all k. Thus, f ∈ C(D, R m )∩B(D, R m ), and {f k } ∞ k=1 converges with respect to the supremum norm to f. Thereforeevery Cauchy sequence converges in C(D, R m ) ∩ B(D, R m ), hence this space is complete.By L(R n , R m ) we denote the set of all linear mappings f : R n → R m . Since for linearmappings f, g and for a real number c the mappings f + g and cf are linear, L(R n , R m )is a vector space.Theorem 3.47 Let f : R n → R m be linear. Then f is continuous. If f differs from zero,then f is unbounded.Proof:To f there exists a unique m × n–Matrix (a ij ) i=1,...,mj=1,...,nsuch thatf 1 (x 1 , . . . , x n ) = a 11 x 1 + . . . + a 1n x n.f m (x 1 , . . . , x n = a m1 x 1 + . . . + a mn x n .Since everyone of the expressions on the right depends continuously on x = (x 1 , . . . , x n ),it follows that all component functions of f are continuous, hence f is continuous.If f differs from 0, there is x ∈ R n with f(x) ≠ 0. From the linearity we then obtainfor λ ∈ R|f(λx)| = |λf(x)| = |λ| |f(x)|,which can be made larger than any constant by choosing |λ| sufficiently large. Hence fis not bounded.64


We want to define a norm on the linear space L(R n , R m ). It is not possible to use thesupremum norm, since every linear mapping f ≠ 0 is unbounded, hence, the supremumof the set{‖f(x)‖ ∣ ∣ x ∈ R n }does not exist. Instead, on L(R n , R m ) a norm can be defined as follows: Let B = {x ∈R n ∣ ∣ ‖x‖ ≤ 1} be the closed unit ball in R n . The set B is bounded and closed, hencecompact. Thus, since f ∈ L(R n , R m ) is continuous and since every continuous map isbounded on compact sets, the supremum‖f‖ := supx∈B‖f(x)‖exists. The following lemma shows that the mapping ‖ · ‖ : L(R n , R m ) → [0, ∞) thusdefined is a norm:Lemma 3.48 Let f, g : R n → R m be linear, let c ∈ R and x ∈ R n . Then(i) f = 0 ⇐⇒ ‖f‖ = 0,(ii)(iii)(iv)‖cf‖ = |c| ‖f‖‖f + g‖ ≤ ‖f‖ + ‖g‖‖f(x)‖ ≤ ‖f‖ ‖x‖.Proof: We first prove (iv). For x = 0 the linearity of f implies f(x) = 0, whence‖f(x)‖ = 0 ≤ ‖f‖ ‖x‖. For x ≠ 0 we have ‖ xx‖ = 1, hence‖x‖linearity of f yields‖f(x)‖ = ∥ ( x )∥ ∥ (f ‖x‖ ∥ = x ‖x‖f‖x‖‖x‖= ‖x‖ ∥ ∥ f( x‖x‖)∥ ∥‖x‖)∥ ∥ ≤ ‖x‖ sup ‖f(y)‖ = ‖x‖ ‖f‖.y∈B∈ B. Therefore theTo prove (i), let f = 0. Then ‖f‖ = sup ‖f(x)‖ = 0. On the other hand, if ‖f‖ = 0, wex∈Bconclude from (iv) for all x ∈ R n that‖f(x)‖ ≤ ‖f‖ ‖x‖ = 0,hence f(x) = 0, and therefore f = 0. (ii) and (iii) are proved just as the correspondingproperties for the supremum norm in Theorem 1.8.65


for all a ∈ R n·m c‖a‖ ∞ ≤ ‖a‖ ∗ ≤ C‖a‖ ∞ ,Definition 3.49 For f ∈ L(R n , R m )is called the operator norm of f.‖f‖ = sup ‖f(x)‖‖x‖≤1With this norm L(R n , R m ) is a normed vector space. To every linear mapping A : R n →R m there is associated a unique m × n–matrix, which we also denote by A, such thatA(x) = Ax. Here Ax denotes the matrix multiplication. The question arises, whether theoperator norm ‖A‖ can be computed from the elements of the matrix A. To give a partialanswer, we define for A = (a ij ),‖A‖ ∞ = max |a ij |.i=1,...,mj=1,...,nTheorem 3.50 There exist constants c, C > 0 such that for every A ∈ L(R n , R m )c‖A‖ ∞ ≤ ‖A‖ ≤ C‖A‖ ∞ .Proof: Define the mapping T : L(R n , R m ) → R n·m byT (A) = (a 11 , . . . , a 1n , a 21 , . . . , a 2n , . . . , a m1 , . . . , a mn ) ∈ R n·m ,for A = (a ij ) ∈ L(R n , R m ). We see immediately that T is a linear isomorphism. Fora ∈ R n·m we set‖a‖ ∗ = ‖T −1 a‖,where the norm on the right hand side is the operator norm. Obviously ‖ · ‖ ∗ is a normon R n·m . Since all norms on R n·m are equivalent, there exist constants c, C > 0 such thatwhich by definition of T implies for A ∈ L(R n , R m ) thatc‖A‖ ∞ = c‖T (A)‖ ∞ ≤ ‖T (A)‖ ∗ = ‖A‖ ≤ C‖T (A)‖ ∞ = C‖A‖ ∞ .66


4 Differentiable mappings on R n4.1 Definition of the derivativeThe derivative of a real function f at a satisfies the equationf(x) = f(a) + f ′ (a)(x − a) + r(x)(x − a) ,where the function r is continuous at a and satisfies r(a) = 0 . Since x ↦→ f ′ (a)x isa linear map from R to R , the interpretation of this equation is that under all affinemaps x ↦→ f(a) + T (x − a) , where T : R → R is linear, the one obtained by choosingT (x) = f ′ (a)x is the best approximation of the function f in a neighborhood of a .Viewed in this way, the notion of the derivative can be generalized immediately tomappings f : D → R m with D ⊆ R n . Thus, the derivative of f at a ∈ D is the linearmap T : R n → R m such that under all affine functions the mapping x ↦→ f(a) + T (x − a)approximates f best in a neighborhood of a .tangential planef(a,f (a))x2For a mapping f : R 2 → R this means that the linear mapping T : R 2 → R , the derivativeof f at a, must be chosen such that the graph of the mapping x ↦→ f(a) + T (x − a) isequal to the tangential plane of the graph of f at ( a, f(a) ) .This idea leads to the following rigorous definition of a differentiable function:Definition 4.1 Let U be an open subset of R n . A function f : U → R m is said to bedifferentiable at the point a ∈ U , if there is a linear mapping T : R n → R m and a functionr : U → R m , which is continuous at a and satisfies r(a) = 0 , such that for all x ∈ Uf(x) = f(a) + T (x − a) + r(x) ‖x − a‖ .67


Therefore to verify that f is differentiable at a ∈ D a linear mapping T : R n → R m mustbe found such that the function r defined byr(x) :=f(x) − f(a) − T (x − a)‖x − a‖satisfieslim r(x) = 0 .r→aLater we show how T can be found. However, there is at most one such T :Lemma 4.2 The linear mapping T is uniquely determined.Proof: Let T 1 , T 2 : R n → R m be linear mappings and r 1 , r 2 : U → R m be functions withlim x→a r 1 (x) = lim x→a r 2 (x) = 0 , such that for x ∈ Uf(x) = f(a) + T 1 (x − a) + r 1 (x) ‖x − a‖f(x) = f(a) + T 2 (x − a) + r 2 (x) ‖x − a‖ .Then(T 1 − T 2 )(x − a) = ( r 2 (x) − r 1 (x) ) ‖x − a‖ .Let h ∈ R n . Then, x = a + th ∈ U for all sufficiently small t > 0 since U is open, whencethus(T 1 − T 2 )(th) = t(T 1 − T 2 )(h) = ( r 2 (a + th) − r 1 (a + th) ) ‖th‖ ,(T 1 − T 2 )(h) = limt→0(T 1 − T 2 )(h) = limt→0(r2 (a + th) − r 1 (a + th) ) ‖h‖ = 0 .This implies T 1 = T 2 , since h ∈ R n was chosen arbitrarily.Definition 4.3 Let U ⊆ R n be open and let f : U → R m be differentiable at a ∈ U. Thenthe unique linear mapping T : R n → R m , for which a function r : U → R m satisfyinglim x→a r(x) = 0 exists, such thatf(x) = f(a) + T (x − a) + r(x) ‖x − a‖holds for all x ∈ U, is called derivative of f at a . This linear mapping is denoted byf ′ (a) = T .68


Mostly we drop the brackets around the argument and write T (h) = T h = f ′ (a)h .For a real valued function f the derivative is a linear mapping f ′ (a) : R n → R . Suchlinear mappings are also called linear forms. In this case f ′ (a) can be represented bya 1×n-matrix, and we normally identify f ′ (a) with this matrix. The transpose [f ′ (a)] Tof this 1×n-matrix is a n×1-matrix, a column vector. For this transpose one uses thenotationgrad f(a) = [f ′ (a)] T .grad f(a) is called the gradient of f at a . With the scalar product on R n the gradientcan be used to represent the derivative of f : For h ∈ R n we havef ′ (a)h = ( grad f(a) ) · h .If h ∈ R n is a unit vector and if t runs through R, then the point th moves along thestraight line through the origin with direction h . A differentiable real function is definedbyt ↦→ ( grad f(a) ) · th = t ( grad f(a) · h ) .The derivative is grad f(a) · h , and this derivative attains the maximum valuegrad f(a) · h = |grad f(a)|if h has the direction of grad f(a) . Since f(a) + grad f(a) · (th) = f(a) + f ′ (a)th approximatesthe value f(a + th) , it follows that the vector grad f(a) points into the direction ofsteepest ascent of the function f at a , and the length of grad f(a) determines the slopeof f in this direction.Lemma 4.4 Let U ⊆ R n be an open set. The function f : U → R m is differentiable ata ∈ U , if and only if all component functions f 1 , . . . , f m : U → R are differentiable in a .The derivatives satisfy(f j ) ′ (a) = ( f ′ (a) ) j , j = 1, . . . , m .Proof: If the derivatives f ′ (a) exist, then the components satisfyf j (a + h) − f j (a) − ( f ′ (a) )limh jh→0‖h‖= 0 .Since ( f ′ (a) ) j : Rn → R is linear, it follows that f j is differentiable at a with derivative(f j ) ′ (a) = ( f ′ (a) ) j . Conversely, if the derivative (f j) ′ (a) of f j exists at a for all j =69


1, . . . , m , then a linear mapping T : R n → R m is defined by⎛ ⎞(f 1 ) ′ (a)hT h = ⎜ . ⎟⎝ ⎠ ,for which(f m ) ′ (a)hf(a + h) − f(a) − T hlimh→0 ‖h‖Thus, f is differentiable at a with derivative f ′ (a) = T .= 0 .4.2 Directional derivatives and partial derivativesLet U ⊆ R n be an open set, let a ∈ U and let f : U → R m . Let v ∈ R n be a given vector.Since U is open, there is δ > 0 such that a + tv ∈ U for all t ∈ R with |t| < δ ; hencef(a + tv) is defined for all such t . If t runs through the interval (−δ, δ) , then a + tv runsthrough a line segment passing through a , which has the direction of the vector v .Definition 4.5 We call the limitD v f(a) = limt→0f(a + tv) − f(a)tderivative of f at a in the direction of the vector v , if this limit exists.It is possible that the directional derivative D v f(a) exists, even if f is not differentiable ata . Also, it can happen that the derivative of f at a exists in the direction of some vectors,and does not exist in the direction of other vectors. In any case, the directional derivativecontains useful information about the function f . However, if f is differentiable at a ,then all directional derivatives of f exist at a :Lemma 4.6 Let U ⊆ R n be open, let a ∈ U and let f : U → R m be differentiable at a .Then the directional derivative D v f(a) exists for every v ∈ R n and satisfiesD v f(a) = f ′ (a)v .Proof: Set x = a + tv with t ∈ R , t ≠ 0 . Then by definition of the derivative f ′ (a)f(a + tv) = f(a) + f ′ (a)(tv) + r(tv + a) |t| ‖v‖ ,hencef(a + tv) − f(a)t= f ′ (a)v + r(tv + a) |t|t ‖v‖ .70


Since |t| = ±1 and since limt t→0 r(tv + a) = r(a) = 0 , it follows that lim t→0 r(tv +a) |t| ‖v‖ = 0 , hencetf(a + tv) − f(a)lim= f ′ (a)v .t→0 tThis result can be used to compute f ′ (a): If v 1 , . . . , v n is a basis of R n , then every vectorv ∈ R can be represented as a linear combination v = ∑ ni=1 α iv i of the basis vectors withuniquely determined numbers α i ∈ R. The linearity of f ′ (a) thus yields( n∑ )f ′ (a)v = f ′ (a) α i v i =i=1n∑α i f ′ (a)v i =i=1n∑α i D vi f(a) .i=1Therefore f ′ (a) is known if the directional derivatives D vi f(a) for the basis vectors areknown. It suggests itself to use the standard basis e 1 , . . . , e n . The directional derivativeD ei f(a) is called i-th partial derivative of f at a. For the i-th partial derivative one usesthe notations∂fD i f, , f xi , f x ′ ∂x i, f |i .iFor i = 1, . . . , n and j = 1, . . . , m we have∂ff(a + te i ) − f(a) f(a 1 , . . . , x i , . . . , a n ) − f(a 1 , . . . , a i , . . . , a n )(a) = lim= lim,∂x i t→0 tx i →a i x i − a i∂f jf j (a 1 , . . . , x i , . . . , a n ) − f j (a 1 , . . . , a i , . . . , a n )(a) = lim.∂x i x i →a i x i − a iConsequently, to compute partial derivatives the differential calculus for functions of onereal variable suffices.To construct f ′ (a) from the partial derivatives one proceeds as follows: If f ′ (a) exists,then all the partial derivatives D i f(a) = ∂f∂x i(a) exist. For arbitrary h ∈ R n we haveh = ∑ ni=1 h ie i , where h i ∈ R are the components of h , hence( n∑ )f ′ (a)h = f ′ (a) h i e i =i=1n∑ ( )f ′ (a)e i hi =i=1n∑D i f(a)h i ,i=1or, in matrix notation,⎛ (f ′ (a)h ) 1f ′ (a)h = ⎜ .⎝(f ′ (a)h ) m⎞ ⎛⎟⎠ = ⎜⎝D 1 f 1 (a) . . . D n f 1 (a).D 1 f m (a) . . . D n f m (a)⎞ ⎛⎟ ⎜⎠ ⎝⎞h 1. ⎟⎠h n71


Thus,⎛f ′ (a) = ⎜⎝D 1 f 1 (a) . . . D n f 1 (a).D 1 f m (a) . . . D n f m (a)⎞ ⎛⎟⎠ = ⎜⎝∂f 1∂x 1(a) . . ..∂f m∂x 1(a) . . .∂f 1∂x n(a)∂f m∂x n(a)⎞⎟⎠is the representation of f ′ (a) as m×n-matrix belonging to the standard bases e 1 , . . . , e nof R n and e 1 , . . . , e m of R m . This matrix is called Jacobi-matrix of f at a . (Carl GustavJacob Jacobi 1804–1851).It is possible that all partial derivatives exist at a without f being differentiable ata . Then the Jacobi-matrix can be formed, but it does not represent the derivative f ′ (a),which does not exist.Therefore, to check whether f is differentiable at a , one first verifies that all partialderivatives exist at a . This is a necessary condition for the existence of f ′ (a) . Then oneforms the Jacobi-matrixand tests whether for this matrixT =( ∂fi)(a)∂x ji=1,...,mj=1,...,nf(a + h) − f(a) − T hlimh→0 ‖h‖holds. If this holds, then f is differentiable at a with derivative f ′ (a) = T .Examples1.) Let f : R 2 → R be defined byT = ⎝f(x 1 , x 2 ) =At a = (a 1 , a 2 ) ∈ R 2 the Jacobi-matrix is⎛∂f 1∂x 1(a)∂f 2∂x 1(a),= 0( ) ( )f1 (x 2 , x 2 ) x2= 1 − x 2 2.f 2 (x 1 , x 2 ) 2x 1 x 2∂f 1∂x 2(a)∂f 2∂x 2(a)⎞ ⎛2a 2 2a 1⎠ = ⎝ 2a 1 −2a 2⎞⎠ .To test the differentiability of f at a , set for h = (h 1 , h 2 ) ∈ R 2 and i = 1, 2hencer i (h) = f i(a + h) − f i (a) − T i (h)‖h‖,r 1 (h) = (a 1 + h 1 ) 2 − (a 2 + h 2 ) 2 − a 2 1 + a 2 2 − 2a 1 h 1 + 2a 2 h 2‖h‖= h2 1 − h 2 2‖h‖,r 2 (h) = 2(a 1 + h 1 )(a 2 + h 2 ) − 2a 1 a 2 − 2a 2 h 1 − 2a 1 h 2‖h‖= 2h 1h 2‖h‖ .72


Using the maximum norm ‖ · ‖ = ‖ · ‖ ∞ , we obtainthus|r 1 (h)| ≤ 2‖h‖ ∞|r 2 (h)| ≤ 2‖h‖ ∞ ,lim ‖r(h)‖ ∞ = lim ‖ ( r 1 (h), r 2 (h) ) ‖ ∞ ≤ lim 2‖h‖ ∞ = 0 .h→0 h→0 h→0Therefore f is differentiable at a . Since a was arbitrary, f is everywhere differentiable,i.e. f is differentiable.2.) Let the affine map f : R n → R m be defined byf(x) = Ax + c ,where c ∈ R m and A : R n → R m is linear. Then f is differentiable with derivativef ′ (a) = A for all a ∈ R n . For,f(a + h) − f(a) − Ah‖h‖=A(a + h) + c − Aa − c − Ah‖h‖= 0 .3.) Let f : R 2 → R be defined by⎧⎪⎨ 0 , for (x 1 , x 2 ) = 0 ,f(x 1 , x 2 ) = |x 1 | x 2⎪⎩ √ ,x21 + x 2 2for (x 1 , x 2 ) ≠ 0 .This function is not differentiable at a = 0 , but it has all the directional derivatives at 0 .To see that all directional derivatives exist, let v = (v 1 , v 2 ) be a vector from R 2 differentfrom zero. ThenD v f(0) = limt→0f(tv) − f(0)t= limt→0t|t| |v 1 |v 2t|t| √ v 2 1 + v 2 2= |v 1| v√ 2.v21 + v22To see that f is not differentiable at 0 , note that the partial derivatives satisfy∂f∂x 1(0) = 0 ,∂f∂x 2(0) = 0 .Therefore, if f would be differentiable at 0, the derivative had to be( ∂ff ′ ∂f)(0) = (0) (0) = (0 0) .∂x 1 ∂x 2Consequently, all directional derivatives would satisfyD v f(0) = f ′ (0)v = 0 .73


Yet, the preceding calculation yields for the derivative in the direction of the diagonalvector v = (1, 1) thatTherefore f ′ (0) cannot exist.D v f(0) = 1 √2.We note that |f(x 1 , x 2 )| = |x 1 x 2 ||x|≤ |x| , which implies that f is continuous at 0 .4.3 Elementary properties of differentiable mappingsIn the preceding example f was not differentiable at 0, but had all the directional derivativesand was continuous at 0. Here is an example of a function f : R 2 → R, which hasall the directional derivatives at 0, yet is not continuous at 0: f is defined by⎧⎪⎨ 0, for (x 1 , x 2 ) = 0f(x 1 , x 2 ) = x ⎪⎩ 1 x 2 2,x 2 1 + x 6 2for (x 1 , x 2 ) ≠ 0 .To see that all directional derivatives exist at 0, let v = (v 1 , v 2 ) ∈ R 2 with v ≠ 0 . Then⎧f(tv) − f(0)⎨ v 1 v22 lim = v2 2, if vD v f(0) = lim= t→0 v1 2 + t 4 v26 1 ≠ 0v 1 t→0 t ⎩0 , if v 1 = 0 .Yet, for h = (h 1 , √ h 1 ) with h 1 > 0 we havelim f(h) = limh 1 →0 h 1 →0h 2 1h 2 1 + h 3 1= limh1 →011 + h 1= 1 ≠ f(0) .Therefore f is not continuous at 0 . Together with the next result we obtain as a consequencethat f is not differentiable at 0 :Theorem 4.7 Let U be an open subset of R n , let a ∈ U and let f : U → R m be differentiableat a. Then there is a constant c > 0 such that for all x from a neighborhood ofaIn particular, f is continuous at a .‖f(x) − f(a)‖ ≤ c‖x − a‖ .Proof: We havef(x) = f(a) + f ′ (a)(x − a) + r(x) ‖x − a‖ ,whence, with the operator norm ‖f ′ (a)‖ of the linear mapping f ′ (a) : R n → R m ,‖f(x) − f(a)‖ ≤ ‖f ′ (a)‖ ‖x − a‖ + ‖r(x)‖ ‖x − a‖ .74


Since lim x→a r(x) = 0 , there is δ > 0 such that‖r(x)‖ ≤ 1for all x ∈ D with ‖x − a‖ < δ , whence for these x‖f(x) − f(a)‖ ≤ ( ‖f ′ (a)‖ + 1 ) ‖x − a‖ = c ‖x − a‖ ,with c = ‖f ′ (a)‖ + 1. In particular, this implieswhence f is continuous at a .lim ‖f(x) − f(a)‖ ≤ lim c‖x − a‖ = 0 ,x→a x→aTheorem 4.8 Let U ⊆ R n be open and a ∈ U . If f : U → R m and g : U → R m aredifferentiable at a, then also f + g and cf are differentiable at a for all c ∈ R, and(f + g) ′ (a) = f ′ (a) + g ′ (a)(cf) ′ (a) = cf ′ (a) .Proof: We have for h ∈ R n with a + h ∈ Uf(a + h) = f(a) + f ′ (a)h + r 1 (a + h) ‖h‖ ,g(a + h) = g(a) + g ′ (a)h + r 2 (a + h) ‖h‖ ,lim r 1 (a + h) = 0h→0limr 2 (a + h) = 0 .h→0Thus(f + g)(a + h) = (f + g)(a) + ( f ′ (a) + g ′ (a) ) h + (r 1 + r 2 )(a + h) ‖h‖with lim h→0 (r 1 + r 2 )(a + h) = 0 . Consequently f + g is differentiable at a with derivative(f + g) ′ (a) = f ′ (a) + g ′ (a) . The statement for cf follows in the same way.Theorem 4.9 (Product rule) Let U ⊆ R n be open and let f, g : U → R be differentiableat a ∈ U . Then f · g : U → R is differentiable at a with derivative(f · g) ′ (a) = f(a) g ′ (a) + g(a) f ′ (a) .Proof: We have for a + h ∈ U(f · g)(a + h) = ( f(a) + f ′ (a)h + r 1 (a + h) ‖h‖ ) · (g(a)+ g ′ (a)h + r 2 (a + h) ‖h‖ )= (f · g)(a) + f(a) g ′ (a)h + g(a) f ′ (a)h + r(a + h) ‖h‖ ,75


wherer(a + h) ‖h‖ = ( f ′ (a)h g ′ h ) ((a) ‖h‖ + g(a) + g ′ (a)h ) r 1 (a + h) ‖h‖‖h‖+ ( f(a) + f ′ (a)h ) r 2 (a + h) ‖h‖ + r 1 (a + h) r 2 (a + h) ‖h‖ 2 .The absolute value is a norm on R. Since r(a + h) ∈ R, we thus obtain with the operatornorms ‖f ′ (a)‖ , ‖g ′ (a)‖ ,lim |r(a + h)|h→0≤[ (‖flim(a)‖ ‖h‖ ‖g ′ (a)‖ )h→0+ ( |g(a)| + ‖g ′ (a)‖ ‖h‖ ) |r 1 (a + h)|+ ( |f(a)| + ‖f ′ (a)‖ ‖h‖ ) |r 2 (a + h)|]+|r 1 (a + h)| |r 2 (a + h)| ‖h‖ = 0 .Since f(a) g ′ (a)h + g(a) f ′ (a)h = ( f(a) g ′ (a) + g(a) f ′ (a) ) h, it follows that f · g is differentiableat a with derivative given by this linear mapping.Theorem 4.10 (Chain rule) Let U ⊆ R p and V ⊆ R n be open, let f : U → V andg : V → R m . Suppose that a ∈ U , that f is differentiable at a and that g is differentiableat b = f(a). Then g ◦ f : U → R n is differentiable at a with derivative(g ◦ f) ′ (a) = g ′( f(a) ) ◦ f ′ (a) .Remark: Since g ′ (b) and f ′ (a) can be represented by matrices, g ′ (b) ◦ f ′ (a) can also bewritten as g ′ (b) f ′ (a) , employing matrix multiplication.Proof: For brevity we setT 2 = g ′ (b) , T 1 = f ′ (a) ,and for h ∈ R p with a + h ∈ UR(h) = (g ◦ f)(a + h) − (g ◦ f)(a) − T 2 T 1 h .The statement of the theorem follows if it can be shown thatWe have for x ∈ U and y ∈ Vlimh→0‖R(h)‖‖h‖= 0.f(x) − f(a) − T 1 (x − a) = r 1 (x − a)‖x − a‖, limh→0r 1 (h) = 0g(y) − g(b) − T 2 (y − b) = r 2 (y − b)‖y − b‖, limk→0r 2 (k) = 0.76


Since T 2 is linear, we thus obtain for x = a + h and y = f(a + h)R(h) = g ( f(a + h) ) − g ( f(a) ) ( )− T 2 f(a + h) − f(a)(+ T 2 f(a + h) − f(a) − T1 h )=( ) (r 2 f(a + h) − f(a) ‖f(a + h) − f(a)‖ + T2 r1 (h)‖h‖ ) ,which yieldslimh→0‖R(h)‖‖h‖≤[ 1limh→0 ‖h‖ ‖r ( )2 f(a + h) − f(a) ‖ ‖f(a + h) − f(a)‖+ ∥ ∥T 2(r1 (h)‖h‖ )∥ ∥ ] .Since f is differentiable at a, for ‖h‖ sufficiently small the estimate ‖f(a + h) − f(a)‖ ≤c‖h‖ holds, cf. Theorem 4.7. Therefore, with the operator norm ‖T 2 ‖ we conclude thatlimh→0‖R(h)‖‖h‖[ ( ) ]≤ lim ‖r 2 f(a + h) − f(a) ‖c + ‖T2 ‖ ‖r 1 (h)‖ = 0.h→0For the Jacobi–matrices of f : U → R n , g : V → R m and h : U → R m we thus obtain⎛⎞ ⎛⎞ ⎛∂h 1 ∂h 1∂g(a) . . . (a)1 ∂g 1 ∂f 1 ∂f 1∂x 1 ∂x p⎟ ⎜(b) . . . (b)(a) . . . (a)∂y 1 ∂y n⎟ ⎜ ∂x 1 ∂x p⎜⎝Thus,.∂h m∂x 1(a) . . .∂h m∂x p(a)∂h j∂x i(a) ==⎟ ⎜⎠ ⎝n∑k=1.∂g m∂y 1(b) . . .∂g m∂y n(b)⎟⎠⎜⎝.∂f n∂x 1(a) . . .∂g j∂y k(b) ∂f k∂x i(a), i = 1, . . . , p, j = 1, . . . , m.∂f n∂x p(a)Corollary 4.11 Let U be an open subset of R n , let a ∈ U and let f : U → R be differentiableat a and satisfy f(a) ≠ 0. Then 1 f( 1fis differentiable at a with derivative) ′(a) = −1f(a) 2 f ′ (a).Proof: Consider the differentiable function g : R\{0} → R defined by g(x) = 1 x . Thenis differentiable at a with derivative1f = g ◦ f : {x ∈ U ∣ ∣∣ f(x) ≠ 0} → R⎞.⎟⎠77


( 1f) ′(a)= g′ ( f(a) ) f ′ (a) = − 1f(a) 2 f ′ (a).Assume that U and V are open subsets of R n and that f : U → V is an invertible mapwith inverse f −1 : V → U. If a ∈ U, if f is differentiable at a and if f −1 is differentiable atb = f(a) ∈ V, then the derivative (f −1 ) ′ (b) can be computed from f ′ (a) using the chainrule. To see this, note thatf −1 ◦ f = id U .The identity mapping id U is obtained as the restriction of the identity mapping id R nU. Since id R n is linear, it follows that id U is differentiable at every c ∈ U with derivative(id U ) ′ (x) = id R n. Consequentlyid R n = (id U ) ′ (a) = (f −1 ◦ f) ′ (a) = (f −1 ) ′ (b) f ′ (a) .From linear algebra we know that this equation implies that (f −1 ) ′ (b) is the inverse off ′ (a). Consequently, ( f ′ (a) ) −1exists and(f −1 ) ′ (b) = ( f ′ (a) ) −1,toor(f −1 ) ′ (b) = [ f ′( f −1 (b) )] −1.Thus, if one assumes that f ′ (a) exists and that the inverse mapping is differentiable atf(a), one can conclude that the linear mapping f ′ (a) is invertible. On the other hand, ifone assumes that f ′ (a) exists and is invertible and that the inverse mapping is continuousat f(a), one can conclude that the inverse mapping is differentiable at f(a).This isshown in the following theorem. We remark that the linear mapping f ′ (a) is invertible ifand only if the determinant det f ′ (a) differs from zero, where f ′ (a) is identified with then×n-matrix representing the linear mapping f ′ (a).Theorem 4.12 Let U ⊆ R n be an open subset, let a ∈ U and let f : U → R n be one-toone.If f is differentiable at a with invertible derivative f ′ (a), if the range f(U) containsa neighborhood of b = f(a), and if the inverse mapping f −1 : f(U) → U of f is continuousat b, then f −1 is differentiable at b with derivativeProof: For brevity we set g = f −1 .(f −1 ) ′ (b) = ( f ′ (a) ) −1=(f ′( f −1 (b) )) −1.V ⊆ f(U) of b and a constant c > 0 such that‖g(y) − g(b)‖‖y − b‖First it is shown that there is a neighborhood78≤ c(∗)


for all y ∈ V .Since f is differentiable at a, we have for x ∈ Uf(x) − f(a) = f ′ (a)(x − a) + r(x) ‖x − a‖ ,(∗∗)where r is continuous at a and satisfies r(a) = 0. Let y ∈ f(U). Employing (∗∗) withx = g(y) and noting that b = f(a), we obtain from the inverse triangle inequality that‖g(y) − g(b)‖‖y − b‖==‖g(y) − a‖‖f ( g(y) ) − f(a)‖‖g(y) − a‖∥ f ′(a) ( g(y) − a ) + r ( g(y) ) ‖g(y) − a‖ ∥ ≤≤‖ ( f ′ (a) ) −1f ′ (a) ( g(y) − a ) ‖‖f ′ (a) ( g(y) − a)‖ − ‖r ( g(y) ) ‖ ‖ ( f ′ (a) ) −1f′(a) ( g(y) − a ) ‖‖ (( f ′ (a) ) −1‖ ‖f ′ (a) ( g(y) − a ) ‖‖f ′ (a) ( g(y) − a ) ‖(1 − ‖r ( g(y) ) ‖ ‖ ( f ′ (a) ) )−1‖=‖ ( f ′ (a) ) −1‖1 − ‖r ( g(y) ) ‖ ‖ ( f ′ (a) ) −1‖The inequality (∗) is obtained from this estimate. To see this, note that by assumptiong is continuous at b and that r is continuous at a = g(b), hence r ◦ g is continuous at b.Thus,lim r( g(y) ) = r ( g(b) ) = r(a) = 0.y→bUsing (∗) the theorem can be proved as follows: we have to show thatEmploying (∗∗) again,limy→bg(y) − g(b) − ( f ′ (a) ) −1(y − b)‖y − b‖= 0 .g(y) − a − ( f ′ (a) ) −1(y − b)‖y − b‖g(y) − a − ( f ′ (a) ) ( −1f ( g(y) ) )− f(a)=‖y − b‖g(y) − a − ( f ′ (a) ) ( −1f ′ (a) ( g(y) − a ) + r ( g(y) ) )‖g(y) − a‖=‖y − b‖(= −f ′ (a) r ( g(y) )) ‖g(y) − a‖‖y − b‖ .79


With a = g(b) we thus obtain from (∗)lim ∥ g(y) − g(b) − ( f ′ (a) ) −1(y − b)∥y→b‖y − b‖≤ limy→b‖f ′ (a)‖ ‖r ( g(y) ) ‖ c = c ‖f ′ (a)‖ limy→b‖r ( g(y) ) ‖ = 0 .Example (Polar coordinates) LetU = { (r, ϕ) ∣ r > 0 ,}0 < ϕ < 2π ⊆ R 2 ,and let f = (f 1 , f 2 ) : U → R 2 be defined byx = f 1 (r, ϕ) = r cos ϕy = f 2 (r, ϕ) = r sin ϕ .y✻.❜(x, y)This mapping is one-to-one with rangeϕ.✲xf(U) = R 2 \ { (x, 0) ∣ ∣ x ≥ 0},and has a continuous inverse. From a theorem proved in the next section it follows thatf is differentiable. Thus,⎛f ′ (r, ϕ) = ⎝⎞ ⎛∂f 1(r, ϕ) ∂f 1(r, ϕ)∂r ∂ϕ⎠ = ⎝ cos ϕ∂f 2(r, ϕ) ∂f 2(r, ϕ) sin ϕ∂r ∂ϕ−r sin ϕr cos ϕThis matrix is invertible for (r, ϕ) ∈ U , hence the derivative (f −1 ) ′ (x, y) exists for every(x, y) = f(r, ϕ) = (r cos ϕ , r sin ϕ) and can be computed without having to determinethe inverse function f −1 :⎛⎞(f −1 ) ′ (x, y) = ( f ′ (r, ϕ) ) −1= ⎝ cos ϕ −r sin ϕ⎠sin ϕ r cos ϕ⎛⎞ ⎛cos ϕ sin ϕ √ x= ⎝⎠ = ⎝ x 2 +y 2− 1 sin ϕ 1 cos ϕ r r80−yx 2 +y 2−1√y⎞⎠ .x 2 +y 2xx 2 +y 2⎞⎠


4.4 Mean value theoremThe mean value theorem for real functions can be generalized to real valued functions:Theorem 4.13 (Mean value theorem) Let U be an open subset of R n , let f : U → Rbe differentiable, and let a, b ∈ U be points such that the line segment connecting thesepoints is contained in U. Then there is a point c from this line segment withf(b) − f(a) = f ′ (c)(b − a) .Proof: Define a function γ : [0, 1] → U by t ↦→ γ(t) := a + t(b − a). This functionmaps the interval [0, 1] onto the line segment connecting a and b. The affine function γis differentiable with derivativeLet F = f ◦ γ be the composition.γ ′ (t) = b − a .Since f and γ are differentiable, F : [0, 1] → Ris differentiable. Thus, the mean value theorem for real functions implies that there isϑ ∈ (0, 1) such thatf(b) − f(a) = F (1) − F (0) = F ′ (ϑ) = f ′( γ(ϑ) ) γ ′ (ϑ) = f ′ (c)(b − a) ,where we have set c = γ(ϑ) .Of course, the mean value theorem can also be formulated as follows: If U containstogether with the points x and x + h also the line segment connecting these points, thenthere is a number ϑ with 0 < ϑ < 1 such thatf(x + h) − f(x) = f ′ (x + ϑh)h .The mean value theorem does not hold for functions f : U → R m with m > 1, but thefollowing weaker result can often be used as a replacement for the mean value theorem:Corollary 4.14 Let U ⊆ R n be open and let f : U → R m be differentiable. Assume thatx and x + h are points from U such that the line segment l = { x + th ∣ ∣ 0 ≤ t ≤ 1}connecting x and x + h is contained in U. If the derivative of f is bounded on l by aconstant S ≥ 0, i.e. if for all 0 ≤ t ≤ 1 the operator norm of the derivative satisfies‖f ′ (x + th)‖ ≤ S ,then‖f(x + h) − f(x)‖ ≤ S‖h‖ .81


To prove this corollary we need the following lemma, which we do not prove:Lemma 4.15 Let ‖·‖ be a norm on R m . Then to every u ∈ R m there is a linear mappingA u : R m → R such that ‖A u ‖ = 1 and A u (u) = ‖u‖ .Example: For the Euclidean norm ‖ · ‖ = | · | define A u byA u (v) = u|u| · v , v ∈ Rm .Then A u (u) = u · u = |u| and|u|1 = ∣ u∣ = 1|u| |u| A u(u) ≤ 1|u| ‖A u‖ |u| = ‖A u ‖= sup |A u (v)| = sup ∣ u|v|≤1|v|≤1 |u| · v∣ ∣ ≤ sup|v|≤1|u| · |v||u|= 1 ,Hence ‖A u ‖ = 1 .Proof of the corollary: To f(x+h)−f(x) ∈ R m choose the linear mapping A : R m → Rsuch that ‖A‖ = 1 and A ( f(x + h) − f(x) ) = ‖f(x + h) − f(x)‖ . As a linear mapping,A is differentiable with derivative A ′ (y) = A for all y ∈ R m . Thus, from the mean valuetheorem applied to the differentiable function F = A ◦ f : U → R we conclude that anumber ϑ with 0 < ϑ < 1 exists such that‖f(x + h) − f(x)‖ = A ( f(x + h) − f(x) )= A ( f(x + h) ) − A ( f(x) ) = F (x + h) − F (x) = F ′ (x + ϑh)h= Af ′ (x + ϑh)h ≤ ‖A‖ ‖f ′ (x + ϑh)‖ ‖h‖ ≤ S ‖h‖ .Theorem 4.16 Let U be an open and pathwise connected subset of R n , and let f : U →R m be differentiable. Then f is constant if and only if f ′ (x) = 0 for all x ∈ U .To prove this theorem, the following lemma is needed:Lemma 4.17 Let U ⊆ R n be open and pathwise connected. Then all points a, b ∈ U canbe connected by a polygon in U, i.e. by a curve consisting of finitely many straight linesegments.82


A proof of this lemma can be found in the book of Barner-Flohr, Analysis <strong>II</strong>, p. 56.Proof of the theorem: If f is constant, then evidently f ′ (x) = 0 for all x ∈ U. Toprove the converse, assume that f ′ (x) = 0 for all x ∈ U. Let a, b be two arbitrary pointsin U. These points can be connected in U by a polygon with the corner pointsa 0 = a , a 1 , . . . , a k−1 , a k = b .We apply Corollary 4.14 to the line segment connecting a j and a j+1 for j = 0, 1, . . . , k−1.Since f ′ (x) = 0 for all x ∈ U, the operator norm ‖f ′ (x)‖ is bounded on this line segmentby 0. Therefore Corollary 4.14 yields ‖f(a j+1 ) − f(a j )‖ ≤ 0, hence f(a j+1 ) = f(a j ) forall j = 0, 1, . . . , k − 1, which impliesf(b) = f(a) .From the existence of all the partial derivatives ∂f∂x 1(a), . . . , ∂f∂x n(a) at a, one cannot concludethat f is differentiable at a. However, we have the following useful criterion fordifferentiability of f at a:Theorem 4.18 Let U be an open subset of R n with a ∈ U and let f : U → R m . If allpartial derivatives ∂f j∂x iexist in U for i = 1, . . . , n and j = 1, . . . , m , and if all the functionsx ↦→ ∂f j∂x i(x) : U → R are continuous at a , then f is differentiable at a.Proof: It suffices to prove that all the component functions f 1 , . . . , f m are differentiableat a. Thus, we can asssume that f : U → R is real valued. We have to show thatlimh→0f(a + h) − f(a) − T h‖h‖ ∞= 0for the linear mapping T with the matrix representation( ∂fT = (a), . . . , ∂f )(a) .∂x 1 ∂x nFor h = (h 1 , . . . , h n ) ∈ R n definea 0 := aa 1 := a 0 + h 1 e 1a 2 := a 1 + h 2 e 2.a + h = a n := a n−1 + h n e n ,83


where e 1 , . . . , e n is the canonical basis of R n . Thenf(a + h) − f(a) = ( f(a + h) − f(a n−1 ) ) + ( f(a n−1 ) − f(a n−2 ) ) + . . . + ( f(a 1 ) − f(a) ) . (∗)If x runs through the line segment connecting a j−1 to a j , then only the component x j of xis varying. Since by assumption the mapping x j → f(x 1 , . . . , x j , . . . , x n ) is differentiable,the mean value theorem can be applied to every term on the right hand side of (∗). Letc j be the intermediate point on the line segment connecting a j−1 to a j . Thenf(a + h) − f(a) =n∑ (f(aj ) − f(a j−1 ) ) =j=1n∑j=1∂f∂x j(c j )h j ,whence|f(a + h) − f(a) − T h| = ∣ ∣= ∣ ∣n∑j=1n∑j=1∂f∂x j(c j )h j −n∑j=1∂f∂x j(a)h j∣ ∣( ∂f∂x j(c j ) − ∂f∂x j(a) ) h j∣ ∣n∑≤ ‖h‖ ∞∣ ∂f (c j ) − ∂f (a) ∣ .∂x j ∂x jBecause the intermediate points satisfy ‖c j − a‖ ∞ ≤ ‖h‖ ∞ for all j = 1, . . . , n, it followsthat lim h→0 c j = a for all intermediate points. The continuity of the partial derivatives ata thus implies|f(a + h) − f(a) − T h|limh→0 ‖h‖ ∞≤ limh→0j=1n∑∣ ∂f (c j ) − ∂f (a) ∣ = 0 .∂x j ∂x jj=1Example: Let s ∈ R and let f : R n \{0} → R be defined byf(x) = (x 2 1 + . . . + x 2 n) s .This mapping is differentiable, since the partial derivativesare continuous in R n \{0} .∂f∂x j(x) = s(x 2 1 + . . . + x 2 n) s−1 2x j84


4.5 Continuously differentiable mappings, second derivativeLet U ⊆ R n be open and let f : U → R m be differentiable at every x ∈ U. Thenx ↦→ f ′ (x) : U → L(R n , R m )defines a mapping from U into the set of linear mappings from R n to R m . If one appliesthe linear mapping f ′ (x) to a vector h ∈ R n , a vector of R m is obtained. Thus, f ′ canalso be considered to be a mapping from U × R n to R m :(x, h) ↦→ f ′ (x)h : U × R n → R m .This mapping is linear with respect to the second argument. What view one takes dependson the situation.Since L(R n , R m ) is a normed space, one can define continuity of the function f ′ asfollows:Definition 4.19 Let U ⊆ R n be an open set and let f : U → R m be differentiable.(i)f ′ : U → L(R n , R m ) is said to be continuous at a ∈ U if to every ε > 0 there isδ > 0 such that for all x ∈ U with ‖x − a‖ < δ‖f ′ (x) − f ′ (a)‖ < ε .(ii)(iii)f is said to be continuously differentiable if f ′ : U → L(R n , R m ) is continuous.Let U, V ⊆ R n be open and let f : U → V be continuously differentiable and invertible.If the inverse f −1 : V → U is also continuously differentiable and invertible.If the inverse f −1 : V → U is also continuously differentiable, then f is called adiffeomorphism.Here ‖f ′ (x) − f ′ (a)‖ denotes the operator norm of the linear mapping ( f ′ (x) − f ′ (a) ) :R n → R m . The following result makes this definition less abstract:Theorem 4.20 Let U ⊆ R n be open and let f : U → R m . Then the following statementsare equivalent:(i)(ii)f is continuously differentiable.All partial derivatives ∂∂x if j with 1 ≤ i ≤ n, 1 ≤ j ≤ m exist in U and are continuousfunctionsx ↦→∂ f j (x) : U → R .∂x i85


(iii)f is differentiable and the mapping x ↦→ f ′ (x)h : U → R m is continuous for everyh ∈ R n .Proof: First we show that (i) and (ii) are equivalent. If f is differentiable, then all partialderivatives exist in U. Conversely, if all partial derivatives exist in U and are continuous,then by Theorem 4.18 the function f is differentiable. Hence, it remains to show that f ′is continuous if and only if all partial derivatives are continuous.For a, x ∈ U let‖f ′ (x) − f ′ (a)‖ ∞ =max ∣ ∂f j(x) − ∂f j(a) ∣ .i=1,...,n ∂x i ∂x ij=1,...,mBy Theorem 3.50 there exist constants c, C > 0, which are independent of x and a, suchthat c‖f ′ (x) − f ′ (a)‖ ∞ ≤ ‖f ′ (x) − f ′ (a)‖ ≤ C‖f ′ (x) − f ′ (a)‖ ∞ . From this estimate andfrom (∗) we see thatlim ‖f ′ (x) − f ′ (a)‖ = 0x→0holds if and only if∂f jlim (x) = ∂f j(a)x→a ∂x i ∂x ifor all 1 ≤ i ≤ n , 1 ≤ j ≤ m. By Definition 4.19 this means that f ′ is continuous at a ifand only if all partial derivatives are continuous at a.To prove that (iii) is equivalent to the first two statements of the theorem it suffices toremark that if f is differentiable, then(∗)x ↦→ f ′ (x)h =n∑i=1∂f∂x i(x)h i : U → R m .By choosing for h vectors from the standard basis e 1 , . . . , e n of R n , we immediately seefrom this equation that x ↦→ f ′ (x)h is continuous for every h ∈ R n , if and only if allpartial derivatives are continuous.The derivative f : U → R m is a mapping f ′ : U → L(R n , R m ). Since L(R n , R m ) is anormed space, it is possible to define the derivative of f ′ at x, which is a linear mappingfrom R n to L(R n , R m ). One denotes this derivative by f ′′ (x) and calls it the secondderivative of f at x . Thus, if f is two times differentiable, thenf ′′ : U → L ( R n , L(R n , R m ) ) .Less abstractly, I define the second derivative in the following equivalent way:86


Definition 4.21 (i) Let U ⊆ R n be open and let f : U → R m be differentiable. f issaid to be two times differentiable at a point x ∈ U, if to every fixed h ∈ R n the mappingg h : U → R m defined byg h (x) = f ′ (x)his differentiable at x .(ii) The function f ′′ (x) : R n × R n → R m defined byf ′′ (x)(h, k) = g h(x)(k)′is called the second derivative of f at x . If f : U → R m is two times differentiable (i.e.,two times differentiable at every x ∈ U), thenf ′′ : U × R n × R n → R m .Theorem 4.22 Let U ⊆ R n be open with x ∈ U and let f : U → R m be differentiable.(i) If f is two times differentiable at x, then all second partial derivatives of f at x exist,and for h = (h 1 , . . . h n ) ∈ R n and k = (k 1 , . . . , k n ) ∈ R nf ′′ (x)(h, k) =n∑j=1n∑i=1∂∂x j∂∂x if(x)h i k j .(ii) f ′′ (x) is bilinear, i.e. (h, k) → f ′′ (x)(h, k) is linear in both arguments.Proof: If f is two times differentiable at x, then by definition the functiony ↦→ g k (y) = f ′ (y)h =n∑i=1∂∂x if(y)h iis differentiable at y = x, hencef ′′ (x)(h, k) = g ′ h(x)k =n∑j=1∂∂x jg h (x)k j =n∑j=1∂( n∑∂x ji=1∂)f(x)h i k j .∂x iWith h = e i and k = e j , where e i and e j are vectors from the standard basis of R n ,this formula implies that the second partial derivative∂ ∂∂x j ∂x if(x) exists. Thus, in thisformula the partial derivative and the summation can be interchanged, hence the statedrepresentation formula for f ′′ (x)(h, k) results. The bilinearity of f ′′ (x) follows immediatelyfrom this representation formula.87


For the second partial derivativesNote that∂ 2 f∂x j ∂x i=∂∂x j∂∂x j∂ 2 f∂x j ∂x i(x) =∂∂x if(x) of f one also uses the notation∂∂x if,⎛⎜⎝∂ 2 f∂x 2 i∂ 2∂x j ∂x if 1 (x).∂ 2f m (x)∂x j ∂x i= ∂∂x i⎞∂∂x if.∈ R m .⎟⎠∂For m = 1, the second partial derivatives 2∂x j ∂x if(x) are real numbers. Thus, for f : U →R we obtain a matrix representation for f ′′ (x) :f ′′ (x)(h, k) =n∑j=1n∑i=1⎛= (h 1 , . . . , h n )⎜⎝with the Hessian matrix∂ 2∂x j ∂x if(x)h i k j∂ 2 f(x) . . .∂x 2 1.∂ 2 f∂x 1 ∂x n(x) . . .∂ 2 f∂x n ∂x 1(x)∂ 2 f(x)∂x 2 n( ∂ 2 fH =.∂x j ∂x i)j,i=1,...,n⎞⎛⎜⎟ ⎝⎠⎞k 1. ⎟⎠ = hHk,k n(Ludwig Otto Hesse 1811 – 1874). For f : U → R m with m > 1 one obtains(f ′′ (x) ) l (h, k) = hH lk,where H l is the Hessian matrix for the component function f l of f. In particular, thisyields(f ′′ (x) ) l (h, k) = (f l) ′′ (x)(h, k),i.e. the l–th component of f ′′ (x) is the second derivative of the component function f l .It is possible, that all second partial derivatives of f at x exist, even if f is not twotimes differentiable at x. In this case the Hessian matrices H l can be formed, but theydo not represent the second derivative at f at x, which does not exist. If f is two timesdifferentiable at x, then the Hessian matrices H l are symmetric, i.e.∂ 2∂x j ∂x if l (x) =∂2∂x i ∂x jf l (x)for all 1 ≤ i, j ≤ n, hence the order of differentiation does not matter. This follows fromthe following.88


Theorem 4.23 (of H.A. Schwarz)Let U ⊆ R n be open, let x ∈ U and let f be twotimes differentiable at x. Then for all h, k ∈ R nf ′′ (x)(h, k) = f ′′ (x)(k, h).(Hermann Amandus Schwartz, 1843 – 1921)Proof: Obviously the bilinear mapping f ′′ (x) is symmetric, if and only if every componentfunction ((f ′′ (x)) l is symmetric. Therefore it suffices to show that every component issymmetric. Since (f ′′ (x)) l = (f l ) ′′ (x) and since f l : U → R is real valued, it is sufficientto prove that for every real valued function f : U → R the second derivative f ′′ (x) issymmetric. We thus assume that f is real valued.To prove symmetry, we show that for all h, k ∈ R nlims→0s>0f(x + sh + sk) − f(x + sh) − f(x + sk) + f(x)s 2 = f ′′ (x)(h, k). (∗)The statement of the theorem ist a consequence of this formula, since the left hand sideremains unchanged if h and k are interchanged.By definition, f ′′ (x)(h, k) is the derivative of the function x ↦→ f ′ (x)h. Thus, for allh, k ∈ R n ,f ′ (x + k)h − f ′ (x)h = f ′′ (x)(h, k) + R x (h, k)‖k‖(∗∗)withlim R x(h, k) = 0.k→0R x (h, k) is linear with respect to h, since f ′ (x+k)h, f ′ (x)h and f ′′ (x)(h, k) are linear withrespect to h. We show that a number ϑ with 0 < ϑ < 1 exists, which depends on h andk, such thatf(x + h + k) − f(x + h) − f(x + k) + f(x) (+)= f ′′ (x)(h, k) + R x (h, ϑh + k)‖ϑh + k‖ − R x (h, ϑh)‖ϑh‖.For, let F : [0, 1] → R be defined byF (t) = f(x + th + k) − f(x + th).F is differentiable, whence the mean value theorem implies that 0 < ϑ < 1 exists withF (1) − F (0) = F ′ (ϑ).89


Therefore, with the definition of F and with (∗∗),f(x + h + k) − f(x + h) − f(x + k) + f(x) = F (1) − F (0)= F ′ (ϑ) = f ′ (x + ϑh + k)h − f ′ (x + ϑh)h= ( f ′ (x + ϑh + k)h − f ′ (x)h ) − ( f ′ (x + ϑh)h − f ′ (x)h )= ( f ′′ (x)(h, ϑh + k) + R x (h, ϑh + k)‖ϑh + k‖ )− ( f ′′ (x)(h, ϑh) + R x (h, ϑh)‖ϑh‖ )= f ′′ (x)(h, k) + R x (h, ϑh + k)‖ϑh + k‖ − R x (h, ϑh)‖ϑh‖,which is (+). In the last step we used the linearity of f ′′ (x) in the second argument.Let s > 0. If one replaces in (+) the vector k by sk and the vector h by sh, then onthe right hand side the factor s 2 can be extracted, because of the bilinearity or linearityor the positive homogeneity of all the terms. The result isSincef(x + sh + sk) − f(x + sh) − f(x + sk) + f(x)( ) ]= s[f 2 ′′ (x)(h, k) + R x h, s(ϑh + k) ‖ϑh + k‖ − Rx (h, sϑh)‖ϑh‖ .this equation yields (∗).Example: Let f : R 2 → Rlim R ( )x h, s(ϑh + k) = 0, lim R x (h, sϑh) = 0,s→0 s→0f(x 1 , x 2 ) = x 2 1x 2 + x 1 + x 3 2.The partial derivatives of every order exist and are continuous. This implies that f iscontinuously differentiable. We have⎛grad f(x) =For h ∈ R 2 the partial derivatives of⎜⎝∂f∂x 1(x)∂f∂x 2(x)⎞⎟⎠ = ( 2x1 x 2 + 1x 2 1 + 3x 2 2).x ↦→ f ′ (x)h = grad f(x) · h = ∂f∂x 1(x)h 1 + ∂f∂x 2(x)h 2are∂∂x i(f ′ (x)h ) =∂2 f∂x i ∂x 1(x)h 1 +∂2 f∂x i ∂x 2(x)h 2 , i = 1, 2 ,90


hence these partial derivatives are continuous, and so x ↦→ f ′ (x)h is differentiable. Thus,by definition f is two times differentiable with the Hessian matrixf ′′ (x) = H =⎛⎜⎝∂ 2 f(x)∂x 2 1∂ 2 f∂x 1 ∂x 2(x)4.6 Higher derivatives, Taylor formula∂ 2 f∂x 2 ∂x 1(x)∂ 2 f(x)∂x 2 2⎞⎛⎟⎠ = ⎜⎝⎞2x 2 2x 1⎟⎠ .2x 1 6x 2Higher derivatives are defined by induction: Let U ⊆ R n be open. The p-th derivative off : U → R m at x is a mappingf (p) (x) : R} n × .{{. . × R n}→ R mp-factorsobtained as follows: If f is (p−1)–times differentiable and if for all h 1 , . . . , h p−1 ∈ R n themappingx ↦→ f (p−1) (x)(h 1 , . . . , h p−1 ) : U → R mis differentiable at x, then f is said to be p-times continuously differentiable at x withp-th derivative f (p) (x) defined byfor h 1 , . . . , h p ∈ R n .f (p) (x)(h 1 , . . . , h p ) = [ f p−1 (·)(h 1 , . . . , h p−1 ) ] ′(x)hp ,The function (h 1 , . . . , h p ) → f (p) (x)(h 1 , . . . , h p ) is linear in all its arguments, and fromthe theorem of H.A. Schwartz one obtaines by induction that it is totally symmetric: For1 ≤ i ≤ j ≤ pf (p) (x)(h 1 , . . . , h i , . . . , h j , . . . , h p ) = f (p) (x)(h 1 , . . . , h j , . . . , h i , . . . , h p ) .From the representation formula for the second derivatives one immediately obtains byinduction for h (j) = (h (j)1 , . . . , h (j)n ) ∈ R nf (p) (x)(h (1) , . . . , h (p) ) =n∑. . .i 1 =1n∑ ∂ p f(x)h (1)i∂x i1 . . . ∂x 1ipi p=1. . . h (p)i p.In accordance with Theorem 4.20, one says that f is p-times continuously differentiable,if f is p-times differentiable and the mapping x ↦→ f (p) (x)(h (1) , . . . , h (p) ) : U → R m iscontinuous for all h (1) , . . . , h (p) ∈ R n . By choosing in the above representation formula of91


f (p) for h (1) , . . . , h (p) vectors from the standard basis e 1 , . . . , e n of R n , it is immediatelyseen that f is p-times continuously differentiable, if and only if all partial derivatives off up to the order p exist and are continuous.If f (p) exists for all p ∈ N, then f is said to be infinitely differentiable. This happensif and only if all partial derivatives of any order exist in U.Theorem 4.24 (Taylor formula) Let U be an open subset of R n , let f : U → R be(p + 1)-times differentiable, and assume that the points x and x + h together with the linesegment connecting these points belong to U. Then there is a number ϑ with 0 < ϑ < 1such thatwheref(x + h) = f(x) + f ′ (x)h + 1 2! f ′′ (x)(h, h) + . . . + 1 p! f (p) (x)(h, . . . , h) + R} {{ } p (x, h) ,p-timesR p (x, h) =1(p + 1)! f (p+1) (x + ϑh)( h, . . . , h ) .} {{ }p+1-timesProof: Let γ : [0, 1] → U be defined by γ(t) = x + th . To F = f ◦ γ : [0, 1] → R applythe Taylor formula for real functions:p∑ F (j) (0) 1F (1) =+j! (p + 1)! F (p+1) (ϑ) .Insertion of the derivativesj=0F ′ (t) = f ′( γ(t) ) γ ′ (t) = f ′( γ(t) ) h ,F ′′ (t) = f ′′( γ(t) ) ( h, γ ′ (t) ) = f ′′( γ(t) ) (h, h) ,.F p+1 (t) = f p+1( γ(t) ) (h, . . . , γ ′ (t) ) = f p+1( γ(t) ) (h, . . . , h),into this formula yields the statement.Using the representation of f (k) by partial derivatives the Taylor formula can also bewritten asf(x + h) =p∑j=0+1[ n∑ ∂ j f(x)]h i1 . . . h ijj! ∂xi 1 ,...,i j =1 i1 . . . ∂x ij1(p + 1)!n∑i 1 ,...,i p+1 =192∂ p+1 f(x + ϑh)∂x i1 . . . ∂x ip+1h i1 . . . h ip+1 .


In this formula the notation can be simplified using multi-indices.α = (α 1 , . . . , α n ) ∈ N n 0 and for x = (x 1 , . . . , x n ) ∈ R n setFor a multi-index|α| := α 1 + . . . + α n (length of α)α! := α 1 ! . . . α n !x α := x α 11 . . . x αnn ,D α f(x) :=∂ |α| f(x)∂ α 1 x1 . . . ∂ αn x n.If α is a fixed multi-index with length |α| = j, then the sumn∑ ∂ j f(x)∂x i1 . . . ∂x iji 1 ,...,i j =1h i1 . . . h ijcontainsj!α 1 !...α n!terms, which are obtained from D α f(x)h α by interchanging the order,in which the derivatives are taken. Using this, the Taylor formula can be written in thecompact formf(x + h) =p∑ ∑j=0 |α|=j= ∑|α|≤p1α! Dα f(x)h α + ∑1α! Dα f(x)h α + ∑|α|=p+1|α|=p+11α! Dα f(x + ϑh)h α1α! Dα f(x + ϑh)h α .93


5 Local extreme values, inverse function and implicit function5.1 Local extreme valuesDefinition 5.1 Let U ⊆ R n be open, let f : U → R be differentiable and let a ∈ U. Iff ′ (a) = 0, then a is called critical point of f.Theorem 5.2 Let U ⊆ R n be open and let f : U → R be differentiable. If f has a localextreme value at a, then a is a critical point of f.Proof: Without restriction of generality we assume that f has a local maximum at a.Then there is a neighborhood V of a such that f(x) ≤ f(a) for all x ∈ V . Let h ∈ R nand choose δ > 0 small enough such that a + th ∈ V for all t ∈ R with |t| ≤ δ. LetF : [−δ, δ] → R be defined byF (t) = f(a + th) .Then F has a local maximum at t = 0 , hence0 = F ′ (0) = f ′ (a)h .Since this holds for every h ∈ R n , it follows that f ′ (a) = 0.Thus, if f has a local extreme value at a, then a is necessarily a critical point. Forexample, the saddle point a in the following picture is a critical point, but f has not anextreme value there.fx 2ax 194


This example shows that for functions of several variables the situation is more complicatedthan for functions of one variable. Still, also for functions of several variables thesecond derivative can be used to formulate a sufficient criterion for an extreme value.To this end some definitions and results for quadratic forms are needed, which we statewithout proof:Definition 5.3 Let Q : R n × R n → R be a bilinear mapping. Then the mapping h →Q(h, h) : R n → R is called a quadratic form. A quadratic form is called(i) positive definite, if Q(h, h) > 0 for all h ≠ 0 ,(ii) positive semi-definite, if Q(h, h) ≥ 0 for all h ,(iii) negative definite, if Q(h, h) < 0 for all h ≠ 0 ,(iv) negative semi definite, if Q(h, h) ≤ 0 for all h,(v)indefinite, if Q(h, h) has positive and negative values.To a quadratic form one can always find a symmetric coefficient matrixsuch that⎛⎞c 11 . . . c 1nC = ⎜ .⎟⎝⎠c n1 . . . c nnQ(h, h) =n∑c ij h i h j = h · Ch .i,j=1From this representation it follows that for a quadratic form the mapping h ↦→ Q(h, h) :R n → R is continuous. The quadratic form Q(h, h) is positive definite, ifc 11 > 0 ,det⎛⎞⎝ c 11 c 12⎠ > 0 ,c 21 c 22⎛⎞c 11 c 12 c 13det ⎜ c⎝ 21 c 22 c 23 ⎟⎠ > 0 , . . . , det(c ij) i,j=1,...,n > 0 .c 31 x 32 c 33If f : U → R is two times differentiable at x ∈ U, then (h, k) ↦→ f ′′ (x)(h, k) is bilinear,hence h ↦→ f ′′ (x)(h, h) is a quadratic form. Sincef ′′ (x)(h, h) =n∑i,j=1∂ 2 f(x)∂x i ∂x jh i h j ,95


the coefficient matrix to this quadratic form is the Hessian matrix( ∂ 2 f(x)H =.∂x i ∂x j)i,j=1,...,nBy the theorem of H.A. Schwarz, this matrix is symmetric.Now we can formulate a sufficient criterion for extreme values:Theorem 5.4 Let U ⊆ R n be open, let f : U → R be two times continuously differentiable,and let a ∈ U be a critical point of f. If the quadratic form f ′′ (a)(h, h)(i) is positive definite, then f has a local minimum at a,(ii) is negative definite, then f has a local maximum at a,(iii) is indefinite, then f does not have an extreme value at a.Proof: The Taylor formula yieldsf(x) = f(a) + f ′ (a)(x − a) + 1 2 f ′′ (a + ϑ(x − a) ) (x − a, x − a) ,with a suitable 0 < ϑ < 1 . Thus, since f ′ (a) = 0 ,withf(x) = f(a) + 1 2 f ′′( a + ϑ(x − a) ) (x − a, x − a) (∗)= f(a) + 1 2 f ′′ (a)(x − a, x − a) + R(x)(x − a, x − a) ,R(x)(h, k) = 1 2 f ′′( a + ϑ(x − a) ) (h, k) − 1 2 f ′′ (a)(h, k)= 1 n∑ ( ∂ 2 f ( a + ϑ(x − a) )− ∂2 f(a))h j k i .2∂x i ∂x j ∂x i ∂x ji,j=1Since by assumption f is two times continuously differentiable, the second partial derivativesare continuous. Hence to every ε > 0 there is δ > 0 such that for all x ∈ U with‖x − a‖ < δ and for all 1 ≤ i, j ≤ nConsequently, for x ∈ U with ‖x − a‖ < δ∣ ∂2 f ( a + ϑ(x − a) )− ∂2 f(a)∣ ∣∣ 2


where in the last step we used that there is a constant c > 0 with ‖h‖ ∞ ≤ c ‖h‖ for allh ∈ R n .Assume now that f ′′ (a)(h, h) > 0 is a positive definite quadratic form.f ′′ (a)(h, h) > 0 for all h ∈ R nh ↦→ f ′′ (a)(h, h) : R nThenwith h ≠ 0, and since the continuous mapping→ R attains the minimum on the closed and bounded, hencecompact set {h ∈ R n ∣ ∣ ‖h‖ = 1} at a point h0 from this set, it follows for all h ∈ R n withh ≠ 0( hf ′′ (a)(h, h) = ‖h‖ 2 f ′′ (a)‖h‖ ,h)≥ ‖h‖ 2 min‖h‖f ′′ (a)(η, η) = κ ‖h‖ 2‖η‖=1withκ = f ′′ (a)(h 0 , h 0 ) > 0 .Now choose ε = κ4c 2 . Then this estimate and (∗), (+) yield that there is δ > 0 such thatfor all x ∈ U with ‖x − a‖ < δf(x) − f(a) = 1 2 f ′′ (a)(x − a, x − a) + R(x)(x − a, x − a)This means that f attains a local minimum at a .≥ κ 2 ‖x − a‖2 − κ 4 ‖x − a‖2 = κ 4 ‖x − a‖2 ≥ 0.In the same way one proves that a local maximum is attained at a if f ′′ (a)(h, h) isnegative definite. If f ′′ (a)(h, h) is indefinite, there is h 0 ∈ R n , k 0 ∈ R n with ‖h 0 ‖ =‖k 0 ‖ = 1 and withκ 1 := f ′′ (a)(h 0 , h 0 ) > 0 , κ 2 := f ′′ (a)(k 0 , k 0 ) < 1 .From these relations we conclude as above that for all points x on the straight line througha with direction vector h 0 sufficiently close to a the difference f(x) − f(a) is positive, andfor x on the straight line through a with direction vector k 0 sufficiently close to a thedifference f(x) − f(a) is negative. Thus, f does not attain an extreme value at a .Example: Let f : R 2 → R be defined by f(x, y) = 6xy −3y 2 −2x 3 . All partial derivativesof all orders exist, hence f is infinitely differentiable. Therefore the assumptions of theTheorems 5.2 and 5.4 are satisfied. Thus, if (x, y) is a critical point, then⎛∂f⎞ ⎛ ⎞(x, y)⎜grad f(x, y) =∂x ⎟ 6y − 6x2⎝ ∂f ⎠ = ⎝ ⎠ = 0 ,∂y (x, y) 6x − 6y97


which yields for the critical points (x, y) = (0, 0) and (x, y) = (1, 1).To determine, whether these critical points are extremal points, the Hessian matrixmust be computed at these points. The Hessian is⎛∂ 2∂x f(x, y) ∂ 2f(x, y)2 ∂y∂xH(x, y) =⎜⎝⎞∂ 2∂x∂y f(x, y) ∂ 2f(x, y)∂y2 ⎛⎟⎠ = ⎝ −12x 66 −6The quadratic form f ′′ (0, 0)(h, h) defined by the Hessian matrix( )0 6H(0, 0) =6 −6is indefinite. For, if h = (1, 1) then(11)and if h = (0, 1) thenf ′′ (0, 0)(h, h) =f ′′ (0, 0)(h, h) =(01)··(0 6(0 66 −6) (116 −6) (01))==(11)(01)··(60)(6−6)= 6 ,⎞⎠ .= −6 ,Therefore (0, 0) is not an extremal point of f. On the other hand, the quadratic formf ′′ (1, 1)(h, h) defined by the matrixH(1, 1) =(−12 66 −6is negative definite. For, by the criterion given above the matrix −H(1, 1) is positivedefinite since 12 > 0 and)( )12 −6det= 72 − 36 > 0 .−6 6Consequently H(1, 1) is negative definite and (1, 1) a local maximum of f .5.2 Banach’s fixed point theoremIn this section we state and prove the Banach fixed point theorem, a tool which we needin the later investigations and which has many important applications in mathematics.Definition 5.5 Let X be a set and let d : X × X → R be a mapping with the properties98


(i) d(x, y) ≥ 0 , d(x, y) = 0 ⇔ x = y(ii) d(x, y) = d(y, x) (symmetry)(iii) d(x, y) ≤ d(x, z) + d(z, y) (triangle inequality)Then d is called a metric on X , and (X, d) is called a metric space. d(x, y) is called thedistance of x and y.Examples 1.) Let X be a normed vector space. We denote the norm by ‖ · ‖ . Then ametric is defined by d(x, y) := ‖x − y‖ . With this definition of the norm, every normedspace becomes a metric space. In particular, R n is a metric space.2.) Let X be a nonempty set. We define a metric on X by⎧⎨ 1, x ≠ yd(x, y) =⎩0, x = y .This metric is called degenerate.3.) On R a metric is defined byd(x, y) =|x − y|1 + |x − y| .To see that this is a metric, note that the properties (i) and (ii) of Definition 5.5 areobviously satisfied. It remains to show that the triangle inequality holds. To this endnote that t ↦→Thus, for x, y, z ∈ Rt1+t : [0, ∞) → [0, ∞) is strictly increasing, since d dtt= 1 (1 − t ) > 0 .1+t 1+t 1+td(x, y) =≤|x − y|1 + |x − y|≤|x − z| + |z − y|1 + |x − z| + |z − y||x − z| |z − y|+1 + |x − z| 1 + |z − y|= d(x, z) + d(z, y) .On a metric space X, a topology can be defined. For example, an ε-neighborhood B ε (x)of the point x ∈ X is defined byB ε (x) = { y ∈ X ∣ } d(x, y) < ε .Based on this definition, open and closed sets and continuous functions between metricspaces can be defined. A subset of a metric space is called compact, if it has the Heine-Borel covering property.99


Definition 5.6 Let (X, d) be a metric space.(i) A sequence { } ∞x n with x n=1 n ∈ X is said to converge, if x ∈ X exists such that toevery ε > 0 there is n 0 ∈ N withd(x n , x) < εfor all n ≥ n 0 . The element x is called the limit of {x n } ∞ n=1 .(ii) A sequence {x n } ∞ n=1 with x n ∈ X is said to be a Cauchy sequence, if to every ε > 0there is n 0 such that for all n, k ≥ n 0d(x n , x k ) < ε .Every converging sequence is a Cauchy sequence, but the converse is not necessarily true.Definition 5.7 A metric space (X, d) with the property that every Cauchy sequence converges,is called a complete metric space.Definition 5.8 Let (X, d) be a metric space. A mapping T : X → X is said to be acontraction, if there is a number ϑ with 0 ≤ ϑ < 1 such that for all x, y ∈ Xd(T x, T y) ≤ ϑd(x, y) .Theorem 5.9 (Banach fixed point theorem) Let (X, d) be a complete metric spaceand let T : X → X be a contraction. Then T possesses exactly one fixed point x , i.e.there is exactly one x ∈ X such thatT x = x .For arbitrary x 0 ∈ X define the sequence {x k } ∞ k=1 byx 1 = T x 0x k+1 = T x k .Thenhenced(x, x k ) ≤ϑk1 − ϑ d(x 1, x 0 ) ,lim x k = x .k→∞100


Proof: First we show that T can have at most one fixed point. Let x, y ∈ X be fixedpoints, hence T x = x , T y = y . Thend(x, y) = d(T x, T y) ≤ ϑd(x, y) ,which implies (1 − ϑ) d(x, y) = 0 , whence d(x, y) = 0 , and so x = y .Next we show that a fixed point exists. Let {x k } ∞ k=1be the sequence defined above.Then for k ≥ 1d(x k+1 , x k ) = d(T x k , T x k−1 ) ≤ ϑd(x k , x k−1 ) .The triangle inequality yieldsd(x k+l , x k ) ≤ d(x k+l , x k+l−1 ) + d(x k+l−1 , x k+l−2 ) + . . . + d(x k+1 , x k ) ,thusd(x k+l , x k ) ≤ (ϑ l−1 + ϑ l−2 + . . . + ϑ + 1) d(x k+1 , x k )≤1 − ϑl1 − ϑ ϑk d(x 1 , x 0 ) ≤ ϑk1 − ϑ d(x 1, x 0 ) . (∗)Since lim k→∞ ϑ k = 0 , if follows from this estimate that {x k } ∞ k=1is a Cauchy sequence.Since the space X is complete, it has a limit x . For this limit we obtaind(T x, x) = limk→∞d(T x, x)≤≤[lim d(T x, T xk ) + d(T x k , x k+1 ) + d(x k+1 , x) ]k→∞[lim ϑd(x, xk ) + d(x k+1 , x k+1 ) + d(x k+1 , x) ] = 0 ,k→∞hence T x = x , which shows that x is the uniquely determined fixed point. Moreover, (∗)yieldsd(x, x k ) = liml→∞d(x, x k )≤[lim d(x, xk+l ) + d(x k+l , x k ) ] ≤l→∞ϑk1 − ϑ d(x 1, x 0 ) .5.3 Local invertibilitySince f ′ (a) is an approximation to f in a neighborhood of a, one can ask whether invertibilityof f ′ (a) (i.e. det f ′ (a) ≠ 0) already suffices to conclude that f is one–to–one in a101


neighborhood of a. The following example shows that in general this is not true:Example:Let f : (−1, 1) → R be defined by⎧⎨ x + 3x 2 sin 1f(x) =x , x ≠ 0⎩0, x = 0.f is differentiable for all |x| < 1 with derivative⎧⎨f ′ 1 + 6x sin 1 (x) =x − 3 cos 1 x , x ≠ 0⎩1, x = 0.In every neighborhood of 0 there are infinitely many intervals, which belong to (0, ∞),and in which f ′ is continuous and has negative values. Thus, in such an interval onecan find 0 < x 1 < x 2 with f(x 1 ) > f(x 2 ) > 0. On the other hand, since f is continuousand satisfies f(0) = 0, the intermediate value theorem implies that the interval (0, x 1 )contains a point x 3 with f(x 2 ) = f(x 3 ). Hence in no neighborhood of 0 the function f isone–to–one.Since f ′ (0) = 1 and since in every neighborhood of 0 there are points x with f ′ (x) < 0,it follows that f ′ is not continuous at 0. Requiring that f ′ is continuous, changes thesituation:Theorem 5.10 Let U ⊆ R n be open, let a ∈ U, let f : U → R n be continuously differentiable,and assume that the derivative f ′ (a) is invertible. Let b = f(a). Then there is aneighborhood V of a and a neighborhood W of b, such that f |V : V → W is bijective witha continuously differentiable inverse g : W → V. (Clearly, g ′ (y) = [f ′ (g(y))] −1 .)Proof: We first assume that a = 0, f(0) = 0, hence b = 0, and f ′ (0) = I, where I : R n →R n is the identity mapping. It suffices to show that there is an open neighborhood W of0 and a neighborhood W ′ of 0, such that every y ∈ W has a unique inverse image underf in W ′ . Since f is continuous, it follows that f −1 (W ) is open, hence V = f −1 (W ) ∩ W ′is a neighborhood of 0, and f : V → W is invertible.To construct W, we define for y ∈ R n the mapping Φ y : U → R n byΦ y (x) = x − f(x) + y.Every fixed point x of this mapping is an inverse image of y under f. We choose W = U r (0)and show that if r > 0 is chosen sufficiently small, then for every y ∈ U r (0) the mappingΦ y has a unique fixed point in the closed ball W ′ = U 2r (0).102


This is guaranteed by the Banach fixed point theorem, if we can show that Φ y mapsU 2r (0) into itself and is a contraction on U 2r (0).Note first that the continuity of f ′ implies that there is r > 0 such that for allx ∈ U 2r (0) with the operator norm‖I − f ′ (x)‖ = ‖f ′ (0) − f ′ (x)‖ ≤ 1 2 ,whence‖Φ ′ y(x)‖ = ‖I − f ′ (x)‖ ≤ 1 2 .For x ∈ U 2r (0) the line segment connecting this point to 0 is contained in U 2r (0), henceCorollary 4.14 yields for such x‖x − f(x)‖ = ‖Φ y (x) − Φ y (0)‖ ≤ 1 ‖x‖ ≤ r.2Thus, for y ∈ U r (0) and x ∈ U 2r (0),‖Φ y (x)‖ = ‖x − f(x) + y‖ ≤ ‖x − f(x)‖ + ‖y‖ ≤ 2r.consequently, Φ y maps U 2r (0) into itself. To prove that Φ y : U 2r (0) → U 2r (0) is a contractionfor every y ∈ U r (0), we use again Corollary 4.14. Since for x, z ∈ U 2r (0) also the linesegment connecting these points is contained in U 2r (0), it follows that‖Φ y (x) − Φ(z)‖ ≤ 1 ‖x − z‖.2Consequently, for every y ∈ U r (0) the mapping Φ y is a contraction on the complete metricspace U 2r (0), whence has a unique fixed point x ∈ U 2r (0). Since x is an inverse image ofy under f, a local inverse g : W → V of f is defined byg(y) = x.We must show that g is continuously differentiable. Note first that if x 1 is a fixed pointof Φ y1 and x 2 is a fixed point of Φ y2 , then‖x 1 − x 2 ‖ = ‖Φ y1 (x 1 ) − Φ y2 (x 2 )‖ ≤ ‖Φ 0 (x 1 ) − Φ 0 (x 2 )‖ + ‖y 1 − y 2 ‖≤1 2 ‖x 1 − x 2 ‖ + ‖y 1 − y 2 ‖,which implies‖g(y 1 ) − g(y 2 )‖ = ‖x 1 − x 2 ‖ ≤ 2‖y 1 − y 2 ‖.103


Hence, g is continuous. To verify that g is differentiable, note that det f ′ (x) ≠ 0 for allx in a neighborhood of 0, hence f ′ (x) is invertible for all x from this neighborhood. Tosee this, remember that f ′ is continuous. By Theorem 4.20 this implies that the partialderivatives of f, which form the coefficients of the matrix f ′ (x), depend continuously onx. Because det f ′ (x) consists of sums of products of these coefficients, it is a continuousfunction of x, hence it differs from zero in a neighborhood of 0, since det f ′ (0) = 1. In thisneighborhood f ′ (x) is invertible. Therefore, since the inverse g is continuous, Theorem4.12 implies that g is differentiable. Finally, from the formulag ′ (y) =[f ′( g(y) )] −1it follows that g ′ is continuous. Here we use that the coefficients of the inverse (f ′ (x)) −1are determined via determinants (Cramer’s rule), and thus depend continuously on thecoefficients of f ′ (x).To prove the theorem for a function f with the properties stated in the theorem,consider the two affine invertible mappings A, B : R n → R n defined byAx = x + aBy = ( f ′ (a) ) −1(y − b).Then H = B ◦ f ◦ A is defined in the open set U − a = {x − a ∣ ∣ x ∈ U} containing 0,H(0) = (f ′ (a)) −1 (f(a) − b) = 0, andH ′ (0) = B ′ f ′ (a)A ′ = ( f ′ (a)) −1f ′ (a) = I.The preceding considerations show that neighborhoods V ′ , W ′ of 0 exist such that H :V ′ → W ′ is invertible. Since f = B −1 ◦ H ◦ A −1 , it thus follows that f has the localinverseg = A ◦ H −1 ◦ B : W → Vwith the neighborhoods W = B −1 (W ′ ) of b and V = A(V ′ ) of a. The local inverse H −1 iscontinuously differentiable, hence also g is continuously differentiable. .Example: Let f : R 3 → R 3 be defined byf 1 (x 1 , x 2 , x 3 ) = x 1 + x 2 + x 3 ,f 2 (x 1 , x 2 , x 3 ) = x 2 x 3 + x 3 x 1 + x 1 x 2 ,f 3 (x 1 , x 2 , x 3 ) = x 1 x 2 x 3 .104


Since all partial derivatives exist and are continuous, it follows that f is continuouslydifferentiable with⎛⎞1 1 1f ′ (x) = ⎜⎝x 3 + x 2 x 3 + x 1 x 2 + x 1⎟⎠ ,x 2 x 3 x 1 x 3 x 1 x 2hencedet f ′ (x) =∣∣ ∣∣∣∣∣∣∣1 0 0x 3 + x 2 x 1 − x 2 x 1 − x 3x 2 x 3 (x 1 − x 2 )x 3 (x 1 − x 3 )x 2= (x 1 − x 2 )(x 1 − x 3 )x 2 − (x 1 − x 2 )(x 1 − x 3 )x 3= (x 1 − x 2 )(x 1 − x 3 )(x 2 − x 3 ) .Thus, let b = f(a) with (a 1 − a 2 )(a 1 − a 3 )(a 2 − a 3 ) ≠ 0 . Then there are neighborhoods Vof a and W of b, such that the system of equationsy 1 = x 1 + x 2 + x 3y 2 = x 2 x 3 + x 3 x 1 + x 1 x 2y 3 = x 1 x 2 x 3has a unique solution x ∈ V to every y ∈ W .We remark that the local invertibility does not imply global invertibility. One can seethis at the following example: Let f : { (x, y) ∈ R ∣ } 2 y > 0 → R 2 be defined byf 1 (x, y) = y cos xf 2 (x, y) = y sin x .f is continuously differentiable withdet f ′ −y sin x(x, y) =∣ y cos xcos xsin x∣ = −y sin2 x − y cos 2 x = −y ≠ 0for all (x, y) from the domain of definition. Consequently f is locally invertible at everypoint. Yet, f is not globally invertible, since f is 2π-periodic with respect to the x variable.105


5.4 Implicit functionsLet a function f : R n+m → R n be given with the components f 1 , . . . , f n , and let y =(y 1 , . . . , y m ) be given. Can one determine x = (x 1 , . . . , x n ) ∈ R n such that the equationsf 1 (x 1 , . . . , x n , y 1 , . . . , y m ) = 0.f n (x 1 , . . . , x n , y 1 , . . . , y m ) = 0hold? These are n equations for n unknowns x 1 , . . . , x n . First we study the situation fora linear function f = A : R n+m → R n ,⎛ ⎞ ⎛⎞A 1 (x, y) a 11 x 1 + . . . + a 1n x n + b 11 y 1 + b 1m y mA(x, y) = ⎜⎝ . ⎟⎠ = ⎜⎝ .⎟⎠ .A n (x, a) a n1 x 1 + . . . + a nn x n + b n1 y 1 + b nm y mSuppose that A has the propertyA(h, 0) = 0 ⇒ h = 0 .A has this property, if and only if the matrix⎛⎞ ⎛a 11 . . . a 1n⎜⎝ .⎟⎠ = ⎜⎝ .a n1 . . . a nnis invertible, hence if and only ifUnder this condition the mapping∂A 1∂x 1. . .∂A 1∂x n∂A n∂x 1. . .∂A n∂x ndet ( ∂A j∂x i)i,j=1,...,n ≠ 0 .h ↦→ Ch := A(h, 0) : R n → R nis invertible, consequently the system of equationshas for every k ∈ R m the unique solution⎞⎟⎠A(h, k) = A(h, 0) + A(0, k) = Ch + A(0, k) = 0h = ϕ(k) := −C −1 A(0, k) .For ϕ : R m → R n one hasA ( ϕ(k), k ) = 0 ,106


for all k ∈ R m . One says that the function ϕ is implicitly given by this equation.The theorem about implicit functions concerns the same situation for continuouslydifferentiable functions f, which are not necessarily linear:Theorem 5.11 (about implicit functions) Let D ⊆ R n+m be open and let f : D →R n be continuously differentiable. Suppose that there is a ∈ R n , b ∈ R m with (a, b) ∈ D,such that f(a, b) = 0 and⎛∂f 1∂x 1(a, b) . . .det ⎜⎝ .∂f n∂x 1(a, b) . . .∂f 1∂x n(a, b)∂f n∂x n(a, b)⎞⎟⎠ ≠ 0 .Then there is a neighborhood U ⊆ R m of b and a uniquely determined continuouslydifferentiable function ϕ : U → R n such that ϕ(b) = a and for all y ∈ Uf ( ϕ(y), y ) = 0 .Proof: Consider the mapping F : D → R n+m ,F (x, y) = ( f(x, y), y ) ∈ R n+m .(∗)ThenF (a, b) = ( f(a, b), b ) = (0, b) .Since f is continuously differentiable, all the partial derivatives of F exist and are continuousin D, hence F is continuously differentiable in D. The derivative F ′ (a, b) is givenby⎛F ′ (a, b) =⎜⎝∂f 1∂x 1. . ..∂f n∂x 1. . .∂f 1∂x n∂f 1∂y 1. . .∂f n∂x n∂f n∂y 1. . .0 . . . 0 1 . . . 0.0 . . . 0 0 1.∂f 1∂y m∂f n∂y m⎞,⎟⎠where the partial derivatives are computed at (a, b). Thus, for h ∈ R n , k ∈ R mThis linear mapping is invertible. For, ifF ′ (a, b)(h, k) = ( f ′ (a, b)(h, k), k ) .F ′ (a, b)(h, k) = ( f ′ (a, b)(h, k), k ) = 0 ,107


then k = 0, therefore f ′ (a, b)(h, 0) = 0, which together with (∗) yields h = 0. Consequentlythe null space of the linear mapping F ′ (a, b) consists only of the set {0}, hence itis invertible.Therefore the assumptions of Theorem 5.10 are satisfied, and it follows that there areneighborhoods V of (a, b) and W of (0, b) in R n+m such thatF |V : U → Wis invertible. The inverse F −1 : W → V is of the formF −1 (z, w) = ( φ(z, w), w ) ,with a continuously differentiable function φ : W → R n . Now setU = {w ∈ R m | (0, w) ∈ W } ⊆ R mand define ϕ : U → R n byϕ(w) = φ(0, w) .U is a neighborhood of b since W is a neighborhood of (0, b), and for all w ∈ U(0, w) = F ( F −1 (0, w) ) = F ( φ(0, w), w ) = F ( ϕ(w), w ) = ( f ( ϕ(w), w ) , w ) ,whencef ( ϕ(w), w ) = 0 .The derivative of the function ϕ can be computed using the chain rule: For the derivativedf( ϕ(y), y ) of the function y ↦→ f ( ϕ(y), ϕ ) we obtaindy0 = ddy f( ϕ(y), y ) = ( ∂∂x f , ∂∂y f)( ϕ(y), y ) ( ϕ ′ (y)id R m)Thus,Here we have set= ( ∂∂x f)( ϕ(y), y ) ◦ ϕ ′ (y) + ( ∂∂y f)( ϕ(y), y ) .[ (ϕ ′ ∂(y) = −∂x f)( ϕ(y), y )] −1 ( ∂ ◦∂y f)( ϕ(y), y ) .∂∂x f(x, y) = ( ∂f j∂x i(x, y) ) i,j=1,...,n∂∂y f(x, y) = ( ∂f j∂y i(x, y) ) j=1,...,n , i=1,...,m .108


Examples:1.) Let an equationf(x 1 , . . . , x n ) = 0be given with continuously differentiable f : R n → R. To given x 1 , . . . , x n−1 we seek x nsuch that this equation is satisfied, i.e. we want to solve this equation for x n . Assumethat a = (a 1 , . . . , a n ) ∈ R n is given such thatandf(a 1 , . . . , a n ) = 0∂f∂x n(a 1 , . . . , a n ) ≠ 0 .Then the implicit function theorem implies that there is a neighborhood U ⊆ R n−1 of(a 1 , . . . , a n−1 ), such that to every (x 1 , . . . , x n−1 ) ∈ U a unique x n = ϕ(x 1 , . . . , x n−1 ) canbe found, which solves the equationf(x 1 , . . . , x n−1 , x n ) = 0 ,and which is a continuously differentiable function of (x 1 , . . . , x n−1 ) and satisfies x n = a nfor (x 1 , . . . , x n−1 ) = (a 1 , . . . , a n−1 ) . For the derivative of the function ϕ one obtains⎛ ⎞grad ϕ(x 1 , . . . , x n−1 ) =where x n = ϕ(x 1 , . . . x n−1 ).2.) Let f : R 3 → R 2 be defined by−1∂∂x nf(x 1 , . . . , x n ) grad n−1f(x 1 , . . . , x n ) = −1∂f∂x nf 1 (x, y, z) = 3x 2 + xy − z − 3f 2 (x, y, z) = 2xz + y 3 + xy .⎜⎝∂∂x 1f.∂∂x n−1fWe have f(1, 0, 0) = 0. To given z ∈ R from a neighborhood of 0 we seek (x, y) ∈ R 2 suchthat f(x, y, z) = 0. To this end we must test, whether the matrix⎛⎞ ⎛⎞∂f 1∂x⎝(x, y, z) ∂f 1(x, y, z)∂y⎠ = ⎝ 6x + y x ⎠∂f 2(x, y, z) ∂f 2(x, y, z) 2z + y 3y 2 + x∂x ∂yis invertible at (x, y, z) = (1, 0, 0). At this point, the determinant of this matrix is6 1∣ 0 1 ∣ = 6 ≠ 0 ,109⎟⎠ ,


hence the matrix is invertible. Consequently, a sufficiently small number δ > 0 and acontinuously differentiable function ϕ : (−δ, δ) → R 2 with ϕ(0) = (1, 0) can be foundsuch that f ( ϕ 1 (z), ϕ 2 (z), z ) = 0 for all z with |z| < δ . For the derivative of ϕ we obtainwith (x, y) = ϕ(z)ϕ ′ (z) =( ) −1⎛ ⎞∂f 16x + y x(x, y, z)∂z−⎝ ⎠2z + y 3y 2 + x ∂f 2∂z=( ) ( )−13y 2 + x −x −1(6x + y)(3y 2 + x) − x(2z + y) −(2z + y) 6x + y 2x( )=−1−3y 2 − x − 2x 2(6x + y)(3y 2 + x) − x(2z + y) 2z + y + 12x 2 + 2xy.Since ϕ(0) = (1, 0), we obtain in particularϕ ′ (0) = − 1 6( )−312=(12−2).110


6 Integration of functions of several variables6.1 Definition of the integralLet Ω be a bounded subset of R 2 and let f : Ω → R be a real valued function. If f iscontinuous, then graph f is a surface in R 3 . We want to define the integral∫f(x)dxΩsuch that its value is equal to the volume of the subset K of R 3 , which lies between thegraph of f and the x 1 , x 2 -plane. More generally, we want to define integrals for functionsdefined on R n , such that for n = 2 the integral has this property.graph fx 2KΩx 1Definition 6.1 LetQ = {x ∈ R n | a i ≤ x i < b i , i = 1, . . . , n}be a bounded, half open interval in R n . A partition P of Q is a cartesian productP = P 1 × . . . × P n ,where P i = {x (i)0 , . . . , x (i)k i} is a partition of [a i , b i ], for every i = 1, . . . , n.Q is partitioned into k = k 1 · k 2 · · · k n half open subintervals Q 1 , . . . , Q k of the formQ j = [x (1)p 1, x (1)p 1 +1) × . . . × [x (n)p n, x (n)p n+1).The number|Q j | = (x (1)p 1 +1 − x (1)p 1) . . . (x (n)p n+1 − x (n)p n)111


is called measure of Q j . For a bounded function f : Q → R defineM j = sup f(Q j ), m j = inf f(Q j ),U(p, f) =k∑M j |Q j |, L(P, f) =j=1k∑m j |Q j |.The upper and lower Darboux integrals are∫f dx = inf{U(P, f) | P is a partition of Q},Q∫f dx = sup{L(P, f) | P is a partition of Q}.QDefinition 6.2 A bounded function f : Q → R is called Riemann integrable, if the upperand lower Darboux integrals coincide. The common value is denoted by∫∫f dx or f(x)dxand is called the Riemann integral of f.QTo define the integral on more general domains, let Ω ⊆ R n be a bounded subset and letf : Ω → R. Choose a bounded interval Q such that Ω ⊆ Q and extend f to a functionf Q : Q → R by⎧⎨f(x), x ∈ Ω,f Q (x) =⎩0, x ∈ Q\Ω.Definition 6.3 A bounded function f : Ω → R is called Riemann integrable over Ω ifthe extension f Q is integrable over Q. We set∫∫f(x)dx =ΩQQf Q (x) dx.The multi-dimensional integral shares most of the properties with the one-dimensionalintegral. We do not repeat the proofs, since they are almost the same. Differences arisemainly from the more complicated structure of the domain of integration. Whether afunction is integrable over a domain Ω depends not only on the properties of the functionbut also on the properties of Ω.Definition 6.4 A bounded set Ω ⊆ R n is called Jordan-measurable, if the characteristicfunction χ Ω : R n → R defined by⎧⎨1, x ∈ Ωχ Ω (x) =⎩0, x ∈ R n \Ωis integrable. In this case |Ω| = ∫ 1 dx is called the Jordan measure of Ω.Ω112j=1


Of course, a bounded interval Q ⊆ R n is measurable, and the previously given definitionof |Q| coincides with the new definition.Theorem 6.5 If the compact domain Ω ⊆ R n is Jordan measurable and if f : Ω → R iscontinuous, then f is integrable.A proof of this theorem can be found in the book ”Lehrbuch der Analysis, Teil 2“ of H.Heuser, p. 455.6.2 Limits of integrals, parameter dependent integralsTheorem 6.6 Let Ω ⊆ R n be a bounded set and let {f k } ∞ k=1 be a sequence of Riemannintegrable functions f k : Ω → R, which converges uniformly to a Riemann integrablefunction f : Ω → R. ThenRemark∫∫lim f k (x)dx = f(x)dx.k→∞ΩΩIt can be shown that the uniform limit f of a sequence of integrable functionsis automatically integrable.ProofLet ε > 0. Then there is k 0 ∈ N such that for all k ≥ k 0 and all x ∈ Ω we havehence∫∣Ω|f k (x) − f(x)| < ε,(fk (x) − f(x) ) ∫∫dx∣∣ ≤ |f k (x) − f(x)|dx ≤ ε dx ≤ ε|Q|.QQBy definition, this means that lim k→∞∫Ω f k(x)dx = ∫ Ω f(x)dx.Corollary 6.7 Let D ⊆ R k and let Q ⊆ R m be a bounded interval. If f : D × Q → R iscontinuous, then the function F : D → R defined by the parameter dependent integral∫F (x) = f(x, t)dtis continuous.QProof Let x 0 ∈ D and let {x k } ∞ k=1 be a sequence with x k ∈ D and lim k→∞ x k = x 0 .Then x 0 is the only accumulation point of the set M = {x k | k ∈ N} ∪ {x 0 }, from whichit is immediately seen that M × Q is closed and bounded, hence it is a compact subsetof D × Q. Therefore the continuous function f is uniformly continuous on M × Q. This113


implies that to every ε > 0 there is δ > 0 such that for all y ∈ M with |y − x 0 | < δ andall t ∈ Q we have|f(y, t) − f(x 0 , t)| < ε.Choose k 0 ∈ N such that |x k − x 0 | < δ for all k ≥ k 0 . This implies for k ≥ k 0 and for allt ∈ Q that|f(x k , t) − f(x 0 , t)| < ε,which shows that the sequence {f k } ∞ k=1 of continuous functions f k : Q → R definedby f k (t) = f(x k , t) converges uniformly to the continuous function f ∞ (t) = f(x 0 , t).Theorem 6.6 impliesTherefore F is continuous.lim F (x k) = lim f(x k , t)dt =k→∞ k→∞∫Q∫Qf(x, t)dx = F (x).6.3 The Theorem of FubiniThe computation of integrals by approximation of the integrand by step functions is impractible.For one-dimensional integrals the computation is strongly simplified by theFundamental Theorem of Calculus. In this section we show that multidimensional integralscan be computed as iterated one-dimensional integrals, which makes also thecomputation of these integrals practicle.We first consider integrals of step functions. LetQ = {x ∈ R n | a i ≤ x i < b i , i = 1, . . . , n}Q ′ = {x ′ ∈ R n−1 | a i ≤ x i < b i , i = 1, . . . , n − 1}be half open intervals. IfP = P 1 × P 2 × . . . × P nis a partition of Q, then P ′ = P 1 × . . . × P n−1 is a partition of Q ′ . Let Q ′ 1, . . . , Q ′ k bethe subintervals of Q ′ generated by P ′ and let I 1 , . . . , I k ′ ⊆ [a n , b n ) be the half opensubintervals generated by P n . Then all the subintervals of Q generated by P are given byQ ′ j × I l , 1 ≤ j ≤ k, 1 ≤ l ≤ k ′ .For the characteristic functions χ Q ′j ×I land the measures |Q ′ j × I l | we haveχ Q ′j ×I l(x) = χ Q ′j(x ′ )χ Il (x n ) and |Q ′ j × I l | = |Q ′ j| |I l |.114


Let s : Q → R be a step function of the forms(x) =∑r jl χ Q ′j ×I l(x) =j=1,...,kl=1,...,k ′∑k ′l=1( k∑j=1)r jl χ Q ′j(x ′ ) χ Il (x n ),with given numbers r jl ∈ R. The last equality shows that for every fixed x n ∈ [a n , b n ) thefunction x ′ ↦→ s(x ′ , x n ) is a step function on Q ′ with integral∫∑k ′ ( k∑ )s(x ′ , x n ) dx ′ = r jl |Q ′ j| χ Il (x n ),Q ′l=1and this formula shows that x n ↦→ ∫ Q ′ s(x ′ , x n ) dx ′ is a step function on [a n , b n ). For theintegral of this step function over the interval [a n , b n ) we thus find∫ bna n∫Q ′ s(x ′ , x n ) dx ′ dx n = ∑j=1j=1,...,kl=1,...,k ′ ( ∑= ∑)r jl |Q ′ j| |I l |r jl |Q ′ j × I l | =j=1,...,kl=1,...,k ′∫Qs(x) dx.For step functions the n-dimensional integral ∫ s(x) dx can thus be computed as anQiterated integral. This is also true for continuous functions:Theorem 6.8 (Guido Fubini, 1879 – 1943) LetQ = {x ∈ R n | a i ≤ x i < b i , i = 1, . . . , n}Q ′ = {x ′ ∈ R n−1 | a i ≤ x i < b i , i = 1, . . . , n − 1}.Then for every continuous function f : Q → R the function F : [a n , b n ] → R defined by∫F (x n ) = f(x ′ , x n )dx ′Q ′is integrable and∫f(x)dx =∫ bnQa na nF (x n )dx n =∫ bn∫f(x ′ , x n )dx ′ dx n . (6.1)Q ′ProofBy Corollary 6.7 the function F is continuous, whence it is integrable. To verify(6.1) we approximate f by step functions. Choose a sequence of partitions {P (l) } ∞ l=1 of Qsuch thatsup diam(Q (l)j ) ≤ 1j=1,...,j ll , (6.2)115


where Q (l)1 , . . . Q (l)j lare the subintervals of Q generated by the partition P (l) . Choosex (l)j∈ Q (l)jand define step functions s l : Q → R bys l (x) =j l∑j=1f(x (l)j)χ (l) Q(x).jThe sequence {s l } ∞ l=1 converges uniformly to f. To verify this, note that the continuousfunction f is uniformly continuous on the compact set Q. It thus follows that to givenε > 0 there is δ > 0 such that |f(x) − f(y)| < ε for all x, y ∈ Q satisfying |x − y| < δ.Choose l 0 with 1/l 0 < δ. For every l ≥ l 0 and every x ∈ Q there is exactly one numberj such that x ∈ Q l j. From (6.2) we thus conclude that |x − x l j| ≤ diam(Q (l)j ) ≤ 1/l ≤1/l 0 < δ, hence|f(x) − s l (x)| = |f(x) − f(x l j)| < ε.This inequality shows that indeed {s l } ∞ l=1 converges uniformly to f, since l 0 is independentof x ∈ Q. Therefore Theorem 6.6 can be applied. We find that∫∫s l (x)dx = f(x)dx. (6.3)liml→∞Moreover, for the step function S l : [a n , b n ] → R defined by∫S l (x n ) = s l (x ′ , x n )dx ′Q ′Qit follows that∫|F (x n ) − S l (x n )| ≤ |f(x ′ , x n ) − s l (x ′ , x n )| dx ′ ≤ sup |f(y) − s l (y)| |Q ′ |.Q ′ y∈QThe right hand side is independent of x n and converges to zero for l → ∞, hence {S l } ∞ l=1converges to F uniformly on [a n , b n ]. Consequently, Theorem 6.6 implies∫ bnliml→∞a nS l (x n )dx n =Q∫ bna nF (x n )dx n .Since (6.1) holds for step functions, it follows from this equation and from (6.3) that∫ bn∫ bnF (x n )dx n = lima l→∞ n a∫ nbn= liml→∞a nS l (x n )dx n∫Q ′ s l (x ′ , x n )dx ′ dx n = liml→∞∫Q∫s l (x)dx =Remarks By repeated application of this theorem we obtain that∫∫ bn∫ b1f(x)dx = . . . f(x 1 , . . . , x n )dx 1 . . . dx n .Qa n a 1116Qf(x)dx.


It is obvious from the proof that in the Theorem of Fubini the coordinate x n can bereplaced by any other coordinate. Therefore the order of integration in the iteratedintegral can be replaced by any other order.The Theorem of Fubini holds not only for continuous functions, but for any integrablefunction. In the general case both the formulation of the theorem and the proof are morecomplicated.6.4 The transformation formulaThe transformation formula generalizes the rule of substitution for one-dimensional integrals.We start with some preparations.Definition 6.9 (i)Let f : R n → R be continuous. The support of f is defined bysupp f = {x ∈ R n | f(x) ≠ 0}.(ii) Let D ⊆ R n and let {U i } ∞ i=1 be an open covering of D. For every i ∈ N let ϕ i :R n → R be a continuous function with compact support contained in U i , such that∞∑ϕ i (x) = 1, for all x ∈ D.i=1Then {ϕ i } ∞ i=1 is called partition of unity on D subordinate to the covering {U i } ∞ i=1 .Theorem 6.10 Let D ⊆ R n be a compact set and let B r1 (z 1 ), . . . , B rm (z m ) be open ballsin R n with D ⊆ B r1 (z 1 ) ∪ . . . ∪ B rm (z m ). Then there is a partition of unity {ϕ i } ∞ i=1 on Dsubordinate to the covering {B ri (z i )} m i=1.Proof: Let C = R n \ ⋃ mi=1 B r i(z i ). The distance dist(D, C) = inf{|x − y| | x ∈ D, y ∈ C}is positive. Otherwise there would be sequences {x j } ∞ j=1, {y j } ∞ j=1, x j ∈ D, y j ∈ C suchthat lim j→∞ |x j −y j | = 0. Since D is compact, {x j } ∞ j=1 would have an accumulation pointx 0 ∈ D. Since x 0 would also be an accumulation point of {y j } ∞ j=1 and since C is closed,it would follow that x 0 ∈ C, hence D ∩ C ≠ ∅, which contradicts the assumptions.Therefore we can choose balls B ′ i = B r ′i(z i ), i = 1, . . . m, with r ′ i < r i , such thatD ⊆ ⋃ mi=1 B′ i. For 1 ≤ i ≤ m, let β i be a continuous function on R n with support inB ri (z i ), such that β i (z) = 1 for z ∈ B ′ i. Put ϕ 1 = β 1 and setϕ j = (1 − β 1 )(1 − β 2 ) · · · (1 − β j−1 )β j , for 2 ≤ j ≤ m.117


Every ϕ j is a continuous function. By induction one obtains that for 1 ≤ l ≤ m,ϕ 1 + · · · + ϕ l = 1 − (1 − β l )(1 − β l−1 ) · · · (1 − β 1 ).Every x ∈ D belongs to at least one B ′ i, hence 1 − β i (x) = 0. For l = m the product onthe right hand side thus vanishes on D, so that ∑ mi=1 ϕ i(x) = 1 for all x ∈ D.Theorem 6.11 Let U ⊆ R n be an open set with 0 ∈ U and let T : U → R n be continuouslydifferentiable such that T (0) = 0 with invertible derivative T ′ (0) : R n → R n . Thenthere is a number j ∈ {1, . . . , n} and there are neighborhoods V, W of 0 in R n , such thatthe decompositionT (x) = h ( g(Bx) )is valid for all x ∈ V , where the linear operator B : R n → R n merely interchanges the x j -and x n -coordinates, and where the functions h : W → R n , g : B(V ) → W are of the form⎛ ⎞ ⎛ ⎞x 1h 1 (x)..g(x) =, h(x) =, (6.4)⎜⎝ x n−1⎟ ⎜⎠ ⎝h n−1 (x) ⎟⎠g n (x)and are continuously differentiable with det h ′ ≠ 0 in W , det g ′ ≠ 0 in B(V ).Proof The last row of the Jacobi matrix T ′ (0) = ( ∂T i∂x k(0) ) contains at least onei,k=1,...,nnon-zero element ∂Tn∂x j(0), since otherwise T ′ (0) would not be invertible. Let B be themapping, which interchanges the x j – and x n –coordinates, where j is the index of thisnon-zero element, and let B(U) ⊆ R n be the image of the set U under B. We have thatB(U) = B −1 (U), since B −1 = B. Define⎛x 1..˜g(x) ==. (6.5)⎜⎝x n−1⎟ ⎜⎠ ⎝ x n−1⎟⎠T n (x 1 , . . . , x j−1 , x n , x j+1 , . . . , x n−1 , x j ) T n (Bx)Then ˜g : B(U) → R n is continuously differentiable with ˜g(0) = 0 and⎛⎞1˜g ′ 1(x) =⎜ . , .. ⎟⎝⎠∂T n∂T∂x 1· · · n∂x j118x n⎞⎛x 1⎞


whence det ˜g ′ (0) = ∂Tn∂x j(0) ≠ 0. Consequently the Inverse Function Theorem 5.10 impliesthat there are neighborhoods Ṽ ⊆ B(U) and W of 0 such the restriction g : Ṽ → W of ˜gto Ṽ is one-to-one and such that the inverse g−1 : W → Ṽ is continuously differentiablewith nonvanishing determinants det g ′ and det(g −1 ) ′ . Of course, we have g −1 (0) = 0. Nowset h = T ◦ B ◦ g −1 . Since B ◦ g −1 (W ) = B(Ṽ ) ⊆ B(B(U)) = U, it follows that h isdefined on W . Moreover, h is continuously differentiable and satisfies h(0) = 0. Since˜g |Ṽ = g, the definition of h and the definition of ˜g in (6.5) yield for y ∈ W thath n (y) = (T n ◦ B)(g −1 (y)) = ˜g n (g −1 (y)) = g n (g −1 (y)) = ( (g ◦ g −1 )(y) ) n = y n.This equation and (6.5) show that h and g have the form required in (6.4). Set V = B(Ṽ ).Then h ◦ g ◦ B : V ⊆ U → R n , and we haveh ◦ g ◦ B = T ◦ B ◦ g −1 ◦ g ◦ B = T ◦ B ◦ B = T,which is the decomposition of T required in the theorem. The chain rule yieldsh ′ = (T ◦ B ◦ g −1 ) ′ = (T ′ ◦ B ◦ g −1 )(B ◦ g −1 )(g −1 ) ′ ,whence det h ′ = ( det T ′ (B ◦ g −1 ) ) det B det(g −1 ) ′ . We have det B = ±1. Moreover,det(g −1 ) ′ does not vanish by construction. Thus, because det T ′ (0) ≠ 0 and because det T ′is continuous, we can reduce the sizes of V and W , if necessary, such that det h ′ (x) ≠ 0for all x ∈ W .With this theorem we can prove the transformation rule, which generalizes the rule ofsubstitution:Theorem 6.12 (Transformation rule) Let U ⊆ R n be open and let T : U → R n be acontinuously differentiable transformation such that | det T ′ (x)| > 0 for all x ∈ U. Supposethat Ω is a compact Jordan-measurable subset of U and that f : T (Ω) → R is continuous.Then T (Ω) is a Jordan measurable subset of R n , the function f is integrable over T (Ω)andProof∫T (Ω)∫f(y) dy =Ωf ( T (x) ) | det T ′ (x)| dx. (6.6)For simplicity we prove this theorem only in the special case when Ω is connectedand when f : R n → R is a continuous function with supp f ⊆ T (Ω). In this case f isdefined outside of T (Ω) and vanishes there. Moreover, f ◦ T is defined in U with supportcontained in Ω. We can therefore extend f ◦ T by 0 to a continuous function on R n .Hence, we can extend the domain of integration on both sides of (6.6) to R n .119


Consider first the case n = 1. By assumption Ω is compact and connected, hence Ωis an interval [a, b]. Since det T ′ (x) = T ′ (x) vanishes nowhere, T ′ (x) is either everywherepositive in [a, b] or everywhere negative. In the first case we have T (a) < T (b), in thesecond case T (b) < T (a). If we take the plus sign in the first case and the minus sign inthe second case we obtain from the rule of substitution∫∫ T (b)∫ bf(y) dy = ± f(y) dy = ± f ( T (x) ) T ′ (x) dxT ([a,b])=∫ baT (a)f ( T (x) ) |T ′ (x)| dx =a∫ baf ( T (x) ) | det T ′ (x)| dx.Therefore (6.6) holds for n = 1. Assume next that n ≥ 2 and that (6.6) holds for n − 1.We shall prove that this implies that (6.6) holds for n, from which the statement of thetheorem follows by induction.Assume first that the transformation is of the special form T (x) = T (x ′ , x n ) =(x ′ , T n (x ′ , x n ) ) . Then the Theorem of Fubini yields∫R n f(y) dy ===∫R n−1 ∫∫R n−1 ∫∫R n−1 ∫f(y ′ , y n ) dy n dy ′Rf(x ′ , T n (x ′ , x n )) ∣ ∂ T n (x ′ , x n ) ∣ dxn dx ′R∂x n∫f(T (x))| det T ′ (x)| dx n dx ′ = f(T (x))| det T ′ (x)| dx,RR nsince det T ′ (x) =∂∂x nT n (x). The transformation formula thus holds in this case. Next,assume that the transformation is of the special form T (x) = ( )( ) ˜T (x ′ , x n ), x n with˜T (x ′ , x n ) ∈ R n−1 . With the Jacobi matrix ∂ x ′˜T (x) =∂ ˜T i∂x j(x) we havei,j=1,...,n−1( )det T ′ ∂x ′˜T (x) 0(x) = det= det ( )∂ x ′˜T (x) .0 1Since by assumption the transformation rule holds for n − 1, we thus have∫∫ ∫f(y) dy = f(y ′ , y n ) dy ′ dy nR n R R∫ ∫n−1= f ( ) ( ˜T (x ′ , x n ), x n | det ∂x ′˜T (x ′ , x n ) ) |dx ′ dx nR R∫n−1= f(T (x))| det T ′ (x)| dx.R nThe transformation formula (6.6) therefore holds also in this case. It also holds when thetransformation T is a linear operator B, which merely interchanges coordinates, since this120


amounts to a change of the order of integration when the integral is computed iteratively,and by the Theorem of Fubini the order of integration does not matter.If (6.6) holds for the transformations R and S, then it also holds for the transformationT = R ◦ S. For,∫∫f(z) dz = f(R(y))| det R ′ (y)| dyR n R∫n= f ( R(S(x)) ) | det R ′ (S(x))| | det S ′ (x)| dxR∫n= f(T (x))| det ( R ′ (S(x))S ′ (x) ) ∫| dx = f(T (x))| det T ′ (x)| dx,R n R nsince by the determinant multiplication theorem for n × n-matrices M 1 and M 2 we havedet M 1 det M 2 = det(M 1 M 2 ).If T has the properties stated in the theorem and if y ∈ U, then the transformationˆT (z) := T (z+y)−T (y) satisfies all assumptions of Theorem 6.11, since ˆT (0) = 0. It followsby this theorem that there is a neighborhood ˆV of 0, in which ˆT has the decompositionˆT = h◦g ◦B with elementary transformations h, g and B, for which we showed above that(6.6) holds. For x from the neighborhood V = y + ˆV of y we thus have the decompositionT (x) = T (y) + ˆT (x − y) = T (y) + (h ◦ g ◦ B)(x − y).Since (6.6) holds for the transformations which merely consist in subtraction of y oraddition of T (y), (6.6) holds for everyone of the elementary transformations on the righthand side of this equation, whence it holds for the composition T of these elementarytransformations. We thus proved that each point y ∈ U has a neighborhood V (y) suchthat (6.6) holds for all continuous f, for which supp (f ◦ T ) ⊆ V (y).Since det T ′ (y) ≠ 0, the inverse function theorem implies that T is locally a diffeomorphismus.Therefore T (V (y)) contains an open neighborhood of T (y). If supp f is a subsetof this neighborhood, we have supp (f ◦ T ) ⊆ V (y), whence (6.6) holds for all such f. Weconclude that each point z ∈ T (Ω) has a neighborhood W (z), which we can choose to bean open ball, such that (6.6) holds for all continuous f whose support lies in W (z).Since T (Ω) is compact, there are points z 1 , . . . , z p in T (Ω) such that the union of theopen balls W (z i ) covers T (Ω). By Theorem 6.10 there is a partition of unity {ϕ i } p i=1on T (Ω) subordinate to the covering {W (z i )} p i=1 . If f is a continuous function withsupp f ⊆ T (Ω), we thus have for every x ∈ R nf(x) = f(x)p∑ϕ i (x) =i=1121p∑ (ϕi (x)f(x) ) .i=1


Since supp (ϕ i f) ⊆ supp ϕ i ⊆ W (z i ), the transformation equation (6.6) holds for everyϕ i f, whence it holds for the sum of these functions, which is f.122


7 p-dimensional surfaces in R m , curve- and surface integrals,Theorems of Gauß and Stokes7.1 p-dimensional patches of a surface, submanifoldsLet L(R n , R m ) be the vector space of all linear mappings from R n to R m . For A ∈L(R n , R m ) the range A(R n ) is a linear subspace of R m .Definition 7.1 Let A ∈ L(R n , R m ). The dimension of this subspace A(R n ) is called rankof A.From linear algebra we know that a linear mapping A : R p → R n with rank p is injective.Definition 7.2 Let U ⊆ R p be an open set and p < n. Let the transformation γ : U →R n be continuously differentiable and assume that the derivativeγ ′ (u) ∈ L(R p , R n )has the rank p for all u ∈ U . Then γ is called a parametric representation or simplya parametrization of a p-dimensional surface patch in R n . If p = 1, then γ is called aparametric representation of a curve in R n .Note that γ need not to be injective. The surface may have double points.Example 1:Let U = {(u, v) ∈ R 2 | u 2 + v 2 < 1} and let γ : U → R 3 be defined by⎛ ⎞ ⎛⎞γ 1 (u, v)uγ(u, v) = ⎜⎝ γ 2 (u, v) ⎟⎠ = ⎜⎝ v ⎟√ ⎠ ,γ 3 (u, v)1 − (u2 + v 2 )then γ is the parametric representation of the upper half of the unit sphere in R 3 . To seethis, observe that the two columns of the matrixγ ′ (u, v) =⎛⎜⎝−√1 00 1u1−(u 2 +v 2 )−√v1−(u 2 +v 2 )are linearly independent for all (u, v) ∈ U, whence the rank is 2.Example 2:⎞⎟⎠ .In the preceding example the patch of a surface is given by the graph ofa function. More general, let U ⊆ R p be an open set and f : U → R n−p be continuously123


differentiable. Then the graph of f is a p-dimensional patch of a surface which is embeddedin R n . The mapping γ : U → R n ,γ 1 (u) := u 1γ 2 (u) := u 2.γ p (u) := u pγ p+1 (u) := f 1 (u 1 . . . , u p ).γ n (u) := f n−p (u 1 . . . , u p )is a parametric representation of this surface, since the column vectors of the matrix⎛⎞1 . . . 0. . ..γ ′ (u) =0 . . . 1∂ x1 f 1 (u) . . . ∂ xp f 1 (u),⎜⎝ ..⎟⎠∂ x1 f n−p (u) . . . ∂ xp f n−p (u)are linearly independent. Therefore the rank is p.Example 3: By stereographic projection, the sphere with center in the origin, which ispunched at the south pole, can be mapped one-to-one onto the plane. The inverse γ ofthis projection maps the plane onto the punched sphere:v(u, v)uγ(u, v) ∈ R 3Südpol124


From the figure we see that the components γ 1 , . . . , γ 3 of the mapping γ : R 2 → R 3 satisfyγ 1= u √γ 2 v , u2 + v 2 − √ √γ1 2 + γ22 u2 + v=2, γ1 2 + γ2 2 + γ3 2 = 1.γ 3 −1Solution of these equations for γ 1 , . . . , γ 3 yieldsThe derivation isγ ′ (u, v) =γ 1 (u, v) =γ 2 (u, v) =2u1 + u 2 + v 22v1 + u 2 + v 2γ 3 (u, v) = 1 − u2 − v 21 + u 2 + v 2 .⎛⎞1 − u 2 + v 2 −2uv2⎜(1 + u 2 + v 2 ) 2 ⎝ −2uv 1 + u 2 − v 2 ⎟⎠ .−2u −2vFor u 2 + v 2 ≠ 1 we have∂ x1 γ 1 (u, v) ∂ x2 γ 1 (u, v)∂ x1 γ 2 (u, v) ∂ x2 γ 2 (u, v)= (1 + (v 2 − u 2 ))(1 − (v 2 − u 2 )) − 4u 2 v 2= 1 − (v 2 − u 2 ) 2 − 4u 2 v 2 = 1 − (v 2 + u 2 ) 2 ≠ 0 ,and for u ≠ 0∂ x1 γ 2 (u, v) ∂ x2 γ 2 (u, v)∂ x1 γ 3 (u, v) ∂ x2 γ 3 (u, v)= 4uv 2 + 2u(1 + u 2 − v 2 )= 2u(1 + u 2 + v 2 ) ≠ 0 .Correspondingly, for v ≠ 0 we get∂ x1 γ 1 (u, v) ∂ x2 γ 1 (u, v)∂ x1 γ 3 (u, v) ∂ x2 γ 3 (u, v) = −2v(1 + u2 + v 2 ) ≠ 0 .These relations show that γ ′ (u, v) has rank 2 for all (u, v) ∈ R 2 , which shows that γ is aparametrization of the unit sphere with the south pole removed.Example 4: Let ˜γ be the restriction of the parametrization γ from Example 3 to theunit disk U = {(u, v) ∈ R 2 | u 2 + v 2 < 1}. This restriction is a parametrization of theupper half of the unit sphere, which differs from the parametrization of Example 1.125


Definition 7.3 Let U, V ⊆ R p be open sets and γ : U → R n , ˜γ : V → R n be parametrizationsof p-dimensional surface patches. γ and ˜γ are called equivalent, if there exists adiffeomorphism ϕ : V → U with˜γ = γ ◦ ϕ .This is an equivalence relation on the set of parametric representations of surface patches.Example 5: Let γ : U → R 3 be the parametrization of the upper half of the unit spherein Example 1 and let ˜γ : Ũ → R3 be the corresponding parametrization in Example 4.These parametrizations are equivalent. For, a diffeomorphism γ : U → U is given by⎛We have⎛2u1+u 2 +v 2(γ ◦ ϕ)(u, v) =2v⎜ 1+u⎝2 +v√21 − 4u2 +4v 2ϕ(u, v) = ⎝⎞⎟(1+u 2 +v 2 ) 2⎞2u1+u 2 +v 22v1+u 2 +v 2⎠ .⎛ ⎞2u⎠ = 1⎜1 + u 2 + v 2 2v ⎟ = ˜γ(u, v).⎝ ⎠1 − u 2 − v 2In Example 3 a parametric representation for the punched sphere is given. However, fortopological reasons there exists no parametric representation γ : U → R 3 of the entiresphere. To parametrize the entire sphere we have to split it into at least two parts, whichcan be parametrized separately. Therefore we define:Definition 7.4 Let U ⊆ R p be an open set. A parametrization γ : U → R n of a p–dimensional surface patch is called simple if γ is injective with continuous inverse γ −1 . Inthis case the range F = γ(U) is called a simple p-dimensional surface patch.The figure following below explains this definition at the example of a curve in R 2 .126


U = (a, b)yyγ −1γ(u 1 )γ(u 2 )au 1u 2baγ −1 (y)bγ is not injective: The two different parametervalues u 1 and u 2 are mapped tothe same double point of the curve.γ −1 is not continuous: The image of everysphere around y contains points, whosedistance to γ −1 (y) is greater than ε =12(b − γ −1 (u) ) .Examples of parametrizations γ : (a, b) → R 2 , which are not simple.Definition 7.5 A subset M ⊆ R n is called p-dimensional submanifold of R n if thereexists for each x ∈ M an open n-dimensional neighborhood V (x) of x and a mapping γ xwith the properties:(i) V (x) ∩ M is a simple p-dimensional surface patch parametrized by γ x .(ii)If x and y are two points in M withN = ( V (x) ∩ M ) ∩ ( V (y) ∩ M ) ≠ ∅,then γ x : γx−1 (N) → M and γ y : γy −1 (N) → M are equivalent parametrizations of N.The inverse mapping κ x = γx−1 : V (x) ∩ M → U ⊆ R p is called a coordinate mapping ora chart. The set {κ x | x ∈ M} of charts is called an atlas of M.Observe that two charts κ x and κ y of the atlas of M need not be different. For, if y ∈ Mbelongs to the domain of definition V (x) ∩ M of the chart κ x , then κ y = κ x is allowed bythis definition.Example 6: Let S = {x ∈ R 3 | |x| = 1} be the unit sphere in R 3 . The stereographicprojection of Example 3, which maps S\{(0, 0, −1)} onto U = R 2 , is a chart of S, asecond chart is given by the stereographic projection of S\{(0, 0, 1)} from the north poleonto R 2 . Therefore the unit sphere is a two-dimensional submanifold of R 3 with an atlasconsisting of two charts only.127


Definition 7.6 Let M be a p-dimensional submanifold of R n and x a point in M. If γis a parametrization of M in a neighborhood of x with x = γ(u), then the range of thelinear mapping γ ′ (u) is a p-dimensional subspace of R n . This subspace is called tangentspace of M in x, written T x (M) or simply T x M.The definition of T x (M) is independent of the chosen parametrization. To see this, assumethat ˜γ is a parametrization equivalent to γ with x = ˜γ(ũ) and that ϕ is a diffeomorphismwith γ = ˜γ ◦ ϕ and ũ = ϕ(u). Then the chain rule givesγ ′ (u) = ˜γ ′ (ũ)ϕ ′ (u).Since ϕ ′ (u) is an invertible linear mapping, this equation implies that γ ′ (u) and ˜γ ′ (ũ) havethe same ranges.7.2 Integration on patches of a surfaceLet M ⊆ R n be a simple p-dimensional surface patch parametrized by γ : U → M. For1 ≤ i, j ≤ p let the continuous functions g ij : U → R be defined by⎛g ij (u) = ∂γ (u) · ∂γ(u) = ⎜∂u i ∂u j⎝Definition 7.7 For u ∈ U letG(u) =∂γ 1∂u i(u).∂γ n∂u i(u)⎛⎜⎝⎞⎟⎠ ·⎛⎜⎝∂γ 1∂u j(u).∂γ n∂u j(u)g 11 (u) . . . g 1p (u).g p1 (u) . . . g pp (u)⎞⎟⎠ =⎞⎟⎠ .n∑k=1∂γ k∂u i(u) ∂γ k(u)∂u j.The function g : U → R defined by g(u) := det(G(u)) is called Gram’s determinant tothe parametrization γ.To motivate this definition fix u ∈ U. Thenh → γ(u) + γ ′ (u)h : R p → R nis the parametrization of a planar surface which is tangential to the surface patch Mat the point x = γ(u). The partial derivatives ∂γ∂u 1(u) , . . . , ∂γ∂u p(u) are vectors lying inthe tangent space T x M of M at the point x, a p-dimensional linear subspace of R p , andeven generate this vector space because by assumption the matrix γ ′ (u) has rank p. The128


column vectors ∂γ∂u 1(u), . . . , ∂γ∂u p(u) of this matrix are called tangent vectors of M in γ(u).The set{ ∑pP =i=1r i∂γ∂u i(u)is a subset of the tangent space, a parallelotope.}∣ r i ∈ R , 0 ≤ r i ≤ 1Theorem 7.8 We have g(u) > 0 and √ g(u) is equal to the p-dimensional volume of theparallelotope P .For simplicity we prove this theorem for n = 2 only. In this case P is the parallelogramshown in the figure.∂γ∂u 2(u)hPbα∂γ∂u 1(u)a=With a = ∣ ∣ ∂γ∂u 1(u) ∣ ∣ and b =∣ ∣ ∂γ∂u 2(u) ∣ ∣ it follows that√g(u) =√det(G(u))√ ∂γ∂u 1(u) · ∂γ∂u 1(u)∂γ∂u 2(u) · ∂γ∂u 1(u)∂γ∂u 1(u) · ∂γ∂u 2(u)∂γ∂u 2(u) · ∂γ∂u 2(u)= √∣∣a 2 ab cos α ∣∣∣∣ab cos α b 2= √ a 2 b 2 − a 2 b 2 cos 2 α = ab √ 1 − cos 2 α = ab sin α = b · h = area of P .Definition 7.9 Let f : M → R be a function. f is called integrable over the p-dimensional surface patch M if the functionu → f(γ(u)) √ g(u)is integrable over U. The integral of f over M is defined by∫∫f(x)dS(x) := f(γ(u)) √ g(u)du.MUdS(x) is called the p-dimensional surface element of M in x. Symbolically one writesdS(x) = √ g(u)du , x = γ(u) .129


Next we show that the integral is well defined by verifying that the value of the integral∫f(γ(u)) √ g(u)du does not change if the parametrization γ is replaced with an equivalentUone.Theorem 7.10 Let U, Ũ ⊆ Rp be open sets, let γ : U → M and ˜γ : Ũ → M be equivalentparametrizations of the surface patch M and let ϕ : Ũ → U be a diffeomorphism with˜γ = γ ◦ ϕ. The Gram determinants for the parametrizations γ and ˜γ are denoted byg : U → R and ˜g : Ũ → R, respectively. Then we have:(i) For all u ∈ Ũ˜g(u) = g(ϕ(u))| det ϕ ′ (u)| 2 .(ii) If (f ◦ γ) √ g is integrable over U, then (f ◦ ˜γ) √˜g is integrable over Ũ with∫f(γ(u)) √ ∫g(u)du = f(˜γ(v)) √˜g(v)dv .UŨProof:(i) Fromwe obtain thatg ij (u) =n∑k=1∂γ k (u)∂u i∂γ k (u)∂u j,G(u) = [γ ′ (u)] T γ ′ (u) .From the chain rule and the rule of multiplication of determinants we thus conclude˜g = det ˜G = det([˜γ ′ ] T ˜γ ′ )= det([(γ ′ ◦ ϕ)ϕ ′ ] T (γ ′ ◦ ϕ)ϕ ′ ) = det(ϕ ′T [γ ′ ◦ ϕ] T [γ ′ ◦ ϕ]ϕ ′ )= (det ϕ ′ ) det([γ ′ ◦ ϕ] T [γ ′ ◦ ϕ])(det ϕ ′ ) = (det ϕ ′ ) 2 (g ◦ ϕ) .(ii) Using part (i) of the proposition we obtain from the transformation rule (Theorem6.10) that∫f(γ(u)) √ ∫g(u)du = f((γ ◦ ϕ)(v)) √ ∫g(ϕ(v))| det ϕ ′ (v)|dv = f(˜γ(v)) √˜g(v)dv .UŨŨ130


7.3 Integration on submanifoldsNow the definition of integrals on patches of surface on submanifolds has to be generalized.I restrict myself to p-dimensional submanifolds M of R n , which can be covered by finitely⋃many simple surface patches V 1 , . . . , V m . Thus, assume that M = m V j . To every 1 ≤j ≤ m let U j ⊆ R p be an open set and κ j : V j ⊆ M → U j a chart. The inverse mappingsγ j = κ −1j: U j → V j are simple parametrizations.Definition 7.11 A family {α j } m j=1 of functions α j : M → R is called a partition of unityof locally integrable functions, which is subordinate to the covering {V j } m j=1, if(i) 0 ≤ α j ≤ 1, α j | M\Vj = 0,j=1(ii)m∑α j (x) = 1, for all x ∈ M,j=1(iii) The function α j ◦ γ j : U j → R is locally integrable, i.e. for all R > 0 there existsthe integral∫α j (γ j (u))du .U j ∩{|u|


Theorem 7.13 Let M be a p-dimensional submanifold in R n and letγ k : U k → V k ,˜γ j : Ũj → Ṽj ,k = 1, . . . , mj = 1, . . . , lbe simple parametrizations with m ⋃holds, thenk=1V k =are Jordan-measurable subsets of R n andl ⋃j=1D jk = Ṽj ∩ V k ≠ ∅Ṽ j = M. Assume that ifU kj = γ −1k (D jk) , Ũ kj = ˜γ −1j (D jk )γ k : U kj → D jk ,˜γ j : Ũkj → D jkare equivalent parametrizations.Assume that the partitions of unity {α k } m k=1 and {β j} l j=1are subordinate to the coverings{V k } m k=1 and {Ṽj} l j=1, respectively. Thenm∑∫k=1V kα k (x)f(x)dS(x) =l∑∫j=1Ṽ jβ j (x)f(x)dS(x) . (7.1)Proof: First I show that β j α k f is integrable over V k and over Ṽj with∫Ṽ j∫β j (x)α k (x)f(x)dS(x) =V kβ j (x)α k (x)f(x)dS(x) . (7.2)To see this, assume that g k and ˜g j are the Gram determinants to γ k and ˜γ j , respectively.If the function [(α k f) ◦ γ k ] √ g k is integrable over U k , then this function is also integrableover U kj , since U kj is a Jordan measurable subset of U k . According to theorem 7.10[(α k f) ◦ ˜γ j ] √˜g j is then integrable over Ũjk. By assumption, β j ◦ ˜γ j is locally integrableover Ũj, therefore this function is also locally integrable over Ũjk because Ũjk is a Jordanmeasurable subset of Ũj. From 0 ≤ β j ◦ ˜γ j ≤ 1 we thus conclude that the product[(β j α k f) ◦ ˜γ j ] √˜g j = (β j ◦ ˜γ j )[(α k f) ◦ ˜γ j ] √˜g jis integrable over Ũjk, and by the equivalence of the parametrizations γ k : U kj → D jk and˜γ j : Ũjk → D jk it thus follows that∫[(β j α k f) ◦ ˜γ j ] √˜g ∫j du =Ũ jk132U kj[(β j α k f) ◦ γ k ] √ g k du . (7.3)


From (β j α k )(x) = 0 for all x ∈ M \ D jk we get [(β j α k f) ◦ ˜γ j ](u) = 0 for all u ∈ Ũj \ Ũjkand [(β j α k f) ◦ γ k ](u) = 0 for all u ∈ U k \ U kj . Therefore the domains of integration in(7.3) can be extended without modification of the values of the integrals. It follows that∫[(β j α k f) ◦ ˜γ j ] √˜g ∫j du = [(β j α k f) ◦ γ k ] √ g k du .Ũ j U kSince ˜γ j : Ũ j → Ṽj and γ k : U k → V k are parametrizations, this means that (7.2) is∑satisfied. Together withl ∑β j (x) = 1 and m α k (x) = 1 it follows from (7.2)and this is (7.1)j=1k=1k=1Vj=1 kk=1m∑∫m∑∫ l∑α k (x)f(x)dS(x) = β j (x)α k (x)f(x)dS(x)V kl∑ m∑∫l∑= β j (x)α k (x)f(x)dS(x) =j=1 k=1Vj=1kl∑∫=m∑j=1 k=1Ṽ jα k (x)β j (x)f(x)dS(x) =l∑∫j=1m∑∫β j (x)α k (x)f(x)dS(x)k=1Ṽ jṼ jβ j (x)f(x)dS(x) ,7.4 The Integral Theorem of GaußTo formulate the Gauß Theorem I need two definitions:Definition 7.14 (Normal vector)(i)(ii)Let A ⊆ R n be a compact set. We say that A has a smooth boundary, if ∂A is an(n − 1)-dimensional submanifold of R n .Let x ∈ A. If the nonzero vector ν ∈ R n is orthogonal to all vectors in the tangentspace T x (∂A) of ∂A at x, then ν is called normal vector of ∂A at x. If |ν| = 1 holds,then ν is a unit normal vector. If ν points to the exterior of A, then ν is calledexterior normal vector.Definition 7.15 (Divergence) Let U ⊆ R n be an open set and let f : U → R n bedifferentiable. Then the function div f : U → R is defined byn∑ ∂div f(x) := f i (x) .∂x idiv f is called the divergence of f.i=1133


Theorem 7.16 (Theorem of Gauß) Let A ⊆ R n be a compact set with smooth boundary;let U ⊆ R n be an open set with A ⊆ U and let f : U → R n be continuouslydifferentiable. ν(x) denotes the exterior normal vector to ∂A at x. Then∫∫ν(x) · f(x)dS(x) = div f(x)dx .∂AFor n = 1 the theorem says: Let a, b ∈ R, a < b. Thenf(b) − f(a) =∫ baAddx f(x)dx ,and we see that the Theorem of Gauß is the generalization of the fundamental theoremof calculus to R n .Example for an application:A body A is submerged in a liquid with specific weightc. The surface of the liquid is given by the plane x 3 = 0. Then the pressure at a pointx = (x 1 , x 2 , x 3 ) ∈ R 3 with x 3 < 0 is−cx 3 .If x ∈ ∂A, then this pressure acts on the body with the force per unit area−cx 3 (−ν(x)) = cx 3 ν(x)in direction of the external unit normal vector ν(x) to ∂A at x. The total force on thebody is thus equal toK =⎛⎜⎝⎞K 1 ∫K 2⎟⎠ =K 3∂Acx 3 ν(x)dS(x) .Application of the Gaussian Theorem to the functions f 1 , f 2 , f 3 : A → R 3 defined byf 1 (x 1 , x 2 , x 3 ) = (x 3 , 0, 0), f 2 (x 1 , x 2 , x 3 ) = (0, x 3 , 0), f 3 (x 1 , x 2 , x 3 ) = (0, 0, x 3 )yields for i = 1, 2∫K i =∂A∫cx 3 ν i (x)dS(x) = cand for i = 3∫∫K 3 = cx 3 ν 3 (x)dS(x) = c∂A∫∂ν(x) · f i (x)dS(x) = c x 3 dx = 0,A ∂x i∫∫∂ν(x) · f 3 (x)dS(x) = c x 3 dx = c dx = c Vol(A).A ∂x 3 AK has the direction of the positive x 3 -axis. Therefore K is a buoyant force acting on Awith the value c Vol(A). This is equal to the weight of the displaced liquid.134


7.5 Green’s formulaeLet U ⊆ R n be an open set, let A ⊆ U be a compact set with smooth boundary, and forx ∈ ∂A let ν(x) be the exterior unit normal to ∂A at x.In the following we write ∇f(x) for differentiable f : U → R to denote the gradientgrad f(x) ∈ R n .Definition 7.17 Let the function f : U → R be continuously differentiable. Then thenormal derivative of f at the point x ∈ ∂A is defined by∂f∂ν (x) := f ′ (x)ν(x) = ν(x) · ∇f(x) =n∑i=1∂f(x)∂x iν i (x) .The normal derivative of f is the directional derivative of f in the direction of ν. Fortwice differentiable f : U → R set∆ is called Laplace operator.∆f(x) :=Theorem 7.18 For f, g ∈ C 2 (U, R) we haven∑i=1∂ 2∂x 2 if(x) .(i)Green’s first identity:∫f(x) ∂g∫∂ν (x)dS(x) =∂AA(∇f(x) · ∇g(x) + f(x)∆g(x))dx.(ii)Green’s second identity:∫∫( ∂g f(x) (x) − g(x)∂f∂ν ∂ν (x)) dS(x) =∂AA(f(x)∆g(x) − g(x)∆f(x))dx.Proof: To prove Green’s first identity apply the Gaußian Theorem to the continuouslydifferentiable functionf ∇g : U → R n .Hence follows∫∂Af(x) ∂g∫∂ν (x)dS(x) = ν(x) · (f ∇g)(x)dS(x)∂A∫∫= div (f ∇g)(x)dx = (∇f(x) · ∇g(x) + f(x)∆g(x))dx .AA135


To prove Green’s second identity use Green’s first identity. We obtain∫ (f(x) ∂g)(x) − g(x)∂f∂ν ∂ν (x) dS(x)∂A ∫∫= (∇f(x) · ∇g(x) + f(x)∆g(x))dx − (∇f(x) · ∇g(x) + g(x)∆f(x))dx∫A= (f(x)∆g(x) − g(x)∆f(x))dx .A7.6 The Integral Theorem of StokesALet U ⊆ R 2 be an open set and let A ⊆ U be a compact set with smooth boundary. Thenthe boundary ∂A is a continuously differentiable curve. If g : U → R 2 is continuouslydifferentiable, the Theorem of Gauß becomes∫ ( ∂g1(x) + ∂g ) ∫2(x) dx = (ν 1 (x)g 1 (x) + ν 2 (x)g 2 (x))ds(x), (7.4)∂x 1 ∂x 2A∂Awith the exterior unit normal vector ν(x) = (ν 1 (x), ν 2 (x)). If f : U → R 2 is anothercontinuously differentiable function and if we choose for g in (7.4) the function( )f2 (x)g(x) := ,−f 1 (x)then we obtain∫ ( ∂f2(x) − ∂f )1(x) dx =∂x 1 ∂x 2whereA=τ(x) =∫∂A∫∂A(ν 1 (x)f 2 (x) − ν 2 (x)f 1 (x))ds(x)τ(x) · f(x)ds(x) , (7.5)( ) −ν2 (x).ν 1 (x)τ(x) is a unit vector perpendicular to the normal vector ν(x) and is obtained by rotatingν(x) by 90 o in the mathematically positive sense (counterclockwise). Therefore τ(x) isa unit tangent vector to ∂A in x ∈ ∂A. If we define for differentiable f : U → R 2 therotation of f byrot f(x) := ∂f 2(x) − ∂f 1(x) ,∂x 1 ∂x 2then (7.5) can be written in the form∫∫rot f(x)dx = τ(x) · f(x)ds(x) .A∂A136


This formula is called Stokes theorem in the plane. Note that A is not assumed tobe ”simply connected”. This means that A can have ”holes”:τ(x)µ(x)∂Aτ(x)A∂Aµ(x)∂Aτ(x)µ(x)We can identify the subset A ⊆ R 2 with the planar submanifold Ax{0} of R 3 and theintegral over A in the Stokes formula with the surface integral over this submanifold. Thisinterpretation suggests that this formula can be generalized and that Stokes formula is notonly valid for planar submanifolds but also for more general 2-dimensional submanifoldsof R 3 . As a matter of fact Stokes formula is valid for orientable submanifolds of R 3 withboundary. To define these, we need some preparations.Definition 7.19 Let M ⊆ R 3 be a 2-dimensional submanifold. A unit normal vectorfield ν of M is a continuous mapping ν : M → R 3 , such that every a ∈ M is mapped toa unit normal vector ν(a) to M at a.A 2-dimensional submanifold M of R 3 is called orientable, if there exists a unit normalfield on M.Example:is ν(a) = a|a| , a ∈ M.The unit sphere M = { x ∈ R 3 ∣ ∣ |x| = 1}is orientable. A unit normal fieldIn contrast, the Möbius strip is not orientable:137


MöbiusbandDefinition 7.20 Let V ⊆ R p be a neighborhood of 0 and U = V ∩ ( R p−1 × [0, ∞) ) .A function γ : U → R n , which is continuously differentiable up to the boundary andfor which γ ′ (u) has rank p for all u ∈ U, is called a parametrization of a surface patchwith boundary. If γ is injective and has a continuous inverse, then γ is called a simpleparametrization and F = γ(U) is called a simple p-dimensional surface patch withboundary. The set ∂F = γ ( V ∩ (R p−1 × {0}) ) ⊆ R n is called the boundary of F .Note that ∂F is a simple (p−1)-dimensional surface patch with parametrization givenby u ′ ↦→ γ(u ′ , 0). We generalize Definition 7.5 of a submanifold and call a set M ⊂ R na p-dimensional submanifold with boundary, if the sets M ∩ V (x) in this definition aresimple p-dimensional surface patches with or without boundary ∂ ( M ∩ V (x) ) , and if theboundary of M defined by∂M = ⋃∂ ( M ∩ V (x) )x∈Mis not empty. ∂M is a (p−1)-dimensional submanifold of R n .For all the points x of a p-dimensional submanifold M with boundary including theboundary points the tangential space T x M is given by Definition 7.6.Let M be a two-dimensional orientable submanifold in R 3 with boundary. Then ∂M isa one-dimensional submanifold of R 3 , a curve. At x ∈ ∂M the tangent space T x (∂M)is one-dimensional, the tangent space T x M is two-dimensional. Therefore T x M containsexactly one unit vector µ(x), which is normal to T x (∂M) and points out of M. With a138


unit normal vector field ν on M we define a unit tangent vector field τ : ∂M → R 3 bysettingτ(x) = ν(x) × µ(x), x ∈ ∂M.We say that the vector field τ orients ∂M positively with respect to ν.Definition 7.21 Let U ⊆ R 3 be an open set and f : U → R 3 be a differentiable function.The rotation of frot f : U → R 3is defined by⎛rot f(x) :=⎜⎝∂f 3− ∂f 2∂x 2 ∂x 3∂f 1− ∂f 3∂x 3 ∂x 1∂f 2− ∂f 1∂x 1 ∂x 2⎞⎟⎠.Theorem 7.22 (Integral Theorem of Stokes) Let M be a compact two-dimensionalorientable submanifold of R 3 with boundary, let ν : M → R 3 be a unit normal vector fieldand let τ : ∂M → R 3 be a unit tangent vector field, which orients ∂M positively withrespect to ν. Assume that U ⊆ R 3 is an open set with M ⊆ U and that f : U → R 3 iscontinuously differentiable. Then∫∫ν(x) · rot f(x)dS(x) = τ(x) · f(x)ds(x).B∂BExample: Let Ω ⊆ R 3 be a domain in R 3 . In Ω there exists an electric field E, whichdepends on the location x ∈ Ω and the time t ∈ R. Thus, E is a vector fieldE : Ω × R → R 3 .The corresponding magnetic induction is a vector fieldB : Ω × R → R 3 .We place a wire loop Γ in Ω. This wire loop is the boundary of a surface M ⊆ Ω:139


ΩMΓτ(x)xUIf B varies in time, then an electric voltage is induced in Γ. We can calculate this voltageas follows: For all (x, t) ∈ Ω × R we haverot x E(x, t) = − ∂ B(x, t) .∂tThis is one of the Maxwell equations, which expresses Faraday’s law of induction. Thereforeit follows from Stokes’ Theorem with a unit normal vector field ν : M → R 3∫∫U(t) = τ(x) · E(x, t)ds(x) = ν(x) · rot x E(x, t)dS(x)Γ∫= −MMν(x) · ∂∂t B(x, t)dS(x) = − ∂ ∫∂tMν(x) · B(x, t)dS(x) .The integral ∫ ν(x) · B(x, t)dS(x) is called flux of the magnetic induction through M.MTherefore U(t) is equal to the negative time variation of the flux of B through M.140


AppendixGerman translation of Section 7141


Ap-dimensionale Flächen im R m , Flächenintegrale, Gaußscherund Stokescher SatzA.1 p-dimensionale Flächenstücke, UntermannigfaltigkeitenWie früher bezeichne L(R n , R m ) den Vektorraum aller linearen Abbildungen von R n nachR m . Für A ∈ L(R n , R m ) ist die Bildmenge A(R n ) ein linearer Unterraum von R m .Definition A.1 Sei A ∈ L(R n , R m ). Als Rang von A bezeichnet man die Dimension desUnterraumes A(R n ).Aus der Theorie der linearen Abbildungen ist bekannt, dass eine lineare Abbildung A :R p → R n mit Rang p injektiv ist.Definition A.2 Sei U ⊆ R p eine offene Menge und sei p < n. Die Abbildung γ : U → R nsei stetig differenzierbar und die Ableitungγ ′ (u) ∈ L(R p , R n )habe für alle u ∈ U den Rang p. Dann heißt γ Parameterdarstellung eines p-dimensionalenFlächenstückes im R n . Ist p = 1, dann heißt γ Parameterdarstellung einer Kurve im R n .Man beachte, daß γ nicht injektiv zu sein braucht.haben.Die Fläche kann “Doppelpunkte”Beispiel 1:Sei U = {(u, v) ∈ R 2 | u 2 + v 2 < 1} und sei γ : U → R 3 definiert durch⎛ ⎞ ⎛⎞γ 1 (u, v)uγ(u, v) = ⎜⎝ γ 2 (u, v) ⎟⎠ = ⎜⎝ v ⎟√ ⎠ .γ 3 (u, v)1 − (u2 + v 2 )Dann ist γ die Parameterdarstellung der oberen Hälfte der Einheitssphäre im R 3 . Dennes giltγ ′ (u, v) =⎛⎜⎝−√1 00 1u1−(u 2 +v 2 )−√v1−(u 2 +v 2 )Die beiden Spalten in dieser Matrix sind für alle (u, v) ∈ U linear unabhängig, also istder Rang 2.Beispiel 2:⎞⎟⎠ .Im vorangehenden Beispiel ist das Flächenstück durch den Graphen einerFunktion gegeben. Allgemeiner sei U ⊆ R p eine offene Menge und sei f : U → R n−p stetig142


differenzierbar. Dann ist der Graph von f ein in den R n eingebettetes p-dimensionalesFlächenstück. Die Abbildung γ : U → R n ,γ 1 (u) := u 1γ 2 (u) := u 2.γ p (u) := u pγ p+1 (u) := f 1 (u 1 . . . , u p ).γ n (u) := f n−p (u 1 . . . , u p )ist eine Parameterdarstellung dieser Fläche . Denn es gilt⎛⎞1 . . . 0. . ..γ ′ (u) =0 . . . 1∂ x1 f 1 (u) . . . ∂ xp f 1 (u),⎜⎝ ..⎟⎠∂ x1 f n−p (u) . . . ∂ xp f n−p (u)und alle Spalten dieser Matrix sind linear unabhängig, also ist der Rang p.Beispiel 3: Durch stereographische Projektion kann die am Südpol gelochte Sphäre mitMittelpunkt im Ursprung eineindeutig auf die Ebene abgebildet werden, also umgekehrtauch die Ebene auf die gelochte Sphäre:v(u, v)uγ(u, v) ∈ R 3Südpolγ 1γ 2= u v , √u2 + v 2 − √ γ 2 1 + γ 2 2γ 3=143√u2 + v 2, γ1 2 + γ2 2 + γ3 2 = 1.−1


Aus den in der Abbildung angegebenen, aus den geometrischen Verhältnissen abgeleitetenGleichungen erhält man für die Abbildung γ : R 2 → R 3 der stereographischen Projektion,daßγ 1 (u, v) =γ 2 (u, v) =2u1 + u 2 + v 22v1 + u 2 + v 2γ 3 (u, v) = 1 − u2 − v 21 + u 2 + v 2 .Die Ableitung ist⎛⎞1 − u 2 + v 2 −2uvγ ′ 2(u, v) =⎜(1 + u 2 + v 2 ) 2 ⎝ −2uv 1 + u 2 − v 2 ⎟⎠ .−2u −2vFür u 2 + v 2 ≠ 1 ist∂ u γ 1 (u, v) ∂ v γ 1 (u, v)∂ u γ 2 (u, v) ∂ v γ 2 (u, v)= (1 + (v 2 − u 2 ))(1 − (v 2 − u 2 )) − 4u 2 v 2= 1 − (v 2 − u 2 ) 2 − 4u 2 v 2 = 1 − (v 2 + u 2 ) 2 ≠ 0 .Für u ≠ 0 gilt∂ u γ 2 (u, v) ∂ v γ 2 (u, v)∂ u γ 3 (u, v) ∂ v γ 3 (u, v)= 4uv 2 + 2u(1 + u 2 − v 2 )= 2u(1 + u 2 + v 2 ) ≠ 0 ,und für v ≠ 0 entsprechendalso hat γ ′∂ u γ 1 (u, v) ∂ v γ 1 (u, v)∂ u γ 3 (u, v) ∂ v γ 3 (u, v) = −2v(1 + u2 + v 2 ) ≠ 0 ,immer den Rang 2, und somit ist γ eine Parameterdarstellung der Einheitssphärebei herausgenommenem Südpol.Beispiel 4: Es sei ˜γ die Einschränkung der Parametrisierung γ aus Beispiel 3 auf dieEinheitskreisscheibe U = {(u, v) ∈ R 2 | u 2 + v 2 < 1}. Dies liefert eine Parametrisierungder oberen Hälfte der Einheitssphäre, die sich von der Parametrisierung aus Beispiel 1unterscheidet.144


Definition A.3 Seien U, V ⊆ R p offene Mengen, γ : U → R n , ˜γ : V → R n seienParameterdarstellungen von p-dimensionalen Flächenstücken. γ und ˜γ heißen äquivalent,wenn ein Diffeomorphismus ϕ : V → U existiert mit˜γ = γ ◦ ϕ .Dies ist eine Äquivalenzrelation unter den Parameterdarstellungen von Flächenstücken.Beispiel 5: Sei γ : U → R 3 die Parametrisierung der oberen Hälfte der Einheitssphäreaus Beispiel 1 und sei ˜γ : U → R 3 die entsprechende Parametrisierung aus Beispiel 4.Diese Parametrisierungen sind äquivalent. Denn ein Diffeomorphismus ϕ : U → U istgegeben durchFür diesen Diffeomorphismus gilt⎛2u1+u 2 +v 2(γ ◦ ϕ)(u, v) =2v⎜ 1+u⎝2 +v√21 − 4u2 +4v 2⎛ϕ(u, v) = ⎝⎞⎟(1+u 2 +v 2 ) 2⎞2u1+u 2 +v 22v1+u 2 +v 2⎠ .⎛ ⎞2u⎠ = 1⎜1 + u 2 + v 2 2v ⎟ = ˜γ(u, v).⎝ ⎠1 − u 2 − v 2In Beispiel 3 ist eine Parameterdarstellung für die gelochte Sphäre angegeben. Für diegesamte Sphäre gibt es jedoch aus topologischen Gründen keine Parameterdarstellungγ : U → R 3 . Zur Parametrisierung muss sie daher in mindestens zwei Teile aufgeteiltwerden, die einzeln parametrisiert werden können. Deswegen definiert man:Definition A.4 Sei U ⊆ R p eine offene Menge. Eine Parametrisierung γ : U → R n einesp–dimensionalen Flächenstückes heißt einfach, wenn γ injektiv und die Umkehrabbildungγ −1 stetig ist. In diesem Fall bezeichnet man die Bildmenge F = γ(U) als einfachesp–dimensionales Flächenstück.Die unten folgende Abbildung erläutert diese Definition am Beispiel einer Kurve im R 2 .Definition A.5 Eine Teilmenge M ⊆ R n heißt p-dimensionale Untermannigfaltigkeitdes R n , wenn es zu jedem x ∈ M eine offene n-dimensionale Umgebung V (x) und eineAbbildung γ x gibt mit folgenden Eigenschaften:(i)V (x)∩M ist ein einfaches p–dimensionales Flächenstück, das durch γ x parametrisiertwird.145


U = (a, b)yyγ −1γ(u 1 )γ(u 2 )au 1u 2baγ −1 (y)bγ ist nicht injektiv: Die beiden verschiedenenParameterwerte u 1 und u 2 werden aufum y enthält Punkte, deren Abstand vonγ −1 ist nicht stetig: Das Bild jeder Kugeldenselben Doppelpunkt y der Kurve abgebildet.γ −1 (y) größer als ε = 1 2 (b − γ−1 (y)) ist.Figure 1: Beispiele für nicht einfache Parametrisierungen γ : (a, b) → R 2 .(ii)Sind x und y zwei Punkte aus M mitN = ( V (x) ∩ M ) ∩ ( V (y) ∩ M ) ≠ ∅,dann sind γ x : γx−1 (N) → M, γ y : γy−1 (N) → M äquivalente Parametrisierungen vonN.Die Umkehrabbildung κ x = γx−1 : V (x) ∩ M → U ⊆ R p heißt Karte der UntermannigfaltigkeitM. Die Menge {κ x | x ∈ M} der Karten heißt Atlas von M.Man beachte, dass zwei Karten κ x und κ y aus dem Atlas von M nicht notwendigerweiseverschieden sein müssen. Denn gehört y ∈ M zum Definitionsbereich V (x)∩M der Karteκ x , dann ist nach dieser Defintion κ y = κ x erlaubt.Beispiel 6: Es sei S = {x ∈ R 3 | |x| = 1} die Einheitssphäre im R 3 . Die stereographischeProjektion aus Beispiel 3, die die Menge S\{(0, 0, −1)} auf U = R 2 abbildet, ist eine Kartevon S, eine zweite Karte erhält man, wenn man die Menge S \ {(0, 0, 1)} stereographischvom Nordpol aus auf den R 2 abbildet. Die Einheitsspäre ist daher eine zweidimensionaleUntermannigfaltigkeit des R 3 mit einem Atlas, der nur aus zwei Karten besteht.Definition A.6 Es sei M eine p-dimensionale Untermannigfaltigkeit des R n und x einPunkt von M. Ist γ eine Parametrisierung von M in einer Umgebung von x mit x = γ(u),dann ist der Wertebereich der linearen Abbildung γ ′ (u) ein p-dimensionaler Unterraumvon R n . Dieser Wertebereich heißt Tangentialraum von M im Punkt x. Man schreibtdafür T x (M) oder auch einfach T x M.146


Die Definition von T x (M) hängt nicht von der gewählten Parametrisierung ab. Denn ist˜γ eine zu γ äquivalente Parametrisierung mit x = ˜γ(ũ) und ist ϕ ein Diffeomorphismusmit γ = ˜γ ◦ ϕ und mit ũ = ϕ(u), dann liefert die Kettenregelγ ′ (u) = ˜γ ′ (ũ)ϕ ′ (u).Weil ϕ ′ (u) eine invertierbare lineare Abbildung ist folgt, dass γ ′ (u) und ˜γ ′ (ũ) denselbenWertebereich haben.A.2 Integration auf FlächenstückenSei M ⊆ R n ein einfaches p–dimensionales Flächenstück, das durch γ : U → Mparametrisiert wird. Für 1 ≤ i, j ≤ p seien die stetigen Funktionen g ij : U → R definiertdurch⎛g ij (u) = ∂γ (u) · ∂γ(u) = ⎜∂u i ∂u j⎝Definition A.7 Für u ∈ U seiG(u) =∂γ 1∂u i(u).∂γ n∂u i(u)⎛⎜⎝⎞⎟⎠ ·⎛⎜⎝∂γ 1∂u j(u).∂γ n∂u j(u)g 11 (u) . . . g 1p (u).g p1 (u) . . . g pp (u)⎞⎟⎠ =⎞⎟⎠ .n∑k=1∂γ k∂u i(u) ∂γ k(u)∂u j.Die durch g(u) := det(G(u)) definierte Funktion g : U → R heißt Gramsche Determinantezur Parameterdarstellung γ.Zur Motivation dieser Definition sei u ∈ U fest gewählt. Dann isth → γ(u) + γ ′ (u)h : R p → R nParameterdarstellung eines ebenen Flächenstückes, das im Punkt x = γ(u) tangentialist an das Flächenstück M. Die partiellen Ableitungen ∂γ∂u 1(u) , . . . , ∂γ∂u p(u) sind Vektoren,die im Tangentialraum T x M von M im Punkt x liegen, einem p-dimensionalen linearenUnterraum von R p , und diesen Unterraum sogar aufspannen, weil die Matrix γ ′ (u) nachVoraussetzung den Rang p hat.Punkt x. Die Menge{ ∑pP =i=1∂γ∂u 1(u 0 ), . . . , ∂γ∂u p(u) heißen Tangentialvektoren von M imr i∂γ∂u i(u)}∣ r i ∈ R , 0 ≤ r i ≤ 1ist eine Teilmenge des Tangentialraumes, ein Parallelotop.147


Satz A.8 Es gilt g(u) > 0 und √ g(u) ist gleich dem p-dimensionalen Volumen des ParallelotopsP .Der Einfachheit halber beweisen wir diesen Satz nur für n = 2. Im diesem Fall ist P dasim Bild dargestellte Parallelogramm.∂γ∂u 2(u)hPbα∂γ∂u 1(u)a=Mit a = ∣ ∣ ∂γ∂u 1(u) ∣ ∣ und b =∣ ∣ ∂γ∂u 2(u) ∣ ∣ gilt√g(u) =√det(G(u))√ ∂γ∂u 1(u) · ∂γ∂u 1(u)∂γ∂u 2(u) · ∂γ∂u 1(u)∂γ∂u 1(u) · ∂γ∂u 2(u)∂γ∂u 2(u) · ∂γ∂u 2(u)= √∣∣a 2 ab cos α ∣∣∣∣ab cos α b 2= √ a 2 b 2 − a 2 b 2 cos 2 α = ab √ 1 − cos 2 α = ab sin α = b · h = Fläche von P .Definition A.9 Sei M ein p–dimensionales Flächenstück und f : M → R eine Funktion.f heißt integrierbar über M, wenn die Funktionu → f(γ(u)) √ g(u)über U integrierbar ist. Man definiert dann das Integral von f über M durch∫∫f(x)dS(x) := f(γ(u)) √ g(u)du.MUMan nennt dS(x) das p-dimensionale Flächenelement von M an der Stelle x. SymbolischgiltdS(x) = √ g(u)du , x = γ(u) .Als nächstes zeigen wir, dass diese Definition sinnvoll ist, d. h. dass der Wert des Integrals∫f(γ(u)) √ g(u)du sich nicht ändert wenn die Parametrisierung γ durch eine äquivalenteUersetzt wird.148


Satz A.10 Seien U, Ũ ⊆ Rp offene Mengen, seien γ : U → M sowie ˜γ : Ũ → Mäquivalente Parameterdarstellungen des Flächenstückes M und sei ϕ : Ũ → U ein Diffeomorphismusmit ˜γ = γ◦ϕ. Die Gramschen Determinanten zu den Parameterdarstellungenγ und ˜γ werden mit g : U → R beziehungsweise ˜g : Ũ → R bezeichnet.(i)Dann gilt˜g(u) = g(ϕ(u))| det ϕ ′ (u)| 2für alle u ∈ Ũ.(ii) Ist (f ◦ γ) √ g über U integrierbar, dann auch (f ◦ ˜γ) √˜g über∫f(γ(u)) √ ∫g(u)du = f(˜γ(v)) √˜g(v)dv .Ũ und es giltUŨBeweis:also ist(i) Es giltg ij (u) =n∑k=1∂γ k (u)∂u i∂γ k (u)∂u j,G(u) = [γ ′ (u)] T γ ′ (u) .Nach der Kettenregel und dem Determinantenmultiplikationssatz gilt also˜g = det ˜G = det([˜γ ′ ] T ˜γ ′ )= det([(γ ′ ◦ ϕ)ϕ ′ ] T (γ ′ ◦ ϕ)ϕ ′ ) = det(ϕ ′T [γ ′ ◦ ϕ] T [γ ′ ◦ ϕ]ϕ ′ )= (det ϕ ′ ) det([γ ′ ◦ ϕ] T [γ ′ ◦ ϕ])(det ϕ ′ ) = (det ϕ ′ ) 2 (g ◦ ϕ) .(ii) Nach dem Transformationssatz ist (f ◦ γ) √ g über U integrierbar, genau dann wenn(f ◦ γ ◦ ϕ) √ g ◦ ϕ| det ϕ ′ | = (f ◦ ˜γ) √˜g über Ũ integrierbar ist. Außerdem ergeben Teil (i)der Behauptung und der Transformationssatz, daß∫f(γ(u)) √ ∫g(u)du = f((γ ◦ ϕ)(v)) √ ∫g(ϕ(v))| det ϕ ′ (v)|dv = f(˜γ(v)) √˜g(v)dv .UŨŨA.3 Integration auf UntermannigfaltigkeitenNun soll die Definition des Integrals von Flächenstücken auf Untermannigfaltigkeitenverallgemeinert werden. Ich beschränke mich dabei auf p-dimensionale UntermannigfaltigkeitenM des R n , die durch endlich viele einfache Flächenstücke V 1 , . . . , V m überdeckt149


⋃werden können. Es gelte also M = m V j . Zu jedem 1 ≤ j ≤ m sei U j ⊆ R p eine offenej=1Menge und κ j : V j ⊆ M → U j eine Karte. Die Umkehrabbildungen γ j = κ −1jsind einfache Parametrisierungen.Definition A.11 Eine Familie {α j } m j=1 von Funktionen α j: U j → V j= M → R heißt eine derÜberdeckung {V j } m j=1von M untergeordnete Zerlegung der Eins aus lokal integrierbarenFunktionen, wenn gilt(i) 0 ≤ α j ≤ 1, α j | M\Vj = 0,(ii)m∑α j (x) = 1, für alle x ∈ M,j=1(iii) α j ◦ γ j : U j → R ist lokal integrierbar, d. h. für alle R > 0 existiere das Integral∫α j (γ j (u))du .U j ∩{|u|


m⋃einfache Parametrisierungen mit V k =k=1l ⋃j=1Ṽ j = M. GiltD jk = Ṽj ∩ V k ≠ ∅,dann seienU kj = γ −1k (D jk) , Ũ kj = ˜γ −1j (D jk )Jordan-messbare Teilmengen von R p undγ k : U kj → D jk ,˜γ j : Ũkj → D jkseien äquivalente Parametrisierungen.Das Funktionensystem {α k } m k=1 sei eine der Überdeckung {V k} m k=1und das System} l{β j } l j=1{Ṽj eine der Überdeckung untergeordnete Zerlegung der Eins. Dann giltj=1m∑∫k=1V kα k (x)f(x)dS(x) =l∑∫j=1Ṽ jβ j (x)f(x)dS(x) . (A.1)Beweis:mitZunächst zeige ich, daß β j α k f sowohl über V k als auch über Ṽj integrierbar ist∫∫β j (x)α k (x)f(x)dS(x) =Ṽ jV kβ j (x)α k (x)f(x)dS(x) . (A.2)Um dies einzusehen, seien g k beziehungsweise ˜g j die Gramschen Determinanten zu γ k und˜γ j . Wenn die Funktion [(α k f) ◦ γ k ] √ g k über U k integrierbar ist, dann ist diese Funktionauch über U kj integrierbar, weil U kj eine Jordan-messbare Teilmenge von U k ist. NachSatz A.10 ist dann [(α k f) ◦ ˜γ j ] √˜g j über Ũjk integrierbar. Nach Voraussetzung ist β j ◦ ˜γ jüber Ũj lokal integrierbar, also ist diese Funktion auch über Ũjk lokal integrierbar, weilŨ jk eine Jordan-messbare Teilmenge von Ũj ist. Wegen 0 ≤ β j ◦ ˜γ j ≤ 1 folgt, daß dasProdukt[(β j α k f) ◦ ˜γ j ] √˜g j = (β j ◦ ˜γ j )[(α k f) ◦ ˜γ j ] √˜g jüber Ũjk integrierbar ist, und wegen der Äquivalenz der Parametrisierungen γ k : U kj →D jk und ˜γ j : Ũjk → D jk gilt folglich∫[(β j α k f) ◦ ˜γ j ] √˜g ∫j du =Ũ jkU kj[(β j α k f) ◦ γ k ] √ g k du .(A.3)151


Da (β j α k )(x) = 0 für alle x ∈ M \ D jk , ist [(β j α k f) ◦ ˜γ j ](u) = 0 für alle u ∈ Ũj \ Ũjk und[(β j α k f) ◦ γ k ](u) = 0 für alle u ∈ U k \ U kj , also können in (A.3) die Integrationsbereicheausgedehnt werden ohne Änderung der Integrale. Dies bedeutet, dass∫[(β j α k f) ◦ ˜γ j ] √˜g ∫j du = [(β j α k f) ◦ γ k ] √ g k du ,Ũ j U kgilt, und diese Gleichung ist äquivalent (A.2), weil ˜γ j : Ũ j → Ṽj und γ k : U k → V kParametrisierungen sind.l∑∑Zusammen mit β j (x) = 1 und m α k (x) = 1 folgt aus (A.2), dassj=1k=1m∑∫m∑∫ l∑α k (x)f(x)dS(x) = β j (x)α k (x)f(x)dS(x)V kk=1k=1Vj=1 kl∑ m∑∫l∑= β j (x)α k (x)f(x)dS(x) =j=1 k=1Vj=1kl∑∫=m∑j=1 k=1Ṽ jund dies ist die Gleichung (A.1).α k (x)β j (x)f(x)dS(x) =l∑∫j=1m∑∫β j (x)α k (x)f(x)dS(x)k=1Ṽ jṼ jβ j (x)f(x)dS(x) ,A.4 Der Gaußsche IntegralsatzZur Formulierung des Gaußschen Satzes benötige ich zwei Definitionen:Definition A.14(i)(ii)Sei A ⊆ R n eine kompakte Menge. Man sagt, A habe glatten Rand, wenn ∂A eine(n − 1)-dimensionale Untermannigfaltigkeit von R n ist.Sei x ∈ A. Ist der von Null verschiedene Vektor ν ∈ R n orthogonal zu allen Vektorenim Tangentialraum T x (∂A) von ∂A im Punkt x, dann heißt ν Normalenvektor zu∂A im Punkt x. Gilt |ν| = 1, dann heißt ν Einheitsnormalenvektor. Zeigt ν insÄußere von A, dann heißt ν äußerer Normalenvektor.Definition A.15 (Divergenz) Sei U ⊆ R n eine offene Menge und f : U → R n seidifferenzierbar. Dann ist die Funktion div f : U → R definiert durchn∑ ∂div f(x) := f i (x) .∂x iMan nennt div f die Divergenz von f.i=1152


Satz A.16 (Gaußscher Integralsatz) Sei A ⊆ R n eine kompakte Menge mit glattemRand, U ⊆ R n sei eine offene Menge mit A ⊆ U und f : U → R n sei stetig differenzierbar.ν(x) bezeichne den äußeren Einheitsnormalenvektor an ∂A im Punkt x. Dann gilt∫∫ν(x) · f(x)dS(x) = div f(x)dx .∂AFür n = 1 lautet der Satz: Seien a, b ∈ R, a < b. Dann istf(b) − f(a) =∫ baAddx f(x)dx ,und man sieht, daß der Gaußsche Satz die Verallgemeinerung des Hauptsatzes derDifferential- und Integralrechnung auf den R n ist.Anwendungsbeispiel:Ein Körper A befinde sich in einer Flüssigkeit mit dem spezifischenGewicht c, deren Oberfläche mit der Ebene x 3 = 0 zusammenfalle. Der Druck imPunkt x = (x 1 , x 2 , x 3 ) ∈ R 3 mit x 3 < 0 ist dann−cx 3 .Ist x ∈ ∂A, dann resultiert aus diesem Druck die Kraft pro Flächeneinheit−cx 3 (−ν(x)) = cx 3 ν(x)auf den Körper in Richtung des äußeren Normaleneinheitsvektors ν(x) an ∂A im Punktx. Für die gesamte Oberflächenkraft ergibt sich⎛ ⎞K 1 ∫K = ⎜⎝ K 2⎟⎠ = cx 3 ν(x)dS(x) .K 3∂AAnwendung des Gaußschen Satzes auf die Funktionen f 1 , . . . , f 3 : A → R 3 mitf 1 (x 1 , x 2 , x 3 ) = (x 3 , 0, 0), f 2 (x 1 , x 2 , x 3 ) = (0, x 3 , 0), f 3 (x 1 , x 2 , x 3 ) = (0, 0, x 3 )liefert für i = 1, 2∫K i =∂A∫cx 3 ν i (x)dS(x) = cund für i = 3∫∫K 3 = cx 3 ν 3 (x)dS(x) = c∂A∫∂ν(x) · f i (x)dS(x) = c x 3 dx = 0,A ∂x i∫∫∂ν(x) · f 3 (x)dS(x) = c x 3 dx = c dx = c Vol (A) .A ∂x 3 AK ist somit in Richtung der positiven x 3 -Achse gerichtet, also erfährt A einen Auftriebder Größe c Vol (A). Dies ist gleich dem Gewicht der verdrängten Flüssigkeit.153


A.5 Greensche FormelnEs sei U ⊆ R n eine offene Menge, A ⊆ U sei eine kompakte Menge mit glattem Rand, undfür x ∈ ∂A sei ν(x) die äußere Einheitsnormale an ∂A im Punkt x. Für differenzierbaresf : U → R bezeichne ∇f(x) ∈ R n im Folgenden den Gradienten grad f(x). Man nennt ∇den Nablaoperator.Definition A.17 Die Funktion f : U → R sei stetig differenzierbar. Dann definiert mandie Normalableitung von f im Punkt x ∈ ∂A durch∂f∂ν (x) := f ′ (x)ν(x) = ν(x) · ∇ f(x) =n∑i=1∂f(x)∂x iν i (x) .Die Normalableitung von f ist die Richtungsableitung von f in Richtung von ν.zweimal differenzierbares f : U → R sein∑∆f(x) :=i=1∂ 2∂x 2 if(x).Für∆ heißt Laplace-Operator.Satz A.18 Für f, g ∈ C 2 (U, R) gelten(i)(ii)Erste Greensche Formel:∫f(x) ∂g∂ν∫A(x)dS(x) = ( )∇f(x) · ∇g(x) + f(x)∆g(x) dx∂A∂AZweite Greensche Formel:∫ (f(x) ∂g) ∫(x) − g(x)∂f∂ν ∂ν (x) dS(x) =A(f(x)∆g(x) − g(x)∆f(x))dx .Beweis: Zum Beweis der ersten Greenschen Formel wende den Gaußschen Integralsatzauf die stetig differenzierbare Funktionf ∇g : U → R nan. Es folgt∫∂Af(x) ∂g∫∂ν (x)dS(x) = ν(x) · (f ∇g)(x)dS(x)∂A∫∫= div (f ∇g)(x)dx = (∇f(x) · ∇g(x) + f(x)∆g(x))dx .AA154


Für den Beweis der zweiten Greenschen Formel benützt man die erste Greensche Formel.Danach gilt∫ (f(x) ∂g)(x) − g(x)∂f∂ν ∂ν (x) dS(x)∂A∫∫= (∇f(x) · ∇g(x) + f(x)∆g(x))dx − (∇f(x) · ∇g(x) + g(x)∆f(x))dxA∫=A(f(x)∆g(x) − g(x)∆f(x))dx .AA.6 Der Stokesche IntegralsatzSei U ⊆ R 2 eine offene Menge und sei A ⊆ U eine kompakte Menge mit glattem Rand.Dann ist der Rand ∂A eine stetig differenzierbare Kurve.Für stetig differenzierbaresg : U → R 2 nimmt der Gaußsche Satz die Form∫ ( ∂g1(x) + ∂g ) ∫2(x) dx = (ν 1 (x)g 1 (x) + ν 2 (x)g 2 (x))ds(x) (A.4)∂x 1 ∂x 2A∂Aan mit dem äußeren Normaleneinheitsvektor ν(x) = (ν 1 (x), ν 2 (x)). Ist f : U → R 2 eineandere stetig differenzierbare Funktion und wählt man für g in (A.4) die Funktion( )f2 (x)g(x) := ,−f 1 (x)dann erhält man∫ ( ∂f2(x) − ∂f )1(x) dx =∂x 1 ∂x 2mitA=τ(x) =∫∂A∫∂A(ν 1 (x)f 2 (x) − ν 2 (x)f 1 (x))ds(x)τ(x) · f(x)ds(x) ,( ) −ν2 (x).ν 1 (x)(A.5)τ(x) ist ein Einheitsvektor, der senkrecht auf dem Normalenvektor ν(x) steht, also istτ(x) ein Einheitstangentenvektor an ∂A im Punkt x ∈ ∂A, und zwar derjenige, den manaus ν(x) durch Drehung um 90 o im mathematisch positiven Sinn erhält. Definiert manfür differenzierbares f : U → R 2 die Rotation von f durchrot f(x) := ∂f 2∂x 1(x) − ∂f 1∂x 2(x) ,155


dann kann (A.5) in der Form∫∫rot f(x)dx =τ(x) · f(x)ds(x)A∂Ageschrieben werden. Diese Formel heißt Stokescher Satz in der Ebene. Man beachte,dass A nicht als “einfach zusammenhängend” vorausgesetzt wurde. Das heißt, dass A“Löcher” haben kann:τ(x)µ(x)∂Aτ(x)A∂Aµ(x)∂Aτ(x)µ(x)Man kann die Teilmenge A ⊆ R 2 mit der ebenen Untermannigfaltigkeit A × {0} im R 3identifizieren und das Integral über A im Stokeschen Satz mit dem Flächenintegral überdiese Untermannigfaltigkeit. Diese Interpretation legt die Vermutung nahe, dass dieseFormel verallgemeinert werden kann und der Stokesche Satz nicht nur für ebene Untermannigfaltigkeiten,sondern für allgemeinere 2-dimensionale Untermannigfaltigkeiten desR 3 gilt. In der Tat gilt der Stokesche Satz für orientierbare Untermannigfaltigkeiten desR 3 , die folgendermaßen definiert sind:Definition A.19 Sei M ⊆ R 3 eine 2-dimensionale Untermannigfaltigkeit. Unter einemEinheitsnormalenfeld ν von M versteht man eine stetige Abbildung ν : M → R 3 mit derEigenschaft, dass für jedes a ∈ M der Vektor ν(a) ein Einheitsnormalenvektor von M ina ist.Eine 2-dimensionale Untermannigfaltigkeit M des R 3 heißt orientierbar, wenn ein Einheitsnormalenfeldauf M existiert.Beispiel:Die Einheitssphäre M = {x ∈ R 3 | |x| = 1} ist orientierbar. Ein Einheitsnormalenfeldist ν(a) = a|a| , a ∈ M.Dagegen ist das Möbiusband nicht orientierbar:156


MöbiusbandDefinition A.20 Sei U ⊆ R 3 eine offene Menge und f : U → R 3 differenzierbar. DieRotation von frot f : U → R 3ist definiert durch⎛rot f(x) :=⎜⎝∂f 3(x) − ∂f 2(x)∂x 2 ∂x 3∂f 1(x) − ∂f 3(x)∂x 3 ∂x 1∂f 2(x) − ∂f 1(x)∂x 1 ∂x 2⎞⎟⎠.Satz A.21 (Stokesscher Integralsatz) Sei M eine 2-dimensionale orientierbare Untermannigfaltigkeitdes R 3 , und sei ν : M → R 3 ein Einheitsnormalenfeld. Sei B ⊆ Meine kompakte Menge mit glattem Rand (d. h. ∂B sei eine differenzierbare Kurve.) Fürx ∈ ∂B sei µ(x) ∈ T x M der aus B hinausweisende Einheitsnormalenvektor. Außerdemseiτ(x) = ν(x) × µ(x) x ∈ ∂B .τ(x) ist ein Einheitstangentenvektor an ∂B. Schließlich seien U ⊆ R 3 eine offene Mengemit B ⊆ U und f : U → R 3 eine stetig differenzierbare Funktion. Dann gilt:∫∫ν(x) · rot f(x)dS(x) = τ(x) · f(x)ds(x) .B∂B157


Beispiel: Sei Ω ⊆ R 3 ein Gebiet im R 3 . In Ω existiere ein elektrisches Feld E, das vomOrt x ∈ Ω und der Zeit t ∈ R abhängt. Also giltE : Ω × R → R 3 .Ebenso seiB : Ω × R → R 3die magnetische Induktion.Sei Γ ⊆ Ω eine Drahtschleife. Diese Drahtschleife berande eine Fläche M ⊆ Ω:ΩMΓτ(x)xUIn Γ wird durch die Änderung von B eine elektrische Spannung U induziert. Diese Spannungkann folgendermaßen berechnet werden: Es gilt für alle (x, t) ∈ Ω × Rrot x E(x, t) = − ∂ B(x, t) .∂tDies ist eine der Maxwellschen Gleichungen.einem Einheitsnormalenfeld ν : M → R 3∫∫U(t) = τ(x) · E(x, t)ds(x) =Also folgt aus dem Stokeschen Satz mitν(x) · rot x E(x, t)dS(x)Γ∫= −MMν(x) · ∂∂t B(x, t)dS(x) = − ∂ ∫∂tMν(x) · B(x, t)dS(x) .Das Integral ∫ Mν(x) · B(x, t)dS(x) heißt Fluß der magnetischen Induktion durch M.Somit ist U(t) gleich der negativen zeitlichen Änderung des Flusses von B durch M.158

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!