10.04.2013 Views

NOTES ON ANALYSIS Contents Banach Spaces 2 Problems 6 A ...

NOTES ON ANALYSIS Contents Banach Spaces 2 Problems 6 A ...

NOTES ON ANALYSIS Contents Banach Spaces 2 Problems 6 A ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong><br />

STEPHEN ROWE<br />

<strong>Contents</strong><br />

<strong>Banach</strong> <strong>Spaces</strong> 2<br />

<strong>Problems</strong> 6<br />

A Few Inequalities for ℓ p 7<br />

From Finite to Infinite Dimension 9<br />

An Introduction to Linear Operators 14<br />

<strong>Problems</strong> 18<br />

An Introduction to Linear Functionals and Dual <strong>Spaces</strong> 19<br />

<strong>Problems</strong> 26<br />

The Three Big Theorems: Open Mapping, Closed Graph, and<br />

<strong>Banach</strong>-Steinhaus 27<br />

<strong>Problems</strong> 35<br />

Topologies and Functionals: Weak and Weak-* Topologies 37<br />

<strong>Problems</strong> 44<br />

Hilbert <strong>Spaces</strong> 45<br />

References 59<br />

These are notes aimed at undergraduates with an interest in learning<br />

a bit about functional analysis without requiring measure theory. With<br />

Date: June 14, 2011.<br />

1


2 STEPHEN ROWE<br />

this approach, functional analysis can be made accessible to undergrad-<br />

uates with just some basic analysis and linear algebra background. This<br />

is strongly inspired by Kreyszig’s Introductory Functional Analysis [3]<br />

with Applications textbook. Some topics and problems were also in-<br />

spired by Folland’s superb analysis textbook [2]. I try to provide some<br />

exercises and detailed proofs along with helpful(I hope!) exposition.<br />

If you see any parenthetical (Why?)’s anywhere, those are statements<br />

which the reader should ponder before moving on (this was inspired<br />

by N. L. Carother’s excellent Real Analysis textbook). If you see any<br />

mistakes, let me know!<br />

<strong>Banach</strong> <strong>Spaces</strong><br />

Definition 1. Let X be a vector space. We say that || · || is a norm<br />

on X if || · || : X :→ R is a mapping such that<br />

• ||x|| ≥ 0<br />

• ||x|| = 0 iff x = 0<br />

• ||αx|| = |α|||x|| for any α ∈ C<br />

• ||x + y|| ≤ ||x|| + ||y|| (The Triangle Inequality)<br />

If such a mapping exists for X, we say that X is a normed vector<br />

space. Geometrically, norms generalize the notion of length to arbitrary<br />

vector spaces. Note that a normed vector space is automatically a<br />

metric space with the natural metric induced by the norm given by<br />

d(x,y) = ||x − y||. Since the norm induces a metric, it gives rise to a<br />

topology naturally by considering the topology generated by open balls.<br />

Recall by the reverse triangle inequality that |||x|| − ||y||| ≤ ||x − y||.


<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 3<br />

Consequently, if we consider the norm as a mapping, it is continuous.<br />

That is, if xn,x ∈ X with xn → x, we have ||xn|| → ||x|| (where<br />

convergence of xn → x is given by ||xn − x|| → 0). Since a norm<br />

naturally induces a metric, is it true that every metric is induced by a<br />

norm? This is a neat question, and the answer is no (see exercises).<br />

Definition 2. Let X be a normed vector space. If X is complete, we<br />

say that X is a <strong>Banach</strong> space. Recall that completeness means that<br />

every Cauchy sequence converges.<br />

It is worth noting that a normed vector space may be complete under<br />

one norm, but incomplete under other norms. It is quite helpful to have<br />

a few examples of different normed vector spaces and <strong>Banach</strong> spaces.<br />

The reader should know that C n and R n are both <strong>Banach</strong> spaces, but<br />

there are more exotic and interesting examples out there.<br />

Example 1. We define C[a,b] to be the set of continuous functions on<br />

the interval [a,b]. The first norm worth considering is the sup-norm.<br />

Define ||f|| = sup t∈[a,b] |f(t)|. Under this norm, we have that C[a,b] is<br />

a complete metric space, and hence a <strong>Banach</strong> space. To see why this<br />

is so, let fn be a Cauchy sequence in C[a,b]. Then, we have for n,m<br />

large enough, ||fn − fm|| < ǫ. Consequently, sup |fn(t) − fm(t)| < ǫ. If<br />

we fix a t, then we have that |fn(t) − fm(t)| is a Cauchy sequence of<br />

real numbers, and hence converges. Hence, fn converges pointwise to<br />

a function f. Consequently, since sup t∈[a,b] |fn(t) − fm(t)| ≤ ǫ for n,m<br />

large enough, we may take limits (which are allowed since the norm is<br />

continuous), and since fn(t) → f(t) pointwise, we have sup t∈[a,b] |f(t)−


4 STEPHEN ROWE<br />

fm(t)| ≤ ǫ. This shows that fn uniformly converges to f, and hence<br />

f is continuous, and hence f ∈ C[a,b]. Therefore, C[a,b] is complete<br />

under the supremum norm.<br />

On the other hand, we can give a metric to C[a,b] by d(f,g) =<br />

b<br />

|f(t) − g(t)|dt. Under this metric, C[a,b] is not complete. To see<br />

a<br />

why, let fn(t) = 0 for t ∈ [0, 1<br />

2 ], and fn(t) = 1 for t ∈ [ 1<br />

2<br />

1 + , 1]. n<br />

In the unspecified area, simply let fn be linear such that it makes fn<br />

continuous (start at 0 at t = 1<br />

2<br />

and go up to 1 at t = 1<br />

2<br />

1 + ). We n<br />

have that d(fn,fm) ≤ ǫ for large n,m > N. However, the limit of<br />

this sequence of functions is a step function, which is discontinuous.<br />

Therefore, the limit is not in C[a,b]. Consequently, this metric makes<br />

the space incomplete. The easiest way to do this is graphically by<br />

drawing fn and fm for large n,m. The area under |fn(t) − fm(t)|<br />

becomes very small for large n,m.<br />

Example 2. We can consider the set of all polynomials defined on an<br />

interval [a,b] to be a subset of C[a,b] with the supremum norm. Is this<br />

set closed? Why or why not?<br />

Amongst the most important spaces in analysis are the so called<br />

L p function spaces (with norm given by integration) and ℓ p sequence<br />

spaces (with norm given by summation). We will focus on ℓ p for now<br />

(due to our sketchy avoidance of measure theory for now). Let x denote<br />

a sequence of real (or complex) scalars and let x(n) denote the n th term<br />

in the sequence.


<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 5<br />

Definition 3. Let 1 ≤ p < ∞. We define ℓ p to be the set of all<br />

sequences such that ∞<br />

n=1 |x(n)|p < ∞. On such a space, we can<br />

induce a norm by ||x||p = ( ∞ n=1 |x(n)|p ) 1<br />

p. If p = ∞, we define ℓ∞ to be the set of all sequences of scalars that are bounded. That is,<br />

sup |x(n)| < ∞.<br />

The most important ℓ p spaces occur for p = 1, 2, ∞. We will see later<br />

that ℓ 2 has extraordinarily nice structure. To get a better feel for these<br />

spaces, it helps to have some examples of elements in them. Consider<br />

the harmonic series given by x(n) = 1.<br />

Since the harmonic series is<br />

n<br />

divergent, we know that x /∈ ℓ 1 . However, we have that x(n) 2 = π2<br />

6 ,<br />

and hence x ∈ ℓ 2 . These spaces have some convenient properties. For<br />

1 ≤ p ≤ ∞, ℓ p is actually a complete normed space, and hence a<br />

<strong>Banach</strong> space. Additionally, for 1 ≤ p < ∞, ℓ p is separable (ℓ ∞ is not<br />

however!). Recall that a space is separable if there exists a countable<br />

dense set. Before we move on, a note on notation. Since the elements of<br />

ℓ p are sequences, it can be quite confusing dealing with sequences in ℓ p<br />

(that is, sequences of sequences!). Therefore, we will use the notation<br />

that x(n) refers to the n th sequence entry of an element x ∈ ℓ p . We let<br />

{xn} ∈ ℓ p be a sequence in ℓ p , then xn is the n th term in the sequence<br />

(regarding each x as a point in a space).<br />

Theorem 1. The normed space ℓ 2 is a <strong>Banach</strong> space with the norm<br />

given by ||x|| = ( ∞ n=1 |x(n)|2 ) 1<br />

2<br />

Proof. Consider a Cauchy sequence xn ∈ ℓ 2 . Then, we have for large<br />

enough N, ||xn−xm|| ≤ ǫ. Consequently, ∞<br />

i=1 |xn(i)−xm(i)| 2 ≤ ǫ 2 . In


6 STEPHEN ROWE<br />

particular, we have (for each i) |xn(i) − xm(i)| ≤ ǫ. However, for fixed<br />

i, xn(i) (with n varying) is a Cauchy sequence of scalars, and hence<br />

converges. Therefore, for each i, we may define x(i) = limn→∞ xn(i).<br />

So far all we have produced is a candidate limit for the Cauchy sequence<br />

xn. Then, since ∞<br />

i=1 |xn(i) − xm(i)| 2 ≤ ǫ 2 , taking a limit on n gives<br />

∞<br />

i=1 |x(i) − xm(i)| 2 ≤ ǫ 2 . Consequently the vector x − xm ∈ ℓ p . Since<br />

ℓ p is a vector space, (x − xm) + xm = x ∈ ℓ p . Therefore, our candidate<br />

limit is in ℓ p , and ||xn − x|| → 0. Therefore ℓ 2 is a <strong>Banach</strong> space.<br />

Exercise: Do this for ℓ p , 1 ≤ p ≤ ∞.<br />

<strong>Problems</strong>.<br />

Problem 1. Show that ℓ ∞ is complete.<br />

Problem 2. Consider M ⊂ ℓ ∞ to be the set of all sequences such that<br />

at most finitely many terms are non-zero. First, show that this set is<br />

a subspace of ℓ ∞ . Next, show that this set is not closed and hence not<br />

complete.<br />

Problem 3. Let Y be a <strong>Banach</strong> space and M ⊂ Y be closed. Show<br />

that M is a <strong>Banach</strong> space with the norm inhertied from Y . (You<br />

probably have done a question like this before: Show that a closed<br />

subset of a complete space is complete.)<br />

Problem 4. Show that ℓ p is separable for 1 ≤ p < ∞. Hint: You need<br />

to construct a countable dense set. Recall that the rationals are dense<br />

in the reals.


<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 7<br />

Problem 5. Show that ℓ ∞ is not separable. Hint: Given any real<br />

number in [0, 1], one can write a binary representation as a string of<br />

ones and zeros. Consider strings of this form (justify that they are in<br />

ℓ ∞ ). How many such strings are there? What is the minimum distance<br />

between two such strings?<br />

Problem 6. Let d(x,y) = ||x − y|| be a metric induced by a norm.<br />

Prove that this metric is translation invariant. That is, show that<br />

d(x+z,y +z) = d(x,y). Furthermore, for any α ∈ C (or R, depending<br />

on the field of scalars), show that d(αx,αy) = |α|d(x,y).<br />

Let S be the space of all sequences of scalars. That is, x ∈ S if<br />

x = (x(1),x(2)......x(j).....). Consider the metric given by d(x,y) =<br />

∞<br />

j=1<br />

1<br />

2 j<br />

|x(j)−y(j)|<br />

. Is this metric induced by a norm?<br />

1+|x(j)−y(j)|<br />

A Few Inequalities for ℓ p<br />

It was mentioned above that the ℓ p spaces are <strong>Banach</strong> spaces. How-<br />

ever, we glossed over the actual step of showing that they even form<br />

normed vector spaces. It isn’t too difficult to verify that the ℓ p norms<br />

satisfy all of the norm properties, save for the triangle inequality. This<br />

requires something known as Minkowski’s Inequality. However, this<br />

relies on an inequality of vast importance in the theory of Lebesgue<br />

integrals in L p spaces known as Hölder’s inequality. Before that, we<br />

require a lemma. When working with ℓ p spaces, one often is interested<br />

in the space ℓq where 1 1 + p q<br />

exponents.<br />

= 1. We say that p and q are conjugate<br />

Lemma 1. Let a,b ≥ 0 and λ ∈ (0, 1). Then, a λ b 1−λ ≤ λa + (1 − λ)b


8 STEPHEN ROWE<br />

Proof. Let t = a,<br />

and divide both sides by b. Then, we aim to show<br />

b<br />

that t λ ≤ λt + (1 − λ). Consider the function t λ − λt. Differentiating<br />

this expression gives λ(t λ−1 − 1), which is optimized with the choise<br />

t = 1. At t = 1, we have 1 λ − λ = 1 − λ. Hence, t λ − λt ≤ 1 − λ, with<br />

equality if t = 1. <br />

Theorem 2. Hölder ′ s Inequality. Let 1 < p < ∞ and let p,q be con-<br />

jugate exponents. Then, ∞ n=1 |x(n)y(n)| ≤ ( ∞<br />

n=1 |x(n)|p ) 1<br />

p( ∞ n=1 |y(n)|q ) 1<br />

q<br />

Proof. Let x = (x(n)) ∈ ℓ p and let y = (y(n)) ∈ ℓ q . This inequality is<br />

equivalent to showing that ||xy||1 ≤ ||x||p||y||q. For now, let’s simplify<br />

the problem and assume that ||x||p = ||y||q = 1. This follows obviously<br />

if x(n) = 0 for all n (or if y(n) = 0 for all n). Assume neither of these<br />

are identically zero. Let a = x(n) p and b = y(n) q . Then, with λ = 1<br />

p ,<br />

we have |x(n)y(n)| ≤ 1<br />

p |x(n)|p + 1<br />

q |y(n)|q . If we sum both sides, we<br />

arrive at ||xy||1 ≤ 1<br />

p ||x||p + 1<br />

q ||y||q = 1. This holds for all normalized<br />

x ∈ ℓ p , y ∈ ℓ q . The normalization is equivalent to dividing each term<br />

x(n) by ||x||p. So, to arrive at the inequality for non-normalized a ∈ ℓ p ,<br />

b ∈ ℓ q , we note that this inequality holds for x = a<br />

||a||p<br />

Then, ||xy||1 ≤ 1 implies<br />

||ab||1<br />

||a||p ||b||q<br />

and y = b<br />

||b||q .<br />

≤ 1. Multiplying by the denominator<br />

yields the desired result <br />

Now that we have this inequality, we can prove the triangle inequality<br />

for ℓ p spaces. This is known as Minkowski’s inequality.<br />

Theorem 3. Let 1 ≤ p ≤ ∞ and let x,y ∈ ℓ p . Then, ||x + y||p ≤<br />

||x||p + ||y||p.


<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 9<br />

Proof. To start, we notice that |x(n) + y(n)| p = |x(n) + y(n)| |x(n) +<br />

y(n)| p−1 ≤ (|x(n)| + |y(n)|)|x(n) + y(n)| p−1 ; all we did was utilize the<br />

regular triangle inequality for scalars. Now, we may sum both sides to<br />

get ∞<br />

n=1 |x(n)+y(n)|p ≤ ∞<br />

n=1 |x(n)||x(n)+y(n)|p−1 + ∞<br />

n=1 |y(n)||x(n)+<br />

y(n)| p−1 . We can now apply Hölder’s inequality to both sides, choos-<br />

ing q to be the conjugate exponent to p. Therefore, we arrive at<br />

∞<br />

n=1 |x(n) + y(n)|p ≤ ||x||p|||x + y| p−1 ||q + ||y||p|||x + y| p−1 ||q. Factor-<br />

ing yields ||x+y|| p p ≤ (||x||p + ||y||p)( ∞ n=1 |x(n)+y(n)|(p−1)q ) 1<br />

q. Notice<br />

that 1 1 = 1 − which tells us that (p − 1)q = p. Consequently, we have<br />

p q<br />

∞ n=1 |x(n) + y(n)|p ≤ (||x||p + ||y||p)( ∞ n=1 |x(n) + y(n)|p ) 1<br />

q. Division<br />

of the summation term yields ||x + y||p = ( ∞ n=1 |x(n) + y(n)|)1−1 q ≤<br />

||x||p + ||y||p<br />

With this, we have the triangle inequality for ℓ p spaces and we can<br />

conclude that ℓ p spaces are <strong>Banach</strong> spaces.<br />

From Finite to Infinite Dimension<br />

Recall from linear algebra that a vector space X is finite dimensional<br />

if the largest linearly independent set has at most n vectors for some<br />

finite n. If X is a vector space where this does not occur, then we say<br />

that X is infinite dimensional. All spaces (besides C n ) introduced in the<br />

previous section are infinite dimensional. (Why?) We know from linear<br />

algebra that if the dimension of a space is n, then a set of n linearly<br />

indpendent vectors serve as a basis. In infinite dimensions, such a basis<br />

would necessarily be infinite. However, it is imperative that the reader<br />

note that if we say a set Y = {x1,x2,.......} is a basis for X (of infinite


10 STEPHEN ROWE<br />

dimension), then this means for every x ∈ X, there exists a finite (!)<br />

set of xi such that y = k cixni i=1 . The linear algebraic definition of<br />

linear combinations only permits finite linear combinations, not infinite<br />

series of such. That does not mean one should disregard the possibility<br />

of generalizing such a notion.<br />

We know that if X is a <strong>Banach</strong> space and xn → x, then we mean<br />

||xn − x|| → 0. With this notion of convergence, we can generalize our<br />

notion of infinite series from calculus. Let xn be a sequence in X and<br />

define Sn = n<br />

i=1 xi. We say that this series converges if ||Sn − Sm|| is<br />

a Cauchy sequence (or if there exists S ∈ X such that ||Sn − S|| → 0),<br />

and we say that S is the sum of the infinite series.<br />

Definition 4. Let X be a <strong>Banach</strong> space and let (en) be a sequence in<br />

X. We say that (en) is a Schauder basis for X if given any x, there<br />

exists a unique sequences of scalars (αn) such that x = αnen, or<br />

||x − k<br />

n=1 αnen|| → 0.<br />

This generalization of a basis for infinite dimensional spaces. It is<br />

quite handy for ℓ p . Since any x ∈ ℓ p can be thought of as a p-summable<br />

sequence, x = (x(1),x(2),.....x(n).....). If we define the basis vectors<br />

e1 = (1, 0, 0.......),e2 = (0, 1, 0, 0,.....)....en = (0, 0,......1, 0, 0.....) we<br />

obtain a Schauder basis and x = ∞<br />

i=1 x(i)ei. Notice that if a <strong>Banach</strong><br />

space X has a Schauder basis, it is automatically separable (Why?).<br />

On the other hand, given a separable <strong>Banach</strong> space, does there exist a<br />

Schauder basis? Intuitively, one might guess yes. However, this is not<br />

true. This was a big open problem, which was solved by Per Enflo in<br />

1973 in the negative.


<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 11<br />

On a more concrete level, let’s take a look at ℓ ∞ . We know that<br />

ℓ ∞ is not separable (see exercises), so it cannot have a Schauder basis.<br />

However, it seems that since the cannonical Schauder basis {en} works<br />

so well for ℓ p , what’s going wrong in ℓ ∞ ? Consider x ∈ ℓ ∞ given by<br />

x = (1, 1, 1, 1.....). Then, xn = en is the appropraite approximation,<br />

by ||x − ∞<br />

n=1 en|| = 1. Consequently, this is not converging in norm<br />

to x!<br />

So far, it appears that we can naturally extend familiar notions from<br />

finite dimensions to infinite. However, the big jump between finite<br />

dimension and infinite is the change in topology. Recall that sequential<br />

compactness and compactness are equivalent notions on metric spaces,<br />

and since we are interested solely in normed (hence metric) spaces, we<br />

may take as a definition that compactness is sequential compactness.<br />

Definition 5. Let X be a normed space and M ⊂ X. We say that M<br />

is compact if given any sequence xn ∈ M, there exists a convergent (in<br />

M) subsequence.<br />

It follows that if M is compact, then M is necessarily closed and<br />

bounded. (Why? To see boundedness, choose a sequence such that<br />

||xn|| grows monotonically arbitrarily large. Why does this not have<br />

a convergent subsequence?) We know by Heine-Borel that in R n com-<br />

pactness is equivalent to a set being closed and bounded. It also is true<br />

that in finite dimensional spaces, a set is compact iff it is closed and<br />

bounded.


12 STEPHEN ROWE<br />

Theorem 4. Let X be a finite dimensional and M a subset of X.<br />

Then, M is compact iff M is closed and bounded<br />

Proof. If M is compact, it follows from above that M is closed and<br />

bounded. Let M be closed and bounded and consider an arbitrary<br />

sequence xn ∈ M. Since this space is finite dimensional, we can choose<br />

a convenient basis e1,e2,....en, and note that xm = αm(1)e1+αm(2)e2+<br />

....αm(n)en. Then, each αm(i) is a bounded sequence of scalars (since M<br />

is bounded), and hence we have a convergent subsequence by Bolzano-<br />

Weierstrass. So, αm(i) → α(i) for some subsequence. Define x =<br />

α(1)e1 + α(2)e2 + ....αnen. Then, we can find a subsequence such that<br />

xnk<br />

→ x. (To do this, one uses a finite form of a ’diagonalization’<br />

argument) Since xnk<br />

is convergent, and M is closed, xnk converges in<br />

M. Hence, we have a convergent subsequence. <br />

However, in infinite dimensions, a closed and bounded set is not nec-<br />

essarily compact. In fact, the closed unit ball in an infinite dimensional<br />

space is necessarily not compact! To show this, we need a technical<br />

lemma first. The following proof is from Kreyszig’s excellent textbook.<br />

Lemma 2. Riesz’s Lemma: [3] Let X be a normed space and let Y,Z be<br />

subspaces such that Y is closed and Y is strictly contained in Z. Given<br />

any α ∈ (0, 1), there exists z ∈ Z with ||z|| = 1 such that ||z − y|| > α<br />

for some y ∈ Y .<br />

Proof. Let v ∈ Z − Y . We can define the distance from v to Y by<br />

infy∈Y ||v − y||, which has some distance d. Since Y is closed, we can<br />

find a sequence yn such that ||v − yn|| → d. Consequently, we can find


<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 13<br />

y0 such that d ≤ ||v − y0|| ≤ d + ǫ for arbitrary ǫ. With this, we can<br />

find y such that d ≤ ||v − y0|| ≤ d<br />

v−y0<br />

. If we define z = , then<br />

α ||v−y0||<br />

clearly ||z|| = 1. Additionally, ||z − y|| = 1<br />

||v−y0|| ||v − y0 − ||v − y0||y||.<br />

Since Y is a subspace, y0 + ||v − y0||y ∈ Y , (call it y1), and hence<br />

||z − y|| = 1<br />

||v−y0|| ||v − y1|| ≥ 1<br />

||v−y0||<br />

α d ≥ d = α. So, ||z − y|| ≥ α. <br />

d<br />

In the following proof, we will be using the fact that finite dimen-<br />

sional subspaces are closed. This fact I have not proved in this section,<br />

but this will be proven once we know the Hahn-<strong>Banach</strong> theorem. There<br />

are ways of proving it without resorting to such drastic measures, but<br />

I prefer it my way. For now, accept on faith that finite dimensional<br />

subspaces are closed.<br />

Theorem 5. Let X be an infinite dimensional normed space. Then,<br />

the closed unit ball B is not compact.<br />

Proof. We can start by picking a point x1 ∈ B such that ||x1|| = 1.<br />

Then, this generates a subspace which is closed. Therefore, we can<br />

choose an x2 such that ||x2 − x1|| ≥ 1<br />

2 and ||x2|| = 1. Now, consider<br />

the subspace spanned by x1 and x2. This is still finite dimensional,<br />

and hence closed. Therefore, using our previous lemma, we can find<br />

an x3 of norm one such that ||x3 − x2|| ≥ 1<br />

2 and ||x3 − x1|| ≥ 1<br />

2 .<br />

Iterating this procedure we get a sequence xn which has no Cauchy<br />

subsequence because ||xn − xm|| ≥ 1<br />

2<br />

always. So, this sequence can’t<br />

possibly have a convergent subsequence. Therefore, the closed unit ball<br />

is not compact.


14 STEPHEN ROWE<br />

An Introduction to Linear Operators<br />

In the setting of linear algebra and finite dimensions, we are familiar<br />

with mappings between two finite dimensional spaces (say C n → C m ).<br />

With a chosen basis, we can represent linear mappings as matrices.<br />

Our goal here is to generalize the notion of linear mappings between<br />

two vector spaces of arbitrary dimension. We call a mapping between<br />

two vector spaces an operator. From here on out, assume that (unless<br />

explicitly stated otherwise) that X and Y refer to normed vector spaces.<br />

Definition 6. Let X,Y be vector spaces. We say that an operator T<br />

is a linear operator if T : X → Y is a linear mapping. That is, for any<br />

scalars α,β, and any x,y ∈ X, T(αx + βy) = αTx + βTy.<br />

It is worth noting that this immediately implies that a linear operator<br />

takes zero to zero. The domain of an operator need not be the whole<br />

space X and the range is not necessarily all of Y . We can extend the<br />

notion of a kernel from linear algebra by saying the kernel (also called<br />

null space) of an operator is given by {x ∈ X : Tx = 0}. If we let<br />

X,Y be finite dimensional, then the linear operators are exactly the<br />

matrices mapping between them. We have more interesting operators<br />

on possibly infinite dimensional spaces. Our goal is to extend notions<br />

from analysis and topology to the infinite dimensional case. Since we<br />

have mappings between spaces, we can try to extend the notion of<br />

continuity and boundedness.


<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 15<br />

Definition 7. Let X,Y be normed vector spaces with norms given by<br />

|| · ||1 and || · ||2 respectively We say that a linear operator T : X → Y<br />

is bounded if ||Tx||2 ≤ C||x||1 for all x ∈ X.<br />

We often may drop the subscripts, which we hypothesize will not<br />

cause too much confusion. However it is important to keep in mind<br />

that the norm on Tx is the Y norm and the norm on x is the X norm.<br />

With this definition, we can define a new vector space given by the<br />

collection of bounded operators between two normed spaces.<br />

Definition 8. Let X,Y be normed vector spaces. Then, define B(X,Y ) =<br />

{T |T : X → Y , T bounded }. Then, B(X,Y ) is a vector space.<br />

Given T,S ∈ B(X,Y ), we have ||(αT+βS)x|| ≤ |α|||Tx||+|β|||Sx|| ≤<br />

|α|CT ||x||+|β|CS||x|| = C||x||. Consequently, for arbitrary α,β scalars,<br />

we have αT +βS is also a bounded operator. Hence, B(X,Y ) is a vec-<br />

tor space. Now that we have a vector space, can we extend other<br />

notions that we have introduced? Can we make a normed space out of<br />

B(X,Y ).<br />

Definition 9. We define the operator norm ||T || = sup x∈X ||Tx|| :<br />

||x|| = 1.<br />

The operator norm is a well-defined norm (see problems) and with<br />

this, we have that B(X,Y ) is a normed vector space. Before moving<br />

onto more complex topics, it may be worth exploring some examples<br />

of bounded and unbounded operators.


16 STEPHEN ROWE<br />

Example 3. Consider C[a,b] with the supremum norm and define<br />

T : C[a,b] → C[a,b] by Tf = t<br />

f(x)dx. This operator is cer-<br />

a<br />

tainly linear (since integration is linear), and we know that the func-<br />

tion defined by Tf is still continuous (hence in C[a,b]). Note that<br />

||Tf|| = || t<br />

a f(x)dx|| ≤ t<br />

a ||f||dx ≤ b<br />

a<br />

is a bounded operator with norm at most (b − a).<br />

||f||dx = (b −a)||f||. Hence, T<br />

Example 4. Let P ⊂ C[0, 1] be the set of all polynomials and let P<br />

inherit the supremum norm. This gives a normed vector space. We can<br />

define the differentiation operator which takes polynomial t n to nt n−1 .<br />

Define the sequence pn(t) = t n . Then Tpn(t) = nt n−1 , so ||Tpn|| = n<br />

Since ||pn|| = 1, we have ||Tpn|| = n||pn||. Therefore, we can’t find a<br />

C such that ||Tp|| ≤ C||p|| for all p ∈ P.<br />

The previous two operators are very important; the study of dif-<br />

ferential equations and integral equations naturally relies on these two<br />

operators. The unboundedness of the differentiation operator can make<br />

it rather unweildly, but it also leads to a very interesting theory of un-<br />

bounded operators.<br />

Now that we have a concept of boundedness of a linear mapping,<br />

can we extend the notion of continuity? Generalizing continuity from<br />

functions, we can say that a linear operator T is continuous if given<br />

any convergent sequence xn → x, we have Txn → Tx. With linear<br />

operators, a remarkable equality occurs: boundedness and continuity<br />

are equivalent notions (which is not the case for functions!).


<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 17<br />

Theorem 6. Let T : X → Y be linear. Then, T is continuous iff T is<br />

bounded.<br />

Proof. Let T be bounded. Then, if xn → x, we have ||T(xn − x)|| ≤<br />

||T ||||xn − x|| → 0, and hence Txn → Tx. Hence, T is continuous. Let<br />

T be continuous. Then, for any ǫ > 0, ||Tx − Tx0|| < ǫ provided that<br />

||x − x0|| < δ. Choose arbitrary y ∈ X, and let x = x0 + δ y<br />

. Then,<br />

||y||<br />

||x−x0|| = y < δ, so ||T(x−x0)|| ≤ ǫ, but ||T(x−x0)|| = δ ||Ty|| ≤ ǫ.<br />

||y||<br />

So, ||Ty|| ≤ ǫ ||y||. Hence, T is bounded.<br />

δ<br />

Notice that the proof above actually only used continuity at some<br />

point x0, and we showed that T was bounded for arbitrary y. It follows<br />

that continuity at a single point is equivalent to continuity everywhere,<br />

another bizarre feature of linear operators. Now that we have some<br />

grasp on bounded linear operators, what can we say about the space<br />

of bounded linear operators between two normed spaces? Is this space<br />

every complete? This is actually possible, and the only requirement is<br />

that Y be complete (the domain space X need not be complete!).<br />

Theorem 7. Let X be a normed space and Y a <strong>Banach</strong> space. Then<br />

B(X,Y ) is a <strong>Banach</strong> space.<br />

Proof. Let Tn be a Cauchy sequence in B(X,Y ). Hence, ||Tn−Tm|| ≤ ǫ<br />

for n,m large enough. Then, for any x ∈ X, ||Tn(x) − Tm(x)|| ≤<br />

||Tn−Tm|| ||x|| ≤ ǫ||x||. If we define yn = Tn(x), then ||yn−ym|| ≤ ǫ||x||,<br />

and hence yn is a Cauchy sequence in Y . But, Y is a <strong>Banach</strong> space,<br />

and hence yn converges to y. We can define an operator Tx = y in


18 STEPHEN ROWE<br />

this manner for each x. It follows that T is linear since T(x + z) =<br />

lim Tn(x + z) = limTnx + lim Tnz = Tx + Tz. Then, for any x, we<br />

have ||Tmx − Tnx|| ≤ ǫ||x||. Taking limits on the n allows us to arrive<br />

at ||Tx − Tnx|| ≤ ǫ||x||, and hence T − Tn is a bounded operator.<br />

Consequently, T = (T − Tn) + Tn ∈ B(X,Y ). Therefore B(X,Y ) is<br />

complete. Also, note that ||Tn − T || → 0 , and hence Tn → T.<br />

<strong>Problems</strong>.<br />

Problem 7. Show that the kernel of a linear operator is a vector space.<br />

Show that the kernel of a bounded linear operator is closed. Also show<br />

that the range of a linear operator is a vector space.<br />

Problem 8. Show that B(X,Y ) is a normed vector space (assume<br />

X,Y normed vector spaces).<br />

Problem 9. Let k(x,y) be a continuous function on [0, 1] × [0, 1].<br />

Define T : C[0, 1] → C[0, 1] by Tf = 1<br />

k(x,y)f(x)dx. Show that T is<br />

0<br />

a bounded linear operator.<br />

Problem 10. Let T ∈ L(X,X) and ||T || < 1. Show that (I − T) is<br />

an invertible operator and that (I − T) −1 = ∞<br />

n=0 T n .<br />

Problem 11. Let x ∈ ℓ ∞ . Define T : ℓ ∞ → ℓ ∞ by Tx = y where<br />

y = (0,x(1),x(2),.......). Is T linear? Bounded? If so, what is the<br />

norm? Consider T : ℓ ∞ → ℓ ∞ defined by Tx = y where y(n) = x(n)<br />

n .<br />

Show that this is a bounded linear operator. What is the norm of T?


<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 19<br />

Problem 12. Is the range of a bounded linear operator necessarily<br />

closed? Why or why not? Hint: Consider an operator from the previous<br />

problem.<br />

Problem 13. Let T be a linear operator with the condition that<br />

||Tx|| ≥ b||x||for all x ∈ X. Show that T −1 exists and that T −1 is<br />

a bounded linear operator.<br />

Problem 14. Let S,T ∈ B(X,X). Show that ||ST || ≤ ||S||||T ||.<br />

An Introduction to Linear Functionals and Dual <strong>Spaces</strong><br />

Now that we know a few things about bounded linear operators, we<br />

can consider a special class of bounded linear operators between normed<br />

spaces and the field of scalars. Let X be a normed space and consider<br />

the set of bounded operators B(X, R) (or B(X, C)). If f ∈ B(X, C),<br />

then f(x) is a scalar, and we say that f is a bounded linear functional.<br />

We call the set B(X, C) the dual space of X, written X ∗ . The study<br />

of normed spaces X relies heavily on exploring the nature of its dual<br />

space. Indeed, we will later see that using X ∗ , we can build a new<br />

topological structure for X called the weak topology. Before we get<br />

too complicated, we should note some basic facts about dual spaces.<br />

The first thing to note is that since C is a <strong>Banach</strong> space, we know that<br />

X ∗ is always a <strong>Banach</strong> space, regardless of whether or not X is. This<br />

follows from the last theorem from the previous section.<br />

Corollary 1. Let X be a normed space. Then, the set of bounded<br />

linear functionals X ∗ is a <strong>Banach</strong> space.


20 STEPHEN ROWE<br />

Since functionals are linear operators, we can define a norm on them<br />

using the operator norm prevously defined. That is, ||f|| = sup{||fx|| :<br />

||x|| = 1}. Note that ||f(x)|| is actually just |f(x)| since f(x) is a scalar<br />

value. Consequently, |f(x)| ≤ ||f|| ||x||. Also, since linear functionals<br />

are operators, if the functional is bounded, it is continuous (and vice<br />

versa). Let’s familiarize ourselves with some common examples:<br />

Example 5. Let X be a normed space. Let f(x) = ||x||. Then, f<br />

is a functional, as it maps a normed space to the field of real num-<br />

bers. However, we do not have linearity, since ||x + y|| ≤ ||x|| + ||y||.<br />

Consequently, this is not a linear functional.<br />

Example 6. Consider C[0, 1] with Tf = 1<br />

f(t)dt. Since integration<br />

0<br />

is a linear operation, T is linear. T : C[0, 1] → R, and hence T is a<br />

linear functional and it is bounded. (Why?)<br />

Example 7. Let x ∈ R n . If we consider y T , y ∈ R n , then f(x) =<br />

y T x = y · x is a bounded linear functional.<br />

1 1 + p q<br />

In the case of ℓ p for 1 < p < ∞, ℓ p has its dual given by ℓ q where<br />

= 1. We say that p,q are conjugate exponents. For the case p = 1,<br />

the dual space is given by ℓ ∞ . However, ℓ ∞ has a dual possibly much<br />

larger than ℓ 1 . We will see both an explicit example of a functional on<br />

ℓ ∞ which is not in ℓ 1 and solve the problem with a quick application of<br />

a clever theorem. Let us demonstrate that the dual space of ℓ 1 is ℓ ∞ .<br />

Theorem 8. The dual space of ℓ 1 is ℓ ∞


<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 21<br />

Proof. Let f ∈ (ℓ 1 ) ∗ . From our discussion bases, we know their is<br />

a canonical basis to choose, given by the (en) vectors. Then, any<br />

x ∈ ℓ 1 can be written as ∞<br />

n=1 x(n)en. Consequently, we have f(x) =<br />

∞<br />

n=1 x(n)f(en), since f is linear and bounded (Why is this allowed?).<br />

Let y(n) = f(en). This defines a sequence. Then, |y(n)| = |f(en)| ≤<br />

||f||||en|| = ||f||. Consequently, sup |y(n)| ≤ ||f||. So, y(n) ∈ ℓ ∞ . So,<br />

we can identify with any linear functional f ∈ (ℓ 1 ) ∗ , a sequence in ℓ ∞ .<br />

Now we need to show that every member of ℓ ∞ defines a linear func-<br />

tional in a manner as above. Let y(n) ∈ ℓ∞ . Then, ∞ n=1 x(n)y(n) de-<br />

fines a linear functional. Boundedness follows since | ∞ n=1 x(n)y(n)| ≤<br />

∞ n=1 |x(n)| sup |y(n)| ≤ sup |y(n)| ∞<br />

n=1 |x(n)| = ||y||∞||x||1. So, this<br />

shows that every element of ℓ ∞ defines a bounded linear functional.<br />

Therefore, we can associate the space ℓ ∞ with ℓ 1 . Lastly, we have<br />

||f|| = ||y(n)||∞. This follows since we earlier showed |y(n)| = |f(en)| ≤<br />

||f||. We also have | ∞<br />

n=1 y(n)x(n)| ≤ sup |y(n)|||x||1. Consequently,<br />

|f(x)|<br />

||x||1 ≤ sup |y(n)|. So, ||f|| = sup |y(n)| = ||y||∞. Therefore, we have<br />

an isometric isomorphism between ℓ ∞ and (ℓ 1 ) ∗ .<br />

Now that we have seen some examples of linear functionals, a ques-<br />

tion arises: what can we say about how many linear functionals exist?<br />

Does a normed space have a rich supply of such functionals? One of<br />

the most important theorems in functional analysis, the Hahn-<strong>Banach</strong><br />

theorem, addresses this question. First, we need to know a few terms<br />

before we can prove this important theorem.


22 STEPHEN ROWE<br />

Definition 10. We say that p is a sublinear functional if p : X → R<br />

such that p(x + y) ≤ p(x) + p(y) and p(λx) = λp(x) for λ ≥ 0. We say<br />

that p is a semi-norm if p if p(x+y) ≤ p(x)+p(y) and p(λx) = |λ|p(x).<br />

The above statement should look a bit familiar: a norm (or any<br />

semi-norm for that matter) is a sublinear functional. The following is<br />

inspired largely by Folland’s proof and Kreyszig’s proof in their respec-<br />

tive textbooks.<br />

Theorem 9. Hahn − <strong>Banach</strong> Theorem [2] [3] Let X be a real vector<br />

space and let M be a subspace of X, and let f be a linear functional<br />

on M. Let p be a sublinear functional such that f(x) ≤ p(x) ∀x ∈ M.<br />

Then, there exists a linear functional F on X such that F(x) ≤ p(x)<br />

for all x ∈ X and F |M = f.<br />

Proof. Our first step will be to extend f to a functional defined on a<br />

subspace of simply dimension larger by one. That is, we will define a<br />

g on M + Rx, where x /∈ M. Once this is done, we will know that an<br />

extension is possible.<br />

To begin, let y1,y2 ∈ M. Then, f(y1) + f(y2) = f(y1 + y2) ≤<br />

p(y1 + y2) ≤ p(y1 − x) + p(x + y2) by invoking the triangle inequality<br />

property of sublinear functionals. This implies that f(y1) −p(y −x) ≤<br />

p(x + y2) − f(y2). Consequently, sup{f(y) − p(y − x) : y ∈ M} ≤<br />

inf{p(x + y) − f(y) : y ∈ M}. Then, there exists some number α such<br />

that sup{f(y)−p(y −x) : y ∈ M} ≤ α ≤ inf{p(x+y)−f(y) : y ∈ M}.<br />

With this, we may define g : M +Rx → R by g(y+λx) = f(y)+λα.<br />

Then, g is linear since g(y1+λ1x+y2+λ1y2) = f(y1+y2)+α(λ1+λ2) =


<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 23<br />

f(y1) + αλ1 + f(y2) + αλ2 = g(y1 + λ1x) + g(y2 + λ2x). On the set<br />

M, f(y) = g(y + 0 · x) = f(y) + α · 0 = f(y). Consequently, on M,<br />

g(y) ≤ p(y). However, we need to show that g(y + λx) ≤ p(y + λx).<br />

Note that the definition of a sublinear functional requires λ > 0, but<br />

in M + Rx, λ ∈ R could be negative, and hence we must account for<br />

two cases. First, let λ > 0<br />

Then, g(y+λx) = λ[f( y<br />

λ<br />

)+α] ≤ λ[f( y<br />

λ<br />

)+p( y<br />

λ<br />

y<br />

+x)−f( )] = p(y+λx).<br />

λ<br />

Now, let λ = −µ < 0. Then, g(y + λx) = µ(f( y<br />

µ ) − α) ≤ µ(f( y<br />

µ ) −<br />

f( y<br />

µ +p( y<br />

µ −x))) = p(y+λx). Therefore, we have g(y+λx) ≤ p(y+λx).<br />

Therefore, we have proven there exists a one dimensional extension of<br />

f.<br />

Now, consider the family of all linear extensions of f satisfying f ≤<br />

p. We can give this set a partial ordering by set inclusion. That<br />

is, if F1,F2 are extensions such that the domain of F1 is contained<br />

in the domain of F2 and if F1 = F2 on their common domain, then<br />

F1 ≤ F2. Now, consider a chain {Fα}. Then, we have an increasing set<br />

of domains (which are subspaces), and if we take the unions, we arrive<br />

at a functional F by defining F(x) = Fα(x) if x is in the domain of<br />

Fα(x). Then, since this is a chain, we have Fα ≤ F, since the domain<br />

of F is the union over all domains, and F(x) = Fα(x) if x is in their<br />

common domain. So, F is an upper bound for this arbitrary chain from<br />

our partially ordered set Therefore, we know our partially ordered set<br />

(by Zorn’s Lemma) has at least one maximal element, call it F. It must<br />

be that the domain of F is the whole space. If not, we could do a one<br />

dimensional extension (as above), which would give an F ′ ≥ F, which


24 STEPHEN ROWE<br />

would contradict the maximality of F. Therefore, F is an extension of<br />

f to the whole space which still satisfies F(x) ≤ p(x).<br />

It is important to note that this proof follows only for vector spaces<br />

over R. The Hahn-<strong>Banach</strong> theorem can be formulated in the case of a<br />

vector space over C. This merely requires a technical lemma (which we<br />

shall omit) and the proof is a lemma of the real version of the Hahn-<br />

<strong>Banach</strong> theorem. However, since we often assume our field of scalars<br />

are complex, it is worth stating the theorem:<br />

Theorem 10. The Complex Hahn − <strong>Banach</strong> Theorem Let X be<br />

a complex vector space, p a semi-norm on X, M a subspace, and f a<br />

complex linear functional such that |f(x)| ≤ p(x) ∀x ∈ M. Then, there<br />

exists a complex linear functional F such that |F(x)| ≤ p(x) ∀x ∈ X<br />

and F |M = f.<br />

Now that we have this powerful theorem, several useful results in-<br />

stantly emerge.<br />

Theorem 11. Let X be a normed vector space.<br />

a: If M is a closed subspace of X and x ∈ X/M, there exists<br />

f ∈ X ∗ such that f(x) = 0 and f|M = 0. We may choose<br />

||f|| = 1 and f(x) = d(x,M) = infy∈M ||x − y|| = δ.<br />

b: If x = 0 ∈ X, there exists f ∈ X ∗ such that f(x) = ||x||,<br />

||f|| = 1.<br />

c: The bounded linear functionals separate points.


<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 25<br />

d: If x ∈ X, we can define x ′ : X ∗ → C by x ′ (f) = f(x). We have<br />

that x ′ is a linear functional on X ∗ , hence x ′ ∈ (X ∗ ) ∗ . We can<br />

isometrically embed X ⊂ X ∗∗ .<br />

Proof. For part (a), we may define f on M + Cx by f(y + λx) = λδ.<br />

Then, f(x) = δ as desired and f|M = 0. Note that δ ≤ ||y + x|| for<br />

any y ∈ M, and hence |f(x)| = |λ|δ ≤ |λ|||λ −1 y + x|| = ||y + λx||.<br />

So, f(z) ≤ ||z|| for z ∈ M + Cx. If we assign ||z|| = p(z), this is a<br />

semi-norm, and hence we may apply Hahn-<strong>Banach</strong> to get a functional<br />

defined on X such that |F(z)| ≤ ||z|| and F |M = 0, F(x) = δ.<br />

For Part (b), simply use the functional from part (A) with M = {0}.<br />

For Part (c), given two points x,y with x = y, there exists a func-<br />

tional such that f(x−y) = ||x−y|| > 0, and hence X ∗ separates points<br />

in X<br />

For part (d), if f,g ∈ X ∗ , x ∈ X, then x ′ (αf +βg) = (αf +βg)(x) =<br />

αf(x) + βg(x) = αx ′ (f) + βx ′ (g), and hence x ′ is a linear functaionl<br />

on X ∗ . We have |x ′ (f)| ≤ ||f||||x||, so ||x ′ || ≤ ||x||. But, we also have<br />

that there exists f such that ||f|| = 1 and f(x) = ||x||, so |x ′ (f)| =<br />

||x|| ≤ ||x ′ ||||f|| = ||x ′ ||. So, ||x ′ || = ||x||. <br />

An interesting question arises from part (d). When does (if ever)<br />

X = X ∗∗ ? We see that we can isometrically embed X as a subset of<br />

X ∗∗ . We say that a space is reflexive if X = X ∗∗ . Do not confuse<br />

this notion of reflexivity with the notion of the Alg Lat of an algebra<br />

equaling itself! If we recall that (ℓp ) ∗ = ℓq where 1 1 + p q<br />

= 1, then it<br />

follows that (ℓ p ) ∗∗ = (ℓ q ) ∗ = ℓ p . Consequently, ℓ p is reflexive. However,


26 STEPHEN ROWE<br />

for ℓ 1 , we have that its dual is ℓ ∞ . But, the dual of ℓ ∞ is vastly larger<br />

than ℓ 1 , and hence ℓ 1 is not reflexive.<br />

The following theorem is a neat application of our previous theorem.<br />

I recommend trying it yourself before reading the proof! Note that ¯ M<br />

refers to the closure of M.<br />

Theorem 12. Let M be a subspace of normed space X. Then, ¯ M =<br />

∩{ker f : f ∈ X ∗ ,M ⊂ ker f}. [1]<br />

Proof. Let N = ∩{ker f : ¯ M ⊂ ker f}, and let’s show first that ¯ M ⊂<br />

N. Since each f ∈ X ∗ , the kernel is always closed. Consequently, we<br />

are considering an arbitrary intersection of closed sets that contain M.<br />

Since one can define ¯ M to be the intersection over all closed sets that<br />

contain M, it follows that ¯ M ⊂ N. Assume that the containment is<br />

proper; that is, there exists x0 ∈ N but not in ¯ M. Then, since ¯ M<br />

is a closed subspace, we can find f ∈ X ∗ such that f|M = 0 and<br />

f(x0) = δ = dist(x0,M). Since f annihilates ¯ M, the kernel of f is<br />

included in the intersection that generates N. Hence, x0 cannot be in<br />

N since f(x0) = 0. Consequently, ¯ M = N. <br />

<strong>Problems</strong>.<br />

Problem 15. Let X be a normed vector space.<br />

a. Let M be a closed subspace of X and let x ∈ X/M. Show that<br />

M + Cx is closed.<br />

b. Let M be a finite dimensional subspace of X. Show that M is<br />

closed.


<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 27<br />

Problem 16. If X is a <strong>Banach</strong> space and X ∗ is separable, show that<br />

X is separable.[2] Hint: This problem is quite tricky. By the definition<br />

of separability, there exists {fn} that is countable and dense in X ∗ .<br />

For each n, try to find an xn ∈ X with ||xn|| = 1 such that |fn(xn)| ≥<br />

1<br />

2 ||fn||. Argue that one can use these countable xn to obtain a countable<br />

dense subset of X.<br />

Problem 17. Without providing a counterexample, prove that (ℓ ∞ ) ∗ =<br />

ℓ 1 . Hint: Consider the previous question. Also, do note that we know<br />

ℓ 1 ⊂ (ℓ ∞ ) ∗ .<br />

The Three Big Theorems: Open Mapping, Closed Graph,<br />

and <strong>Banach</strong>-Steinhaus<br />

The Hahn-<strong>Banach</strong> theorem is one of the cornerstones of functional<br />

analysis because it gives us information about the existence of function-<br />

als on normed spaces. However, there are a few other major theorems<br />

we will need to cover. First, we will need a helpful theorem from topol-<br />

ogy.<br />

Theorem 13. The Baire Category Theorem Let X be a complete<br />

metric space. If {Un} is a sequence of open, dense sets in X, then ∩Un<br />

is also dense in X.<br />

A set is dense in a space if it intersects every non-trivial open set in X.<br />

Let W be an open set, W = ∅. Our goal is to show that (∩Un)∩W = ∅.<br />

Since each Un is dense, we certainly have that U1 ∩ W is nonempty,<br />

and contains a closed ball centered about some point x0. Consequently,


28 STEPHEN ROWE<br />

there exists B(x0,r0) ⊂ W ∩ U1. As one might suspect, we can iterate<br />

this procedure, intersecting each time with Uj and finding xj,rj such<br />

that B(rj,xj) ⊂ Uj ∩ B(rj−1,xj−1), and we may choose rj < 2 −j at<br />

each turn. Then, the sequence of centers, xn forms a Cauchy sequence.<br />

Since X is complete, xn converges to some x ∈ X, which is contained<br />

in the intersection of W ∩ (∩ ∞ n=1Un).<br />

Corollary 2. Let X be a complete metric space. Then, X is not a<br />

countable union of nowhere dense sets.<br />

Proof. Exercise. <br />

This theorem is a purely topological result which depends on com-<br />

pleteness. We know that <strong>Banach</strong> spaces are by definition complete, so<br />

we will utilize the Baire Category Theorem to prove results for <strong>Banach</strong><br />

spaces. This moves us in a more specific direction towards <strong>Banach</strong><br />

spaces, as opposed to the general work we did with normed spaces<br />

before. <strong>Banach</strong> spaces have wonderful properties due to their com-<br />

pleteness. As we discussed before, we can consider series in normed<br />

vector spaces. <strong>Banach</strong> spaces provide a familiar result from calculus:<br />

if a series is absolutely convergent in a <strong>Banach</strong> space, then it the series<br />

itself is convergent. In fact, completeness is equivalent to the previous<br />

statement.<br />

Lemma 3. Let X be a normed vector space. X is complete iff every<br />

absolutely convergent series is convergent.


<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 29<br />

Proof. Let X be complete and let ∞<br />

n=1 xn be a series that is abso-<br />

lutely convergent. That is, ∞<br />

n=1 ||xn|| converges. Consider the par-<br />

tial sums Sn = n<br />

j=1 xn. Then, ||Sn − Sm|| ≤ m<br />

j=n+1 ||xj|| < ǫ for<br />

n,m large enough, the series is absolutely convergent. Hence, Sn is<br />

a Cauchy sequence, and since X is complete, it converges. On the<br />

other hand, let X have the property that every absolutely convergent<br />

series converges. Let xn be a Cauchy sequence in X. Then, if we let<br />

xn = n<br />

j=1 (xn − xn−1). Our goal here is to express the sequence xn<br />

as a series using the above technique. If we can show that this series<br />

is absolutely convergent, we are done. Therefore, if we can choose a<br />

subsequence such that the difference between ||xnj −xnj−1 || < 2−j , then<br />

we will have an absolutely convergent series. Since xn is Cauchy, we<br />

may choose xnk<br />

as a subsequence such that the difference between suc-<br />

ceeding terms has norm less than 2 −k . Let yk = xnk<br />

− xnk−1 . Then,<br />

∞<br />

j=1 ||yj|| ≤ ||y1||+ ∞<br />

j=1 2−j = ||y1||+1. Hence, this series is bounded<br />

above, monotonic and hence convergent. Since yj is absolutely con-<br />

vergent, by assumption it is itself convergent. But this sum converging<br />

amounts to saying that xnk<br />

is a convergent sequence. Since xnk is a<br />

subsequence that converges, we have that xn converges to the same<br />

limit (since xn is a Cauchy sequence). <br />

Definition 11. Let T : X → Y be <strong>Banach</strong> spaces. We say that T is<br />

open if T maps open sets to open sets. That is, T(B(x,r)) contains a<br />

ball centered about Tx in the space Y .<br />

Another way of looking at this is to consider the action of T on a ball.<br />

Let U be an open set and let B(x,r) ⊂ U be a ball about x with radius


30 STEPHEN ROWE<br />

r. Let’s say we require that the image of every ball around a point x<br />

contains a ball around a point Tx. Then, if we consider an arbitrary<br />

open set U ⊂ X, we know that U can be written as a union of open<br />

balls around every point x. That is, U = ∪x∈UB(x,rx). Then, with our<br />

requiremtn, B(Tx,ry) = B(y,ry) ⊂ T(B(x,rx)) ⊂ T(U). What this is<br />

saying is that every point y ∈ T(U) has an open ball contained in T(U).<br />

Consequently, T(U) is an open set if U is an open set. Therefore, if we<br />

want to show that a map is open, we need only show that given any<br />

open ball B(x,r) in X that T(B(x,r) contains an open ball in Y about<br />

Tx. Additionally, if we consider X,Y to be normed spaces and let T be<br />

linear, then to show that a map is open, we merely need to show that T<br />

maps the open unit ball in X to a set that contains a ball about 0 in Y .<br />

To see why this is so, note that since T is a linear map, T(αx) = αTx<br />

for all x ∈ X and T(x + y) = T(x) + T(y) by linearity, and hence we<br />

can conclude that T commutes with dilations and translations. That<br />

way, instead of showing that T maps every open ball about x to a set<br />

containing an open ball about Tx, we can translate and dilate the ball<br />

in X to the open unit ball. Therefore, all we need to do is show that<br />

the open unit ball in X gets mapped to a set that contains an open<br />

ball about 0 (Recall that T(0) = 0).<br />

Theorem 14. Open Mapping Theorem Let X,Y be <strong>Banach</strong> spaces<br />

and let T : X → Y be a surjective, bounded linear operator. Then, T<br />

is open. [2]<br />

Proof. We know that T(X) = Y and we also know that X,Y are<br />

complete spaces. Our goal here is to show that T(B(0, 1)) contains


<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 31<br />

an open ball about 0 in Y . If one considers the sequences of balls<br />

Bn = B(0,n), then one can see that every x ∈ X is going to eventually<br />

be in one of the balls Bn, and hence we may write X = ∪ ∞ n=1Bn. Then,<br />

we have T(X) = T(∪Bn) = ∪T(Bn)) = Y . Since Y is a <strong>Banach</strong><br />

space, Y is complete. Consequently, by the Baire Category Theorem,<br />

Y cannot be the union of nowhere dense sets. Consequently, there is at<br />

least one set T(Bn) such that T(Bn) has non-empty interior. But, this<br />

implies that T(B(0,n)) = nT(B(0, 1)) has non-empty interior (note<br />

the use of linearity). This tells us that T(B(0, 1)) cannot be nowhere<br />

dense, so there exists a y0 ∈ T(B(0, 1)) such that y0 ∈ Y and some<br />

radius r > 0 such that B(y0, 4r) ⊂ T(B(0, 1)). Then, we may choose<br />

a y1 ∈ T(B(0, 1)) such that ||y1 − y0|| < 2r. Since the radius y1 ∈<br />

B(y0, 4r), we have that B(y1, 2r) ⊂ B(y0, 4r). Additionally, since we<br />

know y0 ∈ T(B(0, 1)) we know there exists an x1 ∈ B(0, 1) with Tx1 =<br />

y1. Let y ∈ Y be arbitrary with ||y|| < 2r. Then, y = y + (y1 − Tx1)<br />

by definition y1. However, since ||y|| < 2r, y1 + y ⊂ T(B1) and we<br />

then have that y = −Tx1 + (y + y1) ⊂ T(−x1 + B(0, 1)) ⊂ T(B(0, 2)),<br />

and hence y ∈ T(B(0, 2)) with ||y|| < 2r. Dividing by 2 and noting<br />

the linearity of T, we have that if ||y|| < r, then y ∈ T(B(0, 1)). So<br />

far, what we’ve shown is that we found an r (using the Baire Category<br />

Theorem) such that if y ∈ Y with ||y|| < r, then y ∈ T(B(0, 1)). We’re<br />

very close to showing that T(B(0, 1)) contains an open ball about 0.<br />

Our problem is that we can do this for T(B(0, 1)). We need to discard<br />

the closure part, and then we will have our result.


32 STEPHEN ROWE<br />

Using our dilation trick some more, we see that if ||y|| < 2 −n r, we<br />

have that y ∈ T(B(0, 2−n )). Now, let ||y|| < r<br />

1<br />

. Then, y ∈ T(0, 2 2 ),<br />

and hence we can find an x1 ∈ B(0, 1<br />

2 ) such that ||y − Tx1|| < r<br />

4 .<br />

Now, since ||y − Tx1|| < r<br />

4<br />

1<br />

∈ Y , we know it is in T(B(0, )). So, we<br />

4<br />

can find an x2 such that ||(y − Tx1) − Tx2|| < r<br />

8 with x2 ∈ B(0, r<br />

4 ).<br />

Now, we can proceed inductively to find an xn ∈ B(0, 2 −n−1 ) such that<br />

||y− n<br />

j=1 Txj|| < 2 −n r. Consider the series ∞<br />

j=1 xn. Since ||xn|| < 1<br />

2 n,<br />

we have that ∞<br />

n=1 ||xn|| < ∞<br />

n=1 2−n = 1. Therefore, we have that this<br />

series is absolutely convergent. By our previous lemma, we have that<br />

since X is a complete space, any absolutely convergent series converges,<br />

and hence ∞<br />

n=1 xn converges in X. Let the series sum be denoted by<br />

x. Then,||y − Tx|| = 0, so y = Tx. So, y ∈ T(B(0, 1)) since ||x|| < 1.<br />

Consequently, T(B(0, 1)) contains all y such that ||y|| < r.<br />

This implies<br />

2<br />

that we have a ball about 0 contained in T(B(0, 1)) and hence T is an<br />

open map.<br />

Recall that a function f : X → Y between two topological spaces<br />

is continuous if given any open V ⊂ Y , we have f −1 (V ) is open. This<br />

is equivalent to our definition for continuous linear operators between<br />

normed spaces. Let T be continuous; then, by definition, T −1 maps<br />

open sets to open sets. Hence, T is open. Therefore, if we can show<br />

that T −1 exists (if T is a bijection), then showing that T −1 is bounded<br />

is equivalent to showing that T is open. With these considerations, we<br />

have the very useful corollary to the Open Mapping theorem:


<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 33<br />

Theorem 15. The Bounded Inverse Theorem Let X,Y be <strong>Banach</strong><br />

spaces and let T ∈ B(X,Y ) be a bijection. Then, T −1 is exists and is<br />

a bounded linear operator.<br />

We are about to approach a terminology disaster: it seems clear that<br />

the definition of an open linear operator maps open sets to open sets.<br />

One might suspect that a closed linear operator maps closed sets to<br />

closed sets. Unfortunately, that is not the definition.<br />

Definition 12. Let T : X → Y be a linear operator between two<br />

normed spaces. Let the graph of T be defined as G(T) = {(x,y) ∈<br />

X × Y : Tx = y}. We say that T is a closed operator if G(T) is a<br />

closed subset of X × Y in the product topology.<br />

It’s a bit confusing at first to see what exactly this means: a com-<br />

parison between continuity and closedness is best. If T is continuous,<br />

then given any convergent sequence xn ∈ X, we have that Txn is a<br />

convergent sequence in Y . If T is a closed operator, it does not follow<br />

that xn → x implies Txn → Tx. However, let’s say xn → x and that<br />

Txn does converge to something in Y , say Txn → y. Then, if T is<br />

closed, it follows that y = Tx. Therefore, to show that Txn → Tx,<br />

we first need to know that Txn is a convergent sequence in Y . We see<br />

that if T is bounded (continuous), then, automatically T is closed. So,<br />

closed linear operators generalize the notion of bounded linear oper-<br />

ators. Why are they worth the trouble? Well, it turns out that our<br />

favorite unbounded operator, d<br />

dx<br />

is a closed linear operator on certain<br />

<strong>Banach</strong> spaces. Due to the importance of this operator in differential


34 STEPHEN ROWE<br />

equations, it seems fair that closed operators deserve a bit of atten-<br />

tion. In the applied world, physics , especially quantum mechanics,<br />

deals with unbounded linear operators that are closed (such as the dif-<br />

ferentiation operator). Although not bounded, closed linear operators<br />

still have some acceptable behavior, notably that there are many pos-<br />

itive results about them in spectral theory. So far, we see that being<br />

bounded implies being closed. Fortunately, a closed linear operator is<br />

bounded if T : X → Y and X and Y are <strong>Banach</strong> spaces.<br />

Theorem 16. Closed Graph Theorem Let X,Y be <strong>Banach</strong> spaces<br />

and let T : X → Y be a closed linear operator. Then, T is bounded. [2]<br />

Proof. By the definition of the product topology, the projection op-<br />

erator pi1 : X × Y → X by π1(x,y) = x is a continuous mapping.<br />

The same holds for π2 : X × Y → Y , π2(x,y) = y. Consequently,<br />

π1 ∈ B(G(T),X) and π2 ∈ B(G(T),Y ). We know that X and Y are<br />

<strong>Banach</strong> spaces, and hence X,Y are both complete spaces. The prod-<br />

uct of two complete spaces is complete. By assumption, T is a closed<br />

linear operator, and hence G(T) is a closed set in X × Y . Notice that<br />

Tx = π2(π −1 (x)). Consequently, T = π2 ◦ π −1<br />

1 . We have that π1 is<br />

one to one and onto, and hence a bijection of X × Y to X. Since it<br />

is also bounded, we have that π −1 is bounded by the Bounded Inverse<br />

Theorem. Then, T = π2 ◦ π −1<br />

1 is a bounded operator. <br />

So far we have hit 2 of the big theorems in functional analysis, and<br />

one more remains. The <strong>Banach</strong>-Steinhaus , also known as the Principle<br />

of Uniform Boundedness, is an extraordinarily powerful theorem that


<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 35<br />

allows one to jump from pointwise estimates on the norm of an operator<br />

to a uniform estimate on the value of the operator norm. As you will see<br />

from doing the problems, this theorem makes short work of otherwise<br />

daunting exercises.<br />

Theorem 17. The <strong>Banach</strong> − Steinhaus Theorem Let X be a Ba-<br />

nach space and Y a normed vector space and let A ⊂ B(X,Y ). If<br />

sup T ∈A ||Tx|| < ∞ for all x ∈ X, then sup T ∈A ||T || < ∞.<br />

Proof. Let En = {x ∈ X : ||Tx|| < n} = ∩T ∈A{x ∈ X : sup T ∈A ||Tx|| ≤<br />

n}. Then En is a closed set since it is the intersection of closed sets and<br />

X = ∪En. Since X is a <strong>Banach</strong> space, X is complete, and by the Baire<br />

Category theorem, at least one set En is not nowhere dense. Conse-<br />

quently, we can find an open ball in En, and since En is closed, we may<br />

find a closed ball inside of it. So, let’s denote this ball by B(x0,r) ⊂ En<br />

for some r > 0. Let x ∈ X satisfy ||x|| < r, so, x + x0 ∈ En. We have<br />

||Tx|| = ||T(x + x0) − Tx0|| ≤ ||T(x + x0)|| + ||Tx0|| ≤ n + n = 2n.<br />

This holds for all T ∈ A and x ∈ X with ||x|| < r since x + x0 and<br />

x0 ∈ B(x0,r) ⊂ En. So, B(0,r) ⊂ E2n. So, sup ||T || < 2n,<br />

since r<br />

||T || = sup ||Tx||<br />

||x||<br />

<strong>Problems</strong>.<br />

≤ 2n<br />

||x||<br />

2n ≤ . <br />

r<br />

Problem 18. Consider the <strong>Banach</strong> space C[0, 1] with the supremum<br />

norm. Consider the subset of C[0, 1] of once continuously differentiable<br />

functions, C 1 [0, 1]. [2]<br />

a. Show that X is not a closed subset of C[0, 1] and hence not<br />

complete.


36 STEPHEN ROWE<br />

b. Consider the operator d<br />

dx : C1 [0, 1] → C[0, 1]. Show that this is<br />

a closed linear operator<br />

Problem 19. Let X be a <strong>Banach</strong> space with respect to two different<br />

norms, || · ||1 and || · ||2, with the property that ||x||1 ≤ ||x||2. Show<br />

that these norms are equivalent norms. That is, there exists constants<br />

A,B such that A||x||1 ≤ ||x||2 ≤ B||x||1. [2]<br />

Problem 20. Let X,Y be <strong>Banach</strong> spaces. Let T : X → Y be a linear<br />

map such that given any f ∈ Y ∗ , f(T) ∈ X ∗ . Show that T is a bounded<br />

operator. [2]<br />

Problem 21. Let X,Y be <strong>Banach</strong> spaces and let Tn be a sequence of<br />

bounded operators such that limTnx exists for all x ∈ X. Show that<br />

the operator defined by the pointwise limit is both linear and bounded.<br />

[2]<br />

Problem 22. Let X be a vector space of countably infinite dimension.<br />

Show that there is no norm such that this space is complete. Hint:<br />

Remember my warning from before: linear algebraic bases only allow<br />

finite combinations!. [2]<br />

Problem 23. Let X be a banach space and let {xn} be a sequence such<br />

that the set {f(xn)} is bounded for all f ∈ X ∗ . Show that {||xn||} is<br />

bounded. Big Hint: Look back to the consequence of the Hahn-<strong>Banach</strong><br />

theorem. Remember that we can isometrically embed X ⊂ X ∗∗ . [3]


<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 37<br />

Problem 24. Define Tn = S n where S : ℓ 2 → ℓ 2 is given by Tx =<br />

T(x(1),x(2),......) = (x(2),x(3),.....). Bound ||Tnx|| and calculate<br />

lim ||Tn||. [3]<br />

Topologies and Functionals: Weak and Weak-*<br />

Topologies<br />

As we’ve seen in our previous analysis courses, occasionally we want<br />

to be a bit more flexible with convergence. For example, although<br />

uniform convergence of a sequence of functions is wonderful, sometimes<br />

this is too restrictive and we can use a weaker form of convergence, such<br />

as pointwise convergence. When you study integration theory, you will<br />

see several types of convergence such as, convergence in measure, L1,<br />

pointwise, and pointwise almost everywhere. Analysis is full of different<br />

types of convergences, and each has their uses. We will explore a<br />

topology built by linear functionals known as the weak topology.<br />

Definition 13. We say that a sequence (xn) ∈ X converges weakly to<br />

x ∈ X if f(xn) → f(x) for all f ∈ X ∗ .<br />

It follows then that if xn → x in the usual sense (in the norm-<br />

topology), then xn is weakly convergent (Why?). From a topological<br />

viewpoint, the norm topology generates a collection of open sets from<br />

the open balls; call this τN. The weak topology is a weaker topology<br />

τW. Being weaker implies that τW ⊂ τN. Another way of viewing<br />

this topology is that it is the weakest topology on X such that the<br />

functionals in X ∗ remain continuous. That is, τW is generated by<br />

looking at f −1 (U) for all open U ∈ X and f ∈ X ∗ . If this is confusing,


38 STEPHEN ROWE<br />

don’t worry: the main thing to understand is that weak convergence<br />

means that f(xn) → f(x) for all f ∈ X ∗ .<br />

Now that we have a new form of convergence, we should get some<br />

basic properties down to familiarize ourself with it. Since we are dealing<br />

with functionals, one should expect to see the Hahn-<strong>Banach</strong> theorem<br />

(in the guise of one of its many corollaries) or one of the three big<br />

theorems to pop up often. Evidence for the previous sentence is in the<br />

following proof:<br />

Lemma 4. Let xn be a weakly convergent sequence in a normed space<br />

X with weak limit x. Then:<br />

a. The weak limit of x is unique.<br />

b. Every subsequence of xn converges weakly to x.<br />

c. The sequence ||xn|| is bounded (Exercise from previous section).<br />

Proof. For (a), assume xn converges to both x and y weakly. Since<br />

x = y, ||x − y|| > 0, and hence there exists a functional (Why?) such<br />

that f(x − y) = ||x − y|| = f(x) − f(y) = limf(xn) − lim f(xn) = 0.<br />

So, ||x − y|| = 0.<br />

For (b), we have that given a subsequence xnk , then f(xnk ) is a<br />

subsequence of scalars. Since f(xn) converges, f(xnk ) converges to the<br />

same limit and this holds for all f. Hence xnk<br />

For (c), see the previous set of problems.<br />

converges weakly to x<br />

We often refer to convergence in norm (i.e., xn → x means ||xn −<br />

x|| → 0) as strong convergence (to contrast it with weak convergence).


<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 39<br />

We know that every strongly convergent sequence is weakly convergent;<br />

is there ever equality between the two statements? In finite dimensions,<br />

the answer is yes. I believe the answer is yes for some infinite dimen-<br />

sional spaces, as we will see in the following example.<br />

Example 8. Consider ℓ 1 , which has dual given by ℓ ∞ . Let xn → x<br />

weakly. That is, for every y ∈ ℓ ∞ , ∞<br />

k=1 xn(k)y(k) → ∞<br />

k=1 x(k)y(k).<br />

We may choose en = y, which tells us coordinate-wise, xn(k) → x(k) for<br />

all k. Now, let’s try showing that ||xn−x|| → 0. That is, ∞<br />

k=1 |xn(k)−<br />

x(k)| → 0. Let ǫ > 0 be given. First, note that since xn → x weakly,<br />

x ∈ ℓ 1 , so xn − x ∈ ℓ 1 . That is, ∞<br />

k=1 |xn − x| < ∞. Therefore,<br />

the tail of this series must converge. So, there exists N such that for<br />

∞<br />

k=N |xn(k) − x(k)| < ǫ.<br />

However, by our pointwise convergence,<br />

2<br />

there exists an M such that for n > M, we have for 1 ≤ k ≤ N,<br />

N k=1 |xn(k) − x(k)| < ǫ . Putting the two together, we have for<br />

2<br />

n > M, ∞<br />

k=1 |xn(k) − x(k)| < ǫ, and hence ||xn − x|| < ǫ for n > M.<br />

Consequently, xn strongly converges to x.<br />

Now that we’ve seen that equality between weak and strong conver-<br />

gence can possibly be equal in infinite dimensions, let’s justify that in<br />

finite dimensions, the two are the same.<br />

Theorem 18. Let X be a finite dimensional normed vector space such<br />

that xn weakly converges to x. Then, xn strongly converges to x.<br />

Proof. Since X is finite dimensional, there exists a basis {e1,e2,....en}<br />

such that xn = k<br />

j=1 αn(j)ej, where αn(j) is the j − th coordinate<br />

of xn. We may choose a set of functionals such that fj(ej) = 1 and


40 STEPHEN ROWE<br />

fj(em) = 0 for m = j. Then, since fj(xn) → fj(x) by assumption, this<br />

tells us that fj(xn) = αn(j) → f(x) = α(j). Consequently, we have<br />

a convergent sequence of scalars in each coordinate. Then, we have<br />

||xn − x|| = || k<br />

j=1 (αn(j) − α(j))ej|| ≤ k<br />

j=1 |alphan(j) − α(j)|||ej||.<br />

Since each sequence of scalars goes to zero, the finite sum tends to<br />

zero. <br />

This explains partly why you likely haven’t heard about topics like<br />

weak convergence in your calculus or linear algebra classes: it’s all the<br />

same in finite dimensions! Now that we know a thing or two about<br />

weak convergence, is there an equivalent or simpler way of describing if<br />

a sequence will weakly converge? Yes, there is a very handy way where<br />

we only need to show that f(xn) → f(x) on a total subset of X ∗ .<br />

Definition 14. Let X be a normed vector space and let M ⊂ X. We<br />

say that M is a total subset if the span of M is dense in X.<br />

Theorem 19. Let X be a normed vector space. Then, xn converges<br />

weakly to x iff ||xn|| is a bounded sequence and if for every linear func-<br />

tional in a total subset M ⊂ X ∗ , we have f(xn) → f(x). [3]<br />

Proof. Let xn converge weakly to x. Then, by a previous problem,<br />

||xn|| is a bounded sequence. Additionally, since f(xn) → f(x) for<br />

all f ∈ X ∗ , we of course have that f(xn) → f(x) for a total subset<br />

M ⊂ X ∗ .<br />

The converse is a bit trickier. By assumption, ||xn|| ≤ c and we have<br />

some total subset M ⊂ X ∗ . We need to show that |f(xn) − f(x)| → 0<br />

for all f ∈ X ∗ . We know this holds true for our total set. The trick


<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 41<br />

here is to use an ǫ<br />

3 argument. We can choose an fj ∈ spanM such that<br />

||f − fj|| < ǫ<br />

3 , since M is total. Since fj is a linear combination of<br />

functionals from M, we have |fj(xn) − f(x)| < ǫ.<br />

Then, we have<br />

3<br />

|f(x)−f(xn)| ≤ |f(x)−fj(x)|+|fj(x)−fj(xn)|+|fj(xn)−fj(x)| ≤ ||f−fj|| ||x||+ ǫ<br />

3 +||f−fj|| ||xn||<br />

Note that it was imperative that ||xn|| was bounded for this trick to<br />

work. <br />

It may not be immediately obvious why this previous theorem is so<br />

helpful: we still have to find a total subset of X ∗ and then show that<br />

f(xn) → f(x) for all of those. Well, it turns out in some spaces, working<br />

with a total subset is extremely easy! Consider ℓ p for 1 < p < ∞.<br />

Then ℓ q is the dual space, where q is the conjugate to p. Then {en}<br />

is a Schauder basis, which is a total subset. From this, we can show<br />

that xn converges weakly to x iff ||xn|| is bounded and xn(k) → x(k).<br />

(Why?) That is, a sequence is weakly convergent if it is norm bounded<br />

and pointwise bounded.<br />

So far, if X is a normed vector space, we have so far given it a new<br />

topology. We know that if X is a normed vector space, X ∗ is a <strong>Banach</strong><br />

space (even if X is not!), and hence we can consider giving it a weak<br />

topology. One can do this by considering the weak topology generated<br />

by X ∗∗ . However, the more important topology on X ∗ is the weak-<br />

* topology generated by X regarded as a space of linear functionals<br />

acting on X ∗ . That is, we look at the weak topology on X ∗ generated<br />

by X ⊂ X ∗∗ . More concretely, if fn is a sequence of functionals in X ∗ ,


42 STEPHEN ROWE<br />

we say that fn is weak-* convergent to f if for all x ∈ X, x(fn) → x(f)<br />

(where x is acting as a linear functional on fn ). But, we know that this<br />

just means for all x ∈ X, f(xn) → f(x). That is, the weak-* topology<br />

on X ∗ is just pointwise convergence!<br />

Definition 15. Let X be a normed vector space and let X ∗ be the<br />

dual. We say that a sequence fn converges weak-* to f ∈ X ∗ if for<br />

every x ∈ X, fn(x) → f(x).<br />

It may not seem immediately obvious why we even bother using the<br />

weak-* topology on X ∗ . We see that the weak topology on X is ben-<br />

eficial because it is more flexible in letting sequences converge. There<br />

is a topological reason which makes the weak-* topology extremely<br />

convenient. Recall that we showed that the closed unit ball in an in-<br />

finite dimensional space is necessarily not compact. Well, it turns out<br />

the weak-* topology makes the closed unit ball in X ∗ compact (in the<br />

weak-* topology). Note: if you’re not familiar with Tychonoff’s theo-<br />

rem, feel free to skip this proof. Make sure to familiarize yourself with<br />

Tychonoff’s theorem at some point, as it is a very usefull theorem from<br />

topology. For the benefit of the reader, I will restate it here:<br />

Theorem 20. Tychonoff ′ sTheorem Let {Xα} be a family of com-<br />

pact topological spaces. Then, X = ΠαXα is compact in the product<br />

topology.<br />

On the other hand, if X = ΠαXα is a compact space, then since each<br />

πα is a continuous map, each Xα is also compact (continuous functions<br />

map compact sets to compact sets).


<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 43<br />

Theorem 21. Alaoglu ′ sTheorem Let X be a normed vector space.<br />

Then, the closed unit ball B ∗ = {f ∈ X ∗ : ||f|| ≤ 1} is compact in the<br />

weak-* topology. [2]<br />

Proof. For every x ∈ X, we can define Dx = {z ∈ C : |z| ≤ ||x||}<br />

and define D = Πx∈XDx. Note that each Dx is compact (Why?), and<br />

hence by Tychonoff’s theorem, D is compact. Here’s the trick to this<br />

theorem: What does it mean if φ ∈ D? If φ ∈ D, then φ associates<br />

with each x ∈ X a complex scalar in the x th coordinate. Therefore, we<br />

may identify φ as a functional acting on X. This is not necessarily a<br />

collection of linear functionals though! All we know is so far that D is<br />

compact. We have B ∗ is a subset of D. The topology that B ∗ inherits<br />

from D is the product topology, which you may recall is the topology<br />

of pointwise convergence. But, we know that the topology of pointwise<br />

convergence is exactly the weak-* topology. That is, B ∗ as a subset<br />

of D has the weak-* topology. Since D is compact, we need to just<br />

show that B ∗ is closed. (Why?) Let fα ∈ B ∗ be a net that converges<br />

to f ∈ D. We need to show that f ∈ B ∗ . First, is f linear? Well,<br />

lim fα(ax + by) = a lim fα(x) + b lim fα(y) = af(x) + bf(y). So, f is<br />

linear. So, f ∈ B ∗ , and we have that B ∗ is closed. Consequently, B ∗ is<br />

compact in the weak-* topology. <br />

To summarize, we’ve given a normed space X two topologies: the<br />

usual norm topology and a new topology generated by the functionals<br />

in X ∗ . On X ∗ , we have the usual norm topology and the topology<br />

of pointwise convergence induced by X. What about the space of<br />

bounded operators, B(X,Y ). This has convergence given by the norm.


44 STEPHEN ROWE<br />

That is, if Tn → T, we mean ||Tn − T || → 0. That instantly implies<br />

||Tnx − Tx|| → 0 for all x ∈ X. What about the other way around? If<br />

||Tnx − Tx|| → 0 for all x ∈ X, does ||Tn − T || → 0? This is not the<br />

case. However, we can define a pointwise topology on B(X,Y ) with<br />

this pointwise norm estimates. To be precise,<br />

Definition 16. We say that Tn → T strongly if ||Tnx − Tx|| → 0 for<br />

every x ∈ X.<br />

ogy.<br />

The topology associated with this is called the strong operator topol-<br />

<strong>Problems</strong><br />

On these problems, I strongly suggest taking a glance back at the<br />

Hahn-<strong>Banach</strong> theorem and its useful consequences.<br />

Problem 25. Let xn,yn weakly converge to x and y respectively. Show<br />

that αxn + βyn → αx + βy weakly.<br />

Problem 26. Let T : ℓ 2 → ℓ 2 be given by Tnx = (0, 0, 0,.....x(n),x(n+<br />

1),......). Consider the sequence Tn. Show that each Tn is a linear,<br />

bounded operator first. Show that ||Tnx − Tx|| → 0 for some appro-<br />

priate T. Does ||Tn − T || → 0?<br />

Problem 27. Let X,Y be normed spaces. Let xn → x weakly and let<br />

T ∈ L(X,Y ). Show that Txn → Tx weakly. Note that Txn ∈ Y .<br />

Problem 28. Let xn converge weakly in a normed space X to x. Show<br />

that x ∈ span{x1,x2,......}.


<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 45<br />

Problem 29. Let Y be a closed subspace in X. Show that Y contains<br />

all of the limits of its weakly convergent sequences.<br />

Problem 30. Let X be a <strong>Banach</strong> space and let E ⊂ X be a norm-<br />

bounded set. Consider the weak closure of E (that is, the closure of E<br />

in the weak topology). Show that the weak closure of E is still norm<br />

bounded.<br />

Problem 31. Let X be anormed vector space and Y a subspace. Then<br />

Y is norm closed iff Y is weakly closed.<br />

Hilbert <strong>Spaces</strong><br />

In linear algebra, we learned to generalize the algebraic structure of<br />

R n by considering vector spaces of dimension n. Then, to acquire some<br />

of the topological structure, we generalized the metric nature of R n to<br />

get normed vector spaces and <strong>Banach</strong> spaces. However, even in these<br />

spaces which have similar topological and algebraic structures to R n ,<br />

there is still something missing, and this missing piece is the familiar<br />

geometry of R n . We know how to compare vectors and see if they are<br />

perpendicular. To do this, we have the dot product in R n . The dot<br />

product naturally induced a norm (which in turn gives us a metric).<br />

If we generalize the notion of a dot product, we get what is called an<br />

inner product.<br />

Definition 17. An inner product is a map from X × X → C such<br />

that:<br />

• 〈ax + by,z〉 = a〈x,z〉 + b〈y,z〉


46 STEPHEN ROWE<br />

• 〈y,x〉 = 〈x,y〉<br />

• 〈x,x〉 ∈ (0, ∞) for all x = 0<br />

This is a linear in the first term and conjuagte-linear in the second<br />

term mapping, as 〈x,ay〉 = ā〈x,y〉. Note that in physics, the opposite<br />

convention is used (conjugate linear in the first term). Note that if X is<br />

a real vector space, then the inner product is bilinear and conjugation<br />

is no problem. We often call such a space an inner product space, or in<br />

more fancy terms, a pre-Hilbert space. With an inner product, we can<br />

induce a norm by ||x|| = 〈x,x〉. That this is so is not immediately<br />

obvious. Although it should follows quickly from the definitions that<br />

||x|| = 0 iff x = 0 and ||x|| ≥ 0, and ||αx|| = |α|||x||, the triangle<br />

inequality is a bit tricky and we’ll need something to deal with that.<br />

Inner product spaces give all the structure a normed space has, plus<br />

some new tricks. One of the most valuable inequalities that I have ever<br />

used is an inequality that relates the magnitude of an inner product of<br />

two vectors and the product of their norms.<br />

Theorem 22. The Cauchy − Schwarz Inequality Let x,y ∈ X .<br />

Then |〈x,y〉| ≤ ||x|| ||y||<br />

Proof. Consider x,y = 0 (since if either of them are zero, the inner<br />

product is zero and the result follows). For every scalar α, ||x−αy|| 2 =<br />

〈x − αy,x − αy〉 = 〈x,x〉 − ¯α〈x,y〉 − α〈y,x〉 − α¯α〈y,y〉. That is,<br />

we have ||x − αy|| 2 = ||x|| 2 − ¯α〈x,y〉 − α[〈y,x〉 − α〈y,y〉]. We can<br />

zero out the bracketed term if we choose ¯α = 〈y,x〉<br />

. Consequently,<br />

〈y,y〉<br />

0 ≤ ||x − αy|| 2 ≤ ||x|| 2 − α〈x,y〉 = ||x|| 2 − 〈y,x〉<br />

〈y,x〉. Rewriting yields<br />

〈y,y〉


<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 47<br />

0 ≤ ||x|| 2 − |〈x,y〉|2<br />

||y|| 2 . Moving terms and multiplying by the denominator<br />

yields |〈x,y〉| 2 ≤ ||x|| 2 ||y|| 2 . Taking square roots finishes the proof.<br />

Note that z¯z = |z| 2 , which was used. <br />

Theorem 23. If ||x|| = 〈x,x〉, then ||x|| is a norm on X.<br />

Proof. By the previous remarks, ||x|| satisfies all of the norm properties<br />

automatically from the definition, save for possibly the triangle inequal-<br />

ity. We have, ||x + y|| 2 = 〈x + y,x+y〉 = ||x|| 2 + 〈x,y〉 + 〈y,x〉 + ||y|| 2 .<br />

Now, on the two middle terms, we may apply Cauchy-Schwarz’s in-<br />

equality to get ||x|| ||y|| from both middle terms. Therefore, ||x+y|| 2 ≤<br />

||x|| 2 + 2||x|| ||y|| + ||y|| 2 ≤ (||x|| + ||y||) 2 . Taking square roots gives<br />

the triangle inequality. <br />

We know that the norm is continuous from our initial study of<br />

normed vector spaces, so it follows that the norm induced by the in-<br />

ner product is continuous. More can be said: the inner product is a<br />

continuous mapping from X × X → C.<br />

Lemma 5. Let X be an inner product space and let xn,yn be convergent<br />

sequences to x,y respectively. Show that lim〈xn,yn〉 = 〈x,y〉.<br />

Proof. We have |〈xn,yn〉−〈x,y〉| = |〈xn,yn〉−〈x,yn〉+〈x,yn〉−〈x,y〉| ≤<br />

|〈xn − x,yn〉| + |〈x,yn − y〉| ≤ ||x − xn|| ||yn|| + ||x|| ||y − yn|| → 0. <br />

So far we have generalized the idea of a dot product to an arbitrary<br />

vector space. An inner product space instantly gives us a norm topol-<br />

ogy and convergence in norm, analogous to R n . But, analytically, R n<br />

has the wonderful property of being complete. If we could define a


48 STEPHEN ROWE<br />

complete inner product space, we would have a great generalization of<br />

our familiar Euclidean spaces.<br />

Definition 18. A Hilbert Space is a complete, inner product space.<br />

Another way of phrasing it is that a Hilbert space is a <strong>Banach</strong> space<br />

with an inner product. We know so far that R n and C n are Hilbert<br />

spaces with the usual dot product. We’ve run into an example of a<br />

Hilbert space already: ℓ 2 . If we let 〈x,y〉 = ∞<br />

j=1 x(j)y(j), for x,y ∈ ℓ2 ,<br />

we have a well defined inner product. (Why?) This is what makes ℓ 2 so<br />

much more special thatn ℓ 1 or ℓ ∞ , which can have bizarre, pathological<br />

problems (especially L 1 and L ∞ ). However, ℓ 2 is a very nice space<br />

with great properties. We already know that (ℓ p ) ∗ = ℓ q where p and q<br />

are conjugates, and 2 is conjugate with itself, so we know ℓ 2 = (ℓ 2 )∗.<br />

Soon, we will be able to show that for a general Hilbert space H, there<br />

is a bijection between H and H ∗ . More concretely, there are several<br />

familiar geometric properties in a Hilbert space.<br />

Theorem 24. Let x,y ∈ H. Then ||x+y|| 2 +||x−y|| 2 = 2(||x|| 2 +||y|| 2 )<br />

Proof. Note that ||x + y|| 2 = ||x|| 2 + 2ℜ〈x,y〉 + ||y|| 2 and ||x − y|| 2 =<br />

||x|| 2 − 2ℜ〈x,y〉 + ||y|| 2 . Summing the two formulas gives the desired<br />

result. <br />

The importance of the inner product is only realized when we gener-<br />

alize the notion of orthogonality. We say that x ⊥ y or x is orthogonal<br />

to y if 〈x,y〉 = 0. One of the most familiar rules from geometry is the<br />

Pythagorean theorem for a right triangle. We can generalize this to<br />

arbitrary Hilbert spaces.


<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 49<br />

Theorem 25. The Pythagorean Theorem Let x1,.....xn ∈ H and<br />

let xj ⊥ xk for j = k. Then || n j=1 xj|| 2 = n 2<br />

j=1 ||xj||<br />

Proof. Exercise <br />

Given a set M ⊂ H, we can define M ⊥ = {x ∈ H : x ⊥ y ∀y ∈ H}.<br />

We call M ⊥ the orthogonal complement of M. With the inner product,<br />

given any set, we can decompose a Hilbert space into the direct sum of<br />

M and its orthogonal complement. For example, in R 2 , the x-axis and<br />

y-axis are orthogonal one dimensional subspaces such that R 2 can be<br />

viewed as the direct sum of the two subspaces. We can generalize this<br />

notion to artbitrary Hilbert spaces by first considering the following<br />

question: if M ⊂ H is a closed subspace and y ∈ H, does there exist<br />

a unique x ∈ M sucht that ||x − y|| is minimized? Can we always<br />

find a closest vector in the subspace? The answer is that for a closed<br />

subspace, we can do this; furthermore, with this unique x, we can<br />

actually express y as a sum of an element from M and M ⊥ . Since this<br />

can be done for arbitrary y ∈ H, we can decompose H into M ⊕ M ⊥ .<br />

Theorem 26. Let M be a closed subspace of H. Then H = M ⊕ M ⊥ .<br />

In other words, if x ∈ H, then we can uniquely write x = y + z where<br />

y ∈ M and z ∈ M ⊥ . These unique elements y and z are the unique<br />

elements of M and M ⊥ that minimize the distance to x. [2]<br />

Proof. Let x ∈ H and define δ = inf{||x − y|| : y ∈ M}. By the<br />

definition of infimum, we may find a sequence yn such that ||x −yn|| →<br />

δ. Since H is a Hilbert space, we may use the parallelogram law, which<br />

tells us that:


50 STEPHEN ROWE<br />

2(||yn − x|| 2 + ||ym − x|| 2 ) = ||yn − ym|| 2 + ||yn + ym − 2x|| 2<br />

We know that M is a subspace, so 1<br />

2 (yn + ym) ∈ M. If we solve for<br />

||yn − ym|| 2 , and factor out the 2 from ||yn + ym − 2x|| 2 , we arrive at:<br />

||yn − ym|| 2 = 2(||yn − x|| 2 + ||ym − x|| 2 ) − 4|| 1<br />

2 (yn + ym) − x|| 2<br />

||yn − ym|| 2 ≤ 2||yn − x|| 2 + 2||ym − x|| 2 − 4δ 2<br />

Now, we know that ||yn − x|| 2 and ||ym − x|| 2 fall down towards δ,<br />

so the right hand side falls to zero. This tells us that yn is a Cauchy<br />

sequence in M. Consequently, there exists a y ∈ M such that yn → y.<br />

(Why?) Define z = x − y, and hence ||z|| = ||x − y|| = δ. So far, we<br />

have shown that there is a y ∈ M that minimizes the distance to x.<br />

Notice that x = y +(x −y) = y +z. If we can show that z ∈ M ⊥ , then<br />

we will be mostly done, save for uniqueness.<br />

What we need to do now is show that for any u ∈ M, u ⊥ z. So,<br />

consider 〈z,u〉. This quantity may be a complex number, but if we<br />

multiply u by an appropriate scalar, we can turn this into a real valued<br />

quantity (Note: this trick is often used, where one multiplies by a<br />

scalar to either normalize or make a quantity real). Consider f(t) =<br />

||z + tu|| 2 = ||z|| 2 + 2t〈z,u〉 + t 2 ||u|| 2 . Differentiating this real valued<br />

function gives f ′ (t) = 2〈z,u〉+2t||u|| 2 . We know that f(t) = ||z +tu|| 2<br />

is minimized at t = 0 because z + tu = x − y + tu = x − (y + tu).


<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 51<br />

Since y + tu ∈ M, we know ||x − y + tu|| ≥ ||x − y|| = ||z||. Therefore,<br />

the minimization occurs when t = 0. Looking out our derivative, we<br />

have f ′ (0) = 0 = 2〈z,u〉. Hence z ⊥ u. Since this holds for arbitrary<br />

u ∈ M, we have z ∈ M ⊥ . Now, we must argue uniqueness: Let y ′ ∈ M.<br />

Then ||x − y ′ || 2 = ||x − y|| 2 + ||y − y ′ || 2 ≥ ||x − y|| 2 . Here, we used the<br />

Pythagorean theorem, which is valid since x − y ⊥ y − y ′ ∈ M, since<br />

x − y = z ∈ M ⊥ . One can similarly show that given another z ′ ∈ M ⊥ ,<br />

||x − z ′ || = ||x − z|| 2 + ||z − z ′ || 2 ≥ ||x − z|| 2 , and we have equality iff<br />

z = z ′ . This solves uniqueness of y and z as the closest elements to x<br />

from M and M ⊥ respectively.<br />

Therefore, given any x ∈ H, we can write x = y + z where y ∈ M<br />

and z ∈ M ⊥ . Assume that there is another decomposotion. Then,<br />

y ′ + z ′ = x = y + z implies y ′ − y = z + z ′ . But, y ′ − y ∈ M and<br />

z − z ′ ∈ M ⊥ , and hence y ′ − y = z − z ′ ∈ M ⊥ ∩ M = {0}. Therefore,<br />

we have a unique decomposition of H as M ⊕ M ⊥ . <br />

If we look back at the beginning of this proof where we were es-<br />

tablishing the existence of a minimizing distance vector y to x, notice<br />

that the only properties of M we used were that M was closed (hence<br />

complete) and that 1<br />

2 (yn + ym) ∈ M. The second property is far less<br />

demanding than being a subspace; in fact, a convex set would do just<br />

fine. That is, if K is a closed, convex set in a Hilbert space, then we<br />

can find a unique minimizing vector. The rest of the proof utilizes<br />

subspace properties however.<br />

Let x ∈ H and let M be a closed subspace. Then by the previous<br />

theorem, there exists y,z in M and M ⊥ respectively. We call y the


52 STEPHEN ROWE<br />

orthogonal projection of z onto M. This is motivated by the calculus<br />

and geometry you are likely familiar with. With this information, we<br />

can define a mapping P : H → M by Px = y. From this, we see that P<br />

is a linear operator. Furthermore, P is continuous, and hence bounded<br />

(if xn → x, then ǫ ≥ ||xn − x|| 2 = ||yn − y|| 2 + ||zn − z|| 2 , so yn → y;<br />

from this, Pxn = yn → y = Px). We have some nice properties: P is<br />

an onto bounded linear mapping from H to M, and it is the identity<br />

on M, and hence P 2 = P (Why?). Additionally, P(M ⊥ = 0.<br />

With this newfound structure in a Hilbert space, we can learn some-<br />

thing very important about the dual space of H. If y ∈ H, then we<br />

may define f(x) = 〈x,y〉, which is a bounded linear functional. (Why?)<br />

The surprising thing is, every bounded linear functional can be written<br />

in this way! Therefore, H ∗ can be identified naturally with H itself.<br />

This instantly tells us that we may view H ∗∗ = H also.<br />

Theorem 27. The Riesz Representation Theorem Let H be a Hilbert<br />

space and f ∈ H ∗ . Then, there exists unique y ∈ H such that f(x) =<br />

〈x,y〉 for all x ∈ X.<br />

Proof. If f = 0, then it is certainly true that f(x) = 〈x, 0〉 = 0. Let<br />

f not be the zero functional. Let M = {x ∈ H : f(x) = 0}. We<br />

know that the kernel of a bounded operator gives a closed subspace,<br />

so M is closed. Since f = 0, M is a nontrivial subspace. Then H =<br />

M ⊕ M ⊥ and M ⊥ is non-trivial, so we may choose z ∈ M ⊥ , such<br />

that ||z|| = 1 (since M ⊥ is a closed subspace as well). Then, Then,<br />

define u = f(x)z − f(z)x. So, f(u) = 0, and u ∈ M. So, 0 =


<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 53<br />

〈u,z〉 = f(x)||z|| 2 − f(z)〈x,z〉 = f(x) − 〈x, f(z)z〉. Solving for f(x)<br />

gives f(x) = 〈x,y〉 with y = f(z)z.<br />

Now, we must show uniqueness. Assume there exists y,y ′ such that<br />

f(x) = 〈x,y〉 = 〈x,y ′ 〉. Then, 0 = 〈x,y − y ′ 〉. Choosing x = y − y ′ , we<br />

get ||y − y ′ || 2 = 0, and hence y = y ′ . <br />

This amazing result tells us that the functionals acting on H can be<br />

identified with H itself through a conjugate linear isomorphism. With<br />

this new knowledge of functionals, we can build new operators from<br />

already existing bounded linear operators.<br />

Definition 19. Let H be Hilbert spaces and let T : H → H be a<br />

bounded linear operator. Then, we define the adjoint T ∗ : H → H<br />

such that 〈Tx,y〉 = 〈x,T ∗ y〉 for all x,y ∈ H.<br />

It is not obvious that such an operator even exists. However, with the<br />

Riesz representation theorem, we can actually build it rather quickly<br />

since we know a bit about functionals.<br />

Theorem 28. The adjoint T ∗ of a bounded linear operator exists and<br />

is a unique, bounded linear operator with norm equal to ||T ||.<br />

Proof. Consider the functional defined by fy(x) = 〈Tx,y〉 for all x ∈ H.<br />

Then, this is a bounded linear functional, as ||fy(x)|| ≤ ||T || ||x|| ||y||<br />

(Why?) and hence by the Riesz representation theorem, there ex-<br />

ists a unique z ∈ H such that 〈Tx,y〉 = 〈x,z〉. Consider the map-<br />

ping of H → H given by y → z. We may call this mapping T ∗ .<br />

With this, we have a well defined linear (show linearity) operator T ∗


54 STEPHEN ROWE<br />

such that 〈Tx,y〉 = 〈x,T ∗ y〉 for all x,y ∈ H. Given an operator, we<br />

have ||T || = sup x∈X<br />

sup x,y∈H<br />

〈Tx,y〉<br />

||x|| ||y|| . Then, we have ||T ∗ || = sup x,y∈H<br />

〈T ∗ x,y〉<br />

||x|| ||y|| =<br />

〈x,Ty〉<br />

||x|| ||y|| ≤ sup ||x|| ||Ty||<br />

x,y∈H ||x||||y|| ≤ supx∈H ||Tx|| = ||T ||. On the<br />

other hand, ||T ∗ || = sup x,y∈H<br />

||T ||. So, ||T || = ||T ∗ ||.<br />

〈x,Ty〉<br />

||x|| ||y|| ≥ sup x,y∈H<br />

〈Tx,Tx〉<br />

||Tx|| ||x|| = sup ||Tx||<br />

x∈H ||x|| =<br />

Note that it is possible to generalize this concept to an adjoint map-<br />

ping between two different Hilbert spaces H1 and H2. That is, we<br />

can define , for a given T ∈ B(H1,H2), T ∗ ∈ B(H2,H1) such that<br />

〈Tx,y〉2 = 〈x,T ∗ y〉1 for all x ∈ H1, y ∈ H2. However, this requires<br />

some extra machinery (sesquilinear forms), which I decided weren’t<br />

worth pursuing and can easily be found in any textbook or on the in-<br />

ternet. Before we start proving some properties about adjoints, there is<br />

a useful trick for showing that an operator is actually the zero operator:<br />

Lemma 6. Let X,Y be inner product spaces and let T ∈ B(X,Y ). [3]<br />

Then:<br />

a. T = 0 iff 〈Tx,y〉 = 0 for all x ∈ X and y ∈ Y<br />

b. If T : X → X and X is a complex inner product space and if<br />

〈Tx,x〉 = 0 for all x ∈ X, then T = 0<br />

Proof. Part (a) is an exercise. For part b, consider 〈T(αx + y,αx + y〉.<br />

Consider the two cases of α = i and α = −i. <br />

Note that the statement in part (b) of the previous lemma requires<br />

that X be a complex space. It is false in the real case. Consider a<br />

rotation operator in R 2 that rotates by 90 deg [3].


<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 55<br />

Theorem 29. Let T,S : H → H be bounded linear operators. Then,<br />

a. (S + T) ∗ = S ∗ + T ∗<br />

b. (αT) ∗ = ¯αT ∗<br />

c. (T ∗ ) ∗ = T<br />

d. ||T ∗ T || = ||TT ∗ || = ||T || 2<br />

e. T ∗ T = 0 iff T = 0<br />

f. (ST) ∗ = T ∗ S ∗<br />

The proofs of these are computational exercises which hopefully<br />

shouldn’t prove to be too strenuous. In the exercises, we will explore<br />

operators known as self-adjoint operators, which satisfy T = T ∗ and<br />

unitary operators, which are invertible operators such that T ∗ = T −1 .<br />

So far, we’ve learned a bit about functionals and operators on a Hilbert<br />

space. It’s time we learn about one of the most useful properties about<br />

Hilbert spaces: orthonormal bases.<br />

A set {eα} ∈ H is said to be orthonormal if ||eα|| = 1 and 〈eα,eβ〉 =<br />

δαβ, where δαβ = 1 if α = β and zero otherwise. That is, every vector<br />

in an orthonormal set is orthogonal to every other one, and every vec-<br />

tor has norm one. Recall from linear algebra that given any linearly<br />

independent set {xn}, one could transform this into an orthonormal<br />

set using the Gram-Schmidt orthogonalization procedure. One defines<br />

y1 = x1<br />

||x1|| and then yn. Repeating this for all xn, we can then define<br />

zn = xn − n−1<br />

j=1 〈xj,un〉un. Orthonormal sets satisfy a very impor-<br />

tant inequality which relates the dot products of a vector against an<br />

orthonormal set with the norm of the vector.


56 STEPHEN ROWE<br />

Theorem 30. Bessel ′ s Inequality If {eα}α∈A is an orthonormal set<br />

in H, then for any x ∈ H, we have <br />

α∈A |〈x,eα〉| ≤ ||x|| 2<br />

Proof. It is possible that this is an uncountable sum; to deal with an<br />

uncountable sum, one takes the supremum over all finite subsets of A.<br />

Therefore, if we can prove this for an arbitrary finite subset of A, we<br />

will be done.<br />

0 ≤ ||x − <br />

〈x,eα〉eα|| 2<br />

α∈B<br />

= ||x|| 2 −2Re〈x, <br />

〈x,uα〉uα〉+|| <br />

〈x,uα〉uα|| 2 Use Pythagorean Theorem on rightmost piece<br />

α∈B<br />

α∈B<br />

= ||x|| 2 − 2 <br />

|〈x,uα〉| 2 + <br />

|〈x,uα〉| 2<br />

α∈B<br />

α∈B<br />

= ||x|| 2 − <br />

|〈x,uα〉| 2<br />

α∈B<br />

Moving the last sum to the right hand side finishes the proof. <br />

Can equality happen in Bessel’s inequality? The answer is yes, and<br />

something very nice happens in that case. If one has an orthonormal<br />

set such that <br />

α∈A |〈x,eα〉| = ||x|| 2 for all x ∈ H, it turns out that the<br />

set {eα} actually is a sort of orthonormal basis. That is, we can express<br />

x = 〈x,eα〉eα. We call the coefficeints 〈x,eα〉 Fourier coefficients.<br />

Theorem 31. Let {eα} be an orthonormal set in H. The following<br />

are equivalent: [2]<br />

a. If 〈x,uα〉 = 0 for all α, then x = 0<br />

b. Parseval ′ s Identity ||x|| 2 = <br />

α∈A |〈x,uα〉| 2 for all x ∈ H


<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 57<br />

c. For each x ∈ H, x = <br />

α∈A 〈x,uα〉uα. This sum converges in<br />

the norm topology no matter the ordering.<br />

Proof. Assume (a) and let’s show (c). We may choose a subset of the<br />

α’s by discarding all α such that 〈x,uα〉 = 0. By Bessel’s inequality,<br />

the sum |〈x,uα〉| 2 converges. We have that || m<br />

αj=n 〈x,uαj 〉||2 =<br />

m<br />

j=n |〈x,uαj 〉2 → 0 as we let m,n get arbitrarily large. By the com-<br />

pleteness of H, 〈x,uα〉uα converges. Then, 〈x − 〈x,eα〉eα,eα〉 = 0<br />

for all α, and by assumption of (a), this tells us that x− 〈x,eα〉eα = 0.<br />

Let’s assume (c) and show (b). We have ||x|| 2 − n<br />

j=1 |〈x,uαj 〉|2 =<br />

||x− n<br />

j=1 〈x,uαj 〉uαj ||2 by calculation. We have by assumption that the<br />

term on the right goes to zero. Hence, we have ||x|| 2 = n<br />

j=1 |〈x,uαj 〉|2 .<br />

If we assume (b), then (a) follows: if 〈x,uα〉 = 0 for all α, then we<br />

have ||x|| 2 = |〈x,uα〉| 2 = 0, and hence x = 0. <br />

This theorem illustrates the desirable nature of an orthonormal set<br />

in a Hilbert space: it allows every vector to be written as an easy<br />

linear combination of the orthonormal vectors and the vectors Fourier<br />

coefficients. This is why a Hilbert space can be such an ideal space to<br />

work in. Generalizing from linear algebra, we call a set that satisfies<br />

one (and hence all) properties of the previous theorem an orthnormal<br />

basis. We know from our previous work that the set {en} ∈ ℓ 2 , the<br />

cannonical basis, is a Schauder basis for ℓ 2 . With inner product defined<br />

as 〈x,y〉 = ∞<br />

n=1 x(n)y(n), we see that the (en) form an orthonormal<br />

set, and it isn’t too hard to show that if 〈x,en〉 = 0 for all n, then<br />

x = 0; hence, this sequence forms an orthonormal basis. It should be<br />

clear that a Hilbert space with an orthonormal basis is an ideal setting


58 STEPHEN ROWE<br />

to work in. A question remains: given a Hilbert space, does there exist<br />

an orthonormal basis? Fortunately, the answer is yes!<br />

Theorem 32. Let H be a Hilbert space. Then, H has an orthonormal<br />

basis.<br />

The proof of this, much like the proof of the Hahn-<strong>Banach</strong> theorem,<br />

requires a powerful set theoretic lemma: Zorn’s lemma. Our first step<br />

is to consider a partially ordered set X where the elements of X are<br />

orthonormal subsets of H. (Note to the student: it is imperative that<br />

you first show X is non-empty. Why is X non-empty?) To give a partial<br />

ordering on X, we say U1 ≤ U2 if U1 ⊂ U2. To use Zorn’s lemma, we<br />

must argue that every chain has an upper bound, where a chain is a<br />

linearly ordered set. Let C = {U1,U2,.....} with U1 ⊂ U2 ⊂ U3..... If<br />

we define U = ∪Un, we have an orthonormal set and clearly Un ≤ U<br />

for all n. Therefore, this set has an upper bound, and hence there is a<br />

maximal element in X (that is, a largest orthonormal set). Let this set<br />

be {eα}. We do not yet know that eα is an orthonormal basis. Being a<br />

maximal orthonormal set implies there exists no x such that x ⊥ eα for<br />

all α, save for x = 0. But, that is equivalent to part (a) of the previous<br />

theorem. Consequently, {eα} is an orthonormal basis.<br />

Hilbert spaces keep getting better and better; they generalize the<br />

geometry and completeness of R n , and they always admit orthonormal<br />

bases. Hilbert spaces are reflexive and H ∗ is exactly H itself. Given a<br />

closed subspace, one can decompose H into the direct sum of the sub-<br />

space and its orthogonal complement. Additionally, any vector can be<br />

reconstructed from its Fourier coefficients, given an orthonormal basis,


<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 59<br />

which is guaranteed to exist. To make things even better, if one can<br />

find a countable orthonormal basis, the Hilbert space is automatically<br />

separable (and the converse is true too!).<br />

Theorem 33. Let H be a Hilbert space. Then, H is separable iff H<br />

has a countable orthonormal basis. Additionally, if H has a countable<br />

orthonormal basis, then all orthonormal bases are countable. [2]<br />

Proof. The assertions in the second sentence will be left as an exercise.<br />

Let’s show the last assertion. Let, {un} be a countable orthonormal<br />

basis and {vβ}β∈B, be an arbitrary orthonormal basis. Then, consider<br />

the sets An = {β ∈ B : 〈vβ,un〉 = 0}. This set An must be countable,<br />

by Bessel’s inequality and/or part (c) of the Parseval’s theorem. Then,<br />

∪An is a countable set. We have that since un forms an orthonormal<br />

basis, every α is in at least one An, so ∪An = A is countable. <br />

References<br />

[1] John B. Conway. A Course in Functional Analysis. Springer, 2007.<br />

[2] Gerald B. Folland. Real Analysis. John Wiley and Sons Inc., 1999.<br />

[3] Erwin Kreyszig. Introductory Functional Analysis with Applications. John Wiley<br />

and Sons Inc., 1989.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!