NOTES ON ANALYSIS Contents Banach Spaces 2 Problems 6 A ...
NOTES ON ANALYSIS Contents Banach Spaces 2 Problems 6 A ...
NOTES ON ANALYSIS Contents Banach Spaces 2 Problems 6 A ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong><br />
STEPHEN ROWE<br />
<strong>Contents</strong><br />
<strong>Banach</strong> <strong>Spaces</strong> 2<br />
<strong>Problems</strong> 6<br />
A Few Inequalities for ℓ p 7<br />
From Finite to Infinite Dimension 9<br />
An Introduction to Linear Operators 14<br />
<strong>Problems</strong> 18<br />
An Introduction to Linear Functionals and Dual <strong>Spaces</strong> 19<br />
<strong>Problems</strong> 26<br />
The Three Big Theorems: Open Mapping, Closed Graph, and<br />
<strong>Banach</strong>-Steinhaus 27<br />
<strong>Problems</strong> 35<br />
Topologies and Functionals: Weak and Weak-* Topologies 37<br />
<strong>Problems</strong> 44<br />
Hilbert <strong>Spaces</strong> 45<br />
References 59<br />
These are notes aimed at undergraduates with an interest in learning<br />
a bit about functional analysis without requiring measure theory. With<br />
Date: June 14, 2011.<br />
1
2 STEPHEN ROWE<br />
this approach, functional analysis can be made accessible to undergrad-<br />
uates with just some basic analysis and linear algebra background. This<br />
is strongly inspired by Kreyszig’s Introductory Functional Analysis [3]<br />
with Applications textbook. Some topics and problems were also in-<br />
spired by Folland’s superb analysis textbook [2]. I try to provide some<br />
exercises and detailed proofs along with helpful(I hope!) exposition.<br />
If you see any parenthetical (Why?)’s anywhere, those are statements<br />
which the reader should ponder before moving on (this was inspired<br />
by N. L. Carother’s excellent Real Analysis textbook). If you see any<br />
mistakes, let me know!<br />
<strong>Banach</strong> <strong>Spaces</strong><br />
Definition 1. Let X be a vector space. We say that || · || is a norm<br />
on X if || · || : X :→ R is a mapping such that<br />
• ||x|| ≥ 0<br />
• ||x|| = 0 iff x = 0<br />
• ||αx|| = |α|||x|| for any α ∈ C<br />
• ||x + y|| ≤ ||x|| + ||y|| (The Triangle Inequality)<br />
If such a mapping exists for X, we say that X is a normed vector<br />
space. Geometrically, norms generalize the notion of length to arbitrary<br />
vector spaces. Note that a normed vector space is automatically a<br />
metric space with the natural metric induced by the norm given by<br />
d(x,y) = ||x − y||. Since the norm induces a metric, it gives rise to a<br />
topology naturally by considering the topology generated by open balls.<br />
Recall by the reverse triangle inequality that |||x|| − ||y||| ≤ ||x − y||.
<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 3<br />
Consequently, if we consider the norm as a mapping, it is continuous.<br />
That is, if xn,x ∈ X with xn → x, we have ||xn|| → ||x|| (where<br />
convergence of xn → x is given by ||xn − x|| → 0). Since a norm<br />
naturally induces a metric, is it true that every metric is induced by a<br />
norm? This is a neat question, and the answer is no (see exercises).<br />
Definition 2. Let X be a normed vector space. If X is complete, we<br />
say that X is a <strong>Banach</strong> space. Recall that completeness means that<br />
every Cauchy sequence converges.<br />
It is worth noting that a normed vector space may be complete under<br />
one norm, but incomplete under other norms. It is quite helpful to have<br />
a few examples of different normed vector spaces and <strong>Banach</strong> spaces.<br />
The reader should know that C n and R n are both <strong>Banach</strong> spaces, but<br />
there are more exotic and interesting examples out there.<br />
Example 1. We define C[a,b] to be the set of continuous functions on<br />
the interval [a,b]. The first norm worth considering is the sup-norm.<br />
Define ||f|| = sup t∈[a,b] |f(t)|. Under this norm, we have that C[a,b] is<br />
a complete metric space, and hence a <strong>Banach</strong> space. To see why this<br />
is so, let fn be a Cauchy sequence in C[a,b]. Then, we have for n,m<br />
large enough, ||fn − fm|| < ǫ. Consequently, sup |fn(t) − fm(t)| < ǫ. If<br />
we fix a t, then we have that |fn(t) − fm(t)| is a Cauchy sequence of<br />
real numbers, and hence converges. Hence, fn converges pointwise to<br />
a function f. Consequently, since sup t∈[a,b] |fn(t) − fm(t)| ≤ ǫ for n,m<br />
large enough, we may take limits (which are allowed since the norm is<br />
continuous), and since fn(t) → f(t) pointwise, we have sup t∈[a,b] |f(t)−
4 STEPHEN ROWE<br />
fm(t)| ≤ ǫ. This shows that fn uniformly converges to f, and hence<br />
f is continuous, and hence f ∈ C[a,b]. Therefore, C[a,b] is complete<br />
under the supremum norm.<br />
On the other hand, we can give a metric to C[a,b] by d(f,g) =<br />
b<br />
|f(t) − g(t)|dt. Under this metric, C[a,b] is not complete. To see<br />
a<br />
why, let fn(t) = 0 for t ∈ [0, 1<br />
2 ], and fn(t) = 1 for t ∈ [ 1<br />
2<br />
1 + , 1]. n<br />
In the unspecified area, simply let fn be linear such that it makes fn<br />
continuous (start at 0 at t = 1<br />
2<br />
and go up to 1 at t = 1<br />
2<br />
1 + ). We n<br />
have that d(fn,fm) ≤ ǫ for large n,m > N. However, the limit of<br />
this sequence of functions is a step function, which is discontinuous.<br />
Therefore, the limit is not in C[a,b]. Consequently, this metric makes<br />
the space incomplete. The easiest way to do this is graphically by<br />
drawing fn and fm for large n,m. The area under |fn(t) − fm(t)|<br />
becomes very small for large n,m.<br />
Example 2. We can consider the set of all polynomials defined on an<br />
interval [a,b] to be a subset of C[a,b] with the supremum norm. Is this<br />
set closed? Why or why not?<br />
Amongst the most important spaces in analysis are the so called<br />
L p function spaces (with norm given by integration) and ℓ p sequence<br />
spaces (with norm given by summation). We will focus on ℓ p for now<br />
(due to our sketchy avoidance of measure theory for now). Let x denote<br />
a sequence of real (or complex) scalars and let x(n) denote the n th term<br />
in the sequence.
<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 5<br />
Definition 3. Let 1 ≤ p < ∞. We define ℓ p to be the set of all<br />
sequences such that ∞<br />
n=1 |x(n)|p < ∞. On such a space, we can<br />
induce a norm by ||x||p = ( ∞ n=1 |x(n)|p ) 1<br />
p. If p = ∞, we define ℓ∞ to be the set of all sequences of scalars that are bounded. That is,<br />
sup |x(n)| < ∞.<br />
The most important ℓ p spaces occur for p = 1, 2, ∞. We will see later<br />
that ℓ 2 has extraordinarily nice structure. To get a better feel for these<br />
spaces, it helps to have some examples of elements in them. Consider<br />
the harmonic series given by x(n) = 1.<br />
Since the harmonic series is<br />
n<br />
divergent, we know that x /∈ ℓ 1 . However, we have that x(n) 2 = π2<br />
6 ,<br />
and hence x ∈ ℓ 2 . These spaces have some convenient properties. For<br />
1 ≤ p ≤ ∞, ℓ p is actually a complete normed space, and hence a<br />
<strong>Banach</strong> space. Additionally, for 1 ≤ p < ∞, ℓ p is separable (ℓ ∞ is not<br />
however!). Recall that a space is separable if there exists a countable<br />
dense set. Before we move on, a note on notation. Since the elements of<br />
ℓ p are sequences, it can be quite confusing dealing with sequences in ℓ p<br />
(that is, sequences of sequences!). Therefore, we will use the notation<br />
that x(n) refers to the n th sequence entry of an element x ∈ ℓ p . We let<br />
{xn} ∈ ℓ p be a sequence in ℓ p , then xn is the n th term in the sequence<br />
(regarding each x as a point in a space).<br />
Theorem 1. The normed space ℓ 2 is a <strong>Banach</strong> space with the norm<br />
given by ||x|| = ( ∞ n=1 |x(n)|2 ) 1<br />
2<br />
Proof. Consider a Cauchy sequence xn ∈ ℓ 2 . Then, we have for large<br />
enough N, ||xn−xm|| ≤ ǫ. Consequently, ∞<br />
i=1 |xn(i)−xm(i)| 2 ≤ ǫ 2 . In
6 STEPHEN ROWE<br />
particular, we have (for each i) |xn(i) − xm(i)| ≤ ǫ. However, for fixed<br />
i, xn(i) (with n varying) is a Cauchy sequence of scalars, and hence<br />
converges. Therefore, for each i, we may define x(i) = limn→∞ xn(i).<br />
So far all we have produced is a candidate limit for the Cauchy sequence<br />
xn. Then, since ∞<br />
i=1 |xn(i) − xm(i)| 2 ≤ ǫ 2 , taking a limit on n gives<br />
∞<br />
i=1 |x(i) − xm(i)| 2 ≤ ǫ 2 . Consequently the vector x − xm ∈ ℓ p . Since<br />
ℓ p is a vector space, (x − xm) + xm = x ∈ ℓ p . Therefore, our candidate<br />
limit is in ℓ p , and ||xn − x|| → 0. Therefore ℓ 2 is a <strong>Banach</strong> space.<br />
Exercise: Do this for ℓ p , 1 ≤ p ≤ ∞.<br />
<strong>Problems</strong>.<br />
Problem 1. Show that ℓ ∞ is complete.<br />
Problem 2. Consider M ⊂ ℓ ∞ to be the set of all sequences such that<br />
at most finitely many terms are non-zero. First, show that this set is<br />
a subspace of ℓ ∞ . Next, show that this set is not closed and hence not<br />
complete.<br />
Problem 3. Let Y be a <strong>Banach</strong> space and M ⊂ Y be closed. Show<br />
that M is a <strong>Banach</strong> space with the norm inhertied from Y . (You<br />
probably have done a question like this before: Show that a closed<br />
subset of a complete space is complete.)<br />
Problem 4. Show that ℓ p is separable for 1 ≤ p < ∞. Hint: You need<br />
to construct a countable dense set. Recall that the rationals are dense<br />
in the reals.
<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 7<br />
Problem 5. Show that ℓ ∞ is not separable. Hint: Given any real<br />
number in [0, 1], one can write a binary representation as a string of<br />
ones and zeros. Consider strings of this form (justify that they are in<br />
ℓ ∞ ). How many such strings are there? What is the minimum distance<br />
between two such strings?<br />
Problem 6. Let d(x,y) = ||x − y|| be a metric induced by a norm.<br />
Prove that this metric is translation invariant. That is, show that<br />
d(x+z,y +z) = d(x,y). Furthermore, for any α ∈ C (or R, depending<br />
on the field of scalars), show that d(αx,αy) = |α|d(x,y).<br />
Let S be the space of all sequences of scalars. That is, x ∈ S if<br />
x = (x(1),x(2)......x(j).....). Consider the metric given by d(x,y) =<br />
∞<br />
j=1<br />
1<br />
2 j<br />
|x(j)−y(j)|<br />
. Is this metric induced by a norm?<br />
1+|x(j)−y(j)|<br />
A Few Inequalities for ℓ p<br />
It was mentioned above that the ℓ p spaces are <strong>Banach</strong> spaces. How-<br />
ever, we glossed over the actual step of showing that they even form<br />
normed vector spaces. It isn’t too difficult to verify that the ℓ p norms<br />
satisfy all of the norm properties, save for the triangle inequality. This<br />
requires something known as Minkowski’s Inequality. However, this<br />
relies on an inequality of vast importance in the theory of Lebesgue<br />
integrals in L p spaces known as Hölder’s inequality. Before that, we<br />
require a lemma. When working with ℓ p spaces, one often is interested<br />
in the space ℓq where 1 1 + p q<br />
exponents.<br />
= 1. We say that p and q are conjugate<br />
Lemma 1. Let a,b ≥ 0 and λ ∈ (0, 1). Then, a λ b 1−λ ≤ λa + (1 − λ)b
8 STEPHEN ROWE<br />
Proof. Let t = a,<br />
and divide both sides by b. Then, we aim to show<br />
b<br />
that t λ ≤ λt + (1 − λ). Consider the function t λ − λt. Differentiating<br />
this expression gives λ(t λ−1 − 1), which is optimized with the choise<br />
t = 1. At t = 1, we have 1 λ − λ = 1 − λ. Hence, t λ − λt ≤ 1 − λ, with<br />
equality if t = 1. <br />
Theorem 2. Hölder ′ s Inequality. Let 1 < p < ∞ and let p,q be con-<br />
jugate exponents. Then, ∞ n=1 |x(n)y(n)| ≤ ( ∞<br />
n=1 |x(n)|p ) 1<br />
p( ∞ n=1 |y(n)|q ) 1<br />
q<br />
Proof. Let x = (x(n)) ∈ ℓ p and let y = (y(n)) ∈ ℓ q . This inequality is<br />
equivalent to showing that ||xy||1 ≤ ||x||p||y||q. For now, let’s simplify<br />
the problem and assume that ||x||p = ||y||q = 1. This follows obviously<br />
if x(n) = 0 for all n (or if y(n) = 0 for all n). Assume neither of these<br />
are identically zero. Let a = x(n) p and b = y(n) q . Then, with λ = 1<br />
p ,<br />
we have |x(n)y(n)| ≤ 1<br />
p |x(n)|p + 1<br />
q |y(n)|q . If we sum both sides, we<br />
arrive at ||xy||1 ≤ 1<br />
p ||x||p + 1<br />
q ||y||q = 1. This holds for all normalized<br />
x ∈ ℓ p , y ∈ ℓ q . The normalization is equivalent to dividing each term<br />
x(n) by ||x||p. So, to arrive at the inequality for non-normalized a ∈ ℓ p ,<br />
b ∈ ℓ q , we note that this inequality holds for x = a<br />
||a||p<br />
Then, ||xy||1 ≤ 1 implies<br />
||ab||1<br />
||a||p ||b||q<br />
and y = b<br />
||b||q .<br />
≤ 1. Multiplying by the denominator<br />
yields the desired result <br />
Now that we have this inequality, we can prove the triangle inequality<br />
for ℓ p spaces. This is known as Minkowski’s inequality.<br />
Theorem 3. Let 1 ≤ p ≤ ∞ and let x,y ∈ ℓ p . Then, ||x + y||p ≤<br />
||x||p + ||y||p.
<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 9<br />
Proof. To start, we notice that |x(n) + y(n)| p = |x(n) + y(n)| |x(n) +<br />
y(n)| p−1 ≤ (|x(n)| + |y(n)|)|x(n) + y(n)| p−1 ; all we did was utilize the<br />
regular triangle inequality for scalars. Now, we may sum both sides to<br />
get ∞<br />
n=1 |x(n)+y(n)|p ≤ ∞<br />
n=1 |x(n)||x(n)+y(n)|p−1 + ∞<br />
n=1 |y(n)||x(n)+<br />
y(n)| p−1 . We can now apply Hölder’s inequality to both sides, choos-<br />
ing q to be the conjugate exponent to p. Therefore, we arrive at<br />
∞<br />
n=1 |x(n) + y(n)|p ≤ ||x||p|||x + y| p−1 ||q + ||y||p|||x + y| p−1 ||q. Factor-<br />
ing yields ||x+y|| p p ≤ (||x||p + ||y||p)( ∞ n=1 |x(n)+y(n)|(p−1)q ) 1<br />
q. Notice<br />
that 1 1 = 1 − which tells us that (p − 1)q = p. Consequently, we have<br />
p q<br />
∞ n=1 |x(n) + y(n)|p ≤ (||x||p + ||y||p)( ∞ n=1 |x(n) + y(n)|p ) 1<br />
q. Division<br />
of the summation term yields ||x + y||p = ( ∞ n=1 |x(n) + y(n)|)1−1 q ≤<br />
||x||p + ||y||p<br />
With this, we have the triangle inequality for ℓ p spaces and we can<br />
conclude that ℓ p spaces are <strong>Banach</strong> spaces.<br />
From Finite to Infinite Dimension<br />
Recall from linear algebra that a vector space X is finite dimensional<br />
if the largest linearly independent set has at most n vectors for some<br />
finite n. If X is a vector space where this does not occur, then we say<br />
that X is infinite dimensional. All spaces (besides C n ) introduced in the<br />
previous section are infinite dimensional. (Why?) We know from linear<br />
algebra that if the dimension of a space is n, then a set of n linearly<br />
indpendent vectors serve as a basis. In infinite dimensions, such a basis<br />
would necessarily be infinite. However, it is imperative that the reader<br />
note that if we say a set Y = {x1,x2,.......} is a basis for X (of infinite
10 STEPHEN ROWE<br />
dimension), then this means for every x ∈ X, there exists a finite (!)<br />
set of xi such that y = k cixni i=1 . The linear algebraic definition of<br />
linear combinations only permits finite linear combinations, not infinite<br />
series of such. That does not mean one should disregard the possibility<br />
of generalizing such a notion.<br />
We know that if X is a <strong>Banach</strong> space and xn → x, then we mean<br />
||xn − x|| → 0. With this notion of convergence, we can generalize our<br />
notion of infinite series from calculus. Let xn be a sequence in X and<br />
define Sn = n<br />
i=1 xi. We say that this series converges if ||Sn − Sm|| is<br />
a Cauchy sequence (or if there exists S ∈ X such that ||Sn − S|| → 0),<br />
and we say that S is the sum of the infinite series.<br />
Definition 4. Let X be a <strong>Banach</strong> space and let (en) be a sequence in<br />
X. We say that (en) is a Schauder basis for X if given any x, there<br />
exists a unique sequences of scalars (αn) such that x = αnen, or<br />
||x − k<br />
n=1 αnen|| → 0.<br />
This generalization of a basis for infinite dimensional spaces. It is<br />
quite handy for ℓ p . Since any x ∈ ℓ p can be thought of as a p-summable<br />
sequence, x = (x(1),x(2),.....x(n).....). If we define the basis vectors<br />
e1 = (1, 0, 0.......),e2 = (0, 1, 0, 0,.....)....en = (0, 0,......1, 0, 0.....) we<br />
obtain a Schauder basis and x = ∞<br />
i=1 x(i)ei. Notice that if a <strong>Banach</strong><br />
space X has a Schauder basis, it is automatically separable (Why?).<br />
On the other hand, given a separable <strong>Banach</strong> space, does there exist a<br />
Schauder basis? Intuitively, one might guess yes. However, this is not<br />
true. This was a big open problem, which was solved by Per Enflo in<br />
1973 in the negative.
<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 11<br />
On a more concrete level, let’s take a look at ℓ ∞ . We know that<br />
ℓ ∞ is not separable (see exercises), so it cannot have a Schauder basis.<br />
However, it seems that since the cannonical Schauder basis {en} works<br />
so well for ℓ p , what’s going wrong in ℓ ∞ ? Consider x ∈ ℓ ∞ given by<br />
x = (1, 1, 1, 1.....). Then, xn = en is the appropraite approximation,<br />
by ||x − ∞<br />
n=1 en|| = 1. Consequently, this is not converging in norm<br />
to x!<br />
So far, it appears that we can naturally extend familiar notions from<br />
finite dimensions to infinite. However, the big jump between finite<br />
dimension and infinite is the change in topology. Recall that sequential<br />
compactness and compactness are equivalent notions on metric spaces,<br />
and since we are interested solely in normed (hence metric) spaces, we<br />
may take as a definition that compactness is sequential compactness.<br />
Definition 5. Let X be a normed space and M ⊂ X. We say that M<br />
is compact if given any sequence xn ∈ M, there exists a convergent (in<br />
M) subsequence.<br />
It follows that if M is compact, then M is necessarily closed and<br />
bounded. (Why? To see boundedness, choose a sequence such that<br />
||xn|| grows monotonically arbitrarily large. Why does this not have<br />
a convergent subsequence?) We know by Heine-Borel that in R n com-<br />
pactness is equivalent to a set being closed and bounded. It also is true<br />
that in finite dimensional spaces, a set is compact iff it is closed and<br />
bounded.
12 STEPHEN ROWE<br />
Theorem 4. Let X be a finite dimensional and M a subset of X.<br />
Then, M is compact iff M is closed and bounded<br />
Proof. If M is compact, it follows from above that M is closed and<br />
bounded. Let M be closed and bounded and consider an arbitrary<br />
sequence xn ∈ M. Since this space is finite dimensional, we can choose<br />
a convenient basis e1,e2,....en, and note that xm = αm(1)e1+αm(2)e2+<br />
....αm(n)en. Then, each αm(i) is a bounded sequence of scalars (since M<br />
is bounded), and hence we have a convergent subsequence by Bolzano-<br />
Weierstrass. So, αm(i) → α(i) for some subsequence. Define x =<br />
α(1)e1 + α(2)e2 + ....αnen. Then, we can find a subsequence such that<br />
xnk<br />
→ x. (To do this, one uses a finite form of a ’diagonalization’<br />
argument) Since xnk<br />
is convergent, and M is closed, xnk converges in<br />
M. Hence, we have a convergent subsequence. <br />
However, in infinite dimensions, a closed and bounded set is not nec-<br />
essarily compact. In fact, the closed unit ball in an infinite dimensional<br />
space is necessarily not compact! To show this, we need a technical<br />
lemma first. The following proof is from Kreyszig’s excellent textbook.<br />
Lemma 2. Riesz’s Lemma: [3] Let X be a normed space and let Y,Z be<br />
subspaces such that Y is closed and Y is strictly contained in Z. Given<br />
any α ∈ (0, 1), there exists z ∈ Z with ||z|| = 1 such that ||z − y|| > α<br />
for some y ∈ Y .<br />
Proof. Let v ∈ Z − Y . We can define the distance from v to Y by<br />
infy∈Y ||v − y||, which has some distance d. Since Y is closed, we can<br />
find a sequence yn such that ||v − yn|| → d. Consequently, we can find
<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 13<br />
y0 such that d ≤ ||v − y0|| ≤ d + ǫ for arbitrary ǫ. With this, we can<br />
find y such that d ≤ ||v − y0|| ≤ d<br />
v−y0<br />
. If we define z = , then<br />
α ||v−y0||<br />
clearly ||z|| = 1. Additionally, ||z − y|| = 1<br />
||v−y0|| ||v − y0 − ||v − y0||y||.<br />
Since Y is a subspace, y0 + ||v − y0||y ∈ Y , (call it y1), and hence<br />
||z − y|| = 1<br />
||v−y0|| ||v − y1|| ≥ 1<br />
||v−y0||<br />
α d ≥ d = α. So, ||z − y|| ≥ α. <br />
d<br />
In the following proof, we will be using the fact that finite dimen-<br />
sional subspaces are closed. This fact I have not proved in this section,<br />
but this will be proven once we know the Hahn-<strong>Banach</strong> theorem. There<br />
are ways of proving it without resorting to such drastic measures, but<br />
I prefer it my way. For now, accept on faith that finite dimensional<br />
subspaces are closed.<br />
Theorem 5. Let X be an infinite dimensional normed space. Then,<br />
the closed unit ball B is not compact.<br />
Proof. We can start by picking a point x1 ∈ B such that ||x1|| = 1.<br />
Then, this generates a subspace which is closed. Therefore, we can<br />
choose an x2 such that ||x2 − x1|| ≥ 1<br />
2 and ||x2|| = 1. Now, consider<br />
the subspace spanned by x1 and x2. This is still finite dimensional,<br />
and hence closed. Therefore, using our previous lemma, we can find<br />
an x3 of norm one such that ||x3 − x2|| ≥ 1<br />
2 and ||x3 − x1|| ≥ 1<br />
2 .<br />
Iterating this procedure we get a sequence xn which has no Cauchy<br />
subsequence because ||xn − xm|| ≥ 1<br />
2<br />
always. So, this sequence can’t<br />
possibly have a convergent subsequence. Therefore, the closed unit ball<br />
is not compact.
14 STEPHEN ROWE<br />
An Introduction to Linear Operators<br />
In the setting of linear algebra and finite dimensions, we are familiar<br />
with mappings between two finite dimensional spaces (say C n → C m ).<br />
With a chosen basis, we can represent linear mappings as matrices.<br />
Our goal here is to generalize the notion of linear mappings between<br />
two vector spaces of arbitrary dimension. We call a mapping between<br />
two vector spaces an operator. From here on out, assume that (unless<br />
explicitly stated otherwise) that X and Y refer to normed vector spaces.<br />
Definition 6. Let X,Y be vector spaces. We say that an operator T<br />
is a linear operator if T : X → Y is a linear mapping. That is, for any<br />
scalars α,β, and any x,y ∈ X, T(αx + βy) = αTx + βTy.<br />
It is worth noting that this immediately implies that a linear operator<br />
takes zero to zero. The domain of an operator need not be the whole<br />
space X and the range is not necessarily all of Y . We can extend the<br />
notion of a kernel from linear algebra by saying the kernel (also called<br />
null space) of an operator is given by {x ∈ X : Tx = 0}. If we let<br />
X,Y be finite dimensional, then the linear operators are exactly the<br />
matrices mapping between them. We have more interesting operators<br />
on possibly infinite dimensional spaces. Our goal is to extend notions<br />
from analysis and topology to the infinite dimensional case. Since we<br />
have mappings between spaces, we can try to extend the notion of<br />
continuity and boundedness.
<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 15<br />
Definition 7. Let X,Y be normed vector spaces with norms given by<br />
|| · ||1 and || · ||2 respectively We say that a linear operator T : X → Y<br />
is bounded if ||Tx||2 ≤ C||x||1 for all x ∈ X.<br />
We often may drop the subscripts, which we hypothesize will not<br />
cause too much confusion. However it is important to keep in mind<br />
that the norm on Tx is the Y norm and the norm on x is the X norm.<br />
With this definition, we can define a new vector space given by the<br />
collection of bounded operators between two normed spaces.<br />
Definition 8. Let X,Y be normed vector spaces. Then, define B(X,Y ) =<br />
{T |T : X → Y , T bounded }. Then, B(X,Y ) is a vector space.<br />
Given T,S ∈ B(X,Y ), we have ||(αT+βS)x|| ≤ |α|||Tx||+|β|||Sx|| ≤<br />
|α|CT ||x||+|β|CS||x|| = C||x||. Consequently, for arbitrary α,β scalars,<br />
we have αT +βS is also a bounded operator. Hence, B(X,Y ) is a vec-<br />
tor space. Now that we have a vector space, can we extend other<br />
notions that we have introduced? Can we make a normed space out of<br />
B(X,Y ).<br />
Definition 9. We define the operator norm ||T || = sup x∈X ||Tx|| :<br />
||x|| = 1.<br />
The operator norm is a well-defined norm (see problems) and with<br />
this, we have that B(X,Y ) is a normed vector space. Before moving<br />
onto more complex topics, it may be worth exploring some examples<br />
of bounded and unbounded operators.
16 STEPHEN ROWE<br />
Example 3. Consider C[a,b] with the supremum norm and define<br />
T : C[a,b] → C[a,b] by Tf = t<br />
f(x)dx. This operator is cer-<br />
a<br />
tainly linear (since integration is linear), and we know that the func-<br />
tion defined by Tf is still continuous (hence in C[a,b]). Note that<br />
||Tf|| = || t<br />
a f(x)dx|| ≤ t<br />
a ||f||dx ≤ b<br />
a<br />
is a bounded operator with norm at most (b − a).<br />
||f||dx = (b −a)||f||. Hence, T<br />
Example 4. Let P ⊂ C[0, 1] be the set of all polynomials and let P<br />
inherit the supremum norm. This gives a normed vector space. We can<br />
define the differentiation operator which takes polynomial t n to nt n−1 .<br />
Define the sequence pn(t) = t n . Then Tpn(t) = nt n−1 , so ||Tpn|| = n<br />
Since ||pn|| = 1, we have ||Tpn|| = n||pn||. Therefore, we can’t find a<br />
C such that ||Tp|| ≤ C||p|| for all p ∈ P.<br />
The previous two operators are very important; the study of dif-<br />
ferential equations and integral equations naturally relies on these two<br />
operators. The unboundedness of the differentiation operator can make<br />
it rather unweildly, but it also leads to a very interesting theory of un-<br />
bounded operators.<br />
Now that we have a concept of boundedness of a linear mapping,<br />
can we extend the notion of continuity? Generalizing continuity from<br />
functions, we can say that a linear operator T is continuous if given<br />
any convergent sequence xn → x, we have Txn → Tx. With linear<br />
operators, a remarkable equality occurs: boundedness and continuity<br />
are equivalent notions (which is not the case for functions!).
<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 17<br />
Theorem 6. Let T : X → Y be linear. Then, T is continuous iff T is<br />
bounded.<br />
Proof. Let T be bounded. Then, if xn → x, we have ||T(xn − x)|| ≤<br />
||T ||||xn − x|| → 0, and hence Txn → Tx. Hence, T is continuous. Let<br />
T be continuous. Then, for any ǫ > 0, ||Tx − Tx0|| < ǫ provided that<br />
||x − x0|| < δ. Choose arbitrary y ∈ X, and let x = x0 + δ y<br />
. Then,<br />
||y||<br />
||x−x0|| = y < δ, so ||T(x−x0)|| ≤ ǫ, but ||T(x−x0)|| = δ ||Ty|| ≤ ǫ.<br />
||y||<br />
So, ||Ty|| ≤ ǫ ||y||. Hence, T is bounded.<br />
δ<br />
Notice that the proof above actually only used continuity at some<br />
point x0, and we showed that T was bounded for arbitrary y. It follows<br />
that continuity at a single point is equivalent to continuity everywhere,<br />
another bizarre feature of linear operators. Now that we have some<br />
grasp on bounded linear operators, what can we say about the space<br />
of bounded linear operators between two normed spaces? Is this space<br />
every complete? This is actually possible, and the only requirement is<br />
that Y be complete (the domain space X need not be complete!).<br />
Theorem 7. Let X be a normed space and Y a <strong>Banach</strong> space. Then<br />
B(X,Y ) is a <strong>Banach</strong> space.<br />
Proof. Let Tn be a Cauchy sequence in B(X,Y ). Hence, ||Tn−Tm|| ≤ ǫ<br />
for n,m large enough. Then, for any x ∈ X, ||Tn(x) − Tm(x)|| ≤<br />
||Tn−Tm|| ||x|| ≤ ǫ||x||. If we define yn = Tn(x), then ||yn−ym|| ≤ ǫ||x||,<br />
and hence yn is a Cauchy sequence in Y . But, Y is a <strong>Banach</strong> space,<br />
and hence yn converges to y. We can define an operator Tx = y in
18 STEPHEN ROWE<br />
this manner for each x. It follows that T is linear since T(x + z) =<br />
lim Tn(x + z) = limTnx + lim Tnz = Tx + Tz. Then, for any x, we<br />
have ||Tmx − Tnx|| ≤ ǫ||x||. Taking limits on the n allows us to arrive<br />
at ||Tx − Tnx|| ≤ ǫ||x||, and hence T − Tn is a bounded operator.<br />
Consequently, T = (T − Tn) + Tn ∈ B(X,Y ). Therefore B(X,Y ) is<br />
complete. Also, note that ||Tn − T || → 0 , and hence Tn → T.<br />
<strong>Problems</strong>.<br />
Problem 7. Show that the kernel of a linear operator is a vector space.<br />
Show that the kernel of a bounded linear operator is closed. Also show<br />
that the range of a linear operator is a vector space.<br />
Problem 8. Show that B(X,Y ) is a normed vector space (assume<br />
X,Y normed vector spaces).<br />
Problem 9. Let k(x,y) be a continuous function on [0, 1] × [0, 1].<br />
Define T : C[0, 1] → C[0, 1] by Tf = 1<br />
k(x,y)f(x)dx. Show that T is<br />
0<br />
a bounded linear operator.<br />
Problem 10. Let T ∈ L(X,X) and ||T || < 1. Show that (I − T) is<br />
an invertible operator and that (I − T) −1 = ∞<br />
n=0 T n .<br />
Problem 11. Let x ∈ ℓ ∞ . Define T : ℓ ∞ → ℓ ∞ by Tx = y where<br />
y = (0,x(1),x(2),.......). Is T linear? Bounded? If so, what is the<br />
norm? Consider T : ℓ ∞ → ℓ ∞ defined by Tx = y where y(n) = x(n)<br />
n .<br />
Show that this is a bounded linear operator. What is the norm of T?
<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 19<br />
Problem 12. Is the range of a bounded linear operator necessarily<br />
closed? Why or why not? Hint: Consider an operator from the previous<br />
problem.<br />
Problem 13. Let T be a linear operator with the condition that<br />
||Tx|| ≥ b||x||for all x ∈ X. Show that T −1 exists and that T −1 is<br />
a bounded linear operator.<br />
Problem 14. Let S,T ∈ B(X,X). Show that ||ST || ≤ ||S||||T ||.<br />
An Introduction to Linear Functionals and Dual <strong>Spaces</strong><br />
Now that we know a few things about bounded linear operators, we<br />
can consider a special class of bounded linear operators between normed<br />
spaces and the field of scalars. Let X be a normed space and consider<br />
the set of bounded operators B(X, R) (or B(X, C)). If f ∈ B(X, C),<br />
then f(x) is a scalar, and we say that f is a bounded linear functional.<br />
We call the set B(X, C) the dual space of X, written X ∗ . The study<br />
of normed spaces X relies heavily on exploring the nature of its dual<br />
space. Indeed, we will later see that using X ∗ , we can build a new<br />
topological structure for X called the weak topology. Before we get<br />
too complicated, we should note some basic facts about dual spaces.<br />
The first thing to note is that since C is a <strong>Banach</strong> space, we know that<br />
X ∗ is always a <strong>Banach</strong> space, regardless of whether or not X is. This<br />
follows from the last theorem from the previous section.<br />
Corollary 1. Let X be a normed space. Then, the set of bounded<br />
linear functionals X ∗ is a <strong>Banach</strong> space.
20 STEPHEN ROWE<br />
Since functionals are linear operators, we can define a norm on them<br />
using the operator norm prevously defined. That is, ||f|| = sup{||fx|| :<br />
||x|| = 1}. Note that ||f(x)|| is actually just |f(x)| since f(x) is a scalar<br />
value. Consequently, |f(x)| ≤ ||f|| ||x||. Also, since linear functionals<br />
are operators, if the functional is bounded, it is continuous (and vice<br />
versa). Let’s familiarize ourselves with some common examples:<br />
Example 5. Let X be a normed space. Let f(x) = ||x||. Then, f<br />
is a functional, as it maps a normed space to the field of real num-<br />
bers. However, we do not have linearity, since ||x + y|| ≤ ||x|| + ||y||.<br />
Consequently, this is not a linear functional.<br />
Example 6. Consider C[0, 1] with Tf = 1<br />
f(t)dt. Since integration<br />
0<br />
is a linear operation, T is linear. T : C[0, 1] → R, and hence T is a<br />
linear functional and it is bounded. (Why?)<br />
Example 7. Let x ∈ R n . If we consider y T , y ∈ R n , then f(x) =<br />
y T x = y · x is a bounded linear functional.<br />
1 1 + p q<br />
In the case of ℓ p for 1 < p < ∞, ℓ p has its dual given by ℓ q where<br />
= 1. We say that p,q are conjugate exponents. For the case p = 1,<br />
the dual space is given by ℓ ∞ . However, ℓ ∞ has a dual possibly much<br />
larger than ℓ 1 . We will see both an explicit example of a functional on<br />
ℓ ∞ which is not in ℓ 1 and solve the problem with a quick application of<br />
a clever theorem. Let us demonstrate that the dual space of ℓ 1 is ℓ ∞ .<br />
Theorem 8. The dual space of ℓ 1 is ℓ ∞
<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 21<br />
Proof. Let f ∈ (ℓ 1 ) ∗ . From our discussion bases, we know their is<br />
a canonical basis to choose, given by the (en) vectors. Then, any<br />
x ∈ ℓ 1 can be written as ∞<br />
n=1 x(n)en. Consequently, we have f(x) =<br />
∞<br />
n=1 x(n)f(en), since f is linear and bounded (Why is this allowed?).<br />
Let y(n) = f(en). This defines a sequence. Then, |y(n)| = |f(en)| ≤<br />
||f||||en|| = ||f||. Consequently, sup |y(n)| ≤ ||f||. So, y(n) ∈ ℓ ∞ . So,<br />
we can identify with any linear functional f ∈ (ℓ 1 ) ∗ , a sequence in ℓ ∞ .<br />
Now we need to show that every member of ℓ ∞ defines a linear func-<br />
tional in a manner as above. Let y(n) ∈ ℓ∞ . Then, ∞ n=1 x(n)y(n) de-<br />
fines a linear functional. Boundedness follows since | ∞ n=1 x(n)y(n)| ≤<br />
∞ n=1 |x(n)| sup |y(n)| ≤ sup |y(n)| ∞<br />
n=1 |x(n)| = ||y||∞||x||1. So, this<br />
shows that every element of ℓ ∞ defines a bounded linear functional.<br />
Therefore, we can associate the space ℓ ∞ with ℓ 1 . Lastly, we have<br />
||f|| = ||y(n)||∞. This follows since we earlier showed |y(n)| = |f(en)| ≤<br />
||f||. We also have | ∞<br />
n=1 y(n)x(n)| ≤ sup |y(n)|||x||1. Consequently,<br />
|f(x)|<br />
||x||1 ≤ sup |y(n)|. So, ||f|| = sup |y(n)| = ||y||∞. Therefore, we have<br />
an isometric isomorphism between ℓ ∞ and (ℓ 1 ) ∗ .<br />
Now that we have seen some examples of linear functionals, a ques-<br />
tion arises: what can we say about how many linear functionals exist?<br />
Does a normed space have a rich supply of such functionals? One of<br />
the most important theorems in functional analysis, the Hahn-<strong>Banach</strong><br />
theorem, addresses this question. First, we need to know a few terms<br />
before we can prove this important theorem.
22 STEPHEN ROWE<br />
Definition 10. We say that p is a sublinear functional if p : X → R<br />
such that p(x + y) ≤ p(x) + p(y) and p(λx) = λp(x) for λ ≥ 0. We say<br />
that p is a semi-norm if p if p(x+y) ≤ p(x)+p(y) and p(λx) = |λ|p(x).<br />
The above statement should look a bit familiar: a norm (or any<br />
semi-norm for that matter) is a sublinear functional. The following is<br />
inspired largely by Folland’s proof and Kreyszig’s proof in their respec-<br />
tive textbooks.<br />
Theorem 9. Hahn − <strong>Banach</strong> Theorem [2] [3] Let X be a real vector<br />
space and let M be a subspace of X, and let f be a linear functional<br />
on M. Let p be a sublinear functional such that f(x) ≤ p(x) ∀x ∈ M.<br />
Then, there exists a linear functional F on X such that F(x) ≤ p(x)<br />
for all x ∈ X and F |M = f.<br />
Proof. Our first step will be to extend f to a functional defined on a<br />
subspace of simply dimension larger by one. That is, we will define a<br />
g on M + Rx, where x /∈ M. Once this is done, we will know that an<br />
extension is possible.<br />
To begin, let y1,y2 ∈ M. Then, f(y1) + f(y2) = f(y1 + y2) ≤<br />
p(y1 + y2) ≤ p(y1 − x) + p(x + y2) by invoking the triangle inequality<br />
property of sublinear functionals. This implies that f(y1) −p(y −x) ≤<br />
p(x + y2) − f(y2). Consequently, sup{f(y) − p(y − x) : y ∈ M} ≤<br />
inf{p(x + y) − f(y) : y ∈ M}. Then, there exists some number α such<br />
that sup{f(y)−p(y −x) : y ∈ M} ≤ α ≤ inf{p(x+y)−f(y) : y ∈ M}.<br />
With this, we may define g : M +Rx → R by g(y+λx) = f(y)+λα.<br />
Then, g is linear since g(y1+λ1x+y2+λ1y2) = f(y1+y2)+α(λ1+λ2) =
<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 23<br />
f(y1) + αλ1 + f(y2) + αλ2 = g(y1 + λ1x) + g(y2 + λ2x). On the set<br />
M, f(y) = g(y + 0 · x) = f(y) + α · 0 = f(y). Consequently, on M,<br />
g(y) ≤ p(y). However, we need to show that g(y + λx) ≤ p(y + λx).<br />
Note that the definition of a sublinear functional requires λ > 0, but<br />
in M + Rx, λ ∈ R could be negative, and hence we must account for<br />
two cases. First, let λ > 0<br />
Then, g(y+λx) = λ[f( y<br />
λ<br />
)+α] ≤ λ[f( y<br />
λ<br />
)+p( y<br />
λ<br />
y<br />
+x)−f( )] = p(y+λx).<br />
λ<br />
Now, let λ = −µ < 0. Then, g(y + λx) = µ(f( y<br />
µ ) − α) ≤ µ(f( y<br />
µ ) −<br />
f( y<br />
µ +p( y<br />
µ −x))) = p(y+λx). Therefore, we have g(y+λx) ≤ p(y+λx).<br />
Therefore, we have proven there exists a one dimensional extension of<br />
f.<br />
Now, consider the family of all linear extensions of f satisfying f ≤<br />
p. We can give this set a partial ordering by set inclusion. That<br />
is, if F1,F2 are extensions such that the domain of F1 is contained<br />
in the domain of F2 and if F1 = F2 on their common domain, then<br />
F1 ≤ F2. Now, consider a chain {Fα}. Then, we have an increasing set<br />
of domains (which are subspaces), and if we take the unions, we arrive<br />
at a functional F by defining F(x) = Fα(x) if x is in the domain of<br />
Fα(x). Then, since this is a chain, we have Fα ≤ F, since the domain<br />
of F is the union over all domains, and F(x) = Fα(x) if x is in their<br />
common domain. So, F is an upper bound for this arbitrary chain from<br />
our partially ordered set Therefore, we know our partially ordered set<br />
(by Zorn’s Lemma) has at least one maximal element, call it F. It must<br />
be that the domain of F is the whole space. If not, we could do a one<br />
dimensional extension (as above), which would give an F ′ ≥ F, which
24 STEPHEN ROWE<br />
would contradict the maximality of F. Therefore, F is an extension of<br />
f to the whole space which still satisfies F(x) ≤ p(x).<br />
It is important to note that this proof follows only for vector spaces<br />
over R. The Hahn-<strong>Banach</strong> theorem can be formulated in the case of a<br />
vector space over C. This merely requires a technical lemma (which we<br />
shall omit) and the proof is a lemma of the real version of the Hahn-<br />
<strong>Banach</strong> theorem. However, since we often assume our field of scalars<br />
are complex, it is worth stating the theorem:<br />
Theorem 10. The Complex Hahn − <strong>Banach</strong> Theorem Let X be<br />
a complex vector space, p a semi-norm on X, M a subspace, and f a<br />
complex linear functional such that |f(x)| ≤ p(x) ∀x ∈ M. Then, there<br />
exists a complex linear functional F such that |F(x)| ≤ p(x) ∀x ∈ X<br />
and F |M = f.<br />
Now that we have this powerful theorem, several useful results in-<br />
stantly emerge.<br />
Theorem 11. Let X be a normed vector space.<br />
a: If M is a closed subspace of X and x ∈ X/M, there exists<br />
f ∈ X ∗ such that f(x) = 0 and f|M = 0. We may choose<br />
||f|| = 1 and f(x) = d(x,M) = infy∈M ||x − y|| = δ.<br />
b: If x = 0 ∈ X, there exists f ∈ X ∗ such that f(x) = ||x||,<br />
||f|| = 1.<br />
c: The bounded linear functionals separate points.
<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 25<br />
d: If x ∈ X, we can define x ′ : X ∗ → C by x ′ (f) = f(x). We have<br />
that x ′ is a linear functional on X ∗ , hence x ′ ∈ (X ∗ ) ∗ . We can<br />
isometrically embed X ⊂ X ∗∗ .<br />
Proof. For part (a), we may define f on M + Cx by f(y + λx) = λδ.<br />
Then, f(x) = δ as desired and f|M = 0. Note that δ ≤ ||y + x|| for<br />
any y ∈ M, and hence |f(x)| = |λ|δ ≤ |λ|||λ −1 y + x|| = ||y + λx||.<br />
So, f(z) ≤ ||z|| for z ∈ M + Cx. If we assign ||z|| = p(z), this is a<br />
semi-norm, and hence we may apply Hahn-<strong>Banach</strong> to get a functional<br />
defined on X such that |F(z)| ≤ ||z|| and F |M = 0, F(x) = δ.<br />
For Part (b), simply use the functional from part (A) with M = {0}.<br />
For Part (c), given two points x,y with x = y, there exists a func-<br />
tional such that f(x−y) = ||x−y|| > 0, and hence X ∗ separates points<br />
in X<br />
For part (d), if f,g ∈ X ∗ , x ∈ X, then x ′ (αf +βg) = (αf +βg)(x) =<br />
αf(x) + βg(x) = αx ′ (f) + βx ′ (g), and hence x ′ is a linear functaionl<br />
on X ∗ . We have |x ′ (f)| ≤ ||f||||x||, so ||x ′ || ≤ ||x||. But, we also have<br />
that there exists f such that ||f|| = 1 and f(x) = ||x||, so |x ′ (f)| =<br />
||x|| ≤ ||x ′ ||||f|| = ||x ′ ||. So, ||x ′ || = ||x||. <br />
An interesting question arises from part (d). When does (if ever)<br />
X = X ∗∗ ? We see that we can isometrically embed X as a subset of<br />
X ∗∗ . We say that a space is reflexive if X = X ∗∗ . Do not confuse<br />
this notion of reflexivity with the notion of the Alg Lat of an algebra<br />
equaling itself! If we recall that (ℓp ) ∗ = ℓq where 1 1 + p q<br />
= 1, then it<br />
follows that (ℓ p ) ∗∗ = (ℓ q ) ∗ = ℓ p . Consequently, ℓ p is reflexive. However,
26 STEPHEN ROWE<br />
for ℓ 1 , we have that its dual is ℓ ∞ . But, the dual of ℓ ∞ is vastly larger<br />
than ℓ 1 , and hence ℓ 1 is not reflexive.<br />
The following theorem is a neat application of our previous theorem.<br />
I recommend trying it yourself before reading the proof! Note that ¯ M<br />
refers to the closure of M.<br />
Theorem 12. Let M be a subspace of normed space X. Then, ¯ M =<br />
∩{ker f : f ∈ X ∗ ,M ⊂ ker f}. [1]<br />
Proof. Let N = ∩{ker f : ¯ M ⊂ ker f}, and let’s show first that ¯ M ⊂<br />
N. Since each f ∈ X ∗ , the kernel is always closed. Consequently, we<br />
are considering an arbitrary intersection of closed sets that contain M.<br />
Since one can define ¯ M to be the intersection over all closed sets that<br />
contain M, it follows that ¯ M ⊂ N. Assume that the containment is<br />
proper; that is, there exists x0 ∈ N but not in ¯ M. Then, since ¯ M<br />
is a closed subspace, we can find f ∈ X ∗ such that f|M = 0 and<br />
f(x0) = δ = dist(x0,M). Since f annihilates ¯ M, the kernel of f is<br />
included in the intersection that generates N. Hence, x0 cannot be in<br />
N since f(x0) = 0. Consequently, ¯ M = N. <br />
<strong>Problems</strong>.<br />
Problem 15. Let X be a normed vector space.<br />
a. Let M be a closed subspace of X and let x ∈ X/M. Show that<br />
M + Cx is closed.<br />
b. Let M be a finite dimensional subspace of X. Show that M is<br />
closed.
<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 27<br />
Problem 16. If X is a <strong>Banach</strong> space and X ∗ is separable, show that<br />
X is separable.[2] Hint: This problem is quite tricky. By the definition<br />
of separability, there exists {fn} that is countable and dense in X ∗ .<br />
For each n, try to find an xn ∈ X with ||xn|| = 1 such that |fn(xn)| ≥<br />
1<br />
2 ||fn||. Argue that one can use these countable xn to obtain a countable<br />
dense subset of X.<br />
Problem 17. Without providing a counterexample, prove that (ℓ ∞ ) ∗ =<br />
ℓ 1 . Hint: Consider the previous question. Also, do note that we know<br />
ℓ 1 ⊂ (ℓ ∞ ) ∗ .<br />
The Three Big Theorems: Open Mapping, Closed Graph,<br />
and <strong>Banach</strong>-Steinhaus<br />
The Hahn-<strong>Banach</strong> theorem is one of the cornerstones of functional<br />
analysis because it gives us information about the existence of function-<br />
als on normed spaces. However, there are a few other major theorems<br />
we will need to cover. First, we will need a helpful theorem from topol-<br />
ogy.<br />
Theorem 13. The Baire Category Theorem Let X be a complete<br />
metric space. If {Un} is a sequence of open, dense sets in X, then ∩Un<br />
is also dense in X.<br />
A set is dense in a space if it intersects every non-trivial open set in X.<br />
Let W be an open set, W = ∅. Our goal is to show that (∩Un)∩W = ∅.<br />
Since each Un is dense, we certainly have that U1 ∩ W is nonempty,<br />
and contains a closed ball centered about some point x0. Consequently,
28 STEPHEN ROWE<br />
there exists B(x0,r0) ⊂ W ∩ U1. As one might suspect, we can iterate<br />
this procedure, intersecting each time with Uj and finding xj,rj such<br />
that B(rj,xj) ⊂ Uj ∩ B(rj−1,xj−1), and we may choose rj < 2 −j at<br />
each turn. Then, the sequence of centers, xn forms a Cauchy sequence.<br />
Since X is complete, xn converges to some x ∈ X, which is contained<br />
in the intersection of W ∩ (∩ ∞ n=1Un).<br />
Corollary 2. Let X be a complete metric space. Then, X is not a<br />
countable union of nowhere dense sets.<br />
Proof. Exercise. <br />
This theorem is a purely topological result which depends on com-<br />
pleteness. We know that <strong>Banach</strong> spaces are by definition complete, so<br />
we will utilize the Baire Category Theorem to prove results for <strong>Banach</strong><br />
spaces. This moves us in a more specific direction towards <strong>Banach</strong><br />
spaces, as opposed to the general work we did with normed spaces<br />
before. <strong>Banach</strong> spaces have wonderful properties due to their com-<br />
pleteness. As we discussed before, we can consider series in normed<br />
vector spaces. <strong>Banach</strong> spaces provide a familiar result from calculus:<br />
if a series is absolutely convergent in a <strong>Banach</strong> space, then it the series<br />
itself is convergent. In fact, completeness is equivalent to the previous<br />
statement.<br />
Lemma 3. Let X be a normed vector space. X is complete iff every<br />
absolutely convergent series is convergent.
<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 29<br />
Proof. Let X be complete and let ∞<br />
n=1 xn be a series that is abso-<br />
lutely convergent. That is, ∞<br />
n=1 ||xn|| converges. Consider the par-<br />
tial sums Sn = n<br />
j=1 xn. Then, ||Sn − Sm|| ≤ m<br />
j=n+1 ||xj|| < ǫ for<br />
n,m large enough, the series is absolutely convergent. Hence, Sn is<br />
a Cauchy sequence, and since X is complete, it converges. On the<br />
other hand, let X have the property that every absolutely convergent<br />
series converges. Let xn be a Cauchy sequence in X. Then, if we let<br />
xn = n<br />
j=1 (xn − xn−1). Our goal here is to express the sequence xn<br />
as a series using the above technique. If we can show that this series<br />
is absolutely convergent, we are done. Therefore, if we can choose a<br />
subsequence such that the difference between ||xnj −xnj−1 || < 2−j , then<br />
we will have an absolutely convergent series. Since xn is Cauchy, we<br />
may choose xnk<br />
as a subsequence such that the difference between suc-<br />
ceeding terms has norm less than 2 −k . Let yk = xnk<br />
− xnk−1 . Then,<br />
∞<br />
j=1 ||yj|| ≤ ||y1||+ ∞<br />
j=1 2−j = ||y1||+1. Hence, this series is bounded<br />
above, monotonic and hence convergent. Since yj is absolutely con-<br />
vergent, by assumption it is itself convergent. But this sum converging<br />
amounts to saying that xnk<br />
is a convergent sequence. Since xnk is a<br />
subsequence that converges, we have that xn converges to the same<br />
limit (since xn is a Cauchy sequence). <br />
Definition 11. Let T : X → Y be <strong>Banach</strong> spaces. We say that T is<br />
open if T maps open sets to open sets. That is, T(B(x,r)) contains a<br />
ball centered about Tx in the space Y .<br />
Another way of looking at this is to consider the action of T on a ball.<br />
Let U be an open set and let B(x,r) ⊂ U be a ball about x with radius
30 STEPHEN ROWE<br />
r. Let’s say we require that the image of every ball around a point x<br />
contains a ball around a point Tx. Then, if we consider an arbitrary<br />
open set U ⊂ X, we know that U can be written as a union of open<br />
balls around every point x. That is, U = ∪x∈UB(x,rx). Then, with our<br />
requiremtn, B(Tx,ry) = B(y,ry) ⊂ T(B(x,rx)) ⊂ T(U). What this is<br />
saying is that every point y ∈ T(U) has an open ball contained in T(U).<br />
Consequently, T(U) is an open set if U is an open set. Therefore, if we<br />
want to show that a map is open, we need only show that given any<br />
open ball B(x,r) in X that T(B(x,r) contains an open ball in Y about<br />
Tx. Additionally, if we consider X,Y to be normed spaces and let T be<br />
linear, then to show that a map is open, we merely need to show that T<br />
maps the open unit ball in X to a set that contains a ball about 0 in Y .<br />
To see why this is so, note that since T is a linear map, T(αx) = αTx<br />
for all x ∈ X and T(x + y) = T(x) + T(y) by linearity, and hence we<br />
can conclude that T commutes with dilations and translations. That<br />
way, instead of showing that T maps every open ball about x to a set<br />
containing an open ball about Tx, we can translate and dilate the ball<br />
in X to the open unit ball. Therefore, all we need to do is show that<br />
the open unit ball in X gets mapped to a set that contains an open<br />
ball about 0 (Recall that T(0) = 0).<br />
Theorem 14. Open Mapping Theorem Let X,Y be <strong>Banach</strong> spaces<br />
and let T : X → Y be a surjective, bounded linear operator. Then, T<br />
is open. [2]<br />
Proof. We know that T(X) = Y and we also know that X,Y are<br />
complete spaces. Our goal here is to show that T(B(0, 1)) contains
<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 31<br />
an open ball about 0 in Y . If one considers the sequences of balls<br />
Bn = B(0,n), then one can see that every x ∈ X is going to eventually<br />
be in one of the balls Bn, and hence we may write X = ∪ ∞ n=1Bn. Then,<br />
we have T(X) = T(∪Bn) = ∪T(Bn)) = Y . Since Y is a <strong>Banach</strong><br />
space, Y is complete. Consequently, by the Baire Category Theorem,<br />
Y cannot be the union of nowhere dense sets. Consequently, there is at<br />
least one set T(Bn) such that T(Bn) has non-empty interior. But, this<br />
implies that T(B(0,n)) = nT(B(0, 1)) has non-empty interior (note<br />
the use of linearity). This tells us that T(B(0, 1)) cannot be nowhere<br />
dense, so there exists a y0 ∈ T(B(0, 1)) such that y0 ∈ Y and some<br />
radius r > 0 such that B(y0, 4r) ⊂ T(B(0, 1)). Then, we may choose<br />
a y1 ∈ T(B(0, 1)) such that ||y1 − y0|| < 2r. Since the radius y1 ∈<br />
B(y0, 4r), we have that B(y1, 2r) ⊂ B(y0, 4r). Additionally, since we<br />
know y0 ∈ T(B(0, 1)) we know there exists an x1 ∈ B(0, 1) with Tx1 =<br />
y1. Let y ∈ Y be arbitrary with ||y|| < 2r. Then, y = y + (y1 − Tx1)<br />
by definition y1. However, since ||y|| < 2r, y1 + y ⊂ T(B1) and we<br />
then have that y = −Tx1 + (y + y1) ⊂ T(−x1 + B(0, 1)) ⊂ T(B(0, 2)),<br />
and hence y ∈ T(B(0, 2)) with ||y|| < 2r. Dividing by 2 and noting<br />
the linearity of T, we have that if ||y|| < r, then y ∈ T(B(0, 1)). So<br />
far, what we’ve shown is that we found an r (using the Baire Category<br />
Theorem) such that if y ∈ Y with ||y|| < r, then y ∈ T(B(0, 1)). We’re<br />
very close to showing that T(B(0, 1)) contains an open ball about 0.<br />
Our problem is that we can do this for T(B(0, 1)). We need to discard<br />
the closure part, and then we will have our result.
32 STEPHEN ROWE<br />
Using our dilation trick some more, we see that if ||y|| < 2 −n r, we<br />
have that y ∈ T(B(0, 2−n )). Now, let ||y|| < r<br />
1<br />
. Then, y ∈ T(0, 2 2 ),<br />
and hence we can find an x1 ∈ B(0, 1<br />
2 ) such that ||y − Tx1|| < r<br />
4 .<br />
Now, since ||y − Tx1|| < r<br />
4<br />
1<br />
∈ Y , we know it is in T(B(0, )). So, we<br />
4<br />
can find an x2 such that ||(y − Tx1) − Tx2|| < r<br />
8 with x2 ∈ B(0, r<br />
4 ).<br />
Now, we can proceed inductively to find an xn ∈ B(0, 2 −n−1 ) such that<br />
||y− n<br />
j=1 Txj|| < 2 −n r. Consider the series ∞<br />
j=1 xn. Since ||xn|| < 1<br />
2 n,<br />
we have that ∞<br />
n=1 ||xn|| < ∞<br />
n=1 2−n = 1. Therefore, we have that this<br />
series is absolutely convergent. By our previous lemma, we have that<br />
since X is a complete space, any absolutely convergent series converges,<br />
and hence ∞<br />
n=1 xn converges in X. Let the series sum be denoted by<br />
x. Then,||y − Tx|| = 0, so y = Tx. So, y ∈ T(B(0, 1)) since ||x|| < 1.<br />
Consequently, T(B(0, 1)) contains all y such that ||y|| < r.<br />
This implies<br />
2<br />
that we have a ball about 0 contained in T(B(0, 1)) and hence T is an<br />
open map.<br />
Recall that a function f : X → Y between two topological spaces<br />
is continuous if given any open V ⊂ Y , we have f −1 (V ) is open. This<br />
is equivalent to our definition for continuous linear operators between<br />
normed spaces. Let T be continuous; then, by definition, T −1 maps<br />
open sets to open sets. Hence, T is open. Therefore, if we can show<br />
that T −1 exists (if T is a bijection), then showing that T −1 is bounded<br />
is equivalent to showing that T is open. With these considerations, we<br />
have the very useful corollary to the Open Mapping theorem:
<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 33<br />
Theorem 15. The Bounded Inverse Theorem Let X,Y be <strong>Banach</strong><br />
spaces and let T ∈ B(X,Y ) be a bijection. Then, T −1 is exists and is<br />
a bounded linear operator.<br />
We are about to approach a terminology disaster: it seems clear that<br />
the definition of an open linear operator maps open sets to open sets.<br />
One might suspect that a closed linear operator maps closed sets to<br />
closed sets. Unfortunately, that is not the definition.<br />
Definition 12. Let T : X → Y be a linear operator between two<br />
normed spaces. Let the graph of T be defined as G(T) = {(x,y) ∈<br />
X × Y : Tx = y}. We say that T is a closed operator if G(T) is a<br />
closed subset of X × Y in the product topology.<br />
It’s a bit confusing at first to see what exactly this means: a com-<br />
parison between continuity and closedness is best. If T is continuous,<br />
then given any convergent sequence xn ∈ X, we have that Txn is a<br />
convergent sequence in Y . If T is a closed operator, it does not follow<br />
that xn → x implies Txn → Tx. However, let’s say xn → x and that<br />
Txn does converge to something in Y , say Txn → y. Then, if T is<br />
closed, it follows that y = Tx. Therefore, to show that Txn → Tx,<br />
we first need to know that Txn is a convergent sequence in Y . We see<br />
that if T is bounded (continuous), then, automatically T is closed. So,<br />
closed linear operators generalize the notion of bounded linear oper-<br />
ators. Why are they worth the trouble? Well, it turns out that our<br />
favorite unbounded operator, d<br />
dx<br />
is a closed linear operator on certain<br />
<strong>Banach</strong> spaces. Due to the importance of this operator in differential
34 STEPHEN ROWE<br />
equations, it seems fair that closed operators deserve a bit of atten-<br />
tion. In the applied world, physics , especially quantum mechanics,<br />
deals with unbounded linear operators that are closed (such as the dif-<br />
ferentiation operator). Although not bounded, closed linear operators<br />
still have some acceptable behavior, notably that there are many pos-<br />
itive results about them in spectral theory. So far, we see that being<br />
bounded implies being closed. Fortunately, a closed linear operator is<br />
bounded if T : X → Y and X and Y are <strong>Banach</strong> spaces.<br />
Theorem 16. Closed Graph Theorem Let X,Y be <strong>Banach</strong> spaces<br />
and let T : X → Y be a closed linear operator. Then, T is bounded. [2]<br />
Proof. By the definition of the product topology, the projection op-<br />
erator pi1 : X × Y → X by π1(x,y) = x is a continuous mapping.<br />
The same holds for π2 : X × Y → Y , π2(x,y) = y. Consequently,<br />
π1 ∈ B(G(T),X) and π2 ∈ B(G(T),Y ). We know that X and Y are<br />
<strong>Banach</strong> spaces, and hence X,Y are both complete spaces. The prod-<br />
uct of two complete spaces is complete. By assumption, T is a closed<br />
linear operator, and hence G(T) is a closed set in X × Y . Notice that<br />
Tx = π2(π −1 (x)). Consequently, T = π2 ◦ π −1<br />
1 . We have that π1 is<br />
one to one and onto, and hence a bijection of X × Y to X. Since it<br />
is also bounded, we have that π −1 is bounded by the Bounded Inverse<br />
Theorem. Then, T = π2 ◦ π −1<br />
1 is a bounded operator. <br />
So far we have hit 2 of the big theorems in functional analysis, and<br />
one more remains. The <strong>Banach</strong>-Steinhaus , also known as the Principle<br />
of Uniform Boundedness, is an extraordinarily powerful theorem that
<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 35<br />
allows one to jump from pointwise estimates on the norm of an operator<br />
to a uniform estimate on the value of the operator norm. As you will see<br />
from doing the problems, this theorem makes short work of otherwise<br />
daunting exercises.<br />
Theorem 17. The <strong>Banach</strong> − Steinhaus Theorem Let X be a Ba-<br />
nach space and Y a normed vector space and let A ⊂ B(X,Y ). If<br />
sup T ∈A ||Tx|| < ∞ for all x ∈ X, then sup T ∈A ||T || < ∞.<br />
Proof. Let En = {x ∈ X : ||Tx|| < n} = ∩T ∈A{x ∈ X : sup T ∈A ||Tx|| ≤<br />
n}. Then En is a closed set since it is the intersection of closed sets and<br />
X = ∪En. Since X is a <strong>Banach</strong> space, X is complete, and by the Baire<br />
Category theorem, at least one set En is not nowhere dense. Conse-<br />
quently, we can find an open ball in En, and since En is closed, we may<br />
find a closed ball inside of it. So, let’s denote this ball by B(x0,r) ⊂ En<br />
for some r > 0. Let x ∈ X satisfy ||x|| < r, so, x + x0 ∈ En. We have<br />
||Tx|| = ||T(x + x0) − Tx0|| ≤ ||T(x + x0)|| + ||Tx0|| ≤ n + n = 2n.<br />
This holds for all T ∈ A and x ∈ X with ||x|| < r since x + x0 and<br />
x0 ∈ B(x0,r) ⊂ En. So, B(0,r) ⊂ E2n. So, sup ||T || < 2n,<br />
since r<br />
||T || = sup ||Tx||<br />
||x||<br />
<strong>Problems</strong>.<br />
≤ 2n<br />
||x||<br />
2n ≤ . <br />
r<br />
Problem 18. Consider the <strong>Banach</strong> space C[0, 1] with the supremum<br />
norm. Consider the subset of C[0, 1] of once continuously differentiable<br />
functions, C 1 [0, 1]. [2]<br />
a. Show that X is not a closed subset of C[0, 1] and hence not<br />
complete.
36 STEPHEN ROWE<br />
b. Consider the operator d<br />
dx : C1 [0, 1] → C[0, 1]. Show that this is<br />
a closed linear operator<br />
Problem 19. Let X be a <strong>Banach</strong> space with respect to two different<br />
norms, || · ||1 and || · ||2, with the property that ||x||1 ≤ ||x||2. Show<br />
that these norms are equivalent norms. That is, there exists constants<br />
A,B such that A||x||1 ≤ ||x||2 ≤ B||x||1. [2]<br />
Problem 20. Let X,Y be <strong>Banach</strong> spaces. Let T : X → Y be a linear<br />
map such that given any f ∈ Y ∗ , f(T) ∈ X ∗ . Show that T is a bounded<br />
operator. [2]<br />
Problem 21. Let X,Y be <strong>Banach</strong> spaces and let Tn be a sequence of<br />
bounded operators such that limTnx exists for all x ∈ X. Show that<br />
the operator defined by the pointwise limit is both linear and bounded.<br />
[2]<br />
Problem 22. Let X be a vector space of countably infinite dimension.<br />
Show that there is no norm such that this space is complete. Hint:<br />
Remember my warning from before: linear algebraic bases only allow<br />
finite combinations!. [2]<br />
Problem 23. Let X be a banach space and let {xn} be a sequence such<br />
that the set {f(xn)} is bounded for all f ∈ X ∗ . Show that {||xn||} is<br />
bounded. Big Hint: Look back to the consequence of the Hahn-<strong>Banach</strong><br />
theorem. Remember that we can isometrically embed X ⊂ X ∗∗ . [3]
<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 37<br />
Problem 24. Define Tn = S n where S : ℓ 2 → ℓ 2 is given by Tx =<br />
T(x(1),x(2),......) = (x(2),x(3),.....). Bound ||Tnx|| and calculate<br />
lim ||Tn||. [3]<br />
Topologies and Functionals: Weak and Weak-*<br />
Topologies<br />
As we’ve seen in our previous analysis courses, occasionally we want<br />
to be a bit more flexible with convergence. For example, although<br />
uniform convergence of a sequence of functions is wonderful, sometimes<br />
this is too restrictive and we can use a weaker form of convergence, such<br />
as pointwise convergence. When you study integration theory, you will<br />
see several types of convergence such as, convergence in measure, L1,<br />
pointwise, and pointwise almost everywhere. Analysis is full of different<br />
types of convergences, and each has their uses. We will explore a<br />
topology built by linear functionals known as the weak topology.<br />
Definition 13. We say that a sequence (xn) ∈ X converges weakly to<br />
x ∈ X if f(xn) → f(x) for all f ∈ X ∗ .<br />
It follows then that if xn → x in the usual sense (in the norm-<br />
topology), then xn is weakly convergent (Why?). From a topological<br />
viewpoint, the norm topology generates a collection of open sets from<br />
the open balls; call this τN. The weak topology is a weaker topology<br />
τW. Being weaker implies that τW ⊂ τN. Another way of viewing<br />
this topology is that it is the weakest topology on X such that the<br />
functionals in X ∗ remain continuous. That is, τW is generated by<br />
looking at f −1 (U) for all open U ∈ X and f ∈ X ∗ . If this is confusing,
38 STEPHEN ROWE<br />
don’t worry: the main thing to understand is that weak convergence<br />
means that f(xn) → f(x) for all f ∈ X ∗ .<br />
Now that we have a new form of convergence, we should get some<br />
basic properties down to familiarize ourself with it. Since we are dealing<br />
with functionals, one should expect to see the Hahn-<strong>Banach</strong> theorem<br />
(in the guise of one of its many corollaries) or one of the three big<br />
theorems to pop up often. Evidence for the previous sentence is in the<br />
following proof:<br />
Lemma 4. Let xn be a weakly convergent sequence in a normed space<br />
X with weak limit x. Then:<br />
a. The weak limit of x is unique.<br />
b. Every subsequence of xn converges weakly to x.<br />
c. The sequence ||xn|| is bounded (Exercise from previous section).<br />
Proof. For (a), assume xn converges to both x and y weakly. Since<br />
x = y, ||x − y|| > 0, and hence there exists a functional (Why?) such<br />
that f(x − y) = ||x − y|| = f(x) − f(y) = limf(xn) − lim f(xn) = 0.<br />
So, ||x − y|| = 0.<br />
For (b), we have that given a subsequence xnk , then f(xnk ) is a<br />
subsequence of scalars. Since f(xn) converges, f(xnk ) converges to the<br />
same limit and this holds for all f. Hence xnk<br />
For (c), see the previous set of problems.<br />
converges weakly to x<br />
We often refer to convergence in norm (i.e., xn → x means ||xn −<br />
x|| → 0) as strong convergence (to contrast it with weak convergence).
<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 39<br />
We know that every strongly convergent sequence is weakly convergent;<br />
is there ever equality between the two statements? In finite dimensions,<br />
the answer is yes. I believe the answer is yes for some infinite dimen-<br />
sional spaces, as we will see in the following example.<br />
Example 8. Consider ℓ 1 , which has dual given by ℓ ∞ . Let xn → x<br />
weakly. That is, for every y ∈ ℓ ∞ , ∞<br />
k=1 xn(k)y(k) → ∞<br />
k=1 x(k)y(k).<br />
We may choose en = y, which tells us coordinate-wise, xn(k) → x(k) for<br />
all k. Now, let’s try showing that ||xn−x|| → 0. That is, ∞<br />
k=1 |xn(k)−<br />
x(k)| → 0. Let ǫ > 0 be given. First, note that since xn → x weakly,<br />
x ∈ ℓ 1 , so xn − x ∈ ℓ 1 . That is, ∞<br />
k=1 |xn − x| < ∞. Therefore,<br />
the tail of this series must converge. So, there exists N such that for<br />
∞<br />
k=N |xn(k) − x(k)| < ǫ.<br />
However, by our pointwise convergence,<br />
2<br />
there exists an M such that for n > M, we have for 1 ≤ k ≤ N,<br />
N k=1 |xn(k) − x(k)| < ǫ . Putting the two together, we have for<br />
2<br />
n > M, ∞<br />
k=1 |xn(k) − x(k)| < ǫ, and hence ||xn − x|| < ǫ for n > M.<br />
Consequently, xn strongly converges to x.<br />
Now that we’ve seen that equality between weak and strong conver-<br />
gence can possibly be equal in infinite dimensions, let’s justify that in<br />
finite dimensions, the two are the same.<br />
Theorem 18. Let X be a finite dimensional normed vector space such<br />
that xn weakly converges to x. Then, xn strongly converges to x.<br />
Proof. Since X is finite dimensional, there exists a basis {e1,e2,....en}<br />
such that xn = k<br />
j=1 αn(j)ej, where αn(j) is the j − th coordinate<br />
of xn. We may choose a set of functionals such that fj(ej) = 1 and
40 STEPHEN ROWE<br />
fj(em) = 0 for m = j. Then, since fj(xn) → fj(x) by assumption, this<br />
tells us that fj(xn) = αn(j) → f(x) = α(j). Consequently, we have<br />
a convergent sequence of scalars in each coordinate. Then, we have<br />
||xn − x|| = || k<br />
j=1 (αn(j) − α(j))ej|| ≤ k<br />
j=1 |alphan(j) − α(j)|||ej||.<br />
Since each sequence of scalars goes to zero, the finite sum tends to<br />
zero. <br />
This explains partly why you likely haven’t heard about topics like<br />
weak convergence in your calculus or linear algebra classes: it’s all the<br />
same in finite dimensions! Now that we know a thing or two about<br />
weak convergence, is there an equivalent or simpler way of describing if<br />
a sequence will weakly converge? Yes, there is a very handy way where<br />
we only need to show that f(xn) → f(x) on a total subset of X ∗ .<br />
Definition 14. Let X be a normed vector space and let M ⊂ X. We<br />
say that M is a total subset if the span of M is dense in X.<br />
Theorem 19. Let X be a normed vector space. Then, xn converges<br />
weakly to x iff ||xn|| is a bounded sequence and if for every linear func-<br />
tional in a total subset M ⊂ X ∗ , we have f(xn) → f(x). [3]<br />
Proof. Let xn converge weakly to x. Then, by a previous problem,<br />
||xn|| is a bounded sequence. Additionally, since f(xn) → f(x) for<br />
all f ∈ X ∗ , we of course have that f(xn) → f(x) for a total subset<br />
M ⊂ X ∗ .<br />
The converse is a bit trickier. By assumption, ||xn|| ≤ c and we have<br />
some total subset M ⊂ X ∗ . We need to show that |f(xn) − f(x)| → 0<br />
for all f ∈ X ∗ . We know this holds true for our total set. The trick
<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 41<br />
here is to use an ǫ<br />
3 argument. We can choose an fj ∈ spanM such that<br />
||f − fj|| < ǫ<br />
3 , since M is total. Since fj is a linear combination of<br />
functionals from M, we have |fj(xn) − f(x)| < ǫ.<br />
Then, we have<br />
3<br />
|f(x)−f(xn)| ≤ |f(x)−fj(x)|+|fj(x)−fj(xn)|+|fj(xn)−fj(x)| ≤ ||f−fj|| ||x||+ ǫ<br />
3 +||f−fj|| ||xn||<br />
Note that it was imperative that ||xn|| was bounded for this trick to<br />
work. <br />
It may not be immediately obvious why this previous theorem is so<br />
helpful: we still have to find a total subset of X ∗ and then show that<br />
f(xn) → f(x) for all of those. Well, it turns out in some spaces, working<br />
with a total subset is extremely easy! Consider ℓ p for 1 < p < ∞.<br />
Then ℓ q is the dual space, where q is the conjugate to p. Then {en}<br />
is a Schauder basis, which is a total subset. From this, we can show<br />
that xn converges weakly to x iff ||xn|| is bounded and xn(k) → x(k).<br />
(Why?) That is, a sequence is weakly convergent if it is norm bounded<br />
and pointwise bounded.<br />
So far, if X is a normed vector space, we have so far given it a new<br />
topology. We know that if X is a normed vector space, X ∗ is a <strong>Banach</strong><br />
space (even if X is not!), and hence we can consider giving it a weak<br />
topology. One can do this by considering the weak topology generated<br />
by X ∗∗ . However, the more important topology on X ∗ is the weak-<br />
* topology generated by X regarded as a space of linear functionals<br />
acting on X ∗ . That is, we look at the weak topology on X ∗ generated<br />
by X ⊂ X ∗∗ . More concretely, if fn is a sequence of functionals in X ∗ ,
42 STEPHEN ROWE<br />
we say that fn is weak-* convergent to f if for all x ∈ X, x(fn) → x(f)<br />
(where x is acting as a linear functional on fn ). But, we know that this<br />
just means for all x ∈ X, f(xn) → f(x). That is, the weak-* topology<br />
on X ∗ is just pointwise convergence!<br />
Definition 15. Let X be a normed vector space and let X ∗ be the<br />
dual. We say that a sequence fn converges weak-* to f ∈ X ∗ if for<br />
every x ∈ X, fn(x) → f(x).<br />
It may not seem immediately obvious why we even bother using the<br />
weak-* topology on X ∗ . We see that the weak topology on X is ben-<br />
eficial because it is more flexible in letting sequences converge. There<br />
is a topological reason which makes the weak-* topology extremely<br />
convenient. Recall that we showed that the closed unit ball in an in-<br />
finite dimensional space is necessarily not compact. Well, it turns out<br />
the weak-* topology makes the closed unit ball in X ∗ compact (in the<br />
weak-* topology). Note: if you’re not familiar with Tychonoff’s theo-<br />
rem, feel free to skip this proof. Make sure to familiarize yourself with<br />
Tychonoff’s theorem at some point, as it is a very usefull theorem from<br />
topology. For the benefit of the reader, I will restate it here:<br />
Theorem 20. Tychonoff ′ sTheorem Let {Xα} be a family of com-<br />
pact topological spaces. Then, X = ΠαXα is compact in the product<br />
topology.<br />
On the other hand, if X = ΠαXα is a compact space, then since each<br />
πα is a continuous map, each Xα is also compact (continuous functions<br />
map compact sets to compact sets).
<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 43<br />
Theorem 21. Alaoglu ′ sTheorem Let X be a normed vector space.<br />
Then, the closed unit ball B ∗ = {f ∈ X ∗ : ||f|| ≤ 1} is compact in the<br />
weak-* topology. [2]<br />
Proof. For every x ∈ X, we can define Dx = {z ∈ C : |z| ≤ ||x||}<br />
and define D = Πx∈XDx. Note that each Dx is compact (Why?), and<br />
hence by Tychonoff’s theorem, D is compact. Here’s the trick to this<br />
theorem: What does it mean if φ ∈ D? If φ ∈ D, then φ associates<br />
with each x ∈ X a complex scalar in the x th coordinate. Therefore, we<br />
may identify φ as a functional acting on X. This is not necessarily a<br />
collection of linear functionals though! All we know is so far that D is<br />
compact. We have B ∗ is a subset of D. The topology that B ∗ inherits<br />
from D is the product topology, which you may recall is the topology<br />
of pointwise convergence. But, we know that the topology of pointwise<br />
convergence is exactly the weak-* topology. That is, B ∗ as a subset<br />
of D has the weak-* topology. Since D is compact, we need to just<br />
show that B ∗ is closed. (Why?) Let fα ∈ B ∗ be a net that converges<br />
to f ∈ D. We need to show that f ∈ B ∗ . First, is f linear? Well,<br />
lim fα(ax + by) = a lim fα(x) + b lim fα(y) = af(x) + bf(y). So, f is<br />
linear. So, f ∈ B ∗ , and we have that B ∗ is closed. Consequently, B ∗ is<br />
compact in the weak-* topology. <br />
To summarize, we’ve given a normed space X two topologies: the<br />
usual norm topology and a new topology generated by the functionals<br />
in X ∗ . On X ∗ , we have the usual norm topology and the topology<br />
of pointwise convergence induced by X. What about the space of<br />
bounded operators, B(X,Y ). This has convergence given by the norm.
44 STEPHEN ROWE<br />
That is, if Tn → T, we mean ||Tn − T || → 0. That instantly implies<br />
||Tnx − Tx|| → 0 for all x ∈ X. What about the other way around? If<br />
||Tnx − Tx|| → 0 for all x ∈ X, does ||Tn − T || → 0? This is not the<br />
case. However, we can define a pointwise topology on B(X,Y ) with<br />
this pointwise norm estimates. To be precise,<br />
Definition 16. We say that Tn → T strongly if ||Tnx − Tx|| → 0 for<br />
every x ∈ X.<br />
ogy.<br />
The topology associated with this is called the strong operator topol-<br />
<strong>Problems</strong><br />
On these problems, I strongly suggest taking a glance back at the<br />
Hahn-<strong>Banach</strong> theorem and its useful consequences.<br />
Problem 25. Let xn,yn weakly converge to x and y respectively. Show<br />
that αxn + βyn → αx + βy weakly.<br />
Problem 26. Let T : ℓ 2 → ℓ 2 be given by Tnx = (0, 0, 0,.....x(n),x(n+<br />
1),......). Consider the sequence Tn. Show that each Tn is a linear,<br />
bounded operator first. Show that ||Tnx − Tx|| → 0 for some appro-<br />
priate T. Does ||Tn − T || → 0?<br />
Problem 27. Let X,Y be normed spaces. Let xn → x weakly and let<br />
T ∈ L(X,Y ). Show that Txn → Tx weakly. Note that Txn ∈ Y .<br />
Problem 28. Let xn converge weakly in a normed space X to x. Show<br />
that x ∈ span{x1,x2,......}.
<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 45<br />
Problem 29. Let Y be a closed subspace in X. Show that Y contains<br />
all of the limits of its weakly convergent sequences.<br />
Problem 30. Let X be a <strong>Banach</strong> space and let E ⊂ X be a norm-<br />
bounded set. Consider the weak closure of E (that is, the closure of E<br />
in the weak topology). Show that the weak closure of E is still norm<br />
bounded.<br />
Problem 31. Let X be anormed vector space and Y a subspace. Then<br />
Y is norm closed iff Y is weakly closed.<br />
Hilbert <strong>Spaces</strong><br />
In linear algebra, we learned to generalize the algebraic structure of<br />
R n by considering vector spaces of dimension n. Then, to acquire some<br />
of the topological structure, we generalized the metric nature of R n to<br />
get normed vector spaces and <strong>Banach</strong> spaces. However, even in these<br />
spaces which have similar topological and algebraic structures to R n ,<br />
there is still something missing, and this missing piece is the familiar<br />
geometry of R n . We know how to compare vectors and see if they are<br />
perpendicular. To do this, we have the dot product in R n . The dot<br />
product naturally induced a norm (which in turn gives us a metric).<br />
If we generalize the notion of a dot product, we get what is called an<br />
inner product.<br />
Definition 17. An inner product is a map from X × X → C such<br />
that:<br />
• 〈ax + by,z〉 = a〈x,z〉 + b〈y,z〉
46 STEPHEN ROWE<br />
• 〈y,x〉 = 〈x,y〉<br />
• 〈x,x〉 ∈ (0, ∞) for all x = 0<br />
This is a linear in the first term and conjuagte-linear in the second<br />
term mapping, as 〈x,ay〉 = ā〈x,y〉. Note that in physics, the opposite<br />
convention is used (conjugate linear in the first term). Note that if X is<br />
a real vector space, then the inner product is bilinear and conjugation<br />
is no problem. We often call such a space an inner product space, or in<br />
more fancy terms, a pre-Hilbert space. With an inner product, we can<br />
induce a norm by ||x|| = 〈x,x〉. That this is so is not immediately<br />
obvious. Although it should follows quickly from the definitions that<br />
||x|| = 0 iff x = 0 and ||x|| ≥ 0, and ||αx|| = |α|||x||, the triangle<br />
inequality is a bit tricky and we’ll need something to deal with that.<br />
Inner product spaces give all the structure a normed space has, plus<br />
some new tricks. One of the most valuable inequalities that I have ever<br />
used is an inequality that relates the magnitude of an inner product of<br />
two vectors and the product of their norms.<br />
Theorem 22. The Cauchy − Schwarz Inequality Let x,y ∈ X .<br />
Then |〈x,y〉| ≤ ||x|| ||y||<br />
Proof. Consider x,y = 0 (since if either of them are zero, the inner<br />
product is zero and the result follows). For every scalar α, ||x−αy|| 2 =<br />
〈x − αy,x − αy〉 = 〈x,x〉 − ¯α〈x,y〉 − α〈y,x〉 − α¯α〈y,y〉. That is,<br />
we have ||x − αy|| 2 = ||x|| 2 − ¯α〈x,y〉 − α[〈y,x〉 − α〈y,y〉]. We can<br />
zero out the bracketed term if we choose ¯α = 〈y,x〉<br />
. Consequently,<br />
〈y,y〉<br />
0 ≤ ||x − αy|| 2 ≤ ||x|| 2 − α〈x,y〉 = ||x|| 2 − 〈y,x〉<br />
〈y,x〉. Rewriting yields<br />
〈y,y〉
<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 47<br />
0 ≤ ||x|| 2 − |〈x,y〉|2<br />
||y|| 2 . Moving terms and multiplying by the denominator<br />
yields |〈x,y〉| 2 ≤ ||x|| 2 ||y|| 2 . Taking square roots finishes the proof.<br />
Note that z¯z = |z| 2 , which was used. <br />
Theorem 23. If ||x|| = 〈x,x〉, then ||x|| is a norm on X.<br />
Proof. By the previous remarks, ||x|| satisfies all of the norm properties<br />
automatically from the definition, save for possibly the triangle inequal-<br />
ity. We have, ||x + y|| 2 = 〈x + y,x+y〉 = ||x|| 2 + 〈x,y〉 + 〈y,x〉 + ||y|| 2 .<br />
Now, on the two middle terms, we may apply Cauchy-Schwarz’s in-<br />
equality to get ||x|| ||y|| from both middle terms. Therefore, ||x+y|| 2 ≤<br />
||x|| 2 + 2||x|| ||y|| + ||y|| 2 ≤ (||x|| + ||y||) 2 . Taking square roots gives<br />
the triangle inequality. <br />
We know that the norm is continuous from our initial study of<br />
normed vector spaces, so it follows that the norm induced by the in-<br />
ner product is continuous. More can be said: the inner product is a<br />
continuous mapping from X × X → C.<br />
Lemma 5. Let X be an inner product space and let xn,yn be convergent<br />
sequences to x,y respectively. Show that lim〈xn,yn〉 = 〈x,y〉.<br />
Proof. We have |〈xn,yn〉−〈x,y〉| = |〈xn,yn〉−〈x,yn〉+〈x,yn〉−〈x,y〉| ≤<br />
|〈xn − x,yn〉| + |〈x,yn − y〉| ≤ ||x − xn|| ||yn|| + ||x|| ||y − yn|| → 0. <br />
So far we have generalized the idea of a dot product to an arbitrary<br />
vector space. An inner product space instantly gives us a norm topol-<br />
ogy and convergence in norm, analogous to R n . But, analytically, R n<br />
has the wonderful property of being complete. If we could define a
48 STEPHEN ROWE<br />
complete inner product space, we would have a great generalization of<br />
our familiar Euclidean spaces.<br />
Definition 18. A Hilbert Space is a complete, inner product space.<br />
Another way of phrasing it is that a Hilbert space is a <strong>Banach</strong> space<br />
with an inner product. We know so far that R n and C n are Hilbert<br />
spaces with the usual dot product. We’ve run into an example of a<br />
Hilbert space already: ℓ 2 . If we let 〈x,y〉 = ∞<br />
j=1 x(j)y(j), for x,y ∈ ℓ2 ,<br />
we have a well defined inner product. (Why?) This is what makes ℓ 2 so<br />
much more special thatn ℓ 1 or ℓ ∞ , which can have bizarre, pathological<br />
problems (especially L 1 and L ∞ ). However, ℓ 2 is a very nice space<br />
with great properties. We already know that (ℓ p ) ∗ = ℓ q where p and q<br />
are conjugates, and 2 is conjugate with itself, so we know ℓ 2 = (ℓ 2 )∗.<br />
Soon, we will be able to show that for a general Hilbert space H, there<br />
is a bijection between H and H ∗ . More concretely, there are several<br />
familiar geometric properties in a Hilbert space.<br />
Theorem 24. Let x,y ∈ H. Then ||x+y|| 2 +||x−y|| 2 = 2(||x|| 2 +||y|| 2 )<br />
Proof. Note that ||x + y|| 2 = ||x|| 2 + 2ℜ〈x,y〉 + ||y|| 2 and ||x − y|| 2 =<br />
||x|| 2 − 2ℜ〈x,y〉 + ||y|| 2 . Summing the two formulas gives the desired<br />
result. <br />
The importance of the inner product is only realized when we gener-<br />
alize the notion of orthogonality. We say that x ⊥ y or x is orthogonal<br />
to y if 〈x,y〉 = 0. One of the most familiar rules from geometry is the<br />
Pythagorean theorem for a right triangle. We can generalize this to<br />
arbitrary Hilbert spaces.
<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 49<br />
Theorem 25. The Pythagorean Theorem Let x1,.....xn ∈ H and<br />
let xj ⊥ xk for j = k. Then || n j=1 xj|| 2 = n 2<br />
j=1 ||xj||<br />
Proof. Exercise <br />
Given a set M ⊂ H, we can define M ⊥ = {x ∈ H : x ⊥ y ∀y ∈ H}.<br />
We call M ⊥ the orthogonal complement of M. With the inner product,<br />
given any set, we can decompose a Hilbert space into the direct sum of<br />
M and its orthogonal complement. For example, in R 2 , the x-axis and<br />
y-axis are orthogonal one dimensional subspaces such that R 2 can be<br />
viewed as the direct sum of the two subspaces. We can generalize this<br />
notion to artbitrary Hilbert spaces by first considering the following<br />
question: if M ⊂ H is a closed subspace and y ∈ H, does there exist<br />
a unique x ∈ M sucht that ||x − y|| is minimized? Can we always<br />
find a closest vector in the subspace? The answer is that for a closed<br />
subspace, we can do this; furthermore, with this unique x, we can<br />
actually express y as a sum of an element from M and M ⊥ . Since this<br />
can be done for arbitrary y ∈ H, we can decompose H into M ⊕ M ⊥ .<br />
Theorem 26. Let M be a closed subspace of H. Then H = M ⊕ M ⊥ .<br />
In other words, if x ∈ H, then we can uniquely write x = y + z where<br />
y ∈ M and z ∈ M ⊥ . These unique elements y and z are the unique<br />
elements of M and M ⊥ that minimize the distance to x. [2]<br />
Proof. Let x ∈ H and define δ = inf{||x − y|| : y ∈ M}. By the<br />
definition of infimum, we may find a sequence yn such that ||x −yn|| →<br />
δ. Since H is a Hilbert space, we may use the parallelogram law, which<br />
tells us that:
50 STEPHEN ROWE<br />
2(||yn − x|| 2 + ||ym − x|| 2 ) = ||yn − ym|| 2 + ||yn + ym − 2x|| 2<br />
We know that M is a subspace, so 1<br />
2 (yn + ym) ∈ M. If we solve for<br />
||yn − ym|| 2 , and factor out the 2 from ||yn + ym − 2x|| 2 , we arrive at:<br />
||yn − ym|| 2 = 2(||yn − x|| 2 + ||ym − x|| 2 ) − 4|| 1<br />
2 (yn + ym) − x|| 2<br />
||yn − ym|| 2 ≤ 2||yn − x|| 2 + 2||ym − x|| 2 − 4δ 2<br />
Now, we know that ||yn − x|| 2 and ||ym − x|| 2 fall down towards δ,<br />
so the right hand side falls to zero. This tells us that yn is a Cauchy<br />
sequence in M. Consequently, there exists a y ∈ M such that yn → y.<br />
(Why?) Define z = x − y, and hence ||z|| = ||x − y|| = δ. So far, we<br />
have shown that there is a y ∈ M that minimizes the distance to x.<br />
Notice that x = y +(x −y) = y +z. If we can show that z ∈ M ⊥ , then<br />
we will be mostly done, save for uniqueness.<br />
What we need to do now is show that for any u ∈ M, u ⊥ z. So,<br />
consider 〈z,u〉. This quantity may be a complex number, but if we<br />
multiply u by an appropriate scalar, we can turn this into a real valued<br />
quantity (Note: this trick is often used, where one multiplies by a<br />
scalar to either normalize or make a quantity real). Consider f(t) =<br />
||z + tu|| 2 = ||z|| 2 + 2t〈z,u〉 + t 2 ||u|| 2 . Differentiating this real valued<br />
function gives f ′ (t) = 2〈z,u〉+2t||u|| 2 . We know that f(t) = ||z +tu|| 2<br />
is minimized at t = 0 because z + tu = x − y + tu = x − (y + tu).
<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 51<br />
Since y + tu ∈ M, we know ||x − y + tu|| ≥ ||x − y|| = ||z||. Therefore,<br />
the minimization occurs when t = 0. Looking out our derivative, we<br />
have f ′ (0) = 0 = 2〈z,u〉. Hence z ⊥ u. Since this holds for arbitrary<br />
u ∈ M, we have z ∈ M ⊥ . Now, we must argue uniqueness: Let y ′ ∈ M.<br />
Then ||x − y ′ || 2 = ||x − y|| 2 + ||y − y ′ || 2 ≥ ||x − y|| 2 . Here, we used the<br />
Pythagorean theorem, which is valid since x − y ⊥ y − y ′ ∈ M, since<br />
x − y = z ∈ M ⊥ . One can similarly show that given another z ′ ∈ M ⊥ ,<br />
||x − z ′ || = ||x − z|| 2 + ||z − z ′ || 2 ≥ ||x − z|| 2 , and we have equality iff<br />
z = z ′ . This solves uniqueness of y and z as the closest elements to x<br />
from M and M ⊥ respectively.<br />
Therefore, given any x ∈ H, we can write x = y + z where y ∈ M<br />
and z ∈ M ⊥ . Assume that there is another decomposotion. Then,<br />
y ′ + z ′ = x = y + z implies y ′ − y = z + z ′ . But, y ′ − y ∈ M and<br />
z − z ′ ∈ M ⊥ , and hence y ′ − y = z − z ′ ∈ M ⊥ ∩ M = {0}. Therefore,<br />
we have a unique decomposition of H as M ⊕ M ⊥ . <br />
If we look back at the beginning of this proof where we were es-<br />
tablishing the existence of a minimizing distance vector y to x, notice<br />
that the only properties of M we used were that M was closed (hence<br />
complete) and that 1<br />
2 (yn + ym) ∈ M. The second property is far less<br />
demanding than being a subspace; in fact, a convex set would do just<br />
fine. That is, if K is a closed, convex set in a Hilbert space, then we<br />
can find a unique minimizing vector. The rest of the proof utilizes<br />
subspace properties however.<br />
Let x ∈ H and let M be a closed subspace. Then by the previous<br />
theorem, there exists y,z in M and M ⊥ respectively. We call y the
52 STEPHEN ROWE<br />
orthogonal projection of z onto M. This is motivated by the calculus<br />
and geometry you are likely familiar with. With this information, we<br />
can define a mapping P : H → M by Px = y. From this, we see that P<br />
is a linear operator. Furthermore, P is continuous, and hence bounded<br />
(if xn → x, then ǫ ≥ ||xn − x|| 2 = ||yn − y|| 2 + ||zn − z|| 2 , so yn → y;<br />
from this, Pxn = yn → y = Px). We have some nice properties: P is<br />
an onto bounded linear mapping from H to M, and it is the identity<br />
on M, and hence P 2 = P (Why?). Additionally, P(M ⊥ = 0.<br />
With this newfound structure in a Hilbert space, we can learn some-<br />
thing very important about the dual space of H. If y ∈ H, then we<br />
may define f(x) = 〈x,y〉, which is a bounded linear functional. (Why?)<br />
The surprising thing is, every bounded linear functional can be written<br />
in this way! Therefore, H ∗ can be identified naturally with H itself.<br />
This instantly tells us that we may view H ∗∗ = H also.<br />
Theorem 27. The Riesz Representation Theorem Let H be a Hilbert<br />
space and f ∈ H ∗ . Then, there exists unique y ∈ H such that f(x) =<br />
〈x,y〉 for all x ∈ X.<br />
Proof. If f = 0, then it is certainly true that f(x) = 〈x, 0〉 = 0. Let<br />
f not be the zero functional. Let M = {x ∈ H : f(x) = 0}. We<br />
know that the kernel of a bounded operator gives a closed subspace,<br />
so M is closed. Since f = 0, M is a nontrivial subspace. Then H =<br />
M ⊕ M ⊥ and M ⊥ is non-trivial, so we may choose z ∈ M ⊥ , such<br />
that ||z|| = 1 (since M ⊥ is a closed subspace as well). Then, Then,<br />
define u = f(x)z − f(z)x. So, f(u) = 0, and u ∈ M. So, 0 =
<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 53<br />
〈u,z〉 = f(x)||z|| 2 − f(z)〈x,z〉 = f(x) − 〈x, f(z)z〉. Solving for f(x)<br />
gives f(x) = 〈x,y〉 with y = f(z)z.<br />
Now, we must show uniqueness. Assume there exists y,y ′ such that<br />
f(x) = 〈x,y〉 = 〈x,y ′ 〉. Then, 0 = 〈x,y − y ′ 〉. Choosing x = y − y ′ , we<br />
get ||y − y ′ || 2 = 0, and hence y = y ′ . <br />
This amazing result tells us that the functionals acting on H can be<br />
identified with H itself through a conjugate linear isomorphism. With<br />
this new knowledge of functionals, we can build new operators from<br />
already existing bounded linear operators.<br />
Definition 19. Let H be Hilbert spaces and let T : H → H be a<br />
bounded linear operator. Then, we define the adjoint T ∗ : H → H<br />
such that 〈Tx,y〉 = 〈x,T ∗ y〉 for all x,y ∈ H.<br />
It is not obvious that such an operator even exists. However, with the<br />
Riesz representation theorem, we can actually build it rather quickly<br />
since we know a bit about functionals.<br />
Theorem 28. The adjoint T ∗ of a bounded linear operator exists and<br />
is a unique, bounded linear operator with norm equal to ||T ||.<br />
Proof. Consider the functional defined by fy(x) = 〈Tx,y〉 for all x ∈ H.<br />
Then, this is a bounded linear functional, as ||fy(x)|| ≤ ||T || ||x|| ||y||<br />
(Why?) and hence by the Riesz representation theorem, there ex-<br />
ists a unique z ∈ H such that 〈Tx,y〉 = 〈x,z〉. Consider the map-<br />
ping of H → H given by y → z. We may call this mapping T ∗ .<br />
With this, we have a well defined linear (show linearity) operator T ∗
54 STEPHEN ROWE<br />
such that 〈Tx,y〉 = 〈x,T ∗ y〉 for all x,y ∈ H. Given an operator, we<br />
have ||T || = sup x∈X<br />
sup x,y∈H<br />
〈Tx,y〉<br />
||x|| ||y|| . Then, we have ||T ∗ || = sup x,y∈H<br />
〈T ∗ x,y〉<br />
||x|| ||y|| =<br />
〈x,Ty〉<br />
||x|| ||y|| ≤ sup ||x|| ||Ty||<br />
x,y∈H ||x||||y|| ≤ supx∈H ||Tx|| = ||T ||. On the<br />
other hand, ||T ∗ || = sup x,y∈H<br />
||T ||. So, ||T || = ||T ∗ ||.<br />
〈x,Ty〉<br />
||x|| ||y|| ≥ sup x,y∈H<br />
〈Tx,Tx〉<br />
||Tx|| ||x|| = sup ||Tx||<br />
x∈H ||x|| =<br />
Note that it is possible to generalize this concept to an adjoint map-<br />
ping between two different Hilbert spaces H1 and H2. That is, we<br />
can define , for a given T ∈ B(H1,H2), T ∗ ∈ B(H2,H1) such that<br />
〈Tx,y〉2 = 〈x,T ∗ y〉1 for all x ∈ H1, y ∈ H2. However, this requires<br />
some extra machinery (sesquilinear forms), which I decided weren’t<br />
worth pursuing and can easily be found in any textbook or on the in-<br />
ternet. Before we start proving some properties about adjoints, there is<br />
a useful trick for showing that an operator is actually the zero operator:<br />
Lemma 6. Let X,Y be inner product spaces and let T ∈ B(X,Y ). [3]<br />
Then:<br />
a. T = 0 iff 〈Tx,y〉 = 0 for all x ∈ X and y ∈ Y<br />
b. If T : X → X and X is a complex inner product space and if<br />
〈Tx,x〉 = 0 for all x ∈ X, then T = 0<br />
Proof. Part (a) is an exercise. For part b, consider 〈T(αx + y,αx + y〉.<br />
Consider the two cases of α = i and α = −i. <br />
Note that the statement in part (b) of the previous lemma requires<br />
that X be a complex space. It is false in the real case. Consider a<br />
rotation operator in R 2 that rotates by 90 deg [3].
<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 55<br />
Theorem 29. Let T,S : H → H be bounded linear operators. Then,<br />
a. (S + T) ∗ = S ∗ + T ∗<br />
b. (αT) ∗ = ¯αT ∗<br />
c. (T ∗ ) ∗ = T<br />
d. ||T ∗ T || = ||TT ∗ || = ||T || 2<br />
e. T ∗ T = 0 iff T = 0<br />
f. (ST) ∗ = T ∗ S ∗<br />
The proofs of these are computational exercises which hopefully<br />
shouldn’t prove to be too strenuous. In the exercises, we will explore<br />
operators known as self-adjoint operators, which satisfy T = T ∗ and<br />
unitary operators, which are invertible operators such that T ∗ = T −1 .<br />
So far, we’ve learned a bit about functionals and operators on a Hilbert<br />
space. It’s time we learn about one of the most useful properties about<br />
Hilbert spaces: orthonormal bases.<br />
A set {eα} ∈ H is said to be orthonormal if ||eα|| = 1 and 〈eα,eβ〉 =<br />
δαβ, where δαβ = 1 if α = β and zero otherwise. That is, every vector<br />
in an orthonormal set is orthogonal to every other one, and every vec-<br />
tor has norm one. Recall from linear algebra that given any linearly<br />
independent set {xn}, one could transform this into an orthonormal<br />
set using the Gram-Schmidt orthogonalization procedure. One defines<br />
y1 = x1<br />
||x1|| and then yn. Repeating this for all xn, we can then define<br />
zn = xn − n−1<br />
j=1 〈xj,un〉un. Orthonormal sets satisfy a very impor-<br />
tant inequality which relates the dot products of a vector against an<br />
orthonormal set with the norm of the vector.
56 STEPHEN ROWE<br />
Theorem 30. Bessel ′ s Inequality If {eα}α∈A is an orthonormal set<br />
in H, then for any x ∈ H, we have <br />
α∈A |〈x,eα〉| ≤ ||x|| 2<br />
Proof. It is possible that this is an uncountable sum; to deal with an<br />
uncountable sum, one takes the supremum over all finite subsets of A.<br />
Therefore, if we can prove this for an arbitrary finite subset of A, we<br />
will be done.<br />
0 ≤ ||x − <br />
〈x,eα〉eα|| 2<br />
α∈B<br />
= ||x|| 2 −2Re〈x, <br />
〈x,uα〉uα〉+|| <br />
〈x,uα〉uα|| 2 Use Pythagorean Theorem on rightmost piece<br />
α∈B<br />
α∈B<br />
= ||x|| 2 − 2 <br />
|〈x,uα〉| 2 + <br />
|〈x,uα〉| 2<br />
α∈B<br />
α∈B<br />
= ||x|| 2 − <br />
|〈x,uα〉| 2<br />
α∈B<br />
Moving the last sum to the right hand side finishes the proof. <br />
Can equality happen in Bessel’s inequality? The answer is yes, and<br />
something very nice happens in that case. If one has an orthonormal<br />
set such that <br />
α∈A |〈x,eα〉| = ||x|| 2 for all x ∈ H, it turns out that the<br />
set {eα} actually is a sort of orthonormal basis. That is, we can express<br />
x = 〈x,eα〉eα. We call the coefficeints 〈x,eα〉 Fourier coefficients.<br />
Theorem 31. Let {eα} be an orthonormal set in H. The following<br />
are equivalent: [2]<br />
a. If 〈x,uα〉 = 0 for all α, then x = 0<br />
b. Parseval ′ s Identity ||x|| 2 = <br />
α∈A |〈x,uα〉| 2 for all x ∈ H
<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 57<br />
c. For each x ∈ H, x = <br />
α∈A 〈x,uα〉uα. This sum converges in<br />
the norm topology no matter the ordering.<br />
Proof. Assume (a) and let’s show (c). We may choose a subset of the<br />
α’s by discarding all α such that 〈x,uα〉 = 0. By Bessel’s inequality,<br />
the sum |〈x,uα〉| 2 converges. We have that || m<br />
αj=n 〈x,uαj 〉||2 =<br />
m<br />
j=n |〈x,uαj 〉2 → 0 as we let m,n get arbitrarily large. By the com-<br />
pleteness of H, 〈x,uα〉uα converges. Then, 〈x − 〈x,eα〉eα,eα〉 = 0<br />
for all α, and by assumption of (a), this tells us that x− 〈x,eα〉eα = 0.<br />
Let’s assume (c) and show (b). We have ||x|| 2 − n<br />
j=1 |〈x,uαj 〉|2 =<br />
||x− n<br />
j=1 〈x,uαj 〉uαj ||2 by calculation. We have by assumption that the<br />
term on the right goes to zero. Hence, we have ||x|| 2 = n<br />
j=1 |〈x,uαj 〉|2 .<br />
If we assume (b), then (a) follows: if 〈x,uα〉 = 0 for all α, then we<br />
have ||x|| 2 = |〈x,uα〉| 2 = 0, and hence x = 0. <br />
This theorem illustrates the desirable nature of an orthonormal set<br />
in a Hilbert space: it allows every vector to be written as an easy<br />
linear combination of the orthonormal vectors and the vectors Fourier<br />
coefficients. This is why a Hilbert space can be such an ideal space to<br />
work in. Generalizing from linear algebra, we call a set that satisfies<br />
one (and hence all) properties of the previous theorem an orthnormal<br />
basis. We know from our previous work that the set {en} ∈ ℓ 2 , the<br />
cannonical basis, is a Schauder basis for ℓ 2 . With inner product defined<br />
as 〈x,y〉 = ∞<br />
n=1 x(n)y(n), we see that the (en) form an orthonormal<br />
set, and it isn’t too hard to show that if 〈x,en〉 = 0 for all n, then<br />
x = 0; hence, this sequence forms an orthonormal basis. It should be<br />
clear that a Hilbert space with an orthonormal basis is an ideal setting
58 STEPHEN ROWE<br />
to work in. A question remains: given a Hilbert space, does there exist<br />
an orthonormal basis? Fortunately, the answer is yes!<br />
Theorem 32. Let H be a Hilbert space. Then, H has an orthonormal<br />
basis.<br />
The proof of this, much like the proof of the Hahn-<strong>Banach</strong> theorem,<br />
requires a powerful set theoretic lemma: Zorn’s lemma. Our first step<br />
is to consider a partially ordered set X where the elements of X are<br />
orthonormal subsets of H. (Note to the student: it is imperative that<br />
you first show X is non-empty. Why is X non-empty?) To give a partial<br />
ordering on X, we say U1 ≤ U2 if U1 ⊂ U2. To use Zorn’s lemma, we<br />
must argue that every chain has an upper bound, where a chain is a<br />
linearly ordered set. Let C = {U1,U2,.....} with U1 ⊂ U2 ⊂ U3..... If<br />
we define U = ∪Un, we have an orthonormal set and clearly Un ≤ U<br />
for all n. Therefore, this set has an upper bound, and hence there is a<br />
maximal element in X (that is, a largest orthonormal set). Let this set<br />
be {eα}. We do not yet know that eα is an orthonormal basis. Being a<br />
maximal orthonormal set implies there exists no x such that x ⊥ eα for<br />
all α, save for x = 0. But, that is equivalent to part (a) of the previous<br />
theorem. Consequently, {eα} is an orthonormal basis.<br />
Hilbert spaces keep getting better and better; they generalize the<br />
geometry and completeness of R n , and they always admit orthonormal<br />
bases. Hilbert spaces are reflexive and H ∗ is exactly H itself. Given a<br />
closed subspace, one can decompose H into the direct sum of the sub-<br />
space and its orthogonal complement. Additionally, any vector can be<br />
reconstructed from its Fourier coefficients, given an orthonormal basis,
<strong>NOTES</strong> <strong>ON</strong> <strong>ANALYSIS</strong> 59<br />
which is guaranteed to exist. To make things even better, if one can<br />
find a countable orthonormal basis, the Hilbert space is automatically<br />
separable (and the converse is true too!).<br />
Theorem 33. Let H be a Hilbert space. Then, H is separable iff H<br />
has a countable orthonormal basis. Additionally, if H has a countable<br />
orthonormal basis, then all orthonormal bases are countable. [2]<br />
Proof. The assertions in the second sentence will be left as an exercise.<br />
Let’s show the last assertion. Let, {un} be a countable orthonormal<br />
basis and {vβ}β∈B, be an arbitrary orthonormal basis. Then, consider<br />
the sets An = {β ∈ B : 〈vβ,un〉 = 0}. This set An must be countable,<br />
by Bessel’s inequality and/or part (c) of the Parseval’s theorem. Then,<br />
∪An is a countable set. We have that since un forms an orthonormal<br />
basis, every α is in at least one An, so ∪An = A is countable. <br />
References<br />
[1] John B. Conway. A Course in Functional Analysis. Springer, 2007.<br />
[2] Gerald B. Folland. Real Analysis. John Wiley and Sons Inc., 1999.<br />
[3] Erwin Kreyszig. Introductory Functional Analysis with Applications. John Wiley<br />
and Sons Inc., 1989.