NOTES ON ANALYSIS Contents Banach Spaces 2 Problems 6 A ...

NOTES ON ANALYSIS 

STEPHEN ROWE 

Contents 

Banach Spaces 2 

Problems 6 

A Few Inequalities for ℓ p 7 

From Finite to Infinite Dimension 9 

An Introduction to Linear Operators 14 


An Introduction to Linear Functionals and Dual Spaces 19 


The Three Big Theorems: Open Mapping, Closed Graph, and 

Banach-Steinhaus 27 


Topologies and Functionals: Weak and Weak-* Topologies 37 


Hilbert Spaces 45 

References 59 

These are notes aimed at undergraduates with an interest in learning 

a bit about functional analysis without requiring measure theory. With 

Date: June 14, 2011. 

1

2 STEPHEN ROWE 

this approach, functional analysis can be made accessible to undergrad- 

uates with just some basic analysis and linear algebra background. This 

is strongly inspired by Kreyszig’s Introductory Functional Analysis [3] 

with Applications textbook. Some topics and problems were also in- 

spired by Folland’s superb analysis textbook [2]. I try to provide some 

exercises and detailed proofs along with helpful(I hope!) exposition. 

If you see any parenthetical (Why?)’s anywhere, those are statements 

which the reader should ponder before moving on (this was inspired 

by N. L. Carother’s excellent Real Analysis textbook). If you see any 

mistakes, let me know! 

Banach Spaces 

Definition 1. Let X be a vector space. We say that || · || is a norm 

on X if || · || : X :→ R is a mapping such that 

• ||x|| ≥ 0 

• ||x|| = 0 iff x = 0 

• ||αx|| = |α|||x|| for any α ∈ C 

• ||x + y|| ≤ ||x|| + ||y|| (The Triangle Inequality) 

If such a mapping exists for X, we say that X is a normed vector 

space. Geometrically, norms generalize the notion of length to arbitrary 

vector spaces. Note that a normed vector space is automatically a 

metric space with the natural metric induced by the norm given by 

d(x,y) = ||x − y||. Since the norm induces a metric, it gives rise to a 

topology naturally by considering the topology generated by open balls. 

Recall by the reverse triangle inequality that |||x|| − ||y||| ≤ ||x − y||.

NOTES ON ANALYSIS 3 

Consequently, if we consider the norm as a mapping, it is continuous. 

That is, if xn,x ∈ X with xn → x, we have ||xn|| → ||x|| (where 

convergence of xn → x is given by ||xn − x|| → 0). Since a norm 

naturally induces a metric, is it true that every metric is induced by a 

norm? This is a neat question, and the answer is no (see exercises). 

Definition 2. Let X be a normed vector space. If X is complete, we 

say that X is a Banach space. Recall that completeness means that 

every Cauchy sequence converges. 

It is worth noting that a normed vector space may be complete under 

one norm, but incomplete under other norms. It is quite helpful to have 

a few examples of different normed vector spaces and Banach spaces. 

The reader should know that C n and R n are both Banach spaces, but 

there are more exotic and interesting examples out there. 

Example 1. We define C[a,b] to be the set of continuous functions on 

the interval [a,b]. The first norm worth considering is the sup-norm. 

Define ||f|| = sup t∈[a,b] |f(t)|. Under this norm, we have that C[a,b] is 

a complete metric space, and hence a Banach space. To see why this 

is so, let fn be a Cauchy sequence in C[a,b]. Then, we have for n,m 

large enough, ||fn − fm|| < ǫ. Consequently, sup |fn(t) − fm(t)| < ǫ. If 

we fix a t, then we have that |fn(t) − fm(t)| is a Cauchy sequence of 

real numbers, and hence converges. Hence, fn converges pointwise to 

a function f. Consequently, since sup t∈[a,b] |fn(t) − fm(t)| ≤ ǫ for n,m 

large enough, we may take limits (which are allowed since the norm is 

continuous), and since fn(t) → f(t) pointwise, we have sup t∈[a,b] |f(t)−


fm(t)| ≤ ǫ. This shows that fn uniformly converges to f, and hence 

f is continuous, and hence f ∈ C[a,b]. Therefore, C[a,b] is complete 

under the supremum norm. 

On the other hand, we can give a metric to C[a,b] by d(f,g) = 

b 

|f(t) − g(t)|dt. Under this metric, C[a,b] is not complete. To see 

a 

why, let fn(t) = 0 for t ∈ [0, 1 

2 ], and fn(t) = 1 for t ∈ [ 1 

2 

1 + , 1]. n 

In the unspecified area, simply let fn be linear such that it makes fn 

continuous (start at 0 at t = 1 

2 

and go up to 1 at t = 1 

2 

1 + ). We n 

have that d(fn,fm) ≤ ǫ for large n,m > N. However, the limit of 

this sequence of functions is a step function, which is discontinuous. 

Therefore, the limit is not in C[a,b]. Consequently, this metric makes 

the space incomplete. The easiest way to do this is graphically by 

drawing fn and fm for large n,m. The area under |fn(t) − fm(t)| 

becomes very small for large n,m. 

Example 2. We can consider the set of all polynomials defined on an 

interval [a,b] to be a subset of C[a,b] with the supremum norm. Is this 

set closed? Why or why not? 

Amongst the most important spaces in analysis are the so called 

L p function spaces (with norm given by integration) and ℓ p sequence 

spaces (with norm given by summation). We will focus on ℓ p for now 

(due to our sketchy avoidance of measure theory for now). Let x denote 

a sequence of real (or complex) scalars and let x(n) denote the n th term 

in the sequence.


Definition 3. Let 1 ≤ p < ∞. We define ℓ p to be the set of all 

sequences such that ∞ 

n=1 |x(n)|p < ∞. On such a space, we can 

induce a norm by ||x||p = ( ∞ n=1 |x(n)|p ) 1 

p. If p = ∞, we define ℓ∞ to be the set of all sequences of scalars that are bounded. That is, 

sup |x(n)| < ∞. 

The most important ℓ p spaces occur for p = 1, 2, ∞. We will see later 

that ℓ 2 has extraordinarily nice structure. To get a better feel for these 

spaces, it helps to have some examples of elements in them. Consider 

the harmonic series given by x(n) = 1. 

Since the harmonic series is 

n 

divergent, we know that x /∈ ℓ 1 . However, we have that x(n) 2 = π2 

6 , 

and hence x ∈ ℓ 2 . These spaces have some convenient properties. For 

1 ≤ p ≤ ∞, ℓ p is actually a complete normed space, and hence a 

Banach space. Additionally, for 1 ≤ p < ∞, ℓ p is separable (ℓ ∞ is not 

however!). Recall that a space is separable if there exists a countable 

dense set. Before we move on, a note on notation. Since the elements of 

ℓ p are sequences, it can be quite confusing dealing with sequences in ℓ p 

(that is, sequences of sequences!). Therefore, we will use the notation 

that x(n) refers to the n th sequence entry of an element x ∈ ℓ p . We let 

{xn} ∈ ℓ p be a sequence in ℓ p , then xn is the n th term in the sequence 

(regarding each x as a point in a space). 

Theorem 1. The normed space ℓ 2 is a Banach space with the norm 

given by ||x|| = ( ∞ n=1 |x(n)|2 ) 1 

2 

Proof. Consider a Cauchy sequence xn ∈ ℓ 2 . Then, we have for large 

enough N, ||xn−xm|| ≤ ǫ. Consequently, ∞ 

i=1 |xn(i)−xm(i)| 2 ≤ ǫ 2 . In


particular, we have (for each i) |xn(i) − xm(i)| ≤ ǫ. However, for fixed 

i, xn(i) (with n varying) is a Cauchy sequence of scalars, and hence 

converges. Therefore, for each i, we may define x(i) = limn→∞ xn(i). 

So far all we have produced is a candidate limit for the Cauchy sequence 

xn. Then, since ∞ 

i=1 |xn(i) − xm(i)| 2 ≤ ǫ 2 , taking a limit on n gives 

∞ 

i=1 |x(i) − xm(i)| 2 ≤ ǫ 2 . Consequently the vector x − xm ∈ ℓ p . Since 

ℓ p is a vector space, (x − xm) + xm = x ∈ ℓ p . Therefore, our candidate 

limit is in ℓ p , and ||xn − x|| → 0. Therefore ℓ 2 is a Banach space. 

Exercise: Do this for ℓ p , 1 ≤ p ≤ ∞. 

Problems. 

Problem 1. Show that ℓ ∞ is complete. 

Problem 2. Consider M ⊂ ℓ ∞ to be the set of all sequences such that 

at most finitely many terms are non-zero. First, show that this set is 

a subspace of ℓ ∞ . Next, show that this set is not closed and hence not 

complete. 

Problem 3. Let Y be a Banach space and M ⊂ Y be closed. Show 

that M is a Banach space with the norm inhertied from Y . (You 

probably have done a question like this before: Show that a closed 

subset of a complete space is complete.) 

Problem 4. Show that ℓ p is separable for 1 ≤ p < ∞. Hint: You need 

to construct a countable dense set. Recall that the rationals are dense 

in the reals.


Problem 5. Show that ℓ ∞ is not separable. Hint: Given any real 

number in [0, 1], one can write a binary representation as a string of 

ones and zeros. Consider strings of this form (justify that they are in 

ℓ ∞ ). How many such strings are there? What is the minimum distance 

between two such strings? 

Problem 6. Let d(x,y) = ||x − y|| be a metric induced by a norm. 

Prove that this metric is translation invariant. That is, show that 

d(x+z,y +z) = d(x,y). Furthermore, for any α ∈ C (or R, depending 

on the field of scalars), show that d(αx,αy) = |α|d(x,y). 

Let S be the space of all sequences of scalars. That is, x ∈ S if 

x = (x(1),x(2)......x(j).....). Consider the metric given by d(x,y) = 

∞ 

j=1 

1 

2 j 

|x(j)−y(j)| 

. Is this metric induced by a norm? 

1+|x(j)−y(j)| 

A Few Inequalities for ℓ p 

It was mentioned above that the ℓ p spaces are Banach spaces. How- 

ever, we glossed over the actual step of showing that they even form 

normed vector spaces. It isn’t too difficult to verify that the ℓ p norms 

satisfy all of the norm properties, save for the triangle inequality. This 

requires something known as Minkowski’s Inequality. However, this 

relies on an inequality of vast importance in the theory of Lebesgue 

integrals in L p spaces known as Hölder’s inequality. Before that, we 

require a lemma. When working with ℓ p spaces, one often is interested 

in the space ℓq where 1 1 + p q 

exponents. 

= 1. We say that p and q are conjugate 

Lemma 1. Let a,b ≥ 0 and λ ∈ (0, 1). Then, a λ b 1−λ ≤ λa + (1 − λ)b


Proof. Let t = a, 

and divide both sides by b. Then, we aim to show 

b 

that t λ ≤ λt + (1 − λ). Consider the function t λ − λt. Differentiating 

this expression gives λ(t λ−1 − 1), which is optimized with the choise 

t = 1. At t = 1, we have 1 λ − λ = 1 − λ. Hence, t λ − λt ≤ 1 − λ, with 

equality if t = 1. 

Theorem 2. Hölder ′ s Inequality. Let 1 

jugate exponents. Then, ∞ n=1 |x(n)y(n)| ≤ ( ∞ 

n=1 |x(n)|p ) 1 

p( ∞ n=1 |y(n)|q ) 1 

q 

Proof. Let x = (x(n)) ∈ ℓ p and let y = (y(n)) ∈ ℓ q . This inequality is 

equivalent to showing that ||xy||1 ≤ ||x||p||y||q. For now, let’s simplify 

the problem and assume that ||x||p = ||y||q = 1. This follows obviously 

if x(n) = 0 for all n (or if y(n) = 0 for all n). Assume neither of these 

are identically zero. Let a = x(n) p and b = y(n) q . Then, with λ = 1 

p , 

we have |x(n)y(n)| ≤ 1 

p |x(n)|p + 1 

q |y(n)|q . If we sum both sides, we 

arrive at ||xy||1 ≤ 1 

p ||x||p + 1 

q ||y||q = 1. This holds for all normalized 

x ∈ ℓ p , y ∈ ℓ q . The normalization is equivalent to dividing each term 

x(n) by ||x||p. So, to arrive at the inequality for non-normalized a ∈ ℓ p , 

b ∈ ℓ q , we note that this inequality holds for x = a 

||a||p 

Then, ||xy||1 ≤ 1 implies 

||ab||1 

||a||p ||b||q 

and y = b 

||b||q . 

≤ 1. Multiplying by the denominator 

yields the desired result 

Now that we have this inequality, we can prove the triangle inequality 

for ℓ p spaces. This is known as Minkowski’s inequality. 

Theorem 3. Let 1 ≤ p ≤ ∞ and let x,y ∈ ℓ p . Then, ||x + y||p ≤ 

||x||p + ||y||p.


Proof. To start, we notice that |x(n) + y(n)| p = |x(n) + y(n)| |x(n) + 

y(n)| p−1 ≤ (|x(n)| + |y(n)|)|x(n) + y(n)| p−1 ; all we did was utilize the 

regular triangle inequality for scalars. Now, we may sum both sides to 

get ∞ 

n=1 |x(n)+y(n)|p ≤ ∞ 

n=1 |x(n)||x(n)+y(n)|p−1 + ∞ 

n=1 |y(n)||x(n)+ 

y(n)| p−1 . We can now apply Hölder’s inequality to both sides, choos- 

ing q to be the conjugate exponent to p. Therefore, we arrive at 

∞ 

n=1 |x(n) + y(n)|p ≤ ||x||p|||x + y| p−1 ||q + ||y||p|||x + y| p−1 ||q. Factor- 

ing yields ||x+y|| p p ≤ (||x||p + ||y||p)( ∞ n=1 |x(n)+y(n)|(p−1)q ) 1 

q. Notice 

that 1 1 = 1 − which tells us that (p − 1)q = p. Consequently, we have 

p q 

∞ n=1 |x(n) + y(n)|p ≤ (||x||p + ||y||p)( ∞ n=1 |x(n) + y(n)|p ) 1 

q. Division 

of the summation term yields ||x + y||p = ( ∞ n=1 |x(n) + y(n)|)1−1 q ≤ 

||x||p + ||y||p 

With this, we have the triangle inequality for ℓ p spaces and we can 

conclude that ℓ p spaces are Banach spaces. 

From Finite to Infinite Dimension 

Recall from linear algebra that a vector space X is finite dimensional 

if the largest linearly independent set has at most n vectors for some 

finite n. If X is a vector space where this does not occur, then we say 

that X is infinite dimensional. All spaces (besides C n ) introduced in the 

previous section are infinite dimensional. (Why?) We know from linear 

algebra that if the dimension of a space is n, then a set of n linearly 

indpendent vectors serve as a basis. In infinite dimensions, such a basis 

would necessarily be infinite. However, it is imperative that the reader 

note that if we say a set Y = {x1,x2,.......} is a basis for X (of infinite


dimension), then this means for every x ∈ X, there exists a finite (!) 

set of xi such that y = k cixni i=1 . The linear algebraic definition of 

linear combinations only permits finite linear combinations, not infinite 

series of such. That does not mean one should disregard the possibility 

of generalizing such a notion. 

We know that if X is a Banach space and xn → x, then we mean 

||xn − x|| → 0. With this notion of convergence, we can generalize our 

notion of infinite series from calculus. Let xn be a sequence in X and 

define Sn = n 

i=1 xi. We say that this series converges if ||Sn − Sm|| is 

a Cauchy sequence (or if there exists S ∈ X such that ||Sn − S|| → 0), 

and we say that S is the sum of the infinite series. 

Definition 4. Let X be a Banach space and let (en) be a sequence in 

X. We say that (en) is a Schauder basis for X if given any x, there 

exists a unique sequences of scalars (αn) such that x = αnen, or 

||x − k 

n=1 αnen|| → 0. 

This generalization of a basis for infinite dimensional spaces. It is 

quite handy for ℓ p . Since any x ∈ ℓ p can be thought of as a p-summable 

sequence, x = (x(1),x(2),.....x(n).....). If we define the basis vectors 

e1 = (1, 0, 0.......),e2 = (0, 1, 0, 0,.....)....en = (0, 0,......1, 0, 0.....) we 

obtain a Schauder basis and x = ∞ 

i=1 x(i)ei. Notice that if a Banach 

space X has a Schauder basis, it is automatically separable (Why?). 

On the other hand, given a separable Banach space, does there exist a 

Schauder basis? Intuitively, one might guess yes. However, this is not 

true. This was a big open problem, which was solved by Per Enflo in 

1973 in the negative.


On a more concrete level, let’s take a look at ℓ ∞ . We know that 

ℓ ∞ is not separable (see exercises), so it cannot have a Schauder basis. 

However, it seems that since the cannonical Schauder basis {en} works 

so well for ℓ p , what’s going wrong in ℓ ∞ ? Consider x ∈ ℓ ∞ given by 

x = (1, 1, 1, 1.....). Then, xn = en is the appropraite approximation, 

by ||x − ∞ 

n=1 en|| = 1. Consequently, this is not converging in norm 

to x! 

So far, it appears that we can naturally extend familiar notions from 

finite dimensions to infinite. However, the big jump between finite 

dimension and infinite is the change in topology. Recall that sequential 

compactness and compactness are equivalent notions on metric spaces, 

and since we are interested solely in normed (hence metric) spaces, we 

may take as a definition that compactness is sequential compactness. 

Definition 5. Let X be a normed space and M ⊂ X. We say that M 

is compact if given any sequence xn ∈ M, there exists a convergent (in 

M) subsequence. 

It follows that if M is compact, then M is necessarily closed and 

bounded. (Why? To see boundedness, choose a sequence such that 

||xn|| grows monotonically arbitrarily large. Why does this not have 

a convergent subsequence?) We know by Heine-Borel that in R n com- 

pactness is equivalent to a set being closed and bounded. It also is true 

that in finite dimensional spaces, a set is compact iff it is closed and 

bounded.


Theorem 4. Let X be a finite dimensional and M a subset of X. 

Then, M is compact iff M is closed and bounded 

Proof. If M is compact, it follows from above that M is closed and 

bounded. Let M be closed and bounded and consider an arbitrary 

sequence xn ∈ M. Since this space is finite dimensional, we can choose 

a convenient basis e1,e2,....en, and note that xm = αm(1)e1+αm(2)e2+ 

....αm(n)en. Then, each αm(i) is a bounded sequence of scalars (since M 

is bounded), and hence we have a convergent subsequence by Bolzano- 

Weierstrass. So, αm(i) → α(i) for some subsequence. Define x = 

α(1)e1 + α(2)e2 + ....αnen. Then, we can find a subsequence such that 

xnk 

→ x. (To do this, one uses a finite form of a ’diagonalization’ 

argument) Since xnk 

is convergent, and M is closed, xnk converges in 

M. Hence, we have a convergent subsequence. 

However, in infinite dimensions, a closed and bounded set is not nec- 

essarily compact. In fact, the closed unit ball in an infinite dimensional 

space is necessarily not compact! To show this, we need a technical 

lemma first. The following proof is from Kreyszig’s excellent textbook. 

Lemma 2. Riesz’s Lemma: [3] Let X be a normed space and let Y,Z be 

subspaces such that Y is closed and Y is strictly contained in Z. Given 

any α ∈ (0, 1), there exists z ∈ Z with ||z|| = 1 such that ||z − y|| > α 

for some y ∈ Y . 

Proof. Let v ∈ Z − Y . We can define the distance from v to Y by 

infy∈Y ||v − y||, which has some distance d. Since Y is closed, we can 

find a sequence yn such that ||v − yn|| → d. Consequently, we can find


y0 such that d ≤ ||v − y0|| ≤ d + ǫ for arbitrary ǫ. With this, we can 

find y such that d ≤ ||v − y0|| ≤ d 

v−y0 

. If we define z = , then 

α ||v−y0|| 

clearly ||z|| = 1. Additionally, ||z − y|| = 1 

||v−y0|| ||v − y0 − ||v − y0||y||. 

Since Y is a subspace, y0 + ||v − y0||y ∈ Y , (call it y1), and hence 

||z − y|| = 1 

||v−y0|| ||v − y1|| ≥ 1 

||v−y0|| 

α d ≥ d = α. So, ||z − y|| ≥ α. 

d 

In the following proof, we will be using the fact that finite dimen- 

sional subspaces are closed. This fact I have not proved in this section, 

but this will be proven once we know the Hahn-Banach theorem. There 

are ways of proving it without resorting to such drastic measures, but 

I prefer it my way. For now, accept on faith that finite dimensional 

subspaces are closed. 

Theorem 5. Let X be an infinite dimensional normed space. Then, 

the closed unit ball B is not compact. 

Proof. We can start by picking a point x1 ∈ B such that ||x1|| = 1. 

Then, this generates a subspace which is closed. Therefore, we can 

choose an x2 such that ||x2 − x1|| ≥ 1 

2 and ||x2|| = 1. Now, consider 

the subspace spanned by x1 and x2. This is still finite dimensional, 

and hence closed. Therefore, using our previous lemma, we can find 

an x3 of norm one such that ||x3 − x2|| ≥ 1 

2 and ||x3 − x1|| ≥ 1 

2 . 

Iterating this procedure we get a sequence xn which has no Cauchy 

subsequence because ||xn − xm|| ≥ 1 

2 

always. So, this sequence can’t 

possibly have a convergent subsequence. Therefore, the closed unit ball 

is not compact.


An Introduction to Linear Operators 

In the setting of linear algebra and finite dimensions, we are familiar 

with mappings between two finite dimensional spaces (say C n → C m ). 

With a chosen basis, we can represent linear mappings as matrices. 

Our goal here is to generalize the notion of linear mappings between 

two vector spaces of arbitrary dimension. We call a mapping between 

two vector spaces an operator. From here on out, assume that (unless 

explicitly stated otherwise) that X and Y refer to normed vector spaces. 

Definition 6. Let X,Y be vector spaces. We say that an operator T 

is a linear operator if T : X → Y is a linear mapping. That is, for any 

scalars α,β, and any x,y ∈ X, T(αx + βy) = αTx + βTy. 

It is worth noting that this immediately implies that a linear operator 

takes zero to zero. The domain of an operator need not be the whole 

space X and the range is not necessarily all of Y . We can extend the 

notion of a kernel from linear algebra by saying the kernel (also called 

null space) of an operator is given by {x ∈ X : Tx = 0}. If we let 

X,Y be finite dimensional, then the linear operators are exactly the 

matrices mapping between them. We have more interesting operators 

on possibly infinite dimensional spaces. Our goal is to extend notions 

from analysis and topology to the infinite dimensional case. Since we 

have mappings between spaces, we can try to extend the notion of 

continuity and boundedness.


Definition 7. Let X,Y be normed vector spaces with norms given by 

|| · ||1 and || · ||2 respectively We say that a linear operator T : X → Y 

is bounded if ||Tx||2 ≤ C||x||1 for all x ∈ X. 

We often may drop the subscripts, which we hypothesize will not 

cause too much confusion. However it is important to keep in mind 

that the norm on Tx is the Y norm and the norm on x is the X norm. 

With this definition, we can define a new vector space given by the 

collection of bounded operators between two normed spaces. 

Definition 8. Let X,Y be normed vector spaces. Then, define B(X,Y ) = 

{T |T : X → Y , T bounded }. Then, B(X,Y ) is a vector space. 

Given T,S ∈ B(X,Y ), we have ||(αT+βS)x|| ≤ |α|||Tx||+|β|||Sx|| ≤ 

|α|CT ||x||+|β|CS||x|| = C||x||. Consequently, for arbitrary α,β scalars, 

we have αT +βS is also a bounded operator. Hence, B(X,Y ) is a vec- 

tor space. Now that we have a vector space, can we extend other 

notions that we have introduced? Can we make a normed space out of 

B(X,Y ). 

Definition 9. We define the operator norm ||T || = sup x∈X ||Tx|| : 

||x|| = 1. 

The operator norm is a well-defined norm (see problems) and with 

this, we have that B(X,Y ) is a normed vector space. Before moving 

onto more complex topics, it may be worth exploring some examples 

of bounded and unbounded operators.


Example 3. Consider C[a,b] with the supremum norm and define 

T : C[a,b] → C[a,b] by Tf = t 

f(x)dx. This operator is cer- 

a 

tainly linear (since integration is linear), and we know that the func- 

tion defined by Tf is still continuous (hence in C[a,b]). Note that 

||Tf|| = || t 

a f(x)dx|| ≤ t 

a ||f||dx ≤ b 

a 

is a bounded operator with norm at most (b − a). 

||f||dx = (b −a)||f||. Hence, T 

Example 4. Let P ⊂ C[0, 1] be the set of all polynomials and let P 

inherit the supremum norm. This gives a normed vector space. We can 

define the differentiation operator which takes polynomial t n to nt n−1 . 

Define the sequence pn(t) = t n . Then Tpn(t) = nt n−1 , so ||Tpn|| = n 

Since ||pn|| = 1, we have ||Tpn|| = n||pn||. Therefore, we can’t find a 

C such that ||Tp|| ≤ C||p|| for all p ∈ P. 

The previous two operators are very important; the study of dif- 

ferential equations and integral equations naturally relies on these two 

operators. The unboundedness of the differentiation operator can make 

it rather unweildly, but it also leads to a very interesting theory of un- 

bounded operators. 

Now that we have a concept of boundedness of a linear mapping, 

can we extend the notion of continuity? Generalizing continuity from 

functions, we can say that a linear operator T is continuous if given 

any convergent sequence xn → x, we have Txn → Tx. With linear 

operators, a remarkable equality occurs: boundedness and continuity 

are equivalent notions (which is not the case for functions!).


Theorem 6. Let T : X → Y be linear. Then, T is continuous iff T is 

bounded. 

Proof. Let T be bounded. Then, if xn → x, we have ||T(xn − x)|| ≤ 

||T ||||xn − x|| → 0, and hence Txn → Tx. Hence, T is continuous. Let 

T be continuous. Then, for any ǫ > 0, ||Tx − Tx0|| < ǫ provided that 

||x − x0|| < δ. Choose arbitrary y ∈ X, and let x = x0 + δ y 

. Then, 

||y|| 

||x−x0|| = y < δ, so ||T(x−x0)|| ≤ ǫ, but ||T(x−x0)|| = δ ||Ty|| ≤ ǫ. 

||y|| 

So, ||Ty|| ≤ ǫ ||y||. Hence, T is bounded. 

δ 

Notice that the proof above actually only used continuity at some 

point x0, and we showed that T was bounded for arbitrary y. It follows 

that continuity at a single point is equivalent to continuity everywhere, 

another bizarre feature of linear operators. Now that we have some 

grasp on bounded linear operators, what can we say about the space 

of bounded linear operators between two normed spaces? Is this space 

every complete? This is actually possible, and the only requirement is 

that Y be complete (the domain space X need not be complete!). 

Theorem 7. Let X be a normed space and Y a Banach space. Then 

B(X,Y ) is a Banach space. 

Proof. Let Tn be a Cauchy sequence in B(X,Y ). Hence, ||Tn−Tm|| ≤ ǫ 

for n,m large enough. Then, for any x ∈ X, ||Tn(x) − Tm(x)|| ≤ 

||Tn−Tm|| ||x|| ≤ ǫ||x||. If we define yn = Tn(x), then ||yn−ym|| ≤ ǫ||x||, 

and hence yn is a Cauchy sequence in Y . But, Y is a Banach space, 

and hence yn converges to y. We can define an operator Tx = y in


this manner for each x. It follows that T is linear since T(x + z) = 

lim Tn(x + z) = limTnx + lim Tnz = Tx + Tz. Then, for any x, we 

have ||Tmx − Tnx|| ≤ ǫ||x||. Taking limits on the n allows us to arrive 

at ||Tx − Tnx|| ≤ ǫ||x||, and hence T − Tn is a bounded operator. 

Consequently, T = (T − Tn) + Tn ∈ B(X,Y ). Therefore B(X,Y ) is 

complete. Also, note that ||Tn − T || → 0 , and hence Tn → T. 


Problem 7. Show that the kernel of a linear operator is a vector space. 

Show that the kernel of a bounded linear operator is closed. Also show 

that the range of a linear operator is a vector space. 

Problem 8. Show that B(X,Y ) is a normed vector space (assume 

X,Y normed vector spaces). 

Problem 9. Let k(x,y) be a continuous function on [0, 1] × [0, 1]. 

Define T : C[0, 1] → C[0, 1] by Tf = 1 

k(x,y)f(x)dx. Show that T is 

0 

a bounded linear operator. 

Problem 10. Let T ∈ L(X,X) and ||T || < 1. Show that (I − T) is 

an invertible operator and that (I − T) −1 = ∞ 

n=0 T n . 

Problem 11. Let x ∈ ℓ ∞ . Define T : ℓ ∞ → ℓ ∞ by Tx = y where 

y = (0,x(1),x(2),.......). Is T linear? Bounded? If so, what is the 

norm? Consider T : ℓ ∞ → ℓ ∞ defined by Tx = y where y(n) = x(n) 

n . 

Show that this is a bounded linear operator. What is the norm of T?


Problem 12. Is the range of a bounded linear operator necessarily 

closed? Why or why not? Hint: Consider an operator from the previous 

problem. 

Problem 13. Let T be a linear operator with the condition that 

||Tx|| ≥ b||x||for all x ∈ X. Show that T −1 exists and that T −1 is 


Problem 14. Let S,T ∈ B(X,X). Show that ||ST || ≤ ||S||||T ||. 

An Introduction to Linear Functionals and Dual Spaces 

Now that we know a few things about bounded linear operators, we 

can consider a special class of bounded linear operators between normed 

spaces and the field of scalars. Let X be a normed space and consider 

the set of bounded operators B(X, R) (or B(X, C)). If f ∈ B(X, C), 

then f(x) is a scalar, and we say that f is a bounded linear functional. 

We call the set B(X, C) the dual space of X, written X ∗ . The study 

of normed spaces X relies heavily on exploring the nature of its dual 

space. Indeed, we will later see that using X ∗ , we can build a new 

topological structure for X called the weak topology. Before we get 

too complicated, we should note some basic facts about dual spaces. 

The first thing to note is that since C is a Banach space, we know that 

X ∗ is always a Banach space, regardless of whether or not X is. This 

follows from the last theorem from the previous section. 

Corollary 1. Let X be a normed space. Then, the set of bounded 

linear functionals X ∗ is a Banach space.


Since functionals are linear operators, we can define a norm on them 

using the operator norm prevously defined. That is, ||f|| = sup{||fx|| : 

||x|| = 1}. Note that ||f(x)|| is actually just |f(x)| since f(x) is a scalar 

value. Consequently, |f(x)| ≤ ||f|| ||x||. Also, since linear functionals 

are operators, if the functional is bounded, it is continuous (and vice 

versa). Let’s familiarize ourselves with some common examples: 

Example 5. Let X be a normed space. Let f(x) = ||x||. Then, f 

is a functional, as it maps a normed space to the field of real num- 

bers. However, we do not have linearity, since ||x + y|| ≤ ||x|| + ||y||. 

Consequently, this is not a linear functional. 

Example 6. Consider C[0, 1] with Tf = 1 

f(t)dt. Since integration 

0 

is a linear operation, T is linear. T : C[0, 1] → R, and hence T is a 

linear functional and it is bounded. (Why?) 

Example 7. Let x ∈ R n . If we consider y T , y ∈ R n , then f(x) = 

y T x = y · x is a bounded linear functional. 

1 1 + p q 

In the case of ℓ p for 1 

= 1. We say that p,q are conjugate exponents. For the case p = 1, 

the dual space is given by ℓ ∞ . However, ℓ ∞ has a dual possibly much 

larger than ℓ 1 . We will see both an explicit example of a functional on 

ℓ ∞ which is not in ℓ 1 and solve the problem with a quick application of 

a clever theorem. Let us demonstrate that the dual space of ℓ 1 is ℓ ∞ . 

Theorem 8. The dual space of ℓ 1 is ℓ ∞


Proof. Let f ∈ (ℓ 1 ) ∗ . From our discussion bases, we know their is 

a canonical basis to choose, given by the (en) vectors. Then, any 

x ∈ ℓ 1 can be written as ∞ 

n=1 x(n)en. Consequently, we have f(x) = 

∞ 

n=1 x(n)f(en), since f is linear and bounded (Why is this allowed?). 

Let y(n) = f(en). This defines a sequence. Then, |y(n)| = |f(en)| ≤ 

||f||||en|| = ||f||. Consequently, sup |y(n)| ≤ ||f||. So, y(n) ∈ ℓ ∞ . So, 

we can identify with any linear functional f ∈ (ℓ 1 ) ∗ , a sequence in ℓ ∞ . 

Now we need to show that every member of ℓ ∞ defines a linear func- 

tional in a manner as above. Let y(n) ∈ ℓ∞ . Then, ∞ n=1 x(n)y(n) de- 

fines a linear functional. Boundedness follows since | ∞ n=1 x(n)y(n)| ≤ 

∞ n=1 |x(n)| sup |y(n)| ≤ sup |y(n)| ∞ 

n=1 |x(n)| = ||y||∞||x||1. So, this 

shows that every element of ℓ ∞ defines a bounded linear functional. 

Therefore, we can associate the space ℓ ∞ with ℓ 1 . Lastly, we have 

||f|| = ||y(n)||∞. This follows since we earlier showed |y(n)| = |f(en)| ≤ 

||f||. We also have | ∞ 

n=1 y(n)x(n)| ≤ sup |y(n)|||x||1. Consequently, 

|f(x)| 

||x||1 ≤ sup |y(n)|. So, ||f|| = sup |y(n)| = ||y||∞. Therefore, we have 

an isometric isomorphism between ℓ ∞ and (ℓ 1 ) ∗ . 

Now that we have seen some examples of linear functionals, a ques- 

tion arises: what can we say about how many linear functionals exist? 

Does a normed space have a rich supply of such functionals? One of 

the most important theorems in functional analysis, the Hahn-Banach 

theorem, addresses this question. First, we need to know a few terms 

before we can prove this important theorem.


Definition 10. We say that p is a sublinear functional if p : X → R 

such that p(x + y) ≤ p(x) + p(y) and p(λx) = λp(x) for λ ≥ 0. We say 

that p is a semi-norm if p if p(x+y) ≤ p(x)+p(y) and p(λx) = |λ|p(x). 

The above statement should look a bit familiar: a norm (or any 

semi-norm for that matter) is a sublinear functional. The following is 

inspired largely by Folland’s proof and Kreyszig’s proof in their respec- 

tive textbooks. 

Theorem 9. Hahn − Banach Theorem [2] [3] Let X be a real vector 

space and let M be a subspace of X, and let f be a linear functional 

on M. Let p be a sublinear functional such that f(x) ≤ p(x) ∀x ∈ M. 

Then, there exists a linear functional F on X such that F(x) ≤ p(x) 

for all x ∈ X and F |M = f. 

Proof. Our first step will be to extend f to a functional defined on a 

subspace of simply dimension larger by one. That is, we will define a 

g on M + Rx, where x /∈ M. Once this is done, we will know that an 

extension is possible. 

To begin, let y1,y2 ∈ M. Then, f(y1) + f(y2) = f(y1 + y2) ≤ 

p(y1 + y2) ≤ p(y1 − x) + p(x + y2) by invoking the triangle inequality 

property of sublinear functionals. This implies that f(y1) −p(y −x) ≤ 

p(x + y2) − f(y2). Consequently, sup{f(y) − p(y − x) : y ∈ M} ≤ 

inf{p(x + y) − f(y) : y ∈ M}. Then, there exists some number α such 

that sup{f(y)−p(y −x) : y ∈ M} ≤ α ≤ inf{p(x+y)−f(y) : y ∈ M}. 

With this, we may define g : M +Rx → R by g(y+λx) = f(y)+λα. 

Then, g is linear since g(y1+λ1x+y2+λ1y2) = f(y1+y2)+α(λ1+λ2) =


f(y1) + αλ1 + f(y2) + αλ2 = g(y1 + λ1x) + g(y2 + λ2x). On the set 

M, f(y) = g(y + 0 · x) = f(y) + α · 0 = f(y). Consequently, on M, 

g(y) ≤ p(y). However, we need to show that g(y + λx) ≤ p(y + λx). 

Note that the definition of a sublinear functional requires λ > 0, but 

in M + Rx, λ ∈ R could be negative, and hence we must account for 

two cases. First, let λ > 0 

Then, g(y+λx) = λ[f( y 

λ 

)+α] ≤ λ[f( y 

λ 

)+p( y 

λ 

y 

+x)−f( )] = p(y+λx). 

λ 

Now, let λ = −µ < 0. Then, g(y + λx) = µ(f( y 

µ ) − α) ≤ µ(f( y 

µ ) − 

f( y 

µ +p( y 

µ −x))) = p(y+λx). Therefore, we have g(y+λx) ≤ p(y+λx). 

Therefore, we have proven there exists a one dimensional extension of 

f. 

Now, consider the family of all linear extensions of f satisfying f ≤ 

p. We can give this set a partial ordering by set inclusion. That 

is, if F1,F2 are extensions such that the domain of F1 is contained 

in the domain of F2 and if F1 = F2 on their common domain, then 

F1 ≤ F2. Now, consider a chain {Fα}. Then, we have an increasing set 

of domains (which are subspaces), and if we take the unions, we arrive 

at a functional F by defining F(x) = Fα(x) if x is in the domain of 

Fα(x). Then, since this is a chain, we have Fα ≤ F, since the domain 

of F is the union over all domains, and F(x) = Fα(x) if x is in their 

common domain. So, F is an upper bound for this arbitrary chain from 

our partially ordered set Therefore, we know our partially ordered set 

(by Zorn’s Lemma) has at least one maximal element, call it F. It must 

be that the domain of F is the whole space. If not, we could do a one 

dimensional extension (as above), which would give an F ′ ≥ F, which


would contradict the maximality of F. Therefore, F is an extension of 

f to the whole space which still satisfies F(x) ≤ p(x). 

It is important to note that this proof follows only for vector spaces 

over R. The Hahn-Banach theorem can be formulated in the case of a 

vector space over C. This merely requires a technical lemma (which we 

shall omit) and the proof is a lemma of the real version of the Hahn- 

Banach theorem. However, since we often assume our field of scalars 

are complex, it is worth stating the theorem: 

Theorem 10. The Complex Hahn − Banach Theorem Let X be 

a complex vector space, p a semi-norm on X, M a subspace, and f a 

complex linear functional such that |f(x)| ≤ p(x) ∀x ∈ M. Then, there 

exists a complex linear functional F such that |F(x)| ≤ p(x) ∀x ∈ X 

and F |M = f. 

Now that we have this powerful theorem, several useful results in- 

stantly emerge. 

Theorem 11. Let X be a normed vector space. 

a: If M is a closed subspace of X and x ∈ X/M, there exists 

f ∈ X ∗ such that f(x) = 0 and f|M = 0. We may choose 

||f|| = 1 and f(x) = d(x,M) = infy∈M ||x − y|| = δ. 

b: If x = 0 ∈ X, there exists f ∈ X ∗ such that f(x) = ||x||, 

||f|| = 1. 

c: The bounded linear functionals separate points.


d: If x ∈ X, we can define x ′ : X ∗ → C by x ′ (f) = f(x). We have 

that x ′ is a linear functional on X ∗ , hence x ′ ∈ (X ∗ ) ∗ . We can 

isometrically embed X ⊂ X ∗∗ . 

Proof. For part (a), we may define f on M + Cx by f(y + λx) = λδ. 

Then, f(x) = δ as desired and f|M = 0. Note that δ ≤ ||y + x|| for 

any y ∈ M, and hence |f(x)| = |λ|δ ≤ |λ|||λ −1 y + x|| = ||y + λx||. 

So, f(z) ≤ ||z|| for z ∈ M + Cx. If we assign ||z|| = p(z), this is a 

semi-norm, and hence we may apply Hahn-Banach to get a functional 

defined on X such that |F(z)| ≤ ||z|| and F |M = 0, F(x) = δ. 

For Part (b), simply use the functional from part (A) with M = {0}. 

For Part (c), given two points x,y with x = y, there exists a func- 

tional such that f(x−y) = ||x−y|| > 0, and hence X ∗ separates points 

in X 

For part (d), if f,g ∈ X ∗ , x ∈ X, then x ′ (αf +βg) = (αf +βg)(x) = 

αf(x) + βg(x) = αx ′ (f) + βx ′ (g), and hence x ′ is a linear functaionl 

on X ∗ . We have |x ′ (f)| ≤ ||f||||x||, so ||x ′ || ≤ ||x||. But, we also have 

that there exists f such that ||f|| = 1 and f(x) = ||x||, so |x ′ (f)| = 

||x|| ≤ ||x ′ ||||f|| = ||x ′ ||. So, ||x ′ || = ||x||. 

An interesting question arises from part (d). When does (if ever) 

X = X ∗∗ ? We see that we can isometrically embed X as a subset of 

X ∗∗ . We say that a space is reflexive if X = X ∗∗ . Do not confuse 

this notion of reflexivity with the notion of the Alg Lat of an algebra 

equaling itself! If we recall that (ℓp ) ∗ = ℓq where 1 1 + p q 

= 1, then it 

follows that (ℓ p ) ∗∗ = (ℓ q ) ∗ = ℓ p . Consequently, ℓ p is reflexive. However,


for ℓ 1 , we have that its dual is ℓ ∞ . But, the dual of ℓ ∞ is vastly larger 

than ℓ 1 , and hence ℓ 1 is not reflexive. 

The following theorem is a neat application of our previous theorem. 

I recommend trying it yourself before reading the proof! Note that ¯ M 

refers to the closure of M. 

Theorem 12. Let M be a subspace of normed space X. Then, ¯ M = 

∩{ker f : f ∈ X ∗ ,M ⊂ ker f}. [1] 

Proof. Let N = ∩{ker f : ¯ M ⊂ ker f}, and let’s show first that ¯ M ⊂ 

N. Since each f ∈ X ∗ , the kernel is always closed. Consequently, we 

are considering an arbitrary intersection of closed sets that contain M. 

Since one can define ¯ M to be the intersection over all closed sets that 

contain M, it follows that ¯ M ⊂ N. Assume that the containment is 

proper; that is, there exists x0 ∈ N but not in ¯ M. Then, since ¯ M 

is a closed subspace, we can find f ∈ X ∗ such that f|M = 0 and 

f(x0) = δ = dist(x0,M). Since f annihilates ¯ M, the kernel of f is 

included in the intersection that generates N. Hence, x0 cannot be in 

N since f(x0) = 0. Consequently, ¯ M = N. 


Problem 15. Let X be a normed vector space. 

a. Let M be a closed subspace of X and let x ∈ X/M. Show that 

M + Cx is closed. 

b. Let M be a finite dimensional subspace of X. Show that M is 

closed.


Problem 16. If X is a Banach space and X ∗ is separable, show that 

X is separable.[2] Hint: This problem is quite tricky. By the definition 

of separability, there exists {fn} that is countable and dense in X ∗ . 

For each n, try to find an xn ∈ X with ||xn|| = 1 such that |fn(xn)| ≥ 

1 

2 ||fn||. Argue that one can use these countable xn to obtain a countable 

dense subset of X. 

Problem 17. Without providing a counterexample, prove that (ℓ ∞ ) ∗ = 

ℓ 1 . Hint: Consider the previous question. Also, do note that we know 

ℓ 1 ⊂ (ℓ ∞ ) ∗ . 

The Three Big Theorems: Open Mapping, Closed Graph, 

and Banach-Steinhaus 

The Hahn-Banach theorem is one of the cornerstones of functional 

analysis because it gives us information about the existence of function- 

als on normed spaces. However, there are a few other major theorems 

we will need to cover. First, we will need a helpful theorem from topol- 

ogy. 

Theorem 13. The Baire Category Theorem Let X be a complete 

metric space. If {Un} is a sequence of open, dense sets in X, then ∩Un 

is also dense in X. 

A set is dense in a space if it intersects every non-trivial open set in X. 

Let W be an open set, W = ∅. Our goal is to show that (∩Un)∩W = ∅. 

Since each Un is dense, we certainly have that U1 ∩ W is nonempty, 

and contains a closed ball centered about some point x0. Consequently,


there exists B(x0,r0) ⊂ W ∩ U1. As one might suspect, we can iterate 

this procedure, intersecting each time with Uj and finding xj,rj such 

that B(rj,xj) ⊂ Uj ∩ B(rj−1,xj−1), and we may choose rj < 2 −j at 

each turn. Then, the sequence of centers, xn forms a Cauchy sequence. 

Since X is complete, xn converges to some x ∈ X, which is contained 

in the intersection of W ∩ (∩ ∞ n=1Un). 

Corollary 2. Let X be a complete metric space. Then, X is not a 

countable union of nowhere dense sets. 

Proof. Exercise. 

This theorem is a purely topological result which depends on com- 

pleteness. We know that Banach spaces are by definition complete, so 

we will utilize the Baire Category Theorem to prove results for Banach 

spaces. This moves us in a more specific direction towards Banach 

spaces, as opposed to the general work we did with normed spaces 

before. Banach spaces have wonderful properties due to their com- 

pleteness. As we discussed before, we can consider series in normed 

vector spaces. Banach spaces provide a familiar result from calculus: 

if a series is absolutely convergent in a Banach space, then it the series 

itself is convergent. In fact, completeness is equivalent to the previous 

statement. 

Lemma 3. Let X be a normed vector space. X is complete iff every 

absolutely convergent series is convergent.


Proof. Let X be complete and let ∞ 

n=1 xn be a series that is abso- 

lutely convergent. That is, ∞ 

n=1 ||xn|| converges. Consider the par- 

tial sums Sn = n 

j=1 xn. Then, ||Sn − Sm|| ≤ m 

j=n+1 ||xj|| < ǫ for 

n,m large enough, the series is absolutely convergent. Hence, Sn is 

a Cauchy sequence, and since X is complete, it converges. On the 

other hand, let X have the property that every absolutely convergent 

series converges. Let xn be a Cauchy sequence in X. Then, if we let 

xn = n 

j=1 (xn − xn−1). Our goal here is to express the sequence xn 

as a series using the above technique. If we can show that this series 

is absolutely convergent, we are done. Therefore, if we can choose a 

subsequence such that the difference between ||xnj −xnj−1 || < 2−j , then 

we will have an absolutely convergent series. Since xn is Cauchy, we 

may choose xnk 

as a subsequence such that the difference between suc- 

ceeding terms has norm less than 2 −k . Let yk = xnk 

− xnk−1 . Then, 

∞ 

j=1 ||yj|| ≤ ||y1||+ ∞ 

j=1 2−j = ||y1||+1. Hence, this series is bounded 

above, monotonic and hence convergent. Since yj is absolutely con- 

vergent, by assumption it is itself convergent. But this sum converging 

amounts to saying that xnk 

is a convergent sequence. Since xnk is a 

subsequence that converges, we have that xn converges to the same 

limit (since xn is a Cauchy sequence). 

Definition 11. Let T : X → Y be Banach spaces. We say that T is 

open if T maps open sets to open sets. That is, T(B(x,r)) contains a 

ball centered about Tx in the space Y . 

Another way of looking at this is to consider the action of T on a ball. 

Let U be an open set and let B(x,r) ⊂ U be a ball about x with radius


r. Let’s say we require that the image of every ball around a point x 

contains a ball around a point Tx. Then, if we consider an arbitrary 

open set U ⊂ X, we know that U can be written as a union of open 

balls around every point x. That is, U = ∪x∈UB(x,rx). Then, with our 

requiremtn, B(Tx,ry) = B(y,ry) ⊂ T(B(x,rx)) ⊂ T(U). What this is 

saying is that every point y ∈ T(U) has an open ball contained in T(U). 

Consequently, T(U) is an open set if U is an open set. Therefore, if we 

want to show that a map is open, we need only show that given any 

open ball B(x,r) in X that T(B(x,r) contains an open ball in Y about 

Tx. Additionally, if we consider X,Y to be normed spaces and let T be 

linear, then to show that a map is open, we merely need to show that T 

maps the open unit ball in X to a set that contains a ball about 0 in Y . 

To see why this is so, note that since T is a linear map, T(αx) = αTx 

for all x ∈ X and T(x + y) = T(x) + T(y) by linearity, and hence we 

can conclude that T commutes with dilations and translations. That 

way, instead of showing that T maps every open ball about x to a set 

containing an open ball about Tx, we can translate and dilate the ball 

in X to the open unit ball. Therefore, all we need to do is show that 

the open unit ball in X gets mapped to a set that contains an open 

ball about 0 (Recall that T(0) = 0). 

Theorem 14. Open Mapping Theorem Let X,Y be Banach spaces 

and let T : X → Y be a surjective, bounded linear operator. Then, T 

is open. [2] 

Proof. We know that T(X) = Y and we also know that X,Y are 

complete spaces. Our goal here is to show that T(B(0, 1)) contains


an open ball about 0 in Y . If one considers the sequences of balls 

Bn = B(0,n), then one can see that every x ∈ X is going to eventually 

be in one of the balls Bn, and hence we may write X = ∪ ∞ n=1Bn. Then, 

we have T(X) = T(∪Bn) = ∪T(Bn)) = Y . Since Y is a Banach 

space, Y is complete. Consequently, by the Baire Category Theorem, 

Y cannot be the union of nowhere dense sets. Consequently, there is at 

least one set T(Bn) such that T(Bn) has non-empty interior. But, this 

implies that T(B(0,n)) = nT(B(0, 1)) has non-empty interior (note 

the use of linearity). This tells us that T(B(0, 1)) cannot be nowhere 

dense, so there exists a y0 ∈ T(B(0, 1)) such that y0 ∈ Y and some 

radius r > 0 such that B(y0, 4r) ⊂ T(B(0, 1)). Then, we may choose 

a y1 ∈ T(B(0, 1)) such that ||y1 − y0|| < 2r. Since the radius y1 ∈ 

B(y0, 4r), we have that B(y1, 2r) ⊂ B(y0, 4r). Additionally, since we 

know y0 ∈ T(B(0, 1)) we know there exists an x1 ∈ B(0, 1) with Tx1 = 

y1. Let y ∈ Y be arbitrary with ||y|| < 2r. Then, y = y + (y1 − Tx1) 

by definition y1. However, since ||y|| < 2r, y1 + y ⊂ T(B1) and we 

then have that y = −Tx1 + (y + y1) ⊂ T(−x1 + B(0, 1)) ⊂ T(B(0, 2)), 

and hence y ∈ T(B(0, 2)) with ||y|| < 2r. Dividing by 2 and noting 

the linearity of T, we have that if ||y|| < r, then y ∈ T(B(0, 1)). So 

far, what we’ve shown is that we found an r (using the Baire Category 

Theorem) such that if y ∈ Y with ||y|| < r, then y ∈ T(B(0, 1)). We’re 

very close to showing that T(B(0, 1)) contains an open ball about 0. 

Our problem is that we can do this for T(B(0, 1)). We need to discard 

the closure part, and then we will have our result.


Using our dilation trick some more, we see that if ||y|| < 2 −n r, we 

have that y ∈ T(B(0, 2−n )). Now, let ||y|| < r 

1 

. Then, y ∈ T(0, 2 2 ), 

and hence we can find an x1 ∈ B(0, 1 

2 ) such that ||y − Tx1|| < r 

4 . 

Now, since ||y − Tx1|| < r 

4 

1 

∈ Y , we know it is in T(B(0, )). So, we 

4 

can find an x2 such that ||(y − Tx1) − Tx2|| < r 

8 with x2 ∈ B(0, r 

4 ). 

Now, we can proceed inductively to find an xn ∈ B(0, 2 −n−1 ) such that 

||y− n 

j=1 Txj|| < 2 −n r. Consider the series ∞ 

j=1 xn. Since ||xn|| < 1 

2 n, 

we have that ∞ 

n=1 ||xn|| < ∞ 

n=1 2−n = 1. Therefore, we have that this 

series is absolutely convergent. By our previous lemma, we have that 

since X is a complete space, any absolutely convergent series converges, 

and hence ∞ 

n=1 xn converges in X. Let the series sum be denoted by 

x. Then,||y − Tx|| = 0, so y = Tx. So, y ∈ T(B(0, 1)) since ||x|| < 1. 

Consequently, T(B(0, 1)) contains all y such that ||y|| < r. 

This implies 

2 

that we have a ball about 0 contained in T(B(0, 1)) and hence T is an 

open map. 

Recall that a function f : X → Y between two topological spaces 

is continuous if given any open V ⊂ Y , we have f −1 (V ) is open. This 

is equivalent to our definition for continuous linear operators between 

normed spaces. Let T be continuous; then, by definition, T −1 maps 

open sets to open sets. Hence, T is open. Therefore, if we can show 

that T −1 exists (if T is a bijection), then showing that T −1 is bounded 

is equivalent to showing that T is open. With these considerations, we 

have the very useful corollary to the Open Mapping theorem:


Theorem 15. The Bounded Inverse Theorem Let X,Y be Banach 

spaces and let T ∈ B(X,Y ) be a bijection. Then, T −1 is exists and is 


We are about to approach a terminology disaster: it seems clear that 

the definition of an open linear operator maps open sets to open sets. 

One might suspect that a closed linear operator maps closed sets to 

closed sets. Unfortunately, that is not the definition. 

Definition 12. Let T : X → Y be a linear operator between two 

normed spaces. Let the graph of T be defined as G(T) = {(x,y) ∈ 

X × Y : Tx = y}. We say that T is a closed operator if G(T) is a 

closed subset of X × Y in the product topology. 

It’s a bit confusing at first to see what exactly this means: a com- 

parison between continuity and closedness is best. If T is continuous, 

then given any convergent sequence xn ∈ X, we have that Txn is a 

convergent sequence in Y . If T is a closed operator, it does not follow 

that xn → x implies Txn → Tx. However, let’s say xn → x and that 

Txn does converge to something in Y , say Txn → y. Then, if T is 

closed, it follows that y = Tx. Therefore, to show that Txn → Tx, 

we first need to know that Txn is a convergent sequence in Y . We see 

that if T is bounded (continuous), then, automatically T is closed. So, 

closed linear operators generalize the notion of bounded linear oper- 

ators. Why are they worth the trouble? Well, it turns out that our 

favorite unbounded operator, d 

dx 

is a closed linear operator on certain 

Banach spaces. Due to the importance of this operator in differential


equations, it seems fair that closed operators deserve a bit of atten- 

tion. In the applied world, physics , especially quantum mechanics, 

deals with unbounded linear operators that are closed (such as the dif- 

ferentiation operator). Although not bounded, closed linear operators 

still have some acceptable behavior, notably that there are many pos- 

itive results about them in spectral theory. So far, we see that being 

bounded implies being closed. Fortunately, a closed linear operator is 

bounded if T : X → Y and X and Y are Banach spaces. 

Theorem 16. Closed Graph Theorem Let X,Y be Banach spaces 

and let T : X → Y be a closed linear operator. Then, T is bounded. [2] 

Proof. By the definition of the product topology, the projection op- 

erator pi1 : X × Y → X by π1(x,y) = x is a continuous mapping. 

The same holds for π2 : X × Y → Y , π2(x,y) = y. Consequently, 

π1 ∈ B(G(T),X) and π2 ∈ B(G(T),Y ). We know that X and Y are 

Banach spaces, and hence X,Y are both complete spaces. The prod- 

uct of two complete spaces is complete. By assumption, T is a closed 

linear operator, and hence G(T) is a closed set in X × Y . Notice that 

Tx = π2(π −1 (x)). Consequently, T = π2 ◦ π −1 

1 . We have that π1 is 

one to one and onto, and hence a bijection of X × Y to X. Since it 

is also bounded, we have that π −1 is bounded by the Bounded Inverse 

Theorem. Then, T = π2 ◦ π −1 

1 is a bounded operator. 

So far we have hit 2 of the big theorems in functional analysis, and 

one more remains. The Banach-Steinhaus , also known as the Principle 

of Uniform Boundedness, is an extraordinarily powerful theorem that


allows one to jump from pointwise estimates on the norm of an operator 

to a uniform estimate on the value of the operator norm. As you will see 

from doing the problems, this theorem makes short work of otherwise 

daunting exercises. 

Theorem 17. The Banach − Steinhaus Theorem Let X be a Ba- 

nach space and Y a normed vector space and let A ⊂ B(X,Y ). If 

sup T ∈A ||Tx|| < ∞ for all x ∈ X, then sup T ∈A ||T || < ∞. 

Proof. Let En = {x ∈ X : ||Tx|| < n} = ∩T ∈A{x ∈ X : sup T ∈A ||Tx|| ≤ 

n}. Then En is a closed set since it is the intersection of closed sets and 

X = ∪En. Since X is a Banach space, X is complete, and by the Baire 

Category theorem, at least one set En is not nowhere dense. Conse- 

quently, we can find an open ball in En, and since En is closed, we may 

find a closed ball inside of it. So, let’s denote this ball by B(x0,r) ⊂ En 

for some r > 0. Let x ∈ X satisfy ||x|| < r, so, x + x0 ∈ En. We have 

||Tx|| = ||T(x + x0) − Tx0|| ≤ ||T(x + x0)|| + ||Tx0|| ≤ n + n = 2n. 

This holds for all T ∈ A and x ∈ X with ||x|| < r since x + x0 and 

x0 ∈ B(x0,r) ⊂ En. So, B(0,r) ⊂ E2n. So, sup ||T || < 2n, 

since r 

||T || = sup ||Tx|| 

||x|| 


≤ 2n 

||x|| 

2n ≤ . 

r 

Problem 18. Consider the Banach space C[0, 1] with the supremum 

norm. Consider the subset of C[0, 1] of once continuously differentiable 

functions, C 1 [0, 1]. [2] 

a. Show that X is not a closed subset of C[0, 1] and hence not 

complete.


b. Consider the operator d 

dx : C1 [0, 1] → C[0, 1]. Show that this is 

a closed linear operator 

Problem 19. Let X be a Banach space with respect to two different 

norms, || · ||1 and || · ||2, with the property that ||x||1 ≤ ||x||2. Show 

that these norms are equivalent norms. That is, there exists constants 

A,B such that A||x||1 ≤ ||x||2 ≤ B||x||1. [2] 

Problem 20. Let X,Y be Banach spaces. Let T : X → Y be a linear 

map such that given any f ∈ Y ∗ , f(T) ∈ X ∗ . Show that T is a bounded 

operator. [2] 

Problem 21. Let X,Y be Banach spaces and let Tn be a sequence of 

bounded operators such that limTnx exists for all x ∈ X. Show that 

the operator defined by the pointwise limit is both linear and bounded. 

[2] 

Problem 22. Let X be a vector space of countably infinite dimension. 

Show that there is no norm such that this space is complete. Hint: 

Remember my warning from before: linear algebraic bases only allow 

finite combinations!. [2] 

Problem 23. Let X be a banach space and let {xn} be a sequence such 

that the set {f(xn)} is bounded for all f ∈ X ∗ . Show that {||xn||} is 

bounded. Big Hint: Look back to the consequence of the Hahn-Banach 

theorem. Remember that we can isometrically embed X ⊂ X ∗∗ . [3]


Problem 24. Define Tn = S n where S : ℓ 2 → ℓ 2 is given by Tx = 

T(x(1),x(2),......) = (x(2),x(3),.....). Bound ||Tnx|| and calculate 

lim ||Tn||. [3] 

Topologies and Functionals: Weak and Weak-* 

Topologies 

As we’ve seen in our previous analysis courses, occasionally we want 

to be a bit more flexible with convergence. For example, although 

uniform convergence of a sequence of functions is wonderful, sometimes 

this is too restrictive and we can use a weaker form of convergence, such 

as pointwise convergence. When you study integration theory, you will 

see several types of convergence such as, convergence in measure, L1, 

pointwise, and pointwise almost everywhere. Analysis is full of different 

types of convergences, and each has their uses. We will explore a 

topology built by linear functionals known as the weak topology. 

Definition 13. We say that a sequence (xn) ∈ X converges weakly to 

x ∈ X if f(xn) → f(x) for all f ∈ X ∗ . 

It follows then that if xn → x in the usual sense (in the norm- 

topology), then xn is weakly convergent (Why?). From a topological 

viewpoint, the norm topology generates a collection of open sets from 

the open balls; call this τN. The weak topology is a weaker topology 

τW. Being weaker implies that τW ⊂ τN. Another way of viewing 

this topology is that it is the weakest topology on X such that the 

functionals in X ∗ remain continuous. That is, τW is generated by 

looking at f −1 (U) for all open U ∈ X and f ∈ X ∗ . If this is confusing,


don’t worry: the main thing to understand is that weak convergence 

means that f(xn) → f(x) for all f ∈ X ∗ . 

Now that we have a new form of convergence, we should get some 

basic properties down to familiarize ourself with it. Since we are dealing 

with functionals, one should expect to see the Hahn-Banach theorem 

(in the guise of one of its many corollaries) or one of the three big 

theorems to pop up often. Evidence for the previous sentence is in the 

following proof: 

Lemma 4. Let xn be a weakly convergent sequence in a normed space 

X with weak limit x. Then: 

a. The weak limit of x is unique. 

b. Every subsequence of xn converges weakly to x. 

c. The sequence ||xn|| is bounded (Exercise from previous section). 

Proof. For (a), assume xn converges to both x and y weakly. Since 

x = y, ||x − y|| > 0, and hence there exists a functional (Why?) such 

that f(x − y) = ||x − y|| = f(x) − f(y) = limf(xn) − lim f(xn) = 0. 

So, ||x − y|| = 0. 

For (b), we have that given a subsequence xnk , then f(xnk ) is a 

subsequence of scalars. Since f(xn) converges, f(xnk ) converges to the 

same limit and this holds for all f. Hence xnk 

For (c), see the previous set of problems. 

converges weakly to x 

We often refer to convergence in norm (i.e., xn → x means ||xn − 

x|| → 0) as strong convergence (to contrast it with weak convergence).


We know that every strongly convergent sequence is weakly convergent; 

is there ever equality between the two statements? In finite dimensions, 

the answer is yes. I believe the answer is yes for some infinite dimen- 

sional spaces, as we will see in the following example. 

Example 8. Consider ℓ 1 , which has dual given by ℓ ∞ . Let xn → x 

weakly. That is, for every y ∈ ℓ ∞ , ∞ 

k=1 xn(k)y(k) → ∞ 

k=1 x(k)y(k). 

We may choose en = y, which tells us coordinate-wise, xn(k) → x(k) for 

all k. Now, let’s try showing that ||xn−x|| → 0. That is, ∞ 

k=1 |xn(k)− 

x(k)| → 0. Let ǫ > 0 be given. First, note that since xn → x weakly, 

x ∈ ℓ 1 , so xn − x ∈ ℓ 1 . That is, ∞ 

k=1 |xn − x| < ∞. Therefore, 

the tail of this series must converge. So, there exists N such that for 

∞ 

k=N |xn(k) − x(k)| < ǫ. 

However, by our pointwise convergence, 

2 

there exists an M such that for n > M, we have for 1 ≤ k ≤ N, 

N k=1 |xn(k) − x(k)| < ǫ . Putting the two together, we have for 

2 

n > M, ∞ 

k=1 |xn(k) − x(k)| < ǫ, and hence ||xn − x|| < ǫ for n > M. 

Consequently, xn strongly converges to x. 

Now that we’ve seen that equality between weak and strong conver- 

gence can possibly be equal in infinite dimensions, let’s justify that in 

finite dimensions, the two are the same. 

Theorem 18. Let X be a finite dimensional normed vector space such 

that xn weakly converges to x. Then, xn strongly converges to x. 

Proof. Since X is finite dimensional, there exists a basis {e1,e2,....en} 

such that xn = k 

j=1 αn(j)ej, where αn(j) is the j − th coordinate 

of xn. We may choose a set of functionals such that fj(ej) = 1 and


fj(em) = 0 for m = j. Then, since fj(xn) → fj(x) by assumption, this 

tells us that fj(xn) = αn(j) → f(x) = α(j). Consequently, we have 

a convergent sequence of scalars in each coordinate. Then, we have 

||xn − x|| = || k 

j=1 (αn(j) − α(j))ej|| ≤ k 

j=1 |alphan(j) − α(j)|||ej||. 

Since each sequence of scalars goes to zero, the finite sum tends to 

zero. 

This explains partly why you likely haven’t heard about topics like 

weak convergence in your calculus or linear algebra classes: it’s all the 

same in finite dimensions! Now that we know a thing or two about 

weak convergence, is there an equivalent or simpler way of describing if 

a sequence will weakly converge? Yes, there is a very handy way where 

we only need to show that f(xn) → f(x) on a total subset of X ∗ . 

Definition 14. Let X be a normed vector space and let M ⊂ X. We 

say that M is a total subset if the span of M is dense in X. 

Theorem 19. Let X be a normed vector space. Then, xn converges 

weakly to x iff ||xn|| is a bounded sequence and if for every linear func- 

tional in a total subset M ⊂ X ∗ , we have f(xn) → f(x). [3] 

Proof. Let xn converge weakly to x. Then, by a previous problem, 

||xn|| is a bounded sequence. Additionally, since f(xn) → f(x) for 

all f ∈ X ∗ , we of course have that f(xn) → f(x) for a total subset 

M ⊂ X ∗ . 

The converse is a bit trickier. By assumption, ||xn|| ≤ c and we have 

some total subset M ⊂ X ∗ . We need to show that |f(xn) − f(x)| → 0 

for all f ∈ X ∗ . We know this holds true for our total set. The trick


here is to use an ǫ 

3 argument. We can choose an fj ∈ spanM such that 

||f − fj|| < ǫ 

3 , since M is total. Since fj is a linear combination of 

functionals from M, we have |fj(xn) − f(x)| < ǫ. 

Then, we have 

3 

|f(x)−f(xn)| ≤ |f(x)−fj(x)|+|fj(x)−fj(xn)|+|fj(xn)−fj(x)| ≤ ||f−fj|| ||x||+ ǫ 

3 +||f−fj|| ||xn|| 

Note that it was imperative that ||xn|| was bounded for this trick to 

work. 

It may not be immediately obvious why this previous theorem is so 

helpful: we still have to find a total subset of X ∗ and then show that 

f(xn) → f(x) for all of those. Well, it turns out in some spaces, working 

with a total subset is extremely easy! Consider ℓ p for 1 

Then ℓ q is the dual space, where q is the conjugate to p. Then {en} 

is a Schauder basis, which is a total subset. From this, we can show 

that xn converges weakly to x iff ||xn|| is bounded and xn(k) → x(k). 

(Why?) That is, a sequence is weakly convergent if it is norm bounded 

and pointwise bounded. 

So far, if X is a normed vector space, we have so far given it a new 

topology. We know that if X is a normed vector space, X ∗ is a Banach 

space (even if X is not!), and hence we can consider giving it a weak 

topology. One can do this by considering the weak topology generated 

by X ∗∗ . However, the more important topology on X ∗ is the weak- 

* topology generated by X regarded as a space of linear functionals 

acting on X ∗ . That is, we look at the weak topology on X ∗ generated 

by X ⊂ X ∗∗ . More concretely, if fn is a sequence of functionals in X ∗ ,


we say that fn is weak-* convergent to f if for all x ∈ X, x(fn) → x(f) 

(where x is acting as a linear functional on fn ). But, we know that this 

just means for all x ∈ X, f(xn) → f(x). That is, the weak-* topology 

on X ∗ is just pointwise convergence! 

Definition 15. Let X be a normed vector space and let X ∗ be the 

dual. We say that a sequence fn converges weak-* to f ∈ X ∗ if for 

every x ∈ X, fn(x) → f(x). 

It may not seem immediately obvious why we even bother using the 

weak-* topology on X ∗ . We see that the weak topology on X is ben- 

eficial because it is more flexible in letting sequences converge. There 

is a topological reason which makes the weak-* topology extremely 

convenient. Recall that we showed that the closed unit ball in an in- 

finite dimensional space is necessarily not compact. Well, it turns out 

the weak-* topology makes the closed unit ball in X ∗ compact (in the 

weak-* topology). Note: if you’re not familiar with Tychonoff’s theo- 

rem, feel free to skip this proof. Make sure to familiarize yourself with 

Tychonoff’s theorem at some point, as it is a very usefull theorem from 

topology. For the benefit of the reader, I will restate it here: 

Theorem 20. Tychonoff ′ sTheorem Let {Xα} be a family of com- 

pact topological spaces. Then, X = ΠαXα is compact in the product 

topology. 

On the other hand, if X = ΠαXα is a compact space, then since each 

πα is a continuous map, each Xα is also compact (continuous functions 

map compact sets to compact sets).


Theorem 21. Alaoglu ′ sTheorem Let X be a normed vector space. 

Then, the closed unit ball B ∗ = {f ∈ X ∗ : ||f|| ≤ 1} is compact in the 

weak-* topology. [2] 

Proof. For every x ∈ X, we can define Dx = {z ∈ C : |z| ≤ ||x||} 

and define D = Πx∈XDx. Note that each Dx is compact (Why?), and 

hence by Tychonoff’s theorem, D is compact. Here’s the trick to this 

theorem: What does it mean if φ ∈ D? If φ ∈ D, then φ associates 

with each x ∈ X a complex scalar in the x th coordinate. Therefore, we 

may identify φ as a functional acting on X. This is not necessarily a 

collection of linear functionals though! All we know is so far that D is 

compact. We have B ∗ is a subset of D. The topology that B ∗ inherits 

from D is the product topology, which you may recall is the topology 

of pointwise convergence. But, we know that the topology of pointwise 

convergence is exactly the weak-* topology. That is, B ∗ as a subset 

of D has the weak-* topology. Since D is compact, we need to just 

show that B ∗ is closed. (Why?) Let fα ∈ B ∗ be a net that converges 

to f ∈ D. We need to show that f ∈ B ∗ . First, is f linear? Well, 

lim fα(ax + by) = a lim fα(x) + b lim fα(y) = af(x) + bf(y). So, f is 

linear. So, f ∈ B ∗ , and we have that B ∗ is closed. Consequently, B ∗ is 

compact in the weak-* topology. 

To summarize, we’ve given a normed space X two topologies: the 

usual norm topology and a new topology generated by the functionals 

in X ∗ . On X ∗ , we have the usual norm topology and the topology 

of pointwise convergence induced by X. What about the space of 

bounded operators, B(X,Y ). This has convergence given by the norm.


That is, if Tn → T, we mean ||Tn − T || → 0. That instantly implies 

||Tnx − Tx|| → 0 for all x ∈ X. What about the other way around? If 

||Tnx − Tx|| → 0 for all x ∈ X, does ||Tn − T || → 0? This is not the 

case. However, we can define a pointwise topology on B(X,Y ) with 

this pointwise norm estimates. To be precise, 

Definition 16. We say that Tn → T strongly if ||Tnx − Tx|| → 0 for 

every x ∈ X. 

ogy. 

The topology associated with this is called the strong operator topol- 

Problems 

On these problems, I strongly suggest taking a glance back at the 

Hahn-Banach theorem and its useful consequences. 

Problem 25. Let xn,yn weakly converge to x and y respectively. Show 

that αxn + βyn → αx + βy weakly. 

Problem 26. Let T : ℓ 2 → ℓ 2 be given by Tnx = (0, 0, 0,.....x(n),x(n+ 

1),......). Consider the sequence Tn. Show that each Tn is a linear, 

bounded operator first. Show that ||Tnx − Tx|| → 0 for some appro- 

priate T. Does ||Tn − T || → 0? 

Problem 27. Let X,Y be normed spaces. Let xn → x weakly and let 

T ∈ L(X,Y ). Show that Txn → Tx weakly. Note that Txn ∈ Y . 

Problem 28. Let xn converge weakly in a normed space X to x. Show 

that x ∈ span{x1,x2,......}.


Problem 29. Let Y be a closed subspace in X. Show that Y contains 

all of the limits of its weakly convergent sequences. 

Problem 30. Let X be a Banach space and let E ⊂ X be a norm- 

bounded set. Consider the weak closure of E (that is, the closure of E 

in the weak topology). Show that the weak closure of E is still norm 

bounded. 

Problem 31. Let X be anormed vector space and Y a subspace. Then 

Y is norm closed iff Y is weakly closed. 

Hilbert Spaces 

In linear algebra, we learned to generalize the algebraic structure of 

R n by considering vector spaces of dimension n. Then, to acquire some 

of the topological structure, we generalized the metric nature of R n to 

get normed vector spaces and Banach spaces. However, even in these 

spaces which have similar topological and algebraic structures to R n , 

there is still something missing, and this missing piece is the familiar 

geometry of R n . We know how to compare vectors and see if they are 

perpendicular. To do this, we have the dot product in R n . The dot 

product naturally induced a norm (which in turn gives us a metric). 

If we generalize the notion of a dot product, we get what is called an 

inner product. 

Definition 17. An inner product is a map from X × X → C such 

that: 

• 〈ax + by,z〉 = a〈x,z〉 + b〈y,z〉


• 〈y,x〉 = 〈x,y〉 

• 〈x,x〉 ∈ (0, ∞) for all x = 0 

This is a linear in the first term and conjuagte-linear in the second 

term mapping, as 〈x,ay〉 = ā〈x,y〉. Note that in physics, the opposite 

convention is used (conjugate linear in the first term). Note that if X is 

a real vector space, then the inner product is bilinear and conjugation 

is no problem. We often call such a space an inner product space, or in 

more fancy terms, a pre-Hilbert space. With an inner product, we can 

induce a norm by ||x|| = 〈x,x〉. That this is so is not immediately 

obvious. Although it should follows quickly from the definitions that 

||x|| = 0 iff x = 0 and ||x|| ≥ 0, and ||αx|| = |α|||x||, the triangle 

inequality is a bit tricky and we’ll need something to deal with that. 

Inner product spaces give all the structure a normed space has, plus 

some new tricks. One of the most valuable inequalities that I have ever 

used is an inequality that relates the magnitude of an inner product of 

two vectors and the product of their norms. 

Theorem 22. The Cauchy − Schwarz Inequality Let x,y ∈ X . 

Then |〈x,y〉| ≤ ||x|| ||y|| 

Proof. Consider x,y = 0 (since if either of them are zero, the inner 

product is zero and the result follows). For every scalar α, ||x−αy|| 2 = 

〈x − αy,x − αy〉 = 〈x,x〉 − ¯α〈x,y〉 − α〈y,x〉 − α¯α〈y,y〉. That is, 

we have ||x − αy|| 2 = ||x|| 2 − ¯α〈x,y〉 − α[〈y,x〉 − α〈y,y〉]. We can 

zero out the bracketed term if we choose ¯α = 〈y,x〉 

. Consequently, 

〈y,y〉 

0 ≤ ||x − αy|| 2 ≤ ||x|| 2 − α〈x,y〉 = ||x|| 2 − 〈y,x〉 

〈y,x〉. Rewriting yields 

〈y,y〉


0 ≤ ||x|| 2 − |〈x,y〉|2 

||y|| 2 . Moving terms and multiplying by the denominator 

yields |〈x,y〉| 2 ≤ ||x|| 2 ||y|| 2 . Taking square roots finishes the proof. 

Note that z¯z = |z| 2 , which was used. 

Theorem 23. If ||x|| = 〈x,x〉, then ||x|| is a norm on X. 

Proof. By the previous remarks, ||x|| satisfies all of the norm properties 

automatically from the definition, save for possibly the triangle inequal- 

ity. We have, ||x + y|| 2 = 〈x + y,x+y〉 = ||x|| 2 + 〈x,y〉 + 〈y,x〉 + ||y|| 2 . 

Now, on the two middle terms, we may apply Cauchy-Schwarz’s in- 

equality to get ||x|| ||y|| from both middle terms. Therefore, ||x+y|| 2 ≤ 

||x|| 2 + 2||x|| ||y|| + ||y|| 2 ≤ (||x|| + ||y||) 2 . Taking square roots gives 

the triangle inequality. 

We know that the norm is continuous from our initial study of 

normed vector spaces, so it follows that the norm induced by the in- 

ner product is continuous. More can be said: the inner product is a 

continuous mapping from X × X → C. 

Lemma 5. Let X be an inner product space and let xn,yn be convergent 

sequences to x,y respectively. Show that lim〈xn,yn〉 = 〈x,y〉. 

Proof. We have |〈xn,yn〉−〈x,y〉| = |〈xn,yn〉−〈x,yn〉+〈x,yn〉−〈x,y〉| ≤ 

|〈xn − x,yn〉| + |〈x,yn − y〉| ≤ ||x − xn|| ||yn|| + ||x|| ||y − yn|| → 0. 

So far we have generalized the idea of a dot product to an arbitrary 

vector space. An inner product space instantly gives us a norm topol- 

ogy and convergence in norm, analogous to R n . But, analytically, R n 

has the wonderful property of being complete. If we could define a


complete inner product space, we would have a great generalization of 

our familiar Euclidean spaces. 

Definition 18. A Hilbert Space is a complete, inner product space. 

Another way of phrasing it is that a Hilbert space is a Banach space 

with an inner product. We know so far that R n and C n are Hilbert 

spaces with the usual dot product. We’ve run into an example of a 

Hilbert space already: ℓ 2 . If we let 〈x,y〉 = ∞ 

j=1 x(j)y(j), for x,y ∈ ℓ2 , 

we have a well defined inner product. (Why?) This is what makes ℓ 2 so 

much more special thatn ℓ 1 or ℓ ∞ , which can have bizarre, pathological 

problems (especially L 1 and L ∞ ). However, ℓ 2 is a very nice space 

with great properties. We already know that (ℓ p ) ∗ = ℓ q where p and q 

are conjugates, and 2 is conjugate with itself, so we know ℓ 2 = (ℓ 2 )∗. 

Soon, we will be able to show that for a general Hilbert space H, there 

is a bijection between H and H ∗ . More concretely, there are several 

familiar geometric properties in a Hilbert space. 

Theorem 24. Let x,y ∈ H. Then ||x+y|| 2 +||x−y|| 2 = 2(||x|| 2 +||y|| 2 ) 

Proof. Note that ||x + y|| 2 = ||x|| 2 + 2ℜ〈x,y〉 + ||y|| 2 and ||x − y|| 2 = 

||x|| 2 − 2ℜ〈x,y〉 + ||y|| 2 . Summing the two formulas gives the desired 

result. 

The importance of the inner product is only realized when we gener- 

alize the notion of orthogonality. We say that x ⊥ y or x is orthogonal 

to y if 〈x,y〉 = 0. One of the most familiar rules from geometry is the 

Pythagorean theorem for a right triangle. We can generalize this to 

arbitrary Hilbert spaces.


Theorem 25. The Pythagorean Theorem Let x1,.....xn ∈ H and 

let xj ⊥ xk for j = k. Then || n j=1 xj|| 2 = n 2 

j=1 ||xj|| 

Proof. Exercise 

Given a set M ⊂ H, we can define M ⊥ = {x ∈ H : x ⊥ y ∀y ∈ H}. 

We call M ⊥ the orthogonal complement of M. With the inner product, 

given any set, we can decompose a Hilbert space into the direct sum of 

M and its orthogonal complement. For example, in R 2 , the x-axis and 

y-axis are orthogonal one dimensional subspaces such that R 2 can be 

viewed as the direct sum of the two subspaces. We can generalize this 

notion to artbitrary Hilbert spaces by first considering the following 

question: if M ⊂ H is a closed subspace and y ∈ H, does there exist 

a unique x ∈ M sucht that ||x − y|| is minimized? Can we always 

find a closest vector in the subspace? The answer is that for a closed 

subspace, we can do this; furthermore, with this unique x, we can 

actually express y as a sum of an element from M and M ⊥ . Since this 

can be done for arbitrary y ∈ H, we can decompose H into M ⊕ M ⊥ . 

Theorem 26. Let M be a closed subspace of H. Then H = M ⊕ M ⊥ . 

In other words, if x ∈ H, then we can uniquely write x = y + z where 

y ∈ M and z ∈ M ⊥ . These unique elements y and z are the unique 

elements of M and M ⊥ that minimize the distance to x. [2] 

Proof. Let x ∈ H and define δ = inf{||x − y|| : y ∈ M}. By the 

definition of infimum, we may find a sequence yn such that ||x −yn|| → 

δ. Since H is a Hilbert space, we may use the parallelogram law, which 

tells us that:


2(||yn − x|| 2 + ||ym − x|| 2 ) = ||yn − ym|| 2 + ||yn + ym − 2x|| 2 

We know that M is a subspace, so 1 

2 (yn + ym) ∈ M. If we solve for 

||yn − ym|| 2 , and factor out the 2 from ||yn + ym − 2x|| 2 , we arrive at: 

||yn − ym|| 2 = 2(||yn − x|| 2 + ||ym − x|| 2 ) − 4|| 1 

2 (yn + ym) − x|| 2 

||yn − ym|| 2 ≤ 2||yn − x|| 2 + 2||ym − x|| 2 − 4δ 2 

Now, we know that ||yn − x|| 2 and ||ym − x|| 2 fall down towards δ, 

so the right hand side falls to zero. This tells us that yn is a Cauchy 

sequence in M. Consequently, there exists a y ∈ M such that yn → y. 

(Why?) Define z = x − y, and hence ||z|| = ||x − y|| = δ. So far, we 

have shown that there is a y ∈ M that minimizes the distance to x. 

Notice that x = y +(x −y) = y +z. If we can show that z ∈ M ⊥ , then 

we will be mostly done, save for uniqueness. 

What we need to do now is show that for any u ∈ M, u ⊥ z. So, 

consider 〈z,u〉. This quantity may be a complex number, but if we 

multiply u by an appropriate scalar, we can turn this into a real valued 

quantity (Note: this trick is often used, where one multiplies by a 

scalar to either normalize or make a quantity real). Consider f(t) = 

||z + tu|| 2 = ||z|| 2 + 2t〈z,u〉 + t 2 ||u|| 2 . Differentiating this real valued 

function gives f ′ (t) = 2〈z,u〉+2t||u|| 2 . We know that f(t) = ||z +tu|| 2 

is minimized at t = 0 because z + tu = x − y + tu = x − (y + tu).


Since y + tu ∈ M, we know ||x − y + tu|| ≥ ||x − y|| = ||z||. Therefore, 

the minimization occurs when t = 0. Looking out our derivative, we 

have f ′ (0) = 0 = 2〈z,u〉. Hence z ⊥ u. Since this holds for arbitrary 

u ∈ M, we have z ∈ M ⊥ . Now, we must argue uniqueness: Let y ′ ∈ M. 

Then ||x − y ′ || 2 = ||x − y|| 2 + ||y − y ′ || 2 ≥ ||x − y|| 2 . Here, we used the 

Pythagorean theorem, which is valid since x − y ⊥ y − y ′ ∈ M, since 

x − y = z ∈ M ⊥ . One can similarly show that given another z ′ ∈ M ⊥ , 

||x − z ′ || = ||x − z|| 2 + ||z − z ′ || 2 ≥ ||x − z|| 2 , and we have equality iff 

z = z ′ . This solves uniqueness of y and z as the closest elements to x 

from M and M ⊥ respectively. 

Therefore, given any x ∈ H, we can write x = y + z where y ∈ M 

and z ∈ M ⊥ . Assume that there is another decomposotion. Then, 

y ′ + z ′ = x = y + z implies y ′ − y = z + z ′ . But, y ′ − y ∈ M and 

z − z ′ ∈ M ⊥ , and hence y ′ − y = z − z ′ ∈ M ⊥ ∩ M = {0}. Therefore, 

we have a unique decomposition of H as M ⊕ M ⊥ . 

If we look back at the beginning of this proof where we were es- 

tablishing the existence of a minimizing distance vector y to x, notice 

that the only properties of M we used were that M was closed (hence 

complete) and that 1 

2 (yn + ym) ∈ M. The second property is far less 

demanding than being a subspace; in fact, a convex set would do just 

fine. That is, if K is a closed, convex set in a Hilbert space, then we 

can find a unique minimizing vector. The rest of the proof utilizes 

subspace properties however. 

Let x ∈ H and let M be a closed subspace. Then by the previous 

theorem, there exists y,z in M and M ⊥ respectively. We call y the


orthogonal projection of z onto M. This is motivated by the calculus 

and geometry you are likely familiar with. With this information, we 

can define a mapping P : H → M by Px = y. From this, we see that P 

is a linear operator. Furthermore, P is continuous, and hence bounded 

(if xn → x, then ǫ ≥ ||xn − x|| 2 = ||yn − y|| 2 + ||zn − z|| 2 , so yn → y; 

from this, Pxn = yn → y = Px). We have some nice properties: P is 

an onto bounded linear mapping from H to M, and it is the identity 

on M, and hence P 2 = P (Why?). Additionally, P(M ⊥ = 0. 

With this newfound structure in a Hilbert space, we can learn some- 

thing very important about the dual space of H. If y ∈ H, then we 

may define f(x) = 〈x,y〉, which is a bounded linear functional. (Why?) 

The surprising thing is, every bounded linear functional can be written 

in this way! Therefore, H ∗ can be identified naturally with H itself. 

This instantly tells us that we may view H ∗∗ = H also. 

Theorem 27. The Riesz Representation Theorem Let H be a Hilbert 

space and f ∈ H ∗ . Then, there exists unique y ∈ H such that f(x) = 

〈x,y〉 for all x ∈ X. 

Proof. If f = 0, then it is certainly true that f(x) = 〈x, 0〉 = 0. Let 

f not be the zero functional. Let M = {x ∈ H : f(x) = 0}. We 

know that the kernel of a bounded operator gives a closed subspace, 

so M is closed. Since f = 0, M is a nontrivial subspace. Then H = 

M ⊕ M ⊥ and M ⊥ is non-trivial, so we may choose z ∈ M ⊥ , such 

that ||z|| = 1 (since M ⊥ is a closed subspace as well). Then, Then, 

define u = f(x)z − f(z)x. So, f(u) = 0, and u ∈ M. So, 0 =


〈u,z〉 = f(x)||z|| 2 − f(z)〈x,z〉 = f(x) − 〈x, f(z)z〉. Solving for f(x) 

gives f(x) = 〈x,y〉 with y = f(z)z. 

Now, we must show uniqueness. Assume there exists y,y ′ such that 

f(x) = 〈x,y〉 = 〈x,y ′ 〉. Then, 0 = 〈x,y − y ′ 〉. Choosing x = y − y ′ , we 

get ||y − y ′ || 2 = 0, and hence y = y ′ . 

This amazing result tells us that the functionals acting on H can be 

identified with H itself through a conjugate linear isomorphism. With 

this new knowledge of functionals, we can build new operators from 

already existing bounded linear operators. 

Definition 19. Let H be Hilbert spaces and let T : H → H be a 

bounded linear operator. Then, we define the adjoint T ∗ : H → H 

such that 〈Tx,y〉 = 〈x,T ∗ y〉 for all x,y ∈ H. 

It is not obvious that such an operator even exists. However, with the 

Riesz representation theorem, we can actually build it rather quickly 

since we know a bit about functionals. 

Theorem 28. The adjoint T ∗ of a bounded linear operator exists and 

is a unique, bounded linear operator with norm equal to ||T ||. 

Proof. Consider the functional defined by fy(x) = 〈Tx,y〉 for all x ∈ H. 

Then, this is a bounded linear functional, as ||fy(x)|| ≤ ||T || ||x|| ||y|| 

(Why?) and hence by the Riesz representation theorem, there ex- 

ists a unique z ∈ H such that 〈Tx,y〉 = 〈x,z〉. Consider the map- 

ping of H → H given by y → z. We may call this mapping T ∗ . 

With this, we have a well defined linear (show linearity) operator T ∗


such that 〈Tx,y〉 = 〈x,T ∗ y〉 for all x,y ∈ H. Given an operator, we 

have ||T || = sup x∈X 

sup x,y∈H 

〈Tx,y〉 

||x|| ||y|| . Then, we have ||T ∗ || = sup x,y∈H 

〈T ∗ x,y〉 

||x|| ||y|| = 

〈x,Ty〉 

||x|| ||y|| ≤ sup ||x|| ||Ty|| 

x,y∈H ||x||||y|| ≤ supx∈H ||Tx|| = ||T ||. On the 

other hand, ||T ∗ || = sup x,y∈H 

||T ||. So, ||T || = ||T ∗ ||. 

〈x,Ty〉 

||x|| ||y|| ≥ sup x,y∈H 

〈Tx,Tx〉 

||Tx|| ||x|| = sup ||Tx|| 

x∈H ||x|| = 

Note that it is possible to generalize this concept to an adjoint map- 

ping between two different Hilbert spaces H1 and H2. That is, we 

can define , for a given T ∈ B(H1,H2), T ∗ ∈ B(H2,H1) such that 

〈Tx,y〉2 = 〈x,T ∗ y〉1 for all x ∈ H1, y ∈ H2. However, this requires 

some extra machinery (sesquilinear forms), which I decided weren’t 

worth pursuing and can easily be found in any textbook or on the in- 

ternet. Before we start proving some properties about adjoints, there is 

a useful trick for showing that an operator is actually the zero operator: 

Lemma 6. Let X,Y be inner product spaces and let T ∈ B(X,Y ). [3] 

Then: 

a. T = 0 iff 〈Tx,y〉 = 0 for all x ∈ X and y ∈ Y 

b. If T : X → X and X is a complex inner product space and if 

〈Tx,x〉 = 0 for all x ∈ X, then T = 0 

Proof. Part (a) is an exercise. For part b, consider 〈T(αx + y,αx + y〉. 

Consider the two cases of α = i and α = −i. 

Note that the statement in part (b) of the previous lemma requires 

that X be a complex space. It is false in the real case. Consider a 

rotation operator in R 2 that rotates by 90 deg [3].


Theorem 29. Let T,S : H → H be bounded linear operators. Then, 

a. (S + T) ∗ = S ∗ + T ∗ 

b. (αT) ∗ = ¯αT ∗ 

c. (T ∗ ) ∗ = T 

d. ||T ∗ T || = ||TT ∗ || = ||T || 2 

e. T ∗ T = 0 iff T = 0 

f. (ST) ∗ = T ∗ S ∗ 

The proofs of these are computational exercises which hopefully 

shouldn’t prove to be too strenuous. In the exercises, we will explore 

operators known as self-adjoint operators, which satisfy T = T ∗ and 

unitary operators, which are invertible operators such that T ∗ = T −1 . 

So far, we’ve learned a bit about functionals and operators on a Hilbert 

space. It’s time we learn about one of the most useful properties about 

Hilbert spaces: orthonormal bases. 

A set {eα} ∈ H is said to be orthonormal if ||eα|| = 1 and 〈eα,eβ〉 = 

δαβ, where δαβ = 1 if α = β and zero otherwise. That is, every vector 

in an orthonormal set is orthogonal to every other one, and every vec- 

tor has norm one. Recall from linear algebra that given any linearly 

independent set {xn}, one could transform this into an orthonormal 

set using the Gram-Schmidt orthogonalization procedure. One defines 

y1 = x1 

||x1|| and then yn. Repeating this for all xn, we can then define 

zn = xn − n−1 

j=1 〈xj,un〉un. Orthonormal sets satisfy a very impor- 

tant inequality which relates the dot products of a vector against an 

orthonormal set with the norm of the vector.


Theorem 30. Bessel ′ s Inequality If {eα}α∈A is an orthonormal set 

in H, then for any x ∈ H, we have 

α∈A |〈x,eα〉| ≤ ||x|| 2 

Proof. It is possible that this is an uncountable sum; to deal with an 

uncountable sum, one takes the supremum over all finite subsets of A. 

Therefore, if we can prove this for an arbitrary finite subset of A, we 

will be done. 

0 ≤ ||x − 

〈x,eα〉eα|| 2 

α∈B 

= ||x|| 2 −2Re〈x, 

〈x,uα〉uα〉+|| 

〈x,uα〉uα|| 2 Use Pythagorean Theorem on rightmost piece 

α∈B 

α∈B 

= ||x|| 2 − 2 

|〈x,uα〉| 2 + 

|〈x,uα〉| 2 

α∈B 

α∈B 

= ||x|| 2 − 

|〈x,uα〉| 2 

α∈B 

Moving the last sum to the right hand side finishes the proof. 

Can equality happen in Bessel’s inequality? The answer is yes, and 

something very nice happens in that case. If one has an orthonormal 

set such that 

α∈A |〈x,eα〉| = ||x|| 2 for all x ∈ H, it turns out that the 

set {eα} actually is a sort of orthonormal basis. That is, we can express 

x = 〈x,eα〉eα. We call the coefficeints 〈x,eα〉 Fourier coefficients. 

Theorem 31. Let {eα} be an orthonormal set in H. The following 

are equivalent: [2] 

a. If 〈x,uα〉 = 0 for all α, then x = 0 

b. Parseval ′ s Identity ||x|| 2 = 

α∈A |〈x,uα〉| 2 for all x ∈ H


c. For each x ∈ H, x = 

α∈A 〈x,uα〉uα. This sum converges in 

the norm topology no matter the ordering. 

Proof. Assume (a) and let’s show (c). We may choose a subset of the 

α’s by discarding all α such that 〈x,uα〉 = 0. By Bessel’s inequality, 

the sum |〈x,uα〉| 2 converges. We have that || m 

αj=n 〈x,uαj 〉||2 = 

m 

j=n |〈x,uαj 〉2 → 0 as we let m,n get arbitrarily large. By the com- 

pleteness of H, 〈x,uα〉uα converges. Then, 〈x − 〈x,eα〉eα,eα〉 = 0 

for all α, and by assumption of (a), this tells us that x− 〈x,eα〉eα = 0. 

Let’s assume (c) and show (b). We have ||x|| 2 − n 

j=1 |〈x,uαj 〉|2 = 

||x− n 

j=1 〈x,uαj 〉uαj ||2 by calculation. We have by assumption that the 

term on the right goes to zero. Hence, we have ||x|| 2 = n 

j=1 |〈x,uαj 〉|2 . 

If we assume (b), then (a) follows: if 〈x,uα〉 = 0 for all α, then we 

have ||x|| 2 = |〈x,uα〉| 2 = 0, and hence x = 0. 

This theorem illustrates the desirable nature of an orthonormal set 

in a Hilbert space: it allows every vector to be written as an easy 

linear combination of the orthonormal vectors and the vectors Fourier 

coefficients. This is why a Hilbert space can be such an ideal space to 

work in. Generalizing from linear algebra, we call a set that satisfies 

one (and hence all) properties of the previous theorem an orthnormal 

basis. We know from our previous work that the set {en} ∈ ℓ 2 , the 

cannonical basis, is a Schauder basis for ℓ 2 . With inner product defined 

as 〈x,y〉 = ∞ 

n=1 x(n)y(n), we see that the (en) form an orthonormal 

set, and it isn’t too hard to show that if 〈x,en〉 = 0 for all n, then 

x = 0; hence, this sequence forms an orthonormal basis. It should be 

clear that a Hilbert space with an orthonormal basis is an ideal setting


to work in. A question remains: given a Hilbert space, does there exist 

an orthonormal basis? Fortunately, the answer is yes! 

Theorem 32. Let H be a Hilbert space. Then, H has an orthonormal 

basis. 

The proof of this, much like the proof of the Hahn-Banach theorem, 

requires a powerful set theoretic lemma: Zorn’s lemma. Our first step 

is to consider a partially ordered set X where the elements of X are 

orthonormal subsets of H. (Note to the student: it is imperative that 

you first show X is non-empty. Why is X non-empty?) To give a partial 

ordering on X, we say U1 ≤ U2 if U1 ⊂ U2. To use Zorn’s lemma, we 

must argue that every chain has an upper bound, where a chain is a 

linearly ordered set. Let C = {U1,U2,.....} with U1 ⊂ U2 ⊂ U3..... If 

we define U = ∪Un, we have an orthonormal set and clearly Un ≤ U 

for all n. Therefore, this set has an upper bound, and hence there is a 

maximal element in X (that is, a largest orthonormal set). Let this set 

be {eα}. We do not yet know that eα is an orthonormal basis. Being a 

maximal orthonormal set implies there exists no x such that x ⊥ eα for 

all α, save for x = 0. But, that is equivalent to part (a) of the previous 

theorem. Consequently, {eα} is an orthonormal basis. 

Hilbert spaces keep getting better and better; they generalize the 

geometry and completeness of R n , and they always admit orthonormal 

bases. Hilbert spaces are reflexive and H ∗ is exactly H itself. Given a 

closed subspace, one can decompose H into the direct sum of the sub- 

space and its orthogonal complement. Additionally, any vector can be 

reconstructed from its Fourier coefficients, given an orthonormal basis,


which is guaranteed to exist. To make things even better, if one can 

find a countable orthonormal basis, the Hilbert space is automatically 

separable (and the converse is true too!). 

Theorem 33. Let H be a Hilbert space. Then, H is separable iff H 

has a countable orthonormal basis. Additionally, if H has a countable 

orthonormal basis, then all orthonormal bases are countable. [2] 

Proof. The assertions in the second sentence will be left as an exercise. 

Let’s show the last assertion. Let, {un} be a countable orthonormal 

basis and {vβ}β∈B, be an arbitrary orthonormal basis. Then, consider 

the sets An = {β ∈ B : 〈vβ,un〉 = 0}. This set An must be countable, 

by Bessel’s inequality and/or part (c) of the Parseval’s theorem. Then, 

∪An is a countable set. We have that since un forms an orthonormal 

basis, every α is in at least one An, so ∪An = A is countable. 

References 

[1] John B. Conway. A Course in Functional Analysis. Springer, 2007. 

[2] Gerald B. Folland. Real Analysis. John Wiley and Sons Inc., 1999. 

[3] Erwin Kreyszig. Introductory Functional Analysis with Applications. John Wiley 

and Sons Inc., 1989.

NOTES ON ANALYSIS Contents Banach Spaces 2 Problems 6 A ...

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?