06.09.2021 Views

A First Course in Linear Algebra, 2015a

A First Course in Linear Algebra, 2015a

A First Course in Linear Algebra, 2015a

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong><br />

Robert A. Beezer<br />

University of Puget Sound<br />

Version 3.50<br />

Congruent Press


Robert A. Beezer is a Professor of Mathematics at the University of Puget Sound,<br />

where he has been on the faculty s<strong>in</strong>ce 1984. He received a B.S. <strong>in</strong> Mathematics<br />

(with an Emphasis <strong>in</strong> Computer Science) from the University of Santa Clara <strong>in</strong> 1978,<br />

a M.S. <strong>in</strong> Statistics from the University of Ill<strong>in</strong>ois at Urbana-Champaign <strong>in</strong> 1982<br />

and a Ph.D. <strong>in</strong> Mathematics from the University of Ill<strong>in</strong>ois at Urbana-Champaign<br />

<strong>in</strong> 1984.<br />

In addition to his teach<strong>in</strong>g at the University of Puget Sound, he has made sabbatical<br />

visits to the University of the West Indies (Tr<strong>in</strong>idad campus) and the University<br />

of Western Australia. He has also given several courses <strong>in</strong> the Master’s program at<br />

the African Institute for Mathematical Sciences, South Africa. He has been a Sage<br />

developer s<strong>in</strong>ce 2008.<br />

He teaches calculus, l<strong>in</strong>ear algebra and abstract algebra regularly, while his research<br />

<strong>in</strong>terests <strong>in</strong>clude the applications of l<strong>in</strong>ear algebra to graph theory. His professional<br />

website is at http://buzzard.ups.edu.<br />

Edition<br />

Version 3.50<br />

ISBN: 978-0-9844175-5-1<br />

Cover Design<br />

Aidan Meacham<br />

Publisher<br />

Robert A. Beezer<br />

Congruent Press<br />

Gig Harbor, Wash<strong>in</strong>gton, USA<br />

c○ 2004—2015<br />

Robert A. Beezer<br />

Permission is granted to copy, distribute and/or modify this document under the<br />

terms of the GNU Free Documentation License, Version 1.2 or any later version<br />

published by the Free Software Foundation; with no Invariant Sections, no Front-<br />

Cover Texts, and no Back-Cover Texts. A copy of the license is <strong>in</strong>cluded <strong>in</strong> the<br />

appendix entitled “GNU Free Documentation License”.<br />

The most recent version can always be found at http://l<strong>in</strong>ear.pugetsound.edu.


To my wife, Pat.


Contents<br />

Preface<br />

Acknowledgements<br />

v<br />

xi<br />

Systems of L<strong>in</strong>ear Equations 1<br />

What is L<strong>in</strong>ear <strong>Algebra</strong>? . . . . . . . . . . . . . . . . . . . . . . . . . . . 1<br />

Solv<strong>in</strong>g Systems of L<strong>in</strong>ear Equations . . . . . . . . . . . . . . . . . . . . 8<br />

Reduced Row-Echelon Form . . . . . . . . . . . . . . . . . . . . . . . . . 22<br />

Types of Solution Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45<br />

Homogeneous Systems of Equations . . . . . . . . . . . . . . . . . . . . 58<br />

Nons<strong>in</strong>gular Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66<br />

Vectors 74<br />

Vector Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74<br />

L<strong>in</strong>ear Comb<strong>in</strong>ations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83<br />

Spann<strong>in</strong>g Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107<br />

L<strong>in</strong>ear Independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122<br />

L<strong>in</strong>ear Dependence and Spans . . . . . . . . . . . . . . . . . . . . . . . . 136<br />

Orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148<br />

Matrices 162<br />

Matrix Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162<br />

Matrix Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175<br />

Matrix Inverses and Systems of L<strong>in</strong>ear Equations . . . . . . . . . . . . . 193<br />

Matrix Inverses and Nons<strong>in</strong>gular Matrices . . . . . . . . . . . . . . . . . 207<br />

Column and Row Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 217<br />

Four Subsets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236<br />

Vector Spaces 257<br />

Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257<br />

Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272<br />

L<strong>in</strong>ear Independence and Spann<strong>in</strong>g Sets . . . . . . . . . . . . . . . . . . 288<br />

iv


Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303<br />

Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318<br />

Properties of Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . 331<br />

Determ<strong>in</strong>ants 340<br />

Determ<strong>in</strong>ant of a Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . 340<br />

Properties of Determ<strong>in</strong>ants of Matrices . . . . . . . . . . . . . . . . . . . 355<br />

Eigenvalues 367<br />

Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . 367<br />

Properties of Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . 390<br />

Similarity and Diagonalization . . . . . . . . . . . . . . . . . . . . . . . 403<br />

L<strong>in</strong>ear Transformations 420<br />

L<strong>in</strong>ear Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . 420<br />

Injective L<strong>in</strong>ear Transformations . . . . . . . . . . . . . . . . . . . . . . 446<br />

Surjective L<strong>in</strong>ear Transformations . . . . . . . . . . . . . . . . . . . . . 462<br />

Invertible L<strong>in</strong>ear Transformations . . . . . . . . . . . . . . . . . . . . . . 480<br />

Representations 501<br />

Vector Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 501<br />

Matrix Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . 514<br />

Change of Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 542<br />

Orthonormal Diagonalization . . . . . . . . . . . . . . . . . . . . . . . . 568<br />

Prelim<strong>in</strong>aries 581<br />

Complex Number Operations . . . . . . . . . . . . . . . . . . . . . . . . 581<br />

Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 587<br />

Reference 591<br />

Proof Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 591<br />

Archetypes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 605<br />

Def<strong>in</strong>itions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 610<br />

Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 611<br />

Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 612<br />

GNU Free Documentation License . . . . . . . . . . . . . . . . . . . . . 613<br />

v


Preface<br />

This text is designed to teach the concepts and techniques of basic l<strong>in</strong>ear algebra<br />

as a rigorous mathematical subject. Besides computational proficiency, there is an<br />

emphasis on understand<strong>in</strong>g def<strong>in</strong>itions and theorems, as well as read<strong>in</strong>g, understand<strong>in</strong>g<br />

and creat<strong>in</strong>g proofs. A strictly logical organization, complete and exceed<strong>in</strong>gly<br />

detailed proofs of every theorem, advice on techniques for read<strong>in</strong>g and writ<strong>in</strong>g proofs,<br />

and a selection of challeng<strong>in</strong>g theoretical exercises will slowly provide the novice<br />

with the tools and confidence to be able to study other mathematical topics <strong>in</strong> a<br />

rigorous fashion.<br />

Most students tak<strong>in</strong>g a course <strong>in</strong> l<strong>in</strong>ear algebra will have completed courses <strong>in</strong><br />

differential and <strong>in</strong>tegral calculus, and maybe also multivariate calculus, and will<br />

typically be second-year students <strong>in</strong> university. This level of mathematical maturity is<br />

expected, however there is little or no requirement to know calculus itself to use this<br />

book successfully. With complete details for every proof, for nearly every example,<br />

and for solutions to a majority of the exercises, the book is ideal for self-study, for<br />

those of any age.<br />

While there is an abundance of guidance <strong>in</strong> the use of the software system, Sage,<br />

there is no attempt to address the problems of numerical l<strong>in</strong>ear algebra, which are<br />

arguably cont<strong>in</strong>uous <strong>in</strong> nature. Similarly, there is little emphasis on a geometric<br />

approach to problems of l<strong>in</strong>ear algebra. While this may contradict the experience of<br />

many experienced mathematicians, the approach here is consciously algebraic. As a<br />

result, the student should be well-prepared to encounter groups, r<strong>in</strong>gs and fields <strong>in</strong><br />

future courses <strong>in</strong> algebra, or other areas of discrete mathematics.<br />

HowtoUseThisBook<br />

While the book is divided <strong>in</strong>to chapters, the ma<strong>in</strong> organizational unit is the thirtyseven<br />

sections. Each conta<strong>in</strong>s a selection of def<strong>in</strong>itions, theorems, and examples<br />

<strong>in</strong>terspersed with commentary. If you are enrolled <strong>in</strong> a course, read the section before<br />

class and then answer the section’s read<strong>in</strong>g questions as preparation for class.<br />

The version available for view<strong>in</strong>g <strong>in</strong> a web browser is the most complete, <strong>in</strong>tegrat<strong>in</strong>g<br />

all of the components of the book. Consider acqua<strong>in</strong>t<strong>in</strong>g yourself with this version.<br />

vi


Knowls are <strong>in</strong>dicated by a dashed underl<strong>in</strong>es and will allow you to seamlessly rem<strong>in</strong>d<br />

yourself of the content of def<strong>in</strong>itions, theorems, examples, exercises, subsections and<br />

more. Use them liberally.<br />

Historically, mathematics texts have numbered def<strong>in</strong>itions and theorems. We<br />

have <strong>in</strong>stead adopted a strategy more appropriate to the heavy cross-referenc<strong>in</strong>g,<br />

l<strong>in</strong>k<strong>in</strong>g and knowl<strong>in</strong>g afforded by modern media. Mimick<strong>in</strong>g an approach taken by<br />

Donald Knuth, we have given items short titles and associated acronyms. You will<br />

become comfortable with this scheme after a short time, and might even come to<br />

appreciate its <strong>in</strong>herent advantages. In the web version, each chapter has a list of ten<br />

or so important items from that chapter, and you will f<strong>in</strong>d yourself recogniz<strong>in</strong>g some<br />

of these acronyms with no extra effort beyond the normal amount of study. Bruno<br />

Mello suggests that some say an acronym should be pronounceable as a word (such<br />

as “radar”), and otherwise is an abbreviation. We will not be so strict <strong>in</strong> our use of<br />

the term.<br />

Exercises come <strong>in</strong> three flavors, <strong>in</strong>dicated by the first letter of their label. “C”<br />

<strong>in</strong>dicates a problem that is essentially computational. “T” represents a problem<br />

that is more theoretical, usually requir<strong>in</strong>g a solution that is as rigorous as a proof.<br />

“M” stands for problems that are “medium”, “moderate”, “midway”, “mediate” or<br />

“median”, but never “mediocre.” Their statements could feel computational, but their<br />

solutions require a more thorough understand<strong>in</strong>g of the concepts or theory, while<br />

perhaps not be<strong>in</strong>g as rigorous as a proof. Of course, such a tripartite division will<br />

be subject to <strong>in</strong>terpretation. Otherwise, larger numerical values <strong>in</strong>dicate greater<br />

perceived difficulty, with gaps allow<strong>in</strong>g for the contribution of new problems from<br />

readers. Many, but not all, exercises have complete solutions. These are <strong>in</strong>dicated<br />

by daggers <strong>in</strong> the PDF and pr<strong>in</strong>t versions, with solutions available <strong>in</strong> an onl<strong>in</strong>e<br />

supplement, while <strong>in</strong> the web version a solution is <strong>in</strong>dicated by a knowl right after the<br />

problem statement. Resist the urge to peek early. Work<strong>in</strong>g the exercises diligently is<br />

the best way to master the material.<br />

The Archetypes are a collection of twenty-four archetypical examples. The open<br />

source lexical database, WordNet, def<strong>in</strong>es an archetype as “someth<strong>in</strong>g that serves as<br />

a model or a basis for mak<strong>in</strong>g copies.” We employ the word <strong>in</strong> the first sense here.<br />

By carefully choos<strong>in</strong>g the examples we hope to provide at least one example that<br />

is <strong>in</strong>terest<strong>in</strong>g and appropriate for many of the theorems and def<strong>in</strong>itions, and also<br />

provide counterexamples to conjectures (and especially counterexamples to converses<br />

of theorems). Each archetype has numerous computational results which you could<br />

strive to duplicate as you encounter new def<strong>in</strong>itions and theorems. There are some<br />

exercises which will help guide you <strong>in</strong> this quest.<br />

Supplements<br />

Pr<strong>in</strong>t versions of the book (either a physical copy or a PDF version) have significant<br />

material available as supplements. Solutions are conta<strong>in</strong>ed <strong>in</strong> the Exercise Manual.<br />

vii


Advice on the use of the open source mathematical software system, Sage,isconta<strong>in</strong>ed<br />

<strong>in</strong> another supplement. (Look for a l<strong>in</strong>ear algebra “Quick Reference” sheet at the<br />

Sage website.) The Archetypes are available <strong>in</strong> a PDF form which could be used<br />

as a workbook. Flashcards, with the statement of every def<strong>in</strong>ition and theorem, <strong>in</strong><br />

order of appearance, are also available.<br />

Freedom<br />

This book is copyrighted by its author. Some would say it is his “<strong>in</strong>tellectual property,”<br />

a distasteful phrase if there ever was one. Rather than exercise all the restrictions<br />

provided by the government-granted monopoly that is copyright, the author has<br />

granted you a license, the GNU Free Documentation License (GFDL). In summary<br />

it says you may receive an electronic copy at no cost via electronic networks and you<br />

may make copies forever. So your copy of the book never has to go “out-of-pr<strong>in</strong>t.”<br />

You may redistribute copies and you may make changes to your copy for your own<br />

use. However, you have one major responsibility <strong>in</strong> accept<strong>in</strong>g this license. If you<br />

make changes and distribute the changed version, then you must offer the same<br />

license for the new version, you must acknowledge the orig<strong>in</strong>al author’s work, and<br />

you must <strong>in</strong>dicate where you have made changes.<br />

In practice, if you see a change that needs to be made (like correct<strong>in</strong>g an error,<br />

or add<strong>in</strong>g a particularly nice theoretical exercise), you may just wish to donate the<br />

change to the author rather than create and ma<strong>in</strong>ta<strong>in</strong> a new version. Such donations<br />

are highly encouraged and gratefully accepted. You may notice the large number of<br />

small mistakes that have been corrected by readers that have come before you. Pay<br />

it forward.<br />

So, <strong>in</strong> one word, the book really is “free” (as <strong>in</strong> “no cost”). But the open license<br />

employed is vastly different than “free to download, all rights reserved.” Most<br />

importantly, you know that this book, and its ideas, are not the property of anyone.<br />

Or they are the property of everyone. Either way, this book has its own <strong>in</strong>herent<br />

“freedom,” separate from those who contribute to it. Much of this philosophy is<br />

embodied <strong>in</strong> the follow<strong>in</strong>g quote:<br />

If nature has made any one th<strong>in</strong>g less susceptible than all others of<br />

exclusive property, it is the action of the th<strong>in</strong>k<strong>in</strong>g power called an idea,<br />

which an <strong>in</strong>dividual may exclusively possess as long as he keeps it to<br />

himself; but the moment it is divulged, it forces itself <strong>in</strong>to the possession<br />

of every one, and the receiver cannot dispossess himself of it. Its peculiar<br />

character, too, is that no one possesses the less, because every other<br />

possesses the whole of it. He who receives an idea from me, receives<br />

<strong>in</strong>struction himself without lessen<strong>in</strong>g m<strong>in</strong>e; as he who lights his taper<br />

at m<strong>in</strong>e, receives light without darken<strong>in</strong>g me. That ideas should freely<br />

spread from one to another over the globe, for the moral and mutual<br />

<strong>in</strong>struction of man, and improvement of his condition, seems to have<br />

viii


een peculiarly and benevolently designed by nature, when she made<br />

them, like fire, expansible over all space, without lessen<strong>in</strong>g their density<br />

<strong>in</strong> any po<strong>in</strong>t, and like the air <strong>in</strong> which we breathe, move, and have our<br />

physical be<strong>in</strong>g, <strong>in</strong>capable of conf<strong>in</strong>ement or exclusive appropriation.<br />

To the Instructor<br />

Thomas Jefferson<br />

Letter to Isaac McPherson<br />

August 13, 1813<br />

The first half of this text (through Chapter M) is a course <strong>in</strong> matrix algebra, though<br />

the foundation of some more advanced ideas is also be<strong>in</strong>g formed <strong>in</strong> these early<br />

sections (such as Theorem NMUS, which presages <strong>in</strong>vertible l<strong>in</strong>ear transformations).<br />

Vectors are presented exclusively as column vectors (not transposes of row vectors),<br />

and l<strong>in</strong>ear comb<strong>in</strong>ations are presented very early. Spans, null spaces, column spaces<br />

and row spaces are also presented early, simply as sets, sav<strong>in</strong>g most of their vector<br />

space properties for later, so they are familiar objects before be<strong>in</strong>g scrut<strong>in</strong>ized<br />

carefully.<br />

You cannot do everyth<strong>in</strong>g early, so <strong>in</strong> particular matrix multiplication comes later<br />

than usual. However, with a def<strong>in</strong>ition built on l<strong>in</strong>ear comb<strong>in</strong>ations of column vectors,<br />

it should seem more natural than the more frequent def<strong>in</strong>ition us<strong>in</strong>g dot products<br />

of rows with columns. And this delay emphasizes that l<strong>in</strong>ear algebra is built upon<br />

vector addition and scalar multiplication. Of course, matrix <strong>in</strong>verses must wait for<br />

matrix multiplication, but this does not prevent nons<strong>in</strong>gular matrices from occurr<strong>in</strong>g<br />

sooner. Vector space properties are h<strong>in</strong>ted at when vector and matrix operations<br />

are first def<strong>in</strong>ed, but the notion of a vector space is saved for a more axiomatic<br />

treatment later (Chapter VS). Once bases and dimension have been explored <strong>in</strong><br />

the context of vector spaces, l<strong>in</strong>ear transformations and their matrix representation<br />

follow. The predom<strong>in</strong>ant purpose of the book is the four sections of Chapter R, which<br />

<strong>in</strong>troduces the student to representations of vectors and matrices, change-of-basis,<br />

and orthonormal diagonalization (the spectral theorem). This f<strong>in</strong>al chapter pulls<br />

together all the important ideas of the previous chapters.<br />

Our vector spaces use the complex numbers as the field of scalars. This avoids<br />

the fiction of complex eigenvalues be<strong>in</strong>g used to form scalar multiples of eigenvectors.<br />

The presence of the complex numbers <strong>in</strong> the earliest sections should not frighten<br />

students who need a review, s<strong>in</strong>ce they will not be used heavily until much later,<br />

and Section CNO provides a quick review.<br />

L<strong>in</strong>ear algebra is an ideal subject for the novice mathematics student to learn how<br />

to develop a subject precisely, with all the rigor mathematics requires. Unfortunately,<br />

much of this rigor seems to have escaped the standard calculus curriculum, so<br />

for many university students this is their first exposure to careful def<strong>in</strong>itions and<br />

ix


theorems, and the expectation that they fully understand them, to say noth<strong>in</strong>g of the<br />

expectation that they become proficient <strong>in</strong> formulat<strong>in</strong>g their own proofs. We have<br />

tried to make this text as helpful as possible with this transition. Every def<strong>in</strong>ition<br />

is stated carefully, set apart from the text. Likewise, every theorem is carefully<br />

stated, and almost every one has a complete proof. Theorems usually have just one<br />

conclusion, so they can be referenced precisely later. Def<strong>in</strong>itions and theorems are<br />

cataloged <strong>in</strong> order of their appearance (Def<strong>in</strong>itions and Theorems <strong>in</strong> the Reference<br />

chapter at the end of the book). Along the way, there are discussions of some more<br />

important ideas relat<strong>in</strong>g to formulat<strong>in</strong>g proofs (Proof Techniques), which is partly<br />

advice and partly a primer on logic.<br />

Collect<strong>in</strong>g responses to the Read<strong>in</strong>g Questions prior to cover<strong>in</strong>g material <strong>in</strong> class<br />

will require students to learn how to read the material. Sections are designed to be<br />

covered <strong>in</strong> a fifty-m<strong>in</strong>ute lecture. Later sections are longer, but as students become<br />

more proficient at read<strong>in</strong>g the text, it is possible to survey these longer sections at<br />

the same pace. With solutions to many of the exercises, students may be given the<br />

freedom to work homework at their own pace and style (<strong>in</strong>dividually, <strong>in</strong> groups, with<br />

an <strong>in</strong>structor’s help, etc.). To compensate and keep students from fall<strong>in</strong>g beh<strong>in</strong>d, I<br />

give an exam<strong>in</strong>ation on each chapter.<br />

Sage is a powerful open source program for advanced mathematics. It is especially<br />

robust for l<strong>in</strong>ear algebra. We have <strong>in</strong>cluded an abundance of material which will help<br />

the student (and <strong>in</strong>structor) learn how to use Sage for the study of l<strong>in</strong>ear algebra<br />

and how to understand l<strong>in</strong>ear algebra better with Sage. This material is tightly<br />

<strong>in</strong>tegrated with the web version of the book and will become even easier to use<br />

s<strong>in</strong>ce the technology for <strong>in</strong>terfaces to Sage cont<strong>in</strong>ues to rapidly evolve. Sage is highly<br />

capable for mathematical research as well, and so should be a tool that students can<br />

use <strong>in</strong> subsequent courses and careers.<br />

Conclusion<br />

L<strong>in</strong>ear algebra is a beautiful subject. I have enjoyed prepar<strong>in</strong>g this exposition and<br />

mak<strong>in</strong>g it widely available. Much of my motivation for writ<strong>in</strong>g this book is captured<br />

by the sentiments expressed by H.M. Cundy and A.P. Rollet <strong>in</strong> their Preface to the<br />

<strong>First</strong> Edition of Mathematical Models (1952), especially the f<strong>in</strong>al sentence,<br />

This book was born <strong>in</strong> the classroom, and arose from the spontaneous<br />

<strong>in</strong>terest of a Mathematical Sixth <strong>in</strong> the construction of simple models. A<br />

desire to show that even <strong>in</strong> mathematics one could have fun led to an<br />

exhibition of the results and attracted considerable attention throughout<br />

the school. S<strong>in</strong>ce then the Sherborne collection has grown, ideas have come<br />

from many sources, and widespread <strong>in</strong>terest has been shown. It seems<br />

therefore desirable to give permanent form to the lessons of experience so<br />

that others can benefit by them and be encouraged to undertake similar<br />

work.<br />

x


Foremost, I hope that students f<strong>in</strong>d their time spent with this book profitable. I<br />

hope that <strong>in</strong>structors f<strong>in</strong>d it flexible enough to fit the needs of their course. You can<br />

always f<strong>in</strong>d the latest version, and keep current with any changes, at the book’s website<br />

(http://l<strong>in</strong>ear.pugetsound.edu). I appreciate receiv<strong>in</strong>g suggestions, corrections,<br />

and other comments, so please do contact me.<br />

Robert A. Beezer<br />

Tacoma, Wash<strong>in</strong>gton<br />

December 2012<br />

xi


Acknowledgements<br />

Many people have helped to make this book, and its freedoms, possible.<br />

<strong>First</strong>, the time to create, edit and distribute the book has been provided implicitly<br />

and explicitly by the University of Puget Sound. A sabbatical leave Spr<strong>in</strong>g 2004, a<br />

course release <strong>in</strong> Spr<strong>in</strong>g 2007 and a Lantz Senior Fellowship for the 2010-11 academic<br />

year are three obvious examples of explicit support. The course release was provided<br />

by support from the L<strong>in</strong>d-VanEnkevort Fund. The university has also provided<br />

clerical support, computer hardware, network servers and bandwidth. Thanks to<br />

Dean Kris Bartanen and the chairs of the Mathematics and Computer Science<br />

Department, Professors Mart<strong>in</strong> Jackson, Sigrun Bod<strong>in</strong>e and Bryan Smith, for their<br />

support, encouragement and flexibility.<br />

My colleagues <strong>in</strong> the Mathematics and Computer Science Department have<br />

graciously taught our <strong>in</strong>troductory l<strong>in</strong>ear algebra course us<strong>in</strong>g earlier versions and<br />

have provided valuable suggestions that have improved the book immeasurably.<br />

Thanks to Professor Mart<strong>in</strong> Jackson (v0.30), Professor David Scott (v0.70), Professor<br />

Bryan Smith (v0.70, 0.80, v1.00, v2.00, v2.20), Professor Manley Perkel (v2.10), and<br />

Professor Cynthia Gibson (v2.20).<br />

University of Puget Sound librarians Lori Ricigliano, Elizabeth Knight and Jeanne<br />

Kimura provided valuable advice on production, and <strong>in</strong>terest<strong>in</strong>g conversations about<br />

copyrights.<br />

Many aspects of the book have been <strong>in</strong>fluenced by <strong>in</strong>sightful questions and creative<br />

suggestions from the students who have labored through the book <strong>in</strong> our courses. For<br />

example, the flashcards with theorems and def<strong>in</strong>itions are a direct result of a student<br />

suggestion. I will s<strong>in</strong>gle out a handful of students at the University of Puget Sound<br />

who have been especially adept at f<strong>in</strong>d<strong>in</strong>g and report<strong>in</strong>g mathematically significant<br />

typographical errors: Jake L<strong>in</strong>enthal, Christie Su, Kim Le, Sarah McQuate, Andy<br />

Zimmer, Travis Osborne, Andrew Tapay, Mark Shoemaker, Tasha Underhill, Tim<br />

Zitzer, Elizabeth Million, Steve Canfield, J<strong>in</strong>shil Yi, Cliff Berger, Preston Van Buren,<br />

Duncan Bennett, Dan Messenger, Caden Rob<strong>in</strong>son, Glenna Toomey, Tyler Ueltschi,<br />

Kyle Whitcomb, Anna Dovzhik, Chris Spald<strong>in</strong>g and Jenna Fonta<strong>in</strong>e. All the students<br />

of the Fall 2012 Math 290 sections were very helpful and patient through the major<br />

changes required <strong>in</strong> mak<strong>in</strong>g Version 3.00.<br />

xii


I have tried to be as orig<strong>in</strong>al as possible <strong>in</strong> the organization and presentation of<br />

this beautiful subject. However, I have been <strong>in</strong>fluenced by many years of teach<strong>in</strong>g<br />

from another excellent textbook, Introduction to L<strong>in</strong>ear <strong>Algebra</strong> by L.W. Johnson,<br />

R.D. Reiss and J.T. Arnold. When I have needed <strong>in</strong>spiration for the correct approach<br />

to particularly important proofs, I have learned to eventually consult two other<br />

textbooks. Sheldon Axler’s L<strong>in</strong>ear <strong>Algebra</strong> Done Right is a highly orig<strong>in</strong>al exposition,<br />

while Ben Noble’s Applied L<strong>in</strong>ear <strong>Algebra</strong> frequently strikes just the right note<br />

between rigor and <strong>in</strong>tuition. Noble’s excellent book is highly recommended, even<br />

though its publication dates to 1969.<br />

Conversion to various electronic formats have greatly depended on assistance from:<br />

Eitan Gurari, author of the powerful LaTeX translator, tex4ht; Davide Cervone,<br />

author of jsMath and MathJax; and Carl Witty, who advised and tested the Sony<br />

Reader format. Thanks to these <strong>in</strong>dividuals for their critical assistance.<br />

Incorporation of Sage code is made possible by the entire community of Sage<br />

developers and users, who create and ref<strong>in</strong>e the mathematical rout<strong>in</strong>es, the user<br />

<strong>in</strong>terfaces and applications <strong>in</strong> educational sett<strong>in</strong>gs. Technical and logistical aspects of<br />

<strong>in</strong>corporat<strong>in</strong>g Sage code <strong>in</strong> open textbooks was supported by a grant from the United<br />

States National Science Foundation (DUE-1022574), which has been adm<strong>in</strong>istered by<br />

the American Institute of Mathematics, and <strong>in</strong> particular, David Farmer. The support<br />

and assistance of my fellow Pr<strong>in</strong>cipal Investigators, Jason Grout, Tom Judson, Kiran<br />

Kedlaya, Sandra Laursen, Susan Lynds, and William Ste<strong>in</strong> is especially appreciated.<br />

David Farmer and Sally Koutsoliotas are responsible for the vision and <strong>in</strong>itial<br />

experiments which lead to the knowl-enabled web version, as part of the Version 3<br />

project.<br />

General support and encouragement of free and affordable textbooks, <strong>in</strong> addition<br />

to specific promotion of this text, was provided by Nicole Allen, Textbook Advocate<br />

at Student Public Interest Research Groups. Nicole was an early consumer of this<br />

material, back when it looked more like lecture notes than a textbook.<br />

F<strong>in</strong>ally, <strong>in</strong> every respect, the production and distribution of this book has been<br />

accomplished with open source software. The range of <strong>in</strong>dividuals and projects is far<br />

too great to pretend to list them all. This project is an attempt to pay it forward.<br />

xiii


Contributors<br />

Name Location Contact<br />

Beezer, David Santa Clara U.<br />

Beezer, Robert U. of Puget Sound buzzard.ups.edu/<br />

Black, Chris<br />

Braithwaite, David Chicago, Ill<strong>in</strong>ois<br />

Bucht, Sara U. of Puget Sound<br />

Canfield, Steve U. of Puget Sound<br />

Hubert, Dupont Creteil, France<br />

Fellez, Sarah U. of Puget Sound<br />

Fickenscher, Eric U. of Puget Sound<br />

Jackson, Mart<strong>in</strong> U. of Puget Sound<br />

Johnson, Chili U. of Puget Sound<br />

www.math.ups.edu/~mart<strong>in</strong>j<br />

Kessler, Ivan U. of Puget Sound<br />

Kreher, Don Michigan Tech. U.<br />

Hamrick, Mark St. Louis U.<br />

www.math.mtu.edu/~kreher<br />

L<strong>in</strong>enthal, Jacob U. of Puget Sound<br />

Million, Elizabeth U. of Puget Sound<br />

Osborne, Travis U. of Puget Sound<br />

Riegsecker, Joe Middlebury, Indiana joepye(at)pobox(dot)com<br />

Perkel, Manley U. of Puget Sound<br />

Phelps, Douglas U. of Puget Sound<br />

Shoemaker, Mark U. of Puget Sound<br />

Toth, Zoltan<br />

zoli.web.elte.hu<br />

Ueltschi, Tyler U. of Puget Sound<br />

Zimmer, Andy U. of Puget Sound<br />

xiv


Chapter SLE<br />

Systems of L<strong>in</strong>ear Equations<br />

We will motivate our study of l<strong>in</strong>ear algebra by study<strong>in</strong>g solutions to systems of<br />

l<strong>in</strong>ear equations. While the focus of this chapter is on the practical matter of how<br />

to f<strong>in</strong>d, and describe, these solutions, we will also be sett<strong>in</strong>g ourselves up for more<br />

theoretical ideas that will appear later.<br />

Section WILA<br />

What is L<strong>in</strong>ear <strong>Algebra</strong>?<br />

We beg<strong>in</strong> our study of l<strong>in</strong>ear algebra with an <strong>in</strong>troduction and a motivational example.<br />

Subsection LA<br />

L<strong>in</strong>ear + <strong>Algebra</strong><br />

The subject of l<strong>in</strong>ear algebra can be partially expla<strong>in</strong>ed by the mean<strong>in</strong>g of the<br />

two terms compris<strong>in</strong>g the title. “L<strong>in</strong>ear” is a term you will appreciate better at<br />

the end of this course, and <strong>in</strong>deed, atta<strong>in</strong><strong>in</strong>g this appreciation could be taken as<br />

one of the primary goals of this course. However for now, you can understand it<br />

to mean anyth<strong>in</strong>g that is “straight” or “flat.” For example <strong>in</strong> the xy-plane you<br />

might be accustomed to describ<strong>in</strong>g straight l<strong>in</strong>es (is there any other k<strong>in</strong>d?) as<br />

the set of solutions to an equation of the form y = mx + b, wheretheslopem<br />

and the y-<strong>in</strong>tercept b are constants that together describe the l<strong>in</strong>e. If you have<br />

studied multivariate calculus, then you will have encountered planes. Liv<strong>in</strong>g <strong>in</strong> three<br />

dimensions, with coord<strong>in</strong>ates described by triples (x, y, z), they can be described<br />

as the set of solutions to equations of the form ax + by + cz = d, wherea, b, c, d<br />

are constants that together determ<strong>in</strong>e the plane. While we might describe planes as<br />

“flat,” l<strong>in</strong>es <strong>in</strong> three dimensions might be described as “straight.” From a multivariate<br />

calculus course you will recall that l<strong>in</strong>es are sets of po<strong>in</strong>ts described by equations<br />

1


§WILA Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 2<br />

such as x =3t − 4, y = −7t +2,z =9t, wheret is a parameter that can take on any<br />

value.<br />

Another view of this notion of “flatness” is to recognize that the sets of po<strong>in</strong>ts<br />

just described are solutions to equations of a relatively simple form. These equations<br />

<strong>in</strong>volve addition and multiplication only. We will have a need for subtraction, and<br />

occasionally we will divide, but mostly you can describe “l<strong>in</strong>ear” equations as<br />

<strong>in</strong>volv<strong>in</strong>g only addition and multiplication. Here are some examples of typical<br />

equations we will see <strong>in</strong> the next few sections:<br />

2x +3y − 4z =13 4x 1 +5x 2 − x 3 + x 4 + x 5 =0 9a − 2b +7c +2d = −7<br />

What we will not see are equations like:<br />

xy +5yz =13 x 1 + x 3 2/x 4 − x 3 x 4 x 2 5 =0 tan(ab)+log(c − d) =−7<br />

The exception will be that we will on occasion need to take a square root.<br />

You have probably heard the word “algebra” frequently <strong>in</strong> your mathematical<br />

preparation for this course. Most likely, you have spent a good ten to fifteen years<br />

learn<strong>in</strong>g the algebra of the real numbers, along with some <strong>in</strong>troduction to the very<br />

similar algebra of complex numbers (see Section CNO). However, there are many<br />

new algebras to learn and use, and likely l<strong>in</strong>ear algebra will be your second algebra.<br />

Like learn<strong>in</strong>g a second language, the necessary adjustments can be challeng<strong>in</strong>g at<br />

times, but the rewards are many. And it will make learn<strong>in</strong>g your third and fourth<br />

algebras even easier. Perhaps you have heard of “groups” and “r<strong>in</strong>gs” (or maybe<br />

you have studied them already), which are excellent examples of other algebras with<br />

very <strong>in</strong>terest<strong>in</strong>g properties and applications. In any event, prepare yourself to learn<br />

a new algebra and realize that some of the old rules you used for the real numbers<br />

may no longer apply to this new algebra you will be learn<strong>in</strong>g!<br />

The brief discussion above about l<strong>in</strong>es and planes suggests that l<strong>in</strong>ear algebra<br />

has an <strong>in</strong>herently geometric nature, and this is true. Examples <strong>in</strong> two and three<br />

dimensions can be used to provide valuable <strong>in</strong>sight <strong>in</strong>to important concepts of this<br />

course. However, much of the power of l<strong>in</strong>ear algebra will be the ability to work with<br />

“flat” or “straight” objects <strong>in</strong> higher dimensions, without concern<strong>in</strong>g ourselves with<br />

visualiz<strong>in</strong>g the situation. While much of our <strong>in</strong>tuition will come from examples <strong>in</strong><br />

two and three dimensions, we will ma<strong>in</strong>ta<strong>in</strong> an algebraic approach to the subject,<br />

with the geometry be<strong>in</strong>g secondary. Others may wish to switch this emphasis around,<br />

and that can lead to a very fruitful and beneficial course, but here and now we are<br />

lay<strong>in</strong>g our bias bare.<br />

Subsection AA<br />

An Application<br />

We conclude this section with a rather <strong>in</strong>volved example that will highlight some of<br />

the power and techniques of l<strong>in</strong>ear algebra. Work through all of the details with pencil


§WILA Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 3<br />

and paper, until you believe all the assertions made. However, <strong>in</strong> this <strong>in</strong>troductory<br />

example, do not concern yourself with how some of the results are obta<strong>in</strong>ed or how<br />

you might be expected to solve a similar problem. We will come back to this example<br />

later and expose some of the techniques used and properties exploited. For now,<br />

use your background <strong>in</strong> mathematics to conv<strong>in</strong>ce yourself that everyth<strong>in</strong>g said here<br />

really is correct.<br />

Example TMP Trail Mix Packag<strong>in</strong>g<br />

Suppose you are the production manager at a food-packag<strong>in</strong>g plant and one of your<br />

product l<strong>in</strong>es is trail mix, a healthy snack popular with hikers and backpackers,<br />

conta<strong>in</strong><strong>in</strong>g rais<strong>in</strong>s, peanuts and hard-shelled chocolate pieces. By adjust<strong>in</strong>g the mix<br />

of these three <strong>in</strong>gredients, you are able to sell three varieties of this item. The fancy<br />

version is sold <strong>in</strong> half-kilogram packages at outdoor supply stores and has more<br />

chocolate and fewer rais<strong>in</strong>s, thus command<strong>in</strong>g a higher price. The standard version<br />

is sold <strong>in</strong> one kilogram packages <strong>in</strong> grocery stores and gas station m<strong>in</strong>i-markets.<br />

S<strong>in</strong>ce the standard version has roughly equal amounts of each <strong>in</strong>gredient, it is not as<br />

expensive as the fancy version. F<strong>in</strong>ally, a bulk version is sold <strong>in</strong> b<strong>in</strong>s at grocery stores<br />

for consumers to load <strong>in</strong>to plastic bags <strong>in</strong> amounts of their choos<strong>in</strong>g. To appeal to<br />

the shoppers that like bulk items for their economy and healthfulness, this mix has<br />

many more rais<strong>in</strong>s (at the expense of chocolate) and therefore sells for less.<br />

Your production facilities have limited storage space and early each morn<strong>in</strong>g<br />

you are able to receive and store 380 kilograms of rais<strong>in</strong>s, 500 kilograms of peanuts<br />

and 620 kilograms of chocolate pieces. As production manager, one of your most<br />

important duties is to decide how much of each version of trail mix to make every<br />

day. Clearly, you can have up to 1500 kilograms of raw <strong>in</strong>gredients available each<br />

day, so to be the most productive you will likely produce 1500 kilograms of trail<br />

mix each day. Also, you would prefer not to have any <strong>in</strong>gredients leftover each day,<br />

so that your f<strong>in</strong>al product is as fresh as possible and so that you can receive the<br />

maximum delivery the next morn<strong>in</strong>g. But how should these <strong>in</strong>gredients be allocated<br />

to the mix<strong>in</strong>g of the bulk, standard and fancy versions? <strong>First</strong>, we need a little more<br />

<strong>in</strong>formation about the mixes. Workers mix the <strong>in</strong>gredients <strong>in</strong> 15 kilogram batches,<br />

and each row of the table below gives a recipe for a 15 kilogram batch. There is some<br />

additional <strong>in</strong>formation on the costs of the <strong>in</strong>gredients and the price the manufacturer<br />

can charge for the different versions of the trail mix.<br />

Rais<strong>in</strong>s Peanuts Chocolate Cost Sale Price<br />

(kg/batch) (kg/batch) (kg/batch) ($/kg) ($/kg)<br />

Bulk 7 6 2 3.69 4.99<br />

Standard 6 4 5 3.86 5.50<br />

Fancy 2 5 8 4.45 6.50<br />

Storage (kg) 380 500 620<br />

Cost ($/kg) 2.55 4.65 4.80<br />

As production manager, it is important to realize that you only have three


§WILA Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 4<br />

decisions to make — the amount of bulk mix to make, the amount of standard mix to<br />

make and the amount of fancy mix to make. Everyth<strong>in</strong>g else is beyond your control<br />

or is handled by another department with<strong>in</strong> the company. Pr<strong>in</strong>cipally, you are also<br />

limited by the amount of raw <strong>in</strong>gredients you can store each day. Let us denote the<br />

amount of each mix to produce each day, measured <strong>in</strong> kilograms, by the variable<br />

quantities b, s and f. Your production schedule can be described as values of b, s<br />

and f that do several th<strong>in</strong>gs. <strong>First</strong>, we cannot make negative quantities of each mix,<br />

so<br />

b ≥ 0 s ≥ 0 f ≥ 0<br />

Second, if we want to consume all of our <strong>in</strong>gredients each day, the storage capacities<br />

lead to three (l<strong>in</strong>ear) equations, one for each <strong>in</strong>gredient,<br />

7<br />

15 b + 6<br />

15 s + 2 f = 380<br />

15<br />

(rais<strong>in</strong>s)<br />

6<br />

15 b + 4<br />

15 s + 5 f = 500<br />

15<br />

(peanuts)<br />

2<br />

15 b + 5<br />

15 s + 8 f = 620<br />

15<br />

(chocolate)<br />

It happens that this system of three equations has just one solution. In other<br />

words, as production manager, your job is easy, s<strong>in</strong>ce there is but one way to use up<br />

all of your raw <strong>in</strong>gredients mak<strong>in</strong>g trail mix. This s<strong>in</strong>gle solution is<br />

b = 300 kg s = 300 kg f = 900 kg.<br />

We do not yet have the tools to expla<strong>in</strong> why this solution is the only one, but it<br />

should be simple for you to verify that this is <strong>in</strong>deed a solution. (Go ahead, we will<br />

wait.) Determ<strong>in</strong><strong>in</strong>g solutions such as this, and establish<strong>in</strong>g that they are unique, will<br />

be the ma<strong>in</strong> motivation for our <strong>in</strong>itial study of l<strong>in</strong>ear algebra.<br />

So we have solved the problem of mak<strong>in</strong>g sure that we make the best use of<br />

our limited storage space, and each day use up all of the raw <strong>in</strong>gredients that are<br />

shipped to us. Additionally, as production manager, you must report weekly to the<br />

CEO of the company, and you know he will be more <strong>in</strong>terested <strong>in</strong> the profit derived<br />

from your decisions than <strong>in</strong> the actual production levels. So you compute,<br />

300(4.99 − 3.69) + 300(5.50 − 3.86) + 900(6.50 − 4.45) = 2727.00<br />

for a daily profit of $2,727 from this production schedule. The computation of the<br />

daily profit is also beyond our control, though it is def<strong>in</strong>itely of <strong>in</strong>terest, and it too<br />

looks like a “l<strong>in</strong>ear” computation.<br />

As often happens, th<strong>in</strong>gs do not stay the same for long, and now the market<strong>in</strong>g<br />

department has suggested that your company’s trail mix products standardize on<br />

every mix be<strong>in</strong>g one-third peanuts. Adjust<strong>in</strong>g the peanut portion of each recipe by<br />

also adjust<strong>in</strong>g the chocolate portion leads to revised recipes, and slightly different<br />

costs for the bulk and standard mixes, as given <strong>in</strong> the follow<strong>in</strong>g table.


§WILA Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 5<br />

Rais<strong>in</strong>s Peanuts Chocolate Cost Sale Price<br />

(kg/batch) (kg/batch) (kg/batch) ($/kg) ($/kg)<br />

Bulk 7 5 3 3.70 4.99<br />

Standard 6 5 4 3.85 5.50<br />

Fancy 2 5 8 4.45 6.50<br />

Storage (kg) 380 500 620<br />

Cost ($/kg) 2.55 4.65 4.80<br />

In a similar fashion as before, we desire values of b, s and f so that<br />

b ≥ 0 s ≥ 0 f ≥ 0<br />

and<br />

7<br />

15 b + 6<br />

15 s + 2 f = 380 (rais<strong>in</strong>s)<br />

15<br />

5<br />

15 b + 5<br />

15 s + 5 f = 500 (peanuts)<br />

15<br />

3<br />

15 b + 4<br />

15 s + 8 f = 620 (chocolate)<br />

15<br />

It now happens that this system of equations has <strong>in</strong>f<strong>in</strong>itely many solutions,<br />

as we will now demonstrate. Let f rema<strong>in</strong> a variable quantity. Then suppose we<br />

make f kilograms of the fancy mix, b =4f − 3300 kilograms of the bulk mix, and<br />

s = −5f + 4800 kilograms of the standard mix. We now show that these choices,<br />

for any value of f, will yield a production schedule that exhausts all of the day’s<br />

supply of raw <strong>in</strong>gredients. (We will very soon learn how to solve systems of equations<br />

with <strong>in</strong>f<strong>in</strong>itely many solutions and then determ<strong>in</strong>e expressions like these for b and<br />

s). Grab your pencil and paper and play along by substitut<strong>in</strong>g these choices for the<br />

production schedule <strong>in</strong>to the storage limits for each raw <strong>in</strong>gredient and simpliy<strong>in</strong>g<br />

the algebra.<br />

7<br />

6<br />

2 5700<br />

(4f − 3300) + (−5f + 4800) + f =0f +<br />

15 15 15 15 = 380<br />

5<br />

5<br />

5 7500<br />

(4f − 3300) + (−5f + 4800) + f =0f +<br />

15 15 15 15 = 500<br />

3<br />

4<br />

8 9300<br />

(4f − 3300) + (−5f + 4800) + f =0f +<br />

15 15 15 15 = 620<br />

Conv<strong>in</strong>ce yourself that these expressions for b and s allow us to vary f and obta<strong>in</strong><br />

an <strong>in</strong>f<strong>in</strong>ite number of possibilities for solutions to the three equations that describe<br />

our storage capacities. As a practical matter, there really are not an <strong>in</strong>f<strong>in</strong>ite number<br />

of solutions, s<strong>in</strong>ce we are unlikely to want to end the day with a fractional number<br />

of bags of fancy mix, so our allowable values of f should probably be <strong>in</strong>tegers. More<br />

importantly, we need to remember that we cannot make negative amounts of each<br />

mix! Where does this lead us? Positive quantities of the bulk mix requires that<br />

b ≥ 0 ⇒ 4f − 3300 ≥ 0 ⇒ f ≥ 825


§WILA Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 6<br />

Similarly for the standard mix,<br />

s ≥ 0 ⇒ −5f + 4800 ≥ 0 ⇒ f ≤ 960<br />

So, as production manager, you really have to choose a value of f from the f<strong>in</strong>ite<br />

set<br />

{825, 826, ..., 960}<br />

leav<strong>in</strong>g you with 136 choices, each of which will exhaust the day’s supply of raw<br />

<strong>in</strong>gredients. Pause now and th<strong>in</strong>k about which you would choose.<br />

Recall<strong>in</strong>g your weekly meet<strong>in</strong>g with the CEO suggests that you might want to<br />

choose a production schedule that yields the biggest possible profit for the company.<br />

So you compute an expression for the profit based on your as yet undeterm<strong>in</strong>ed<br />

decision for the value of f,<br />

(4f − 3300)(4.99 − 3.70) + (−5f + 4800)(5.50 − 3.85) + (f)(6.50 − 4.45)<br />

= −1.04f + 3663<br />

S<strong>in</strong>ce f has a negative coefficient it would appear that mix<strong>in</strong>g fancy mix is<br />

detrimental to your profit and should be avoided. So you will make the decision<br />

to set daily fancy mix production at f = 825. This has the effect of sett<strong>in</strong>g b =<br />

4(825) − 3300 = 0 and we stop produc<strong>in</strong>g bulk mix entirely. So the rema<strong>in</strong>der of your<br />

daily production is standard mix at the level of s = −5(825) + 4800 = 675 kilograms<br />

and the result<strong>in</strong>g daily profit is (−1.04)(825) + 3663 = 2805. It is a pleasant surprise<br />

that daily profit has risen to $2,805, but this is not the most important part of the<br />

story. What is important here is that there are a large number of ways to produce<br />

trail mix that use all of the day’s worth of raw <strong>in</strong>gredients and you were able to<br />

easily choose the one that netted the largest profit. Notice too how all of the above<br />

computations look “l<strong>in</strong>ear.”<br />

In the food <strong>in</strong>dustry, th<strong>in</strong>gs do not stay the same for long, and now the sales<br />

department says that <strong>in</strong>creased competition has led to the decision to stay competitive<br />

and charge just $5.25 for a kilogram of the standard mix, rather than the previous<br />

$5.50 per kilogram. This decision has no effect on the possibilities for the production<br />

schedule, but will affect the decision based on profit considerations. So you revisit<br />

just the profit computation, suitably adjusted for the new sell<strong>in</strong>g price of standard<br />

mix,<br />

(4f − 3300)(4.99 − 3.70) + (−5f + 4800)(5.25 − 3.85) + (f)(6.50 − 4.45)<br />

=0.21f + 2463<br />

Now it would appear that fancy mix is beneficial to the company’s profit s<strong>in</strong>ce<br />

the value of f has a positive coefficient. So you take the decision to make as much<br />

fancy mix as possible, sett<strong>in</strong>g f = 960. This leads to s = −5(960) + 4800 = 0 and the<br />

<strong>in</strong>creased competition has driven you out of the standard mix market all together. The<br />

rema<strong>in</strong>der of production is therefore bulk mix at a daily level of b = 4(960) − 3300 =


§WILA Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 7<br />

540 kilograms and the result<strong>in</strong>g daily profit is 0.21(960) + 2463 = 2664.60. A daily<br />

profit of $2,664.60 is less than it used to be, but as production manager, you have<br />

made the best of a difficult situation and shown the sales department that the best<br />

course is to pull out of the highly competitive standard mix market completely. △<br />

This example is taken from a field of mathematics variously known by names such<br />

as operations research, systems science, or management science. More specifically,<br />

this is a prototypical example of problems that are solved by the techniques of “l<strong>in</strong>ear<br />

programm<strong>in</strong>g.”<br />

There is a lot go<strong>in</strong>g on under the hood <strong>in</strong> this example. The heart of the matter is<br />

the solution to systems of l<strong>in</strong>ear equations, which is the topic of the next few sections,<br />

and a recurrent theme throughout this course. We will return to this example on<br />

several occasions to reveal some of the reasons for its behavior.<br />

Read<strong>in</strong>g Questions<br />

1. Is the equation x 2 + xy +tan(y 3 ) = 0 l<strong>in</strong>ear or not? Why or why not?<br />

2. F<strong>in</strong>d all solutions to the system of two l<strong>in</strong>ear equations 2x +3y = −8, x − y =6.<br />

3. Describe how the production manager might expla<strong>in</strong> the importance of the procedures<br />

described <strong>in</strong> the trail mix application (Subsection WILA.AA).<br />

Exercises<br />

C10 In Example TMP the first table lists the cost (per kilogram) to manufacture each of<br />

the three varieties of trail mix (bulk, standard, fancy). For example, it costs $3.69 to make<br />

one kilogram of the bulk variety. Re-compute each of these three costs and notice that the<br />

computations are l<strong>in</strong>ear <strong>in</strong> character.<br />

M70 † In Example TMP two different prices were considered for market<strong>in</strong>g standard mix<br />

with the revised recipes (one-third peanuts <strong>in</strong> each recipe). Sell<strong>in</strong>g standard mix at $5.50<br />

resulted <strong>in</strong> sell<strong>in</strong>g the m<strong>in</strong>imum amount of the fancy mix and no bulk mix. At $5.25 it<br />

was best for profits to sell the maximum amount of fancy mix and then sell no standard<br />

mix. Determ<strong>in</strong>e a sell<strong>in</strong>g price for standard mix that allows for maximum profits while still<br />

sell<strong>in</strong>g some of each type of mix.


Section SSLE<br />

Solv<strong>in</strong>g Systems of L<strong>in</strong>ear Equations<br />

We will motivate our study of l<strong>in</strong>ear algebra by consider<strong>in</strong>g the problem of solv<strong>in</strong>g<br />

several l<strong>in</strong>ear equations simultaneously. The word “solve” tends to get abused<br />

somewhat, as <strong>in</strong> “solve this problem.” When talk<strong>in</strong>g about equations we understand<br />

a more precise mean<strong>in</strong>g: f<strong>in</strong>d all of the values of some variable quantities that make<br />

an equation, or several equations, simultaneously true.<br />

Subsection SLE<br />

Systems of L<strong>in</strong>ear Equations<br />

Our first example is of a type we will not pursue further. While it has two equations,<br />

the first is not l<strong>in</strong>ear. So this is a good example to come back to later, especially<br />

after you have seen Theorem PSSLS.<br />

Example STNE Solv<strong>in</strong>g two (nonl<strong>in</strong>ear) equations<br />

Suppose we desire the simultaneous solutions of the two equations,<br />

x 2 + y 2 =1<br />

−x + √ 3y =0<br />

You can easily check by substitution that x = √ 3<br />

2 ,y= 1 2 and x = − √ 3<br />

2 ,y= − 1 2<br />

are both solutions. We need to also conv<strong>in</strong>ce ourselves that these are the only<br />

solutions. To see this, plot each equation on the xy-plane, which means to plot (x, y)<br />

pairs that make an <strong>in</strong>dividual equation true. In this case we get a circle centered at<br />

the orig<strong>in</strong> with radius 1 and a straight l<strong>in</strong>e through the orig<strong>in</strong> with slope √ 1<br />

3<br />

.The<br />

<strong>in</strong>tersections of these two curves are our desired simultaneous solutions, and so we<br />

believe from our plot that the two solutions we know already are <strong>in</strong>deed the only<br />

ones. We like to write solutions as sets, so <strong>in</strong> this case we write the set of solutions as<br />

{( √3 ) (<br />

S =<br />

2 , 1 2<br />

, − √ )}<br />

3<br />

2 , − 1 2<br />

In order to discuss systems of l<strong>in</strong>ear equations carefully, we need a precise<br />

def<strong>in</strong>ition. And before we do that, we will <strong>in</strong>troduce our periodic discussions about<br />

“Proof Techniques.” L<strong>in</strong>ear algebra is an excellent sett<strong>in</strong>g for learn<strong>in</strong>g how to read,<br />

understand and formulate proofs. But this is a difficult step <strong>in</strong> your development as<br />

a mathematician, so we have <strong>in</strong>cluded a series of short essays conta<strong>in</strong><strong>in</strong>g advice and<br />

explanations to help you along. These will be referenced <strong>in</strong> the text as needed, and<br />

are also collected as a list you can consult when you want to return to re-read them.<br />

(Which is strongly encouraged!)<br />

8<br />


§SSLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 9<br />

With a def<strong>in</strong>ition next, now is the time for the first of our proof techniques. So<br />

study Proof Technique D. We’ll be right here when you get back. See you <strong>in</strong> a bit.<br />

Def<strong>in</strong>ition SLE System of L<strong>in</strong>ear Equations<br />

A system of l<strong>in</strong>ear equations is a collection of m equations <strong>in</strong> the variable<br />

quantities x 1 ,x 2 ,x 3 ,...,x n of the form,<br />

a 11 x 1 + a 12 x 2 + a 13 x 3 + ···+ a 1n x n = b 1<br />

a 21 x 1 + a 22 x 2 + a 23 x 3 + ···+ a 2n x n = b 2<br />

a 31 x 1 + a 32 x 2 + a 33 x 3 + ···+ a 3n x n = b 3<br />

a m1 x 1 + a m2 x 2 + a m3 x 3 + ···+ a mn x n = b m<br />

where the values of a ij , b i and x j ,1≤ i ≤ m, 1≤ j ≤ n, are from the set of complex<br />

numbers, C.<br />

□<br />

Do not let the mention of the complex numbers, C, rattle you. We will stick with<br />

real numbers exclusively for many more sections, and it will sometimes seem like<br />

we only work with <strong>in</strong>tegers! However, we want to leave the possibility of complex<br />

numbers open, and there will be occasions <strong>in</strong> subsequent sections where they are<br />

necessary. You can review the basic properties of complex numbers <strong>in</strong> Section CNO,<br />

but these facts will not be critical until we reach Section O.<br />

Now we make the notion of a solution to a l<strong>in</strong>ear system precise.<br />

Def<strong>in</strong>ition SSLE Solution of a System of L<strong>in</strong>ear Equations<br />

A solution of a system of l<strong>in</strong>ear equations <strong>in</strong> n variables, x 1 ,x 2 ,x 3 , ..., x n (such<br />

as the system given <strong>in</strong> Def<strong>in</strong>ition SLE), is an ordered list of n complex numbers,<br />

s 1 ,s 2 ,s 3 , ..., s n such that if we substitute s 1 for x 1 , s 2 for x 2 , s 3 for x 3 , ..., s n<br />

for x n , then for every equation of the system the left side will equal the right side,<br />

i.e. each equation is true simultaneously.<br />

□<br />

More typically, we will write a solution <strong>in</strong> a form like x 1 = 12, x 2 = −7, x 3 =2to<br />

mean that s 1 = 12, s 2 = −7, s 3 = 2 <strong>in</strong> the notation of Def<strong>in</strong>ition SSLE. To discuss<br />

all of the possible solutions to a system of l<strong>in</strong>ear equations, we now def<strong>in</strong>e the set<br />

of all solutions. (So Section SET is now applicable, and you may want to go and<br />

familiarize yourself with what is there.)<br />

Def<strong>in</strong>ition SSSLE Solution Set of a System of L<strong>in</strong>ear Equations<br />

The solution set of a l<strong>in</strong>ear system of equations is the set which conta<strong>in</strong>s every<br />

solution to the system, and noth<strong>in</strong>g more.<br />

□<br />

Be aware that a solution set can be <strong>in</strong>f<strong>in</strong>ite, or there can be no solutions, <strong>in</strong><br />

whichcasewewritethesolutionsetastheemptyset,∅ = {} (Def<strong>in</strong>ition ES). Here<br />

is an example to illustrate us<strong>in</strong>g the notation <strong>in</strong>troduced <strong>in</strong> Def<strong>in</strong>ition SLE and the<br />

notion of a solution (Def<strong>in</strong>ition SSLE).<br />

.


§SSLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 10<br />

Example NSE Notation for a system of equations<br />

Given the system of l<strong>in</strong>ear equations,<br />

x 1 +2x 2 + x 4 =7<br />

x 1 + x 2 + x 3 − x 4 =3<br />

3x 1 + x 2 +5x 3 − 7x 4 =1<br />

we have n =4variablesandm = 3 equations. Also,<br />

a 11 =1 a 12 =2 a 13 =0 a 14 =1 b 1 =7<br />

a 21 =1 a 22 =1 a 23 =1 a 24 = −1 b 2 =3<br />

a 31 =3 a 32 =1 a 33 =5 a 34 = −7 b 3 =1<br />

Additionally, conv<strong>in</strong>ce yourself that x 1 = −2, x 2 =4,x 3 =2,x 4 = 1 is one<br />

solution (Def<strong>in</strong>ition SSLE), but it is not the only one! For example, another solution<br />

is x 1 = −12, x 2 = 11, x 3 =1,x 4 = −3, and there are more to be found. So the<br />

solution set conta<strong>in</strong>s at least two elements.<br />

△<br />

We will often shorten the term “system of l<strong>in</strong>ear equations” to “system of<br />

equations” leav<strong>in</strong>g the l<strong>in</strong>ear aspect implied. After all, this is a book about l<strong>in</strong>ear<br />

algebra.<br />

Subsection PSS<br />

Possibilities for Solution Sets<br />

The next example illustrates the possibilities for the solution set of a system of l<strong>in</strong>ear<br />

equations. We will not be too formal here, and the necessary theorems to back up<br />

our claims will come <strong>in</strong> subsequent sections. So read for feel<strong>in</strong>g and come back later<br />

to revisit this example.<br />

Example TTS Three typical systems<br />

Consider the system of two equations with two variables,<br />

2x 1 +3x 2 =3<br />

x 1 − x 2 =4<br />

If we plot the solutions to each of these equations separately on the x 1 x 2 -plane,<br />

we get two l<strong>in</strong>es, one with negative slope, the other with positive slope. They have<br />

exactly one po<strong>in</strong>t <strong>in</strong> common, (x 1 ,x 2 )=(3, −1), which is the solution x 1 =3,<br />

x 2 = −1. From the geometry, we believe that this is the only solution to the system<br />

of equations, and so we say it is unique.<br />

Now adjust the system with a different second equation,<br />

2x 1 +3x 2 =3<br />

4x 1 +6x 2 =6


§SSLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 11<br />

A plot of the solutions to these equations <strong>in</strong>dividually results <strong>in</strong> two l<strong>in</strong>es, one on<br />

top of the other! There are <strong>in</strong>f<strong>in</strong>itely many pairs of po<strong>in</strong>ts that make both equations<br />

true. We will learn shortly how to describe this <strong>in</strong>f<strong>in</strong>ite solution set precisely (see<br />

Example SAA, TheoremVFSLS). Notice now how the second equation is just a<br />

multiple of the first.<br />

One more m<strong>in</strong>or adjustment provides a third system of l<strong>in</strong>ear equations,<br />

2x 1 +3x 2 =3<br />

4x 1 +6x 2 =10<br />

A plot now reveals two l<strong>in</strong>es with identical slopes, i.e. parallel l<strong>in</strong>es. They have<br />

no po<strong>in</strong>ts <strong>in</strong> common, and so the system has a solution set that is empty, S = ∅.△<br />

This example exhibits all of the typical behaviors of a system of equations. A<br />

subsequent theorem will tell us that every system of l<strong>in</strong>ear equations has a solution<br />

set that is empty, conta<strong>in</strong>s a s<strong>in</strong>gle solution or conta<strong>in</strong>s <strong>in</strong>f<strong>in</strong>itely many solutions<br />

(Theorem PSSLS). Example STNE yielded exactly two solutions, but this does not<br />

contradict the forthcom<strong>in</strong>g theorem. The equations <strong>in</strong> Example STNE are not l<strong>in</strong>ear<br />

because they do not match the form of Def<strong>in</strong>ition SLE, and so we cannot apply<br />

Theorem PSSLS <strong>in</strong> this case.<br />

Subsection ESEO<br />

Equivalent Systems and Equation Operations<br />

With all this talk about f<strong>in</strong>d<strong>in</strong>g solution sets for systems of l<strong>in</strong>ear equations, you<br />

might be ready to beg<strong>in</strong> learn<strong>in</strong>g how to f<strong>in</strong>d these solution sets yourself. We beg<strong>in</strong><br />

with our first def<strong>in</strong>ition that takes a common word and gives it a very precise mean<strong>in</strong>g<br />

<strong>in</strong> the context of systems of l<strong>in</strong>ear equations.<br />

Def<strong>in</strong>ition ESYS Equivalent Systems<br />

Two systems of l<strong>in</strong>ear equations are equivalent if their solution sets are equal. □<br />

Notice here that the two systems of equations could look very different (i.e. not<br />

be equal), but still have equal solution sets, and we would then call the systems<br />

equivalent. Two l<strong>in</strong>ear equations <strong>in</strong> two variables might be plotted as two l<strong>in</strong>es<br />

that <strong>in</strong>tersect <strong>in</strong> a s<strong>in</strong>gle po<strong>in</strong>t. A different system, with three equations <strong>in</strong> two<br />

variables might have a plot that is three l<strong>in</strong>es, all <strong>in</strong>tersect<strong>in</strong>g at a common po<strong>in</strong>t,<br />

with this common po<strong>in</strong>t identical to the <strong>in</strong>tersection po<strong>in</strong>t for the first system. By our<br />

def<strong>in</strong>ition, we could then say these two very different look<strong>in</strong>g systems of equations<br />

are equivalent, s<strong>in</strong>ce they have identical solution sets. It is really like a weaker form<br />

of equality, where we allow the systems to be different <strong>in</strong> some respects, but we use<br />

the term equivalent to highlight the situation when their solution sets are equal.<br />

With this def<strong>in</strong>ition, we can beg<strong>in</strong> to describe our strategy for solv<strong>in</strong>g l<strong>in</strong>ear<br />

systems. Given a system of l<strong>in</strong>ear equations that looks difficult to solve, we would<br />

liketohaveanequivalent system that is easy to solve. S<strong>in</strong>ce the systems will have


§SSLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 12<br />

equal solution sets, we can solve the “easy” system and get the solution set to the<br />

“difficult” system. Here come the tools for mak<strong>in</strong>g this strategy viable.<br />

Def<strong>in</strong>ition EO Equation Operations<br />

Given a system of l<strong>in</strong>ear equations, the follow<strong>in</strong>g three operations will transform the<br />

system <strong>in</strong>to a different one, and each operation is known as an equation operation.<br />

1. Swap the locations of two equations <strong>in</strong> the list of equations.<br />

2. Multiply each term of an equation by a nonzero quantity.<br />

3. Multiply each term of one equation by some quantity, and add these terms to<br />

a second equation, on both sides of the equality. Leave the first equation the<br />

same after this operation, but replace the second equation by the new one.<br />

□<br />

These descriptions might seem a bit vague, but the proof or the examples that<br />

follow should make it clear what is meant by each. We will shortly prove a key<br />

theorem about equation operations and solutions to l<strong>in</strong>ear systems of equations.<br />

We are about to give a rather <strong>in</strong>volved proof, so a discussion about just what a<br />

theorem really is would be timely. Stop and read Proof Technique T first.<br />

In the theorem we are about to prove, the conclusion is that two systems are<br />

equivalent. By Def<strong>in</strong>ition ESYS this translates to requir<strong>in</strong>g that solution sets be<br />

equal for the two systems. So we are be<strong>in</strong>g asked to show that two sets are equal.<br />

How do we do this? Well, there is a very standard technique, and we will use it<br />

repeatedly through the course. If you have not done so already, head to Section SET<br />

and familiarize yourself with sets, their operations, and especially the notion of set<br />

equality, Def<strong>in</strong>ition SE, and the nearby discussion about its use.<br />

The follow<strong>in</strong>g theorem has a rather long proof. This chapter conta<strong>in</strong>s a few very<br />

necessary theorems like this, with proofs that you can safely skip on a first read<strong>in</strong>g.<br />

You might come back to them later, when you are more comfortable with read<strong>in</strong>g<br />

and study<strong>in</strong>g proofs.<br />

Theorem EOPSS Equation Operations Preserve Solution Sets<br />

If we apply one of the three equation operations of Def<strong>in</strong>ition EO toasystemof<br />

l<strong>in</strong>ear equations (Def<strong>in</strong>ition SLE), then the orig<strong>in</strong>al system and the transformed<br />

system are equivalent.<br />

Proof. We take each equation operation <strong>in</strong> turn and show that the solution sets of<br />

the two systems are equal, us<strong>in</strong>g the def<strong>in</strong>ition of set equality (Def<strong>in</strong>ition SE).<br />

1. It will not be our habit <strong>in</strong> proofs to resort to say<strong>in</strong>g statements are “obvious,”<br />

but <strong>in</strong> this case, it should be. There is noth<strong>in</strong>g about the order <strong>in</strong> which we<br />

write l<strong>in</strong>ear equations that affects their solutions, so the solution set will be<br />

equal if the systems only differ by a rearrangement of the order of the equations.


§SSLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 13<br />

2. Suppose α ̸= 0 is a number. Let us choose to multiply the terms of equation i<br />

by α to build the new system of equations,<br />

a 11 x 1 + a 12 x 2 + a 13 x 3 + ···+ a 1n x n = b 1<br />

a 21 x 1 + a 22 x 2 + a 23 x 3 + ···+ a 2n x n = b 2<br />

a 31 x 1 + a 32 x 2 + a 33 x 3 + ···+ a 3n x n = b 3<br />

αa i1 x 1 + αa i2 x 2 + αa i3 x 3 + ···+ αa <strong>in</strong> x n = αb i<br />

a m1 x 1 + a m2 x 2 + a m3 x 3 + ···+ a mn x n = b m<br />

Let S denote the solutions to the system <strong>in</strong> the statement of the theorem, and<br />

let T denote the solutions to the transformed system.<br />

(a) Show S ⊆ T . Suppose (x 1 ,x 2 , x 3 , ...,x n )=(β 1 ,β 2 , β 3 , ...,β n ) ∈ S is<br />

a solution to the orig<strong>in</strong>al system. Ignor<strong>in</strong>g the i-th equation for a moment,<br />

we know it makes all the other equations of the transformed system true.<br />

We also know that<br />

a i1 β 1 + a i2 β 2 + a i3 β 3 + ···+ a <strong>in</strong> β n = b i<br />

which we can multiply by α to get<br />

.<br />

.<br />

αa i1 β 1 + αa i2 β 2 + αa i3 β 3 + ···+ αa <strong>in</strong> β n = αb i<br />

This says that the i-th equation of the transformed system is also true, so<br />

we have established that (β 1 ,β 2 , β 3 , ...,β n ) ∈ T , and therefore S ⊆ T .<br />

(b) Now show T ⊆ S. Suppose (x 1 ,x 2 , x 3 , ...,x n )=(β 1 ,β 2 , β 3 , ...,β n ) ∈<br />

T is a solution to the transformed system. Ignor<strong>in</strong>g the i-th equation<br />

for a moment, we know it makes all the other equations of the orig<strong>in</strong>al<br />

system true. We also know that<br />

αa i1 β 1 + αa i2 β 2 + αa i3 β 3 + ···+ αa <strong>in</strong> β n = αb i<br />

which we can multiply by 1 α<br />

,s<strong>in</strong>ceα ≠0,toget<br />

a i1 β 1 + a i2 β 2 + a i3 β 3 + ···+ a <strong>in</strong> β n = b i<br />

This says that the i-th equation of the orig<strong>in</strong>al system is also true, so<br />

we have established that (β 1 ,β 2 , β 3 , ...,β n ) ∈ S, and therefore T ⊆ S.<br />

Locate the key po<strong>in</strong>t where we required that α ̸= 0, and consider what<br />

wouldhappenifα =0.


§SSLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 14<br />

3. Suppose α is a number. Let us choose to multiply the terms of equation i by<br />

α and add them to equation j <strong>in</strong> order to build the new system of equations,<br />

a 11 x 1 + a 12 x 2 + ···+ a 1n x n = b 1<br />

a 21 x 1 + a 22 x 2 + ···+ a 2n x n = b 2<br />

a 31 x 1 + a 32 x 2 + ···+ a 3n x n = b 3<br />

(αa i1 + a j1 )x 1 +(αa i2 + a j2 )x 2 + ···+(αa <strong>in</strong> + a jn )x n = αb i + b j<br />

a m1 x 1 + a m2 x 2 + ···+ a mn x n = b m<br />

Let S denote the solutions to the system <strong>in</strong> the statement of the theorem, and<br />

let T denote the solutions to the transformed system.<br />

(a) Show S ⊆ T . Suppose (x 1 ,x 2 , x 3 , ...,x n )=(β 1 ,β 2 , β 3 , ...,β n ) ∈ S is<br />

a solution to the orig<strong>in</strong>al system. Ignor<strong>in</strong>g the j-th equation for a moment,<br />

we know this solution makes all the other equations of the transformed<br />

system true. Us<strong>in</strong>g the fact that the solution makes the i-th and j-th<br />

equations of the orig<strong>in</strong>al system true, we f<strong>in</strong>d<br />

(αa i1 + a j1 )β 1 +(αa i2 + a j2 )β 2 + ···+(αa <strong>in</strong> + a jn )β n<br />

=(αa i1 β 1 + αa i2 β 2 + ···+ αa <strong>in</strong> β n )+(a j1 β 1 + a j2 β 2 + ···+ a jn β n )<br />

= α(a i1 β 1 + a i2 β 2 + ···+ a <strong>in</strong> β n )+(a j1 β 1 + a j2 β 2 + ···+ a jn β n )<br />

= αb i + b j .<br />

This says that the j-th equation of the transformed system is also true, so<br />

we have established that (β 1 ,β 2 , β 3 , ...,β n ) ∈ T , and therefore S ⊆ T .<br />

(b) Now show T ⊆ S. Suppose (x 1 ,x 2 , x 3 , ...,x n )=(β 1 ,β 2 , β 3 , ...,β n ) ∈<br />

T is a solution to the transformed system. Ignor<strong>in</strong>g the j-th equation<br />

for a moment, we know it makes all the other equations of the orig<strong>in</strong>al<br />

system true. We then f<strong>in</strong>d<br />

a j1 β 1 + ···+ a jn β n<br />

= a j1 β 1 + ···+ a jn β n + αb i − αb i<br />

= a j1 β 1 + ···+ a jn β n +(αa i1 β 1 + ···+ αa <strong>in</strong> β n ) − αb i<br />

= a j1 β 1 + αa i1 β 1 + ···+ a jn β n + αa <strong>in</strong> β n − αb i<br />

=(αa i1 + a j1 )β 1 + ···+(αa <strong>in</strong> + a jn )β n − αb i<br />

= αb i + b j − αb i<br />

= b j<br />

.<br />

.


§SSLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 15<br />

This says that the j-th equation of the orig<strong>in</strong>al system is also true, so we<br />

have established that (β 1 ,β 2 , β 3 , ...,β n ) ∈ S, and therefore T ⊆ S.<br />

Why did we not need to require that α ̸= 0 for this row operation? In other<br />

words, how does the third statement of the theorem read when α =0?Does<br />

our proof require some extra care when α = 0? Compare your answers with<br />

the similar situation for the second row operation. (See Exercise SSLE.T20.)<br />

Theorem EOPSS is the necessary tool to complete our strategy for solv<strong>in</strong>g systems<br />

of equations. We will use equation operations to move from one system to another,<br />

all the while keep<strong>in</strong>g the solution set the same. With the right sequence of operations,<br />

we will arrive at a simpler equation to solve. The next two examples illustrate this<br />

idea, while sav<strong>in</strong>g some of the details for later.<br />

Example US Three equations, one solution<br />

We solve the follow<strong>in</strong>g system by a sequence of equation operations.<br />

x 1 +2x 2 +2x 3 =4<br />

x 1 +3x 2 +3x 3 =5<br />

2x 1 +6x 2 +5x 3 =6<br />

α = −1 times equation 1, add to equation 2:<br />

x 1 +2x 2 +2x 3 =4<br />

0x 1 +1x 2 +1x 3 =1<br />

2x 1 +6x 2 +5x 3 =6<br />

α = −2 times equation 1, add to equation 3:<br />

x 1 +2x 2 +2x 3 =4<br />

0x 1 +1x 2 +1x 3 =1<br />

0x 1 +2x 2 +1x 3 = −2<br />

α = −2 times equation 2, add to equation 3:<br />

α = −1 times equation 3:<br />

x 1 +2x 2 +2x 3 =4<br />

0x 1 +1x 2 +1x 3 =1<br />

0x 1 +0x 2 − 1x 3 = −4<br />

x 1 +2x 2 +2x 3 =4<br />

0x 1 +1x 2 +1x 3 =1


§SSLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 16<br />

which can be written more clearly as<br />

0x 1 +0x 2 +1x 3 =4<br />

x 1 +2x 2 +2x 3 =4<br />

x 2 + x 3 =1<br />

x 3 =4<br />

This is now a very easy system of equations to solve. The third equation requires<br />

that x 3 = 4 to be true. Mak<strong>in</strong>g this substitution <strong>in</strong>to equation 2 we arrive at x 2 = −3,<br />

and f<strong>in</strong>ally, substitut<strong>in</strong>g these values of x 2 and x 3 <strong>in</strong>to the first equation, we f<strong>in</strong>d<br />

that x 1 = 2. Note too that this is the only solution to this f<strong>in</strong>al system of equations,<br />

s<strong>in</strong>cewewereforcedtochoosethesevaluestomaketheequationstrue.S<strong>in</strong>cewe<br />

performed equation operations on each system to obta<strong>in</strong> the next one <strong>in</strong> the list, all<br />

of the systems listed here are all equivalent to each other by Theorem EOPSS. Thus<br />

(x 1 ,x 2 ,x 3 )=(2, −3, 4) is the unique solution to the orig<strong>in</strong>al system of equations<br />

(and all of the other <strong>in</strong>termediate systems of equations listed as we transformed one<br />

<strong>in</strong>to another).<br />

△<br />

Example IS Three equations, <strong>in</strong>f<strong>in</strong>itely many solutions<br />

The follow<strong>in</strong>g system of equations made an appearance earlier <strong>in</strong> this section (Example<br />

NSE),wherewelistedone of its solutions. Now, we will try to f<strong>in</strong>d all of the<br />

solutions to this system. Do not concern yourself too much about why we choose<br />

this particular sequence of equation operations, just believe that the work we do is<br />

all correct.<br />

x 1 +2x 2 +0x 3 + x 4 =7<br />

x 1 + x 2 + x 3 − x 4 =3<br />

3x 1 + x 2 +5x 3 − 7x 4 =1<br />

α = −1 times equation 1, add to equation 2:<br />

x 1 +2x 2 +0x 3 + x 4 =7<br />

0x 1 − x 2 + x 3 − 2x 4 = −4<br />

3x 1 + x 2 +5x 3 − 7x 4 =1<br />

α = −3 times equation 1, add to equation 3:<br />

x 1 +2x 2 +0x 3 + x 4 =7<br />

0x 1 − x 2 + x 3 − 2x 4 = −4<br />

0x 1 − 5x 2 +5x 3 − 10x 4 = −20


§SSLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 17<br />

α = −5 times equation 2, add to equation 3:<br />

α = −1 times equation 2:<br />

x 1 +2x 2 +0x 3 + x 4 =7<br />

0x 1 − x 2 + x 3 − 2x 4 = −4<br />

0x 1 +0x 2 +0x 3 +0x 4 =0<br />

x 1 +2x 2 +0x 3 + x 4 =7<br />

0x 1 + x 2 − x 3 +2x 4 =4<br />

0x 1 +0x 2 +0x 3 +0x 4 =0<br />

α = −2 times equation 2, add to equation 1:<br />

which can be written more clearly as<br />

x 1 +0x 2 +2x 3 − 3x 4 = −1<br />

0x 1 + x 2 − x 3 +2x 4 =4<br />

0x 1 +0x 2 +0x 3 +0x 4 =0<br />

x 1 +2x 3 − 3x 4 = −1<br />

x 2 − x 3 +2x 4 =4<br />

0=0<br />

What does the equation 0 = 0 mean? We can choose any values for x 1 , x 2 ,<br />

x 3 , x 4 and this equation will be true, so we only need to consider further the first<br />

two equations, s<strong>in</strong>ce the third is true no matter what. We can analyze the second<br />

equation without consideration of the variable x 1 . It would appear that there is<br />

considerable latitude <strong>in</strong> how we can choose x 2 , x 3 , x 4 and make this equation true.<br />

Let us choose x 3 and x 4 to be anyth<strong>in</strong>g we please, say x 3 = a and x 4 = b.<br />

Now we can take these arbitrary values for x 3 and x 4 , substitute them <strong>in</strong> equation<br />

1, to obta<strong>in</strong><br />

x 1 +2a − 3b = −1<br />

x 1 = −1 − 2a +3b<br />

Similarly, equation 2 becomes<br />

x 2 − a +2b =4<br />

x 2 =4+a − 2b<br />

So our arbitrary choices of values for x 3 and x 4 (a and b) translate <strong>in</strong>to specific<br />

values of x 1 and x 2 . The lone solution given <strong>in</strong> Example NSE was obta<strong>in</strong>ed by<br />

choos<strong>in</strong>g a = 2 and b = 1. Now we can easily and quickly f<strong>in</strong>d many more (<strong>in</strong>f<strong>in</strong>itely


§SSLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 18<br />

more). Suppose we choose a = 5 and b = −2, then we compute<br />

x 1 = −1 − 2(5) + 3(−2) = −17<br />

x 2 =4+5− 2(−2) = 13<br />

and you can verify that (x 1 ,x 2 ,x 3 ,x 4 )=(−17, 13, 5, −2) makes all three equations<br />

true. The entire solution set is written as<br />

S = {(−1 − 2a +3b, 4+a − 2b, a, b)| a ∈ C, b∈ C}<br />

It would be <strong>in</strong>structive to f<strong>in</strong>ish off your study of this example by tak<strong>in</strong>g the<br />

general form of the solutions given <strong>in</strong> this set and substitut<strong>in</strong>g them <strong>in</strong>to each of the<br />

three equations and verify that they are true <strong>in</strong> each case (Exercise SSLE.M40). △<br />

In the next section we will describe how to use equation operations to systematically<br />

solve any system of l<strong>in</strong>ear equations. But first, read one of our more important<br />

pieces of advice about speak<strong>in</strong>g and writ<strong>in</strong>g mathematics. See Proof Technique L.<br />

Before attack<strong>in</strong>g the exercises <strong>in</strong> this section, it will be helpful to read some<br />

advice on gett<strong>in</strong>g started on the construction of a proof. See Proof Technique GS.<br />

Read<strong>in</strong>g Questions<br />

1. How many solutions does the system of equations 3x+2y =4,6x+4y = 8 have? Expla<strong>in</strong><br />

your answer.<br />

2. How many solutions does the system of equations 3x +2y =4,6x +4y = −2 have?<br />

Expla<strong>in</strong> your answer.<br />

3. What do we mean when we say mathematics is a language?<br />

Exercises<br />

C10 F<strong>in</strong>d a solution to the system <strong>in</strong> Example IS where x 3 =6andx 4 =2.F<strong>in</strong>dtwo<br />

other solutions to the system. F<strong>in</strong>d a solution where x 1 = −17 and x 2 = 14. How many<br />

possible answers are there to each of these questions?<br />

C20 Each archetype (Archetypes) that is a system of equations beg<strong>in</strong>s by list<strong>in</strong>g some<br />

specific solutions. Verify the specific solutions listed <strong>in</strong> the follow<strong>in</strong>g archetypes by evaluat<strong>in</strong>g<br />

the system of equations with the solutions listed.<br />

Archetype A,ArchetypeB,ArchetypeC,ArchetypeD,ArchetypeE,ArchetypeF,Archetype<br />

G, ArchetypeH, ArchetypeI, ArchetypeJ<br />

C30 † F<strong>in</strong>d all solutions to the l<strong>in</strong>ear system:<br />

x + y =5<br />

2x − y =3<br />

C31<br />

F<strong>in</strong>d all solutions to the l<strong>in</strong>ear system:<br />

3x +2y =1


§SSLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 19<br />

x − y =2<br />

4x +2y =2<br />

C32<br />

C33<br />

C34<br />

F<strong>in</strong>d all solutions to the l<strong>in</strong>ear system:<br />

x +2y =8<br />

x − y =2<br />

x + y =4<br />

F<strong>in</strong>d all solutions to the l<strong>in</strong>ear system:<br />

x + y − z = −1<br />

x − y − z = −1<br />

z =2<br />

F<strong>in</strong>d all solutions to the l<strong>in</strong>ear system:<br />

x + y − z = −5<br />

x − y − z = −3<br />

x + y − z =0<br />

C50 † A three-digit number has two properties. The tens-digit and the ones-digit add up<br />

to 5. If the number is written with the digits <strong>in</strong> the reverse order, and then subtracted<br />

from the orig<strong>in</strong>al number, the result is 792. Use a system of equations to f<strong>in</strong>d all of the<br />

three-digit numbers with these properties.<br />

C51 † F<strong>in</strong>d all of the six-digit numbers <strong>in</strong> which the first digit is one less than the second,<br />

the third digit is half the second, the fourth digit is three times the third and the last two<br />

digits form a number that equals the sum of the fourth and fifth. The sum of all the digits<br />

is 24. (From The MENSA Puzzle Calendar for January 9, 2006.)<br />

C52 † Driv<strong>in</strong>g along, Terry notices that the last four digits on his car’s odometer are<br />

pal<strong>in</strong>dromic. A mile later, the last five digits are pal<strong>in</strong>dromic. After driv<strong>in</strong>g another mile, the<br />

middle four digits are pal<strong>in</strong>dromic. One more mile, and all six are pal<strong>in</strong>dromic. What was<br />

the odometer read<strong>in</strong>g when Terry first looked at it? Form a l<strong>in</strong>ear system of equations that<br />

expresses the requirements of this puzzle. (Car Talk Puzzler, National Public Radio, Week<br />

of January 21, 2008) (A car odometer displays six digits and a sequence is a pal<strong>in</strong>drome<br />

if it reads the same left-to-right as right-to-left.)<br />

C53 † An article <strong>in</strong> The Economist (“Free Exchange”, December 6, 2014) quotes the<br />

follow<strong>in</strong>g problem as an illustration that some of the “underly<strong>in</strong>g assumptions of classical<br />

economics” about people’s behavior are <strong>in</strong>correct and “the m<strong>in</strong>d plays tricks.” A bat and<br />

ball cost $1.10 between them. How much does each cost? Answer this quickly with no<br />

writ<strong>in</strong>g, then construct system of l<strong>in</strong>ear equations and solve the problem carefully.<br />

M10 † Each sentence below has at least two mean<strong>in</strong>gs. Identify the source of the double<br />

mean<strong>in</strong>g, and rewrite the sentence (at least twice) to clearly convey each mean<strong>in</strong>g.<br />

1. They are bak<strong>in</strong>g potatoes.


§SSLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 20<br />

2. He bought many ripe pears and apricots.<br />

3. She likes his sculpture.<br />

4. I decided on the bus.<br />

M11 † Discuss the difference <strong>in</strong> mean<strong>in</strong>g of each of the follow<strong>in</strong>g three almost identical<br />

sentences, which all have the same grammatical structure. (These are due to Keith Devl<strong>in</strong>.)<br />

1. She saw him <strong>in</strong> the park with a dog.<br />

2. She saw him <strong>in</strong> the park with a founta<strong>in</strong>.<br />

3. She saw him <strong>in</strong> the park with a telescope.<br />

M12 † The follow<strong>in</strong>g sentence, due to Noam Chomsky, has a correct grammatical structure,<br />

but is mean<strong>in</strong>gless. Critique its faults. “Colorless green ideas sleep furiously.” (Chomsky,<br />

Noam. Syntactic Structures, The Hague/Paris: Mouton, 1957. p. 15.)<br />

M13 † Read the follow<strong>in</strong>g sentence and form a mental picture of the situation.<br />

The baby cried and the mother picked it up.<br />

What assumptions did you make about the situation?<br />

M14 Discuss the difference <strong>in</strong> mean<strong>in</strong>g of the follow<strong>in</strong>g two almost identical sentences,<br />

which have nearly identical grammatical structure. (This antanaclasis is often attributed to<br />

the comedian Groucho Marx, but has earlier roots.)<br />

1. Time flies like an arrow.<br />

2. Fruit flies like a banana.<br />

M30 † This problem appears <strong>in</strong> a middle-school mathematics textbook: Together Dan<br />

and Diane have $20. Together Diane and Donna have $15. How much do the three of them<br />

have <strong>in</strong> total? (Transition Mathematics, Second Edition, Scott Foresman Addison Wesley,<br />

1998. Problem 5–1.19.)<br />

M40 Solutions to the system <strong>in</strong> Example IS are given as<br />

(x 1 ,x 2 ,x 3 ,x 4 )=(−1 − 2a +3b, 4+a − 2b, a, b)<br />

Evaluate the three equations of the orig<strong>in</strong>al system with these expressions <strong>in</strong> a and b and<br />

verify that each equation is true, no matter what values are chosen for a and b.<br />

M70 † We have seen <strong>in</strong> this section that systems of l<strong>in</strong>ear equations have limited possibilities<br />

for solution sets, and we will shortly prove Theorem PSSLS that describes these<br />

possibilities exactly. This exercise will show that if we relax the requirement that our equations<br />

be l<strong>in</strong>ear, then the possibilities expand greatly. Consider a system of two equations <strong>in</strong><br />

the two variables x and y, where the departure from l<strong>in</strong>earity <strong>in</strong>volves simply squar<strong>in</strong>g the<br />

variables.<br />

x 2 − y 2 =1


§SSLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 21<br />

x 2 + y 2 =4<br />

After solv<strong>in</strong>g this system of nonl<strong>in</strong>ear equations, replace the second equation <strong>in</strong> turn by<br />

x 2 +2x + y 2 =3,x 2 + y 2 =1,x 2 − 4x + y 2 = −3, −x 2 + y 2 = 1 and solve each result<strong>in</strong>g<br />

system of two equations <strong>in</strong> two variables. (This exercise <strong>in</strong>cludes suggestions from Don<br />

Kreher.)<br />

T10 † Proof Technique D asks you to formulate a def<strong>in</strong>ition of what it means for a whole<br />

number to be odd. What is your def<strong>in</strong>ition? (Do not say “the opposite of even.”) Is 6 odd?<br />

Is 11 odd? Justify your answers by us<strong>in</strong>g your def<strong>in</strong>ition.<br />

T20 † Expla<strong>in</strong> why the second equation operation <strong>in</strong> Def<strong>in</strong>ition EO requires that the<br />

scalar be nonzero, while <strong>in</strong> the third equation operation this restriction on the scalar is not<br />

present.


Section RREF<br />

Reduced Row-Echelon Form<br />

After solv<strong>in</strong>g a few systems of equations, you will recognize that it does not matter so<br />

much what we call our variables, as opposed to what numbers act as their coefficients.<br />

A system <strong>in</strong> the variables x 1 ,x 2 ,x 3 would behave the same if we changed the names<br />

of the variables to a, b, c and kept all the constants the same and <strong>in</strong> the same places.<br />

In this section, we will isolate the key bits of <strong>in</strong>formation about a system of equations<br />

<strong>in</strong>to someth<strong>in</strong>g called a matrix, and then use this matrix to systematically solve<br />

the equations. Along the way we will obta<strong>in</strong> one of our most important and useful<br />

computational tools.<br />

Subsection MVNSE<br />

Matrix and Vector Notation for Systems of Equations<br />

Def<strong>in</strong>ition M Matrix<br />

An m × n matrix is a rectangular layout of numbers from C hav<strong>in</strong>g m rows and<br />

n columns. We will use upper-case Lat<strong>in</strong> letters from the start of the alphabet<br />

(A, B, C,...) to denote matrices and squared-off brackets to delimit the layout.<br />

Many use large parentheses <strong>in</strong>stead of brackets — the dist<strong>in</strong>ction is not important.<br />

Rows of a matrix will be referenced start<strong>in</strong>g at the top and work<strong>in</strong>g down (i.e. row 1<br />

is at the top) and columns will be referenced start<strong>in</strong>g from the left (i.e. column 1 is<br />

at the left). For a matrix A, the notation [A] ij<br />

will refer to the complex number <strong>in</strong><br />

row i and column j of A.<br />

□<br />

Be careful with this notation for <strong>in</strong>dividual entries, s<strong>in</strong>ce it is easy to th<strong>in</strong>k that<br />

[A] ij<br />

refers to the whole matrix. It does not. It is just a number, but is a convenient<br />

way to talk about the <strong>in</strong>dividual entries simultaneously. This notation will get a<br />

heavy workout once we get to Chapter M.<br />

Example AM A matrix<br />

B =<br />

[ −1 2 5<br />

] 3<br />

1 0 −6 1<br />

−4 2 2 −2<br />

is a matrix with m = 3 rows and n = 4 columns. We can say that [B] 2,3<br />

= −6 while<br />

[B] 3,4<br />

= −2.<br />

△<br />

When we do equation operations on system of equations, the names of the<br />

variables really are not very important. Use x 1 , x 2 , x 3 ,ora, b, c, orx, y, z, it really<br />

does not matter. In this subsection we will describe some notation that will make it<br />

easier to describe l<strong>in</strong>ear systems, solve the systems and describe the solution sets.<br />

Here is a list of def<strong>in</strong>itions, laden with notation.<br />

22


§RREF Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 23<br />

Def<strong>in</strong>ition CV Column Vector<br />

A column vector of size m is an ordered list of m numbers, which is written <strong>in</strong><br />

order vertically, start<strong>in</strong>g at the top and proceed<strong>in</strong>g to the bottom. At times, we will<br />

refer to a column vector as simply a vector. Column vectors will be written <strong>in</strong> bold,<br />

usually with lower case Lat<strong>in</strong> letter from the end of the alphabet such as u, v, w,<br />

x, y, z. Some books like to write vectors with arrows, such as ⃗u. Writ<strong>in</strong>g by hand,<br />

some like to put arrows on top of the symbol, or a tilde underneath the symbol, as<br />

<strong>in</strong> u<br />

∼<br />

. To refer to the entry or component of vector v <strong>in</strong> location i of the list, we<br />

write [v] i<br />

.<br />

Be careful with this notation. While the symbols [v] i<br />

might look somewhat<br />

substantial, as an object this represents just one entry of a vector, which is just a<br />

s<strong>in</strong>gle complex number.<br />

Def<strong>in</strong>ition ZCV Zero Column Vector<br />

The zero vector of size m is the column vector of size m where each entry is the<br />

number zero,<br />

⎡ ⎤<br />

0<br />

0<br />

0 =<br />

0<br />

⎢<br />

⎣<br />

⎥<br />

.<br />

⎦<br />

0<br />

or def<strong>in</strong>ed much more compactly, [0] i<br />

=0for1≤ i ≤ m.<br />

Def<strong>in</strong>ition CM Coefficient Matrix<br />

For a system of l<strong>in</strong>ear equations,<br />

a 11 x 1 + a 12 x 2 + a 13 x 3 + ···+ a 1n x n = b 1<br />

a 21 x 1 + a 22 x 2 + a 23 x 3 + ···+ a 2n x n = b 2<br />

a 31 x 1 + a 32 x 2 + a 33 x 3 + ···+ a 3n x n = b 3<br />

.<br />

□<br />

□<br />

a m1 x 1 + a m2 x 2 + a m3 x 3 + ···+ a mn x n = b m<br />

the coefficient matrix is the m × n matrix<br />

⎡<br />

⎤<br />

a 11 a 12 a 13 ... a 1n<br />

a 21 a 22 a 23 ... a 2n<br />

A =<br />

a 31 a 32 a 33 ... a 3n<br />

⎢<br />

⎥<br />

⎣<br />

.<br />

⎦<br />

a m1 a m2 a m3 ... a mn<br />

Def<strong>in</strong>ition VOC Vector of Constants<br />


§RREF Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 24<br />

For a system of l<strong>in</strong>ear equations,<br />

a 11 x 1 + a 12 x 2 + a 13 x 3 + ···+ a 1n x n = b 1<br />

a 21 x 1 + a 22 x 2 + a 23 x 3 + ···+ a 2n x n = b 2<br />

a 31 x 1 + a 32 x 2 + a 33 x 3 + ···+ a 3n x n = b 3<br />

.<br />

a m1 x 1 + a m2 x 2 + a m3 x 3 + ···+ a mn x n = b m<br />

the vector of constants is the column vector of size m<br />

⎡ ⎤<br />

b 1<br />

b 2<br />

b =<br />

b 3..<br />

⎢ ⎥<br />

⎣ ⎦<br />

b m<br />

Def<strong>in</strong>ition SOLV Solution Vector<br />

For a system of l<strong>in</strong>ear equations,<br />

□<br />

a 11 x 1 + a 12 x 2 + a 13 x 3 + ···+ a 1n x n = b 1<br />

a 21 x 1 + a 22 x 2 + a 23 x 3 + ···+ a 2n x n = b 2<br />

a 31 x 1 + a 32 x 2 + a 33 x 3 + ···+ a 3n x n = b 3<br />

.<br />

a m1 x 1 + a m2 x 2 + a m3 x 3 + ···+ a mn x n = b m<br />

the solution vector is the column vector of size n<br />

⎡ ⎤<br />

x 1<br />

x 2<br />

x =<br />

x 3<br />

⎢<br />

⎣ . ⎥<br />

.<br />

⎦<br />

x n<br />

The solution vector may do double-duty on occasion. It might refer to a list of<br />

variable quantities at one po<strong>in</strong>t, and subsequently refer to values of those variables<br />

that actually form a particular solution to that system.<br />

Def<strong>in</strong>ition MRLS Matrix Representation of a L<strong>in</strong>ear System<br />

If A is the coefficient matrix of a system of l<strong>in</strong>ear equations and b is the vector of<br />

constants, then we will write LS(A, b) as a shorthand expression for the system of<br />

l<strong>in</strong>ear equations, which we will refer to as the matrix representation of the l<strong>in</strong>ear<br />


§RREF Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 25<br />

system.<br />

Example NSLE Notation for systems of l<strong>in</strong>ear equations<br />

The system of l<strong>in</strong>ear equations<br />

2x 1 +4x 2 − 3x 3 +5x 4 + x 5 =9<br />

3x 1 + x 2 + x 4 − 3x 5 =0<br />

−2x 1 +7x 2 − 5x 3 +2x 4 +2x 5 = −3<br />

has coefficient matrix<br />

and vector of constants<br />

A =<br />

and so will be referenced as LS(A, b).<br />

[ 2 4 −3 5<br />

] 1<br />

3 1 0 1 −3<br />

−2 7 −5 2 2<br />

[ ] 9<br />

b = 0<br />

−3<br />

Def<strong>in</strong>ition AM Augmented Matrix<br />

Suppose we have a system of m equations <strong>in</strong> n variables, with coefficient matrix A<br />

and vector of constants b. Then the augmented matrix of the system of equations<br />

is the m × (n + 1) matrix whose first n columns are the columns of A and whose last<br />

column (n + 1) is the column vector b. Thismatrixwillbewrittenas[A | b]. □<br />

The augmented matrix represents all the important <strong>in</strong>formation <strong>in</strong> the system of<br />

equations, s<strong>in</strong>ce the names of the variables have been ignored, and the only connection<br />

with the variables is the location of their coefficients <strong>in</strong> the matrix. It is important<br />

to realize that the augmented matrix is just that, a matrix, and not asystemof<br />

equations. In particular, the augmented matrix does not have any “solutions,” though<br />

it will be useful for f<strong>in</strong>d<strong>in</strong>g solutions to the system of equations that it is associated<br />

with. (Th<strong>in</strong>k about your objects, and review Proof Technique L.) However, notice<br />

that an augmented matrix always belongs to some system of equations, and vice<br />

versa, so it is tempt<strong>in</strong>g to try and blur the dist<strong>in</strong>ction between the two. Here is a<br />

quick example.<br />

Example AMAA Augmented matrix for Archetype A<br />

Archetype A is the follow<strong>in</strong>g system of 3 equations <strong>in</strong> 3 variables.<br />

x 1 − x 2 +2x 3 =1<br />

2x 1 + x 2 + x 3 =8<br />

x 1 + x 2 =5<br />

□<br />


§RREF Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 26<br />

Here is its augmented matrix.<br />

[1 ] −1 2 1<br />

2 1 1 8<br />

1 1 0 5<br />

△<br />

Subsection RO<br />

Row Operations<br />

An augmented matrix for a system of equations will save us the tedium of cont<strong>in</strong>ually<br />

writ<strong>in</strong>g down the names of the variables as we solve the system. It will also release<br />

us from any dependence on the actual names of the variables. We have seen how<br />

certa<strong>in</strong> operations we can perform on equations (Def<strong>in</strong>ition EO) will preserve their<br />

solutions (Theorem EOPSS). The next two def<strong>in</strong>itions and the follow<strong>in</strong>g theorem<br />

carry over these ideas to augmented matrices.<br />

Def<strong>in</strong>ition RO Row Operations<br />

The follow<strong>in</strong>g three operations will transform an m × n matrix <strong>in</strong>to a different matrix<br />

of the same size, and each is known as a row operation.<br />

1. Swap the locations of two rows.<br />

2. Multiply each entry of a s<strong>in</strong>gle row by a nonzero quantity.<br />

3. Multiply each entry of one row by some quantity, and add these values to the<br />

entries <strong>in</strong> the same columns of a second row. Leave the first row the same after<br />

this operation, but replace the second row by the new values.<br />

We will use a symbolic shorthand to describe these row operations:<br />

1. R i ↔ R j : Swap the location of rows i and j.<br />

2. αR i : Multiply row i by the nonzero scalar α.<br />

3. αR i + R j : Multiply row i by the scalar α and add to row j.<br />

Def<strong>in</strong>ition REM Row-Equivalent Matrices<br />

Two matrices, A and B, arerow-equivalent if one can be obta<strong>in</strong>ed from the other<br />

by a sequence of row operations.<br />

□<br />

Example TREM Two row-equivalent matrices<br />

The matrices<br />

[ ]<br />

2 −1 3 4<br />

A = 5 2 −2 3<br />

B =<br />

1 1 0 6<br />

[ 1 1 0<br />

] 6<br />

3 0 −2 −9<br />

2 −1 3 4<br />


§RREF Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 27<br />

are row-equivalent as can be seen from<br />

[ ] [ ] [ ]<br />

2 −1 3 4<br />

1 1 0 6<br />

1 1 0 6<br />

R<br />

5 2 −2 3<br />

1 ↔R<br />

−−−−−→<br />

3<br />

−2R<br />

5 2 −2 3<br />

1 +R<br />

−−−−−−→<br />

2<br />

3 0 −2 −9<br />

1 1 0 6<br />

2 −1 3 4<br />

2 −1 3 4<br />

We can also say that any pair of these three matrices are row-equivalent.<br />

Notice that each of the three row operations is reversible (Exercise RREF.T10),<br />

so we do not have to be careful about the dist<strong>in</strong>ction between “A is row-equivalent<br />

to B” and“B is row-equivalent to A.” (Exercise RREF.T11)<br />

The preced<strong>in</strong>g def<strong>in</strong>itions are designed to make the follow<strong>in</strong>g theorem possible.<br />

It says that row-equivalent matrices represent systems of l<strong>in</strong>ear equations that have<br />

identical solution sets.<br />

Theorem REMES Row-Equivalent Matrices represent Equivalent Systems<br />

Suppose that A and B are row-equivalent augmented matrices. Then the systems of<br />

l<strong>in</strong>ear equations that they represent are equivalent systems.<br />

Proof. If we perform a s<strong>in</strong>gle row operation on an augmented matrix, it will have the<br />

same effect as if we did the analogous equation operation on the system of equations<br />

the matrix represents. By exactly the same methods as we used <strong>in</strong> the proof of<br />

Theorem EOPSS we can see that each of these row operations will preserve the set<br />

of solutions for the system of equations the matrix represents.<br />

<br />

So at this po<strong>in</strong>t, our strategy is to beg<strong>in</strong> with a system of equations, represent<br />

the system by an augmented matrix, perform row operations (which will preserve<br />

solutions for the system) to get a “simpler” augmented matrix, convert back to a<br />

“simpler” system of equations and then solve that system, know<strong>in</strong>g that its solutions<br />

are those of the orig<strong>in</strong>al system. Here is a rehash of Example US as an exercise <strong>in</strong><br />

us<strong>in</strong>g our new tools.<br />

Example USR Three equations, one solution, reprised<br />

We solve the follow<strong>in</strong>g system us<strong>in</strong>g augmented matrices and row operations. This<br />

is the same system of equations solved <strong>in</strong> Example US us<strong>in</strong>g equation operations.<br />

x 1 +2x 2 +2x 3 =4<br />

x 1 +3x 2 +3x 3 =5<br />

2x 1 +6x 2 +5x 3 =6<br />

Form the augmented matrix,<br />

A =<br />

[ 1 2 2<br />

] 4<br />

1 3 3 5<br />

2 6 5 6<br />


§RREF Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 28<br />

and apply row operations,<br />

So the matrix<br />

[ 1 2 2 4<br />

−1R 1 +R<br />

−−−−−−→<br />

2<br />

0 1 1 1<br />

]<br />

[ 1 2 2 4<br />

−2R 1 +R<br />

−−−−−−→<br />

3<br />

0 1 1 1<br />

2 6 5 6<br />

0 2 1 −2<br />

[ ] [ ]<br />

1 2 2 4 1 2 2 4<br />

−2R 2 +R<br />

−−−−−−→<br />

3<br />

−1R<br />

0 1 1 1 −−−→<br />

3<br />

0 1 1 1<br />

0 0 −1 −4 0 0 1 4<br />

B =<br />

[ 1 2 2<br />

] 4<br />

0 1 1 1<br />

0 0 1 4<br />

is row equivalent to A and by Theorem REMES the system of equations below has<br />

the same solution set as the orig<strong>in</strong>al system of equations.<br />

x 1 +2x 2 +2x 3 =4<br />

x 2 + x 3 =1<br />

x 3 =4<br />

Solv<strong>in</strong>g this “simpler” system is straightforward and is identical to the process<br />

<strong>in</strong> Example US.<br />

△<br />

Subsection RREF<br />

Reduced Row-Echelon Form<br />

The preced<strong>in</strong>g example amply illustrates the def<strong>in</strong>itions and theorems we have seen<br />

so far. But it still leaves two questions unanswered. Exactly what is this “simpler”<br />

form for a matrix, and just how do we get it? Here is the answer to the first question,<br />

a def<strong>in</strong>ition of reduced row-echelon form.<br />

Def<strong>in</strong>ition RREF Reduced Row-Echelon Form<br />

A matrix is <strong>in</strong> reduced row-echelon form if it meets all of the follow<strong>in</strong>g conditions:<br />

1. If there is a row where every entry is zero, then this row lies below any other<br />

row that conta<strong>in</strong>s a nonzero entry.<br />

2. The leftmost nonzero entry of a row is equal to 1.<br />

3. The leftmost nonzero entry of a row is the only nonzero entry <strong>in</strong> its column.<br />

4. Consider any two different leftmost nonzero entries, one located <strong>in</strong> row i,<br />

column j and the other located <strong>in</strong> row s, column t. Ifs>i,thent>j.<br />

A row of only zero entries is called a zero row and the leftmost nonzero entry<br />

of a nonzero row is a lead<strong>in</strong>g 1. A column conta<strong>in</strong><strong>in</strong>g a lead<strong>in</strong>g 1 will be called a<br />

]


§RREF Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 29<br />

pivot column. The number of nonzero rows will be denoted by r, whichisalso<br />

equal to the number of lead<strong>in</strong>g 1’s and the number of pivot columns.<br />

The set of column <strong>in</strong>dices for the pivot columns will be denoted by D =<br />

{d 1 ,d 2 ,d 3 , ..., d r } where d 1 < d 2 < d 3 < ··· < d r , while the columns that<br />

are not pivot columns will be denoted as F = {f 1 ,f 2 ,f 3 , ..., f n−r } where f 1 <<br />

f 2


§RREF Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 30<br />

1. A and B are row-equivalent.<br />

2. B is <strong>in</strong> reduced row-echelon form.<br />

Proof. Suppose that A has m rows and n columns. We will describe a process for<br />

convert<strong>in</strong>g A <strong>in</strong>to B via row operations. This procedure is known as Gauss-Jordan<br />

elim<strong>in</strong>ation. Trac<strong>in</strong>g through this procedure will be easier if you recognize that i<br />

refers to a row that is be<strong>in</strong>g converted, j refers to a column that is be<strong>in</strong>g converted,<br />

and r keeps track of the number of nonzero rows. Here we go.<br />

1. Set j = 0 and r =0.<br />

2. Increase j by 1. If j now equals n +1,thenstop.<br />

3. Exam<strong>in</strong>e the entries of A <strong>in</strong> column j located <strong>in</strong> rows r +1throughm. Ifall<br />

of these entries are zero, then go to Step 2.<br />

4. Choose a row from rows r +1throughm with a nonzero entry <strong>in</strong> column j.<br />

Let i denote the <strong>in</strong>dex for this row.<br />

5. Increase r by 1.<br />

6. Use the first row operation to swap rows i and r.<br />

7. Use the second row operation to convert the entry <strong>in</strong> row r and column j to a<br />

1.<br />

8. Use the third row operation with row r to convert every other entry of column<br />

j to zero.<br />

9. Go to Step 2.<br />

The result of this procedure is that the matrix A is converted to a matrix <strong>in</strong><br />

reduced row-echelon form, which we will refer to as B. We need to now prove this<br />

claim by show<strong>in</strong>g that the converted matrix has the requisite properties of Def<strong>in</strong>ition<br />

RREF. <strong>First</strong>, the matrix is only converted through row operations (Steps 6, 7, 8), so<br />

A and B are row-equivalent (Def<strong>in</strong>ition REM).<br />

It is a bit more work to be certa<strong>in</strong> that B is <strong>in</strong> reduced row-echelon form. We<br />

claim that as we beg<strong>in</strong> Step 2, the first j columns of the matrix are <strong>in</strong> reduced<br />

row-echelon form with r nonzero rows. Certa<strong>in</strong>ly this is true at the start when j =0,<br />

s<strong>in</strong>ce the matrix has no columns and so vacuously meets the conditions of Def<strong>in</strong>ition<br />

RREF with r = 0 nonzero rows.<br />

In Step 2 we <strong>in</strong>crease j by 1 and beg<strong>in</strong> to work with the next column. There<br />

are two possible outcomes for Step 3. Suppose that every entry of column j <strong>in</strong> rows<br />

r +1 throughm is zero. Then with no changes we recognize that the first j columns


§RREF Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 31<br />

of the matrix has its first r rows still <strong>in</strong> reduced-row echelon form, with the f<strong>in</strong>al<br />

m − r rows still all zero.<br />

Suppose <strong>in</strong>stead that the entry <strong>in</strong> row i of column j is nonzero. Notice that s<strong>in</strong>ce<br />

r +1≤ i ≤ m, we know the first j − 1 entries of this row are all zero. Now, <strong>in</strong> Step<br />

5 we <strong>in</strong>crease r by 1, and then embark on build<strong>in</strong>g a new nonzero row. In Step 6 we<br />

swap row r and row i. In the first j columns, the first r − 1 rows rema<strong>in</strong> <strong>in</strong> reduced<br />

row-echelon form after the swap. In Step 7 we multiply row r by a nonzero scalar,<br />

creat<strong>in</strong>ga1<strong>in</strong>theentry<strong>in</strong>columnj of row i, and not chang<strong>in</strong>g any other rows.<br />

This new lead<strong>in</strong>g 1 is the first nonzero entry <strong>in</strong> its row, and is located to the right of<br />

all the lead<strong>in</strong>g 1’s <strong>in</strong> the preced<strong>in</strong>g r − 1 rows. With Step 8 we <strong>in</strong>sure that every<br />

entry <strong>in</strong> the column with this new lead<strong>in</strong>g 1 is now zero, as required for reduced<br />

row-echelon form. Also, rows r +1 through m are now all zeros <strong>in</strong> the first j columns,<br />

so we now only have one new nonzero row, consistent with our <strong>in</strong>crease of r by one.<br />

Furthermore, s<strong>in</strong>ce the first j − 1 entries of row r are zero, the employment of the<br />

third row operation does not destroy any of the necessary features of rows 1 through<br />

r − 1androwsr +1throughm, <strong>in</strong> columns 1 through j − 1.<br />

So at this stage, the first j columns of the matrix are <strong>in</strong> reduced row-echelon<br />

form. When Step 2 f<strong>in</strong>ally <strong>in</strong>creases j to n + 1, then the procedure is completed and<br />

the full n columns of the matrix are <strong>in</strong> reduced row-echelon form, with the value of<br />

r correctly record<strong>in</strong>g the number of nonzero rows.<br />

<br />

Theproceduregiven<strong>in</strong>theproofofTheoremREMEF can be more precisely<br />

described us<strong>in</strong>g a pseudo-code version of a computer program. S<strong>in</strong>gle-letter variables,<br />

like m, n, i, j, r have the same mean<strong>in</strong>gs as above. := is assignment of the value<br />

on the right to the variable on the left, A[i,j] is the equivalent of the matrix entry<br />

[A] ij<br />

, while == is an equality test and is a “not equals” test.<br />

<strong>in</strong>put m, n and A<br />

r := 0<br />

for j := 1 to n<br />

i := r+1<br />

while i


§RREF Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 32<br />

that follows is not out-of-bounds.<br />

So now we can put it all together. Beg<strong>in</strong> with a system of l<strong>in</strong>ear equations<br />

(Def<strong>in</strong>ition SLE), and represent the system by its augmented matrix (Def<strong>in</strong>ition AM).<br />

Use row operations (Def<strong>in</strong>ition RO) to convert this matrix <strong>in</strong>to reduced row-echelon<br />

form (Def<strong>in</strong>ition RREF), us<strong>in</strong>g the procedure outl<strong>in</strong>ed <strong>in</strong> the proof of Theorem<br />

REMEF. TheoremREMEF also tells us we can always accomplish this, and that the<br />

result is row-equivalent (Def<strong>in</strong>ition REM) to the orig<strong>in</strong>al augmented matrix. S<strong>in</strong>ce<br />

the matrix <strong>in</strong> reduced-row echelon form has the same solution set, we can analyze<br />

the row-reduced version <strong>in</strong>stead of the orig<strong>in</strong>al matrix, view<strong>in</strong>g it as the augmented<br />

matrix of a different system of equations. The beauty of augmented matrices <strong>in</strong><br />

reduced row-echelon form is that the solution sets to the systems they represent can<br />

be easily determ<strong>in</strong>ed, as we will see <strong>in</strong> the next few examples and <strong>in</strong> the next section.<br />

We will see through the course that almost every <strong>in</strong>terest<strong>in</strong>g property of a matrix<br />

can be discerned by look<strong>in</strong>g at a row-equivalent matrix <strong>in</strong> reduced row-echelon form.<br />

For this reason it is important to know that the matrix B is guaranteed to exist by<br />

Theorem REMEF is also unique.<br />

Two proof techniques are applicable to the proof. <strong>First</strong>, head out and read two<br />

proof techniques: Proof Technique CD and Proof Technique U.<br />

Theorem RREFU Reduced Row-Echelon Form is Unique<br />

Suppose that A is an m × n matrix and that B and C are m × n matrices that are<br />

row-equivalent to A and <strong>in</strong> reduced row-echelon form. Then B = C.<br />

Proof. We need to beg<strong>in</strong> with no assumptions about any relationships between B<br />

and C, other than they are both <strong>in</strong> reduced row-echelon form, and they are both<br />

row-equivalent to A.<br />

If B and C are both row-equivalent to A, then they are row-equivalent to each<br />

other. Repeated row operations on a matrix comb<strong>in</strong>e the rows with each other us<strong>in</strong>g<br />

operations that are l<strong>in</strong>ear, and are identical <strong>in</strong> each column. A key observation for<br />

this proof is that each <strong>in</strong>dividual row of B is l<strong>in</strong>early related to the rows of C. This<br />

relationship is different for each row of B, but once we fix a row, the relationship is<br />

the same across columns. More precisely, there are scalars δ ik ,1≤ i, k ≤ m such<br />

that for any 1 ≤ i ≤ m, 1≤ j ≤ n,<br />

m∑<br />

[B] ij<br />

= δ ik [C] kj<br />

k=1<br />

You should read this as say<strong>in</strong>g that an entry of row i of B (<strong>in</strong> column j) isa<br />

l<strong>in</strong>ear function of the entries of all the rows of C that are also <strong>in</strong> column j, andthe<br />

scalars (δ ik ) depend on which row of B we are consider<strong>in</strong>g (the i subscript on δ ik ),<br />

but are the same for every column (no dependence on j <strong>in</strong> δ ik ). This idea may be<br />

complicated now, but will feel more familiar once we discuss “l<strong>in</strong>ear comb<strong>in</strong>ations”<br />

(Def<strong>in</strong>ition LCCV) and moreso when we discuss “row spaces” (Def<strong>in</strong>ition RSM).<br />

For now, spend some time carefully work<strong>in</strong>g Exercise RREF.M40, which is designed


§RREF Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 33<br />

to illustrate the orig<strong>in</strong>s of this expression. This completes our exploitation of the<br />

row-equivalence of B and C.<br />

We now repeatedly exploit the fact that B and C are <strong>in</strong> reduced row-echelon<br />

form. Recall that a pivot column is all zeros, except a s<strong>in</strong>gle one. More carefully, if<br />

R is a matrix <strong>in</strong> reduced row-echelon form, and d l is the <strong>in</strong>dex of a pivot column,<br />

then [R] kdl<br />

= 1 precisely when k = l and is otherwise zero. Notice also that any<br />

entry of R that is both below the entry <strong>in</strong> row l and to the left of column d l is also<br />

zero (with below and left understood to <strong>in</strong>clude equality). In other words, look at<br />

examples of matrices <strong>in</strong> reduced row-echelon form and choose a lead<strong>in</strong>g 1 (with a<br />

box around it). The rest of the column is also zeros, and the lower left “quadrant”<br />

of the matrix that beg<strong>in</strong>s here is totally zeros.<br />

Assum<strong>in</strong>g no relationship about the form of B and C, letB have r nonzero rows<br />

and denote the pivot columns as D = {d 1 ,d 2 ,d 3 , ..., d r }.ForC let r ′ denote the<br />

number of nonzero rows and denote the pivot columns as<br />

D ′ = { d ′ 1,d ′ 2,d ′ 3, ..., d ′ r ′} (Def<strong>in</strong>ition RREF). There are four steps <strong>in</strong> the<br />

proof, and the first three are about show<strong>in</strong>g that B and C have the same number<br />

of pivot columns, <strong>in</strong> the same places. In other words, the “primed” symbols are a<br />

necessary fiction.<br />

<strong>First</strong> Step. Suppose that d 1


§RREF Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 34<br />

= δ p+1,l [C] ld<br />

′<br />

l<br />

+<br />

= δ p+1,l (1) +<br />

m∑<br />

δ p+1,k [C] kd<br />

′<br />

l<br />

k=1<br />

k≠l<br />

m∑<br />

δ p+1,k (0)<br />

k=1<br />

k≠l<br />

Property CACN<br />

Def<strong>in</strong>ition RREF<br />

Now,<br />

= δ p+1,l<br />

1=[B] p+1,dp+1<br />

=<br />

=<br />

=<br />

=<br />

=<br />

=0<br />

m∑<br />

δ p+1,k [C] kdp+1<br />

k=1<br />

p∑<br />

δ p+1,k [C] kdp+1<br />

+<br />

k=1<br />

p∑<br />

(0) [C] kdp+1<br />

+<br />

k=1<br />

m∑<br />

k=p+1<br />

m∑<br />

k=p+1<br />

δ p+1,k [C] kdp+1<br />

m∑<br />

k=p+1<br />

m∑<br />

k=p+1<br />

δ p+1,k [C] kdp+1<br />

δ p+1,k [C] kdp+1<br />

Def<strong>in</strong>ition RREF<br />

Property AACN<br />

δ p+1,k (0) d p+1


§RREF Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 35<br />

∑<br />

= δ rk [C] kdl<br />

r ′<br />

k=1<br />

∑<br />

= δ rk [C] kd<br />

′<br />

l<br />

r ′<br />

k=1<br />

∑<br />

= δ rl [C] ld<br />

′ + δ rk [C]<br />

l<br />

kd<br />

′<br />

l<br />

r ′<br />

k=1<br />

k≠l<br />

∑<br />

= δ rl (1) + δ rk (0)<br />

r ′<br />

k=1<br />

k≠l<br />

Property CACN<br />

Def<strong>in</strong>ition RREF<br />

= δ rl<br />

Now exam<strong>in</strong>e the entries of row r of B,<br />

m∑<br />

[B] rj<br />

= δ rk [C] kj<br />

k=1<br />

∑<br />

= δ rk [C] kj<br />

+<br />

r ′<br />

k=1<br />

∑<br />

= δ rk [C] kj<br />

+<br />

r ′<br />

k=1<br />

∑<br />

= δ rk [C] kj<br />

r ′<br />

k=1<br />

∑<br />

= (0) [C] kj<br />

=0<br />

r ′<br />

k=1<br />

m∑<br />

k=r ′ +1<br />

m∑<br />

k=r ′ +1<br />

δ rk [C] kj<br />

δ rk (0)<br />

Property CACN<br />

Def<strong>in</strong>ition RREF<br />

So row r is a totally zero row, contradict<strong>in</strong>g that this should be the bottommost<br />

nonzero row of B. Sor ′ ≥ r. By an entirely similar argument, revers<strong>in</strong>g the roles of<br />

B and C, we would conclude that r ′ ≤ r and therefore r = r ′ . Thus, comb<strong>in</strong><strong>in</strong>g the<br />

first three steps we can say that D = D ′ . In other words, B and C have the same<br />

pivot columns, <strong>in</strong> the same locations.<br />

Fourth Step. In this f<strong>in</strong>al step, we will not argue by contradiction. Our <strong>in</strong>tent<br />

is to determ<strong>in</strong>e the values of the δ ij . Notice that we can use the values of the d i<br />

<strong>in</strong>terchangeably for B and C. Here we go,<br />

1=[B] idi<br />

Def<strong>in</strong>ition RREF


§RREF Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 36<br />

=<br />

m∑<br />

δ ik [C] kdi<br />

k=1<br />

= δ ii [C] idi<br />

+<br />

= δ ii (1) +<br />

= δ ii<br />

m∑<br />

δ ik [C] kdi<br />

k=1<br />

k≠i<br />

m∑<br />

δ ik (0)<br />

k=1<br />

k≠i<br />

and for l ≠ i<br />

0=[B] idl<br />

m∑<br />

= δ ik [C] kdl<br />

k=1<br />

= δ il [C] ldl<br />

+<br />

= δ il (1) +<br />

m∑<br />

δ ik [C] kdl<br />

k=1<br />

k≠l<br />

m∑<br />

δ ik (0)<br />

k=1<br />

k≠l<br />

Property CACN<br />

Def<strong>in</strong>ition RREF<br />

Def<strong>in</strong>ition RREF<br />

Property CACN<br />

Def<strong>in</strong>ition RREF<br />

= δ il<br />

F<strong>in</strong>ally, hav<strong>in</strong>g determ<strong>in</strong>ed the values of the δ ij , we can show that B = C. For<br />

1 ≤ i ≤ m, 1≤ j ≤ n,<br />

m∑<br />

[B] ij<br />

= δ ik [C] kj<br />

k=1<br />

= δ ii [C] ij<br />

+<br />

= (1) [C] ij<br />

+<br />

m∑<br />

δ ik [C] kj<br />

k=1<br />

k≠i<br />

m∑<br />

(0) [C] kj<br />

k=1<br />

k≠i<br />

Property CACN<br />

=[C] ij<br />

So B and C have equal values <strong>in</strong> every entry, and so are the same matrix.<br />

<br />

We will now run through some examples of us<strong>in</strong>g these def<strong>in</strong>itions and theorems<br />

to solve some systems of equations. From now on, when we have a matrix <strong>in</strong> reduced


§RREF Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 37<br />

row-echelon form, we will mark the lead<strong>in</strong>g 1’s with a small box. This will help you<br />

count, and identify, the pivot columns. In your work, you can box ’em, circle ’em or<br />

write ’em <strong>in</strong> a different color — just identify ’em somehow. This device will prove<br />

very useful later and is a very good habit to start develop<strong>in</strong>g right now.<br />

Example SAB Solutions for Archetype B<br />

Let us f<strong>in</strong>d the solutions to the follow<strong>in</strong>g system of equations,<br />

−7x 1 − 6x 2 − 12x 3 = −33<br />

5x 1 +5x 2 +7x 3 =24<br />

x 1 +4x 3 =5<br />

<strong>First</strong>, form the augmented matrix,<br />

[ −7 −6 −12<br />

] −33<br />

5 5 7 24<br />

1 0 4 5<br />

and work to reduced row-echelon form, first with j =1,<br />

[ ] [ ]<br />

1 0 4 5<br />

1 0 4 5<br />

R 1 ↔R<br />

−−−−−→<br />

3<br />

−5R<br />

5 5 7 24<br />

1 +R<br />

−−−−−−→<br />

2<br />

0 5 −13 −1<br />

−7 −6 −12 −33<br />

−7 −6 −12 −33<br />

⎡<br />

7R 1 +R<br />

−−−−−→<br />

3<br />

⎣ 1 0 4 5<br />

⎤<br />

0 5 −13 −1⎦<br />

0 −6 16 2<br />

Now, with j =2,<br />

⎡<br />

1<br />

5 R 2<br />

−−−→ ⎣ 1 0 4 5<br />

⎤ ⎡<br />

−13 −1<br />

0 1 ⎦<br />

5 5 −−−−−→ 6R 2+R 3<br />

⎣ 1 0 4 5<br />

⎤<br />

−13 −1<br />

0 1 ⎦<br />

5 5<br />

0 −6 16 2<br />

2 4<br />

0 0<br />

5 5<br />

And f<strong>in</strong>ally, with j =3,<br />

⎡<br />

5<br />

2 R 3<br />

−−−→ ⎣ 1 0 4 5<br />

⎤<br />

−13 −1<br />

0 1<br />

5 5<br />

0 0 1 2<br />

⎡<br />

⎤<br />

1 0 0 −3<br />

−4R 3 +R<br />

−−−−−−→<br />

1<br />

⎣ 0 1 0 5 ⎦<br />

0 0 1 2<br />

⎦ 13<br />

5 R 3+R 2<br />

−−−−−−→<br />

⎡<br />

⎣ 1 0 4 5<br />

⎤<br />

0 1 0 5⎦<br />

0 0 1 2<br />

This is now the augmented matrix of a very simple system of equations, namely<br />

x 1 = −3, x 2 =5,x 3 = 2, which has an obvious solution. Furthermore, we can<br />

see that this is the only solution to this system, so we have determ<strong>in</strong>ed the entire


§RREF Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 38<br />

solution set,<br />

S =<br />

{[ ]} −3<br />

5<br />

2<br />

You might compare this example with the procedure we used <strong>in</strong> Example US.△<br />

Archetypes A and B are meant to contrast each other <strong>in</strong> many respects. So let<br />

us solve Archetype A now.<br />

Example SAA Solutions for Archetype A<br />

Let us f<strong>in</strong>d the solutions to the follow<strong>in</strong>g system of equations,<br />

x 1 − x 2 +2x 3 =1<br />

2x 1 + x 2 + x 3 =8<br />

x 1 + x 2 =5<br />

<strong>First</strong>, form the augmented matrix,<br />

[ 1 −1 2<br />

] 1<br />

2 1 1 8<br />

1 1 0 5<br />

and work to reduced row-echelon form, first with j =1,<br />

[ ] ⎡<br />

1 −1 2 1<br />

−2R 1 +R<br />

−−−−−−→<br />

2<br />

−1R<br />

0 3 −3 6<br />

1 +R<br />

−−−−−−→<br />

3<br />

⎣ 1 −1 2 1<br />

⎤<br />

0 3 −3 6⎦<br />

1 1 0 5<br />

0 2 −2 4<br />

Now, with j =2,<br />

⎡<br />

1<br />

3 R 2<br />

−−−→ ⎣ 1 −1 2 1<br />

⎤ ⎡<br />

0 1 −1 2⎦ −−−−−→ 1R 2+R 1<br />

⎣ 1 0 1 3<br />

⎤<br />

0 1 −1 2⎦<br />

0 2 −2 4<br />

0 2 −2 4<br />

⎡<br />

−2R 2 +R<br />

−−−−−−→<br />

3<br />

⎣ 1 0 1 3<br />

⎤<br />

0 1 −1 2⎦<br />

0 0 0 0<br />

The system of equations represented by this augmented matrix needs to be<br />

considered a bit differently than that for Archetype B. <strong>First</strong>, the last row of the<br />

matrix is the equation 0 = 0, which is always true, so it imposes no restrictions on<br />

our possible solutions and therefore we can safely ignore it as we analyze the other<br />

two equations. These equations are,<br />

x 1 + x 3 =3<br />

x 2 − x 3 =2.<br />

While this system is fairly easy to solve, it also appears to have a multitude of


§RREF Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 39<br />

solutions. For example, choose x 3 = 1 and see that then x 1 = 2 and x 2 = 3 will<br />

together form a solution. Or choose x 3 = 0, and then discover that x 1 = 3 and<br />

x 2 = 2 lead to a solution. Try it yourself: pick any value of x 3 you please, and figure<br />

out what x 1 and x 2 should be to make the first and second equations (respectively)<br />

true. We’ll wait while you do that. Because of this behavior, we say that x 3 is a “free”<br />

or “<strong>in</strong>dependent” variable. But why do we vary x 3 and not some other variable? For<br />

now, notice that the third column of the augmented matrix is not a pivot column.<br />

With this idea, we can rearrange the two equations, solv<strong>in</strong>g each for the variable<br />

whose <strong>in</strong>dex is the same as the column <strong>in</strong>dex of a pivot column.<br />

x 1 =3− x 3<br />

x 2 =2+x 3<br />

To write the set of solution vectors <strong>in</strong> set notation, we have<br />

{[ ]∣ }<br />

3 − x3 ∣∣∣∣<br />

S = 2+x 3 x 3 ∈ C<br />

x 3<br />

We will learn more <strong>in</strong> the next section about systems with <strong>in</strong>f<strong>in</strong>itely many<br />

solutions and how to express their solution sets. Right now, you might look back at<br />

Example IS.<br />

△<br />

Example SAE Solutions for Archetype E<br />

Let us f<strong>in</strong>d the solutions to the follow<strong>in</strong>g system of equations,<br />

2x 1 + x 2 +7x 3 − 7x 4 =2<br />

−3x 1 +4x 2 − 5x 3 − 6x 4 =3<br />

x 1 + x 2 +4x 3 − 5x 4 =2<br />

<strong>First</strong>, form the augmented matrix,<br />

[ 2 1 7 −7<br />

] 2<br />

−3 4 −5 −6 3<br />

1 1 4 −5 2<br />

and work to reduced row-echelon form, first with j =1,<br />

[ ] [ ]<br />

1 1 4 −5 2<br />

1 1 4 −5 2<br />

R 1 ↔R<br />

−−−−−→<br />

3<br />

3R<br />

−3 4 −5 −6 3<br />

1 +R<br />

−−−−−→<br />

2<br />

0 7 7 −21 9<br />

2 1 7 −7 2<br />

2 1 7 −7 2<br />

⎡<br />

−2R 1 +R<br />

−−−−−−→<br />

3<br />

⎣ 1 1 4 −5 2<br />

⎤<br />

0 7 7 −21 9 ⎦<br />

0 −1 −1 3 −2


§RREF Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 40<br />

Now, with j =2,<br />

⎡<br />

R 2 ↔R<br />

−−−−−→<br />

3<br />

⎣ 1 1 4 −5 2<br />

⎤ ⎡<br />

0 −1 −1 3 −2⎦ −1R 2<br />

−−−→ ⎣ 1 1 4 −5 2<br />

⎤<br />

0 1 1 −3 2⎦<br />

0 7 7 −21 9<br />

0 7 7 −21 9<br />

⎡<br />

−1R 2 +R<br />

−−−−−−→<br />

1<br />

⎣ 1 0 3 −2 0<br />

⎤ ⎡<br />

0 1 1 −3 2⎦ −−−−−−→ −7R 2+R 3<br />

⎣ 1 0 3 −2 0<br />

⎤<br />

0 1 1 −3 2 ⎦<br />

0 7 7 −21 9<br />

0 0 0 0 −5<br />

And f<strong>in</strong>ally, with j =4,<br />

⎡<br />

− 1 5 R 3<br />

−−−−→ ⎣ 1 0 3 −2 0<br />

⎤ ⎡<br />

0 1 1 −3 2⎦ −−−−−−→ −2R 3+R 2<br />

⎣<br />

0 0 0 0 1<br />

1 0 3 −2 0<br />

0 1 1 −3 0<br />

0 0 0 0 1<br />

Let us analyze the equations <strong>in</strong> the system represented by this augmented matrix.<br />

The third equation will read 0 = 1. This is patently false, all the time. No choice<br />

of values for our variables will ever make it true. We are done. S<strong>in</strong>ce we cannot<br />

even make the last equation true, we have no hope of mak<strong>in</strong>g all of the equations<br />

simultaneously true. So this system has no solutions, and its solution set is the empty<br />

set, ∅ = {}(Def<strong>in</strong>ition ES).<br />

Notice that we could have reached this conclusion sooner. After perform<strong>in</strong>g the<br />

row operation −7R 2 + R 3 , we can see that the third equation reads 0 = −5, a false<br />

statement. S<strong>in</strong>ce the system represented by this matrix has no solutions, none of the<br />

systems represented has any solutions. However, for this example, we have chosen to<br />

br<strong>in</strong>g the matrix all the way to reduced row-echelon form as practice. △<br />

These three examples (Example SAB, Example SAA, ExampleSAE) illustrate<br />

the full range of possibilities for a system of l<strong>in</strong>ear equations — no solutions, one<br />

solution, or <strong>in</strong>f<strong>in</strong>itely many solutions. In the next section we will exam<strong>in</strong>e these three<br />

scenarios more closely.<br />

We (and everybody else) will often speak of “row-reduc<strong>in</strong>g” a matrix. This is an<br />

<strong>in</strong>formal way of say<strong>in</strong>g we beg<strong>in</strong> with a matrix A and then analyze the matrix B that<br />

is row-equivalent to A and <strong>in</strong> reduced row-echelon form. So the term row-reduce is<br />

used as a verb, but describes someth<strong>in</strong>g a bit more complicated, s<strong>in</strong>ce we do not really<br />

change A. TheoremREMEF tells us that this process will always be successful and<br />

Theorem RREFU tells us that B will be unambiguous. Typically, an <strong>in</strong>vestigation<br />

of A will proceed by analyz<strong>in</strong>g B and apply<strong>in</strong>g theorems whose hypotheses <strong>in</strong>clude<br />

the row-equivalence of A and B, and usually the hypothesis that B is <strong>in</strong> reduced<br />

row-echelon form.<br />

Read<strong>in</strong>g Questions<br />

⎤<br />


C10 † 2x 1 − 3x 2 + x 3 +7x 4 =14<br />

C11 † 3x 1 +4x 2 − x 3 +2x 4 =6<br />

§RREF Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 41<br />

1. Is the matrix below <strong>in</strong> reduced row-echelon form? Why or why not?<br />

⎡<br />

⎣ 1 5 0 6 8<br />

⎤<br />

0 0 1 2 0⎦<br />

0 0 0 0 1<br />

2. Use row operations to convert the matrix below to reduced row-echelon form and report<br />

the f<strong>in</strong>al matrix.<br />

⎡<br />

⎣ 2 1 8<br />

⎤<br />

−1 1 −1⎦<br />

−2 5 4<br />

3. F<strong>in</strong>d all the solutions to the system below by us<strong>in</strong>g an augmented matrix and row<br />

operations. Report your f<strong>in</strong>al matrix <strong>in</strong> reduced row-echelon form and the set of solutions.<br />

Exercises<br />

2x 1 +3x 2 − x 3 =0<br />

x 1 +2x 2 + x 3 =3<br />

x 1 +3x 2 +3x 3 =7<br />

C05 Each archetype below is a system of equations. Form the augmented matrix of the<br />

system of equations, convert the matrix to reduced row-echelon form by us<strong>in</strong>g equation<br />

operations and then describe the solution set of the orig<strong>in</strong>al system of equations.<br />

Archetype A,ArchetypeB,ArchetypeC,ArchetypeD,ArchetypeE,ArchetypeF,Archetype<br />

G, ArchetypeH, ArchetypeI, ArchetypeJ<br />

For problems C10–C19, f<strong>in</strong>d all solutions to the system of l<strong>in</strong>ear equations. Use your favorite<br />

comput<strong>in</strong>g device to row-reduce the augmented matrices for the systems, and write the<br />

solutions as a set, us<strong>in</strong>g correct set notation.<br />

2x 1 +8x 2 − 4x 3 +5x 4 = −1<br />

x 1 +3x 2 − 3x 3 =4<br />

−5x 1 +2x 2 +3x 3 +4x 4 = −19<br />

x 1 − 2x 2 +3x 3 + x 4 =2<br />

10x 2 − 10x 3 − x 4 =1<br />

C12 †<br />

2x 1 +4x 2 +5x 3 +7x 4 = −26


C14 † 2x 1 + x 2 +7x 3 − 2x 4 =4<br />

C16 † 2x 1 +3x 2 +19x 3 − 4x 4 =2<br />

C18 † x 1 +2x 2 − 4x 3 − x 4 =32<br />

C19 † 2x 1 + x 2 =6<br />

§RREF Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 42<br />

x 1 +2x 2 + x 3 − x 4 = −4<br />

−2x 1 − 4x 2 + x 3 +11x 4 = −10<br />

C13 †<br />

x 1 +2x 2 +8x 3 − 7x 4 = −2<br />

3x 1 +2x 2 +12x 3 − 5x 4 =6<br />

−x 1 + x 2 + x 3 − 5x 4 = −10<br />

3x 1 − 2x 2 +11x 4 =13<br />

x 1 + x 2 +5x 3 − 3x 4 =1<br />

C15 †<br />

2x 1 +3x 2 − x 3 − 9x 4 = −16<br />

x 1 +2x 2 + x 3 =0<br />

−x 1 +2x 2 +3x 3 +4x 4 =8<br />

x 1 +2x 2 +12x 3 − 3x 4 =1<br />

−x 1 +2x 2 +8x 3 − 5x 4 =1<br />

C17 †<br />

−x 1 +5x 2 = −8<br />

−2x 1 +5x 2 +5x 3 +2x 4 =9<br />

−3x 1 − x 2 +3x 3 + x 4 =3<br />

7x 1 +6x 2 +5x 3 + x 4 =30<br />

x 1 +3x 2 − 7x 3 − x 5 =33<br />

x 1 +2x 3 − 2x 4 +3x 5 =22<br />

−x 1 − x 2 = −2<br />

3x 1 +4x 2 =4


§RREF Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 43<br />

3x 1 +5x 2 =2<br />

For problems C30–C33, row-reduce the matrix without the aid of a calculator, <strong>in</strong>dicat<strong>in</strong>g<br />

the row operations you are us<strong>in</strong>g at each step us<strong>in</strong>g the notation of Def<strong>in</strong>ition RO.<br />

C30 †<br />

⎡<br />

⎣ 2 1 5 10<br />

⎤<br />

1 −3 −1 −2⎦<br />

4 −2 6 12<br />

C31 †<br />

C32 †<br />

C33 †<br />

⎡<br />

⎣ 1 2 −4<br />

⎤<br />

−3 −1 −3⎦<br />

−2 1 −7<br />

⎡<br />

⎣ 1 1 1<br />

⎤<br />

−4 −3 −2⎦<br />

3 2 1<br />

⎡<br />

⎤<br />

1 2 −1 −1<br />

⎣ 2 4 −1 4 ⎦<br />

−1 −2 3 5<br />

M40 †<br />

Consider the two 3 × 4matricesbelow<br />

⎡<br />

B = ⎣ 1 3 −2 2<br />

⎤<br />

⎡<br />

−1 −2 −1 −1⎦ C = ⎣ 1 2 1 2<br />

⎤<br />

1 1 4 0⎦<br />

−1 −5 8 −3<br />

−1 −1 −4 1<br />

1. Row-reduce each matrix and determ<strong>in</strong>e that the reduced row-echelon forms of B and<br />

C are identical. From this argue that B and C are row-equivalent.<br />

2. In the proof of Theorem RREFU, we beg<strong>in</strong> by argu<strong>in</strong>g that entries of row-equivalent<br />

matrices are related by way of certa<strong>in</strong> scalars and sums. In this example, we would<br />

write that entries of B from row i that are <strong>in</strong> column j are l<strong>in</strong>early related to the<br />

entries of C <strong>in</strong> column j from all three rows<br />

[B] ij<br />

= δ i1 [C] 1j<br />

+ δ i2 [C] 2j<br />

+ δ i3 [C] 3j<br />

1 ≤ j ≤ 4<br />

For each 1 ≤ i ≤ 3 f<strong>in</strong>d the correspond<strong>in</strong>g three scalars <strong>in</strong> this relationship. So your<br />

answer will be n<strong>in</strong>e scalars, determ<strong>in</strong>ed three at a time.


§RREF Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 44<br />

M45 † You keep a number of lizards, mice and peacocks as pets. There are a total of 108<br />

legs and 30 tails <strong>in</strong> your menagerie. You have twice as many mice as lizards. How many of<br />

each creature do you have?<br />

M50 † A park<strong>in</strong>g lot has 66 vehicles (cars, trucks, motorcycles and bicycles) <strong>in</strong> it. There<br />

are four times as many cars as trucks. The total number of tires (4 per car or truck, 2 per<br />

motorcycle or bicycle) is 252. How many cars are there? How many bicycles?<br />

T10 † Prove that each of the three row operations (Def<strong>in</strong>ition RO) is reversible. More<br />

precisely, if the matrix B is obta<strong>in</strong>ed from A by application of a s<strong>in</strong>gle row operation, show<br />

that there is a s<strong>in</strong>gle row operation that will transform B back <strong>in</strong>to A.<br />

T11 Suppose that A, B and C are m × n matrices. Use the def<strong>in</strong>ition of row-equivalence<br />

(Def<strong>in</strong>ition REM) to prove the follow<strong>in</strong>g three facts.<br />

1. A is row-equivalent to A.<br />

2. If A is row-equivalent to B, thenB is row-equivalent to A.<br />

3. If A is row-equivalent to B, andB is row-equivalent to C, then A is row-equivalent<br />

to C.<br />

A relationship that satisfies these three properties is known as an equivalence relation,<br />

an important idea <strong>in</strong> the study of various algebras. This is a formal way of say<strong>in</strong>g that<br />

a relationship behaves like equality, without requir<strong>in</strong>g the relationship to be as strict as<br />

equality itself. We will see it aga<strong>in</strong> <strong>in</strong> Theorem SER.<br />

T12 Suppose that B is an m × n matrix <strong>in</strong> reduced row-echelon form. Build a new, likely<br />

smaller, k × l matrix C as follows. Keep any collection of k adjacent rows, k ≤ m. From<br />

these rows, keep columns 1 through l, l ≤ n. ProvethatC is <strong>in</strong> reduced row-echelon form.<br />

T13 Generalize Exercise RREF.T12 by just keep<strong>in</strong>g any k rows, and not requir<strong>in</strong>g the<br />

rows to be adjacent. Prove that any such matrix C is <strong>in</strong> reduced row-echelon form.


Section TSS<br />

Types of Solution Sets<br />

We will now be more careful about analyz<strong>in</strong>g the reduced row-echelon form derived<br />

from the augmented matrix of a system of l<strong>in</strong>ear equations. In particular, we will see<br />

how to systematically handle the situation when we have <strong>in</strong>f<strong>in</strong>itely many solutions<br />

to a system, and we will prove that every system of l<strong>in</strong>ear equations has either zero,<br />

one or <strong>in</strong>f<strong>in</strong>itely many solutions. With these tools, we will be able to rout<strong>in</strong>ely solve<br />

any l<strong>in</strong>ear system.<br />

Subsection CS<br />

Consistent Systems<br />

The computer scientist Donald Knuth said, “Science is what we understand well<br />

enough to expla<strong>in</strong> to a computer. Art is everyth<strong>in</strong>g else.” In this section we will<br />

remove solv<strong>in</strong>g systems of equations from the realm of art, and <strong>in</strong>to the realm of<br />

science. We beg<strong>in</strong> with a def<strong>in</strong>ition.<br />

Def<strong>in</strong>ition CS Consistent System<br />

A system of l<strong>in</strong>ear equations is consistent if it has at least one solution. Otherwise,<br />

the system is called <strong>in</strong>consistent.<br />

□<br />

We will want to first recognize when a system is <strong>in</strong>consistent or consistent, and <strong>in</strong><br />

the case of consistent systems we will be able to further ref<strong>in</strong>e the types of solutions<br />

possible. We will do this by analyz<strong>in</strong>g the reduced row-echelon form of a matrix,<br />

us<strong>in</strong>g the value of r, and the sets of column <strong>in</strong>dices, D and F ,firstdef<strong>in</strong>edback<strong>in</strong><br />

Def<strong>in</strong>ition RREF.<br />

Use of the notation for the elements of D and F can be a bit confus<strong>in</strong>g, s<strong>in</strong>ce<br />

we have subscripted variables that are <strong>in</strong> turn equal to <strong>in</strong>tegers used to <strong>in</strong>dex the<br />

matrix. However, many questions about matrices and systems of equations can be<br />

answered once we know r, D and F . The choice of the letters D and F refer to our<br />

upcom<strong>in</strong>g def<strong>in</strong>ition of dependent and free variables (Def<strong>in</strong>ition IDV). An example<br />

will help us beg<strong>in</strong> to get comfortable with this aspect of reduced row-echelon form.<br />

Example RREFN Reduced row-echelon form notation<br />

For the 5 × 9 matrix<br />

⎡<br />

⎤<br />

1 5 0 0 2 8 0 5 −1<br />

0 0 1 0 4 7 0 2 0<br />

B =<br />

⎢ 0 0 0 1 3 9 0 3 −6<br />

⎥<br />

⎣<br />

⎦<br />

0 0 0 0 0 0 1 4 2<br />

0 0 0 0 0 0 0 0 0<br />

45


§TSS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 46<br />

<strong>in</strong> reduced row-echelon form we have<br />

r =4<br />

d 1 =1 d 2 =3 d 3 =4 d 4 =7<br />

f 1 =2 f 2 =5 f 3 =6 f 4 =8 f 5 =9<br />

Notice that the sets<br />

D = {d 1 ,d 2 ,d 3 ,d 4 } = {1, 3, 4, 7} F = {f 1 ,f 2 ,f 3 ,f 4 ,f 5 } = {2, 5, 6, 8, 9}<br />

have noth<strong>in</strong>g <strong>in</strong> common and together account for all of the columns of B (we say it<br />

is a partition of the set of column <strong>in</strong>dices).<br />

△<br />

The number r is the s<strong>in</strong>gle most important piece of <strong>in</strong>formation we can get from<br />

the reduced row-echelon form of a matrix. It is def<strong>in</strong>ed as the number of nonzero<br />

rows, but s<strong>in</strong>ce each nonzero row has a lead<strong>in</strong>g 1, it is also the number of lead<strong>in</strong>g<br />

1’s present. For each lead<strong>in</strong>g 1, we have a pivot column, so r is also the number of<br />

pivot columns. Repeat<strong>in</strong>g ourselves, r is the number of nonzero rows, the number<br />

of lead<strong>in</strong>g 1’s and the number of pivot columns. Across different situations, each<br />

of these <strong>in</strong>terpretations of the mean<strong>in</strong>g of r will be useful, though it may be most<br />

helpful to th<strong>in</strong>k <strong>in</strong> terms of pivot columns.<br />

Before prov<strong>in</strong>g some theorems about the possibilities for solution sets to systems<br />

of equations, let us analyze one particular system with an <strong>in</strong>f<strong>in</strong>ite solution set very<br />

carefully as an example. We will use this technique frequently, and shortly we will<br />

ref<strong>in</strong>e it slightly.<br />

Archetypes I and J are both fairly large for do<strong>in</strong>g computations by hand (though<br />

not impossibly large). Their properties are very similar, so we will frequently analyze<br />

the situation <strong>in</strong> Archetype I, and leave you the joy of analyz<strong>in</strong>g Archetype J yourself.<br />

So work through Archetype I with the text, by hand and/or with a computer, and<br />

then tackle Archetype J yourself (and check your results with those listed). Notice<br />

too that the archetypes describ<strong>in</strong>g systems of equations each lists the values of r, D<br />

and F . Here we go...<br />

Example ISSI Describ<strong>in</strong>g <strong>in</strong>f<strong>in</strong>ite solution sets, Archetype I<br />

Archetype I is the system of m = 4 equations <strong>in</strong> n = 7 variables.<br />

x 1 +4x 2 − x 4 +7x 6 − 9x 7 =3<br />

2x 1 +8x 2 − x 3 +3x 4 +9x 5 − 13x 6 +7x 7 =9<br />

2x 3 − 3x 4 − 4x 5 +12x 6 − 8x 7 =1<br />

−x 1 − 4x 2 +2x 3 +4x 4 +8x 5 − 31x 6 +37x 7 =4<br />

This system has a 4 × 8 augmented matrix that is row-equivalent to the follow<strong>in</strong>g<br />

matrix (check this!), and which is <strong>in</strong> reduced row-echelon form (the existence of<br />

this matrix is guaranteed by Theorem REMEF and its uniqueness is guaranteed by


§TSS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 47<br />

Theorem RREFU),<br />

⎡<br />

⎢<br />

⎣<br />

⎤<br />

1 4 0 0 2 1 −3 4<br />

0 0 1 0 1 −3 5 2<br />

⎥<br />

0 0 0 1 2 −6 6 1<br />

⎦<br />

0 0 0 0 0 0 0 0<br />

So we f<strong>in</strong>d that r = 3 and<br />

D = {d 1 ,d 2 ,d 3 } = {1, 3, 4} F = {f 1 ,f 2 ,f 3 ,f 4 ,f 5 } = {2, 5, 6, 7, 8}<br />

Let i denote any one of the r = 3 nonzero rows. Then the <strong>in</strong>dex d i is a pivot<br />

column. It will be easy <strong>in</strong> this case to use the equation represented by row i to<br />

write an expression for the variable x di . It will be a l<strong>in</strong>ear function of the variables<br />

x f1 ,x f2 ,x f3 ,x f4 (notice that f 5 = 8 does not reference a variable, but does tell us<br />

the f<strong>in</strong>al column is not a pivot column). We will now construct these three expressions.<br />

Notice that us<strong>in</strong>g subscripts upon subscripts takes some gett<strong>in</strong>g used to.<br />

(i =1) x d1 = x 1 =4− 4x 2 − 2x 5 − x 6 +3x 7<br />

(i =2) x d2 = x 3 =2− x 5 +3x 6 − 5x 7<br />

(i =3) x d3 = x 4 =1− 2x 5 +6x 6 − 6x 7<br />

Each element of the set F = {f 1 ,f 2 ,f 3 ,f 4 ,f 5 } = {2, 5, 6, 7, 8} is the <strong>in</strong>dex<br />

of a variable, except for f 5 = 8. We refer to x f1 = x 2 , x f2 = x 5 , x f3 = x 6 and<br />

x f4 = x 7 as “free” (or “<strong>in</strong>dependent”) variables s<strong>in</strong>ce they are allowed to assume<br />

any possible comb<strong>in</strong>ation of values that we can imag<strong>in</strong>e and we can cont<strong>in</strong>ue on to<br />

build a solution to the system by solv<strong>in</strong>g <strong>in</strong>dividual equations for the values of the<br />

other (“dependent”) variables.<br />

Each element of the set D = {d 1 ,d 2 ,d 3 } = {1, 3, 4} is the <strong>in</strong>dex of a variable.<br />

We refer to the variables x d1 = x 1 , x d2 = x 3 and x d3 = x 4 as “dependent” variables<br />

s<strong>in</strong>ce they depend on the <strong>in</strong>dependent variables. More precisely, for each possible<br />

choice of values for the <strong>in</strong>dependent variables we get exactly one set of values for<br />

the dependent variables that comb<strong>in</strong>e to form a solution of the system.<br />

To express the solutions as a set, we write<br />

⎧⎡<br />

⎤<br />

⎫<br />

4 − 4x 2 − 2x 5 − x 6 +3x 7<br />

x 2<br />

⎪⎨<br />

2 − x 5 +3x 6 − 5x 7<br />

⎪⎬<br />

1 − 2x 5 +6x 6 − 6x 7<br />

x 2 ,x 5 ,x 6 ,x 7 ∈ C<br />

⎢ x 5<br />

⎥<br />

⎣ x ⎪⎩<br />

6<br />

⎦<br />

x 7<br />

∣<br />

⎪⎭<br />

The condition that x 2 ,x 5 ,x 6 ,x 7 ∈ C is how we specify that the variables<br />

x 2 ,x 5 ,x 6 ,x 7 are “free” to assume any possible values.<br />

This systematic approach to solv<strong>in</strong>g a system of equations will allow us to create<br />

a precise description of the solution set for any consistent system once we have found


§TSS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 48<br />

the reduced row-echelon form of the augmented matrix. It will work just as well<br />

when the set of free variables is empty and we get just a s<strong>in</strong>gle solution. And we<br />

could program a computer to do it! Now have a whack at Archetype J (Exercise<br />

TSS.C10), mimick<strong>in</strong>g the discussion <strong>in</strong> this example. We’ll still be here when you<br />

get back.<br />

△<br />

Us<strong>in</strong>g the reduced row-echelon form of the augmented matrix of a system of<br />

equations to determ<strong>in</strong>e the nature of the solution set of the system is a very key<br />

idea. So let us look at one more example like the last one. But first a def<strong>in</strong>ition, and<br />

then the example. We mix our metaphors a bit when we call variables free versus<br />

dependent. Maybe we should call dependent variables “enslaved”?<br />

Def<strong>in</strong>ition IDV Independent and Dependent Variables<br />

Suppose A is the augmented matrix of a consistent system of l<strong>in</strong>ear equations and<br />

B is a row-equivalent matrix <strong>in</strong> reduced row-echelon form. Suppose j is the <strong>in</strong>dex<br />

of a pivot column of B. Then the variable x j is dependent. A variable that is not<br />

dependent is called <strong>in</strong>dependent or free.<br />

□<br />

If you studied this def<strong>in</strong>ition carefully, you might wonder what to do if the system<br />

has n variables and column n + 1 is a pivot column? We will see shortly, by Theorem<br />

RCLS, that this never happens for a consistent system.<br />

Example FDV Free and dependent variables<br />

Consider the system of five equations <strong>in</strong> five variables,<br />

x 1 − x 2 − 2x 3 + x 4 +11x 5 =13<br />

x 1 − x 2 + x 3 + x 4 +5x 5 =16<br />

2x 1 − 2x 2 + x 4 +10x 5 =21<br />

2x 1 − 2x 2 − x 3 +3x 4 +20x 5 =38<br />

2x 1 − 2x 2 + x 3 + x 4 +8x 5 =22<br />

whose augmented matrix row-reduces to<br />

⎡<br />

⎤<br />

1 −1 0 0 3 6<br />

0 0 1 0 −2 1<br />

⎢ 0 0 0 1 4 9⎥<br />

⎣<br />

0 0 0 0 0 0<br />

⎦<br />

0 0 0 0 0 0<br />

Columns 1, 3 and 4 are pivot columns, so D = {1, 3, 4}. Fromthisweknow<br />

that the variables x 1 , x 3 and x 4 will be dependent variables, and each of the r =3<br />

nonzero rows of the row-reduced matrix will yield an expression for one of these<br />

three variables. The set F is all the rema<strong>in</strong><strong>in</strong>g column <strong>in</strong>dices, F = {2, 5, 6}. The<br />

column <strong>in</strong>dex 6 <strong>in</strong> F means that the f<strong>in</strong>al column is not a pivot column, and thus<br />

the system is consistent (Theorem RCLS). The rema<strong>in</strong><strong>in</strong>g <strong>in</strong>dices <strong>in</strong> F <strong>in</strong>dicate free


§TSS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 49<br />

variables, so x 2 and x 5 (the rema<strong>in</strong><strong>in</strong>g variables) are our free variables. The result<strong>in</strong>g<br />

three equations that describe our solution set are then,<br />

(x d1 = x 1 ) x 1 =6+x 2 − 3x 5<br />

(x d2 = x 3 ) x 3 =1+2x 5<br />

(x d3 = x 4 ) x 4 =9− 4x 5<br />

Make sure you understand where these three equations came from, and notice<br />

how the location of the pivot columns determ<strong>in</strong>ed the variables on the left-hand side<br />

of each equation. We can compactly describe the solution set as,<br />

⎧⎡<br />

⎤<br />

⎫<br />

6+x 2 − 3x 5<br />

⎪⎨<br />

x 2<br />

⎪⎬<br />

S = ⎢ 1+2x 5 ⎥<br />

x<br />

⎣<br />

⎪⎩<br />

9 − 4x ⎦<br />

2 ,x 5 ∈ C<br />

5<br />

x 5<br />

∣<br />

⎪⎭<br />

Notice how we express the freedom for x 2 and x 5 : x 2 ,x 5 ∈ C.<br />

△<br />

Sets are an important part of algebra, and we have seen a few already. Be<strong>in</strong>g<br />

comfortable with sets is important for understand<strong>in</strong>g and writ<strong>in</strong>g proofs. If you have<br />

not already, pay a visit now to Section SET.<br />

We can now use the values of m, n, r, and the <strong>in</strong>dependent and dependent<br />

variables to categorize the solution sets for l<strong>in</strong>ear systems through a sequence of<br />

theorems.<br />

Through the follow<strong>in</strong>g sequence of proofs, you will want to consult three proof<br />

techniques. See Proof Technique E, Proof Technique N, Proof Technique CP.<br />

<strong>First</strong> we have an important theorem that explores the dist<strong>in</strong>ction between<br />

consistent and <strong>in</strong>consistent l<strong>in</strong>ear systems.<br />

Theorem RCLS Recogniz<strong>in</strong>g Consistency of a L<strong>in</strong>ear System<br />

Suppose A is the augmented matrix of a system of l<strong>in</strong>ear equations with n variables.<br />

Suppose also that B is a row-equivalent matrix <strong>in</strong> reduced row-echelon form with r<br />

nonzero rows. Then the system of equations is <strong>in</strong>consistent if and only if column<br />

n +1 of B is a pivot column.<br />

Proof. (⇐) The first half of the proof beg<strong>in</strong>s with the assumption that column n +1<br />

of B is a pivot column. Then the lead<strong>in</strong>g 1 of row r is located <strong>in</strong> column n +1of<br />

B and so row r of B beg<strong>in</strong>s with n consecutive zeros, f<strong>in</strong>ish<strong>in</strong>g with the lead<strong>in</strong>g 1.<br />

This is a representation of the equation 0 = 1, which is false. S<strong>in</strong>ce this equation<br />

is false for any collection of values we might choose for the variables, there are no<br />

solutions for the system of equations, and the system is <strong>in</strong>consistent.<br />

(⇒) For the second half of the proof, we wish to show that if we assume the system<br />

is <strong>in</strong>consistent, then column n +1 of B is a pivot column. But <strong>in</strong>stead of prov<strong>in</strong>g this<br />

directly, we will form the logically equivalent statement that is the contrapositive,<br />

and prove that <strong>in</strong>stead (see Proof Technique CP). Turn<strong>in</strong>g the implication around,


§TSS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 50<br />

and negat<strong>in</strong>g each portion, we arrive at the logically equivalent statement: if column<br />

n +1ofB is not a pivot column, then the system of equations is consistent.<br />

If column n +1ofB is not a pivot column, the lead<strong>in</strong>g 1 for row r is located<br />

somewhere <strong>in</strong> columns 1 through n. Thenevery preced<strong>in</strong>g row’s lead<strong>in</strong>g 1 is also<br />

located <strong>in</strong> columns 1 through n. In other words, s<strong>in</strong>ce the last lead<strong>in</strong>g 1 is not <strong>in</strong><br />

the last column, no lead<strong>in</strong>g 1 for any row is <strong>in</strong> the last column, due to the echelon<br />

layout of the lead<strong>in</strong>g 1’s (Def<strong>in</strong>ition RREF). We will now construct a solution to the<br />

system by sett<strong>in</strong>g each dependent variable to the entry of the f<strong>in</strong>al column <strong>in</strong> the<br />

row with the correspond<strong>in</strong>g lead<strong>in</strong>g 1, and sett<strong>in</strong>g each free variable to zero. That<br />

sentence is pretty vague, so let us be more precise. Us<strong>in</strong>g our notation for the sets<br />

D and F from the reduced row-echelon form (Def<strong>in</strong>ition RREF):<br />

x di =[B] i,n+1<br />

, 1 ≤ i ≤ r x fi =0, 1 ≤ i ≤ n − r<br />

These values for the variables make the equations represented by the first r rows<br />

of B all true (conv<strong>in</strong>ce yourself of this). Rows numbered greater than r (if any) are<br />

all zero rows, hence represent the equation 0 = 0 and are also all true. We have now<br />

identified one solution to the system represented by B, and hence a solution to the<br />

system represented by A (Theorem REMES). So we can say the system is consistent<br />

(Def<strong>in</strong>ition CS).<br />

<br />

The beauty of this theorem be<strong>in</strong>g an equivalence is that we can unequivocally<br />

test to see if a system is consistent or <strong>in</strong>consistent by look<strong>in</strong>g at just a s<strong>in</strong>gle entry<br />

of the reduced row-echelon form matrix. We could program a computer to do it!<br />

Notice that for a consistent system the row-reduced augmented matrix has<br />

n +1∈ F , so the largest element of F does not refer to a variable. Also, for an<br />

<strong>in</strong>consistent system, n +1∈ D, and it then does not make much sense to discuss<br />

whether or not variables are free or dependent s<strong>in</strong>ce there is no solution. Take a look<br />

back at Def<strong>in</strong>ition IDV and see why we did not need to consider the possibility of<br />

referenc<strong>in</strong>g x n+1 as a dependent variable.<br />

With the characterization of Theorem RCLS, we can explore the relationships<br />

between r and n for a consistent system. We can dist<strong>in</strong>guish between the case of a<br />

unique solution and <strong>in</strong>f<strong>in</strong>itely many solutions, and furthermore, we recognize that<br />

these are the only two possibilities.<br />

Theorem CSRN Consistent Systems, r and n<br />

Suppose A is the augmented matrix of a consistent system of l<strong>in</strong>ear equations with<br />

n variables. Suppose also that B is a row-equivalent matrix <strong>in</strong> reduced row-echelon<br />

form with r pivot columns. Then r ≤ n. Ifr = n, then the system has a unique<br />

solution, and if r


§TSS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 51<br />

column is a pivot column. So Theorem RCLS tells us that the system is <strong>in</strong>consistent,<br />

contrary to our hypothesis. We are left with r ≤ n.<br />

When r = n, we f<strong>in</strong>d n − r = 0 free variables (i.e. F = {n +1}) andtheonly<br />

solution is given by sett<strong>in</strong>g the n variables to the the first n entries of column n +1<br />

of B.<br />

When r0 free variables. Choose one free variable and set<br />

all the other free variables to zero. Now, set the chosen free variable to any fixed<br />

value. It is possible to then determ<strong>in</strong>e the values of the dependent variables to create<br />

a solution to the system. By sett<strong>in</strong>g the chosen free variable to different values, <strong>in</strong><br />

this manner we can create <strong>in</strong>f<strong>in</strong>itely many solutions.<br />

<br />

Subsection FV<br />

Free Variables<br />

The next theorem simply states a conclusion from the f<strong>in</strong>al paragraph of the previous<br />

proof, allow<strong>in</strong>g us to state explicitly the number of free variables for a consistent<br />

system.<br />

Theorem FVCS Free Variables for Consistent Systems<br />

Suppose A is the augmented matrix of a consistent system of l<strong>in</strong>ear equations with<br />

n variables. Suppose also that B is a row-equivalent matrix <strong>in</strong> reduced row-echelon<br />

form with r rows that are not completely zeros. Then the solution set can be described<br />

with n − r free variables.<br />

Proof. See the proof of Theorem CSRN.<br />

<br />

Example CFV Count<strong>in</strong>g free variables<br />

For each archetype that is a system of equations, the values of n and r are listed.<br />

Many also conta<strong>in</strong> a few sample solutions. We can use this <strong>in</strong>formation profitably,<br />

as illustrated by four examples.<br />

1. Archetype A has n = 3 and r = 2. It can be seen to be consistent by the<br />

sample solutions given. Its solution set then has n − r = 1 free variables, and<br />

therefore will be <strong>in</strong>f<strong>in</strong>ite.<br />

2. Archetype B has n = 3 and r = 3. It can be seen to be consistent by the s<strong>in</strong>gle<br />

sample solution given. Its solution set can then be described with n − r =0<br />

free variables, and therefore will have just the s<strong>in</strong>gle solution.<br />

3. Archetype H has n = 2 and r = 3. In this case, column 3 must be a pivot<br />

column,sobyTheoremRCLS, the system is <strong>in</strong>consistent. We should not try<br />

to apply Theorem FVCS to count free variables, s<strong>in</strong>ce the theorem only applies<br />

to consistent systems. (What would happen if you did try to <strong>in</strong>correctly apply<br />

Theorem FVCS?)


§TSS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 52<br />

4. Archetype E has n = 4 and r = 3. However, by look<strong>in</strong>g at the reduced rowechelon<br />

form of the augmented matrix, we f<strong>in</strong>d that column 5 is a pivot column.<br />

By Theorem RCLS we recognize the system as <strong>in</strong>consistent.<br />

△<br />

We have accomplished a lot so far, but our ma<strong>in</strong> goal has been the follow<strong>in</strong>g<br />

theorem, which is now very simple to prove. The proof is so simple that we ought to<br />

call it a corollary, but the result is important enough that it deserves to be called a<br />

theorem. (See Proof Technique LC.) Notice that this theorem was presaged first by<br />

Example TTS and further foreshadowed by other examples.<br />

Theorem PSSLS Possible Solution Sets for L<strong>in</strong>ear Systems<br />

A system of l<strong>in</strong>ear equations has no solutions, a unique solution or <strong>in</strong>f<strong>in</strong>itely many<br />

solutions.<br />

Proof. By its def<strong>in</strong>ition, a system is either <strong>in</strong>consistent or consistent (Def<strong>in</strong>ition CS).<br />

The first case describes systems with no solutions. For consistent systems, we have<br />

the rema<strong>in</strong><strong>in</strong>g two possibilities as guaranteed by, and described <strong>in</strong>, Theorem CSRN.<br />

<br />

Here is a diagram that consolidates several of our theorems from this section, and<br />

which is of practical use when you analyze systems of equations. Note this presumes<br />

we have the reduced row-echelon form of the augmented matrix of the system to<br />

analyze.<br />

no pivot column <strong>in</strong><br />

column n +1<br />

Consistent<br />

Theorem RCLS<br />

pivot column <strong>in</strong><br />

column n +1<br />

Inconsistent<br />

Theorem CSRN<br />

r


§TSS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 53<br />

We have one more theorem to round out our set of tools for determ<strong>in</strong><strong>in</strong>g solution<br />

sets to systems of l<strong>in</strong>ear equations.<br />

Theorem CMVEI Consistent, More Variables than Equations, Inf<strong>in</strong>ite solutions<br />

Suppose a consistent system of l<strong>in</strong>ear equations has m equations <strong>in</strong> n variables. If<br />

n>m, then the system has <strong>in</strong>f<strong>in</strong>itely many solutions.<br />

Proof. Suppose that the augmented matrix of the system of equations is rowequivalent<br />

to B, a matrix <strong>in</strong> reduced row-echelon form with r nonzero rows. Because<br />

B has m rows <strong>in</strong> total, the number of nonzero rows is less than or equal to m. In<br />

other words, r ≤ m. Follow this with the hypothesis that n>mand we f<strong>in</strong>d that<br />

the system has a solution set described by at least one free variable because<br />

n − r ≥ n − m>0.<br />

A consistent system with free variables will have an <strong>in</strong>f<strong>in</strong>ite number of solutions,<br />

as given by Theorem CSRN.<br />

<br />

Notice that to use this theorem we need only know that the system is consistent,<br />

together with the values of m and n. We do not necessarily have to compute a<br />

row-equivalent reduced row-echelon form matrix, even though we discussed such a<br />

matrix <strong>in</strong> the proof. This is the substance of the follow<strong>in</strong>g example.<br />

Example OSGMD One solution gives many, Archetype D<br />

Archetype D is the system of m = 3 equations <strong>in</strong> n =4variables,<br />

2x 1 + x 2 +7x 3 − 7x 4 =8<br />

−3x 1 +4x 2 − 5x 3 − 6x 4 = −12<br />

x 1 + x 2 +4x 3 − 5x 4 =4<br />

and the solution x 1 =0,x 2 =1,x 3 =2,x 4 = 1 can be checked easily by<br />

substitution. Hav<strong>in</strong>g been handed this solution, we know the system is consistent.<br />

This, together with n>m, allows us to apply Theorem CMVEI and conclude that<br />

the system has <strong>in</strong>f<strong>in</strong>itely many solutions.<br />

△<br />

These theorems give us the procedures and implications that allow us to completely<br />

solve any system of l<strong>in</strong>ear equations. The ma<strong>in</strong> computational tool is us<strong>in</strong>g<br />

row operations to convert an augmented matrix <strong>in</strong>to reduced row-echelon form. Here<br />

is a broad outl<strong>in</strong>e of how we would <strong>in</strong>struct a computer to solve a system of l<strong>in</strong>ear<br />

equations.<br />

1. Represent a system of l<strong>in</strong>ear equations <strong>in</strong> n variables by an augmented matrix<br />

(an array is the appropriate data structure <strong>in</strong> most computer languages).<br />

2. Convert the matrix to a row-equivalent matrix <strong>in</strong> reduced row-echelon form<br />

us<strong>in</strong>g the procedure from the proof of Theorem REMEF. Identifythelocation<br />

of the pivot columns, and their number r.


§TSS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 54<br />

3. If column n + 1 is a pivot column, output the statement that the system is<br />

<strong>in</strong>consistent and halt.<br />

4. If column n + 1 is not a pivot column, there are two possibilities:<br />

(a) r = n and the solution is unique. It can be read off directly from the<br />

entries<strong>in</strong>rows1throughn of column n +1.<br />

(b) r


C21 † x 1 +4x 2 +3x 3 − x 4 =5<br />

C22 † x 1 − 2x 2 + x 3 − x 4 =3<br />

C23 † x 1 − 2x 2 + x 3 − x 4 =3<br />

C24 † x 1 − 2x 2 + x 3 − x 4 =2<br />

C25 † x 1 +2x 2 +3x 3 =1<br />

C26 † x 1 +2x 2 +3x 3 =1<br />

C27 † x 1 +2x 2 +3x 3 =0<br />

§TSS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 55<br />

x 1 − x 2 + x 3 +2x 4 =6<br />

4x 1 + x 2 +6x 3 +5x 4 =9<br />

2x 1 − 4x 2 + x 3 + x 4 =2<br />

x 1 − 2x 2 − 2x 3 +3x 4 =1<br />

x 1 + x 2 + x 3 − x 4 =1<br />

x 1 + x 3 − x 4 =2<br />

x 1 + x 2 + x 3 − x 4 =2<br />

x 1 + x 3 − x 4 =2<br />

2x 1 − x 2 + x 3 =2<br />

3x 1 + x 2 + x 3 =4<br />

x 2 +2x 3 =6<br />

2x 1 − x 2 + x 3 =2<br />

3x 1 + x 2 + x 3 =4<br />

5x 2 +2x 3 =1<br />

2x 1 − x 2 + x 3 =2<br />

x 1 − 8x 2 − 7x 3 =1<br />

x 2 + x 3 =0


C28 † x 1 +2x 2 +3x 3 =1<br />

§TSS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 56<br />

2x 1 − x 2 + x 3 =2<br />

x 1 − 8x 2 − 7x 3 =1<br />

x 2 + x 3 =0<br />

M45 † The details for Archetype J <strong>in</strong>clude several sample solutions. Verify that one of<br />

these solutions is correct (any one, but just one). Based only on this evidence, and especially<br />

without do<strong>in</strong>g any row operations, expla<strong>in</strong> how you know this system of l<strong>in</strong>ear equations<br />

has <strong>in</strong>f<strong>in</strong>itely many solutions.<br />

M46 Consider Archetype J, and specifically the row-reduced version of the augmented<br />

matrix of the system of equations, denoted as B here, and the values of r, D and F<br />

immediately follow<strong>in</strong>g. Determ<strong>in</strong>e the values of the entries<br />

[B] 1,d1<br />

[B] 3,d3<br />

[B] 1,d3<br />

[B] 3,d1<br />

[B] d1 ,1<br />

[B] d3 ,3<br />

[B] d1 ,3<br />

[B] d3 ,1<br />

[B] 1,f1<br />

[B] 3,f1<br />

(See Exercise TSS.M70 for a generalization.)<br />

For Exercises M51–M57 say as much as possible about each system’s solution set. Be<br />

sure to make it clear which theorems you are us<strong>in</strong>g to reach your conclusions.<br />

M51 † A consistent system of 8 equations <strong>in</strong> 6 variables.<br />

M52 † A consistent system of 6 equations <strong>in</strong> 8 variables.<br />

M53 † A system of 5 equations <strong>in</strong> 9 variables.<br />

M54 † A system with 12 equations <strong>in</strong> 35 variables.<br />

M56 † A system with 6 equations <strong>in</strong> 12 variables.<br />

M57 † A system with 8 equations and 6 variables. The reduced row-echelon form of<br />

the augmented matrix of the system has 7 pivot columns.<br />

M60 Without do<strong>in</strong>g any computations, and without exam<strong>in</strong><strong>in</strong>g any solutions, say as much<br />

as possible about the form of the solution set for each archetype that is a system of equations.<br />

Archetype A,ArchetypeB,ArchetypeC,ArchetypeD,ArchetypeE,ArchetypeF,Archetype<br />

G, ArchetypeH, ArchetypeI, ArchetypeJ<br />

M70 Suppose that B is a matrix <strong>in</strong> reduced row-echelon form that is equivalent to the<br />

augmented matrix of a system of equations with m equations <strong>in</strong> n variables. Let r, D and F<br />

be as def<strong>in</strong>ed <strong>in</strong> Def<strong>in</strong>ition RREF. What can you conclude, <strong>in</strong> general, about the follow<strong>in</strong>g<br />

entries?<br />

[B] 1,d1<br />

[B] 3,d3<br />

[B] 1,d3<br />

[B] 3,d1<br />

[B] d1 ,1<br />

[B] d3 ,3<br />

[B] d1 ,3<br />

[B] d3 ,1<br />

[B] 1,f1<br />

[B] 3,f1<br />

If you cannot conclude anyth<strong>in</strong>g about an entry, then say so. (See Exercise TSS.M46.)<br />

T10 † An <strong>in</strong>consistent system may have r>n. If we try (<strong>in</strong>correctly!) to apply Theorem<br />

FVCS to such a system, how many free variables would we discover?


§TSS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 57<br />

T11 † Suppose A is the augmented matrix of a system of l<strong>in</strong>ear equations <strong>in</strong> n variables.<br />

and that B is a row-equivalent matrix <strong>in</strong> reduced row-echelon form with r pivot columns.<br />

If r = n + 1, prove that the system of equations is <strong>in</strong>consistent.<br />

T20 Suppose that B is a matrix <strong>in</strong> reduced row-echelon form that is equivalent to the<br />

augmented matrix of a system of equations with m equations <strong>in</strong> n variables. Let r, D and<br />

F be as def<strong>in</strong>ed <strong>in</strong> Def<strong>in</strong>ition RREF. Provethatd k ≥ k for all 1 ≤ k ≤ r. Then suppose<br />

that r ≥ 2and1≤ k


Section HSE<br />

Homogeneous Systems of Equations<br />

In this section we specialize to systems of l<strong>in</strong>ear equations where every equation<br />

has a zero as its constant term. Along the way, we will beg<strong>in</strong> to express more and<br />

more ideas <strong>in</strong> the language of matrices and beg<strong>in</strong> a move away from writ<strong>in</strong>g out<br />

whole systems of equations. The ideas <strong>in</strong>itiated <strong>in</strong> this section will carry through<br />

the rema<strong>in</strong>der of the course.<br />

Subsection SHS<br />

Solutions of Homogeneous Systems<br />

As usual, we beg<strong>in</strong> with a def<strong>in</strong>ition.<br />

Def<strong>in</strong>ition HS Homogeneous System<br />

A system of l<strong>in</strong>ear equations, LS(A, b) is homogeneous if the vector of constants<br />

is the zero vector, <strong>in</strong> other words, if b = 0.<br />

□<br />

Example AHSAC Archetype C as a homogeneous system<br />

For each archetype that is a system of equations, we have formulated a similar, yet<br />

different, homogeneous system of equations by replac<strong>in</strong>g each equation’s constant<br />

term with a zero. To wit, for Archetype C, we can convert the orig<strong>in</strong>al system of<br />

equations <strong>in</strong>to the homogeneous system,<br />

2x 1 − 3x 2 + x 3 − 6x 4 =0<br />

4x 1 + x 2 +2x 3 +9x 4 =0<br />

3x 1 + x 2 + x 3 +8x 4 =0<br />

Can you quickly f<strong>in</strong>d a solution to this system without row-reduc<strong>in</strong>g the augmented<br />

matrix?<br />

△<br />

As you might have discovered by study<strong>in</strong>g Example AHSAC, sett<strong>in</strong>g each variable<br />

to zero will always be a solution of a homogeneous system. This is the substance of<br />

the follow<strong>in</strong>g theorem.<br />

Theorem HSC Homogeneous Systems are Consistent<br />

Suppose that a system of l<strong>in</strong>ear equations is homogeneous. Then the system is<br />

consistent and one solution is found by sett<strong>in</strong>g each variable to zero.<br />

Proof. Set each variable of the system to zero. When substitut<strong>in</strong>g these values <strong>in</strong>to<br />

each equation, the left-hand side evaluates to zero, no matter what the coefficients<br />

are. S<strong>in</strong>ce a homogeneous system has zero on the right-hand side of each equation<br />

as the constant term, each equation is true. With one demonstrated solution, we<br />

can call the system consistent.<br />

<br />

58


§HSE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 59<br />

S<strong>in</strong>ce this solution is so obvious, we now def<strong>in</strong>e it as the trivial solution.<br />

Def<strong>in</strong>ition TSHSE Trivial Solution to Homogeneous Systems of Equations<br />

Suppose a homogeneous system of l<strong>in</strong>ear equations has n variables. The solution<br />

x 1 =0,x 2 =0,...,x n = 0 (i.e. x = 0) is called the trivial solution. □<br />

Here are three typical examples, which we will reference throughout this section.<br />

Work through the row operations as we br<strong>in</strong>g each to reduced row-echelon form.<br />

Also notice what is similar <strong>in</strong> each example, and what differs.<br />

Example HUSAB Homogeneous, unique solution, Archetype B<br />

Archetype B can be converted to the homogeneous system,<br />

−7x 1 − 6x 2 − 12x 3 =0<br />

5x 1 +5x 2 +7x 3 =0<br />

x 1 +4x 3 =0<br />

whose augmented matrix row-reduces to<br />

⎡<br />

1 0 0<br />

⎤<br />

0<br />

⎣ 0 1 0 0⎦<br />

0 0 1 0<br />

By Theorem HSC, the system is consistent, and so the computation n − r =<br />

3 − 3 = 0 means the solution set conta<strong>in</strong>s just a s<strong>in</strong>gle solution. Then, this lone<br />

solution must be the trivial solution.<br />

△<br />

Example HISAA Homogeneous, <strong>in</strong>f<strong>in</strong>ite solutions, Archetype A<br />

Archetype A can be converted to the homogeneous system,<br />

x 1 − x 2 +2x 3 =0<br />

2x 1 + x 2 + x 3 =0<br />

x 1 + x 2 =0<br />

whose augmented matrix row-reduces to<br />

⎡<br />

⎣ 1 0 1 0<br />

⎤<br />

0 1 −1 0⎦<br />

0 0 0 0<br />

By Theorem HSC, the system is consistent, and so the computation n − r =<br />

3 − 2 = 1 means the solution set conta<strong>in</strong>s one free variable by Theorem FVCS, and<br />

hence has <strong>in</strong>f<strong>in</strong>itely many solutions. We can describe this solution set us<strong>in</strong>g the free<br />

variable x 3 ,<br />

{[ ]∣ } {[ ]∣ }<br />

x1 ∣∣∣∣ −x3 ∣∣∣∣<br />

S = x 2 x 1 = −x 3 ,x 2 = x 3 = x 3 x 3 ∈ C<br />

x 3 x 3


§HSE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 60<br />

Geometrically, these are po<strong>in</strong>ts <strong>in</strong> three dimensions that lie on a l<strong>in</strong>e through the<br />

orig<strong>in</strong>.<br />

△<br />

Example HISAD Homogeneous, <strong>in</strong>f<strong>in</strong>ite solutions, Archetype D<br />

Archetype D (and identically, Archetype E) can be converted to the homogeneous<br />

system,<br />

2x 1 + x 2 +7x 3 − 7x 4 =0<br />

−3x 1 +4x 2 − 5x 3 − 6x 4 =0<br />

x 1 + x 2 +4x 3 − 5x 4 =0<br />

whose augmented matrix row-reduces to<br />

⎡<br />

⎣ 1 0 3 −2 0<br />

⎤<br />

0 1 1 −3 0⎦<br />

0 0 0 0 0<br />

By Theorem HSC, the system is consistent, and so the computation n − r =<br />

4 − 2 = 2 means the solution set conta<strong>in</strong>s two free variables by Theorem FVCS, and<br />

hence has <strong>in</strong>f<strong>in</strong>itely many solutions. We can describe this solution set us<strong>in</strong>g the free<br />

variables x 3 and x 4 ,<br />

⎧⎡<br />

⎤<br />

⎫<br />

x ⎪⎨ 1<br />

⎪⎬<br />

⎢x S = 2 ⎥<br />

⎣<br />

⎪⎩<br />

x<br />

⎦<br />

x 1 = −3x 3 +2x 4 ,x 2 = −x 3 +3x 4 3<br />

⎪<br />

x ∣ ⎭<br />

4<br />

⎧⎡<br />

⎤<br />

⎫<br />

−3x ⎪⎨ 3 +2x 4<br />

⎪⎬<br />

⎢ −x<br />

=<br />

3 +3x 4 ⎥<br />

⎣<br />

⎪⎩<br />

x<br />

⎦<br />

x<br />

3<br />

3 ,x 4 ∈ C<br />

x ∣<br />

⎪⎭<br />

4<br />

After work<strong>in</strong>g through these examples, you might perform the same computations<br />

for the slightly larger example, Archetype J.<br />

Notice that when we do row operations on the augmented matrix of a homogeneous<br />

system of l<strong>in</strong>ear equations the last column of the matrix is all zeros. Any one of<br />

the three allowable row operations will convert zeros to zeros and thus, the f<strong>in</strong>al<br />

column of the matrix <strong>in</strong> reduced row-echelon form will also be all zeros. So <strong>in</strong> this<br />

case, we may be as likely to reference only the coefficient matrix and presume that<br />

we remember that the f<strong>in</strong>al column beg<strong>in</strong>s with zeros, and after any number of row<br />

operations is still zero.<br />

Example HISAD suggests the follow<strong>in</strong>g theorem.<br />

Theorem HMVEI Homogeneous, More Variables than Equations, Inf<strong>in</strong>ite solutions<br />

Suppose that a homogeneous system of l<strong>in</strong>ear equations has m equations and n<br />


§HSE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 61<br />

variables with n>m. Then the system has <strong>in</strong>f<strong>in</strong>itely many solutions.<br />

Proof. We are assum<strong>in</strong>g the system is homogeneous, so Theorem HSC says it is<br />

consistent. Then the hypothesis that n>m, together with Theorem CMVEI, gives<br />

<strong>in</strong>f<strong>in</strong>itely many solutions.<br />

<br />

Example HUSAB and Example HISAA are concerned with homogeneous systems<br />

where n = m and expose a fundamental dist<strong>in</strong>ction between the two examples. One<br />

has a unique solution, while the other has <strong>in</strong>f<strong>in</strong>itely many. These are exactly the<br />

only two possibilities for a homogeneous system and illustrate that each is possible<br />

(unlike the case when n>mwhere Theorem HMVEI tells us that there is only one<br />

possibility for a homogeneous system).<br />

Subsection NSM<br />

Null Space of a Matrix<br />

The set of solutions to a homogeneous system (which by Theorem HSC is never<br />

empty) is of enough <strong>in</strong>terest to warrant its own name. However, we def<strong>in</strong>e it as a<br />

property of the coefficient matrix, not as a property of some system of equations.<br />

Def<strong>in</strong>ition NSM Null Space of a Matrix<br />

The null space of a matrix A, denoted N (A), is the set of all the vectors that are<br />

solutions to the homogeneous system LS(A, 0).<br />

□<br />

In the Archetypes (Archetypes) each example that is a system of equations also<br />

has a correspond<strong>in</strong>g homogeneous system of equations listed, and several sample<br />

solutions are given. These solutions will be elements of the null space of the coefficient<br />

matrix. We will look at one example.<br />

Example NSEAI Null space elements of Archetype I<br />

The write-up for Archetype I lists several solutions of the correspond<strong>in</strong>g homogeneous<br />

system. Here are two, written as solution vectors. We can say that they are <strong>in</strong> the<br />

null space of the coefficient matrix for the system of equations <strong>in</strong> Archetype I.<br />

⎡<br />

⎤<br />

3<br />

0<br />

−5<br />

x =<br />

−6<br />

⎢ 0<br />

⎥<br />

⎣ 0 ⎦<br />

1<br />

⎡ ⎤<br />

−4<br />

1<br />

−3<br />

y =<br />

−2<br />

⎢ 1<br />

⎥<br />

⎣ 1 ⎦<br />

1


§HSE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 62<br />

However, the vector<br />

⎡ ⎤<br />

1<br />

0<br />

0<br />

z =<br />

0<br />

⎢0<br />

⎥<br />

⎣0⎦<br />

2<br />

is not <strong>in</strong> the null space, s<strong>in</strong>ce it is not a solution to the homogeneous system. For<br />

example, it fails to even make the first equation true.<br />

△<br />

Here are two (prototypical) examples of the computation of the null space of a<br />

matrix.<br />

Example CNS1 Comput<strong>in</strong>g a null space, no. 1<br />

Let us compute the null space of<br />

[ 2 −1 7 −3<br />

] −8<br />

A = 1 0 2 4 9<br />

2 2 −2 −1 8<br />

which we write as N (A). Translat<strong>in</strong>g Def<strong>in</strong>ition NSM, we simply desire to solve the<br />

homogeneous system LS(A, 0). So we row-reduce the augmented matrix to obta<strong>in</strong><br />

⎡<br />

1 0 2 0 1<br />

⎤<br />

0<br />

⎣ 0 1 −3 0 4 0⎦<br />

0 0 0 1 2 0<br />

The variables (of the homogeneous system) x 3 and x 5 arefree(s<strong>in</strong>cecolumns1,<br />

2 and 4 are pivot columns), so we arrange the equations represented by the matrix<br />

<strong>in</strong> reduced row-echelon form to<br />

x 1 = −2x 3 − x 5<br />

x 2 =3x 3 − 4x 5<br />

x 4 = −2x 5<br />

So we can write the <strong>in</strong>f<strong>in</strong>ite solution set as sets us<strong>in</strong>g column vectors,<br />

⎧⎡<br />

⎤<br />

⎫<br />

−2x 3 − x 5<br />

⎪⎨<br />

3x 3 − 4x 5<br />

⎪⎬<br />

N (A) = ⎢ x 3 ⎥<br />

x 3 ,x 5 ∈ C<br />

⎣<br />

⎪⎩<br />

−2x ⎦<br />

5<br />

x 5<br />

∣<br />

⎪⎭<br />

Example CNS2 Comput<strong>in</strong>g a null space, no. 2<br />


§HSE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 63<br />

Let us compute the null space of<br />

⎡ ⎤<br />

−4 6 1<br />

⎢−1 4 1⎥<br />

C = ⎣<br />

5 6 7<br />

⎦<br />

4 7 1<br />

which we write as N (C). Translat<strong>in</strong>g Def<strong>in</strong>ition NSM, we simply desire to solve the<br />

homogeneous system LS(C, 0). So we row-reduce the augmented matrix to obta<strong>in</strong><br />

⎡<br />

⎤<br />

1 0 0 0<br />

⎢ 0 1 0 0<br />

⎥<br />

⎣<br />

0 0 1 0<br />

⎦<br />

0 0 0 0<br />

There are no free variables <strong>in</strong> the homogeneous system represented by the rowreduced<br />

matrix, so there is only the trivial solution, the zero vector, 0. Sowecan<br />

write the (trivial) solution set as<br />

{[ ]} 0<br />

N (C) ={0} = 0<br />

0<br />

Read<strong>in</strong>g Questions<br />

1. What is always true of the solution set for a homogeneous system of equations?<br />

2. Suppose a homogeneous system of equations has 13 variables and 8 equations. How<br />

many solutions will it have? Why?<br />

3. Describe, us<strong>in</strong>g only words, the null space of a matrix. (So <strong>in</strong> particular, do not use any<br />

symbols.)<br />

Exercises<br />

C10 Each Archetype (Archetypes) that is a system of equations has a correspond<strong>in</strong>g<br />

homogeneous system with the same coefficient matrix. Compute the set of solutions for<br />

each. Notice that these solution sets are the null spaces of the coefficient matrices.<br />

Archetype A,ArchetypeB,ArchetypeC,ArchetypeD/Archetype E,ArchetypeF,Archetype<br />

G/Archetype H, ArchetypeI, ArchetypeJ<br />

C20 Archetype K and Archetype L are simply 5 × 5 matrices (i.e. they are not systems<br />

of equations). Compute the null space of each matrix.<br />

For Exercises C21-C23, solve the given homogeneous l<strong>in</strong>ear system. Compare your results<br />

to the results of the correspond<strong>in</strong>g exercise <strong>in</strong> Section TSS.


§HSE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 64<br />

C21 † x 1 +4x 2 +3x 3 − x 4 =0<br />

x 1 − x 2 + x 3 +2x 4 =0<br />

4x 1 + x 2 +6x 3 +5x 4 =0<br />

C22 † x 1 − 2x 2 + x 3 − x 4 =0<br />

2x 1 − 4x 2 + x 3 + x 4 =0<br />

x 1 − 2x 2 − 2x 3 +3x 4 =0<br />

C23 † x 1 − 2x 2 + x 3 − x 4 =0<br />

x 1 + x 2 + x 3 − x 4 =0<br />

x 1 + x 3 − x 4 =0<br />

For Exercises C25-C27, solve the given homogeneous l<strong>in</strong>ear system. Compare your results<br />

to the results of the correspond<strong>in</strong>g exercise <strong>in</strong> Section TSS.<br />

C25 † x 1 +2x 2 +3x 3 =0<br />

2x 1 − x 2 + x 3 =0<br />

3x 1 + x 2 + x 3 =0<br />

x 2 +2x 3 =0<br />

C26 † x 1 +2x 2 +3x 3 =0<br />

2x 1 − x 2 + x 3 =0<br />

3x 1 + x 2 + x 3 =0<br />

5x 2 +2x 3 =0<br />

C27 † x 1 +2x 2 +3x 3 =0<br />

2x 1 − x 2 + x 3 =0<br />

x 1 − 8x 2 − 7x 3 =0<br />

x 2 + x 3 =0


§HSE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 65<br />

C30 †<br />

C31 †<br />

Compute the null space of the matrix A, N (A).<br />

⎡<br />

⎤<br />

2 4 1 3 8<br />

⎢−1 −2 −1 −1 1⎥<br />

A = ⎣<br />

2 4 0 −3 4<br />

⎦<br />

2 4 −1 −7 4<br />

F<strong>in</strong>d the null space of the matrix B, N (B).<br />

⎡<br />

⎤<br />

−6 4 −36 6<br />

B = ⎣ 2 −1 10 −1⎦<br />

−3 2 −18 3<br />

M45 Without do<strong>in</strong>g any computations, and without exam<strong>in</strong><strong>in</strong>g any solutions, say as<br />

much as possible about the form of the solution set for correspond<strong>in</strong>g homogeneous system<br />

of equations of each archetype that is a system of equations.<br />

Archetype A,ArchetypeB,ArchetypeC,ArchetypeD/Archetype E,ArchetypeF,Archetype<br />

G/Archetype H, ArchetypeI, ArchetypeJ<br />

For Exercises M50–M52 say as much as possible about each system’s solution set. Be<br />

sure to make it clear which theorems you are us<strong>in</strong>g to reach your conclusions.<br />

M50 † A homogeneous system of 8 equations <strong>in</strong> 8 variables.<br />

M51 † A homogeneous system of 8 equations <strong>in</strong> 9 variables.<br />

M52 † A homogeneous system of 8 equations <strong>in</strong> 7 variables.<br />

T10 † Prove or disprove: A system of l<strong>in</strong>ear equations is homogeneous if and only if the<br />

system has the zero vector as a solution.<br />

T11 † Suppose that two systems of l<strong>in</strong>ear equations are equivalent. Prove that if the first<br />

system is homogeneous, then the second system is homogeneous. Notice that this will<br />

allow us to conclude that two equivalent systems are either both homogeneous or both not<br />

homogeneous.<br />

T12 Give an alternate proof of Theorem HSC that uses Theorem RCLS.<br />

T20 † Consider the homogeneous system of l<strong>in</strong>ear equations LS(A, 0), and suppose that<br />

⎡<br />

u<br />

⎤<br />

⎡<br />

1<br />

4u<br />

⎤<br />

1<br />

u 2<br />

4u 2<br />

u =<br />

u 3<br />

⎢<br />

⎣<br />

⎥<br />

⎦ is one solution to the system of equations. Prove that v = 4u 3<br />

⎢<br />

⎣<br />

⎥ is also a<br />

.<br />

⎦<br />

.<br />

u n 4u n<br />

solution to LS(A, 0).


Section NM<br />

Nons<strong>in</strong>gular Matrices<br />

In this section we specialize further and consider matrices with equal numbers of<br />

rows and columns, which when considered as coefficient matrices lead to systems<br />

with equal numbers of equations and variables. We will see <strong>in</strong> the second half of<br />

the course (Chapter D, Chapter E, Chapter LT, Chapter R) that these matrices are<br />

especially important.<br />

Subsection NM<br />

Nons<strong>in</strong>gular Matrices<br />

Our theorems will now establish connections between systems of equations (homogeneous<br />

or otherwise), augmented matrices represent<strong>in</strong>g those systems, coefficient<br />

matrices, constant vectors, the reduced row-echelon form of matrices (augmented and<br />

coefficient) and solution sets. Be very careful <strong>in</strong> your read<strong>in</strong>g, writ<strong>in</strong>g and speak<strong>in</strong>g<br />

about systems of equations, matrices and sets of vectors. A system of equations is<br />

not a matrix, a matrix is not a solution set, and a solution set is not a system of<br />

equations. Now would be a great time to review the discussion about speak<strong>in</strong>g and<br />

writ<strong>in</strong>g mathematics <strong>in</strong> Proof Technique L.<br />

Def<strong>in</strong>ition SQM Square Matrix<br />

A matrix with m rows and n columns is square if m = n. In this case, we say the<br />

matrix has size n. To emphasize the situation when a matrix is not square, we will<br />

call it rectangular.<br />

□<br />

We can now present one of the central def<strong>in</strong>itions of l<strong>in</strong>ear algebra.<br />

Def<strong>in</strong>ition NM Nons<strong>in</strong>gular Matrix<br />

Suppose A is a square matrix. Suppose further that the solution set to the homogeneous<br />

l<strong>in</strong>ear system of equations LS(A, 0) is {0}, <strong>in</strong> other words, the system has<br />

only the trivial solution. Then we say that A is a nons<strong>in</strong>gular matrix. Otherwise<br />

we say A is a s<strong>in</strong>gular matrix.<br />

□<br />

We can <strong>in</strong>vestigate whether any square matrix is nons<strong>in</strong>gular or not, no matter if<br />

the matrix is derived somehow from a system of equations or if it is simply a matrix.<br />

The def<strong>in</strong>ition says that to perform this <strong>in</strong>vestigation we must construct a very<br />

specific system of equations (homogeneous, with the matrix as the coefficient matrix)<br />

and look at its solution set. We will have theorems <strong>in</strong> this section that connect<br />

nons<strong>in</strong>gular matrices with systems of equations, creat<strong>in</strong>g more opportunities for<br />

confusion. Conv<strong>in</strong>ce yourself now of two observations, (1) we can decide nons<strong>in</strong>gularity<br />

for any square matrix, and (2) the determ<strong>in</strong>ation of nons<strong>in</strong>gularity <strong>in</strong>volves the<br />

solution set for a certa<strong>in</strong> homogeneous system of equations.<br />

Notice that it makes no sense to call a system of equations nons<strong>in</strong>gular (the term<br />

66


§NM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 67<br />

does not apply to a system of equations), nor does it make any sense to call a 5 × 7<br />

matrix s<strong>in</strong>gular (the matrix is not square).<br />

Example S A s<strong>in</strong>gular matrix, Archetype A<br />

Example HISAA shows that the coefficient matrix derived from Archetype A, specifically<br />

the 3 × 3 matrix,<br />

[ 1 −1<br />

] 2<br />

A = 2 1 1<br />

1 1 0<br />

is a s<strong>in</strong>gular matrix s<strong>in</strong>ce there are nontrivial solutions to the homogeneous system<br />

LS(A, 0).<br />

△<br />

Example NM A nons<strong>in</strong>gular matrix, Archetype B<br />

Example HUSAB shows that the coefficient matrix derived from Archetype B,<br />

specifically the 3 × 3 matrix,<br />

[ −7 −6<br />

] −12<br />

B = 5 5 7<br />

1 0 4<br />

is a nons<strong>in</strong>gular matrix s<strong>in</strong>ce the homogeneous system, LS(B, 0), has only the trivial<br />

solution.<br />

△<br />

Notice that we will not discuss Example HISAD as be<strong>in</strong>g a s<strong>in</strong>gular or nons<strong>in</strong>gular<br />

coefficient matrix s<strong>in</strong>ce the matrix is not square.<br />

The next theorem comb<strong>in</strong>es with our ma<strong>in</strong> computational technique (row reduc<strong>in</strong>g<br />

a matrix) to make it easy to recognize a nons<strong>in</strong>gular matrix. But first a def<strong>in</strong>ition.<br />

Def<strong>in</strong>ition IM Identity Matrix<br />

The m × m identity matrix, I m , is def<strong>in</strong>ed by<br />

{<br />

1 i = j<br />

[I m ] ij<br />

=<br />

1 ≤ i, j ≤ m<br />

0 i ≠ j<br />

Example IM An identity matrix<br />

The 4 × 4identitymatrixis<br />

⎡<br />

⎤<br />

1 0 0 0<br />

⎢0 1 0 0⎥<br />

I 4 = ⎣<br />

0 0 1 0<br />

⎦ .<br />

0 0 0 1<br />

Notice that an identity matrix is square, and <strong>in</strong> reduced row-echelon form. Also,<br />

every column is a pivot column, and every possible pivot column appears once.<br />

□<br />


§NM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 68<br />

Theorem NMRRI Nons<strong>in</strong>gular Matrices Row Reduce to the Identity matrix<br />

Suppose that A is a square matrix and B is a row-equivalent matrix <strong>in</strong> reduced<br />

row-echelon form. Then A is nons<strong>in</strong>gular if and only if B is the identity matrix.<br />

Proof. (⇐) Suppose B is the identity matrix. When the augmented matrix [A | 0]<br />

is row-reduced, the result is [B | 0] = [I n | 0]. The number of nonzero rows is equal<br />

to the number of variables <strong>in</strong> the l<strong>in</strong>ear system of equations LS(A, 0), son = r<br />

and Theorem FVCS gives n − r = 0 free variables. Thus, the homogeneous system<br />

LS(A, 0) has just one solution, which must be the trivial solution. This is exactly<br />

the def<strong>in</strong>ition of a nons<strong>in</strong>gular matrix (Def<strong>in</strong>ition NM).<br />

(⇒) IfA is nons<strong>in</strong>gular, then the homogeneous system LS(A, 0) has a unique<br />

solution, and has no free variables <strong>in</strong> the description of the solution set. The homogeneous<br />

system is consistent (Theorem HSC) soTheoremFVCS applies and tells<br />

us there are n − r free variables. Thus, n − r =0,andson = r. SoB has n pivot<br />

columns among its total of n columns. This is enough to force B to be the n × n<br />

identity matrix I n (see Exercise NM.T12).<br />

<br />

Notice that s<strong>in</strong>ce this theorem is an equivalence it will always allow us to determ<strong>in</strong>e<br />

if a matrix is either nons<strong>in</strong>gular or s<strong>in</strong>gular. Here are two examples of this, cont<strong>in</strong>u<strong>in</strong>g<br />

our study of Archetype A and Archetype B.<br />

Example SRR S<strong>in</strong>gular matrix, row-reduced<br />

We have the coefficient matrix for Archetype A and a row-equivalent matrix B <strong>in</strong><br />

reduced row-echelon form,<br />

A =<br />

[ 1 −1 2<br />

2 1 1<br />

1 1 0<br />

]<br />

RREF<br />

−−−−→<br />

⎡<br />

⎣ 1 0 1<br />

0 1 −1<br />

0 0 0<br />

⎤<br />

⎦ = B<br />

S<strong>in</strong>ce B is not the 3 × 3 identity matrix, Theorem NMRRI tells us that A is a<br />

s<strong>in</strong>gular matrix.<br />

△<br />

Example NSR Nons<strong>in</strong>gular matrix, row-reduced<br />

We have the coefficient matrix for Archetype B and a row-equivalent matrix B <strong>in</strong><br />

reduced row-echelon form,<br />

A =<br />

[ −7 −6 −12<br />

5 5 7<br />

1 0 4<br />

]<br />

RREF<br />

−−−−→<br />

⎡<br />

⎣<br />

1 0 0<br />

0 1 0<br />

0 0 1<br />

⎤<br />

⎦ = B<br />

S<strong>in</strong>ce B is the 3 × 3 identity matrix, Theorem NMRRI tells us that A is a<br />

nons<strong>in</strong>gular matrix.<br />


§NM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 69<br />

Subsection NSNM<br />

Null Space of a Nons<strong>in</strong>gular Matrix<br />

Nons<strong>in</strong>gular matrices and their null spaces are <strong>in</strong>timately related, as the next two<br />

examples illustrate.<br />

Example NSS Null space of a s<strong>in</strong>gular matrix<br />

Given the s<strong>in</strong>gular coefficient matrix from Archetype A, thenullspaceisthesetof<br />

solutions to the homogeneous system of equations LS(A, 0), whichhasasolution<br />

set and null space constructed <strong>in</strong> Example HISAA as an <strong>in</strong>f<strong>in</strong>ite set of vectors.<br />

A =<br />

[ 1 −1<br />

] 2<br />

2 1 1<br />

1 1 0<br />

N (A) =<br />

{[ ]∣ }<br />

−x3 ∣∣∣∣<br />

x 3 x 3 ∈ C<br />

x 3<br />

Example NSNM Null space of a nons<strong>in</strong>gular matrix<br />

Given the nons<strong>in</strong>gular coefficient matrix from Archetype B, the solution set to the<br />

homogeneous system LS(A, 0) is constructed <strong>in</strong> Example HUSAB and conta<strong>in</strong>s only<br />

the trivial solution, so the null space of A has only a s<strong>in</strong>gle element,<br />

A =<br />

[ −7 −6<br />

] −12<br />

5 5 7<br />

1 0 4<br />

]} 0<br />

N (A) ={[<br />

0<br />

0<br />

These two examples illustrate the next theorem, which is another equivalence.<br />

Theorem NMTNS Nons<strong>in</strong>gular Matrices have Trivial Null Spaces<br />

Suppose that A is a square matrix. Then A is nons<strong>in</strong>gular if and only if the null<br />

space of A is the set conta<strong>in</strong><strong>in</strong>g only the zero vector, i.e. N (A) ={0}.<br />

Proof. The null space of a square matrix, A, is equal to the set of solutions to the<br />

homogeneous system, LS(A, 0). Amatrix is nons<strong>in</strong>gular if and only if the set of<br />

solutions to the homogeneous system, LS(A, 0), has only a trivial solution. These<br />

two observations may be cha<strong>in</strong>ed together to construct the two proofs necessary for<br />

each half of this theorem.<br />

<br />

The next theorem pulls a lot of big ideas together. Theorem NMUS tells us that<br />

we can learn much about solutions to a system of l<strong>in</strong>ear equations with a square<br />

coefficient matrix by just exam<strong>in</strong><strong>in</strong>g a similar homogeneous system.<br />

Theorem NMUS Nons<strong>in</strong>gular Matrices and Unique Solutions<br />

Suppose that A is a square matrix. A is a nons<strong>in</strong>gular matrix if and only if the<br />

system LS(A, b) has a unique solution for every choice of the constant vector b.<br />

△<br />


§NM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 70<br />

Proof. (⇐) The hypothesis for this half of the proof is that the system LS(A, b)<br />

has a unique solution for every choice of the constant vector b. We will make a very<br />

specific choice for b: b = 0. Then we know that the system LS(A, 0) has a unique<br />

solution. But this is precisely the def<strong>in</strong>ition of what it means for A to be nons<strong>in</strong>gular<br />

(Def<strong>in</strong>ition NM). That almost seems too easy! Notice that we have not used the full<br />

power of our hypothesis, but there is noth<strong>in</strong>g that says we must use a hypothesis to<br />

its fullest.<br />

(⇒) We assume that A is nons<strong>in</strong>gular of size n × n, so we know there is a<br />

sequence of row operations that will convert A <strong>in</strong>to the identity matrix I n (Theorem<br />

NMRRI). Form the augmented matrix A ′ = [A | b] and apply this same sequence<br />

of row operations to A ′ . The result will be the matrix B ′ = [I n | c], which is <strong>in</strong><br />

reduced row-echelon form with r = n. Then the augmented matrix B ′ represents the<br />

(extremely simple) system of equations x i = [c] i<br />

,1≤ i ≤ n. The vector c is clearly a<br />

solution, so the system is consistent (Def<strong>in</strong>ition CS). With a consistent system, we<br />

use Theorem FVCS to count free variables. We f<strong>in</strong>d that there are n − r = n − n =0<br />

free variables, and so we therefore know that the solution is unique. (This half of<br />

the proof was suggested by Asa Scherer.)<br />

<br />

This theorem helps to expla<strong>in</strong> part of our <strong>in</strong>terest <strong>in</strong> nons<strong>in</strong>gular matrices. If a<br />

matrix is nons<strong>in</strong>gular, then no matter what vector of constants we pair it with, us<strong>in</strong>g<br />

the matrix as the coefficient matrix will always yield a l<strong>in</strong>ear system of equations<br />

with a solution, and the solution is unique. To determ<strong>in</strong>e if a matrix has this property<br />

(nons<strong>in</strong>gularity) it is enough to just solve one l<strong>in</strong>ear system, the homogeneous system<br />

with the matrix as coefficient matrix and the zero vector as the vector of constants<br />

(or any other vector of constants, see Exercise MM.T10).<br />

Formulat<strong>in</strong>g the negation of the second part of this theorem is a good exercise.<br />

A s<strong>in</strong>gular matrix has the property that for some value of the vector b, the system<br />

LS(A, b) does not have a unique solution (which means that it has no solution or<br />

<strong>in</strong>f<strong>in</strong>itely many solutions). We will be able to say more about this case later (see the<br />

discussion follow<strong>in</strong>g Theorem PSPHS).<br />

Square matrices that are nons<strong>in</strong>gular have a long list of <strong>in</strong>terest<strong>in</strong>g properties,<br />

which we will start to catalog <strong>in</strong> the follow<strong>in</strong>g, recurr<strong>in</strong>g, theorem. Of course, s<strong>in</strong>gular<br />

matrices will then have all of the opposite properties. The follow<strong>in</strong>g theorem is a list<br />

of equivalences.<br />

We want to understand just what is <strong>in</strong>volved with understand<strong>in</strong>g and prov<strong>in</strong>g<br />

a theorem that says several conditions are equivalent. So have a look at Proof<br />

Technique ME before study<strong>in</strong>g the first <strong>in</strong> this series of theorems.<br />

Theorem NME1 Nons<strong>in</strong>gular Matrix Equivalences, Round 1<br />

Suppose that A is a square matrix. The follow<strong>in</strong>g are equivalent.<br />

1. A is nons<strong>in</strong>gular.<br />

2. A row-reduces to the identity matrix.


§NM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 71<br />

3. The null space of A conta<strong>in</strong>s only the zero vector, N (A) ={0}.<br />

4. The l<strong>in</strong>ear system LS(A, b) has a unique solution for every possible choice of<br />

b.<br />

Proof. The statement that A is nons<strong>in</strong>gular is equivalent to each of the subsequent<br />

statements by, <strong>in</strong> turn, Theorem NMRRI, TheoremNMTNS and Theorem NMUS.<br />

So the statement of this theorem is just a convenient way to organize all these results.<br />

<br />

F<strong>in</strong>ally, you may have wondered why we refer to a matrix as nons<strong>in</strong>gular when<br />

it creates systems of equations with s<strong>in</strong>gle solutions (Theorem NMUS)! I have<br />

wondered the same th<strong>in</strong>g. We will have an opportunity to address this when we get<br />

to Theorem SMZD. Can you wait that long?<br />

Read<strong>in</strong>g Questions<br />

1. In your own words state the def<strong>in</strong>ition of a nons<strong>in</strong>gular matrix.<br />

2. What is the easiest way to recognize if a square matrix is nons<strong>in</strong>gular or not?<br />

3. Suppose we have a system of equations and its coefficient matrix is nons<strong>in</strong>gular. What<br />

can you say about the solution set for this system?<br />

Exercises<br />

In Exercises C30–C33 determ<strong>in</strong>e if the matrix is nons<strong>in</strong>gular or s<strong>in</strong>gular. Give reasons for<br />

your answer.<br />

C30 †<br />

⎡<br />

⎤<br />

⎢<br />

⎣<br />

−3 1 2 8<br />

2 0 3 4<br />

1 2 7 −4<br />

5 −1 2 0<br />

⎥<br />

⎦<br />

C31 †<br />

C32 †<br />

⎡<br />

⎤<br />

2 3 1 4<br />

⎢ 1 1 1 0⎥<br />

⎣<br />

−1 2 3 5<br />

⎦<br />

1 2 1 3<br />

⎡<br />

⎣ 9 3 2 4<br />

⎤<br />

5 −6 1 3 ⎦<br />

4 1 3 −5


§NM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 72<br />

C33 †<br />

⎡<br />

⎤<br />

−1 2 0 3<br />

⎢ 1 −3 −2 4⎥<br />

⎣<br />

−2 0 4 3<br />

⎦<br />

−3 1 −2 3<br />

C40 Each of the archetypes below is a system of equations with a square coefficient<br />

matrix, or is itself a square matrix. Determ<strong>in</strong>e if these matrices are nons<strong>in</strong>gular, or s<strong>in</strong>gular.<br />

Comment on the null space of each matrix.<br />

Archetype A, ArchetypeB, ArchetypeF, ArchetypeK, Archetype L<br />

C50 † F<strong>in</strong>d the null space of the matrix E below.<br />

⎡<br />

⎤<br />

2 1 −1 −9<br />

⎢ 2 2 −6 −6⎥<br />

E = ⎣<br />

1 2 −8 0<br />

⎦<br />

−1 2 −12 12<br />

M30 † Let A be the coefficient matrix of the system of equations below. Is A nons<strong>in</strong>gular<br />

or s<strong>in</strong>gular? Expla<strong>in</strong> what you could <strong>in</strong>fer about the solution set for the system based only<br />

on what you have learned about A be<strong>in</strong>g s<strong>in</strong>gular or nons<strong>in</strong>gular.<br />

−x 1 +5x 2 = −8<br />

−2x 1 +5x 2 +5x 3 +2x 4 =9<br />

−3x 1 − x 2 +3x 3 + x 4 =3<br />

7x 1 +6x 2 +5x 3 + x 4 =30<br />

For Exercises M51–M52 say as much as possible about each system’s solution set. Be<br />

sure to make it clear which theorems you are us<strong>in</strong>g to reach your conclusions.<br />

M51 † 6 equations <strong>in</strong> 6 variables, s<strong>in</strong>gular coefficient matrix.<br />

M52 † A system with a nons<strong>in</strong>gular coefficient matrix, not homogeneous.<br />

T10 † Suppose that A is a square matrix, and B is a matrix <strong>in</strong> reduced row-echelon form<br />

that is row-equivalent to A. ProvethatifA is s<strong>in</strong>gular, then the last row of B is a zero row.<br />

T12 Us<strong>in</strong>g (Def<strong>in</strong>ition RREF) and (Def<strong>in</strong>ition IM) carefully, give a proof of the follow<strong>in</strong>g<br />

equivalence: A is a square matrix <strong>in</strong> reduced row-echelon form where every column is a<br />

pivot column if and only if A is the identity matrix.<br />

T30 † Suppose that A is a nons<strong>in</strong>gular matrix and A is row-equivalent to the matrix B.<br />

Prove that B is nons<strong>in</strong>gular.<br />

T31 † Suppose that A is a square matrix of size n × n and that we know there is a s<strong>in</strong>gle<br />

vector b ∈ C n such that the system LS(A, b) has a unique solution. Prove that A is a<br />

nons<strong>in</strong>gular matrix. (Notice that this is very similar to Theorem NMUS, but is not exactly<br />

the same.)


§NM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 73<br />

T90 † Provide an alternative for the second half of the proof of Theorem NMUS, without<br />

appeal<strong>in</strong>g to properties of the reduced row-echelon form of the coefficient matrix. In other<br />

words, prove that if A is nons<strong>in</strong>gular, then LS(A, b) has a unique solution for every choice<br />

of the constant vector b. Construct this proof without us<strong>in</strong>g Theorem REMEF or Theorem<br />

RREFU.


Chapter V<br />

Vectors<br />

We have worked extensively <strong>in</strong> the last chapter with matrices, and some with vectors.<br />

In this chapter we will develop the properties of vectors, while prepar<strong>in</strong>g to study<br />

vector spaces (Chapter VS). Initially we will depart from our study of systems<br />

of l<strong>in</strong>ear equations, but <strong>in</strong> Section LC we will forge a connection between l<strong>in</strong>ear<br />

comb<strong>in</strong>ations and systems of l<strong>in</strong>ear equations <strong>in</strong> Theorem SLSLC. This connection<br />

will allow us to understand systems of l<strong>in</strong>ear equations at a higher level, while<br />

consequently discuss<strong>in</strong>g them less frequently.<br />

Section VO<br />

Vector Operations<br />

In this section we def<strong>in</strong>e some new operations <strong>in</strong>volv<strong>in</strong>g vectors, and collect some<br />

basic properties of these operations. Beg<strong>in</strong> by recall<strong>in</strong>g our def<strong>in</strong>ition of a column<br />

vector as an ordered list of complex numbers, written vertically (Def<strong>in</strong>ition CV).<br />

The collection of all possible vectors of a fixed size is a commonly used set, so we<br />

start with its def<strong>in</strong>ition.<br />

Subsection CV<br />

Column Vectors<br />

Def<strong>in</strong>ition VSCV Vector Space of Column Vectors<br />

The vector space C m is the set of all column vectors (Def<strong>in</strong>ition CV) ofsizem with<br />

entries from the set of complex numbers, C.<br />

□<br />

When a set similar to this is def<strong>in</strong>ed us<strong>in</strong>g only column vectors where all the<br />

entries are from the real numbers, it is written as R m and is known as Euclidean<br />

m-space.<br />

74


§VO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 75<br />

The term vector is used <strong>in</strong> a variety of different ways. We have def<strong>in</strong>ed it as<br />

an ordered list written vertically. It could simply be an ordered list of numbers,<br />

and perhaps written as 〈2, 3, −1, 6〉. Or it could be <strong>in</strong>terpreted as a po<strong>in</strong>t <strong>in</strong> m<br />

dimensions, such as (3, 4, −2) represent<strong>in</strong>g a po<strong>in</strong>t <strong>in</strong> three dimensions relative to x,<br />

y and z axes. With an <strong>in</strong>terpretation as a po<strong>in</strong>t, we can construct an arrow from the<br />

orig<strong>in</strong> to the po<strong>in</strong>t which is consistent with the notion that a vector has direction<br />

and magnitude.<br />

All of these ideas can be shown to be related and equivalent, so keep that <strong>in</strong> m<strong>in</strong>d<br />

as you connect the ideas of this course with ideas from other discipl<strong>in</strong>es. For now,<br />

we will stick with the idea that a vector is just a list of numbers, <strong>in</strong> some particular<br />

order.<br />

Subsection VEASM<br />

Vector Equality, Addition, Scalar Multiplication<br />

We start our study of this set by first def<strong>in</strong><strong>in</strong>g what it means for two vectors to be<br />

the same.<br />

Def<strong>in</strong>ition CVE Column Vector Equality<br />

Suppose that u, v ∈ C m .Thenu and v are equal, written u = v if<br />

[u] i<br />

=[v] i<br />

1 ≤ i ≤ m<br />

Now this may seem like a silly (or even stupid) th<strong>in</strong>g to say so carefully. Of<br />

course two vectors are equal if they are equal for each correspond<strong>in</strong>g entry! Well,<br />

this is not as silly as it appears. We will see a few occasions later where the obvious<br />

def<strong>in</strong>ition is not the right one. And besides, <strong>in</strong> do<strong>in</strong>g mathematics we need to be very<br />

careful about mak<strong>in</strong>g all the necessary def<strong>in</strong>itions and mak<strong>in</strong>g them unambiguous.<br />

And we have done that here.<br />

Notice now that the symbol “=” is now do<strong>in</strong>g triple-duty. We know from our<br />

earlier education what it means for two numbers (real or complex) to be equal, and<br />

we take this for granted. In Def<strong>in</strong>ition SE we def<strong>in</strong>ed what it meant for two sets<br />

to be equal. Now we have def<strong>in</strong>ed what it means for two vectors to be equal, and<br />

that def<strong>in</strong>ition builds on our def<strong>in</strong>ition for when two numbers are equal when we<br />

use the condition u i = v i for all 1 ≤ i ≤ m. So th<strong>in</strong>k carefully about your objects<br />

when you see an equal sign and th<strong>in</strong>k about just which notion of equality you have<br />

encountered. This will be especially important when you are asked to construct<br />

proofs whose conclusion states that two objects are equal. If you have an electronic<br />

copy of the book, such as the PDF version, search<strong>in</strong>g on “Def<strong>in</strong>ition CVE” can be<br />

an <strong>in</strong>structive exercise. See how often, and where, the def<strong>in</strong>ition is employed.<br />

OK, let us do an example of vector equality that beg<strong>in</strong>s to h<strong>in</strong>t at the utility of<br />

this def<strong>in</strong>ition.<br />


§VO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 76<br />

Example VESE Vector equality for a system of equations<br />

Consider the system of l<strong>in</strong>ear equations <strong>in</strong> Archetype B,<br />

−7x 1 − 6x 2 − 12x 3 = −33<br />

5x 1 +5x 2 +7x 3 =24<br />

x 1 +4x 3 =5<br />

Note the use of three equals signs — each <strong>in</strong>dicates an equality of numbers (the<br />

l<strong>in</strong>ear expressions are numbers when we evaluate them with fixed values of the<br />

variable quantities). Now write the vector equality,<br />

[ −7x1 − 6x 2 − 12x 3<br />

]<br />

5x 1 +5x 2 +7x 3 =<br />

x 1 +4x 3<br />

[ −33<br />

24<br />

5<br />

By Def<strong>in</strong>ition CVE, thiss<strong>in</strong>gle equality (of two column vectors) translates <strong>in</strong>to three<br />

simultaneous equalities of numbers that form the system of equations. So with this<br />

new notion of vector equality we can become less reliant on referr<strong>in</strong>g to systems of<br />

simultaneous equations. There is more to vector equality than just this, but this is a<br />

good example for starters and we will develop it further.<br />

△<br />

We will now def<strong>in</strong>e two operations on the set C m . By this we mean well-def<strong>in</strong>ed<br />

procedures that somehow convert vectors <strong>in</strong>to other vectors. Here are two of the<br />

most basic def<strong>in</strong>itions of the entire course.<br />

Def<strong>in</strong>ition CVA Column Vector Addition<br />

Suppose that u, v ∈ C m .Thesum of u and v is the vector u + v def<strong>in</strong>ed by<br />

[u + v] i<br />

=[u] i<br />

+[v] i<br />

1 ≤ i ≤ m<br />

So vector addition takes two vectors of the same size and comb<strong>in</strong>es them (<strong>in</strong> a<br />

natural way!) to create a new vector of the same size. Notice that this def<strong>in</strong>ition<br />

is required, even if we agree that this is the obvious, right, natural or correct way<br />

to do it. Notice too that the symbol ‘+’ is be<strong>in</strong>g recycled. We all know how to add<br />

numbers, but now we have the same symbol extended to double-duty and we use<br />

it to <strong>in</strong>dicate how to add two new objects, vectors. And this def<strong>in</strong>ition of our new<br />

mean<strong>in</strong>g is built on our previous mean<strong>in</strong>g of addition via the expressions u i + v i .<br />

Th<strong>in</strong>k about your objects, especially when do<strong>in</strong>g proofs. Vector addition is easy, here<br />

is an example from C 4 .<br />

Example VA Addition of two vectors <strong>in</strong> C 4<br />

If<br />

⎡ ⎤<br />

⎡ ⎤<br />

2<br />

−1<br />

⎢−3⎥<br />

⎢ 5 ⎥<br />

u = ⎣<br />

4<br />

⎦ v = ⎣<br />

2<br />

⎦<br />

2<br />

−7<br />

]<br />

.<br />


§VO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 77<br />

then<br />

⎡ ⎤ ⎡ ⎤<br />

2 −1<br />

⎢−3⎥<br />

⎢ 5<br />

u + v = ⎣<br />

4<br />

⎦ + ⎣<br />

2<br />

2 −7<br />

⎥<br />

⎦ =<br />

⎡ ⎤<br />

2+(−1)<br />

⎢<br />

⎣<br />

−3+5<br />

4+2<br />

2+(−7)<br />

⎥<br />

⎦ =<br />

Our second operation takes two objects of different types, specifically a number<br />

and a vector, and comb<strong>in</strong>es them to create another vector. In this context we call a<br />

number a scalar <strong>in</strong> order to emphasize that it is not a vector.<br />

Def<strong>in</strong>ition CVSM Column Vector Scalar Multiplication<br />

Suppose u ∈ C m and α ∈ C, then the scalar multiple of u by α is the vector αu<br />

def<strong>in</strong>ed by<br />

[αu] i<br />

= α [u] i<br />

1 ≤ i ≤ m<br />

□<br />

Notice that we are do<strong>in</strong>g a k<strong>in</strong>d of multiplication here, but we are def<strong>in</strong><strong>in</strong>g anew<br />

type, perhaps <strong>in</strong> what appears to be a natural way. We use juxtaposition (smash<strong>in</strong>g<br />

two symbols together side-by-side) to denote this operation rather than us<strong>in</strong>g a<br />

symbol like we did with vector addition. So this can be another source of confusion.<br />

When two symbols are next to each other, are we do<strong>in</strong>g regular old multiplication,<br />

the k<strong>in</strong>d we have done for years, or are we do<strong>in</strong>g scalar vector multiplication, the<br />

operation we just def<strong>in</strong>ed? Th<strong>in</strong>k about your objects — if the first object is a scalar,<br />

and the second is a vector, then it must be that we are do<strong>in</strong>g our new operation,<br />

and the result of this operation will be another vector.<br />

Notice how consistency <strong>in</strong> notation can be an aid here. If we write scalars as<br />

lower case Greek letters from the start of the alphabet (such as α, β, ...)andwrite<br />

vectors <strong>in</strong> bold Lat<strong>in</strong> letters from the end of the alphabet (u, v, . . . ), then we have<br />

some h<strong>in</strong>ts about what type of objects we are work<strong>in</strong>g with. This can be a bless<strong>in</strong>g<br />

and a curse, s<strong>in</strong>ce when we go read another book about l<strong>in</strong>ear algebra, or read an<br />

application <strong>in</strong> another discipl<strong>in</strong>e (physics, economics, . . . ) the types of notation<br />

employed may be very different and hence unfamiliar.<br />

Aga<strong>in</strong>, computationally, vector scalar multiplication is very easy.<br />

Example CVSM Scalar multiplication <strong>in</strong> C 5<br />

If<br />

⎡ ⎤<br />

3<br />

1<br />

u = ⎢−2⎥<br />

⎣ 4 ⎦<br />

−1<br />

⎡<br />

⎢<br />

⎣<br />

1<br />

2<br />

6<br />

−5<br />

⎤<br />

⎥<br />

⎦<br />


§VO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 78<br />

and α =6,then<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

3 6(3) 18<br />

1<br />

6(1)<br />

6<br />

αu =6⎢−2⎥<br />

⎣ 4 ⎦ = ⎢6(−2)<br />

⎥<br />

⎣ 6(4) ⎦ = ⎢−12⎥<br />

⎣ 24 ⎦ .<br />

−1 6(−1) −6<br />

△<br />

Subsection VSP<br />

Vector Space Properties<br />

With def<strong>in</strong>itions of vector addition and scalar multiplication we can state, and prove,<br />

several properties of each operation, and some properties that <strong>in</strong>volve their <strong>in</strong>terplay.<br />

We now collect ten of them here for later reference.<br />

Theorem VSPCV Vector Space Properties of Column Vectors<br />

Suppose that C m is the set of column vectors of size m (Def<strong>in</strong>ition VSCV) with<br />

addition and scalar multiplication as def<strong>in</strong>ed <strong>in</strong> Def<strong>in</strong>ition CVA and Def<strong>in</strong>ition CVSM.<br />

Then<br />

• ACC Additive Closure, Column Vectors<br />

If u, v ∈ C m ,thenu + v ∈ C m .<br />

• SCC Scalar Closure, Column Vectors<br />

If α ∈ C and u ∈ C m ,thenαu ∈ C m .<br />

• CC Commutativity, Column Vectors<br />

If u, v ∈ C m ,thenu + v = v + u.<br />

• AAC Additive Associativity, Column Vectors<br />

If u, v, w ∈ C m ,thenu +(v + w) =(u + v)+w.<br />

• ZC Zero Vector, Column Vectors<br />

There is a vector, 0, called the zero vector, such that u + 0 = u for all u ∈ C m .<br />

• AIC Additive Inverses, Column Vectors<br />

If u ∈ C m , then there exists a vector −u ∈ C m so that u +(−u) =0.<br />

• SMAC Scalar Multiplication Associativity, Column Vectors<br />

If α, β ∈ C and u ∈ C m ,thenα(βu) =(αβ)u.<br />

• DVAC Distributivity across Vector Addition, Column Vectors<br />

If α ∈ C and u, v ∈ C m ,thenα(u + v) =αu + αv.


§VO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 79<br />

• DSAC Distributivity across Scalar Addition, Column Vectors<br />

If α, β ∈ C and u ∈ C m ,then(α + β)u = αu + βu.<br />

• OC One, Column Vectors<br />

If u ∈ C m ,then1u = u.<br />

Proof. While some of these properties seem very obvious, they all require proof.<br />

However, the proofs are not very <strong>in</strong>terest<strong>in</strong>g, and border on tedious. We will prove<br />

one version of distributivity very carefully, and you can test your proof-build<strong>in</strong>g<br />

skills on some of the others. We need to establish an equality, so we will do so by<br />

beg<strong>in</strong>n<strong>in</strong>g with one side of the equality, apply various def<strong>in</strong>itions and theorems (listed<br />

to the right of each step) to massage the expression from the left <strong>in</strong>to the expression<br />

ontheright.HerewegowithaproofofPropertyDSAC.<br />

For 1 ≤ i ≤ m,<br />

[(α + β)u] i<br />

=(α + β)[u] i<br />

Def<strong>in</strong>ition CVSM<br />

= α [u] i<br />

+ β [u] i<br />

Property DCN<br />

=[αu] i<br />

+[βu] i<br />

Def<strong>in</strong>ition CVSM<br />

=[αu + βu] i<br />

Def<strong>in</strong>ition CVA<br />

S<strong>in</strong>ce the <strong>in</strong>dividual components of the vectors (α + β)u and αu + βu are equal<br />

for all i, 1≤ i ≤ m, Def<strong>in</strong>ition CVE tells us the vectors are equal.<br />

<br />

Many of the conclusions of our theorems can be characterized as “identities,”<br />

especially when we are establish<strong>in</strong>g basic properties of operations such as those <strong>in</strong><br />

this section. Most of the properties listed <strong>in</strong> Theorem VSPCV are examples. So<br />

some advice about the style we use for prov<strong>in</strong>g identities is appropriate right now.<br />

Have a look at Proof Technique PI.<br />

Be careful with the notion of the vector −u. This is a vector that we add to u<br />

so that the result is the particular vector 0. This is basically a property of vector<br />

addition. It happens that we can compute −u us<strong>in</strong>g the other operation, scalar<br />

multiplication. We can prove this directly by writ<strong>in</strong>g that<br />

[−u] i<br />

= − [u] i<br />

=(−1) [u] i<br />

=[(−1)u] i<br />

We will see later how to derive this property as a consequence of several of the ten<br />

properties listed <strong>in</strong> Theorem VSPCV.<br />

Similarly, we will often write someth<strong>in</strong>g you would immediately recognize as<br />

“vector subtraction.” This could be placed on a firm theoretical foundation — as you<br />

candoyourselfwithExerciseVO.T30.<br />

A f<strong>in</strong>al note. Property AAC implies that we do not have to be careful about how<br />

we “parenthesize” the addition of vectors. In other words, there is noth<strong>in</strong>g to be<br />

ga<strong>in</strong>ed by writ<strong>in</strong>g (u + v) + (w +(x + y)) rather than u + v + w + x + y, s<strong>in</strong>cewe


§VO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 80<br />

get the same result no matter which order we choose to perform the four additions.<br />

So we will not be careful about us<strong>in</strong>g parentheses this way.<br />

Read<strong>in</strong>g Questions<br />

1. Where have you seen vectors used before <strong>in</strong> other courses? How were they different?<br />

2. In words only, when are two vectors equal?<br />

3. Perform the follow<strong>in</strong>g computation with vector operations<br />

⎡<br />

2 ⎣ 1 ⎤ ⎡<br />

5⎦ +(−3) ⎣ 7 ⎤<br />

6⎦<br />

0 5<br />

Exercises<br />

C10 †<br />

Compute<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

2<br />

1 −1<br />

−3<br />

2<br />

3<br />

4 ⎢ 4 ⎥<br />

⎣<br />

1<br />

⎦ +(−2) ⎢−5⎥<br />

⎣<br />

2<br />

⎦ + ⎢ 0 ⎥<br />

⎣<br />

1<br />

⎦<br />

0<br />

4 2<br />

C11 †<br />

C12 †<br />

C13 †<br />

C14 †<br />

C15 †<br />

Solve the given vector equation for x, or expla<strong>in</strong> why no solution exists:<br />

⎡<br />

3 ⎣ 1 ⎤ ⎡<br />

2 ⎦ +4⎣ 2 ⎤ ⎡<br />

0⎦ = ⎣ 11<br />

⎤<br />

6 ⎦<br />

−1 x 17<br />

Solve the given vector equation for α, or expla<strong>in</strong> why no solution exists:<br />

⎡<br />

α ⎣ 1 ⎤ ⎡<br />

2 ⎦ +4⎣ 3 ⎤ ⎡<br />

4⎦ = ⎣ −1<br />

⎤<br />

0 ⎦<br />

−1 2 4<br />

Solve the given vector equation for α, or expla<strong>in</strong> why no solution exists:<br />

⎡<br />

α ⎣ 3 ⎤ ⎡<br />

2 ⎦ + ⎣ 6 ⎤ ⎡<br />

1⎦ = ⎣ 0<br />

⎤<br />

−3⎦<br />

−2 2 6<br />

F<strong>in</strong>d α and β that solve the vector equation.<br />

[ [ [<br />

1 0 3<br />

α + β =<br />

0]<br />

1]<br />

2]<br />

F<strong>in</strong>d α and β that solve the vector equation.<br />

[ [ [<br />

2 1 5<br />

α + β =<br />

1]<br />

3]<br />

0]


§VO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 81<br />

T05 † Provide reasons (mostly vector space properties) as justification for each of the<br />

seven steps of the follow<strong>in</strong>g proof.<br />

Theorem For any vectors u, v, w ∈ C m ,ifu + v = u + w, thenv = w.<br />

Proof: Letu, v, w ∈ C m , and suppose u + v = u + w.<br />

v = 0 + v<br />

=(−u + u)+v<br />

= −u +(u + v)<br />

= −u +(u + w)<br />

=(−u + u)+w<br />

= 0 + w<br />

= w<br />

T06 † Provide reasons (mostly vector space properties) as justification for each of the six<br />

steps of the follow<strong>in</strong>g proof.<br />

Theorem For any vector u ∈ C m ,0u = 0.<br />

Proof: Letu ∈ C m .<br />

0 =0u +(−0u)<br />

=(0+0)u +(−0u)<br />

=(0u +0u)+(−0u)<br />

=0u +(0u +(−0u))<br />

=0u + 0<br />

=0u<br />

T07 † Provide reasons (mostly vector space properties) as justification for each of the six<br />

steps of the follow<strong>in</strong>g proof.<br />

Theorem For any scalar c, c 0 = 0.<br />

Proof: Letc be an arbitrary scalar.<br />

0 = c0 +(−c0)<br />

= c(0 + 0)+(−c0)<br />

=(c0 + c0)+(−c0)<br />

= c0 +(c0 +(−c0))


§VO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 82<br />

= c0 + 0<br />

= c0<br />

T13 † Prove Property CC of Theorem VSPCV. Write your proof <strong>in</strong> the style of the proof<br />

of Property DSAC given <strong>in</strong> this section.<br />

T17 Prove Property SMAC of Theorem VSPCV. Write your proof <strong>in</strong> the style of the<br />

proof of Property DSAC given <strong>in</strong> this section.<br />

T18 Prove Property DVAC of Theorem VSPCV. Write your proof <strong>in</strong> the style of the<br />

proof of Property DSAC given <strong>in</strong> this section.<br />

Exercises T30, T31 and T32 are about mak<strong>in</strong>g a careful def<strong>in</strong>ition of “vector subtraction”.<br />

T30 Suppose u and v are two vectors <strong>in</strong> C m . Def<strong>in</strong>e a new operation, called “subtraction,”<br />

as the new vector denoted u − v and def<strong>in</strong>ed by<br />

[u − v] i<br />

=[u] i<br />

− [v] i<br />

1 ≤ i ≤ m<br />

Prove that we can express the subtraction of two vectors <strong>in</strong> terms of our two basic<br />

operations. More precisely, prove that u − v = u +(−1)v. So <strong>in</strong> a sense, subtraction is<br />

not someth<strong>in</strong>g new and different, but is just a convenience. Mimic the style of similar<br />

proofs <strong>in</strong> this section.<br />

T31 Prove, by giv<strong>in</strong>g counterexamples, that vector subtraction is not commutative<br />

and not associative.<br />

T32 Prove that vector subtraction obeys a distributive property. Specifically, prove<br />

that α (u − v) =αu − αv.<br />

Can you give two different proofs? Dist<strong>in</strong>guish your two proofs by us<strong>in</strong>g the alternate<br />

descriptions of vector subtraction provided by Exercise VO.T30.


Section LC<br />

L<strong>in</strong>ear Comb<strong>in</strong>ations<br />

In Section VO we def<strong>in</strong>ed vector addition and scalar multiplication. These two<br />

operations comb<strong>in</strong>e nicely to give us a construction known as a l<strong>in</strong>ear comb<strong>in</strong>ation,<br />

a construct that we will work with throughout this course.<br />

Subsection LC<br />

L<strong>in</strong>ear Comb<strong>in</strong>ations<br />

Def<strong>in</strong>ition LCCV L<strong>in</strong>ear Comb<strong>in</strong>ation of Column Vectors<br />

Given n vectors u 1 , u 2 , u 3 , ..., u n from C m and n scalars α 1 ,α 2 ,α 3 , ..., α n ,their<br />

l<strong>in</strong>ear comb<strong>in</strong>ation is the vector<br />

α 1 u 1 + α 2 u 2 + α 3 u 3 + ···+ α n u n<br />

So this def<strong>in</strong>ition takes an equal number of scalars and vectors, comb<strong>in</strong>es them<br />

us<strong>in</strong>g our two new operations (scalar multiplication and vector addition) and creates<br />

a s<strong>in</strong>gle brand-new vector, of the same size as the orig<strong>in</strong>al vectors. When a def<strong>in</strong>ition<br />

or theorem employs a l<strong>in</strong>ear comb<strong>in</strong>ation, th<strong>in</strong>k about the nature of the objects that<br />

go <strong>in</strong>to its creation (lists of scalars and vectors), and the type of object that results<br />

(a s<strong>in</strong>gle vector). Computationally, a l<strong>in</strong>ear comb<strong>in</strong>ation is pretty easy.<br />

Example TLC Two l<strong>in</strong>ear comb<strong>in</strong>ations <strong>in</strong> C 6<br />

Suppose that<br />

α 1 =1 α 2 = −4 α 3 =2 α 4 = −1<br />

and<br />

⎡<br />

⎤<br />

2<br />

4<br />

−3<br />

u 1 =<br />

⎢ 1<br />

⎥<br />

⎣ 2 ⎦<br />

9<br />

⎡<br />

⎤<br />

6<br />

3<br />

0<br />

u 2 =<br />

⎢−2<br />

⎥<br />

⎣ 1 ⎦<br />

4<br />

⎡ ⎤<br />

−5<br />

2<br />

1<br />

u 3 =<br />

⎢ 1<br />

⎥<br />

⎣−3⎦<br />

0<br />

⎡<br />

⎤<br />

3<br />

2<br />

−5<br />

u 4 =<br />

⎢ 7<br />

⎥<br />

⎣ 1 ⎦<br />

3<br />

then their l<strong>in</strong>ear comb<strong>in</strong>ation is<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

2<br />

6 −5<br />

3<br />

4<br />

3<br />

2<br />

2<br />

−3<br />

0<br />

1<br />

−5<br />

α 1 u 1 + α 2 u 2 + α 3 u 3 + α 4 u 4 =(1)<br />

⎢ 1<br />

+(−4)<br />

⎥ ⎢−2<br />

+(2)<br />

⎥ ⎢ 1<br />

+(−1)<br />

⎥ ⎢ 7<br />

⎥<br />

⎣ 2 ⎦ ⎣ 1 ⎦ ⎣−3⎦<br />

⎣ 1 ⎦<br />

9<br />

4 0<br />

3<br />

□<br />

83


§LC Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 84<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

2 −24 −10 −3 −35<br />

4<br />

−12<br />

4<br />

−2<br />

−6<br />

−3<br />

0<br />

2<br />

5<br />

4<br />

=<br />

⎢ 1<br />

+<br />

⎥ ⎢ 8<br />

+<br />

⎥ ⎢ 2<br />

+<br />

⎥ ⎢−7<br />

=<br />

⎥ ⎢ 4<br />

⎥<br />

⎣ 2 ⎦ ⎣ −4 ⎦ ⎣ −6 ⎦ ⎣−1⎦<br />

⎣ −9 ⎦<br />

9 −16 0 −3 −10<br />

A different l<strong>in</strong>ear comb<strong>in</strong>ation, of the same set of vectors, can be formed with<br />

different scalars. Take<br />

β 1 =3 β 2 =0 β 3 =5 β 4 = −1<br />

and form the l<strong>in</strong>ear comb<strong>in</strong>ation<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

2 6 −5<br />

3<br />

4<br />

3<br />

2<br />

2<br />

−3<br />

0<br />

1<br />

−5<br />

β 1 u 1 + β 2 u 2 + β 3 u 3 + β 4 u 4 =(3)<br />

⎢ 1<br />

+(0)<br />

⎥ ⎢−2<br />

+(5)<br />

⎥ ⎢ 1<br />

+(−1)<br />

⎥ ⎢ 7<br />

⎥<br />

⎣ 2 ⎦ ⎣ 1 ⎦ ⎣−3⎦<br />

⎣ 1 ⎦<br />

9 4 0<br />

3<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

6 0 −25 −3 −22<br />

12<br />

0<br />

10<br />

−2<br />

20<br />

−9<br />

0<br />

5<br />

5<br />

1<br />

=<br />

⎢ 3<br />

+<br />

⎥ ⎢0<br />

+<br />

⎥ ⎢ 5<br />

+<br />

⎥ ⎢−7<br />

=<br />

⎥ ⎢ 1<br />

⎥<br />

⎣ 6 ⎦ ⎣0⎦<br />

⎣−15⎦<br />

⎣−1⎦<br />

⎣−10⎦<br />

27 0 0 −3 24<br />

Notice how we could keep our set of vectors fixed, and use different sets of scalars<br />

to construct different vectors. You might build a few new l<strong>in</strong>ear comb<strong>in</strong>ations of<br />

u 1 , u 2 , u 3 , u 4 right now. We will be right here when you get back. What vectors<br />

were you able to create? Do you th<strong>in</strong>k you could create the vector w with a “suitable”<br />

choice of four scalars?<br />

⎡ ⎤<br />

13<br />

15<br />

5<br />

w =<br />

⎢−17<br />

⎥<br />

⎣ 2 ⎦<br />

25<br />

Do you th<strong>in</strong>k you could create any possible vector from C 6 by choos<strong>in</strong>g the proper<br />

scalars? These last two questions are very fundamental, and time spent consider<strong>in</strong>g<br />

them now will prove beneficial later.<br />

△<br />

Our next two examples are key ones, and a discussion about decompositions is<br />

timely. Have a look at Proof Technique DC before study<strong>in</strong>g the next two examples.<br />

Example ABLC Archetype B as a l<strong>in</strong>ear comb<strong>in</strong>ation


§LC Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 85<br />

In this example we will rewrite Archetype B <strong>in</strong> the language of vectors, vector<br />

equality and l<strong>in</strong>ear comb<strong>in</strong>ations. In Example VESE we wrote the system of m =3<br />

equations as the vector equality<br />

[ ]<br />

−7x1 − 6x 2 − 12x 3<br />

5x 1 +5x 2 +7x 3 =<br />

x 1 +4x 3<br />

[ ] −33<br />

24<br />

5<br />

Now we will bust up the l<strong>in</strong>ear expressions on the left, first us<strong>in</strong>g vector addition,<br />

[ ] [ ] [ ] [ ]<br />

−7x1 −6x2 −12x3 −33<br />

5x 1 + 5x 2 + 7x 3 = 24<br />

x 1 0x 2 4x 3 5<br />

Now we can rewrite each of these vectors as a scalar multiple of a fixed vector,<br />

where the scalar is one of the unknown variables, convert<strong>in</strong>g the left-hand side <strong>in</strong>to<br />

a l<strong>in</strong>ear comb<strong>in</strong>ation<br />

[ −7<br />

x 1 5<br />

1<br />

]<br />

[ ] −6<br />

+ x 2 5<br />

0<br />

[ ] −12<br />

+ x 3 7 =<br />

4<br />

[ ] −33<br />

24<br />

5<br />

We can now <strong>in</strong>terpret the problem of solv<strong>in</strong>g the system of equations as determ<strong>in</strong><strong>in</strong>g<br />

values for the scalar multiples that make the vector equation true. In the analysis<br />

of Archetype B, we were able to determ<strong>in</strong>e that it had only one solution. A quick<br />

way to see this is to row-reduce the coefficient matrix to the 3 × 3 identity matrix<br />

and apply Theorem NMRRI to determ<strong>in</strong>e that the coefficient matrix is nons<strong>in</strong>gular.<br />

Then Theorem NMUS tells us that the system of equations has a unique solution.<br />

This solution is<br />

x 1 = −3 x 2 =5 x 3 =2<br />

So, <strong>in</strong> the context of this example, we can express the fact that these values of<br />

the variables are a solution by writ<strong>in</strong>g the l<strong>in</strong>ear comb<strong>in</strong>ation,<br />

(−3)<br />

[ ] −7<br />

5<br />

1<br />

+(5)<br />

[ ] −6<br />

5<br />

0<br />

+(2)<br />

[ ] −12<br />

7<br />

4<br />

=<br />

[ ] −33<br />

24<br />

5<br />

Furthermore, these are the only three scalars that will accomplish this equality,<br />

s<strong>in</strong>ce they come from a unique solution.<br />

Notice how the three vectors <strong>in</strong> this example are the columns of the coefficient<br />

matrix of the system of equations. This is our first h<strong>in</strong>t of the important <strong>in</strong>terplay<br />

between the vectors that form the columns of a matrix, and the matrix itself. △<br />

With any discussion of Archetype A or Archetype B we should be sure to contrast<br />

with the other.<br />

Example AALC Archetype A as a l<strong>in</strong>ear comb<strong>in</strong>ation


§LC Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 86<br />

As a vector equality, Archetype A can be written as<br />

[ ] [ ]<br />

x1 − x 2 +2x 3 1<br />

2x 1 + x 2 + x 3 = 8<br />

x 1 + x 2 5<br />

Now bust up the l<strong>in</strong>ear expressions on the left, first us<strong>in</strong>g vector addition,<br />

[ ] [ ] [ ] [ ]<br />

x1 −x2 2x3 1<br />

2x 1 + x 2 + x 3 = 8<br />

x 1 x 2 0x 3 5<br />

Rewrite each of these vectors as a scalar multiple of a fixed vector, where the<br />

scalar is one of the unknown variables, convert<strong>in</strong>g the left-hand side <strong>in</strong>to a l<strong>in</strong>ear<br />

comb<strong>in</strong>ation<br />

x 1<br />

[ 1<br />

2<br />

1<br />

]<br />

[ ] [ ] [ ]<br />

−1 2 1<br />

+ x 2 1 + x 3 1 = 8<br />

1 0 5<br />

Row-reduc<strong>in</strong>g the augmented matrix for Archetype A leads to the conclusion<br />

that the system is consistent and has free variables, hence <strong>in</strong>f<strong>in</strong>itely many solutions.<br />

So for example, the two solutions<br />

x 1 =2 x 2 =3 x 3 =1<br />

x 1 =3 x 2 =2 x 3 =0<br />

can be used together to say that,<br />

[ ] [ ] ]<br />

1 −1 2<br />

(2) 2 +(3) 1 +(1)[<br />

1<br />

1 1 0<br />

=<br />

[ 1<br />

8<br />

5<br />

]<br />

] 1<br />

=(3)[<br />

2<br />

1<br />

+(2)<br />

[ ] ]<br />

−1 2<br />

1 +(0)[<br />

1<br />

1 0<br />

Ignore the middle of this equation, and move all the terms to the left-hand side,<br />

[ ] [ ] ] [ ] [ ] [ ] [ ]<br />

1 −1 2 1 −1 2 0<br />

(2) 2 +(3) 1 +(1)[<br />

1 +(−3) 2 +(−2) 1 +(−0) 1 = 0<br />

1 1 0 1<br />

1<br />

0 0<br />

Regroup<strong>in</strong>g gives<br />

[ ] 1<br />

(−1) 2<br />

1<br />

+(1)<br />

[ ] ] [ ]<br />

−1 2 0<br />

1 +(1)[<br />

1 = 0<br />

1 0 0<br />

Notice that these three vectors are the columns of the coefficient matrix for the<br />

system of equations <strong>in</strong> Archetype A. This equality says there is a l<strong>in</strong>ear comb<strong>in</strong>ation<br />

of those columns that equals the vector of all zeros. Give it some thought, but this<br />

says that<br />

x 1 = −1 x 2 =1 x 3 =1<br />

is a nontrivial solution to the homogeneous system of equations with the coefficient


§LC Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 87<br />

matrix for the orig<strong>in</strong>al system <strong>in</strong> Archetype A. In particular, this demonstrates that<br />

this coefficient matrix is s<strong>in</strong>gular.<br />

△<br />

There is a lot go<strong>in</strong>g on <strong>in</strong> the last two examples. Come back to them <strong>in</strong> a while and<br />

make some connections with the <strong>in</strong>terven<strong>in</strong>g material. For now, we will summarize<br />

and expla<strong>in</strong> some of this behavior with a theorem.<br />

Theorem SLSLC Solutions to L<strong>in</strong>ear Systems are L<strong>in</strong>ear Comb<strong>in</strong>ations<br />

Denote the columns of the m × n matrix A as the vectors A 1 , A 2 , A 3 , ..., A n .<br />

Then x ∈ C n is a solution to the l<strong>in</strong>ear system of equations LS(A, b) if and only if<br />

b equals the l<strong>in</strong>ear comb<strong>in</strong>ation of the columns of A formed with the entries of x,<br />

[x] 1<br />

A 1 +[x] 2<br />

A 2 +[x] 3<br />

A 3 + ···+[x] n<br />

A n = b<br />

Proof. The proof of this theorem is as much about a change <strong>in</strong> notation as it is<br />

about mak<strong>in</strong>g logical deductions. Write the system of equations LS(A, b) as<br />

a 11 x 1 + a 12 x 2 + a 13 x 3 + ···+ a 1n x n = b 1<br />

a 21 x 1 + a 22 x 2 + a 23 x 3 + ···+ a 2n x n = b 2<br />

a 31 x 1 + a 32 x 2 + a 33 x 3 + ···+ a 3n x n = b 3<br />

.<br />

a m1 x 1 + a m2 x 2 + a m3 x 3 + ···+ a mn x n = b m<br />

Notice then that the entry of the coefficient matrix A <strong>in</strong> row i and column j has<br />

two names: a ij as the coefficient of x j <strong>in</strong> equation i of the system and [A j ] i<br />

as the<br />

i-th entry of the column vector <strong>in</strong> column j of the coefficient matrix A. Likewise,<br />

entry i of b has two names: b i from the l<strong>in</strong>ear system and [b] i<br />

as an entry of a<br />

vector. Our theorem is an equivalence (Proof Technique E) so we need to prove both<br />

“directions.”<br />

(⇐) Suppose we have the vector equality between b and the l<strong>in</strong>ear comb<strong>in</strong>ation<br />

of the columns of A. Then for 1 ≤ i ≤ m,<br />

b i =[b] i<br />

Def<strong>in</strong>ition CV<br />

=[[x] 1<br />

A 1 +[x] 2<br />

A 2 +[x] 3<br />

A 3 + ···+[x] n<br />

A n ] i<br />

Hypothesis<br />

=[[x] 1<br />

A 1 ] i<br />

+[[x] 2<br />

A 2 ] i<br />

+[[x] 3<br />

A 3 ] i<br />

+ ···+[[x] n<br />

A n ] i<br />

Def<strong>in</strong>ition CVA<br />

=[x] 1<br />

[A 1 ] i<br />

+[x] 2<br />

[A 2 ] i<br />

+[x] 3<br />

[A 3 ] i<br />

+ ···+[x] n<br />

[A n ] i<br />

Def<strong>in</strong>ition CVSM<br />

=[x] 1<br />

a i1 +[x] 2<br />

a i2 +[x] 3<br />

a i3 + ···+[x] n<br />

a <strong>in</strong><br />

Def<strong>in</strong>ition CV<br />

= a i1 [x] 1<br />

+ a i2 [x] 2<br />

+ a i3 [x] 3<br />

+ ···+ a <strong>in</strong> [x] n<br />

Property CMCN<br />

This says that the entries of x form a solution to equation i of LS(A, b) for all<br />

1 ≤ i ≤ m, <strong>in</strong> other words, x is a solution to LS(A, b).<br />

(⇒) Suppose now that x is a solution to the l<strong>in</strong>ear system LS(A, b). Thenfor


§LC Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 88<br />

all 1 ≤ i ≤ m,<br />

[b] i<br />

= b i<br />

Def<strong>in</strong>ition CV<br />

= a i1 [x] 1<br />

+ a i2 [x] 2<br />

+ a i3 [x] 3<br />

+ ···+ a <strong>in</strong> [x] n<br />

Hypothesis<br />

=[x] 1<br />

a i1 +[x] 2<br />

a i2 +[x] 3<br />

a i3 + ···+[x] n<br />

a <strong>in</strong><br />

Property CMCN<br />

=[x] 1<br />

[A 1 ] i<br />

+[x] 2<br />

[A 2 ] i<br />

+[x] 3<br />

[A 3 ] i<br />

+ ···+[x] n<br />

[A n ] i<br />

Def<strong>in</strong>ition CV<br />

=[[x] 1<br />

A 1 ] i<br />

+[[x] 2<br />

A 2 ] i<br />

+[[x] 3<br />

A 3 ] i<br />

+ ···+[[x] n<br />

A n ] i<br />

Def<strong>in</strong>ition CVSM<br />

=[[x] 1<br />

A 1 +[x] 2<br />

A 2 +[x] 3<br />

A 3 + ···+[x] n<br />

A n ] i<br />

Def<strong>in</strong>ition CVA<br />

So the entries of the vector b, and the entries of the vector that is the l<strong>in</strong>ear<br />

comb<strong>in</strong>ation of the columns of A, agree for all 1 ≤ i ≤ m. By Def<strong>in</strong>ition CVE we see<br />

that the two vectors are equal, as desired.<br />

<br />

In other words, this theorem tells us that solutions to systems of equations are<br />

l<strong>in</strong>ear comb<strong>in</strong>ations of the n column vectors of the coefficient matrix (A j )which<br />

yield the constant vector b. Or said another way, a solution to a system of equations<br />

LS(A, b) is an answer to the question “How can I form the vector b as a l<strong>in</strong>ear<br />

comb<strong>in</strong>ation of the columns of A?” Look through the archetypes that are systems<br />

of equations and exam<strong>in</strong>e a few of the advertised solutions. In each case use the<br />

solution to form a l<strong>in</strong>ear comb<strong>in</strong>ation of the columns of the coefficient matrix and<br />

verify that the result equals the constant vector (see Exercise LC.C21).<br />

Subsection VFSS<br />

Vector Form of Solution Sets<br />

We have written solutions to systems of equations as column vectors. For example<br />

Archetype B has the solution x 1 = −3, x 2 =5,x 3 =2whichwewriteas<br />

[ ] [ ]<br />

x1 −3<br />

x = x 2 = 5<br />

x 3 2<br />

Now, we will use column vectors and l<strong>in</strong>ear comb<strong>in</strong>ations to express all of the<br />

solutions to a l<strong>in</strong>ear system of equations <strong>in</strong> a compact and understandable way.<br />

<strong>First</strong>, here are two examples that will motivate our next theorem. This is a valuable<br />

technique, almost the equal of row-reduc<strong>in</strong>g a matrix, so be sure you get comfortable<br />

with it over the course of this section.<br />

Example VFSAD Vector form of solutions for Archetype D<br />

Archetype D is a l<strong>in</strong>ear system of 3 equations <strong>in</strong> 4 variables. Row-reduc<strong>in</strong>g the<br />

augmented matrix yields<br />

⎡<br />

⎣ 1 0 3 −2 4<br />

⎤<br />

0 1 1 −3 0⎦<br />

0 0 0 0 0


§LC Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 89<br />

and we see r = 2 pivot columns. Also, D = {1, 2} so the dependent variables are<br />

then x 1 and x 2 . F = {3, 4, 5} so the two free variables are x 3 and x 4 . We will<br />

express a generic solution for the system by two slightly different methods, though<br />

both arrive at the same conclusion.<br />

<strong>First</strong>, we will decompose (Proof Technique DC) a solution vector. Rearrang<strong>in</strong>g<br />

each equation represented <strong>in</strong> the row-reduced form of the augmented matrix by<br />

solv<strong>in</strong>g for the dependent variable <strong>in</strong> each row yields the vector equality,<br />

⎡ ⎡<br />

⎤<br />

⎤<br />

x 1<br />

⎢x 2 ⎥<br />

⎣<br />

x<br />

⎦ =<br />

3<br />

x 4<br />

⎢<br />

⎣<br />

4 − 3x 3 +2x 4<br />

−x 3 +3x 4<br />

x 3<br />

x 4<br />

⎥<br />

⎦<br />

Now we will use the def<strong>in</strong>itions of column vector addition and scalar multiplication<br />

to express this vector as a l<strong>in</strong>ear comb<strong>in</strong>ation,<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

4 −3x 3 2x 4<br />

⎢0⎥<br />

⎢ −x<br />

= ⎣<br />

0<br />

⎦ + 3 ⎥ ⎢3x ⎣<br />

x<br />

⎦ + 4 ⎥<br />

⎣<br />

3 0<br />

⎦ Def<strong>in</strong>ition CVA<br />

0 0 x 4<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

4 −3 2<br />

⎢0⎥<br />

⎢−1⎥<br />

⎢3⎥<br />

= ⎣<br />

0<br />

⎦ + x 3 ⎣<br />

1<br />

⎦ + x 4 ⎣<br />

0<br />

⎦ Def<strong>in</strong>ition CVSM<br />

0 0 1<br />

We will develop the same l<strong>in</strong>ear comb<strong>in</strong>ation a bit quicker, us<strong>in</strong>g three steps.<br />

While the method above is <strong>in</strong>structive, the method below will be our preferred<br />

approach.<br />

Step 1. Write the vector of variables as a fixed vector, plus a l<strong>in</strong>ear comb<strong>in</strong>ation<br />

of n − r vectors, us<strong>in</strong>g the free variables as the scalars.<br />

⎡ ⎡ ⎤ ⎤ ⎤<br />

⎤<br />

x 1<br />

⎢x x = 2 ⎥<br />

⎣<br />

x<br />

⎦ =<br />

3<br />

x 4<br />

⎢<br />

⎣<br />

⎡ ⎡<br />

⎥ ⎢ ⎥ ⎢<br />

⎦ + x 3 ⎣ ⎦ + x 4 ⎣<br />

Step 2. Use 0’s and 1’s to ensure equality for the entries of the vectors with<br />

<strong>in</strong>dices <strong>in</strong> F (correspond<strong>in</strong>g to the free variables).<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

x 1<br />

⎢x x = 2 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥<br />

⎣<br />

x<br />

⎦ = ⎣<br />

3 0<br />

⎦ + x 3 ⎣ 1<br />

⎦ + x 4 ⎣ 0<br />

⎦<br />

x 4 0 0 1<br />

Step 3. For each dependent variable, use the augmented matrix to formulate an<br />

equation express<strong>in</strong>g the dependent variable as a constant plus multiples of the free<br />

⎥<br />


§LC Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 90<br />

variables. Convert this equation <strong>in</strong>to entries of the vectors that ensure equality for<br />

each dependent variable, one at a time.<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

x 1 4 −3 2<br />

⎢x x 1 =4− 3x 3 +2x 4 ⇒ x = 2 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥<br />

⎣<br />

x<br />

⎦ = ⎣<br />

3 0<br />

⎦ + x 3 ⎣<br />

1<br />

⎦ + x 4 ⎣<br />

0<br />

⎦<br />

x 4 0 0 1<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

x 1 4 −3 2<br />

⎢x x 2 =0− 1x 3 +3x 4 ⇒ x = 2 ⎥ ⎢0⎥<br />

⎢−1⎥<br />

⎢3⎥<br />

⎣<br />

x<br />

⎦ = ⎣<br />

3 0<br />

⎦ + x 3 ⎣<br />

1<br />

⎦ + x 4 ⎣<br />

0<br />

⎦<br />

x 4 0 0 1<br />

This f<strong>in</strong>al form of a typical solution is especially pleas<strong>in</strong>g and useful. For example,<br />

we can build solutions quickly by choos<strong>in</strong>g values for our free variables, and then<br />

compute a l<strong>in</strong>ear comb<strong>in</strong>ation. Such as<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

x 1 4 −3 2 −12<br />

⎢x x 3 =2,x 4 = −5 ⇒ x = 2 ⎥ ⎢0⎥<br />

⎢−1⎥<br />

⎢3⎥<br />

⎢−17⎥<br />

⎣<br />

x<br />

⎦ = ⎣<br />

3 0<br />

⎦ +(2) ⎣<br />

1<br />

⎦ +(−5) ⎣<br />

0<br />

⎦ = ⎣<br />

2<br />

⎦<br />

x 4 0 0<br />

1 −5<br />

or,<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

x 1 4 −3 2 7<br />

⎢x x 3 =1,x 4 =3 ⇒ x = 2 ⎥ ⎢0⎥<br />

⎢−1⎥<br />

⎢3⎥<br />

⎢8⎥<br />

⎣<br />

x<br />

⎦ = ⎣<br />

3 0<br />

⎦ +(1) ⎣<br />

1<br />

⎦ +(3) ⎣<br />

0<br />

⎦ = ⎣<br />

1<br />

⎦<br />

x 4 0 0 1 3<br />

You will f<strong>in</strong>d the second solution listed <strong>in</strong> the write-up for Archetype D, andyou<br />

might check the first solution by substitut<strong>in</strong>g it back <strong>in</strong>to the orig<strong>in</strong>al equations.<br />

While this form is useful for quickly creat<strong>in</strong>g solutions, it is even better because<br />

it tells us exactly what every solution looks like. We know the solution set is <strong>in</strong>f<strong>in</strong>ite,<br />

⎡ ⎤<br />

−3<br />

⎢−1⎥<br />

which is pretty big, but now we can say that a solution is some multiple of ⎣<br />

1<br />

⎦<br />

0<br />

⎡ ⎤<br />

⎡ ⎤<br />

2<br />

4<br />

⎢3⎥<br />

⎢0⎥<br />

plus a multiple of ⎣<br />

0<br />

⎦ plus the fixed vector ⎣<br />

0<br />

⎦. Period. So it only takes us three<br />

1<br />

0<br />

vectors to describe the entire <strong>in</strong>f<strong>in</strong>ite solution set, provided we also agree on how to<br />

comb<strong>in</strong>e the three vectors <strong>in</strong>to a l<strong>in</strong>ear comb<strong>in</strong>ation.<br />

△<br />

This is such an important and fundamental technique, we will do another example.<br />

Example VFS Vector form of solutions<br />

Consider a l<strong>in</strong>ear system of m = 5 equations <strong>in</strong> n = 7 variables, hav<strong>in</strong>g the augmented


§LC Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 91<br />

matrix A.<br />

⎡<br />

A = ⎢<br />

⎣<br />

2 1 −1 −2 2 1 5 21<br />

1 1 −3 1 1 1 2 −5<br />

1 2 −8 5 1 1 −6 −15<br />

3 3 −9 3 6 5 2 −24<br />

−2 −1 1 2 1 1 −9 −30<br />

Row-reduc<strong>in</strong>g we obta<strong>in</strong> the matrix<br />

⎡<br />

⎤<br />

1 0 2 −3 0 0 9 15<br />

0 1 −5 4 0 0 −8 −10<br />

B =<br />

⎢ 0 0 0 0 1 0 −6 11<br />

⎥<br />

⎣<br />

⎦<br />

0 0 0 0 0 1 7 −21<br />

0 0 0 0 0 0 0 0<br />

and we see r = 4 pivot columns. Also, D = {1, 2, 5, 6} so the dependent variables<br />

are then x 1 ,x 2 ,x 5 , and x 6 . F = {3, 4, 7, 8} so the n − r = 3 free variables are<br />

x 3 ,x 4 and x 7 . We will express a generic solution for the system by two different<br />

methods: both a decomposition and a construction.<br />

<strong>First</strong>, we will decompose (Proof Technique DC) a solution vector. Rearrang<strong>in</strong>g<br />

each equation represented <strong>in</strong> the row-reduced form of the augmented matrix by<br />

solv<strong>in</strong>g for the dependent variable <strong>in</strong> each row yields the vector equality,<br />

⎡ ⎤ ⎡<br />

⎤<br />

x 1 15 − 2x 3 +3x 4 − 9x 7<br />

x 2<br />

−10 + 5x 3 − 4x 4 +8x 7<br />

x 3<br />

x 3<br />

x 4<br />

=<br />

x 4<br />

⎢x 5 ⎥ ⎢ 11 + 6x 7 ⎥<br />

⎣x 6<br />

⎦ ⎣ −21 − 7x 7<br />

⎦<br />

x 7 x 7<br />

Now we will use the def<strong>in</strong>itions of column vector addition and scalar multiplication<br />

to decompose this generic solution vector as a l<strong>in</strong>ear comb<strong>in</strong>ation,<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

15 −2x 3 3x 4 −9x 7<br />

−10<br />

5x 3<br />

−4x 4<br />

8x 7<br />

0<br />

x 3<br />

0<br />

0<br />

=<br />

0<br />

+<br />

0 +<br />

x 4<br />

+<br />

0<br />

Def<strong>in</strong>ition CVA<br />

⎢ 11<br />

⎥ ⎢ 0<br />

⎥ ⎢ 0 ⎥ ⎢ 6x 7 ⎥<br />

⎣−21⎦<br />

⎣ 0 ⎦ ⎣ 0 ⎦ ⎣−7x 7<br />

⎦<br />

0 0 0<br />

x 7<br />

⎤<br />

⎥<br />


§LC Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 92<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

15 −2 3 −9<br />

−10<br />

5<br />

−4<br />

8<br />

0<br />

1<br />

0<br />

0<br />

=<br />

0<br />

+ x ⎢ 11<br />

3 0<br />

+ x 4 1<br />

+ x<br />

⎥ ⎢ 0<br />

⎥ ⎢ 0<br />

7 0<br />

⎥ ⎢ 6<br />

⎥<br />

⎣−21⎦<br />

⎣ 0 ⎦ ⎣ 0 ⎦ ⎣−7⎦<br />

0 0 0 1<br />

Def<strong>in</strong>ition CVSM<br />

We will now develop the same l<strong>in</strong>ear comb<strong>in</strong>ation a bit quicker, us<strong>in</strong>g three steps.<br />

While the method above is <strong>in</strong>structive, the method below will be our preferred<br />

approach.<br />

Step 1. Write the vector of variables as a fixed vector, plus a l<strong>in</strong>ear comb<strong>in</strong>ation<br />

of n − r vectors, us<strong>in</strong>g the free variables as the scalars.<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

x 1<br />

x 2<br />

x 3<br />

x =<br />

x 4<br />

=<br />

+ x 3 + x 4 + x ⎢x 7 5 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥<br />

⎣x 6<br />

⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦<br />

x 7<br />

Step 2. Use 0’s and 1’s to ensure equality for the entries of the vectors with<br />

<strong>in</strong>dices <strong>in</strong> F (correspond<strong>in</strong>g to the free variables).<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

x 1<br />

x 2<br />

x 3<br />

0<br />

1<br />

0<br />

0<br />

x =<br />

x 4<br />

=<br />

0<br />

+ x 3 0<br />

+ x 4 1<br />

+ x 7 0<br />

⎢x 5 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥<br />

⎣x 6<br />

⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦<br />

x 7 0 0 0 1<br />

Step 3. For each dependent variable, use the augmented matrix to formulate an<br />

equation express<strong>in</strong>g the dependent variable as a constant plus multiples of the free<br />

variables. Convert this equation <strong>in</strong>to entries of the vectors that ensure equality for<br />

each dependent variable, one at a time.<br />

x 1 =15− 2x 3 +3x 4 − 9x 7 ⇒<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

x 1 15 −2 3 −9<br />

x 2<br />

x 3<br />

0<br />

1<br />

0<br />

0<br />

x =<br />

x 4<br />

=<br />

0<br />

+ x ⎢x 3 0<br />

+ x 4 1<br />

+ x 7 0<br />

5 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥<br />

⎣x 6<br />

⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦<br />

x 7 0 0 0 1<br />

x 2 = −10 + 5x 3 − 4x 4 +8x 7 ⇒


§LC Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 93<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

x 1 15 −2 3 −9<br />

x 2<br />

−10<br />

5<br />

−4<br />

8<br />

x 3<br />

0<br />

1<br />

0<br />

0<br />

x =<br />

x 4<br />

=<br />

0<br />

+ x ⎢x 3 0<br />

+ x 4 1<br />

+ x 7 0<br />

5 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥<br />

⎣x 6<br />

⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦<br />

x 7 0 0 0 1<br />

x 5 =11+6x 7 ⇒<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

x 1 15 −2 3 −9<br />

x 2<br />

−10<br />

5<br />

−4<br />

8<br />

x 3<br />

0<br />

1<br />

0<br />

0<br />

x =<br />

x 4<br />

=<br />

0<br />

+ x 3 0<br />

+ x 4 1<br />

+ x 7 0<br />

⎢x 5 ⎥ ⎢ 11<br />

⎥ ⎢ 0<br />

⎥ ⎢ 0<br />

⎥ ⎢ 6<br />

⎥<br />

⎣x 6<br />

⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦<br />

x 7 0 0 0 1<br />

x 6 = −21 − 7x 7 ⇒<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

x 1 15 −2 3 −9<br />

x 2<br />

−10<br />

5<br />

−4<br />

8<br />

x 3<br />

0<br />

1<br />

0<br />

0<br />

x =<br />

x 4<br />

=<br />

0<br />

+ x 3 0<br />

+ x ⎢x 5 ⎥ ⎢ 11<br />

4 1<br />

+ x<br />

⎥ ⎢ 0<br />

7 0<br />

⎥ ⎢ 0<br />

⎥ ⎢ 6<br />

⎥<br />

⎣x 6<br />

⎦ ⎣−21⎦<br />

⎣ 0 ⎦ ⎣ 0 ⎦ ⎣−7⎦<br />

x 7 0 0 0 1<br />

This f<strong>in</strong>al form of a typical solution is especially pleas<strong>in</strong>g and useful. For example,<br />

we can build solutions quickly by choos<strong>in</strong>g values for our free variables, and then<br />

compute a l<strong>in</strong>ear comb<strong>in</strong>ation. For example<br />

x 3 =2,x 4 = −4, x 7 =3 ⇒<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

x 1 15 −2<br />

3 −9 −28<br />

x 2<br />

−10<br />

5<br />

−4<br />

8<br />

40<br />

x 3<br />

0<br />

1<br />

0<br />

0<br />

2<br />

x =<br />

x 4<br />

=<br />

0<br />

+(2)<br />

0<br />

+(−4)<br />

1<br />

+(3)<br />

0<br />

=<br />

−4<br />

⎢x 5 ⎥ ⎢ 11<br />

⎥ ⎢ 0<br />

⎥ ⎢ 0<br />

⎥ ⎢ 6<br />

⎥ ⎢ 29<br />

⎥<br />

⎣x 6<br />

⎦ ⎣−21⎦<br />

⎣ 0 ⎦ ⎣ 0 ⎦ ⎣−7⎦<br />

⎣−42⎦<br />

x 7 0<br />

0<br />

0 1 3<br />

or perhaps,<br />

x 3 =5,x 4 =2,x 7 =1<br />


§LC Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 94<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

x 1 15 −2 3 −9 2<br />

x 2<br />

−10<br />

5<br />

−4<br />

8<br />

15<br />

x 3<br />

0<br />

1<br />

0<br />

0<br />

5<br />

x =<br />

x 4<br />

=<br />

0<br />

+(5)<br />

0<br />

+(2)<br />

1<br />

+(1)<br />

0 ⎥ =<br />

2<br />

⎢x 5 ⎥ ⎢ 11<br />

⎥ ⎢ 0<br />

⎥ ⎢ 0<br />

⎥ ⎢ 6 ⎥⎥⎦ ⎢ 17<br />

⎥<br />

⎣x 6<br />

⎦ ⎣−21⎦<br />

⎣ 0 ⎦ ⎣ 0 ⎦ ⎣−7<br />

⎣−28⎦<br />

x 7 0<br />

0 0 1 1<br />

or even,<br />

x 3 =0,x 4 =0,x 7 =0 ⇒<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

x 1 15 −2 3 −9 15<br />

x 2<br />

−10<br />

5<br />

−4<br />

8<br />

−10<br />

x 3<br />

0<br />

1<br />

0<br />

0<br />

0<br />

x =<br />

x 4<br />

=<br />

0<br />

+(0)<br />

0<br />

+(0)<br />

1<br />

+(0)<br />

0<br />

=<br />

0<br />

⎢x 5 ⎥ ⎢ 11<br />

⎥ ⎢ 0<br />

⎥ ⎢ 0<br />

⎥ ⎢ 6<br />

⎥ ⎢ 11<br />

⎥<br />

⎣x 6<br />

⎦ ⎣−21⎦<br />

⎣ 0 ⎦ ⎣ 0 ⎦ ⎣−7⎦<br />

⎣−21⎦<br />

x 7 0<br />

0 0 1 0<br />

So we can compactly express all of the solutions to this l<strong>in</strong>ear system with just 4<br />

fixed vectors, provided we agree how to comb<strong>in</strong>e them <strong>in</strong> a l<strong>in</strong>ear comb<strong>in</strong>ations to<br />

create solution vectors.<br />

Suppose you were told that the vector w below was a solution to this system of<br />

equations. Could you turn the problem around and write w as a l<strong>in</strong>ear comb<strong>in</strong>ation<br />

of the four vectors c, u 1 , u 2 , u 3 ? (See Exercise LC.M11.)<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

100<br />

−75<br />

7<br />

w =<br />

9<br />

⎢−37<br />

⎥<br />

⎣ 35 ⎦<br />

−8<br />

15<br />

−10<br />

0<br />

c =<br />

0<br />

⎢ 11<br />

⎥<br />

⎣−21⎦<br />

0<br />

−2<br />

5<br />

1<br />

u 1 =<br />

0<br />

⎢ 0<br />

⎥<br />

⎣ 0 ⎦<br />

0<br />

3<br />

−4<br />

0<br />

u 2 =<br />

1<br />

⎢ 0<br />

⎥<br />

⎣ 0 ⎦<br />

0<br />

−9<br />

8<br />

0<br />

u 3 =<br />

0<br />

⎢ 6<br />

⎥<br />

⎣−7⎦<br />

1<br />

Did you th<strong>in</strong>k a few weeks ago that you could so quickly and easily list all the<br />

solutions to a l<strong>in</strong>ear system of 5 equations <strong>in</strong> 7 variables?<br />

We will now formalize the last two (important) examples as a theorem. The<br />

statement of this theorem is a bit scary, and the proof is scarier. For now, be sure to<br />

convice yourself, by work<strong>in</strong>g through the examples and exercises, that the statement<br />

just describes the procedure of the two immediately previous examples.<br />

Theorem VFSLS Vector Form of Solutions to L<strong>in</strong>ear Systems<br />

Suppose that [A | b] is the augmented matrix for a consistent l<strong>in</strong>ear system LS(A, b)<br />

of m equations <strong>in</strong> n variables. Let B be a row-equivalent m × (n +1) matrix<br />

<strong>in</strong> reduced row-echelon form. Suppose that B has r pivot columns, with <strong>in</strong>dices<br />

D = {d 1 ,d 2 ,d 3 , ..., d r },whilethen − r non-pivot columns have <strong>in</strong>dices <strong>in</strong> F =


§LC Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 95<br />

{f 1 ,f 2 ,f 3 , ..., f n−r ,n+1}. Def<strong>in</strong>e vectors c, u j , 1 ≤ j ≤ n − r of size n by<br />

{<br />

0 if i ∈ F<br />

[c] i<br />

=<br />

[B] k,n+1<br />

if i ∈ D, i = d k<br />

⎧<br />

⎪⎨ 1 if i ∈ F , i = f j<br />

[u j ] i<br />

= 0 if i ∈ F , i ≠ f j .<br />

⎪⎩<br />

− [B] k,fj<br />

if i ∈ D, i = d k<br />

Then the set of solutions to the system of equations LS(A, b) is<br />

S = {c + α 1 u 1 + α 2 u 2 + α 3 u 3 + ···+ α n−r u n−r | α 1 ,α 2 ,α 3 , ..., α n−r ∈ C}<br />

Proof. <strong>First</strong>, LS(A, b) is equivalent to the l<strong>in</strong>ear system of equations that has the<br />

matrix B as its augmented matrix (Theorem REMES), so we need only show that S<br />

is the solution set for the system with B as its augmented matrix. The conclusion of<br />

this theorem is that the solution set is equal to the set S, so we will apply Def<strong>in</strong>ition<br />

SE.<br />

We beg<strong>in</strong> by show<strong>in</strong>g that every element of S is <strong>in</strong>deed a solution to the system.<br />

Let α 1 ,α 2 ,α 3 , ..., α n−r be one choice of the scalars used to describe elements of<br />

S. So an arbitrary element of S, which we will consider as a proposed solution is<br />

x = c + α 1 u 1 + α 2 u 2 + α 3 u 3 + ···+ α n−r u n−r<br />

When r +1 ≤ l ≤ m, rowl of the matrix B is a zero row, so the equation<br />

represented by that row is always true, no matter which solution vector we propose.<br />

So concentrate on rows represent<strong>in</strong>g equations 1 ≤ l ≤ r. We evaluate equation l of<br />

the system represented by B with the proposed solution vector x and refer to the<br />

value of the left-hand side of the equation as β l ,<br />

to<br />

β l =[B] l1<br />

[x] 1<br />

+[B] l2<br />

[x] 2<br />

+[B] l3<br />

[x] 3<br />

+ ···+[B] ln<br />

[x] n<br />

S<strong>in</strong>ce [B] ldi<br />

=0forall1≤ i ≤ r, except that [B] ldl<br />

= 1, we see that β l simplifies<br />

β l =[x] dl<br />

+[B] lf1<br />

[x] f1<br />

+[B] lf2<br />

[x] f2<br />

+[B] lf3<br />

[x] f3<br />

+ ···+[B] lfn−r<br />

[x] fn−r<br />

Notice that for 1 ≤ i ≤ n − r<br />

[x] fi<br />

=[c] fi<br />

+ α 1 [u 1 ] fi<br />

+ α 2 [u 2 ] fi<br />

+ ···+ α i [u i ] fi<br />

+ ···+ α n−r [u n−r ] fi<br />

=0+α 1 (0) + α 2 (0) + ···+ α i (1) + ···+ α n−r (0)<br />

= α i<br />

So β l simplifies further, and we expand the first term<br />

β l =[x] dl<br />

+[B] lf1<br />

α 1 +[B] lf2<br />

α 2 +[B] lf3<br />

α 3 + ···+[B] lfn−r<br />

α n−r<br />

=[c + α 1 u 1 + α 2 u 2 + α 3 u 3 + ···+ α n−r u n−r ] dl<br />

+


§LC Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 96<br />

[B] lf1<br />

α 1 +[B] lf2<br />

α 2 +[B] lf3<br />

α 3 + ···+[B] lfn−r<br />

α n−r<br />

=[c] dl<br />

+ α 1 [u 1 ] dl<br />

+ α 2 [u 2 ] dl<br />

+ α 3 [u 3 ] dl<br />

+ ···+ α n−r [u n−r ] dl<br />

+<br />

[B] lf1<br />

α 1 +[B] lf2<br />

α 2 +[B] lf3<br />

α 3 + ···+[B] lfn−r<br />

α n−r<br />

=[B] l,n+1<br />

+<br />

α 1 (− [B] lf1<br />

)+α 2 (− [B] lf2<br />

)+α 3 (− [B] lf3<br />

)+···+ α n−r (− [B] lfn−r<br />

)+<br />

[B] lf1<br />

α 1 +[B] lf2<br />

α 2 +[B] lf3<br />

α 3 + ···+[B] lfn−r<br />

α n−r<br />

=[B] l,n+1<br />

So β l began as the left-hand side of equation l of the system represented by B<br />

and we now know it equals [B] l,n+1<br />

, the constant term for equation l of this system.<br />

So the arbitrarily chosen vector from S makes every equation of the system true,<br />

and therefore is a solution to the system. So all the elements of S are solutions to<br />

the system.<br />

For the second half of the proof, assume that x is a solution vector for the system<br />

hav<strong>in</strong>g B as its augmented matrix. For convenience and clarity, denote the entries of<br />

x by x i , <strong>in</strong> other words, x i = [x] i<br />

. We desire to show that this solution vector is also<br />

an element of the set S. Beg<strong>in</strong> with the observation that a solution vector’s entries<br />

makes equation l of the system true for all 1 ≤ l ≤ m,<br />

[B] l,1<br />

x 1 +[B] l,2<br />

x 2 +[B] l,3<br />

x 3 + ···+[B] l,n<br />

x n =[B] l,n+1<br />

When l ≤ r, the pivot columns of B have zero entries <strong>in</strong> row l with the exception<br />

of column d l , which will conta<strong>in</strong> a 1. So for 1 ≤ l ≤ r, equation l simplifies to<br />

1x dl +[B] l,f1<br />

x f1 +[B] l,f2<br />

x f2 +[B] l,f3<br />

x f3 + ···+[B] l,fn−r<br />

x fn−r<br />

=[B] l,n+1<br />

This allows us to write,<br />

[x] dl<br />

= x dl<br />

=[B] l,n+1<br />

− [B] l,f1<br />

x f1 − [B] l,f2<br />

x f2 − [B] l,f3<br />

x f3 −···−[B] l,fn−r<br />

x fn−r<br />

=[c] dl<br />

+ x f1 [u 1 ] dl<br />

+ x f2 [u 2 ] dl<br />

+ x f3 [u 3 ] dl<br />

+ ···+ x fn−r [u n−r ] dl<br />

= [ ]<br />

c + x f1 u 1 + x f2 u 2 + x f3 u 3 + ···+ x fn−r u n−r d l<br />

This tells us that the entries of the solution vector x correspond<strong>in</strong>g to dependent<br />

variables (<strong>in</strong>dices <strong>in</strong> D), are equal to those of a vector <strong>in</strong> the set S. We still need to<br />

check the other entries of the solution vector x correspond<strong>in</strong>g to the free variables<br />

(<strong>in</strong>dices <strong>in</strong> F ) to see if they are equal to the entries of the same vector <strong>in</strong> the set S.<br />

To this end, suppose i ∈ F and i = f j .Then<br />

[x] i<br />

= x i = x fj<br />

=0+0x f1 +0x f2 +0x f3 + ···+0x fj−1 +1x fj +0x fj+1 + ···+0x fn−r<br />

=[c] i<br />

+ x f1 [u 1 ] i<br />

+ x f2 [u 2 ] i<br />

+ x f3 [u 3 ] i<br />

+ ···+ x fj [u j ] i<br />

+ ···+ x fn−r [u n−r ] i


§LC Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 97<br />

= [ c + x f1 u 1 + x f2 u 2 + ···+ x fn−r u n−r<br />

]i<br />

So entries of x and c + x f1 u 1 + x f2 u 2 + ···+ x fn−r u n−r are equal and therefore<br />

by Def<strong>in</strong>ition CVE they are equal vectors. S<strong>in</strong>ce x f1 ,x f2 ,x f3 , ..., x fn−r are scalars,<br />

thisshowsusthatx qualifies for membership <strong>in</strong> S. So the set S conta<strong>in</strong>s all of the<br />

solutions to the system.<br />

<br />

Note that both halves of the proof of Theorem VFSLS <strong>in</strong>dicate that α i = [x] fi<br />

.<br />

In other words, the arbitrary scalars, α i , <strong>in</strong> the description of the set S actually have<br />

more mean<strong>in</strong>g — they are the values of the free variables [x] fi<br />

,1≤ i ≤ n − r. Sowe<br />

will often exploit this observation <strong>in</strong> our descriptions of solution sets.<br />

Theorem VFSLS formalizes what happened <strong>in</strong> the three steps of Example VFSAD.<br />

The theorem will be useful <strong>in</strong> prov<strong>in</strong>g other theorems, and it it is useful s<strong>in</strong>ce it tells<br />

us an exact procedure for simply describ<strong>in</strong>g an <strong>in</strong>f<strong>in</strong>ite solution set. We could program<br />

a computer to implement it, once we have the augmented matrix row-reduced and<br />

have checked that the system is consistent. By Knuth’s def<strong>in</strong>ition, this completes<br />

our conversion of l<strong>in</strong>ear equation solv<strong>in</strong>g from art <strong>in</strong>to science. Notice that it even<br />

applies (but is overkill) <strong>in</strong> the case of a unique solution. However, as a practical<br />

matter, I prefer the three-step process of Example VFSAD when I need to describe<br />

an <strong>in</strong>f<strong>in</strong>ite solution set. So let us practice some more, but with a bigger example.<br />

Example VFSAI Vector form of solutions for Archetype I<br />

Archetype I is a l<strong>in</strong>ear system of m = 4 equations <strong>in</strong> n = 7 variables. Row-reduc<strong>in</strong>g<br />

the augmented matrix yields<br />

⎡<br />

⎤<br />

⎢<br />

⎣<br />

1 4 0 0 2 1 −3 4<br />

0 0 1 0 1 −3 5 2<br />

⎥<br />

0 0 0 1 2 −6 6 1<br />

⎦<br />

0 0 0 0 0 0 0 0<br />

and we see r = 3 pivot columns, with <strong>in</strong>dices D = {1, 3, 4}. Sother =3dependent<br />

variables are x 1 ,x 3 ,x 4 . The non-pivot columns have <strong>in</strong>dices <strong>in</strong> F = {2, 5, 6, 7, 8},<br />

so the n − r = 4 free variables are x 2 ,x 5 ,x 6 ,x 7 .<br />

Step 1. Write the vector of variables (x) as a fixed vector (c), plus a l<strong>in</strong>ear<br />

comb<strong>in</strong>ation of n − r = 4 vectors (u 1 , u 2 , u 3 , u 4 ), us<strong>in</strong>g the free variables as the<br />

scalars.<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

x 1<br />

x 2<br />

x 3<br />

x =<br />

x 4<br />

=<br />

+ x ⎢x 2 + x 5 + x 6 + x 7 5 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥<br />

⎣x 6<br />

⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦<br />

x 7<br />

Step 2. For each free variable, use 0’s and 1’s to ensure equality for the corre-


§LC Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 98<br />

spond<strong>in</strong>g entry of the vectors. Take note of the pattern of 0’s and 1’s at this stage,<br />

because this is the best look you will have at it. We will state an important theorem<br />

<strong>in</strong> the next section and the proof will essentially rely on this observation.<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

x 1<br />

x 2<br />

0<br />

1<br />

0<br />

0<br />

0<br />

x 3<br />

x =<br />

x 4<br />

=<br />

+ x 2 + x 5 + x 6 + x ⎢x 5 ⎥ ⎢0<br />

⎥ ⎢0<br />

⎥ ⎢1<br />

⎥ ⎢0<br />

7 ⎥ ⎢0<br />

⎥<br />

⎣x 6<br />

⎦ ⎣0⎦<br />

⎣0⎦<br />

⎣0⎦<br />

⎣1⎦<br />

⎣0⎦<br />

x 7 0 0 0 0 1<br />

Step 3. For each dependent variable, use the augmented matrix to formulate an<br />

equation express<strong>in</strong>g the dependent variable as a constant plus multiples of the free<br />

variables. Convert this equation <strong>in</strong>to entries of the vectors that ensure equality for<br />

each dependent variable, one at a time.<br />

x 1 =4− 4x 2 − 2x 5 − 1x 6 +3x 7 ⇒<br />

⎡ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

x 1 4 −4 −2 −1 3<br />

x 2<br />

0<br />

1<br />

0<br />

0<br />

0<br />

x 3<br />

x = ⎢ ⎥ = ⎢ ⎥ + x 2 ⎢ ⎥ + x 5 ⎢ ⎥ + x 6 ⎢ ⎥ + x 7 ⎢ ⎥<br />

⎤<br />

x 4<br />

⎢x 5 ⎥<br />

⎣x 6<br />

⎦<br />

x 7<br />

⎢0<br />

⎥<br />

⎣0⎦<br />

0<br />

⎢<br />

⎣<br />

0<br />

0<br />

0<br />

⎥<br />

⎦<br />

⎢<br />

⎣<br />

1<br />

0<br />

0<br />

⎥<br />

⎦<br />

⎢<br />

⎣<br />

0<br />

1<br />

0<br />

⎥<br />

⎦<br />

⎢0<br />

⎥<br />

⎣0⎦<br />

1<br />

x 3 =2+0x 2 − x 5 +3x 6 − 5x 7 ⇒<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

x 1 4 −4 −2 −1 3<br />

x 2<br />

0<br />

1<br />

0<br />

0<br />

0<br />

x 3<br />

2<br />

0<br />

−1<br />

3<br />

−5<br />

x =<br />

x 4<br />

=<br />

+ x 2 + x 5 + x 6 + x 7 ⎢x 5 ⎥ ⎢0<br />

⎥ ⎢ 0<br />

⎥ ⎢ 1<br />

⎥ ⎢ 0<br />

⎥ ⎢ 0<br />

⎥<br />

⎣x 6<br />

⎦ ⎣0⎦<br />

⎣ 0 ⎦ ⎣ 0 ⎦ ⎣ 1 ⎦ ⎣ 0 ⎦<br />

x 7 0 0 0 0 1<br />

x 4 =1+0x 2 − 2x 5 +6x 6 − 6x 7 ⇒<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

x 1 4 −4 −2 −1 3<br />

x 2<br />

0<br />

1<br />

0<br />

0<br />

0<br />

x 3<br />

2<br />

0<br />

−1<br />

3<br />

−5<br />

x =<br />

x 4<br />

=<br />

1<br />

+ x 2 0<br />

+ x ⎢x 5 ⎥ ⎢0<br />

5 −2<br />

+ x<br />

⎥ ⎢ 0<br />

6 6<br />

+ x 7 −6<br />

⎥ ⎢ 1<br />

⎥ ⎢ 0<br />

⎥ ⎢ 0<br />

⎥<br />

⎣x 6<br />

⎦ ⎣0⎦<br />

⎣ 0 ⎦ ⎣ 0 ⎦ ⎣ 1 ⎦ ⎣ 0 ⎦<br />

x 7 0 0 0 0 1


§LC Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 99<br />

We can now use this f<strong>in</strong>al expression to quickly build solutions to the system.<br />

You might try to recreate each of the solutions listed <strong>in</strong> the write-up for Archetype<br />

I. (H<strong>in</strong>t: look at the values of the free variables <strong>in</strong> each solution, and notice that the<br />

vector c has 0’s <strong>in</strong> these locations.)<br />

Even better, we have a description of the <strong>in</strong>f<strong>in</strong>ite solution set, based on just 5<br />

vectors, which we comb<strong>in</strong>e <strong>in</strong> l<strong>in</strong>ear comb<strong>in</strong>ations to produce solutions.<br />

Whenever we discuss Archetype I you know that is your cue to go work through<br />

Archetype J by yourself. Remember to take note of the 0/1 pattern at the conclusion<br />

of Step 2. Have fun — we won’t go anywhere while you’re away.<br />

△<br />

This technique is so important, that we will do one more example. However, an<br />

important dist<strong>in</strong>ction will be that this system is homogeneous.<br />

Example VFSAL Vector form of solutions for Archetype L<br />

Archetype L is presented simply as the 5 × 5 matrix<br />

⎡<br />

⎤<br />

−2 −1 −2 −4 4<br />

−6 −5 −4 −4 6<br />

L = ⎢ 10 7 7 10 −13⎥<br />

⎣−7 −5 −6 −9 10 ⎦<br />

−4 −3 −4 −6 6<br />

We will employ this matrix here as the coefficient matrix of a homogeneous<br />

system and reference this matrix as L. So we are solv<strong>in</strong>g the homogeneous system<br />

LS(L, 0) hav<strong>in</strong>g m = 5 equations <strong>in</strong> n = 5 variables. If we built the augmented<br />

matrix, we would add a sixth column to L conta<strong>in</strong><strong>in</strong>g all zeros. As we did row<br />

operations, this sixth column would rema<strong>in</strong> all zeros. So <strong>in</strong>stead we will row-reduce<br />

the coefficient matrix, and mentally remember the miss<strong>in</strong>g sixth column of zeros.<br />

This row-reduced matrix is ⎡<br />

⎢<br />

1 0 0 1 −2<br />

⎤<br />

⎢⎢⎢⎣ 0 1 0 −2 2<br />

0 0 1 2 −1⎥<br />

0 0 0 0 0<br />

⎦<br />

0 0 0 0 0<br />

and we see r = 3 pivot columns, with <strong>in</strong>dices D = {1, 2, 3}. Sother =3dependent<br />

variables are x 1 ,x 2 ,x 3 . The non-pivot columns have <strong>in</strong>dices F = {4, 5}, sothe<br />

n − r = 2 free variables are x 4 ,x 5 . Notice that if we had <strong>in</strong>cluded the all-zero vector<br />

of constants to form the augmented matrix for the system, then the <strong>in</strong>dex 6 would<br />

have appeared <strong>in</strong> the set F , and subsequently would have been ignored when list<strong>in</strong>g<br />

the free variables. So noth<strong>in</strong>g is lost by not creat<strong>in</strong>g an augmented matrix (<strong>in</strong> the<br />

case of a homogenous system). And maybe it is an improvement, s<strong>in</strong>ce now every<br />

<strong>in</strong>dex <strong>in</strong> F can be used to reference a variable of the l<strong>in</strong>ear system.<br />

Step 1. Write the vector of variables (x) as a fixed vector (c), plus a l<strong>in</strong>ear


§LC Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 100<br />

comb<strong>in</strong>ation of n − r = 2 vectors (u 1 , u 2 ), us<strong>in</strong>g the free variables as the scalars.<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

x 1<br />

x 2<br />

x = ⎢x 3 ⎥<br />

⎣x ⎦ = ⎢ ⎥<br />

⎣ ⎦ + x 4 ⎢ ⎥<br />

⎣ ⎦ + x 5 ⎢ ⎥<br />

⎣ ⎦<br />

4<br />

x 5<br />

Step 2. For each free variable, use 0’s and 1’s to ensure equality for the correspond<strong>in</strong>g<br />

entry of the vectors. Take note of the pattern of 0’s and 1’s at this stage,<br />

even if it is not as illum<strong>in</strong>at<strong>in</strong>g as <strong>in</strong> other examples.<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

x 1<br />

x 2<br />

x = ⎢x 3 ⎥<br />

⎣x ⎦ = ⎢ ⎥<br />

⎣ 4 0 ⎦ + x 4 ⎢ ⎥<br />

⎣ 1 ⎦ + x 5 ⎢ ⎥<br />

⎣ 0 ⎦<br />

x 5 0 0 1<br />

Step 3. For each dependent variable, use the augmented matrix to formulate an<br />

equation express<strong>in</strong>g the dependent variable as a constant plus multiples of the free<br />

variables. Do not forget about the “miss<strong>in</strong>g” sixth column be<strong>in</strong>g full of zeros. Convert<br />

this equation <strong>in</strong>to entries of the vectors that ensure equality for each dependent<br />

variable, one at a time.<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

x 1 0 −1 2<br />

x 2<br />

x 1 =0− 1x 4 +2x 5 ⇒ x = ⎢x 3 ⎥<br />

⎣x ⎦ = ⎢ ⎥<br />

⎣ 4 0⎦ + x 4 ⎢ ⎥<br />

⎣ 1 ⎦ + x 5 ⎢ ⎥<br />

⎣0⎦<br />

x 5 0 0 1<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

x 1 0 −1 2<br />

x 2<br />

0<br />

2<br />

−2<br />

x 2 =0+2x 4 − 2x 5 ⇒ x = ⎢x 3 ⎥<br />

⎣x ⎦ = ⎢ ⎥<br />

⎣ 4 0⎦ + x 4 ⎢ ⎥<br />

⎣ 1 ⎦ + x 5 ⎢ ⎥<br />

⎣ 0 ⎦<br />

x 5 0 0 1<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

x 1 0 −1 2<br />

x 2<br />

0<br />

2<br />

−2<br />

x 3 =0− 2x 4 +1x 5 ⇒ x = ⎢x 3 ⎥<br />

⎣x ⎦ = ⎢0⎥<br />

⎣ 4 0⎦ + x 4 ⎢−2⎥<br />

⎣ 1 ⎦ + x 5 ⎢ 1 ⎥<br />

⎣ 0 ⎦<br />

x 5 0 0 1<br />

The vector c will always have 0’s <strong>in</strong> the entries correspond<strong>in</strong>g to free variables.<br />

However, s<strong>in</strong>ce we are solv<strong>in</strong>g a homogeneous system, the row-reduced augmented<br />

matrix has zeros <strong>in</strong> column n + 1 = 6, and hence all the entries of c are zero. So we


§LC Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 101<br />

can write<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

x 1<br />

−1 2 −1 2<br />

x 2<br />

2<br />

−2<br />

2<br />

−2<br />

x = ⎢x 3 ⎥<br />

⎣x ⎦ = 0 + x 4 ⎢−2⎥<br />

⎣<br />

4<br />

1 ⎦ + x 5 ⎢ 1 ⎥<br />

⎣ 0 ⎦ = x 4 ⎢−2⎥<br />

⎣ 1 ⎦ + x 5 ⎢ 1 ⎥<br />

⎣ 0 ⎦<br />

x 5 0 1 0 1<br />

It will always happen that the solutions to a homogeneous system has c = 0<br />

(even <strong>in</strong> the case of a unique solution?). So our expression for the solutions is a<br />

bit more pleas<strong>in</strong>g. In this example it says ⎡ that ⎤ the solutions ⎡ ⎤are all possible l<strong>in</strong>ear<br />

−1<br />

2<br />

2<br />

−2<br />

comb<strong>in</strong>ations of the two vectors u 1 = ⎢−2⎥<br />

⎣ 1 ⎦ and u 2 = ⎢ 1 ⎥, with no mention of<br />

⎣ 0 ⎦<br />

0<br />

1<br />

any fixed vector enter<strong>in</strong>g <strong>in</strong>to the l<strong>in</strong>ear comb<strong>in</strong>ation.<br />

This observation will motivate our next section and the ma<strong>in</strong> def<strong>in</strong>ition of that<br />

section, and after that we will conclude the section by formaliz<strong>in</strong>g this situation.△<br />

Subsection PSHS<br />

Particular Solutions, Homogeneous Solutions<br />

The next theorem tells us that <strong>in</strong> order to f<strong>in</strong>d all of the solutions to a l<strong>in</strong>ear system<br />

of equations, it is sufficient to f<strong>in</strong>d just one solution, and then f<strong>in</strong>d all of the solutions<br />

to the correspond<strong>in</strong>g homogeneous system. This expla<strong>in</strong>s part of our <strong>in</strong>terest <strong>in</strong> the<br />

null space, the set of all solutions to a homogeneous system.<br />

Theorem PSPHS Particular Solution Plus Homogeneous Solutions<br />

Suppose that w is one solution to the l<strong>in</strong>ear system of equations LS(A, b). Theny<br />

is a solution to LS(A, b) if and only if y = w + z for some vector z ∈N(A).<br />

Proof. Let A 1 , A 2 , A 3 , ..., A n be the columns of the coefficient matrix A.<br />

(⇐) Suppose y = w + z and z ∈N(A). Then<br />

b =[w] 1<br />

A 1 +[w] 2<br />

A 2 +[w] 3<br />

A 3 + ···+[w] n<br />

A n Theorem SLSLC<br />

=[w] 1<br />

A 1 +[w] 2<br />

A 2 +[w] 3<br />

A 3 + ···+[w] n<br />

A n + 0 Property ZC<br />

=[w] 1<br />

A 1 +[w] 2<br />

A 2 +[w] 3<br />

A 3 + ···+[w] n<br />

A n Theorem SLSLC<br />

+[z] 1<br />

A 1 +[z] 2<br />

A 2 +[z] 3<br />

A 3 + ···+[z] n<br />

A n<br />

=([w] 1<br />

+[z] 1<br />

) A 1 +([w] 2<br />

+[z] 2<br />

) A 2<br />

Theorem VSPCV<br />

([w] 3<br />

+[z] 3<br />

) A 3 + ···+([w] n<br />

+[z] n<br />

) A n<br />

=+[w + z] 1<br />

A 1 +[w + z] 2<br />

A 2 + ···+[w + z] n<br />

A n Def<strong>in</strong>ition CVA<br />

=[y] 1<br />

A 1 +[y] 2<br />

A 2 +[y] 3<br />

A 3 + ···+[y] n<br />

A n Def<strong>in</strong>ition of y<br />

Apply<strong>in</strong>g Theorem SLSLC we see that the vector y is a solution to LS(A, b).


§LC Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 102<br />

(⇒) Suppose y is a solution to LS(A, b). Then<br />

0 = b − b<br />

=[y] 1<br />

A 1 +[y] 2<br />

A 2 +[y] 3<br />

A 3 + ···+[y] n<br />

A n<br />

− ([w] 1<br />

A 1 +[w] 2<br />

A 2 +[w] 3<br />

A 3 + ···+[w] n<br />

A n )<br />

=([y] 1<br />

− [w] 1<br />

) A 1 +([y] 2<br />

− [w] 2<br />

) A 2<br />

+([y] 3<br />

− [w] 3<br />

) A 3 + ···+([y] n<br />

− [w] n<br />

) A n<br />

=[y − w] 1<br />

A 1 +[y − w] 2<br />

A 2<br />

Theorem SLSLC<br />

Theorem VSPCV<br />

Def<strong>in</strong>ition CVA<br />

+[y − w] 3<br />

A 3 + ···+[y − w] n<br />

A n<br />

By Theorem SLSLC we see that the vector y − w is a solution to the homogeneous<br />

system LS(A, 0) and by Def<strong>in</strong>ition NSM, y − w ∈N(A). In other words, y − w = z<br />

for some vector z ∈N(A). Rewritten, this is y = w + z, as desired.<br />

<br />

After prov<strong>in</strong>g Theorem NMUS we commented (<strong>in</strong>sufficiently) on the negation of<br />

one half of the theorem. Nons<strong>in</strong>gular coefficient matrices lead to unique solutions for<br />

every choice of the vector of constants. What does this say about s<strong>in</strong>gular matrices?<br />

A s<strong>in</strong>gular matrix A has a nontrivial null space (Theorem NMTNS). For a given<br />

vector of constants, b, the system LS(A, b) could be <strong>in</strong>consistent, mean<strong>in</strong>g there<br />

arenosolutions.Butifthereisatleastonesolution(w), then Theorem PSPHS tells<br />

us there will be <strong>in</strong>f<strong>in</strong>itely many solutions because of the role of the <strong>in</strong>f<strong>in</strong>ite null space<br />

for a s<strong>in</strong>gular matrix. So a system of equations with a s<strong>in</strong>gular coefficient matrix<br />

never has a unique solution. Notice that this is the contrapositive of the statement<br />

<strong>in</strong> Exercise NM.T31. With a s<strong>in</strong>gular coefficient matrix, either there are no solutions,<br />

or <strong>in</strong>f<strong>in</strong>itely many solutions, depend<strong>in</strong>g on the choice of the vector of constants (b).<br />

Example PSHS Particular solutions, homogeneous solutions, Archetype D<br />

Archetype D is a consistent system of equations with a nontrivial null space. Let<br />

A denote the coefficient matrix of this system. The write-up for this system beg<strong>in</strong>s<br />

with three solutions,<br />

⎡ ⎤<br />

⎡ ⎤<br />

⎡ ⎤<br />

0<br />

4<br />

7<br />

⎢1⎥<br />

⎢0⎥<br />

⎢8⎥<br />

y 1 = ⎣<br />

2<br />

⎦ y 2 = ⎣<br />

0<br />

⎦ y 3 = ⎣<br />

1<br />

⎦<br />

1<br />

0<br />

3<br />

We will choose to have y 1 playtheroleofw <strong>in</strong> the statement of Theorem PSPHS,<br />

any one of the three vectors listed here (or others) could have been chosen. To<br />

illustrate the theorem, we should be able to write each of these three solutions as<br />

the vector w plus a solution to the correspond<strong>in</strong>g homogeneous system of equations.<br />

S<strong>in</strong>ce 0 is always a solution to a homogeneous system we can easily write<br />

y 1 = w = w + 0.<br />

The vectors y 2 and y 3 will require a bit more effort. Solutions to the homogeneous


§LC Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 103<br />

system LS(A, 0) are exactly the elements of the null space of the coefficient matrix,<br />

which by an application of Theorem VFSLS is<br />

⎧ ⎡ ⎤ ⎡ ⎤<br />

⎫<br />

−3 2<br />

⎪⎨<br />

⎪⎬<br />

N (A) =<br />

⎪⎩ x ⎢−1⎥<br />

⎢3⎥<br />

3 ⎣<br />

1<br />

⎦ + x 4 ⎣<br />

0<br />

⎦<br />

x 3 ,x 4 ∈ C<br />

0 1 ∣<br />

⎪⎭<br />

Then<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎛ ⎡ ⎤ ⎡ ⎤⎞<br />

4 0 4 0<br />

−3 2<br />

⎢0⎥<br />

⎢1⎥<br />

⎢−1⎥<br />

⎢1⎥<br />

⎜ ⎢−1⎥<br />

⎢3⎥⎟<br />

y 2 = ⎣<br />

0<br />

⎦ = ⎣<br />

2<br />

⎦ + ⎣<br />

−2<br />

⎦ = ⎣<br />

2<br />

⎦ + ⎝(−2) ⎣<br />

1<br />

⎦ +(−1) ⎣<br />

0<br />

⎦⎠ = w + z 2<br />

0 1 −1 1<br />

0<br />

1<br />

where<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

4<br />

−3 2<br />

⎢−1⎥<br />

⎢−1⎥<br />

⎢3⎥<br />

z 2 = ⎣<br />

−2<br />

⎦ =(−2) ⎣<br />

1<br />

⎦ +(−1) ⎣<br />

0<br />

⎦<br />

−1<br />

0<br />

1<br />

is obviously a solution of the homogeneous system s<strong>in</strong>ce it is written as a l<strong>in</strong>ear<br />

comb<strong>in</strong>ation of the vectors describ<strong>in</strong>g the null space of the coefficient matrix (or as<br />

a check, you could just evaluate the equations <strong>in</strong> the homogeneous system with z 2 ).<br />

Aga<strong>in</strong><br />

where<br />

⎡ ⎤ ⎡ ⎤ ⎡<br />

7 0<br />

⎢8⎥<br />

⎢1⎥<br />

⎢<br />

y 3 = ⎣<br />

1<br />

⎦ = ⎣<br />

2<br />

⎦ + ⎣<br />

3 1<br />

7<br />

7<br />

−1<br />

2<br />

⎢<br />

z 3 = ⎣<br />

⎤ ⎡ ⎤ ⎛ ⎡ ⎤ ⎡ ⎤⎞<br />

0<br />

−3 2<br />

⎥ ⎢1⎥<br />

⎜ ⎢−1⎥<br />

⎢3⎥⎟<br />

⎦ = ⎣<br />

2<br />

⎦ + ⎝(−1) ⎣<br />

1<br />

⎦ +2⎣<br />

0<br />

⎦⎠ = w + z 3<br />

1<br />

0 1<br />

⎡<br />

7<br />

7<br />

−1<br />

2<br />

⎤ ⎡ ⎤ ⎡ ⎤<br />

−3 2<br />

⎥ ⎢−1⎥<br />

⎢3⎥<br />

⎦ =(−1) ⎣<br />

1<br />

⎦ +2⎣<br />

0<br />

⎦<br />

0 1<br />

is obviously a solution of the homogeneous system s<strong>in</strong>ce it is written as a l<strong>in</strong>ear<br />

comb<strong>in</strong>ation of the vectors describ<strong>in</strong>g the null space of the coefficient matrix (or as<br />

a check, you could just evaluate the equations <strong>in</strong> the homogeneous system with z 2 ).<br />

Here is another view of this theorem, <strong>in</strong> the context of this example. Grab two<br />

new solutions of the orig<strong>in</strong>al system of equations, say<br />

⎡<br />

⎢<br />

y 4 = ⎣<br />

11<br />

0<br />

−3<br />

−1<br />

⎤<br />

⎥<br />

⎦ y 5 =<br />

⎡ ⎤<br />

−4<br />

⎢ ⎥<br />

⎣ ⎦<br />

2<br />

4<br />

2


§LC Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 104<br />

and form their difference,<br />

⎡<br />

⎢<br />

u = ⎣<br />

11<br />

0<br />

−3<br />

−1<br />

⎤<br />

⎥<br />

⎦ −<br />

⎡ ⎤<br />

−4<br />

⎢<br />

⎣<br />

2<br />

4<br />

2<br />

⎥<br />

⎦ =<br />

⎡ ⎤<br />

15<br />

⎢−2⎥<br />

⎣<br />

−7<br />

⎦ .<br />

−3<br />

It is no accident that u is a solution to the homogeneous system (check this!). In<br />

other words, the difference between any two solutions to a l<strong>in</strong>ear system of equations<br />

is an element of the null space of the coefficient matrix. This is an equivalent way to<br />

state Theorem PSPHS. (See Exercise MM.T50).<br />

△<br />

The ideas of this subsection will appear aga<strong>in</strong> <strong>in</strong> Chapter LT when we discuss<br />

pre-images of l<strong>in</strong>ear transformations (Def<strong>in</strong>ition PI).<br />

Read<strong>in</strong>g Questions<br />

1. Earlier, a read<strong>in</strong>g question asked you to solve the system of equations<br />

2x 1 +3x 2 − x 3 =0<br />

x 1 +2x 2 + x 3 =3<br />

x 1 +3x 2 +3x 3 =7<br />

Use a l<strong>in</strong>ear comb<strong>in</strong>ation to rewrite this system of equations as a vector equality.<br />

2. F<strong>in</strong>d a l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors<br />

⎧⎡<br />

⎨<br />

S = ⎣ 1 ⎤ ⎡<br />

3 ⎦ , ⎣ 2 ⎤ ⎡<br />

0⎦ , ⎣ −1<br />

⎤⎫<br />

⎬<br />

3 ⎦<br />

⎩<br />

−1 4 −5<br />

⎭<br />

⎡<br />

that equals the vector ⎣ 1<br />

⎤<br />

−9⎦.<br />

11<br />

3. The matrix below is the augmented matrix of a system of equations, row-reduced to<br />

reduced row-echelon form. Write the vector form of the solutions to the system.<br />

⎡<br />

⎤<br />

1 3 0 6 0 9<br />

⎢<br />

⎥<br />

⎣ 0 0 1 −2 0 −8⎦<br />

0 0 0 0 1 3<br />

Exercises<br />

C21 † Consider each archetype that is a system of equations. For <strong>in</strong>dividual solutions<br />

listed (both for the orig<strong>in</strong>al system and the correspond<strong>in</strong>g homogeneous system) express<br />

the vector of constants as a l<strong>in</strong>ear comb<strong>in</strong>ation of the columns of the coefficient matrix, as<br />

guaranteed by Theorem SLSLC. Verify this equality by comput<strong>in</strong>g the l<strong>in</strong>ear comb<strong>in</strong>ation.<br />

For systems with no solutions, recognize that it is then impossible to write the vector of<br />

constants as a l<strong>in</strong>ear comb<strong>in</strong>ation of the columns of the coefficient matrix. Note too, for


§LC Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 105<br />

homogeneous systems, that the solutions give rise to l<strong>in</strong>ear comb<strong>in</strong>ations that equal the<br />

zero vector.<br />

Archetype A,ArchetypeB,ArchetypeC,ArchetypeD,ArchetypeE,ArchetypeF,Archetype<br />

G, ArchetypeH, ArchetypeI, ArchetypeJ<br />

C22 † Consider each archetype that is a system of equations. Write elements of the solution<br />

set <strong>in</strong> vector form, as guaranteed by Theorem VFSLS.<br />

Archetype A,ArchetypeB,ArchetypeC,ArchetypeD,ArchetypeE,ArchetypeF,Archetype<br />

G, ArchetypeH, ArchetypeI, ArchetypeJ<br />

C40 † F<strong>in</strong>d the vector form of the solutions to the system of equations below.<br />

2x 1 − 4x 2 +3x 3 + x 5 =6<br />

x 1 − 2x 2 − 2x 3 +14x 4 − 4x 5 =15<br />

x 1 − 2x 2 + x 3 +2x 4 + x 5 = −1<br />

−2x 1 +4x 2 − 12x 4 + x 5 = −7<br />

C41 †<br />

F<strong>in</strong>d the vector form of the solutions to the system of equations below.<br />

−2x 1 − 1x 2 − 8x 3 +8x 4 +4x 5 − 9x 6 − 1x 7 − 1x 8 − 18x 9 =3<br />

3x 1 − 2x 2 +5x 3 +2x 4 − 2x 5 − 5x 6 +1x 7 +2x 8 +15x 9 =10<br />

4x 1 − 2x 2 +8x 3 +2x 5 − 14x 6 − 2x 8 +2x 9 =36<br />

−1x 1 +2x 2 +1x 3 − 6x 4 +7x 6 − 1x 7 − 3x 9 = −8<br />

3x 1 +2x 2 +13x 3 − 14x 4 − 1x 5 +5x 6 − 1x 8 +12x 9 =15<br />

−2x 1 +2x 2 − 2x 3 − 4x 4 +1x 5 +6x 6 − 2x 7 − 2x 8 − 15x 9 = −7<br />

M10 †<br />

Example TLC asks if the vector<br />

⎡ ⎤<br />

13<br />

15<br />

5<br />

w =<br />

⎢−17<br />

⎥<br />

⎣ 2 ⎦<br />

25<br />

can be written as a l<strong>in</strong>ear comb<strong>in</strong>ation of the four vectors<br />

⎡ ⎤<br />

⎡ ⎤<br />

⎡ ⎤<br />

2<br />

6<br />

−5<br />

4<br />

3<br />

2<br />

−3<br />

0<br />

1<br />

u 1 =<br />

⎢ 1<br />

u 2 =<br />

⎥<br />

⎢−2<br />

u 3 =<br />

⎥<br />

⎢ 1<br />

⎥<br />

⎣ 2 ⎦<br />

⎣ 1 ⎦<br />

⎣−3⎦<br />

9<br />

4<br />

0<br />

⎡ ⎤<br />

3<br />

2<br />

−5<br />

u 4 =<br />

⎢ 7<br />

⎥<br />

⎣ 1 ⎦<br />

3<br />

Can it? Can any vector <strong>in</strong> C 6 be written as a l<strong>in</strong>ear comb<strong>in</strong>ation of the four vectors<br />

u 1 , u 2 , u 3 , u 4 ?<br />

M11 † At the end of Example VFS, thevectorw is claimed to be a solution to the l<strong>in</strong>ear<br />

system under discussion. Verify that w really is a solution. Then determ<strong>in</strong>e the four scalars


§LC Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 106<br />

that express w as a l<strong>in</strong>ear comb<strong>in</strong>ation of c, u 1 , u 2 , u 3 .


Section SS<br />

Spann<strong>in</strong>g Sets<br />

In this section we will provide an extremely compact way to describe an <strong>in</strong>f<strong>in</strong>ite set<br />

of vectors, mak<strong>in</strong>g use of l<strong>in</strong>ear comb<strong>in</strong>ations. This will give us a convenient way to<br />

describe the solution set of a l<strong>in</strong>ear system, the null space of a matrix, and many<br />

other sets of vectors.<br />

Subsection SSV<br />

Span of a Set of Vectors<br />

In Example VFSAL we saw the solution set of a homogeneous system described<br />

as all possible l<strong>in</strong>ear comb<strong>in</strong>ations of two particular vectors. This is a useful way<br />

to construct or describe <strong>in</strong>f<strong>in</strong>ite sets of vectors, so we encapsulate the idea <strong>in</strong> a<br />

def<strong>in</strong>ition.<br />

Def<strong>in</strong>ition SSCV Span of a Set of Column Vectors<br />

Given a set of vectors S = {u 1 , u 2 , u 3 , ..., u p },theirspan, 〈S〉, is the set of all<br />

possible l<strong>in</strong>ear comb<strong>in</strong>ations of u 1 , u 2 , u 3 , ..., u p . Symbolically,<br />

〈S〉 = {α 1 u 1 + α 2 u 2 + α 3 u 3 + ···+ α p u p | α i ∈ C, 1 ≤ i ≤ p}<br />

{ p∑<br />

∣ }<br />

∣∣∣∣<br />

= α i u i α i ∈ C, 1 ≤ i ≤ p<br />

i=1<br />

□<br />

The span is just a set of vectors, though <strong>in</strong> all but one situation it is an <strong>in</strong>f<strong>in</strong>ite<br />

set. (Just when is it not <strong>in</strong>f<strong>in</strong>ite?) So we start with a f<strong>in</strong>ite collection of vectors S (p<br />

of them to be precise), and use this f<strong>in</strong>ite set to describe an <strong>in</strong>f<strong>in</strong>ite set of vectors,<br />

〈S〉. Confus<strong>in</strong>g the f<strong>in</strong>ite set S with the <strong>in</strong>f<strong>in</strong>ite set 〈S〉 is one of the most persistent<br />

problems <strong>in</strong> understand<strong>in</strong>g <strong>in</strong>troductory l<strong>in</strong>ear algebra. We will see this construction<br />

repeatedly, so let us work through some examples to get comfortable with it. The<br />

most obvious question about a set is if a particular item of the correct type is <strong>in</strong> the<br />

set, or not <strong>in</strong> the set.<br />

Example ABS A basic span<br />

Consider the set of 5 vectors, S, fromC 4<br />

⎧⎡<br />

⎤ ⎡ ⎤ ⎡<br />

1 2<br />

⎪⎨<br />

⎢1⎥<br />

⎢ 1 ⎥ ⎢<br />

S = ⎣<br />

⎪⎩<br />

3<br />

⎦ , ⎣<br />

2<br />

⎦ , ⎣<br />

1 −1<br />

7<br />

3<br />

5<br />

−5<br />

⎤<br />

⎥<br />

⎦ ,<br />

⎡<br />

⎢<br />

⎣<br />

1<br />

1<br />

−1<br />

2<br />

⎤<br />

⎥<br />

⎦ ,<br />

⎡ ⎤⎫<br />

−1 ⎪⎬<br />

⎢ 0 ⎥<br />

⎣<br />

9<br />

⎦<br />

⎪⎭<br />

0<br />

and consider the <strong>in</strong>f<strong>in</strong>ite set of vectors 〈S〉 formed from all possible l<strong>in</strong>ear comb<strong>in</strong>ations<br />

of the elements of S. Herearefourvectorswedef<strong>in</strong>itelyknowareelementsof〈S〉,<br />

107


§SS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 108<br />

s<strong>in</strong>ce we will construct them <strong>in</strong> accordance with Def<strong>in</strong>ition SSCV,<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

1 2<br />

7 1 −1 −4<br />

⎢1⎥<br />

⎢ 1 ⎥ ⎢ 3 ⎥ ⎢ 1 ⎥ ⎢ 0 ⎥ ⎢ 2 ⎥<br />

w =(2) ⎣<br />

3<br />

⎦ +(1) ⎣<br />

2<br />

⎦ +(−1) ⎣<br />

5<br />

⎦ +(2) ⎣<br />

−1<br />

⎦ +(3) ⎣<br />

9<br />

⎦ = ⎣<br />

28<br />

⎦<br />

1 −1 −5 2 0 10<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡<br />

1<br />

2<br />

7<br />

⎢1⎥<br />

⎢ 1 ⎥ ⎢ 3 ⎥ ⎢<br />

x =(5) ⎣<br />

3<br />

⎦ +(−6) ⎣<br />

2<br />

⎦ +(−3) ⎣<br />

5<br />

⎦ +(4) ⎣<br />

1 −1 −5<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡<br />

1 2 7<br />

⎢1⎥<br />

⎢ 1 ⎥ ⎢ 3 ⎥ ⎢<br />

y =(1) ⎣<br />

3<br />

⎦ +(0) ⎣<br />

2<br />

⎦ +(1) ⎣<br />

5<br />

⎦ +(0) ⎣<br />

1 −1 −5<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡<br />

1 2 7<br />

⎢1⎥<br />

⎢ 1 ⎥ ⎢ 3 ⎥ ⎢<br />

z =(0) ⎣<br />

3<br />

⎦ +(0) ⎣<br />

2<br />

⎦ +(0) ⎣<br />

5<br />

⎦ +(0) ⎣<br />

1 −1 −5<br />

1<br />

1<br />

−1<br />

2<br />

1<br />

1<br />

−1<br />

2<br />

1<br />

1<br />

−1<br />

2<br />

⎤ ⎡ ⎤<br />

−1<br />

⎥ ⎢ 0<br />

⎦ +(2) ⎣<br />

9<br />

0<br />

⎤ ⎡ ⎤<br />

−1<br />

⎥ ⎢ 0<br />

⎦ +(1) ⎣<br />

9<br />

0<br />

⎥<br />

⎦ =<br />

⎥<br />

⎦ =<br />

⎤ ⎡ ⎤<br />

−1<br />

⎥ ⎢ 0<br />

⎦ +(0) ⎣<br />

9<br />

0<br />

⎥<br />

⎦ =<br />

⎡ ⎤<br />

−26<br />

⎢ ⎥<br />

⎣ ⎦<br />

⎡<br />

⎢<br />

⎣<br />

7<br />

4<br />

−6<br />

2<br />

34<br />

17<br />

−4<br />

⎤<br />

⎥<br />

⎦<br />

⎡ ⎤<br />

0<br />

⎢0⎥<br />

⎣<br />

0<br />

⎦<br />

0<br />

The purpose of a set is to collect objects with some common property, and to<br />

exclude objects without that property. So the most fundamental question about a<br />

set is if a given object is an element of the set or not. Let us learn more about 〈S〉<br />

by <strong>in</strong>vestigat<strong>in</strong>g which<br />

⎡<br />

vectors<br />

⎤<br />

are elements of the set, and which are not.<br />

<strong>First</strong>, is u =<br />

⎢<br />

⎣<br />

−15<br />

−6<br />

19<br />

5<br />

α 1 ,α 2 ,α 3 ,α 4 ,α 5 such that<br />

⎡ ⎤ ⎡ ⎤ ⎡<br />

1 2<br />

⎢1⎥<br />

⎢ 1 ⎥ ⎢<br />

α 1 ⎣<br />

3<br />

⎦ + α 2 ⎣<br />

2<br />

⎦ + α 3 ⎣<br />

1 −1<br />

⎥<br />

⎦ an element of 〈S〉? We are ask<strong>in</strong>g if there are scalars<br />

⎤<br />

7<br />

3 ⎥<br />

5<br />

−5<br />

⎡<br />

⎢<br />

⎦ + α 4 ⎣<br />

1<br />

1<br />

−1<br />

2<br />

⎤ ⎡ ⎤<br />

−1<br />

⎥ ⎢ 0<br />

⎦ + α 5 ⎣<br />

9<br />

0<br />

⎥<br />

⎦ = u =<br />

⎡ ⎤<br />

−15<br />

⎢ ⎥<br />

⎣ ⎦<br />

Apply<strong>in</strong>g Theorem SLSLC we recognize the search for these scalars as a solution<br />

to a l<strong>in</strong>ear system of equations with augmented matrix<br />

⎡<br />

⎤<br />

1 2 7 1 −1 −15<br />

⎢1 1 3 1 0 −6 ⎥<br />

⎣<br />

3 2 5 −1 9 19<br />

⎦<br />

1 −1 −5 2 0 5<br />

−6<br />

19<br />

5


§SS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 109<br />

which row-reduces to<br />

⎡<br />

⎢<br />

⎣<br />

⎤<br />

1 0 −1 0 3 10<br />

0 1 4 0 −1 −9<br />

⎥<br />

0 0 0 1 −2 −7<br />

⎦<br />

0 0 0 0 0 0<br />

At this po<strong>in</strong>t, we see that the system is consistent (Theorem RCLS), so we know<br />

there is a solution for the five scalars α 1 ,α 2 ,α 3 ,α 4 ,α 5 . This is enough evidence for<br />

us to say that u ∈〈S〉. If we wished further evidence, we could compute an actual<br />

solution, say<br />

α 1 =2 α 2 =1 α 3 = −2 α 4 = −3 α 5 =2<br />

This particular solution allows us to write<br />

⎡ ⎤ ⎡<br />

1<br />

⎢1⎥<br />

⎢<br />

(2) ⎣<br />

3<br />

⎦ +(1) ⎣<br />

1<br />

2<br />

1<br />

2<br />

−1<br />

⎤<br />

⎡<br />

⎥ ⎢<br />

⎦ +(−2) ⎣<br />

7<br />

3<br />

5<br />

−5<br />

⎤<br />

⎡<br />

⎥ ⎢<br />

⎦ +(−3) ⎣<br />

1<br />

1<br />

−1<br />

2<br />

⎤ ⎡ ⎤<br />

−1<br />

⎥ ⎢ 0<br />

⎦ +(2) ⎣<br />

9<br />

0<br />

⎥<br />

⎦ = u =<br />

⎡ ⎤<br />

−15<br />

⎢ ⎥<br />

⎣ ⎦<br />

mak<strong>in</strong>g it even more obvious that<br />

⎡ ⎤<br />

u ∈〈S〉.<br />

3<br />

⎢ 1 ⎥<br />

Let us do it aga<strong>in</strong>. Is v = ⎣<br />

2<br />

⎦ an element of 〈S〉? We are ask<strong>in</strong>g if there are<br />

−1<br />

scalars α 1 ,α 2 ,α 3 ,α 4 ,α 5 such that<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

1<br />

1 −1<br />

⎢1⎥<br />

⎢ ⎥ ⎢ ⎥ ⎢ 1 ⎥ ⎢ ⎥ ⎢ ⎥<br />

α 1 ⎣ ⎦ + α 2 ⎣ ⎦ + α 3 ⎣ ⎦ + α 4 ⎣ ⎦ + α 5 ⎣ ⎦ = v = ⎣ ⎦<br />

3<br />

1<br />

2<br />

1<br />

2<br />

−1<br />

7<br />

3<br />

5<br />

−5<br />

Apply<strong>in</strong>g Theorem SLSLC we recognize the search for these scalars as a solution<br />

to a l<strong>in</strong>ear system of equations with augmented matrix<br />

⎡<br />

⎤<br />

1 2 7 1 −1 3<br />

⎢1 1 3 1 0 1 ⎥<br />

⎣<br />

3 2 5 −1 9 2<br />

⎦<br />

1 −1 −5 2 0 −1<br />

which row-reduces to<br />

⎡<br />

⎢<br />

⎣<br />

−1<br />

2<br />

1 0 −1 0 3 0<br />

0 1 4 0 −1 0<br />

0 0 0 1 −2 0<br />

0 0 0 0 0 1<br />

At this po<strong>in</strong>t, we see that the system is <strong>in</strong>consistent by Theorem RCLS, sowe<br />

know there is not a solution for the five scalars α 1 ,α 2 ,α 3 ,α 4 ,α 5 . This is enough<br />

⎤<br />

⎥<br />

⎦<br />

0<br />

9<br />

0<br />

3<br />

1<br />

2<br />

−1<br />

−6<br />

19<br />

5


§SS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 110<br />

evidence for us to say that v ∉〈S〉. End of story.<br />

Example SCAA Span of the columns of Archetype A<br />

Beg<strong>in</strong> with the f<strong>in</strong>ite set of three vectors of size 3<br />

{[ ] [ ] [ ]}<br />

1 −1 2<br />

S = {u 1 , u 2 , u 3 } = 2 , 1 , 1<br />

1 1 0<br />

and consider the <strong>in</strong>f<strong>in</strong>ite set 〈S〉. The vectors of S could have been chosen to be<br />

anyth<strong>in</strong>g, but for reasons that will become clear later, we have chosen the three<br />

columns of the coefficient matrix <strong>in</strong> Archetype A.<br />

<strong>First</strong>, as an example, note that<br />

v =(5)[ 1<br />

2<br />

1<br />

]<br />

+(−3)<br />

[ ] ]<br />

−1 2<br />

1 +(7)[<br />

1 =<br />

1 0<br />

[ ] 22<br />

14<br />

2<br />

is <strong>in</strong> 〈S〉, s<strong>in</strong>ce it is a l<strong>in</strong>ear comb<strong>in</strong>ation of u 1 , u 2 , u 3 . We write this succ<strong>in</strong>ctly as<br />

v ∈〈S〉. There is noth<strong>in</strong>g magical about the scalars α 1 =5,α 2 = −3, α 3 =7,they<br />

could have been chosen to be anyth<strong>in</strong>g. So repeat this part of the example yourself,<br />

us<strong>in</strong>g different values of α 1 ,α 2 ,α 3 . What happens if you choose all three scalars to<br />

be zero?<br />

So we know how to quickly construct sample elements of the set 〈S〉. A slightly<br />

different question arises when you are handed a<br />

]<br />

vector of the correct size and asked<br />

[ 1<br />

if it is an element of 〈S〉. For example, is w = 8 <strong>in</strong> 〈S〉? More succ<strong>in</strong>ctly, w ∈〈S〉?<br />

5<br />

To answer this question, we will look for scalars α 1 ,α 2 ,α 3 so that<br />

α 1 u 1 + α 2 u 2 + α 3 u 3 = w<br />

By Theorem SLSLC solutions to this vector equation are solutions to the system of<br />

equations<br />

α 1 − α 2 +2α 3 =1<br />

2α 1 + α 2 + α 3 =8<br />

α 1 + α 2 =5<br />

Build<strong>in</strong>g the augmented matrix for this l<strong>in</strong>ear system, and row-reduc<strong>in</strong>g, gives<br />

⎡<br />

⎣ 1 0 1 3<br />

⎤<br />

0 1 −1 2⎦<br />

0 0 0 0<br />

This system has <strong>in</strong>f<strong>in</strong>itely many solutions (there is a free variable <strong>in</strong> x 3 ), but all<br />


§SS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 111<br />

we need is one solution vector. The solution,<br />

α 1 =2 α 2 =3 α 3 =1<br />

tells us that<br />

(2)u 1 +(3)u 2 +(1)u 3 = w<br />

so we are conv<strong>in</strong>ced that w really is <strong>in</strong> 〈S〉. Notice that there are an <strong>in</strong>f<strong>in</strong>ite number<br />

of ways to answer this question affirmatively. We could choose a different solution,<br />

this time choos<strong>in</strong>g the free variable to be zero,<br />

α 1 =3 α 2 =2 α 3 =0<br />

shows us that<br />

(3)u 1 +(2)u 2 +(0)u 3 = w<br />

Verify<strong>in</strong>g the arithmetic <strong>in</strong> this second solution will make it obvious that w is <strong>in</strong><br />

this span. And of course, we now realize that there are an <strong>in</strong>f<strong>in</strong>ite number of ways<br />

to realize w as element of 〈S〉.<br />

]<br />

[ 2<br />

Let us ask the same type of question aga<strong>in</strong>, but this time with y = 4<br />

3<br />

y ∈〈S〉?<br />

So we will look for scalars α 1 ,α 2 ,α 3 so that<br />

α 1 u 1 + α 2 u 2 + α 3 u 3 = y<br />

, i.e. is<br />

By Theorem SLSLC solutions to this vector equation are the solutions to the system<br />

of equations<br />

α 1 − α 2 +2α 3 =2<br />

2α 1 + α 2 + α 3 =4<br />

α 1 + α 2 =3<br />

Build<strong>in</strong>g the augmented matrix for this l<strong>in</strong>ear system, and row-reduc<strong>in</strong>g, gives<br />

⎡<br />

1 0 1<br />

⎤<br />

0<br />

⎣ 0 1 −1 0 ⎦<br />

0 0 0 1<br />

This system is <strong>in</strong>consistent (there is a pivot column <strong>in</strong> the last column, Theorem<br />

RCLS), so there are no scalars α 1 ,α 2 ,α 3 that will create a l<strong>in</strong>ear comb<strong>in</strong>ation of<br />

u 1 , u 2 , u 3 that equals y. More precisely, y ∉〈S〉.<br />

There are three th<strong>in</strong>gs to observe <strong>in</strong> this example. (1) It is easy to construct<br />

vectors <strong>in</strong> 〈S〉. (2) It is possible that some vectors are <strong>in</strong> 〈S〉 (e.g. w), while others<br />

are not (e.g. y). (3) Decid<strong>in</strong>g if a given vector is <strong>in</strong> 〈S〉 leads to solv<strong>in</strong>g a l<strong>in</strong>ear<br />

system of equations and ask<strong>in</strong>g if the system is consistent.


§SS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 112<br />

With a computer program <strong>in</strong> hand to solve systems of l<strong>in</strong>ear equations, could<br />

you create a program to decide if a vector was, or was not, <strong>in</strong> the span of a given set<br />

of vectors? Is this art or science?<br />

This example was built on vectors from the columns of the coefficient matrix of<br />

Archetype A. Study the determ<strong>in</strong>ation that v ∈〈S〉 and see if you can connect it<br />

with some of the other properties of Archetype A.<br />

△<br />

Hav<strong>in</strong>g analyzed Archetype A <strong>in</strong> Example SCAA, we will of course subject<br />

Archetype B to a similar <strong>in</strong>vestigation.<br />

Example SCAB Span of the columns of Archetype B<br />

Beg<strong>in</strong> with the f<strong>in</strong>ite set of three vectors of size 3 that are the columns of the<br />

coefficient matrix <strong>in</strong> Archetype B,<br />

R = {v 1 , v 2 , v 3 } =<br />

and consider the <strong>in</strong>f<strong>in</strong>ite set 〈R〉.<br />

<strong>First</strong>, as an example, note that<br />

[ ] −7<br />

x =(2) 5<br />

1<br />

+(4)<br />

{[ ] −7<br />

5 ,<br />

1<br />

[ ] −6<br />

5 +(−3)<br />

0<br />

[ ] −6<br />

5 ,<br />

0<br />

[ ]} −12<br />

7<br />

4<br />

[ ] −12<br />

[ ] −2<br />

7 = 9<br />

4 −10<br />

is <strong>in</strong> 〈R〉, s<strong>in</strong>ce it is a l<strong>in</strong>ear comb<strong>in</strong>ation of v 1 , v 2 , v 3 . In other words, x ∈〈R〉. Try<br />

some different values of α 1 ,α 2 ,α 3 yourself, and see what vectors you can create as<br />

elements of 〈R〉.<br />

Now ask if a given vector is an element of 〈R〉. For example, is z =<br />

〈R〉? Isz ∈〈R〉?<br />

To answer this question, we will look for scalars α 1 ,α 2 ,α 3 so that<br />

α 1 v 1 + α 2 v 2 + α 3 v 3 = z<br />

[ ] −33<br />

24<br />

5<br />

By Theorem SLSLC solutions to this vector equation are the solutions to the system<br />

of equations<br />

−7α 1 − 6α 2 − 12α 3 = −33<br />

5α 1 +5α 2 +7α 3 =24<br />

α 1 +4α 3 =5<br />

Build<strong>in</strong>g the augmented matrix for this l<strong>in</strong>ear system, and row-reduc<strong>in</strong>g, gives<br />

⎡<br />

1 0 0<br />

⎤<br />

−3<br />

⎣ 0 1 0 5 ⎦<br />

0 0 1 2<br />

<strong>in</strong>


§SS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 113<br />

This system has a unique solution,<br />

α 1 = −3 α 2 =5 α 3 =2<br />

tell<strong>in</strong>g us that<br />

(−3)v 1 +(5)v 2 +(2)v 3 = z<br />

so we are conv<strong>in</strong>ced that z really is <strong>in</strong> 〈R〉. Noticethat<strong>in</strong>thiscasewehaveonly<br />

one way to answer the question affirmatively s<strong>in</strong>ce<br />

[<br />

the<br />

]<br />

solution is unique.<br />

−7<br />

Let us ask about another vector, say is x = 8 <strong>in</strong> 〈R〉? Isx ∈〈R〉?<br />

−3<br />

We desire scalars α 1 ,α 2 ,α 3 so that<br />

α 1 v 1 + α 2 v 2 + α 3 v 3 = x<br />

By Theorem SLSLC solutions to this vector equation are the solutions to the system<br />

of equations<br />

−7α 1 − 6α 2 − 12α 3 = −7<br />

5α 1 +5α 2 +7α 3 =8<br />

α 1 +4α 3 = −3<br />

Build<strong>in</strong>g the augmented matrix for this l<strong>in</strong>ear system, and row-reduc<strong>in</strong>g, gives<br />

⎡<br />

1 0 0<br />

⎤<br />

1<br />

⎣ 0 1 0 2 ⎦<br />

0 0 1 −1<br />

This system has a unique solution,<br />

α 1 =1 α 2 =2 α 3 = −1<br />

tell<strong>in</strong>g us that<br />

(1)v 1 +(2)v 2 +(−1)v 3 = x<br />

so we are conv<strong>in</strong>ced that x really is <strong>in</strong> 〈R〉. Notice that <strong>in</strong> this case we aga<strong>in</strong> have<br />

only one way to answer the question affirmatively s<strong>in</strong>ce the solution is aga<strong>in</strong> unique.<br />

We could cont<strong>in</strong>ue to test other vectors for membership <strong>in</strong> 〈R〉, but there is no<br />

po<strong>in</strong>t. A question about membership <strong>in</strong> 〈R〉 <strong>in</strong>evitably leads to a system of three<br />

equations <strong>in</strong> the three variables α 1 ,α 2 ,α 3 with a coefficient matrix whose columns<br />

are the vectors v 1 , v 2 , v 3 . This particular coefficient matrix is nons<strong>in</strong>gular, so by<br />

Theorem NMUS, the system is guaranteed to have a solution. (This solution is<br />

unique, but that is not critical here.) So no matter which vector we might have<br />

chosen for z, we would have been certa<strong>in</strong> to discover that it was an element of 〈R〉.<br />

Stated differently, every vector of size 3 is <strong>in</strong> 〈R〉, or〈R〉 = C 3 .<br />

Compare this example with Example SCAA, and see if you can connect z with


§SS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 114<br />

some aspects of the write-up for Archetype B.<br />

△<br />

Subsection SSNS<br />

Spann<strong>in</strong>g Sets of Null Spaces<br />

We saw <strong>in</strong> Example VFSAL that when a system of equations is homogeneous the<br />

solution set can be expressed <strong>in</strong> the form described by Theorem VFSLS where the<br />

vector c is the zero vector. We can essentially ignore this vector, so that the rema<strong>in</strong>der<br />

of the typical expression for a solution looks like an arbitrary l<strong>in</strong>ear comb<strong>in</strong>ation,<br />

where the scalars are the free variables and the vectors are u 1 , u 2 , u 3 , ..., u n−r .<br />

Which sounds a lot like a span. This is the substance of the next theorem.<br />

Theorem SSNS Spann<strong>in</strong>g Sets for Null Spaces<br />

Suppose that A is an m × n matrix, and B is a row-equivalent matrix <strong>in</strong> reduced<br />

row-echelon form. Suppose that B has r pivot columns, with <strong>in</strong>dices given<br />

by D = {d 1 ,d 2 ,d 3 , ..., d r },whilethen − r non-pivot columns have <strong>in</strong>dices F =<br />

{f 1 ,f 2 ,f 3 , ..., f n−r ,n+1}. Construct the n − r vectors z j , 1 ≤ j ≤ n − r of size<br />

n,<br />

⎧<br />

⎪⎨ 1 if i ∈ F , i = f j<br />

[z j ] i<br />

= 0 if i ∈ F , i ≠ f j<br />

⎪⎩<br />

− [B] k,fj<br />

if i ∈ D, i = d k<br />

Then the null space of A is given by<br />

N (A) =〈{z 1 , z 2 , z 3 , ..., z n−r }〉<br />

Proof. Consider the homogeneous system with A as a coefficient matrix, LS(A, 0).<br />

Its set of solutions, S, is by Def<strong>in</strong>ition NSM, the null space of A, N (A). LetB ′<br />

denote the result of row-reduc<strong>in</strong>g the augmented matrix of this homogeneous system.<br />

S<strong>in</strong>ce the system is homogeneous, the f<strong>in</strong>al column of the augmented matrix will be<br />

all zeros, and after any number of row operations (Def<strong>in</strong>ition RO), the column will<br />

still be all zeros. So B ′ has a f<strong>in</strong>al column that is totally zeros.<br />

Now apply Theorem VFSLS to B ′ , after not<strong>in</strong>g that our homogeneous system<br />

must be consistent (Theorem HSC). The vector c has zeros for each entry that has<br />

an <strong>in</strong>dex <strong>in</strong> F . For entries with their <strong>in</strong>dex <strong>in</strong> D, thevalueis− [B ′ ] k,n+1<br />

, but for<br />

B ′ any entry <strong>in</strong> the f<strong>in</strong>al column (<strong>in</strong>dex n +1)iszero. Soc = 0. The vectors z j ,<br />

1 ≤ j ≤ n − r are identical to the vectors u j ,1≤ j ≤ n − r described <strong>in</strong> Theorem<br />

VFSLS. Putt<strong>in</strong>g it all together and apply<strong>in</strong>g Def<strong>in</strong>ition SSCV <strong>in</strong> the f<strong>in</strong>al step,<br />

N (A) =S<br />

= {c + α 1 u 1 + α 2 u 2 + α 3 u 3 + ···+ α n−r u n−r | α 1 ,α 2 ,α 3 , ..., α n−r ∈ C}<br />

= {α 1 u 1 + α 2 u 2 + α 3 u 3 + ···+ α n−r u n−r | α 1 ,α 2 ,α 3 , ..., α n−r ∈ C}<br />

= 〈{z 1 , z 2 , z 3 , ..., z n−r }〉


§SS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 115<br />

Notice that the hypotheses of Theorem VFSLS and Theorem SSNS are slightly<br />

different. In the former, B is the row-reduced version of an augmented matrix of<br />

a l<strong>in</strong>ear system, while <strong>in</strong> the latter, B is the row-reduced version of an arbitrary<br />

matrix. Understand<strong>in</strong>g this subtlety now will avoid confusion later.<br />

Example SSNS Spann<strong>in</strong>g set of a null space<br />

F<strong>in</strong>d a set of vectors, S, so that the null space of the matrix A below is the span of<br />

S, thatis,〈S〉 = N (A).<br />

⎡<br />

⎤<br />

A =<br />

⎢<br />

⎣<br />

1 3 3 −1 −5<br />

2 5 7 1 1<br />

1 1 5 1 5<br />

−1 −4 −2 0 4<br />

The null space of A is the set of all solutions to the homogeneous system LS(A, 0).<br />

If we f<strong>in</strong>d the vector form of the solutions to this homogeneous system (Theorem<br />

VFSLS) then the vectors u j ,1≤ j ≤ n − r <strong>in</strong> the l<strong>in</strong>ear comb<strong>in</strong>ation are exactly the<br />

vectors z j ,1≤ j ≤ n − r described <strong>in</strong> Theorem SSNS. So we can mimic Example<br />

VFSAL to arrive at these vectors (rather than be<strong>in</strong>g a slave to the formulas <strong>in</strong> the<br />

statement of the theorem).<br />

Beg<strong>in</strong> by row-reduc<strong>in</strong>g A. The result is<br />

⎡<br />

⎤<br />

⎢<br />

⎣<br />

1 0 6 0 4<br />

0 1 −1 0 −2<br />

⎥<br />

0 0 0 1 3<br />

⎦<br />

0 0 0 0 0<br />

With D = {1, 2, 4} and F = {3, 5} we recognize that x 3 and x 5 are free variables<br />

and we can <strong>in</strong>terpret each nonzero row as an expression for the dependent variables<br />

x 1 , x 2 , x 4 (respectively) <strong>in</strong> the free variables x 3 and x 5 .Withthiswecanwritethe<br />

vector form of a solution vector as<br />

⎡ ⎤ ⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤<br />

x 1 −6x 3 − 4x 5 −6 −4<br />

x 2<br />

x ⎢x 3 ⎥<br />

⎣x ⎦ = 3 +2x 5<br />

1<br />

2<br />

⎢ x 3 ⎥<br />

⎣ 4 −3x ⎦ = x 3 ⎢ 1 ⎥<br />

⎣<br />

5<br />

0 ⎦ + x 5 ⎢ 0 ⎥<br />

⎣−3⎦<br />

x 5 x 5<br />

0 1<br />

Then <strong>in</strong> the notation of Theorem SSNS,<br />

⎡ ⎤<br />

−6<br />

1<br />

z 1 = ⎢ 1 ⎥<br />

⎣ 0 ⎦<br />

0<br />

⎥<br />

⎦<br />

⎡ ⎤<br />

−4<br />

2<br />

z 2 = ⎢ 0 ⎥<br />

⎣−3⎦<br />

1


§SS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 116<br />

and<br />

⎧⎡<br />

⎤ ⎡ ⎤⎫<br />

−6 −4<br />

〈<br />

〉<br />

⎪⎨<br />

1<br />

2<br />

⎪⎬<br />

N (A) =〈{z 1 , z 2 }〉 = ⎢ 1 ⎥<br />

⎣<br />

⎪⎩<br />

0 ⎦ , ⎢ 0 ⎥<br />

⎣−3⎦<br />

⎪⎭<br />

0 1<br />

Example NSDS Null space directly as a span<br />

Let us express the null space of A as the span of a set of vectors, apply<strong>in</strong>g Theorem<br />

SSNS as economically as possible, without reference to the underly<strong>in</strong>g homogeneous<br />

system of equations (<strong>in</strong> contrast to Example SSNS).<br />

⎡<br />

⎤<br />

2 1 5 1 5 1<br />

1 1 3 1 6 −1<br />

A = ⎢−1 1 −1 0 4 −3⎥<br />

⎣−3 2 −4 −4 −7 0 ⎦<br />

3 −1 5 2 2 3<br />

Theorem SSNS creates vectors for the span by first row-reduc<strong>in</strong>g the matrix <strong>in</strong><br />

question. The row-reduced version of A is<br />

⎡<br />

⎤<br />

1 0 2 0 −1 2<br />

0 1 1 0 3 −1<br />

B =<br />

⎢ 0 0 0 1 4 −2⎥<br />

⎣<br />

0 0 0 0 0 0<br />

⎦<br />

0 0 0 0 0 0<br />

We will mechanically follow the prescription of Theorem SSNS. Here we go, <strong>in</strong><br />

two big steps.<br />

<strong>First</strong>, the non-pivot columns have <strong>in</strong>dices F = {3, 5, 6}, so we will construct the<br />

n − r =6− 3 = 3 vectors with a pattern of zeros and ones dictated by the <strong>in</strong>dices<br />

<strong>in</strong> F . This is the realization of the first two l<strong>in</strong>es of the three-case def<strong>in</strong>ition of the<br />

vectors z j ,1≤ j ≤ n − r.<br />

⎡ ⎤<br />

⎡ ⎤<br />

⎡ ⎤<br />

1<br />

z 1 =<br />

⎢ ⎥<br />

⎣0⎦<br />

0<br />

0<br />

z 2 =<br />

⎢ ⎥<br />

⎣1⎦<br />

0<br />

0<br />

z 3 =<br />

⎢ ⎥<br />

⎣0⎦<br />

1<br />

Each of these vectors arises due to the presence of a column that is not a pivot<br />

column. The rema<strong>in</strong><strong>in</strong>g entries of each vector are the entries of the non-pivot column,<br />

negated, and distributed <strong>in</strong>to the empty slots <strong>in</strong> order (these slots have <strong>in</strong>dices <strong>in</strong><br />

the set D, so also refer to pivot columns). This is the realization of the third l<strong>in</strong>e of<br />


§SS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 117<br />

the three-case def<strong>in</strong>ition of the vectors z j ,1≤ j ≤ n − r.<br />

⎡ ⎤<br />

⎡ ⎤<br />

−2<br />

1<br />

−1<br />

−3<br />

1<br />

0<br />

z 1 =<br />

⎢ 0<br />

z 2 =<br />

⎥<br />

⎢−4<br />

⎥<br />

⎣ 0 ⎦<br />

⎣ 1 ⎦<br />

0<br />

0<br />

⎡ ⎤<br />

−2<br />

1<br />

0<br />

z 3 =<br />

⎢ 2<br />

⎥<br />

⎣ 0 ⎦<br />

1<br />

So, by Theorem SSNS, wehave<br />

⎧⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤⎫<br />

−2 1 −2<br />

〈<br />

−1<br />

−3<br />

1<br />

〉<br />

⎪⎨<br />

⎪⎬<br />

1<br />

0<br />

0<br />

N (A) =〈{z 1 , z 2 , z 3 }〉 =<br />

⎢ 0<br />

,<br />

⎥ ⎢−4<br />

,<br />

⎥ ⎢ 2<br />

⎥<br />

⎣ 0 ⎦ ⎣ 1 ⎦ ⎣ 0 ⎦<br />

⎪⎩<br />

⎪⎭<br />

0 0 1<br />

We know that the null space of A is the solution set of the homogeneous system<br />

LS(A, 0), but nowhere <strong>in</strong> this application of Theorem SSNS have we found occasion<br />

to reference the variables or equations of this system. These details are all buried <strong>in</strong><br />

the proof of Theorem SSNS.<br />

△<br />

Here is an example that will simultaneously exercise the span construction and<br />

Theorem SSNS, while also po<strong>in</strong>t<strong>in</strong>g the way to the next section.<br />

Example SCAD Span of the columns of Archetype D<br />

Beg<strong>in</strong> with the set of four vectors of size 3<br />

{[ ] [ ] [ ]<br />

2 1 7<br />

T = {w 1 , w 2 , w 3 , w 4 } = −3 , 4 , −5 ,<br />

1 1 4<br />

[ ]} −7<br />

−6<br />

−5<br />

and consider the <strong>in</strong>f<strong>in</strong>ite set W = 〈T 〉. The vectors of T have been chosen as the<br />

four columns of the coefficient matrix <strong>in</strong> Archetype D. Check that the vector<br />

⎡ ⎤<br />

2<br />

⎢3⎥<br />

z 2 = ⎣<br />

0<br />

⎦<br />

1<br />

is a solution to the homogeneous system LS(D, 0) (it is the vector z 2 provided by<br />

the description of the null space of the coefficient matrix D from Theorem SSNS).<br />

Apply<strong>in</strong>g Theorem SLSLC, we can write the l<strong>in</strong>ear comb<strong>in</strong>ation,<br />

2w 1 +3w 2 +0w 3 +1w 4 = 0<br />

which we can solve for w 4 ,<br />

w 4 =(−2)w 1 +(−3)w 2 .<br />

This equation says that whenever we encounter the vector w 4 , we can replace it


§SS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 118<br />

with a specific l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors w 1 and w 2 .Sous<strong>in</strong>gw 4 <strong>in</strong> the set<br />

T ,alongwithw 1 and w 2 , is excessive. An example of what we mean here can be<br />

illustrated by the computation,<br />

5w 1 +(−4)w 2 +6w 3 +(−3)w 4<br />

=5w 1 +(−4)w 2 +6w 3 +(−3) ((−2)w 1 +(−3)w 2 )<br />

=5w 1 +(−4)w 2 +6w 3 +(6w 1 +9w 2 )<br />

=11w 1 +5w 2 +6w 3<br />

So what began as a l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors w 1 , w 2 , w 3 , w 4 has been<br />

reduced to a l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors w 1 , w 2 , w 3 . A careful proof us<strong>in</strong>g<br />

our def<strong>in</strong>ition of set equality (Def<strong>in</strong>ition SE) would now allow us to conclude that<br />

this reduction is possible for any vector <strong>in</strong> W ,so<br />

W = 〈{w 1 , w 2 , w 3 }〉<br />

So the span of our set of vectors, W , has not changed, but we have described it by<br />

the span of a set of three vectors, rather than four. Furthermore, we can achieve yet<br />

another, similar, reduction.<br />

Check that the vector<br />

⎡ ⎤<br />

−3<br />

⎢−1⎥<br />

z 1 = ⎣<br />

1<br />

⎦<br />

0<br />

is a solution to the homogeneous system LS(D, 0) (it is the vector z 1 provided by<br />

the description of the null space of the coefficient matrix D from Theorem SSNS).<br />

Apply<strong>in</strong>g Theorem SLSLC, we can write the l<strong>in</strong>ear comb<strong>in</strong>ation,<br />

(−3)w 1 +(−1)w 2 +1w 3 = 0<br />

which we can solve for w 3 ,<br />

w 3 =3w 1 +1w 2<br />

This equation says that whenever we encounter the vector w 3 , we can replace it<br />

with a specific l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors w 1 and w 2 . So, as before, the vector<br />

w 3 is not needed <strong>in</strong> the description of W , provided we have w 1 and w 2 available. In<br />

particular, a careful proof (such as is done <strong>in</strong> Example RSC5) would show that<br />

W = 〈{w 1 , w 2 }〉<br />

So W began life as the span of a set of four vectors, and we have now shown<br />

(utiliz<strong>in</strong>g solutions to a homogeneous system) that W can also be described as the<br />

span of a set of just two vectors. Conv<strong>in</strong>ce yourself that we cannot go any further.<br />

In other words, it is not possible to dismiss either w 1 or w 2 <strong>in</strong> a similar fashion and<br />

w<strong>in</strong>now the set down to just one vector.<br />

What was it about the orig<strong>in</strong>al set of four vectors that allowed us to declare


§SS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 119<br />

certa<strong>in</strong> vectors as surplus? And just which vectors were we able to dismiss? And<br />

why did we have to stop once we had two vectors rema<strong>in</strong><strong>in</strong>g? The answers to these<br />

questions motivate “l<strong>in</strong>ear <strong>in</strong>dependence,” our next section and next def<strong>in</strong>ition, and<br />

so are worth consider<strong>in</strong>g carefully now.<br />

△<br />

Read<strong>in</strong>g Questions<br />

1. Let S be the set of three vectors below.<br />

⎧⎡<br />

⎨<br />

S = ⎣ 1 ⎤ ⎡<br />

2 ⎦ , ⎣ 3<br />

⎤ ⎡<br />

−4⎦ , ⎣ 4<br />

⎤⎫<br />

⎬<br />

−2⎦<br />

⎩<br />

−1 2 1<br />

⎭<br />

⎡<br />

Let W = 〈S〉 be the span of S. Is the vector ⎣ −1<br />

⎤<br />

8 ⎦ <strong>in</strong> W ? Give an explanation of the<br />

−4<br />

reason for your answer.<br />

⎡<br />

2. Use S and W from the previous question. Is the vector ⎣ 6 ⎤<br />

5 ⎦ <strong>in</strong> W ? Give an explanation<br />

−1<br />

of the reason for your answer.<br />

3. For the matrix A below, f<strong>in</strong>d a set S so that 〈S〉 = N (A), whereN (A) is the null space<br />

of A. (See Theorem SSNS.)<br />

⎡<br />

A = ⎣ 1 3 1 9<br />

⎤<br />

2 1 −3 8⎦<br />

1 1 −1 5<br />

Exercises<br />

C22 † For each archetype that is a system of equations, consider the correspond<strong>in</strong>g homogeneous<br />

system of equations. Write elements of the solution set to these homogeneous systems<br />

<strong>in</strong> vector form, as guaranteed by Theorem VFSLS. Then write the null space of the coefficient<br />

matrix of each system as the span of a set of vectors, as described <strong>in</strong> Theorem SSNS.<br />

Archetype A,ArchetypeB,ArchetypeC,ArchetypeD/Archetype E,ArchetypeF,Archetype<br />

G/Archetype H, ArchetypeI, ArchetypeJ<br />

C23 † Archetype K and Archetype L are def<strong>in</strong>ed as matrices. Use Theorem SSNS directly<br />

to f<strong>in</strong>d a set S so that 〈S〉 is the null space of the matrix. Do not make any reference to<br />

the associated homogeneous system of equations <strong>in</strong> your solution.<br />

⎧⎡<br />

⎤ ⎡ ⎤⎫<br />

⎡ ⎤<br />

2 3<br />

5<br />

⎪⎨<br />

⎪⎬<br />

C40 † ⎢−1⎥<br />

⎢ 2 ⎥<br />

Suppose that S = ⎣<br />

⎪⎩<br />

3<br />

⎦ , ⎣<br />

−2<br />

⎦<br />

⎪⎭ .LetW = 〈S〉 and let x = ⎢ 8 ⎥<br />

⎣<br />

−12<br />

⎦. Isx ∈ W ?<br />

4 1<br />

−5<br />

If so, provide an explicit l<strong>in</strong>ear comb<strong>in</strong>ation that demonstrates this.


§SS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 120<br />

⎧⎡<br />

⎤ ⎡ ⎤⎫<br />

⎡ ⎤<br />

2 3<br />

5<br />

⎪⎨<br />

⎪⎬<br />

C41 † ⎢−1⎥<br />

⎢ 2 ⎥<br />

Suppose that S = ⎣<br />

⎪⎩<br />

3<br />

⎦ , ⎣<br />

−2<br />

⎦<br />

⎪⎭ .LetW = 〈S〉 and let y = ⎢1⎥<br />

⎣<br />

3<br />

⎦. Isy ∈ W ?If<br />

4 1<br />

5<br />

so, provide an explicit l<strong>in</strong>ear comb<strong>in</strong>ation that demonstrates this.<br />

⎧⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤⎫<br />

⎡ ⎤<br />

2 1 3<br />

1<br />

⎪⎨<br />

−1<br />

1<br />

−1<br />

⎪⎬<br />

−1<br />

C42 † Suppose R = ⎢ 3 ⎥<br />

⎣<br />

⎪⎩<br />

4<br />

⎦ , ⎢ 2 ⎥<br />

⎣<br />

2<br />

⎦ , ⎢ 0 ⎥ .Isy = ⎢−8⎥<br />

<strong>in</strong> 〈R〉?<br />

⎣<br />

3<br />

⎦ ⎣<br />

⎪⎭<br />

−4<br />

⎦<br />

0 −1 −2<br />

−3<br />

⎧⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤⎫<br />

⎡ ⎤<br />

2 1 3<br />

1<br />

⎪⎨<br />

−1<br />

1<br />

−1<br />

⎪⎬<br />

1<br />

C43 † Suppose R = ⎢ 3 ⎥<br />

⎣<br />

⎪⎩<br />

4<br />

⎦ , ⎢ 2 ⎥<br />

⎣<br />

2<br />

⎦ , ⎢ 0 ⎥ .Isz = ⎢5⎥<br />

<strong>in</strong> 〈R〉?<br />

⎣<br />

3<br />

⎦ ⎣<br />

⎪⎭<br />

3<br />

⎦<br />

0 −1 −2<br />

1<br />

⎧⎡<br />

⎨<br />

C44 † Suppose that S = ⎣ −1<br />

⎤ ⎡<br />

2 ⎦ , ⎣ 3 ⎤ ⎡<br />

1⎦ , ⎣ 1 ⎤ ⎡<br />

5⎦ , ⎣ −6<br />

⎤⎫<br />

⎡<br />

⎬<br />

5 ⎦<br />

⎩<br />

1 2 4 1<br />

⎭ .LetW = 〈S〉 and let y = ⎣ −5<br />

⎤<br />

3 ⎦.<br />

0<br />

Is y ∈ W ? If so, provide an explicit l<strong>in</strong>ear comb<strong>in</strong>ation that demonstrates this.<br />

⎧⎡<br />

⎨<br />

C45 † Suppose that S = ⎣ −1<br />

⎤ ⎡<br />

2 ⎦ , ⎣ 3 ⎤ ⎡<br />

1⎦ , ⎣ 1 ⎤ ⎡<br />

5⎦ , ⎣ −6<br />

⎤⎫<br />

⎡<br />

⎬<br />

5 ⎦<br />

⎩<br />

1 2 4 1<br />

⎭ .LetW = 〈S〉 and let w = ⎣ 2 ⎤<br />

1⎦.<br />

3<br />

Is w ∈ W ? If so, provide an explicit l<strong>in</strong>ear comb<strong>in</strong>ation that demonstrates this.<br />

C50 † Let A be the matrix below.<br />

1. F<strong>in</strong>d a set S so that N (A) =〈S〉.<br />

⎡ ⎤<br />

3<br />

⎢−5⎥<br />

2. If z = ⎣<br />

1<br />

⎦, then show directly that z ∈N(A).<br />

2<br />

3. Write z as a l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors <strong>in</strong> S.<br />

⎡<br />

A = ⎣ 2 3 1 4<br />

⎤<br />

1 2 1 3⎦<br />

−1 0 1 1<br />

C60 † For the matrix A below, f<strong>in</strong>d a set of vectors S so that the span of S equals the<br />

null space of A, 〈S〉 = N (A).<br />

⎡<br />

A = ⎣ 1 1 6 −8<br />

⎤<br />

1 −2 0 1 ⎦<br />

−2 1 −6 7<br />

M10 † Consider the set of all size 2 vectors <strong>in</strong> the Cartesian plane R 2 .<br />

1. Give a geometric description of the span of a s<strong>in</strong>gle vector.


§SS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 121<br />

2. How can you tell if two vectors span the entire plane, without do<strong>in</strong>g any row reduction<br />

or calculation?<br />

M11 † Consider the set of all size 3 vectors <strong>in</strong> Cartesian 3-space R 3 .<br />

1. Give a geometric description of the span of a s<strong>in</strong>gle vector.<br />

2. Describe the possibilities for the span of two vectors.<br />

3. Describe the possibilities for the span of three vectors.<br />

⎡<br />

M12 † Let u = ⎣ 1 ⎤ ⎡<br />

3 ⎦ and v = ⎣ 2<br />

⎤<br />

−2⎦.<br />

−2<br />

1<br />

1. F<strong>in</strong>d a vector w 1 , different from u and v, sothat〈{u, v, w 1 }〉 = 〈{u, v}〉.<br />

2. F<strong>in</strong>d a vector w 2 so that 〈{u, v, w 2 }〉 ≠ 〈{u, v}〉.<br />

M20 In Example SCAD we began with the four columns of the coefficient matrix of<br />

Archetype D, and used these columns <strong>in</strong> a span construction. Then we methodically argued<br />

that we could remove the last column, then the third column, and create the same set by<br />

just do<strong>in</strong>g a span construction with the first two columns. We claimed we could not go any<br />

further, and had removed as many vectors as possible. Provide a conv<strong>in</strong>c<strong>in</strong>g argument for<br />

why a third vector cannot be removed.<br />

M21 † In the spirit of Example SCAD, beg<strong>in</strong> with the four columns of the coefficient<br />

matrix of Archetype C, and use these columns <strong>in</strong> a span construction to build the set S.<br />

Argue that S can be expressed as the span of just three of the columns of the coefficient<br />

matrix (say<strong>in</strong>g exactly which three) and <strong>in</strong> the spirit of Exercise SS.M20 argue that no one<br />

of these three vectors can be removed and still have a span construction create S.<br />

T10 † Suppose that v 1 , v 2 ∈ C m .Provethat<br />

〈{v 1 , v 2 }〉 = 〈{v 1 , v 2 , 5v 1 +3v 2 }〉<br />

T20 † Suppose that S is a set of vectors from C m . Prove that the zero vector, 0, isan<br />

element of 〈S〉.<br />

T21 Suppose that S is a set of vectors from C m and x, y ∈〈S〉. Provethatx + y ∈〈S〉.<br />

T22<br />

Suppose that S is a set of vectors from C m , α ∈ C, andx ∈〈S〉. Provethatαx ∈〈S〉.


Section LI<br />

L<strong>in</strong>ear Independence<br />

“L<strong>in</strong>ear <strong>in</strong>dependence” is one of the most fundamental conceptual ideas <strong>in</strong> l<strong>in</strong>ear<br />

algebra, along with the notion of a span. So this section, and the subsequent Section<br />

LDS, will explore this new idea.<br />

Subsection LISV<br />

L<strong>in</strong>early Independent Sets of Vectors<br />

Theorem SLSLC tells us that a solution to a homogeneous system of equations is<br />

a l<strong>in</strong>ear comb<strong>in</strong>ation of the columns of the coefficient matrix that equals the zero<br />

vector. We used just this situation to our advantage (twice!) <strong>in</strong> Example SCAD<br />

where we reduced the set of vectors used <strong>in</strong> a span construction from four down to<br />

two, by declar<strong>in</strong>g certa<strong>in</strong> vectors as surplus. The next two def<strong>in</strong>itions will allow us<br />

to formalize this situation.<br />

Def<strong>in</strong>ition RLDCV Relation of L<strong>in</strong>ear Dependence for Column Vectors<br />

Given a set of vectors S = {u 1 , u 2 , u 3 , ..., u n }, a true statement of the form<br />

α 1 u 1 + α 2 u 2 + α 3 u 3 + ···+ α n u n = 0<br />

is a relation of l<strong>in</strong>ear dependence on S. If this statement is formed <strong>in</strong> a trivial<br />

fashion, i.e. α i =0,1≤ i ≤ n, then we say it is the trivial relation of l<strong>in</strong>ear<br />

dependence on S.<br />

□<br />

Def<strong>in</strong>ition LICV L<strong>in</strong>ear Independence of Column Vectors<br />

The set of vectors S = {u 1 , u 2 , u 3 , ..., u n } is l<strong>in</strong>early dependent if there is<br />

a relation of l<strong>in</strong>ear dependence on S that is not trivial. In the case where the<br />

only relation of l<strong>in</strong>ear dependence on S is the trivial one, then S is a l<strong>in</strong>early<br />

<strong>in</strong>dependent set of vectors.<br />

□<br />

Notice that a relation of l<strong>in</strong>ear dependence is an equation. Though most of it is a<br />

l<strong>in</strong>ear comb<strong>in</strong>ation, it is not a l<strong>in</strong>ear comb<strong>in</strong>ation (that would be a vector). L<strong>in</strong>ear<br />

<strong>in</strong>dependence is a property of a set of vectors. It is easy to take a set of vectors,<br />

and an equal number of scalars, all zero, and form a l<strong>in</strong>ear comb<strong>in</strong>ation that equals<br />

the zero vector. When the easy way is the only way, then we say the set is l<strong>in</strong>early<br />

<strong>in</strong>dependent. Here are a couple of examples.<br />

Example LDS L<strong>in</strong>early dependent set <strong>in</strong> C 5<br />

122


§LI Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 123<br />

Consider the set of n = 4 vectors from C 5 ,<br />

⎧⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />

2 1 2 −6<br />

⎪⎨<br />

−1<br />

2<br />

1<br />

7<br />

⎪⎬<br />

S = ⎢ 3 ⎥<br />

⎣<br />

⎪⎩<br />

1 ⎦ , ⎢−1⎥<br />

⎣ 5 ⎦ , ⎢−3⎥<br />

⎣ 6 ⎦ , ⎢−1⎥<br />

⎣ 0 ⎦<br />

⎪⎭<br />

2 2 1 1<br />

To determ<strong>in</strong>e l<strong>in</strong>ear <strong>in</strong>dependence we first form a relation of l<strong>in</strong>ear dependence,<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

2 1 2 −6<br />

−1<br />

2<br />

1<br />

7<br />

α 1 ⎢ 3 ⎥<br />

⎣ 1 ⎦ + α 2 ⎢−1⎥<br />

⎣ 5 ⎦ + α 3 ⎢−3⎥<br />

⎣ 6 ⎦ + α 4 ⎢−1⎥<br />

⎣ 0 ⎦ = 0<br />

2 2 1 1<br />

We know that α 1 = α 2 = α 3 = α 4 = 0 is a solution to this equation, but<br />

that is of no <strong>in</strong>terest whatsoever. That is always the case, no matter what four<br />

vectorswemighthavechosen.Wearecurioustoknowifthereareother,nontrivial,<br />

solutions. Theorem SLSLC tells us that we can f<strong>in</strong>d such solutions as solutions to the<br />

homogeneous system LS(A, 0) where the coefficient matrix has these four vectors<br />

as columns, which we then row-reduce<br />

⎡<br />

⎤ ⎡<br />

⎤<br />

2 1 2 −6<br />

1 0 0 −2<br />

−1 2 1 7<br />

A = ⎢ 3 −1 −3 −1⎥<br />

⎣ 1 5 6 0 ⎦<br />

2 2 1 1<br />

RREF<br />

−−−−→<br />

0 1 0 4<br />

⎢ 0 0 1 −3⎥<br />

⎣<br />

0 0 0 0<br />

⎦<br />

0 0 0 0<br />

We could solve this homogeneous system completely, but for this example all we<br />

need is one nontrivial solution. Sett<strong>in</strong>g the lone free variable to any nonzero value,<br />

such as x 4 = 1, yields the nontrivial solution<br />

⎡ ⎤<br />

2<br />

⎢−4⎥<br />

x = ⎣<br />

3<br />

⎦<br />

1<br />

complet<strong>in</strong>g our application of Theorem SLSLC, wehave<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

2<br />

1 2 −6<br />

−1<br />

2<br />

1<br />

7<br />

2 ⎢ 3 ⎥<br />

⎣ 1 ⎦ +(−4) ⎢−1⎥<br />

⎣ 5 ⎦ +3 ⎢−3⎥<br />

⎣ 6 ⎦ +1 ⎢−1⎥<br />

⎣ 0 ⎦ = 0<br />

2<br />

2 1 1<br />

This is a relation of l<strong>in</strong>ear dependence on S that is not trivial, so we conclude<br />

that S is l<strong>in</strong>early dependent.<br />

△<br />

Example LIS L<strong>in</strong>early <strong>in</strong>dependent set <strong>in</strong> C 5


§LI Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 124<br />

Consider the set of n = 4 vectors from C 5 ,<br />

⎧⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />

2 1 2 −6<br />

⎪⎨<br />

−1<br />

2<br />

1<br />

7<br />

⎪⎬<br />

T = ⎢ 3 ⎥<br />

⎣<br />

⎪⎩<br />

1 ⎦ , ⎢−1⎥<br />

⎣ 5 ⎦ , ⎢−3⎥<br />

⎣ 6 ⎦ , ⎢−1⎥<br />

⎣ 1 ⎦<br />

⎪⎭<br />

2 2 1 1<br />

To determ<strong>in</strong>e l<strong>in</strong>ear <strong>in</strong>dependence we first form a relation of l<strong>in</strong>ear dependence,<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

2 1 2 −6<br />

−1<br />

2<br />

1<br />

7<br />

α 1 ⎢ 3 ⎥<br />

⎣ 1 ⎦ + α 2 ⎢−1⎥<br />

⎣ 5 ⎦ + α 3 ⎢−3⎥<br />

⎣ 6 ⎦ + α 4 ⎢−1⎥<br />

⎣ 1 ⎦ = 0<br />

2 2 1 1<br />

We know that α 1 = α 2 = α 3 = α 4 = 0 is a solution to this equation, but<br />

that is of no <strong>in</strong>terest whatsoever. That is always the case, no matter what four<br />

vectorswemighthavechosen.Wearecurioustoknowifthereareother,nontrivial,<br />

solutions. Theorem SLSLC tells us that we can f<strong>in</strong>d such solutions as solution to the<br />

homogeneous system LS(B, 0) where the coefficient matrix has these four vectors<br />

as columns. Row-reduc<strong>in</strong>g this coefficient matrix yields,<br />

⎡<br />

⎤ ⎡<br />

2 1 2 −6<br />

−1 2 1 7<br />

RREF<br />

B = ⎢ 3 −1 −3 −1⎥<br />

−−−−→<br />

⎣ 1 5 6 1 ⎦ ⎢<br />

⎣<br />

2 2 1 1<br />

1 0 0 0<br />

0 1 0 0<br />

0 0 1 0<br />

0 0 0 1<br />

0 0 0 0<br />

From the form of this matrix, we see that there are no free variables, so the<br />

solution is unique, and because the system is homogeneous, this unique solution is<br />

the trivial solution. So we now know that there is but one way to comb<strong>in</strong>e the four<br />

vectors of T <strong>in</strong>to a relation of l<strong>in</strong>ear dependence, and that one way is the easy and<br />

obvious way. In this situation we say that the set, T , is l<strong>in</strong>early <strong>in</strong>dependent. △<br />

Example LDS and Example LIS relied on solv<strong>in</strong>g a homogeneous system of<br />

equations to determ<strong>in</strong>e l<strong>in</strong>ear <strong>in</strong>dependence. We can codify this process <strong>in</strong> a timesav<strong>in</strong>g<br />

theorem.<br />

Theorem LIVHS L<strong>in</strong>early Independent Vectors and Homogeneous Systems<br />

Suppose that S = {v 1 , v 2 , v 3 , ..., v n }⊆C m is a set of vectors and A is the m × n<br />

matrix whose columns are the vectors <strong>in</strong> S. ThenS is a l<strong>in</strong>early <strong>in</strong>dependent set if<br />

and only if the homogeneous system LS(A, 0) has a unique solution.<br />

Proof. (⇐) Suppose that LS(A, 0) has a unique solution. S<strong>in</strong>ce it is a homogeneous<br />

system, this solution must be the trivial solution x = 0. ByTheoremSLSLC, this<br />

means that the only relation of l<strong>in</strong>ear dependence on S is the trivial one. So S is<br />

l<strong>in</strong>early <strong>in</strong>dependent.<br />

⎤<br />

⎥<br />


§LI Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 125<br />

(⇒) We will prove the contrapositive. Suppose that LS(A, 0) does not have a<br />

unique solution. S<strong>in</strong>ce it is a homogeneous system, it is consistent (Theorem HSC),<br />

and so must have <strong>in</strong>f<strong>in</strong>itely many solutions (Theorem PSSLS). One of these <strong>in</strong>f<strong>in</strong>itely<br />

many solutions must be nontrivial (<strong>in</strong> fact, almost all of them are), so choose one.<br />

By Theorem SLSLC this nontrivial solution will give a nontrivial relation of l<strong>in</strong>ear<br />

dependence on S, so we can conclude that S is a l<strong>in</strong>early dependent set. <br />

S<strong>in</strong>ce Theorem LIVHS is an equivalence, we can use it to determ<strong>in</strong>e the l<strong>in</strong>ear<br />

<strong>in</strong>dependence or dependence of any set of column vectors, just by creat<strong>in</strong>g a matrix<br />

and analyz<strong>in</strong>g the row-reduced form. Let us illustrate this with two more examples.<br />

Example LIHS L<strong>in</strong>early <strong>in</strong>dependent, homogeneous system<br />

Is the set of vectors<br />

⎧⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤⎫<br />

2 6 4<br />

⎪⎨<br />

−1<br />

2<br />

3<br />

⎪⎬<br />

S = ⎢ 3 ⎥<br />

⎣<br />

⎪⎩<br />

4 ⎦ , ⎢−1⎥<br />

⎣ 3 ⎦ , ⎢−4⎥<br />

⎣ 5 ⎦<br />

⎪⎭<br />

2 4 1<br />

l<strong>in</strong>early <strong>in</strong>dependent or l<strong>in</strong>early dependent?<br />

Theorem LIVHS suggests we study the matrix, A, whose columns are the vectors<br />

<strong>in</strong> S. Specifically, we are <strong>in</strong>terested <strong>in</strong> the size of the solution set for the homogeneous<br />

system LS(A, 0), so we row-reduce A.<br />

⎡<br />

⎤<br />

2 6 4<br />

−1 2 3<br />

A = ⎢ 3 −1 −4⎥<br />

⎣ 4 3 5 ⎦<br />

2 4 1<br />

RREF<br />

−−−−→<br />

⎡<br />

⎢<br />

⎣<br />

1 0 0<br />

0 1 0<br />

0 0 1<br />

0 0 0<br />

0 0 0<br />

Now, r = 3, so there are n−r =3−3 = 0 free variables and we see that LS(A, 0)<br />

has a unique solution (Theorem HSC, TheoremFVCS). By Theorem LIVHS, the<br />

set S is l<strong>in</strong>early <strong>in</strong>dependent.<br />

△<br />

Example LDHS L<strong>in</strong>early dependent, homogeneous system<br />

Is the set of vectors<br />

⎧⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤⎫<br />

2 6 4<br />

⎪⎨<br />

−1<br />

2<br />

3<br />

⎪⎬<br />

S = ⎢ 3 ⎥<br />

⎣<br />

⎪⎩<br />

4 ⎦ , ⎢−1⎥<br />

⎣ 3 ⎦ , ⎢−4⎥<br />

⎣−1⎦<br />

⎪⎭<br />

2 4 2<br />

l<strong>in</strong>early <strong>in</strong>dependent or l<strong>in</strong>early dependent?<br />

Theorem LIVHS suggests we study the matrix, A, whose columns are the vectors<br />

<strong>in</strong> S. Specifically, we are <strong>in</strong>terested <strong>in</strong> the size of the solution set for the homogeneous<br />

system LS(A, 0), so we row-reduce A.<br />

⎤<br />

⎥<br />


§LI Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 126<br />

⎡<br />

⎤ ⎡<br />

⎤<br />

2 6 4<br />

1 0 −1<br />

−1 2 3<br />

RREF<br />

0 1 1<br />

A = ⎢ 3 −1 −4⎥<br />

−−−−→<br />

⎣ 4 3 −1⎦<br />

⎢ 0 0 0<br />

⎥<br />

⎣ 0 0 0 ⎦<br />

2 4 2<br />

0 0 0<br />

Now, r = 2, so there are n−r =3−2 = 1 free variables and we see that LS(A, 0)<br />

has <strong>in</strong>f<strong>in</strong>itely many solutions (Theorem HSC, TheoremFVCS). By Theorem LIVHS,<br />

the set S is l<strong>in</strong>early dependent.<br />

△<br />

As an equivalence, Theorem LIVHS gives us a straightforward way to determ<strong>in</strong>e<br />

if a set of vectors is l<strong>in</strong>early <strong>in</strong>dependent or dependent.<br />

Review Example LIHS and Example LDHS. They are very similar, differ<strong>in</strong>g only<br />

<strong>in</strong> the last two slots of the third vector. This resulted <strong>in</strong> slightly different matrices<br />

when row-reduced, and slightly different values of r, the number of nonzero rows.<br />

Notice, too, that we are less <strong>in</strong>terested <strong>in</strong> the actual solution set, and more <strong>in</strong>terested<br />

<strong>in</strong> its form or size. These observations allow us to make a slight improvement <strong>in</strong><br />

Theorem LIVHS.<br />

Theorem LIVRN L<strong>in</strong>early Independent Vectors, r and n<br />

Suppose that S = {v 1 , v 2 , v 3 , ..., v n }⊆C m is a set of vectors and A is the m × n<br />

matrix whose columns are the vectors <strong>in</strong> S. LetB be a matrix <strong>in</strong> reduced row-echelon<br />

form that is row-equivalent to A and let r denote the number of pivot columns <strong>in</strong> B.<br />

Then S is l<strong>in</strong>early <strong>in</strong>dependent if and only if n = r.<br />

Proof. Theorem LIVHS says the l<strong>in</strong>ear <strong>in</strong>dependence of S is equivalent to the<br />

homogeneous l<strong>in</strong>ear system LS(A, 0) hav<strong>in</strong>g a unique solution. S<strong>in</strong>ce LS(A, 0) is<br />

consistent (Theorem HSC) we can apply Theorem CSRN to see that the solution is<br />

unique exactly when n = r.<br />

<br />

So now here is an example of the most straightforward way to determ<strong>in</strong>e if a set<br />

of column vectors is l<strong>in</strong>early <strong>in</strong>dependent or l<strong>in</strong>early dependent. While this method<br />

can be quick and easy, do not forget the logical progression from the def<strong>in</strong>ition<br />

of l<strong>in</strong>ear <strong>in</strong>dependence through homogeneous system of equations which makes it<br />

possible.<br />

Example LDRN L<strong>in</strong>early dependent, r and n<br />

Is the set of vectors<br />

⎧⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />

2 9 1 −3 6<br />

−1<br />

−6<br />

1<br />

1<br />

−2<br />

⎪⎨<br />

⎪⎬<br />

3<br />

−2<br />

1<br />

4<br />

1<br />

S =<br />

⎢ 1<br />

,<br />

⎥ ⎢ 3<br />

,<br />

⎥ ⎢0<br />

,<br />

⎥ ⎢ 2<br />

,<br />

⎥ ⎢ 4<br />

⎥<br />

⎣ 0 ⎦ ⎣ 2 ⎦ ⎣0⎦<br />

⎣ 1 ⎦ ⎣ 3 ⎦<br />

⎪⎩<br />

⎪⎭<br />

3 1 1 2 2


§LI Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 127<br />

l<strong>in</strong>early <strong>in</strong>dependent or l<strong>in</strong>early dependent?<br />

Theorem LIVRN suggests we place these vectors <strong>in</strong>to a matrix as columns and<br />

analyze the row-reduced version of the matrix,<br />

⎡<br />

⎤ ⎡<br />

⎤<br />

2 9 1 −3 6<br />

1 0 0 0 −1<br />

−1 −6 1 1 −2<br />

3 −2 1 4 1<br />

⎢ 1 3 0 2 4<br />

⎥<br />

⎣ 0 2 0 1 3 ⎦<br />

3 1 1 2 2<br />

RREF<br />

−−−−→<br />

⎢<br />

⎣<br />

0 1 0 0 1<br />

0 0 1 0 2<br />

0 0 0 1 1<br />

0 0 0 0 0<br />

0 0 0 0 0<br />

Now we need only compute that r =4< 5=n to recognize, via Theorem LIVRN<br />

that S is a l<strong>in</strong>early dependent set. Boom!<br />

△<br />

Example LLDS Large l<strong>in</strong>early dependent set <strong>in</strong> C 4<br />

Consider the set of n = 9 vectors from C 4 ,<br />

⎧⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

−1 7 1 0 5 2<br />

⎪⎨<br />

⎢ 3 ⎥ ⎢ 1 ⎥ ⎢ 2 ⎥ ⎢4⎥<br />

⎢−2⎥<br />

⎢ 1 ⎥<br />

R = ⎣<br />

⎪⎩<br />

1<br />

⎦ , ⎣<br />

−3<br />

⎦ , ⎣<br />

−1<br />

⎦ , ⎣<br />

2<br />

⎦ , ⎣<br />

4<br />

⎦ , ⎣<br />

−6<br />

⎦ ,<br />

2 6 −2 9 3 4<br />

⎡<br />

⎢<br />

⎣<br />

3<br />

0<br />

−3<br />

1<br />

⎤<br />

⎥<br />

⎦ ,<br />

⎥<br />

⎦<br />

⎡ ⎤ ⎡ ⎤⎫<br />

1 −6 ⎪⎬<br />

⎢1⎥<br />

⎢−1⎥<br />

⎣<br />

5<br />

⎦ , ⎣<br />

1<br />

⎦<br />

⎪⎭ .<br />

3 1<br />

To employ Theorem LIVHS, we form a 4 × 9 matrix, C, whose columns are the<br />

vectors <strong>in</strong> R<br />

⎡<br />

⎤<br />

−1 7 1 0 5 2 3 1 −6<br />

⎢ 3 1 2 4 −2 1 0 1 −1⎥<br />

C = ⎣<br />

1 −3 −1 2 4 −6 −3 5 1<br />

⎦ .<br />

2 6 −2 9 3 4 1 3 1<br />

To determ<strong>in</strong>e if the homogeneous system LS(C, 0) has a unique solution or not,<br />

we would normally row-reduce this matrix. But <strong>in</strong> this particular example, we can<br />

do better. Theorem HMVEI tells us that s<strong>in</strong>ce the system is homogeneous with<br />

n = 9 variables <strong>in</strong> m = 4 equations, and n>m, there must be <strong>in</strong>f<strong>in</strong>itely many<br />

solutions. S<strong>in</strong>ce there is not a unique solution, Theorem LIVHS says the set is l<strong>in</strong>early<br />

dependent.<br />

△<br />

The situation <strong>in</strong> Example LLDS is slick enough to warrant formulat<strong>in</strong>g as a<br />

theorem.<br />

Theorem MVSLD More Vectors than Size implies L<strong>in</strong>ear Dependence<br />

Suppose that S = {u 1 , u 2 , u 3 , ..., u n }⊆C m and n>m.ThenS is a l<strong>in</strong>early<br />

dependent set.<br />

Proof. Form the m × n matrix A whose columns are u i ,1≤ i ≤ n. Consider the<br />

homogeneous system LS(A, 0). ByTheoremHMVEI this system has <strong>in</strong>f<strong>in</strong>itely many<br />

solutions. S<strong>in</strong>ce the system does not have a unique solution, Theorem LIVHS says<br />

the columns of A form a l<strong>in</strong>early dependent set, as desired.


§LI Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 128<br />

Subsection LINM<br />

L<strong>in</strong>ear Independence and Nons<strong>in</strong>gular Matrices<br />

Wewillnowspecializetosetsofn vectors from C n . This will put Theorem MVSLD<br />

off-limits, while Theorem LIVHS will <strong>in</strong>volve square matrices. Let us beg<strong>in</strong> by<br />

contrast<strong>in</strong>g Archetype A and Archetype B.<br />

Example LDCAA L<strong>in</strong>early dependent columns <strong>in</strong> Archetype A<br />

Archetype A is a system of l<strong>in</strong>ear equations with coefficient matrix,<br />

[ 1 −1<br />

] 2<br />

A = 2 1 1<br />

1 1 0<br />

Do the columns of this matrix form a l<strong>in</strong>early <strong>in</strong>dependent or dependent set? By<br />

Example S we know that A is s<strong>in</strong>gular. Accord<strong>in</strong>g to the def<strong>in</strong>ition of nons<strong>in</strong>gular<br />

matrices, Def<strong>in</strong>ition NM, the homogeneous system LS(A, 0) has <strong>in</strong>f<strong>in</strong>itely many<br />

solutions. So by Theorem LIVHS, thecolumnsofA form a l<strong>in</strong>early dependent set. △<br />

Example LICAB L<strong>in</strong>early <strong>in</strong>dependent columns <strong>in</strong> Archetype B<br />

Archetype B is a system of l<strong>in</strong>ear equations with coefficient matrix,<br />

[ −7 −6<br />

] −12<br />

B = 5 5 7<br />

1 0 4<br />

Do the columns of this matrix form a l<strong>in</strong>early <strong>in</strong>dependent or dependent set?<br />

By Example NM we know that B is nons<strong>in</strong>gular. Accord<strong>in</strong>g to the def<strong>in</strong>ition of<br />

nons<strong>in</strong>gular matrices, Def<strong>in</strong>ition NM, the homogeneous system LS(A, 0) has a unique<br />

solution.SobyTheoremLIVHS, thecolumnsofB form a l<strong>in</strong>early <strong>in</strong>dependent set.<br />

△<br />

That Archetype A and Archetype B have opposite properties for the columns<br />

of their coefficient matrices is no accident. Here is the theorem, and then we will<br />

update our equivalences for nons<strong>in</strong>gular matrices, Theorem NME1.<br />

Theorem NMLIC Nons<strong>in</strong>gular Matrices have L<strong>in</strong>early Independent Columns<br />

Suppose that A is a square matrix. Then A is nons<strong>in</strong>gular if and only if the columns<br />

of A form a l<strong>in</strong>early <strong>in</strong>dependent set.<br />

Proof. This is a proof where we can cha<strong>in</strong> together equivalences, rather than prov<strong>in</strong>g<br />

the two halves separately.<br />

A nons<strong>in</strong>gular ⇐⇒ LS (A, 0) has a unique solution<br />

⇐⇒ columns of A are l<strong>in</strong>early <strong>in</strong>dependent<br />

Def<strong>in</strong>ition NM<br />

Theorem LIVHS


§LI Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 129<br />

Here is the update to Theorem NME1.<br />

Theorem NME2 Nons<strong>in</strong>gular Matrix Equivalences, Round 2<br />

Suppose that A is a square matrix. The follow<strong>in</strong>g are equivalent.<br />

1. A is nons<strong>in</strong>gular.<br />

2. A row-reduces to the identity matrix.<br />

3. The null space of A conta<strong>in</strong>s only the zero vector, N (A) ={0}.<br />

4. The l<strong>in</strong>ear system LS(A, b) has a unique solution for every possible choice of<br />

b.<br />

5. The columns of A form a l<strong>in</strong>early <strong>in</strong>dependent set.<br />

Proof. Theorem NMLIC is yet another equivalence for a nons<strong>in</strong>gular matrix, so we<br />

can add it to the list <strong>in</strong> Theorem NME1.<br />

<br />

Subsection NSSLI<br />

Null Spaces, Spans, L<strong>in</strong>ear Independence<br />

In Subsection SS.SSNS we proved Theorem SSNS which provided n − r vectors<br />

that could be used with the span construction to build the entire null space of a<br />

matrix. As we have h<strong>in</strong>ted <strong>in</strong> Example SCAD, and as we will see aga<strong>in</strong> go<strong>in</strong>g forward,<br />

l<strong>in</strong>early dependent sets carry redundant vectors with them when used <strong>in</strong> build<strong>in</strong>g<br />

a set as a span. Our aim now is to show that the vectors provided by Theorem<br />

SSNS form a l<strong>in</strong>early <strong>in</strong>dependent set, so <strong>in</strong> one sense they are as efficient as possible<br />

a way to describe the null space. Notice that the vectors z j ,1≤ j ≤ n − r first<br />

appear <strong>in</strong> the vector form of solutions to arbitrary l<strong>in</strong>ear systems (Theorem VFSLS).<br />

The exact same vectors appear aga<strong>in</strong> <strong>in</strong> the span construction <strong>in</strong> the conclusion of<br />

Theorem SSNS. S<strong>in</strong>ce this second theorem specializes to homogeneous systems the<br />

only real difference is that the vector c <strong>in</strong> Theorem VFSLS is the zero vector for a<br />

homogeneous system. F<strong>in</strong>ally, Theorem BNS will now show that these same vectors<br />

are a l<strong>in</strong>early <strong>in</strong>dependent set. We will set the stage for the proof of this theorem<br />

with a moderately large example. Study the example carefully, as it will make it<br />

easier to understand the proof.<br />

Example LINSB L<strong>in</strong>ear <strong>in</strong>dependence of null space basis<br />

Suppose that we are <strong>in</strong>terested <strong>in</strong> the null space of a 3×7 matrix, A, which row-reduces<br />

to<br />

⎡<br />

⎤<br />

1 0 −2 4 0 3 9<br />

B = ⎣ 0 1 5 6 0 7 1 ⎦<br />

0 0 0 0 1 8 −5


§LI Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 130<br />

The set F = {3, 4, 6, 7} is the set of <strong>in</strong>dices for our four free variables that would<br />

be used <strong>in</strong> a description of the solution set for the homogeneous system LS(A, 0).<br />

Apply<strong>in</strong>g Theorem SSNS we can beg<strong>in</strong> to construct a set of four vectors whose span<br />

is the null space of A, a set of vectors we will reference as T .<br />

⎧⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />

〈<br />

〉<br />

⎪⎨<br />

1<br />

0<br />

0<br />

0<br />

⎪⎬<br />

N (A) =〈T 〉 = 〈{z 1 , z 2 , z 3 , z 4 }〉 =<br />

0<br />

,<br />

1<br />

,<br />

0<br />

,<br />

0<br />

⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥<br />

⎣0⎦<br />

⎣0⎦<br />

⎣1⎦<br />

⎣0⎦<br />

⎪⎩<br />

⎪⎭<br />

0 0 0 1<br />

So far, we have constructed as much of these <strong>in</strong>dividual vectors as we can, based<br />

just on the knowledge of the contents of the set F . This has allowed us to determ<strong>in</strong>e<br />

the entries <strong>in</strong> slots 3, 4, 6 and 7, while we have left slots 1, 2 and 5 blank. Without<br />

do<strong>in</strong>g any more, let us ask if T is l<strong>in</strong>early <strong>in</strong>dependent? Beg<strong>in</strong> with a relation of<br />

l<strong>in</strong>ear dependence on T , and see what we can learn about the scalars,<br />

0 = α 1 z 1 + α 2 z 2 + α 3 z 3 + α 4 z 4<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

0<br />

0<br />

0<br />

1<br />

0<br />

0<br />

0<br />

0<br />

= α 1 0<br />

+ α 2 1<br />

+ α 3 0<br />

+ α 4 0<br />

⎢0<br />

⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥<br />

⎣0⎦<br />

⎣0⎦<br />

⎣0⎦<br />

⎣1⎦<br />

⎣0⎦<br />

0 0 0 0 1<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎡ ⎡<br />

=<br />

⎢<br />

⎣<br />

α 1<br />

0<br />

0<br />

0<br />

⎤ ⎤ ⎤<br />

0<br />

0<br />

0<br />

α 1<br />

+<br />

α 2<br />

+<br />

0<br />

+<br />

0<br />

=<br />

α 2<br />

⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥<br />

⎦ ⎣ 0 ⎦ ⎣α 3<br />

⎦ ⎣ 0 ⎦ ⎣α 3<br />

⎦<br />

0 0 α 4 α 4<br />

Apply<strong>in</strong>g Def<strong>in</strong>ition CVE to the two ends of this cha<strong>in</strong> of equalities, we see that<br />

α 1 = α 2 = α 3 = α 4 = 0. So the only relation of l<strong>in</strong>ear dependence on the set T is a<br />

trivial one. By Def<strong>in</strong>ition LICV the set T is l<strong>in</strong>early <strong>in</strong>dependent. The important<br />

feature of this example is how the “pattern of zeros and ones” <strong>in</strong> the four vectors<br />

led to the conclusion of l<strong>in</strong>ear <strong>in</strong>dependence.<br />

△<br />

The proof of Theorem BNS is really quite straightforward, and relies on the<br />

“pattern of zeros and ones” that arise <strong>in</strong> the vectors z i ,1≤ i ≤ n − r <strong>in</strong> the entries<br />

that arise with the locations of the non-pivot columns. Play along with Example<br />

LINSB as you study the proof. Also, take a look at Example VFSAD, Example<br />

VFSAI and Example VFSAL, especially at the conclusion of Step 2 (temporarily


§LI Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 131<br />

ignore the construction of the constant vector, c). This proof is also a good first<br />

example of how to prove a conclusion that states a set is l<strong>in</strong>early <strong>in</strong>dependent.<br />

Theorem BNS Basis for Null Spaces<br />

Suppose that A is an m × n matrix, and B is a row-equivalent matrix <strong>in</strong> reduced<br />

row-echelon form with r pivot columns. Let D = {d 1 ,d 2 ,d 3 , ..., d r } and F =<br />

{f 1 ,f 2 ,f 3 , ..., f n−r } be the sets of column <strong>in</strong>dices where B does and does not<br />

(respectively) have pivot columns. Construct the n − r vectors z j , 1 ≤ j ≤ n − r of<br />

size n as<br />

⎧<br />

⎪⎨ 1 if i ∈ F , i = f j<br />

[z j ] i<br />

= 0 if i ∈ F , i ≠ f j<br />

⎪⎩<br />

− [B] k,fj<br />

if i ∈ D, i = d k<br />

Def<strong>in</strong>e the set S = {z 1 , z 2 , z 3 , ..., z n−r }.Then<br />

1. N (A) =〈S〉.<br />

2. S is a l<strong>in</strong>early <strong>in</strong>dependent set.<br />

Proof. Notice first that the vectors z j ,1≤ j ≤ n − r are exactly the same as the<br />

n − r vectors def<strong>in</strong>ed <strong>in</strong> Theorem SSNS. Also, the hypotheses of Theorem SSNS are<br />

the same as the hypotheses of the theorem we are currently prov<strong>in</strong>g. So it is then<br />

simply the conclusion of Theorem SSNS that tells us that N (A) = 〈S〉. Thatwas<br />

the easy half, but the second part is not much harder. What is new here is the claim<br />

that S is a l<strong>in</strong>early <strong>in</strong>dependent set.<br />

To prove the l<strong>in</strong>ear <strong>in</strong>dependence of a set, we need to start with a relation of<br />

l<strong>in</strong>ear dependence and somehow conclude that the scalars <strong>in</strong>volved must all be zero,<br />

i.e. that the relation of l<strong>in</strong>ear dependence only happens <strong>in</strong> the trivial fashion. So to<br />

establish the l<strong>in</strong>ear <strong>in</strong>dependence of S, we start with<br />

α 1 z 1 + α 2 z 2 + α 3 z 3 + ···+ α n−r z n−r = 0.<br />

For each j, 1≤ j ≤ n − r, consider the equality of the <strong>in</strong>dividual entries of the<br />

vectors on both sides of this equality <strong>in</strong> position f j ,<br />

0=[0] fj<br />

=[α 1 z 1 + α 2 z 2 + α 3 z 3 + ···+ α n−r z n−r ] fj<br />

=[α 1 z 1 ] fj<br />

+[α 2 z 2 ] fj<br />

+[α 3 z 3 ] fj<br />

+ ···+[α n−r z n−r ] fj<br />

= α 1 [z 1 ] fj<br />

+ α 2 [z 2 ] fj<br />

+ α 3 [z 3 ] fj<br />

+ ···+<br />

α j−1 [z j−1 ] fj<br />

+ α j [z j ] fj<br />

+ α j+1 [z j+1 ] fj<br />

+ ···+<br />

α n−r [z n−r ] fj<br />

= α 1 (0) + α 2 (0) + α 3 (0) + ···+<br />

Def<strong>in</strong>ition CVE<br />

Def<strong>in</strong>ition CVA<br />

Def<strong>in</strong>ition CVSM


§LI Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 132<br />

α j−1 (0) + α j (1) + α j+1 (0) + ···+ α n−r (0)<br />

Def<strong>in</strong>ition of z j<br />

= α j<br />

So for all j, 1≤ j ≤ n − r, wehaveα j = 0, which is the conclusion that tells<br />

us that the only relation of l<strong>in</strong>ear dependence on S = {z 1 , z 2 , z 3 , ..., z n−r } is the<br />

trivial one. Hence, by Def<strong>in</strong>ition LICV the set is l<strong>in</strong>early <strong>in</strong>dependent, as desired.<br />

Example NSLIL Null space spanned by l<strong>in</strong>early <strong>in</strong>dependent set, Archetype L<br />

In Example VFSAL we previewed Theorem SSNS by f<strong>in</strong>d<strong>in</strong>g a set of two vectors<br />

such that their span was the null space for the matrix <strong>in</strong> Archetype L. Writ<strong>in</strong>g the<br />

matrix as L, wehave<br />

⎧⎡<br />

⎤ ⎡ ⎤⎫<br />

−1 2<br />

〈<br />

〉<br />

⎪⎨<br />

2<br />

−2<br />

⎪⎬<br />

N (L) = ⎢−2⎥<br />

⎣<br />

⎪⎩<br />

1 ⎦ , ⎢ 1 ⎥<br />

⎣ 0 ⎦<br />

⎪⎭<br />

0 1<br />

Solv<strong>in</strong>g the homogeneous system LS(L, 0) resulted <strong>in</strong> recogniz<strong>in</strong>g x 4 and x 5 as the<br />

free variables. So look <strong>in</strong> entries 4 and 5 of the two vectors above and notice the<br />

pattern of zeros and ones that provides the l<strong>in</strong>ear <strong>in</strong>dependence of the set. △<br />

Read<strong>in</strong>g Questions<br />

1. Let S be the set of three vectors below.<br />

⎧⎡<br />

⎨<br />

S = ⎣ 1 ⎤ ⎡<br />

2 ⎦ , ⎣ 3<br />

⎤ ⎡<br />

−4⎦ , ⎣ 4<br />

⎤⎫<br />

⎬<br />

−2⎦<br />

⎩<br />

−1 2 1<br />

⎭<br />

Is S l<strong>in</strong>early <strong>in</strong>dependent or l<strong>in</strong>early dependent? Expla<strong>in</strong> why.<br />

2. Let S be the set of three vectors below.<br />

⎧⎡<br />

⎨<br />

S = ⎣ 1<br />

⎤ ⎡<br />

−1⎦ , ⎣ 3 ⎤ ⎡<br />

2⎦ , ⎣ 4 ⎤⎫<br />

⎬<br />

3 ⎦<br />

⎩<br />

0 2 −4<br />

⎭<br />

Is S l<strong>in</strong>early <strong>in</strong>dependent or l<strong>in</strong>early dependent? Expla<strong>in</strong> why.<br />

3. Is the matrix below s<strong>in</strong>gular or nons<strong>in</strong>gular? Expla<strong>in</strong> your answer us<strong>in</strong>g only the f<strong>in</strong>al<br />

conclusion you reached <strong>in</strong> the previous question, along with one new theorem.<br />

⎡<br />

⎣ 1 3 4<br />

⎤<br />

−1 2 3 ⎦<br />

0 2 −4<br />

Exercises<br />

Determ<strong>in</strong>e if the sets of vectors <strong>in</strong> Exercises C20–C25 are l<strong>in</strong>early <strong>in</strong>dependent or l<strong>in</strong>early<br />

dependent. When the set is l<strong>in</strong>early dependent, exhibit a nontrivial relation of l<strong>in</strong>ear


§LI Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 133<br />

dependence.<br />

⎧⎡<br />

⎨<br />

C20 † ⎣ 1<br />

⎤ ⎡<br />

−2⎦ , ⎣ 2<br />

⎤ ⎡<br />

−1⎦ , ⎣ 1 ⎤⎫<br />

⎬<br />

5⎦<br />

⎩<br />

1 3 0<br />

⎭<br />

⎧⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤⎫<br />

⎪⎨ −1 3 7 ⎪⎬<br />

C21 † ⎢ 2 ⎥ ⎢ 3 ⎥ ⎢ 3 ⎥<br />

⎣<br />

⎪ ⎩<br />

4<br />

⎦ , ⎣<br />

−1<br />

⎦ , ⎣<br />

−6<br />

⎦<br />

⎪⎭<br />

2 3 4<br />

⎧⎡<br />

⎨<br />

C22 † ⎣ −2<br />

⎤ ⎡<br />

−1⎦ , ⎣ 1 ⎤ ⎡<br />

0 ⎦ , ⎣ 3 ⎤ ⎡<br />

3⎦ , ⎣ −5<br />

⎤ ⎡<br />

−4⎦ , ⎣ 4 ⎤⎫<br />

⎬<br />

4⎦<br />

⎩<br />

−1 −1 6 −6 7<br />

⎭<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />

1 3 2 1<br />

⎧⎪ ⎨ −2<br />

3<br />

1<br />

0<br />

⎪⎬<br />

C23 †<br />

⎢ 2 ⎥<br />

⎣<br />

⎪ ⎩<br />

5<br />

⎦ , ⎢ 1 ⎥<br />

⎣<br />

2<br />

⎦ , ⎢ 2 ⎥<br />

⎣<br />

−1<br />

⎦ , ⎢1⎥<br />

⎣<br />

2<br />

⎦<br />

⎪⎭<br />

3 −4 1 2<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />

1 3 4 −1<br />

⎧⎪ ⎨ 2<br />

2<br />

4<br />

2<br />

⎪⎬<br />

C24 †<br />

⎢−1⎥<br />

⎣<br />

⎪ ⎩<br />

0<br />

⎦ , ⎢−1⎥<br />

⎣<br />

2<br />

⎦ , ⎢−2⎥<br />

⎣<br />

2<br />

⎦ , ⎢−1⎥<br />

⎣<br />

−2<br />

⎦<br />

⎪⎭<br />

1 2 3 0<br />

C25 †<br />

⎧⎪ ⎨<br />

⎪ ⎩<br />

⎡<br />

⎢<br />

⎣<br />

2<br />

1<br />

3<br />

−1<br />

2<br />

⎤ ⎡ ⎤ ⎡ ⎤⎫<br />

4 10<br />

−2<br />

−7<br />

⎪⎬<br />

⎥<br />

⎦ , ⎢ 1 ⎥<br />

⎣<br />

3<br />

⎦ , ⎢ 0 ⎥<br />

⎣<br />

10<br />

⎦<br />

⎪⎭<br />

2 4<br />

C30 † For the matrix B below, f<strong>in</strong>d a set S that is l<strong>in</strong>early <strong>in</strong>dependent and spans the<br />

null space of B, thatis,N (B) =〈S〉.<br />

⎡<br />

⎤<br />

−3 1 −2 7<br />

B = ⎣−1 2 1 4 ⎦<br />

1 1 2 −1<br />

C31 † For the matrix A below, f<strong>in</strong>d a l<strong>in</strong>early <strong>in</strong>dependent set S so that the null space of<br />

A is spanned by S, thatis,N (A) =〈S〉.<br />

⎡<br />

−1 −2 2 1<br />

⎤<br />

5<br />

⎢ 1 2 1 1 5⎥<br />

A = ⎣<br />

3 6 1 2 7<br />

⎦<br />

2 4 0 1 2<br />

C32 † F<strong>in</strong>d a set of column vectors, T , such that (1) the span of T is the null space of B,<br />

〈T 〉 = N (B) and(2)T is a l<strong>in</strong>early <strong>in</strong>dependent set.<br />

⎡<br />

B = ⎣ 2 1 1 1<br />

⎤<br />

−4 −3 1 −7⎦<br />

1 1 −1 3


§LI Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 134<br />

C33 † F<strong>in</strong>d a set S so that S is l<strong>in</strong>early <strong>in</strong>dependent and N (A) = 〈S〉, whereN (A) is the<br />

null space of the matrix A below.<br />

⎡<br />

A = ⎣ 2 3 3 1 4<br />

⎤<br />

1 1 −1 −1 −3⎦<br />

3 2 −8 −1 1<br />

C50 Consider each archetype that is a system of equations and consider the solutions<br />

listed for the homogeneous version of the archetype. (If only the trivial solution is listed,<br />

then assume this is the only solution to the system.) From the solution set, determ<strong>in</strong>e if<br />

the columns of the coefficient matrix form a l<strong>in</strong>early <strong>in</strong>dependent or l<strong>in</strong>early dependent<br />

set. In the case of a l<strong>in</strong>early dependent set, use one of the sample solutions to provide<br />

a nontrivial relation of l<strong>in</strong>ear dependence on the set of columns of the coefficient matrix<br />

(Def<strong>in</strong>ition RLD). Indicate when Theorem MVSLD applies and connect this with the<br />

number of variables and equations <strong>in</strong> the system of equations.<br />

Archetype A,ArchetypeB,ArchetypeC,ArchetypeD/Archetype E,ArchetypeF,Archetype<br />

G/Archetype H, ArchetypeI, ArchetypeJ<br />

C51 For each archetype that is a system of equations consider the homogeneous version.<br />

Write elements of the solution set <strong>in</strong> vector form (Theorem VFSLS) and from this extract<br />

the vectors z j described<strong>in</strong>TheoremBNS. These vectors are used <strong>in</strong> a span construction<br />

to describe the null space of the coefficient matrix for each archetype. What does it mean<br />

when we write a null space as 〈{ }〉?<br />

Archetype A,ArchetypeB,ArchetypeC,ArchetypeD/Archetype E,ArchetypeF,Archetype<br />

G/Archetype H, ArchetypeI, ArchetypeJ<br />

C52 For each archetype that is a system of equations consider the homogeneous version.<br />

Sample solutions are given and a l<strong>in</strong>early <strong>in</strong>dependent spann<strong>in</strong>g set is given for the null<br />

space of the coefficient matrix. Write each of the sample solutions <strong>in</strong>dividually as a l<strong>in</strong>ear<br />

comb<strong>in</strong>ation of the vectors <strong>in</strong> the spann<strong>in</strong>g set for the null space of the coefficient matrix.<br />

Archetype A,ArchetypeB,ArchetypeC,ArchetypeD/Archetype E,ArchetypeF,Archetype<br />

G/Archetype H, ArchetypeI, ArchetypeJ<br />

C60 † For the matrix A below, f<strong>in</strong>d a set of vectors S so that (1) S is l<strong>in</strong>early <strong>in</strong>dependent,<br />

and (2) the span of S equals the null space of A, 〈S〉 = N (A). (See Exercise SS.C60.)<br />

⎡<br />

A = ⎣ 1 1 6 −8<br />

⎤<br />

1 −2 0 1 ⎦<br />

−2 1 −6 7<br />

M20 †<br />

set<br />

Suppose that S = {v 1 , v 2 , v 3 } is a set of three vectors from C 873 . Prove that the<br />

is l<strong>in</strong>early dependent.<br />

T = {2v 1 +3v 2 + v 3 , v 1 − v 2 − 2v 3 , 2v 1 + v 2 − v 3 }


§LI Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 135<br />

M21 † Suppose that S = {v 1 , v 2 , v 3 } is a l<strong>in</strong>early <strong>in</strong>dependent set of three vectors from<br />

C 873 . Prove that the set<br />

T = {2v 1 +3v 2 + v 3 , v 1 − v 2 +2v 3 , 2v 1 + v 2 − v 3 }<br />

is l<strong>in</strong>early <strong>in</strong>dependent.<br />

M50 † Consider the set of vectors from C 3 , W , given below. F<strong>in</strong>d a set T that conta<strong>in</strong>s<br />

three vectors from W andsuchthatW = 〈T 〉.<br />

〈 ⎧ ⎡<br />

⎨<br />

W = 〈{v 1 , v 2 , v 3 , v 4 , v 5 }〉 = ⎣ 2 ⎤ ⎡<br />

1⎦ , ⎣ ⎤ ⎡<br />

−1⎦ , ⎣ 1 ⎤ ⎡<br />

2⎦ , ⎣ 3 ⎤ ⎡<br />

1⎦ , ⎣ 0 ⎤⎫〉<br />

⎬<br />

1 ⎦<br />

⎩<br />

1 1 3 3 −3<br />

⎭<br />

M51 † Consider the subspace W = 〈{v 1 , v 2 , v 3 , v 4 }〉. F<strong>in</strong>dasetS so that (1) S is a<br />

subset of W ,(2)S is l<strong>in</strong>early <strong>in</strong>dependent, and (3) W = 〈S〉. Write each vector not <strong>in</strong>cluded<br />

<strong>in</strong> S as a l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors that are <strong>in</strong> S.<br />

⎡<br />

v 1 = ⎣ 1<br />

⎤<br />

⎡<br />

−1⎦ v 2 = ⎣ 4<br />

⎤<br />

⎡<br />

−4⎦ v 3 = ⎣ −3<br />

⎤<br />

⎡<br />

2 ⎦ v 4 = ⎣ 2 ⎤<br />

1⎦<br />

2<br />

8<br />

−7<br />

7<br />

T10 Prove that if a set of vectors conta<strong>in</strong>s the zero vector, then the set is l<strong>in</strong>early<br />

dependent. (Ed. “The zero vector is death to l<strong>in</strong>early <strong>in</strong>dependent sets.”)<br />

T12 Suppose that S is a l<strong>in</strong>early <strong>in</strong>dependent set of vectors, and T is a subset of S, T ⊆ S<br />

(Def<strong>in</strong>ition SSET). Prove that T is l<strong>in</strong>early <strong>in</strong>dependent.<br />

T13 Suppose that T is a l<strong>in</strong>early dependent set of vectors, and T is a subset of S, T ⊆ S<br />

(Def<strong>in</strong>ition SSET). Prove that S is l<strong>in</strong>early dependent.<br />

T15 † Suppose that {v 1 , v 2 , v 3 , ..., v n } is a set of vectors. Prove that<br />

{v 1 − v 2 , v 2 − v 3 , v 3 − v 4 , ..., v n − v 1 }<br />

is a l<strong>in</strong>early dependent set.<br />

T20 † Suppose that {v 1 , v 2 , v 3 , v 4 } is a l<strong>in</strong>early <strong>in</strong>dependent set <strong>in</strong> C 35 .Provethat<br />

{v 1 , v 1 + v 2 , v 1 + v 2 + v 3 , v 1 + v 2 + v 3 + v 4 }<br />

is a l<strong>in</strong>early <strong>in</strong>dependent set.<br />

T50 † Suppose that A is an m × n matrix with l<strong>in</strong>early <strong>in</strong>dependent columns and the<br />

l<strong>in</strong>ear system LS(A, b) is consistent. Show that this system has a unique solution. (Notice<br />

that we are not requir<strong>in</strong>g A to be square.)


Section LDS<br />

L<strong>in</strong>ear Dependence and Spans<br />

In any l<strong>in</strong>early dependent set there is always one vector that can be written as a<br />

l<strong>in</strong>ear comb<strong>in</strong>ation of the others. This is the substance of the upcom<strong>in</strong>g Theorem<br />

DLDS. Perhaps this will expla<strong>in</strong> the use of the word “dependent.” In a l<strong>in</strong>early<br />

dependent set, at least one vector “depends” on the others (via a l<strong>in</strong>ear comb<strong>in</strong>ation).<br />

Indeed, because Theorem DLDS is an equivalence (Proof Technique E) some<br />

authors use this condition as a def<strong>in</strong>ition (Proof Technique D) of l<strong>in</strong>ear dependence.<br />

Then l<strong>in</strong>ear <strong>in</strong>dependence is def<strong>in</strong>ed as the logical opposite of l<strong>in</strong>ear dependence. Of<br />

course, we have chosen to take Def<strong>in</strong>ition LICV as our def<strong>in</strong>ition, and then follow<br />

with Theorem DLDS as a theorem.<br />

Subsection LDSS<br />

L<strong>in</strong>early Dependent Sets and Spans<br />

If we use a l<strong>in</strong>early dependent set to construct a span, then we can always create<br />

the same <strong>in</strong>f<strong>in</strong>ite set with a start<strong>in</strong>g set that is one vector smaller <strong>in</strong> size. We will<br />

illustrate this behavior <strong>in</strong> Example RSC5. However, this will not be possible if we<br />

build a span from a l<strong>in</strong>early <strong>in</strong>dependent set. So <strong>in</strong> a certa<strong>in</strong> sense, us<strong>in</strong>g a l<strong>in</strong>early<br />

<strong>in</strong>dependent set to formulate a span is the best possible way — there are not any<br />

extra vectors be<strong>in</strong>g used to build up all the necessary l<strong>in</strong>ear comb<strong>in</strong>ations. OK, here<br />

is the theorem, and then the example.<br />

Theorem DLDS Dependency <strong>in</strong> L<strong>in</strong>early Dependent Sets<br />

Suppose that S = {u 1 , u 2 , u 3 , ..., u n } is a set of vectors. Then S is a l<strong>in</strong>early<br />

dependent set if and only if there is an <strong>in</strong>dex t, 1 ≤ t ≤ n such that u t is a l<strong>in</strong>ear<br />

comb<strong>in</strong>ation of the vectors u 1 , u 2 , u 3 , ..., u t−1 , u t+1 , ..., u n .<br />

Proof. (⇒) Suppose that S is l<strong>in</strong>early dependent, so there exists a nontrivial relation<br />

of l<strong>in</strong>ear dependence by Def<strong>in</strong>ition LICV. That is, there are scalars, α i ,1≤ i ≤ n,<br />

which are not all zero, such that<br />

α 1 u 1 + α 2 u 2 + α 3 u 3 + ···+ α n u n = 0.<br />

S<strong>in</strong>ce the α i cannot all be zero, choose one, say α t , that is nonzero. Then,<br />

u t = −1 (−α t u t )<br />

Property MICN<br />

α t<br />

= −1 (α 1 u 1 + ···+ α t−1 u t−1 + α t+1 u t+1 + ···+ α n u n ) Theorem VSPCV<br />

α t<br />

= −α 1<br />

u 1 + ···+ −α t−1<br />

α t<br />

α t<br />

u t−1 + −α t+1<br />

α t<br />

u t+1 + ···+ −α n<br />

α t<br />

u n Theorem VSPCV<br />

136


§LDS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 137<br />

S<strong>in</strong>ce the values of α i<br />

α t<br />

are aga<strong>in</strong> scalars, we have expressed u t as a l<strong>in</strong>ear comb<strong>in</strong>ation<br />

of the other elements of S.<br />

(⇐) Assume that the vector u t is a l<strong>in</strong>ear comb<strong>in</strong>ation of the other vectors <strong>in</strong><br />

S. Write this l<strong>in</strong>ear comb<strong>in</strong>ation, denot<strong>in</strong>g the relevant scalars as β 1 , β 2 ,...,β t−1 ,<br />

β t+1 , ..., β n ,as<br />

u t = β 1 u 1 + β 2 u 2 + ···+ β t−1 u t−1 + β t+1 u t+1 + ···+ β n u n<br />

Then we have<br />

β 1 u 1 + ···+ β t−1 u t−1 +(−1)u t + β t+1 u t+1 + ···+ β n u n<br />

= u t +(−1)u t Theorem VSPCV<br />

=(1+(−1)) u t<br />

Property DSAC<br />

=0u t Property AICN<br />

= 0 Def<strong>in</strong>ition CVSM<br />

So the scalars β 1 ,β 2 ,β 3 , ..., β t−1 ,β t = −1,β t+1 , ..., β n provide a nontrivial<br />

l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors <strong>in</strong> S, thus establish<strong>in</strong>g that S is a l<strong>in</strong>early dependent<br />

set (Def<strong>in</strong>ition LICV).<br />

<br />

This theorem can be used, sometimes repeatedly, to whittle down the size of a<br />

set of vectors used <strong>in</strong> a span construction. We have seen some of this already <strong>in</strong><br />

Example SCAD, but <strong>in</strong> the next example we will detail some of the subtleties.<br />

Example RSC5 Reduc<strong>in</strong>g a span <strong>in</strong> C 5<br />

Consider the set of n = 4 vectors from C 5 ,<br />

⎧⎡<br />

⎤ ⎡ ⎤ ⎡<br />

1 2<br />

⎪⎨<br />

2<br />

1<br />

R = {v 1 , v 2 , v 3 , v 4 } = ⎢−1⎥<br />

⎣<br />

⎪⎩<br />

3 ⎦ , ⎢3⎥<br />

⎣1⎦ , ⎢<br />

⎣<br />

2 2<br />

0<br />

−7<br />

6<br />

−11<br />

−2<br />

⎤ ⎡ ⎤⎫<br />

4<br />

1<br />

⎪⎬<br />

⎥<br />

⎦ , ⎢2⎥<br />

⎣1⎦<br />

⎪⎭<br />

6<br />

and def<strong>in</strong>e V = 〈R〉.<br />

To employ Theorem LIVHS, we form a 5 × 4 matrix, D, and row-reduce to<br />

understand solutions to the homogeneous system LS(D, 0),<br />

⎡<br />

⎤ ⎡<br />

⎤<br />

1 2 0 4<br />

1 0 0 4<br />

2 1 −7 1<br />

D = ⎢−1 3 6 2⎥<br />

⎣ 3 1 −11 1⎦<br />

2 2 −2 6<br />

RREF<br />

−−−−→<br />

0 1 0 0<br />

⎢ 0 0 1 1⎥<br />

⎣<br />

0 0 0 0<br />

⎦<br />

0 0 0 0<br />

We can f<strong>in</strong>d <strong>in</strong>f<strong>in</strong>itely many solutions to this system, most of them nontrivial,<br />

and we choose any one we like to build a relation of l<strong>in</strong>ear dependence on R. Letus


§LDS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 138<br />

beg<strong>in</strong> with x 4 = 1, to f<strong>in</strong>d the solution<br />

⎡ ⎤<br />

−4<br />

⎢ 0 ⎥ ⎣<br />

−1<br />

⎦<br />

1<br />

So we can write the relation of l<strong>in</strong>ear dependence,<br />

(−4)v 1 +0v 2 +(−1)v 3 +1v 4 = 0<br />

Theorem DLDS guarantees that we can solve this relation of l<strong>in</strong>ear dependence<br />

for some vector <strong>in</strong> R, butthechoiceofwhichoneisuptous.Noticehoweverthat<br />

v 2 has a zero coefficient. In this case, we cannot choose to solve for v 2 .Maybesome<br />

other relation of l<strong>in</strong>ear dependence would produce a nonzero coefficient for v 2 if we<br />

just had to solve for this vector. Unfortunately, this example has been eng<strong>in</strong>eered to<br />

always produce a zero coefficient here, as you can see from solv<strong>in</strong>g the homogeneous<br />

system. Every solution has x 2 =0!<br />

OK, if we are conv<strong>in</strong>ced that we cannot solve for v 2 , let us <strong>in</strong>stead solve for v 3 ,<br />

v 3 =(−4)v 1 +0v 2 +1v 4 =(−4)v 1 +1v 4<br />

We now claim that this particular equation will allow us to write<br />

V = 〈R〉 = 〈{v 1 , v 2 , v 3 , v 4 }〉 = 〈{v 1 , v 2 , v 4 }〉<br />

<strong>in</strong> essence declar<strong>in</strong>g v 3 as surplus for the task of build<strong>in</strong>g V as a span. This claim<br />

is an equality of two sets, so we will use Def<strong>in</strong>ition SE to establish it carefully. Let<br />

R ′ = {v 1 , v 2 , v 4 } and V ′ = 〈R ′ 〉. We want to show that V = V ′ .<br />

<strong>First</strong> show that V ′ ⊆ V . S<strong>in</strong>ce every vector of R ′ is <strong>in</strong> R, any vector we can<br />

construct <strong>in</strong> V ′ as a l<strong>in</strong>ear comb<strong>in</strong>ation of vectors from R ′ can also be constructed<br />

as a vector <strong>in</strong> V by the same l<strong>in</strong>ear comb<strong>in</strong>ation of the same vectors <strong>in</strong> R. Thatwas<br />

easy, now turn it around.<br />

Next show that V ⊆ V ′ .Chooseanyv from V . So there are scalars α 1 ,α 2 ,α 3 ,α 4<br />

such that<br />

v = α 1 v 1 + α 2 v 2 + α 3 v 3 + α 4 v 4<br />

= α 1 v 1 + α 2 v 2 + α 3 ((−4)v 1 +1v 4 )+α 4 v 4<br />

= α 1 v 1 + α 2 v 2 +((−4α 3 )v 1 + α 3 v 4 )+α 4 v 4<br />

=(α 1 − 4α 3 ) v 1 + α 2 v 2 +(α 3 + α 4 ) v 4 .<br />

This equation says that v can then be written as a l<strong>in</strong>ear comb<strong>in</strong>ation of the<br />

vectors <strong>in</strong> R ′ and hence qualifies for membership <strong>in</strong> V ′ .SoV ⊆ V ′ and we have<br />

established that V = V ′ .<br />

If R ′ was also l<strong>in</strong>early dependent (it is not), we could reduce the set even further.<br />

Notice that we could have chosen to elim<strong>in</strong>ate any one of v 1 , v 3 or v 4 , but somehow v 2<br />

is essential to the creation of V s<strong>in</strong>ce it cannot be replaced by any l<strong>in</strong>ear comb<strong>in</strong>ation


§LDS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 139<br />

of v 1 , v 3 or v 4 .<br />

△<br />

Subsection COV<br />

Cast<strong>in</strong>g Out Vectors<br />

In Example RSC5 we used four vectors to create a span. With a relation of l<strong>in</strong>ear<br />

dependence <strong>in</strong> hand, we were able to “toss out” one of these four vectors and create<br />

the same span from a subset of just three vectors from the orig<strong>in</strong>al set of four. We did<br />

have to take some care as to just which vector we tossed out. In the next example,<br />

we will be more methodical about just how we choose to elim<strong>in</strong>ate vectors from a<br />

l<strong>in</strong>early dependent set while preserv<strong>in</strong>g a span.<br />

Example COV Cast<strong>in</strong>g out vectors<br />

Webeg<strong>in</strong>withasetS conta<strong>in</strong><strong>in</strong>g seven vectors from C 4 ,<br />

⎧⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />

1 4 0 −1 0 7 −9<br />

⎪⎨<br />

⎪⎬<br />

⎢ 2 ⎥ ⎢ 8 ⎥ ⎢−1⎥<br />

⎢ 3 ⎥ ⎢ 9 ⎥ ⎢−13⎥<br />

⎢ 7 ⎥<br />

S = ⎣<br />

⎪⎩<br />

0<br />

⎦ , ⎣<br />

0<br />

⎦ , ⎣<br />

2<br />

⎦ , ⎣<br />

−3<br />

⎦ , ⎣<br />

−4<br />

⎦ , ⎣<br />

12<br />

⎦ , ⎣<br />

−8<br />

⎦<br />

⎪⎭<br />

−1 −4 2 4 8 −31 37<br />

and def<strong>in</strong>e W = 〈S〉.<br />

The set S is obviously l<strong>in</strong>early dependent by Theorem MVSLD, s<strong>in</strong>cewehave<br />

n = 7 vectors from C 4 . So we can slim down S some, and still create W as the span<br />

of a smaller set of vectors.<br />

As a device for identify<strong>in</strong>g relations of l<strong>in</strong>ear dependence among the vectors of S,<br />

we place the seven column vectors of S <strong>in</strong>to a matrix as columns,<br />

⎡<br />

⎤<br />

1 4 0 −1 0 7 −9<br />

⎢ 2 8 −1 3 9 −13 7 ⎥<br />

A =[A 1 |A 2 |A 3 | ...|A 7 ]= ⎣<br />

0 0 2 −3 −4 12 −8<br />

⎦<br />

−1 −4 2 4 8 −31 37<br />

By Theorem SLSLC a nontrivial solution to LS(A, 0) will give us a nontrivial<br />

relation of l<strong>in</strong>ear dependence (Def<strong>in</strong>ition RLDCV) onthecolumnsofA (which are<br />

theelementsofthesetS). The row-reduced form for A is the matrix<br />

⎡<br />

⎤<br />

B =<br />

⎢<br />

⎣<br />

1 4 0 0 2 1 −3<br />

0 0 1 0 1 −3 5<br />

0 0 0 1 2 −6 6<br />

0 0 0 0 0 0 0<br />

so we can easily create solutions to the homogeneous system LS(A, 0) us<strong>in</strong>g the<br />

free variables x 2 ,x 5 ,x 6 ,x 7 . Any such solution will provide a relation of l<strong>in</strong>ear<br />

dependence on the columns of B. These solutions will allow us to solve for one<br />

column vector as a l<strong>in</strong>ear comb<strong>in</strong>ation of some others, <strong>in</strong> the spirit of Theorem<br />

DLDS, and remove that vector from the set. We will set about form<strong>in</strong>g these l<strong>in</strong>ear<br />

comb<strong>in</strong>ations methodically.<br />

⎥<br />


§LDS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 140<br />

Set the free variable x 2 = 1, and set the other free variables to zero. Then a<br />

solution to LS(A, 0) is<br />

⎡ ⎤<br />

−4<br />

1<br />

0<br />

x =<br />

0<br />

⎢ 0<br />

⎥<br />

⎣ 0 ⎦<br />

0<br />

which can be used to create the l<strong>in</strong>ear comb<strong>in</strong>ation<br />

(−4)A 1 +1A 2 +0A 3 +0A 4 +0A 5 +0A 6 +0A 7 = 0<br />

This can then be arranged and solved for A 2 , result<strong>in</strong>g <strong>in</strong> A 2 expressed as a<br />

l<strong>in</strong>ear comb<strong>in</strong>ation of {A 1 , A 3 , A 4 },<br />

A 2 =4A 1 +0A 3 +0A 4<br />

This means that A 2 is surplus, and we can create W just as well with a smaller<br />

set with this vector removed,<br />

W = 〈{A 1 , A 3 , A 4 , A 5 , A 6 , A 7 }〉<br />

Technically, this set equality for W requires a proof, <strong>in</strong> the spirit of Example<br />

RSC5, but we will bypass this requirement here, and <strong>in</strong> the next few paragraphs.<br />

Now, set the free variable x 5 = 1, and set the other free variables to zero. Then<br />

a solution to LS(B, 0) is<br />

⎡ ⎤<br />

−2<br />

0<br />

−1<br />

x =<br />

−2<br />

⎢ 1<br />

⎥<br />

⎣ 0 ⎦<br />

0<br />

which can be used to create the l<strong>in</strong>ear comb<strong>in</strong>ation<br />

(−2)A 1 +0A 2 +(−1)A 3 +(−2)A 4 +1A 5 +0A 6 +0A 7 = 0<br />

This can then be arranged and solved for A 5 , result<strong>in</strong>g <strong>in</strong> A 5 expressed as a<br />

l<strong>in</strong>ear comb<strong>in</strong>ation of {A 1 , A 3 , A 4 },<br />

A 5 =2A 1 +1A 3 +2A 4<br />

This means that A 5 is surplus, and we can create W just as well with a smaller<br />

set with this vector removed,<br />

W = 〈{A 1 , A 3 , A 4 , A 6 , A 7 }〉<br />

Do it aga<strong>in</strong>, set the free variable x 6 = 1, and set the other free variables to zero.


§LDS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 141<br />

Then a solution to LS(B, 0) is<br />

⎡ ⎤<br />

−1<br />

0<br />

3<br />

x =<br />

6<br />

⎢ 0<br />

⎥<br />

⎣ 1 ⎦<br />

0<br />

which can be used to create the l<strong>in</strong>ear comb<strong>in</strong>ation<br />

(−1)A 1 +0A 2 +3A 3 +6A 4 +0A 5 +1A 6 +0A 7 = 0<br />

This can then be arranged and solved for A 6 , result<strong>in</strong>g <strong>in</strong> A 6 expressed as a<br />

l<strong>in</strong>ear comb<strong>in</strong>ation of {A 1 , A 3 , A 4 },<br />

A 6 =1A 1 +(−3)A 3 +(−6)A 4<br />

This means that A 6 is surplus, and we can create W just as well with a smaller set<br />

with this vector removed,<br />

W = 〈{A 1 , A 3 , A 4 , A 7 }〉<br />

Set the free variable x 7 = 1, and set the other free variables to zero. Then a<br />

solution to LS(B, 0) is<br />

⎡ ⎤<br />

3<br />

0<br />

−5<br />

x =<br />

−6<br />

⎢ 0<br />

⎥<br />

⎣ 0 ⎦<br />

1<br />

which can be used to create the l<strong>in</strong>ear comb<strong>in</strong>ation<br />

3A 1 +0A 2 +(−5)A 3 +(−6)A 4 +0A 5 +0A 6 +1A 7 = 0<br />

This can then be arranged and solved for A 7 , result<strong>in</strong>g <strong>in</strong> A 7 expressed as a<br />

l<strong>in</strong>ear comb<strong>in</strong>ation of {A 1 , A 3 , A 4 },<br />

A 7 =(−3)A 1 +5A 3 +6A 4<br />

This means that A 7 is surplus, and we can create W just as well with a smaller<br />

set with this vector removed,<br />

W = 〈{A 1 , A 3 , A 4 }〉<br />

You might th<strong>in</strong>k we could keep this up, but we have run out of free variables.<br />

And not co<strong>in</strong>cidentally, the set {A 1 , A 3 , A 4 } is l<strong>in</strong>early <strong>in</strong>dependent (check this!).<br />

It should be clear how each free variable was used to elim<strong>in</strong>ate the a column from<br />

the set used to span the column space, as this will be the essence of the proof of the


§LDS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 142<br />

next theorem. The column vectors <strong>in</strong> S were not chosen entirely at random, they are<br />

the columns of Archetype I. See if you can mimic this example us<strong>in</strong>g the columns of<br />

Archetype J. Go ahead, we’ll go grab a cup of coffee and be back before you f<strong>in</strong>ish<br />

up.<br />

For extra credit, notice that the vector<br />

⎡ ⎤<br />

3<br />

⎢9⎥<br />

b = ⎣<br />

1<br />

⎦<br />

4<br />

is the vector of constants <strong>in</strong> the def<strong>in</strong>ition of Archetype I. S<strong>in</strong>ce the system LS(A, b)<br />

is consistent, we know by Theorem SLSLC that b is a l<strong>in</strong>ear comb<strong>in</strong>ation of the<br />

columns of A, or stated equivalently, b ∈ W . This means that b must also be a<br />

l<strong>in</strong>ear comb<strong>in</strong>ation of just the three columns A 1 , A 3 , A 4 . Can you f<strong>in</strong>d such a l<strong>in</strong>ear<br />

comb<strong>in</strong>ation? Did you notice that there is just a s<strong>in</strong>gle (unique) answer? Hmmmm.<br />

△<br />

Example COV deserves your careful attention, s<strong>in</strong>ce this important example<br />

motivates the follow<strong>in</strong>g very fundamental theorem.<br />

Theorem BS Basis of a Span<br />

Suppose that S = {v 1 , v 2 , v 3 , ..., v n } is a set of column vectors. Def<strong>in</strong>e W = 〈S〉<br />

and let A be the matrix whose columns are the vectors from S. LetB be the reduced<br />

row-echelon form of A, withD = {d 1 ,d 2 ,d 3 , ..., d r } the set of <strong>in</strong>dices for the pivot<br />

columns of B. Then<br />

1. T = {v d1 , v d2 , v d3 , ... v dr } is a l<strong>in</strong>early <strong>in</strong>dependent set.<br />

2. W = 〈T 〉.<br />

Proof. To prove that T is l<strong>in</strong>early <strong>in</strong>dependent, beg<strong>in</strong> with a relation of l<strong>in</strong>ear<br />

dependence on T ,<br />

0 = α 1 v d1 + α 2 v d2 + α 3 v d3 + ...+ α r v dr<br />

and we will try to conclude that the only possibility for the scalars α i is that they are<br />

all zero. Denote the non-pivot columns of B by F = {f 1 ,f 2 ,f 3 , ..., f n−r }.Then<br />

we can preserve the equality by add<strong>in</strong>g a big fat zero to the l<strong>in</strong>ear comb<strong>in</strong>ation,<br />

0 = α 1 v d1 + α 2 v d2 + α 3 v d3 + ...+ α r v dr +0v f1 +0v f2 +0v f3 + ...+0v fn−r<br />

By Theorem SLSLC, the scalars <strong>in</strong> this l<strong>in</strong>ear comb<strong>in</strong>ation (suitably reordered)<br />

are a solution to the homogeneous system LS(A, 0). But notice that this is the<br />

solution obta<strong>in</strong>ed by sett<strong>in</strong>g each free variable to zero. If we consider the description of<br />

a solution vector <strong>in</strong> the conclusion of Theorem VFSLS, <strong>in</strong> the case of a homogeneous<br />

system, then we see that if all the free variables are set to zero the result<strong>in</strong>g solution


§LDS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 143<br />

vector is trivial (all zeros). So it must be that α i =0,1≤ i ≤ r. Thisimpliesby<br />

Def<strong>in</strong>ition LICV that T is a l<strong>in</strong>early <strong>in</strong>dependent set.<br />

The second conclusion of this theorem is an equality of sets (Def<strong>in</strong>ition SE).<br />

S<strong>in</strong>ce T is a subset of S, any l<strong>in</strong>ear comb<strong>in</strong>ation of elements of the set T can also<br />

be viewed as a l<strong>in</strong>ear comb<strong>in</strong>ation of elements of the set S. So〈T 〉⊆〈S〉 = W .It<br />

rema<strong>in</strong>s to prove that W = 〈S〉 ⊆〈T 〉.<br />

For each k, 1≤ k ≤ n − r, form a solution x to LS(A, 0) by sett<strong>in</strong>g the free<br />

variables as follows:<br />

x f1 =0 x f2 =0 x f3 =0 ... x fk =1 ... x fn−r =0<br />

By Theorem VFSLS, the rema<strong>in</strong>der of this solution vector is given by,<br />

x d1 = − [B] 1,fk<br />

x d2 = − [B] 2,fk<br />

x d3 = − [B] 3,fk<br />

... x dr = − [B] r,fk<br />

From this solution, we obta<strong>in</strong> a relation of l<strong>in</strong>ear dependence on the columns of<br />

A,<br />

− [B] 1,fk<br />

v d1 − [B] 2,fk<br />

v d2 − [B] 3,fk<br />

v d3 − ...− [B] r,fk<br />

v dr +1v fk = 0<br />

which can be arranged as the equality<br />

v fk<br />

=[B] 1,fk<br />

v d1 +[B] 2,fk<br />

v d2 +[B] 3,fk<br />

v d3 + ...+[B] r,fk<br />

v dr<br />

Now, suppose we take an arbitrary element, w, ofW = 〈S〉 and write it as a<br />

l<strong>in</strong>ear comb<strong>in</strong>ation of the elements of S, but with the terms organized accord<strong>in</strong>g to<br />

the <strong>in</strong>dices <strong>in</strong> D and F ,<br />

w = α 1 v d1 + α 2 v d2 + ...+ α r v dr + β 1 v f1 + β 2 v f2 + ...+ β n−r v fn−r<br />

From the above, we can replace each v fj by a l<strong>in</strong>ear comb<strong>in</strong>ation of the v di ,<br />

w = α 1 v d1 + α 2 v d2 + ...+ α r v dr +<br />

(<br />

)<br />

β 1 [B] 1,f1<br />

v d1 +[B] 2,f1<br />

v d2 +[B] 3,f1<br />

v d3 + ...+[B] r,f1<br />

v dr +<br />

(<br />

)<br />

β 2 [B] 1,f2<br />

v d1 +[B] 2,f2<br />

v d2 +[B] 3,f2<br />

v d3 + ...+[B] r,f2<br />

v dr +<br />

.<br />

β n−r<br />

(<br />

[B] 1,fn−r<br />

v d1 +[B] 2,fn−r<br />

v d2 +[B] 3,fn−r<br />

v d3 + ...+[B] r,fn−r<br />

v dr<br />

)<br />

With repeated applications of several of the properties of Theorem VSPCV we can<br />

rearrange this expression as,<br />

)<br />

=<br />

(α 1 + β 1 [B] 1,f1<br />

+ β 2 [B] 1,f2<br />

+ β 3 [B] 1,f3<br />

+ ...+ β n−r [B] 1,fn−r<br />

v d1 +<br />

)<br />

(α 2 + β 1 [B] 2,f1<br />

+ β 2 [B] 2,f2<br />

+ β 3 [B] 2,f3<br />

+ ...+ β n−r [B] 2,fn−r<br />

v d2 +<br />

.


§LDS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 144<br />

(α r + β 1 [B] r,f1<br />

+ β 2 [B] r,f2<br />

+ β 3 [B] r,f3<br />

+ ...+ β n−r [B] r,fn−r<br />

)<br />

v dr<br />

This mess expresses the vector w as a l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors <strong>in</strong><br />

T = {v d1 , v d2 , v d3 , ... v dr }<br />

thus say<strong>in</strong>g that w ∈〈T 〉. Therefore, W = 〈S〉 ⊆〈T 〉.<br />

<br />

In Example COV, we tossed-out vectors one at a time. But <strong>in</strong> each <strong>in</strong>stance,<br />

we rewrote the offend<strong>in</strong>g vector as a l<strong>in</strong>ear comb<strong>in</strong>ation of those vectors with the<br />

column <strong>in</strong>dices of the pivot columns of the reduced row-echelon form of the matrix<br />

of columns. In the proof of Theorem BS, we accomplish this reduction <strong>in</strong> one big<br />

step. In Example COV we arrived at a l<strong>in</strong>early <strong>in</strong>dependent set at exactly the same<br />

moment that we ran out of free variables to exploit. This was not a co<strong>in</strong>cidence, it<br />

is the substance of our conclusion of l<strong>in</strong>ear <strong>in</strong>dependence <strong>in</strong> Theorem BS.<br />

Here is a straightforward application of Theorem BS.<br />

Example RSC4 Reduc<strong>in</strong>g a span <strong>in</strong> C 4<br />

Beg<strong>in</strong> with a set of five vectors from C 4 ,<br />

⎧⎡<br />

⎤ ⎡ ⎤ ⎡<br />

1 2 ⎪⎨<br />

⎢1⎥<br />

⎢2⎥<br />

⎢<br />

S = ⎣<br />

⎪⎩<br />

2<br />

⎦ , ⎣<br />

4<br />

⎦ , ⎣<br />

1 2<br />

2<br />

0<br />

−1<br />

1<br />

⎤<br />

⎡<br />

⎥ ⎢<br />

⎦ , ⎣<br />

7<br />

1<br />

−1<br />

4<br />

⎤<br />

⎥<br />

⎦ ,<br />

⎡ ⎤⎫<br />

0 ⎪⎬<br />

⎢2⎥<br />

⎣<br />

5<br />

⎦<br />

⎪⎭<br />

1<br />

and let W = 〈S〉. To arrive at a (smaller) l<strong>in</strong>early <strong>in</strong>dependent set, follow the<br />

procedure described <strong>in</strong> Theorem BS. Place the vectors from S <strong>in</strong>toamatrixas<br />

columns, and row-reduce,<br />

⎡<br />

⎤ ⎡<br />

⎤<br />

1 2 2 7 0<br />

⎢1 2 0 1 2⎥<br />

⎣<br />

2 4 −1 −1 5<br />

⎦ −−−−→<br />

RREF<br />

1 2 1 4 1<br />

⎢<br />

⎣<br />

1 2 0 1 2<br />

0 0 1 3 −1<br />

⎥<br />

0 0 0 0 0 ⎦<br />

0 0 0 0 0<br />

Columns 1 and 3 are the pivot columns (D = {1, 3}) so the set<br />

⎧⎡<br />

⎤ ⎡ ⎤⎫<br />

1 2<br />

⎪⎨<br />

⎪⎬<br />

⎢1⎥<br />

⎢ 0 ⎥<br />

T = ⎣<br />

⎪⎩<br />

2<br />

⎦ , ⎣<br />

−1<br />

⎦<br />

⎪⎭<br />

1 1<br />

is l<strong>in</strong>early <strong>in</strong>dependent and 〈T 〉 = 〈S〉 = W .Boom!<br />

S<strong>in</strong>ce the reduced row-echelon form of a matrix is unique (Theorem RREFU),<br />

the procedure of Theorem BS leads us to a unique set T . However, there is a wide<br />

variety of possibilities for sets T that are l<strong>in</strong>early <strong>in</strong>dependent and which can be


§LDS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 145<br />

employed <strong>in</strong> a span to create W . Without proof, we list two other possibilities:<br />

⎧⎡<br />

⎤ ⎡ ⎤⎫<br />

2 2<br />

⎪⎨<br />

⎪⎬<br />

T ′ ⎢2⎥<br />

⎢ 0 ⎥<br />

= ⎣<br />

⎪⎩<br />

4<br />

⎦ , ⎣<br />

−1<br />

⎦<br />

⎪⎭<br />

2 1<br />

⎧⎡<br />

⎤ ⎡ ⎤⎫<br />

3 −1<br />

⎪⎨<br />

⎪⎬<br />

T ∗ ⎢1⎥<br />

⎢ 1 ⎥<br />

= ⎣<br />

⎪⎩<br />

1<br />

⎦ , ⎣<br />

3<br />

⎦<br />

⎪⎭<br />

2 0<br />

Can you prove that T ′ and T ∗ are l<strong>in</strong>early <strong>in</strong>dependent sets and W = 〈S〉 =<br />

〈T ′ 〉 = 〈T ∗ 〉? △<br />

Example RES Rework<strong>in</strong>g elements of a span<br />

Beg<strong>in</strong> with a set of five vectors from C 4 ,<br />

⎧⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤ ⎡<br />

2 −1 −8<br />

⎪⎨<br />

⎢1⎥<br />

⎢ 1 ⎥ ⎢−1⎥<br />

⎢<br />

R = ⎣<br />

⎪⎩<br />

3<br />

⎦ , ⎣<br />

0<br />

⎦ , ⎣<br />

−9<br />

⎦ , ⎣<br />

2 1 −4<br />

3<br />

1<br />

−1<br />

−2<br />

⎤<br />

⎥<br />

⎦ ,<br />

⎡ ⎤⎫<br />

−10 ⎪⎬<br />

⎢ −1 ⎥<br />

⎣<br />

−1<br />

⎦<br />

⎪⎭<br />

4<br />

It is easy to create elements of X = 〈R〉 — we will create one at random,<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

2 −1 −8 3 −10 9<br />

⎢1⎥<br />

⎢ 1 ⎥ ⎢−1⎥<br />

⎢ 1 ⎥ ⎢ −1 ⎥ ⎢ 2 ⎥<br />

y =6⎣<br />

3<br />

⎦ +(−7) ⎣<br />

0<br />

⎦ +1⎣<br />

−9<br />

⎦ +6⎣<br />

−1<br />

⎦ +2⎣<br />

−1<br />

⎦ = ⎣<br />

1<br />

⎦<br />

2<br />

1 −4 −2 4 −3<br />

We know we can replace R by a smaller set (s<strong>in</strong>ce it is obviously l<strong>in</strong>early dependent<br />

by Theorem MVSLD) that will create the same span. Here goes,<br />

⎡<br />

⎤ ⎡<br />

⎤<br />

2 −1 −8 3 −10<br />

1 0 −3 0 −1<br />

⎢1 1 −1 1 −1 ⎥<br />

⎣<br />

3 0 −9 −1 −1<br />

⎦ −−−−→<br />

RREF<br />

⎢ 0 1 2 0 2<br />

⎥<br />

⎣<br />

0 0 0 1 −2<br />

⎦<br />

2 1 −4 −2 4<br />

0 0 0 0 0<br />

So, if we collect the first, second and fourth vectors from R,<br />

⎧⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤⎫<br />

2 −1 3<br />

⎪⎨<br />

⎪⎬<br />

⎢1⎥<br />

⎢ 1 ⎥ ⎢ 1 ⎥<br />

P = ⎣<br />

⎪⎩<br />

3<br />

⎦ , ⎣<br />

0<br />

⎦ , ⎣<br />

−1<br />

⎦<br />

⎪⎭<br />

2 1 −2<br />

then P is l<strong>in</strong>early <strong>in</strong>dependent and 〈P 〉 = 〈R〉 = X by Theorem BS. S<strong>in</strong>ce we built<br />

y as an element of 〈R〉 it must also be an element of 〈P 〉. Can we write y as a l<strong>in</strong>ear<br />

comb<strong>in</strong>ation of just the three vectors <strong>in</strong> P ? The answer is, of course, yes. But let us<br />

compute an explicit l<strong>in</strong>ear comb<strong>in</strong>ation just for fun. By Theorem SLSLC we can get<br />

such a l<strong>in</strong>ear comb<strong>in</strong>ation by solv<strong>in</strong>g a system of equations with the column vectors<br />

of R as the columns of a coefficient matrix, and y as the vector of constants.


§LDS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 146<br />

Employ<strong>in</strong>g an augmented matrix to solve this system,<br />

⎡<br />

⎤ ⎡<br />

⎤<br />

2 −1 3 9<br />

1 0 0 1<br />

⎢1 1 1 2 ⎥<br />

⎣<br />

3 0 −1 1<br />

⎦ −−−−→<br />

RREF<br />

⎢ 0 1 0 −1<br />

⎥<br />

⎣<br />

0 0 1 2<br />

⎦<br />

2 1 −2 −3<br />

0 0 0 0<br />

So we see, as expected, that<br />

⎡ ⎤ ⎡ ⎤ ⎡<br />

2 −1<br />

⎢1⎥<br />

⎢ 1 ⎥ ⎢<br />

1 ⎣<br />

3<br />

⎦ +(−1) ⎣<br />

0<br />

⎦ +2⎣<br />

2<br />

1<br />

3<br />

1<br />

−1<br />

−2<br />

⎤<br />

⎥<br />

⎦ =<br />

⎡<br />

⎢<br />

⎣<br />

9<br />

2<br />

1<br />

−3<br />

⎤<br />

⎥<br />

⎦ = y<br />

A key feature of this example is that the l<strong>in</strong>ear comb<strong>in</strong>ation that expresses y as a<br />

l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors <strong>in</strong> P is unique. This is a consequence of the l<strong>in</strong>ear<br />

<strong>in</strong>dependence of P . The l<strong>in</strong>early <strong>in</strong>dependent set P is smaller than R, but still just<br />

(barely) big enough to create elements of the set X = 〈R〉. There are many, many<br />

ways to write y as a l<strong>in</strong>ear comb<strong>in</strong>ation of the five vectors <strong>in</strong> R (the appropriate<br />

system of equations to verify this claim yields two free variables <strong>in</strong> the description<br />

of the solution set), yet there is precisely one way to write y as a l<strong>in</strong>ear comb<strong>in</strong>ation<br />

of the three vectors <strong>in</strong> P .<br />

△<br />

Read<strong>in</strong>g Questions<br />

1. Let S be the l<strong>in</strong>early dependent set of three vectors below.<br />

⎧⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤⎫<br />

1 1 5<br />

⎪⎨<br />

⎪⎬<br />

⎢ 10 ⎥ ⎢1⎥<br />

⎢ 23 ⎥<br />

S = ⎣<br />

⎪⎩<br />

100<br />

⎦ , ⎣<br />

1<br />

⎦ , ⎣<br />

203<br />

⎦<br />

⎪⎭<br />

1000 1 2003<br />

Write one vector from S as a l<strong>in</strong>ear comb<strong>in</strong>ation of the other two and <strong>in</strong>clude this<br />

vector equality <strong>in</strong> your response. (You should be able to do this on sight, rather than<br />

do<strong>in</strong>g some computations.) Convert this expression <strong>in</strong>to a nontrivial relation of l<strong>in</strong>ear<br />

dependence on S.<br />

2. Expla<strong>in</strong> why the word “dependent” is used <strong>in</strong> the def<strong>in</strong>ition of l<strong>in</strong>ear dependence.<br />

3. Suppose that Y = 〈P 〉 = 〈Q〉, whereP is a l<strong>in</strong>early dependent set and Q is l<strong>in</strong>early<br />

<strong>in</strong>dependent. Would you rather use P or Q to describe Y ?Why?<br />

Exercises<br />

C20 † Let T be the set of columns of the matrix B below. Def<strong>in</strong>e W = 〈T 〉. F<strong>in</strong>dasetR<br />

so that (1) R has 3 vectors, (2) R is a subset of T ,and(3)W = 〈R〉.<br />

⎡<br />

⎤<br />

−3 1 −2 7<br />

B = ⎣−1 2 1 4 ⎦<br />

1 1 2 −1


§LDS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 147<br />

C40 Verify that the set R ′ = {v 1 , v 2 , v 4 } at the end of Example RSC5 is l<strong>in</strong>early<br />

<strong>in</strong>dependent.<br />

C50 † Consider the set of vectors from C 3 , W , given below. F<strong>in</strong>d a l<strong>in</strong>early <strong>in</strong>dependent<br />

set T that conta<strong>in</strong>s three vectors from W andsuchthat〈W 〉 = 〈T 〉.<br />

⎧⎡<br />

⎨<br />

W = {v 1 , v 2 , v 3 , v 4 , v 5 } = ⎣ 2 ⎤ ⎡<br />

1⎦ , ⎣ ⎤ ⎡<br />

−1⎦ , ⎣ 1 ⎤ ⎡<br />

2⎦ , ⎣ 3 ⎤ ⎡<br />

1⎦ , ⎣ 0 ⎤⎫<br />

⎬<br />

1 ⎦<br />

⎩<br />

1 1 3 3 −3<br />

⎭<br />

C51 †<br />

Given the set S below, f<strong>in</strong>d a l<strong>in</strong>early <strong>in</strong>dependent set T so that 〈T 〉 = 〈S〉.<br />

⎧⎡<br />

⎨<br />

S = ⎣ 2<br />

⎤ ⎡<br />

−1⎦ , ⎣ 3 ⎤ ⎡<br />

0⎦ , ⎣ 1 ⎤ ⎡<br />

1 ⎦ , ⎣ 5<br />

⎤⎫<br />

⎬<br />

−1⎦<br />

⎩<br />

2 1 −1 3<br />

⎭<br />

C52 † Let W be the span of the set of vectors S below, W = 〈S〉. F<strong>in</strong>dasetT so that 1)<br />

the span of T is W , 〈T 〉 = W ,(2)T is a l<strong>in</strong>early <strong>in</strong>dependent set, and (3) T is a subset of<br />

S.<br />

⎧⎡<br />

⎨<br />

S = ⎣ 1 ⎤ ⎡<br />

2 ⎦ , ⎣ 2<br />

⎤ ⎡<br />

−3⎦ , ⎣ 4 ⎤ ⎡<br />

1 ⎦ , ⎣ 3 ⎤ ⎡<br />

1⎦ , ⎣ 3<br />

⎤⎫<br />

⎬<br />

−1⎦<br />

⎩<br />

−1 1 −1 1 0<br />

⎭<br />

⎧⎡<br />

⎨<br />

C55 † Let T be the set of vectors T = ⎣ 1<br />

⎤ ⎡<br />

−1⎦ , ⎣ 3 ⎤ ⎡<br />

0⎦ , ⎣ 4 ⎤ ⎡<br />

2⎦ , ⎣ 3 ⎤⎫<br />

⎬<br />

0⎦<br />

. F<strong>in</strong>d two different<br />

⎩<br />

2 1 3 6<br />

⎭<br />

subsets of T ,namedR and S, sothatR and S each conta<strong>in</strong> three vectors, and so that<br />

〈R〉 = 〈T 〉 and 〈S〉 = 〈T 〉. Prove that both R and S are l<strong>in</strong>early <strong>in</strong>dependent.<br />

C70 Reprise Example RES by creat<strong>in</strong>g a new version of the vector y. Inotherwords,<br />

form a new, different l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors <strong>in</strong> R to create a new vector y (but<br />

do not simplify the problem too much by choos<strong>in</strong>g any of the five new scalars to be zero).<br />

Then express this new y as a comb<strong>in</strong>ation of the vectors <strong>in</strong> P .<br />

M10 At the conclusion of Example RSC4 two alternative solutions, sets T ′ and T ∗ ,are<br />

proposed. Verify these claims by prov<strong>in</strong>g that 〈T 〉 = 〈T ′ 〉 and 〈T 〉 = 〈T ∗ 〉.<br />

T40 † Suppose that v 1 and v 2 are any two vectors from C m . Prove the follow<strong>in</strong>g set<br />

equality.<br />

〈{v 1 , v 2 }〉 = 〈{v 1 + v 2 , v 1 − v 2 }〉


Section O<br />

Orthogonality<br />

In this section we def<strong>in</strong>e a couple more operations with vectors, and prove a few<br />

theorems. At first blush these def<strong>in</strong>itions and results will not appear central to what<br />

follows, but we will make use of them at key po<strong>in</strong>ts <strong>in</strong> the rema<strong>in</strong>der of the course<br />

(such as Section MINM, Section OD). Because we have chosen to use C as our set of<br />

scalars, this subsection is a bit more, uh, . . . complex than it would be for the real<br />

numbers. We will expla<strong>in</strong> as we go along how th<strong>in</strong>gs get easier for the real numbers<br />

R. If you have not already, now would be a good time to review some of the basic<br />

properties of arithmetic with complex numbers described <strong>in</strong> Section CNO. With<br />

that done, we can extend the basics of complex number arithmetic to our study of<br />

vectors <strong>in</strong> C m .<br />

Subsection CAV<br />

Complex Arithmetic and Vectors<br />

We know how the addition and multiplication of complex numbers is employed <strong>in</strong><br />

def<strong>in</strong><strong>in</strong>g the operations for vectors <strong>in</strong> C m (Def<strong>in</strong>ition CVA and Def<strong>in</strong>ition CVSM).<br />

We can also extend the idea of the conjugate to vectors.<br />

Def<strong>in</strong>ition CCCV Complex Conjugate of a Column Vector<br />

Suppose that u is a vector from C m . Then the conjugate of the vector, u, is def<strong>in</strong>ed<br />

by<br />

[u] i<br />

= [u] i<br />

1 ≤ i ≤ m<br />

□<br />

With this def<strong>in</strong>ition we can show that the conjugate of a column vector behaves<br />

as we would expect with regard to vector addition and scalar multiplication.<br />

Theorem CRVA Conjugation Respects Vector Addition<br />

Suppose x and y are two vectors from C m .Then<br />

x + y = x + y<br />

Proof. For each 1 ≤ i ≤ m,<br />

[x + y] i<br />

= [x + y] i<br />

Def<strong>in</strong>ition CCCV<br />

= [x] i<br />

+[y] i<br />

Def<strong>in</strong>ition CVA<br />

= [x] i<br />

+ [y] i<br />

Theorem CCRA<br />

=[x] i<br />

+[y] i<br />

Def<strong>in</strong>ition CCCV<br />

=[x + y] i<br />

Def<strong>in</strong>ition CVA<br />

148


§O Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 149<br />

Then by Def<strong>in</strong>ition CVE we have x + y = x + y.<br />

<br />

Theorem CRSM Conjugation Respects Vector Scalar Multiplication<br />

Suppose x is a vector from C m ,andα ∈ C is a scalar. Then<br />

αx = α x<br />

Proof. For 1 ≤ i ≤ m,<br />

[αx] i<br />

= [αx] i<br />

Def<strong>in</strong>ition CCCV<br />

= α [x] i<br />

Def<strong>in</strong>ition CVSM<br />

= α [x] i<br />

Theorem CCRM<br />

= α [x] i<br />

Def<strong>in</strong>ition CCCV<br />

=[α x] i<br />

Def<strong>in</strong>ition CVSM<br />

Then by Def<strong>in</strong>ition CVE we have αx = α x.<br />

<br />

These two theorems together tell us how we can “push” complex conjugation<br />

through l<strong>in</strong>ear comb<strong>in</strong>ations.<br />

Subsection IP<br />

Inner products<br />

Def<strong>in</strong>ition IP Inner Product<br />

Given the vectors u, v ∈ C m the <strong>in</strong>ner product of u and v is the scalar quantity<br />

<strong>in</strong> C,<br />

m∑<br />

〈u, v〉 = [u] 1<br />

[v] 1<br />

+ [u] 2<br />

[v] 2<br />

+ [u] 3<br />

[v] 3<br />

+ ···+ [u] m<br />

[v] m<br />

= [u] i<br />

[v] i<br />

This operation is a bit different <strong>in</strong> that we beg<strong>in</strong> with two vectors but produce a<br />

scalar. Comput<strong>in</strong>g one is straightforward.<br />

Example CSIP Comput<strong>in</strong>g some <strong>in</strong>ner products<br />

The <strong>in</strong>ner product of<br />

[ ] 2+3i<br />

u = 5+2i<br />

−3+i<br />

and v =<br />

is<br />

i=1<br />

[ ] 1+2i<br />

−4+5i<br />

0+5i<br />

〈u, v〉 =(2+3i)(1 + 2i)+(5+2i)(−4+5i)+(−3+i)(0 + 5i)<br />

=(2− 3i)(1 + 2i)+(5− 2i)(−4+5i)+(−3 − i)(0 + 5i)<br />


§O Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 150<br />

is<br />

=(8+i)+(−10 + 33i)+(5− 15i)<br />

=3+19i<br />

The <strong>in</strong>ner product of<br />

⎡ ⎤<br />

⎡<br />

2<br />

4<br />

w = ⎢−3⎥<br />

and x = ⎢<br />

⎣ 2 ⎦<br />

⎣<br />

8<br />

3<br />

1<br />

0<br />

−1<br />

−2<br />

〈w, x〉 =(2)3 + (4)1 + (−3)0 + (2)(−1) + (8)(−2)<br />

= 2(3) + 4(1) + (−3)0 + 2(−1) + 8(−2) = −8.<br />

In the case where the entries of our vectors are all real numbers (as <strong>in</strong> the second<br />

part of Example CSIP), the computation of the <strong>in</strong>ner product may look familiar and<br />

be known to you as a dot product or scalar product. So you can view the <strong>in</strong>ner<br />

product as a generalization of the scalar product to vectors from C m (rather than<br />

R m ).<br />

Note that we have chosen to conjugate the entries of the first vector listed <strong>in</strong> the<br />

<strong>in</strong>ner product, while it is almost equally feasible to conjugate entries from the second<br />

vector <strong>in</strong>stead. In particular, prior to Version 2.90, we did use the latter def<strong>in</strong>ition,<br />

and this has now changed to the former, with result<strong>in</strong>g adjustments propogated<br />

up through Section CB (only). However, conjugat<strong>in</strong>g the first vector leads to much<br />

nicer formulas for certa<strong>in</strong> matrix decompositions and also shortens some proofs.<br />

There are several quick theorems we can now prove, and they will each be useful<br />

later.<br />

Theorem IPVA Inner Product and Vector Addition<br />

Suppose u, v, w ∈ C m .Then<br />

1. 〈u + v, w〉 = 〈u, w〉 + 〈v, w〉<br />

2. 〈u, v + w〉 = 〈u, v〉 + 〈u, w〉<br />

Proof. The proofs of the two parts are very similar, with the second one requir<strong>in</strong>g<br />

just a bit more effort due to the conjugation that occurs. We will prove part 1 and<br />

you can prove part 2 (Exercise O.T10).<br />

m∑<br />

〈u + v, w〉 = [u + v] i<br />

[w] i<br />

Def<strong>in</strong>ition IP<br />

=<br />

i=1<br />

m∑<br />

i=1<br />

([u] i<br />

+[v] i<br />

)<br />

[w] i<br />

⎤<br />

⎥<br />

⎦<br />

Def<strong>in</strong>ition CVA<br />


§O Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 151<br />

=<br />

=<br />

=<br />

m∑<br />

i=1<br />

([u] i<br />

+ [v] i<br />

)<br />

[w] i<br />

m∑<br />

[u] i<br />

[w] i<br />

+ [v] i<br />

[w] i<br />

i=1<br />

m∑<br />

[u] i<br />

[w] i<br />

+<br />

i=1<br />

m∑<br />

[v] i<br />

[w] i<br />

i=1<br />

Theorem CCRA<br />

Property DCN<br />

Property CACN<br />

= 〈u, w〉 + 〈v, w〉 Def<strong>in</strong>ition IP<br />

<br />

Theorem IPSM Inner Product and Scalar Multiplication<br />

Suppose u, v ∈ C m and α ∈ C. Then<br />

1. 〈αu, v〉 = α 〈u, v〉<br />

2. 〈u, αv〉 = α 〈u, v〉<br />

Proof. The proofs of the two parts are very similar, with the second one requir<strong>in</strong>g<br />

just a bit more effort due to the conjugation that occurs. We will prove part 1 and<br />

you can prove part 2 (Exercise O.T11).<br />

m∑<br />

〈αu, v〉 = [αu] i<br />

[v] i<br />

Def<strong>in</strong>ition IP<br />

=<br />

=<br />

= α<br />

i=1<br />

m∑<br />

α [u] i<br />

[v] i<br />

i=1<br />

m∑<br />

α [u] i<br />

[v] i<br />

i=1<br />

m∑<br />

[u] i<br />

[v] i<br />

i=1<br />

Def<strong>in</strong>ition CVSM<br />

Theorem CCRM<br />

Property DCN<br />

= α 〈u, v〉 Def<strong>in</strong>ition IP<br />

Theorem IPAC Inner Product is Anti-Commutative<br />

Suppose that u and v are vectors <strong>in</strong> C m .Then〈u, v〉 = 〈v, u〉.<br />

<br />

Proof.<br />

〈u, v〉 =<br />

m∑<br />

[u] i<br />

[v] i<br />

i=1<br />

Def<strong>in</strong>ition IP


§O Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 152<br />

m∑<br />

= [u] i<br />

[v] i<br />

Theorem CCT<br />

i=1<br />

m∑<br />

= [u] i<br />

[v] i<br />

Theorem CCRM<br />

i=1<br />

( m<br />

)<br />

∑<br />

= [u] i<br />

[v] i<br />

Theorem CCRA<br />

i=1<br />

( m<br />

)<br />

∑<br />

= [v] i<br />

[u] i<br />

i=1<br />

Property CMCN<br />

= 〈v, u〉 Def<strong>in</strong>ition IP<br />

Subsection N<br />

Norm<br />

If treat<strong>in</strong>g l<strong>in</strong>ear algebra <strong>in</strong> a more geometric fashion, the length of a vector occurs<br />

naturally, and is what you would expect from its name. With complex numbers, we<br />

will def<strong>in</strong>e a similar function. Recall that if c is a complex number, then |c| denotes<br />

its modulus (Def<strong>in</strong>ition MCN).<br />

Def<strong>in</strong>ition NV Norm of a Vector<br />

The norm of the vector u is the scalar quantity <strong>in</strong> C<br />

‖u‖ =<br />

√<br />

|[u] 1<br />

| 2 + |[u] 2<br />

| 2 + |[u] 3<br />

| 2 + ···+ |[u] m<br />

| 2 =<br />

Comput<strong>in</strong>g a norm is also easy to do.<br />

Example CNSV Comput<strong>in</strong>g the norm of some vectors<br />

The norm of<br />

⎡ ⎤<br />

3+2i<br />

⎢1 − 6i⎥<br />

u = ⎣<br />

2+4i<br />

⎦<br />

2+i<br />

is<br />

‖u‖ =<br />

∑<br />

√ m |[u] i<br />

| 2<br />

i=1<br />

√<br />

|3+2i| 2 + |1 − 6i| 2 + |2+4i| 2 + |2+i| 2<br />

<br />


§O Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 153<br />

is<br />

The norm of<br />

‖v‖ =<br />

= √ 13 + 37 + 20 + 5 = √ 75 = 5 √ 3<br />

⎡ ⎤<br />

3<br />

−1<br />

v = ⎢ 2 ⎥<br />

⎣ 4 ⎦<br />

−3<br />

√<br />

|3| 2 + |−1| 2 + |2| 2 + |4| 2 + |−3| 2 = √ 3 2 +1 2 +2 2 +4 2 +3 2 = √ 39.<br />

Notice how the norm of a vector with real number entries is just the length of<br />

the vector. Inner products and norms are related by the follow<strong>in</strong>g theorem.<br />

Theorem IPN Inner Products and Norms<br />

Suppose that u is a vector <strong>in</strong> C m .Then‖u‖ 2 = 〈u, u〉.<br />

△<br />

Proof.<br />

⎛<br />

‖u‖ 2 ∑<br />

= ⎝√ m<br />

=<br />

=<br />

i=1<br />

m∑<br />

|[u] i<br />

| 2<br />

i=1<br />

m∑<br />

[u] i<br />

[u] i<br />

i=1<br />

⎞2<br />

|[u] i<br />

| 2 ⎠<br />

Def<strong>in</strong>ition NV<br />

Inverse functions<br />

Def<strong>in</strong>ition MCN<br />

= 〈u, u〉 Def<strong>in</strong>ition IP<br />

<br />

When our vectors have entries only from the real numbers Theorem IPN says<br />

that the dot product of a vector with itself is equal to the length of the vector<br />

squared.<br />

Theorem PIP Positive Inner Products<br />

Suppose that u is a vector <strong>in</strong> C m .Then〈u, u〉 ≥0 with equality if and only if u = 0.<br />

Proof. From the proof of Theorem IPN we see that<br />

〈u, u〉 = |[u] 1<br />

| 2 + |[u] 2<br />

| 2 + |[u] 3<br />

| 2 + ···+ |[u] m<br />

| 2<br />

S<strong>in</strong>ce each modulus is squared, every term is positive, and the sum must also be<br />

positive. (Notice that <strong>in</strong> general the <strong>in</strong>ner product is a complex number and cannot<br />

be compared with zero, but <strong>in</strong> the special case of 〈u, u〉 the result is a real number.)


§O Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 154<br />

The phrase, “with equality if and only if” means that we want to show that<br />

the statement 〈u, u〉 = 0 (i.e. with equality) is equivalent (“if and only if”) to the<br />

statement u = 0.<br />

If u = 0, then it is a straightforward computation to see that 〈u, u〉 =0.Inthe<br />

other direction, assume that 〈u, u〉 = 0. As before, 〈u, u〉 is a sum of moduli. So we<br />

have<br />

0=〈u, u〉 = |[u] 1<br />

| 2 + |[u] 2<br />

| 2 + |[u] 3<br />

| 2 + ···+ |[u] m<br />

| 2<br />

Now we have a sum of squares equal<strong>in</strong>g zero, so each term must be zero. Then<br />

by similar logic, |[u] i<br />

| = 0 will imply that [u] i<br />

=0,s<strong>in</strong>ce0+0i is the only complex<br />

number with zero modulus. Thus every entry of u is zero and so u = 0, as desired.<br />

Notice that Theorem PIP conta<strong>in</strong>s three implications:<br />

u ∈ C m ⇒〈u, u〉 ≥0<br />

u = 0 ⇒〈u, u〉 =0<br />

〈u, u〉 =0⇒ u = 0<br />

The results conta<strong>in</strong>ed <strong>in</strong> Theorem PIP are summarized by say<strong>in</strong>g “the <strong>in</strong>ner<br />

product is positive def<strong>in</strong>ite.”<br />

Subsection OV<br />

Orthogonal Vectors<br />

“Orthogonal” is a generalization of “perpendicular.” You may have used mutually<br />

perpendicular vectors <strong>in</strong> a physics class, or you may recall from a calculus class that<br />

perpendicular vectors have a zero dot product. We will now extend these ideas <strong>in</strong>to<br />

the realm of higher dimensions and complex scalars.<br />

Def<strong>in</strong>ition OV Orthogonal Vectors<br />

A pair of vectors, u and v, fromC m are orthogonal if their <strong>in</strong>ner product is zero,<br />

that is, 〈u, v〉 =0.<br />

□<br />

Example TOV Two orthogonal vectors<br />

The vectors<br />

⎡ ⎤<br />

⎡ ⎤<br />

2+3i<br />

1 − i<br />

⎢4 − 2i⎥<br />

⎢2+3i⎥<br />

u = ⎣<br />

1+i<br />

⎦ v = ⎣<br />

4 − 6i<br />

⎦<br />

1+i<br />

1<br />

are orthogonal s<strong>in</strong>ce<br />

〈u, v〉 =(2− 3i)(1 − i)+(4+2i)(2 + 3i)+(1− i)(4 − 6i)+(1− i)(1)<br />

=(−1 − 5i)+(2+16i)+(−2 − 10i)+(1− i)<br />

=0+0i.


§O Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 155<br />

We extend this def<strong>in</strong>ition to whole sets by requir<strong>in</strong>g vectors to be pairwise<br />

orthogonal. Despite us<strong>in</strong>g the same word, careful thought about what objects you<br />

are us<strong>in</strong>g will elim<strong>in</strong>ate any source of confusion.<br />

Def<strong>in</strong>ition OSV Orthogonal Set of Vectors<br />

Suppose that S = {u 1 , u 2 , u 3 , ..., u n } is a set of vectors from C m .ThenS is<br />

an orthogonal set if every pair of different vectors from S is orthogonal, that is<br />

〈u i , u j 〉 = 0 whenever i ≠ j. □<br />

We now def<strong>in</strong>e the prototypical orthogonal set, which we will reference repeatedly.<br />

Def<strong>in</strong>ition SUV Standard Unit Vectors<br />

Let e j ∈ C m ,1≤ j ≤ m denote the column vectors def<strong>in</strong>ed by<br />

{<br />

0 if i ≠ j<br />

[e j ] i<br />

=<br />

1 if i = j<br />

Then the set<br />

{e 1 , e 2 , e 3 , ..., e m } = {e j | 1 ≤ j ≤ m}<br />

is the set of standard unit vectors <strong>in</strong> C m .<br />

Notice that e j is identical to column j of the m×m identity matrix I m (Def<strong>in</strong>ition<br />

IM) and is a pivot column for I m , s<strong>in</strong>ce the identity matrix is <strong>in</strong> reduced row-echelon<br />

form. These observations will often be useful. We will reserve the notation e i for these<br />

vectors. It is not hard to see that the set of standard unit vectors is an orthogonal<br />

set.<br />

Example SUVOS Standard Unit Vectors are an Orthogonal Set<br />

Compute the <strong>in</strong>ner product of two dist<strong>in</strong>ct vectors from the set of standard unit<br />

vectors (Def<strong>in</strong>ition SUV), say e i , e j ,wherei ≠ j,<br />

〈e i , e j 〉 = 00 + 00 + ···+ 10 + ···+ 00 + ···+ 01 + ···+ 00 + 00<br />

= 0(0) + 0(0) + ···+1(0)+···+0(1)+···+ 0(0) + 0(0)<br />

=0<br />

So the set {e 1 , e 2 , e 3 , ..., e m } is an orthogonal set.<br />

△<br />

Example AOS An orthogonal set<br />

The set<br />

⎧⎡<br />

⎤ ⎡ ⎤ ⎡<br />

1+i 1+5i<br />

⎪⎨<br />

⎢ 1 ⎥ ⎢ 6+5i ⎥ ⎢<br />

{x 1 , x 2 , x 3 , x 4 } = ⎣<br />

⎪⎩<br />

1 − i<br />

⎦ , ⎣<br />

−7 − i<br />

⎦ , ⎣<br />

i 1 − 6i<br />

−7+34i<br />

−8 − 23i<br />

−10 + 22i<br />

30 + 13i<br />

⎤<br />

⎥<br />

⎦ ,<br />

⎡ ⎤⎫<br />

−2 − 4i ⎪⎬<br />

⎢ 6+i ⎥<br />

⎣<br />

4+3i<br />

⎦<br />

⎪⎭<br />

6 − i<br />

is an orthogonal set.<br />

S<strong>in</strong>ce the <strong>in</strong>ner product is anti-commutative (Theorem IPAC) we can test pairs<br />

of different vectors <strong>in</strong> any order. If the result is zero, then it will also be zero if the<br />


§O Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 156<br />

<strong>in</strong>ner product is computed <strong>in</strong> the opposite order. This means there are six different<br />

pairs of vectors to use <strong>in</strong> an <strong>in</strong>ner product computation. We will do two and you<br />

can practice your <strong>in</strong>ner products on the other four.<br />

〈x 1 , x 3 〉 =(1− i)(−7+34i) + (1)(−8 − 23i)+(1+i)(−10 + 22i)+(−i)(30 + 13i)<br />

= (27 + 41i)+(−8 − 23i)+(−32 + 12i)+(13− 30i)<br />

=0+0i<br />

and<br />

〈x 2 , x 4 〉 =(1− 5i)(−2 − 4i)+(6− 5i)(6 + i)+(−7+i)(4 + 3i)+(1+6i)(6 − i)<br />

=(−22 + 6i)+(41− 24i)+(−31 − 17i)+(12+35i)<br />

=0+0i<br />

So far, this section has seen lots of def<strong>in</strong>itions, and lots of theorems establish<strong>in</strong>g<br />

un-surpris<strong>in</strong>g consequences of those def<strong>in</strong>itions. But here is our first theorem that<br />

suggests that <strong>in</strong>ner products and orthogonal vectors have some utility. It is also one<br />

of our first illustrations of how to arrive at l<strong>in</strong>ear <strong>in</strong>dependence as the conclusion of<br />

a theorem.<br />

Theorem OSLI Orthogonal Sets are L<strong>in</strong>early Independent<br />

Suppose that S is an orthogonal set of nonzero vectors. Then S is l<strong>in</strong>early <strong>in</strong>dependent.<br />

Proof. Let S = {u 1 , u 2 , u 3 , ..., u n } be an orthogonal set of nonzero vectors. To<br />

prove the l<strong>in</strong>ear <strong>in</strong>dependence of S, we can appeal to the def<strong>in</strong>ition (Def<strong>in</strong>ition LICV)<br />

and beg<strong>in</strong> with an arbitrary relation of l<strong>in</strong>ear dependence (Def<strong>in</strong>ition RLDCV),<br />

α 1 u 1 + α 2 u 2 + α 3 u 3 + ···+ α n u n = 0.<br />

Then, for every 1 ≤ i ≤ n, wehave<br />

α i 〈u i , u i 〉<br />

= α 1 (0) + α 2 (0) + ···+ α i 〈u i , u i 〉 + ···+ α n (0) Property ZCN<br />

= α 1 〈u i , u 1 〉 + ···+ α i 〈u i , u i 〉 + ···+ α n 〈u i , u n 〉 Def<strong>in</strong>ition OSV<br />

= 〈u i ,α 1 u 1 〉 + 〈u i ,α 2 u 2 〉 + ···+ 〈u i ,α n u n 〉 Theorem IPSM<br />

= 〈u i ,α 1 u 1 + α 2 u 2 + α 3 u 3 + ···+ α n u n 〉 Theorem IPVA<br />

= 〈u i , 0〉 Def<strong>in</strong>ition RLDCV<br />

= 0 Def<strong>in</strong>ition IP<br />

Because u i was assumed to be nonzero, Theorem PIP says 〈u i , u i 〉 is nonzero<br />

and thus α i must be zero. So we conclude that α i =0forall1≤ i ≤ n <strong>in</strong> any<br />

relation of l<strong>in</strong>ear dependence on S. But this says that S is a l<strong>in</strong>early <strong>in</strong>dependent<br />

set s<strong>in</strong>ce the only way to form a relation of l<strong>in</strong>ear dependence is the trivial way<br />

(Def<strong>in</strong>ition LICV). Boom!


§O Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 157<br />

Subsection GSP<br />

Gram-Schmidt Procedure<br />

The Gram-Schmidt Procedure is really a theorem. It says that if we beg<strong>in</strong> with a<br />

l<strong>in</strong>early <strong>in</strong>dependent set of p vectors, S, then we can do a number of calculations<br />

with these vectors and produce an orthogonal set of p vectors, T ,sothat〈S〉 = 〈T 〉.<br />

Given the large number of computations <strong>in</strong>volved, it is <strong>in</strong>deed a procedure to do all<br />

the necessary computations, and it is best employed on a computer. However, it also<br />

has value <strong>in</strong> proofs where we may on occasion wish to replace a l<strong>in</strong>early <strong>in</strong>dependent<br />

set by an orthogonal set.<br />

This is our first occasion to use the technique of “mathematical <strong>in</strong>duction” for a<br />

proof, a technique we will see aga<strong>in</strong> several times, especially <strong>in</strong> Chapter D. Sostudy<br />

the simple example described <strong>in</strong> Proof Technique I first.<br />

Theorem GSP Gram-Schmidt Procedure<br />

Suppose that S = {v 1 , v 2 , v 3 , ..., v p } is a l<strong>in</strong>early <strong>in</strong>dependent set of vectors <strong>in</strong><br />

C m . Def<strong>in</strong>e the vectors u i , 1 ≤ i ≤ p by<br />

u i = v i − 〈u 1, v i 〉<br />

〈u 1 , u 1 〉 u 1 − 〈u 2, v i 〉<br />

〈u 2 , u 2 〉 u 2 − 〈u 3, v i 〉<br />

〈u 3 , u 3 〉 u 3 −···− 〈u i−1, v i 〉<br />

〈u i−1 , u i−1 〉 u i−1<br />

Let T = {u 1 , u 2 , u 3 , ..., u p }.ThenT is an orthogonal set of nonzero vectors,<br />

and 〈T 〉 = 〈S〉.<br />

Proof. We will prove the result by us<strong>in</strong>g <strong>in</strong>duction on p (Proof Technique I). To<br />

beg<strong>in</strong>, we prove that T has the desired properties when p = 1. In this case u 1 = v 1<br />

and T = {u 1 } = {v 1 } = S. Because S and T are equal, 〈S〉 = 〈T 〉. Equally trivial,<br />

T is an orthogonal set. If u 1 = 0, thenS would be a l<strong>in</strong>early dependent set, a<br />

contradiction.<br />

Suppose that the theorem is true for any set of p − 1 l<strong>in</strong>early <strong>in</strong>dependent<br />

vectors. Let S = {v 1 , v 2 , v 3 , ..., v p } be a l<strong>in</strong>early <strong>in</strong>dependent set of p vectors.<br />

Then S ′ = {v 1 , v 2 , v 3 , ..., v p−1 } is also l<strong>in</strong>early <strong>in</strong>dependent. So we can apply<br />

the theorem to S ′ and construct the vectors T ′ = {u 1 , u 2 , u 3 , ..., u p−1 }. T ′ is<br />

therefore an orthogonal set of nonzero vectors and 〈S ′ 〉 = 〈T ′ 〉. Def<strong>in</strong>e<br />

u p = v p − 〈u 1, v p 〉<br />

〈u 1 , u 1 〉 u 1 − 〈u 2, v p 〉<br />

〈u 2 , u 2 〉 u 2 − 〈u 3, v p 〉<br />

〈u 3 , u 3 〉 u 3 −···− 〈u p−1, v p 〉<br />

〈u p−1 , u p−1 〉 u p−1<br />

and let T = T ′ ∪{u p }. We need to now show that T has several properties by<br />

build<strong>in</strong>g on what we know about T ′ . But first notice that the above equation has<br />

no problems with the denom<strong>in</strong>ators (〈u i , u i 〉) be<strong>in</strong>g zero, s<strong>in</strong>ce the u i are from T ′ ,<br />

which is composed of nonzero vectors.<br />

We show that 〈T 〉 = 〈S〉, by first establish<strong>in</strong>g that 〈T 〉⊆〈S〉. Supposex ∈〈T 〉,<br />

so<br />

x = a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a p u p


§O Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 158<br />

The term a p u p is a l<strong>in</strong>ear comb<strong>in</strong>ation of vectors from T ′ and the vector v p , while the<br />

rema<strong>in</strong><strong>in</strong>g terms are a l<strong>in</strong>ear comb<strong>in</strong>ation of vectors from T ′ .S<strong>in</strong>ce〈T ′ 〉 = 〈S ′ 〉,any<br />

term that is a multiple of a vector from T ′ can be rewritten as a l<strong>in</strong>ear comb<strong>in</strong>ation<br />

of vectors from S ′ . The rema<strong>in</strong><strong>in</strong>g term a p v p is a multiple of a vector <strong>in</strong> S. Sowe<br />

see that x can be rewritten as a l<strong>in</strong>ear comb<strong>in</strong>ation of vectors from S, i.e. x ∈〈S〉.<br />

To show that 〈S〉 ⊆〈T 〉, beg<strong>in</strong>withy ∈〈S〉, so<br />

y = a 1 v 1 + a 2 v 2 + a 3 v 3 + ···+ a p v p<br />

Rearrange our def<strong>in</strong><strong>in</strong>g equation for u p by solv<strong>in</strong>g for v p . Then the term a p v p<br />

is a multiple of a l<strong>in</strong>ear comb<strong>in</strong>ation of elements of T . The rema<strong>in</strong><strong>in</strong>g terms are a<br />

l<strong>in</strong>ear comb<strong>in</strong>ation of v 1 , v 2 , v 3 , ..., v p−1 , hence an element of 〈S ′ 〉 = 〈T ′ 〉.Thus<br />

these rema<strong>in</strong><strong>in</strong>g terms can be written as a l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors <strong>in</strong> T ′ .<br />

So y is a l<strong>in</strong>ear comb<strong>in</strong>ation of vectors from T , i.e. y ∈〈T 〉.<br />

The elements of T ′ are nonzero, but what about u p ? Suppose to the contrary<br />

that u p = 0,<br />

0 = u p = v p − 〈u 1, v p 〉<br />

〈u 1 , u 1 〉 u 1 − 〈u 2, v p 〉<br />

〈u 2 , u 2 〉 u 2 − 〈u 3, v p 〉<br />

〈u 3 , u 3 〉 u 3 −···− 〈u p−1, v p 〉<br />

〈u p−1 , u p−1 〉 u p−1<br />

v p = 〈u 1, v p 〉<br />

〈u 1 , u 1 〉 u 1 + 〈u 2, v p 〉<br />

〈u 2 , u 2 〉 u 2 + 〈u 3, v p 〉<br />

〈u 3 , u 3 〉 u 3 + ···+ 〈u p−1, v p 〉<br />

〈u p−1 , u p−1 〉 u p−1<br />

S<strong>in</strong>ce 〈S ′ 〉 = 〈T ′ 〉 we can write the vectors u 1 , u 2 , u 3 , ..., u p−1 on the right side<br />

of this equation <strong>in</strong> terms of the vectors v 1 , v 2 , v 3 , ..., v p−1 andwethenhavethe<br />

vector v p expressed as a l<strong>in</strong>ear comb<strong>in</strong>ation of the other p − 1 vectors <strong>in</strong> S, imply<strong>in</strong>g<br />

that S is a l<strong>in</strong>early dependent set (Theorem DLDS), contrary to our lone hypothesis<br />

about S.<br />

F<strong>in</strong>ally, it is a simple matter to establish that T is an orthogonal set, though it<br />

will not appear so simple look<strong>in</strong>g. Th<strong>in</strong>k about your objects as you work through<br />

the follow<strong>in</strong>g — what is a vector and what is a scalar. S<strong>in</strong>ce T ′ is an orthogonal set<br />

by <strong>in</strong>duction, most pairs of elements <strong>in</strong> T are already known to be orthogonal. We<br />

just need to test “new” <strong>in</strong>ner products, between u p and u i , for 1 ≤ i ≤ p − 1. Here<br />

we go, us<strong>in</strong>g summation notation,<br />

〈<br />

〈u i , u p 〉 = u i , v p −<br />

= 〈u i , v p 〉−<br />

∑p−1<br />

k=1<br />

〈<br />

u i ,<br />

∑p−1<br />

= 〈u i , v p 〉−<br />

k=1<br />

〈u k , v p 〉<br />

〈u k , u k 〉 u k<br />

∑p−1<br />

k=1<br />

〉<br />

〈u k , v p 〉<br />

〈u k , u k 〉 u k<br />

〈<br />

u i , 〈u k, v p 〉<br />

〈u k , u k 〉 u k<br />

〉<br />

〉<br />

Theorem IPVA<br />

Theorem IPVA


§O Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 159<br />

= 〈u i , v p 〉−<br />

∑p−1<br />

k=1<br />

〈u k , v p 〉<br />

〈u k , u k 〉 〈u i, u k 〉 Theorem IPSM<br />

= 〈u i , v p 〉− 〈u i, v p 〉<br />

〈u i , u i 〉 〈u i, u i 〉− ∑ k≠i<br />

= 〈u i , v p 〉−〈u i , v p 〉− ∑ k≠i<br />

0<br />

〈u k , v p 〉<br />

(0) Induction Hypothesis<br />

〈u k , u k 〉<br />

=0<br />

<br />

Example GSTV Gram-Schmidt of three vectors<br />

We will illustrate the Gram-Schmidt process with three vectors. Beg<strong>in</strong> with the<br />

l<strong>in</strong>early <strong>in</strong>dependent (check this!) set<br />

and<br />

Then<br />

S = {v 1 , v 2 , v 3 } =<br />

u 1 = v 1 =<br />

[ 1<br />

]<br />

1+i<br />

1<br />

u 2 = v 2 − 〈u 1, v 2 〉<br />

〈u 1 , u 1 〉 u 1 = 1 4<br />

{[ ] 1<br />

1+i ,<br />

1<br />

[ ] −2 − 3i<br />

1 − i<br />

2+5i<br />

[ ] −i<br />

1 ,<br />

1+i<br />

u 3 = v 3 − 〈u 1, v 3 〉<br />

〈u 1 , u 1 〉 u 1 − 〈u 2, v 3 〉<br />

〈u 2 , u 2 〉 u 2 = 1 11<br />

T = {u 1 , u 2 , u 3 } =<br />

{[ ] 1<br />

1+i , 1<br />

1 4<br />

[ ] −2 − 3i<br />

1 − i ,<br />

2+5i<br />

[ ]} 0<br />

i<br />

i<br />

[ ] −3 − i<br />

1+3i<br />

−1 − i<br />

1<br />

11<br />

[ ]} −3 − i<br />

1+3i<br />

−1 − i<br />

is an orthogonal set (which you can check) of nonzero vectors and 〈T 〉 = 〈S〉 (all<br />

by Theorem GSP). Of course, as a by-product of orthogonality, the set T is also<br />

l<strong>in</strong>early <strong>in</strong>dependent (Theorem OSLI).<br />

△<br />

One f<strong>in</strong>al def<strong>in</strong>ition related to orthogonal vectors.<br />

Def<strong>in</strong>ition ONS OrthoNormal Set<br />

Suppose S = {u 1 , u 2 , u 3 , ..., u n } is an orthogonal set of vectors such that ‖u i ‖ =1<br />

for all 1 ≤ i ≤ n. ThenS is an orthonormal set of vectors.<br />

□<br />

Once you have an orthogonal set, it is easy to convert it to an orthonormal set


§O Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 160<br />

— multiply each vector by the reciprocal of its norm, and the result<strong>in</strong>g vector will<br />

have norm 1. This scal<strong>in</strong>g of each vector will not affect the orthogonality properties<br />

(apply Theorem IPSM).<br />

Example ONTV Orthonormal set, three vectors<br />

The set<br />

{[ ] 1<br />

T = {u 1 , u 2 , u 3 } = 1+i , 1<br />

1 4<br />

from Example GSTV is an orthogonal set.<br />

We compute the norm of each vector,<br />

[ ] −2 − 3i<br />

1 − i ,<br />

2+5i<br />

1<br />

11<br />

‖u 1 ‖ =2 ‖u 2 ‖ = 1 2√<br />

11 ‖u3 ‖ =<br />

[ ]} −3 − i<br />

1+3i<br />

−1 − i<br />

√<br />

2<br />

√<br />

11<br />

Convert<strong>in</strong>g each vector to a norm of 1, yields an orthonormal set,<br />

[ ]<br />

w 1 = 1 1<br />

1+i<br />

2 1<br />

[ ] [ ]<br />

w 2 = √ 1 1 −2 − 3i<br />

1 − i = 1 −2 − 3i<br />

1<br />

2 11 4 2+5i 2 √ 1 − i<br />

11 2+5i<br />

[ ] [ ]<br />

w 3 = 1 1 −3 − i<br />

√ 1+3i = 1 −3 − i<br />

√ 1+3i<br />

√ 2 11<br />

11<br />

−1 − i 22 −1 − i<br />

Example ONFV Orthonormal set, four vectors<br />

As an exercise convert the l<strong>in</strong>early <strong>in</strong>dependent set<br />

⎧⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />

1+i i i −1 − i<br />

⎪⎨<br />

⎪⎬<br />

⎢ 1 ⎥ ⎢1+i⎥<br />

⎢ −i ⎥ ⎢ i ⎥<br />

S = ⎣<br />

⎪⎩<br />

1 − i<br />

⎦ , ⎣<br />

−1<br />

⎦ , ⎣<br />

−1+i<br />

⎦ , ⎣<br />

1<br />

⎦<br />

⎪⎭<br />

i −i 1 −1<br />

to an orthogonal set via the Gram-Schmidt Process (Theorem GSP) and then scale<br />

the vectors to norm 1 to create an orthonormal set. You should get the same set you<br />

would if you scaled the orthogonal set of Example AOS to become an orthonormal<br />

set.<br />

△<br />

We will see orthonormal sets aga<strong>in</strong> <strong>in</strong> Subsection MINM.UM. They are <strong>in</strong>timately<br />

related to unitary matrices (Def<strong>in</strong>ition UM) through Theorem CUMOS. Some of<br />

the utility of orthonormal sets is captured by Theorem COB <strong>in</strong> Subsection B.OBC.<br />

Orthonormal sets appear once aga<strong>in</strong> <strong>in</strong> Section OD where they are key <strong>in</strong> orthonormal<br />

diagonalization.<br />


§O Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 161<br />

Read<strong>in</strong>g Questions<br />

1. Is the set<br />

⎧⎡<br />

⎨<br />

⎣ 1<br />

⎤ ⎡<br />

−1⎦ , ⎣ 5 ⎤ ⎡<br />

3 ⎦ , ⎣ 8 ⎤⎫<br />

⎬<br />

4 ⎦<br />

⎩<br />

2 −1 −2<br />

⎭<br />

an orthogonal set? Why?<br />

2. What is the dist<strong>in</strong>ction between an orthogonal set and an orthonormal set?<br />

3. What is nice about the output of the Gram-Schmidt process?<br />

Exercises<br />

C20 Complete Example AOS by verify<strong>in</strong>g that the four rema<strong>in</strong><strong>in</strong>g <strong>in</strong>ner products are<br />

zero.<br />

C21 Verify that the set T created<strong>in</strong>ExampleGSTV by the Gram-Schmidt Procedure is<br />

an orthogonal set.<br />

M60 Suppose that {u, v, w} ⊆C n is an orthonormal set. Prove that u + v is not<br />

orthogonal to v + w.<br />

T10 Prove part 2 of the conclusion of Theorem IPVA.<br />

T11 Prove part 2 of the conclusion of Theorem IPSM.<br />

T20 † Suppose that u, v, w ∈ C n , α, β ∈ C and u is orthogonal to both v and w. Prove<br />

that u is orthogonal to αv + βw.<br />

T30 Suppose that the set S <strong>in</strong> the hypothesis of Theorem GSP is not just l<strong>in</strong>early<br />

<strong>in</strong>dependent, but is also orthogonal. Prove that the set T created by the Gram-Schmidt<br />

procedure is equal to S. (Note that we are gett<strong>in</strong>g a stronger conclusion than 〈T 〉 = 〈S〉 —<br />

the conclusion is that T = S.) In other words, it is po<strong>in</strong>tless to apply the Gram-Schmidt<br />

procedure to a set that is already orthogonal.<br />

T31 Suppose that the set S is l<strong>in</strong>early <strong>in</strong>dependent. Apply the Gram-Schmidt procedure<br />

(Theorem GSP) twice, creat<strong>in</strong>g first the l<strong>in</strong>early <strong>in</strong>dependent set T 1 from S, andthen<br />

creat<strong>in</strong>g T 2 from T 1 . As a consequence of Exercise O.T30, provethatT 1 = T 2 .Inother<br />

words, it is po<strong>in</strong>tless to apply the Gram-Schmidt procedure twice.


Chapter M<br />

Matrices<br />

We have made frequent use of matrices for solv<strong>in</strong>g systems of equations, and we<br />

have begun to <strong>in</strong>vestigate a few of their properties, such as the null space and<br />

nons<strong>in</strong>gularity. In this chapter, we will take a more systematic approach to the study<br />

of matrices.<br />

Section MO<br />

Matrix Operations<br />

In this section we will back up and start simple. We beg<strong>in</strong> with a def<strong>in</strong>ition of a<br />

totally general set of matrices, and see where that takes us.<br />

Subsection MEASM<br />

Matrix Equality, Addition, Scalar Multiplication<br />

Def<strong>in</strong>ition VSM Vector Space of m × n Matrices<br />

The vector space M mn is the set of all m × n matrices with entries from the set of<br />

complex numbers.<br />

□<br />

Just as we made, and used, a careful def<strong>in</strong>ition of equality for column vectors, so<br />

too, we have precise def<strong>in</strong>itions for matrices.<br />

Def<strong>in</strong>ition ME Matrix Equality<br />

The m × n matrices A and B are equal, written A = B provided [A] ij<br />

= [B] ij<br />

for<br />

all 1 ≤ i ≤ m, 1≤ j ≤ n.<br />

□<br />

So equality of matrices translates to the equality of complex numbers, on an<br />

entry-by-entry basis. Notice that we now have yet another def<strong>in</strong>ition that uses the<br />

symbol “=” for shorthand. Whenever a theorem has a conclusion say<strong>in</strong>g two matrices<br />

162


§MO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 163<br />

are equal (th<strong>in</strong>k about your objects), we will consider appeal<strong>in</strong>g to this def<strong>in</strong>ition as<br />

a way of formulat<strong>in</strong>g the top-level structure of the proof.<br />

We will now def<strong>in</strong>e two operations on the set M mn . Aga<strong>in</strong>, we will overload a<br />

symbol (‘+’) and a convention (juxtaposition for scalar multiplication).<br />

Def<strong>in</strong>ition MA Matrix Addition<br />

Given the m × n matrices A and B, def<strong>in</strong>e the sum of A and B as an m × n matrix,<br />

written A + B, accord<strong>in</strong>g to<br />

[A + B] ij<br />

=[A] ij<br />

+[B] ij<br />

1 ≤ i ≤ m, 1 ≤ j ≤ n<br />

□<br />

So matrix addition takes two matrices of the same size and comb<strong>in</strong>es them (<strong>in</strong> a<br />

natural way!) to create a new matrix of the same size. Perhaps this is the “obvious”<br />

th<strong>in</strong>g to do, but it does not relieve us from the obligation to state it carefully.<br />

Example MA Addition of two matrices <strong>in</strong> M 23<br />

If<br />

[ ]<br />

2 −3 4<br />

A =<br />

B =<br />

1 0 −7<br />

then<br />

[ ] [ ]<br />

2 −3 4 6 2 −4<br />

A + B =<br />

+<br />

1 0 −7 3 5 2<br />

[ ]<br />

2+6 −3+2 4+(−4)<br />

=<br />

=<br />

1+3 0+5 −7+2<br />

[<br />

6 2<br />

]<br />

−4<br />

3 5 2<br />

[<br />

8 −1<br />

]<br />

0<br />

4 5 −5<br />

Our second operation takes two objects of different types, specifically a number<br />

and a matrix, and comb<strong>in</strong>es them to create another matrix. As with vectors, <strong>in</strong> this<br />

context we call a number a scalar <strong>in</strong> order to emphasize that it is not a matrix.<br />

Def<strong>in</strong>ition MSM Matrix Scalar Multiplication<br />

Given the m × n matrix A and the scalar α ∈ C, thescalar multiple of A is an<br />

m × n matrix, written αA and def<strong>in</strong>ed accord<strong>in</strong>g to<br />

[αA] ij<br />

= α [A] ij<br />

1 ≤ i ≤ m, 1 ≤ j ≤ n<br />

□<br />

Notice aga<strong>in</strong> that we have yet another k<strong>in</strong>d of multiplication, and it is aga<strong>in</strong> written<br />

putt<strong>in</strong>g two symbols side-by-side. Computationally, scalar matrix multiplication is<br />

very easy.<br />

Example MSM Scalar multiplication <strong>in</strong> M 32<br />


§MO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 164<br />

If<br />

and α =7,then<br />

αA =7<br />

[ 2<br />

] 8<br />

−3 5 =<br />

0 1<br />

A =<br />

[ 2<br />

] 8<br />

−3 5<br />

0 1<br />

[ 7(2)<br />

] 7(8)<br />

7(−3) 7(5) =<br />

7(0) 7(1)<br />

[ 14<br />

] 56<br />

−21 35<br />

0 7<br />

△<br />

Subsection VSP<br />

Vector Space Properties<br />

With def<strong>in</strong>itions of matrix addition and scalar multiplication we can now state, and<br />

prove, several properties of each operation, and some properties that <strong>in</strong>volve their<br />

<strong>in</strong>terplay. We now collect ten of them here for later reference.<br />

Theorem VSPM Vector Space Properties of Matrices<br />

Suppose that M mn is the set of all m × n matrices (Def<strong>in</strong>ition VSM) with addition<br />

and scalar multiplication as def<strong>in</strong>ed <strong>in</strong> Def<strong>in</strong>ition MA and Def<strong>in</strong>ition MSM. Then<br />

• ACM Additive Closure, Matrices<br />

If A, B ∈ M mn ,thenA + B ∈ M mn .<br />

• SCM Scalar Closure, Matrices<br />

If α ∈ C and A ∈ M mn ,thenαA ∈ M mn .<br />

• CM Commutativity, Matrices<br />

If A, B ∈ M mn ,thenA + B = B + A.<br />

• AAM Additive Associativity, Matrices<br />

If A, B, C ∈ M mn ,thenA +(B + C) =(A + B)+C.<br />

• ZM Zero Matrix, Matrices<br />

There is a matrix, O, called the zero matrix, such that A + O = A for all<br />

A ∈ M mn .<br />

• AIM Additive Inverses, Matrices<br />

If A ∈ M mn , then there exists a matrix −A ∈ M mn so that A +(−A) =O.<br />

• SMAM Scalar Multiplication Associativity, Matrices<br />

If α, β ∈ C and A ∈ M mn ,thenα(βA) =(αβ)A.


§MO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 165<br />

• DMAM Distributivity across Matrix Addition, Matrices<br />

If α ∈ C and A, B ∈ M mn ,thenα(A + B) =αA + αB.<br />

• DSAM Distributivity across Scalar Addition, Matrices<br />

If α, β ∈ C and A ∈ M mn ,then(α + β)A = αA + βA.<br />

• OM One, Matrices<br />

If A ∈ M mn ,then1A = A.<br />

Proof. While some of these properties seem very obvious, they all require proof.<br />

However, the proofs are not very <strong>in</strong>terest<strong>in</strong>g, and border on tedious. We will prove<br />

one version of distributivity very carefully, and you can test your proof-build<strong>in</strong>g skills<br />

on some of the others. We will give our new notation for matrix entries a workout<br />

here. Compare the style of the proofs here with those given for vectors <strong>in</strong> Theorem<br />

VSPCV — while the objects here are more complicated, our notation makes the<br />

proofs cleaner.<br />

To prove Property DSAM, (α+β)A = αA+βA, we need to establish the equality<br />

of two matrices (see Proof Technique GS). Def<strong>in</strong>ition ME says we need to establish<br />

the equality of their entries, one-by-one. How do we do this, when we do not even<br />

know how many entries the two matrices might have? This is where the notation for<br />

matrix entries, given <strong>in</strong> Def<strong>in</strong>ition M, comes <strong>in</strong>to play. Ready? Here we go.<br />

For any i and j, 1≤ i ≤ m, 1≤ j ≤ n,<br />

[(α + β)A] ij<br />

=(α + β)[A] ij<br />

Def<strong>in</strong>ition MSM<br />

= α [A] ij<br />

+ β [A] ij<br />

Distributivity <strong>in</strong> C<br />

=[αA] ij<br />

+[βA] ij<br />

Def<strong>in</strong>ition MSM<br />

=[αA + βA] ij<br />

Def<strong>in</strong>ition MA<br />

There are several th<strong>in</strong>gs to notice here. (1) Each equals sign is an equality of<br />

scalars (numbers). (2) The two ends of the equation, be<strong>in</strong>g true for any i and j,<br />

allow us to conclude the equality of the matrices by Def<strong>in</strong>ition ME. (3)Thereare<br />

several plus signs, and several <strong>in</strong>stances of juxtaposition. Identify each one, and state<br />

exactly what operation is be<strong>in</strong>g represented by each.<br />

<br />

For now, note the similarities between Theorem VSPM about matrices and<br />

Theorem VSPCV about vectors.<br />

The zero matrix described <strong>in</strong> this theorem, O, is what you would expect — a<br />

matrix full of zeros.<br />

Def<strong>in</strong>ition ZM Zero Matrix<br />

The m × n zero matrix is written as O = O m×n anddef<strong>in</strong>edby[O] ij<br />

= 0, for all<br />

1 ≤ i ≤ m, 1≤ j ≤ n. □


§MO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 166<br />

Subsection TSM<br />

Transposes and Symmetric Matrices<br />

We describe one more common operation we can perform on matrices. Informally,<br />

to transpose a matrix is to build a new matrix by swapp<strong>in</strong>g its rows and columns.<br />

Def<strong>in</strong>ition TM Transpose of a Matrix<br />

Given an m × n matrix A, itstranspose is the n × m matrix A t given by<br />

[<br />

A<br />

t ] ij =[A] ji , 1 ≤ i ≤ n, 1 ≤ j ≤ m. □<br />

Example TM Transpose of a 3 × 4 matrix<br />

Suppose<br />

[ 3 7 2<br />

] −3<br />

D = −1 4 2 8 .<br />

0 3 −2 5<br />

We could formulate the transpose, entry-by-entry, us<strong>in</strong>g the def<strong>in</strong>ition. But it is<br />

easier to just systematically rewrite rows as columns (or vice-versa). The form of<br />

the def<strong>in</strong>ition given will be more useful <strong>in</strong> proofs. So we have<br />

⎡<br />

⎤<br />

3 −1 0<br />

7 4 3 ⎥<br />

D t =<br />

⎢<br />

⎣<br />

2 2 −2<br />

−3 8 5<br />

It will sometimes happen that a matrix is equal to its transpose. In this case, we<br />

will call a matrix symmetric. These matrices occur naturally <strong>in</strong> certa<strong>in</strong> situations,<br />

and also have some nice properties, so it is worth stat<strong>in</strong>g the def<strong>in</strong>ition carefully.<br />

Informally a matrix is symmetric if we can “flip” it about the ma<strong>in</strong> diagonal (upperleft<br />

corner, runn<strong>in</strong>g down to the lower-right corner) and have it look unchanged.<br />

Def<strong>in</strong>ition SYM Symmetric Matrix<br />

The matrix A is symmetric if A = A t .<br />

Example SYM A symmetric 5 × 5 matrix<br />

The matrix<br />

⎡<br />

⎤<br />

2 3 −9 5 7<br />

3 1 6 −2 −3<br />

E = ⎢−9 6 0 −1 9 ⎥<br />

⎣ 5 −2 −1 4 −8⎦<br />

7 −3 9 −8 −3<br />

is symmetric.<br />

You might have noticed that Def<strong>in</strong>ition SYM did not specify the size of the matrix<br />

⎦<br />

△<br />

□<br />


§MO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 167<br />

A, as has been our custom. That is because it was not necessary. An alternative<br />

would have been to state the def<strong>in</strong>ition just for square matrices, but this is the<br />

substance of the next proof.<br />

Before read<strong>in</strong>g the next proof, we want to offer you some advice about how to<br />

become more proficient at construct<strong>in</strong>g proofs. Perhaps you can apply this advice to<br />

the next theorem. Have a peek at Proof Technique P now.<br />

Theorem SMS Symmetric Matrices are Square<br />

Suppose that A is a symmetric matrix. Then A is square.<br />

Proof. We start by specify<strong>in</strong>g A’s size, without assum<strong>in</strong>g it is square, s<strong>in</strong>ce we are<br />

try<strong>in</strong>g to prove that, so we cannot also assume it. Suppose A is an m × n matrix.<br />

Because A is symmetric, we know by Def<strong>in</strong>ition SYM that A = A t . So, <strong>in</strong> particular,<br />

Def<strong>in</strong>ition ME requires that A and A t musthavethesamesize.ThesizeofA t is<br />

n × m. Because A has m rows and A t has n rows, we conclude that m = n, and<br />

hence A must be square by Def<strong>in</strong>ition SQM.<br />

<br />

We f<strong>in</strong>ish this section with three easy theorems, but they illustrate the <strong>in</strong>terplay<br />

of our three new operations, our new notation, and the techniques used to prove<br />

matrix equalities.<br />

Theorem TMA Transpose and Matrix Addition<br />

Suppose that A and B are m × n matrices. Then (A + B) t = A t + B t .<br />

Proof. The statement to be proved is an equality of matrices, so we work entry-byentry<br />

and use Def<strong>in</strong>ition ME. Th<strong>in</strong>k carefully about the objects <strong>in</strong>volved here, and<br />

the many uses of the plus sign. Realize too that while A and B are m × n matrices,<br />

the conclusion is a statement about the equality of two n × m matrices. So we beg<strong>in</strong><br />

with: for 1 ≤ i ≤ n, 1≤ j ≤ m,<br />

[<br />

(A + B)<br />

t ] ij =[A + B] ji<br />

Def<strong>in</strong>ition TM<br />

=[A] ji<br />

+[B] ji<br />

Def<strong>in</strong>ition MA<br />

= [ A t] + [ B t] Def<strong>in</strong>ition TM<br />

ij ij<br />

= [ A t + B t] Def<strong>in</strong>ition MA<br />

ij<br />

S<strong>in</strong>ce the matrices (A + B) t and A t + B t agree at each entry, Def<strong>in</strong>ition ME tells<br />

us the two matrices are equal.<br />

<br />

Theorem TMSM Transpose and Matrix Scalar Multiplication<br />

Suppose that α ∈ C and A is an m × n matrix. Then (αA) t = αA t .<br />

Proof. The statement to be proved is an equality of matrices, so we work entry-byentry<br />

and use Def<strong>in</strong>ition ME. Notice that the desired equality is of n×m matrices, and


§MO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 168<br />

th<strong>in</strong>k carefully about the objects <strong>in</strong>volved here, plus the many uses of juxtaposition.<br />

For 1 ≤ i ≤ m, 1≤ j ≤ n,<br />

[<br />

(αA)<br />

t ] ji =[αA] ij<br />

Def<strong>in</strong>ition TM<br />

= α [A] ij<br />

Def<strong>in</strong>ition MSM<br />

= α [ A t] ji<br />

Def<strong>in</strong>ition TM<br />

= [ αA t] ji<br />

Def<strong>in</strong>ition MSM<br />

S<strong>in</strong>ce the matrices (αA) t and αA t agree at each entry, Def<strong>in</strong>ition ME tells us the<br />

two matrices are equal.<br />

<br />

Theorem TT Transpose of a Transpose<br />

Suppose that A is an m × n matrix. Then (A t ) t = A.<br />

Proof. We aga<strong>in</strong> want to prove an equality of matrices, so we work entry-by-entry<br />

and use Def<strong>in</strong>ition ME. For 1 ≤ i ≤ m, 1≤ j ≤ n,<br />

[ (A<br />

t ) t ] ij = [ A t] ji<br />

Def<strong>in</strong>ition TM<br />

=[A] ij<br />

Def<strong>in</strong>ition TM<br />

S<strong>in</strong>ce the matrices (A t ) t and A agree at each entry, Def<strong>in</strong>ition ME tells us the<br />

two matrices are equal.<br />

<br />

Subsection MCC<br />

Matrices and Complex Conjugation<br />

As we did with vectors (Def<strong>in</strong>ition CCCV), we can def<strong>in</strong>e what it means to take the<br />

conjugate of a matrix.<br />

Def<strong>in</strong>ition CCM Complex Conjugate of a Matrix<br />

Suppose A is an m × n matrix. Then the conjugate of A, written A is an m × n<br />

matrix def<strong>in</strong>ed by<br />

[<br />

A<br />

]<br />

ij = [A] ij<br />

Example CCM Complex conjugate of a matrix<br />

If<br />

[ ]<br />

2 − i 3 5+4i<br />

A =<br />

−3+6i 2 − 3i 0<br />

then<br />

A =<br />

[<br />

2+i 3<br />

]<br />

5− 4i<br />

−3 − 6i 2+3i 0<br />


§MO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 169<br />

The <strong>in</strong>terplay between the conjugate of a matrix and the two operations on<br />

matrices is what you might expect.<br />

Theorem CRMA Conjugation Respects Matrix Addition<br />

Suppose that A and B are m × n matrices. Then A + B = A + B.<br />

Proof. For 1 ≤ i ≤ m, 1≤ j ≤ n,<br />

[ ]<br />

A + B<br />

ij = [A + B] ij<br />

Def<strong>in</strong>ition CCM<br />

= [A] ij<br />

+[B] ij<br />

Def<strong>in</strong>ition MA<br />

= [A] ij<br />

+ [B] ij<br />

Theorem CCRA<br />

= [ A ] ij + [ B ] ij<br />

= [ A + B ] ij<br />

Def<strong>in</strong>ition CCM<br />

Def<strong>in</strong>ition MA<br />

S<strong>in</strong>ce the matrices A + B and A + B are equal <strong>in</strong> each entry, Def<strong>in</strong>ition ME says<br />

that A + B = A + B.<br />

<br />

Theorem CRMSM Conjugation Respects Matrix Scalar Multiplication<br />

Suppose that α ∈ C and A is an m × n matrix. Then αA = αA.<br />

Proof. For 1 ≤ i ≤ m, 1≤ j ≤ n,<br />

[ ]<br />

αA<br />

ij = [αA] ij<br />

Def<strong>in</strong>ition CCM<br />

= α [A] ij<br />

Def<strong>in</strong>ition MSM<br />

= α[A] ij<br />

Theorem CCRM<br />

= α [ A ] ij<br />

= [ αA ] ij<br />

Def<strong>in</strong>ition CCM<br />

Def<strong>in</strong>ition MSM<br />

S<strong>in</strong>ce the matrices αA and αA are equal <strong>in</strong> each entry, Def<strong>in</strong>ition ME says that<br />

αA = αA.<br />

<br />

Theorem CCM Conjugate of the Conjugate of a Matrix<br />

Suppose that A is an m × n matrix. Then ( A ) = A.<br />

Proof. For 1 ≤ i ≤ m, 1≤ j ≤ n,<br />

[ (A ) ] ij<br />

= [ A ] ij<br />

Def<strong>in</strong>ition CCM<br />

= [A] ij<br />

Def<strong>in</strong>ition CCM<br />

=[A] ij<br />

Theorem CCT


§MO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 170<br />

S<strong>in</strong>ce the matrices ( A ) and A are equal <strong>in</strong> each entry, Def<strong>in</strong>ition ME says that<br />

(<br />

A<br />

)<br />

= A. <br />

F<strong>in</strong>ally, we will need the follow<strong>in</strong>g result about matrix conjugation and transposes<br />

later.<br />

Theorem MCT Matrix Conjugation and Transposes<br />

Suppose that A is an m × n matrix. Then (A t )= ( A ) t<br />

.<br />

Proof. For 1 ≤ i ≤ m, 1≤ j ≤ n,<br />

[ ]<br />

(A t ) = ji [At ] ji<br />

Def<strong>in</strong>ition CCM<br />

= [A] ij<br />

Def<strong>in</strong>ition TM<br />

= [ A ] Def<strong>in</strong>ition CCM<br />

ij<br />

[ (A ) t<br />

]<br />

=<br />

Def<strong>in</strong>ition TM<br />

ji<br />

S<strong>in</strong>ce the matrices (A t ) and ( A ) t<br />

are equal <strong>in</strong> each entry, Def<strong>in</strong>ition ME says<br />

that (A t )= ( A ) t<br />

.<br />

Subsection AM<br />

Adjo<strong>in</strong>t of a Matrix<br />

The comb<strong>in</strong>ation of transpos<strong>in</strong>g and conjugat<strong>in</strong>g a matrix will be important <strong>in</strong><br />

subsequent sections, such as Subsection MINM.UM and Section OD. Wemakea<br />

key def<strong>in</strong>ition here and prove some basic results <strong>in</strong> the same spirit as those above.<br />

Def<strong>in</strong>ition A Adjo<strong>in</strong>t<br />

If A is a matrix, then its adjo<strong>in</strong>t is A ∗ = ( A ) t<br />

.<br />

You will see the adjo<strong>in</strong>t written elsewhere variously as A H , A ∗ or A † . Notice that<br />

Theorem MCT says it does not really matter if we conjugate and then transpose, or<br />

transpose and then conjugate.<br />

Theorem AMA Adjo<strong>in</strong>t and Matrix Addition<br />

Suppose A and B are matrices of the same size. Then (A + B) ∗ = A ∗ + B ∗ .<br />

<br />

□<br />

Proof.<br />

(A + B) ∗ = ( A + B ) t<br />

Def<strong>in</strong>ition A<br />

= ( A + B ) t<br />

Theorem CRMA<br />

= ( A ) t ( ) t<br />

+ B Theorem TMA


§MO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 171<br />

= A ∗ + B ∗ Def<strong>in</strong>ition A<br />

<br />

Theorem AMSM Adjo<strong>in</strong>t and Matrix Scalar Multiplication<br />

Suppose α ∈ C is a scalar and A is a matrix. Then (αA) ∗ = αA ∗ .<br />

Proof.<br />

(αA) ∗ = ( αA ) t<br />

Def<strong>in</strong>ition A<br />

= ( αA ) t<br />

Theorem CRMSM<br />

= α ( A ) t<br />

Theorem TMSM<br />

= αA ∗ Def<strong>in</strong>ition A<br />

<br />

Theorem AA Adjo<strong>in</strong>t of an Adjo<strong>in</strong>t<br />

Suppose that A is a matrix. Then (A ∗ ) ∗ = A.<br />

Proof.<br />

( t<br />

(A ∗ ) ∗ = (A )) ∗ Def<strong>in</strong>ition A<br />

(<br />

= (A ∗ ) t) Theorem MCT<br />

( ((A ) t<br />

) ) t<br />

=<br />

Def<strong>in</strong>ition A<br />

= ( A ) Theorem TT<br />

= A Theorem CCM<br />

<br />

Take note of how the theorems <strong>in</strong> this section, while simple, build on earlier<br />

theorems and def<strong>in</strong>itions and never descend to the level of entry-by-entry proofs<br />

based on Def<strong>in</strong>ition ME. In other words, the equal signs that appear <strong>in</strong> the previous<br />

proofs are equalities of matrices, not scalars (which is the opposite of a proof like<br />

that of Theorem TMA).<br />

Read<strong>in</strong>g Questions<br />

1. Perform the follow<strong>in</strong>g matrix computation.<br />

⎡<br />

(6) ⎣ 2 −2 8 1<br />

⎤ ⎡<br />

4 5 −1 3⎦ +(−2) ⎣ 2 7 1 2<br />

⎤<br />

3 −1 0 5⎦<br />

7 −3 0 2 1 7 3 3


§MO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 172<br />

2. Theorem VSPM rem<strong>in</strong>ds you of what previous theorem? How strong is the similarity?<br />

3. Compute the transpose of the matrix below.<br />

⎡<br />

⎣ 6 8 4<br />

⎤<br />

−2 1 0⎦<br />

9 −5 6<br />

Exercises<br />

⎡<br />

[ ] [ ]<br />

C10 † 1 4 −3 3 2 1<br />

Let A =<br />

, B =<br />

and C = ⎣ 2 4<br />

⎤<br />

4 0⎦. Letα =4and<br />

6 3 0 −2 −6 5<br />

−2 2<br />

β =1/2. Perform the follow<strong>in</strong>g calculations: (1) A + B, (2)A + C, (3)B t + C, (4)A + B t ,<br />

(5) βC, (6)4A − 3B, (7)A t + αC, (8)A + B − C t ,(9)4A +2B − 5C t .<br />

C11 † Solve the given vector equation for x, or expla<strong>in</strong> why no solution exists:<br />

[ ] [ ] [ ]<br />

1 2 3 1 1 2 −1 1 0<br />

2<br />

− 3<br />

=<br />

0 4 2 0 1 x 0 5 −2<br />

C12 †<br />

C13 †<br />

C14 †<br />

Solve the given matrix equation for α, or expla<strong>in</strong> why no solution exists:<br />

[ ] [ ] [ ]<br />

1 3 4 4 3 −6 7 12 6<br />

α<br />

+<br />

=<br />

2 1 −1 0 1 1 6 4 −2<br />

Solve the given matrix equation for α, or expla<strong>in</strong> why no solution exists:<br />

⎡<br />

α ⎣ 3 1<br />

⎤ ⎡<br />

2 0⎦ − ⎣ 4 1<br />

⎤ ⎡<br />

3 2⎦ = ⎣ 2 1<br />

⎤<br />

1 −2⎦<br />

1 4 0 1 2 6<br />

F<strong>in</strong>d α and β that solve the follow<strong>in</strong>g equation:<br />

[ ] [ ] [ ]<br />

1 2 2 1 −1 4<br />

α + β =<br />

4 1 3 1 6 1<br />

In Chapter V we def<strong>in</strong>ed the operations of vector addition and vector scalar multiplication<br />

<strong>in</strong> Def<strong>in</strong>ition CVA and Def<strong>in</strong>ition CVSM. These two operations formed the underp<strong>in</strong>n<strong>in</strong>gs<br />

of the rema<strong>in</strong>der of the chapter. We have now def<strong>in</strong>ed similar operations for matrices <strong>in</strong><br />

Def<strong>in</strong>ition MA and Def<strong>in</strong>ition MSM. You will have noticed the result<strong>in</strong>g similarities between<br />

Theorem VSPCV and Theorem VSPM.<br />

In Exercises M20–M25, you will be asked to extend these similarities to other fundamental<br />

def<strong>in</strong>itions and concepts we first saw <strong>in</strong> Chapter V. This sequence of problems was suggested<br />

by Mart<strong>in</strong> Jackson.<br />

M20 Suppose S = {B 1 ,B 2 ,B 3 , ..., B p } is a set of matrices from M mn .Formulate<br />

appropriate def<strong>in</strong>itions for the follow<strong>in</strong>g terms and give an example of the use of each.<br />

1. A l<strong>in</strong>ear comb<strong>in</strong>ation of elements of S.<br />

2. A relation of l<strong>in</strong>ear dependence on S, both trivial and nontrivial.


§MO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 173<br />

3. S is a l<strong>in</strong>early <strong>in</strong>dependent set.<br />

4. 〈S〉.<br />

M21 † Show that the set S is l<strong>in</strong>early <strong>in</strong>dependent <strong>in</strong> M 22 .<br />

{[ ] [ ] [ ] [ ]}<br />

1 0 0 1 0 0 0 0<br />

S = , , ,<br />

0 0 0 0 1 0 0 1<br />

M22 † Determ<strong>in</strong>e if the set S below is l<strong>in</strong>early <strong>in</strong>dependent <strong>in</strong> M 23 .<br />

{[ ] [ ] [ ] [ ] [ ]}<br />

−2 3 4 4 −2 2 −1 −2 −2 −1 1 0 −1 2 −2<br />

,<br />

,<br />

,<br />

,<br />

−1 3 −2 0 −1 1 2 2 2 −1 0 −2 0 −1 −2<br />

M23 † Determ<strong>in</strong>e if the matrix A is <strong>in</strong> the span of S. Inotherwords,isA ∈〈S〉? Ifso<br />

write A as a l<strong>in</strong>ear comb<strong>in</strong>ation of the elements of S.<br />

[ ]<br />

−13 24 2<br />

A =<br />

−8 −2 −20<br />

{[ ] [ ] [ ] [ ] [ ]}<br />

−2 3 4 4 −2 2 −1 −2 −2 −1 1 0 −1 2 −2<br />

S =<br />

,<br />

,<br />

,<br />

,<br />

−1 3 −2 0 −1 1 2 2 2 −1 0 −2 0 −1 −2<br />

M24 † Suppose Y is the set of all 3 × 3 symmetric matrices (Def<strong>in</strong>ition SYM). F<strong>in</strong>d a<br />

set T so that T is l<strong>in</strong>early <strong>in</strong>dependent and 〈T 〉 = Y .<br />

M25 Def<strong>in</strong>e a subset of M 33 by<br />

{<br />

}<br />

U 33 = A ∈ M 33 | [A] ij<br />

= 0 whenever i>j<br />

F<strong>in</strong>d a set R so that R is l<strong>in</strong>early <strong>in</strong>dependent and 〈R〉 = U 33 .<br />

T13 † Prove Property CM of Theorem VSPM. Write your proof <strong>in</strong> the style of the proof<br />

of Property DSAM given <strong>in</strong> this section.<br />

T14 Prove Property AAM of Theorem VSPM. Write your proof <strong>in</strong> the style of the proof<br />

of Property DSAM given <strong>in</strong> this section.<br />

T17 Prove Property SMAM of Theorem VSPM. Write your proof <strong>in</strong> the style of the<br />

proof of Property DSAM given <strong>in</strong> this section.<br />

T18 Prove Property DMAM of Theorem VSPM. Write your proof <strong>in</strong> the style of the<br />

proof of Property DSAM given <strong>in</strong> this section.<br />

A matrix A is skew-symmetric if A t = −A Exercises T30–T37 employ this def<strong>in</strong>ition.<br />

T30 Prove that a skew-symmetric matrix is square. (H<strong>in</strong>t: study the proof of Theorem<br />

SMS.)<br />

T31 Prove that a skew-symmetric matrix must have zeros for its diagonal elements.<br />

In other words, if A is skew-symmetric of size n, then[A] ii<br />

=0for1≤ i ≤ n. (H<strong>in</strong>t:<br />

carefully construct an example of a 3 × 3 skew-symmetric matrix before attempt<strong>in</strong>g a<br />

proof.)<br />

T32 Prove that a matrix A is both skew-symmetric and symmetric if and only if A is<br />

the zero matrix. (H<strong>in</strong>t: one half of this proof is very easy, the other half takes a little<br />

more work.)


§MO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 174<br />

T33 Suppose A and B are both skew-symmetric matrices of the same size and α, β ∈ C.<br />

Prove that αA + βB is a skew-symmetric matrix.<br />

T34 Suppose A is a square matrix. Prove that A + A t is a symmetric matrix.<br />

T35 Suppose A is a square matrix. Prove that A − A t is a skew-symmetric matrix.<br />

T36 Suppose A is a square matrix. Prove that there is a symmetric matrix B and a<br />

skew-symmetric matrix C such that A = B + C. In other words, any square matrix can<br />

be decomposed <strong>in</strong>to a symmetric matrix and a skew-symmetric matrix (Proof Technique<br />

DC). (H<strong>in</strong>t: consider build<strong>in</strong>g a proof on Exercise MO.T34 and Exercise MO.T35.)<br />

T37 Prove that the decomposition <strong>in</strong> Exercise MO.T36 is unique (see Proof Technique<br />

U). (H<strong>in</strong>t: a proof can turn on Exercise MO.T31.)


Section MM<br />

Matrix Multiplication<br />

We know how to add vectors and how to multiply them by scalars. Together, these<br />

operations give us the possibility of mak<strong>in</strong>g l<strong>in</strong>ear comb<strong>in</strong>ations. Similarly, we know<br />

how to add matrices and how to multiply matrices by scalars. In this section we mix<br />

all these ideas together and produce an operation known as “matrix multiplication.”<br />

This will lead to some results that are both surpris<strong>in</strong>g and central. We beg<strong>in</strong> with a<br />

def<strong>in</strong>ition of how to multiply a vector by a matrix.<br />

Subsection MVP<br />

Matrix-Vector Product<br />

We have repeatedly seen the importance of form<strong>in</strong>g l<strong>in</strong>ear comb<strong>in</strong>ations of the<br />

columns of a matrix. As one example of this, the oft-used Theorem SLSLC, said<br />

that every solution to a system of l<strong>in</strong>ear equations gives rise to a l<strong>in</strong>ear comb<strong>in</strong>ation<br />

of the column vectors of the coefficient matrix that equals the vector of constants.<br />

This theorem, and others, motivate the follow<strong>in</strong>g central def<strong>in</strong>ition.<br />

Def<strong>in</strong>ition MVP Matrix-Vector Product<br />

Suppose A is an m × n matrix with columns A 1 , A 2 , A 3 , ..., A n and u is a vector<br />

of size n. Then the matrix-vector product of A with u is the l<strong>in</strong>ear comb<strong>in</strong>ation<br />

Au =[u] 1<br />

A 1 +[u] 2<br />

A 2 +[u] 3<br />

A 3 + ···+[u] n<br />

A n<br />

So, the matrix-vector product is yet another version of “multiplication,” at least<br />

<strong>in</strong> the sense that we have yet aga<strong>in</strong> overloaded juxtaposition of two symbols as our<br />

notation. Remember your objects, an m × n matrix times a vector of size n will<br />

create a vector of size m. SoifA is rectangular, then the size of the vector changes.<br />

With all the l<strong>in</strong>ear comb<strong>in</strong>ations we have performed so far, this computation should<br />

now seem second nature.<br />

Example MTV A matrix times a vector<br />

Consider<br />

Then<br />

Au =2<br />

A =<br />

[ 1 4 2 3<br />

] 4<br />

−3 2 0 1 −2<br />

1 6 −3 −1 5<br />

[ ] ] [ ] ]<br />

1 4 2 3<br />

−3 +1[<br />

2 +(−2) 0 +3[<br />

1 +(−1)<br />

1 6 −3 −1<br />

175<br />

⎡ ⎤<br />

2<br />

1<br />

u = ⎢−2⎥<br />

⎣ 3 ⎦<br />

−1<br />

[ ] [ ] 4 7<br />

−2 = 1 .<br />

5 6<br />


§MM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 176<br />

△<br />

We can now represent systems of l<strong>in</strong>ear equations compactly with a matrix-vector<br />

product (Def<strong>in</strong>ition MVP) and column vector equality (Def<strong>in</strong>ition CVE). This f<strong>in</strong>ally<br />

yields a very popular alternative to our unconventional LS(A, b) notation.<br />

Theorem SLEMM Systems of L<strong>in</strong>ear Equations as Matrix Multiplication<br />

The set of solutions to the l<strong>in</strong>ear system LS(A, b) equals the set of solutions for x<br />

<strong>in</strong> the vector equation Ax = b.<br />

Proof. This theorem says that two sets (of solutions) are equal. So we need to show<br />

that one set of solutions is a subset of the other, and vice versa (Def<strong>in</strong>ition SE). Let<br />

A 1 , A 2 , A 3 , ..., A n be the columns of A. Both of these set <strong>in</strong>clusions then follow<br />

from the follow<strong>in</strong>g cha<strong>in</strong> of equivalences (Proof Technique E),<br />

x is a solution to LS(A, b)<br />

⇐⇒ [x] 1<br />

A 1 +[x] 2<br />

A 2 +[x] 3<br />

A 3 + ···+[x] n<br />

A n = b Theorem SLSLC<br />

⇐⇒ x is a solution to Ax = b<br />

Def<strong>in</strong>ition MVP<br />

Example MNSLE Matrix notation for systems of l<strong>in</strong>ear equations<br />

Consider the system of l<strong>in</strong>ear equations from Example NSLE.<br />

2x 1 +4x 2 − 3x 3 +5x 4 + x 5 =9<br />

3x 1 + x 2 + x 4 − 3x 5 =0<br />

−2x 1 +7x 2 − 5x 3 +2x 4 +2x 5 = −3<br />

has coefficient matrix and vector of constants<br />

[ ]<br />

[ ]<br />

2 4 −3 5 1<br />

9<br />

A = 3 1 0 1 −3<br />

b = 0<br />

−2 7 −5 2 2<br />

−3<br />

and so will be described compactly by the vector equation Ax = b.<br />

The matrix-vector product is a very natural computation. We have motivated it<br />

by its connections with systems of equations, but here is another example.<br />

Example MBC Money’s best cities<br />

Every year Money magaz<strong>in</strong>e selects several cities <strong>in</strong> the United States as the “best”<br />

cities to live <strong>in</strong>, based on a wide array of statistics about each city. This is an example<br />

of how the editors of Money might arrive at a s<strong>in</strong>gle number that consolidates the<br />

statistics about a city. We will analyze Los Angeles, Chicago and New York City,<br />

based on four criteria: average high temperature <strong>in</strong> July (Farenheit), number of<br />

colleges and universities <strong>in</strong> a 30-mile radius, number of toxic waste sites <strong>in</strong> the<br />

Superfund environmental clean-up program and a personal crime <strong>in</strong>dex based on FBI<br />


§MM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 177<br />

statistics (average = 100, smaller is safer). It should be apparent how to generalize<br />

the example to a greater number of cities and a greater number of statistics.<br />

We beg<strong>in</strong> by build<strong>in</strong>g a table of statistics. The rows will be labeled with the<br />

cities, and the columns with statistical categories. These values are from Money’s<br />

website <strong>in</strong> early 2005.<br />

City Temp Colleges Superfund Crime<br />

Los Angeles 77 28 93 254<br />

Chicago 84 38 85 363<br />

New York 84 99 1 193<br />

Conceivably these data might reside <strong>in</strong> a spreadsheet. Now we must comb<strong>in</strong>e<br />

the statistics for each city. We could accomplish this by weight<strong>in</strong>g each category,<br />

scal<strong>in</strong>g the values and summ<strong>in</strong>g them. The sizes of the weights would depend upon<br />

the numerical size of each statistic generally, but more importantly, they would<br />

reflect the editors op<strong>in</strong>ions or beliefs about which statistics were most important to<br />

their readers. Is the crime <strong>in</strong>dex more important than the number of colleges and<br />

universities? Of course, there is no right answer to this question.<br />

Suppose the editors f<strong>in</strong>ally decide on the follow<strong>in</strong>g weights to employ: temperature,<br />

0.23; colleges, 0.46; Superfund, −0.05; crime, −0.20. Notice how negative weights<br />

are used for undesirable statistics. Then, for example, the editors would compute for<br />

Los Angeles,<br />

(0.23)(77) + (0.46)(28) + (−0.05)(93) + (−0.20)(254) = −24.86<br />

This computation might rem<strong>in</strong>d you of an <strong>in</strong>ner product, but we will produce<br />

the computations for all of the cities as a matrix-vector product. Write the table of<br />

raw statistics as a matrix<br />

[ ]<br />

77 28 93 254<br />

T = 84 38 85 363<br />

84 99 1 193<br />

and the weights as a vector<br />

⎡<br />

⎢<br />

w = ⎣<br />

0.23<br />

0.46<br />

−0.05<br />

−0.20<br />

then the matrix-vector product (Def<strong>in</strong>ition MVP) yields<br />

T w =(0.23)<br />

[ ] 77<br />

84<br />

84<br />

+(0.46)<br />

[ ] 28<br />

38<br />

99<br />

+(−0.05)<br />

[ ] 93<br />

85<br />

1<br />

+(−0.20)<br />

⎤<br />

⎥<br />

⎦<br />

[ ] 254<br />

363 =<br />

193<br />

[ ] −24.86<br />

−40.05<br />

26.21<br />

This vector conta<strong>in</strong>s a s<strong>in</strong>gle number for each of the cities be<strong>in</strong>g studied, so the<br />

editors would rank New York best (26.21), Los Angeles next (−24.86), and Chicago


§MM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 178<br />

third (−40.05). Of course, the mayor’s offices <strong>in</strong> Chicago and Los Angeles are free to<br />

counter with a different set of weights that cause their city to be ranked best. These<br />

alternative weights would be chosen to play to each cities’ strengths, and m<strong>in</strong>imize<br />

their problem areas.<br />

If a speadsheet were used to make these computations, a row of weights would<br />

be entered somewhere near the table of data and the formulas <strong>in</strong> the spreadsheet<br />

would effect a matrix-vector product. This example is meant to illustrate how “l<strong>in</strong>ear”<br />

computations (addition, multiplication) can be organized as a matrix-vector product.<br />

Another example would be the matrix of numerical scores on exam<strong>in</strong>ations and<br />

exercises for students <strong>in</strong> a class. The rows would be <strong>in</strong>dexed by students and the<br />

columns would be <strong>in</strong>dexed by exams and assignments. The <strong>in</strong>structor could then<br />

assign weights to the different exams and assignments, and via a matrix-vector<br />

product, compute a s<strong>in</strong>gle score for each student.<br />

△<br />

Later (much later) we will need the follow<strong>in</strong>g theorem, which is really a technical<br />

lemma (see Proof Technique LC). S<strong>in</strong>ce we are <strong>in</strong> a position to prove it now, we<br />

will. But you can safely skip it for the moment, if you promise to come back later to<br />

study the proof when the theorem is employed. At that po<strong>in</strong>t you will also be able<br />

to understand the comments <strong>in</strong> the paragraph follow<strong>in</strong>g the proof.<br />

Theorem EMMVP Equal Matrices and Matrix-Vector Products<br />

Suppose that A and B are m × n matrices such that Ax = Bx for every x ∈ C n .<br />

Then A = B.<br />

Proof. We are assum<strong>in</strong>g Ax = Bx for all x ∈ C n , so we can employ this equality<br />

for any choice of the vector x. However, we will limit our use of this equality to the<br />

standard unit vectors, e j ,1≤ j ≤ n (Def<strong>in</strong>ition SUV). For all 1 ≤ j ≤ n, 1≤ i ≤ m,<br />

[A] ij<br />

=0[A] i1<br />

+ ···+0[A] i,j−1<br />

+1[A] ij<br />

+0[A] i,j+1<br />

+ ···+0[A] <strong>in</strong><br />

Theorem PCNA<br />

=[e j ] 1<br />

[A] i1<br />

+[e j ] 2<br />

[A] i2<br />

+[e j ] 3<br />

[A] i3<br />

+ ···+[e j ] n<br />

[A] <strong>in</strong><br />

Property CMCN<br />

=[A] i1<br />

[e j ] 1<br />

+[A] i2<br />

[e j ] 2<br />

+[A] i3<br />

[e j ] 3<br />

+ ···+[A] <strong>in</strong><br />

[e j ] n<br />

Def<strong>in</strong>ition SUV<br />

=[Ae j ] i<br />

Def<strong>in</strong>ition MVP<br />

=[Be j ] i<br />

Hypothesis<br />

=[B] i1<br />

[e j ] 1<br />

+[B] i2<br />

[e j ] 2<br />

+[B] i3<br />

[e j ] 3<br />

+ ···+[B] <strong>in</strong><br />

[e j ] n<br />

Def<strong>in</strong>ition MVP<br />

=[e j ] 1<br />

[B] i1<br />

+[e j ] 2<br />

[B] i2<br />

+[e j ] 3<br />

[B] i3<br />

+ ···+[e j ] n<br />

[B] <strong>in</strong><br />

Property CMCN<br />

=0[B] i1<br />

+ ···+0[B] i,j−1<br />

+1[B] ij<br />

+0[B] i,j+1<br />

+ ···+0[B] <strong>in</strong><br />

Def<strong>in</strong>ition SUV<br />

=[B] ij<br />

Theorem PCNA<br />

So by Def<strong>in</strong>ition ME the matrices A and B are equal, as desired.<br />

<br />

You might notice from study<strong>in</strong>g the proof that the hypotheses of this theorem


§MM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 179<br />

could be “weakened” (i.e. made less restrictive). We need only suppose the equality<br />

of the matrix-vector products for just the standard unit vectors (Def<strong>in</strong>ition SUV) or<br />

any other spann<strong>in</strong>g set (Def<strong>in</strong>ition SSVS) ofC n (Exercise LISS.T40). However, <strong>in</strong><br />

practice, when we apply this theorem the stronger hypothesis will be <strong>in</strong> effect so this<br />

version of the theorem will suffice for our purposes. (If we changed the statement of<br />

the theorem to have the less restrictive hypothesis, then we would call the theorem<br />

“stronger.”)<br />

Subsection MM<br />

Matrix Multiplication<br />

We now def<strong>in</strong>e how to multiply two matrices together. Stop for a m<strong>in</strong>ute and th<strong>in</strong>k<br />

about how you might def<strong>in</strong>e this new operation.<br />

Many books would present this def<strong>in</strong>ition much earlier <strong>in</strong> the course. However,<br />

we have taken great care to delay it as long as possible and to present as many<br />

ideas as practical based mostly on the notion of l<strong>in</strong>ear comb<strong>in</strong>ations. Towards the<br />

conclusion of the course, or when you perhaps take a second course <strong>in</strong> l<strong>in</strong>ear algebra,<br />

you may be <strong>in</strong> a position to appreciate the reasons for this. For now, understand<br />

that matrix multiplication is a central def<strong>in</strong>ition and perhaps you will appreciate its<br />

importance more by hav<strong>in</strong>g saved it for later.<br />

Def<strong>in</strong>ition MM Matrix Multiplication<br />

Suppose A is an m × n matrix and B 1 , B 2 , B 3 , ..., B p are the columns of an n × p<br />

matrix B. Then the matrix product of A with B is the m×p matrix where column<br />

i is the matrix-vector product AB i . Symbolically,<br />

AB = A [B 1 |B 2 |B 3 | ...|B p ]=[AB 1 |AB 2 |AB 3 | ...|AB p ] .<br />

□<br />

Example PTM Product of two matrices<br />

Set<br />

A =<br />

[ 1 2 −1 4<br />

] 6<br />

0 −4 1 2 3<br />

−5 1 2 −3 4<br />

Then<br />

⎡ ⎡ ⎤<br />

⎡ ⎤<br />

⎡<br />

1<br />

6<br />

−1<br />

4<br />

AB = ⎢<br />

⎣ A ⎢ 1 ⎥<br />

A ⎢ 1 ⎥<br />

A ⎢<br />

⎣ 6 ⎦<br />

⎣<br />

4 ⎦<br />

⎣<br />

1 ∣ −2 ∣<br />

2<br />

3<br />

2<br />

−1<br />

3<br />

⎡<br />

⎤<br />

1 6 2 1<br />

−1 4 3 2<br />

B = ⎢ 1 1 2 3⎥<br />

⎣ 6 4 −1 2⎦<br />

1 −2 3 0<br />

⎤<br />

⎡ ⎤⎤<br />

1<br />

[ ]<br />

2<br />

28 17 20 10<br />

⎥<br />

A ⎢3⎥⎥<br />

⎦<br />

⎣<br />

2⎦⎦ = 20 −13 −3 −1 .<br />

−18 −44 12 −3<br />

∣ 0<br />


§MM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 180<br />

Is this the def<strong>in</strong>ition of matrix multiplication you expected? Perhaps our previous<br />

operations for matrices caused you to th<strong>in</strong>k that we might multiply two matrices of<br />

the same size, entry-by-entry? Notice that our current def<strong>in</strong>ition uses matrices of<br />

different sizes (though the number of columns <strong>in</strong> the first must equal the number<br />

of rows <strong>in</strong> the second), and the result is of a third size. Notice too <strong>in</strong> the previous<br />

example that we cannot even consider the product BA, s<strong>in</strong>ce the sizes of the two<br />

matrices <strong>in</strong> this order are not right.<br />

But it gets weirder than that. Many of your old ideas about “multiplication” will<br />

not apply to matrix multiplication, but some still will. So make no assumptions, and<br />

do not do anyth<strong>in</strong>g until you have a theorem that says you can. Even if the sizes are<br />

right, matrix multiplication is not commutative — order matters.<br />

Example MMNC Matrix multiplication is not commutative<br />

Set<br />

[ ]<br />

[ ]<br />

1 3<br />

4 0<br />

A =<br />

B = .<br />

−1 2<br />

5 1<br />

Then we have two square, 2 × 2 matrices, so Def<strong>in</strong>ition MM allows us to multiply<br />

them <strong>in</strong> either order. We f<strong>in</strong>d<br />

[ ]<br />

[ ]<br />

19 3<br />

4 12<br />

AB =<br />

BA =<br />

6 2<br />

4 17<br />

and AB ̸= BA. Not even close. It should not be hard for you to construct other<br />

pairs of matrices that do not commute (try a couple of 3 × 3’s). Can you f<strong>in</strong>d a pair<br />

of non-identical matrices that do commute?<br />

△<br />

Subsection MMEE<br />

Matrix Multiplication, Entry-by-Entry<br />

While certa<strong>in</strong> “natural” properties of multiplication do not hold, many more do. In<br />

the next subsection, we will state and prove the relevant theorems. But first, we<br />

need a theorem that provides an alternate means of multiply<strong>in</strong>g two matrices. In<br />

many texts, this would be given as the def<strong>in</strong>ition of matrix multiplication. We prefer<br />

to turn it around and have the follow<strong>in</strong>g formula as a consequence of our def<strong>in</strong>ition.<br />

It will prove useful for proofs of matrix equality, where we need to exam<strong>in</strong>e products<br />

of matrices, entry-by-entry.<br />

Theorem EMP Entries of Matrix Products<br />

Suppose A is an m × n matrix and B is an n × p matrix. Then for 1 ≤ i ≤ m,<br />

1 ≤ j ≤ p, the <strong>in</strong>dividual entries of AB are given by<br />

[AB] ij<br />

=[A] i1<br />

[B] 1j<br />

+[A] i2<br />

[B] 2j<br />

+[A] i3<br />

[B] 3j<br />

+ ···+[A] <strong>in</strong><br />

[B] nj<br />

n∑<br />

= [A] ik<br />

[B] kj<br />

k=1


§MM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 181<br />

Proof. Let the vectors A 1 , A 2 , A 3 , ..., A n denote the columns of A and let the<br />

vectors B 1 , B 2 , B 3 , ..., B p denote the columns of B. Then for 1 ≤ i ≤ m,1≤ j ≤ p,<br />

[AB] ij<br />

=[AB j ] i<br />

Def<strong>in</strong>ition MM<br />

= [ [B j ] 1<br />

A 1 +[B j ] 2<br />

A 2 + ···+[B j ] n<br />

A n Def<strong>in</strong>ition MVP<br />

]i<br />

= [ ]<br />

[B j ] 1<br />

A 1 + [ ]<br />

[B<br />

i j ] 2<br />

A 2 + ···+ [ [B<br />

i j ] n<br />

A n Def<strong>in</strong>ition CVA<br />

]i<br />

=[B j ] 1<br />

[A 1 ] i<br />

+[B j ] 2<br />

[A 2 ] i<br />

+ ···+[B j ] n<br />

[A n ] i<br />

Def<strong>in</strong>ition CVSM<br />

=[B] 1j<br />

[A] i1<br />

+[B] 2j<br />

[A] i2<br />

+ ···+[B] nj<br />

[A] <strong>in</strong><br />

Def<strong>in</strong>ition M<br />

=[A] i1<br />

[B] 1j<br />

+[A] i2<br />

[B] 2j<br />

+ ···+[A] <strong>in</strong><br />

[B] nj<br />

Property CMCN<br />

n∑<br />

= [A] ik<br />

[B] kj<br />

k=1<br />

Example PTMEE Product of two matrices, entry-by-entry<br />

Consider aga<strong>in</strong> the two matrices from Example PTM<br />

⎡<br />

⎤<br />

1 6 2 1<br />

[ ]<br />

1 2 −1 4 6<br />

−1 4 3 2<br />

A = 0 −4 1 2 3 B = ⎢ 1 1 2 3⎥<br />

−5 1 2 −3 4<br />

⎣ 6 4 −1 2⎦<br />

1 −2 3 0<br />

Then suppose we just wanted the entry of AB <strong>in</strong> the second row, third column:<br />

[AB] 23<br />

=[A] 21<br />

[B] 13<br />

+[A] 22<br />

[B] 23<br />

+[A] 23<br />

[B] 33<br />

+[A] 24<br />

[B] 43<br />

+[A] 25<br />

[B] 53<br />

=(0)(2) + (−4)(3) + (1)(2) + (2)(−1) + (3)(3) = −3<br />

Notice how there are 5 terms <strong>in</strong> the sum, s<strong>in</strong>ce 5 is the common dimension of the<br />

two matrices (column count for A, rowcountforB). In the conclusion of Theorem<br />

EMP, it would be the <strong>in</strong>dex k that would run from 1 to 5 <strong>in</strong> this computation. Here<br />

is a bit more practice.<br />

The entry of third row, first column:<br />

[AB] 31<br />

=[A] 31<br />

[B] 11<br />

+[A] 32<br />

[B] 21<br />

+[A] 33<br />

[B] 31<br />

+[A] 34<br />

[B] 41<br />

+[A] 35<br />

[B] 51<br />

=(−5)(1) + (1)(−1) + (2)(1) + (−3)(6) + (4)(1) = −18<br />

To get some more practice on your own, complete the computation of the other<br />

10 entries of this product. Construct some other pairs of matrices (of compatible<br />

sizes) and compute their product two ways. <strong>First</strong> use Def<strong>in</strong>ition MM. S<strong>in</strong>ce l<strong>in</strong>ear<br />

comb<strong>in</strong>ations are straightforward for you now, this should be easy to do and to do<br />

correctly. Then do it aga<strong>in</strong>, us<strong>in</strong>g Theorem EMP. S<strong>in</strong>ce this process may take some<br />

practice, use your first computation to check your work.<br />

△<br />

Theorem EMP is the way many people compute matrix products by hand. It


§MM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 182<br />

will also be very useful for the theorems we are go<strong>in</strong>g to prove shortly. However, the<br />

def<strong>in</strong>ition (Def<strong>in</strong>ition MM) is frequently the most useful for its connections with<br />

deeper ideas like the null space and the upcom<strong>in</strong>g column space.<br />

Subsection PMM<br />

Properties of Matrix Multiplication<br />

In this subsection, we collect properties of matrix multiplication and its <strong>in</strong>teraction<br />

with the zero matrix (Def<strong>in</strong>ition ZM), the identity matrix (Def<strong>in</strong>ition IM), matrix<br />

addition (Def<strong>in</strong>ition MA), scalar matrix multiplication (Def<strong>in</strong>ition MSM), the <strong>in</strong>ner<br />

product (Def<strong>in</strong>ition IP), conjugation (Theorem MMCC), and the transpose (Def<strong>in</strong>ition<br />

TM). Whew! Here we go. These are great proofs to practice with, so try to<br />

concoct the proofs before read<strong>in</strong>g them, they will get progressively more complicated<br />

as we go.<br />

Theorem MMZM Matrix Multiplication and the Zero Matrix<br />

Suppose A is an m × n matrix. Then<br />

1. AO n×p = O m×p<br />

2. O p×m A = O p×n<br />

Proof. We will prove (1) and leave (2) to you. Entry-by-entry, for 1 ≤ i ≤ m,<br />

1 ≤ j ≤ p,<br />

n∑<br />

[AO n×p ] ij<br />

= [A] ik<br />

[O n×p ] kj<br />

Theorem EMP<br />

=<br />

=<br />

k=1<br />

n∑<br />

[A] ik<br />

0<br />

k=1<br />

n∑<br />

0<br />

k=1<br />

Def<strong>in</strong>ition ZM<br />

=0 PropertyZCN<br />

=[O m×p ] ij<br />

Def<strong>in</strong>ition ZM<br />

So by the def<strong>in</strong>ition of matrix equality (Def<strong>in</strong>ition ME), the matrices AO n×p and<br />

O m×p are equal.<br />

<br />

Theorem MMIM Matrix Multiplication and Identity Matrix<br />

Suppose A is an m × n matrix. Then<br />

1. AI n = A<br />

2. I m A = A


§MM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 183<br />

Proof. Aga<strong>in</strong>, we will prove (1) and leave (2) to you. Entry-by-entry, For 1 ≤ i ≤ m,<br />

1 ≤ j ≤ n,<br />

n∑<br />

[AI n ] ij<br />

= [A] ik<br />

[I n ] kj<br />

Theorem EMP<br />

k=1<br />

=[A] ij<br />

[I n ] jj<br />

+<br />

=[A] ij<br />

(1) +<br />

=[A] ij<br />

+<br />

=[A] ij<br />

n∑<br />

[A] ik<br />

[I n ] kj<br />

k=1<br />

k≠j<br />

n∑<br />

k=1,k≠j<br />

n∑<br />

k=1,k≠j<br />

0<br />

[A] ik<br />

(0)<br />

Property CACN<br />

Def<strong>in</strong>ition IM<br />

So the matrices A and AI n are equal, entry-by-entry, and by the def<strong>in</strong>ition of<br />

matrix equality (Def<strong>in</strong>ition ME) we can say they are equal matrices.<br />

<br />

It is this theorem that gives the identity matrix its name. It is a matrix that<br />

behaves with matrix multiplication like the scalar 1 does with scalar multiplication.<br />

To multiply by the identity matrix is to have no effect on the other matrix.<br />

Theorem MMDAA Matrix Multiplication Distributes Across Addition<br />

Suppose A is an m × n matrix and B and C are n × p matrices and D is a p × s<br />

matrix. Then<br />

1. A(B + C) =AB + AC<br />

2. (B + C)D = BD + CD<br />

Proof. We will do (1), you do (2). Entry-by-entry, for 1 ≤ i ≤ m, 1≤ j ≤ p,<br />

n∑<br />

[A(B + C)] ij<br />

= [A] ik<br />

[B + C] kj<br />

Theorem EMP<br />

=<br />

=<br />

=<br />

k=1<br />

n∑<br />

[A] ik<br />

([B] kj<br />

+[C] kj<br />

)<br />

k=1<br />

n∑<br />

[A] ik<br />

[B] kj<br />

+[A] ik<br />

[C] kj<br />

k=1<br />

n∑<br />

[A] ik<br />

[B] kj<br />

+<br />

k=1<br />

=[AB] ij<br />

+[AC] ij<br />

n∑<br />

[A] ik<br />

[C] kj<br />

k=1<br />

Def<strong>in</strong>ition MA<br />

Property DCN<br />

Property CACN<br />

Theorem EMP


§MM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 184<br />

=[AB + AC] ij<br />

Def<strong>in</strong>ition MA<br />

So the matrices A(B + C) andAB + AC are equal, entry-by-entry, and by the<br />

def<strong>in</strong>ition of matrix equality (Def<strong>in</strong>ition ME) we can say they are equal matrices.<br />

Theorem MMSMM Matrix Multiplication and Scalar Matrix Multiplication<br />

Suppose A is an m × n matrix and B is an n × p matrix. Let α be a scalar. Then<br />

α(AB) =(αA)B = A(αB).<br />

Proof. These are equalities of matrices. We will do the first one, the second is similar<br />

and will be good practice for you. For 1 ≤ i ≤ m, 1≤ j ≤ p,<br />

[α(AB)] ij<br />

= α [AB] ij<br />

Def<strong>in</strong>ition MSM<br />

n∑<br />

= α [A] ik<br />

[B] kj<br />

Theorem EMP<br />

=<br />

=<br />

k=1<br />

n∑<br />

α [A] ik<br />

[B] kj<br />

k=1<br />

n∑<br />

[αA] ik<br />

[B] kj<br />

k=1<br />

=[(αA)B] ij<br />

Property DCN<br />

Def<strong>in</strong>ition MSM<br />

Theorem EMP<br />

So the matrices α(AB) and(αA)B are equal, entry-by-entry, and by the def<strong>in</strong>ition<br />

of matrix equality (Def<strong>in</strong>ition ME) we can say they are equal matrices. <br />

Theorem MMA Matrix Multiplication is Associative<br />

Suppose A is an m × n matrix, B is an n × p matrix and D is a p × s matrix. Then<br />

A(BD) =(AB)D.<br />

Proof. A matrix equality, so we will go entry-by-entry, no surprise there. For 1 ≤<br />

i ≤ m, 1≤ j ≤ s,<br />

n∑<br />

[A(BD)] ij<br />

= [A] ik<br />

[BD] kj<br />

Theorem EMP<br />

=<br />

=<br />

k=1<br />

(<br />

n∑<br />

p∑<br />

)<br />

[A] ik<br />

[B] kl<br />

[D] lj<br />

l=1<br />

k=1<br />

n∑<br />

k=1 l=1<br />

p∑<br />

[A] ik<br />

[B] kl<br />

[D] lj<br />

Theorem EMP<br />

Property DCN


§MM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 185<br />

We can switch the order of the summation s<strong>in</strong>ce these are f<strong>in</strong>ite sums,<br />

p∑ n∑<br />

= [A] ik<br />

[B] kl<br />

[D] lj<br />

Property CACN<br />

l=1 k=1<br />

As [D] lj<br />

does not depend on the <strong>in</strong>dex k, we can use distributivity to move it outside<br />

of the <strong>in</strong>ner sum,<br />

(<br />

p∑<br />

n<br />

)<br />

∑<br />

= [D] lj<br />

[A] ik<br />

[B] kl<br />

Property DCN<br />

=<br />

=<br />

l=1<br />

k=1<br />

p∑<br />

[D] lj<br />

[AB] il<br />

l=1<br />

p∑<br />

[AB] il<br />

[D] lj<br />

l=1<br />

=[(AB)D] ij<br />

Theorem EMP<br />

Property CMCN<br />

Theorem EMP<br />

So the matrices (AB)D and A(BD) are equal, entry-by-entry, and by the def<strong>in</strong>ition<br />

of matrix equality (Def<strong>in</strong>ition ME) we can say they are equal matrices. <br />

S<strong>in</strong>ce Theorem MMA says matrix multipication is associative, it means we do<br />

not have to be careful about the order <strong>in</strong> which we perform matrix multiplication,<br />

nor how we parenthesize an expression with just several matrices multiplied togther.<br />

So this is where we draw the l<strong>in</strong>e on expla<strong>in</strong><strong>in</strong>g every last detail <strong>in</strong> a proof. We will<br />

frequently add, remove, or rearrange parentheses with no comment. Indeed, I only<br />

see about a dozen places where Theorem MMA is cited <strong>in</strong> a proof. You could try to<br />

count how many times we avoid mak<strong>in</strong>g a reference to this theorem.<br />

The statement of our next theorem is technically <strong>in</strong>accurate. If we upgrade the<br />

vectors u, v to matrices with a s<strong>in</strong>gle column, then the expression u t v is a 1 × 1<br />

matrix, though we will treat this small matrix as if it was simply the scalar quantity<br />

<strong>in</strong> its lone entry. When we apply Theorem MMIP there should not be any confusion.<br />

Notice that if we treat a column vector as a matrix with a s<strong>in</strong>gle column, then we<br />

can also construct the adjo<strong>in</strong>t of a vector, though we will not make this a common<br />

practice.<br />

Theorem MMIP Matrix Multiplication and Inner Products<br />

If we consider the vectors u, v ∈ C m as m × 1 matrices then<br />

〈u, v〉 = u t v = u ∗ v<br />

Proof.<br />

m∑<br />

〈u, v〉 = [u] k<br />

[v] k<br />

k=1<br />

Def<strong>in</strong>ition IP


§MM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 186<br />

=<br />

=<br />

=<br />

m∑<br />

[u] k1<br />

[v] k1<br />

Column vectors as matrices<br />

k=1<br />

m∑<br />

[u] k1<br />

[v] k1<br />

k=1<br />

Def<strong>in</strong>ition CCM<br />

m∑ [<br />

u<br />

t ] 1k [v] k1<br />

Def<strong>in</strong>ition TM<br />

k=1<br />

= [ u t v ] Theorem EMP<br />

11<br />

To f<strong>in</strong>ish we just blur the dist<strong>in</strong>ction between a 1 × 1matrix(u t v) and its lone<br />

entry.<br />

<br />

Theorem MMCC Matrix Multiplication and Complex Conjugation<br />

Suppose A is an m × n matrix and B is an n × p matrix. Then AB = A B.<br />

Proof. To obta<strong>in</strong> this matrix equality, we will work entry-by-entry. For 1 ≤ i ≤ m,<br />

1 ≤ j ≤ p,<br />

[ ]<br />

AB<br />

ij = [AB] ij<br />

Def<strong>in</strong>ition CCM<br />

n∑<br />

= [A] ik<br />

[B] kj<br />

Theorem EMP<br />

=<br />

=<br />

=<br />

k=1<br />

n∑<br />

[A] ik<br />

[B] kj<br />

k=1<br />

n∑<br />

[A] ik<br />

[B] kj<br />

k=1<br />

n∑ [ ]<br />

A<br />

k=1<br />

ik<br />

[<br />

B<br />

]<br />

kj<br />

Theorem CCRA<br />

Theorem CCRM<br />

Def<strong>in</strong>ition CCM<br />

= [ A B ] ij<br />

Theorem EMP<br />

So the matrices AB and A B are equal, entry-by-entry, and by the def<strong>in</strong>ition of<br />

matrix equality (Def<strong>in</strong>ition ME) we can say they are equal matrices.<br />

<br />

Another theorem <strong>in</strong> this style, and it is a good one. If you have been practic<strong>in</strong>g<br />

with the previous proofs you should be able to do this one yourself.<br />

Theorem MMT Matrix Multiplication and Transposes<br />

Suppose A is an m × n matrix and B is an n × p matrix. Then (AB) t = B t A t .


§MM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 187<br />

Proof. This theorem may be surpris<strong>in</strong>g but if we check the sizes of the matrices<br />

<strong>in</strong>volved, then maybe it will not seem so far-fetched. <strong>First</strong>, AB has size m × p, so<br />

its transpose has size p × m. The product of B t with A t is a p × n matrix times an<br />

n × m matrix, also result<strong>in</strong>g <strong>in</strong> a p × m matrix. So at least our objects are compatible<br />

for equality (and would not be, <strong>in</strong> general, if we did not reverse the order of the<br />

matrix multiplication).<br />

Here we go aga<strong>in</strong>, entry-by-entry. For 1 ≤ i ≤ m, 1≤ j ≤ p,<br />

[<br />

(AB)<br />

t ] ji =[AB] ij<br />

Def<strong>in</strong>ition TM<br />

=<br />

=<br />

=<br />

n∑<br />

[A] ik<br />

[B] kj<br />

k=1<br />

n∑<br />

[B] kj<br />

[A] ik<br />

k=1<br />

Theorem EMP<br />

Property CMCN<br />

n∑ [<br />

B<br />

t ] [<br />

jk A<br />

t ] Def<strong>in</strong>ition TM<br />

ki<br />

k=1<br />

= [ B t A t] ji<br />

Theorem EMP<br />

So the matrices (AB) t and B t A t are equal, entry-by-entry, and by the def<strong>in</strong>ition<br />

of matrix equality (Def<strong>in</strong>ition ME) we can say they are equal matrices. <br />

This theorem seems odd at first glance, s<strong>in</strong>ce we have to switch the order of<br />

A and B. But if we simply consider the sizes of the matrices <strong>in</strong>volved, we can see<br />

that the switch is necessary for this reason alone. That the <strong>in</strong>dividual entries of the<br />

products then come along to be equal is a bonus.<br />

As the adjo<strong>in</strong>t of a matrix is a composition of a conjugate and a transpose, its<br />

<strong>in</strong>teraction with matrix multiplication is similar to that of a transpose. Here is the<br />

last of our long list of basic properties of matrix multiplication.<br />

Theorem MMAD Matrix Multiplication and Adjo<strong>in</strong>ts<br />

Suppose A is an m × n matrix and B is an n × p matrix. Then (AB) ∗ = B ∗ A ∗ .<br />

Proof.<br />

(AB) ∗ = ( AB ) t<br />

Def<strong>in</strong>ition A<br />

= ( A B ) t<br />

Theorem MMCC<br />

= ( B ) t ( ) t<br />

A Theorem MMT<br />

= B ∗ A ∗ Def<strong>in</strong>ition A<br />

<br />

Notice how none of these proofs above relied on writ<strong>in</strong>g out huge general matrices<br />

with lots of ellipses (“. . . ”) and try<strong>in</strong>g to formulate the equalities a whole matrix at


§MM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 188<br />

a time. This messy bus<strong>in</strong>ess is a “proof technique” to be avoided at all costs. Notice<br />

too how the proof of Theorem MMAD does not use an entry-by-entry approach,<br />

but simply builds on previous results about matrix multiplication’s <strong>in</strong>teraction with<br />

conjugation and transposes.<br />

These theorems, along with Theorem VSPM and the other results <strong>in</strong> Section<br />

MO, give you the “rules” for how matrices <strong>in</strong>teract with the various operations<br />

we have def<strong>in</strong>ed on matrices (addition, scalar multiplication, matrix multiplication,<br />

conjugation, transposes and adjo<strong>in</strong>ts). Use them and use them often. But do not try<br />

to do anyth<strong>in</strong>g with a matrix that you do not have a rule for. Together, we would<br />

<strong>in</strong>formally call all these operations, and the attendant theorems, “the algebra of<br />

matrices.” Notice, too, that every column vector is just a n × 1 matrix, so these<br />

theorems apply to column vectors also. F<strong>in</strong>ally, these results, taken as a whole, may<br />

make us feel that the def<strong>in</strong>ition of matrix multiplication is not so unnatural.<br />

Subsection HM<br />

Hermitian Matrices<br />

The adjo<strong>in</strong>t of a matrix has a basic property when employed <strong>in</strong> a matrix-vector<br />

product as part of an <strong>in</strong>ner product. At this po<strong>in</strong>t, you could even use the follow<strong>in</strong>g<br />

result as a motivation for the def<strong>in</strong>ition of an adjo<strong>in</strong>t.<br />

Theorem AIP Adjo<strong>in</strong>t and Inner Product<br />

Suppose that A is an m × n matrix and x ∈ C n , y ∈ C m .Then〈Ax, y〉 = 〈x, A ∗ y〉.<br />

Proof.<br />

〈Ax, y〉 = ( Ax ) t<br />

y<br />

Theorem MMIP<br />

= ( A x ) t<br />

y Theorem MMCC<br />

= x t A t y Theorem MMT<br />

= x t (A ∗ y) Def<strong>in</strong>ition A<br />

= 〈x, A ∗ y〉 Theorem MMIP<br />

<br />

Sometimes a matrix is equal to its adjo<strong>in</strong>t (Def<strong>in</strong>ition A), and these matrices<br />

have <strong>in</strong>terest<strong>in</strong>g properties. One of the most common situations where this occurs<br />

is when a matrix has only real number entries. Then we are simply talk<strong>in</strong>g about<br />

symmetric matrices (Def<strong>in</strong>ition SYM), so you can view this as a generalization of a<br />

symmetric matrix.<br />

Def<strong>in</strong>ition HM Hermitian Matrix<br />

The square matrix A is Hermitian (or self-adjo<strong>in</strong>t) ifA = A ∗ .<br />


§MM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 189<br />

Aga<strong>in</strong>, the set of real matrices that are Hermitian is exactly the set of symmetric<br />

matrices. In Section PEE we will uncover some amaz<strong>in</strong>g properties of Hermitian<br />

matrices, so when you get there, run back here to rem<strong>in</strong>d yourself of this def<strong>in</strong>ition.<br />

Further properties will also appear <strong>in</strong> Section OD. Right now we prove a fundamental<br />

result about Hermitian matrices, matrix vector products and <strong>in</strong>ner products. As a<br />

characterization, this could be employed as a def<strong>in</strong>ition of a Hermitian matrix and<br />

some authors take this approach.<br />

Theorem HMIP Hermitian Matrices and Inner Products<br />

Suppose that A is a square matrix of size n. ThenA is Hermitian if and only if<br />

〈Ax, y〉 = 〈x, Ay〉 for all x, y ∈ C n .<br />

Proof. (⇒) This is the “easy half” of the proof, and makes the rationale for a<br />

def<strong>in</strong>ition of Hermitian matrices most obvious. Assume A is Hermitian,<br />

〈Ax, y〉 = 〈x, A ∗ y〉<br />

Theorem AIP<br />

= 〈x, Ay〉 Def<strong>in</strong>ition HM<br />

(⇐) This “half” will take a bit more work. Assume that 〈Ax, y〉 = 〈x, Ay〉 for<br />

all x, y ∈ C n . We show that A = A ∗ by establish<strong>in</strong>g that Ax = A ∗ x for all x, sowe<br />

can then apply Theorem EMMVP. With only this much motivation, consider the<br />

<strong>in</strong>ner product for any x ∈ C n .<br />

〈Ax − A ∗ x,Ax − A ∗ x〉 = 〈Ax − A ∗ x,Ax〉−〈Ax − A ∗ x,A ∗ x〉 Theorem IPVA<br />

= 〈Ax − A ∗ x,Ax〉−〈A (Ax − A ∗ x) , x〉 Theorem AIP<br />

= 〈Ax − A ∗ x,Ax〉−〈Ax − A ∗ x,Ax〉 Hypothesis<br />

=0 PropertyAICN<br />

Because this first <strong>in</strong>ner product equals zero, and has the same vector <strong>in</strong> each<br />

argument (Ax − A ∗ x), Theorem PIP gives the conclusion that Ax − A ∗ x = 0. With<br />

Ax = A ∗ x for all x ∈ C n ,TheoremEMMVP says A = A ∗ , which is the def<strong>in</strong><strong>in</strong>g<br />

property of a Hermitian matrix (Def<strong>in</strong>ition HM).<br />

<br />

So, <strong>in</strong>formally, Hermitian matrices are those that can be tossed around from one<br />

side of an <strong>in</strong>ner product to the other with reckless abandon. We will see later what<br />

this buys us.<br />

Read<strong>in</strong>g Questions<br />

1. Form the matrix vector product of<br />

⎡<br />

⎣ 2 3 −1 0<br />

⎤<br />

1 −2 7 3⎦<br />

1 5 3 2<br />

with<br />

⎡ ⎤<br />

2<br />

⎢−3⎥<br />

⎣<br />

0<br />

⎦<br />

5


§MM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 190<br />

2. Multiply together the two matrices below (<strong>in</strong> the order given).<br />

⎡<br />

⎣ 2 3 −1 0<br />

⎤<br />

⎡ ⎤<br />

2 6<br />

1 −2 7 3⎦<br />

⎢−3<br />

−4⎥<br />

⎣<br />

0 2<br />

⎦<br />

1 5 3 2<br />

3 −1<br />

3. Rewrite the system of l<strong>in</strong>ear equations below as a vector equality and us<strong>in</strong>g a matrixvector<br />

product. (This question does not ask for a solution to the system. But it does<br />

ask you to express the system of equations <strong>in</strong> a new form us<strong>in</strong>g tools from this section.)<br />

Exercises<br />

2x 1 +3x 2 − x 3 =0<br />

x 1 +2x 2 + x 3 =3<br />

x 1 +3x 2 +3x 3 =7<br />

C20 † Compute the product of the two matrices below, AB. Do this us<strong>in</strong>g the def<strong>in</strong>itions<br />

of the matrix-vector product (Def<strong>in</strong>ition MVP) and the def<strong>in</strong>ition of matrix multiplication<br />

(Def<strong>in</strong>ition MM).<br />

⎡<br />

A = ⎣ 2 5<br />

⎤<br />

[ ]<br />

−1 3 ⎦ 1 5 −3 4<br />

B =<br />

2 0 2 −3<br />

2 −2<br />

C21 † Compute the product AB of the two matrices below us<strong>in</strong>g both the def<strong>in</strong>ition of<br />

the matrix-vector product (Def<strong>in</strong>ition MVP) and the def<strong>in</strong>ition of matrix multiplication<br />

(Def<strong>in</strong>ition MM).<br />

⎡<br />

A = ⎣ 1 3 2<br />

⎤<br />

⎡<br />

−1 2 1⎦ B = ⎣ 4 1 2<br />

⎤<br />

1 0 1⎦<br />

0 1 0<br />

3 1 5<br />

C22 † Compute the product AB of the two matrices below us<strong>in</strong>g both the def<strong>in</strong>ition of<br />

the matrix-vector product (Def<strong>in</strong>ition MVP) and the def<strong>in</strong>ition of matrix multiplication<br />

(Def<strong>in</strong>ition MM).<br />

[ ]<br />

[ ]<br />

1 0<br />

2 3<br />

A =<br />

B =<br />

−2 1<br />

4 6<br />

C23 † Compute the product AB of the two matrices below us<strong>in</strong>g both the def<strong>in</strong>ition of<br />

the matrix-vector product (Def<strong>in</strong>ition MVP) and the def<strong>in</strong>ition of matrix multiplication<br />

(Def<strong>in</strong>ition MM).<br />

⎡ ⎤<br />

3 1<br />

[ ]<br />

⎢2 4⎥<br />

−3 1<br />

A = ⎣<br />

6 5<br />

⎦ B =<br />

4 2<br />

1 2


§MM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 191<br />

C24 †<br />

C25 †<br />

Compute the product AB of the two matrices below.<br />

⎡<br />

A = ⎣ 1 2 3 −2<br />

⎤<br />

⎡ ⎤<br />

3<br />

0 1 −2 −1⎦ ⎢4⎥<br />

B = ⎣<br />

0<br />

⎦<br />

1 1 3 1<br />

2<br />

Compute the product AB of the two matrices below.<br />

⎡<br />

A = ⎣ 1 2 3 −2<br />

⎤<br />

⎡ ⎤<br />

−7<br />

0 1 −2 −1⎦ ⎢ 3 ⎥<br />

B = ⎣<br />

1<br />

⎦<br />

1 1 3 1<br />

1<br />

C26 † Compute the product AB of the two matrices below us<strong>in</strong>g both the def<strong>in</strong>ition of<br />

the matrix-vector product (Def<strong>in</strong>ition MVP) and the def<strong>in</strong>ition of matrix multiplication<br />

(Def<strong>in</strong>ition MM).<br />

⎡ ⎤<br />

⎡<br />

⎤<br />

A = ⎣ 1 3 1<br />

2 −5 −1<br />

0 1 0⎦ B = ⎣ 0 1 0 ⎦<br />

1 1 2<br />

−1 2 1<br />

C30 † [ ]<br />

1 2<br />

For the matrix A = , f<strong>in</strong>d A<br />

0 1<br />

, A 3 , A 4 . F<strong>in</strong>d a general formula for A n for any<br />

positive <strong>in</strong>teger n.<br />

[ ]<br />

C31 † 1 −1<br />

For the matrix A = ,f<strong>in</strong>dA<br />

0 1<br />

, A 3 , A 4 . F<strong>in</strong>d a general formula for A n for<br />

any positive <strong>in</strong>teger n.<br />

⎡ ⎤<br />

C32 † For the matrix A = ⎣ 1 0 0<br />

0 2 0⎦, f<strong>in</strong>d A 2 , A 3 , A 4 . F<strong>in</strong>d a general formula for A n<br />

0 0 3<br />

for any positive <strong>in</strong>teger n.<br />

⎡ ⎤<br />

C33 † For the matrix A = ⎣ 0 1 2<br />

0 0 1⎦, f<strong>in</strong>d A 2 , A 3 , A 4 . F<strong>in</strong>d a general formula for A n<br />

0 0 0<br />

for any positive <strong>in</strong>teger n.<br />

T10 † Suppose that A is a square matrix and there is a vector, b, such that LS(A, b) has<br />

a unique solution. Prove that A is nons<strong>in</strong>gular. Give a direct proof (perhaps appeal<strong>in</strong>g to<br />

Theorem PSPHS) rather than just negat<strong>in</strong>g a sentence from the text discuss<strong>in</strong>g a similar<br />

situation.<br />

T12 The conclusion of Theorem HMIP is 〈Ax, y〉 = 〈x, A ∗ y〉. Use the same hypotheses,<br />

and prove the similar conclusion: 〈x, Ay〉 = 〈A ∗ x, y〉. Two different approaches can be<br />

based on an application of Theorem HMIP. The first uses Theorem AA, while a second<br />

uses Theorem IPAC. Can you provide two proofs?<br />

T20 Prove the second part of Theorem MMZM.<br />

T21 Prove the second part of Theorem MMIM.<br />

T22 Prove the second part of Theorem MMDAA.


§MM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 192<br />

T23 †<br />

T31<br />

T32<br />

T35<br />

Prove the second part of Theorem MMSMM.<br />

Suppose that A is an m × n matrix and x, y ∈N(A). Prove that x + y ∈N(A).<br />

Suppose that A is an m × n matrix, α ∈ C, andx ∈N(A). Provethatαx ∈N(A).<br />

Suppose that A is an n × n matrix. Prove that A ∗ A and AA ∗ are Hermitian matrices.<br />

T40 † Suppose that A is an m × n matrix and B is an n × p matrix. Prove that the null<br />

space of B is a subset of the null space of AB, thatisN (B) ⊆N(AB). Provide an example<br />

where the opposite is false, <strong>in</strong> other words give an example where N (AB) ⊈N(B).<br />

T41 † Suppose that A is an n × n nons<strong>in</strong>gular matrix and B is an n × p matrix. Prove that<br />

the null space of B is equal to the null space of AB, thatisN (B) = N (AB). (Compare<br />

with Exercise MM.T40.)<br />

T50 Suppose u and v are any two solutions of the l<strong>in</strong>ear system LS(A, b). Provethat<br />

u − v is an element of the null space of A, thatis,u − v ∈N(A).<br />

T51 † Give a new proof of Theorem PSPHS replac<strong>in</strong>g applications of Theorem SLSLC<br />

with matrix-vector products (Theorem SLEMM).<br />

T52 † Suppose that x, y ∈ C n , b ∈ C m and A is an m × n matrix. If x, y and x + y are<br />

each a solution to the l<strong>in</strong>ear system LS(A, b), what can you say that is <strong>in</strong>terest<strong>in</strong>g about<br />

b? Form an implication with the existence of the three solutions as the hypothesis and an<br />

<strong>in</strong>terest<strong>in</strong>g statement about LS(A, b) as the conclusion, and then give a proof.


Section MISLE<br />

Matrix Inverses and Systems of L<strong>in</strong>ear Equations<br />

The <strong>in</strong>verse of a square matrix, and solutions to l<strong>in</strong>ear systems with square coefficient<br />

matrices, are <strong>in</strong>timately connected.<br />

Subsection SI<br />

Solutions and Inverses<br />

We beg<strong>in</strong> with a familiar example, performed <strong>in</strong> a novel way.<br />

Example SABMI Solutions to Archetype B with a matrix <strong>in</strong>verse<br />

Archetype B is the system of m = 3 l<strong>in</strong>ear equations <strong>in</strong> n = 3 variables,<br />

−7x 1 − 6x 2 − 12x 3 = −33<br />

5x 1 +5x 2 +7x 3 =24<br />

x 1 +4x 3 =5<br />

By Theorem SLEMM we can represent this system of equations as<br />

Ax = b<br />

where<br />

A =<br />

[ −7 −6<br />

] −12<br />

5 5 7<br />

1 0 4<br />

x =<br />

[<br />

x1<br />

]<br />

x 2<br />

x 3<br />

Now, entirely unmotivated, we def<strong>in</strong>e the 3 × 3 matrix B,<br />

⎡<br />

⎤<br />

−10 −12 −9<br />

B = ⎣ 13<br />

⎦<br />

and note the remarkable fact that<br />

⎡<br />

⎤<br />

−10 −12 −9<br />

BA = ⎣ 13<br />

⎦<br />

11<br />

2<br />

8<br />

2<br />

5<br />

5<br />

2<br />

3<br />

2<br />

11<br />

2<br />

8<br />

2<br />

5<br />

5<br />

2<br />

3<br />

2<br />

[ −7 −6<br />

] −12<br />

5 5 7 =<br />

1 0 4<br />

b =<br />

[ ] −33<br />

24<br />

5<br />

[ 1 0<br />

] 0<br />

0 1 0<br />

0 0 1<br />

Now apply this computation to the problem of solv<strong>in</strong>g the system of equations,<br />

x = I 3 x<br />

Theorem MMIM<br />

=(BA)x<br />

Substitution<br />

= B(Ax) Theorem MMA<br />

= Bb Substitution<br />

193


§MISLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 194<br />

So we have<br />

⎡<br />

−10 −12<br />

⎤<br />

−9<br />

x = Bb = ⎣ 13<br />

⎦<br />

11<br />

2<br />

8<br />

2<br />

5<br />

5<br />

2<br />

3<br />

2<br />

[ ] −33<br />

24 =<br />

5<br />

[ ] −3<br />

5<br />

2<br />

So with the help and assistance of B we have been able to determ<strong>in</strong>e a solution<br />

to the system represented by Ax = b through judicious use of matrix multiplication.<br />

We know by Theorem NMUS that s<strong>in</strong>ce the coefficient matrix <strong>in</strong> this example is<br />

nons<strong>in</strong>gular, there would be a unique solution, no matter what the choice of b. The<br />

derivation above amplifies this result, s<strong>in</strong>ce we were forced to conclude that x = Bb<br />

and the solution could not be anyth<strong>in</strong>g else. You should notice that this argument<br />

would hold for any particular choice of b.<br />

△<br />

The matrix B of the previous example is called the <strong>in</strong>verse of A. WhenA and B<br />

are comb<strong>in</strong>ed via matrix multiplication, the result is the identity matrix, which can<br />

be <strong>in</strong>serted “<strong>in</strong> front” of x as the first step <strong>in</strong> f<strong>in</strong>d<strong>in</strong>g the solution. This is entirely<br />

analogous to how we might solve a s<strong>in</strong>gle l<strong>in</strong>ear equation like 3x = 12.<br />

( ) 1<br />

x =1x =<br />

3 (3) x = 1 3 (3x) =1 3 (12) = 4<br />

Here we have obta<strong>in</strong>ed a solution by employ<strong>in</strong>g the “multiplicative <strong>in</strong>verse” of<br />

3, 3 −1 = 1 3<br />

. This works f<strong>in</strong>e for any scalar multiple of x, except for zero, s<strong>in</strong>ce zero<br />

does not have a multiplicative <strong>in</strong>verse. Consider separately the two l<strong>in</strong>ear equations,<br />

0x =12 0x =0<br />

The first has no solutions, while the second has <strong>in</strong>f<strong>in</strong>itely many solutions. For<br />

matrices, it is all just a little more complicated. Some matrices have <strong>in</strong>verses, some<br />

do not. And when a matrix does have an <strong>in</strong>verse, just how would we compute it?<br />

In other words, just where did that matrix B <strong>in</strong> the last example come from? Are<br />

there other matrices that might have worked just as well?<br />

Subsection IM<br />

Inverse of a Matrix<br />

Def<strong>in</strong>ition MI Matrix Inverse<br />

Suppose A and B are square matrices of size n such that AB = I n and BA = I n .<br />

Then A is <strong>in</strong>vertible and B is the <strong>in</strong>verse of A. In this situation, we write B = A −1 .<br />

□<br />

Notice that if B is the <strong>in</strong>verse of A, then we can just as easily say A is the <strong>in</strong>verse<br />

of B, orA and B are <strong>in</strong>verses of each other.<br />

Not every square matrix has an <strong>in</strong>verse. In Example SABMI the matrix B is<br />

the <strong>in</strong>verse of the coefficient matrix of Archetype B. To see this it only rema<strong>in</strong>s to<br />

check that AB = I 3 . What about Archetype A? It is an example of a square matrix


§MISLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 195<br />

without an <strong>in</strong>verse.<br />

Example MWIAA A matrix without an <strong>in</strong>verse, Archetype A<br />

Consider the coefficient matrix from Archetype A,<br />

[ 1 −1<br />

] 2<br />

A = 2 1 1<br />

1 1 0<br />

Suppose that A is <strong>in</strong>vertible and does have an <strong>in</strong>verse, say B. Choose the vector<br />

of constants<br />

[ ] 1<br />

b = 3<br />

2<br />

and consider the system of equations LS(A, b). Just as <strong>in</strong> Example SABMI, this<br />

vector equation would have the unique solution x = Bb.<br />

However, the system LS(A, b) is <strong>in</strong>consistent. Form the augmented matrix<br />

[A | b] and row-reduce to<br />

⎡<br />

⎣<br />

1 0 1 0<br />

0 1 −1 0<br />

0 0 0 1<br />

which allows us to recognize the <strong>in</strong>consistency by Theorem RCLS.<br />

So the assumption of A’s <strong>in</strong>verse leads to a logical <strong>in</strong>consistency (the system<br />

cannot be both consistent and <strong>in</strong>consistent), so our assumption is false. A is not<br />

<strong>in</strong>vertible.<br />

It is possible this example is less than satisfy<strong>in</strong>g. Just where did that particular<br />

choice of the vector b come from anyway? Stay tuned for an application of the future<br />

Theorem CSCS <strong>in</strong> Example CSAA.<br />

△<br />

Letuslookatonemorematrix<strong>in</strong>versebeforeweembarkonamoresystematic<br />

study.<br />

Example MI Matrix <strong>in</strong>verse<br />

Consider the matrices,<br />

⎡<br />

⎤<br />

1 2 1 2 1<br />

−2 −3 0 −5 −1<br />

A = ⎢ 1 1 0 2 1 ⎥<br />

⎣−2 −3 −1 −3 −2⎦<br />

−1 −3 −1 −3 1<br />

⎤<br />

⎦<br />

⎡<br />

⎤<br />

−3 3 6 −1 −2<br />

0 −2 −5 −1 1<br />

B = ⎢ 1 2 4 1 −1⎥<br />

⎣ 1 0 1 1 0 ⎦<br />

1 −1 −2 0 1


§MISLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 196<br />

Then<br />

⎡<br />

⎤ ⎡<br />

⎤ ⎡<br />

⎤<br />

1 2 1 2 1 −3 3 6 −1 −2 1 0 0 0 0<br />

−2 −3 0 −5 −1<br />

0 −2 −5 −1 1<br />

0 1 0 0 0<br />

AB = ⎢ 1 1 0 2 1 ⎥ ⎢ 1 2 4 1 −1⎥<br />

⎣−2 −3 −1 −3 −2⎦<br />

⎣ 1 0 1 1 0 ⎦ = ⎢0 0 1 0 0⎥<br />

⎣0 0 0 1 0⎦<br />

−1 −3 −1 −3 1 1 −1 −2 0 1 0 0 0 0 1<br />

and<br />

⎡<br />

⎤ ⎡<br />

⎤ ⎡<br />

⎤<br />

−3 3 6 −1 −2 1 2 1 2 1 1 0 0 0 0<br />

0 −2 −5 −1 1<br />

−2 −3 0 −5 −1<br />

0 1 0 0 0<br />

BA = ⎢ 1 2 4 1 −1⎥<br />

⎢ 1 1 0 2 1 ⎥<br />

⎣ 1 0 1 1 0 ⎦ ⎣−2 −3 −1 −3 −2⎦ = ⎢0 0 1 0 0⎥<br />

⎣0 0 0 1 0⎦<br />

1 −1 −2 0 1 −1 −3 −1 −3 1 0 0 0 0 1<br />

so by Def<strong>in</strong>ition MI, wecansaythatA is <strong>in</strong>vertible and write B = A −1 .<br />

We will now concern ourselves less with whether or not an <strong>in</strong>verse of a matrix<br />

exists, but <strong>in</strong>stead with how you can f<strong>in</strong>d one when it does exist. In Section MINM<br />

we will have some theorems that allow us to more quickly and easily determ<strong>in</strong>e just<br />

when a matrix is <strong>in</strong>vertible.<br />

Subsection CIM<br />

Comput<strong>in</strong>g the Inverse of a Matrix<br />

We have seen that the matrices from Archetype B and Archetype K both have<br />

<strong>in</strong>verses, but these <strong>in</strong>verse matrices have just dropped from the sky. How would<br />

we compute an <strong>in</strong>verse? And just when is a matrix <strong>in</strong>vertible, and when is it not?<br />

Writ<strong>in</strong>g a putative <strong>in</strong>verse with n 2 unknowns and solv<strong>in</strong>g the resultant n 2 equations<br />

is one approach. Apply<strong>in</strong>g this approach to 2 × 2 matrices can get us somewhere, so<br />

just for fun, let us do it.<br />

Theorem TTMI Two-by-Two Matrix Inverse<br />

Suppose<br />

[ ]<br />

a b<br />

A =<br />

c d<br />

Then A is <strong>in</strong>vertible if and only if ad − bc ≠0.WhenA is <strong>in</strong>vertible, then<br />

[ ]<br />

A −1 1 d −b<br />

=<br />

ad − bc −c a<br />

Proof. (⇐) Assume that ad − bc ̸= 0. We will use the def<strong>in</strong>ition of the <strong>in</strong>verse of a<br />

matrix to establish that A has an <strong>in</strong>verse (Def<strong>in</strong>ition MI). Note that if ad − bc ̸= 0<br />

then the displayed formula for A −1 is legitimate s<strong>in</strong>ce we are not divid<strong>in</strong>g by zero).<br />


§MISLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 197<br />

Us<strong>in</strong>g this proposed formula for the <strong>in</strong>verse of A, we compute<br />

[ ]( [ ]) [ ]<br />

AA −1 a b 1 d −b 1 ad − bc 0<br />

=<br />

=<br />

=<br />

c d ad − bc −c a ad − bc 0 ad − bc<br />

[ ]<br />

1 0<br />

0 1<br />

and<br />

[ ][ ] [ ] [ ]<br />

A −1 1 d −b a b 1 ad − bc 0 1 0<br />

A =<br />

=<br />

=<br />

ad − bc −c a c d ad − bc 0 ad − bc 0 1<br />

By Def<strong>in</strong>ition MI this is sufficient to establish that A is <strong>in</strong>vertible, and that the<br />

expression for A −1 is correct.<br />

(⇒) Assume that A is <strong>in</strong>vertible, and proceed with a proof by contradiction<br />

(Proof Technique CD), by assum<strong>in</strong>g also that ad − bc = 0. This translates to ad = bc.<br />

Let<br />

[ ]<br />

e f<br />

B =<br />

g h<br />

be a putative <strong>in</strong>verse of A.<br />

This means that<br />

[ ][ ] [ ]<br />

a b e f ae + bg af + bh<br />

I 2 = AB =<br />

=<br />

c d g h ce + dg cf + dh<br />

Work<strong>in</strong>g on the matrices on two ends of this equation, we will multiply the top<br />

row by c and the bottom row by a.<br />

[ ] [ ]<br />

c 0 ace + bcg acf + bch<br />

=<br />

0 a ace + adg acf + adh<br />

We are assum<strong>in</strong>g that ad = bc, so we can replace two occurrences of ad by bc <strong>in</strong><br />

the bottom row of the right matrix.<br />

[ ] [ ]<br />

c 0 ace + bcg acf + bch<br />

=<br />

0 a ace + bcg acf + bch<br />

The matrix on the right now has two rows that are identical, and therefore the<br />

same must be true of the matrix on the left. Identical rows for the matrix on the<br />

left implies that a = 0 and c =0.<br />

With this <strong>in</strong>formation, the product AB becomes<br />

[ ]<br />

[ ] [ ]<br />

1 0<br />

ae + bg af + bh bg bh<br />

= I<br />

0 1 2 = AB =<br />

=<br />

ce + dg cf + dh dg dh<br />

So bg = dh = 1 and thus b, g, d, h are all nonzero. But then bh and dg (the “other<br />

corners”) must also be nonzero, so this is (f<strong>in</strong>ally) a contradiction. So our assumption<br />

was false and we see that ad − bc ≠ 0 whenever A has an <strong>in</strong>verse.<br />

<br />

There are several ways one could try to prove this theorem, but there is a cont<strong>in</strong>ual<br />

temptation to divide by one of the eight entries <strong>in</strong>volved (a through f), but we can


§MISLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 198<br />

never be sure if these numbers are zero or not. This could lead to an analysis by<br />

cases, which is messy, messy, messy. Note how the above proof never divides, but<br />

always multiplies, and how zero/nonzero considerations are handled. Pay attention<br />

to the expression ad − bc, as we will see it aga<strong>in</strong> <strong>in</strong> a while (Chapter D).<br />

This theorem is cute, and it is nice to have a formula for the <strong>in</strong>verse, and a condition<br />

that tells us when we can use it. However, this approach becomes impractical for<br />

larger matrices, even though it is possible to demonstrate that, <strong>in</strong> theory, there is<br />

a general formula. (Th<strong>in</strong>k for a m<strong>in</strong>ute about extend<strong>in</strong>g this result to just 3 × 3<br />

matrices. For starters, we need 18 letters!) Instead, we will work column-by-column.<br />

Let us first work an example that will motivate the ma<strong>in</strong> theorem and remove some<br />

of the previous mystery.<br />

Example CMI Comput<strong>in</strong>g a matrix <strong>in</strong>verse<br />

Consider the matrix def<strong>in</strong>ed <strong>in</strong> Example MI as,<br />

⎡<br />

⎤<br />

1 2 1 2 1<br />

−2 −3 0 −5 −1<br />

A = ⎢ 1 1 0 2 1 ⎥<br />

⎣−2 −3 −1 −3 −2⎦<br />

−1 −3 −1 −3 1<br />

For its <strong>in</strong>verse, we desire a matrix B so that AB = I 5 . Emphasiz<strong>in</strong>g the structure<br />

of the columns and employ<strong>in</strong>g the def<strong>in</strong>ition of matrix multiplication Def<strong>in</strong>ition MM,<br />

AB = I 5<br />

A[B 1 |B 2 |B 3 |B 4 |B 5 ]=[e 1 |e 2 |e 3 |e 4 |e 5 ]<br />

[AB 1 |AB 2 |AB 3 |AB 4 |AB 5 ]=[e 1 |e 2 |e 3 |e 4 |e 5 ]<br />

Equat<strong>in</strong>g the matrices column-by-column we have<br />

AB 1 = e 1 AB 2 = e 2 AB 3 = e 3 AB 4 = e 4 AB 5 = e 5 .<br />

S<strong>in</strong>ce the matrix B is what we are try<strong>in</strong>g to compute, we can view each column,<br />

B i , as a column vector of unknowns. Then we have five systems of equations to<br />

solve, each with 5 equations <strong>in</strong> 5 variables. Notice that all 5 of these systems have<br />

the same coefficient matrix. We will now solve each system <strong>in</strong> turn,


§MISLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 199<br />

Row-reduce the augmented matrix of the l<strong>in</strong>ear system LS(A, e 1 ),<br />

⎡<br />

⎤<br />

⎡<br />

⎤<br />

1 2 1 2 1 1<br />

1 0 0 0 0 −3 ⎡ ⎤<br />

−2 −3 0 −5 −1 0<br />

0 1 0 0 0 0<br />

−3<br />

0<br />

RREF<br />

⎢ 1 1 0 2 1 0⎥<br />

−−−−→<br />

0 0 1 0 0 1<br />

; B 1 = ⎢ 1 ⎥<br />

⎣−2 −3 −1 −3 −2 0⎦<br />

⎢<br />

⎥ ⎣<br />

⎣ 0 0 0 1 0 1 ⎦ 1 ⎦<br />

−1 −3 −1 −3 1 0<br />

1<br />

0 0 0 0 1 1<br />

Row-reduce the augmented matrix of the l<strong>in</strong>ear system LS(A, e 2 ),<br />

⎡<br />

⎤<br />

⎡<br />

⎤<br />

1 2 1 2 1 0<br />

1 0 0 0 0 3 ⎡ ⎤<br />

−2 −3 0 −5 −1 1<br />

0 1 0 0 0 −2<br />

3<br />

−2<br />

RREF<br />

⎢ 1 1 0 2 1 0⎥<br />

−−−−→<br />

0 0 1 0 0 2<br />

; B 2 = ⎢ 2 ⎥<br />

⎣−2 −3 −1 −3 −2 0⎦<br />

⎢<br />

⎥ ⎣<br />

⎣ 0 0 0 1 0 0 ⎦ 0 ⎦<br />

−1 −3 −1 −3 1 0<br />

−1<br />

0 0 0 0 1 −1<br />

Row-reduce the augmented matrix of the l<strong>in</strong>ear system LS(A, e 3 ),<br />

⎡<br />

⎤<br />

⎡<br />

⎤<br />

1 2 1 2 1 0<br />

1 0 0 0 0 6 ⎡ ⎤<br />

−2 −3 0 −5 −1 0<br />

0 1 0 0 0 −5<br />

6<br />

−5<br />

RREF<br />

⎢ 1 1 0 2 1 1⎥<br />

−−−−→<br />

0 0 1 0 0 4<br />

; B<br />

⎣−2 −3 −1 −3 −2 0⎦<br />

⎢<br />

3 = ⎢ 4 ⎥<br />

⎥ ⎣<br />

⎣ 0 0 0 1 0 1 ⎦ 1 ⎦<br />

−1 −3 −1 −3 1 0<br />

−2<br />

0 0 0 0 1 −2<br />

Row-reduce the augmented matrix of the l<strong>in</strong>ear system LS(A, e 4 ),<br />

⎡<br />

⎤<br />

⎡<br />

⎤<br />

1 2 1 2 1 0<br />

1 0 0 0 0 −1 ⎡ ⎤<br />

−2 −3 0 −5 −1 0<br />

0 1 0 0 0 −1<br />

−1<br />

−1<br />

RREF<br />

⎢ 1 1 0 2 1 0⎥<br />

−−−−→<br />

0 0 1 0 0 1<br />

; B 4 = ⎢ 1 ⎥<br />

⎣−2 −3 −1 −3 −2 1⎦<br />

⎢<br />

⎥ ⎣<br />

⎣ 0 0 0 1 0 1 ⎦ 1 ⎦<br />

−1 −3 −1 −3 1 0<br />

0<br />

0 0 0 0 1 0<br />

Row-reduce the augmented matrix of the l<strong>in</strong>ear system LS(A, e 5 ),<br />

⎡<br />

⎤<br />

⎡<br />

⎤<br />

1 2 1 2 1 0<br />

1 0 0 0 0 −2 ⎡ ⎤<br />

−2 −3 0 −5 −1 0<br />

0 1 0 0 0 1<br />

−2<br />

1<br />

RREF<br />

⎢ 1 1 0 2 1 0⎥<br />

−−−−→<br />

0 0 1 0 0 −1<br />

; B<br />

⎣−2 −3 −1 −3 −2 0⎦<br />

⎢<br />

⎥ 5 = ⎢−1⎥<br />

⎣<br />

⎣ 0 0 0 1 0 0 ⎦ 0 ⎦<br />

−1 −3 −1 −3 1 1<br />

1<br />

0 0 0 0 1 1


§MISLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 200<br />

We can now collect our 5 solution vectors <strong>in</strong>to the matrix B,<br />

B =[B 1 |B 2 |B 3 |B 4 |B 5 ]<br />

⎡⎡<br />

⎤<br />

⎡ ⎤<br />

⎡ ⎤<br />

⎡ ⎤<br />

⎡ ⎤⎤<br />

−3<br />

3<br />

6<br />

−1<br />

−2<br />

0<br />

−2<br />

−5<br />

−1<br />

1<br />

= ⎢⎢<br />

1 ⎥<br />

⎢ 2 ⎥<br />

⎢ 4 ⎥<br />

⎢ 1 ⎥<br />

⎢−1⎥⎥<br />

⎣⎣<br />

1 ⎦<br />

⎣<br />

0 ⎦<br />

⎣<br />

1 ⎦<br />

⎣<br />

1 ⎦<br />

⎣<br />

0 ⎦⎦<br />

1 ∣ −1 ∣ −2 ∣ 0 ∣ 1<br />

⎡<br />

⎤<br />

−3 3 6 −1 −2<br />

⎢ 0 −2 −5 −1 1<br />

⎢⎢⎣ = 1 2 4 1 −1⎥<br />

1 0 1 1 0 ⎦<br />

1 −1 −2 0 1<br />

By this method, we know that AB = I 5 .CheckthatBA = I 5 , and then we will<br />

know that we have the <strong>in</strong>verse of A.<br />

△<br />

Notice how the five systems of equations <strong>in</strong> the preced<strong>in</strong>g example were all solved<br />

by exactly the same sequence of row operations. Would it not be nice to avoid this<br />

obvious duplication of effort? Our ma<strong>in</strong> theorem for this section follows, and it<br />

mimics this previous example, while also avoid<strong>in</strong>g all the overhead.<br />

Theorem CINM Comput<strong>in</strong>g the Inverse of a Nons<strong>in</strong>gular Matrix<br />

Suppose A is a nons<strong>in</strong>gular square matrix of size n. Create the n × 2n matrix M by<br />

plac<strong>in</strong>g the n × n identity matrix I n to the right of the matrix A. LetN be a matrix<br />

that is row-equivalent to M and <strong>in</strong> reduced row-echelon form. F<strong>in</strong>ally, let J be the<br />

matrix formed from the f<strong>in</strong>al n columns of N. ThenAJ = I n .<br />

Proof. A is nons<strong>in</strong>gular, so by Theorem NMRRI there is a sequence of row operations<br />

that will convert A <strong>in</strong>to I n . It is this same sequence of row operations that will<br />

convert M <strong>in</strong>to N, s<strong>in</strong>ce hav<strong>in</strong>g the identity matrix <strong>in</strong> the first n columns of N is<br />

sufficient to guarantee that N is <strong>in</strong> reduced row-echelon form.<br />

If we consider the systems of l<strong>in</strong>ear equations, LS(A, e i ),1≤ i ≤ n, we see that<br />

the aforementioned sequence of row operations will also br<strong>in</strong>g the augmented matrix<br />

of each of these systems <strong>in</strong>to reduced row-echelon form. Furthermore, the unique<br />

solution to LS(A, e i ) appears <strong>in</strong> column n + 1 of the row-reduced augmented matrix<br />

of the system and is identical to column n+i of N. LetN 1 , N 2 , N 3 , ..., N 2n denote<br />

the columns of N. So we f<strong>in</strong>d,<br />

AJ =A[N n+1 |N n+2 |N n+3 | ...|N n+n ]<br />

=[AN n+1 |AN n+2 |AN n+3 | ...|AN n+n ] Def<strong>in</strong>ition MM<br />

=[e 1 |e 2 |e 3 | ...|e n ]<br />

=I n Def<strong>in</strong>ition IM<br />

as desired.


§MISLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 201<br />

We have to be just a bit careful here about both what this theorem says and<br />

what it does not say. If A is a nons<strong>in</strong>gular matrix, then we are guaranteed a matrix<br />

B such that AB = I n , and the proof gives us a process for construct<strong>in</strong>g B. However,<br />

the def<strong>in</strong>ition of the <strong>in</strong>verse of a matrix (Def<strong>in</strong>ition MI) requires that BA = I n also.<br />

So at this juncture we must compute the matrix product <strong>in</strong> the “opposite” order<br />

before we claim B as the <strong>in</strong>verse of A. However, we will soon see that this is always<br />

the case, <strong>in</strong> Theorem OSIS, so the title of this theorem is not <strong>in</strong>accurate.<br />

What if A is s<strong>in</strong>gular? At this po<strong>in</strong>t we only know that Theorem CINM cannot<br />

be applied. The question of A’s <strong>in</strong>verse is still open. (But see Theorem NI <strong>in</strong> the<br />

next section.)<br />

We will f<strong>in</strong>ish by comput<strong>in</strong>g the <strong>in</strong>verse for the coefficient matrix of Archetype<br />

B, the one we just pulled from a hat <strong>in</strong> Example SABMI. There are more examples<br />

<strong>in</strong> the Archetypes (Archetypes) to practice with, though notice that it is silly to ask<br />

for the <strong>in</strong>verse of a rectangular matrix (the sizes are not right) and not every square<br />

matrix has an <strong>in</strong>verse (remember Example MWIAA?).<br />

Example CMIAB Comput<strong>in</strong>g a matrix <strong>in</strong>verse, Archetype B<br />

Archetype B has a coefficient matrix given as<br />

[ −7 −6<br />

] −12<br />

B = 5 5 7<br />

1 0 4<br />

Exercis<strong>in</strong>g Theorem CINM we set<br />

[ −7 −6 −12 1 0<br />

] 0<br />

M = 5 5 7 0 1 0 .<br />

1 0 4 0 0 1<br />

which row reduces to<br />

So<br />

⎡<br />

⎤<br />

1 0 0 −10 −12 −9<br />

N = ⎣<br />

13<br />

11<br />

0 1 0<br />

2<br />

8 ⎦<br />

2<br />

.<br />

5<br />

5<br />

0 0 1<br />

2<br />

3<br />

2<br />

⎡<br />

−10 −12<br />

⎤<br />

−9<br />

B −1 = ⎣ 13<br />

⎦<br />

11<br />

2<br />

8<br />

2<br />

5<br />

5<br />

2<br />

3<br />

2<br />

once we check that B −1 B = I 3 (the product <strong>in</strong> the opposite order is a consequence<br />

of the theorem).<br />


§MISLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 202<br />

Subsection PMI<br />

Properties of Matrix Inverses<br />

The <strong>in</strong>verse of a matrix enjoys some nice properties. We collect a few here. <strong>First</strong>, a<br />

matrix can have but one <strong>in</strong>verse.<br />

Theorem MIU Matrix Inverse is Unique<br />

Suppose the square matrix A has an <strong>in</strong>verse. Then A −1 is unique.<br />

Proof. As described <strong>in</strong> Proof Technique U, we will assume that A has two <strong>in</strong>verses.<br />

The hypothesis tells there is at least one. Suppose then that B and C are both<br />

<strong>in</strong>verses for A, so we know by Def<strong>in</strong>ition MI that AB = BA = I n and AC = CA = I n .<br />

Then we have,<br />

B = BI n<br />

Theorem MMIM<br />

= B(AC) Def<strong>in</strong>ition MI<br />

=(BA)C<br />

Theorem MMA<br />

= I n C Def<strong>in</strong>ition MI<br />

= C Theorem MMIM<br />

So we conclude that B and C are the same, and cannot be different. So any<br />

matrix that acts like an <strong>in</strong>verse, must be the <strong>in</strong>verse.<br />

<br />

When most of us dress <strong>in</strong> the morn<strong>in</strong>g, we put on our socks first, followed by our<br />

shoes. In the even<strong>in</strong>g we must then first remove our shoes, followed by our socks.<br />

Try to connect the conclusion of the follow<strong>in</strong>g theorem with this everyday example.<br />

Theorem SS Socks and Shoes<br />

Suppose A and B are <strong>in</strong>vertible matrices of size n. ThenAB is an <strong>in</strong>vertible matrix<br />

and (AB) −1 = B −1 A −1 .<br />

Proof. At the risk of carry<strong>in</strong>g our everyday analogies too far, the proof of this<br />

theorem is quite easy when we compare it to the work<strong>in</strong>gs of a dat<strong>in</strong>g service. We<br />

have a statement about the <strong>in</strong>verse of the matrix AB, which for all we know right<br />

now might not even exist. Suppose AB was to sign up for a dat<strong>in</strong>g service with<br />

two requirements for a compatible date. Upon multiplication on the left, and on<br />

the right, the result should be the identity matrix. In other words, AB’s ideal date<br />

would be its <strong>in</strong>verse.<br />

Now along comes the matrix B −1 A −1 (which we know exists because our hypothesissaysbothA<br />

and B are <strong>in</strong>vertible and we can form the product of these<br />

two matrices), also look<strong>in</strong>g for a date. Let us see if B −1 A −1 is a good match for<br />

AB. <strong>First</strong> they meet at a noncommittal neutral location, say a coffee shop, for quiet<br />

conversation:<br />

(B −1 A −1 )(AB) =B −1 (A −1 A)B Theorem MMA


§MISLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 203<br />

= B −1 I n B Def<strong>in</strong>ition MI<br />

= B −1 B Theorem MMIM<br />

= I n Def<strong>in</strong>ition MI<br />

The first date hav<strong>in</strong>g gone smoothly, a second, more serious, date is arranged, say<br />

d<strong>in</strong>ner and a show:<br />

(AB)(B −1 A −1 )=A(BB −1 )A −1 Theorem MMA<br />

= AI n A −1 Def<strong>in</strong>ition MI<br />

= AA −1 Theorem MMIM<br />

= I n Def<strong>in</strong>ition MI<br />

So the matrix B −1 A −1 has met all of the requirements to be AB’s <strong>in</strong>verse (date)<br />

and with the ensu<strong>in</strong>g marriage proposal we can announce that (AB) −1 = B −1 A −1 .<br />

<br />

Theorem MIMI Matrix Inverse of a Matrix Inverse<br />

Suppose A is an <strong>in</strong>vertible matrix. Then A −1 is <strong>in</strong>vertible and (A −1 ) −1 = A.<br />

Proof. As with the proof of Theorem SS, we exam<strong>in</strong>e if A is a suitable <strong>in</strong>verse for<br />

A −1 (by def<strong>in</strong>ition, the opposite is true).<br />

AA −1 = I n<br />

Def<strong>in</strong>ition MI<br />

and<br />

A −1 A = I n<br />

Def<strong>in</strong>ition MI<br />

The matrix A has met all the requirements to be the <strong>in</strong>verse of A −1 ,andsois<br />

<strong>in</strong>vertible and we can write A =(A −1 ) −1 .<br />

<br />

Theorem MIT Matrix Inverse of a Transpose<br />

Suppose A is an <strong>in</strong>vertible matrix. Then A t is <strong>in</strong>vertible and (A t ) −1 =(A −1 ) t .<br />

Proof. As with the proof of Theorem SS, we see if (A −1 ) t is a suitable <strong>in</strong>verse for<br />

A t . Apply Theorem MMT to see that<br />

(A −1 ) t A t =(AA −1 ) t Theorem MMT<br />

= In t Def<strong>in</strong>ition MI<br />

= I n Def<strong>in</strong>ition SYM<br />

and<br />

A t (A −1 ) t =(A −1 A) t<br />

Theorem MMT<br />

= In t Def<strong>in</strong>ition MI


§MISLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 204<br />

= I n Def<strong>in</strong>ition SYM<br />

The matrix (A −1 ) t has met all the requirements to be the <strong>in</strong>verse of A t ,andso<br />

is <strong>in</strong>vertible and we can write (A t ) −1 =(A −1 ) t .<br />

<br />

Theorem MISM Matrix Inverse of a Scalar Multiple<br />

Suppose A is an <strong>in</strong>vertible matrix and α is a nonzero scalar. Then (αA) −1 = 1 α A−1<br />

and αA is <strong>in</strong>vertible.<br />

Proof. As with the proof of Theorem SS, we see if 1 α A−1 is a suitable <strong>in</strong>verse for αA.<br />

( 1<br />

α A−1 )<br />

(αA) =( 1<br />

α α ) (A −1 A ) Theorem MMSMM<br />

=1I n Scalar multiplicative <strong>in</strong>verses<br />

= I n Property OM<br />

and<br />

( ) (<br />

1<br />

(αA)<br />

α A−1 = α<br />

α) 1 (AA<br />

−1 ) Theorem MMSMM<br />

=1I n Scalar multiplicative <strong>in</strong>verses<br />

= I n Property OM<br />

The matrix 1 α A−1 has met all the requirements to be the <strong>in</strong>verse of αA, sowe<br />

can write (αA) −1 = 1 α A−1 .<br />

<br />

Notice that there are some likely theorems that are miss<strong>in</strong>g here. For example, it<br />

would be tempt<strong>in</strong>g to th<strong>in</strong>k that (A + B) −1 = A −1 + B −1 , but this is false. Can you<br />

f<strong>in</strong>d a counterexample? (See Exercise MISLE.T10.)<br />

Read<strong>in</strong>g Questions<br />

1. Compute the <strong>in</strong>verse of the matrix below.<br />

[ ]<br />

−2 3<br />

−3 4<br />

2. Compute the <strong>in</strong>verse of the matrix below.<br />

⎡<br />

⎣ 2 3 1<br />

⎤<br />

1 −2 −3⎦<br />

−2 4 6<br />

3. Expla<strong>in</strong> why Theorem SS has the title it does. (Do not just state the theorem, expla<strong>in</strong><br />

the choice of the title mak<strong>in</strong>g reference to the theorem itself.)


§MISLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 205<br />

Exercises<br />

⎡<br />

C16 † If it exists, f<strong>in</strong>d the <strong>in</strong>verse of A = ⎣ 1 0 1<br />

⎤<br />

1 1 1⎦, and check your answer.<br />

2 −1 1<br />

⎡<br />

C17 † If it exists, f<strong>in</strong>d the <strong>in</strong>verse of A = ⎣ 2 −1 1<br />

⎤<br />

1 2 1⎦, and check your answer.<br />

3 1 2<br />

⎡<br />

C18 † If it exists, f<strong>in</strong>d the <strong>in</strong>verse of A = ⎣ 1 3 1<br />

⎤<br />

1 2 1⎦, and check your answer.<br />

2 2 1<br />

⎡<br />

C19 † If it exists, f<strong>in</strong>d the <strong>in</strong>verse of A = ⎣ 1 3 1<br />

⎤<br />

0 2 1⎦, and check your answer.<br />

2 2 1<br />

C21 † Verify that B is the <strong>in</strong>verse of A.<br />

⎡<br />

⎤<br />

⎡<br />

⎤<br />

1 1 −1 2<br />

4 2 0 −1<br />

⎢−2 −1 2 −3⎥<br />

⎢ 8 4 −1 −1⎥<br />

A = ⎣<br />

1 1 0 2<br />

⎦ B = ⎣<br />

−1 0 1 0<br />

⎦<br />

−1 2 0 2<br />

−6 −3 1 1<br />

C22 †<br />

Recycle the matrices A and B from Exercise MISLE.C21 and set<br />

⎡ ⎤<br />

⎡ ⎤<br />

2<br />

1<br />

⎢ 1 ⎥<br />

⎢1⎥<br />

c = ⎣<br />

−3<br />

⎦ d = ⎣<br />

1<br />

⎦<br />

2<br />

1<br />

Employ the matrix B to solve the two l<strong>in</strong>ear systems LS(A, c) andLS(A, d).<br />

C23 If it exists, f<strong>in</strong>d the <strong>in</strong>verse of the 2 × 2 matrix<br />

[ ]<br />

7 3<br />

A =<br />

5 2<br />

and check your answer. (See Theorem TTMI.)<br />

C24 If it exists, f<strong>in</strong>d the <strong>in</strong>verse of the 2 × 2 matrix<br />

[ ]<br />

6 3<br />

A =<br />

4 2<br />

and check your answer. (See Theorem TTMI.)<br />

C25 At the conclusion of Example CMI, verifythatBA = I 5 by comput<strong>in</strong>g the matrix<br />

product.<br />

C26 † Let<br />

⎡<br />

⎤<br />

1 −1 3 −2 1<br />

−2 3 −5 3 0<br />

D = ⎢ 1 −1 4 −2 2⎥<br />

⎣<br />

−1 4 −1 0 4<br />

⎦<br />

1 0 5 −2 5


§MISLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 206<br />

Compute the <strong>in</strong>verse of D, D −1 , by form<strong>in</strong>g the 5 × 10 matrix [ D | I 5 ] and row-reduc<strong>in</strong>g<br />

(Theorem CINM). Then use a calculator to compute D −1 directly.<br />

C27 † Let<br />

⎡<br />

⎤<br />

1 −1 3 −2 1<br />

−2 3 −5 3 −1<br />

E = ⎢ 1 −1 4 −2 2 ⎥<br />

⎣<br />

−1 4 −1 0 2<br />

⎦<br />

1 0 5 −2 4<br />

Compute the <strong>in</strong>verse of E, E −1 , by form<strong>in</strong>g the 5 × 10 matrix [ E | I 5 ] and row-reduc<strong>in</strong>g<br />

(Theorem CINM). Then use a calculator to compute E −1 directly.<br />

C28 † Let<br />

⎡<br />

⎤<br />

1 1 3 1<br />

⎢−2 −1 −4 −1⎥<br />

C = ⎣<br />

1 4 10 2<br />

⎦<br />

−2 0 −4 5<br />

Compute the <strong>in</strong>verse of C, C −1 , by form<strong>in</strong>g the 4 × 8 matrix [ C | I 4 ] and row-reduc<strong>in</strong>g<br />

(Theorem CINM). Then use a calculator to compute C −1 directly.<br />

C40 † F<strong>in</strong>d all solutions to the system of equations below, mak<strong>in</strong>g use of the matrix <strong>in</strong>verse<br />

found <strong>in</strong> Exercise MISLE.C28.<br />

x 1 + x 2 +3x 3 + x 4 = −4<br />

−2x 1 − x 2 − 4x 3 − x 4 =4<br />

x 1 +4x 2 +10x 3 +2x 4 = −20<br />

−2x 1 − 4x 3 +5x 4 =9<br />

C41 † Use the <strong>in</strong>verse of a matrix to f<strong>in</strong>d all the solutions to the follow<strong>in</strong>g system of<br />

equations.<br />

x 1 +2x 2 − x 3 = −3<br />

2x 1 +5x 2 − x 3 = −4<br />

−x 1 − 4x 2 =2<br />

C42 †<br />

Use a matrix <strong>in</strong>verse to solve the l<strong>in</strong>ear system of equations.<br />

x 1 − x 2 +2x 3 =5<br />

x 1 − 2x 3 = −8<br />

2x 1 − x 2 − x 3 = −6<br />

T10 † Construct an example to demonstrate that (A + B) −1 = A −1 + B −1 is not true for<br />

all square matrices A and B of the same size.


Section MINM<br />

Matrix Inverses and Nons<strong>in</strong>gular Matrices<br />

We saw <strong>in</strong> Theorem CINM that if a square matrix A is nons<strong>in</strong>gular, then there<br />

is a matrix B so that AB = I n . In other words, B is halfway to be<strong>in</strong>g an <strong>in</strong>verse<br />

of A. We will see <strong>in</strong> this section that B automatically fulfills the second condition<br />

(BA = I n ). Example MWIAA showed us that the coefficient matrix from Archetype<br />

A had no <strong>in</strong>verse. Not co<strong>in</strong>cidentally, this coefficient matrix is s<strong>in</strong>gular. We will make<br />

all these connections precise now. Not many examples or def<strong>in</strong>itions <strong>in</strong> this section,<br />

just theorems.<br />

Subsection NMI<br />

Nons<strong>in</strong>gular Matrices are Invertible<br />

We need a couple of technical results for starters. Some books would call these m<strong>in</strong>or,<br />

but essential, results “lemmas.” We’ll just call ’em theorems. See Proof Technique<br />

LC for more on the dist<strong>in</strong>ction.<br />

The first of these technical results is <strong>in</strong>terest<strong>in</strong>g <strong>in</strong> that the hypothesis says<br />

someth<strong>in</strong>g about a product of two square matrices and the conclusion then says the<br />

same th<strong>in</strong>g about each <strong>in</strong>dividual matrix <strong>in</strong> the product. This result has an analogy<br />

<strong>in</strong> the algebra of complex numbers: suppose α, β ∈ C, thenαβ ≠ 0 if and only if<br />

α ̸= 0 and β ̸= 0. We can view this result as suggest<strong>in</strong>g that the term “nons<strong>in</strong>gular”<br />

for matrices is like the term “nonzero” for scalars. Consider too that we know<br />

s<strong>in</strong>gular matrices, as coefficient matrices for systems of equations, will sometimes<br />

lead to systems with no solutions, or systems with <strong>in</strong>f<strong>in</strong>itely many solutions (Theorem<br />

NMUS). What do l<strong>in</strong>ear equations with zero look like? Consider 0x = 5, which has<br />

no solution, and 0x = 0, which has <strong>in</strong>f<strong>in</strong>itely many solutions. In the algebra of scalars,<br />

zero is exceptional (mean<strong>in</strong>g different, not better), and <strong>in</strong> the algebra of matrices,<br />

s<strong>in</strong>gular matrices are also the exception. While there is only one zero scalar, and<br />

there are <strong>in</strong>f<strong>in</strong>itely many s<strong>in</strong>gular matrices, we will see that s<strong>in</strong>gular matrices are a<br />

dist<strong>in</strong>ct m<strong>in</strong>ority.<br />

Theorem NPNT Nons<strong>in</strong>gular Product has Nons<strong>in</strong>gular Terms<br />

Suppose that A and B are square matrices of size n. The product AB is nons<strong>in</strong>gular<br />

if and only if A and B are both nons<strong>in</strong>gular.<br />

Proof. (⇒) For this portion of the proof we will form the logically-equivalent contrapositive<br />

and prove that statement us<strong>in</strong>g two cases. “AB is nons<strong>in</strong>gular implies A<br />

and B are both nons<strong>in</strong>gular” becomes “A or B is s<strong>in</strong>gular implies AB is s<strong>in</strong>gular.”<br />

(Be sure to undertstand why the “and” became an “or”, see Proof Technique CP.)<br />

Case 1. Suppose B is s<strong>in</strong>gular. Then there is a nonzero vector z that is a solution<br />

207


§MINM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 208<br />

to LS(B, 0). So<br />

(AB)z = A(Bz)<br />

Theorem MMA<br />

= A0 Theorem SLEMM<br />

= 0 Theorem MMZM<br />

With Theorem SLEMM we can translate this vector equality to the statement<br />

that z is a nonzero solution to LS(AB, 0). ThusAB is s<strong>in</strong>gular (Def<strong>in</strong>ition NM), as<br />

desired.<br />

Case 2. Suppose A is s<strong>in</strong>gular, and B is not s<strong>in</strong>gular. In other words, with Case<br />

1 complete, we can be more precise about this rema<strong>in</strong><strong>in</strong>g case and assume that B is<br />

nons<strong>in</strong>gular. Because A is s<strong>in</strong>gular, there is a nonzero vector y that is a solution<br />

to LS(A, 0). Now consider the l<strong>in</strong>ear system LS(B, y). S<strong>in</strong>ceB is nons<strong>in</strong>gular, the<br />

system has a unique solution (Theorem NMUS), which we will denote as w. Wefirst<br />

claim w is not the zero vector either. Assum<strong>in</strong>g the opposite, suppose that w = 0<br />

(Proof Technique CD). Then<br />

y = Bw<br />

Theorem SLEMM<br />

= B0 Hypothesis<br />

= 0 Theorem MMZM<br />

contrary to y be<strong>in</strong>g nonzero. So w ≠ 0. The pieces are <strong>in</strong> place, so here we go,<br />

(AB)w = A(Bw)<br />

Theorem MMA<br />

= Ay Theorem SLEMM<br />

= 0 Theorem SLEMM<br />

With Theorem SLEMM we can translate this vector equality to the statement<br />

that w is a nonzero solution to LS(AB, 0). ThusAB is s<strong>in</strong>gular (Def<strong>in</strong>ition NM),<br />

as desired. And this conclusion holds for both cases.<br />

(⇐) Now assume that both A and B are nons<strong>in</strong>gular. Suppose that x ∈ C n is a<br />

solution to LS(AB, 0). Then<br />

0 =(AB) x Theorem SLEMM<br />

= A (Bx) Theorem MMA<br />

By Theorem SLEMM, Bx is a solution to LS(A, 0), and by the def<strong>in</strong>ition of a<br />

nons<strong>in</strong>gular matrix (Def<strong>in</strong>ition NM), we conclude that Bx = 0. Now,byanentirely<br />

similar argument, the nons<strong>in</strong>gularity of B forces us to conclude that x = 0. So<br />

the only solution to LS(AB, 0) is the zero vector and we conclude that AB is<br />

nons<strong>in</strong>gular by Def<strong>in</strong>ition NM.


§MINM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 209<br />

This is a powerful result <strong>in</strong> the “forward” direction, because it allows us to beg<strong>in</strong><br />

with a hypothesis that someth<strong>in</strong>g complicated (the matrix product AB) has the<br />

property of be<strong>in</strong>g nons<strong>in</strong>gular, and we can then conclude that the simpler constituents<br />

(A and B <strong>in</strong>dividually) then also have the property of be<strong>in</strong>g nons<strong>in</strong>gular. If we had<br />

thought that the matrix product was an artificial construction, results like this would<br />

make us beg<strong>in</strong> to th<strong>in</strong>k twice.<br />

The contrapositive of this entire result is equally <strong>in</strong>terest<strong>in</strong>g. It says that A or B<br />

(or both) is a s<strong>in</strong>gular matrix if and only if the product AB is s<strong>in</strong>gular. (See Proof<br />

Technique CP.)<br />

Theorem OSIS One-Sided Inverse is Sufficient<br />

Suppose A and B are square matrices of size n such that AB = I n .ThenBA = I n .<br />

Proof. The matrix I n is nons<strong>in</strong>gular (s<strong>in</strong>ce it row-reduces easily to I n ,Theorem<br />

NMRRI). So A and B are nons<strong>in</strong>gular by Theorem NPNT, so <strong>in</strong> particular B is<br />

nons<strong>in</strong>gular. We can therefore apply Theorem CINM to assert the existence of a<br />

matrix C so that BC = I n . This application of Theorem CINM could be a bit<br />

confus<strong>in</strong>g, mostly because of the names of the matrices <strong>in</strong>volved. B is nons<strong>in</strong>gular,<br />

so there must be a “right-<strong>in</strong>verse” for B, and we are call<strong>in</strong>g it C.<br />

Now<br />

BA =(BA)I n<br />

Theorem MMIM<br />

=(BA)(BC)<br />

Theorem CINM<br />

= B(AB)C Theorem MMA<br />

= BI n C Hypothesis<br />

= BC Theorem MMIM<br />

= I n Theorem CINM<br />

which is the desired conclusion.<br />

<br />

So Theorem OSIS tells us that if A is nons<strong>in</strong>gular, then the matrix B guaranteed<br />

by Theorem CINM will be both a “right-<strong>in</strong>verse” and a “left-<strong>in</strong>verse” for A, soA is<br />

<strong>in</strong>vertible and A −1 = B.<br />

So if you have a nons<strong>in</strong>gular matrix, A, you can use the procedure described<br />

<strong>in</strong> Theorem CINM to f<strong>in</strong>d an <strong>in</strong>verse for A. IfA is s<strong>in</strong>gular, then the procedure<br />

<strong>in</strong> Theorem CINM will fail as the first n columns of M will not row-reduce to<br />

the identity matrix. However, we can say a bit more. When A is s<strong>in</strong>gular, then A<br />

does not have an <strong>in</strong>verse (which is very different from say<strong>in</strong>g that the procedure <strong>in</strong><br />

Theorem CINM fails to f<strong>in</strong>d an <strong>in</strong>verse). This may feel like we are splitt<strong>in</strong>g hairs,<br />

but it is important that we do not make unfounded assumptions. These observations<br />

motivate the next theorem.<br />

Theorem NI Nons<strong>in</strong>gularity is Invertibility<br />

Suppose that A is a square matrix. Then A is nons<strong>in</strong>gular if and only if A is <strong>in</strong>vertible.


§MINM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 210<br />

Proof. (⇐) S<strong>in</strong>ceA is <strong>in</strong>vertible, we can write I n = AA −1 (Def<strong>in</strong>ition MI). Notice<br />

that I n is nons<strong>in</strong>gular (Theorem NMRRI) soTheoremNPNT implies that A (and<br />

A −1 ) is nons<strong>in</strong>gular.<br />

(⇒) Suppose now that A is nons<strong>in</strong>gular. By Theorem CINM we f<strong>in</strong>d B so that<br />

AB = I n . Then Theorem OSIS tells us that BA = I n .SoB is A’s <strong>in</strong>verse, and by<br />

construction, A is <strong>in</strong>vertible.<br />

<br />

So for a square matrix, the properties of hav<strong>in</strong>g an <strong>in</strong>verse and of hav<strong>in</strong>g a trivial<br />

null space are one and the same. Cannot have one without the other.<br />

Theorem NME3 Nons<strong>in</strong>gular Matrix Equivalences, Round 3<br />

Suppose that A is a square matrix of size n. The follow<strong>in</strong>g are equivalent.<br />

1. A is nons<strong>in</strong>gular.<br />

2. A row-reduces to the identity matrix.<br />

3. The null space of A conta<strong>in</strong>s only the zero vector, N (A) ={0}.<br />

4. The l<strong>in</strong>ear system LS(A, b) has a unique solution for every possible choice of<br />

b.<br />

5. The columns of A are a l<strong>in</strong>early <strong>in</strong>dependent set.<br />

6. A is <strong>in</strong>vertible.<br />

Proof. We can update our list of equivalences for nons<strong>in</strong>gular matrices (Theorem<br />

NME2) with the equivalent condition from Theorem NI.<br />

<br />

InthecasethatA is a nons<strong>in</strong>gular coefficient matrix of a system of equations,<br />

the <strong>in</strong>verse allows us to very quickly compute the unique solution, for any vector of<br />

constants.<br />

Theorem SNCM Solution with Nons<strong>in</strong>gular Coefficient Matrix<br />

Suppose that A is nons<strong>in</strong>gular. Then the unique solution to LS(A, b) is A −1 b.<br />

Proof. By Theorem NMUS we know already that LS(A, b) has a unique solution<br />

for every choice of b. We need to show that the expression stated is <strong>in</strong>deed a solution<br />

(the solution). That is easy, just “plug it <strong>in</strong>” to the vector equation representation<br />

of the system (Theorem SLEMM),<br />

A ( A −1 b ) = ( AA −1) b<br />

Theorem MMA<br />

= I n b Def<strong>in</strong>ition MI<br />

= b Theorem MMIM<br />

S<strong>in</strong>ce Ax = b is true when we substitute A −1 b for x, A −1 b is a (the!) solution<br />

to LS(A, b).


§MINM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 211<br />

Subsection UM<br />

Unitary Matrices<br />

Recall that the adjo<strong>in</strong>t of a matrix is A ∗ = ( A ) t<br />

(Def<strong>in</strong>ition A).<br />

Def<strong>in</strong>ition UM Unitary Matrices<br />

Suppose that U is a square matrix of size n such that U ∗ U = I n .ThenwesayU is<br />

unitary.<br />

□<br />

This condition may seem rather far-fetched at first glance. Would there be any<br />

matrix that behaved this way? Well, yes, here is one.<br />

Example UM3 Unitary matrix of size 3<br />

⎡<br />

⎢<br />

U = ⎣<br />

1+i √<br />

5<br />

1−i √<br />

5<br />

i √<br />

5<br />

3+2 i √<br />

55<br />

2+2i √<br />

22<br />

2+2 i √<br />

55<br />

⎤<br />

−3+i √<br />

⎥<br />

22 ⎦<br />

√<br />

22<br />

3−5 i √<br />

55<br />

− 2<br />

The computations get a bit tiresome, but if you work your way through the computation<br />

of U ∗ U,youwill arrive at the 3 × 3 identity matrix I 3 .<br />

△<br />

Unitary matrices do not have to look quite so gruesome. Here is a larger one<br />

that is a bit more pleas<strong>in</strong>g.<br />

Example UPM Unitary permutation matrix<br />

The matrix<br />

⎡<br />

⎤<br />

0 1 0 0 0<br />

0 0 0 1 0<br />

P = ⎢1 0 0 0 0⎥<br />

⎣0 0 0 0 1⎦<br />

0 0 1 0 0<br />

is unitary as can be easily checked. Notice that it is just a rearrangement of the<br />

columns of the 5 × 5 identity matrix, I 5 (Def<strong>in</strong>ition IM).<br />

An <strong>in</strong>terest<strong>in</strong>g exercise is to build another 5 × 5 unitary matrix, R, us<strong>in</strong>ga<br />

different rearrangement of the columns of I 5 . Then form the product PR.This<br />

will be another unitary matrix (Exercise MINM.T10). If you were to build all<br />

5! = 5 × 4 × 3 × 2 × 1 = 120 matrices of this type you would have a set that<br />

rema<strong>in</strong>s closed under matrix multiplication. It is an example of another algebraic<br />

structure known as a group s<strong>in</strong>ce together the set and the one operation (matrix<br />

multiplication here) is closed, associative, has an identity (I 5 ), and <strong>in</strong>verses (Theorem<br />

UMI). Notice though that the operation <strong>in</strong> this group is not commutative! △<br />

If a matrix A has only real number entries (we say it is a real matrix) then<br />

the def<strong>in</strong><strong>in</strong>g property of be<strong>in</strong>g unitary simplifies to A t A = I n . In this case we, and<br />

everybody else, call the matrix orthogonal, so you may often encounter this term


§MINM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 212<br />

<strong>in</strong> your other read<strong>in</strong>g when the complex numbers are not under consideration.<br />

Unitary matrices have easily computed <strong>in</strong>verses. They also have columns that<br />

form orthonormal sets. Here are the theorems that show us that unitary matrices<br />

are not as strange as they might <strong>in</strong>itially appear.<br />

Theorem UMI Unitary Matrices are Invertible<br />

Suppose that U is a unitary matrix of size n. ThenU is nons<strong>in</strong>gular, and U −1 = U ∗ .<br />

Proof. By Def<strong>in</strong>ition UM, we know that U ∗ U = I n .ThematrixI n is nons<strong>in</strong>gular<br />

(s<strong>in</strong>ce it row-reduces easily to I n ,TheoremNMRRI). So by Theorem NPNT, U and<br />

U ∗ are both nons<strong>in</strong>gular matrices.<br />

The equation U ∗ U = I n gets us halfway to an <strong>in</strong>verse of U, andTheoremOSIS<br />

tells us that then UU ∗ = I n also. So U and U ∗ are <strong>in</strong>verses of each other (Def<strong>in</strong>ition<br />

MI).<br />

<br />

Theorem CUMOS Columns of Unitary Matrices are Orthonormal Sets<br />

Suppose that S = {A 1 , A 2 , A 3 , ..., A n } is the set of columns of a square matrix A<br />

of size n. ThenA is a unitary matrix if and only if S is an orthonormal set.<br />

Proof. The proof revolves around recogniz<strong>in</strong>g that a typical entry of the product<br />

A ∗ A is an <strong>in</strong>ner product of columns of A. Here are the details to support this claim.<br />

n∑<br />

[A ∗ A] ij<br />

= [A ∗ ] ik<br />

[A] kj<br />

Theorem EMP<br />

=<br />

=<br />

=<br />

=<br />

k=1<br />

n∑<br />

k=1<br />

[<br />

A t] [A] kj<br />

ik<br />

n∑ [ ]<br />

A<br />

k=1<br />

ki [A] kj<br />

n∑<br />

[A] ki<br />

[A] kj<br />

k=1<br />

n∑<br />

[A i ] k<br />

[A j ] k<br />

k=1<br />

Theorem EMP<br />

Def<strong>in</strong>ition TM<br />

Def<strong>in</strong>ition CCM<br />

= 〈A i , A j 〉 Def<strong>in</strong>ition IP<br />

We now employ this equality <strong>in</strong> a cha<strong>in</strong> of equivalences,<br />

S = {A 1 , A 2 , A 3 , ..., A n } is an orthonormal set<br />

{<br />

0 if i ≠ j<br />

⇐⇒ 〈A i , A j 〉 =<br />

Def<strong>in</strong>ition ONS<br />

1 if i = j


§MINM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 213<br />

⇐⇒ [A ∗ A] ij<br />

=<br />

{<br />

0 if i ≠ j<br />

1 if i = j<br />

⇐⇒ [A ∗ A] ij<br />

=[I n ] ij<br />

, 1 ≤ i ≤ n, 1 ≤ j ≤ n<br />

⇐⇒ A ∗ A = I n<br />

⇐⇒ A is a unitary matrix<br />

Example OSMC Orthonormal set from matrix columns<br />

The matrix<br />

⎡<br />

1+i √ 3+2 √ i 2+2i √<br />

⎢ 5 55 22<br />

⎥<br />

U = ⎣<br />

⎦<br />

1−i √<br />

5<br />

i √<br />

5<br />

2+2 i √<br />

55<br />

−3+i √<br />

22<br />

3−5 i √<br />

55<br />

− 2<br />

√<br />

22<br />

⎤<br />

Def<strong>in</strong>ition IM<br />

Def<strong>in</strong>ition ME<br />

Def<strong>in</strong>ition UM<br />

<br />

from Example UM3 is a unitary matrix. By Theorem CUMOS, itscolumns<br />

⎧⎡<br />

⎡ ⎡ ⎫<br />

1+i √ 3+2 √ i 2+2i √ ⎪⎨<br />

⎢ 5<br />

1−i<br />

⎣<br />

√<br />

⎥ ⎢ 55<br />

2+2 i<br />

5 ⎦ , ⎣<br />

√<br />

⎥ ⎢ 22<br />

⎪⎬<br />

−3+i<br />

55 ⎦ , ⎣<br />

√<br />

⎥<br />

22<br />

⎦<br />

⎪⎩ i 3−5 i<br />

⎪⎭<br />

√<br />

5<br />

⎤<br />

√<br />

55<br />

⎤<br />

⎤<br />

− √ 2<br />

22<br />

form an orthonormal set. You might f<strong>in</strong>d check<strong>in</strong>g the six <strong>in</strong>ner products of pairs<br />

of these vectors easier than do<strong>in</strong>g the matrix product U ∗ U. Or, because the <strong>in</strong>ner<br />

product is anti-commutative (Theorem IPAC) you only need check three <strong>in</strong>ner<br />

products (see Exercise MINM.T12).<br />

△<br />

When us<strong>in</strong>g vectors and matrices that only have real number entries, orthogonal<br />

matrices are those matrices with <strong>in</strong>verses that equal their transpose. Similarly, the<br />

<strong>in</strong>ner product is the familiar dot product. Keep this special case <strong>in</strong> m<strong>in</strong>d as you read<br />

the next theorem.<br />

Theorem UMPIP Unitary Matrices Preserve Inner Products<br />

Suppose that U is a unitary matrix of size n and u and v are two vectors from C n .<br />

Then<br />

〈Uu, Uv〉 = 〈u, v〉 and ‖Uv‖ = ‖v‖<br />

Proof.<br />

〈Uu, Uv〉 = ( Uu ) t<br />

Uv<br />

Theorem MMIP<br />

= ( Uu ) t<br />

Uv Theorem MMCC<br />

= u t U t Uv Theorem MMT<br />

= u t U ∗ Uv Def<strong>in</strong>ition A<br />

= u t I n v Def<strong>in</strong>ition UM


§MINM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 214<br />

= u t v Theorem MMIM<br />

= 〈u, v〉 Theorem MMIP<br />

The second conclusion is just a specialization of the first conclusion.<br />

√<br />

‖Uv‖ = ‖Uv‖ 2<br />

= √ 〈Uv, Uv〉 Theorem IPN<br />

= √ 〈v, v〉<br />

√<br />

= ‖v‖ 2<br />

= ‖v‖<br />

Theorem IPN<br />

<br />

Aside from the <strong>in</strong>herent <strong>in</strong>terest <strong>in</strong> this theorem, it makes a bigger statement<br />

about unitary matrices. When we view vectors geometrically as directions or forces,<br />

then the norm equates to a notion of length. If we transform a vector by multiplication<br />

with a unitary matrix, then the length (norm) of that vector stays the same. If we<br />

consider column vectors with two or three slots conta<strong>in</strong><strong>in</strong>g only real numbers, then<br />

the <strong>in</strong>ner product of two such vectors is just the dot product, and this quantity can<br />

be used to compute the angle between two vectors. When two vectors are multiplied<br />

(transformed) by the same unitary matrix, their dot product is unchanged and their<br />

<strong>in</strong>dividual lengths are unchanged. This results <strong>in</strong> the angle between the two vectors<br />

rema<strong>in</strong><strong>in</strong>g unchanged.<br />

A “unitary transformation” (matrix-vector products with unitary matrices) thus<br />

preserve geometrical relationships among vectors represent<strong>in</strong>g directions, forces, or<br />

other physical quantities. In the case of a two-slot vector with real entries, this is<br />

simply a rotation. These sorts of computations are exceed<strong>in</strong>gly important <strong>in</strong> computer<br />

graphics such as games and real-time simulations, especially when <strong>in</strong>creased realism<br />

is achieved by perform<strong>in</strong>g many such computations quickly. We will see unitary<br />

matrices aga<strong>in</strong> <strong>in</strong> subsequent sections (especially Theorem OD) and <strong>in</strong> each <strong>in</strong>stance,<br />

consider the <strong>in</strong>terpretation of the unitary matrix as a sort of geometry-preserv<strong>in</strong>g<br />

transformation. Some authors use the term isometry to highlight this behavior. We<br />

will speak loosely of a unitary matrix as be<strong>in</strong>g a sort of generalized rotation.<br />

A f<strong>in</strong>al rem<strong>in</strong>der: the terms “dot product,” “symmetric matrix” and “orthogonal<br />

matrix” used <strong>in</strong> reference to vectors or matrices with real number entries are special<br />

cases of the terms “<strong>in</strong>ner product,” “Hermitian matrix” and “unitary matrix” that<br />

we use for vectors or matrices with complex number entries, so keep that <strong>in</strong> m<strong>in</strong>d as<br />

you read elsewhere.<br />

Read<strong>in</strong>g Questions


§MINM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 215<br />

1. Compute the <strong>in</strong>verse of the coefficient matrix of the system of equations below and use<br />

the <strong>in</strong>verse to solve the system.<br />

4x 1 +10x 2 =12<br />

2x 1 +6x 2 =4<br />

2. In the read<strong>in</strong>g questions for Section MISLE you were asked to f<strong>in</strong>d the <strong>in</strong>verse of the<br />

3 × 3 matrix below.<br />

⎡<br />

⎣ 2 3 1<br />

⎤<br />

1 −2 −3⎦<br />

−2 4 6<br />

Because the matrix was not nons<strong>in</strong>gular, you had no theorems at that po<strong>in</strong>t that would<br />

allow you to compute the <strong>in</strong>verse. Expla<strong>in</strong> why you now know that the <strong>in</strong>verse does<br />

not exist (which is different than not be<strong>in</strong>g able to compute it) by quot<strong>in</strong>g the relevant<br />

theorem’s acronym.<br />

3. Is the matrix A unitary? Why?<br />

[ ]<br />

√ 1<br />

1<br />

A = 22<br />

(4 + 2i) √<br />

374<br />

(5 + 3i)<br />

√<br />

1<br />

1<br />

22<br />

(−1 − i) √<br />

374<br />

(12 + 14i)<br />

Exercises<br />

⎡<br />

C20 Let A = ⎣ 1 2 1<br />

⎤ ⎡<br />

0 1 1⎦ and B = ⎣ −1 1 0<br />

⎤<br />

1 2 1⎦. VerifythatAB is nons<strong>in</strong>gular.<br />

1 0 2<br />

0 1 1<br />

C40 † Solve the system of equations below us<strong>in</strong>g the <strong>in</strong>verse of a matrix.<br />

x 1 + x 2 +3x 3 + x 4 =5<br />

−2x 1 − x 2 − 4x 3 − x 4 = −7<br />

x 1 +4x 2 +10x 3 +2x 4 =9<br />

−2x 1 − 4x 3 +5x 4 =9<br />

⎡ ⎤<br />

M10 † F<strong>in</strong>d values of x, y, z so that matrix A = ⎣ 1 2 x<br />

3 0 y⎦ is <strong>in</strong>vertible.<br />

1<br />

⎡<br />

1 z<br />

⎤<br />

M11 † F<strong>in</strong>d values of x, yzso that matrix A = ⎣ 1 x 1<br />

1 y 4⎦ is s<strong>in</strong>gular.<br />

0 z 5<br />

M15 † If A and B are n × n matrices, A is nons<strong>in</strong>gular, and B is s<strong>in</strong>gular, show directly<br />

that AB is s<strong>in</strong>gular, without us<strong>in</strong>g Theorem NPNT.<br />

M20 † Construct an example of a 4 × 4 unitary matrix.<br />

M80 † Matrix multiplication <strong>in</strong>teracts nicely with many operations. But not always with<br />

transform<strong>in</strong>g a matrix to reduced row-echelon form. Suppose that A is an m × n matrix<br />

and B is an n × p matrix. Let P beamatrixthatisrow-equivalenttoA and<strong>in</strong>reduced


§MINM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 216<br />

row-echelon form, Q be a matrix that is row-equivalent to B and <strong>in</strong> reduced row-echelon<br />

form, and let R be a matrix that is row-equivalent to AB and <strong>in</strong> reduced row-echelon form.<br />

Is PQ = R? (In other words, with nonstandard notation, is rref(A)rref(B) = rref(AB)?)<br />

Construct a counterexample to show that, <strong>in</strong> general, this statement is false. Then f<strong>in</strong>d a<br />

large class of matrices where if A and B are <strong>in</strong> the class, then the statement is true.<br />

T10 Suppose that Q and P are unitary matrices of size n. ProvethatQP is a unitary<br />

matrix.<br />

T11 Prove that Hermitian matrices (Def<strong>in</strong>ition HM) have real entries on the diagonal.<br />

More precisely, suppose that A is a Hermitian matrix of size n. Then[A] ii<br />

∈ R, 1≤ i ≤ n.<br />

T12 Suppose that we are check<strong>in</strong>g if a square matrix of size n is unitary. Show that<br />

a straightforward application of Theorem CUMOS requires the computation of n 2 <strong>in</strong>ner<br />

products when the matrix is unitary, and fewer when the matrix is not orthogonal. Then<br />

show that this maximum number of <strong>in</strong>ner products can be reduced to 1 n(n + 1) <strong>in</strong> light of<br />

2<br />

Theorem IPAC.<br />

T25 The notation A k means a repeated matrix product between k copies of the square<br />

matrix A.<br />

1. Assume A is an n × n matrix where A 2 = O (which does not imply that A = O.)<br />

Prove that I n − A is <strong>in</strong>vertible by show<strong>in</strong>g that I n + A is an <strong>in</strong>verse of I n − A.<br />

2. Assume that A is an n × n matrix where A 3 = O. ProvethatI n − A is <strong>in</strong>vertible.<br />

3. Form a general theorem based on your observations from parts (1) and (2) and<br />

provide a proof.


Section CRS<br />

Column and Row Spaces<br />

A matrix-vector product (Def<strong>in</strong>ition MVP) is a l<strong>in</strong>ear comb<strong>in</strong>ation of the columns<br />

of the matrix and this allows us to connect matrix multiplication with systems of<br />

equations via Theorem SLSLC. Row operations are l<strong>in</strong>ear comb<strong>in</strong>ations of the rows<br />

of a matrix, and of course, reduced row-echelon form (Def<strong>in</strong>ition RREF) isalso<br />

<strong>in</strong>timately related to solv<strong>in</strong>g systems of equations. In this section we will formalize<br />

these ideas with two key def<strong>in</strong>itions of sets of vectors derived from a matrix.<br />

Subsection CSSE<br />

Column Spaces and Systems of Equations<br />

Theorem SLSLC showed us that there is a natural correspondence between solutions<br />

to l<strong>in</strong>ear systems and l<strong>in</strong>ear comb<strong>in</strong>ations of the columns of the coefficient matrix.<br />

This idea motivates the follow<strong>in</strong>g important def<strong>in</strong>ition.<br />

Def<strong>in</strong>ition CSM Column Space of a Matrix<br />

Suppose that A is an m × n matrix with columns A 1 , A 2 , A 3 , ..., A n . Then<br />

the column space of A, written C(A), is the subset of C m conta<strong>in</strong><strong>in</strong>g all l<strong>in</strong>ear<br />

comb<strong>in</strong>ations of the columns of A,<br />

C(A) =〈{A 1 , A 2 , A 3 , ..., A n }〉<br />

□<br />

Some authors refer to the column space of a matrix as the range, but we will<br />

reserve this term for use with l<strong>in</strong>ear transformations (Def<strong>in</strong>ition RLT).<br />

Upon encounter<strong>in</strong>g any new set, the first question we ask is what objects are <strong>in</strong><br />

the set, and which objects are not? Here is an example of one way to answer this<br />

question, and it will motivate a theorem that will then answer the question precisely.<br />

Example CSMCS Column space of a matrix and consistent systems<br />

Archetype D and Archetype E are l<strong>in</strong>ear systems of equations, with an identical<br />

3 × 4 coefficient matrix, which we call A here. However, Archetype D is consistent,<br />

while Archetype E is not. We can expla<strong>in</strong> this difference by employ<strong>in</strong>g the column<br />

space of the matrix A.<br />

The column vector of constants, b, <strong>in</strong>ArchetypeD is given below, and one<br />

solution listed for LS(A, b) isx,<br />

b =<br />

[ ] 8<br />

−12<br />

4<br />

⎡ ⎤<br />

7<br />

⎢8⎥<br />

x = ⎣<br />

1<br />

⎦<br />

3<br />

217


§CRS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 218<br />

By Theorem SLSLC, we can summarize this solution as a l<strong>in</strong>ear comb<strong>in</strong>ation of<br />

the columns of A that equals b,<br />

[ ] ] [ ] [ ] [ ]<br />

2 1 7 −7 8<br />

7 −3 +8[<br />

4 +1 −5 +3 −6 = −12 = b.<br />

1 1 4 −5 4<br />

This equation says that b is a l<strong>in</strong>ear comb<strong>in</strong>ation of the columns of A, andthen<br />

by Def<strong>in</strong>ition CSM, we can say that b ∈C(A).<br />

On the other hand, Archetype E is the l<strong>in</strong>ear system LS(A, c), where the vector<br />

of constants is<br />

[ ] 2<br />

c = 3<br />

2<br />

and this system of equations is <strong>in</strong>consistent. This means c ∉C(A), for if it were,<br />

then it would equal a l<strong>in</strong>ear comb<strong>in</strong>ation of the columns of A and Theorem SLSLC<br />

would lead us to a solution of the system LS(A, c).<br />

△<br />

So if we fix the coefficient matrix, and vary the vector of constants, we can<br />

sometimes f<strong>in</strong>d consistent systems, and sometimes <strong>in</strong>consistent systems. The vectors<br />

of constants that lead to consistent systems are exactly the elements of the column<br />

space. This is the content of the next theorem, and s<strong>in</strong>ce it is an equivalence, it<br />

provides an alternate view of the column space.<br />

Theorem CSCS Column Spaces and Consistent Systems<br />

Suppose A is an m × n matrix and b isavectorofsizem. Thenb ∈C(A) if and<br />

only if LS(A, b) is consistent.<br />

Proof. (⇒) Suppose b ∈C(A). Then we can write b as some l<strong>in</strong>ear comb<strong>in</strong>ation<br />

of the columns of A. ByTheoremSLSLC we can use the scalars from this l<strong>in</strong>ear<br />

comb<strong>in</strong>ation to form a solution to LS(A, b), so this system is consistent.<br />

(⇐) IfLS(A, b) is consistent, there is a solution that may be used with Theorem<br />

SLSLC to write b as a l<strong>in</strong>ear comb<strong>in</strong>ation of the columns of A. This qualifies b for<br />

membership <strong>in</strong> C(A).<br />

<br />

This theorem tells us that ask<strong>in</strong>g if the system LS(A, b) is consistent is exactly<br />

the same question as ask<strong>in</strong>g if b is <strong>in</strong> the column space of A. Or equivalently, it tells<br />

us that the column space of the matrix A is precisely those vectors of constants, b,<br />

that can be paired with A to create a system of l<strong>in</strong>ear equations LS(A, b) that is<br />

consistent.<br />

Employ<strong>in</strong>g Theorem SLEMM we can form the cha<strong>in</strong> of equivalences<br />

b ∈C(A) ⇐⇒ LS (A, b) is consistent ⇐⇒ Ax = b for some x<br />

Thus, an alternative (and popular) def<strong>in</strong>ition of the column space of an m × n


§CRS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 219<br />

matrix A is<br />

C(A) ={y ∈ C m | y = Ax for some x ∈ C n } = {Ax| x ∈ C n }⊆C m<br />

Werecognizethisassay<strong>in</strong>gcreateall the matrix vector products possible with<br />

the matrix A by lett<strong>in</strong>g x range over all of the possibilities. By Def<strong>in</strong>ition MVP<br />

we see that this means take all possible l<strong>in</strong>ear comb<strong>in</strong>ations of the columns of A —<br />

precisely the def<strong>in</strong>ition of the column space (Def<strong>in</strong>ition CSM) wehavechosen.<br />

Notice how this formulation of the column space looks very much like the def<strong>in</strong>ition<br />

of the null space of a matrix (Def<strong>in</strong>ition NSM), but for a rectangular matrix the<br />

column vectors of C(A) and N (A) have different sizes, so the sets are very different.<br />

Given a vector b and a matrix A it is now very mechanical to test if b ∈C(A).<br />

Form the l<strong>in</strong>ear system LS(A, b), row-reduce the augmented matrix, [A | b], and<br />

test for consistency with Theorem RCLS. Here is an example of this procedure.<br />

Example MCSM Membership <strong>in</strong> the column space of a matrix<br />

Consider the column space of the 3 × 4 matrix A,<br />

[ ]<br />

3 2 1 −4<br />

A = −1 1 −2 3<br />

2 −4 6 −8<br />

[ ] 18<br />

We first show that v = −6 is <strong>in</strong> the column space of A, v ∈C(A). Theorem<br />

12<br />

CSCS says we need only check the consistency of LS(A, v). Form the augmented<br />

matrix and row-reduce,<br />

[ ] ⎡<br />

3 2 1 −4 18<br />

RREF<br />

−1 1 −2 3 −6 −−−−→ ⎣ 1 0 1 −2 6<br />

⎤<br />

0 1 −1 1 0⎦<br />

2 −4 6 −8 12<br />

0 0 0 0 0<br />

S<strong>in</strong>ce the f<strong>in</strong>al column is not a pivot column, Theorem RCLS tells us the system<br />

is consistent and therefore by Theorem CSCS, v ∈C(A).<br />

If we wished to demonstrate explicitly that v is a l<strong>in</strong>ear comb<strong>in</strong>ation of the<br />

columns of A, we can f<strong>in</strong>d a solution (any solution) of LS(A, v) and use Theorem<br />

SLSLC to construct the desired l<strong>in</strong>ear comb<strong>in</strong>ation. For example, set the free variables<br />

to x 3 = 2 and x 4 = 1. Then a solution has x 2 = 1 and x 1 = 6. Then by Theorem<br />

SLSLC,<br />

[ ] [ ] ] [ ] [ ]<br />

18 3 2 1 −4<br />

v = −6 =6 −1 +1[<br />

1 +2 −2 +1 3<br />

12 2 −4 6 −8<br />

[ ] 2<br />

Nowweshowthatw = 1 is not <strong>in</strong> the column space of A, w ∉C(A).<br />

−3<br />

Theorem CSCS says we need only check the consistency of LS(A, w). Formthe


§CRS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 220<br />

augmented matrix and row-reduce,<br />

[ 3 2 1 −4<br />

] 2<br />

⎡<br />

−1 1 −2 3 1<br />

RREF<br />

−−−−→ ⎣<br />

2 −4 6 −8 −3<br />

1 0 1 −2 0<br />

0 1 −1 1 0<br />

0 0 0 0 1<br />

s<strong>in</strong>ce the f<strong>in</strong>al column is a pivot column, Theorem RCLS tells us the system is<br />

<strong>in</strong>consistent and therefore by Theorem CSCS, w ∉C(A).<br />

△<br />

Theorem CSCS completes a collection of three theorems, and one def<strong>in</strong>ition,<br />

that deserve comment. Many questions about spans, l<strong>in</strong>ear <strong>in</strong>dependence, null space,<br />

column spaces and similar objects can be converted to questions about systems<br />

of equations (homogeneous or not), which we understand well from our previous<br />

results, especially those <strong>in</strong> Chapter SLE. These previous results <strong>in</strong>clude theorems<br />

like Theorem RCLS which allows us to quickly decide consistency of a system, and<br />

Theorem BNS which allows us to describe solution sets for homogeneous systems<br />

compactly as the span of a l<strong>in</strong>early <strong>in</strong>dependent set of column vectors.<br />

The table below lists these four def<strong>in</strong>itions and theorems along with a brief<br />

rem<strong>in</strong>der of the statement and an example of how the statement is used.<br />

⎤<br />

⎦<br />

Def<strong>in</strong>ition NSM<br />

Synopsis<br />

Example<br />

Theorem SLSLC<br />

Synopsis<br />

Example<br />

Theorem SLEMM<br />

Synopsis<br />

Example<br />

Theorem CSCS<br />

Synopsis<br />

Example<br />

Null space is solution set of homogeneous system<br />

General solution sets described by Theorem PSPHS<br />

Solutions for l<strong>in</strong>ear comb<strong>in</strong>ations with unknown scalars<br />

Decid<strong>in</strong>g membership <strong>in</strong> spans<br />

System of equations represented by matrix-vector product<br />

Solution to LS(A, b) isA −1 b when A is nons<strong>in</strong>gular<br />

Column space vectors create consistent systems<br />

Decid<strong>in</strong>g membership <strong>in</strong> column spaces<br />

Subsection CSSOC<br />

Column Space Spanned by Orig<strong>in</strong>al Columns<br />

So we have a foolproof, automated procedure for determ<strong>in</strong><strong>in</strong>g membership <strong>in</strong> C(A).<br />

While this works just f<strong>in</strong>e a vector at a time, we would like to have a more useful<br />

description of the set C(A) as a whole. The next example will preview the first of<br />

two fundamental results about the column space of a matrix.<br />

Example CSTW Column space, two ways


§CRS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 221<br />

Consider the 5 × 7 matrix A,<br />

⎡<br />

⎤<br />

2 4 1 −1 1 4 4<br />

1 2 1 0 2 4 7<br />

⎢ 0 0 1 4 1 8 7 ⎥<br />

⎣ 1 2 −1 2 1 9 6 ⎦<br />

−2 −4 1 3 −1 −2 −2<br />

Accord<strong>in</strong>g to the def<strong>in</strong>ition (Def<strong>in</strong>ition CSM), the column space of A is<br />

⎧⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />

2 4 1 −1 1 4 4<br />

〈<br />

〉<br />

⎪⎨<br />

1<br />

2<br />

1<br />

0 ⎥ 2<br />

4<br />

7<br />

⎪⎬<br />

C(A) = ⎢ 0 ⎥<br />

⎣<br />

⎪⎩<br />

1 ⎦ , ⎢ 0 ⎥<br />

⎣ 2 ⎦ , ⎢ 1 ⎥<br />

⎣−1⎦ , ⎥⎥⎦ ⎢ 4 , ⎢ 1 ⎥<br />

⎣ 2 ⎣ 1 ⎦ , ⎢ 8 ⎥<br />

⎣ 9 ⎦ , ⎢ 7 ⎥<br />

⎣ 6 ⎦<br />

⎪⎭<br />

−2 −4 1 3 −1 −2 −2<br />

While this is a concise description of an <strong>in</strong>f<strong>in</strong>ite set, we might be able to describe<br />

the span with fewer than seven vectors. This is the substance of Theorem BS. Sowe<br />

take these seven vectors and make them the columns of a matrix, which is simply<br />

the orig<strong>in</strong>al matrix A aga<strong>in</strong>. Now we row-reduce,<br />

⎡<br />

⎤ ⎡<br />

⎤<br />

2 4 1 −1 1 4 4<br />

1 2 0 0 0 3 1<br />

1 2 1 0 2 4 7<br />

RREF<br />

0 0 1 0 0 −1 0<br />

⎢ 0 0 1 4 1 8 7 ⎥ −−−−→<br />

⎣ 1 2 −1 2 1 9 6 ⎦ ⎢ 0 0 0 1 0 2 1<br />

⎥<br />

⎣<br />

⎦<br />

0 0 0 0 1 1 3<br />

−2 −4 1 3 −1 −2 −2<br />

0 0 0 0 0 0 0<br />

The pivot columns are D = {1, 3, 4, 5}, so we can create the set<br />

⎧⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />

2 1 −1 1<br />

⎪⎨<br />

1<br />

1<br />

0<br />

2<br />

⎪⎬<br />

T = ⎢ 0 ⎥<br />

⎣<br />

⎪⎩<br />

1 ⎦ , ⎢ 1 ⎥<br />

⎣−1⎦ , ⎢ 4 ⎥<br />

⎣ 2 ⎦ , ⎢ 1 ⎥<br />

⎣ 1 ⎦<br />

⎪⎭<br />

−2 1 3 −1<br />

and know that C(A) = 〈T 〉 and T is a l<strong>in</strong>early <strong>in</strong>dependent set of columns from the<br />

set of columns of A.<br />

△<br />

We will now formalize the previous example, which will make it trivial to determ<strong>in</strong>e<br />

a l<strong>in</strong>early <strong>in</strong>dependent set of vectors that will span the column space of a matrix,<br />

and is constituted of just columns of A.<br />

Theorem BCS Basis of the Column Space<br />

Suppose that A is an m × n matrix with columns A 1 , A 2 , A 3 , ..., A n ,andB is<br />

a row-equivalent matrix <strong>in</strong> reduced row-echelon form with r pivot columns. Let<br />

D = {d 1 ,d 2 ,d 3 , ..., d r } be the set of <strong>in</strong>dices for the pivot columns of B Let T =<br />

{A d1 , A d2 , A d3 , ..., A dr }.Then<br />

1. T is a l<strong>in</strong>early <strong>in</strong>dependent set.


§CRS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 222<br />

2. C(A) =〈T 〉.<br />

Proof. Def<strong>in</strong>ition CSM describes the column space as the span of the set of columns<br />

of A. TheoremBS tells us that we can reduce the set of vectors used <strong>in</strong> a span. If<br />

we apply Theorem BS to C(A), we would collect the columns of A <strong>in</strong>toamatrix<br />

(which would just be A aga<strong>in</strong>) and br<strong>in</strong>g the matrix to reduced row-echelon form,<br />

which is the matrix B <strong>in</strong> the statement of the theorem. In this case, the conclusions<br />

of Theorem BS applied to A, B and C(A) are exactly the conclusions we desire. <br />

This is a nice result s<strong>in</strong>ce it gives us a handful of vectors that describe the entire<br />

column space (through the span), and we believe this set is as small as possible<br />

because we cannot create any more relations of l<strong>in</strong>ear dependence to trim it down<br />

further. Furthermore, we def<strong>in</strong>ed the column space (Def<strong>in</strong>ition CSM) as all l<strong>in</strong>ear<br />

comb<strong>in</strong>ations of the columns of the matrix, and the elements of the set T are still<br />

columns of the matrix (we will not be so lucky <strong>in</strong> the next two constructions of the<br />

column space).<br />

Procedurally this theorem is extremely easy to apply. Row-reduce the orig<strong>in</strong>al<br />

matrix, identify r pivot columns the reduced matrix, and grab the columns of the<br />

orig<strong>in</strong>al matrix with the same <strong>in</strong>dices as the pivot columns. But it is still important<br />

to study the proof of Theorem BS and its motivation <strong>in</strong> Example COV which lie at<br />

the root of this theorem. We will trot through an example all the same.<br />

Example CSOCD Column space, orig<strong>in</strong>al columns, Archetype D<br />

Let us determ<strong>in</strong>e a compact expression for the entire column space of the coefficient<br />

matrix of the system of equations that is Archetype D. Notice that <strong>in</strong> Example<br />

CSMCS we were only determ<strong>in</strong><strong>in</strong>g if <strong>in</strong>dividual vectors were <strong>in</strong> the column space or<br />

not, now we are describ<strong>in</strong>g the entire column space.<br />

To start with the application of Theorem BCS, call the coefficient matrix A and<br />

row-reduce it to reduced row-echelon form B,<br />

A =<br />

[ 2 1 7 −7<br />

−3 4 −5 −6<br />

1 1 4 −5<br />

]<br />

⎡<br />

B = ⎣ 1 0 3 −2<br />

⎤<br />

0 1 1 −3⎦<br />

0 0 0 0<br />

S<strong>in</strong>ce columns 1 and 2 are pivot columns, D = {1, 2}. To construct a set that<br />

spans C(A), just grab the columns of A with <strong>in</strong>dices <strong>in</strong> D, so<br />

〈{[ ] [ ]}〉<br />

2 1<br />

C(A) = −3 , 4 .<br />

1 1<br />

That’s it.<br />

In Example CSMCS we determ<strong>in</strong>ed that the vector<br />

[ ] 2<br />

c = 3<br />

2


§CRS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 223<br />

was not <strong>in</strong> the column space of A. Trytowritec as a l<strong>in</strong>ear comb<strong>in</strong>ation of the first<br />

two columns of A. What happens?<br />

Also <strong>in</strong> Example CSMCS we determ<strong>in</strong>ed that the vector<br />

[ ] 8<br />

b = −12<br />

4<br />

was <strong>in</strong> the column space of A. Trytowriteb as a l<strong>in</strong>ear comb<strong>in</strong>ation of the first<br />

two columns of A. What happens? Did you f<strong>in</strong>d a unique solution to this question?<br />

Hmmmm.<br />

△<br />

Subsection CSNM<br />

Column Space of a Nons<strong>in</strong>gular Matrix<br />

Let us specialize to square matrices and contrast the column spaces of the coefficient<br />

matrices <strong>in</strong> Archetype A and Archetype B.<br />

Example CSAA Column space of Archetype A<br />

The coefficient matrix <strong>in</strong> Archetype A is A, which row-reduces to B,<br />

A =<br />

[ 1 −1<br />

] 2<br />

2 1 1<br />

1 1 0<br />

⎡<br />

B = ⎣ 1 0 1<br />

⎤<br />

0 1 −1⎦<br />

0 0 0<br />

Columns 1 and 2 are pivot columns, so by Theorem BCS we can write<br />

〈{[ ] [ ]}〉<br />

1 −1<br />

C(A) =〈{A 1 , A 2 }〉 = 2 , 1 .<br />

1 1<br />

We want to show <strong>in</strong> this example that C(A) ≠ C 3 . So take, for example, the<br />

]<br />

[ 1<br />

vector b = 3 . Then there is no solution to the system LS(A, b), or equivalently,<br />

2<br />

it is not possible to write b as a l<strong>in</strong>ear comb<strong>in</strong>ation of A 1 and A 2 . Try one of these<br />

two computations yourself. (Or try both!). S<strong>in</strong>ce b ∉C(A), the column space of A<br />

cannot be all of C 3 . So by vary<strong>in</strong>g the vector of constants, it is possible to create<br />

<strong>in</strong>consistent systems of equations with this coefficient matrix (the vector b be<strong>in</strong>g<br />

one such example).<br />

In Example MWIAA we wished to show that the coefficient matrix from Archetype<br />

A was not <strong>in</strong>vertible as a first example of a matrix without an <strong>in</strong>verse. Our<br />

device there was to f<strong>in</strong>d an <strong>in</strong>consistent l<strong>in</strong>ear system with A as the coefficient<br />

matrix. The vector of constants <strong>in</strong> that example was b, deliberately chosen outside<br />

the column space of A.<br />


§CRS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 224<br />

Example CSAB Column space of Archetype B<br />

The coefficient matrix <strong>in</strong> Archetype B, call it B here, is known to be nons<strong>in</strong>gular<br />

(see Example NM). By Theorem NMUS, the l<strong>in</strong>ear system LS(B, b) has a (unique)<br />

solution for every choice of b. TheoremCSCS then says that b ∈C(B) for all b ∈ C 3 .<br />

Stated differently, there is no way to build an <strong>in</strong>consistent system with the coefficient<br />

matrix B, but then we knew that already from Theorem NMUS.<br />

△<br />

Example CSAA and Example CSAB together motivate the follow<strong>in</strong>g equivalence,<br />

which says that nons<strong>in</strong>gular matrices have column spaces that are as big as possible.<br />

Theorem CSNM Column Space of a Nons<strong>in</strong>gular Matrix<br />

Suppose A is a square matrix of size n. Then A is nons<strong>in</strong>gular if and only if<br />

C(A) =C n .<br />

Proof. (⇒) Suppose A is nons<strong>in</strong>gular. We wish to establish the set equality C(A) =<br />

C n . By Def<strong>in</strong>ition CSM, C(A) ⊆ C n . To show that C n ⊆C(A) choose b ∈ C n .By<br />

Theorem NMUS, we know the l<strong>in</strong>ear system LS(A, b) has a (unique) solution and<br />

therefore is consistent. Theorem CSCS then says that b ∈C(A). So by Def<strong>in</strong>ition<br />

SE, C(A) =C n .<br />

(⇐) Ife i is column i of the n × n identity matrix (Def<strong>in</strong>ition SUV) andby<br />

hypothesis C(A) = C n ,thene i ∈C(A) for 1 ≤ i ≤ n. ByTheoremCSCS, the system<br />

LS(A, e i ) is consistent for 1 ≤ i ≤ n. Letb i denote any one particular solution to<br />

LS(A, e i ), 1 ≤ i ≤ n.<br />

Def<strong>in</strong>e the n × n matrix B =[b 1 |b 2 |b 3 | ...|b n ]. Then<br />

AB = A [b 1 |b 2 |b 3 | ...|b n ]<br />

=[Ab 1 |Ab 2 |Ab 3 | ...|Ab n ]<br />

Def<strong>in</strong>ition MM<br />

=[e 1 |e 2 |e 3 | ...|e n ]<br />

= I n Def<strong>in</strong>ition SUV<br />

So the matrix B is a “right-<strong>in</strong>verse” for A. ByTheoremNMRRI, I n is a nons<strong>in</strong>gular<br />

matrix, so by Theorem NPNT both A and B are nons<strong>in</strong>gular. Thus, <strong>in</strong><br />

particular, A is nons<strong>in</strong>gular. (Travis Osborne contributed to this proof.) <br />

With this equivalence for nons<strong>in</strong>gular matrices we can update our list, Theorem<br />

NME3.<br />

Theorem NME4 Nons<strong>in</strong>gular Matrix Equivalences, Round 4<br />

Suppose that A is a square matrix of size n. The follow<strong>in</strong>g are equivalent.<br />

1. A is nons<strong>in</strong>gular.<br />

2. A row-reduces to the identity matrix.<br />

3. The null space of A conta<strong>in</strong>s only the zero vector, N (A) ={0}.


§CRS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 225<br />

4. The l<strong>in</strong>ear system LS(A, b) has a unique solution for every possible choice of<br />

b.<br />

5. The columns of A are a l<strong>in</strong>early <strong>in</strong>dependent set.<br />

6. A is <strong>in</strong>vertible.<br />

7. The column space of A is C n , C(A) =C n .<br />

Proof. S<strong>in</strong>ce Theorem CSNM is an equivalence, we can add it to the list <strong>in</strong> Theorem<br />

NME3.<br />

<br />

Subsection RSM<br />

Row Space of a Matrix<br />

The rows of a matrix can be viewed as vectors, s<strong>in</strong>ce they are just lists of numbers,<br />

arranged horizontally. So we will transpose a matrix, turn<strong>in</strong>g rows <strong>in</strong>to columns, so<br />

we can then manipulate rows as column vectors. As a result we will be able to make<br />

some new connections between row operations and solutions to systems of equations.<br />

OK, here is the second primary def<strong>in</strong>ition of this section.<br />

Def<strong>in</strong>ition RSM Row Space of a Matrix<br />

Suppose A is an m × n matrix. Then the row space of A, R(A), isthecolumn<br />

space of A t , i.e. R(A) =C(A t ).<br />

□<br />

Informally, the row space is the set of all l<strong>in</strong>ear comb<strong>in</strong>ations of the rows of<br />

A. However, we write the rows as column vectors, thus the necessity of us<strong>in</strong>g the<br />

transpose to make the rows <strong>in</strong>to columns. Additionally, with the row space def<strong>in</strong>ed<br />

<strong>in</strong> terms of the column space, all of the previous results of this section can be applied<br />

to row spaces.<br />

Notice that if A is a rectangular m×n matrix, then C(A) ⊆ C m , while R(A) ⊆ C n<br />

and the two sets are not comparable s<strong>in</strong>ce they do not even hold objects of the same<br />

type. However, when A is square of size n, both C(A) and R(A) are subsets of C n ,<br />

though usually the sets will not be equal (but see Exercise CRS.M20).<br />

Example RSAI Row space of Archetype I<br />

The coefficient matrix <strong>in</strong> Archetype I is<br />

⎡<br />

⎤<br />

1 4 0 −1 0 7 −9<br />

⎢ 2 8 −1 3 9 −13 7 ⎥<br />

I = ⎣<br />

0 0 2 −3 −4 12 −8<br />

⎦<br />

−1 −4 2 4 8 −31 37


§CRS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 226<br />

To build the row space, we transpose the matrix,<br />

⎡<br />

⎤<br />

1 2 0 −1<br />

4 8 0 −4<br />

0 −1 2 2<br />

I t =<br />

−1 3 −3 4<br />

⎢ 0 9 −4 8<br />

⎥<br />

⎣ 7 −13 12 −31⎦<br />

−9 7 −8 37<br />

Then the columns of this matrix are used <strong>in</strong> a span to build the row space,<br />

⎧⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />

1 2 0 −1<br />

R(I) =C ( 〈<br />

4<br />

8<br />

0<br />

−4<br />

〉<br />

⎪⎨<br />

I t) 0<br />

−1<br />

2<br />

2<br />

⎪⎬<br />

=<br />

−1<br />

,<br />

3<br />

,<br />

−3<br />

,<br />

4<br />

.<br />

⎢ 0<br />

⎥ ⎢ 9<br />

⎥ ⎢−4<br />

⎥ ⎢ 8<br />

⎥<br />

⎣ 7 ⎦ ⎣−13⎦<br />

⎣ 12 ⎦ ⎣−31⎦<br />

⎪⎩<br />

⎪⎭<br />

−9 7 −8 37<br />

However, we can use Theorem BCS to get a slightly better description. <strong>First</strong>,<br />

row-reduce I t ,<br />

⎡<br />

⎤<br />

1 0 0 − 31<br />

7<br />

⎢<br />

⎣<br />

0 1 0<br />

12<br />

7<br />

13<br />

0 0 1<br />

7<br />

0 0 0 0<br />

0 0 0 0<br />

0 0 0 0<br />

0 0 0 0<br />

S<strong>in</strong>ce the pivot columns have <strong>in</strong>dices D = {1, 2, 3}, the column space of I t can<br />

be spanned by just the first three columns of I t ,<br />

⎧⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤⎫<br />

1 2 0<br />

R(I) =C ( 〈<br />

4<br />

8<br />

0<br />

〉<br />

⎪⎨<br />

I t) 0<br />

−1<br />

2<br />

⎪⎬<br />

=<br />

−1<br />

,<br />

3<br />

,<br />

−3<br />

.<br />

⎢ 0<br />

⎥ ⎢ 9<br />

⎥ ⎢−4<br />

⎥<br />

⎣ 7 ⎦ ⎣−13⎦<br />

⎣ 12 ⎦<br />

⎪⎩<br />

⎪⎭<br />

−9 7 −8<br />

The row space would not be too <strong>in</strong>terest<strong>in</strong>g if it was simply the column space of<br />

the transpose. However, when we do row operations on a matrix we have no effect<br />

on the many l<strong>in</strong>ear comb<strong>in</strong>ations that can be formed with the rows of the matrix.<br />

This is stated more carefully <strong>in</strong> the follow<strong>in</strong>g theorem.<br />

Theorem REMRS Row-Equivalent Matrices have equal Row Spaces<br />

.<br />

⎥<br />

⎦<br />


§CRS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 227<br />

Suppose A and B are row-equivalent matrices. Then R(A) =R(B).<br />

Proof. Two matrices are row-equivalent (Def<strong>in</strong>ition REM) if one can be obta<strong>in</strong>ed<br />

from another by a sequence of (possibly many) row operations. We will prove the<br />

theorem for two matrices that differ by a s<strong>in</strong>gle row operation, and then this result<br />

can be applied repeatedly to get the full statement of the theorem. The row spaces<br />

of A and B are spans of the columns of their transposes. For each row operation<br />

we perform on a matrix, we can def<strong>in</strong>e an analogous operation on the columns.<br />

Perhaps we should call these column operations. Instead, we will still call them<br />

row operations, but we will apply them to the columns of the transposes.<br />

Refer to the columns of A t and B t as A i and B i ,1≤ i ≤ m. Therowoperation<br />

that switches rows will just switch columns of the transposed matrices. This will<br />

have no effect on the possible l<strong>in</strong>ear comb<strong>in</strong>ations formed by the columns.<br />

Suppose that B t is formed from A t by multiply<strong>in</strong>g column A t by α ̸= 0.Inother<br />

words, B t = αA t ,andB i = A i for all i ̸= t. We need to establish that two sets<br />

are equal, C(A t ) = C(B t ). We will take a generic element of one and show that it is<br />

conta<strong>in</strong>ed <strong>in</strong> the other.<br />

β 1 B 1 +β 2 B 2 + β 3 B 3 + ···+ β t B t + ···+ β m B m<br />

= β 1 A 1 + β 2 A 2 + β 3 A 3 + ···+ β t (αA t )+···+ β m A m<br />

= β 1 A 1 + β 2 A 2 + β 3 A 3 + ···+(αβ t ) A t + ···+ β m A m<br />

says that C(B t ) ⊆C(A t ). Similarly,<br />

γ 1 A 1 +γ 2 A 2 + γ 3 A 3 + ···+ γ t A t + ···+ γ m A m<br />

( γt<br />

)<br />

= γ 1 A 1 + γ 2 A 2 + γ 3 A 3 + ···+<br />

α α A t + ···+ γ m A m<br />

= γ 1 A 1 + γ 2 A 2 + γ 3 A 3 + ···+ γ t<br />

α (αA t)+···+ γ m A m<br />

= γ 1 B 1 + γ 2 B 2 + γ 3 B 3 + ···+ γ t<br />

α B t + ···+ γ m B m<br />

says that C(A t ) ⊆C(B t ).SoR(A) = C(A t ) = C(B t ) = R(B) when a s<strong>in</strong>gle row<br />

operation of the second type is performed.<br />

Suppose now that B t is formed from A t by replac<strong>in</strong>g A t with αA s + A t for some<br />

α ∈ C and s ≠ t. In other words, B t = αA s + A t ,andB i = A i for i ≠ t.<br />

β 1 B 1 +β 2 B 2 + ···+ β s B s + ···+ β t B t + ···+ β m B m<br />

= β 1 A 1 + β 2 A 2 + ···+ β s A s + ···+ β t (αA s + A t )+···+ β m A m<br />

= β 1 A 1 + β 2 A 2 + ···+ β s A s + ···+(β t α) A s + β t A t + ···+ β m A m<br />

= β 1 A 1 + β 2 A 2 + ···+ β s A s +(β t α) A s + ···+ β t A t + ···+ β m A m<br />

= β 1 A 1 + β 2 A 2 + ···+(β s + β t α) A s + ···+ β t A t + ···+ β m A m<br />

says that C(B t ) ⊆C(A t ). Similarly,


§CRS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 228<br />

γ 1 A 1 + γ 2 A 2 + ···+ γ s A s + ···+ γ t A t + ···+ γ m A m<br />

= γ 1 A 1 + γ 2 A 2 + ···+ γ s A s + ···+(−αγ t A s + αγ t A s )+γ t A t + ···+ γ m A m<br />

= γ 1 A 1 + γ 2 A 2 + ···+(−αγ t + γ s ) A s + ···+ γ t (αA s + A t )+···+ γ m A m<br />

= γ 1 B 1 + γ 2 B 2 + ···+(−αγ t + γ s ) B s + ···+ γ t B t + ···+ γ m B m<br />

says that C(A t ) ⊆C(B t ).SoR(A) = C(A t ) = C(B t ) = R(B) when a s<strong>in</strong>gle row<br />

operationofthethirdtypeisperformed.<br />

So the row space of a matrix is preserved by each row operation, and hence row<br />

spaces of row-equivalent matrices are equal sets.<br />

<br />

Example RSREM Row spaces of two row-equivalent matrices<br />

In Example TREM we saw that the matrices<br />

[ 2 −1 3<br />

] 4<br />

[ 1 1 0<br />

] 6<br />

A = 5 2 −2 3<br />

B = 3 0 −2 −9<br />

1 1 0 6<br />

2 −1 3 4<br />

are row-equivalent by demonstrat<strong>in</strong>g a sequence of two row operations that converted<br />

A <strong>in</strong>to B. Apply<strong>in</strong>g Theorem REMRS we can say<br />

〈 ⎧ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />

⎪ 2 5 1 〉 〈 ⎨ ⎪⎬<br />

⎧ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />

⎪ 1 3 2 〉<br />

⎨ ⎪⎬<br />

⎢−1⎥<br />

⎢ 2 ⎥ ⎢1⎥<br />

⎢1⎥<br />

⎢ 0 ⎥ ⎢−1⎥<br />

R(A) = ⎣<br />

⎪ ⎩<br />

3<br />

⎦ , ⎣<br />

−2<br />

⎦ , ⎣<br />

0<br />

⎦ = ⎣<br />

⎪⎭ ⎪ ⎩<br />

0<br />

⎦ , ⎣<br />

−2<br />

⎦ , ⎣<br />

3<br />

⎦ = R(B)<br />

⎪⎭<br />

4 3 6<br />

6 −9 4<br />

Theorem REMRS is at its best when one of the row-equivalent matrices is <strong>in</strong><br />

reduced row-echelon form. The vectors that are zero rows can be ignored. (Who<br />

needs the zero vector when build<strong>in</strong>g a span? See Exercise LI.T10.) The echelon<br />

pattern <strong>in</strong>sures that the nonzero rows yield vectors that are l<strong>in</strong>early <strong>in</strong>dependent.<br />

Here is the theorem.<br />

Theorem BRS BasisfortheRowSpace<br />

Suppose that A is a matrix and B is a row-equivalent matrix <strong>in</strong> reduced row-echelon<br />

form. Let S be the set of nonzero columns of B t .Then<br />

1. R(A) =〈S〉.<br />

2. S is a l<strong>in</strong>early <strong>in</strong>dependent set.<br />

Proof. From Theorem REMRS we know that R(A) = R(B). IfB has any zero rows,<br />

these are columns of B t that are the zero vector. We can safely toss out the zero<br />

vector <strong>in</strong> the span construction, s<strong>in</strong>ce it can be recreated from the nonzero vectors<br />

by a l<strong>in</strong>ear comb<strong>in</strong>ation where all the scalars are zero. So R(A) =〈S〉.<br />

Suppose B has r nonzero rows and let D = {d 1 ,d 2 ,d 3 , ..., d r } denote the<br />

<strong>in</strong>dices of the pivot columns of B. Denotether column vectors of B t , the vectors


§CRS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 229<br />

<strong>in</strong> S, asB 1 , B 2 , B 3 , ..., B r . To show that S is l<strong>in</strong>early <strong>in</strong>dependent, start with a<br />

relation of l<strong>in</strong>ear dependence<br />

α 1 B 1 + α 2 B 2 + α 3 B 3 + ···+ α r B r = 0<br />

Now consider this vector equality <strong>in</strong> location d i .S<strong>in</strong>ceB is <strong>in</strong> reduced row-echelon<br />

form, the entries of column d i of B are all zero, except for a lead<strong>in</strong>g 1 <strong>in</strong> row i. Thus,<br />

<strong>in</strong> B t ,rowd i is all zeros, except<strong>in</strong>g a 1 <strong>in</strong> column i. So,for1≤ i ≤ r,<br />

0=[0] di<br />

Def<strong>in</strong>ition ZCV<br />

=[α 1 B 1 + α 2 B 2 + α 3 B 3 + ···+ α r B r ] di<br />

Def<strong>in</strong>ition RLDCV<br />

=[α 1 B 1 ] di<br />

+[α 2 B 2 ] di<br />

+[α 3 B 3 ] di<br />

+ ···+[α r B r ] di<br />

Def<strong>in</strong>ition MA<br />

= α 1 [B 1 ] di<br />

+ α 2 [B 2 ] di<br />

+ α 3 [B 3 ] di<br />

+ ···+ α r [B r ] di<br />

Def<strong>in</strong>ition MSM<br />

= α 1 (0) + α 2 (0) + α 3 (0) + ···+ α i (1) + ···+ α r (0) Def<strong>in</strong>ition RREF<br />

= α i<br />

So we conclude that α i =0forall1≤ i ≤ r, establish<strong>in</strong>g the l<strong>in</strong>ear <strong>in</strong>dependence<br />

of S (Def<strong>in</strong>ition LICV).<br />

<br />

Example IAS Improv<strong>in</strong>g a span<br />

Suppose <strong>in</strong> the course of analyz<strong>in</strong>g a matrix (its column space, its null space, its<br />

. . . ) we encounter the follow<strong>in</strong>g set of vectors, described by a span<br />

⎧⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤ ⎡<br />

1 3 1<br />

〈 ⎪⎨<br />

2<br />

−1<br />

−1<br />

X = ⎢1⎥<br />

⎣<br />

⎪⎩<br />

6⎦ , ⎢ 2 ⎥<br />

⎣−1⎦ , ⎢ 0 ⎥<br />

⎣−1⎦ , ⎢<br />

⎣<br />

6 6 −2<br />

−3<br />

2<br />

−3<br />

6<br />

−10<br />

⎤⎫<br />

〉<br />

⎪⎬<br />

⎥<br />

⎦<br />

⎪⎭<br />

Let A be the matrix whose rows are the vectors <strong>in</strong> X, sobydesignX = R(A),<br />

⎡<br />

⎤<br />

1 2 1 6 6<br />

⎢ 3 −1 2 −1 6 ⎥<br />

A = ⎣<br />

1 −1 0 −1 −2<br />

⎦<br />

−3 2 −3 6 −10<br />

Row-reduce A to form a row-equivalent matrix <strong>in</strong> reduced row-echelon form,<br />

⎡<br />

⎤<br />

1 0 0 2 −1<br />

B = ⎢ 0 1 0 3 1<br />

⎥<br />

⎣<br />

0 0 1 −2 5<br />

⎦<br />

0 0 0 0 0


§CRS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 230<br />

Then Theorem BRS says we can grab the nonzero columns of B t and write<br />

⎧⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤⎫<br />

1 0 0<br />

〈<br />

〉<br />

⎪⎨<br />

0<br />

1<br />

0<br />

⎪⎬<br />

X = R(A) =R(B) = ⎢ 0 ⎥<br />

⎣<br />

⎪⎩<br />

2 ⎦ , ⎢0⎥<br />

⎣3⎦ , ⎢ 1 ⎥<br />

⎣−2⎦<br />

⎪⎭<br />

−1 1 5<br />

These three vectors provide a much-improved description of X. There are fewer<br />

vectors, and the pattern of zeros and ones <strong>in</strong> the first three entries makes it easier to<br />

determ<strong>in</strong>e membership <strong>in</strong> X.<br />

△<br />

Notice that <strong>in</strong> Example IAS all we had to do was row-reduce the right matrix and<br />

toss out a zero row. Next to row operations themselves, Theorem BRS is probably<br />

the most powerful computational technique at your disposal as it quickly provides a<br />

much improved description of a span, any span (row space, column space, . . . ).<br />

Theorem BRS and the techniques of Example IAS will provide yet another<br />

description of the column space of a matrix. <strong>First</strong> we state a triviality as a theorem,<br />

so we can reference it later.<br />

Theorem CSRST Column Space, Row Space, Transpose<br />

Suppose A is a matrix. Then C(A) =R(A t ).<br />

Proof.<br />

C(A) =C( (A<br />

t ) t ) Theorem TT<br />

= R ( A t) Def<strong>in</strong>ition RSM<br />

<br />

So to f<strong>in</strong>d another expression for the column space of a matrix, build its transpose,<br />

row-reduce it, toss out the zero rows, and convert the nonzero rows to column vectors<br />

to yield an improved set for the span construction. We will do Archetype I, then<br />

you do Archetype J.<br />

Example CSROI Column space from row operations, Archetype I<br />

To f<strong>in</strong>d the column space of the coefficient matrix of Archetype I, we proceed as<br />

follows. The matrix is<br />

⎡<br />

⎤<br />

I =<br />

1 4 0 −1 0 7 −9<br />

⎢ 2 8 −1 3 9 −13 7 ⎥<br />

⎣<br />

0 0 2 −3 −4 12 −8<br />

⎦<br />

−1 −4 2 4 8 −31 37


§CRS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 231<br />

The transpose is<br />

⎡<br />

⎤<br />

1 2 0 −1<br />

4 8 0 −4<br />

0 −1 2 2<br />

−1 3 −3 4<br />

.<br />

⎢ 0 9 −4 8<br />

⎥<br />

⎣ 7 −13 12 −31⎦<br />

−9 7 −8 37<br />

Row-reduced this becomes,<br />

⎡ ⎤<br />

1 0 0 − 31<br />

7<br />

⎢ ⎢⎢⎢⎢⎢⎢⎢⎣<br />

12<br />

0 1 0<br />

7<br />

13<br />

0 0 1 7<br />

0 0 0 0<br />

.<br />

0 0 0 0<br />

⎥<br />

0 0 0 0 ⎦<br />

0 0 0 0<br />

Now, us<strong>in</strong>g Theorem CSRST and Theorem BRS<br />

C(I) =R ( 〈 ⎧ ⎡ ⎤ ⎡ ⎤<br />

⎪ ⎨<br />

I t) ⎢ ⎥ ⎢ ⎥<br />

= ⎣ ⎦ , ⎣ ⎦ ,<br />

⎪ ⎩<br />

1<br />

0<br />

0<br />

− 31<br />

7<br />

0<br />

1<br />

0<br />

12<br />

7<br />

⎡<br />

⎢<br />

⎣<br />

0<br />

0<br />

1<br />

13<br />

7<br />

⎤⎫<br />

〉 ⎪⎬<br />

⎥<br />

⎦ .<br />

⎪⎭<br />

This is a very nice description of the column space. Fewer vectors than the 7<br />

<strong>in</strong>volved <strong>in</strong> the def<strong>in</strong>ition, and the pattern of the zeros and ones <strong>in</strong> the first 3 slots<br />

can be used to advantage. For example, Archetype I is presented as a consistent<br />

system of equations with a vector of constants<br />

⎡ ⎤<br />

3<br />

⎢9⎥<br />

b = ⎣<br />

1<br />

⎦ .<br />

4<br />

S<strong>in</strong>ce LS(I, b) is consistent, Theorem CSCS tells us that b ∈C(I). But we could<br />

see this quickly with the follow<strong>in</strong>g computation, which really only <strong>in</strong>volves any work<br />

<strong>in</strong> the 4th entry of the vectors as the scalars <strong>in</strong> the l<strong>in</strong>ear comb<strong>in</strong>ation are dictated<br />

by the first three entries of b.<br />

⎡ ⎤ ⎡<br />

3<br />

⎢9⎥<br />

⎢<br />

b = ⎣<br />

1<br />

⎦ =3⎣<br />

4<br />

1<br />

0<br />

0<br />

− 31<br />

7<br />

⎤<br />

⎡<br />

⎥ ⎢<br />

⎦ +9⎣<br />

0<br />

1<br />

0<br />

12<br />

7<br />

⎤<br />

⎡<br />

⎥ ⎢<br />

⎦ +1⎣<br />

Can you now rapidly construct several vectors, b, sothatLS(I, b) is consistent,<br />

and several more so that the system is <strong>in</strong>consistent?<br />

△<br />

0<br />

0<br />

1<br />

13<br />

7<br />

⎤<br />

⎥<br />


§CRS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 232<br />

Read<strong>in</strong>g Questions<br />

1. Write the column space of the matrix below as the span of a set of three vectors and<br />

expla<strong>in</strong> your choice of method.<br />

⎡<br />

⎣ 1 3 1 3<br />

⎤<br />

2 0 1 1⎦<br />

−1 2 1 0<br />

2. Suppose that A is an n × n nons<strong>in</strong>gular matrix. What can you say about its column<br />

space?<br />

⎡ ⎤<br />

0<br />

⎢5⎥<br />

3. Is the vector ⎣<br />

2<br />

⎦ <strong>in</strong> the row space of the follow<strong>in</strong>g matrix? Why or why not?<br />

3<br />

⎡<br />

⎣ 1 3 1 3<br />

⎤<br />

2 0 1 1⎦<br />

−1 2 1 0<br />

Exercises<br />

C20 For each matrix below, f<strong>in</strong>d a set of l<strong>in</strong>early <strong>in</strong>dependent vectors X so that 〈X〉<br />

equals the column space of the matrix, and a set of l<strong>in</strong>early <strong>in</strong>dependent vectors Y so that<br />

〈Y 〉 equals the row space of the matrix.<br />

⎡ ⎤<br />

⎡<br />

⎤<br />

1 2 3 1<br />

⎡<br />

⎢0 1 1 2 ⎥<br />

A = ⎣<br />

1 −1 2 3<br />

⎦ B = ⎣ 1 2 1 1 1<br />

⎤<br />

2 1 0<br />

3 0 3<br />

3 2 −1 4 5⎦ C = ⎢1 2 −3⎥<br />

0 1 1 1 2<br />

⎣<br />

1 1 −1<br />

⎦<br />

1 1 2 −1<br />

1 1 −1<br />

From your results for these three matrices, can you formulate a conjecture about the sets<br />

X and Y ?<br />

C30 † Example CSOCD expresses the column space of the coefficient matrix from Archetype<br />

D (call the matrix A here) as the span of the first two columns of A. In Example<br />

CSMCS we determ<strong>in</strong>ed that the vector<br />

⎡<br />

c = ⎣ 2 ⎤<br />

3⎦<br />

2<br />

was not <strong>in</strong> the column space of A and that the vector<br />

⎡<br />

b = ⎣ 8<br />

⎤<br />

−12⎦<br />

4<br />

was <strong>in</strong> the column space of A. Attempt to write c and b as l<strong>in</strong>ear comb<strong>in</strong>ations of the two<br />

vectors <strong>in</strong> the span construction for the column space <strong>in</strong> Example CSOCD and record your<br />

observations.


§CRS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 233<br />

C31 † For the matrix A below f<strong>in</strong>d a set of vectors T meet<strong>in</strong>g the follow<strong>in</strong>g requirements:<br />

(1) the span of T is the column space of A, thatis,〈T 〉 = C(A), (2)T is l<strong>in</strong>early <strong>in</strong>dependent,<br />

and (3) the elements of T are columns of A.<br />

⎡<br />

⎤<br />

2 1 4 −1 2<br />

⎢ 1 −1 5 1 1⎥<br />

A = ⎣<br />

−1 2 −7 0 1<br />

⎦<br />

2 −1 8 −1 2<br />

C32 In Example CSAA, verify that the vector b is not <strong>in</strong> the column space of the<br />

coefficient matrix.<br />

C33 † F<strong>in</strong>d a l<strong>in</strong>early <strong>in</strong>dependent set S so that the span of S, 〈S〉, isrowspaceofthe<br />

matrix B, andS is l<strong>in</strong>early <strong>in</strong>dependent.<br />

⎡<br />

B = ⎣ 2 3 1 1<br />

⎤<br />

1 1 0 1 ⎦<br />

−1 2 3 −4<br />

C34 † For the 3 × 4 matrix A and the column vector y ∈ C 4 given below, determ<strong>in</strong>e if y<br />

is <strong>in</strong> the row space of A. In other words, answer the question: y ∈R(A)?<br />

⎡<br />

⎤<br />

⎡ ⎤<br />

2<br />

−2 6 7 −1<br />

A = ⎣ 7 −3 0 −3⎦ ⎢ 1 ⎥<br />

y = ⎣<br />

3<br />

⎦<br />

8 0 7 6<br />

−2<br />

C35 † For the matrix A below, f<strong>in</strong>d two different l<strong>in</strong>early <strong>in</strong>dependent sets whose spans<br />

equal the column space of A, C(A), such that<br />

1. the elements are each columns of A.<br />

2. the set is obta<strong>in</strong>ed by a procedure that is substantially different from the procedure<br />

you use <strong>in</strong> part (1).<br />

⎡<br />

A = ⎣ 3 5 1 −2<br />

⎤<br />

1 2 3 3 ⎦<br />

−3 −4 7 13<br />

C40 The follow<strong>in</strong>g archetypes are systems of equations. For each system, write the vector<br />

of constants as a l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors <strong>in</strong> the span construction for the column<br />

space provided by Theorem BCS (these vectors are listed for each of these archetypes).<br />

Archetype A,ArchetypeB,ArchetypeC,ArchetypeD,ArchetypeE,ArchetypeF,Archetype<br />

G, ArchetypeH, ArchetypeI, ArchetypeJ<br />

C42 The follow<strong>in</strong>g archetypes are either matrices or systems of equations with coefficient<br />

matrices. For each matrix, compute a set of column vectors such that (1) the vectors are<br />

columns of the matrix, (2) the set is l<strong>in</strong>early <strong>in</strong>dependent, and (3) the span of the set is<br />

the column space of the matrix. See Theorem BCS.


§CRS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 234<br />

Archetype A,ArchetypeB,ArchetypeC,ArchetypeD/Archetype E,ArchetypeF,Archetype<br />

G/Archetype H, ArchetypeI, ArchetypeJ, ArchetypeK, ArchetypeL<br />

C50 The follow<strong>in</strong>g archetypes are either matrices or systems of equations with coefficient<br />

matrices. For each matrix, compute a set of column vectors such that (1) the set is l<strong>in</strong>early<br />

<strong>in</strong>dependent, and (2) the span of the set is the row space of the matrix. See Theorem BRS.<br />

Archetype A,ArchetypeB,ArchetypeC,ArchetypeD/Archetype E,ArchetypeF,Archetype<br />

G/Archetype H, ArchetypeI, ArchetypeJ, ArchetypeK, ArchetypeL<br />

C51 The follow<strong>in</strong>g archetypes are either matrices or systems of equations with coefficient<br />

matrices. For each matrix, compute the column space as the span of a l<strong>in</strong>early <strong>in</strong>dependent<br />

set as follows: transpose the matrix, row-reduce, toss out zero rows, convert rows <strong>in</strong>to<br />

column vectors. See Example CSROI.<br />

Archetype A,ArchetypeB,ArchetypeC,ArchetypeD/Archetype E,ArchetypeF,Archetype<br />

G/Archetype H, ArchetypeI, ArchetypeJ, ArchetypeK, ArchetypeL<br />

C52 The follow<strong>in</strong>g archetypes are systems of equations. For each different coefficient<br />

matrix build two new vectors of constants. The first should lead to a consistent system<br />

and the second should lead to an <strong>in</strong>consistent system. Descriptions of the column space as<br />

spans of l<strong>in</strong>early <strong>in</strong>dependent sets of vectors with “nice patterns” of zeros and ones might<br />

be most useful and <strong>in</strong>structive <strong>in</strong> connection with this exercise. (See the end of Example<br />

CSROI.)<br />

Archetype A,ArchetypeB,ArchetypeC,ArchetypeD/Archetype E,ArchetypeF,Archetype<br />

G/Archetype H, ArchetypeI, ArchetypeJ<br />

M10 † For the matrix E below, f<strong>in</strong>d vectors b and c so that the system LS(E, b) is<br />

consistent and LS(E, c) is <strong>in</strong>consistent.<br />

⎡<br />

E = ⎣ −2 1 1 0<br />

⎤<br />

3 −1 0 2⎦<br />

4 1 1 6<br />

M20 † Usually the column space and null space of a matrix conta<strong>in</strong> vectors of different<br />

sizes. For a square matrix, though, the vectors <strong>in</strong> these two sets are the same size. Usually<br />

the two sets will be different. Construct an example of a square matrix where the column<br />

space and null space are equal.<br />

M21 † We have a variety of theorems about how to create column spaces and row spaces<br />

and they frequently <strong>in</strong>volve row-reduc<strong>in</strong>g a matrix. Here is a procedure that some try to<br />

use to get a column space. Beg<strong>in</strong> with an m × n matrix A and row-reduce to a matrix B<br />

with columns B 1 , B 2 , B 3 , ..., B n .ThenformthecolumnspaceofA as<br />

C(A) =〈{B 1 , B 2 , B 3 , ..., B n }〉 = C(B)<br />

This is not not a legitimate procedure, and therefore is not a theorem. Construct an example<br />

to show that the procedure will not <strong>in</strong> general create the column space of A.<br />

T40 † Suppose that A is an m × n matrix and B is an n × p matrix. Prove that the<br />

column space of AB is a subset of the column space of A, thatisC(AB) ⊆C(A). Providean


§CRS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 235<br />

example where the opposite is false, <strong>in</strong> other words give an example where C(A) ⊈C(AB).<br />

(Compare with Exercise MM.T40.)<br />

T41 † Suppose that A is an m × n matrix and B is an n × n nons<strong>in</strong>gular matrix. Prove<br />

that the column space of A is equal to the column space of AB, thatisC(A) = C(AB).<br />

(Compare with Exercise MM.T41 and Exercise CRS.T40.)<br />

T45 † Suppose that A is an m × n matrix and B is an n × m matrix where AB is a<br />

nons<strong>in</strong>gular matrix. Prove that<br />

1. N (B) ={0}<br />

2. C(B) ∩N(A) ={0}<br />

Discuss the case when m = n <strong>in</strong> connection with Theorem NPNT.


Section FS<br />

Four Subsets<br />

There are four natural subsets associated with a matrix. We have met three already:<br />

the null space, the column space and the row space. In this section we will <strong>in</strong>troduce<br />

a fourth, the left null space. The objective of this section is to describe one procedure<br />

that will allow us to f<strong>in</strong>d l<strong>in</strong>early <strong>in</strong>dependent sets that span each of these four sets<br />

of column vectors. Along the way, we will make a connection with the <strong>in</strong>verse of<br />

a matrix, so Theorem FS will tie together most all of this chapter (and the entire<br />

course so far).<br />

Subsection LNS<br />

Left Null Space<br />

Def<strong>in</strong>ition LNS Left Null Space<br />

Suppose A is an m × n matrix. Then the left null space is def<strong>in</strong>ed as L(A) =<br />

N (A t ) ⊆ C m .<br />

□<br />

The left null space will not feature prom<strong>in</strong>ently <strong>in</strong> the sequel, but we can expla<strong>in</strong><br />

its name and connect it to row operations. Suppose y ∈L(A). Then by Def<strong>in</strong>ition<br />

LNS, A t y = 0. We can then write<br />

0 t = ( A t y ) t<br />

Def<strong>in</strong>ition LNS<br />

= y t ( A t) t<br />

Theorem MMT<br />

= y t A Theorem TT<br />

The product y t A can be viewed as the components of y act<strong>in</strong>g as the scalars <strong>in</strong><br />

a l<strong>in</strong>ear comb<strong>in</strong>ation of the rows of A. And the result is a “row vector”, 0 t that is<br />

totally zeros. When we apply a sequence of row operations to a matrix, each row of<br />

the result<strong>in</strong>g matrix is some l<strong>in</strong>ear comb<strong>in</strong>ation of the rows. These observations tell<br />

us that the vectors <strong>in</strong> the left null space are scalars that record a sequence of row<br />

operations that result <strong>in</strong> a row of zeros <strong>in</strong> the row-reduced version of the matrix.<br />

We will see this idea more explicitly <strong>in</strong> the course of prov<strong>in</strong>g Theorem FS.<br />

Example LNS Left null space<br />

We will f<strong>in</strong>d the left null space of<br />

⎡<br />

⎤<br />

1 −3 1<br />

⎢−2 1 1⎥<br />

A = ⎣<br />

1 5 1<br />

⎦<br />

9 −4 0<br />

236


§FS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 237<br />

We transpose A and row-reduce,<br />

[ ] ⎡<br />

⎤<br />

1 −2 1 9<br />

1 0 0 2<br />

A t RREF<br />

= −3 1 5 −4 −−−−→ ⎣ 0 1 0 −3⎦<br />

1 1 1 0<br />

0 0 1 1<br />

Apply<strong>in</strong>g Def<strong>in</strong>ition LNS and Theorem BNS we have<br />

L(A) =N ( 〈 ⎧ ⎡ ⎤⎫<br />

⎪ −2 〉<br />

⎨ ⎪⎬<br />

A t) ⎢ 3 ⎥<br />

= ⎣<br />

⎪ ⎩<br />

−1<br />

⎦<br />

⎪⎭<br />

1<br />

If you row-reduce A you will discover one zero row <strong>in</strong> the reduced row-echelon<br />

form. This zero row is created by a sequence of row operations, which <strong>in</strong> total amounts<br />

to a l<strong>in</strong>ear comb<strong>in</strong>ation, with scalars a 1 = −2, a 2 =3,a 3 = −1 anda 4 =1,onthe<br />

rows of A and which results <strong>in</strong> the zero vector (check this!). So the components of<br />

the vector describ<strong>in</strong>g the left null space of A provide a relation of l<strong>in</strong>ear dependence<br />

on the rows of A.<br />

△<br />

Subsection CCS<br />

Comput<strong>in</strong>g Column Spaces<br />

We have three ways to build the column space of a matrix. <strong>First</strong>, we can use just the<br />

def<strong>in</strong>ition, Def<strong>in</strong>ition CSM, and express the column space as a span of the columns of<br />

the matrix. A second approach gives us the column space as the span of some of the<br />

columns of the matrix, and additionally, this set is l<strong>in</strong>early <strong>in</strong>dependent (Theorem<br />

BCS). F<strong>in</strong>ally, we can transpose the matrix, row-reduce the transpose, kick out<br />

zero rows, and write the rema<strong>in</strong><strong>in</strong>g rows as column vectors. Theorem CSRST and<br />

Theorem BRS tell us that the result<strong>in</strong>g vectors are l<strong>in</strong>early <strong>in</strong>dependent and their<br />

span is the column space of the orig<strong>in</strong>al matrix.<br />

We will now demonstrate a fourth method by way of a rather complicated example.<br />

Study this example carefully, but realize that its ma<strong>in</strong> purpose is to motivate a<br />

theorem that simplifies much of the apparent complexity. So other than an <strong>in</strong>structive<br />

exercise or two, the procedure we are about to describe will not be a usual approach<br />

to comput<strong>in</strong>g a column space.<br />

Example CSANS Column space as null space<br />

Let us f<strong>in</strong>d the column space of the matrix A below with a new approach.<br />

⎡<br />

⎤<br />

10 0 3 8 7<br />

−16 −1 −4 −10 −13<br />

−6 1 −3 −6 −6<br />

A =<br />

⎢ 0 2 −2 −3 −2<br />

⎥<br />

⎣ 3 0 1 2 3 ⎦<br />

−1 −1 1 1 0


§FS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 238<br />

By Theorem CSCS we know that the column vector b is <strong>in</strong> the column space<br />

of A if and only if the l<strong>in</strong>ear system LS(A, b) is consistent. So let us try to solve<br />

this system <strong>in</strong> full generality, us<strong>in</strong>g a vector of variables for the vector of constants.<br />

In other words, which vectors b lead to consistent systems? Beg<strong>in</strong> by form<strong>in</strong>g the<br />

augmented matrix [A | b] with a general version of b,<br />

⎡<br />

⎤<br />

10 0 3 8 7 b 1<br />

−16 −1 −4 −10 −13 b 2<br />

−6 1 −3 −6 −6 b<br />

[A | b] =<br />

3<br />

⎢ 0 2 −2 −3 −2 b 4 ⎥<br />

⎣ 3 0 1 2 3 b 5<br />

⎦<br />

−1 −1 1 1 0 b 6<br />

To identify solutions we will br<strong>in</strong>g this matrix to reduced row-echelon form.<br />

Despite the presence of variables <strong>in</strong> the last column, there is noth<strong>in</strong>g to stop us<br />

from do<strong>in</strong>g this, except numerical computational rout<strong>in</strong>es cannot be used, and even<br />

some of the symbolic algebra rout<strong>in</strong>es do some unexpected maneuvers with this<br />

computation. So do it by hand. Yes, it is a bit of work. But worth it. We’ll still be<br />

here when you get back. Notice along the way that the row operations are exactly<br />

the same ones you would do if you were just row-reduc<strong>in</strong>g the coefficient matrix<br />

alone, say <strong>in</strong> connection with a homogeneous system of equations. The column with<br />

the b i acts as a sort of bookkeep<strong>in</strong>g device. There are many different possibilities for<br />

the result, depend<strong>in</strong>g on what order you choose to perform the row operations, but<br />

shortly we will all be on the same page. If you want to match our work right now,<br />

use row 5 to remove any occurrence of b 1 from the other entries of the last column,<br />

and use row 6 to remove any occurrence of b 2 from the last columns. We have:<br />

⎡<br />

⎤<br />

1 0 0 0 2 b 3 − b 4 +2b 5 − b 6<br />

0 1 0 0 −3 −2b 3 +3b 4 − 3b 5 +3b 6<br />

0 0 1 0 1 b 3 + b 4 +3b 5 +3b 6<br />

⎢ 0 0 0 1 −2 −2b<br />

⎣<br />

3 + b 4 − 4b 5 ⎥<br />

0 0 0 0 0 b 1 +3b 3 − b 4 +3b 5 + b<br />

⎦<br />

6<br />

0 0 0 0 0 b 2 − 2b 3 + b 4 + b 5 − b 6<br />

Our goal is to identify those vectors b which make LS(A, b) consistent. By<br />

Theorem RCLS we know that the consistent systems are precisely those without a<br />

pivot column <strong>in</strong> the last column. Are the expressions <strong>in</strong> the last column of rows 5<br />

and 6 equal to zero, or are they lead<strong>in</strong>g 1’s? The answer is: maybe. It depends on b.<br />

With a nonzero value for either of these expressions, we would scale the row and<br />

produce a lead<strong>in</strong>g 1. So we get a consistent system, and b is <strong>in</strong> the column space,<br />

if and only if these two expressions are both simultaneously zero. In other words,<br />

members of the column space of A are exactly those vectors b that satisfy<br />

b 1 +3b 3 − b 4 +3b 5 + b 6 =0<br />

b 2 − 2b 3 + b 4 + b 5 − b 6 =0


§FS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 239<br />

Hmmm. Looks suspiciously like a homogeneous system of two equations with<br />

six variables. If you have been play<strong>in</strong>g along (and we hope you have) then you may<br />

have a slightly different system, but you should have just two equations. Form the<br />

coefficient matrix and row-reduce (notice that the system above has a coefficient<br />

matrix that is already <strong>in</strong> reduced row-echelon form). We should all be together now<br />

with the same matrix,<br />

[ ]<br />

1 0 3 −1 3 1<br />

L =<br />

0 1 −2 1 1 −1<br />

So, C(A) = N (L) and we can apply Theorem BNS to obta<strong>in</strong> a l<strong>in</strong>early <strong>in</strong>dependent<br />

set to use <strong>in</strong> a span construction,<br />

⎧⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />

−3 1 −3 −1<br />

〈<br />

2<br />

−1<br />

−1<br />

1<br />

〉<br />

⎪⎨<br />

⎪⎬<br />

1<br />

0<br />

0<br />

0<br />

C(A) =N (L) =<br />

⎢ 0<br />

,<br />

⎥ ⎢ 1<br />

,<br />

⎥ ⎢ 0<br />

,<br />

⎥ ⎢ 0<br />

⎥<br />

⎣ 0 ⎦ ⎣ 0 ⎦ ⎣ 1 ⎦ ⎣ 0 ⎦<br />

⎪⎩<br />

⎪⎭<br />

0 0 0 1<br />

Whew! As a postscript to this central example, you may wish to conv<strong>in</strong>ce yourself<br />

that the four vectors above really are elements of the column space. Do they create<br />

consistent systems with A as coefficient matrix? Can you recognize the constant<br />

vector <strong>in</strong> your description of these solution sets?<br />

OK, that was so much fun, let us do it aga<strong>in</strong>. But simpler this time. And we will<br />

all get the same results all the way through. Do<strong>in</strong>g row operations by hand with<br />

variables can be a bit error prone, so let us see if we can improve the process some.<br />

Rather than row-reduce a column vector b full of variables, let us write b = I 6 b<br />

and we will row-reduce the matrix I 6 and when we f<strong>in</strong>ish row-reduc<strong>in</strong>g, then we will<br />

compute the matrix-vector product. You should first conv<strong>in</strong>ce yourself that we can<br />

operate like this (this is the subject of a future homework exercise).<br />

Rather than augment<strong>in</strong>g A with b, we will <strong>in</strong>stead augment it with I 6 (does this<br />

feel familiar?),<br />

⎡<br />

⎤<br />

10 0 3 8 7 1 0 0 0 0 0<br />

−16 −1 −4 −10 −13 0 1 0 0 0 0<br />

−6 1 −3 −6 −6 0 0 1 0 0 0<br />

M =<br />

⎢ 0 2 −2 −3 −2 0 0 0 1 0 0<br />

⎥<br />

⎣ 3 0 1 2 3 0 0 0 0 1 0⎦<br />

−1 −1 1 1 0 0 0 0 0 0 1<br />

We want to row-reduce the left-hand side of this matrix, but we will apply the<br />

same row operations to the right-hand side as well. And once we get the left-hand<br />

side <strong>in</strong> reduced row-echelon form, we will cont<strong>in</strong>ue on to put lead<strong>in</strong>g 1’s <strong>in</strong> the f<strong>in</strong>al<br />

two rows, as well as mak<strong>in</strong>g pivot columns that conta<strong>in</strong> these two additional lead<strong>in</strong>g<br />

1’s. It is these additional row operations that will ensure that we all get to the same


§FS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 240<br />

place, s<strong>in</strong>ce the reduced row-echelon form is unique (Theorem RREFU),<br />

⎡<br />

⎤<br />

1 0 0 0 2 0 0 1 −1 2 −1<br />

0 1 0 0 −3 0 0 −2 3 −3 3<br />

0 0 1 0 1 0 0 1 1 3 3<br />

N =<br />

⎢0 0 0 1 −2 0 0 −2 1 −4 0<br />

⎥<br />

⎣0 0 0 0 0 1 0 3 −1 3 1 ⎦<br />

0 0 0 0 0 0 1 −2 1 1 −1<br />

so<br />

We are after the f<strong>in</strong>al six columns of this matrix, which we will multiply by b<br />

⎡<br />

⎤<br />

0 0 1 −1 2 −1<br />

0 0 −2 3 −3 3<br />

0 0 1 1 3 3<br />

J =<br />

⎢0 0 −2 1 −4 0<br />

⎥<br />

⎣1 0 3 −1 3 1 ⎦<br />

0 1 −2 1 1 −1<br />

⎡<br />

⎤ ⎡ ⎤ ⎡<br />

⎤<br />

0 0 1 −1 2 −1 b 1 b 3 − b 4 +2b 5 − b 6<br />

0 0 −2 3 −3 3<br />

b 2<br />

−2b 3 +3b 4 − 3b 5 +3b 6<br />

0 0 1 1 3 3<br />

b<br />

Jb =<br />

⎢0 0 −2 1 −4 0<br />

3<br />

b<br />

⎥ ⎢b =<br />

3 + b 4 +3b 5 +3b 6<br />

4 ⎥ ⎢ −2b 3 + b 4 − 4b 5 ⎥<br />

⎣1 0 3 −1 3 1 ⎦ ⎣b 5<br />

⎦ ⎣b 1 +3b 3 − b 4 +3b 5 + b 6<br />

⎦<br />

0 1 −2 1 1 −1 b 6 b 2 − 2b 3 + b 4 + b 5 − b 6<br />

So by apply<strong>in</strong>g the same row operations that row-reduce A to the identity matrix<br />

(which we could do computationally once I 6 is placed alongside of A), we can then<br />

arrive at the result of row-reduc<strong>in</strong>g a column of symbols where the vector of constants<br />

usually resides. S<strong>in</strong>ce the row-reduced version of A has two zero rows, for a consistent<br />

system we require that<br />

b 1 +3b 3 − b 4 +3b 5 + b 6 =0<br />

b 2 − 2b 3 + b 4 + b 5 − b 6 =0<br />

Now we are exactly back where we were on the first go-round. Notice that we<br />

obta<strong>in</strong> the matrix L as simply the last two rows and last six columns of N. △<br />

This example motivates the rema<strong>in</strong>der of this section, so it is worth careful study.<br />

You might attempt to mimic the second approach with the coefficient matrices of<br />

Archetype I and Archetype J. We will see shortly that the matrix L conta<strong>in</strong>s more<br />

<strong>in</strong>formation about A than just the column space.<br />

Subsection EEF<br />

Extended Echelon Form<br />

The f<strong>in</strong>al matrix that we row-reduced <strong>in</strong> Example CSANS should look familiar <strong>in</strong><br />

most respects to the procedure we used to compute the <strong>in</strong>verse of a nons<strong>in</strong>gular


§FS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 241<br />

matrix, Theorem CINM. We will now generalize that procedure to matrices that are<br />

not necessarily nons<strong>in</strong>gular, or even square. <strong>First</strong> a def<strong>in</strong>ition.<br />

Def<strong>in</strong>ition EEF Extended Echelon Form<br />

Suppose A is an m × n matrix. Extend A on its right side with the addition of an<br />

m×m identity matrix to form an m×(n+m) matrix M. Use row operations to br<strong>in</strong>g<br />

M to reduced row-echelon form and call the result N. N is the extended reduced<br />

row-echelon form of A, and we will standardize on names for five submatrices (B,<br />

C, J, K, L) ofN.<br />

Let B denote the m × n matrix formed from the first n columns of N and let J<br />

denote the m × m matrix formed from the last m columns of N. Suppose that B has<br />

r nonzero rows. Further partition N by lett<strong>in</strong>g C denote the r × n matrix formed<br />

from all of the nonzero rows of B. LetK be the r × m matrix formed from the first<br />

r rows of J, while L will be the (m − r) × m matrix formed from the bottom m − r<br />

rows of J. Pictorially,<br />

M =[A|I m ] RREF<br />

−−−−→ N =[B|J] =<br />

[<br />

C K<br />

0 L<br />

Example SEEF Submatrices of extended echelon form<br />

We illustrate Def<strong>in</strong>ition EEF with the matrix A,<br />

⎡<br />

⎤<br />

1 −1 −2 7 1 6<br />

⎢−6 2 −4 −18 −3 −26⎥<br />

A = ⎣<br />

4 −1 4 10 2 17<br />

⎦<br />

3 −1 2 9 1 12<br />

Augment<strong>in</strong>g with the 4 × 4 identity matrix,<br />

⎡<br />

⎤<br />

1 −1 −2 7 1 6 1 0 0 0<br />

⎢−6 2 −4 −18 −3 −26 0 1 0 0⎥<br />

M = ⎣<br />

4 −1 4 10 2 17 0 0 1 0<br />

⎦<br />

3 −1 2 9 1 12 0 0 0 1<br />

and row-reduc<strong>in</strong>g, we obta<strong>in</strong><br />

⎡<br />

N =<br />

⎢<br />

⎣<br />

So we then obta<strong>in</strong><br />

1 0 2 1 0 3 0 1 1 1<br />

0 1 4 −6 0 −1 0 2 3 0<br />

0 0 0 0 1 2 0 −1 0 −2<br />

0 0 0 0 0 0 1 2 2 1<br />

B =<br />

⎡<br />

⎢<br />

⎣<br />

⎤<br />

1 0 2 1 0 3<br />

0 1 4 −6 0 −1<br />

⎥<br />

0 0 0 0 1 2<br />

⎦<br />

0 0 0 0 0 0<br />

]<br />

⎤<br />

⎥<br />

⎦<br />


§FS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 242<br />

⎡<br />

1 0 2 1 0<br />

⎤<br />

3<br />

C = ⎣ 0 1 4 −6 0 −1⎦<br />

0 0 0 0 1 2<br />

⎡<br />

⎤<br />

0 1 1 1<br />

⎢ 0 2 3 0 ⎥<br />

J = ⎣ 0 −1 0 −2⎦<br />

1 2 2 1<br />

[ ]<br />

0 1 1 1<br />

K = 0 2 3 0<br />

0 −1 0 −2<br />

L = [ 1 2 2 1 ]<br />

You can observe (or verify) the properties of the follow<strong>in</strong>g theorem with this<br />

example.<br />

△<br />

Theorem PEEF Properties of Extended Echelon Form<br />

Suppose that A is an m × n matrix and that N is its extended echelon form. Then<br />

1. J is nons<strong>in</strong>gular.<br />

2. B = JA.<br />

3. If x ∈ C n and y ∈ C m ,thenAx = y if and only if Bx = Jy.<br />

4. C is <strong>in</strong> reduced row-echelon form, has no zero rows and has r pivot columns.<br />

5. L is <strong>in</strong> reduced row-echelon form, has no zero rows and has m−r pivot columns.<br />

Proof. J is the result of apply<strong>in</strong>g a sequence of row operations to I m , and therefore<br />

J and I m are row-equivalent. LS(I m , 0) has only the zero solution, s<strong>in</strong>ce I m is<br />

nons<strong>in</strong>gular (Theorem NMRRI). Thus, LS(J, 0) also has only the zero solution<br />

(Theorem REMES, Def<strong>in</strong>ition ESYS) andJ is therefore nons<strong>in</strong>gular (Def<strong>in</strong>ition<br />

NSM).<br />

To prove the second part of this conclusion, first conv<strong>in</strong>ce yourself that row<br />

operations and the matrix-vector product are associative operations. By this we<br />

mean the follow<strong>in</strong>g. Suppose that F is an m × n matrix that is row-equivalent to<br />

the matrix G. Apply to the column vector F w the same sequence of row operations<br />

that converts F to G. Then the result is Gw. So we can do row operations on the<br />

matrix, then do a matrix-vector product, or do a matrix-vector product and then do<br />

row operations on a column vector, and the result will be the same either way. S<strong>in</strong>ce<br />

matrix multiplication is def<strong>in</strong>ed by a collection of matrix-vector products (Def<strong>in</strong>ition<br />

MM), the matrix product FH will become GH if we apply the same sequence of<br />

row operations to FH that convert F to G. (This argument can be made more<br />

rigorous us<strong>in</strong>g elementary matrices from the upcom<strong>in</strong>g Subsection DM.EM and the


§FS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 243<br />

associative property of matrix multiplication established <strong>in</strong> Theorem MMA.) Now<br />

apply these observations to A.<br />

Write AI n = I m A and apply the row operations that convert M to N. A is<br />

converted to B, while I m is converted to J, sowehaveBI n = JA. Simplify<strong>in</strong>g the<br />

left side gives the desired conclusion.<br />

For the third conclusion, we now establish the two equivalences<br />

Ax = y ⇐⇒ JAx = Jy ⇐⇒ Bx = Jy<br />

The forward direction of the first equivalence is accomplished by multiply<strong>in</strong>g<br />

both sides of the matrix equality by J, while the backward direction is accomplished<br />

by multiply<strong>in</strong>g by the <strong>in</strong>verse of J (which we know exists by Theorem NI s<strong>in</strong>ce J is<br />

nons<strong>in</strong>gular). The second equivalence is obta<strong>in</strong>ed simply by the substitutions given<br />

by JA = B.<br />

The first r rows of N are <strong>in</strong> reduced row-echelon form, s<strong>in</strong>ce any contiguous<br />

collection of rows taken from a matrix <strong>in</strong> reduced row-echelon form will form a<br />

matrix that is aga<strong>in</strong> <strong>in</strong> reduced row-echelon form (Exercise RREF.T12). S<strong>in</strong>ce the<br />

matrix C is formed by remov<strong>in</strong>g the last n entries of each these rows, the rema<strong>in</strong>der<br />

is still <strong>in</strong> reduced row-echelon form. By its construction, C has no zero rows. C has<br />

r rows and each conta<strong>in</strong>s a lead<strong>in</strong>g 1, so there are r pivot columns <strong>in</strong> C.<br />

The f<strong>in</strong>al m − r rows of N are <strong>in</strong> reduced row-echelon form, s<strong>in</strong>ce any contiguous<br />

collection of rows taken from a matrix <strong>in</strong> reduced row-echelon form will form a<br />

matrix that is aga<strong>in</strong> <strong>in</strong> reduced row-echelon form. S<strong>in</strong>ce the matrix L is formed by<br />

remov<strong>in</strong>g the first n entries of each these rows, and these entries are all zero (they<br />

form the zero rows of B), the rema<strong>in</strong>der is still <strong>in</strong> reduced row-echelon form. L is the<br />

f<strong>in</strong>al m − r rows of the nons<strong>in</strong>gular matrix J, so none of these rows can be totally<br />

zero, or J would not row-reduce to the identity matrix. L has m − r rows and each<br />

conta<strong>in</strong>s a lead<strong>in</strong>g 1, so there are m − r pivot columns <strong>in</strong> L.<br />

<br />

Notice that <strong>in</strong> the case where A is a nons<strong>in</strong>gular matrix we know that the reduced<br />

row-echelon form of A is the identity matrix (Theorem NMRRI), so B = I n .Then<br />

the second conclusion above says JA = B = I n ,soJ is the <strong>in</strong>verse of A. Thusthis<br />

theorem generalizes Theorem CINM, though the result is a “left-<strong>in</strong>verse” of A rather<br />

than a “right-<strong>in</strong>verse.”<br />

The third conclusion of Theorem PEEF is the most tell<strong>in</strong>g. It says that x is a<br />

solution to the l<strong>in</strong>ear system LS(A, y) if and only if x is a solution to the l<strong>in</strong>ear<br />

system LS(B, Jy). Or said differently, if we row-reduce the augmented matrix [A | y]<br />

we will get the augmented matrix [B | Jy]. ThematrixJ tracks the cumulative effect<br />

of the row operations that converts A to reduced row-echelon form, here effectively<br />

apply<strong>in</strong>g them to the vector of constants <strong>in</strong> a system of equations hav<strong>in</strong>g A as a<br />

coefficient matrix. When A row-reduces to a matrix with zero rows, then Jy should<br />

also have zero entries <strong>in</strong> the same rows if the system is to be consistent.


§FS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 244<br />

Subsection FS<br />

Four Subsets<br />

With all the prelim<strong>in</strong>aries <strong>in</strong> place we can state our ma<strong>in</strong> result for this section. In<br />

essence this result will allow us to say that we can f<strong>in</strong>d l<strong>in</strong>early <strong>in</strong>dependent sets to<br />

use <strong>in</strong> span constructions for all four subsets (null space, column space, row space,<br />

left null space) by analyz<strong>in</strong>g only the extended echelon form of the matrix, and<br />

specifically, just the two submatrices C and L, which will be ripe for analysis s<strong>in</strong>ce<br />

they are already <strong>in</strong> reduced row-echelon form (Theorem PEEF).<br />

Theorem FS Four Subsets<br />

Suppose A is an m × n matrix with extended echelon form N. Suppose the reduced<br />

row-echelon form of A has r nonzero rows. Then C is the submatrix of N formed<br />

from the first r rows and the first n columns and L is the submatrix of N formed<br />

from the last m columns and the last m − r rows. Then<br />

1. The null space of A is the null space of C, N (A) =N (C).<br />

2. The row space of A is the row space of C, R(A) =R(C).<br />

3. The column space of A is the null space of L, C(A) =N (L).<br />

4. The left null space of A is the row space of L, L(A) =R(L).<br />

Proof. <strong>First</strong>, N (A) = N (B) s<strong>in</strong>ce B is row-equivalent to A (Theorem REMES). The<br />

zero rows of B represent equations that are always true <strong>in</strong> the homogeneous system<br />

LS(B, 0), so the removal of these equations will not change the solution set. Thus,<br />

<strong>in</strong> turn, N (B) =N (C).<br />

Second, R(A) = R(B) s<strong>in</strong>ce B is row-equivalent to A (Theorem REMRS). The<br />

zero rows of B contribute noth<strong>in</strong>g to the span that is the row space of B, sothe<br />

removal of these rows will not change the row space. Thus, <strong>in</strong> turn, R(B) =R(C).<br />

Third, we prove the set equality C(A) = N (L) with Def<strong>in</strong>ition SE. Beg<strong>in</strong>by<br />

show<strong>in</strong>g that C(A) ⊆N(L). Choose y ∈C(A) ⊆ C m . Then there exists a vector<br />

x ∈ C n such that Ax = y (Theorem CSCS). Then for 1 ≤ k ≤ m − r,<br />

[Ly] k<br />

=[Jy] r+k<br />

L a submatrix of J<br />

=[Bx] r+k<br />

Theorem PEEF<br />

=[Ox] k<br />

Zero matrix a submatrix of B<br />

=[0] k<br />

Theorem MMZM<br />

So, for all 1 ≤ k ≤ m − r, [Ly] k<br />

= [0] k<br />

.SobyDef<strong>in</strong>itionCVE we have Ly = 0<br />

and thus y ∈N(L).<br />

Now, show that N (L) ⊆ C(A). Choose y ∈ N(L) ⊆ C m . Form the vector<br />

Ky ∈ C r . The l<strong>in</strong>ear system LS(C, Ky) is consistent s<strong>in</strong>ce C is <strong>in</strong> reduced rowechelon<br />

form and has no zero rows (Theorem PEEF). Let x ∈ C n denote a solution<br />

to LS(C, Ky).


§FS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 245<br />

Then for 1 ≤ j ≤ r,<br />

[Bx] j<br />

=[Cx] j<br />

=[Ky] j<br />

=[Jy] j<br />

C a submatrix of B<br />

x a solution to LS(C, Ky)<br />

K a submatrix of J<br />

And for r +1≤ k ≤ m,<br />

[Bx] k<br />

=[Ox] k−r<br />

Zero matrix a submatrix of B<br />

=[0] k−r<br />

Theorem MMZM<br />

=[Ly] k−r<br />

y <strong>in</strong> N (L)<br />

=[Jy] k<br />

L a submatrix of J<br />

So for all 1 ≤ i ≤ m, [Bx] i<br />

= [Jy] i<br />

and by Def<strong>in</strong>ition CVE we have Bx = Jy.<br />

From Theorem PEEF we know then that Ax = y, and therefore y ∈C(A) (Theorem<br />

CSCS). By Def<strong>in</strong>ition SE we now have C(A) =N (L).<br />

Fourth, we prove the set equality L(A) = R(L) with Def<strong>in</strong>ition SE. Beg<strong>in</strong>by<br />

show<strong>in</strong>g that R(L) ⊆L(A). Choose y ∈R(L) ⊆ C m . Then there exists a vector<br />

w ∈ C m−r such that y = L t w (Def<strong>in</strong>ition RSM,TheoremCSCS). Then for 1 ≤ i ≤ n,<br />

[<br />

A t y ] = ∑<br />

m [<br />

i A<br />

t ] [y] ik k<br />

Theorem EMP<br />

=<br />

=<br />

=<br />

k=1<br />

m∑ [<br />

A<br />

t ] [<br />

ik L t w ] k<br />

k=1<br />

m∑ [<br />

A<br />

t ] ik<br />

k=1<br />

m∑<br />

k=1<br />

m−r<br />

∑<br />

=<br />

l=1<br />

m−r<br />

∑<br />

l=1<br />

m−r<br />

∑<br />

l=1<br />

[<br />

A<br />

t ] ik<br />

Def<strong>in</strong>ition of w<br />

[<br />

L<br />

t ] kl [w] l<br />

Theorem EMP<br />

[<br />

L<br />

t ] kl [w] l<br />

Property DCN<br />

m∑ [<br />

A<br />

t ] [<br />

ik L<br />

t ] kl [w] l<br />

Property CACN<br />

k=1<br />

(<br />

m−r<br />

∑ ∑ m<br />

[<br />

=<br />

A<br />

t ] [<br />

ik L<br />

t ] )<br />

[w]<br />

kl l<br />

l=1 k=1<br />

(<br />

m−r<br />

∑ ∑ m<br />

[<br />

=<br />

A<br />

t ] [<br />

ik J<br />

t ] )<br />

[w]<br />

k,r+l l<br />

l=1<br />

k=1<br />

Property DCN<br />

L a submatrix of J


§FS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 246<br />

=<br />

m−r<br />

∑<br />

l=1<br />

m−r<br />

∑<br />

=<br />

l=1<br />

[<br />

A t J t] i,r+l [w] l<br />

Theorem EMP<br />

[<br />

(JA) t] [w] l<br />

i,r+l<br />

Theorem MMT<br />

m−r<br />

∑ [<br />

= B<br />

t ] i,r+l [w] l<br />

Theorem PEEF<br />

l=1<br />

m−r<br />

∑<br />

= 0[w] l<br />

l=1<br />

Zero rows <strong>in</strong> B<br />

=0 PropertyZCN<br />

=[0] i<br />

Def<strong>in</strong>ition ZCV<br />

S<strong>in</strong>ce [A t y] i<br />

= [0] i<br />

for 1 ≤ i ≤ n, Def<strong>in</strong>ition CVE implies that A t y = 0. This<br />

means that y ∈N(A t ).<br />

Now, show that L(A) ⊆ R(L). Choose y ∈ L(A) ⊆ C m . The matrix J is<br />

nons<strong>in</strong>gular (Theorem PEEF), so J t is also nons<strong>in</strong>gular (Theorem MIT) and therefore<br />

the l<strong>in</strong>ear system LS(J t , y) has a unique solution. Denote this solution as x ∈ C m .<br />

We will need to work with two “halves” of x, which we will denote as z and w with<br />

formal def<strong>in</strong>itions given by<br />

[z] j<br />

=[x] i<br />

1 ≤ j ≤ r, [w] k<br />

=[x] r+k<br />

1 ≤ k ≤ m − r<br />

Now, for 1 ≤ j ≤ r,<br />

[<br />

C t z ] r∑<br />

j = [<br />

C<br />

t ] [z] jk k<br />

Theorem EMP<br />

=<br />

=<br />

=<br />

=<br />

=<br />

k=1<br />

r∑<br />

k=1<br />

[<br />

C<br />

t ] m−r<br />

[z] jk k + ∑<br />

l=1<br />

[O] jl<br />

[w] l<br />

Def<strong>in</strong>ition ZM<br />

r∑ [<br />

B<br />

t ] m−r<br />

[z] jk k + ∑ [<br />

B<br />

t ] [w] j,r+l l<br />

C, O submatrices of B<br />

k=1<br />

l=1<br />

r∑ [<br />

B<br />

t ] m−r<br />

jk [x] k + ∑ [<br />

B<br />

t ] j,r+l [x] r+l<br />

Def<strong>in</strong>itions of z and w<br />

k=1<br />

l=1<br />

r∑ [<br />

B<br />

t ] m jk [x] k + ∑<br />

k=1<br />

k=r+1<br />

[<br />

B<br />

t ] jk [x] k<br />

Re-<strong>in</strong>dex second sum<br />

m∑ [<br />

B<br />

t ] [x] jk k<br />

Comb<strong>in</strong>e sums<br />

k=1


§FS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 247<br />

=<br />

=<br />

=<br />

=<br />

=<br />

=<br />

=<br />

m∑<br />

k=1<br />

[<br />

(JA) t] [x] k<br />

jk<br />

Theorem PEEF<br />

m∑ [<br />

A t J t] jk [x] k<br />

Theorem MMT<br />

k=1<br />

m∑<br />

k=1 l=1<br />

m∑<br />

l=1 k=1<br />

m∑ [<br />

A<br />

t ] [<br />

jl J<br />

t ] lk [x] k<br />

Theorem EMP<br />

m∑ [<br />

A<br />

t ] [<br />

jl J<br />

t ] [x] lk k<br />

Property CACN<br />

m∑ [<br />

A<br />

t ] (<br />

∑ m<br />

[<br />

jl J<br />

t ] )<br />

lk [x] k<br />

l=1<br />

k=1<br />

Property DCN<br />

m∑ [<br />

A<br />

t ] [<br />

jl J t x ] Theorem EMP<br />

l<br />

l=1<br />

m∑ [<br />

A<br />

t ] [y] jl l<br />

Def<strong>in</strong>ition of x<br />

l=1<br />

= [ A t y ] j<br />

Theorem EMP<br />

=[0] j<br />

y ∈L(A)<br />

So, by Def<strong>in</strong>ition CVE, C t z = 0 and the vector z gives us a l<strong>in</strong>ear comb<strong>in</strong>ation<br />

of the columns of C t that equals the zero vector. In other words, z gives a relation<br />

of l<strong>in</strong>ear dependence on the the rows of C. However, the rows of C are a l<strong>in</strong>early<br />

<strong>in</strong>dependent set by Theorem BRS. Accord<strong>in</strong>g to Def<strong>in</strong>ition LICV we must conclude<br />

that the entries of z are all zero, i.e. z = 0.<br />

Now, for 1 ≤ i ≤ m, wehave<br />

[y] i<br />

= [ J t x ] Def<strong>in</strong>ition of x<br />

i<br />

m∑ [<br />

= J<br />

t ] ik [x] k<br />

Theorem EMP<br />

=<br />

=<br />

=<br />

k=1<br />

r∑ [<br />

J<br />

t ] m ik [x] k + ∑<br />

k=1<br />

k=r+1<br />

r∑ [<br />

J<br />

t ] m<br />

ik [z] k + ∑<br />

k=1<br />

r∑<br />

k=1<br />

[<br />

J<br />

t ] m−r<br />

ik 0+ ∑<br />

l=1<br />

k=r+1<br />

[<br />

J<br />

t ] ik [x] k<br />

Break apart sum<br />

[<br />

J<br />

t ] ik [w] k−r<br />

Def<strong>in</strong>ition of z and w<br />

[<br />

J<br />

t ] i,r+l [w] l<br />

z = 0, re-<strong>in</strong>dex


§FS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 248<br />

=0+<br />

m−r<br />

∑<br />

l=1<br />

[<br />

L<br />

t ] i,l [w] l<br />

L a submatrix of J<br />

= [ L t w ] i<br />

Theorem EMP<br />

So by Def<strong>in</strong>ition CVE, y = L t w. The existence of w implies that y ∈R(L), and<br />

therefore L(A) ⊆R(L). So by Def<strong>in</strong>ition SE we have L(A) =R(L).<br />

<br />

The first two conclusions of this theorem are nearly trivial. But they set up a<br />

pattern of results for C that is reflected <strong>in</strong> the latter two conclusions about L. In<br />

total, they tell us that we can compute all four subsets just by f<strong>in</strong>d<strong>in</strong>g null spaces and<br />

row spaces. This theorem does not tell us exactly how to compute these subsets, but<br />

<strong>in</strong>stead simply expresses them as null spaces and row spaces of matrices <strong>in</strong> reduced<br />

row-echelon form without any zero rows (C and L). A l<strong>in</strong>early <strong>in</strong>dependent set that<br />

spans the null space of a matrix <strong>in</strong> reduced row-echelon form can be found easily<br />

with Theorem BNS. It is an even easier matter to f<strong>in</strong>d a l<strong>in</strong>early <strong>in</strong>dependent set<br />

that spans the row space of a matrix <strong>in</strong> reduced row-echelon form with Theorem<br />

BRS, especially when there are no zero rows present. So an application of Theorem<br />

FS is typically followed by two applications each of Theorem BNS and Theorem<br />

BRS.<br />

The situation when r = m deserves comment, s<strong>in</strong>ce now the matrix L has no<br />

rows. What is C(A) when we try to apply Theorem FS and encounter N (L)? One<br />

<strong>in</strong>terpretation of this situation is that L is the coefficient matrix of a homogeneous<br />

system that has no equations. How hard is it to f<strong>in</strong>d a solution vector to this system?<br />

Some thought will conv<strong>in</strong>ce you that any proposed vector will qualify as a solution,<br />

s<strong>in</strong>ce it makes all of the equations true. So every possible vector is <strong>in</strong> the null space<br />

of L and therefore C(A) = N (L) = C m . OK, perhaps this sounds like some twisted<br />

argument from Alice <strong>in</strong> Wonderland. Let us try another argument that might solidly<br />

conv<strong>in</strong>ce you of this logic.<br />

If r = m, when we row-reduce the augmented matrix of LS(A, b) the result<br />

will have no zero rows, and the first n columns will all be pivot columns, leav<strong>in</strong>g<br />

none for the f<strong>in</strong>al column, so by Theorem RCLS the system will be consistent. By<br />

Theorem CSCS, b ∈C(A). S<strong>in</strong>ceb was arbitrary, every possible vector is <strong>in</strong> the<br />

column space of A, so we aga<strong>in</strong> have C(A) = C m . The situation when a matrix has<br />

r = m is known by the term full rank, and <strong>in</strong> the case of a square matrix co<strong>in</strong>cides<br />

with nons<strong>in</strong>gularity (see Exercise FS.M50).<br />

The properties of the matrix L described by this theorem can be expla<strong>in</strong>ed<br />

<strong>in</strong>formally as follows. A column vector y ∈ C m is <strong>in</strong> the column space of A if the<br />

l<strong>in</strong>ear system LS(A, y) is consistent (Theorem CSCS). By Theorem RCLS, the<br />

reduced row-echelon form of the augmented matrix [A | y] of a consistent system<br />

will have zeros <strong>in</strong> the bottom m − r locations of the last column. By Theorem PEEF<br />

this f<strong>in</strong>al column is the vector Jy and so should then have zeros <strong>in</strong> the f<strong>in</strong>al m − r


§FS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 249<br />

locations. But s<strong>in</strong>ce L comprises the f<strong>in</strong>al m − r rows of J, this condition is expressed<br />

by say<strong>in</strong>g y ∈N(L).<br />

Additionally, the rows of J are the scalars <strong>in</strong> l<strong>in</strong>ear comb<strong>in</strong>ations of the rows<br />

of A that create the rows of B. Thatis,therowsofJ record the net effect of the<br />

sequence of row operations that takes A to its reduced row-echelon form, B. This<br />

canbeseen<strong>in</strong>theequationJA = B (Theorem PEEF). As such, the rows of L are<br />

scalars for l<strong>in</strong>ear comb<strong>in</strong>ations of the rows of A that yield zero rows. But such l<strong>in</strong>ear<br />

comb<strong>in</strong>ations are precisely the elements of the left null space. So any element of the<br />

row space of L is also an element of the left null space of A.<br />

We will now illustrate Theorem FS with a few examples.<br />

Example FS1 Four subsets, no. 1<br />

In Example SEEF we found the five relevant submatrices of the extended echelon<br />

form for the matrix<br />

⎡<br />

⎤<br />

1 −1 −2 7 1 6<br />

⎢−6 2 −4 −18 −3 −26⎥<br />

A = ⎣<br />

4 −1 4 10 2 17<br />

⎦<br />

3 −1 2 9 1 12<br />

To apply Theorem FS we only need C and L,<br />

⎡<br />

⎤<br />

1 0 2 1 0 3<br />

C = ⎣ 0 1 4 −6 0 −1⎦ L = [ 1 2 2 1 ]<br />

0 0 0 0 1 2<br />

Then we use Theorem FS to obta<strong>in</strong><br />

⎧⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤⎫<br />

−2 −1 −3<br />

〈<br />

−4<br />

6<br />

1<br />

〉<br />

⎪⎨<br />

⎪⎬<br />

1<br />

0<br />

0<br />

N (A) =N (C) =<br />

⎢ 0<br />

,<br />

⎥ ⎢ 1<br />

,<br />

⎥ ⎢ 0<br />

⎥<br />

⎣ 0 ⎦ ⎣ 0 ⎦ ⎣−2⎦<br />

⎪⎩<br />

⎪⎭<br />

0 0 1<br />

⎧⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤⎫<br />

1 0 0<br />

〈<br />

0<br />

1<br />

0<br />

〉<br />

⎪⎨<br />

⎪⎬<br />

2<br />

4<br />

0<br />

R(A) =R(C) =<br />

⎢1<br />

,<br />

⎥ ⎢−6<br />

,<br />

⎥ ⎢0<br />

⎥<br />

⎣0⎦<br />

⎣ 0 ⎦ ⎣1⎦<br />

⎪⎩<br />

⎪⎭<br />

3 −1 2<br />

〈 ⎧ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />

⎪ −2 −2 −1 〉<br />

⎨ ⎪⎬<br />

⎢ 1 ⎥ ⎢ 0 ⎥ ⎢ 0 ⎥<br />

C(A) =N (L) = ⎣<br />

⎪ ⎩<br />

0<br />

⎦ , ⎣<br />

1<br />

⎦ , ⎣<br />

0<br />

⎦<br />

⎪⎭<br />

0 0 1<br />

Theorem BNS<br />

Theorem BRS<br />

Theorem BNS


§FS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 250<br />

Boom!<br />

〈 ⎧ ⎡ ⎤⎫<br />

⎪ 1 〉 ⎨ ⎪⎬<br />

⎢2⎥<br />

L(A) =R(L) = ⎣<br />

⎪ ⎩<br />

2<br />

⎦<br />

⎪⎭<br />

1<br />

Theorem BRS<br />

Example FS2 Four subsets, no. 2<br />

Now let us return to the matrix A that we used to motivate this section <strong>in</strong> Example<br />

CSANS,<br />

⎡<br />

⎤<br />

10 0 3 8 7<br />

−16 −1 −4 −10 −13<br />

−6 1 −3 −6 −6<br />

A =<br />

⎢ 0 2 −2 −3 −2<br />

⎥<br />

⎣ 3 0 1 2 3 ⎦<br />

−1 −1 1 1 0<br />

We form the matrix M by adjo<strong>in</strong><strong>in</strong>g the 6 × 6 identity matrix I 6 ,<br />

⎡<br />

⎤<br />

10 0 3 8 7 1 0 0 0 0 0<br />

−16 −1 −4 −10 −13 0 1 0 0 0 0<br />

−6 1 −3 −6 −6 0 0 1 0 0 0<br />

M =<br />

⎢ 0 2 −2 −3 −2 0 0 0 1 0 0<br />

⎥<br />

⎣ 3 0 1 2 3 0 0 0 0 1 0⎦<br />

−1 −1 1 1 0 0 0 0 0 0 1<br />

and row-reduce to obta<strong>in</strong> N<br />

⎡<br />

⎤<br />

1 0 0 0 2 0 0 1 −1 2 −1<br />

0 1 0 0 −3 0 0 −2 3 −3 3<br />

0 0 1 0 1 0 0 1 1 3 3<br />

N =<br />

0 0 0 1 −2 0 0 −2 1 −4 0<br />

⎢<br />

⎥<br />

⎣ 0 0 0 0 0 1 0 3 −1 3 1 ⎦<br />

0 0 0 0 0 0 1 −2 1 1 −1<br />

To f<strong>in</strong>d the four subsets for A, we only need identify the 4 × 5 matrix C and the<br />

2 × 6 matrix L,<br />

⎡<br />

⎤<br />

1 0 0 0 2<br />

[ ]<br />

C = ⎢ 0 1 0 0 −3<br />

⎥<br />

1 0 3 −1 3 1<br />

⎣ 0 0 1 0 1 ⎦ L =<br />

0 1 −2 1 1 −1<br />

0 0 0 1 −2<br />


§FS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 251<br />

Then we apply Theorem FS,<br />

⎧⎡<br />

⎤⎫<br />

−2<br />

〈 〉<br />

⎪⎨<br />

3<br />

⎪⎬<br />

N (A) =N (C) = ⎢−1⎥<br />

⎣<br />

⎪⎩<br />

2 ⎦<br />

⎪⎭<br />

1<br />

⎧⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />

1 0 0 0<br />

〈<br />

〉<br />

⎪⎨<br />

0<br />

1<br />

0<br />

0<br />

⎪⎬<br />

R(A) =R(C) = ⎢0⎥<br />

⎣<br />

⎪⎩<br />

0⎦ , ⎢ 0 ⎥<br />

⎣ 0 ⎦ , ⎢1⎥<br />

⎣0⎦ , ⎢ 0 ⎥<br />

⎣ 1 ⎦<br />

⎪⎭<br />

2 −3 1 −2<br />

⎧⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />

−3 1 −3 −1<br />

〈<br />

2<br />

−1<br />

−1<br />

1<br />

〉<br />

⎪⎨<br />

⎪⎬<br />

1<br />

0<br />

0<br />

0<br />

C(A) =N (L) =<br />

⎢ 0<br />

,<br />

⎥ ⎢ 1<br />

,<br />

⎥ ⎢ 0<br />

,<br />

⎥ ⎢ 0<br />

⎥<br />

⎣ 0 ⎦ ⎣ 0 ⎦ ⎣ 1 ⎦ ⎣ 0 ⎦<br />

⎪⎩<br />

⎪⎭<br />

0 0 0 1<br />

⎧⎡<br />

⎤ ⎡ ⎤⎫<br />

1 0<br />

〈<br />

0<br />

1<br />

〉<br />

⎪⎨<br />

⎪⎬<br />

3<br />

−2<br />

L(A) =R(L) =<br />

⎢−1<br />

,<br />

⎥ ⎢ 1<br />

⎥<br />

⎣ 3 ⎦ ⎣ 1 ⎦<br />

⎪⎩<br />

⎪⎭<br />

1 −1<br />

Theorem BNS<br />

Theorem BRS<br />

Theorem BNS<br />

Theorem BRS<br />

The next example is just a bit different s<strong>in</strong>ce the matrix has more rows than<br />

columns, and a trivial null space.<br />

Example FSAG Four subsets, Archetype G<br />

Archetype G and Archetype H are both systems of m = 5 equations <strong>in</strong> n =2<br />

variables. They have identical coefficient matrices, which we will denote here as the<br />

matrix G,<br />

⎡ ⎤<br />

2 3<br />

−1 4<br />

G = ⎢ 3 10⎥<br />

⎣ 3 −1⎦<br />

6 9<br />

Adjo<strong>in</strong> the 5 × 5 identity matrix, I 5 , to form<br />

⎡<br />

⎤<br />

2 3 1 0 0 0 0<br />

−1 4 0 1 0 0 0<br />

M = ⎢ 3 10 0 0 1 0 0⎥<br />

⎣ 3 −1 0 0 0 1 0⎦<br />

6 9 0 0 0 0 1<br />


§FS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 252<br />

This row-reduces to<br />

⎡<br />

3<br />

1 0 0 0 0<br />

11<br />

0 1 0 0 0 − 2<br />

11<br />

1<br />

33<br />

1<br />

11<br />

N =<br />

0 0 1 0 0 0 −<br />

⎢<br />

1 3<br />

⎣ 0 0 0 1 0 1 − 1 ⎥<br />

⎦<br />

3<br />

0 0 0 0 1 1 −1<br />

The first n = 2 columns conta<strong>in</strong> r = 2 lead<strong>in</strong>g 1’s, so we obta<strong>in</strong> C as the 2 × 2<br />

identity matrix and extract L from the f<strong>in</strong>al m − r = 3 rows <strong>in</strong> the f<strong>in</strong>al m =5<br />

columns.<br />

C =<br />

[<br />

1 0<br />

0 1<br />

]<br />

⎡<br />

1 0 0 0 − 1 ⎤<br />

3<br />

L = ⎣ 0 1 0 1 − 1 ⎦<br />

3<br />

0 0 1 1 −1<br />

Then we apply Theorem FS,<br />

N (G) =N (C) =〈∅〉 = {0}<br />

Theorem BNS<br />

〈{[ [<br />

1 0<br />

R(G) =R(C) = , = C<br />

0]<br />

1]}〉<br />

2 Theorem BRS<br />

⎧⎡<br />

⎤ ⎡ 1⎤⎫<br />

0<br />

〈<br />

3<br />

⎪⎨<br />

−1<br />

1 〉<br />

⎪⎬<br />

C(G) =N (L) = ⎢−1⎥<br />

⎣<br />

⎪⎩<br />

1 ⎦ , 3<br />

⎢1⎥<br />

Theorem BNS<br />

⎣<br />

0<br />

⎦<br />

⎪⎭<br />

0 1<br />

⎧⎡<br />

⎤ ⎡ ⎤⎫<br />

0 1<br />

〈<br />

〉<br />

⎪⎨<br />

−1<br />

1<br />

⎪⎬<br />

= ⎢−1⎥<br />

⎣<br />

⎪⎩<br />

1 ⎦ , ⎢3⎥<br />

⎣0⎦<br />

⎪⎭<br />

0 3<br />

⎧⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤⎫<br />

1 0 0<br />

〈<br />

〉<br />

⎪⎨<br />

0<br />

1<br />

0<br />

⎪⎬<br />

L(G) =R(L) = ⎢ 0<br />

⎥<br />

⎣ 0 ⎦ , ⎢ 0<br />

⎥<br />

⎣ 1 ⎦ , ⎢ 1 ⎥ Theorem BRS<br />

⎣<br />

⎪⎩<br />

1 ⎦<br />

⎪⎭<br />

− 1 3<br />

− 1 −1<br />

3<br />

⎧⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤⎫<br />

3 0 0<br />

〈<br />

〉<br />

⎪⎨<br />

0<br />

3<br />

0<br />

⎪⎬<br />

= ⎢ 0 ⎥<br />

⎣<br />

⎪⎩<br />

0 ⎦ , ⎢ 0 ⎥<br />

⎣ 3 ⎦ , ⎢ 1 ⎥<br />

⎣ 1 ⎦<br />

⎪⎭<br />

−1 −1 −1<br />

As mentioned earlier, Archetype G is consistent, while Archetype H is <strong>in</strong>consistent.<br />

See if you can write the two different vectors of constants from these two archetypes<br />

as l<strong>in</strong>ear comb<strong>in</strong>ations of the two vectors that form the spann<strong>in</strong>g set for C(G). How<br />


§FS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 253<br />

about the two columns of G? Can you write each <strong>in</strong>dividually as a l<strong>in</strong>ear comb<strong>in</strong>ation<br />

of the two vectors that form the spann<strong>in</strong>g set for C(G)? They must be <strong>in</strong> the column<br />

space of G also. Are your answers unique? Do you notice anyth<strong>in</strong>g about the scalars<br />

that appear <strong>in</strong> the l<strong>in</strong>ear comb<strong>in</strong>ations you are form<strong>in</strong>g?<br />

△<br />

Example COV and Example CSROI each describes the column space of the<br />

coefficient matrix from Archetype I as the span of a set of r = 3 l<strong>in</strong>early <strong>in</strong>dependent<br />

vectors. It is no accident that these two different sets both have the same size. If we<br />

(you?) were to calculate the column space of this matrix us<strong>in</strong>g the null space of the<br />

matrix L from Theorem FS then we would aga<strong>in</strong> f<strong>in</strong>d a set of 3 l<strong>in</strong>early <strong>in</strong>dependent<br />

vectors that span the range. More on this later.<br />

So we have three different methods to obta<strong>in</strong> a description of the column space of<br />

a matrix as the span of a l<strong>in</strong>early <strong>in</strong>dependent set. Theorem BCS is sometimes useful<br />

s<strong>in</strong>ce the vectors it specifies are equal to actual columns of the matrix. Theorem BRS<br />

and Theorem CSRST comb<strong>in</strong>e to create vectors with lots of zeros, and strategically<br />

placed 1’s near the top of the vector. Theorem FS and the matrix L from the<br />

extended echelon form gives us a third method, which tends to create vectors with<br />

lots of zeros, and strategically placed 1’s near the bottom of the vector. If we do not<br />

care about l<strong>in</strong>ear <strong>in</strong>dependence we can also appeal to Def<strong>in</strong>ition CSM and simply<br />

express the column space as the span of all the columns of the matrix, giv<strong>in</strong>g us a<br />

fourth description.<br />

With Theorem CSRST and Def<strong>in</strong>ition RSM, we can compute column spaces<br />

with theorems about row spaces, and we can compute row spaces with theorems<br />

about column spaces, but <strong>in</strong> each case we must transpose the matrix first. At this<br />

po<strong>in</strong>t you may be overwhelmed by all the possibilities for comput<strong>in</strong>g column and<br />

row spaces. Diagram CSRST is meant to help. For both the column space and row<br />

space, it suggests four techniques. One is to appeal to the def<strong>in</strong>ition, another yields a<br />

span of a l<strong>in</strong>early <strong>in</strong>dependent set, and a third uses Theorem FS. A fourth suggests<br />

transpos<strong>in</strong>g the matrix and the dashed l<strong>in</strong>e implies that then the companion set of<br />

techniques can be applied. This can lead to a bit of sill<strong>in</strong>ess, s<strong>in</strong>ce if you were to<br />

follow the dashed l<strong>in</strong>es twice you would transpose the matrix twice, and by Theorem<br />

TT would accomplish noth<strong>in</strong>g productive.


§FS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 254<br />

Def<strong>in</strong>ition CSM<br />

C(A)<br />

Theorem BCS<br />

Theorem FS, N (L)<br />

Theorem CSRST, R(A t )<br />

Def<strong>in</strong>ition RSM, C(A t )<br />

R(A)<br />

Theorem FS, R(C)<br />

Theorem BRS<br />

Def<strong>in</strong>ition RSM<br />

Diagram CSRST: Column Space and Row Space Techniques<br />

Although we have many ways to describe a column space, notice that one tempt<strong>in</strong>g<br />

strategy will usually fail. It is not possible to simply row-reduce a matrix directly<br />

and then use the columns of the row-reduced matrix as a set whose span equals<br />

the column space. In other words, row operations do not preserve column spaces<br />

(however row operations do preserve row spaces, Theorem REMRS). See Exercise<br />

CRS.M21.<br />

Read<strong>in</strong>g Questions<br />

1. F<strong>in</strong>d a nontrivial element of the left null space of A.<br />

⎡<br />

A = ⎣ 2 1 −3 4<br />

⎤<br />

−1 −1 2 −1⎦<br />

0 −1 1 2<br />

2. F<strong>in</strong>d the matrices C and L <strong>in</strong> the extended echelon form of A.<br />

⎡<br />

⎤<br />

−9 5 −3<br />

A = ⎣ 2 −1 1 ⎦<br />

−5 3 −1<br />

3. Why is Theorem FS a great conclusion to Chapter M?<br />

Exercises<br />

C20 Example FSAG concludes with several questions. Perform the analysis suggested by<br />

these questions.


§FS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 255<br />

C25 † Given the matrix A below, use the extended echelon form of A to answer each part<br />

of this problem. In each part, f<strong>in</strong>d a l<strong>in</strong>early <strong>in</strong>dependent set of vectors, S, so that the span<br />

of S, 〈S〉, equals the specified set of vectors.<br />

⎡<br />

⎤<br />

−5 3 −1<br />

⎢−1 1 1 ⎥<br />

A = ⎣<br />

−8 5 −1<br />

⎦<br />

3 −2 0<br />

1. The row space of A, R(A).<br />

2. The column space of A, C(A).<br />

3. The null space of A, N (A).<br />

4. The left null space of A, L(A).<br />

C26 †<br />

For the matrix D below use the extended echelon form to f<strong>in</strong>d:<br />

1. A l<strong>in</strong>early <strong>in</strong>dependent set whose span is the column space of D.<br />

2. A l<strong>in</strong>early <strong>in</strong>dependent set whose span is the left null space of D.<br />

D =<br />

⎡<br />

⎤<br />

−7 −11 −19 −15<br />

⎢ 6 10 18 14 ⎥<br />

⎣<br />

3 5 9 7<br />

⎦<br />

−1 −2 −4 −3<br />

C41 The follow<strong>in</strong>g archetypes are systems of equations. For each system, write the vector<br />

of constants as a l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors <strong>in</strong> the span construction for the column<br />

space provided by Theorem FS and Theorem BNS (these vectors are listed for each of these<br />

archetypes).<br />

Archetype A,ArchetypeB,ArchetypeC,ArchetypeD,ArchetypeE,ArchetypeF,Archetype<br />

G, ArchetypeH, ArchetypeI, ArchetypeJ<br />

C43 The follow<strong>in</strong>g archetypes are either matrices or systems of equations with coefficient<br />

matrices. For each matrix, compute the extended echelon form N and identify the matrices<br />

C and L. Us<strong>in</strong>g Theorem FS, TheoremBNS and Theorem BRS express the null space, the<br />

row space, the column space and left null space of each coefficient matrix as a span of a<br />

l<strong>in</strong>early <strong>in</strong>dependent set.<br />

Archetype A,ArchetypeB,ArchetypeC,ArchetypeD/Archetype E,ArchetypeF,Archetype<br />

G/Archetype H, ArchetypeI, ArchetypeJ, ArchetypeK, ArchetypeL<br />

C60 † For the matrix B below, f<strong>in</strong>d sets of vectors whose span equals the column space of<br />

B (C(B)) and which <strong>in</strong>dividually meet the follow<strong>in</strong>g extra requirements.<br />

1. The set illustrates the def<strong>in</strong>ition of the column space.


§FS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 256<br />

2. The set is l<strong>in</strong>early <strong>in</strong>dependent and the members of the set are columns of B.<br />

3. The set is l<strong>in</strong>early <strong>in</strong>dependent with a “nice pattern of zeros and ones” at the top of<br />

each vector.<br />

4. The set is l<strong>in</strong>early <strong>in</strong>dependent with a “nice pattern of zeros and ones” at the bottom<br />

of each vector.<br />

⎡<br />

B = ⎣ 2 3 1 1<br />

⎤<br />

1 1 0 1 ⎦<br />

−1 2 3 −4<br />

C61 †<br />

Let A be the matrix below, and f<strong>in</strong>d the <strong>in</strong>dicated sets with the requested properties.<br />

⎡<br />

⎤<br />

2 −1 5 −3<br />

A = ⎣−5 3 −12 7 ⎦<br />

1 1 4 −3<br />

1. A l<strong>in</strong>early <strong>in</strong>dependent set S so that C(A) = 〈S〉 and S is composed of columns of A.<br />

2. A l<strong>in</strong>early <strong>in</strong>dependent set S so that C(A) = 〈S〉 and the vectors <strong>in</strong> S have a nice<br />

pattern of zeros and ones at the top of the vectors.<br />

3. A l<strong>in</strong>early <strong>in</strong>dependent set S so that C(A) = 〈S〉 and the vectors <strong>in</strong> S have a nice<br />

pattern of zeros and ones at the bottom of the vectors.<br />

4. A l<strong>in</strong>early <strong>in</strong>dependent set S so that R(A) =〈S〉.<br />

M50 Suppose that A is a nons<strong>in</strong>gular matrix. Extend the four conclusions of Theorem<br />

FS <strong>in</strong> this special case and discuss connections with previous results (such as Theorem<br />

NME4).<br />

M51 Suppose that A is a s<strong>in</strong>gular matrix. Extend the four conclusions of Theorem FS <strong>in</strong><br />

this special case and discuss connections with previous results (such as Theorem NME4).


Chapter VS<br />

Vector Spaces<br />

We now have a computational toolkit <strong>in</strong> place and so we can beg<strong>in</strong> our study of<br />

l<strong>in</strong>ear algebra at a more theoretical level.<br />

L<strong>in</strong>ear algebra is the study of two fundamental objects, vector spaces and l<strong>in</strong>ear<br />

transformations (see Chapter LT). This chapter will focus on the former. The power<br />

of mathematics is often derived from generaliz<strong>in</strong>g many different situations <strong>in</strong>to one<br />

abstract formulation, and that is exactly what we will be do<strong>in</strong>g throughout this<br />

chapter.<br />

Section VS<br />

Vector Spaces<br />

In this section we present a formal def<strong>in</strong>ition of a vector space, which will lead to an<br />

extra <strong>in</strong>crement of abstraction. Once def<strong>in</strong>ed, we study its most basic properties.<br />

Subsection VS<br />

Vector Spaces<br />

Here is one of the two most important def<strong>in</strong>itions <strong>in</strong> the entire course.<br />

Def<strong>in</strong>ition VS Vector Space<br />

Suppose that V is a set upon which we have def<strong>in</strong>ed two operations: (1) vector<br />

addition, which comb<strong>in</strong>es two elements of V and is denoted by “+”, and (2) scalar<br />

multiplication, which comb<strong>in</strong>es a complex number with an element of V and is<br />

denoted by juxtaposition. Then V , along with the two operations, is a vector space<br />

over C if the follow<strong>in</strong>g ten properties hold.<br />

• AC Additive Closure<br />

If u, v ∈ V ,thenu + v ∈ V .<br />

257


§VS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 258<br />

• SC Scalar Closure<br />

If α ∈ C and u ∈ V ,thenαu ∈ V .<br />

• C Commutativity<br />

If u, v ∈ V ,thenu + v = v + u.<br />

• AA Additive Associativity<br />

If u, v, w ∈ V ,thenu +(v + w) =(u + v)+w.<br />

• Z Zero Vector<br />

There is a vector, 0, called the zero vector, such that u + 0 = u for all u ∈ V .<br />

• AI Additive Inverses<br />

If u ∈ V , then there exists a vector −u ∈ V so that u +(−u) =0.<br />

• SMA Scalar Multiplication Associativity<br />

If α, β ∈ C and u ∈ V ,thenα(βu) =(αβ)u.<br />

• DVA Distributivity across Vector Addition<br />

If α ∈ C and u, v ∈ V ,thenα(u + v) =αu + αv.<br />

• DSA Distributivity across Scalar Addition<br />

If α, β ∈ C and u ∈ V ,then(α + β)u = αu + βu.<br />

• OOne<br />

If u ∈ V ,then1u = u.<br />

Theobjects<strong>in</strong>V are called vectors, no matter what else they might really be,<br />

simply by virtue of be<strong>in</strong>g elements of a vector space.<br />

□<br />

Now, there are several important observations to make. Many of these will be<br />

easier to understand on a second or third read<strong>in</strong>g, and especially after carefully<br />

study<strong>in</strong>g the examples <strong>in</strong> Subsection VS.EVS.<br />

An axiom is often a “self-evident” truth. Someth<strong>in</strong>g so fundamental that we<br />

all agree it is true and accept it without proof. Typically, it would be the logical<br />

underp<strong>in</strong>n<strong>in</strong>g that we would beg<strong>in</strong> to build theorems upon. Some might refer to<br />

the ten properties of Def<strong>in</strong>ition VS as axioms, imply<strong>in</strong>g that a vector space is a<br />

very natural object and the ten properties are the essence of a vector space. We<br />

will <strong>in</strong>stead emphasize that we will beg<strong>in</strong> with a def<strong>in</strong>ition of a vector space. After<br />

study<strong>in</strong>g the rema<strong>in</strong>der of this chapter, you might return here and rem<strong>in</strong>d yourself<br />

how all our forthcom<strong>in</strong>g theorems and def<strong>in</strong>itions rest on this foundation.<br />

As we will see shortly, the objects <strong>in</strong> V can be anyth<strong>in</strong>g, even though we will<br />

call them vectors. We have been work<strong>in</strong>g with vectors frequently, but we should


§VS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 259<br />

stress here that these have so far just been column vectors — scalars arranged <strong>in</strong> a<br />

columnar list of fixed length. In a similar ve<strong>in</strong>, you have used the symbol “+” for<br />

many years to represent the addition of numbers (scalars). We have extended its<br />

use to the addition of column vectors and to the addition of matrices, and now we<br />

are go<strong>in</strong>g to recycle it even further and let it denote vector addition <strong>in</strong> any possible<br />

vector space. So when describ<strong>in</strong>g a new vector space, we will have to def<strong>in</strong>e exactly<br />

what “+” is. Similar comments apply to scalar multiplication. Conversely, we can<br />

def<strong>in</strong>e our operations any way we like, so long as the ten properties are fulfilled (see<br />

Example CVS).<br />

In Def<strong>in</strong>ition VS, the scalars do not have to be complex numbers. They can come<br />

from what are called <strong>in</strong> more advanced mathematics, “fields”. Examples of fields are<br />

the set of complex numbers, the set of real numbers, the set of rational numbers,<br />

and even the f<strong>in</strong>ite set of “b<strong>in</strong>ary numbers”, {0, 1}. There are many, many others.<br />

In this case we would call V a vector space over (the field) F .<br />

A vector space is composed of three objects, a set and two operations. Some<br />

would explicitly state <strong>in</strong> the def<strong>in</strong>ition that V must be a nonempty set, but we can<br />

<strong>in</strong>fer this from Property Z, s<strong>in</strong>ce the set cannot be empty and conta<strong>in</strong> a vector that<br />

behaves as the zero vector. Also, we usually use the same symbol for both the set<br />

and the vector space itself. Do not let this convenience fool you <strong>in</strong>to th<strong>in</strong>k<strong>in</strong>g the<br />

operations are secondary!<br />

This discussion has either conv<strong>in</strong>ced you that we are really embark<strong>in</strong>g on a new<br />

level of abstraction, or it has seemed cryptic, mysterious or nonsensical. You might<br />

want to return to this section <strong>in</strong> a few days and give it another read then. In any<br />

case, let us look at some concrete examples now.<br />

Subsection EVS<br />

Examples of Vector Spaces<br />

Our aim <strong>in</strong> this subsection is to give you a storehouse of examples to work with, to<br />

become comfortable with the ten vector space properties and to conv<strong>in</strong>ce you that<br />

the multitude of examples justifies (at least <strong>in</strong>itially) mak<strong>in</strong>g such a broad def<strong>in</strong>ition<br />

as Def<strong>in</strong>ition VS. Some of our claims will be justified by reference to previous<br />

theorems, we will prove some facts from scratch, and we will do one nontrivial<br />

example completely. In other places, our usual thoroughness will be neglected, so<br />

grab paper and pencil and play along.<br />

Example VSCV The vector space C m<br />

Set: C m , all column vectors of size m, Def<strong>in</strong>ition VSCV.<br />

Equality: Entry-wise, Def<strong>in</strong>ition CVE.<br />

Vector Addition: The “usual” addition, given <strong>in</strong> Def<strong>in</strong>ition CVA.<br />

Scalar Multiplication: The “usual” scalar multiplication, given <strong>in</strong> Def<strong>in</strong>ition<br />

CVSM.


§VS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 260<br />

Does this set with these operations fulfill the ten properties? Yes. And by design<br />

all we need to do is quote Theorem VSPCV. That was easy.<br />

△<br />

Example VSM The vector space of matrices, M mn<br />

Set: M mn , the set of all matrices of size m × n and entries from C, Def<strong>in</strong>ition VSM.<br />

Equality: Entry-wise, Def<strong>in</strong>ition ME.<br />

Vector Addition: The “usual” addition, given <strong>in</strong> Def<strong>in</strong>ition MA.<br />

Scalar Multiplication: The “usual” scalar multiplication, given <strong>in</strong> Def<strong>in</strong>ition MSM.<br />

Does this set with these operations fulfill the ten properties? Yes. And all we<br />

need to do is quote Theorem VSPM. Another easy one (by design).<br />

△<br />

So, the set of all matrices of a fixed size forms a vector space. That entitles us to<br />

call a matrix a vector, s<strong>in</strong>ce a matrix is an element of a vector space. For example, if<br />

A, B ∈ M 34 then we call A and B “vectors,” and we even use our previous notation<br />

for column vectors to refer to A and B. So we could legitimately write expressions<br />

like<br />

u + v = A + B = B + A = v + u<br />

This could lead to some confusion, but it is not too great a danger. But it is worth<br />

comment.<br />

The previous two examples may be less than satisfy<strong>in</strong>g. We made all the relevant<br />

def<strong>in</strong>itions long ago. And the required verifications were all handled by quot<strong>in</strong>g old<br />

theorems. However, it is important to consider these two examples first. We have<br />

been study<strong>in</strong>g vectors and matrices carefully (Chapter V, Chapter M), and both<br />

objects, along with their operations, have certa<strong>in</strong> properties <strong>in</strong> common, as you<br />

may have noticed <strong>in</strong> compar<strong>in</strong>g Theorem VSPCV with Theorem VSPM. Indeed,<br />

it is these two theorems that motivate us to formulate the abstract def<strong>in</strong>ition of a<br />

vector space, Def<strong>in</strong>ition VS. Now, if we prove some general theorems about vector<br />

spaces (as we will shortly <strong>in</strong> Subsection VS.VSP), we can then <strong>in</strong>stantly apply the<br />

conclusions to both C m and M mn . Notice too, how we have taken six def<strong>in</strong>itions and<br />

two theorems and reduced them down to two examples. With greater generalization<br />

and abstraction our old ideas get downgraded <strong>in</strong> stature.<br />

Let us look at some more examples, now consider<strong>in</strong>g some new vector spaces.<br />

Example VSP The vector space of polynomials, P n<br />

Set: P n , the set of all polynomials of degree n or less <strong>in</strong> the variable x with coefficients<br />

from C.<br />

Equality:<br />

a 0 + a 1 x + a 2 x 2 + ···+ a n x n = b 0 + b 1 x + b 2 x 2 + ···+ b n x n<br />

if and only if a i = b i for 0 ≤ i ≤ n<br />

Vector Addition:<br />

(a 0 + a 1 x + a 2 x 2 + ···+ a n x n )+(b 0 + b 1 x + b 2 x 2 + ···+ b n x n )=


§VS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 261<br />

(a 0 + b 0 )+(a 1 + b 1 )x +(a 2 + b 2 )x 2 + ···+(a n + b n )x n<br />

Scalar Multiplication:<br />

α(a 0 + a 1 x + a 2 x 2 + ···+ a n x n )=(αa 0 )+(αa 1 )x +(αa 2 )x 2 + ···+(αa n )x n<br />

This set, with these operations, will fulfill the ten properties, though we will not<br />

work all the details here. However, we will make a few comments and prove one of<br />

the properties. <strong>First</strong>, the zero vector (Property Z) is what you might expect, and<br />

you can check that it has the required property.<br />

0 =0+0x +0x 2 + ···+0x n<br />

The additive <strong>in</strong>verse (Property AI) is also no surprise, though consider how we<br />

have chosen to write it.<br />

− ( a 0 + a 1 x + a 2 x 2 + ···+ a n x n) =(−a 0 )+(−a 1 )x +(−a 2 )x 2 + ···+(−a n )x n<br />

Now let us prove the associativity of vector addition (Property AA). This is a<br />

bit tedious, though necessary. Throughout, the plus sign (“+”) does triple-duty. You<br />

might ask yourself what each plus sign represents as you work through this proof.<br />

u+(v + w)<br />

=(a 0 + a 1 x + ···+ a n x n )+((b 0 + b 1 x + ···+ b n x n )+(c 0 + c 1 x + ···+ c n x n ))<br />

=(a 0 + a 1 x + ···+ a n x n )+((b 0 + c 0 )+(b 1 + c 1 )x + ···+(b n + c n )x n )<br />

=(a 0 +(b 0 + c 0 )) + (a 1 +(b 1 + c 1 ))x + ···+(a n +(b n + c n ))x n<br />

=((a 0 + b 0 )+c 0 )+((a 1 + b 1 )+c 1 )x + ···+((a n + b n )+c n )x n<br />

=((a 0 + b 0 )+(a 1 + b 1 )x + ···+(a n + b n )x n )+(c 0 + c 1 x + ···+ c n x n )<br />

=((a 0 + a 1 x + ···+ a n x n )+(b 0 + b 1 x + ···+ b n x n )) + (c 0 + c 1 x + ···+ c n x n )<br />

=(u + v)+w<br />

Notice how it is the application of the associativity of the (old) addition of<br />

complex numbers <strong>in</strong> the middle of this cha<strong>in</strong> of equalities that makes the whole proof<br />

happen. The rema<strong>in</strong>der is successive applications of our (new) def<strong>in</strong>ition of vector<br />

(polynomial) addition. Prov<strong>in</strong>g the rema<strong>in</strong>der of the ten properties is similar <strong>in</strong> style<br />

and tedium. You might try prov<strong>in</strong>g the commutativity of vector addition (Property<br />

C), or one of the distributivity properties (Property DVA, Property DSA). △<br />

Example VSIS The vector space of <strong>in</strong>f<strong>in</strong>ite sequences<br />

Set: C ∞ = {(c 0 ,c 1 ,c 2 ,c 3 , ...)| c i ∈ C, i ∈ N}.<br />

Equality:<br />

(c 0 ,c 1 ,c 2 , ...)=(d 0 ,d 1 ,d 2 , ...) if and only if c i = d i for all i ≥ 0<br />

Vector Addition:<br />

(c 0 ,c 1 ,c 2 , ...)+(d 0 ,d 1 ,d 2 , ...)=(c 0 + d 0 ,c 1 + d 1 ,c 2 + d 2 , ...)


§VS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 262<br />

Scalar Multiplication:<br />

α(c 0 ,c 1 ,c 2 ,c 3 , ...)=(αc 0 ,αc 1 ,αc 2 ,αc 3 , ...)<br />

This should rem<strong>in</strong>d you of the vector space C m , though now our lists of scalars are<br />

written horizontally with commas as delimiters and they are allowed to be <strong>in</strong>f<strong>in</strong>ite <strong>in</strong><br />

length. What does the zero vector look like (Property Z)? Additive <strong>in</strong>verses (Property<br />

AI)? Can you prove the associativity of vector addition (Property AA)? △<br />

Example VSF The vector space of functions<br />

Let X be any set.<br />

Set: F = {f| f : X → C}.<br />

Equality: f = g if and only if f(x) =g(x) for all x ∈ X.<br />

Vector Addition: f + g is the function with outputs def<strong>in</strong>ed by (f + g)(x) =<br />

f(x)+g(x).<br />

Scalar Multiplication: αf is the function with outputs def<strong>in</strong>ed by (αf)(x) =<br />

αf(x).<br />

So this is the set of all functions of one variable that take elements of the set X<br />

to a complex number. You might have studied functions of one variable that take a<br />

real number to a real number, and that might be a more natural set to use as X.<br />

But s<strong>in</strong>ce we are allow<strong>in</strong>g our scalars to be complex numbers, we need to specify<br />

that the range of our functions is the complex numbers. Study carefully how the<br />

def<strong>in</strong>itions of the operation are made, and th<strong>in</strong>k about the different uses of “+” and<br />

juxtaposition. As an example of what is required when verify<strong>in</strong>g that this is a vector<br />

space, consider that the zero vector (Property Z) is the function z whose def<strong>in</strong>ition<br />

is z(x) = 0 for every <strong>in</strong>put x ∈ X.<br />

Vector spaces of functions are very important <strong>in</strong> mathematics and physics, where<br />

the field of scalars may be the real numbers, so the ranges of the functions can <strong>in</strong><br />

turn also be the set of real numbers.<br />

△<br />

Here is a unique example.<br />

Example VSS The s<strong>in</strong>gleton vector space<br />

Set: Z = {z}.<br />

Equality: Huh?<br />

Vector Addition: z + z = z.<br />

Scalar Multiplication: αz = z.<br />

This should look pretty wild. <strong>First</strong>, just what is z? Column vector, matrix,<br />

polynomial, sequence, function? M<strong>in</strong>eral, plant, or animal? We aren’t say<strong>in</strong>g! z just<br />

is. And we have def<strong>in</strong>itions of vector addition and scalar multiplication that are<br />

sufficient for an occurrence of either that may come along.<br />

Our only concern is if this set, along with the def<strong>in</strong>itions of two operations, fulfills<br />

the ten properties of Def<strong>in</strong>ition VS. Let us check associativity of vector addition


§VS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 263<br />

(Property AA). For all u, v, w ∈ Z,<br />

u +(v + w) =z +(z + z)<br />

= z + z<br />

=(z + z)+z<br />

=(u + v)+w<br />

What is the zero vector <strong>in</strong> this vector space (Property Z)? With only one element<br />

<strong>in</strong> the set, we do not have much choice. Is z = 0? Itappearsthatz behaves like the<br />

zero vector should, so it gets the title. Maybe now the def<strong>in</strong>ition of this vector space<br />

does not seem so bizarre. It is a set whose only element is the element that behaves<br />

like the zero vector, so that lone element is the zero vector.<br />

△<br />

Perhaps some of the above def<strong>in</strong>itions and verifications seem obvious or like<br />

splitt<strong>in</strong>g hairs, but the next example should conv<strong>in</strong>ce you that they are necessary.<br />

We will study this one carefully. Ready? Check your preconceptions at the door.<br />

Example CVS The crazy vector space<br />

Set: C = {(x 1 ,x 2 )| x 1 ,x 2 ∈ C}.<br />

Vector Addition: (x 1 ,x 2 )+(y 1 ,y 2 )=(x 1 + y 1 +1,x 2 + y 2 +1).<br />

Scalar Multiplication: α(x 1 ,x 2 )=(αx 1 + α − 1, αx 2 + α − 1).<br />

Now, the first th<strong>in</strong>g I hear you say is “You can’t do that!” And my response is,<br />

“Oh yes, I can!” I am free to def<strong>in</strong>e my set and my operations any way I please. They<br />

may not look natural, or even useful, but we will now verify that they provide us<br />

with another example of a vector space. And that is enough. If you are adventurous,<br />

you might try first check<strong>in</strong>g some of the properties yourself. What is the zero vector?<br />

Additive <strong>in</strong>verses? Can you prove associativity? Ready, here we go.<br />

Property AC, Property SC: The result of each operation is a pair of complex<br />

numbers, so these two closure properties are fulfilled.<br />

Property C:<br />

u + v =(x 1 ,x 2 )+(y 1 ,y 2 )=(x 1 + y 1 +1,x 2 + y 2 +1)<br />

=(y 1 + x 1 +1,y 2 + x 2 +1)=(y 1 ,y 2 )+(x 1 ,x 2 )<br />

= v + u<br />

Property AA:<br />

u +(v + w) =(x 1 ,x 2 )+((y 1 ,y 2 )+(z 1 ,z 2 ))<br />

=(x 1 ,x 2 )+(y 1 + z 1 +1,y 2 + z 2 +1)<br />

=(x 1 +(y 1 + z 1 +1)+1,x 2 +(y 2 + z 2 +1)+1)<br />

=(x 1 + y 1 + z 1 +2,x 2 + y 2 + z 2 +2)<br />

=((x 1 + y 1 +1)+z 1 +1, (x 2 + y 2 +1)+z 2 +1)<br />

=(x 1 + y 1 +1,x 2 + y 2 +1)+(z 1 ,z 2 )


§VS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 264<br />

=((x 1 ,x 2 )+(y 1 ,y 2 )) + (z 1 ,z 2 )<br />

=(u + v)+w<br />

Property Z: The zero vector is . . . 0 =(−1, −1). Now I hear you say, “No, no,<br />

that can’t be, it must be (0, 0)!”. Indulge me for a moment and let us check my<br />

proposal.<br />

u + 0 =(x 1 ,x 2 )+(−1, −1) = (x 1 +(−1) + 1, x 2 +(−1) + 1) = (x 1 ,x 2 )=u<br />

Feel<strong>in</strong>g better? Or worse?<br />

Property AI: For each vector, u, we must locate an additive <strong>in</strong>verse, −u. Hereit<br />

is, −(x 1 ,x 2 )=(−x 1 −2, −x 2 −2). As odd as it may look, I hope you are withhold<strong>in</strong>g<br />

judgment. Check:<br />

u +(−u) =(x 1 ,x 2 )+(−x 1 − 2, −x 2 − 2)<br />

=(x 1 +(−x 1 − 2) + 1, −x 2 +(x 2 − 2) + 1) = (−1, −1) = 0<br />

Property SMA:<br />

α(βu) =α(β(x 1 ,x 2 ))<br />

= α(βx 1 + β − 1, βx 2 + β − 1)<br />

=(α(βx 1 + β − 1) + α − 1, α(βx 2 + β − 1) + α − 1)<br />

=((αβx 1 + αβ − α)+α − 1, (αβx 2 + αβ − α)+α − 1)<br />

=(αβx 1 + αβ − 1, αβx 2 + αβ − 1)<br />

=(αβ)(x 1 ,x 2 )<br />

=(αβ)u<br />

Property DVA: If you have hung on so far, here is where it gets even wilder. In<br />

the next two properties we mix and mash the two operations.<br />

α(u + v)<br />

= α ((x 1 ,x 2 )+(y 1 ,y 2 ))<br />

= α(x 1 + y 1 +1,x 2 + y 2 +1)<br />

=(α(x 1 + y 1 +1)+α − 1, α(x 2 + y 2 +1)+α − 1)<br />

=(αx 1 + αy 1 + α + α − 1, αx 2 + αy 2 + α + α − 1)<br />

=(αx 1 + α − 1+αy 1 + α − 1+1,αx 2 + α − 1+αy 2 + α − 1+1)<br />

=((αx 1 + α − 1) + (αy 1 + α − 1) + 1, (αx 2 + α − 1) + (αy 2 + α − 1) + 1)<br />

=(αx 1 + α − 1, αx 2 + α − 1) + (αy 1 + α − 1, αy 2 + α − 1)<br />

= α(x 1 ,x 2 )+α(y 1 ,y 2 )<br />

= αu + αv<br />

Property DSA:<br />

(α + β)u


§VS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 265<br />

=(α + β)(x 1 ,x 2 )<br />

=((α + β)x 1 +(α + β) − 1, (α + β)x 2 +(α + β) − 1)<br />

=(αx 1 + βx 1 + α + β − 1, αx 2 + βx 2 + α + β − 1)<br />

=(αx 1 + α − 1+βx 1 + β − 1+1,αx 2 + α − 1+βx 2 + β − 1+1)<br />

=((αx 1 + α − 1) + (βx 1 + β − 1) + 1, (αx 2 + α − 1) + (βx 2 + β − 1) + 1)<br />

=(αx 1 + α − 1, αx 2 + α − 1) + (βx 1 + β − 1, βx 2 + β − 1)<br />

= α(x 1 ,x 2 )+β(x 1 ,x 2 )<br />

= αu + βu<br />

Property O: After all that, this one is easy, but no less pleas<strong>in</strong>g.<br />

1u =1(x 1 ,x 2 )=(x 1 +1− 1, x 2 +1− 1) = (x 1 ,x 2 )=u<br />

That is it, C is a vector space, as crazy as that may seem.<br />

Notice that <strong>in</strong> the case of the zero vector and additive <strong>in</strong>verses, we only had to<br />

propose possibilities and then verify that they were the correct choices. You might<br />

try to discover how you would arrive at these choices, though you should understand<br />

why the process of discover<strong>in</strong>g them is not a necessary component of the proof itself.<br />

△<br />

Subsection VSP<br />

Vector Space Properties<br />

Subsection VS.EVS has provided us with an abundance of examples of vector spaces,<br />

most of them conta<strong>in</strong><strong>in</strong>g useful and <strong>in</strong>terest<strong>in</strong>g mathematical objects along with<br />

natural operations. In this subsection we will prove some general properties of<br />

vector spaces. Some of these results will aga<strong>in</strong> seem obvious, but it is important<br />

to understand why it is necessary to state and prove them. A typical hypothesis<br />

will be “Let V be a vector space.” From this we may assume the ten properties of<br />

Def<strong>in</strong>ition VS, and noth<strong>in</strong>g more. It is like start<strong>in</strong>g over, as we learn about what can<br />

happen <strong>in</strong> this new algebra we are learn<strong>in</strong>g. But the power of this careful approach is<br />

that we can apply these theorems to any vector space we encounter — those <strong>in</strong> the<br />

previous examples, or new ones we have not yet contemplated. Or perhaps new ones<br />

that nobody has ever contemplated. We will illustrate some of these results with<br />

examples from the crazy vector space (Example CVS), but mostly we are stat<strong>in</strong>g<br />

theorems and do<strong>in</strong>g proofs. These proofs do not get too <strong>in</strong>volved, but are not trivial<br />

either, so these are good theorems to try prov<strong>in</strong>g yourself before you study the proof<br />

given here. (See Proof Technique P.)<br />

<strong>First</strong> we show that there is just one zero vector. Notice that the properties only<br />

require there to be at least one, and say noth<strong>in</strong>g about there possibly be<strong>in</strong>g more.<br />

That is because we can use the ten properties of a vector space (Def<strong>in</strong>ition VS) to<br />

learn that there can never be more than one. To require that this extra condition


§VS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 266<br />

be stated as an eleventh property would make the def<strong>in</strong>ition of a vector space more<br />

complicated than it needs to be.<br />

Theorem ZVU Zero Vector is Unique<br />

Suppose that V is a vector space. The zero vector, 0, is unique.<br />

Proof. To prove uniqueness, a standard technique is to suppose the existence of two<br />

objects (Proof Technique U). So let 0 1 and 0 2 be two zero vectors <strong>in</strong> V .Then<br />

0 1 = 0 1 + 0 2 Property Z for 0 2<br />

= 0 2 + 0 1 Property C<br />

= 0 2 Property Z for 0 1<br />

This proves the uniqueness s<strong>in</strong>ce the two zero vectors are really the same.<br />

<br />

Theorem AIU Additive Inverses are Unique<br />

Suppose that V is a vector space. For each u ∈ V , the additive <strong>in</strong>verse, −u, is unique.<br />

Proof. To prove uniqueness, a standard technique is to suppose the existence of two<br />

objects (Proof Technique U). So let −u 1 and −u 2 be two additive <strong>in</strong>verses for u.<br />

Then<br />

−u 1 = −u 1 + 0<br />

Property Z<br />

= −u 1 +(u + −u 2 ) Property AI<br />

=(−u 1 + u)+−u 2<br />

Property AA<br />

= 0 + −u 2 Property AI<br />

= −u 2 Property Z<br />

So the two additive <strong>in</strong>verses are really the same.<br />

<br />

Asobviousasthenextthreetheoremsappear,nowherehaveweguaranteedthat<br />

the zero scalar, scalar multiplication and the zero vector all <strong>in</strong>teract this way. Until<br />

we have proved it, anyway.<br />

Theorem ZSSM Zero Scalar <strong>in</strong> Scalar Multiplication<br />

Suppose that V is a vector space and u ∈ V .Then0u = 0.<br />

Proof. Notice that 0 is a scalar, u is a vector, so Property SC says 0u is aga<strong>in</strong> a<br />

vector. As such, 0u has an additive <strong>in</strong>verse, −(0u) by Property AI.<br />

0u = 0 +0u<br />

Property Z<br />

=(−(0u)+0u)+0u<br />

Property AI<br />

= −(0u)+(0u +0u) Property AA<br />

= −(0u)+(0+0)u Property DSA<br />

= −(0u)+0u Property ZCN


§VS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 267<br />

= 0 Property AI<br />

Here is another theorem that looks like it should be obvious, but is still <strong>in</strong> need<br />

of a proof.<br />

Theorem ZVSM Zero Vector <strong>in</strong> Scalar Multiplication<br />

Suppose that V is a vector space and α ∈ C. Thenα0 = 0.<br />

Proof. Notice that α is a scalar, 0 isavector,soPropertySC means α0 is aga<strong>in</strong> a<br />

vector. As such, α0 has an additive <strong>in</strong>verse, −(α0) byPropertyAI.<br />

α0 = 0 + α0<br />

Property Z<br />

=(−(α0)+α0)+α0<br />

Property AI<br />

= −(α0)+(α0 + α0) Property AA<br />

= −(α0)+α (0 + 0) Property DVA<br />

= −(α0)+α0 Property Z<br />

= 0 Property AI<br />

Here is another one that sure looks obvious. But understand that we have chosen<br />

to use certa<strong>in</strong> notation because it makes the theorem’s conclusion look so nice. The<br />

theorem is not true because the notation looks so good; it still needs a proof. If we<br />

had really wanted to make this po<strong>in</strong>t, we might have used notation like u ♯ for the<br />

additive <strong>in</strong>verse of u. Then we would have written the def<strong>in</strong><strong>in</strong>g property, Property<br />

AI, asu + u ♯ = 0. This theorem would become u ♯ =(−1)u. Not really quite as<br />

pretty, is it?<br />

Theorem AISM Additive Inverses from Scalar Multiplication<br />

Suppose that V is a vector space and u ∈ V .Then−u =(−1)u.<br />

Proof.<br />

<br />

−u = −u + 0<br />

Property Z<br />

= −u +0u Theorem ZSSM<br />

= −u +(1+(−1)) u Property AICN<br />

= −u +(1u +(−1)u) Property DSA<br />

= −u +(u +(−1)u) Property O<br />

=(−u + u)+(−1)u<br />

Property AA<br />

= 0 +(−1)u Property AI<br />

=(−1)u<br />

Property Z


§VS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 268<br />

Because of this theorem, we can now write l<strong>in</strong>ear comb<strong>in</strong>ations like 6u 1 +(−4)u 2<br />

as 6u 1 − 4u 2 , even though we have not formally def<strong>in</strong>ed an operation called vector<br />

subtraction.<br />

Our next theorem is a bit different from several of the others <strong>in</strong> the list. Rather<br />

than mak<strong>in</strong>g a declaration (“the zero vector is unique”) it is an implication (“if. . . ,<br />

then...”) and so can be used <strong>in</strong> proofs to convert a vector equality <strong>in</strong>to two possibilities,<br />

one a scalar equality and the other a vector equality. It should rem<strong>in</strong>d you of<br />

the situation for complex numbers. If α, β ∈ C and αβ =0,thenα =0orβ =0.<br />

This critical property is the driv<strong>in</strong>g force beh<strong>in</strong>d us<strong>in</strong>g a factorization to solve a<br />

polynomial equation.<br />

Theorem SMEZV Scalar Multiplication Equals the Zero Vector<br />

Suppose that V is a vector space and α ∈ C. Ifαu = 0, then either α =0or u = 0.<br />

Proof. We prove this theorem by break<strong>in</strong>g up the analysis <strong>in</strong>to two cases. The first<br />

seems too trivial, and it is, but the logic of the argument is still legitimate.<br />

Case 1. Suppose α = 0. In this case our conclusion is true (the first part of the<br />

either/or is true) and we are done. That was easy.<br />

Case 2. Suppose α ≠0.<br />

u =1u Property O<br />

( ) 1<br />

=<br />

α α u α ≠0, < acroreftype =”property”acro =”MICN”/ ><br />

= 1 (αu)<br />

α<br />

Property SMA<br />

= 1 (0)<br />

α<br />

Hypothesis<br />

= 0 Theorem ZVSM<br />

So <strong>in</strong> this case, the conclusion is true (the second part of the either/or is true)<br />

and we are done s<strong>in</strong>ce the conclusion was true <strong>in</strong> each of the two cases. <br />

Example PCVS Properties for the Crazy Vector Space<br />

Several of the above theorems have <strong>in</strong>terest<strong>in</strong>g demonstrations when applied to the<br />

crazy vector space, C (Example CVS). We are not prov<strong>in</strong>g anyth<strong>in</strong>g new here, or<br />

learn<strong>in</strong>g anyth<strong>in</strong>g we did not know already about C. It is just pla<strong>in</strong> fun to see how<br />

these general theorems apply <strong>in</strong> a specific <strong>in</strong>stance. For most of our examples, the<br />

applications are obvious or trivial, but not with C.<br />

Suppose u ∈ C. Then, as given by Theorem ZSSM,<br />

0u =0(x 1 ,x 2 )=(0x 1 +0− 1, 0x 2 +0− 1) = (−1, −1) = 0<br />

And as given by Theorem ZVSM,<br />

α0 = α(−1, −1) = (α(−1) + α − 1, α(−1) + α − 1)


§VS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 269<br />

=(−α + α − 1, −α + α − 1) = (−1, −1) = 0<br />

F<strong>in</strong>ally, as given by Theorem AISM,<br />

(−1)u =(−1)(x 1 ,x 2 )=((−1)x 1 +(−1) − 1, (−1)x 2 +(−1) − 1)<br />

=(−x 1 − 2, −x 2 − 2) = −u<br />

△<br />

Subsection RD<br />

Recycl<strong>in</strong>g Def<strong>in</strong>itions<br />

When we say that V is a vector space, we then know we have a set of objects (the<br />

“vectors”), but we also know we have been provided with two operations (“vector<br />

addition” and “scalar multiplication”) and these operations behave with these objects<br />

accord<strong>in</strong>g to the ten properties of Def<strong>in</strong>ition VS. One comb<strong>in</strong>es two vectors and<br />

produces a vector, the other takes a scalar and a vector, produc<strong>in</strong>g a vector as the<br />

result. So if u 1 , u 2 , u 3 ∈ V then an expression like<br />

5u 1 +7u 2 − 13u 3<br />

would be unambiguous <strong>in</strong> any of the vector spaces we have discussed <strong>in</strong> this section.<br />

And the result<strong>in</strong>g object would be another vector <strong>in</strong> the vector space. If you were<br />

tempted to call the above expression a l<strong>in</strong>ear comb<strong>in</strong>ation, you would be right. Four<br />

of the def<strong>in</strong>itions that were central to our discussions <strong>in</strong> Chapter V were stated <strong>in</strong><br />

the context of vectors be<strong>in</strong>g column vectors, but were purposely kept broad enough<br />

that they could be applied <strong>in</strong> the context of any vector space. They only rely on the<br />

presence of scalars, vectors, vector addition and scalar multiplication to make sense.<br />

We will restate them shortly, unchanged, except that their titles and acronyms no<br />

longer refer to column vectors, and the hypothesis of be<strong>in</strong>g <strong>in</strong> a vector space has<br />

been added. Take the time now to look forward and review each one, and beg<strong>in</strong><br />

to form some connections to what we have done earlier and what we will be do<strong>in</strong>g<br />

<strong>in</strong> subsequent sections and chapters. Specifically, compare the follow<strong>in</strong>g pairs of<br />

def<strong>in</strong>itions:<br />

• Def<strong>in</strong>ition LCCV and Def<strong>in</strong>ition LC<br />

• Def<strong>in</strong>ition SSCV and Def<strong>in</strong>ition SS<br />

• Def<strong>in</strong>ition RLDCV and Def<strong>in</strong>ition RLD<br />

• Def<strong>in</strong>ition LICV and Def<strong>in</strong>ition LI<br />

Read<strong>in</strong>g Questions


§VS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 270<br />

1. Comment on how the vector space C m went from a theorem (Theorem VSPCV) toan<br />

example (Example VSCV).<br />

2. In the crazy vector space, C, (Example CVS) compute the l<strong>in</strong>ear comb<strong>in</strong>ation<br />

2(3, 4) + (−6)(1, 2).<br />

3. Suppose that α is a scalar and 0 is the zero vector. Why should we prove anyth<strong>in</strong>g as<br />

obvious as α0 = 0 such as we did <strong>in</strong> Theorem ZVSM?<br />

Exercises<br />

M10 Def<strong>in</strong>e a possibly new vector space by beg<strong>in</strong>n<strong>in</strong>g with the set and vector addition<br />

from C 2 (Example VSCV) but change the def<strong>in</strong>ition of scalar multiplication to<br />

[<br />

0<br />

αx = 0 =<br />

α ∈ C, x ∈ C<br />

0]<br />

2<br />

Prove that the first n<strong>in</strong>e properties required for a vector space hold, but Property O does<br />

not hold.<br />

This example shows us that we cannot expect to be able to derive Property O as a<br />

consequence of assum<strong>in</strong>g the first n<strong>in</strong>e properties. In other words, we cannot slim down our<br />

list of properties by jettison<strong>in</strong>g the last one, and still have the same collection of objects<br />

qualify as vector spaces.<br />

M11 † Let V be the set C 2 with the usual vector addition, but with scalar multiplication<br />

def<strong>in</strong>ed by<br />

[<br />

x<br />

α =<br />

y]<br />

[ ]<br />

αy<br />

αx<br />

Determ<strong>in</strong>e whether or not V is a vector space with these operations.<br />

M12 † Let V be the set C 2 with the usual scalar multiplication, but with vector addition<br />

def<strong>in</strong>ed by<br />

[ [ [ ]<br />

x z y + w<br />

+ =<br />

y]<br />

w]<br />

x + z<br />

Determ<strong>in</strong>e whether or not V is a vector space with these operations.<br />

M13 † Let V be the set M 22 with the usual scalar multiplication, but with addition def<strong>in</strong>ed<br />

by A + B = O 2,2 for all 2 × 2 matrices A and B. Determ<strong>in</strong>e whether or not V is a vector<br />

space with these operations.<br />

M14 † Let V be the set M 22 with the usual addition, but with scalar multiplication def<strong>in</strong>ed<br />

by αA = O 2,2 for all 2 × 2 matrices A and scalars α. Determ<strong>in</strong>e whether or not V is a<br />

vector space with these operations.<br />

M15 † Consider the follow<strong>in</strong>g sets of 3 × 3 matrices, where the symbol ∗ <strong>in</strong>dicates the<br />

position of an arbitrary complex number. Determ<strong>in</strong>e whether or not these sets form vector<br />

spaces with the usual operations of addition and scalar multiplication for matrices.


§VS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 271<br />

1. All matrices of the form<br />

2. All matrices of the form<br />

3. All matrices of the form<br />

4. All matrices of the form<br />

⎡<br />

⎣ ∗ ∗ 1<br />

⎤<br />

∗ 1 ∗⎦<br />

1 ∗ ∗<br />

⎡<br />

⎣ ∗ 0 ∗<br />

⎤<br />

0 ∗ 0⎦<br />

∗ 0 ∗<br />

⎡<br />

⎣ ∗ 0 0<br />

⎤<br />

0 ∗ 0⎦ (These are the diagonal matrices.)<br />

0 0 ∗<br />

⎡<br />

⎣ ∗ ∗ ∗<br />

⎤<br />

0 ∗ ∗⎦ (These are the upper triangular matrices.)<br />

0 0 ∗<br />

M20 † Expla<strong>in</strong> why we need to def<strong>in</strong>e the vector space P n as the set of all polynomials<br />

with degree up to and <strong>in</strong>clud<strong>in</strong>g n <strong>in</strong>stead of the more obvious set of all polynomials of<br />

degree exactly n.<br />

{[ ]∣<br />

M21 † The set of <strong>in</strong>tegers is denoted Z. DoesthesetZ 2 m ∣∣∣<br />

= m, n ∈ Z}<br />

with the<br />

n<br />

operations of standard addition and scalar multiplication of vectors form a vector space?<br />

T10 ProveeachofthetenpropertiesofDef<strong>in</strong>itionVS for each of the follow<strong>in</strong>g examples<br />

of a vector space: Example VSP, Example VSIS, Example VSF, Example VSS.<br />

The next three problems suggest that under the right situations we can “cancel.” In practice,<br />

these techniques should be avoided <strong>in</strong> other proofs. Prove each of the follow<strong>in</strong>g statements.<br />

T21 † Suppose that V is a vector space, and u, v, w ∈ V .Ifw + u = w + v, then<br />

u = v.<br />

T22 † Suppose V is a vector space, u, v ∈ V and α is a nonzero scalar from C. If<br />

αu = αv, thenu = v.<br />

T23 † Suppose V is a vector space, u ̸= 0 is a vector <strong>in</strong> V and α, β ∈ C. Ifαu = βu,<br />

then α = β.<br />

T30 † Suppose that V is a vector space and α ∈ C is a scalar such that αx = x<br />

for every x ∈ V .Provethatα = 1. In other words, Property O is not duplicated for<br />

any other scalar but the “special” scalar, 1. (This question was suggested by James<br />

Gallagher.)


Section S<br />

Subspaces<br />

A subspace is a vector space that is conta<strong>in</strong>ed with<strong>in</strong> another vector space. So every<br />

subspace is a vector space <strong>in</strong> its own right, but it is also def<strong>in</strong>ed relative to some<br />

other (larger) vector space. We will discover shortly that we are already familiar<br />

with a wide variety of subspaces from previous sections.<br />

Subsection S<br />

Subspaces<br />

Here is the pr<strong>in</strong>cipal def<strong>in</strong>ition for this section.<br />

Def<strong>in</strong>ition S Subspace<br />

Suppose that V and W are two vector spaces that have identical def<strong>in</strong>itions of vector<br />

addition and scalar multiplication, and that W is a subset of V , W ⊆ V .ThenW is<br />

a subspace of V .<br />

□<br />

Let us look at an example of a vector space <strong>in</strong>side another vector space.<br />

Example SC3 A subspace of C 3<br />

We know that C 3 is a vector space (Example VSCV). Consider the subset,<br />

{[ ]∣ }<br />

x1 ∣∣∣∣<br />

W = x 2 2x 1 − 5x 2 +7x 3 =0<br />

x 3<br />

It is clear that W ⊆ C 3 , s<strong>in</strong>ce the objects <strong>in</strong> W are column vectors of size 3. But<br />

is W a vector space? Does it satisfy the ten properties of Def<strong>in</strong>ition VS when we use<br />

the same operations?<br />

[ ]<br />

That is the<br />

[<br />

ma<strong>in</strong><br />

]<br />

question.<br />

x1<br />

y1<br />

Suppose x = x 2 and y = y 2 are vectors from W . Then we know that these<br />

x 3 y 3<br />

vectors cannot be totally arbitrary, they must have ga<strong>in</strong>ed membership <strong>in</strong> W by<br />

virtue of meet<strong>in</strong>g the membership test. For example, we know that x must satisfy<br />

2x 1 − 5x 2 +7x 3 =0whiley must satisfy 2y 1 − 5y 2 +7y 3 = 0. Our first property<br />

(Property AC) asks the question, is x + y ∈ W ? When our set of vectors was C 3 ,<br />

this was an easy question to answer. Now it is not so obvious. Notice first that<br />

[ ] [ ] [ ]<br />

x1 y1 x1 + y 1<br />

x + y = x 2 + y 2 = x 2 + y 2<br />

x 3 y 3 x 3 + y 3<br />

and we can test this vector for membership <strong>in</strong> W as follows. Because x ∈ W we know<br />

2x 1 − 5x 2 +7x 3 = 0 and because y ∈ W we know 2y 1 − 5y 2 +7y 3 = 0. Therefore,<br />

2(x 1 + y 1 ) − 5(x 2 + y 2 )+7(x 3 + y 3 )=2x 1 +2y 1 − 5x 2 − 5y 2 +7x 3 +7y 3<br />

272


§S Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 273<br />

=(2x 1 − 5x 2 +7x 3 )+(2y 1 − 5y 2 +7y 3 )<br />

=0+0<br />

=0<br />

and by this computation we see that x + y ∈ W . One property down, n<strong>in</strong>e to go.<br />

If α is a scalar and x ∈ W ,isitalwaystruethatαx ∈ W ?Thisiswhatweneed<br />

to establish Property SC. Aga<strong>in</strong>, the answer is not as obvious as it was when our<br />

set of vectors was all of C 3 . Let us see. <strong>First</strong>, note that because x ∈ W we know<br />

2x 1 − 5x 2 +7x 3 = 0. Therefore,<br />

[ ] [ ]<br />

x1 αx1<br />

αx = α x 2 = αx 2<br />

x 3 αx 3<br />

and we can test this vector for membership <strong>in</strong> W . <strong>First</strong>, note that because x ∈ W<br />

we know 2x 1 − 5x 2 +7x 3 = 0. Therefore,<br />

2(αx 1 ) − 5(αx 2 )+7(αx 3 )=α(2x 1 − 5x 2 +7x 3 )<br />

= α0<br />

=0<br />

and we see that <strong>in</strong>deed αx ∈ W .Always.<br />

If W has a zero vector, it will be unique (Theorem ZVU). The zero vector for C 3<br />

should also perform the required duties when added to elements of W . So the likely<br />

candidate for a zero vector ] <strong>in</strong> W is the same zero vector that we know C 3 has. You<br />

[ 0<br />

can check that 0 = 0 is a zero vector <strong>in</strong> W too (Property Z).<br />

0<br />

With a zero vector, we can now ask about additive <strong>in</strong>verses (Property AI). As<br />

you might suspect, the natural candidate for an additive <strong>in</strong>verse <strong>in</strong> W isthesameas<br />

the additive <strong>in</strong>verse from C 3 . However, we must <strong>in</strong>sure that these additive <strong>in</strong>verses<br />

actually are elements of W .Givenx ∈ W ,is−x ∈ W ?<br />

[ ] −x1<br />

−x = −x 2<br />

−x 3<br />

and we can test this vector for membership <strong>in</strong> W . As before, because x ∈ W we<br />

know 2x 1 − 5x 2 +7x 3 =0.<br />

2(−x 1 ) − 5(−x 2 )+7(−x 3 )=−(2x 1 − 5x 2 +7x 3 )<br />

= −0<br />

=0<br />

and we now believe that −x ∈ W .<br />

Is the vector addition <strong>in</strong> W commutative (Property C)? Is x + y = y + x? Of<br />

course! Noth<strong>in</strong>g about restrict<strong>in</strong>g the scope of our set of vectors will prevent the


§S Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 274<br />

operation from still be<strong>in</strong>g commutative. Indeed, the rema<strong>in</strong><strong>in</strong>g five properties are<br />

unaffected by the transition to a smaller set of vectors, and so rema<strong>in</strong> true. That<br />

was convenient.<br />

So W satisfies all ten properties, is therefore a vector space, and thus earns the<br />

title of be<strong>in</strong>g a subspace of C 3 .<br />

△<br />

Subsection TS<br />

Test<strong>in</strong>g Subspaces<br />

In Example SC3 we proceeded through all ten of the vector space properties before<br />

believ<strong>in</strong>g that a subset was a subspace. But six of the properties were easy to prove,<br />

and we can lean on some of the properties of the vector space (the superset) to make<br />

the other four easier. Here is a theorem that will make it easier to test if a subset is<br />

a vector space. A shortcut if there ever was one.<br />

Theorem TSS Test<strong>in</strong>g Subsets for Subspaces<br />

Suppose that V is a vector space and W is a subset of V , W ⊆ V . Endow W with<br />

the same operations as V .ThenW is a subspace if and only if three conditions are<br />

met<br />

1. W is nonempty, W ≠ ∅.<br />

2. If x ∈ W and y ∈ W ,thenx + y ∈ W .<br />

3. If α ∈ C and x ∈ W ,thenαx ∈ W .<br />

Proof. (⇒) WehavethehypothesisthatW is a subspace, so by Property Z we know<br />

that W conta<strong>in</strong>s a zero vector. This is enough to show that W ≠ ∅. Also, s<strong>in</strong>ce W is<br />

a vector space it satisfies the additive and scalar multiplication closure properties<br />

(Property AC, Property SC), and so exactly meets the second and third conditions.<br />

If that was easy, the other direction might require a bit more work.<br />

(⇐) We have three properties for our hypothesis, and from this we should<br />

conclude that W has the ten def<strong>in</strong><strong>in</strong>g properties of a vector space. The second and<br />

third conditions of our hypothesis are exactly Property AC and Property SC. Our<br />

hypothesis that V is a vector space implies that Property C, Property AA, Property<br />

SMA, Property DVA, Property DSA and Property O all hold. They cont<strong>in</strong>ue to be<br />

true for vectors from W s<strong>in</strong>ce pass<strong>in</strong>g to a subset, and keep<strong>in</strong>g the operation the<br />

same, leaves their statements unchanged. Eight down, two to go.<br />

Suppose x ∈ W . Then by the third part of our hypothesis (scalar closure), we<br />

know that (−1)x ∈ W .ByTheoremAISM (−1)x = −x, so together these statements<br />

show us that −x ∈ W . −x is the additive <strong>in</strong>verse of x <strong>in</strong> V , but will cont<strong>in</strong>ue <strong>in</strong><br />

this role when viewed as element of the subset W . So every element of W has an<br />

additive <strong>in</strong>verse that is an element of W and Property AI is completed. Just one<br />

property left.


§S Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 275<br />

While we have implicitly discussed the zero vector <strong>in</strong> the previous paragraph, we<br />

need to be certa<strong>in</strong> that the zero vector (of V ) really lives <strong>in</strong> W .S<strong>in</strong>ceW is nonempty,<br />

we can choose some vector z ∈ W . Then by the argument <strong>in</strong> the previous paragraph,<br />

we know −z ∈ W .NowbyPropertyAI for V and then by the second part of our<br />

hypothesis (additive closure) we see that<br />

0 = z +(−z) ∈ W<br />

So W conta<strong>in</strong>s the zero vector from V . S<strong>in</strong>ce this vector performs the required<br />

duties of a zero vector <strong>in</strong> V , it will cont<strong>in</strong>ue <strong>in</strong> that role as an element of W .This<br />

gives us, Property Z, the f<strong>in</strong>al property of the ten required. (Sarah Fellez contributed<br />

to this proof.)<br />

<br />

So just three conditions, plus be<strong>in</strong>g a subset of a known vector space, gets us all<br />

ten properties. Fabulous! This theorem can be paraphrased by say<strong>in</strong>g that a subspace<br />

is “a nonempty subset (of a vector space) that is closed under vector addition and<br />

scalar multiplication.”<br />

You might want to go back and rework Example SC3 <strong>in</strong> light of this result,<br />

perhaps see<strong>in</strong>g where we can now economize or where the work done <strong>in</strong> the example<br />

mirrored the proof and where it did not. We will press on and apply this theorem <strong>in</strong><br />

a slightly more abstract sett<strong>in</strong>g.<br />

Example SP4 A subspace of P 4<br />

P 4 is the vector space of polynomials with degree at most 4 (Example VSP). Def<strong>in</strong>e<br />

a subset W as<br />

W = {p(x)| p ∈ P 4 , p(2) = 0}<br />

so W is the collection of those polynomials (with degree 4 or less) whose graphs<br />

cross the x-axis at x = 2. Whenever we encounter a new set it is a good idea to ga<strong>in</strong><br />

a better understand<strong>in</strong>g of the set by f<strong>in</strong>d<strong>in</strong>g a few elements <strong>in</strong> the set, and a few<br />

outside it. For example x 2 − x − 2 ∈ W , while x 4 + x 3 − 7 ∉ W .<br />

Is W nonempty? Yes, x − 2 ∈ W .<br />

Additive closure? Suppose p ∈ W and q ∈ W .Isp + q ∈ W ? p and q are not<br />

totally arbitrary, we know that p(2) = 0 and q(2) = 0. Then we can check p + q for<br />

membership <strong>in</strong> W ,<br />

(p + q)(2) = p(2) + q(2) Addition <strong>in</strong> P 4<br />

=0+0 p ∈ W, q ∈ W<br />

=0<br />

so we see that p + q qualifies for membership <strong>in</strong> W .<br />

Scalar multiplication closure? Suppose that α ∈ C and p ∈ W . Then we know<br />

that p(2) = 0. Test<strong>in</strong>g αp for membership,<br />

(αp)(2) = αp(2) Scalar multiplication <strong>in</strong> P 4


§S Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 276<br />

= α0 p ∈ W<br />

=0<br />

so αp ∈ W .<br />

We have shown that W meets the three conditions of Theorem TSS and so<br />

qualifies as a subspace of P 4 . Notice that by Def<strong>in</strong>ition S we now know that W is<br />

also a vector space. So all the properties of a vector space (Def<strong>in</strong>ition VS) andthe<br />

theorems of Section VS apply <strong>in</strong> full.<br />

△<br />

Much of the power of Theorem TSS is that we can easily establish new vector<br />

spaces if we can locate them as subsets of other vector spaces, such as the vector<br />

spaces presented <strong>in</strong> Subsection VS.EVS.<br />

It can be as <strong>in</strong>structive to consider some subsets that are not subspaces. S<strong>in</strong>ce<br />

Theorem TSS is an equivalence (see Proof Technique E) wecanbeassuredthata<br />

subset is not a subspace if it violates one of the three conditions, and <strong>in</strong> any example<br />

of <strong>in</strong>terest this will not be the “nonempty” condition. However, s<strong>in</strong>ce a subspace has<br />

to be a vector space <strong>in</strong> its own right, we can also search for a violation of any one<br />

of the ten def<strong>in</strong><strong>in</strong>g properties <strong>in</strong> Def<strong>in</strong>ition VS or any <strong>in</strong>herent property of a vector<br />

space, such as those given by the basic theorems of Subsection VS.VSP. Notice also<br />

that a violation need only be for a specific vector or pair of vectors.<br />

Example NSC2Z A non-subspace <strong>in</strong> C 2 , zero vector<br />

Consider the subset W below as a candidate for be<strong>in</strong>g a subspace of C 2<br />

{[ ]∣ }<br />

x1 ∣∣∣<br />

W = 3x<br />

x 1 − 5x 2 =12<br />

2<br />

[<br />

0<br />

The zero vector of C 2 , 0 = will need to be the zero vector <strong>in</strong> W also. However,<br />

0]<br />

0 ∉ W s<strong>in</strong>ce 3(0) − 5(0) = 0 ≠ 12. So W has no zero vector and fails Property Z<br />

of Def<strong>in</strong>ition VS. This subspace also fails to be closed under addition and scalar<br />

multiplication. Can you f<strong>in</strong>d examples of this?<br />

△<br />

Example NSC2A A non-subspace <strong>in</strong> C 2 , additive closure<br />

Consider the subset X below as a candidate for be<strong>in</strong>g a subspace of C 2<br />

{[ ]∣ }<br />

x1 ∣∣∣<br />

X = x<br />

x 1 x 2 =0<br />

2<br />

You can check that 0 ∈ X, so the[ approach of the last [ example will not get us<br />

1 0<br />

anywhere. However, notice that x = ∈ X and y = ∈ X. Yet<br />

0]<br />

1]<br />

[ [ [<br />

1 0 1<br />

x + y = + = ∉ X<br />

0]<br />

1]<br />

1]<br />

So X fails the additive closure requirement of either Property AC or Theorem<br />

TSS, and is therefore not a subspace.<br />


§S Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 277<br />

Example NSC2S A non-subspace <strong>in</strong> C 2 , scalar multiplication closure<br />

Consider the subset Y below as a candidate for be<strong>in</strong>g a subspace of C 2<br />

{[ ]∣ }<br />

x1 ∣∣∣<br />

Y = x<br />

x 1 ∈ Z, x 2 ∈ Z<br />

2<br />

Z is the set of <strong>in</strong>tegers, so we are only allow<strong>in</strong>g “whole numbers” as the constituents<br />

of our vectors. Now, 0 ∈ Y , and additive closure also holds (can you prove these<br />

claims?). So we will have to try someth<strong>in</strong>g different. Note that α = 1 2 ∈ C and<br />

[<br />

2<br />

3]<br />

∈ Y , but<br />

αx = 1 [<br />

2<br />

=<br />

2 3]<br />

[ 1<br />

3<br />

2<br />

]<br />

∉ Y<br />

So Y fails the scalar multiplication closure requirement of either Property SC or<br />

Theorem TSS, and is therefore not a subspace.<br />

△<br />

There are two examples of subspaces that are trivial. Suppose that V is any<br />

vector space. Then V is a subset of itself and is a vector space. By Def<strong>in</strong>ition S, V<br />

qualifies as a subspace of itself. The set conta<strong>in</strong><strong>in</strong>g just the zero vector Z = {0} is<br />

also a subspace as can be seen by apply<strong>in</strong>g Theorem TSS or by simple modifications<br />

of the techniques h<strong>in</strong>ted at <strong>in</strong> Example VSS. S<strong>in</strong>ce these subspaces are so obvious<br />

(and therefore not too <strong>in</strong>terest<strong>in</strong>g) we will refer to them as be<strong>in</strong>g trivial.<br />

Def<strong>in</strong>ition TS Trivial Subspaces<br />

Given the vector space V , the subspaces V and {0} are each called a trivial<br />

subspace.<br />

□<br />

We can also use Theorem TSS to prove more general statements about subspaces,<br />

as illustrated <strong>in</strong> the next theorem.<br />

Theorem NSMS Null Space of a Matrix is a Subspace<br />

Suppose that A is an m × n matrix. Then the null space of A, N (A), isasubspace<br />

of C n .<br />

Proof. Wewillexam<strong>in</strong>ethethreerequirementsofTheoremTSS. Recall that Def<strong>in</strong>ition<br />

NSM canbeformulatedasN (A) ={x ∈ C n | Ax = 0}.<br />

<strong>First</strong>, 0 ∈N(A), which can be <strong>in</strong>ferred as a consequence of Theorem HSC. So<br />

N (A) ≠ ∅.<br />

Second, check additive closure by suppos<strong>in</strong>g that x ∈N(A) and y ∈N(A). So<br />

we know a little someth<strong>in</strong>g about x and y: Ax = 0 and Ay = 0, and that is all we<br />

know. Question: Is x + y ∈N(A)? Let us check.<br />

A(x + y) =Ax + Ay<br />

Theorem MMDAA<br />

= 0 + 0 x ∈N(A) , y ∈N(A)<br />

= 0 Theorem VSPCV


§S Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 278<br />

So, yes, x + y qualifies for membership <strong>in</strong> N (A).<br />

Third, check scalar multiplication closure by suppos<strong>in</strong>g that α ∈ C and x ∈N(A).<br />

So we know a little someth<strong>in</strong>g about x: Ax = 0, and that is all we know. Question:<br />

Is αx ∈N(A)? Let us check.<br />

A(αx) =α(Ax)<br />

Theorem MMSMM<br />

= α0 x ∈N(A)<br />

= 0 Theorem ZVSM<br />

So, yes, αx qualifies for membership <strong>in</strong> N (A).<br />

Hav<strong>in</strong>g met the three conditions <strong>in</strong> Theorem TSS we can now say that the null<br />

space of a matrix is a subspace (and hence a vector space <strong>in</strong> its own right!). <br />

Here is an example where we can exercise Theorem NSMS.<br />

Example RSNS Recast<strong>in</strong>g a subspace as a null space<br />

Consider the subset of C 5 def<strong>in</strong>ed as<br />

⎧⎡<br />

⎤<br />

⎫<br />

x 1<br />

⎪⎨<br />

x 2<br />

3x 1 + x 2 − 5x 3 +7x 4 + x 5 =0, ⎪⎬<br />

W = ⎢x 3 ⎥<br />

4x 1 +6x 2 +3x 3 − 6x 4 − 5x 5 =0,<br />

⎣<br />

⎪⎩<br />

x ⎦<br />

4<br />

−2x 1 +4x 2 +7x 4 + x 5 =0<br />

x 5<br />

∣<br />

⎪⎭<br />

It is possible to show that W is a subspace of C 5 by check<strong>in</strong>g the three conditions<br />

of Theorem TSS directly, but it will get tedious rather quickly. Instead, give W<br />

a fresh look and notice that it is a set of solutions to a homogeneous system of<br />

equations. Def<strong>in</strong>e the matrix<br />

[ 3 1 −5 7<br />

] 1<br />

A = 4 6 3 −6 −5<br />

−2 4 0 7 1<br />

and then recognize that W = N (A). ByTheoremNSMS we can immediately see<br />

that W is a subspace. Boom!<br />

△<br />

Subsection TSS<br />

The Span of a Set<br />

The span of a set of column vectors got a heavy workout <strong>in</strong> Chapter V and Chapter<br />

M. The def<strong>in</strong>ition of the span depended only on be<strong>in</strong>g able to formulate l<strong>in</strong>ear<br />

comb<strong>in</strong>ations. In any of our more general vector spaces we always have a def<strong>in</strong>ition<br />

of vector addition and of scalar multiplication. So we can build l<strong>in</strong>ear comb<strong>in</strong>ations<br />

and manufacture spans. This subsection conta<strong>in</strong>s two def<strong>in</strong>itions that are just mild<br />

variants of def<strong>in</strong>itions we have seen earlier for column vectors. If you have not already,<br />

compare them with Def<strong>in</strong>ition LCCV and Def<strong>in</strong>ition SSCV.


§S Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 279<br />

Def<strong>in</strong>ition LC L<strong>in</strong>ear Comb<strong>in</strong>ation<br />

Suppose that V is a vector space. Given n vectors u 1 , u 2 , u 3 , ..., u n and n scalars<br />

α 1 ,α 2 ,α 3 , ..., α n ,theirl<strong>in</strong>ear comb<strong>in</strong>ation is the vector<br />

α 1 u 1 + α 2 u 2 + α 3 u 3 + ···+ α n u n .<br />

□<br />

Example LCM A l<strong>in</strong>ear comb<strong>in</strong>ation of matrices<br />

In the vector space M 23 of 2 × 3 matrices, we have the vectors<br />

[ ]<br />

[ ]<br />

[ ]<br />

1 3 −2<br />

3 −1 2<br />

4 2 −4<br />

x =<br />

y =<br />

z =<br />

2 0 7<br />

5 5 1<br />

1 1 1<br />

and we can form l<strong>in</strong>ear comb<strong>in</strong>ations such as<br />

[ ] [ ] [ ]<br />

1 3 −2 3 −1 2 4 2 −4<br />

2x +4y +(−1)z =2<br />

+4<br />

+(−1)<br />

2 0 7 5 5 1 1 1 1<br />

[ ] [ ] [ ]<br />

2 6 −4 12 −4 8 −4 −2 4<br />

=<br />

+<br />

+<br />

4 0 14 20 20 4 −1 −1 −1<br />

[ ]<br />

10 0 8<br />

=<br />

23 19 17<br />

or,<br />

[ ] [ ] [ ]<br />

1 3 −2 3 −1 2 4 2 −4<br />

4x − 2y +3z =4<br />

− 2<br />

+3<br />

2 0 7 5 5 1 1 1 1<br />

[ ] [ ] [<br />

4 12 −8 −6 2 −4 12 6 −12<br />

=<br />

+<br />

+<br />

8 0 28 −10 −10 −2 3 3 3<br />

[ ]<br />

10 20 −24<br />

=<br />

1 −7 29<br />

When we realize that we can form l<strong>in</strong>ear comb<strong>in</strong>ations <strong>in</strong> any vector space, then<br />

it is natural to revisit our def<strong>in</strong>ition of the span of a set, s<strong>in</strong>ce it is the set of all<br />

possible l<strong>in</strong>ear comb<strong>in</strong>ations of a set of vectors.<br />

Def<strong>in</strong>ition SS Span of a Set<br />

Suppose that V is a vector space. Given a set of vectors S = {u 1 , u 2 , u 3 , ..., u t },<br />

their span, 〈S〉, is the set of all possible l<strong>in</strong>ear comb<strong>in</strong>ations of u 1 , u 2 , u 3 , ..., u t .<br />

Symbolically,<br />

〈S〉 = {α 1 u 1 + α 2 u 2 + α 3 u 3 + ···+ α t u t | α i ∈ C, 1 ≤ i ≤ t}<br />

{ t∑<br />

∣ }<br />

∣∣∣∣<br />

= α i u i α i ∈ C, 1 ≤ i ≤ t<br />

i=1<br />

□<br />

]<br />


§S Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 280<br />

Theorem SSS Span of a Set is a Subspace<br />

Suppose V is a vector space. Given a set of vectors S = {u 1 , u 2 , u 3 , ..., u t }⊆V ,<br />

their span, 〈S〉, is a subspace.<br />

Proof. By Def<strong>in</strong>ition SS, the span conta<strong>in</strong>s l<strong>in</strong>ear comb<strong>in</strong>ations of vectors from the<br />

vector space V , so by repeated use of the closure properties, Property AC and<br />

Property SC, 〈S〉 can be seen to be a subset of V .<br />

We will then verify the three conditions of Theorem TSS. <strong>First</strong>,<br />

0 = 0 + 0 + 0 + ...+ 0 Property Z<br />

=0u 1 +0u 2 +0u 3 + ···+0u t Theorem ZSSM<br />

So we have written 0 as a l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors <strong>in</strong> S and by Def<strong>in</strong>ition<br />

SS, 0 ∈〈S〉 and therefore 〈S〉 ≠ ∅.<br />

Second, suppose x ∈〈S〉 and y ∈〈S〉. Can we conclude that x + y ∈〈S〉? What<br />

do we know about x and y by virtue of their membership <strong>in</strong> 〈S〉? Theremustbe<br />

scalars from C, α 1 ,α 2 ,α 3 , ..., α t and β 1 ,β 2 ,β 3 , ..., β t so that<br />

x = α 1 u 1 + α 2 u 2 + α 3 u 3 + ···+ α t u t<br />

y = β 1 u 1 + β 2 u 2 + β 3 u 3 + ···+ β t u t<br />

Then<br />

x + y = α 1 u 1 + α 2 u 2 + α 3 u 3 + ···+ α t u t<br />

+ β 1 u 1 + β 2 u 2 + β 3 u 3 + ···+ β t u t<br />

= α 1 u 1 + β 1 u 1 + α 2 u 2 + β 2 u 2<br />

+ α 3 u 3 + β 3 u 3 + ···+ α t u t + β t u t Property AA, Property C<br />

=(α 1 + β 1 )u 1 +(α 2 + β 2 )u 2<br />

+(α 3 + β 3 )u 3 + ···+(α t + β t )u t<br />

Property DSA<br />

S<strong>in</strong>ce each α i + β i is aga<strong>in</strong> a scalar from C we have expressed the vector sum x + y<br />

as a l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors from S, and therefore by Def<strong>in</strong>ition SS we can<br />

say that x + y ∈〈S〉.<br />

Third, suppose α ∈ C and x ∈〈S〉. Can we conclude that αx ∈〈S〉? Whatdo<br />

we know about x by virtue of its membership <strong>in</strong> 〈S〉? There must be scalars from C,<br />

α 1 ,α 2 ,α 3 , ..., α t so that<br />

x = α 1 u 1 + α 2 u 2 + α 3 u 3 + ···+ α t u t<br />

Then<br />

αx = α (α 1 u 1 + α 2 u 2 + α 3 u 3 + ···+ α t u t )<br />

= α(α 1 u 1 )+α(α 2 u 2 )+α(α 3 u 3 )+···+ α(α t u t ) Property DVA<br />

=(αα 1 )u 1 +(αα 2 )u 2 +(αα 3 )u 3 + ···+(αα t )u t<br />

Property SMA


§S Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 281<br />

S<strong>in</strong>ce each αα i is aga<strong>in</strong> a scalar from C we have expressed the scalar multiple αx as<br />

a l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors from S, and therefore by Def<strong>in</strong>ition SS we can<br />

say that αx ∈〈S〉.<br />

With the three conditions of Theorem TSS met, we can say that 〈S〉 is a subspace<br />

(and so is also a vector space, Def<strong>in</strong>ition VS). (See Exercise SS.T20, Exercise SS.T21,<br />

Exercise SS.T22.)<br />

<br />

Example SSP Span of a set of polynomials<br />

In Example SP4 we proved that<br />

W = {p(x)| p ∈ P 4 , p(2) = 0}<br />

is a subspace of P 4 , the vector space of polynomials of degree at most 4. S<strong>in</strong>ce W is<br />

a vector space itself, let us construct a span with<strong>in</strong> W . <strong>First</strong> let<br />

S = { x 4 − 4x 3 +5x 2 − x − 2, 2x 4 − 3x 3 − 6x 2 +6x +4 }<br />

and verify that S is a subset of W by check<strong>in</strong>g that each of these two polynomials<br />

has x = 2 as a root. Now, if we def<strong>in</strong>e U = 〈S〉, then Theorem SSS tells us that U is<br />

a subspace of W . So quite quickly we have built a cha<strong>in</strong> of subspaces, U <strong>in</strong>side W ,<br />

and W <strong>in</strong>side P 4 .<br />

Rather than dwell on how quickly we can build subspaces, let us try to ga<strong>in</strong> a<br />

better understand<strong>in</strong>g of just how the span construction creates subspaces, <strong>in</strong> the<br />

context of this example. We can quickly build representative elements of U,<br />

3(x 4 −4x 3 +5x 2 −x−2)+5(2x 4 −3x 3 −6x 2 +6x+4) = 13x 4 −27x 3 −15x 2 +27x+14<br />

and<br />

(−2)(x 4 −4x 3 +5x 2 −x−2)+8(2x 4 −3x 3 −6x 2 +6x+4) = 14x 4 −16x 3 −58x 2 +50x+36<br />

and each of these polynomials must be <strong>in</strong> W s<strong>in</strong>ce it is closed under addition and<br />

scalar multiplication. But you might check for yourself that both of these polynomials<br />

have x =2asaroot.<br />

I can tell you that y =3x 4 − 7x 3 − x 2 +7x − 2isnot<strong>in</strong>U, but would you believe<br />

me? A first check shows that y does have x = 2 as a root, but that only shows that<br />

y ∈ W .Whatdoesy have to do to ga<strong>in</strong> membership <strong>in</strong> U = 〈S〉? It must be a l<strong>in</strong>ear<br />

comb<strong>in</strong>ation of the vectors <strong>in</strong> S, x 4 − 4x 3 +5x 2 − x − 2and2x 4 − 3x 3 − 6x 2 +6x +4.<br />

So let us suppose that y is such a l<strong>in</strong>ear comb<strong>in</strong>ation,<br />

y =3x 4 − 7x 3 − x 2 +7x − 2<br />

= α 1 (x 4 − 4x 3 +5x 2 − x − 2) + α 2 (2x 4 − 3x 3 − 6x 2 +6x +4)<br />

=(α 1 +2α 2 )x 4 +(−4α 1 − 3α 2 )x 3 +(5α 1 − 6α 2 )x 2<br />

+(−α 1 +6α 2 )x +(−2α 1 +4α 2 )<br />

Notice that operations above are done <strong>in</strong> accordance with the def<strong>in</strong>ition of the


§S Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 282<br />

vector space of polynomials (Example VSP). Now, if we equate coefficients, which is<br />

the def<strong>in</strong>ition of equality for polynomials, then we obta<strong>in</strong> the system of five l<strong>in</strong>ear<br />

equations <strong>in</strong> two variables<br />

α 1 +2α 2 =3<br />

−4α 1 − 3α 2 = −7<br />

5α 1 − 6α 2 = −1<br />

−α 1 +6α 2 =7<br />

−2α 1 +4α 2 = −2<br />

Build an augmented matrix from the system and row-reduce,<br />

⎡<br />

⎤<br />

1 2 3<br />

−4 −3 −7<br />

⎢ 5 −6 −1⎥<br />

⎣−1 6 7 ⎦<br />

−2 4 −2<br />

RREF<br />

−−−−→<br />

⎡<br />

⎢<br />

⎣<br />

1 0 0<br />

0 1 0<br />

0 0 1<br />

0 0 0<br />

0 0 0<br />

S<strong>in</strong>ce the f<strong>in</strong>al column of the row-reduced augmented matrix is a pivot column,<br />

Theorem RCLS tells us the system of equations is <strong>in</strong>consistent. Therefore, there are<br />

no scalars, α 1 and α 2 , to establish y as a l<strong>in</strong>ear comb<strong>in</strong>ation of the elements <strong>in</strong> U.<br />

So y ∉ U.<br />

△<br />

Let us aga<strong>in</strong> exam<strong>in</strong>e membership <strong>in</strong> a span.<br />

Example SM32 A subspace of M 32<br />

The set of all 3 × 2 matrices forms a vector space when we use the operations of<br />

matrix addition (Def<strong>in</strong>ition MA) and scalar matrix multiplication (Def<strong>in</strong>ition MSM),<br />

as was shown <strong>in</strong> Example VSM. Consider the subset<br />

S =<br />

{[ 3 1<br />

4 2<br />

5 −5<br />

]<br />

,<br />

[ 1 1<br />

2 −1<br />

14 −1<br />

]<br />

,<br />

[ 3<br />

] −1<br />

−1 2 ,<br />

−19 −11<br />

⎤<br />

⎥<br />

⎦<br />

[ 4 2<br />

1 −2<br />

14 −2<br />

]<br />

,<br />

[ 3<br />

]} 1<br />

−4 0<br />

−17 7<br />

and def<strong>in</strong>e a new subset of vectors W <strong>in</strong> M 32 us<strong>in</strong>g the span (Def<strong>in</strong>ition SS), W = 〈S〉.<br />

So by Theorem SSS we know that W is a subspace of M 32 . While W is an <strong>in</strong>f<strong>in</strong>ite set,<br />

and this is a precise description, it would still be worthwhile to <strong>in</strong>vestigate whether<br />

or not W conta<strong>in</strong>s certa<strong>in</strong> elements.<br />

<strong>First</strong>, is<br />

[ 9 3<br />

]<br />

y = 7 3<br />

10 −11<br />

<strong>in</strong> W ? To answer this, we want to determ<strong>in</strong>e if y can be written as a l<strong>in</strong>ear comb<strong>in</strong>ation


§S Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 283<br />

of the five matrices <strong>in</strong> S. Can we f<strong>in</strong>d scalars, α 1 ,α 2 ,α 3 ,α 4 ,α 5 so that<br />

[ ] 9 3<br />

7 3<br />

10 −11<br />

[ 3 1<br />

= α 1 4 2<br />

5 −5<br />

=<br />

] [ ] [ ] [ ]<br />

1 1 3 −1 4 2<br />

+ α 2 2 −1 + α 3 −1 2 + α 4 1 −2<br />

14 −1 −19 −11 14 −2<br />

]<br />

3α 1 + α 2 +3α 3 +4α 4 +3α 5 α 1 + α 2 − α 3 +2α 4 + α 5<br />

4α 1 +2α 2 − α 3 + α 4 − 4α 5 2α 1 − α 2 +2α 3 − 2α 4<br />

5α 1 +14α 2 − 19α 3 +14α 4 − 17α 5 −5α 1 − α 2 − 11α 3 − 2α 4 +7α 5<br />

[<br />

[ ] 3 1<br />

+ α 5 −4 0<br />

−17 7<br />

Us<strong>in</strong>g our def<strong>in</strong>ition of matrix equality (Def<strong>in</strong>ition ME) we can translate this<br />

statement <strong>in</strong>to six equations <strong>in</strong> the five unknowns,<br />

3α 1 + α 2 +3α 3 +4α 4 +3α 5 =9<br />

α 1 + α 2 − α 3 +2α 4 + α 5 =3<br />

4α 1 +2α 2 − α 3 + α 4 − 4α 5 =7<br />

2α 1 − α 2 +2α 3 − 2α 4 =3<br />

5α 1 +14α 2 − 19α 3 +14α 4 − 17α 5 =10<br />

−5α 1 − α 2 − 11α 3 − 2α 4 +7α 5 = −11<br />

This is a l<strong>in</strong>ear system of equations, which we can represent with an augmented<br />

matrix and row-reduce <strong>in</strong> search of solutions. The matrix that is row-equivalent to<br />

the augmented matrix is<br />

⎡ ⎤<br />

5<br />

1 0 0 0<br />

8<br />

2<br />

⎢ −19<br />

⎢⎢⎢⎢⎢⎣ 0 1 0 0<br />

4<br />

−1<br />

−7<br />

0 0 1 0<br />

8<br />

0<br />

17<br />

0 0 0 1<br />

8<br />

1 ⎥<br />

0 0 0 0 0 0<br />

⎦<br />

0 0 0 0 0 0<br />

So we recognize that the system is consistent s<strong>in</strong>ce the f<strong>in</strong>al column is not a pivot<br />

column (Theorem RCLS), and compute n − r =5− 4 = 1 free variables (Theorem<br />

FVCS). While there are <strong>in</strong>f<strong>in</strong>itely many solutions, we are only <strong>in</strong> pursuit of a s<strong>in</strong>gle<br />

solution, so let us choose the free variable α 5 = 0 for simplicity’s sake. Then we<br />

easily see that α 1 =2,α 2 = −1, α 3 =0,α 4 = 1. So the scalars α 1 =2,α 2 = −1,<br />

α 3 =0,α 4 =1,α 5 = 0 will provide a l<strong>in</strong>ear comb<strong>in</strong>ation of the elements of S that<br />

equals y, as we can verify by check<strong>in</strong>g,<br />

[ 9 3<br />

7 3<br />

10 −11<br />

]<br />

=2<br />

[ 3 1<br />

4 2<br />

5 −5<br />

]<br />

+(−1)<br />

[ 1<br />

] 1<br />

[ 4<br />

] 2<br />

2 −1 +(1) 1 −2<br />

14 −1 14 −2<br />

So with one particular l<strong>in</strong>ear comb<strong>in</strong>ation <strong>in</strong> hand, we are conv<strong>in</strong>ced that y deserves


§S Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 284<br />

to be a member of W = 〈S〉.<br />

Second, is<br />

x =<br />

[ ] 2 1<br />

3 1<br />

4 −2<br />

<strong>in</strong> W ? To answer this, we want to determ<strong>in</strong>e if x canbewrittenasal<strong>in</strong>earcomb<strong>in</strong>ation<br />

of the five matrices <strong>in</strong> S. Can we f<strong>in</strong>d scalars, α 1 ,α 2 ,α 3 ,α 4 ,α 5 so that<br />

[ ] 2 1<br />

3 1<br />

4 −2<br />

[ ] [ ] [ ] [ ] [ ]<br />

3 1 1 1 3 −1 4 2 3 1<br />

= α 1 4 2 + α 2 2 −1 + α 3 −1 2 + α 4 1 −2 + α 5 −4 0<br />

5 −5 14 −1 −19 −11 14 −2 −17 7<br />

[<br />

]<br />

3α 1 + α 2 +3α 3 +4α 4 +3α 5 α 1 + α 2 − α 3 +2α 4 + α 5<br />

= 4α 1 +2α 2 − α 3 + α 4 − 4α 5 2α 1 − α 2 +2α 3 − 2α 4<br />

5α 1 +14α 2 − 19α 3 +14α 4 − 17α 5 −5α 1 − α 2 − 11α 3 − 2α 4 +7α 5<br />

Us<strong>in</strong>g our def<strong>in</strong>ition of matrix equality (Def<strong>in</strong>ition ME) we can translate this statement<br />

<strong>in</strong>to six equations <strong>in</strong> the five unknowns,<br />

3α 1 + α 2 +3α 3 +4α 4 +3α 5 =2<br />

α 1 + α 2 − α 3 +2α 4 + α 5 =1<br />

4α 1 +2α 2 − α 3 + α 4 − 4α 5 =3<br />

2α 1 − α 2 +2α 3 − 2α 4 =1<br />

5α 1 +14α 2 − 19α 3 +14α 4 − 17α 5 =4<br />

−5α 1 − α 2 − 11α 3 − 2α 4 +7α 5 = −2<br />

This is a l<strong>in</strong>ear system of equations, which we can represent with an augmented<br />

matrix and row-reduce <strong>in</strong> search of solutions. The matrix that is row-equivalent to<br />

the augmented matrix is<br />

⎡ ⎤<br />

5<br />

1 0 0 0<br />

8<br />

0<br />

⎢ ⎢⎢⎢⎢⎢⎢⎣ 0 1 0 0 − 38<br />

8<br />

0<br />

0 0 1 0 − 7 8<br />

0<br />

0 0 0 1 − 17<br />

8<br />

0<br />

⎥<br />

0 0 0 0 0 1 ⎦<br />

0 0 0 0 0 0<br />

S<strong>in</strong>ce the last column is a pivot column, Theorem RCLS tells us that the system is<br />

<strong>in</strong>consistent. Therefore, there are no values for the scalars that will place x <strong>in</strong> W ,<br />

and so we conclude that x ∉ W .<br />

△<br />

Notice how Example SSP and Example SM32 conta<strong>in</strong>ed questions about mem-


§S Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 285<br />

bership <strong>in</strong> a span, but these questions quickly became questions about solutions to<br />

a system of l<strong>in</strong>ear equations. This will be a common theme go<strong>in</strong>g forward.<br />

Subsection SC<br />

Subspace Constructions<br />

Several of the subsets of vectors spaces that we worked with <strong>in</strong> Chapter M are also<br />

subspaces — they are closed under vector addition and scalar multiplication <strong>in</strong> C m .<br />

Theorem CSMS Column Space of a Matrix is a Subspace<br />

Suppose that A is an m × n matrix. Then C(A) is a subspace of C m .<br />

Proof. Def<strong>in</strong>ition CSM shows us that C(A) is a subset of C m , and that it is def<strong>in</strong>ed<br />

as the span of a set of vectors from C m (the columns of the matrix). S<strong>in</strong>ce C(A) is a<br />

span, Theorem SSS says it is a subspace.<br />

<br />

That was easy! Notice that we could have used this same approach to prove that<br />

the null space is a subspace, s<strong>in</strong>ce Theorem SSNS provided a description of the null<br />

space of a matrix as the span of a set of vectors. However, I much prefer the current<br />

proofofTheoremNSMS. Speak<strong>in</strong>g of easy, here is a very easy theorem that exposes<br />

another of our constructions as creat<strong>in</strong>g subspaces.<br />

Theorem RSMS Row Space of a Matrix is a Subspace<br />

Suppose that A is an m × n matrix. Then R(A) is a subspace of C n .<br />

Proof. Def<strong>in</strong>ition RSM says R(A) = C(A t ), so the row space of a matrix is a column<br />

space, and every column space is a subspace by Theorem CSMS. That’s enough.<br />

One more.<br />

Theorem LNSMS Left Null Space of a Matrix is a Subspace<br />

Suppose that A is an m × n matrix. Then L(A) is a subspace of C m .<br />

Proof. Def<strong>in</strong>ition LNS says L(A) = N (A t ), so the left null space is a null space, and<br />

every null space is a subspace by Theorem NSMS. Done.<br />

<br />

So the span of a set of vectors, and the null space, column space, row space and<br />

left null space of a matrix are all subspaces, and hence are all vector spaces, mean<strong>in</strong>g<br />

they have all the properties detailed <strong>in</strong> Def<strong>in</strong>ition VS and <strong>in</strong> the basic theorems<br />

presented <strong>in</strong> Section VS. We have worked with these objects as just sets <strong>in</strong> Chapter<br />

V and Chapter M, but now we understand that they have much more structure. In<br />

particular, be<strong>in</strong>g closed under vector addition and scalar multiplication means a<br />

subspace is also closed under l<strong>in</strong>ear comb<strong>in</strong>ations.


§S Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 286<br />

Read<strong>in</strong>g Questions<br />

1. Summarize the three conditions that allow us to quickly test if a set is a subspace.<br />

2. Consider the set of vectors<br />

⎧ ⎡<br />

⎨<br />

W = ⎣ a ⎤<br />

⎫ ⎬<br />

b⎦<br />

⎩<br />

c ∣ 3a − 2b + c =5 ⎭<br />

Is the set W a subspace of C 3 ? Expla<strong>in</strong> your answer.<br />

3. Name five general constructions of sets of column vectors (subsets of C m )thatwenow<br />

know as subspaces.<br />

Exercises<br />

⎡<br />

⎤<br />

C15 † Work<strong>in</strong>g with<strong>in</strong> the vector space C 3 , determ<strong>in</strong>e if b = ⎣ 4 3⎦ is <strong>in</strong> the subspace W ,<br />

1<br />

〈 ⎧ ⎡<br />

⎨<br />

W = ⎣ 3 ⎤ ⎡<br />

2⎦ , ⎣ 1 ⎤ ⎡<br />

0⎦ , ⎣ 1 ⎤ ⎡<br />

1⎦ , ⎣ 2 ⎤⎫〉<br />

⎬<br />

1⎦<br />

⎩<br />

3 3 0 3<br />

⎭<br />

⎡ ⎤<br />

1<br />

C16 † Work<strong>in</strong>g with<strong>in</strong> the vector space C 4 ⎢1⎥<br />

, determ<strong>in</strong>e if b = ⎣<br />

0<br />

⎦ is <strong>in</strong> the subspace W ,<br />

1<br />

〈 ⎧ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />

⎪ 1 1 2 〉<br />

⎨ ⎪⎬<br />

⎢ 2 ⎥ ⎢0⎥<br />

⎢1⎥<br />

W = ⎣<br />

⎪ ⎩<br />

−1<br />

⎦ , ⎣<br />

3<br />

⎦ , ⎣<br />

1<br />

⎦<br />

⎪⎭<br />

1 1 2<br />

⎡ ⎤<br />

2<br />

C17 † Work<strong>in</strong>g with<strong>in</strong> the vector space C 4 ⎢1⎥<br />

, determ<strong>in</strong>e if b = ⎣<br />

2<br />

⎦ is <strong>in</strong> the subspace W ,<br />

1<br />

〈 ⎧ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />

⎪ 1 1 0 1 〉<br />

⎨ ⎪⎬<br />

⎢2⎥<br />

⎢0⎥<br />

⎢1⎥<br />

⎢1⎥<br />

W = ⎣<br />

⎪ ⎩<br />

0<br />

⎦ , ⎣<br />

3<br />

⎦ , ⎣<br />

0<br />

⎦ , ⎣<br />

2<br />

⎦<br />

⎪⎭<br />

2 1 2 0<br />

C20 † Work<strong>in</strong>g with<strong>in</strong> the vector space P 3 of polynomials of degree 3 or less, determ<strong>in</strong>e if<br />

p(x) =x 3 +6x + 4 is <strong>in</strong> the subspace W below.<br />

W = 〈{ x 3 + x 2 + x, x 3 +2x − 6, x 2 − 5 }〉<br />

C21 †<br />

Consider the subspace<br />

〈{[ ]<br />

2 1<br />

W =<br />

,<br />

3 −1<br />

[ ]<br />

4 0<br />

,<br />

2 3<br />

[ ]}〉<br />

−3 1<br />

2 1


§S Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 287<br />

[ ]<br />

−3 3<br />

of the vector space of 2 × 2matrices,M 22 .IsC =<br />

an element of W ?<br />

6 −4<br />

{[ ]∣ }<br />

x1 ∣∣∣<br />

C25 Show that the set W = 3x<br />

x 1 − 5x 2 =12 from Example NSC2Z fails Property<br />

2<br />

AC and Property SC.<br />

{[ ]∣ }<br />

x1 ∣∣∣<br />

C26 Show that the set Y = x<br />

x 1 ∈ Z, x 2 ∈ Z from Example NSC2S has Property<br />

2<br />

AC.<br />

M20 † In C 3 , the vector space of column vectors of size 3, prove that the set Z is a<br />

subspace.<br />

⎧ ⎡<br />

⎨<br />

Z = ⎣ x ⎤<br />

⎫<br />

1<br />

⎬<br />

x 2<br />

⎦<br />

⎩<br />

x 3 ∣ 4x 1 − x 2 +5x 3 =0<br />

⎭<br />

T20 † A square matrix A of size n is upper triangular if [A] ij<br />

= 0 whenever i>j.Let<br />

UT n be the set of all upper triangular matrices of size n. ProvethatUT n is a subspace of<br />

the vector space of all square matrices of size n, M nn .<br />

T30 † Let P be the set of all polynomials, of any degree. The set P is a vector space. Let<br />

E be the subset of P consist<strong>in</strong>g of all polynomials with only terms of even degree. Prove or<br />

disprove: the set E is a subspace of P .<br />

T31 † Let P be the set of all polynomials, of any degree. The set P is a vector space. Let<br />

F be the subset of P consist<strong>in</strong>g of all polynomials with only terms of odd degree. Prove or<br />

disprove: the set F is a subspace of P .


Section LISS<br />

L<strong>in</strong>ear Independence and Spann<strong>in</strong>g Sets<br />

A vector space is def<strong>in</strong>ed as a set with two operations, meet<strong>in</strong>g ten properties<br />

(Def<strong>in</strong>ition VS). Just as the def<strong>in</strong>ition of span of a set of vectors only required<br />

know<strong>in</strong>g how to add vectors and how to multiply vectors by scalars, so it is with<br />

l<strong>in</strong>ear <strong>in</strong>dependence. A def<strong>in</strong>ition of a l<strong>in</strong>early <strong>in</strong>dependent set of vectors <strong>in</strong> an<br />

arbitrary vector space only requires know<strong>in</strong>g how to form l<strong>in</strong>ear comb<strong>in</strong>ations and<br />

equat<strong>in</strong>g these with the zero vector. S<strong>in</strong>ce every vector space must have a zero vector<br />

(Property Z), we always have a zero vector at our disposal.<br />

In this section we will also put a twist on the notion of the span of a set of<br />

vectors. Rather than beg<strong>in</strong>n<strong>in</strong>g with a set of vectors and creat<strong>in</strong>g a subspace that is<br />

the span, we will <strong>in</strong>stead beg<strong>in</strong> with a subspace and look for a set of vectors whose<br />

span equals the subspace.<br />

The comb<strong>in</strong>ation of l<strong>in</strong>ear <strong>in</strong>dependence and spann<strong>in</strong>g will be very important<br />

go<strong>in</strong>g forward.<br />

Subsection LI<br />

L<strong>in</strong>ear Independence<br />

Our previous def<strong>in</strong>ition of l<strong>in</strong>ear <strong>in</strong>dependence (Def<strong>in</strong>ition LICV) employed a relation<br />

of l<strong>in</strong>ear dependence that was a l<strong>in</strong>ear comb<strong>in</strong>ation on one side of an equality and a<br />

zero vector on the other side. As a l<strong>in</strong>ear comb<strong>in</strong>ation <strong>in</strong> a vector space (Def<strong>in</strong>ition<br />

LC) depends only on vector addition and scalar multiplication, and every vector<br />

space must have a zero vector (Property Z), we can extend our def<strong>in</strong>ition of l<strong>in</strong>ear<br />

<strong>in</strong>dependence from the sett<strong>in</strong>g of C m to the sett<strong>in</strong>g of a general vector space V with<br />

almost no changes. Compare these next two def<strong>in</strong>itions with Def<strong>in</strong>ition RLDCV and<br />

Def<strong>in</strong>ition LICV.<br />

Def<strong>in</strong>ition RLD Relation of L<strong>in</strong>ear Dependence<br />

Suppose that V is a vector space. Given a set of vectors S = {u 1 , u 2 , u 3 , ..., u n },<br />

an equation of the form<br />

α 1 u 1 + α 2 u 2 + α 3 u 3 + ···+ α n u n = 0<br />

is a relation of l<strong>in</strong>ear dependence on S. If this equation is formed <strong>in</strong> a trivial<br />

fashion, i.e. α i =0,1≤ i ≤ n, then we say it is a trivial relation of l<strong>in</strong>ear<br />

dependence on S.<br />

□<br />

Def<strong>in</strong>ition LI L<strong>in</strong>ear Independence<br />

Suppose that V is a vector space. The set of vectors S = {u 1 , u 2 , u 3 , ..., u n } from<br />

V is l<strong>in</strong>early dependent if there is a relation of l<strong>in</strong>ear dependence on S that is not<br />

288


§LISS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 289<br />

trivial. In the case where the only relation of l<strong>in</strong>ear dependence on S is the trivial<br />

one, then S is a l<strong>in</strong>early <strong>in</strong>dependent set of vectors.<br />

□<br />

Notice the emphasis on the word “only.” This might rem<strong>in</strong>d you of the def<strong>in</strong>ition<br />

of a nons<strong>in</strong>gular matrix, where if the matrix is employed as the coefficient matrix of<br />

a homogeneous system then the only solution is the trivial one.<br />

Example LIP4 L<strong>in</strong>ear <strong>in</strong>dependence <strong>in</strong> P 4<br />

In the vector space of polynomials with degree 4 or less, P 4 (Example VSP) consider<br />

the set S below<br />

{<br />

2x 4 +3x 3 +2x 2 − x +10, −x 4 − 2x 3 + x 2 +5x − 8, 2x 4 + x 3 +10x 2 +17x − 2 }<br />

Is this set of vectors l<strong>in</strong>early <strong>in</strong>dependent or dependent? Consider that<br />

3 ( 2x 4 +3x 3 +2x 2 − x +10 ) +4 ( −x 4 − 2x 3 + x 2 +5x − 8 )<br />

+(−1) ( 2x 4 + x 3 +10x 2 +17x − 2 ) =0x 4 +0x 3 +0x 2 +0x +0=0<br />

This is a nontrivial relation of l<strong>in</strong>ear dependence (Def<strong>in</strong>ition RLD) onthesetS and<br />

so conv<strong>in</strong>ces us that S is l<strong>in</strong>early dependent (Def<strong>in</strong>ition LI).<br />

Now, I hear you say, “Where did those scalars come from?” Do not worry about<br />

that right now, just be sure you understand why the above explanation is sufficient to<br />

prove that S is l<strong>in</strong>early dependent. The rema<strong>in</strong>der of the example will demonstrate<br />

how we might f<strong>in</strong>d these scalars if they had not been provided so readily.<br />

Let us look at another set of vectors (polynomials) from P 4 .Let<br />

T = { 3x 4 − 2x 3 +4x 2 +6x − 1, −3x 4 +1x 3 +0x 2 +4x +2,<br />

4x 4 +5x 3 − 2x 2 +3x +1, 2x 4 − 7x 3 +4x 2 +2x +1 }<br />

Suppose we have a relation of l<strong>in</strong>ear dependence on this set,<br />

0 =0x 4 +0x 3 +0x 2 +0x +0<br />

(<br />

= α 1 3x 4 − 2x 3 +4x 2 +6x − 1 ) (<br />

+ α 2 −3x 4 +1x 3 +0x 2 +4x +2 )<br />

(<br />

+ α 3 4x 4 +5x 3 − 2x 2 +3x +1 ) (<br />

+ α 4 2x 4 − 7x 3 +4x 2 +2x +1 )<br />

Us<strong>in</strong>g our def<strong>in</strong>itions of vector addition and scalar multiplication <strong>in</strong> P 4 (Example<br />

VSP), we arrive at,<br />

0x 4 +0x 3 +0x 2 +0x +0=<br />

(3α 1 − 3α 2 +4α 3 +2α 4 ) x 4 +(−2α 1 + α 2 +5α 3 − 7α 4 ) x 3 +<br />

(4α 1 − 2α 3 +4α 4 ) x 2 +(6α 1 +4α 2 +3α 3 +2α 4 ) x +(−α 1 +2α 2 + α 3 + α 4 )<br />

Equat<strong>in</strong>g coefficients, we arrive at the homogeneous system of equations,<br />

3α 1 − 3α 2 +4α 3 +2α 4 =0<br />

−2α 1 + α 2 +5α 3 − 7α 4 =0<br />

4α 1 − 2α 3 +4α 4 =0


§LISS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 290<br />

6α 1 +4α 2 +3α 3 +2α 4 =0<br />

−α 1 +2α 2 + α 3 + α 4 =0<br />

We form the coefficient matrix of this homogeneous system of equations and<br />

row-reduce to f<strong>in</strong>d<br />

⎡<br />

⎤<br />

1 0 0 0<br />

0 1 0 0<br />

⎢ 0 0 1 0<br />

⎥<br />

⎣<br />

⎦<br />

0 0 0 1<br />

0 0 0 0<br />

We expected the system to be consistent (Theorem HSC) and so can compute<br />

n − r =4− 4=0andTheoremCSRN tells us that the solution is unique. S<strong>in</strong>ce<br />

this is a homogeneous system, this unique solution is the trivial solution (Def<strong>in</strong>ition<br />

TSHSE), α 1 =0,α 2 =0,α 3 =0,α 4 = 0. So by Def<strong>in</strong>ition LI the set T is l<strong>in</strong>early<br />

<strong>in</strong>dependent.<br />

A few observations. If we had discovered <strong>in</strong>f<strong>in</strong>itely many solutions, then we could<br />

have used one of the nontrivial solutions to provide a l<strong>in</strong>ear comb<strong>in</strong>ation <strong>in</strong> the<br />

manner we used to show that S was l<strong>in</strong>early dependent. It is important to realize<br />

that it is not <strong>in</strong>terest<strong>in</strong>g that we can create a relation of l<strong>in</strong>ear dependence with zero<br />

scalars — we can always do that — but for T , this is the only way to create a relation<br />

of l<strong>in</strong>ear dependence. It was no accident that we arrived at a homogeneous system<br />

of equations <strong>in</strong> this example, it is related to our use of the zero vector <strong>in</strong> def<strong>in</strong><strong>in</strong>g a<br />

relation of l<strong>in</strong>ear dependence. It is easy to present a conv<strong>in</strong>c<strong>in</strong>g statement that a set<br />

is l<strong>in</strong>early dependent (just exhibit a nontrivial relation of l<strong>in</strong>ear dependence) but a<br />

conv<strong>in</strong>c<strong>in</strong>g statement of l<strong>in</strong>ear <strong>in</strong>dependence requires demonstrat<strong>in</strong>g that there is<br />

no relation of l<strong>in</strong>ear dependence other than the trivial one. Notice how we relied<br />

on theorems from Chapter SLE toprovidethisdemonstration.Whew!Thereisa<br />

lot go<strong>in</strong>g on <strong>in</strong> this example. Spend some time with it, we will be wait<strong>in</strong>g patiently<br />

right here when you get back.<br />

△<br />

Example LIM32 L<strong>in</strong>ear <strong>in</strong>dependence <strong>in</strong> M 32<br />

Consider the two sets of vectors R and S from the vector space of all 3 × 2 matrices,<br />

M 32 (Example VSM)<br />

R =<br />

S =<br />

{[ 3 −1<br />

1 4<br />

6 −6<br />

{[ 2 0<br />

1 −1<br />

1 3<br />

] [ −2<br />

] 3<br />

, 1 −3 ,<br />

]<br />

−2 −6<br />

[ ] −4 0<br />

, −2 2 ,<br />

−2 −6<br />

[ 6 −6<br />

−1 0<br />

7 −9<br />

[ 1 1<br />

−2 1<br />

2 4<br />

]<br />

]<br />

,<br />

[ 7<br />

]} 9<br />

, −4 −5<br />

2 5<br />

[ ]} −5 3<br />

−10 7<br />

2 0<br />

One set is l<strong>in</strong>early <strong>in</strong>dependent, the other is not. Which is which? Let us exam<strong>in</strong>e


§LISS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 291<br />

R first. Build a generic relation of l<strong>in</strong>ear dependence (Def<strong>in</strong>ition RLD),<br />

[ ] [ ] [ ] [ ]<br />

3 −1 −2 3 6 −6 7 9<br />

α 1 1 4 + α 2 1 −3 + α 3 −1 0 + α 4 −4 −5 = 0<br />

6 −6 −2 −6 7 −9 2 5<br />

Massag<strong>in</strong>g the left-hand side with our def<strong>in</strong>itions of vector addition and scalar<br />

multiplication <strong>in</strong> M 32 (Example VSM) we obta<strong>in</strong>,<br />

[ ] [ ]<br />

3α1 − 2α 2 +6α 3 +7α 4 −α 1 +3α 2 − 6α 3 +9α 4 0 0<br />

α 1 + α 2 − α 3 − 4α 4 4α 1 − 3α 2 + −5α 4 = 0 0<br />

6α 1 − 2α 2 +7α 3 +2α 4 −6α 1 − 6α 2 − 9α 3 +5α 4 0 0<br />

Us<strong>in</strong>g our def<strong>in</strong>ition of matrix equality (Def<strong>in</strong>ition ME) to equate entries we get<br />

the homogeneous system of six equations <strong>in</strong> four variables,<br />

3α 1 − 2α 2 +6α 3 +7α 4 =0<br />

−α 1 +3α 2 − 6α 3 +9α 4 =0<br />

α 1 + α 2 − α 3 − 4α 4 =0<br />

4α 1 − 3α 2 + −5α 4 =0<br />

6α 1 − 2α 2 +7α 3 +2α 4 =0<br />

−6α 1 − 6α 2 − 9α 3 +5α 4 =0<br />

Form the coefficient matrix of this homogeneous system and row-reduce to obta<strong>in</strong><br />

⎡<br />

⎤<br />

1 0 0 0<br />

0 1 0 0<br />

0 0 1 0<br />

⎢ 0 0 0 1 ⎥<br />

⎣<br />

0 0 0 0<br />

⎦<br />

0 0 0 0<br />

Analyz<strong>in</strong>g this matrix we are led to conclude that α 1 =0,α 2 =0,α 3 =0,α 4 =0.<br />

This means there is only a trivial relation of l<strong>in</strong>ear dependence on the vectors of R<br />

and so we call R a l<strong>in</strong>early <strong>in</strong>dependent set (Def<strong>in</strong>ition LI).<br />

So it must be that S is l<strong>in</strong>early dependent. Let us see if we can f<strong>in</strong>d a nontrivial<br />

relation of l<strong>in</strong>ear dependence on S. We will beg<strong>in</strong> as with R, by construct<strong>in</strong>g a<br />

relation of l<strong>in</strong>ear dependence (Def<strong>in</strong>ition RLD) with unknown scalars,<br />

α 1<br />

[ 2 0<br />

1 −1<br />

1 3<br />

] [ ] −4 0<br />

+ α 2 −2 2<br />

−2 −6<br />

+ α 3<br />

[ 1 1<br />

−2 1<br />

2 4<br />

]<br />

+ α 4<br />

[ −5 3<br />

−10 7<br />

2 0<br />

Massag<strong>in</strong>g the left-hand side with our def<strong>in</strong>itions of vector addition and scalar<br />

]<br />

= 0


§LISS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 292<br />

multiplication <strong>in</strong> M 32 (Example VSM) we obta<strong>in</strong>,<br />

[ 2α1 − 4α 2 + α 3 − 5α 4 α 3 +3α 4<br />

]<br />

α 1 − 2α 2 − 2α 3 − 10α 4 −α 1 +2α 2 + α 3 +7α 4 =<br />

α 1 − 2α 2 +2α 3 +2α 4 3α 1 − 6α 2 +4α 3<br />

[ ] 0 0<br />

0 0<br />

0 0<br />

Us<strong>in</strong>g our def<strong>in</strong>ition of matrix equality (Def<strong>in</strong>ition ME) to equate entries we get<br />

the homogeneous system of six equations <strong>in</strong> four variables,<br />

2α 1 − 4α 2 + α 3 − 5α 4 =0<br />

α 3 +3α 4 =0<br />

α 1 − 2α 2 − 2α 3 − 10α 4 =0<br />

−α 1 +2α 2 + α 3 +7α 4 =0<br />

α 1 − 2α 2 +2α 3 +2α 4 =0<br />

3α 1 − 6α 2 +4α 3 =0<br />

Form the coefficient matrix of this homogeneous system and row-reduce to obta<strong>in</strong><br />

⎡<br />

⎤<br />

1 −2 0 −4<br />

0 0 1 3<br />

0 0 0 0<br />

⎢ 0 0 0 0 ⎥<br />

⎣<br />

⎦<br />

0 0 0 0<br />

0 0 0 0<br />

Analyz<strong>in</strong>g this we see that the system is consistent (we expected this s<strong>in</strong>ce the<br />

system is homogeneous, Theorem HSC) and has n − r =4− 2 = 2 free variables,<br />

namely α 2 and α 4 . This means there are <strong>in</strong>f<strong>in</strong>itely many solutions, and <strong>in</strong> particular,<br />

we can f<strong>in</strong>d a nontrivial solution, so long as we do not pick all of our free variables<br />

to be zero. The mere presence of a nontrivial solution for these scalars is enough to<br />

conclude that S is a l<strong>in</strong>early dependent set (Def<strong>in</strong>ition LI). But let us go ahead and<br />

explicitly construct a nontrivial relation of l<strong>in</strong>ear dependence.<br />

Choose α 2 = 1 and α 4 = −1. There is noth<strong>in</strong>g special about this choice, there<br />

are <strong>in</strong>f<strong>in</strong>itely many possibilities, some “easier” than this one, just avoid pick<strong>in</strong>g both<br />

variables to be zero. (Why not?) Then we f<strong>in</strong>d the dependent variables to have values<br />

α 1 = −2 andα 3 = 3. So the relation of l<strong>in</strong>ear dependence,<br />

[ ] [ ] [ ] [ ] [ ]<br />

2 0 −4 0 1 1 −5 3 0 0<br />

(−2) 1 −1 +(1) −2 2 +(3) −2 1 +(−1) −10 7 = 0 0<br />

1 3 −2 −6 2 4<br />

2 0 0 0<br />

is an iron-clad demonstration that S is l<strong>in</strong>early dependent. Can you construct another<br />

such demonstration?<br />

△<br />

Example LIC L<strong>in</strong>early <strong>in</strong>dependent set <strong>in</strong> the crazy vector space<br />

Is the set R = {(1, 0), (6, 3)} l<strong>in</strong>early <strong>in</strong>dependent <strong>in</strong> the crazy vector space C<br />

(Example CVS)?


§LISS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 293<br />

We beg<strong>in</strong> with an arbitrary relation of l<strong>in</strong>ear dependence on R<br />

0 = a 1 (1, 0) + a 2 (6, 3) Def<strong>in</strong>ition RLD<br />

and then massage it to a po<strong>in</strong>t where we can apply the def<strong>in</strong>ition of equality <strong>in</strong> C.<br />

Recall the def<strong>in</strong>itions of vector addition and scalar multiplication <strong>in</strong> C are not what<br />

you would expect.<br />

( − 1, −1)<br />

= 0 Example CVS<br />

= a 1 (1, 0) + a 2 (6, 3) Def<strong>in</strong>ition RLD<br />

=(1a 1 + a 1 − 1, 0a 1 + a 1 − 1) + (6a 2 + a 2 − 1, 3a 2 + a 2 − 1) Example CVS<br />

=(2a 1 − 1, a 1 − 1) + (7a 2 − 1, 4a 2 − 1)<br />

=(2a 1 − 1+7a 2 − 1+1,a 1 − 1+4a 2 − 1 + 1)<br />

Example CVS<br />

=(2a 1 +7a 2 − 1, a 1 +4a 2 − 1)<br />

Equality <strong>in</strong> C (Example CVS) then yields the two equations,<br />

2a 1 +7a 2 − 1=−1<br />

a 1 +4a 2 − 1=−1<br />

which becomes the homogeneous system<br />

2a 1 +7a 2 =0<br />

a 1 +4a 2 =0<br />

S<strong>in</strong>ce the coefficient matrix of this system is nons<strong>in</strong>gular (check this!) the system<br />

has only the trivial solution a 1 = a 2 = 0. By Def<strong>in</strong>ition LI the set R is l<strong>in</strong>early<br />

<strong>in</strong>dependent. Notice that even though the zero vector of C is not what we might<br />

have first suspected, a question about l<strong>in</strong>ear <strong>in</strong>dependence still concludes with a<br />

question about a homogeneous system of equations. Hmmm.<br />

△<br />

Subsection SS<br />

Spann<strong>in</strong>g Sets<br />

In a vector space V , suppose we are given a set of vectors S ⊆ V .Thenwecan<br />

immediately construct a subspace, 〈S〉, us<strong>in</strong>g Def<strong>in</strong>ition SS and then be assured<br />

by Theorem SSS that the construction does provide a subspace. We now turn the<br />

situation upside-down. Suppose we are first given a subspace W ⊆ V . Can we f<strong>in</strong>d a<br />

set S so that 〈S〉 = W ? Typically W is <strong>in</strong>f<strong>in</strong>ite and we are search<strong>in</strong>g for a f<strong>in</strong>ite set<br />

of vectors S that we can comb<strong>in</strong>e <strong>in</strong> l<strong>in</strong>ear comb<strong>in</strong>ations and “build” all of W .<br />

I like to th<strong>in</strong>k of S as the raw materials that are sufficient for the construction of<br />

W . If you have nails, lumber, wire, copper pipe, drywall, plywood, carpet, sh<strong>in</strong>gles,


§LISS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 294<br />

pa<strong>in</strong>t (and a few other th<strong>in</strong>gs), then you can comb<strong>in</strong>e them <strong>in</strong> many different ways<br />

to create a house (or <strong>in</strong>f<strong>in</strong>itely many different houses for that matter). A fast-food<br />

restaurant may have beef, chicken, beans, cheese, tortillas, taco shells and hot sauce<br />

and from this small list of <strong>in</strong>gredients build a wide variety of items for sale. Or<br />

maybe a better analogy comes from Ben Cordes — the additive primary colors (red,<br />

green and blue) can be comb<strong>in</strong>ed to create many different colors by vary<strong>in</strong>g the<br />

<strong>in</strong>tensity of each. The <strong>in</strong>tensity is like a scalar multiple, and the comb<strong>in</strong>ation of the<br />

three <strong>in</strong>tensities is like vector addition. The three <strong>in</strong>dividual colors, red, green and<br />

blue, are the elements of the spann<strong>in</strong>g set.<br />

Because we will use terms like “spanned by” and “spann<strong>in</strong>g set,” there is the<br />

potential for confusion with “the span.” Come back and reread the first paragraph of<br />

this subsection whenever you are uncerta<strong>in</strong> about the difference. Here is the work<strong>in</strong>g<br />

def<strong>in</strong>ition.<br />

Def<strong>in</strong>ition SSVS Spann<strong>in</strong>g Set of a Vector Space<br />

Suppose V is a vector space. A subset S of V is a spann<strong>in</strong>g set of V if 〈S〉 = V .<br />

In this case, we also frequently say S spans V .<br />

□<br />

The def<strong>in</strong>ition of a spann<strong>in</strong>g set requires that two sets (subspaces actually) be<br />

equal. If S is a subset of V ,then〈S〉 ⊆V , always. Thus it is usually only necessary<br />

to prove that V ⊆〈S〉. Now would be a good time to review Def<strong>in</strong>ition SE.<br />

Example SSP4 Spann<strong>in</strong>g set <strong>in</strong> P 4<br />

In Example SP4 we showed that<br />

W = {p(x)| p ∈ P 4 , p(2) = 0}<br />

is a subspace of P 4 , the vector space of polynomials with degree at most 4 (Example<br />

VSP). In this example, we will show that the set<br />

S = { x − 2, x 2 − 4x +4,x 3 − 6x 2 +12x − 8, x 4 − 8x 3 +24x 2 − 32x +16 }<br />

is a spann<strong>in</strong>g set for W . To do this, we require that W = 〈S〉. This is an equality of<br />

sets. We can check that every polynomial <strong>in</strong> S has x = 2 as a root and therefore<br />

S ⊆ W .S<strong>in</strong>ceW is closed under addition and scalar multiplication, 〈S〉 ⊆W also.<br />

So it rema<strong>in</strong>s to show that W ⊆〈S〉 (Def<strong>in</strong>ition SE). To do this, beg<strong>in</strong> by<br />

choos<strong>in</strong>g an arbitrary polynomial <strong>in</strong> W ,sayr(x) =ax 4 + bx 3 + cx 2 + dx + e ∈ W .<br />

This polynomial is not as arbitrary as it would appear, s<strong>in</strong>ce we also know it must<br />

have x = 2 as a root. This translates to<br />

0=a(2) 4 + b(2) 3 + c(2) 2 + d(2) + e =16a +8b +4c +2d + e<br />

as a condition on r.<br />

We wish to show that r is a polynomial <strong>in</strong> 〈S〉, that is, we want to show that r<br />

can be written as a l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors (polynomials) <strong>in</strong> S. Soletus<br />

try.<br />

r(x) =ax 4 + bx 3 + cx 2 + dx + e


§LISS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 295<br />

(<br />

= α 1 (x − 2) + α 2 x 2 − 4x +4 ) (<br />

+ α 3 x 3 − 6x 2 +12x − 8 )<br />

(<br />

+ α 4 x 4 − 8x 3 +24x 2 − 32x +16 )<br />

= α 4 x 4 +(α 3 − 8α 4 ) x 3 +(α 2 − 6α 3 +24α 4 ) x 2<br />

+(α 1 − 4α 2 +12α 3 − 32α 4 ) x +(−2α 1 +4α 2 − 8α 3 +16α 4 )<br />

Equat<strong>in</strong>g coefficients (vector equality <strong>in</strong> P 4 ) gives the system of five equations <strong>in</strong><br />

four variables,<br />

α 4 = a<br />

α 3 − 8α 4 = b<br />

α 2 − 6α 3 +24α 4 = c<br />

α 1 − 4α 2 +12α 3 − 32α 4 = d<br />

−2α 1 +4α 2 − 8α 3 +16α 4 = e<br />

Any solution to this system of equations will provide the l<strong>in</strong>ear comb<strong>in</strong>ation we<br />

need to determ<strong>in</strong>e if r ∈〈S〉, but we need to be conv<strong>in</strong>ced there is a solution for any<br />

values of a, b, c, d, e that qualify r to be a member of W . So the question is: is this<br />

system of equations consistent? We will form the augmented matrix, and row-reduce.<br />

(We probably need to do this by hand, s<strong>in</strong>ce the matrix is symbolic — revers<strong>in</strong>g the<br />

order of the first four rows is the best way to start). We obta<strong>in</strong> a matrix <strong>in</strong> reduced<br />

row-echelon form<br />

⎡ ⎤<br />

1 0 0 0 32a +12b +4c + d<br />

⎢ ⎢⎢⎢⎣ 0 1 0 0 24a +6b + c<br />

0 0 1 0 8a + b<br />

⎥<br />

⎦<br />

0 0 0 1 a<br />

0 0 0 0 16a +8b +4c +2d + e<br />

⎡<br />

⎤<br />

1 0 0 0 32a +12b +4c + d<br />

0 1 0 0 24a +6b + c<br />

=<br />

⎢ 0 0 1 0 8a + b<br />

⎥<br />

⎣<br />

⎦<br />

0 0 0 1 a<br />

0 0 0 0 0<br />

For your results to match our first matrix, you may f<strong>in</strong>d it necessary to multiply<br />

the f<strong>in</strong>al row of your row-reduced matrix by the appropriate scalar, and/or add<br />

multiples of this row to some of the other rows. To obta<strong>in</strong> the second version of the<br />

matrix, the last entry of the last column has been simplified to zero accord<strong>in</strong>g to the<br />

one condition we were able to impose on an arbitrary polynomial from W .S<strong>in</strong>cethe<br />

last column is not a pivot column, Theorem RCLS tells us this system is consistent.<br />

Therefore, any polynomial from W can be written as a l<strong>in</strong>ear comb<strong>in</strong>ation of the<br />

polynomials <strong>in</strong> S, soW ⊆〈S〉. Therefore, W = 〈S〉 and S is a spann<strong>in</strong>g set for W


§LISS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 296<br />

by Def<strong>in</strong>ition SSVS.<br />

Notice that an alternative to row-reduc<strong>in</strong>g the augmented matrix by hand would<br />

be to appeal to Theorem FS by express<strong>in</strong>g the column space of the coefficient matrix<br />

as a null space, and then verify<strong>in</strong>g that the condition on r guarantees that r is <strong>in</strong><br />

the column space, thus imply<strong>in</strong>g that the system is always consistent. Give it a try,<br />

we will wait. This has been a complicated example, but worth study<strong>in</strong>g carefully.△<br />

Given a subspace and a set of vectors, as <strong>in</strong> Example SSP4 it can take some<br />

work to determ<strong>in</strong>e that the set actually is a spann<strong>in</strong>g set. An even harder problem is<br />

to be confronted with a subspace and required to construct a spann<strong>in</strong>g set with no<br />

guidance. We will now work an example of this flavor, but some of the steps will be<br />

unmotivated. Fortunately, we will have some better tools for this type of problem<br />

later on.<br />

Example SSM22 Spann<strong>in</strong>g set <strong>in</strong> M 22<br />

In the space of all 2 × 2 matrices, M 22 consider the subspace<br />

{[ ]∣<br />

a b ∣∣∣<br />

Z = a +3b − c − 5d =0, −2a − 6b +3c +14d =0}<br />

c d<br />

and f<strong>in</strong>d a spann<strong>in</strong>g set for Z.<br />

We need to construct a limited number of matrices <strong>in</strong> Z so that every matrix<br />

<strong>in</strong> Z can be expressed [ as]<br />

a l<strong>in</strong>ear comb<strong>in</strong>ation of this limited number of matrices.<br />

a b<br />

Suppose that B = is a matrix <strong>in</strong> Z. Then we can form a column vector with<br />

c d<br />

the entries of B and write ⎡ ⎤<br />

a<br />

⎢ ⎣<br />

b⎥<br />

c<br />

⎦ ∈N<br />

d<br />

([<br />

1 3 −1<br />

])<br />

−5<br />

−2 −6 3 14<br />

Row-reduc<strong>in</strong>g this matrix and apply<strong>in</strong>g Theorem REMES we obta<strong>in</strong> the equivalent<br />

statement,<br />

⎡ ⎤<br />

a ([ ])<br />

⎢b⎥<br />

1 3 0 −1<br />

⎣<br />

c<br />

⎦ ∈N<br />

0 0 1 4<br />

d<br />

We can then express the subspace Z <strong>in</strong> the follow<strong>in</strong>g equal forms,<br />

{[ ]∣<br />

a b ∣∣∣<br />

Z = a +3b − c − 5d =0, −2a − 6b +3c +14d =0}<br />

c d<br />

{[ ]∣<br />

a b ∣∣∣<br />

= a +3b − d =0, c+4d =0}<br />

c d<br />

{[ ]∣<br />

a b ∣∣∣<br />

= a = −3b + d, c = −4d}<br />

c d


§LISS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 297<br />

So the set<br />

{[ ]∣<br />

−3b + d b ∣∣∣<br />

=<br />

b, d ∈ C}<br />

−4d d<br />

{[ ] [ ]∣<br />

−3b b d 0 ∣∣∣<br />

=<br />

+<br />

b, d ∈ C}<br />

0 0 −4d d<br />

{ [ ] [ ]∣<br />

−3 1 1 0 ∣∣∣<br />

= b + d b, d ∈ C}<br />

0 0 −4 1<br />

〈{[ ] [ ]}〉<br />

−3 1 1 0<br />

=<br />

,<br />

0 0 −4 1<br />

spans Z by Def<strong>in</strong>ition SSVS.<br />

Q =<br />

{[ ]<br />

−3 1<br />

,<br />

0 0<br />

[<br />

1<br />

]}<br />

0<br />

−4 1<br />

Example SSC Spann<strong>in</strong>g set <strong>in</strong> the crazy vector space<br />

In Example LIC we determ<strong>in</strong>ed that the set R = {(1, 0), (6, 3)} is l<strong>in</strong>early <strong>in</strong>dependent<br />

<strong>in</strong> the crazy vector space C (Example CVS). We now show that R is a spann<strong>in</strong>g<br />

set for C.<br />

Given an arbitrary vector (x, y) ∈ C we desire to show that it can be written as<br />

a l<strong>in</strong>ear comb<strong>in</strong>ation of the elements of R. In other words, are there scalars a 1 and<br />

a 2 so that<br />

(x, y) =a 1 (1, 0) + a 2 (6, 3)<br />

We will act as if this equation is true and try to determ<strong>in</strong>e just what a 1 and<br />

a 2 would be (as functions of x and y). Recall that our vector space operations are<br />

unconventional and are def<strong>in</strong>ed <strong>in</strong> Example CVS.<br />

(x, y) =a 1 (1, 0) + a 2 (6, 3)<br />

=(1a 1 + a 1 − 1, 0a 1 + a 1 − 1) + (6a 2 + a 2 − 1, 3a 2 + a 2 − 1)<br />

=(2a 1 − 1, a 1 − 1) + (7a 2 − 1, 4a 2 − 1)<br />

=(2a 1 − 1+7a 2 − 1+1,a 1 − 1+4a 2 − 1+1)<br />

=(2a 1 +7a 2 − 1, a 1 +4a 2 − 1)<br />

Equality <strong>in</strong> C then yields the two equations,<br />

2a 1 +7a 2 − 1=x<br />

a 1 +4a 2 − 1=y<br />

which becomes the l<strong>in</strong>ear system with a matrix representation<br />

[ ][ ] [ ]<br />

2 7 a1 x +1<br />

=<br />

1 4 a 2 y +1<br />

The coefficient matrix of this system is nons<strong>in</strong>gular, hence <strong>in</strong>vertible (Theorem<br />


§LISS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 298<br />

NI), and we can employ its <strong>in</strong>verse to f<strong>in</strong>d a solution (Theorem TTMI, Theorem<br />

SNCM),<br />

[ ] [ ] −1 [ ] [ ][ ] [ ]<br />

a1 2 7 x +1 4 −7 x +1 4x − 7y − 3<br />

=<br />

=<br />

=<br />

a 2 1 4 y +1 −1 2 y +1 −x +2y +1<br />

We could chase through the above implications backwards and take the existence<br />

of these solutions as sufficient evidence for R be<strong>in</strong>g a spann<strong>in</strong>g set for C. Instead,let<br />

us view the above as simply scratchwork and now get serious with a simple direct<br />

proof that R is a spann<strong>in</strong>g set. Ready? Suppose (x, y) is any vector from C, then<br />

compute the follow<strong>in</strong>g l<strong>in</strong>ear comb<strong>in</strong>ation us<strong>in</strong>g the def<strong>in</strong>itions of the operations <strong>in</strong><br />

C,<br />

(4x − 7y − 3)(1, 0) + (−x +2y + 1)(6, 3)<br />

= (1(4x − 7y − 3) + (4x − 7y − 3) − 1, 0(4x − 7y − 3) + (4x − 7y − 3) − 1) +<br />

(6(−x +2y +1)+(−x +2y +1)− 1, 3(−x +2y +1)+(−x +2y +1)− 1)<br />

=(8x − 14y − 7, 4x − 7y − 4) + (−7x +14y +6, −4x +8y +3)<br />

= ((8x − 14y − 7) + (−7x +14y +6)+1, (4x − 7y − 4) + (−4x +8y +3)+1)<br />

=(x, y)<br />

This f<strong>in</strong>al sequence of computations <strong>in</strong> C is sufficient to demonstrate that any<br />

element of C can be written (or expressed) as a l<strong>in</strong>ear comb<strong>in</strong>ation of the two vectors<br />

<strong>in</strong> R, soC ⊆〈R〉. S<strong>in</strong>ce the reverse <strong>in</strong>clusion 〈R〉 ⊆C is trivially true, C = 〈R〉 and<br />

we say R spans C (Def<strong>in</strong>ition SSVS). Notice that this demonstration is no more or<br />

less valid if we hide from the reader our scratchwork that suggested a 1 =4x − 7y − 3<br />

and a 2 = −x +2y +1.<br />

△<br />

Subsection VR<br />

Vector Representation<br />

In Chapter R we will take up the matter of representations fully, where Theorem<br />

VRRB will be critical for Def<strong>in</strong>ition VR. We will now motivate and prove a critical<br />

theorem that tells us how to “represent” a vector. This theorem could wait, but<br />

work<strong>in</strong>g with it now will provide some extra <strong>in</strong>sight <strong>in</strong>to the nature of l<strong>in</strong>early<br />

<strong>in</strong>dependent spann<strong>in</strong>g sets. <strong>First</strong> an example, then the theorem.<br />

Example AVR A vector representation<br />

Consider the set<br />

{[ ] −7<br />

[ ] −6<br />

S = 5<br />

1<br />

, 5<br />

0<br />

,<br />

[ ]} −12<br />

7<br />

4<br />

from the vector space C 3 .LetA be the matrix whose columns are the set S, and<br />

verify that A is nons<strong>in</strong>gular. By Theorem NMLIC theelementsofS form a l<strong>in</strong>early


§LISS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 299<br />

<strong>in</strong>dependent set. Suppose that b ∈ C 3 .ThenLS(A, b) has a (unique) solution<br />

(Theorem NMUS) and hence is consistent. By Theorem SLSLC, b ∈〈S〉. S<strong>in</strong>ceb<br />

is arbitrary, this is enough to show that 〈S〉 = C 3 , and therefore S is a spann<strong>in</strong>g<br />

set for C 3 (Def<strong>in</strong>ition SSVS). (This set comes from the columns of the coefficient<br />

matrix of Archetype B.)<br />

Now exam<strong>in</strong>e the situation for a particular choice of b, sayb =<br />

[ ] −33<br />

24 . Because<br />

5<br />

S is a spann<strong>in</strong>g set for C 3 ,weknowwecanwriteb as a l<strong>in</strong>ear comb<strong>in</strong>ation of the<br />

vectors <strong>in</strong> S,<br />

[ ] −33<br />

24 =(−3)<br />

5<br />

[ ] −7<br />

5 +(5)<br />

1<br />

[ ] −6<br />

5 +(2)<br />

0<br />

[ ] −12<br />

7 .<br />

4<br />

The nons<strong>in</strong>gularity of the matrix A tells that the scalars <strong>in</strong> this l<strong>in</strong>ear comb<strong>in</strong>ation<br />

are unique. More precisely, it is the l<strong>in</strong>ear <strong>in</strong>dependence of S that provides the<br />

uniqueness. We will refer to the scalars a 1 = −3, a 2 =5,a 3 = 2 as a “representation<br />

of b relative to S.” In other words, once we settle on S as a l<strong>in</strong>early <strong>in</strong>dependent<br />

set that spans C 3 , the vector b is recoverable just by know<strong>in</strong>g the scalars a 1 = −3,<br />

a 2 =5,a 3 = 2 (use these scalars <strong>in</strong> a l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors <strong>in</strong> S). This is<br />

all an illustration of the follow<strong>in</strong>g important theorem, which we prove <strong>in</strong> the sett<strong>in</strong>g<br />

of a general vector space.<br />

△<br />

Theorem VRRB Vector Representation Relative to a Basis<br />

Suppose that V is a vector space and B = {v 1 , v 2 , v 3 , ..., v m } is a l<strong>in</strong>early <strong>in</strong>dependent<br />

set that spans V .Letw be any vector <strong>in</strong> V . Then there exist unique scalars<br />

a 1 ,a 2 ,a 3 , ..., a m such that<br />

w = a 1 v 1 + a 2 v 2 + a 3 v 3 + ···+ a m v m .<br />

Proof. That w can be written as a l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors <strong>in</strong> B follows<br />

from the spann<strong>in</strong>g property of the set (Def<strong>in</strong>ition SSVS). This is good, but not the<br />

meat of this theorem. We now know that for any choice of the vector w there exist<br />

some scalars that will create w as a l<strong>in</strong>ear comb<strong>in</strong>ation of the basis vectors. The<br />

real question is: Is there more than one way to write w as a l<strong>in</strong>ear comb<strong>in</strong>ation of<br />

{v 1 , v 2 , v 3 , ..., v m }? Are the scalars a 1 ,a 2 ,a 3 , ..., a m unique? (Proof Technique<br />

U)<br />

Assume there are two different l<strong>in</strong>ear comb<strong>in</strong>ations of {v 1 , v 2 , v 3 , ..., v m }<br />

that equal the vector w. In other words there exist scalars a 1 ,a 2 ,a 3 , ..., a m and<br />

b 1 ,b 2 ,b 3 , ..., b m so that<br />

w = a 1 v 1 + a 2 v 2 + a 3 v 3 + ···+ a m v m<br />

w = b 1 v 1 + b 2 v 2 + b 3 v 3 + ···+ b m v m .


§LISS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 300<br />

Then notice that<br />

0 = w +(−w) Property AI<br />

= w +(−1)w Theorem AISM<br />

=(a 1 v 1 + a 2 v 2 + a 3 v 3 + ···+ a m v m )+<br />

(−1)(b 1 v 1 + b 2 v 2 + b 3 v 3 + ···+ b m v m )<br />

=(a 1 v 1 + a 2 v 2 + a 3 v 3 + ···+ a m v m )+<br />

(−b 1 v 1 − b 2 v 2 − b 3 v 3 − ...− b m v m ) Property DVA<br />

=(a 1 − b 1 )v 1 +(a 2 − b 2 )v 2 +(a 3 − b 3 )v 3 +<br />

···+(a m − b m )v m<br />

Property C, Property DSA<br />

But this is a relation of l<strong>in</strong>ear dependence on a l<strong>in</strong>early <strong>in</strong>dependent set of<br />

vectors (Def<strong>in</strong>ition RLD)! Now we are us<strong>in</strong>g the other assumption about B, that<br />

{v 1 , v 2 , v 3 , ..., v m } is a l<strong>in</strong>early <strong>in</strong>dependent set. So by Def<strong>in</strong>ition LI it must<br />

happen that the scalars are all zero. That is,<br />

(a 1 − b 1 )=0 (a 2 − b 2 )=0 (a 3 − b 3 )=0 ... (a m − b m )=0<br />

a 1 = b 1 a 2 = b 2 a 3 = b 3 ... a m = b m .<br />

And so we f<strong>in</strong>d that the scalars are unique.<br />

<br />

TheconverseofTheoremVRRB is true as well, but is not important enough to<br />

rise beyond an exercise (see Exercise LISS.T51).<br />

This is a very typical use of the hypothesis that a set is l<strong>in</strong>early <strong>in</strong>dependent —<br />

obta<strong>in</strong> a relation of l<strong>in</strong>ear dependence and then conclude that the scalars must all<br />

be zero. The result of this theorem tells us that we can write any vector <strong>in</strong> a vector<br />

space as a l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors <strong>in</strong> a l<strong>in</strong>early <strong>in</strong>dependent spann<strong>in</strong>g set,<br />

but only just. There is only enough raw material <strong>in</strong> the spann<strong>in</strong>g set to write each<br />

vector one way as a l<strong>in</strong>ear comb<strong>in</strong>ation. So <strong>in</strong> this sense, we could call a l<strong>in</strong>early<br />

<strong>in</strong>dependent spann<strong>in</strong>g set a “m<strong>in</strong>imal spann<strong>in</strong>g set.” These sets are so important<br />

that we will give them a simpler name (“basis”) and explore their properties further<br />

<strong>in</strong> the next section.<br />

Read<strong>in</strong>g Questions<br />

1. Is the set of matrices below l<strong>in</strong>early <strong>in</strong>dependent or l<strong>in</strong>early dependent <strong>in</strong> the vector<br />

space M 22 ? Why or why not?<br />

{[ ] [ ] [ ]}<br />

1 3 −2 3 0 9<br />

,<br />

,<br />

−2 4 3 −5 −1 3<br />

2. Expla<strong>in</strong> the difference between the follow<strong>in</strong>g two uses of the term “span”:<br />

1. S is a subset of the vector space V and the span of S is a subspace of V .


§LISS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 301<br />

2. W is a subspace of the vector space Y and T spans W .<br />

3. The set<br />

⎧⎡<br />

⎨<br />

S = ⎣ 6 ⎤ ⎡<br />

2⎦ , ⎣ 4<br />

⎤ ⎡<br />

−3⎦ , ⎣ 5 ⎤⎫<br />

⎬<br />

8⎦<br />

⎩<br />

1 1 2<br />

⎭<br />

⎡<br />

is l<strong>in</strong>early <strong>in</strong>dependent and spans C 3 . Write the vector x = ⎣ −6<br />

⎤<br />

2 ⎦ as a l<strong>in</strong>ear comb<strong>in</strong>ation<br />

2<br />

of the elements of S. How many ways are there to answer this question, and which<br />

theorem allows you to say so?<br />

Exercises<br />

C20 † In the vector space of 2 × 2matrices,M 22 , determ<strong>in</strong>e if the set S below is l<strong>in</strong>early<br />

<strong>in</strong>dependent.<br />

{[ ] [ ] [ ]}<br />

2 −1 0 4 4 2<br />

S =<br />

, ,<br />

1 3 −1 2 1 3<br />

C21 † In the crazy vector space C (Example CVS), is the set S = {(0, 2), (2, 8)} l<strong>in</strong>early<br />

<strong>in</strong>dependent?<br />

C22 † In the vector space of polynomials P 3 , determ<strong>in</strong>e if the set S is l<strong>in</strong>early <strong>in</strong>dependent<br />

or l<strong>in</strong>early dependent.<br />

S = { 2+x − 3x 2 − 8x 3 , 1+x + x 2 +5x 3 , 3 − 4x 2 − 7x 3}<br />

C23 † Determ<strong>in</strong>e if the set S = {(3, 1), (7, 3)} is l<strong>in</strong>early <strong>in</strong>dependent <strong>in</strong> the crazy vector<br />

space C (Example CVS).<br />

C24 † In the vector space of real-valued functions F = { f| f : R → R}, determ<strong>in</strong>e if the<br />

follow<strong>in</strong>g set S is l<strong>in</strong>early <strong>in</strong>dependent.<br />

S = { s<strong>in</strong> 2 x, cos 2 x, 2 }<br />

C25 †<br />

Let<br />

S =<br />

{[ ]<br />

1 2<br />

,<br />

2 1<br />

[ ]<br />

2 1<br />

,<br />

−1 2<br />

[ ]}<br />

0 1<br />

1 2<br />

1. Determ<strong>in</strong>e if S spans M 22 .<br />

2. Determ<strong>in</strong>e if S is l<strong>in</strong>early <strong>in</strong>dependent.<br />

C26 †<br />

Let<br />

S =<br />

{[ ]<br />

1 2<br />

,<br />

2 1<br />

[ ]<br />

2 1<br />

,<br />

−1 2<br />

[ ]<br />

0 1<br />

,<br />

1 2<br />

[ ]<br />

1 0<br />

,<br />

1 1<br />

[ ]}<br />

1 4<br />

0 3<br />

1. Determ<strong>in</strong>e if S spans M 22 .


§LISS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 302<br />

2. Determ<strong>in</strong>e if S is l<strong>in</strong>early <strong>in</strong>dependent.<br />

C30 In Example LIM32, f<strong>in</strong>d another nontrivial relation of l<strong>in</strong>ear dependence on the<br />

l<strong>in</strong>early dependent set of 3 × 2matrices,S.<br />

C40 † Determ<strong>in</strong>e if the set T = { x 2 − x +5, 4x 3 − x 2 +5x, 3x +2 } spans the vector<br />

space of polynomials with degree 4 or less, P 4 .<br />

C41 † The set W is a subspace of M 22 , the vector space of all 2 × 2 matrices. Prove that<br />

S is a spann<strong>in</strong>g set for W .<br />

{[ ]∣ {[ ] [ ] [ ]}<br />

a b ∣∣∣ 1 0 0 1 0 0<br />

W = 2a − 3b +4c − d =0}<br />

S = , ,<br />

c d<br />

0 2 0 −3 1 4<br />

C42 †<br />

CVS).<br />

M10 †<br />

Determ<strong>in</strong>e if the set S = {(3, 1), (7, 3)} spans the crazy vector space C (Example<br />

Halfway through Example SSP4, we need to show that the system of equations<br />

⎛⎡<br />

⎤ ⎡ ⎤⎞<br />

0 0 0 1 a<br />

0 0 1 −8<br />

b<br />

LS⎜⎢<br />

0 1 −6 24 ⎥<br />

⎝⎣<br />

1 −4 12 −32<br />

⎦ , ⎢c⎥⎟<br />

⎣<br />

d<br />

⎦⎠<br />

−2 4 −8 16 e<br />

is consistent for every choice of the vector of constants satisfy<strong>in</strong>g 16a +8b +4c +2d + e =0.<br />

Express the column space of the coefficient matrix of this system as a null space, us<strong>in</strong>g<br />

Theorem FS. From this use Theorem CSCS to establish that the system is always<br />

consistent. Notice that this approach removes from Example SSP4 the need to row-reduce<br />

a symbolic matrix.<br />

T20 † Suppose that S is a f<strong>in</strong>ite l<strong>in</strong>early <strong>in</strong>dependent set of vectors from the vector space<br />

V .LetT be any subset of S. ProvethatT is l<strong>in</strong>early <strong>in</strong>dependent.<br />

T40 Prove the follow<strong>in</strong>g variant of Theorem EMMVP that has a weaker hypothesis: Suppose<br />

that C = {u 1 , u 2 , u 3 , ..., u p } is a l<strong>in</strong>early <strong>in</strong>dependent spann<strong>in</strong>g set for C n .Suppose<br />

also that A and B are m×n matrices such that Au i = Bu i for every 1 ≤ i ≤ n. Then A = B.<br />

Can you weaken the hypothesis even further while still preserv<strong>in</strong>g the conclusion?<br />

T50 † Suppose that V is a vector space and u, v ∈ V are two vectors <strong>in</strong> V .Usethe<br />

def<strong>in</strong>ition of l<strong>in</strong>ear <strong>in</strong>dependence to prove that S = {u, v} is a l<strong>in</strong>early dependent set if<br />

and only if one of the two vectors is a scalar multiple of the other. Prove this directly <strong>in</strong><br />

the context of an abstract vector space (V ), without simply giv<strong>in</strong>g an upgraded version of<br />

Theorem DLDS for the special case of just two vectors.<br />

T51 † Carefully formulate the converse of Theorem VRRB and provide a proof.


Section B<br />

Bases<br />

A basis of a vector space is one of the most useful concepts <strong>in</strong> l<strong>in</strong>ear algebra. It often<br />

provides a concise, f<strong>in</strong>ite description of an <strong>in</strong>f<strong>in</strong>ite vector space.<br />

Subsection B<br />

Bases<br />

We now have all the tools <strong>in</strong> place to def<strong>in</strong>e a basis of a vector space.<br />

Def<strong>in</strong>ition B Basis<br />

Suppose V is a vector space. Then a subset S ⊆ V is a basis of V if it is l<strong>in</strong>early<br />

<strong>in</strong>dependent and spans V .<br />

□<br />

So, a basis is a l<strong>in</strong>early <strong>in</strong>dependent spann<strong>in</strong>g set for a vector space. The requirement<br />

that the set spans V <strong>in</strong>sures that S has enough raw material to build V ,<br />

while the l<strong>in</strong>ear <strong>in</strong>dependence requirement <strong>in</strong>sures that we do not have any more<br />

raw material than we need. As we shall see soon <strong>in</strong> Section D, a basis is a m<strong>in</strong>imal<br />

spann<strong>in</strong>g set.<br />

You may have noticed that we used the term basis for some of the titles of previous<br />

theorems (e.g. Theorem BNS, TheoremBCS, TheoremBRS) and if you review each<br />

of these theorems you will see that their conclusions provide l<strong>in</strong>early <strong>in</strong>dependent<br />

spann<strong>in</strong>g sets for sets that we now recognize as subspaces of C m . Examples associated<br />

with these theorems <strong>in</strong>clude Example NSLIL, Example CSOCD and Example IAS.<br />

As we will see, these three theorems will cont<strong>in</strong>ue to be powerful tools, even <strong>in</strong> the<br />

sett<strong>in</strong>g of more general vector spaces.<br />

Furthermore, the archetypes conta<strong>in</strong> an abundance of bases. For each coefficient<br />

matrix of a system of equations, and for each archetype def<strong>in</strong>ed simply as a matrix,<br />

thereisabasisforthenullspace,three bases for the column space, and a basis for<br />

the row space. For this reason, our subsequent examples will concentrate on bases<br />

for vector spaces other than C m .<br />

Notice that Def<strong>in</strong>ition B does not preclude a vector space from hav<strong>in</strong>g many<br />

bases, and this is the case, as h<strong>in</strong>ted above by the statement that the archetypes<br />

conta<strong>in</strong> three bases for the column space of a matrix. More generally, we can grab<br />

any basis for a vector space, multiply any one basis vector by a nonzero scalar and<br />

create a slightly different set that is still a basis. For “important” vector spaces, it<br />

will be convenient to have a collection of “nice” bases. When a vector space has<br />

a s<strong>in</strong>gle particularly nice basis, it is sometimes called the standard basis though<br />

there is noth<strong>in</strong>g precise enough about this term to allow us to def<strong>in</strong>e it formally —<br />

it is a question of style. Here are some nice bases for important vector spaces.<br />

Theorem SUVB Standard Unit Vectors are a Basis<br />

303


§B Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 304<br />

The set of standard unit vectors for C m (Def<strong>in</strong>ition SUV), B = {e i | 1 ≤ i ≤ m} is a<br />

basis for the vector space C m .<br />

Proof. We must show that the set B is both l<strong>in</strong>early <strong>in</strong>dependent and a spann<strong>in</strong>g<br />

set for C m . <strong>First</strong>, the vectors <strong>in</strong> B are, by Def<strong>in</strong>ition SUV, the columns of the<br />

identity matrix, which we know is nons<strong>in</strong>gular (s<strong>in</strong>ce it row-reduces to the identity<br />

matrix, Theorem NMRRI). And the columns of a nons<strong>in</strong>gular matrix are l<strong>in</strong>early<br />

<strong>in</strong>dependent by Theorem NMLIC.<br />

Suppose we grab an arbitrary vector from C m ,say<br />

⎡ ⎤<br />

v 1<br />

v 2<br />

v =<br />

v 3<br />

⎢ ⎥<br />

⎣<br />

.<br />

⎦ .<br />

v m<br />

Can we write v as a l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors <strong>in</strong> B? Yes, and quite<br />

simply.<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

⎡ ⎤<br />

v 1 1 0 0<br />

0<br />

v 2<br />

0<br />

1<br />

0<br />

0<br />

v 3<br />

⎢<br />

⎣ . ⎥<br />

.<br />

⎦ = v 0<br />

1 ⎢<br />

⎣<br />

⎥<br />

.<br />

⎦ + v 0<br />

2 ⎢<br />

⎣<br />

⎥<br />

.<br />

⎦ + v 1<br />

3 ⎢<br />

⎣<br />

⎥<br />

.<br />

⎦ + ···+ v 0<br />

m ⎢<br />

⎣<br />

⎥<br />

.<br />

⎦<br />

v m 0 0 0<br />

1<br />

v = v 1 e 1 + v 2 e 2 + v 3 e 3 + ···+ v m e m<br />

This shows that C m ⊆〈B〉, which is sufficient to show that B is a spann<strong>in</strong>g set<br />

for C m .<br />

<br />

Example BP Bases for P n<br />

The vector space of polynomials with degree at most n, P n , has the basis<br />

B = { 1,x,x 2 ,x 3 , ..., x n} .<br />

Another nice basis for P n is<br />

C = { 1, 1+x, 1+x + x 2 , 1+x + x 2 + x 3 , ..., 1+x + x 2 + x 3 + ···+ x n} .<br />

Check<strong>in</strong>g that each of B and C is a l<strong>in</strong>early <strong>in</strong>dependent spann<strong>in</strong>g set are good<br />

exercises.<br />

△<br />

Example BM A basis for the vector space of matrices<br />

In the vector space M mn of matrices (Example VSM) def<strong>in</strong>e the matrices B kl ,<br />

1 ≤ k ≤ m, 1≤ l ≤ n by<br />

{<br />

1 if k = i, l = j<br />

[B kl ] ij<br />

=<br />

0 otherwise


§B Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 305<br />

So these matrices have entries that are all zeros, with the exception of a lone<br />

entry that is one. The set of all mn of them,<br />

B = {B kl | 1 ≤ k ≤ m, 1 ≤ l ≤ n}<br />

forms a basis for M mn . See Exercise B.M20.<br />

△<br />

The bases described above will often be convenient ones to work with. However<br />

a basis does not have to obviously look like a basis.<br />

Example BSP4 A basis for a subspace of P 4<br />

In Example SSP4 we showed that<br />

S = { x − 2, x 2 − 4x +4,x 3 − 6x 2 +12x − 8, x 4 − 8x 3 +24x 2 − 32x +16 }<br />

is a spann<strong>in</strong>g set for W = {p(x)| p ∈ P 4 , p(2) = 0}. WewillnowshowthatS is also<br />

l<strong>in</strong>early <strong>in</strong>dependent <strong>in</strong> W . Beg<strong>in</strong> with a relation of l<strong>in</strong>ear dependence,<br />

0+0x +0x 2 +0x 3 +0x 4<br />

(<br />

= α 1 (x − 2) + α 2 x 2 − 4x +4 ) (<br />

+ α 3 x 3 − 6x 2 +12x − 8 )<br />

(<br />

+ α 4 x 4 − 8x 3 +24x 2 − 32x +16 )<br />

= α 4 x 4 +(α 3 − 8α 4 ) x 3 +(α 2 − 6α 3 +24α 4 ) x 2<br />

+(α 1 − 4α 2 +12α 3 − 32α 4 ) x +(−2α 1 +4α 2 − 8α 3 +16α 4 )<br />

Equat<strong>in</strong>g coefficients (vector equality <strong>in</strong> P 4 ) gives the homogeneous system of<br />

five equations <strong>in</strong> four variables,<br />

α 4 =0<br />

α 3 − 8α 4 =0<br />

α 2 − 6α 3 +24α 4 =0<br />

α 1 − 4α 2 +12α 3 − 32α 4 =0<br />

−2α 1 +4α 2 − 8α 3 +16α 4 =0<br />

We form the coefficient matrix, and row-reduce to obta<strong>in</strong> a matrix <strong>in</strong> reduced<br />

row-echelon form<br />

⎡<br />

⎤<br />

1 0 0 0<br />

0 1 0 0<br />

⎢ 0 0 1 0<br />

⎥<br />

⎣<br />

⎦<br />

0 0 0 1<br />

0 0 0 0<br />

With only the trivial solution to this homogeneous system, we conclude that<br />

only scalars that will form a relation of l<strong>in</strong>ear dependence are the trivial ones, and<br />

therefore the set S is l<strong>in</strong>early <strong>in</strong>dependent (Def<strong>in</strong>ition LI). F<strong>in</strong>ally, S has earned the


§B Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 306<br />

right to be called a basis for W (Def<strong>in</strong>ition B).<br />

△<br />

Example BSM22 A basis for a subspace of M 22<br />

In Example SSM22 we discovered that<br />

{[ ] [ ]}<br />

−3 1 1 0<br />

Q =<br />

,<br />

0 0 −4 1<br />

is a spann<strong>in</strong>g set for the subspace<br />

{[ ]∣<br />

a b ∣∣∣<br />

Z = a +3b − c − 5d =0, −2a − 6b +3c +14d =0}<br />

c d<br />

of the vector space of all 2 × 2 matrices, M 22 . If we can also determ<strong>in</strong>e that Q is<br />

l<strong>in</strong>early <strong>in</strong>dependent <strong>in</strong> Z (or <strong>in</strong> M 22 ), then it will qualify as a basis for Z.<br />

Let us beg<strong>in</strong> with a relation of l<strong>in</strong>ear dependence.<br />

[<br />

0 0<br />

0 0<br />

] [ ]<br />

−3 1<br />

= α 1<br />

0 0<br />

=<br />

[ ]<br />

−3α1 + α 2 α 1<br />

−4α 2 α 2<br />

+ α 2<br />

[<br />

1 0<br />

−4 1<br />

Us<strong>in</strong>g our def<strong>in</strong>ition of matrix equality (Def<strong>in</strong>ition ME) we equate entries and<br />

get a homogeneous system of four equations <strong>in</strong> two variables,<br />

−3α 1 + α 2 =0<br />

α 1 =0<br />

−4α 2 =0<br />

α 2 =0<br />

We could row-reduce the coefficient matrix of this homogeneous system, but it is<br />

not necessary. The second and fourth equations tell us that α 1 =0,α 2 =0isthe<br />

only solution to this homogeneous system. This qualifies the set Q as be<strong>in</strong>g l<strong>in</strong>early<br />

<strong>in</strong>dependent, s<strong>in</strong>ce the only relation of l<strong>in</strong>ear dependence is trivial (Def<strong>in</strong>ition LI).<br />

Therefore Q is a basis for Z (Def<strong>in</strong>ition B).<br />

△<br />

Example BC Basis for the crazy vector space<br />

In Example LIC and Example SSC we determ<strong>in</strong>ed that the set R = {(1, 0), (6, 3)}<br />

from the crazy vector space, C (Example CVS), is l<strong>in</strong>early <strong>in</strong>dependent and is a<br />

spann<strong>in</strong>g set for C. By Def<strong>in</strong>ition B we see that R is a basis for C.<br />

△<br />

We have seen that several of the sets associated with a matrix are subspaces<br />

of vector spaces of column vectors. Specifically these are the null space (Theorem<br />

NSMS), column space (Theorem CSMS), row space (Theorem RSMS) andleftnull<br />

space (Theorem LNSMS). As subspaces they are vector spaces (Def<strong>in</strong>ition S) andit<br />

is natural to ask about bases for these vector spaces. Theorem BNS, TheoremBCS,<br />

Theorem BRS each have conclusions that provide l<strong>in</strong>early <strong>in</strong>dependent spann<strong>in</strong>g sets<br />

for (respectively) the null space, column space, and row space. Notice that each of<br />

]


§B Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 307<br />

these theorems conta<strong>in</strong>s the word “basis” <strong>in</strong> its title, even though we did not know<br />

the precise mean<strong>in</strong>g of the word at the time. To f<strong>in</strong>d a basis for a left null space we<br />

can use the def<strong>in</strong>ition of this subspace as a null space (Def<strong>in</strong>ition LNS) and apply<br />

Theorem BNS. OrTheoremFS tells us that the left null space can be expressed as<br />

a row space and we can then use Theorem BRS.<br />

Theorem BS is another early result that provides a l<strong>in</strong>early <strong>in</strong>dependent spann<strong>in</strong>g<br />

set (i.e. a basis) as its conclusion. If a vector space of column vectors can be<br />

expressed as a span of a set of column vectors, then Theorem BS can be employed<br />

<strong>in</strong> a straightforward manner to quickly yield a basis.<br />

Subsection BSCV<br />

Bases for Spans of Column Vectors<br />

We have seen several examples of bases <strong>in</strong> different vector spaces. In this subsection,<br />

and the next (Subsection B.BNM), we will consider build<strong>in</strong>g bases for C m and its<br />

subspaces.<br />

Suppose we have a subspace of C m that is expressed as the span of a set of vectors,<br />

S, andS is not necessarily l<strong>in</strong>early <strong>in</strong>dependent, or perhaps not very attractive.<br />

Theorem REMRS says that row-equivalent matrices have identical row spaces, while<br />

Theorem BRS says the nonzero rows of a matrix <strong>in</strong> reduced row-echelon form are a<br />

basis for the row space. These theorems together give us a great computational tool<br />

for quickly f<strong>in</strong>d<strong>in</strong>g a basis for a subspace that is expressed orig<strong>in</strong>ally as a span.<br />

Example RSB Row space basis<br />

When we first def<strong>in</strong>ed the span of a set of column vectors, <strong>in</strong> Example SCAD we<br />

looked at the set<br />

W =<br />

〈{[ ] 2<br />

−3 ,<br />

1<br />

[ ] 1<br />

4 ,<br />

1<br />

[ ] 7<br />

−5 ,<br />

4<br />

[ ]}〉 −7<br />

−6<br />

−5<br />

with an eye towards realiz<strong>in</strong>g W as the span of a smaller set. By build<strong>in</strong>g relations<br />

of l<strong>in</strong>ear dependence (though we did not know them by that name then) we were<br />

able to remove two vectors and write W as the span of the other two vectors. These<br />

two rema<strong>in</strong><strong>in</strong>g vectors formed a l<strong>in</strong>early <strong>in</strong>dependent set, even though we did not<br />

know that at the time.<br />

NowweknowthatW is a subspace and must have a basis. Consider the matrix,<br />

C, whose rows are the vectors <strong>in</strong> the spann<strong>in</strong>g set for W ,<br />

C =<br />

⎡<br />

⎢<br />

⎣<br />

2 −3 1<br />

1 4 1<br />

7 −5 4<br />

−7 −6 −5<br />

Then,byDef<strong>in</strong>itionRSM, the row space of C will be W , R(C) = W .Theorem<br />

BRS tells us that if we row-reduce C, the nonzero rows of the row-equivalent matrix<br />

⎤<br />

⎥<br />


§B Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 308<br />

<strong>in</strong> reduced row-echelon form will be a basis for R(C), and hence a basis for W .Let<br />

us do it — C row-reduces to<br />

⎡<br />

⎤<br />

7<br />

1 0<br />

11<br />

1<br />

⎢ 0 1 11⎥<br />

⎣ 0 0 0 ⎦<br />

0 0 0<br />

If we convert the two nonzero rows to column vectors then we have a basis,<br />

⎧⎡<br />

⎨<br />

B = ⎣ 1 ⎤ ⎡<br />

0 ⎦ , ⎣ 0 ⎤⎫<br />

⎬<br />

1 ⎦<br />

⎩ 7 1 ⎭<br />

11 11<br />

and<br />

⎡ ⎤ ⎡ ⎤⎫〉<br />

〈 ⎧ ⎨<br />

W = ⎣ 1 0 ⎦ ,<br />

⎩<br />

7<br />

11<br />

⎣ 0 1 ⎦<br />

1<br />

11<br />

For aesthetic reasons, we might wish to multiply each vector <strong>in</strong> B by 11, which<br />

will not change the spann<strong>in</strong>g or l<strong>in</strong>ear <strong>in</strong>dependence properties of B as a basis. Then<br />

we can also write<br />

W =<br />

〈{[ 11<br />

0<br />

7<br />

]<br />

,<br />

⎬<br />

⎭<br />

[ ]}〉 0<br />

11<br />

1<br />

Example IAS provides another example of this flavor, though now we can notice<br />

that X is a subspace, and that the result<strong>in</strong>g set of three vectors is a basis. This is<br />

such a powerful technique that we should do one more example.<br />

Example RS Reduc<strong>in</strong>g a span<br />

In Example RSC5 we began with a set of n = 4 vectors from C 5 ,<br />

⎧⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />

1 2 0 4<br />

⎪⎨<br />

2<br />

1<br />

−7<br />

1<br />

⎪⎬<br />

R = {v 1 , v 2 , v 3 , v 4 } = ⎢−1⎥<br />

⎣<br />

⎪⎩<br />

3 ⎦ , ⎢3⎥<br />

⎣1⎦ , ⎢ 6 ⎥<br />

⎣−11⎦ , ⎢2⎥<br />

⎣1⎦<br />

⎪⎭<br />

2 2 −2 6<br />

and def<strong>in</strong>ed V = 〈R〉. Our goal <strong>in</strong> that problem was to f<strong>in</strong>d a relation of l<strong>in</strong>ear<br />

dependence on the vectors <strong>in</strong> R, solve the result<strong>in</strong>g equation for one of the vectors,<br />

and re-express V as the span of a set of three vectors.<br />

Here is another way to accomplish someth<strong>in</strong>g similar. The row space of the matrix<br />

⎡<br />

⎤<br />

1 2 −1 3 2<br />

⎢2 1 3 1 2 ⎥<br />

A = ⎣<br />

0 −7 6 −11 −2<br />

⎦<br />

4 1 2 1 6<br />


§B Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 309<br />

is equal to 〈R〉. ByTheoremBRS we can row-reduce this matrix, ignore any zero<br />

rows, and use the nonzero rows as column vectors that are a basis for the row space<br />

of A. Row-reduc<strong>in</strong>g A creates the matrix<br />

So<br />

⎧⎡<br />

⎪⎨<br />

⎢<br />

⎣<br />

⎪⎩<br />

⎡<br />

1 0 0 − 1<br />

17<br />

25<br />

⎢0 1 0<br />

⎣0 0 1 − 2<br />

0 0 0 0 0<br />

1<br />

0<br />

0<br />

− 1<br />

17<br />

30<br />

17<br />

⎤ ⎡<br />

⎥<br />

⎦ , ⎢<br />

⎣<br />

0<br />

1<br />

0<br />

25<br />

17<br />

− 2<br />

17<br />

30<br />

17<br />

17<br />

− 2<br />

17<br />

17<br />

− 8<br />

17<br />

⎤ ⎡<br />

⎥<br />

⎦ , ⎢<br />

⎣<br />

0<br />

0<br />

1<br />

⎤<br />

⎥<br />

⎦<br />

− 2<br />

17<br />

− 8<br />

17<br />

⎤⎫<br />

⎪⎬<br />

⎥<br />

⎦<br />

⎪⎭<br />

is a basis for V . Our theorem tells us this is a basis, there is no need to verify that<br />

the subspace spanned by three vectors (rather than four) is the identical subspace,<br />

and there is no need to verify that we have reached the limit <strong>in</strong> reduc<strong>in</strong>g the set,<br />

s<strong>in</strong>ce the set of three vectors is guaranteed to be l<strong>in</strong>early <strong>in</strong>dependent. △<br />

Subsection BNM<br />

Bases and Nons<strong>in</strong>gular Matrices<br />

A quick source of diverse bases for C m is the set of columns of a nons<strong>in</strong>gular matrix.<br />

Theorem CNMB Columns of Nons<strong>in</strong>gular Matrix are a Basis<br />

Suppose that A is a square matrix of size m. Then the columns of A are a basis of<br />

C m if and only if A is nons<strong>in</strong>gular.<br />

Proof. (⇒) Suppose that the columns of A areabasisforC m . Then Def<strong>in</strong>ition B<br />

says the set of columns is l<strong>in</strong>early <strong>in</strong>dependent. Theorem NMLIC then says that A<br />

is nons<strong>in</strong>gular.<br />

(⇐) Suppose that A is nons<strong>in</strong>gular. Then by Theorem NMLIC this set of<br />

columns is l<strong>in</strong>early <strong>in</strong>dependent. Theorem CSNM says that for a nons<strong>in</strong>gular matrix,<br />

C(A) = C m . This is equivalent to say<strong>in</strong>g that the columns of A are a spann<strong>in</strong>g set<br />

for the vector space C m . As a l<strong>in</strong>early <strong>in</strong>dependent spann<strong>in</strong>g set, the columns of A<br />

qualify as a basis for C m (Def<strong>in</strong>ition B).<br />

<br />

Example CABAK Columns as Basis, Archetype K


§B Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 310<br />

Archetype K is the 5 × 5 matrix<br />

⎡<br />

⎤<br />

10 18 24 24 −12<br />

12 −2 −6 0 −18<br />

K = ⎢−30 −21 −23 −30 39 ⎥<br />

⎣ 27 30 36 37 −30⎦<br />

18 24 30 30 −20<br />

which is row-equivalent to the 5 × 5 identity matrix I 5 .SobyTheoremNMRRI, K<br />

is nons<strong>in</strong>gular. Then Theorem CNMB says the set<br />

⎧⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />

10 18 24 24 −12<br />

⎪⎨<br />

12<br />

−2<br />

−6<br />

0<br />

−18<br />

⎪⎬<br />

⎢−30⎥<br />

⎣<br />

⎪⎩<br />

27 ⎦ , ⎢−21⎥<br />

⎣ 30 ⎦ , ⎢−23⎥<br />

⎣ 36 ⎦ , ⎢−30⎥<br />

⎣ 37 ⎦ , ⎢ 39 ⎥<br />

⎣−30⎦<br />

⎪⎭<br />

18 24 30 30 −20<br />

is a (novel) basis of C 5 .<br />

△<br />

Perhaps we should view the fact that the standard unit vectors are a basis<br />

(Theorem SUVB) as just a simple corollary of Theorem CNMB? (See Proof Technique<br />

LC.)<br />

With a new equivalence for a nons<strong>in</strong>gular matrix, we can update our list of<br />

equivalences.<br />

Theorem NME5 Nons<strong>in</strong>gular Matrix Equivalences, Round 5<br />

Suppose that A is a square matrix of size n. The follow<strong>in</strong>g are equivalent.<br />

1. A is nons<strong>in</strong>gular.<br />

2. A row-reduces to the identity matrix.<br />

3. The null space of A conta<strong>in</strong>s only the zero vector, N (A) ={0}.<br />

4. The l<strong>in</strong>ear system LS(A, b) has a unique solution for every possible choice of<br />

b.<br />

5. The columns of A are a l<strong>in</strong>early <strong>in</strong>dependent set.<br />

6. A is <strong>in</strong>vertible.<br />

7. The column space of A is C n , C(A) =C n .<br />

8. The columns of A are a basis for C n .<br />

Proof. With a new equivalence for a nons<strong>in</strong>gular matrix <strong>in</strong> Theorem CNMB we can<br />

expand Theorem NME4.


§B Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 311<br />

Subsection OBC<br />

Orthonormal Bases and Coord<strong>in</strong>ates<br />

We learned about orthogonal sets of vectors <strong>in</strong> C m back <strong>in</strong> Section O, andwealso<br />

learned that orthogonal sets are automatically l<strong>in</strong>early <strong>in</strong>dependent (Theorem OSLI).<br />

When an orthogonal set also spans a subspace of C m , then the set is a basis. And<br />

when the set is orthonormal, then the set is an <strong>in</strong>credibly nice basis. We will back up<br />

this claim with a theorem, but first consider how you might manufacture such a set.<br />

Suppose that W is a subspace of C m with basis B. ThenB spans W and is<br />

a l<strong>in</strong>early <strong>in</strong>dependent set of nonzero vectors. We can apply the Gram-Schmidt<br />

Procedure (Theorem GSP) and obta<strong>in</strong> a l<strong>in</strong>early <strong>in</strong>dependent set T such that<br />

〈T 〉 = 〈B〉 = W and T is orthogonal. In other words, T is a basis for W ,andisan<br />

orthogonal set. By scal<strong>in</strong>g each vector of T to norm 1, we can convert T <strong>in</strong>to an<br />

orthonormal set, without destroy<strong>in</strong>g the properties that make it a basis of W .In<br />

short, we can convert any basis <strong>in</strong>to an orthonormal basis. Example GSTV, followed<br />

by Example ONTV, illustrates this process.<br />

Unitary matrices (Def<strong>in</strong>ition UM) are another good source of orthonormal bases<br />

(and vice versa). Suppose that Q is a unitary matrix of size n. Then the n columns of<br />

Q form an orthonormal set (Theorem CUMOS) that is therefore l<strong>in</strong>early <strong>in</strong>dependent<br />

(Theorem OSLI). S<strong>in</strong>ce Q is <strong>in</strong>vertible (Theorem UMI), we know Q is nons<strong>in</strong>gular<br />

(Theorem NI), and then the columns of Q span C n (Theorem CSNM). So the columns<br />

of a unitary matrix of size n are an orthonormal basis for C n .<br />

Why all the fuss about orthonormal bases? Theorem VRRB told us that any<br />

vector <strong>in</strong> a vector space could be written, uniquely, as a l<strong>in</strong>ear comb<strong>in</strong>ation of basis<br />

vectors. For an orthonormal basis, f<strong>in</strong>d<strong>in</strong>g the scalars for this l<strong>in</strong>ear comb<strong>in</strong>ation<br />

is extremely easy, and this is the content of the next theorem. Furthermore, with<br />

vectors written this way (as l<strong>in</strong>ear comb<strong>in</strong>ations of the elements of an orthonormal<br />

set) certa<strong>in</strong> computations and analysis become much easier. Here is the promised<br />

theorem.<br />

Theorem COB Coord<strong>in</strong>ates and Orthonormal Bases<br />

Suppose that B = {v 1 , v 2 , v 3 , ..., v p } is an orthonormal basis of the subspace W<br />

of C m . For any w ∈ W ,<br />

w = 〈v 1 , w〉 v 1 + 〈v 2 , w〉 v 2 + 〈v 3 , w〉 v 3 + ···+ 〈v p , w〉 v p<br />

Proof. Because B is a basis of W ,TheoremVRRB tells us that we can write w<br />

uniquely as a l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors <strong>in</strong> B. So it is not this aspect of<br />

the conclusion that makes this theorem <strong>in</strong>terest<strong>in</strong>g. What is <strong>in</strong>terest<strong>in</strong>g is that the<br />

particular scalars are so easy to compute. No need to solve big systems of equations<br />

— just do an <strong>in</strong>ner product of w with v i to arrive at the coefficient of v i <strong>in</strong> the l<strong>in</strong>ear<br />

comb<strong>in</strong>ation.<br />

So beg<strong>in</strong> the proof by writ<strong>in</strong>g w as a l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors <strong>in</strong> B,


§B Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 312<br />

us<strong>in</strong>g unknown scalars,<br />

w = a 1 v 1 + a 2 v 2 + a 3 v 3 + ···+ a p v p<br />

and compute,<br />

〈<br />

〉<br />

p∑<br />

〈v i , w〉 = v i , a k v k Theorem VRRB<br />

=<br />

=<br />

k=1<br />

p∑<br />

〈v i ,a k v k 〉 Theorem IPVA<br />

k=1<br />

p∑<br />

a k 〈v i , v k 〉<br />

k=1<br />

= a i 〈v i , v i 〉 +<br />

= a i (1) +<br />

= a i<br />

p∑<br />

a k 〈v i , v k 〉<br />

k=1<br />

k≠i<br />

p∑<br />

a k (0)<br />

k=1<br />

k≠i<br />

Theorem IPSM<br />

Property C<br />

Def<strong>in</strong>ition ONS<br />

So the (unique) scalars for the l<strong>in</strong>ear comb<strong>in</strong>ation are <strong>in</strong>deed the <strong>in</strong>ner products<br />

advertised <strong>in</strong> the conclusion of the theorem’s statement.<br />

<br />

Example CROB4 Coord<strong>in</strong>atization relative to an orthonormal basis, C 4<br />

The set<br />

⎧⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />

1+i 1+5i −7+34i −2 − 4i<br />

⎪⎨<br />

⎪⎬<br />

⎢ 1 ⎥ ⎢ 6+5i ⎥ ⎢ −8 − 23i ⎥ ⎢ 6+i ⎥<br />

{x 1 , x 2 , x 3 , x 4 } = ⎣<br />

⎪⎩<br />

1 − i<br />

⎦ , ⎣<br />

−7 − i<br />

⎦ , ⎣<br />

−10 + 22i<br />

⎦ , ⎣<br />

4+3i<br />

⎦<br />

⎪⎭<br />

i 1 − 6i 30 + 13i 6 − i<br />

was proposed, and partially verified, as an orthogonal set <strong>in</strong> Example AOS. Let<br />

us scale each vector to norm 1, so as to form an orthonormal set <strong>in</strong> C 4 .Thenby<br />

Theorem OSLI the set will be l<strong>in</strong>early <strong>in</strong>dependent, and by Theorem NME5 the<br />

set will be a basis for C 4 . So, once scaled to norm 1, the adjusted set will be an<br />

orthonormal basis of C 4 . The norms are,<br />

‖x 1 ‖ = √ 6 ‖x 2 ‖ = √ 174 ‖x 3 ‖ = √ 3451 ‖x 4 ‖ = √ 119<br />

So an orthonormal basis is<br />

B = {v 1 , v 2 , v 3 , v 4 }


§B Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 313<br />

⎧ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />

1+i 1+5i<br />

−7+34i<br />

−2 − 4i<br />

⎪⎨<br />

1 ⎢ 1 ⎥ 1 ⎢ 6+5i ⎥ 1 ⎢ −8 − 23i ⎥ 1<br />

⎪⎬<br />

⎢ 6+i ⎥<br />

= √ ⎣<br />

⎪⎩ 6 1 − i<br />

⎦ , √ ⎣ 174 −7 − i<br />

⎦ , √ ⎣ 3451 −10 + 22i<br />

⎦ , √ ⎣ 119 4+3i<br />

⎦<br />

⎪⎭<br />

i<br />

1 − 6i<br />

30 + 13i<br />

6 − i<br />

⎡ ⎤<br />

2<br />

Now, to illustrate Theorem COB, choose any vector from C 4 ⎢−3⎥<br />

,sayw = ⎣<br />

1<br />

⎦,<br />

4<br />

and compute<br />

〈v 1 , w〉 = −5i √<br />

6<br />

〈v 2 , w〉 =<br />

−19 + 30i<br />

√<br />

174<br />

〈v 3 , w〉 =<br />

120 − 211i<br />

√<br />

3451<br />

〈v 4 , w〉 = 6+12i √<br />

119<br />

Then Theorem COB guarantees that<br />

⎡ ⎤ ⎛ ⎡ ⎤⎞<br />

⎛ ⎡ ⎤⎞<br />

2<br />

1+i<br />

1+5i<br />

⎢−3⎥<br />

⎣<br />

1<br />

⎦ = −5i ⎜<br />

√ ⎝√ 1 ⎢ 1 ⎥⎟<br />

−19 + 30i ⎜<br />

⎣ 6 6 1 − i<br />

⎦⎠ + √ ⎝√ 1 ⎢ 6+5i ⎥⎟<br />

⎣<br />

174 174 −7 − i<br />

⎦⎠<br />

4<br />

i<br />

1 − 6i<br />

⎛ ⎡ ⎤⎞<br />

⎛ ⎡ ⎤⎞<br />

−7+34i<br />

−2 − 4i<br />

120 − 211i ⎜<br />

+ √ ⎝√ 1 ⎢ −8 − 23i ⎥⎟<br />

⎣ ⎦⎠ + 6+12i ⎜<br />

√ ⎝√ 1 ⎢ ⎥⎟<br />

⎣ ⎦⎠<br />

3451 3451 119 119<br />

−10 + 22i<br />

30 + 13i<br />

as you might want to check (if you have unlimited patience).<br />

6+i<br />

4+3i<br />

6 − i<br />

A slightly less <strong>in</strong>timidat<strong>in</strong>g example follows, <strong>in</strong> three dimensions and with just<br />

real numbers.<br />

Example CROB3 Coord<strong>in</strong>atization relative to an orthonormal basis, C 3<br />

The set<br />

{[ ] [ ] [ ]}<br />

1 −1 2<br />

{x 1 , x 2 , x 3 } = 2 , 0 , 1<br />

1 1 1<br />

is a l<strong>in</strong>early <strong>in</strong>dependent set, which the Gram-Schmidt Process (Theorem GSP)<br />

converts to an orthogonal set, and which can then be converted to the orthonormal<br />

set,<br />

{<br />

B = {v 1 , v 2 , v 3 } =<br />

1<br />

√<br />

6<br />

[ 1<br />

2<br />

1<br />

]<br />

,<br />

[ ]<br />

1 −1<br />

√ 0 , 2 1<br />

[ ]}<br />

1 1<br />

√ −1<br />

3 1<br />

which is therefore an orthonormal basis of C 3 . With three vectors <strong>in</strong> C 3 ,allwith<br />

real number entries, the <strong>in</strong>ner product (Def<strong>in</strong>ition IP) reduces to the usual “dot<br />

product” (or scalar product) and the orthogonal pairs of vectors can be <strong>in</strong>terpreted<br />


§B Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 314<br />

as perpendicular pairs of directions. So the vectors <strong>in</strong> B serve as replacements for our<br />

usual 3-D axes, or the usual 3-D unit vectors ⃗i,⃗j and ⃗ k. We would like to decompose<br />

arbitrary vectors <strong>in</strong>to “components” <strong>in</strong> the directions of each of these basis vectors.<br />

It is Theorem COB that tells us [ how ] to do this.<br />

2<br />

Suppose that we choose w = −1 . Compute<br />

5<br />

〈v 1 , w〉 = √ 5 〈v 2 , w〉 = √ 3 〈v 3 , w〉 = √ 8<br />

6 2 3<br />

then Theorem COB guarantees that<br />

[ ] ( [ ]) ( [ ]) ( [ ])<br />

2<br />

−1 = √ 5 1 1<br />

√ 2 + √ 3 1 −1<br />

√ 0 + 8 1 1<br />

√ √ −1<br />

5 6 6 1 2 2 1 3 3 1<br />

which you should be able to check easily, even if you do not have much patience.△<br />

Not only do the columns of a unitary matrix form an orthonormal basis, but there<br />

is a deeper connection between orthonormal bases and unitary matrices. Informally,<br />

the next theorem says that if we transform each vector of an orthonormal basis by<br />

multiply<strong>in</strong>g it by a unitary matrix, then the result<strong>in</strong>g set will be another orthonormal<br />

basis. And more remarkably, any matrix with this property must be unitary! As an<br />

equivalence (Proof Technique E) we could take this as our def<strong>in</strong><strong>in</strong>g property of a<br />

unitary matrix, though it might not have the same utility as Def<strong>in</strong>ition UM.<br />

Theorem UMCOB Unitary Matrices Convert Orthonormal Bases<br />

Let A be an n × n matrix and B = {x 1 , x 2 , x 3 , ..., x n } be an orthonormal basis of<br />

C n . Def<strong>in</strong>e<br />

C = {Ax 1 ,Ax 2 ,Ax 3 , ..., Ax n }<br />

Then A is a unitary matrix if and only if C is an orthonormal basis of C n .<br />

Proof. (⇒) Assume A is a unitary matrix and establish several facts about C. <strong>First</strong><br />

we check that C is an orthonormal set (Def<strong>in</strong>ition ONS). By Theorem UMPIP, for<br />

i ≠ j,<br />

〈Ax i ,Ax j 〉 = 〈x i , x j 〉 =0<br />

Similarly, Theorem UMPIP also gives, for 1 ≤ i ≤ n,<br />

‖Ax i ‖ = ‖x i ‖ =1<br />

As C is an orthogonal set (Def<strong>in</strong>ition OSV), Theorem OSLI yields the l<strong>in</strong>ear<br />

<strong>in</strong>dependence of C. Hav<strong>in</strong>g established that the column vectors on C form a l<strong>in</strong>early<br />

<strong>in</strong>dependent set, a matrix whose columns are the vectors of C is nons<strong>in</strong>gular (Theorem<br />

NMLIC), and hence these vectors form a basis of C n by Theorem CNMB.<br />

(⇐) Now assume that C is an orthonormal set. Let y be an arbitrary vector


§B Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 315<br />

from C n .S<strong>in</strong>ceB spans C n , there are scalars, a 1 ,a 2 ,a 3 , ..., a n , such that<br />

y = a 1 x 1 + a 2 x 2 + a 3 x 3 + ···+ a n x n<br />

Now<br />

A ∗ Ay =<br />

=<br />

=<br />

=<br />

=<br />

=<br />

=<br />

=<br />

=<br />

=<br />

=<br />

n∑<br />

〈x i ,A ∗ Ay〉 x i Theorem COB<br />

i=1<br />

〈<br />

n∑<br />

x i ,A ∗ A<br />

i=1<br />

〈<br />

n∑<br />

x i ,<br />

i=1<br />

〈<br />

n∑<br />

x i ,<br />

i=1<br />

n∑<br />

i=1 j=1<br />

n∑<br />

i=1 j=1<br />

n∑<br />

i=1 j=1<br />

n∑<br />

i=1<br />

n∑<br />

i=1<br />

n∑<br />

i=1<br />

〉<br />

n∑<br />

a j x j x i Def<strong>in</strong>ition SSVS<br />

j=1<br />

〉<br />

n∑<br />

A ∗ Aa j x j x i Theorem MMDAA<br />

j=1<br />

〉<br />

n∑<br />

a j A ∗ Ax j x i Theorem MMSMM<br />

j=1<br />

n∑<br />

〈x i ,a j A ∗ Ax j 〉 x i Theorem IPVA<br />

n∑<br />

a j 〈x i ,A ∗ Ax j 〉 x i<br />

n∑<br />

a j 〈Ax i ,Ax j 〉 x i<br />

n∑<br />

a j 〈Ax i ,Ax j 〉 x i +<br />

j=1<br />

j≠i<br />

n∑<br />

a j (0)x i +<br />

j=1<br />

j≠i<br />

n∑<br />

0 +<br />

j=1<br />

j≠i<br />

n∑<br />

a l x l<br />

l=1<br />

n∑<br />

a l 〈Ax l ,Ax l 〉 x l<br />

l=1<br />

n∑<br />

a l (1)x l<br />

l=1<br />

n∑<br />

a l x l<br />

l=1<br />

Theorem IPSM<br />

Theorem AIP<br />

Property C<br />

Def<strong>in</strong>ition ONS<br />

Theorem ZSSM<br />

Property Z<br />

= y<br />

= I n y Theorem MMIM<br />

S<strong>in</strong>ce the choice of y was arbitrary, Theorem EMMVP tells us that A ∗ A = I n ,<br />

so A is unitary (Def<strong>in</strong>ition UM).


§B Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 316<br />

Read<strong>in</strong>g Questions<br />

1. The matrix below is nons<strong>in</strong>gular. What can you now say about its columns?<br />

⎡<br />

A = ⎣ −3 0 1<br />

⎤<br />

1 2 1⎦<br />

5 1 6<br />

⎡<br />

⎤<br />

2. Write the vector w = ⎣ 6 6 ⎦ as a l<strong>in</strong>ear comb<strong>in</strong>ation of the columns of the matrix A<br />

15<br />

above. How many ways are there to answer this question?<br />

3. Why is an orthonormal basis desirable?<br />

Exercises<br />

C10 †<br />

F<strong>in</strong>d a basis for 〈S〉, where<br />

⎧⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />

1 1 1 1 3<br />

⎪⎨<br />

⎪⎬<br />

⎢3⎥<br />

⎢2⎥<br />

⎢1⎥<br />

⎢2⎥<br />

⎢4⎥<br />

S = ⎣<br />

⎪⎩<br />

2<br />

⎦ , ⎣<br />

1<br />

⎦ , ⎣<br />

0<br />

⎦ , ⎣<br />

2<br />

⎦ , ⎣<br />

1<br />

⎦<br />

⎪⎭ .<br />

1 1 1 1 3<br />

C11 † F<strong>in</strong>d a basis for the subspace W of C 4 ,<br />

⎧ ⎡<br />

⎤<br />

⎫<br />

a + b − 2c<br />

⎪⎨<br />

⎪⎬<br />

⎢ a + b − 2c + d ⎥<br />

W = ⎣<br />

⎪⎩<br />

−2a +2b +4c − d<br />

⎦<br />

a, b, c, d ∈ C<br />

b + d<br />

∣<br />

⎪⎭<br />

C12 † F<strong>in</strong>d a basis ⎡for the vector ⎤ space T of lower triangular 3 × 3 matrices; that is,<br />

matrices of the form ⎣ ∗ 0 0<br />

∗ ∗ 0⎦ where an asterisk represents any complex number.<br />

∗ ∗ ∗<br />

C13 † F<strong>in</strong>d a basis for the subspace Q of P 2 , Q = { p(x) =a + bx + cx 2∣ }<br />

∣ p(0) = 0 .<br />

C14 † F<strong>in</strong>d a basis for the subspace R of P 2 , R = { p(x) =a + bx + cx 2∣ ∣ p ′ (0) = 0 } ,where<br />

p ′ denotes the derivative.<br />

C40 † From Example RSB, form an arbitrary (and nontrivial) l<strong>in</strong>ear comb<strong>in</strong>ation of the<br />

four vectors <strong>in</strong> the orig<strong>in</strong>al spann<strong>in</strong>g set for W . So the result of this computation is of<br />

course an element of W . As such, this vector should be a l<strong>in</strong>ear comb<strong>in</strong>ation of the basis<br />

vectors <strong>in</strong> B. F<strong>in</strong>d the (unique) scalars that provide this l<strong>in</strong>ear comb<strong>in</strong>ation. Repeat with<br />

another l<strong>in</strong>ear comb<strong>in</strong>ation of the orig<strong>in</strong>al four vectors.<br />

C80 Prove that {(1, 2), (2, 3)} is a basis for the crazy vector space C (Example CVS).<br />

M20 † In Example BM provide the verifications (l<strong>in</strong>ear <strong>in</strong>dependence and spann<strong>in</strong>g) to<br />

show that B is a basis of M mn .<br />

T50 † Theorem UMCOB says that unitary matrices are characterized as those matrices<br />

that “carry” orthonormal bases to orthonormal bases. This problem asks you to prove a


§B Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 317<br />

similar result: nons<strong>in</strong>gular matrices are characterized as those matrices that “carry” bases<br />

to bases.<br />

More precisely, suppose that A is a square matrix of size n and B = {x 1 , x 2 , x 3 , ..., x n }<br />

is a basis of C n .ProvethatA is nons<strong>in</strong>gular if and only if C = {Ax 1 ,Ax 2 ,Ax 3 , ..., Ax n }<br />

is a basis of C n . (See also Exercise PD.T33, Exercise MR.T20.)<br />

T51 † Use the result of Exercise B.T50 to build a very concise proof of Theorem CNMB.<br />

(H<strong>in</strong>t: make a judicious choice for the basis B.)


Section D<br />

Dimension<br />

Almost every vector space we have encountered has been <strong>in</strong>f<strong>in</strong>ite <strong>in</strong> size (an exception<br />

is Example VSS). But some are bigger and richer than others. Dimension, once<br />

suitably def<strong>in</strong>ed, will be a measure of the size of a vector space, and a useful tool<br />

for study<strong>in</strong>g its properties. You probably already have a rough notion of what a<br />

mathematical def<strong>in</strong>ition of dimension might be — try to forget these imprecise ideas<br />

and go with the new ones given here.<br />

Subsection D<br />

Dimension<br />

Def<strong>in</strong>ition D Dimension<br />

Suppose that V is a vector space and {v 1 , v 2 , v 3 , ..., v t } is a basis of V . Then the<br />

dimension of V is def<strong>in</strong>ed by dim (V ) = t. IfV has no f<strong>in</strong>ite bases, we say V has<br />

<strong>in</strong>f<strong>in</strong>ite dimension.<br />

□<br />

This is a very simple def<strong>in</strong>ition, which belies its power. Grab a basis, any basis,<br />

and count up the number of vectors it conta<strong>in</strong>s. That is the dimension. However, this<br />

simplicity causes a problem. Given a vector space, you and I could each construct<br />

different bases — remember that a vector space might have many bases. And what if<br />

your basis and my basis had different sizes? Apply<strong>in</strong>g Def<strong>in</strong>ition D we would arrive<br />

at different numbers! With our current knowledge about vector spaces, we would<br />

have to say that dimension is not “well-def<strong>in</strong>ed.” Fortunately, there is a theorem<br />

that will correct this problem.<br />

In a strictly logical progression, the next two theorems would precede the def<strong>in</strong>ition<br />

of dimension. Many subsequent theorems will trace their l<strong>in</strong>eage back to the follow<strong>in</strong>g<br />

fundamental result.<br />

Theorem SSLD Spann<strong>in</strong>g Sets and L<strong>in</strong>ear Dependence<br />

Suppose that S = {v 1 , v 2 , v 3 , ..., v t } is a f<strong>in</strong>ite set of vectors which spans the<br />

vector space V .Thenanysetoft +1 or more vectors from V is l<strong>in</strong>early dependent.<br />

Proof. We want to prove that any set of t + 1 or more vectors from V is l<strong>in</strong>early<br />

dependent. So we will beg<strong>in</strong> with a totally arbitrary set of vectors from V , R =<br />

{u 1 , u 2 , u 3 , ..., u m },wherem>t. We will now construct a nontrivial relation of<br />

l<strong>in</strong>ear dependence on R.<br />

Each vector u 1 , u 2 , u 3 , ..., u m can be written as a l<strong>in</strong>ear comb<strong>in</strong>ation of the<br />

vectors v 1 , v 2 , v 3 , ..., v t s<strong>in</strong>ce S is a spann<strong>in</strong>g set of V . This means there exist<br />

scalars a ij ,1≤ i ≤ t, 1≤ j ≤ m, sothat<br />

u 1 = a 11 v 1 + a 21 v 2 + a 31 v 3 + ···+ a t1 v t<br />

318


§D Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 319<br />

u 2 = a 12 v 1 + a 22 v 2 + a 32 v 3 + ···+ a t2 v t<br />

u 3 = a 13 v 1 + a 23 v 2 + a 33 v 3 + ···+ a t3 v t<br />

.<br />

u m = a 1m v 1 + a 2m v 2 + a 3m v 3 + ···+ a tm v t<br />

Now we form, unmotivated, the homogeneous system of t equations <strong>in</strong> the m<br />

variables, x 1 ,x 2 ,x 3 , ..., x m , where the coefficients are the just-discovered scalars<br />

a ij ,<br />

a 11 x 1 + a 12 x 2 + a 13 x 3 + ···+ a 1m x m =0<br />

a 21 x 1 + a 22 x 2 + a 23 x 3 + ···+ a 2m x m =0<br />

a 31 x 1 + a 32 x 2 + a 33 x 3 + ···+ a 3m x m =0<br />

.<br />

a t1 x 1 + a t2 x 2 + a t3 x 3 + ···+ a tm x m =0<br />

This is a homogeneous system with more variables than equations (our hypothesis<br />

is expressed as m>t), so by Theorem HMVEI there are <strong>in</strong>f<strong>in</strong>itely many solutions.<br />

Choose a nontrivial solution and denote it by x 1 = c 1 ,x 2 = c 2 ,x 3 = c 3 , ..., x m =<br />

c m . As a solution to the homogeneous system, we then have<br />

a 11 c 1 + a 12 c 2 + a 13 c 3 + ···+ a 1m c m =0<br />

a 21 c 1 + a 22 c 2 + a 23 c 3 + ···+ a 2m c m =0<br />

a 31 c 1 + a 32 c 2 + a 33 c 3 + ···+ a 3m c m =0<br />

.<br />

a t1 c 1 + a t2 c 2 + a t3 c 3 + ···+ a tm c m =0<br />

As a collection of nontrivial scalars, c 1 ,c 2 ,c 3 , ..., c m will provide the nontrivial<br />

relation of l<strong>in</strong>ear dependence we desire,<br />

c 1 u 1 + c 2 u 2 + c 3 u 3 + ···+ c m u m<br />

= c 1 (a 11 v 1 + a 21 v 2 + a 31 v 3 + ···+ a t1 v t ) Def<strong>in</strong>ition SSVS<br />

+ c 2 (a 12 v 1 + a 22 v 2 + a 32 v 3 + ···+ a t2 v t )<br />

+ c 3 (a 13 v 1 + a 23 v 2 + a 33 v 3 + ···+ a t3 v t )<br />

.<br />

+ c m (a 1m v 1 + a 2m v 2 + a 3m v 3 + ···+ a tm v t )<br />

= c 1 a 11 v 1 + c 1 a 21 v 2 + c 1 a 31 v 3 + ···+ c 1 a t1 v t Property DVA


§D Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 320<br />

+ c 2 a 12 v 1 + c 2 a 22 v 2 + c 2 a 32 v 3 + ···+ c 2 a t2 v t<br />

+ c 3 a 13 v 1 + c 3 a 23 v 2 + c 3 a 33 v 3 + ···+ c 3 a t3 v t<br />

.<br />

+ c m a 1m v 1 + c m a 2m v 2 + c m a 3m v 3 + ···+ c m a tm v t<br />

=(c 1 a 11 + c 2 a 12 + c 3 a 13 + ···+ c m a 1m ) v 1<br />

Property DSA<br />

+(c 1 a 21 + c 2 a 22 + c 3 a 23 + ···+ c m a 2m ) v 2<br />

+(c 1 a 31 + c 2 a 32 + c 3 a 33 + ···+ c m a 3m ) v 3<br />

.<br />

+(c 1 a t1 + c 2 a t2 + c 3 a t3 + ···+ c m a tm ) v t<br />

=(a 11 c 1 + a 12 c 2 + a 13 c 3 + ···+ a 1m c m ) v 1<br />

Property CMCN<br />

+(a 21 c 1 + a 22 c 2 + a 23 c 3 + ···+ a 2m c m ) v 2<br />

+(a 31 c 1 + a 32 c 2 + a 33 c 3 + ···+ a 3m c m ) v 3<br />

.<br />

+(a t1 c 1 + a t2 c 2 + a t3 c 3 + ···+ a tm c m ) v t<br />

=0v 1 +0v 2 +0v 3 + ···+0v t c j as solution<br />

= 0 + 0 + 0 + ···+ 0 Theorem ZSSM<br />

= 0 Property Z<br />

That does it. R has been undeniably shown to be a l<strong>in</strong>early dependent set.<br />

The proof just given has some monstrous expressions <strong>in</strong> it, mostly ow<strong>in</strong>g to<br />

the double subscripts present. Now is a great opportunity to show the value of a<br />

more compact notation. We will rewrite the key steps of the previous proof us<strong>in</strong>g<br />

summation notation, result<strong>in</strong>g <strong>in</strong> a more economical presentation, and even greater<br />

<strong>in</strong>sight <strong>in</strong>to the key aspects of the proof. So here is an alternate proof — study it<br />

carefully.<br />

Alternate Proof: We want to prove that any set of t + 1 or more vectors from<br />

V is l<strong>in</strong>early dependent. So we will beg<strong>in</strong> with a totally arbitrary set of vectors from<br />

V , R = {u j | 1 ≤ j ≤ m}, wherem>t. We will now construct a nontrivial relation<br />

of l<strong>in</strong>ear dependence on R.<br />

Each vector u j ,1≤ j ≤ m can be written as a l<strong>in</strong>ear comb<strong>in</strong>ation of v i ,1≤ i ≤ t<br />

s<strong>in</strong>ce S is a spann<strong>in</strong>g set of V . This means there are scalars a ij ,1≤ i ≤ t, 1≤ j ≤ m,<br />

so that<br />

t∑<br />

u j = a ij v i<br />

1 ≤ j ≤ m<br />

i=1<br />

Now we form, unmotivated, the homogeneous system of t equations <strong>in</strong> the m


§D Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 321<br />

variables, x j ,1≤ j ≤ m, where the coefficients are the just-discovered scalars a ij ,<br />

m∑<br />

a ij x j =0<br />

1≤ i ≤ t<br />

j=1<br />

This is a homogeneous system with more variables than equations (our hypothesis<br />

is expressed as m>t), so by Theorem HMVEI there are <strong>in</strong>f<strong>in</strong>itely many solutions.<br />

Choose one of these solutions that is not trivial and denote it by x j = c j ,1≤ j ≤ m.<br />

As a solution to the homogeneous system, we then have ∑ m<br />

j=1 a ijc j =0for1≤ i ≤ t.<br />

As a collection of nontrivial scalars, c j ,1≤ j ≤ m, will provide the nontrivial relation<br />

of l<strong>in</strong>ear dependence we desire,<br />

)<br />

m∑<br />

m∑<br />

c j u j =<br />

a ij v i Def<strong>in</strong>ition SSVS<br />

j=1<br />

( t∑<br />

c j<br />

j=1 i=1<br />

m∑ t∑<br />

= c j a ij v i<br />

=<br />

=<br />

=<br />

=<br />

=<br />

j=1 i=1<br />

t∑<br />

i=1 j=1<br />

t∑<br />

i=1 j=1<br />

i=1<br />

m∑<br />

c j a ij v i<br />

m∑<br />

a ij c j v i<br />

⎛<br />

t∑ m∑<br />

⎝<br />

t∑<br />

0v i<br />

i=1<br />

j=1<br />

a ij c j<br />

⎞<br />

⎠ v i<br />

Property DVA<br />

Property C<br />

Property CMCN<br />

Property DSA<br />

c j as solution<br />

t∑<br />

0 Theorem ZSSM<br />

i=1<br />

= 0 Property Z<br />

That does it. R has been undeniably shown to be a l<strong>in</strong>early dependent set.<br />

Noticehowtheswapofthetwosummationsissomucheasier<strong>in</strong>thethirdstep<br />

above, as opposed to all the rearrang<strong>in</strong>g and regroup<strong>in</strong>g that takes place <strong>in</strong> the<br />

previous proof. And us<strong>in</strong>g only about half the space. And there are no ellipses (...).<br />

Theorem SSLD can be viewed as a generalization of Theorem MVSLD. Weknow<br />

that C m has a basis with m vectors <strong>in</strong> it (Theorem SUVB), so it is a set of m vectors<br />

that spans C m .ByTheoremSSLD, any set of more than m vectors from C m will be<br />

l<strong>in</strong>early dependent. But this is exactly the conclusion we have <strong>in</strong> Theorem MVSLD.


§D Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 322<br />

Maybe this is not a total shock, as the proofs of both theorems rely heavily on<br />

Theorem HMVEI. The beauty of Theorem SSLD is that it applies <strong>in</strong> any vector<br />

space. We illustrate the generality of this theorem, and h<strong>in</strong>t at its power, <strong>in</strong> the next<br />

example.<br />

Example LDP4 L<strong>in</strong>early dependent set <strong>in</strong> P 4<br />

In Example SSP4 we showed that<br />

S = { x − 2, x 2 − 4x +4,x 3 − 6x 2 +12x − 8, x 4 − 8x 3 +24x 2 − 32x +16 }<br />

is a spann<strong>in</strong>g set for W = {p(x)| p ∈ P 4 , p(2) = 0}. So we can apply Theorem SSLD<br />

to W with t = 4. Here is a set of five vectors from W , as you may check by verify<strong>in</strong>g<br />

that each is a polynomial of degree 4 or less and has x =2asaroot,<br />

T = {p 1 ,p 2 ,p 3 ,p 4 ,p 5 }⊆W<br />

p 1 = x 4 − 2x 3 +2x 2 − 8x +8<br />

p 2 = −x 3 +6x 2 − 5x − 6<br />

p 3 =2x 4 − 5x 3 +5x 2 − 7x +2<br />

p 4 = −x 4 +4x 3 − 7x 2 +6x<br />

p 5 =4x 3 − 9x 2 +5x − 6<br />

By Theorem SSLD we conclude that T is l<strong>in</strong>early dependent, with no further<br />

computations.<br />

△<br />

Theorem SSLD is <strong>in</strong>deed powerful, but our ma<strong>in</strong> purpose <strong>in</strong> prov<strong>in</strong>g it right now<br />

was to make sure that our def<strong>in</strong>ition of dimension (Def<strong>in</strong>ition D) is well-def<strong>in</strong>ed.<br />

Here is the theorem.<br />

Theorem BIS Bases have Identical Sizes<br />

Suppose that V is a vector space with a f<strong>in</strong>ite basis B and a second basis C. ThenB<br />

and C have the same size.<br />

Proof. Suppose that C has more vectors than B. (Allow<strong>in</strong>g for the possibility that<br />

C is <strong>in</strong>f<strong>in</strong>ite, we can replace C by a subset that has more vectors than B.) As a<br />

basis, B is a spann<strong>in</strong>g set for V (Def<strong>in</strong>ition B), so Theorem SSLD says that C is<br />

l<strong>in</strong>early dependent. However, this contradicts the fact that as a basis C is l<strong>in</strong>early<br />

<strong>in</strong>dependent (Def<strong>in</strong>ition B). So C must also be a f<strong>in</strong>ite set, with size less than, or<br />

equal to, that of B.<br />

Suppose that B has more vectors than C. As a basis, C is a spann<strong>in</strong>g set for V<br />

(Def<strong>in</strong>ition B), so Theorem SSLD says that B is l<strong>in</strong>early dependent. However, this<br />

contradicts the fact that as a basis B is l<strong>in</strong>early <strong>in</strong>dependent (Def<strong>in</strong>ition B). So C<br />

cannot be strictly smaller than B.<br />

The only possibility left for the sizes of B and C is for them to be equal.


§D Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 323<br />

Theorem BIS tells us that if we f<strong>in</strong>d one f<strong>in</strong>ite basis <strong>in</strong> a vector space, then they<br />

all have the same size. This (f<strong>in</strong>ally) makes Def<strong>in</strong>ition D unambiguous.<br />

Subsection DVS<br />

Dimension of Vector Spaces<br />

We can now collect the dimension of some common, and not so common, vector<br />

spaces.<br />

Theorem DCM Dimension of C m<br />

The dimension of C m (Example VSCV) ism.<br />

Proof. Theorem SUVB provides a basis with m vectors.<br />

<br />

Theorem DP Dimension of P n<br />

The dimension of P n (Example VSP) isn +1.<br />

Proof. Example BP provides two bases with n + 1 vectors. Take your pick.<br />

<br />

Theorem DM Dimension of M mn<br />

The dimension of M mn (Example VSM) ismn.<br />

Proof. Example BM provides a basis with mn vectors.<br />

<br />

Example DSM22 Dimension of a subspace of M 22<br />

It should now be plausible that<br />

{[ ]∣<br />

a b ∣∣∣<br />

Z = 2a + b +3c +4d =0, −a +3b − 5c − 2d =0}<br />

c d<br />

is a subspace of the vector space M 22 (Example VSM). (It is.) To f<strong>in</strong>d the dimension<br />

of Z we must first f<strong>in</strong>d a basis, though any old basis will do.<br />

<strong>First</strong> concentrate on the conditions relat<strong>in</strong>g a, b, c and d. They form a homogeneous<br />

system of two equations <strong>in</strong> four variables with coefficient matrix<br />

[ ]<br />

2 1 3 4<br />

−1 3 −5 −2<br />

We can row-reduce this matrix to obta<strong>in</strong><br />

[ ]<br />

1 0 2 2<br />

0 1 −1 0<br />

Rewrite the two equations represented by each row of this matrix, express<strong>in</strong>g the<br />

dependent variables (a and b) <strong>in</strong> terms of the free variables (c and d), and we obta<strong>in</strong>,<br />

a = −2c − 2d<br />

b = c


§D Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 324<br />

We can now write a typical entry of Z strictly <strong>in</strong> terms of c and d, andwecan<br />

decompose the result,<br />

[ ] [ ] [ ] [ ] [ ] [ ]<br />

a b −2c − 2d c −2c c −2d 0 −2 1 −2 0<br />

=<br />

= + = c + d<br />

c d c d c 0 0 d 1 0 0 1<br />

This equation says that an arbitrary matrix <strong>in</strong> Z canbewrittenasal<strong>in</strong>ear<br />

comb<strong>in</strong>ation of the two vectors <strong>in</strong><br />

{[ ] [ ]}<br />

−2 1 −2 0<br />

S =<br />

,<br />

1 0 0 1<br />

so we know that<br />

Z = 〈S〉 =<br />

〈{[ ]<br />

−2 1<br />

,<br />

1 0<br />

[ ]}〉<br />

−2 0<br />

0 1<br />

Are these two matrices (vectors) also l<strong>in</strong>early <strong>in</strong>dependent? Beg<strong>in</strong> with a relation<br />

of l<strong>in</strong>ear dependence on S,<br />

[ ] [ ]<br />

−2 1 −2 0<br />

a 1 + a<br />

1 0 2 = O<br />

0 1<br />

[ ] [ ]<br />

−2a1 − 2a 2 a 1 0 0<br />

=<br />

a 1 a 2 0 0<br />

From the equality of the two entries <strong>in</strong> the last row, we conclude that a 1 =0,<br />

a 2 = 0. Thus the only possible relation of l<strong>in</strong>ear dependence is the trivial one, and<br />

therefore S is l<strong>in</strong>early <strong>in</strong>dependent (Def<strong>in</strong>ition LI). So S is a basis for Z (Def<strong>in</strong>ition B).<br />

F<strong>in</strong>ally, we can conclude that dim (Z) = 2 (Def<strong>in</strong>ition D) s<strong>in</strong>ceS has two elements.<br />

△<br />

Example DSP4 Dimension of a subspace of P 4<br />

In Example BSP4 we showed that<br />

S = { x − 2, x 2 − 4x +4,x 3 − 6x 2 +12x − 8, x 4 − 8x 3 +24x 2 − 32x +16 }<br />

is a basis for W = {p(x)| p ∈ P 4 , p(2) = 0}. Thus, the dimension of W is four,<br />

dim (W )=4.<br />

Note that dim (P 4 ) =5byTheoremDP, soW is a subspace of dimension 4<br />

with<strong>in</strong> the vector space P 4 of dimension 5, illustrat<strong>in</strong>g the upcom<strong>in</strong>g Theorem PSSD.<br />

△<br />

Example DC Dimension of the crazy vector space<br />

In Example BC we determ<strong>in</strong>ed that the set R = {(1, 0), (6, 3)} from the crazy<br />

vector space, C (Example CVS), is a basis for C. By Def<strong>in</strong>ition D we see that C has<br />

dimension 2, dim (C) =2.<br />

△<br />

It is possible for a vector space to have no f<strong>in</strong>ite bases, <strong>in</strong> which case we say<br />

it has <strong>in</strong>f<strong>in</strong>ite dimension. Many of the best examples of this are vector spaces of<br />

functions, which lead to constructions like Hilbert spaces. We will focus exclusively


§D Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 325<br />

on f<strong>in</strong>ite-dimensional vector spaces. OK, one <strong>in</strong>f<strong>in</strong>ite-dimensional example, and then<br />

we will focus exclusively on f<strong>in</strong>ite-dimensional vector spaces.<br />

Example VSPUD Vector space of polynomials with unbounded degree<br />

Def<strong>in</strong>e the set P by<br />

P = {p| p(x) is a polynomial <strong>in</strong> x}<br />

Our operations will be the same as those def<strong>in</strong>ed for P n (Example VSP).<br />

With no restrictions on the possible degrees of our polynomials, any f<strong>in</strong>ite set<br />

that is a candidate for spann<strong>in</strong>g P will come up short. We will give a proof by<br />

contradiction (Proof Technique CD). To this end, suppose that the dimension of P<br />

is f<strong>in</strong>ite, say dim (P )=n.<br />

The set T = { 1,x,x 2 , ..., x n} is a l<strong>in</strong>early <strong>in</strong>dependent set (check this!) conta<strong>in</strong><strong>in</strong>g<br />

n + 1 polynomials from P .However,abasisofP will be a spann<strong>in</strong>g set of<br />

P conta<strong>in</strong><strong>in</strong>g n vectors. This situation is a contradiction of Theorem SSLD, soour<br />

assumption that P has f<strong>in</strong>ite dimension is false. Thus, we say dim (P )=∞. △<br />

Subsection RNM<br />

Rank and Nullity of a Matrix<br />

For any matrix, we have seen that we can associate several subspaces — the null<br />

space (Theorem NSMS), the column space (Theorem CSMS), row space (Theorem<br />

RSMS) and the left null space (Theorem LNSMS). As vector spaces, each of these<br />

has a dimension, and for the null space and column space, they are important enough<br />

to warrant names.<br />

Def<strong>in</strong>ition NOM Nullity Of a Matrix<br />

Suppose that A is an m × n matrix. Then the nullity of A is the dimension of the<br />

null space of A, n (A) =dim(N (A)).<br />

□<br />

Def<strong>in</strong>ition ROM Rank Of a Matrix<br />

Suppose that A is an m × n matrix. Then the rank of A is the dimension of the<br />

column space of A, r (A) =dim(C(A)).<br />

□<br />

Example RNM Rank and nullity of a matrix<br />

Let us compute the rank and nullity of<br />

⎡<br />

⎤<br />

2 −4 −1 3 2 1 −4<br />

1 −2 0 0 4 0 1<br />

−2 4 1 0 −5 −4 −8<br />

A =<br />

⎢ 1 −2 1 1 6 1 −3<br />

⎥<br />

⎣ 2 −4 −1 1 4 −2 −1⎦<br />

−1 2 3 −1 6 3 −1<br />

To do this, we will first row-reduce the matrix s<strong>in</strong>ce that will help us determ<strong>in</strong>e


§D Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 326<br />

bases for the null space and column space.<br />

⎡<br />

⎤<br />

1 −2 0 0 4 0 1<br />

0 0 1 0 3 0 −2<br />

0 0 0 1 −1 0 −3<br />

⎢ 0 0 0 0 0 1 1 ⎥<br />

⎣<br />

0 0 0 0 0 0 0<br />

⎦<br />

0 0 0 0 0 0 0<br />

From this row-equivalent matrix <strong>in</strong> reduced row-echelon form we record D =<br />

{1, 3, 4, 6} and F = {2, 5, 7}.<br />

For each <strong>in</strong>dex <strong>in</strong> D, TheoremBCS creates a s<strong>in</strong>gle basis vector. In total the<br />

basis will have 4 vectors, so the column space of A will have dimension 4 and we<br />

write r (A) =4.<br />

For each <strong>in</strong>dex <strong>in</strong> F ,TheoremBNS creates a s<strong>in</strong>gle basis vector. In total the<br />

basis will have 3 vectors, so the null space of A will have dimension 3 and we write<br />

n (A) =3.<br />

△<br />

There were no accidents or co<strong>in</strong>cidences <strong>in</strong> the previous example — with the<br />

row-reduced version of a matrix <strong>in</strong> hand, the rank and nullity are easy to compute.<br />

Theorem CRN Comput<strong>in</strong>g Rank and Nullity<br />

Suppose that A is an m × n matrix and B is a row-equivalent matrix <strong>in</strong> reduced<br />

row-echelon form. Let r denote the number of pivot columns (or the number of<br />

nonzero rows). Then r (A) =r and n (A) =n − r.<br />

Proof. Theorem BCS provides a basis for the column space by choos<strong>in</strong>g columns<br />

of A that that have the same <strong>in</strong>dices as the pivot columns of B. In the analysis of<br />

B, each lead<strong>in</strong>g 1 provides one nonzero row and one pivot column. So there are r<br />

column vectors <strong>in</strong> a basis for C(A).<br />

Theorem BNS provides a basis for the null space by creat<strong>in</strong>g basis vectors of the<br />

null space of A from entries of B, one basis vector for each column that is not a<br />

pivot column. So there are n − r column vectors <strong>in</strong> a basis for n (A).<br />

<br />

Every archetype (Archetypes) that <strong>in</strong>volves a matrix lists its rank and nullity.<br />

You may have noticed as you studied the archetypes that the larger the column space<br />

is the smaller the null space is. A simple corollary states this trade-off succ<strong>in</strong>ctly.<br />

(See Proof Technique LC.)<br />

Theorem RPNC Rank Plus Nullity is Columns<br />

Suppose that A is an m × n matrix. Then r (A)+n (A) =n.<br />

Proof. Let r be the number of nonzero rows <strong>in</strong> a row-equivalent matrix <strong>in</strong> reduced<br />

row-echelon form. By Theorem CRN,<br />

r (A)+n (A) =r +(n − r) =n


§D Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 327<br />

When we first <strong>in</strong>troduced r as our standard notation for the number of nonzero<br />

rows <strong>in</strong> a matrix <strong>in</strong> reduced row-echelon form you might have thought r stood for<br />

“rows.” Not really — it stands for “rank”!<br />

Subsection RNNM<br />

Rank and Nullity of a Nons<strong>in</strong>gular Matrix<br />

Let us take a look at the rank and nullity of a square matrix.<br />

Example RNSM Rank and nullity of a square matrix<br />

The matrix<br />

⎡<br />

⎤<br />

0 4 −1 2 2 3 1<br />

2 −2 1 −1 0 −4 −3<br />

−2 −3 9 −3 9 −1 9<br />

E =<br />

−3 −4 9 4 −1 6 −2<br />

⎢−3 −4 6 −2 5 9 −4<br />

⎥<br />

⎣ 9 −3 8 −2 −4 2 4 ⎦<br />

8 2 2 9 3 0 9<br />

is row-equivalent to the matrix <strong>in</strong> reduced row-echelon form,<br />

⎡<br />

⎤<br />

1 0 0 0 0 0 0<br />

0 1 0 0 0 0 0<br />

0 0 1 0 0 0 0<br />

0 0 0 1 0 0 0<br />

⎢<br />

0 0 0 0 1 0 0<br />

⎥<br />

⎣ 0 0 0 0 0 1 0 ⎦<br />

0 0 0 0 0 0 1<br />

With n =7columnsandr = 7 nonzero rows Theorem CRN tells us the rank is<br />

r (E) = 7 and the nullity is n (E) =7− 7=0.<br />

△<br />

The value of either the nullity or the rank are enough to characterize a nons<strong>in</strong>gular<br />

matrix.<br />

Theorem RNNM Rank and Nullity of a Nons<strong>in</strong>gular Matrix<br />

Suppose that A is a square matrix of size n. The follow<strong>in</strong>g are equivalent.<br />

1. A is nons<strong>in</strong>gular.<br />

2. The rank of A is n, r (A) =n.<br />

3. The nullity of A is zero, n (A) =0.<br />

Proof. (1 ⇒ 2) Theorem CSNM says that if A is nons<strong>in</strong>gular then C(A) = C n .If<br />

C(A) = C n , then the column space has dimension n by Theorem DCM, so the rank<br />

of A is n.


§D Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 328<br />

(2 ⇒ 3) Suppose r (A) =n. Then Theorem RPNC gives<br />

n (A) =n − r (A)<br />

Theorem RPNC<br />

= n − n Hypothesis<br />

=0<br />

(3 ⇒ 1) Suppose n (A) = 0, so a basis for the null space of A is the empty set.<br />

This implies that N (A) ={0} and Theorem NMTNS says A is nons<strong>in</strong>gular. <br />

With a new equivalence for a nons<strong>in</strong>gular matrix, we can update our list of<br />

equivalences (Theorem NME5) which now becomes a list requir<strong>in</strong>g double digits to<br />

number.<br />

Theorem NME6 Nons<strong>in</strong>gular Matrix Equivalences, Round 6<br />

Suppose that A is a square matrix of size n. The follow<strong>in</strong>g are equivalent.<br />

1. A is nons<strong>in</strong>gular.<br />

2. A row-reduces to the identity matrix.<br />

3. The null space of A conta<strong>in</strong>s only the zero vector, N (A) ={0}.<br />

4. The l<strong>in</strong>ear system LS(A, b) has a unique solution for every possible choice of<br />

b.<br />

5. The columns of A are a l<strong>in</strong>early <strong>in</strong>dependent set.<br />

6. A is <strong>in</strong>vertible.<br />

7. The column space of A is C n , C(A) =C n .<br />

8. The columns of A are a basis for C n .<br />

9. The rank of A is n, r (A) =n.<br />

10. The nullity of A is zero, n (A) =0.<br />

Proof. Build<strong>in</strong>g on Theorem NME5 we can add two of the statements from Theorem<br />

RNNM.<br />

<br />

Read<strong>in</strong>g Questions<br />

1. What is the dimension of the vector space P 6 , the set of all polynomials of degree 6 or<br />

less?<br />

2. How are the rank and nullity of a matrix related?<br />

3. Expla<strong>in</strong> why we might say that a nons<strong>in</strong>gular matrix has “full rank.”


§D Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 329<br />

Exercises<br />

C20 The archetypes listed below are matrices, or systems of equations with coefficient<br />

matrices. For each, compute the nullity and rank of the matrix. This <strong>in</strong>formation is listed<br />

for each archetype (along with the number of columns <strong>in</strong> the matrix, so as to illustrate<br />

Theorem RPNC), and notice how it could have been computed immediately after the<br />

determ<strong>in</strong>ation of the sets D and F associated with the reduced row-echelon form of the<br />

matrix.<br />

Archetype A,ArchetypeB,ArchetypeC,ArchetypeD/Archetype E,ArchetypeF,Archetype<br />

G/Archetype H, ArchetypeI, ArchetypeJ, ArchetypeK, ArchetypeL<br />

⎧ ⎡ ⎤<br />

⎫<br />

a + b<br />

⎪⎨<br />

⎪⎬<br />

C21 † ⎢a + c⎥<br />

F<strong>in</strong>d the dimension of the subspace W = ⎣<br />

⎪⎩<br />

a + d<br />

⎦<br />

a, b, c, d ∈ C<br />

d<br />

∣<br />

⎪⎭ of C4 .<br />

C22 † F<strong>in</strong>d the dimension of the subspace W = { a + bx + cx 2 + dx 3∣ ∣ a + b + c + d =0 }<br />

of P 3 .<br />

{[ ]∣<br />

C23 † a b ∣∣∣<br />

F<strong>in</strong>d the dimension of the subspace W = a + b = c, b + c = d, c + d = a}<br />

c d<br />

of M 22 .<br />

C30 † For the matrix A below, compute the dimension of the null space of A, dim (N (A)).<br />

⎡<br />

⎤<br />

2 −1 −3 11 9<br />

⎢1 2 1 −7 −3⎥<br />

A = ⎣<br />

3 1 −3 6 8<br />

⎦<br />

2 1 2 −5 −3<br />

C31 † The set W below is a subspace of C 4 . F<strong>in</strong>d the dimension of W .<br />

〈 ⎧ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />

⎪ 2 3 −4 〉<br />

⎨ ⎪⎬<br />

⎢−3⎥<br />

⎢ 0 ⎥ ⎢−3⎥<br />

W = ⎣<br />

⎪ ⎩<br />

4<br />

⎦ , ⎣<br />

1<br />

⎦ , ⎣<br />

2<br />

⎦<br />

⎪⎭<br />

1 −2 5<br />

⎡ ⎤<br />

1 0 1<br />

1 2 2<br />

C35 † F<strong>in</strong>d the rank and nullity of the matrix A = ⎢ 2 1 1⎥<br />

⎣<br />

−1 0 1<br />

⎦ .<br />

1 1 2<br />

⎡<br />

C36 † F<strong>in</strong>d the rank and nullity of the matrix A = ⎣ 1 2 1 1 1<br />

⎤<br />

1 3 2 0 4⎦.<br />

1 2 1 1 1<br />

⎡<br />

⎤<br />

3 2 1 1 1<br />

2 3 0 1 1<br />

C37 † F<strong>in</strong>d the rank and nullity of the matrix A = ⎢−1 1 2 1 0 ⎥<br />

⎣<br />

1 1 0 1 1<br />

⎦ .<br />

0 1 1 2 −1<br />

C40 In Example LDP4 we determ<strong>in</strong>ed that the set of five polynomials, T , is l<strong>in</strong>early<br />

dependent by a simple <strong>in</strong>vocation of Theorem SSLD. ProvethatT is l<strong>in</strong>early dependent


§D Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 330<br />

from scratch, beg<strong>in</strong>n<strong>in</strong>g with Def<strong>in</strong>ition LI.<br />

M20 † M 22 is the vector space of 2 × 2 matrices. Let S 22 denote the set of all 2 × 2<br />

symmetric matrices. That is<br />

S 22 = { A ∈ M 22 | A t = A }<br />

1. Show that S 22 is a subspace of M 22 .<br />

2. Exhibit a basis for S 22 and prove that it has the required properties.<br />

3. What is the dimension of S 22 ?<br />

M21 † A2× 2 matrix B is upper triangular if [B] 21<br />

=0.LetUT 2 be the set of all 2 × 2<br />

upper triangular matrices. Then UT 2 is a subspace of the vector space of all 2 × 2matrices,<br />

M 22 (you may assume this). Determ<strong>in</strong>e the dimension of UT 2 provid<strong>in</strong>g all of the necessary<br />

justifications for your answer.


Section PD<br />

Properties of Dimension<br />

Once the dimension of a vector space is known, then the determ<strong>in</strong>ation of whether<br />

or not a set of vectors is l<strong>in</strong>early <strong>in</strong>dependent, or if it spans the vector space, can<br />

often be much easier. In this section we will state a workhorse theorem and then<br />

apply it to the column space and row space of a matrix. It will also help us describe<br />

a super-basis for C m .<br />

Subsection GT<br />

Goldilocks’ Theorem<br />

We beg<strong>in</strong> with a useful theorem that we will need later, and <strong>in</strong> the proof of the<br />

ma<strong>in</strong> theorem <strong>in</strong> this subsection. This theorem says that we can extend l<strong>in</strong>early<br />

<strong>in</strong>dependent sets, one vector at a time, by add<strong>in</strong>g vectors from outside the span of<br />

the l<strong>in</strong>early <strong>in</strong>dependent set, all the while preserv<strong>in</strong>g the l<strong>in</strong>ear <strong>in</strong>dependence of the<br />

set.<br />

Theorem ELIS Extend<strong>in</strong>g L<strong>in</strong>early Independent Sets<br />

Suppose V is a vector space and S is a l<strong>in</strong>early <strong>in</strong>dependent set of vectors from V .<br />

Suppose w is a vector such that w ∉〈S〉. Then the set S ′ = S ∪{w} is l<strong>in</strong>early<br />

<strong>in</strong>dependent.<br />

Proof. Suppose S = {v 1 , v 2 , v 3 , ..., v m } and beg<strong>in</strong> with a relation of l<strong>in</strong>ear dependence<br />

on S ′ ,<br />

a 1 v 1 + a 2 v 2 + a 3 v 3 + ···+ a m v m + a m+1 w = 0.<br />

There are two cases to consider. <strong>First</strong> suppose that a m+1 = 0. Then the relation<br />

of l<strong>in</strong>ear dependence on S ′ becomes<br />

a 1 v 1 + a 2 v 2 + a 3 v 3 + ···+ a m v m = 0.<br />

and by the l<strong>in</strong>ear <strong>in</strong>dependence of the set S, we conclude that a 1 = a 2 = a 3 = ···=<br />

a m = 0. So all of the scalars <strong>in</strong> the relation of l<strong>in</strong>ear dependence on S ′ are zero.<br />

In the second case, suppose that a m+1 ≠ 0. Then the relation of l<strong>in</strong>ear dependence<br />

on S ′ becomes<br />

a m+1 w = −a 1 v 1 − a 2 v 2 − a 3 v 3 −···−a m v m<br />

w = − a 1<br />

a m+1<br />

v 1 − a 2<br />

a m+1<br />

v 2 − a 3<br />

a m+1<br />

v 3 −···−<br />

a m<br />

a m+1<br />

v m<br />

This equation expresses w as a l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors <strong>in</strong> S, contrary<br />

to the assumption that w ∉〈S〉, so this case leads to a contradiction.<br />

The first case yielded only a trivial relation of l<strong>in</strong>ear dependence on S ′ and the<br />

second case led to a contradiction. So S ′ is a l<strong>in</strong>early <strong>in</strong>dependent set s<strong>in</strong>ce any<br />

331


§PD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 332<br />

relation of l<strong>in</strong>ear dependence is trivial.<br />

<br />

In the story Goldilocks and the Three Bears, the young girl Goldilocks visits the<br />

empty house of the three bears while out walk<strong>in</strong>g <strong>in</strong> the woods. One bowl of porridge<br />

is too hot, the other too cold, the third is just right. One chair is too hard, one too<br />

soft, the third is just right. So it is with sets of vectors — some are too big (l<strong>in</strong>early<br />

dependent), some are too small (they do not span), and some are just right (bases).<br />

Here is Goldilocks’ Theorem.<br />

Theorem G Goldilocks<br />

Suppose that V is a vector space of dimension t. LetS = {v 1 , v 2 , v 3 , ..., v m } be a<br />

set of vectors from V .Then<br />

1. If m>t,thenS is l<strong>in</strong>early dependent.<br />

2. If m


§PD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 333<br />

There is a tension <strong>in</strong> the construction of a basis. Make a set too big and you will<br />

end up with relations of l<strong>in</strong>ear dependence among the vectors. Make a set too small<br />

and you will not have enough raw material to span the entire vector space. Make a<br />

set just the right size (the dimension) and you only need to have l<strong>in</strong>ear <strong>in</strong>dependence<br />

or spann<strong>in</strong>g, and you get the other property for free. These roughly-stated ideas are<br />

made precise by Theorem G.<br />

The structure and proof of this theorem also deserve comment. The hypotheses<br />

seem <strong>in</strong>nocuous. We presume we know the dimension of the vector space <strong>in</strong> hand,<br />

then we mostly just look at the size of the set S. From this we get big conclusions<br />

about spann<strong>in</strong>g and l<strong>in</strong>ear <strong>in</strong>dependence. Each of the four proofs relies on ultimately<br />

contradict<strong>in</strong>g Theorem SSLD, so <strong>in</strong> a way we could th<strong>in</strong>k of this entire theorem as a<br />

corollary of Theorem SSLD. (See Proof Technique LC.)Theproofsofthethirdand<br />

fourth parts parallel each other <strong>in</strong> style: <strong>in</strong>troduce w us<strong>in</strong>g Theorem ELIS or toss<br />

v k us<strong>in</strong>g Theorem DLDS. Then obta<strong>in</strong> a contradiction to Theorem SSLD.<br />

Theorem G is useful <strong>in</strong> both concrete examples and as a tool <strong>in</strong> other proofs. We<br />

will use it often to bypass verify<strong>in</strong>g l<strong>in</strong>ear <strong>in</strong>dependence or spann<strong>in</strong>g.<br />

Example BPR Bases for P n , reprised<br />

In Example BP we claimed that<br />

B = { 1,x,x 2 ,x 3 , ..., x n}<br />

C = { 1, 1+x, 1+x + x 2 , 1+x + x 2 + x 3 , ..., 1+x + x 2 + x 3 + ···+ x n} .<br />

were both bases for P n (Example VSP). Suppose we had first verified that B was<br />

a basis, so we would then know that dim (P n ) = n +1.ThesizeofC is n +1,the<br />

right size to be a basis. We could then verify that C is l<strong>in</strong>early <strong>in</strong>dependent. We<br />

would not have to make any special efforts to prove that C spans P n , s<strong>in</strong>ce Theorem<br />

G would allow us to conclude this property of C directly. Then we would be able to<br />

say that C is a basis of P n also.<br />

△<br />

Example BDM22 Basis by dimension <strong>in</strong> M 22<br />

In Example DSM22 we showed that<br />

{[ ] [ ]}<br />

−2 1 −2 0<br />

B =<br />

,<br />

1 0 0 1<br />

is a basis for the subspace Z of M 22 (Example VSM) givenby<br />

{[ ]∣<br />

a b ∣∣∣<br />

Z = 2a + b +3c +4d =0, −a +3b − 5c − d =0}<br />

c d<br />

This tells us that dim (Z) = 2. In this example we will f<strong>in</strong>d another basis. We<br />

can construct two new matrices <strong>in</strong> Z by form<strong>in</strong>g l<strong>in</strong>ear comb<strong>in</strong>ations of the matrices<br />

<strong>in</strong> B.<br />

2<br />

[ ]<br />

−2 1<br />

+(−3)<br />

1 0<br />

[ ] [ ]<br />

−2 0 2 2<br />

=<br />

0 1 2 −3


§PD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 334<br />

Then the set<br />

3<br />

[ ]<br />

−2 1<br />

+1<br />

1 0<br />

C =<br />

{[ ]<br />

2 2<br />

,<br />

2 −3<br />

[ ]<br />

−2 0<br />

=<br />

0 1<br />

[ ]}<br />

−8 3<br />

3 1<br />

[ ]<br />

−8 3<br />

3 1<br />

hastherightsizetobeabasisofZ. Let us see if it is a l<strong>in</strong>early <strong>in</strong>dependent set.<br />

The relation of l<strong>in</strong>ear dependence<br />

[ ] [ ]<br />

2 2 −8 3<br />

a 1 + a<br />

2 −3 2 = O<br />

3 1<br />

[ ] [ ]<br />

2a1 − 8a 2 2a 1 +3a 2 0 0<br />

=<br />

2a 1 +3a 2 −3a 1 + a 2 0 0<br />

leads to the homogeneous system of equations whose coefficient matrix<br />

⎡ ⎤<br />

2 −8<br />

⎢ 2 3 ⎥<br />

⎣<br />

2 3<br />

⎦<br />

−3 1<br />

row-reduces to<br />

⎡<br />

⎢<br />

⎣<br />

1 0<br />

0 1<br />

0 0<br />

0 0<br />

So with a 1 = a 2 = 0 as the only solution, the set is l<strong>in</strong>early <strong>in</strong>dependent. Now<br />

we can apply Theorem G to see that C also spans Z and therefore is a second basis<br />

for Z.<br />

△<br />

Example SVP4 Sets of vectors <strong>in</strong> P 4<br />

In Example BSP4 we showed that<br />

B = { x − 2, x 2 − 4x +4,x 3 − 6x 2 +12x − 8, x 4 − 8x 3 +24x 2 − 32x +16 }<br />

is a basis for W = {p(x)| p ∈ P 4 , p(2) = 0}. Sodim(W )=4.<br />

The set<br />

{<br />

3x 2 − 5x − 2, 2x 2 − 7x +6,x 3 − 2x 2 + x − 2 }<br />

is a subset of W (check this) and it happens to be l<strong>in</strong>early <strong>in</strong>dependent (check this,<br />

too). However, by Theorem G it cannot span W .<br />

The set<br />

{<br />

3x 2 − 5x − 2, 2x 2 − 7x +6,x 3 − 2x 2 + x − 2, −x 4 +2x 3 +5x 2 − 10x, x 4 − 16 }<br />

is another subset of W (check this) and Theorem G tells us that it must be l<strong>in</strong>early<br />

dependent.<br />

⎤<br />

⎥<br />


§PD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 335<br />

The set<br />

{<br />

x − 2, x 2 − 2x, x 3 − 2x 2 ,x 4 − 2x 3}<br />

is a third subset of W (check this) and is l<strong>in</strong>early <strong>in</strong>dependent (check this). S<strong>in</strong>ce it<br />

has the right size to be a basis, and is l<strong>in</strong>early <strong>in</strong>dependent, Theorem G tells us that<br />

it also spans W , and therefore is a basis of W .<br />

△<br />

A simple consequence of Theorem G is the observation that a proper subspace<br />

has strictly smaller dimension that its parent vector space. Hopefully this may seem<br />

<strong>in</strong>tuitively obvious, but it still requires proof, and we will cite this result later.<br />

Theorem PSSD Proper Subspaces have Smaller Dimension<br />

Suppose that U and V are subspaces of the vector space W , such that U V .Then<br />

dim (U) < dim (V ).<br />

Proof. Suppose that dim (U) = m and dim (V ) = t. ThenU has a basis B of size<br />

m. Ifm>t, then by Theorem G, B is l<strong>in</strong>early dependent, which is a contradiction.<br />

If m = t, thenbyTheoremG, B spans V .ThenU = 〈B〉 = V , also a contradiction.<br />

All that rema<strong>in</strong>s is that m


§PD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 336<br />

Proof. Suppose we row-reduce A to the matrix B <strong>in</strong> reduced row-echelon form, and<br />

B has r nonzero rows. The quantity r tells us three th<strong>in</strong>gs about B: thenumberof<br />

lead<strong>in</strong>g 1’s, the number of nonzero rows and the number of pivot columns. For this<br />

proof we will be <strong>in</strong>terested <strong>in</strong> the latter two.<br />

Theorem BRS and Theorem BCS each has a conclusion that provides a basis, for<br />

the row space and the column space, respectively. In each case, these bases conta<strong>in</strong><br />

r vectors. This observation makes the follow<strong>in</strong>g go.<br />

r (A) =dim(C(A))<br />

Def<strong>in</strong>ition ROM<br />

= r Theorem BCS<br />

=dim(R(A))<br />

Theorem BRS<br />

=dim ( C ( A t))<br />

Theorem CSRST<br />

= r ( A t) Def<strong>in</strong>ition ROM<br />

Jacob L<strong>in</strong>enthal helped with this proof.<br />

<br />

This says that the row space and the column space of a matrix have the same<br />

dimension, which should be very surpris<strong>in</strong>g. It does not say that column space<br />

and the row space are identical. Indeed, if the matrix is not square, then the sizes<br />

(number of slots) of the vectors <strong>in</strong> each space are different, so the sets are not even<br />

comparable.<br />

It is not hard to construct by yourself examples of matrices that illustrate Theorem<br />

RMRT, s<strong>in</strong>ce it applies equally well to any matrix. Grab a matrix, row-reduce it,<br />

count the nonzero rows or the number of pivot columns. That is the rank. Transpose<br />

the matrix, row-reduce that, count the nonzero rows or the pivot columns. That is<br />

the rank of the transpose. The theorem says the two will be equal. Every time. Here<br />

is an example anyway.<br />

Example RRTI Rank,rankoftranspose,ArchetypeI<br />

Archetype I has a 4 × 7 coefficient matrix which row-reduces to<br />

⎡<br />

⎤<br />

1 4 0 0 2 1 −3<br />

⎢ 0 0 1 0 1 −3 5<br />

⎥<br />

⎣<br />

0 0 0 1 2 −6 6<br />

⎦<br />

0 0 0 0 0 0 0


§PD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 337<br />

so the rank is 3. Row-reduc<strong>in</strong>g the transpose yields<br />

⎡<br />

⎤<br />

1 0 0 − 31<br />

7<br />

⎢<br />

⎣<br />

0 1 0<br />

12<br />

7<br />

13<br />

0 0 1<br />

7<br />

0 0 0 0<br />

0 0 0 0<br />

0 0 0 0<br />

0 0 0 0<br />

.<br />

⎥<br />

⎦<br />

demonstrat<strong>in</strong>g that the rank of the transpose is also 3.<br />

△<br />

Subsection DFS<br />

Dimension of Four Subspaces<br />

That the rank of a matrix equals the rank of its transpose is a fundamental and<br />

surpris<strong>in</strong>g result. However, apply<strong>in</strong>g Theorem FS we can easily determ<strong>in</strong>e the<br />

dimension of all four fundamental subspaces associated with a matrix.<br />

Theorem DFS Dimensions of Four Subspaces<br />

Suppose that A is an m × n matrix, and B is a row-equivalent matrix <strong>in</strong> reduced<br />

row-echelon form with r nonzero rows. Then<br />

1. dim (N (A)) = n − r<br />

2. dim (C(A)) = r<br />

3. dim (R(A)) = r<br />

4. dim (L(A)) = m − r<br />

Proof. If A row-reduces to a matrix <strong>in</strong> reduced row-echelon form with r nonzero<br />

rows, then the matrix C of extended echelon form (Def<strong>in</strong>ition EEF) willbeanr × n<br />

matrix <strong>in</strong> reduced row-echelon form with no zero rows and r pivot columns (Theorem<br />

PEEF). Similarly, the matrix L of extended echelon form (Def<strong>in</strong>ition EEF) will be<br />

an m − r × m matrix <strong>in</strong> reduced row-echelon form with no zero rows and m − r<br />

pivot columns (Theorem PEEF).<br />

dim (N (A)) = dim (N (C))<br />

Theorem FS<br />

= n − r Theorem BNS<br />

dim (C(A)) = dim (N (L))<br />

Theorem FS<br />

= m − (m − r) Theorem BNS<br />

= r


§PD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 338<br />

dim (R(A)) = dim (R(C))<br />

Theorem FS<br />

= r Theorem BRS<br />

dim (L(A)) = dim (R(L))<br />

Theorem FS<br />

= m − r Theorem BRS<br />

There are many different ways to state and prove this result, and <strong>in</strong>deed, the<br />

equality of the dimensions of the column space and row space is just a slight expansion<br />

of Theorem RMRT. However, we have restricted our techniques to apply<strong>in</strong>g Theorem<br />

FS and then determ<strong>in</strong><strong>in</strong>g dimensions with bases provided by Theorem BNS and<br />

Theorem BRS. This provides an appeal<strong>in</strong>g symmetry to the results and the proof.<br />

Read<strong>in</strong>g Questions<br />

1. WhydoesTheoremG have the title it does?<br />

2. Why is Theorem RMRT so surpris<strong>in</strong>g ?<br />

3. Row-reduce the matrix A to reduced row-echelon form. Without any further computations,<br />

compute the dimensions of the four subspaces, (a) N (A), (b)C(A), (c)R(A) and<br />

(d) L(A).<br />

⎡<br />

⎤<br />

1 −1 2 8 5<br />

⎢1 1 1 4 −1⎥<br />

A = ⎣<br />

0 2 −3 −8 −6<br />

⎦<br />

2 0 1 8 4<br />

Exercises<br />

C10<br />

Example SVP4 leaves several details for the reader to check. Verify these five claims.<br />

C40 † Determ<strong>in</strong>e if the set T = { x 2 − x +5, 4x 3 − x 2 +5x, 3x +2 } spans the vector<br />

space of polynomials with degree 4 or less, P 4 . (Compare the solution to this exercise with<br />

Solution LISS.C40.)<br />

T05 Trivially, if U and V are two subspaces of W with U = V ,thendim (U) = dim (V ).<br />

Comb<strong>in</strong>e this fact, Theorem PSSD, and Theorem EDYES all <strong>in</strong>to one grand comb<strong>in</strong>ed<br />

theorem. You might look to Theorem PIP for stylistic <strong>in</strong>spiration. (Notice this problem<br />

does not ask you to prove anyth<strong>in</strong>g. It just asks you to roll up three theorems <strong>in</strong>to one<br />

compact, logically equivalent statement.)<br />

T10 Prove the follow<strong>in</strong>g theorem, which could be viewed as a reformulation of parts (3) and<br />

(4) of Theorem G, or more appropriately as a corollary of Theorem G (Proof Technique LC).


§PD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 339<br />

Suppose V is a vector space and S is a subset of V such that the number of vectors<br />

<strong>in</strong> S equals the dimension of V .ThenS is l<strong>in</strong>early <strong>in</strong>dependent if and only if S spans V .<br />

T15 Suppose that A is an m × n matrix and let m<strong>in</strong>(m, n) denote the m<strong>in</strong>imum of m<br />

and n. Provethatr (A) ≤ m<strong>in</strong>(m, n). (If m and n are two numbers, then m<strong>in</strong>(m, n) stands<br />

for the number that is the smaller of the two. For example m<strong>in</strong>(4, 6) = 4.)<br />

T20 † Suppose that A is an m × n matrix and b ∈ C m . Prove that the l<strong>in</strong>ear system<br />

LS(A, b) is consistent if and only if r (A) =r ([ A | b]).<br />

T25 Suppose that V is a vector space with f<strong>in</strong>ite dimension. Let W be any subspace of<br />

V .ProvethatW has f<strong>in</strong>ite dimension.<br />

T33 † Part of Exercise B.T50 is the half of the proof where we assume the matrix A is<br />

nons<strong>in</strong>gular and prove that a set is a basis. In Solution B.T50 we proved directly that the<br />

set was both l<strong>in</strong>early <strong>in</strong>dependent and a spann<strong>in</strong>g set. Shorten this part of the proof by<br />

apply<strong>in</strong>g Theorem G. Be careful, there is one subtlety.<br />

T60 † Suppose that W is a vector space with dimension 5, and U and V are subspaces of<br />

W , each of dimension 3. Prove that U ∩ V conta<strong>in</strong>s a nonzero vector. State a more general<br />

result.


Chapter D<br />

Determ<strong>in</strong>ants<br />

The determ<strong>in</strong>ant is a function that takes a square matrix as an <strong>in</strong>put and produces<br />

a scalar as an output. So unlike a vector space, it is not an algebraic structure.<br />

However, it has many beneficial properties for study<strong>in</strong>g vector spaces, matrices and<br />

systems of equations, so it is hard to ignore (though some have tried). While the<br />

properties of a determ<strong>in</strong>ant can be very useful, they are also complicated to prove.<br />

Section DM<br />

Determ<strong>in</strong>ant of a Matrix<br />

Before we def<strong>in</strong>e the determ<strong>in</strong>ant of a matrix, we take a slight detour to <strong>in</strong>troduce<br />

elementary matrices. These will br<strong>in</strong>g us back to the beg<strong>in</strong>n<strong>in</strong>g of the course and<br />

our old friend, row operations.<br />

Subsection EM<br />

Elementary Matrices<br />

Elementary matrices are very simple, as you might have suspected from their name.<br />

Their purpose is to effect row operations (Def<strong>in</strong>ition RO) on a matrix through matrix<br />

multiplication (Def<strong>in</strong>ition MM). Their def<strong>in</strong>itions look much more complicated than<br />

they really are, so be sure to skip over them on your first read<strong>in</strong>g and head right for<br />

the explanation that follows and the first example.<br />

Def<strong>in</strong>ition ELEM Elementary Matrices<br />

340


§DM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 341<br />

1. For i ≠ j, E i,j is the square matrix of size n with<br />

⎧<br />

0 k ≠ i, k ≠ j, l ≠ k<br />

1 k ≠ i, k ≠ j, l = k<br />

⎪⎨<br />

0 k = i, l ≠ j<br />

[E i,j ] kl<br />

=<br />

1 k = i, l = j<br />

0 k = j, l ≠ i<br />

⎪⎩<br />

1 k = j, l = i<br />

2. For α ≠0,E i (α) is the square matrix of size n with<br />

⎧<br />

⎪⎨ 0 l ≠ k<br />

[E i (α)] kl<br />

= 1 k ≠ i, l = k<br />

⎪⎩<br />

α k = i, l = i<br />

3. For i ≠ j, E i,j (α) is the square matrix of size n with<br />

⎧<br />

0 k ≠ j, l ≠ k<br />

⎪⎨ 1 k ≠ j, l = k<br />

[E i,j (α)] kl<br />

= 0 k = j, l ≠ i, l ≠ j<br />

1 k = j, l = j<br />

⎪⎩<br />

α k = j, l = i<br />

□<br />

Aga<strong>in</strong>, these matrices are not as complicated as their def<strong>in</strong>itions suggest, s<strong>in</strong>ce<br />

they are just small perturbations of the n × n identity matrix (Def<strong>in</strong>ition IM). E i,j<br />

is the identity matrix with rows (or columns) i and j trad<strong>in</strong>g places, E i (α) is the<br />

identity matrix where the diagonal entry <strong>in</strong> row i and column i has been replaced<br />

by α, andE i,j (α) is the identity matrix where the entry <strong>in</strong> row j and column i<br />

has been replaced by α. (Yes, those subscripts look backwards <strong>in</strong> the description of<br />

E i,j (α)). Notice that our notation makes no reference to the size of the elementary<br />

matrix, s<strong>in</strong>ce this will always be apparent from the context, or unimportant.<br />

The raison d’etre for elementary matrices is to “do” row operations on matrices<br />

with matrix multiplication. So here is an example where we will both see some<br />

elementary matrices and see how they accomplish row operations when used with<br />

matrix multiplication.<br />

Example EMRO Elementary matrices and row operations<br />

We will perform a sequence of row operations (Def<strong>in</strong>ition RO) onthe3× 4 matrix<br />

A, while also multiply<strong>in</strong>g the matrix on the left by the appropriate 3 × 3 elementary


§DM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 342<br />

matrix.<br />

R 1 ↔ R 3 :<br />

2R 2 :<br />

2R 3 + R 1 :<br />

[ 5 0 3<br />

] 1<br />

1 3 2 4<br />

2 1 3 1<br />

[ ]<br />

5 0 3 1<br />

2 6 4 8<br />

2 1 3 1<br />

[ ]<br />

9 2 9 3<br />

2 6 4 8<br />

2 1 3 1<br />

A =<br />

E 1,3 :<br />

E 2 (2) :<br />

E 3,1 (2) :<br />

[ 2 1 3<br />

] 1<br />

1 3 2 4<br />

5 0 3 1<br />

[ ][ ]<br />

0 0 1 2 1 3 1<br />

0 1 0 1 3 2 4<br />

1 0 0 5 0 3 1<br />

[ ][ ]<br />

1 0 0 5 0 3 1<br />

0 2 0 1 3 2 4<br />

=<br />

=<br />

0 0 1 2 1 3 1<br />

[ ][ ]<br />

1 0 2 5 0 3 1<br />

0 1 0 2 6 4 8 =<br />

0 0 1 2 1 3 1<br />

[ 5 0 3<br />

] 1<br />

1 3 2 4<br />

2 1 3 1<br />

[ ]<br />

5 0 3 1<br />

2 6 4 8<br />

2 1 3 1<br />

[ ]<br />

9 2 9 3<br />

2 6 4 8<br />

2 1 3 1<br />

The next three theorems establish that each elementary matrix effects a row<br />

operation via matrix multiplication.<br />

Theorem EMDRO Elementary Matrices Do Row Operations<br />

Suppose that A is an m×n matrix, and B is a matrix of the same size that is obta<strong>in</strong>ed<br />

from A by a s<strong>in</strong>gle row operation (Def<strong>in</strong>ition RO). Then there is an elementary<br />

matrix of size m that will convert A to B via matrix multiplication on the left. More<br />

precisely,<br />

1. If the row operation swaps rows i and j, thenB = E i,j A.<br />

2. If the row operation multiplies row i by α, thenB = E i (α) A.<br />

3. If the row operation multiplies row i by α and adds the result to row j, then<br />

B = E i,j (α) A.<br />

Proof. In each of the three conclusions, perform<strong>in</strong>g the row operation on A will<br />

create the matrix B where only one or two rows will have changed. So we will<br />

establish the equality of the matrix entries row by row, first for the unchanged rows,<br />

then for the changed rows, show<strong>in</strong>g <strong>in</strong> each case that the result of the matrix product<br />

is the same as the result of the row operation. Here we go.<br />

Row k of the product E i,j A,wherek ≠ i, k ≠ j, is unchanged from A,<br />

n∑<br />

[E i,j A] kl<br />

= [E i,j ] kp<br />

[A] pl<br />

Theorem EMP<br />

p=1<br />

=[E i,j ] kk<br />

[A] kl<br />

+<br />

n∑<br />

[E i,j ] kp<br />

[A] pl<br />

Property CACN<br />

p=1<br />

p≠k<br />


§DM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 343<br />

=1[A] kl<br />

+<br />

n∑<br />

0[A] pl<br />

p=1<br />

p≠k<br />

Def<strong>in</strong>ition ELEM<br />

=[A] kl<br />

Row i of the product E i,j A is row j of A,<br />

n∑<br />

[E i,j A] il<br />

= [E i,j ] ip<br />

[A] pl<br />

Theorem EMP<br />

p=1<br />

=[E i,j ] ij<br />

[A] jl<br />

+<br />

=1[A] jl<br />

+<br />

=[A] jl<br />

n∑<br />

[E i,j ] ip<br />

[A] pl<br />

Property CACN<br />

p=1<br />

p≠j<br />

n∑<br />

0[A] pl<br />

p=1<br />

p≠j<br />

Def<strong>in</strong>ition ELEM<br />

Row j of the product E i,j A is row i of A,<br />

n∑<br />

[E i,j A] jl<br />

= [E i,j ] jp<br />

[A] pl<br />

Theorem EMP<br />

p=1<br />

=[E i,j ] ji<br />

[A] il<br />

+<br />

=1[A] il<br />

+<br />

=[A] il<br />

n∑<br />

[E i,j ] jp<br />

[A] pl<br />

Property CACN<br />

p=1<br />

p≠i<br />

n∑<br />

0[A] pl<br />

p=1<br />

p≠i<br />

Def<strong>in</strong>ition ELEM<br />

So the matrix product E i,j A is the same as the row operation that swaps rows i<br />

and j.<br />

Row k of the product E i (α) A, wherek ≠ i, is unchanged from A,<br />

n∑<br />

[E i (α) A] kl<br />

= [E i (α)] kp<br />

[A] pl<br />

Theorem EMP<br />

p=1<br />

=[E i (α)] kk<br />

[A] kl<br />

+<br />

n∑<br />

[E i (α)] kp<br />

[A] pl<br />

Property CACN<br />

p=1<br />

p≠k


§DM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 344<br />

=1[A] kl<br />

+<br />

n∑<br />

0[A] pl<br />

p=1<br />

p≠k<br />

Def<strong>in</strong>ition ELEM<br />

=[A] kl<br />

Row i of the product E i (α) A is α times row i of A,<br />

n∑<br />

[E i (α) A] il<br />

= [E i (α)] ip<br />

[A] pl<br />

Theorem EMP<br />

p=1<br />

=[E i (α)] ii<br />

[A] il<br />

+<br />

= α [A] il<br />

+<br />

= α [A] il<br />

n∑<br />

[E i (α)] ip<br />

[A] pl<br />

Property CACN<br />

p=1<br />

p≠i<br />

n∑<br />

0[A] pl<br />

p=1<br />

p≠i<br />

Def<strong>in</strong>ition ELEM<br />

So the matrix product E i (α) A is the same as the row operation that swaps<br />

multiplies row i by α.<br />

Row k of the product E i,j (α) A, wherek ≠ j, is unchanged from A,<br />

n∑<br />

[E i,j (α) A] kl<br />

= [E i,j (α)] kp<br />

[A] pl<br />

Theorem EMP<br />

p=1<br />

=[E i,j (α)] kk<br />

[A] kl<br />

+<br />

=1[A] kl<br />

+<br />

=[A] kl<br />

n∑<br />

0[A] pl<br />

p=1<br />

p≠k<br />

n∑<br />

[E i,j (α)] kp<br />

[A] pl<br />

Property CACN<br />

p=1<br />

p≠k<br />

Def<strong>in</strong>ition ELEM<br />

Row j of the product E i,j (α) A, isα times row i of A and then added to row j<br />

of A,<br />

n∑<br />

[E i,j (α) A] jl<br />

= [E i,j (α)] jp<br />

[A] pl<br />

Theorem EMP<br />

p=1<br />

=[E i,j (α)] jj<br />

[A] jl<br />

+<br />

[E i,j (α)] ji<br />

[A] il<br />

+<br />

n∑<br />

[E i,j (α)] jp<br />

[A] pl<br />

Property CACN<br />

p=1<br />

p≠j,i


§DM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 345<br />

=1[A] jl<br />

+ α [A] il<br />

+<br />

n∑<br />

p=1<br />

p≠j,i<br />

0[A] pl<br />

Def<strong>in</strong>ition ELEM<br />

=[A] jl<br />

+ α [A] il<br />

So the matrix product E i,j (α) A is the same as the row operation that multiplies<br />

row i by α and adds the result to row j.<br />

<br />

Later <strong>in</strong> this section we will need two facts about elementary matrices.<br />

Theorem EMN Elementary Matrices are Nons<strong>in</strong>gular<br />

If E is an elementary matrix, then E is nons<strong>in</strong>gular.<br />

Proof. We show that we can row-reduce each elementary matrix to the identity<br />

matrix. Given an elementary matrix of the form E i,j , perform the row operation<br />

that swaps row j with row i. Given an elementary matrix of the form E i (α), with<br />

α ̸= 0, perform the row operation that multiplies row i by 1/α. Given an elementary<br />

matrix of the form E i,j (α), withα ≠ 0, perform the row operation that multiplies<br />

row i by −α and adds it to row j. In each case, the result of the s<strong>in</strong>gle row operation<br />

is the identity matrix. So each elementary matrix is row-equivalent to the identity<br />

matrix, and by Theorem NMRRI is nons<strong>in</strong>gular.<br />

<br />

Notice that we have now made use of the nonzero restriction on α <strong>in</strong> the def<strong>in</strong>ition<br />

of E i (α). One more key property of elementary matrices.<br />

Theorem NMPEM Nons<strong>in</strong>gular Matrices are Products of Elementary Matrices<br />

Suppose that A is a nons<strong>in</strong>gular matrix. Then there exists elementary matrices<br />

E 1 ,E 2 ,E 3 , ..., E t so that A = E 1 E 2 E 3 ...E t .<br />

Proof. S<strong>in</strong>ce A is nons<strong>in</strong>gular, it is row-equivalent to the identity matrix by Theorem<br />

NMRRI, so there is a sequence of t row operations that converts I to A. For each of<br />

these row operations, form the associated elementary matrix from Theorem EMDRO<br />

and denote these matrices by E 1 ,E 2 ,E 3 , ..., E t . Apply<strong>in</strong>g the first row operation<br />

to I yields the matrix E 1 I. The second row operation yields E 2 (E 1 I), and the third<br />

row operation creates E 3 E 2 E 1 I. The result of the full sequence of t row operations<br />

will yield A, so<br />

A = E t ...E 3 E 2 E 1 I = E t ...E 3 E 2 E 1<br />

Other than the cosmetic matter of re-<strong>in</strong>dex<strong>in</strong>g these elementary matrices <strong>in</strong> the<br />

opposite order, this is the desired result.


§DM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 346<br />

Subsection DD<br />

Def<strong>in</strong>ition of the Determ<strong>in</strong>ant<br />

We will now turn to the def<strong>in</strong>ition of a determ<strong>in</strong>ant and do some sample computations.<br />

The def<strong>in</strong>ition of the determ<strong>in</strong>ant function is recursive, that is, the determ<strong>in</strong>ant of<br />

a large matrix is def<strong>in</strong>ed <strong>in</strong> terms of the determ<strong>in</strong>ant of smaller matrices. To this<br />

end, we will make a few def<strong>in</strong>itions.<br />

Def<strong>in</strong>ition SM SubMatrix<br />

Suppose that A is an m×n matrix. Then the submatrix A (i|j) is the (m−1)×(n−1)<br />

matrix obta<strong>in</strong>ed from A by remov<strong>in</strong>g row i and column j.<br />

□<br />

Example SS Some submatrices<br />

For the matrix<br />

A =<br />

we have the submatrices<br />

[ ]<br />

1 −2 9<br />

A (2|3) =<br />

3 5 1<br />

[ 1 −2 3<br />

] 9<br />

4 −2 0 1<br />

3 5 2 1<br />

A (3|1) =<br />

[<br />

−2 3<br />

]<br />

9<br />

−2 0 1<br />

Def<strong>in</strong>ition DM Determ<strong>in</strong>ant of a Matrix<br />

Suppose A is a square matrix. Then its determ<strong>in</strong>ant, det (A) = |A|, isanelement<br />

of C def<strong>in</strong>ed recursively by:<br />

1. If A is a 1 × 1 matrix, then det (A) =[A] 11<br />

.<br />

2. If A is a matrix of size n with n ≥ 2, then<br />

det (A) =[A] 11<br />

det (A (1|1)) − [A] 12<br />

det (A (1|2)) + [A] 13<br />

det (A (1|3)) −<br />

[A] 14<br />

det (A (1|4)) + ···+(−1) n+1 [A] 1n<br />

det (A (1|n))<br />

So to compute the determ<strong>in</strong>ant of a 5 × 5 matrix we must build 5 submatrices,<br />

each of size 4. To compute the determ<strong>in</strong>ants of each the 4 × 4 matrices we need<br />

to create 4 submatrices each, these now of size 3 and so on. To compute the<br />

determ<strong>in</strong>ant of a 10 × 10 matrix would require comput<strong>in</strong>g the determ<strong>in</strong>ant of<br />

10! = 10 × 9 × 8 × 7 × 6 × 5 × 4 × 3 × 2=3, 628, 800 1 × 1 matrices. Fortunately there<br />

are better ways. However this does suggest an excellent computer programm<strong>in</strong>g<br />

exercise to write a recursive procedure to compute a determ<strong>in</strong>ant.<br />

Let us compute the determ<strong>in</strong>ant of a reasonably sized matrix by hand.<br />

△<br />


§DM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 347<br />

Example D33M Determ<strong>in</strong>ant of a 3 × 3 matrix<br />

Suppose that we have the 3 × 3 matrix<br />

[ ]<br />

3 2 −1<br />

A = 4 1 6<br />

−3 −1 2<br />

Then<br />

3 2 −1<br />

det (A) =|A| =<br />

4 1 6<br />

∣<br />

−3 −1 2<br />

∣<br />

=3<br />

∣ 1 6<br />

∣ ∣ ∣∣∣ −1 2∣ − 2 4 6<br />

∣∣∣ −3 2∣ +(−1) 4 1<br />

−3 −1∣<br />

=3(1|2|−6 |−1|) − 2(4|2|−6 |−3|) − (4 |−1|−1 |−3|)<br />

= 3 (1(2) − 6(−1)) − 2 (4(2) − 6(−3)) − (4(−1) − 1(−3))<br />

=24− 52 + 1<br />

= −27<br />

In practice it is a bit silly to decompose a 2 × 2matrixdown<strong>in</strong>toacoupleof<br />

1 × 1 matrices and then compute the exceed<strong>in</strong>gly easy determ<strong>in</strong>ant of these puny<br />

matrices. So here is a simple theorem.<br />

Theorem DMST[ Determ<strong>in</strong>ant ] of Matrices of Size Two<br />

a b<br />

Suppose that A = .Thendet (A) =ad − bc.<br />

c d<br />

△<br />

Proof. Apply<strong>in</strong>g Def<strong>in</strong>ition DM,<br />

∣ a b<br />

c d∣ = a |d|−b |c| = ad − bc<br />

<br />

Do you recall see<strong>in</strong>g the expression ad − bc before? (H<strong>in</strong>t: Theorem TTMI)<br />

Subsection CD<br />

Comput<strong>in</strong>g Determ<strong>in</strong>ants<br />

There are a variety of ways to compute the determ<strong>in</strong>ant. We will establish first<br />

that we can choose to mimic our def<strong>in</strong>ition of the determ<strong>in</strong>ant, but by us<strong>in</strong>g matrix<br />

entries and submatrices based on a row other than the first one.<br />

Theorem DER Determ<strong>in</strong>ant Expansion about Rows


§DM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 348<br />

Suppose that A is a square matrix of size n. Then for 1 ≤ i ≤ n<br />

det (A) =(−1) i+1 [A] i1<br />

det (A (i|1)) + (−1) i+2 [A] i2<br />

det (A (i|2))<br />

+(−1) i+3 [A] i3<br />

det (A (i|3)) + ···+(−1) i+n [A] <strong>in</strong><br />

det (A (i|n))<br />

which is known as expansion about row i.<br />

Proof. <strong>First</strong>, the statement of the theorem co<strong>in</strong>cides with Def<strong>in</strong>ition DM when i =1,<br />

so throughout, we need only consider i>1.<br />

Given the recursive def<strong>in</strong>ition of the determ<strong>in</strong>ant, it should be no surprise that we<br />

will use <strong>in</strong>duction for this proof (Proof Technique I). When n = 1, there is noth<strong>in</strong>g<br />

to prove s<strong>in</strong>ce there is but one row. When n = 2, we just exam<strong>in</strong>e expansion about<br />

the second row,<br />

(−1) 2+1 [A] 21<br />

det (A (2|1)) + (−1) 2+2 [A] 22<br />

det (A (2|2))<br />

= − [A] 21<br />

[A] 12<br />

+[A] 22<br />

[A] 11<br />

Def<strong>in</strong>ition DM<br />

=[A] 11<br />

[A] 22<br />

− [A] 12<br />

[A] 21<br />

=det(A)<br />

Theorem DMST<br />

So the theorem is true for matrices of size n = 1 and n = 2. Now assume the<br />

result is true for all matrices of size n − 1 as we derive an expression for expansion<br />

about row i for a matrix of size n. We will abuse our notation for a submatrix slightly,<br />

so A (i 1 ,i 2 |j 1 ,j 2 ) will denote the matrix formed by remov<strong>in</strong>g rows i 1 and i 2 ,along<br />

with remov<strong>in</strong>g columns j 1 and j 2 . Also, as we take a determ<strong>in</strong>ant of a submatrix,<br />

we will need to “jump up” the <strong>in</strong>dex of summation partway through as we “skip<br />

over” a miss<strong>in</strong>g column. To do this smoothly we will set<br />

{<br />

0 lj<br />

Now,<br />

det (A)<br />

n∑<br />

= (−1) 1+j [A] 1j<br />

det (A (1|j))<br />

=<br />

=<br />

j=1<br />

n∑<br />

(−1) 1+j [A] 1j<br />

j=1<br />

n∑<br />

j=1<br />

∑<br />

1≤l≤n<br />

l≠j<br />

∑<br />

1≤l≤n<br />

l≠j<br />

(−1) i−1+l−ɛ lj<br />

[A] il<br />

det (A (1,i|j, l))<br />

(−1) j+i+l−ɛ lj<br />

[A] 1j<br />

[A] il<br />

det (A (1,i|j, l))<br />

Def<strong>in</strong>ition DM<br />

Induction<br />

Property DCN


§DM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 349<br />

=<br />

=<br />

=<br />

=<br />

n∑<br />

l=1<br />

∑<br />

1≤j≤n<br />

j≠l<br />

n∑<br />

(−1) i+l [A] il<br />

l=1<br />

n∑<br />

(−1) i+l [A] il<br />

l=1<br />

(−1) j+i+l−ɛ lj<br />

[A] 1j<br />

[A] il<br />

det (A (1,i|j, l))<br />

∑<br />

1≤j≤n<br />

j≠l<br />

∑<br />

1≤j≤n<br />

j≠l<br />

n∑<br />

(−1) i+l [A] il<br />

det (A (i|l))<br />

l=1<br />

(−1) j−ɛ lj<br />

[A] 1j<br />

det (A (1,i|j, l))<br />

(−1) ɛ lj+j [A] 1j<br />

det (A (i, 1|l, j))<br />

Property CACN<br />

Property DCN<br />

2ɛ lj is even<br />

Def<strong>in</strong>ition DM<br />

<br />

We can also obta<strong>in</strong> a formula that computes a determ<strong>in</strong>ant by expansion about<br />

a column, but this will be simpler if we first prove a result about the <strong>in</strong>terplay of<br />

determ<strong>in</strong>ants and transposes. Notice how the follow<strong>in</strong>g proof makes use of the ability<br />

to compute a determ<strong>in</strong>ant by expand<strong>in</strong>g about any row.<br />

Theorem DT Determ<strong>in</strong>ant of the Transpose<br />

Suppose that A is a square matrix. Then det (A t )=det(A).<br />

Proof. With our def<strong>in</strong>ition of the determ<strong>in</strong>ant (Def<strong>in</strong>ition DM) and theorems like<br />

Theorem DER, us<strong>in</strong>g <strong>in</strong>duction (Proof Technique I) is a natural approach to prov<strong>in</strong>g<br />

properties of determ<strong>in</strong>ants. And so it is here. Let n bethesizeofthematrixA, and<br />

we will use <strong>in</strong>duction on n.<br />

For n = 1, the transpose of a matrix is identical to the orig<strong>in</strong>al matrix, so<br />

vacuously, the determ<strong>in</strong>ants are equal.<br />

Now assume the result is true for matrices of size n − 1. Then,<br />

det ( A t) = 1 n∑<br />

det ( A t)<br />

n<br />

= 1 n<br />

= 1 n<br />

= 1 n<br />

i=1<br />

n∑<br />

i=1 j=1<br />

n∑<br />

i=1 j=1<br />

n∑<br />

i=1 j=1<br />

n∑<br />

(−1) i+j [ A t] ij det ( A t (i|j) )<br />

n∑<br />

(−1) i+j [A] ji<br />

det ( A t (i|j) )<br />

n∑<br />

(−1) i+j [A] ji<br />

det<br />

Theorem DER<br />

Def<strong>in</strong>ition TM<br />

(<br />

(A (j|i)) t) Def<strong>in</strong>ition TM


§DM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 350<br />

= 1 n<br />

= 1 n<br />

= 1 n<br />

n∑ n∑<br />

(−1) i+j [A] ji<br />

det (A (j|i))<br />

i=1 j=1<br />

n∑<br />

j=1 i=1<br />

n∑<br />

(−1) j+i [A] ji<br />

det (A (j|i))<br />

n∑<br />

det (A)<br />

j=1<br />

=det(A)<br />

Induction Hypothesis<br />

Property CACN<br />

Theorem DER<br />

<br />

Now we can easily get the result that a determ<strong>in</strong>ant can be computed by expansion<br />

about any column as well.<br />

Theorem DEC Determ<strong>in</strong>ant Expansion about Columns<br />

Suppose that A is a square matrix of size n. Then for 1 ≤ j ≤ n<br />

det (A) =(−1) 1+j [A] 1j<br />

det (A (1|j)) + (−1) 2+j [A] 2j<br />

det (A (2|j))<br />

+(−1) 3+j [A] 3j<br />

det (A (3|j)) + ···+(−1) n+j [A] nj<br />

det (A (n|j))<br />

which is known as expansion about column j.<br />

Proof.<br />

det (A) =det ( A t)<br />

n∑<br />

= (−1) j+i [ A t] det ( A t (j|i) )<br />

ji<br />

=<br />

=<br />

=<br />

i=1<br />

Theorem DT<br />

Theorem DER<br />

n∑<br />

(−1) j+i [ A t] ((A<br />

ji det (i|j)) t) Def<strong>in</strong>ition TM<br />

i=1<br />

n∑<br />

(−1) j+i [ A t] det (A (i|j))<br />

ji<br />

i=1<br />

n∑<br />

(−1) i+j [A] ij<br />

det (A (i|j))<br />

i=1<br />

Theorem DT<br />

Def<strong>in</strong>ition TM<br />

<br />

That the determ<strong>in</strong>ant of an n × n matrix can be computed <strong>in</strong> 2n different (albeit<br />

similar) ways is noth<strong>in</strong>g short of remarkable. For the doubters among us, we will do<br />

an example, comput<strong>in</strong>g a 4 × 4 matrix <strong>in</strong> two different ways.


§DM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 351<br />

Example TCSD Two computations, same determ<strong>in</strong>ant<br />

Let<br />

⎡<br />

⎤<br />

−2 3 0 1<br />

⎢ 9 −2 0 1 ⎥<br />

A = ⎣<br />

1 3 −2 −1<br />

⎦<br />

4 1 2 6<br />

Then expand<strong>in</strong>g about the fourth row (Theorem DER with i = 4) yields,<br />

∣ ∣ ∣∣∣∣ 3 0 1<br />

∣∣∣∣ −2 0 1<br />

|A| = (4)(−1) 4+1 −2 0 1<br />

3 −2 −1<br />

∣ + (1)(−1)4+2 9 0 1<br />

1 −2 −1<br />

∣<br />

∣ ∣ ∣∣∣∣ −2 3 1<br />

∣∣∣∣ −2 3 0<br />

+ (2)(−1) 4+3 9 −2 1<br />

1 3 −1<br />

∣ + (6)(−1)4+4 9 −2 0<br />

1 3 −2<br />

∣<br />

=(−4)(10) + (1)(−22) + (−2)(61) + 6(46) = 92<br />

Expand<strong>in</strong>g about column 3 (Theorem DEC with j =3)gives<br />

∣ ∣ ∣∣∣∣ 9 −2 1<br />

∣∣∣∣ −2 3 1<br />

|A| = (0)(−1) 1+3 1 3 −1<br />

4 1 6<br />

∣ + (0)(−1)2+3 1 3 −1<br />

4 1 6<br />

∣ +<br />

∣ ∣ ∣∣∣∣ −2 3 1<br />

∣∣∣∣ −2 3 1<br />

(−2)(−1) 3+3 9 −2 1<br />

4 1 6<br />

∣ + (2)(−1)4+3 9 −2 1<br />

1 3 −1<br />

∣<br />

=0+0+(−2)(−107) + (−2)(61) = 92<br />

Notice how much easier the second computation was. By choos<strong>in</strong>g to expand<br />

about the third column, we have two entries that are zero, so two 3 × 3 determ<strong>in</strong>ants<br />

need not be computed at all!<br />

△<br />

When a matrix has all zeros above (or below) the diagonal, exploit<strong>in</strong>g the zeros<br />

by expand<strong>in</strong>g about the proper row or column makes comput<strong>in</strong>g a determ<strong>in</strong>ant<br />

<strong>in</strong>sanely easy.<br />

Example DUTM Determ<strong>in</strong>ant of an upper triangular matrix<br />

Suppose that<br />

⎡<br />

⎤<br />

2 3 −1 3 3<br />

0 −1 5 2 −1<br />

T = ⎢0 0 3 9 2 ⎥<br />

⎣0 0 0 −1 3 ⎦<br />

0 0 0 0 5<br />

We will compute the determ<strong>in</strong>ant of this 5 × 5 matrix by consistently expand<strong>in</strong>g<br />

about the first column for each submatrix that arises and does not have a zero entry


§DM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 352<br />

multiply<strong>in</strong>g it.<br />

2 3 −1 3 3<br />

0 −1 5 2 −1<br />

det (T )=<br />

0 0 3 9 2<br />

0 0 0 −1 3<br />

∣0 0 0 0 5 ∣<br />

∣ ∣∣∣∣∣∣ −1 5 2 −1<br />

=2(−1) 1+1 0 3 9 2<br />

0 0 −1 3<br />

0 0 0 5 ∣<br />

∣ ∣∣∣∣ 3 9 2<br />

=2(−1)(−1) 1+1 0 −1 3<br />

0 0 5<br />

∣<br />

∣ ∣∣∣<br />

=2(−1)(3)(−1) 1+1 −1 3<br />

0 5∣<br />

=2(−1)(3)(−1)(−1) 1+1 |5|<br />

=2(−1)(3)(−1)(5) = 30<br />

When you consult other texts <strong>in</strong> your study of determ<strong>in</strong>ants, you may run <strong>in</strong>to<br />

the terms “m<strong>in</strong>or” and “cofactor,” especially <strong>in</strong> a discussion centered on expansion<br />

about rows and columns. We have chosen not to make these def<strong>in</strong>itions formally<br />

s<strong>in</strong>ce we have been able to get along without them. However, <strong>in</strong>formally, a m<strong>in</strong>or is<br />

a determ<strong>in</strong>ant of a submatrix, specifically det (A (i|j)) and is usually referenced as<br />

the m<strong>in</strong>or of [A] ij<br />

.Acofactor is a signed m<strong>in</strong>or, specifically the cofactor of [A] ij<br />

is<br />

(−1) i+j det (A (i|j)).<br />

Read<strong>in</strong>g Questions<br />

1. Construct the elementary matrix that will effect the row operation −6R 2 + R 3 on a<br />

4 × 7matrix.<br />

2. Compute the determ<strong>in</strong>ant of the matrix<br />

⎡<br />

⎣ 2 3 −1<br />

⎤<br />

3 8 2 ⎦<br />

4 −1 −3<br />

3. Compute the determ<strong>in</strong>ant of the matrix<br />

⎡<br />

⎤<br />

3 9 −2 4 2<br />

0 1 4 −2 7<br />

⎢0 0 −2 5 2⎥<br />

⎣<br />

0 0 0 −1 6<br />

⎦<br />

0 0 0 0 4<br />


§DM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 353<br />

Exercises<br />

C21 †<br />

Do<strong>in</strong>g the computations by hand, f<strong>in</strong>d the determ<strong>in</strong>ant of the matrix below.<br />

[ ]<br />

1 3<br />

6 2<br />

C22 †<br />

C23 †<br />

C24 †<br />

C25 †<br />

Do<strong>in</strong>g the computations by hand, f<strong>in</strong>d the determ<strong>in</strong>ant of the matrix below.<br />

[ ]<br />

1 3<br />

2 6<br />

Do<strong>in</strong>g the computations by hand, f<strong>in</strong>d the determ<strong>in</strong>ant of the matrix below.<br />

⎡<br />

⎣ 1 3 2<br />

⎤<br />

4 1 3⎦<br />

1 0 1<br />

Do<strong>in</strong>g the computations by hand, f<strong>in</strong>d the determ<strong>in</strong>ant of the matrix below.<br />

⎡<br />

⎤<br />

−2 3 −2<br />

⎣−4 −2 1 ⎦<br />

2 4 2<br />

Do<strong>in</strong>g the computations by hand, f<strong>in</strong>d the determ<strong>in</strong>ant of the matrix below.<br />

⎡<br />

⎣ 3 −1 4<br />

⎤<br />

2 5 1⎦<br />

2 0 6<br />

C26 † Do<strong>in</strong>g the computations by hand, f<strong>in</strong>d the determ<strong>in</strong>ant of the matrix A.<br />

⎡<br />

⎤<br />

2 0 3 2<br />

⎢5 1 2 4⎥<br />

A = ⎣<br />

3 0 1 2<br />

⎦<br />

5 3 2 1<br />

C27 † Do<strong>in</strong>g the computations by hand, f<strong>in</strong>d the determ<strong>in</strong>ant of the matrix A.<br />

⎡<br />

⎤<br />

1 0 1 1<br />

⎢2 2 −1 1⎥<br />

A = ⎣<br />

2 1 3 0<br />

⎦<br />

1 1 0 1<br />

C28 † Do<strong>in</strong>g the computations by hand, f<strong>in</strong>d the determ<strong>in</strong>ant of the matrix A.<br />

⎡<br />

⎤<br />

1 0 1 1<br />

⎢2 −1 −1 1⎥<br />

A = ⎣<br />

2 5 3 0<br />

⎦<br />

1 −1 0 1


§DM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 354<br />

C29 † Do<strong>in</strong>g the computations by hand, f<strong>in</strong>d the determ<strong>in</strong>ant of the matrix A.<br />

⎡<br />

⎤<br />

2 3 0 2 1<br />

0 1 1 1 2<br />

A = ⎢0 0 1 2 3⎥<br />

⎣<br />

0 1 2 1 0<br />

⎦<br />

0 0 0 1 2<br />

C30 † Do<strong>in</strong>g the computations by hand, f<strong>in</strong>d the determ<strong>in</strong>ant of the matrix A.<br />

⎡<br />

⎤<br />

2 1 1 0 1<br />

2 1 2 −1 1<br />

A = ⎢0 0 1 2 0⎥<br />

⎣<br />

1 0 3 1 1<br />

⎦<br />

2 1 1 2 1<br />

[ ]<br />

M10 † 2 4<br />

F<strong>in</strong>d a value of k so that the matrix A = has det(A) = 0, or expla<strong>in</strong> why<br />

3 k<br />

it is not possible.<br />

⎡<br />

M11 † F<strong>in</strong>d a value of k so that the matrix A = ⎣ 1 2 1<br />

⎤<br />

2 0 1⎦ has det(A) = 0, or expla<strong>in</strong><br />

2 3 k<br />

why it is not possible.<br />

[ ]<br />

M15 † 2 − x 1<br />

Given the matrix B =<br />

, f<strong>in</strong>d all values of x that are solutions of<br />

4 2− x<br />

det(B) =0.<br />

⎡<br />

⎤<br />

4 − x −4 −4<br />

M16 † Given the matrix B = ⎣ 2 −2 − x −4 ⎦, f<strong>in</strong>d all values of x that are<br />

3 −3 −4 − x<br />

solutions of det(B) =0.<br />

M30 The two matrices below are row-equivalent. How would you confirm this? S<strong>in</strong>ce the<br />

matrices are row-equivalent, there is a sequence of row operations that converts X <strong>in</strong>to Y ,<br />

which would be a product of elementary matrices, M, such that MX = Y .F<strong>in</strong>dM. (This<br />

approach could be used to f<strong>in</strong>d the “9 scalars” of the very early Exercise RREF.M40.)<br />

H<strong>in</strong>t: Compute the extended echelon form for both matrices, and then use the property<br />

from Theorem PEEF that reads B = JA,whereA is the orig<strong>in</strong>al matrix, B is the echelon<br />

form of the matrix and J is a nons<strong>in</strong>gular matrix obta<strong>in</strong>ed from extended echelon form.<br />

Comb<strong>in</strong>e the two square matrices <strong>in</strong> the right way to obta<strong>in</strong> M.<br />

⎡<br />

⎤<br />

⎡<br />

⎤<br />

−1 3 1 −2 8<br />

−1 2 2 0 0<br />

⎢−1 3 2 −1 4 ⎥<br />

⎢−3 6 8 −1 1 ⎥<br />

X = ⎣<br />

2 −4 −3 2 −7<br />

⎦ Y = ⎣<br />

0 1 −2 −2 9<br />

⎦<br />

−2 5 3 −2 8<br />

−1 4 −3 −3 16


Section PDM<br />

Properties of Determ<strong>in</strong>ants of Matrices<br />

We have seen how to compute the determ<strong>in</strong>ant of a matrix, and the <strong>in</strong>credible fact<br />

that we can perform expansion about any row or column to make this computation.<br />

In this largely theoretical section, we will state and prove several more <strong>in</strong>trigu<strong>in</strong>g<br />

properties about determ<strong>in</strong>ants. Our ma<strong>in</strong> goal will be the two results <strong>in</strong> Theorem<br />

SMZD and Theorem DRMM, but more specifically, we will see how the value of<br />

a determ<strong>in</strong>ant will allow us to ga<strong>in</strong> <strong>in</strong>sight <strong>in</strong>to the various properties of a square<br />

matrix.<br />

Subsection DRO<br />

Determ<strong>in</strong>ants and Row Operations<br />

We start easy with a straightforward theorem whose proof presages the style of<br />

subsequent proofs <strong>in</strong> this subsection.<br />

Theorem DZRC Determ<strong>in</strong>ant with Zero Row or Column<br />

Suppose that A is a square matrix with a row where every entry is zero, or a column<br />

where every entry is zero. Then det (A) =0.<br />

Proof. Suppose that A is a square matrix of size n and row i has every entry equal<br />

to zero. We compute det (A) via expansion about row i.<br />

n∑<br />

det (A) = (−1) i+j [A] ij<br />

det (A (i|j)) Theorem DER<br />

=<br />

=<br />

j=1<br />

n∑<br />

(−1) i+j 0det(A (i|j))<br />

j=1<br />

n∑<br />

0=0<br />

j=1<br />

Row i is zeros<br />

The proof for the case of a zero column is entirely similar, or could be derived<br />

from an application of Theorem DT employ<strong>in</strong>g the transpose of the matrix. <br />

Theorem DRCS Determ<strong>in</strong>ant for Row or Column Swap<br />

Suppose that A is a square matrix. Let B be the square matrix obta<strong>in</strong>ed from A by<br />

<strong>in</strong>terchang<strong>in</strong>g the location of two rows, or <strong>in</strong>terchang<strong>in</strong>g the location of two columns.<br />

Then det (B) =− det (A).<br />

Proof. Beg<strong>in</strong> with the special case where A is a square matrix of size n and we form<br />

B by swapp<strong>in</strong>g adjacent rows i and i +1forsome1≤ i ≤ n − 1. Notice that the<br />

assumption about swapp<strong>in</strong>g adjacent rows means that B (i +1|j) = A (i|j) for all<br />

355


§PDM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 356<br />

1 ≤ j ≤ n, and[B] i+1,j<br />

= [A] ij<br />

for all 1 ≤ j ≤ n. We compute det (B) via expansion<br />

about row i +1.<br />

n∑<br />

det (B) = (−1) (i+1)+j [B] i+1,j<br />

det (B (i +1|j)) Theorem DER<br />

=<br />

=<br />

j=1<br />

n∑<br />

(−1) (i+1)+j [A] ij<br />

det (A (i|j))<br />

j=1<br />

n∑<br />

(−1) 1 (−1) i+j [A] ij<br />

det (A (i|j))<br />

j=1<br />

=(−1)<br />

n∑<br />

(−1) i+j [A] ij<br />

det (A (i|j))<br />

j=1<br />

Hypothesis<br />

= − det (A) Theorem DER<br />

So the result holds for the special case where we swap adjacent rows of the<br />

matrix. As any computer scientist knows, we can accomplish any rearrangement of<br />

an ordered list by swapp<strong>in</strong>g adjacent elements. This pr<strong>in</strong>ciple can be demonstrated<br />

by naïve sort<strong>in</strong>g algorithms such as “bubble sort.” In any event, we do not need to<br />

discuss every possible reorder<strong>in</strong>g, we just need to consider a swap of two rows, say<br />

rows s and t with 1 ≤ s


§PDM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 357<br />

B by multiply<strong>in</strong>g each entry of row i of A by α. Notice that the other rows of A<br />

and B are equal, so A (i|j) = B (i|j), for all 1 ≤ j ≤ n. We compute det (B) via<br />

expansion about row i.<br />

n∑<br />

det (B) = (−1) i+j [B] ij<br />

det (B (i|j)) Theorem DER<br />

=<br />

=<br />

= α<br />

j=1<br />

n∑<br />

(−1) i+j [B] ij<br />

det (A (i|j))<br />

j=1<br />

n∑<br />

(−1) i+j α [A] ij<br />

det (A (i|j))<br />

j=1<br />

n∑<br />

(−1) i+j [A] ij<br />

det (A (i|j))<br />

j=1<br />

Hypothesis<br />

Hypothesis<br />

= α det (A) Theorem DER<br />

The proof for the case of a multiple of a column is entirely similar, or could be<br />

derivedfromanapplicationofTheoremDT employ<strong>in</strong>g the transpose of the matrix.<br />

<br />

Let us go for understand<strong>in</strong>g the effect of all three row operations. But first we<br />

need an <strong>in</strong>termediate result, but it is an easy one.<br />

Theorem DERC Determ<strong>in</strong>ant with Equal Rows or Columns<br />

Suppose that A is a square matrix with two equal rows, or two equal columns. Then<br />

det (A) =0.<br />

Proof. Suppose that A is a square matrix of size n where the two rows s and t are<br />

equal. Form the matrix B by swapp<strong>in</strong>g rows s and t. Notice that as a consequence<br />

of our hypothesis, A = B. Then<br />

det (A) = 1 (det (A)+det(A))<br />

2<br />

= 1 (det (A) − det (B))<br />

2<br />

Theorem DRCS<br />

= 1 (det (A) − det (A))<br />

2<br />

Hypothesis, A = B<br />

= 1 2 (0) = 0<br />

The proof for the case of two equal columns is entirely similar, or could be derived<br />

from an application of Theorem DT employ<strong>in</strong>g the transpose of the matrix. <br />

Now expla<strong>in</strong> the third row operation. Here we go.


§PDM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 358<br />

Theorem DRCMA Determ<strong>in</strong>ant for Row or Column Multiples and Addition<br />

Suppose that A is a square matrix. Let B be the square matrix obta<strong>in</strong>ed from A<br />

by multiply<strong>in</strong>g a row by the scalar α and then add<strong>in</strong>g it to another row, or by<br />

multiply<strong>in</strong>g a column by the scalar α and then add<strong>in</strong>g it to another column. Then<br />

det (B) =det(A).<br />

Proof. Suppose that A is a square matrix of size n. FormthematrixB by multiply<strong>in</strong>g<br />

row s by α andadd<strong>in</strong>gittorowt. LetC be the auxiliary matrix where we replace<br />

row t of A by row s of A. Notice that A (t|j) = B (t|j) = C (t|j) for all 1 ≤ j ≤ n.<br />

We compute the determ<strong>in</strong>ant of B by expansion about row t.<br />

n∑<br />

det (B) = (−1) t+j [B] tj<br />

det (B (t|j))<br />

Theorem DER<br />

=<br />

=<br />

= α<br />

= α<br />

j=1<br />

n∑ ( )<br />

(−1) t+j α [A] sj<br />

+[A] tj<br />

det (B (t|j)) Hypothesis<br />

j=1<br />

n∑<br />

(−1) t+j α [A] sj<br />

det (B (t|j))<br />

j=1<br />

+<br />

n∑<br />

(−1) t+j [A] tj<br />

det (B (t|j))<br />

j=1<br />

n∑<br />

(−1) t+j [A] sj<br />

det (B (t|j))<br />

j=1<br />

+<br />

n∑<br />

(−1) t+j [A] tj<br />

det (B (t|j))<br />

j=1<br />

n∑<br />

(−1) t+j [C] tj<br />

det (C (t|j))<br />

j=1<br />

+<br />

n∑<br />

(−1) t+j [A] tj<br />

det (A (t|j))<br />

j=1<br />

= α det (C)+det(A) Theorem DER<br />

= α 0+det(A) =det(A) Theorem DERC<br />

The proof for the case of add<strong>in</strong>g a multiple of a column is entirely similar, or<br />

could be derived from an application of Theorem DT employ<strong>in</strong>g the transpose of<br />

the matrix.<br />

<br />

Is this what you expected? We could argue that the third row operation is the<br />

most popular, and yet it has no effect whatsoever on the determ<strong>in</strong>ant of a matrix!


§PDM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 359<br />

We can exploit this, along with our understand<strong>in</strong>g of the other two row operations,<br />

to provide another approach to comput<strong>in</strong>g a determ<strong>in</strong>ant. We’ll expla<strong>in</strong> this <strong>in</strong> the<br />

context of an example.<br />

Example DRO Determ<strong>in</strong>ant by row operations<br />

Suppose we desire the determ<strong>in</strong>ant of the 4 × 4 matrix<br />

⎡<br />

⎤<br />

2 0 2 3<br />

⎢ 1 3 −1 1⎥<br />

A = ⎣<br />

−1 1 −1 2<br />

⎦<br />

3 5 4 0<br />

We will perform a sequence of row operations on this matrix, shoot<strong>in</strong>g for an upper<br />

triangular matrix, whose determ<strong>in</strong>ant will be simply the product of its diagonal<br />

entries. For each row operation, we will track the effect on the determ<strong>in</strong>ant via<br />

Theorem DRCS, TheoremDRCM, TheoremDRCMA.<br />

⎡<br />

⎤<br />

2 0 2 3<br />

⎢ 1 3 −1 1⎥<br />

A = ⎣<br />

−1 1 −1 2<br />

⎦ det (A)<br />

3 5 4 0<br />

⎡<br />

⎤<br />

1 3 −1 1<br />

R 1 ↔R 2<br />

−−−−−→ A1 =<br />

−2R 1 +R 2<br />

−−−−−−→ A2 =<br />

1R 1 +R 3<br />

−−−−−→ A3 =<br />

−3R 1 +R 4<br />

−−−−−−→ A4 =<br />

1R 3 +R 2<br />

−−−−−→ A5 =<br />

− 1 2 R 2<br />

−−−−→ A 6 =<br />

⎢ 2 0 2 3⎥<br />

⎣<br />

−1 1 −1 2<br />

⎦ = − det (A 1 ) Theorem DRCS<br />

3 5 4 0<br />

⎡<br />

⎤<br />

1 3 −1 1<br />

⎢ 0 −6 4 1⎥<br />

⎣<br />

−1 1 −1 2<br />

⎦ = − det (A 2 ) Theorem DRCMA<br />

3 5 4 0<br />

⎡<br />

⎤<br />

1 3 −1 1<br />

⎢0 −6 4 1⎥<br />

⎣<br />

0 4 −2 3<br />

⎦ = − det (A 3 ) Theorem DRCMA<br />

3 5 4 0<br />

⎡<br />

⎤<br />

1 3 −1 1<br />

⎢0 −6 4 1 ⎥<br />

⎣<br />

0 4 −2 3<br />

⎦ = − det (A 4 ) Theorem DRCMA<br />

0 −4 7 −3<br />

⎡<br />

⎤<br />

1 3 −1 1<br />

⎢0 −2 2 4 ⎥<br />

⎣<br />

0 4 −2 3<br />

⎦ = − det (A 5 ) Theorem DRCMA<br />

0 −4 7 −3<br />

⎡<br />

⎤<br />

1 3 −1 1<br />

⎢0 1 −1 −2⎥<br />

⎣<br />

0 4 −2 3<br />

⎦ =2det(A 6 ) Theorem DRCM<br />

0 −4 7 −3


§PDM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 360<br />

−4R 2 +R 3<br />

−−−−−−→ A7 =<br />

4R 2 +R 4<br />

−−−−−→ A8 =<br />

−1R 3 +R 4<br />

−−−−−−→ A9 =<br />

−2R 4 +R 3<br />

−−−−−−→ A10 =<br />

R 3 ↔R 4<br />

−−−−−→ A11 =<br />

1<br />

55 R 4<br />

−−−→ A 12 =<br />

⎡<br />

1 3 −1<br />

⎤<br />

1<br />

⎢0 1 −1 −2⎥<br />

⎣<br />

0 0 2 11<br />

⎦ =2det(A 7 ) Theorem DRCMA<br />

0 −4 7 −3<br />

⎡<br />

⎤<br />

1 3 −1 1<br />

⎢0 1 −1 −2 ⎥<br />

⎣<br />

0 0 2 11<br />

⎦ =2det(A 8 ) Theorem DRCMA<br />

0 0 3 −11<br />

⎡<br />

⎤<br />

1 3 −1 1<br />

⎢0 1 −1 −2 ⎥<br />

⎣<br />

0 0 2 11<br />

⎦ =2det(A 9 ) Theorem DRCMA<br />

0 0 1 −22<br />

⎡<br />

⎤<br />

1 3 −1 1<br />

⎢0 1 −1 −2 ⎥<br />

⎣<br />

0 0 0 55<br />

⎦ =2det(A 10 ) Theorem DRCMA<br />

0 0 1 −22<br />

⎡<br />

⎤<br />

1 3 −1 1<br />

⎢0 1 −1 −2 ⎥<br />

⎣<br />

0 0 1 −22<br />

⎦ = −2det(A 11 ) Theorem DRCS<br />

0 0 0 55<br />

⎡<br />

⎤<br />

1 3 −1 1<br />

⎢0 1 −1 −2 ⎥<br />

⎣<br />

0 0 1 −22<br />

⎦ = −110 det (A 12 ) Theorem DRCM<br />

0 0 0 1<br />

The matrix A 12 is upper triangular, so expansion about the first column (repeatedly)<br />

will result <strong>in</strong> det (A 12 ) = (1)(1)(1)(1) = 1 (see Example DUTM) and thus,<br />

det (A) =−110(1) = −110.<br />

Notice that our sequence of row operations was somewhat ad hoc, such as the<br />

transformation to A 5 . We could have been even more methodical, and strictly<br />

followed the process that converts a matrix to reduced row-echelon form (Theorem<br />

REMEF), eventually achiev<strong>in</strong>g the same numerical result with a f<strong>in</strong>al matrix that<br />

equaled the 4 × 4 identity matrix. Notice too that we could have stopped with A 8 ,<br />

s<strong>in</strong>ce at this po<strong>in</strong>t we could compute det (A 8 ) by two expansions about first columns,<br />

followed by a simple determ<strong>in</strong>ant of a 2 × 2 matrix (Theorem DMST).<br />

The beauty of this approach is that computationally we should already have<br />

written a procedure to convert matrices to reduced-row echelon form, so all we<br />

need to do is track the multiplicative changes to the determ<strong>in</strong>ant as the algorithm<br />

proceeds. Further, for a square matrix of size n this approach requires on the order<br />

of n 3 multiplications, while a recursive application of expansion about a row or<br />

column (Theorem DER, TheoremDEC) will require <strong>in</strong> the vic<strong>in</strong>ity of (n − 1)(n!)


§PDM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 361<br />

multiplications. So even for very small matrices, a computational approach utiliz<strong>in</strong>g<br />

row operations will have superior run-time. Track<strong>in</strong>g, and controll<strong>in</strong>g, the effects of<br />

round-off errors is another story, best saved for a numerical l<strong>in</strong>ear algebra course.△<br />

Subsection DROEM<br />

Determ<strong>in</strong>ants, Row Operations, Elementary Matrices<br />

As a f<strong>in</strong>al preparation for our two most important theorems about determ<strong>in</strong>ants,<br />

we prove a handful of facts about the <strong>in</strong>terplay of row operations and matrix<br />

multiplication with elementary matrices with regard to the determ<strong>in</strong>ant. But first, a<br />

simple, but crucial, fact about the identity matrix.<br />

Theorem DIM Determ<strong>in</strong>ant of the Identity Matrix<br />

For every n ≥ 1, det (I n )=1.<br />

Proof. It may be overkill, but this is a good situation to run through a proof by<br />

<strong>in</strong>duction on n (Proof Technique I). Is the result true when n = 1? Yes,<br />

det (I 1 )=[I 1 ] 11<br />

Def<strong>in</strong>ition DM<br />

= 1 Def<strong>in</strong>ition IM<br />

Now assume the theorem is true for the identity matrix of size n−1 and <strong>in</strong>vestigate<br />

the determ<strong>in</strong>ant of the identity matrix of size n with expansion about row 1,<br />

n∑<br />

det (I n )= (−1) 1+j [I n ] 1j<br />

det (I n (1|j))<br />

Def<strong>in</strong>ition DM<br />

j=1<br />

=(−1) 1+1 [I n ] 11<br />

det (I n (1|1))<br />

n∑<br />

+ (−1) 1+j [I n ] 1j<br />

det (I n (1|j))<br />

j=2<br />

=1det(I n−1 )+<br />

j=2<br />

n∑<br />

(−1) 1+j 0det(I n (1|j)) Def<strong>in</strong>ition IM<br />

j=2<br />

n∑<br />

=1(1)+ 0 = 1 Induction Hypothesis<br />

Theorem DEM Determ<strong>in</strong>ants of Elementary Matrices<br />

For the three possible versions of an elementary matrix (Def<strong>in</strong>ition ELEM) we have<br />

the determ<strong>in</strong>ants,


§PDM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 362<br />

1. det (E i,j )=−1<br />

2. det (E i (α)) = α<br />

3. det (E i,j (α)) = 1<br />

Proof. Swapp<strong>in</strong>g rows i and j of the identity matrix will create E i,j (Def<strong>in</strong>ition<br />

ELEM), so<br />

det (E i,j )=− det (I n )<br />

Theorem DRCS<br />

= −1 Theorem DIM<br />

Multiply<strong>in</strong>g row i of the identity matrix by α will create E i (α) (Def<strong>in</strong>ition<br />

ELEM), so<br />

det (E i (α)) = α det (I n )<br />

Theorem DRCM<br />

= α(1) = α Theorem DIM<br />

Multiply<strong>in</strong>g row i of the identity matrix by α and add<strong>in</strong>g to row j will create<br />

E i,j (α) (Def<strong>in</strong>ition ELEM), so<br />

det (E i,j (α)) = det (I n )<br />

Theorem DRCMA<br />

=1 TheoremDIM<br />

Theorem DEMMM Determ<strong>in</strong>ants, Elementary Matrices, Matrix Multiplication<br />

Suppose that A is a square matrix of size n and E is any elementary matrix of size<br />

n. Then<br />

det (EA)=det(E)det(A)<br />

Proof. The proof procedes <strong>in</strong> three parts, one for each type of elementary matrix,<br />

with each part very similar to the other two.<br />

<strong>First</strong>, let B be the matrix obta<strong>in</strong>ed from A by swapp<strong>in</strong>g rows i and j,<br />

det (E i,j A)=det(B)<br />

Theorem EMDRO<br />

= − det (A) Theorem DRCS<br />

=det(E i,j )det(A) Theorem DEM<br />

Second, let B be the matrix obta<strong>in</strong>ed from A by multiply<strong>in</strong>g row i by α,<br />

det (E i (α) A) =det(B)<br />

Theorem EMDRO<br />

= α det (A) Theorem DRCM


§PDM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 363<br />

=det(E i (α)) det (A)<br />

Theorem DEM<br />

Third, let B be the matrix obta<strong>in</strong>ed from A by multiply<strong>in</strong>g row i by α and add<strong>in</strong>g<br />

to row j,<br />

det (E i,j (α) A) =det(B)<br />

Theorem EMDRO<br />

=det(A)<br />

Theorem DRCMA<br />

=det(E i,j (α)) det (A) Theorem DEM<br />

S<strong>in</strong>ce the desired result holds for each variety of elementary matrix <strong>in</strong>dividually,<br />

we are done.<br />

<br />

Subsection DNMMM<br />

Determ<strong>in</strong>ants, Nons<strong>in</strong>gular Matrices, Matrix Multiplication<br />

If you asked someone with substantial experience work<strong>in</strong>g with matrices about the<br />

value of the determ<strong>in</strong>ant, they’d be likely to quote the follow<strong>in</strong>g theorem as the first<br />

th<strong>in</strong>gtocometom<strong>in</strong>d.<br />

Theorem SMZD S<strong>in</strong>gular Matrices have Zero Determ<strong>in</strong>ants<br />

Let A be a square matrix. Then A is s<strong>in</strong>gular if and only if det (A) =0.<br />

Proof. Rather than jump<strong>in</strong>g <strong>in</strong>to the two halves of the equivalence, we first establish<br />

a few items. Let B be the unique square matrix that is row-equivalent to A and<br />

<strong>in</strong> reduced row-echelon form (Theorem REMEF, TheoremRREFU). For each of<br />

the row operations that converts B <strong>in</strong>to A, there is an elementary matrix E i which<br />

effects the row operation by matrix multiplication (Theorem EMDRO). Repeated<br />

applications of Theorem EMDRO allow us to write<br />

A = E s E s−1 ...E 2 E 1 B<br />

Then<br />

det (A) =det(E s E s−1 ...E 2 E 1 B)<br />

=det(E s )det(E s−1 ) ...det (E 2 )det(E 1 )det(B) Theorem DEMMM<br />

From Theorem DEM we can <strong>in</strong>fer that the determ<strong>in</strong>ant of an elementary matrix<br />

is never zero (note the ban on α =0forE i (α) <strong>in</strong> Def<strong>in</strong>ition ELEM). So the product<br />

on the right is composed of nonzero scalars, with the possible exception of det (B).<br />

More precisely, we can argue that det (A) = 0 if and only if det (B) = 0. With this<br />

established, we can take up the two halves of the equivalence.<br />

(⇒) IfA is s<strong>in</strong>gular, then by Theorem NMRRI, B cannot be the identity matrix.<br />

Because(1)thenumberofpivotcolumnsisequaltothenumberofnonzerorows,(2)<br />

not every column is a pivot column, and (3) B is square, we see that B must have a<br />

zero row. By Theorem DZRC the determ<strong>in</strong>ant of B is zero, and by the above, we<br />

conclude that the determ<strong>in</strong>ant of A is zero.


§PDM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 364<br />

(⇐) We will prove the contrapositive (Proof Technique CP). So assume A is<br />

nons<strong>in</strong>gular, then by Theorem NMRRI, B is the identity matrix and Theorem DIM<br />

tells us that det (B) =1≠ 0. With the argument above, we conclude that the<br />

determ<strong>in</strong>ant of A is nonzero as well.<br />

<br />

For the case of 2 × 2 matrices you might compare the application of Theorem<br />

SMZD with the comb<strong>in</strong>ation of the results stated <strong>in</strong> Theorem DMST and Theorem<br />

TTMI.<br />

Example ZNDAB Zero and nonzero determ<strong>in</strong>ant, Archetypes A and B<br />

The coefficient matrix <strong>in</strong> Archetype A has a zero determ<strong>in</strong>ant (check this!) while the<br />

coefficient matrix Archetype B has a nonzero determ<strong>in</strong>ant (check this, too). These<br />

matrices are s<strong>in</strong>gular and nons<strong>in</strong>gular, respectively. This is exactly what Theorem<br />

SMZD says, and cont<strong>in</strong>ues our list of contrasts between these two archetypes. △<br />

In Section MINM we said “s<strong>in</strong>gular matrices are a dist<strong>in</strong>ct m<strong>in</strong>ority.” If you built<br />

a random matrix and took its determ<strong>in</strong>ant, how likely would it be that you got zero?<br />

S<strong>in</strong>ce Theorem SMZD is an equivalence (Proof Technique E) we can expand on<br />

our grow<strong>in</strong>g list of equivalences about nons<strong>in</strong>gular matrices. The addition of the<br />

condition det (A) ̸= 0 is one of the best motivations for learn<strong>in</strong>g about determ<strong>in</strong>ants.<br />

Theorem NME7 Nons<strong>in</strong>gular Matrix Equivalences, Round 7<br />

Suppose that A is a square matrix of size n. The follow<strong>in</strong>g are equivalent.<br />

1. A is nons<strong>in</strong>gular.<br />

2. A row-reduces to the identity matrix.<br />

3. The null space of A conta<strong>in</strong>s only the zero vector, N (A) ={0}.<br />

4. The l<strong>in</strong>ear system LS(A, b) has a unique solution for every possible choice of<br />

b.<br />

5. The columns of A are a l<strong>in</strong>early <strong>in</strong>dependent set.<br />

6. A is <strong>in</strong>vertible.<br />

7. The column space of A is C n , C(A) =C n .<br />

8. The columns of A are a basis for C n .<br />

9. The rank of A is n, r (A) =n.<br />

10. The nullity of A is zero, n (A) =0.<br />

11. The determ<strong>in</strong>ant of A is nonzero, det (A) ≠0.


§PDM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 365<br />

Proof. Theorem SMZD says A is s<strong>in</strong>gular if and only if det (A) = 0. If we negate<br />

each of these statements, we arrive at two contrapositives that we can comb<strong>in</strong>e as<br />

the equivalence, A is nons<strong>in</strong>gular if and only if det (A) ≠ 0. This allows us to add a<br />

new statement to the list found <strong>in</strong> Theorem NME6.<br />

<br />

Computationally, row-reduc<strong>in</strong>g a matrix is the most efficient way to determ<strong>in</strong>e<br />

if a matrix is nons<strong>in</strong>gular, though the effect of us<strong>in</strong>g division <strong>in</strong> a computer can<br />

lead to round-off errors that confuse small quantities with critical zero quantities.<br />

Conceptually, the determ<strong>in</strong>ant may seem the most efficient way to determ<strong>in</strong>e if a<br />

matrix is nons<strong>in</strong>gular. The def<strong>in</strong>ition of a determ<strong>in</strong>ant uses just addition, subtraction<br />

and multiplication, so division is never a problem. And the f<strong>in</strong>al test is easy: is the<br />

determ<strong>in</strong>ant zero or not? However, the number of operations <strong>in</strong>volved <strong>in</strong> comput<strong>in</strong>g a<br />

determ<strong>in</strong>ant by the def<strong>in</strong>ition very quickly becomes so excessive as to be impractical.<br />

Now for the coup de grace. We will generalize Theorem DEMMM to the case of any<br />

two square matrices. You may recall th<strong>in</strong>k<strong>in</strong>g that matrix multiplication was def<strong>in</strong>ed<br />

<strong>in</strong> a needlessly complicated manner. For sure, the def<strong>in</strong>ition of a determ<strong>in</strong>ant seems<br />

even stranger. (Though Theorem SMZD might be forc<strong>in</strong>g you to reconsider.) Read<br />

the statement of the next theorem and contemplate how nicely matrix multiplication<br />

and determ<strong>in</strong>ants play with each other.<br />

Theorem DRMM Determ<strong>in</strong>ant Respects Matrix Multiplication<br />

Suppose that A and B are square matrices of the same size. Then det (AB) =<br />

det (A)det(B).<br />

Proof. This proof is constructed <strong>in</strong> two cases. <strong>First</strong>, suppose that A is s<strong>in</strong>gular.<br />

Then det (A) =0byTheoremSMZD. By the contrapositive of Theorem NPNT,<br />

AB is s<strong>in</strong>gular as well. So by a second application of Theorem SMZD, det (AB) =0.<br />

Putt<strong>in</strong>g it all together<br />

det (AB) =0=0det(B) =det(A)det(B)<br />

as desired.<br />

For the second case, suppose that A is nons<strong>in</strong>gular. By Theorem NMPEM there<br />

are elementary matrices E 1 ,E 2 ,E 3 , ..., E s such that A = E 1 E 2 E 3 ...E s .Then<br />

det (AB) =det(E 1 E 2 E 3 ...E s B)<br />

=det(E 1 )det(E 2 )det(E 3 ) ...det (E s )det(B) Theorem DEMMM<br />

=det(E 1 E 2 E 3 ...E s )det(B)<br />

Theorem DEMMM<br />

=det(A)det(B)<br />

<br />

It is amaz<strong>in</strong>g that matrix multiplication and the determ<strong>in</strong>ant <strong>in</strong>teract this way.<br />

Might it also be true that det (A + B) =det(A)+det(B)? (Exercise PDM.M30)


§PDM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 366<br />

Read<strong>in</strong>g Questions<br />

1. Consider the two matrices below, and suppose you already have computed det (A) = −120.<br />

What is det (B)? Why?<br />

⎡<br />

⎤<br />

⎡<br />

⎤<br />

0 8 3 −4<br />

0 8 3 −4<br />

⎢−1 2 −2 5 ⎥<br />

⎢ 0 −4 2 −3⎥<br />

A = ⎣<br />

−2 8 4 3<br />

⎦ B = ⎣<br />

−2 8 4 3<br />

⎦<br />

0 −4 2 −3<br />

−1 2 −2 5<br />

2. State the theorem that allows us to make yet another extension to our NMEx series of<br />

theorems.<br />

3. What is amaz<strong>in</strong>g about the <strong>in</strong>teraction between matrix multiplication and the determ<strong>in</strong>ant?<br />

Exercises<br />

C30 Each of the archetypes below is a system of equations with a square coefficient<br />

matrix, or is a square matrix itself. Compute the determ<strong>in</strong>ant of each matrix, not<strong>in</strong>g how<br />

Theorem SMZD <strong>in</strong>dicates when the matrix is s<strong>in</strong>gular or nons<strong>in</strong>gular.<br />

Archetype A, ArchetypeB, ArchetypeF, ArchetypeK, Archetype L<br />

M20 † Construct a 3 × 3 nons<strong>in</strong>gular matrix and call it A. Then, for each entry of the<br />

matrix, compute the correspond<strong>in</strong>g cofactor, and create a new 3 × 3 matrix full of these<br />

cofactors by plac<strong>in</strong>g the cofactor of an entry <strong>in</strong> the same location as the entry it was based<br />

on. Once complete, call this matrix C. Compute AC t . Any observations? Repeat with a<br />

new matrix, or perhaps with a 4 × 4matrix.<br />

M30 Construct an example to show that the follow<strong>in</strong>g statement is not true for all square<br />

matrices A and B of the same size: det (A + B) =det(A)+det(B).<br />

T10 Theorem NPNT says that if the product of square matrices AB is nons<strong>in</strong>gular, then<br />

the <strong>in</strong>dividual matrices A and B are nons<strong>in</strong>gular also. Construct a new proof of this result<br />

mak<strong>in</strong>g use of theorems about determ<strong>in</strong>ants of matrices.<br />

T15 Use Theorem DRCM to prove Theorem DZRC as a corollary. (See Proof Technique<br />

LC.)<br />

T20 Suppose that A is a square matrix of size n and α ∈ C is a scalar. Prove that<br />

det (αA) =α n det (A).<br />

T25 Employ Theorem DT to construct the second half of the proof of Theorem DRCM<br />

(the portion about a multiple of a column).


Chapter E<br />

Eigenvalues<br />

When we have a square matrix of size n, A, and we multiply it by a vector x from C n<br />

to form the matrix-vector product (Def<strong>in</strong>ition MVP), the result is another vector <strong>in</strong><br />

C n . So we can adopt a functional view of this computation — the act of multiply<strong>in</strong>g<br />

by a square matrix is a function that converts one vector (x) <strong>in</strong>toanotherone(Ax)<br />

of the same size. For some vectors, this seem<strong>in</strong>gly complicated computation is really<br />

no more complicated than scalar multiplication. The vectors vary accord<strong>in</strong>g to the<br />

choice of A, so the question is to determ<strong>in</strong>e, for an <strong>in</strong>dividual choice of A, ifthere<br />

are any such vectors, and if so, which ones. It happens <strong>in</strong> a variety of situations that<br />

these vectors (and the scalars that go along with them) are of special <strong>in</strong>terest.<br />

We will be solv<strong>in</strong>g polynomial equations <strong>in</strong> this chapter, which raises the specter of<br />

complex numbers as roots. This dist<strong>in</strong>ct possibility is our ma<strong>in</strong> reason for enterta<strong>in</strong><strong>in</strong>g<br />

the complex numbers throughout the course. You might be moved to revisit Section<br />

CNO and Section O.<br />

Section EE<br />

Eigenvalues and Eigenvectors<br />

In this section, we will def<strong>in</strong>e the eigenvalues and eigenvectors of a matrix, and see<br />

how to compute them. More theoretical properties will be taken up <strong>in</strong> the next<br />

section.<br />

Subsection EEM<br />

Eigenvalues and Eigenvectors of a Matrix<br />

We start with the pr<strong>in</strong>cipal def<strong>in</strong>ition for this chapter.<br />

Def<strong>in</strong>ition EEM Eigenvalues and Eigenvectors of a Matrix<br />

Suppose that A is a square matrix of size n, x ̸= 0 is a vector <strong>in</strong> C n ,andλ is a<br />

367


§EE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 368<br />

scalar <strong>in</strong> C. Thenwesayx is an eigenvector of A with eigenvalue λ if<br />

Ax = λx<br />

Before go<strong>in</strong>g any further, perhaps we should conv<strong>in</strong>ce you that such th<strong>in</strong>gs ever<br />

happen at all. Understand the next example, but do not concern yourself with where<br />

the pieces come from. We will have methods soon enough to be able to discover<br />

these eigenvectors ourselves.<br />

Example SEE Some eigenvalues and eigenvectors<br />

Consider the matrix<br />

⎡<br />

⎤<br />

204 98 −26 −10<br />

⎢−280 −134 36 14 ⎥<br />

A = ⎣<br />

716 348 −90 −36<br />

⎦<br />

−472 −232 60 28<br />

and the vectors<br />

⎡ ⎤<br />

⎡<br />

1<br />

⎢−1⎥<br />

⎢<br />

x = ⎣<br />

2<br />

⎦ y = ⎣<br />

5<br />

−3<br />

4<br />

−10<br />

4<br />

⎤<br />

⎥<br />

⎦ z =<br />

⎡ ⎤<br />

−3<br />

⎢<br />

⎣<br />

7<br />

0<br />

8<br />

⎥<br />

⎦ w =<br />

⎡ ⎤<br />

1<br />

⎢−1⎥<br />

⎣<br />

4<br />

⎦<br />

0<br />

Then<br />

⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

204 98 −26 −10 1 4 1<br />

⎢−280 −134 36 14 ⎥ ⎢−1⎥<br />

⎢−4⎥<br />

⎢−1⎥<br />

Ax = ⎣<br />

716 348 −90 −36<br />

⎦ ⎣<br />

2<br />

⎦ = ⎣<br />

8<br />

⎦ =4⎣<br />

2<br />

⎦ =4x<br />

−472 −232 60 28 5 20 5<br />

so x is an eigenvector of A with eigenvalue λ =4.<br />

Also,<br />

⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤ ⎡<br />

204 98 −26 −10 −3 0<br />

⎢−280 −134 36 14 ⎥ ⎢ 4 ⎥ ⎢0⎥<br />

⎢<br />

Ay = ⎣<br />

716 348 −90 −36<br />

⎦ ⎣<br />

−10<br />

⎦ = ⎣<br />

0<br />

⎦ =0⎣<br />

−472 −232 60 28 4 0<br />

so y is an eigenvector of A with eigenvalue λ =0.<br />

Also,<br />

⎡<br />

⎤ ⎡ ⎤<br />

204 98 −26 −10 −3<br />

⎢−280 −134 36 14 ⎥ ⎢ 7 ⎥<br />

Az = ⎣<br />

716 348 −90 −36<br />

⎦ ⎣<br />

0<br />

⎦ =<br />

−472 −232 60 28 8<br />

so z is an eigenvector of A with eigenvalue λ =2.<br />

−3<br />

4<br />

−10<br />

4<br />

⎤<br />

⎡ ⎤ ⎡ ⎤<br />

−6 −3<br />

⎢ 14 ⎥ ⎢ 7<br />

⎣<br />

0<br />

⎦ =2⎣<br />

0<br />

16 8<br />

⎥<br />

⎦ =0y<br />

⎥<br />

⎦ =2z<br />


§EE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 369<br />

Also,<br />

⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

204 98 −26 −10 1 2 1<br />

⎢−280 −134 36 14 ⎥ ⎢−1⎥<br />

⎢−2⎥<br />

⎢−1⎥<br />

Aw = ⎣<br />

716 348 −90 −36<br />

⎦ ⎣<br />

4<br />

⎦ = ⎣<br />

8<br />

⎦ =2⎣<br />

4<br />

⎦ =2w<br />

−472 −232 60 28 0 0 0<br />

so w is an eigenvector of A with eigenvalue λ =2.<br />

So we have demonstrated four eigenvectors of A. Are there more? Yes, any<br />

nonzero scalar multiple of an eigenvector is aga<strong>in</strong> an eigenvector. In this example,<br />

set u =30x. Then<br />

Au = A(30x)<br />

=30Ax<br />

Theorem MMSMM<br />

= 30(4x) Def<strong>in</strong>ition EEM<br />

=4(30x)<br />

Property SMAM<br />

=4u<br />

so that u is also an eigenvector of A for the same eigenvalue, λ =4.<br />

The vectors z and w are both eigenvectors of A for the same eigenvalue λ =2,<br />

yet this is not as simple as the two vectors just be<strong>in</strong>g scalar multiples of each other<br />

(they are not). Look what happens when we add them together, to form v = z + w,<br />

and multiply by A,<br />

Av = A(z + w)<br />

= Az + Aw Theorem MMDAA<br />

=2z +2w Def<strong>in</strong>ition EEM<br />

=2(z + w)<br />

Property DVAC<br />

=2v<br />

so that v is also an eigenvector of A for the eigenvalue λ = 2. So it would appear<br />

that the set of eigenvectors that are associated with a fixed eigenvalue is closed under<br />

the vector space operations of C n . Hmmm.<br />

The vector y is an eigenvector of A for the eigenvalue λ = 0, so we can use<br />

Theorem ZSSM to write Ay =0y = 0. But this also means that y ∈N(A). There<br />

would appear to be a connection here also.<br />

△<br />

Example SEE h<strong>in</strong>ts at a number of <strong>in</strong>trigu<strong>in</strong>g properties, and there are many<br />

more. We will explore the general properties of eigenvalues and eigenvectors <strong>in</strong><br />

Section PEE, but <strong>in</strong> this section we will concern ourselves with the question of<br />

actually comput<strong>in</strong>g eigenvalues and eigenvectors. <strong>First</strong> we need a bit of background<br />

material on polynomials and matrices.


§EE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 370<br />

Subsection PM<br />

Polynomials and Matrices<br />

A polynomial is a comb<strong>in</strong>ation of powers, multiplication by scalar coefficients, and<br />

addition (with subtraction just be<strong>in</strong>g the <strong>in</strong>verse of addition). We never have occasion<br />

to divide when comput<strong>in</strong>g the value of a polynomial. So it is with matrices. We can<br />

add and subtract matrices, we can multiply matrices by scalars, and we can form<br />

powers of square matrices by repeated applications of matrix multiplication. We do<br />

not normally divide matrices (though sometimes we can multiply by an <strong>in</strong>verse). If a<br />

matrix is square, all the operations constitut<strong>in</strong>g a polynomial will preserve the size<br />

of the matrix. So it is natural to consider evaluat<strong>in</strong>g a polynomial with a matrix,<br />

effectively replac<strong>in</strong>g the variable of the polynomial by a matrix. We will demonstrate<br />

with an example.<br />

Example PM Polynomial of a matrix<br />

Let<br />

p(x) = 14 + 19x − 3x 2 − 7x 3 + x 4 D =<br />

[ −1 3<br />

] 2<br />

1 0 −2<br />

−3 1 1<br />

and we will compute p(D).<br />

<strong>First</strong>, the necessary powers of D. Notice that D 0 is def<strong>in</strong>ed to be the multiplicative<br />

identity, I 3 , as will be the case <strong>in</strong> general.<br />

[ ] 1 0 0<br />

D 0 = I 3 = 0 1 0<br />

0 0 1<br />

[ ]<br />

−1 3 2<br />

D 1 = D = 1 0 −2<br />

−3 1 1<br />

D 2 = DD 1 =<br />

D 3 = DD 2 =<br />

D 4 = DD 3 =<br />

[ −1 3 2<br />

1 0 −2<br />

−3 1 1<br />

[ −1 3 2<br />

1 0 −2<br />

−3 1 1<br />

[ −1 3 2<br />

1 0 −2<br />

−3 1 1<br />

][ −1 3<br />

] 2<br />

[ −2 −1<br />

] −6<br />

1 0 −2 = 5 1 0<br />

−3 1 1 1 −8 −7<br />

][ ] [ ]<br />

−2 −1 −6 19 −12 −8<br />

5 1 0 = −4 15 8<br />

1 −8 −7 12 −4 11<br />

][ ] [ 19 −12 −8 −7 49 54<br />

−4 15 8 = −5 −4 −30<br />

12 −4 11 −49 47 43<br />

]<br />

Then<br />

p(D) = 14 + 19D − 3D 2 − 7D 3 + D 4


§EE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 371<br />

=14<br />

=<br />

[ 1 0 0<br />

0 1 0<br />

0 0 1<br />

]<br />

+19<br />

[ 19 −12 −8<br />

− 7 −4 15 8<br />

12 −4 11<br />

[ ]<br />

−139 193 166<br />

27 −98 −124<br />

−193 118 20<br />

[ −1 3 2<br />

1 0 −2<br />

−3 1 1<br />

]<br />

+<br />

]<br />

− 3<br />

[ −7 49 54<br />

−5 −4 −30<br />

−49 47 43<br />

[ −2 −1<br />

] −6<br />

5 1 0<br />

1 −8 −7<br />

Notice that p(x) factors as<br />

p(x) = 14 + 19x − 3x 2 − 7x 3 + x 4 =(x − 2)(x − 7)(x +1) 2<br />

Because D commutes with itself (DD = DD), we can use distributivity of matrix<br />

multiplication across matrix addition (Theorem MMDAA) without be<strong>in</strong>g careful<br />

with any of the matrix products, and just as easily evaluate p(D) us<strong>in</strong>g the factored<br />

form of p(x),<br />

p(D) = 14 + 19D − 3D 2 − 7D 3 + D 4 =(D − 2I 3 )(D − 7I 3 )(D + I 3 ) 2<br />

[ −3 3<br />

][ 2 −8 3<br />

][ 2 0 3 2<br />

]2<br />

= 1 −2 −2 1 −7 −2 1 1 −2<br />

−3 1 −1 −3 1 −6 −3 1 2<br />

[ ]<br />

−139 193 166<br />

= 27 −98 −124<br />

−193 118 20<br />

This example is not meant to be too profound. It is meant to show you that it is<br />

natural to evaluate a polynomial with a matrix, and that the factored form of the<br />

polynomial is as good as (or maybe better than) the expanded form. And do not<br />

forget that constant terms <strong>in</strong> polynomials are really multiples of the identity matrix<br />

when we are evaluat<strong>in</strong>g the polynomial with a matrix.<br />

△<br />

Subsection EEE<br />

Existence of Eigenvalues and Eigenvectors<br />

Before we embark on comput<strong>in</strong>g eigenvalues and eigenvectors, we will prove that<br />

every matrix has at least one eigenvalue (and an eigenvector to go with it). Later, <strong>in</strong><br />

Theorem MNEM, we will determ<strong>in</strong>e the maximum number of eigenvalues a matrix<br />

may have.<br />

The determ<strong>in</strong>ant (Def<strong>in</strong>ition DM) will be a powerful tool <strong>in</strong> Subsection EE.CEE<br />

when it comes time to compute eigenvalues. However, it is possible, with some<br />

more advanced mach<strong>in</strong>ery, to compute eigenvalues without ever mak<strong>in</strong>g use of the<br />

determ<strong>in</strong>ant. Sheldon Axler does just that <strong>in</strong> his book, L<strong>in</strong>ear <strong>Algebra</strong> Done Right.<br />

Here and now, we give Axler’s “determ<strong>in</strong>ant-free” proof that every matrix has an<br />

]


§EE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 372<br />

eigenvalue. The result is not too startl<strong>in</strong>g, but the proof is most enjoyable.<br />

Theorem EMHE Every Matrix Has an Eigenvalue<br />

Suppose A is a square matrix. Then A has at least one eigenvalue.<br />

Proof. Suppose that A has size n, andchoosex as any nonzero vector from C n .<br />

(Notice how much latitude we have <strong>in</strong> our choice of x. Only the zero vector is<br />

off-limits.) Consider the set<br />

S = { x,Ax, A 2 x,A 3 x, ..., A n x }<br />

This is a set of n + 1 vectors from C n ,sobyTheoremMVSLD, S is l<strong>in</strong>early<br />

dependent. Let a 0 ,a 1 ,a 2 , ..., a n be a collection of n + 1 scalars from C, not all<br />

zero, that provide a relation of l<strong>in</strong>ear dependence on S. In other words,<br />

a 0 x + a 1 Ax + a 2 A 2 x + a 3 A 3 x + ···+ a n A n x = 0<br />

Some of the a i are nonzero. Suppose that just a 0 ≠0,anda 1 = a 2 = a 3 = ···=<br />

a n =0.Thena 0 x = 0 and by Theorem SMEZV, eithera 0 =0orx = 0, which are<br />

both contradictions. So a i ≠ 0 for some i ≥ 1. Let m be the largest <strong>in</strong>teger such<br />

that a m ≠ 0. From this discussion we know that m ≥ 1. We can also assume that<br />

a m = 1, for if not, replace each a i by a i /a m to obta<strong>in</strong> scalars that serve equally well<br />

<strong>in</strong> provid<strong>in</strong>g a relation of l<strong>in</strong>ear dependence on S.<br />

Def<strong>in</strong>e the polynomial<br />

p(x) =a 0 + a 1 x + a 2 x 2 + a 3 x 3 + ···+ a m x m<br />

Because we have consistently used C as our set of scalars (rather than R), we<br />

know that we can factor p(x) <strong>in</strong>to l<strong>in</strong>ear factors of the form (x − b i ), where b i ∈ C.<br />

So there are scalars, b 1 ,b 2 ,b 3 , ..., b m ,fromC so that,<br />

p(x) =(x − b m )(x − b m−1 ) ···(x − b 3 )(x − b 2 )(x − b 1 )<br />

Put it all together and<br />

0 = a 0 x + a 1 Ax + a 2 A 2 x + ···+ a n A n x<br />

= a 0 x + a 1 Ax + a 2 A 2 x + ···+ a m A m x a i =0fori>m<br />

= ( a 0 I n + a 1 A + a 2 A 2 + ···+ a m A m) x Theorem MMDAA<br />

= p(A)x Def<strong>in</strong>ition of p(x)<br />

=(A − b m I n )(A − b m−1 I n ) ···(A − b 2 I n )(A − b 1 I n )x<br />

Let k be the smallest <strong>in</strong>teger such that<br />

(A − b k I n )(A − b k−1 I n ) ···(A − b 2 I n )(A − b 1 I n )x = 0.<br />

From the preced<strong>in</strong>g equation, we know that k ≤ m. Def<strong>in</strong>e the vector z by<br />

z =(A − b k−1 I n ) ···(A − b 2 I n )(A − b 1 I n )x<br />

Notice that by the def<strong>in</strong>ition of k, the vector z must be nonzero. In the case


§EE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 373<br />

where k = 1, we understand that z is def<strong>in</strong>ed by z = x, andz is still nonzero. Now<br />

(A − b k I n )z =(A − b k I n )(A − b k−1 I n ) ···(A − b 3 I n )(A − b 2 I n )(A − b 1 I n )x = 0<br />

which allows us to write<br />

Az =(A + O)z<br />

Property ZM<br />

=(A − b k I n + b k I n )z<br />

Property AIM<br />

=(A − b k I n )z + b k I n z<br />

Theorem MMDAA<br />

= 0 + b k I n z Def<strong>in</strong><strong>in</strong>g property of z<br />

= b k I n z Property ZM<br />

= b k z Theorem MMIM<br />

S<strong>in</strong>ce z ̸= 0, this equation says that z is an eigenvector of A for the eigenvalue<br />

λ = b k (Def<strong>in</strong>ition EEM), so we have shown that any square matrix A does have at<br />

least one eigenvalue.<br />

<br />

The proof of Theorem EMHE is constructive (it conta<strong>in</strong>s an unambiguous<br />

procedure that leads to an eigenvalue), but it is not meant to be practical. We will<br />

illustrate the theorem with an example, the purpose be<strong>in</strong>g to provide a companion<br />

for study<strong>in</strong>g the proof and not to suggest this is the best procedure for comput<strong>in</strong>g<br />

an eigenvalue.<br />

Example CAEHW Comput<strong>in</strong>g an eigenvalue the hard way<br />

This example illustrates the proof of Theorem EMHE, and so will employ the same<br />

notation as the proof — look there for full explanations. It is not meant to be<br />

an example of a reasonable computational approach to f<strong>in</strong>d<strong>in</strong>g eigenvalues and<br />

eigenvectors. OK, warn<strong>in</strong>gs <strong>in</strong> place, here we go.<br />

Consider the matrix A, and choose the vector x,<br />

⎡<br />

⎤<br />

−7 −1 11 0 −4<br />

4 1 0 2 0<br />

A = ⎢−10 −1 14 0 −4⎥<br />

⎣ 8 2 −15 −1 5 ⎦<br />

−10 −1 16 0 −6<br />

⎡<br />

x = ⎢<br />

⎣<br />

It is important to notice that the choice of x could be anyth<strong>in</strong>g, so long as it is<br />

not the zero vector. We have not chosen x totally at random, but so as to make our<br />

illustration of the theorem as general as possible. You could replicate this example<br />

with your own choice and the computations are guaranteed to be reasonable, provided<br />

you have a computational tool that will factor a fifth degree polynomial for you.<br />

The set<br />

S = { x,Ax, A 2 x,A 3 x,A 4 x,A 5 x }<br />

3<br />

0<br />

3<br />

−5<br />

4<br />

⎤<br />

⎥<br />


§EE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 374<br />

⎧⎡<br />

⎪⎨<br />

= ⎢<br />

⎣<br />

⎪⎩<br />

3<br />

0<br />

3<br />

−5<br />

4<br />

⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />

−4 6 −10 18 −34<br />

2<br />

−6<br />

14<br />

−30<br />

62<br />

⎪⎬<br />

⎥<br />

⎦ , ⎢−4⎥<br />

⎣ 4 ⎦ , ⎢ 6 ⎥<br />

⎣−2⎦ , ⎢−10⎥<br />

⎣ −2 ⎦ , ⎢ 18 ⎥<br />

⎣ 10 ⎦ , ⎢−34⎥<br />

⎣−26⎦<br />

⎪⎭<br />

−6 10 −18 34 −66<br />

is guaranteed to be l<strong>in</strong>early dependent, as it has six vectors from C 5 (Theorem<br />

MVSLD).<br />

We will search for a nontrivial relation of l<strong>in</strong>ear dependence by solv<strong>in</strong>g a homogeneous<br />

system of equations whose coefficient matrix has the vectors of S as columns<br />

through row operations,<br />

⎡<br />

⎤ ⎡<br />

⎤<br />

3 −4 6 −10 18 −34<br />

1 0 −2 6 −14 30<br />

0 2 −6 14 −30 62<br />

⎢ 3 −4 6 −10 18 −34⎥<br />

⎣−5 4 −2 −2 10 −26⎦<br />

4 −6 10 −18 34 −66<br />

RREF<br />

−−−−→<br />

0 1 −3 7 −15 31<br />

⎢ 0 0 0 0 0 0<br />

⎥<br />

⎣ 0 0 0 0 0 0⎦<br />

0 0 0 0 0 0<br />

There are four free variables for describ<strong>in</strong>g solutions to this homogeneous system,<br />

so we have our pick of solutions. The most expedient choice would be to set x 3 =1<br />

and x 4 = x 5 = x 6 = 0. However, we will aga<strong>in</strong> opt to maximize the generality of our<br />

illustration of Theorem EMHE and choose x 3 = −8, x 4 = −3, x 5 = 1 and x 6 =0.<br />

This leads to a solution with x 1 =16andx 2 = 12.<br />

This relation of l<strong>in</strong>ear dependence then says that<br />

0 =16x +12Ax − 8A 2 x − 3A 3 x + A 4 x +0A 5 x<br />

0 = ( 16 + 12A − 8A 2 − 3A 3 + A 4) x<br />

So we def<strong>in</strong>e p(x) = 16 + 12x − 8x 2 − 3x 3 + x 4 , and as advertised <strong>in</strong> the proof of<br />

Theorem EMHE, we have a polynomial of degree m =4> 1 such that p(A)x = 0.<br />

Now we need to factor p(x) overC. If you made your own choice of x at the start,<br />

this is where you might have a fifth degree polynomial, and where you might need<br />

to use a computational tool to f<strong>in</strong>d roots and factors. We have<br />

p(x) = 16 + 12x − 8x 2 − 3x 3 + x 4 =(x − 4)(x +2)(x − 2)(x +1)<br />

So we know that<br />

0 = p(A)x =(A − 4I 5 )(A +2I 5 )(A − 2I 5 )(A +1I 5 )x<br />

We apply one factor at a time, until we get the zero vector, so as to determ<strong>in</strong>e<br />

the value of k described <strong>in</strong> the proof of Theorem EMHE,<br />

⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤<br />

−6 −1 11 0 −4 3 −1<br />

4 2 0 2 0<br />

0<br />

2<br />

(A +1I 5 )x = ⎢−10 −1 15 0 −4⎥<br />

⎢ 3 ⎥<br />

⎣ 8 2 −15 0 5 ⎦ ⎣−5⎦ = ⎢−1⎥<br />

⎣−1⎦<br />

−10 −1 16 0 −5 4 −2


§EE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 375<br />

⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤<br />

−9 −1 11 0 −4 −1 4<br />

4 −1 0 2 0<br />

2<br />

−8<br />

(A − 2I 5 )(A +1I 5 )x = ⎢−10 −1 12 0 −4⎥<br />

⎢−1⎥<br />

⎣ 8 2 −15 −3 5 ⎦ ⎣−1⎦ = ⎢ 4 ⎥<br />

⎣ 4 ⎦<br />

−10 −1 16 0 −8 −2 8<br />

⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤<br />

−5 −1 11 0 −4 4 0<br />

4 3 0 2 0<br />

−8<br />

0<br />

(A +2I 5 )(A − 2I 5 )(A +1I 5 )x = ⎢−10 −1 16 0 −4⎥<br />

⎢ 4 ⎥<br />

⎣ 8 2 −15 1 5 ⎦ ⎣ 4 ⎦ = ⎢0⎥<br />

⎣0⎦<br />

−10 −1 16 0 −4 8 0<br />

So k = 3 and<br />

⎡ ⎤<br />

4<br />

−8<br />

z =(A − 2I 5 )(A +1I 5 )x = ⎢ 4 ⎥<br />

⎣ 4 ⎦<br />

8<br />

is an eigenvector of A for the eigenvalue λ = −2, as you can check by do<strong>in</strong>g the<br />

computation Az. If you work through this example with your own choice of the<br />

vector x (strongly recommended) then the eigenvalue you will f<strong>in</strong>d may be different,<br />

but will be <strong>in</strong> the set {3, 0, 1, −1, −2}. See Exercise EE.M60 for a suggested start<strong>in</strong>g<br />

vector.<br />

△<br />

Subsection CEE<br />

Comput<strong>in</strong>g Eigenvalues and Eigenvectors<br />

Fortunately, we need not rely on the procedure of Theorem EMHE each time we<br />

need an eigenvalue. It is the determ<strong>in</strong>ant, and specifically Theorem SMZD, that<br />

provides the ma<strong>in</strong> tool for comput<strong>in</strong>g eigenvalues. Here is an <strong>in</strong>formal sequence of<br />

equivalences that is the key to determ<strong>in</strong><strong>in</strong>g the eigenvalues and eigenvectors of a<br />

matrix,<br />

Ax = λx ⇐⇒ Ax − λI n x = 0 ⇐⇒ (A − λI n ) x = 0<br />

So, for an eigenvalue λ and associated eigenvector x ̸= 0, thevectorx will be<br />

a nonzero element of the null space of A − λI n , while the matrix A − λI n will<br />

be s<strong>in</strong>gular and therefore have zero determ<strong>in</strong>ant. These ideas are made precise <strong>in</strong><br />

Theorem EMRCP and Theorem EMNS, but for now this brief discussion should<br />

suffice as motivation for the follow<strong>in</strong>g def<strong>in</strong>ition and example.<br />

Def<strong>in</strong>ition CP Characteristic Polynomial<br />

Suppose that A is a square matrix of size n. Then the characteristic polynomial


= −(x − 3)(x +1) 2 △<br />

§EE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 376<br />

of A is the polynomial p A (x) def<strong>in</strong>ed by<br />

p A (x) =det(A − xI n )<br />

Example CPMS3 Characteristic polynomial of a matrix, size 3<br />

Consider<br />

[ ]<br />

−13 −8 −4<br />

F = 12 7 4<br />

24 16 7<br />

Then<br />

p F (x) =det(F − xI 3 )<br />

−13 − x −8 −4<br />

=<br />

12 7 − x 4<br />

∣<br />

24 16 7 − x<br />

∣<br />

=(−13 − x)<br />

∣ 7 − x 4<br />

∣ ∣∣∣ 16 7 − x∣ +(−8)(−1) 12 4<br />

24 7 − x∣<br />

+(−4)<br />

∣ 12 7 − x<br />

24 16 ∣<br />

=(−13 − x)((7 − x)(7 − x) − 4(16))<br />

+(−8)(−1)(12(7 − x) − 4(24))<br />

+(−4)(12(16) − (7 − x)(24))<br />

=3+5x + x 2 − x 3<br />

Def<strong>in</strong>ition CP<br />

Def<strong>in</strong>ition DM<br />

Theorem DMST<br />

□<br />

The characteristic polynomial is our ma<strong>in</strong> computational tool for f<strong>in</strong>d<strong>in</strong>g eigenvalues,<br />

and will sometimes be used to aid us <strong>in</strong> determ<strong>in</strong><strong>in</strong>g the properties of eigenvalues.<br />

Theorem EMRCP Eigenvalues of a Matrix are Roots of Characteristic Polynomials<br />

Suppose A is a square matrix. Then λ is an eigenvalue of A if and only if p A (λ) =0.<br />

Proof. Suppose A has size n.<br />

λ is an eigenvalue of A<br />

⇐⇒ there exists x ≠ 0 so that Ax = λx Def<strong>in</strong>ition EEM<br />

⇐⇒ there exists x ≠ 0 so that Ax − λx = 0 Property AIC<br />

⇐⇒ there exists x ≠ 0 so that Ax − λI n x = 0 Theorem MMIM<br />

⇐⇒ there exists x ≠ 0 so that (A − λI n )x = 0 Theorem MMDAA<br />

⇐⇒ A − λI n is s<strong>in</strong>gular<br />

Def<strong>in</strong>ition NM


§EE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 377<br />

⇐⇒ det (A − λI n )=0<br />

⇐⇒ p A (λ) = 0<br />

TheoremSMZD<br />

Def<strong>in</strong>ition CP<br />

<br />

Example EMS3 Eigenvalues of a matrix, size 3<br />

In Example CPMS3 we found the characteristic polynomial of<br />

[ −13 −8<br />

] −4<br />

F = 12 7 4<br />

24 16 7<br />

to be p F (x) = −(x − 3)(x +1) 2 . Factored, we can f<strong>in</strong>d all of its roots easily, they are<br />

x = 3 and x = −1. By Theorem EMRCP, λ = 3 and λ = −1 are both eigenvalues of<br />

F , and these are the only eigenvalues of F .Wehavefoundthemall.<br />

△<br />

Let us now turn our attention to the computation of eigenvectors.<br />

Def<strong>in</strong>ition EM Eigenspace of a Matrix<br />

Suppose that A is a square matrix and λ is an eigenvalue of A. Then the eigenspace<br />

of A for λ, E A (λ), is the set of all the eigenvectors of A for λ, together with the<br />

<strong>in</strong>clusion of the zero vector.<br />

□<br />

Example SEE h<strong>in</strong>ted that the set of eigenvectors for a s<strong>in</strong>gle eigenvalue might<br />

have some closure properties, and with the addition of the one eigenvector that is<br />

never an eigenvector, 0, we <strong>in</strong>deed get a whole subspace.<br />

Theorem EMS Eigenspace for a Matrix is a Subspace<br />

Suppose A is a square matrix of size n and λ is an eigenvalue of A. Then the<br />

eigenspace E A (λ) is a subspace of the vector space C n .<br />

Proof. We will check the three conditions of Theorem TSS. <strong>First</strong>, Def<strong>in</strong>ition EM<br />

explicitly <strong>in</strong>cludes the zero vector <strong>in</strong> E A (λ), so the set is nonempty.<br />

Suppose that x, y ∈E A (λ), thatis,x and y are two eigenvectors of A for λ.<br />

Then<br />

A (x + y) =Ax + Ay<br />

Theorem MMDAA<br />

= λx + λy Def<strong>in</strong>ition EEM<br />

= λ (x + y) Property DVAC<br />

So either x + y = 0, orx + y is an eigenvector of A for λ (Def<strong>in</strong>ition EEM). So,<br />

<strong>in</strong> either event, x + y ∈E A (λ), and we have additive closure.<br />

Suppose that α ∈ C, andthatx ∈E A (λ), thatis,x is an eigenvector of A for λ.<br />

Then<br />

A (αx) =α (Ax)<br />

Theorem MMSMM<br />

= αλx Def<strong>in</strong>ition EEM<br />

= λ (αx) Property SMAC


§EE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 378<br />

So either αx = 0, orαx is an eigenvector of A for λ (Def<strong>in</strong>ition EEM). So, <strong>in</strong><br />

either event, αx ∈E A (λ), and we have scalar closure.<br />

With the three conditions of Theorem TSS met, we know E A (λ) is a subspace.<br />

Theorem EMS tells us that an eigenspace is a subspace (and hence a vector space<br />

<strong>in</strong> its own right). Our next theorem tells us how to quickly construct this subspace.<br />

Theorem EMNS Eigenspace of a Matrix is a Null Space<br />

Suppose A is a square matrix of size n and λ is an eigenvalue of A. Then<br />

E A (λ) =N (A − λI n )<br />

Proof. The conclusion of this theorem is an equality of sets, so normally we would<br />

follow the advice of Def<strong>in</strong>ition SE. However, <strong>in</strong> this case we can construct a sequence<br />

of equivalences which will together provide the two subset <strong>in</strong>clusions we need. <strong>First</strong>,<br />

notice that 0 ∈E A (λ) by Def<strong>in</strong>ition EM and 0 ∈N(A − λI n ) by Theorem HSC.<br />

Now consider any nonzero vector x ∈ C n ,<br />

x ∈E A (λ) ⇐⇒ Ax = λx<br />

Def<strong>in</strong>ition EM<br />

⇐⇒ Ax − λx = 0<br />

Property AIC<br />

⇐⇒ Ax − λI n x = 0 Theorem MMIM<br />

⇐⇒ (A − λI n ) x = 0 Theorem MMDAA<br />

⇐⇒ x ∈N(A − λI n ) Def<strong>in</strong>ition NSM<br />

<br />

You might notice the close parallels (and differences) between the proofs of<br />

Theorem EMRCP and Theorem EMNS. S<strong>in</strong>ce Theorem EMNS describes the set<br />

of all the eigenvectors of A as a null space we can use techniques such as Theorem<br />

BNS to provide concise descriptions of eigenspaces. Theorem EMNS also provides a<br />

trivial proof for Theorem EMS.<br />

Example ESMS3 Eigenspaces of a matrix, size 3<br />

Example CPMS3 and Example EMS3 describe the characteristic polynomial and<br />

eigenvalues of the 3 × 3 matrix<br />

[ −13 −8<br />

] −4<br />

F = 12 7 4<br />

24 16 7<br />

We will now take each eigenvalue <strong>in</strong> turn and compute its eigenspace. To do this,<br />

we row-reduce the matrix F −λI 3 <strong>in</strong> order to determ<strong>in</strong>e solutions to the homogeneous<br />

system LS(F − λI 3 , 0) and then express the eigenspace as the null space of F − λI 3<br />

(Theorem EMNS). Theorem BNS then tells us how to write the null space as the


§EE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 379<br />

span of a basis.<br />

[ −16 −8 −4<br />

12 4 4<br />

24 16 4<br />

]<br />

⎡<br />

⎣ 1 0 1<br />

2<br />

0 1 − 1 2<br />

RREF<br />

λ =3 F − 3I 3 =<br />

−−−−→<br />

⎦<br />

0 0 0<br />

〈 ⎧ ⎡<br />

⎨<br />

E F (3) = N (F − 3I 3 )= ⎣ − ⎤⎫<br />

1 〉 〈{[ ]}〉<br />

2<br />

⎬ −1<br />

1 ⎦ = 1<br />

⎩ 2 ⎭<br />

1<br />

2<br />

[ ] ⎡<br />

−12 −8 −4<br />

RREF<br />

λ = −1 F +1I 3 = 12 8 4 −−−−→ ⎣ 1 ⎤<br />

2 1<br />

3 3<br />

0 0 0⎦<br />

24 16 8<br />

0 0 0<br />

〈 ⎧ ⎡<br />

⎨<br />

E F (−1) = N (F +1I 3 )= ⎣ − ⎤ ⎡<br />

2<br />

3<br />

1 ⎦ , ⎣ − ⎤⎫<br />

1 〉<br />

⎬<br />

3<br />

0 ⎦ =<br />

⎩<br />

0 1<br />

⎭<br />

⎤<br />

〈{[ ] −2<br />

3 ,<br />

0<br />

[ ]}〉 −1<br />

0<br />

3<br />

Eigenspaces <strong>in</strong> hand, we can easily compute eigenvectors by form<strong>in</strong>g nontrivial<br />

l<strong>in</strong>ear comb<strong>in</strong>ations of the basis vectors describ<strong>in</strong>g each eigenspace. In particular,<br />

notice that we can “pretty up” our basis vectors by us<strong>in</strong>g scalar multiples to clear<br />

out fractions.<br />

△<br />

Subsection ECEE<br />

Examples of Comput<strong>in</strong>g Eigenvalues and Eigenvectors<br />

There are no theorems <strong>in</strong> this section, just a selection of examples meant to illustrate<br />

the range of possibilities for the eigenvalues and eigenvectors of a matrix. These<br />

examples can all be done by hand, though the computation of the characteristic<br />

polynomial would be very time-consum<strong>in</strong>g and error-prone. It can also be difficult<br />

to factor an arbitrary polynomial, though if we were to suggest that most of our<br />

eigenvalues are go<strong>in</strong>g to be <strong>in</strong>tegers, then it can be easier to hunt for roots. These<br />

examples are meant to look similar to a concatenation of Example CPMS3, Example<br />

EMS3 and Example ESMS3. <strong>First</strong>, we will sneak <strong>in</strong> a pair of def<strong>in</strong>itions so we can<br />

illustrate them throughout this sequence of examples.<br />

Def<strong>in</strong>ition AME <strong>Algebra</strong>ic Multiplicity of an Eigenvalue<br />

Suppose that A is a square matrix and λ is an eigenvalue of A. Then the algebraic<br />

multiplicity of λ, α A (λ), is the highest power of (x−λ) that divides the characteristic<br />

polynomial, p A (x).<br />

□<br />

S<strong>in</strong>ce an eigenvalue λ is a root of the characteristic polynomial, there is always<br />

afactorof(x − λ), and the algebraic multiplicity is just the power of this factor<br />

<strong>in</strong> a factorization of p A (x). So<strong>in</strong>particular,α A (λ) ≥ 1. Compare the def<strong>in</strong>ition of<br />

algebraic multiplicity with the next def<strong>in</strong>ition.<br />

Def<strong>in</strong>ition GME Geometric Multiplicity of an Eigenvalue


§EE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 380<br />

Suppose that A is a square matrix and λ is an eigenvalue of A. Then the geometric<br />

multiplicity of λ, γ A (λ), is the dimension of the eigenspace E A (λ).<br />

□<br />

Every eigenvalue must have at least one eigenvector, so the associated eigenspace<br />

cannot be trivial, and so γ A (λ) ≥ 1.<br />

Example EMMS4 Eigenvalue multiplicities, matrix of size 4<br />

Consider the matrix<br />

⎡<br />

⎤<br />

−2 1 −2 −4<br />

⎢ 12 1 4 9 ⎥<br />

B = ⎣<br />

6 5 −2 −4<br />

⎦<br />

3 −4 5 10<br />

then<br />

p B (x) =8− 20x +18x 2 − 7x 3 + x 4 =(x − 1)(x − 2) 3<br />

So the eigenvalues are λ =1, 2 with algebraic multiplicities α B (1) = 1 and α B (2) =<br />

3.<br />

Comput<strong>in</strong>g eigenvectors,<br />

⎡<br />

⎤ ⎡<br />

−3 1 −2 −4<br />

⎢ 12 0 4 9 ⎥<br />

λ =1 B − 1I 4 = ⎣<br />

6 5 −3 −4<br />

⎦ RREF<br />

−−−−→ ⎢<br />

⎣<br />

3 −4 5 9<br />

〈 ⎧ ⎡ ⎤⎫<br />

⎪ ⎨<br />

⎪ ⎩<br />

E B (1) = N (B − 1I 4 )=<br />

⎢<br />

⎣<br />

− 1 3<br />

1<br />

1<br />

0<br />

1<br />

1 0<br />

3<br />

0<br />

0 1 −1 0<br />

0 0 0 1<br />

0 0 0 0<br />

〉 〈 ⎪⎬<br />

⎧ ⎡ ⎤⎫<br />

⎪ −1 〉<br />

⎨ ⎪⎬<br />

⎥ ⎢ 3 ⎥<br />

⎦ = ⎣<br />

⎪⎭ ⎪ ⎩<br />

3<br />

⎦<br />

⎪⎭<br />

0<br />

⎡<br />

⎤ ⎡<br />

⎤<br />

−4 1 −2 −4<br />

1 0 0 1/2<br />

⎢ 12 −1 4 9 ⎥<br />

λ =2 B − 2I 4 = ⎣<br />

6 5 −4 −4<br />

⎦ −−−−→<br />

RREF<br />

⎢ 0 1 0 −1<br />

⎥<br />

⎣<br />

0 0 1 1/2<br />

⎦<br />

3 −4 5 8<br />

0 0 0 0<br />

〈 ⎧ ⎡<br />

⎪ − 1 ⎤⎫<br />

〉 〈 ⎨ 2 ⎪⎬<br />

⎧ ⎡ ⎤⎫<br />

⎪ −1 〉<br />

⎨ ⎪⎬<br />

⎢ 1 ⎥ ⎢ 2 ⎥<br />

E B (2) = N (B − 2I 4 )= ⎣<br />

⎪ ⎩ − 1 ⎦ = ⎣<br />

2<br />

⎪⎭ ⎪ ⎩<br />

−1<br />

⎦<br />

⎪⎭<br />

1<br />

2<br />

⎤<br />

⎥<br />

⎦<br />

So each eigenspace has dimension 1 and so γ B (1) = 1 and γ B (2) =1.This<br />

example is of <strong>in</strong>terest because of the discrepancy between the two multiplicities for<br />

λ = 2. In many of our examples the algebraic and geometric multiplicities will be<br />

equal for all of the eigenvalues (as it was for λ = 1 <strong>in</strong> this example), so keep this<br />

example <strong>in</strong> m<strong>in</strong>d. We will have some explanations for this phenomenon later (see<br />

Example NDMS4).<br />


§EE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 381<br />

Example ESMS4 Eigenvalues, symmetric matrix of size 4<br />

Consider the matrix<br />

⎡<br />

⎤<br />

1 0 1 1<br />

⎢0 1 1 1⎥<br />

C = ⎣<br />

1 1 1 0<br />

⎦<br />

1 1 0 1<br />

then<br />

p C (x) =−3+4x +2x 2 − 4x 3 + x 4 =(x − 3)(x − 1) 2 (x +1)<br />

So the eigenvalues are λ =3, 1, −1 with algebraic multiplicities α C (3) =1,α C (1) =<br />

2andα C (−1) = 1.<br />

Comput<strong>in</strong>g eigenvectors,<br />

⎡<br />

⎤ ⎡<br />

⎤<br />

λ =3 C − 3I 4 =<br />

⎢<br />

⎣<br />

−2 0 1 1<br />

0 −2 1 1<br />

1 1 −2 0<br />

1 1 0 −2<br />

〈 ⎧ ⎡<br />

⎪ ⎨<br />

⎪ ⎩<br />

E C (3) = N (C − 3I 4 )=<br />

⎥<br />

⎦ RREF<br />

−−−−→<br />

⎤⎫<br />

1 〉 ⎪⎬<br />

⎢1⎥<br />

⎣<br />

1<br />

⎦<br />

⎪⎭<br />

1<br />

⎢<br />

⎣<br />

1 0 0 −1<br />

0 1 0 −1<br />

⎥<br />

0 0 1 −1<br />

⎦<br />

0 0 0 0<br />

⎡<br />

⎤ ⎡<br />

⎤<br />

0 0 1 1<br />

1 1 0 0<br />

⎢0 0 1 1⎥<br />

λ =1 C − 1I 4 = ⎣<br />

1 1 0 0<br />

⎦ −−−−→<br />

RREF<br />

⎢ 0 0 1 1<br />

⎥<br />

⎣ 0 0 0 0⎦<br />

1 1 0 0<br />

0 0 0 0<br />

〈 ⎧ ⎡ ⎤ ⎡ ⎤⎫<br />

⎪ −1 0 〉<br />

⎨ ⎪⎬<br />

⎢ 1 ⎥ ⎢ 0 ⎥<br />

E C (1) = N (C − 1I 4 )= ⎣<br />

⎪ ⎩<br />

0<br />

⎦ , ⎣<br />

−1<br />

⎦<br />

⎪⎭<br />

0 1<br />

⎡<br />

⎤ ⎡<br />

⎤<br />

2 0 1 1<br />

1 0 0 1<br />

⎢0 2 1 1⎥<br />

λ = −1 C +1I 4 = ⎣<br />

1 1 2 0<br />

⎦ −−−−→<br />

RREF<br />

⎢ 0 1 0 1<br />

⎥<br />

⎣<br />

0 0 1 −1<br />

⎦<br />

1 1 0 2<br />

0 0 0 0<br />

〈 ⎧ ⎡ ⎤⎫<br />

⎪ −1 〉<br />

⎨ ⎪⎬<br />

⎢−1⎥<br />

E C (−1) = N (C +1I 4 )= ⎣<br />

⎪ ⎩<br />

1<br />

⎦<br />

⎪⎭<br />

1<br />

So the eigenspace dimensions yield geometric multiplicities γ C (3) =1,γ C (1) =2<br />

and γ C (−1) = 1, the same as for the algebraic multiplicities. This example is of<br />

<strong>in</strong>terest because A is a symmetric matrix, and will be the subject of Theorem HMRE.


§EE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 382<br />

△<br />

Example HMEM5 High multiplicity eigenvalues, matrix of size 5<br />

Consider the matrix<br />

⎡<br />

⎤<br />

29 14 2 6 −9<br />

−47 −22 −1 −11 13<br />

E = ⎢ 19 10 5 4 −8⎥<br />

⎣−19 −10 −3 −2 8 ⎦<br />

7 4 3 1 −3<br />

then<br />

p E (x) =−16 + 16x +8x 2 − 16x 3 +7x 4 − x 5 = −(x − 2) 4 (x +1)<br />

So the eigenvalues are λ =2, −1 with algebraic multiplicities α E (2) = 4 and<br />

α E (−1) = 1.<br />

Comput<strong>in</strong>g eigenvectors,<br />

λ =2<br />

⎡<br />

⎤ ⎡<br />

⎤<br />

27 14 2 6 −9<br />

1 0 0 1 0<br />

−47 −24 −1 −11 13<br />

RREF<br />

0 1 0 − 3 2<br />

− 1 2<br />

E − 2I 5 = ⎢ 19 10 3 4 −8⎥<br />

−−−−→<br />

⎣−19 −10 −3 −4 8 ⎦ ⎢ 0 0 1 0 −1⎥<br />

⎣<br />

0 0 0 0 0<br />

⎦<br />

7 4 3 1 −5<br />

0 0 0 0 0<br />

⎧⎡<br />

⎤ ⎡ ⎤⎫<br />

⎧⎡<br />

⎤ ⎡ ⎤⎫<br />

−1 0<br />

−2 0<br />

〈 ⎪⎨<br />

3<br />

1 〉 〈<br />

〉<br />

⎪⎬ ⎪⎨<br />

3<br />

1<br />

⎪⎬<br />

2<br />

E E (2) = N (E − 2I 5 )= ⎢ 0 ⎥<br />

⎣<br />

⎪⎩<br />

1<br />

⎦ , 2<br />

⎢1⎥<br />

= ⎢ 0 ⎥<br />

⎣<br />

0<br />

⎦ ⎣<br />

⎪⎭ ⎪⎩<br />

2 ⎦ , ⎢2⎥<br />

⎣0⎦<br />

⎪⎭<br />

0 1<br />

0 2<br />

λ = −1<br />

⎡<br />

30 14 2 6<br />

⎤<br />

−9<br />

−47 −21 −1 −11 13<br />

E +1I 5 = ⎢ 19 10 6 4 −8⎥<br />

⎣−19 −10 −3 −1 8 ⎦<br />

7 4 3 1 −2<br />

⎧⎡<br />

⎤⎫<br />

−2<br />

〈 〉<br />

⎪⎨<br />

4<br />

⎪⎬<br />

E E (−1) = N (E +1I 5 )= ⎢−1⎥<br />

⎣<br />

⎪⎩<br />

1 ⎦<br />

⎪⎭<br />

0<br />

RREF<br />

−−−−→<br />

⎡<br />

⎢<br />

⎣<br />

1 0 0 2 0<br />

0 1 0 −4 0<br />

0 0 1 1 0<br />

0 0 0 0 1<br />

0 0 0 0 0<br />

⎤<br />

⎥<br />

⎦<br />

So the eigenspace dimensions yield geometric multiplicities γ E (2) = 2 and<br />

γ E (−1) = 1. This example is of <strong>in</strong>terest because λ = 2 has such a large algebraic


§EE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 383<br />

multiplicity, which is also not equal to its geometric multiplicity.<br />

Example CEMS6 Complex eigenvalues, matrix of size 6<br />

Consider the matrix<br />

⎡<br />

⎤<br />

−59 −34 41 12 25 30<br />

1 7 −46 −36 −11 −29<br />

−233 −119 58 −35 75 54<br />

F =<br />

⎢ 157 81 −43 21 −51 −39<br />

⎥<br />

⎣ −91 −48 32 −5 32 26 ⎦<br />

209 107 −55 28 −69 −50<br />

then<br />

p F (x) =−50 + 55x +13x 2 − 50x 3 +32x 4 − 9x 5 + x 6<br />

=(x − 2)(x +1)(x 2 − 4x +5) 2<br />

=(x − 2)(x + 1)((x − (2 + i))(x − (2 − i))) 2<br />

=(x − 2)(x +1)(x − (2 + i)) 2 (x − (2 − i)) 2<br />

△<br />

So the eigenvalues are λ =2, −1, 2+i, 2 − i with algebraic multiplicities α F (2) =1,<br />

α F (−1) = 1, α F (2 + i) =2andα F (2 − i) =2.<br />

We compute eigenvectors, not<strong>in</strong>g that the last two basis vectors are each a scalar<br />

multiple of what Theorem BNS will provide,<br />

λ =2 F − 2I 6 =<br />

⎡<br />

⎡<br />

⎤<br />

−61 −34 41 12 25 30<br />

1 5 −46 −36 −11 −29<br />

−233 −119 56 −35 75 54<br />

RREF<br />

⎢ 157 81 −43 19 −51 −39<br />

−−−−→<br />

⎥<br />

⎣ −91 −48 32 −5 30 26 ⎦ ⎢<br />

⎣<br />

209 107 −55 28 −69 −52<br />

⎧⎡<br />

− 1 ⎤⎫<br />

⎧⎡<br />

⎤⎫<br />

5<br />

−1<br />

〈<br />

0<br />

〉 〈<br />

0<br />

〉<br />

⎪⎨<br />

−<br />

E F (2) = N (F − 2I 6 )=<br />

3 ⎪⎬ ⎪⎨<br />

⎪⎬<br />

5<br />

−3<br />

⎢ 1 =<br />

5<br />

⎥ ⎢ 1<br />

⎥<br />

⎣<br />

⎪⎩ − 4 ⎦ ⎣−4⎦<br />

5 ⎪⎭ ⎪⎩ ⎪⎭<br />

1<br />

5<br />

1 0 0 0 0<br />

1<br />

5<br />

0 1 0 0 0 0<br />

3<br />

0 0 1 0 0<br />

5<br />

0 0 0 1 0 − 1 5<br />

4<br />

0 0 0 0 1<br />

5<br />

0 0 0 0 0 0<br />

⎤<br />

⎥<br />

⎦<br />

λ = −1 F +1I 6 =


§EE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 384<br />

⎡<br />

⎤<br />

⎡<br />

⎤<br />

1<br />

−58 −34 41 12 25 30<br />

1 0 0 0 0<br />

2<br />

1 8 −46 −36 −11 −29<br />

0 1 0 0 0 − 3 2<br />

−233 −119 59 −35 75 54<br />

RREF<br />

1<br />

⎢ 157 81 −43 22 −51 −39<br />

−−−−→<br />

0 0 1 0 0 2<br />

⎥<br />

⎣ −91 −48 32 −5 33 26 ⎦ ⎢<br />

0 0 0 1 0 0<br />

⎥<br />

⎣ 0 0 0 0 1 −<br />

209 107 −55 28 −69 −49<br />

1 ⎦<br />

2<br />

0 0 0 0 0 0<br />

⎧⎡<br />

− 1 ⎤⎫<br />

⎧⎡<br />

⎤⎫<br />

2<br />

−1<br />

〈<br />

3 〉 〈<br />

⎪⎨<br />

2<br />

3<br />

〉<br />

−<br />

E F (−1) = N (F + I 6 )=<br />

1 ⎪⎬ ⎪⎨<br />

⎪⎬<br />

−1<br />

2<br />

=<br />

⎢ 0 ⎥ ⎢ 0<br />

⎥<br />

⎣ 1 ⎦ ⎣ 1 ⎦<br />

⎪⎩ 2 ⎪⎭ ⎪⎩ ⎪⎭<br />

1<br />

2<br />

λ =2+i<br />

⎡<br />

⎤<br />

−61 − i −34 41 12 25 30<br />

1 5− i −46 −36 −11 −29<br />

−233 −119 56 − i −35 75 54<br />

F − (2 + i)I 6 =<br />

⎢ 157 81 −43 19 − i −51 −39<br />

⎥<br />

⎣ −91 −48 32 −5 30− i 26 ⎦<br />

209 107 −55 28 −69 −52 − i<br />

⎡<br />

⎤<br />

1<br />

1 0 0 0 0<br />

5<br />

(7 + i)<br />

1<br />

0 1 0 0 0 5<br />

(−9 − 2i)<br />

RREF<br />

−−−−→<br />

0 0 1 0 0 1<br />

⎢<br />

0 0 0 1 0 −1<br />

⎥<br />

⎣ 0 0 0 0 1 1 ⎦<br />

0 0 0 0 0 0<br />

⎧⎡<br />

⎤⎫<br />

−7 − i<br />

〈<br />

9+2i<br />

〉<br />

⎪⎨<br />

⎪⎬<br />

−5<br />

E F (2 + i) =N (F − (2 + i)I 6 )=<br />

⎢ 5<br />

⎥<br />

⎣ −5 ⎦<br />

⎪⎩ ⎪⎭<br />

5<br />

λ =2− i


§EE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 385<br />

⎡<br />

⎤<br />

−61 + i −34 41 12 25 30<br />

1 5+i −46 −36 −11 −29<br />

−233 −119 56 + i −35 75 54<br />

F − (2 − i)I 6 =<br />

⎢ 157 81 −43 19 + i −51 −39<br />

⎥<br />

⎣ −91 −48 32 −5 30+i 26 ⎦<br />

209 107 −55 28 −69 −52 + i<br />

⎡<br />

⎤<br />

1<br />

1 0 0 0 0<br />

5<br />

(7 − i)<br />

1<br />

0 1 0 0 0 5 (−9+2i)<br />

RREF<br />

−−−−→<br />

0 0 1 0 0 1<br />

⎢<br />

0 0 0 1 0 −1<br />

⎥<br />

⎣ 0 0 0 0 1 1 ⎦<br />

0 0 0 0 0 0<br />

⎧⎡<br />

⎤⎫<br />

−7+i<br />

〈<br />

9 − 2i<br />

〉<br />

⎪⎨<br />

⎪⎬<br />

−5<br />

E F (2 − i) =N (F − (2 − i)I 6 )=<br />

⎢ 5<br />

⎥<br />

⎣ −5 ⎦<br />

⎪⎩ ⎪⎭<br />

5<br />

Eigenspace dimensions yield geometric multiplicities of γ F (2) =1,γ F (−1) =<br />

1, γ F (2 + i) = 1 and γ F (2 − i) = 1. This example demonstrates some of the<br />

possibilities for the appearance of complex eigenvalues, even when all the entries<br />

of the matrix are real. Notice how all the numbers <strong>in</strong> the analysis of λ =2− i are<br />

conjugates of the correspond<strong>in</strong>g number <strong>in</strong> the analysis of λ =2+i. This is the<br />

content of the upcom<strong>in</strong>g Theorem ERMCP.<br />

△<br />

Example DEMS5 Dist<strong>in</strong>ct eigenvalues, matrix of size 5<br />

Consider the matrix<br />

⎡<br />

⎤<br />

15 18 −8 6 −5<br />

5 3 1 −1 −3<br />

H = ⎢ 0 −4 5 −4 −2 ⎥<br />

⎣−43 −46 17 −14 15 ⎦<br />

26 30 −12 8 −10<br />

then<br />

p H (x) =−6x + x 2 +7x 3 − x 4 − x 5 = x(x − 2)(x − 1)(x +1)(x +3)<br />

So the eigenvalues are λ =2, 1, 0, −1, −3 with algebraic multiplicities α H (2) =1,<br />

α H (1) = 1, α H (0) = 1, α H (−1) = 1 and α H (−3) = 1.<br />

Comput<strong>in</strong>g eigenvectors,<br />

λ =2


§EE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 386<br />

⎡<br />

⎤<br />

13 18 −8 6 −5<br />

5 1 1 −1 −3<br />

H − 2I 5 = ⎢ 0 −4 3 −4 −2 ⎥<br />

⎣−43 −46 17 −16 15 ⎦<br />

26 30 −12 8 −12<br />

⎧⎡<br />

⎤⎫<br />

1<br />

〈 〉<br />

⎪⎨<br />

−1<br />

⎪⎬<br />

E H (2) = N (H − 2I 5 )= ⎢−2⎥<br />

⎣<br />

⎪⎩<br />

−1⎦<br />

⎪⎭<br />

1<br />

RREF<br />

−−−−→<br />

⎡<br />

⎤<br />

1 0 0 0 −1<br />

0 1 0 0 1<br />

⎢ 0 0 1 0 2<br />

⎥<br />

⎣<br />

⎦<br />

0 0 0 1 1<br />

0 0 0 0 0<br />

λ =1<br />

⎡<br />

⎤ ⎡<br />

14 18 −8 6 −5<br />

1 0 0 0 − 1 ⎤<br />

2<br />

5 2 1 −1 −3<br />

RREF<br />

0 1 0 0 0<br />

H − 1I 5 = ⎢ 0 −4 4 −4 −2 ⎥ −−−−→<br />

1<br />

⎣−43 −46 17 −15 15 ⎦ ⎢ 0 0 1 0 2 ⎥<br />

⎣<br />

⎦<br />

0 0 0 1 1<br />

26 30 −12 8 −11<br />

0 0 0 0 0<br />

⎧⎡<br />

1 ⎤⎫<br />

⎧⎡<br />

⎤⎫<br />

〈 2<br />

1<br />

〉 〈 〉<br />

⎪⎨<br />

0<br />

⎪⎬ ⎪⎨<br />

0<br />

⎪⎬<br />

E H (1) = N (H − 1I 5 )= ⎢− 1 ⎣<br />

2⎥<br />

= ⎢−1⎥<br />

⎪⎩ −1<br />

⎦ ⎣<br />

⎪⎭ ⎪⎩<br />

−2⎦<br />

⎪⎭<br />

1<br />

2<br />

λ =0<br />

⎡<br />

⎤ ⎡<br />

⎤<br />

15 18 −8 6 −5<br />

1 0 0 0 1<br />

5 3 1 −1 −3<br />

RREF<br />

0 1 0 0 −2<br />

H − 0I 5 = ⎢ 0 −4 5 −4 −2 ⎥ −−−−→<br />

⎣−43 −46 17 −14 15 ⎦ ⎢ 0 0 1 0 −2<br />

⎥<br />

⎣<br />

⎦<br />

0 0 0 1 0<br />

26 30 −12 8 −10<br />

0 0 0 0 0<br />

⎧⎡<br />

⎤⎫<br />

−1<br />

〈 〉<br />

⎪⎨<br />

2<br />

⎪⎬<br />

E H (0) = N (H − 0I 5 )= ⎢ 2 ⎥<br />

⎣<br />

⎪⎩<br />

0 ⎦<br />

⎪⎭<br />

1<br />

λ = −1


§EE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 387<br />

⎡<br />

⎤ ⎡<br />

⎤<br />

16 18 −8 6 −5<br />

1 0 0 0 −1/2<br />

5 4 1 −1 −3<br />

RREF<br />

0 1 0 0 0<br />

H +1I 5 = ⎢ 0 −4 6 −4 −2⎥<br />

−−−−→<br />

⎣−43 −46 17 −13 15 ⎦ ⎢ 0 0 1 0 0<br />

⎥<br />

⎣<br />

⎦<br />

0 0 0 1 1/2<br />

26 30 −12 8 −9<br />

0 0 0 0 0<br />

⎧⎡<br />

1 ⎤⎫<br />

⎧⎡<br />

⎤⎫<br />

〈 2<br />

1<br />

〉 〈 〉<br />

⎪⎨<br />

0<br />

⎪⎬ ⎪⎨<br />

0<br />

⎪⎬<br />

E H (−1) = N (H +1I 5 )= ⎢ 0 ⎥ = ⎢ 0 ⎥<br />

⎣<br />

⎪⎩<br />

− 1 ⎦ ⎣<br />

2 ⎪⎭ ⎪⎩<br />

−1⎦<br />

⎪⎭<br />

1<br />

2<br />

λ = −3<br />

⎡<br />

⎤ ⎡<br />

⎤<br />

18 18 −8 6 −5<br />

1 0 0 0 −1<br />

5 6 1 −1 −3<br />

1<br />

RREF<br />

0 1 0 0 2<br />

H +3I 5 = ⎢ 0 −4 8 −4 −2⎥<br />

−−−−→<br />

⎣−43 −46 17 −11 15 ⎦ ⎢ 0 0 1 0 1<br />

⎥<br />

⎣<br />

⎦<br />

0 0 0 1 2<br />

26 30 −12 8 −7<br />

0 0 0 0 0<br />

⎧⎡<br />

⎤⎫<br />

⎧⎡<br />

⎤⎫<br />

1<br />

−2<br />

〈 ⎪⎨<br />

− 1 〉 〈 〉<br />

⎪⎬ ⎪⎨<br />

1<br />

⎪⎬<br />

2<br />

E H (−3) = N (H +3I 5 )= ⎢−1⎥<br />

= ⎢ 2 ⎥<br />

⎣<br />

⎪⎩<br />

−2<br />

⎦ ⎣<br />

⎪⎭ ⎪⎩<br />

4 ⎦<br />

⎪⎭<br />

1<br />

−2<br />

So the eigenspace dimensions yield geometric multiplicities γ H (2) =1,γ H (1) =1,<br />

γ H (0) =1,γ H (−1) = 1 and γ H (−3) = 1, identical to the algebraic multiplicities.<br />

This example is of <strong>in</strong>terest for two reasons. <strong>First</strong>, λ = 0 is an eigenvalue, illustrat<strong>in</strong>g<br />

the upcom<strong>in</strong>g Theorem SMZE. Second, all the eigenvalues are dist<strong>in</strong>ct, yield<strong>in</strong>g<br />

algebraic and geometric multiplicities of 1 for each eigenvalue, illustrat<strong>in</strong>g Theorem<br />

DED.<br />

△<br />

Read<strong>in</strong>g Questions<br />

1. Suppose A is the 2 × 2 matrix<br />

A =<br />

[<br />

−5<br />

]<br />

8<br />

−4 7<br />

F<strong>in</strong>d the eigenvalues of A.<br />

2. For each eigenvalue of A, f<strong>in</strong>d the correspond<strong>in</strong>g eigenspace.<br />

3. For the polynomial p(x) =3x 2 − x +2andA fromabove,computep(A).


§EE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 388<br />

Exercises<br />

C10 † [ ]<br />

1 2<br />

F<strong>in</strong>d the characteristic polynomial of the matrix A = .<br />

3 4<br />

⎡ ⎤<br />

C11 † F<strong>in</strong>d the characteristic polynomial of the matrix A = ⎣ 3 2 1<br />

0 1 1⎦.<br />

1 2 0<br />

⎡<br />

1 2 1<br />

⎤<br />

0<br />

C12 † ⎢1 0 1 0⎥<br />

F<strong>in</strong>d the characteristic polynomial of the matrix A = ⎣<br />

2 1 1 0<br />

⎦.<br />

3 1 0 1<br />

C19 † F<strong>in</strong>d the eigenvalues, eigenspaces, algebraic multiplicities and geometric multiplicities<br />

for the matrix below. It is possible to do all these computations by hand, and it would<br />

be <strong>in</strong>structive to do so.<br />

[ ]<br />

−1 2<br />

C =<br />

−6 6<br />

C20 † F<strong>in</strong>d the eigenvalues, eigenspaces, algebraic multiplicities and geometric multiplicities<br />

for the matrix below. It is possible to do all these computations by hand, and it would<br />

be <strong>in</strong>structive to do so.<br />

[ ]<br />

−12 30<br />

B =<br />

−5 13<br />

C21 † The matrix A below has λ = 2 as an eigenvalue. F<strong>in</strong>d the geometric multiplicity of<br />

λ = 2 us<strong>in</strong>g your calculator only for row-reduc<strong>in</strong>g matrices.<br />

⎡<br />

⎤<br />

18 −15 33 −15<br />

⎢−4 8 −6 6 ⎥<br />

A = ⎣<br />

−9 9 −16 9<br />

⎦<br />

5 −6 9 −4<br />

C22 † Without us<strong>in</strong>g a calculator, f<strong>in</strong>d the eigenvalues of the matrix B.<br />

[ ]<br />

2 −1<br />

B =<br />

1 1<br />

[ C23 † ] F<strong>in</strong>d the eigenvalues, eigenspaces, algebraic and geometric multiplicities for A =<br />

1 1<br />

.<br />

1 1<br />

C24 ⎡<br />

† F<strong>in</strong>d the eigenvalues, eigenspaces, algebraic and geometric multiplicities for A =<br />

⎣ 1 −1 1<br />

⎤<br />

−1 1 −1⎦.<br />

1 −1 1<br />

C25 † F<strong>in</strong>d the eigenvalues, eigenspaces, algebraic and geometric multiplicities for the<br />

3 × 3 identity matrix I 3 . Do your results make sense?


§EE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 389<br />

⎡<br />

C26 † For matrix A = ⎣ 2 1 1<br />

⎤<br />

1 2 1⎦, the characteristic polynomial of A is p A (x) =(4−<br />

1 1 2<br />

x)(1 − x) 2 . F<strong>in</strong>d the eigenvalues and correspond<strong>in</strong>g eigenspaces of A.<br />

⎡<br />

⎤<br />

0 4 −1 1<br />

C27 † ⎢−2 6 −1 1 ⎥<br />

For matrix A = ⎣<br />

−2 8 −1 −1<br />

⎦, the characteristic polynomial of A is p A (x) =<br />

−2 8 −3 1<br />

(x +2)(x − 2) 2 (x − 4). F<strong>in</strong>d the eigenvalues and correspond<strong>in</strong>g eigenspaces of A.<br />

⎡ ⎤<br />

0<br />

8<br />

M60 † Repeat Example CAEHW by choos<strong>in</strong>g x = ⎢2⎥<br />

and then arrive at an eigenvalue<br />

⎣<br />

1<br />

⎦<br />

2<br />

and eigenvector of the matrix A. The hard way.<br />

T10 † A matrix A is idempotent if A 2 = A. Show that the only possible eigenvalues of<br />

an idempotent matrix are λ =0andλ = 1. Then give an example of a matrix that is<br />

idempotent and has both of these two values as eigenvalues.<br />

T15 † The characteristic polynomial of the square matrix A is usually def<strong>in</strong>ed as r A (x) =<br />

det (xI n − A). F<strong>in</strong>d a specific relationship between our characteristic polynomial, p A (x),<br />

and r A (x), give a proof of your relationship, and use this to expla<strong>in</strong> why Theorem EMRCP<br />

can rema<strong>in</strong> essentially unchanged with either def<strong>in</strong>ition. Expla<strong>in</strong> the advantages of each<br />

def<strong>in</strong>ition over the other. (Comput<strong>in</strong>g with both def<strong>in</strong>itions, for a 2 × 2anda3× 3matrix,<br />

might be a good way to start.)<br />

T20 † Suppose that λ and ρ are two different eigenvalues of the square matrix A. Prove<br />

that the <strong>in</strong>tersection of the eigenspaces for these two eigenvalues is trivial. That is, E A (λ) ∩<br />

E A (ρ) ={0}.


Section PEE<br />

Properties of Eigenvalues and Eigenvectors<br />

The previous section <strong>in</strong>troduced eigenvalues and eigenvectors, and concentrated on<br />

their existence and determ<strong>in</strong>ation. This section will be more about theorems, and<br />

the various properties eigenvalues and eigenvectors enjoy. Like a good 4 × 100 meter<br />

relay, we will lead-off with one of our better theorems and save the very best for the<br />

anchor leg.<br />

Subsection BPE<br />

Basic Properties of Eigenvalues<br />

Theorem EDELI Eigenvectors with Dist<strong>in</strong>ct Eigenvalues are L<strong>in</strong>early Independent<br />

Suppose that A is an n × n square matrix and S = {x 1 , x 2 , x 3 , ..., x p } is a set of<br />

eigenvectors with eigenvalues λ 1 ,λ 2 ,λ 3 , ..., λ p such that λ i ≠ λ j whenever i ≠ j.<br />

Then S is a l<strong>in</strong>early <strong>in</strong>dependent set.<br />

Proof. If p = 1, then the set S = {x 1 } is l<strong>in</strong>early <strong>in</strong>dependent s<strong>in</strong>ce eigenvectors are<br />

nonzero (Def<strong>in</strong>ition EEM), so assume for the rema<strong>in</strong>der that p ≥ 2.<br />

We will prove this result by contradiction (Proof Technique CD). Suppose to the<br />

contrary that S is a l<strong>in</strong>early dependent set. Def<strong>in</strong>e S i = {x 1 , x 2 , x 3 , ..., x i } and<br />

let k be an <strong>in</strong>teger such that S k−1 = {x 1 , x 2 , x 3 , ..., x k−1 } is l<strong>in</strong>early <strong>in</strong>dependent<br />

and S k = {x 1 , x 2 , x 3 , ..., x k } is l<strong>in</strong>early dependent. We have to ask if there is<br />

even such an <strong>in</strong>teger k? <strong>First</strong>, s<strong>in</strong>ce eigenvectors are nonzero, the set {x 1 } is l<strong>in</strong>early<br />

<strong>in</strong>dependent. S<strong>in</strong>ce we are assum<strong>in</strong>g that S = S p is l<strong>in</strong>early dependent, there must<br />

be an <strong>in</strong>teger k, 2≤ k ≤ p, where the sets S i transition from l<strong>in</strong>ear <strong>in</strong>dependence to<br />

l<strong>in</strong>ear dependence (and stay that way). In other words, x k is the vector with the<br />

smallest <strong>in</strong>dex that is a l<strong>in</strong>ear comb<strong>in</strong>ation of just vectors with smaller <strong>in</strong>dices.<br />

S<strong>in</strong>ce {x 1 , x 2 , x 3 , ..., x k } is a l<strong>in</strong>early dependent set there must be scalars,<br />

a 1 ,a 2 ,a 3 , ..., a k , not all zero (Def<strong>in</strong>ition LI), so that<br />

0 = a 1 x 1 + a 2 x 2 + a 3 x 3 + ···+ a k x k<br />

Then,<br />

0 =(A − λ k I n ) 0 Theorem ZVSM<br />

=(A − λ k I n )(a 1 x 1 + a 2 x 2 + a 3 x 3 + ···+ a k x k )<br />

Def<strong>in</strong>ition RLD<br />

=(A − λ k I n ) a 1 x 1 + ···+(A − λ k I n ) a k x k<br />

Theorem MMDAA<br />

= a 1 (A − λ k I n ) x 1 + ···+ a k (A − λ k I n ) x k Theorem MMSMM<br />

= a 1 (Ax 1 − λ k I n x 1 )+···+ a k (Ax k − λ k I n x k ) Theorem MMDAA<br />

= a 1 (Ax 1 − λ k x 1 )+···+ a k (Ax k − λ k x k ) Theorem MMIM<br />

= a 1 (λ 1 x 1 − λ k x 1 )+···+ a k (λ k x k − λ k x k ) Def<strong>in</strong>ition EEM<br />

390


§PEE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 391<br />

= a 1 (λ 1 − λ k ) x 1 + ···+ a k (λ k − λ k ) x k Theorem MMDAA<br />

= a 1 (λ 1 − λ k ) x 1 + ···+ a k−1 (λ k−1 − λ k ) x k−1 + a k (0) x k Property AICN<br />

= a 1 (λ 1 − λ k ) x 1 + ···+ a k−1 (λ k−1 − λ k ) x k−1 + 0 Theorem ZSSM<br />

= a 1 (λ 1 − λ k ) x 1 + ···+ a k−1 (λ k−1 − λ k ) x k−1 Property Z<br />

This equation is a relation of l<strong>in</strong>ear dependence on the l<strong>in</strong>early <strong>in</strong>dependent set<br />

{x 1 , x 2 , x 3 , ..., x k−1 }, so the scalars must all be zero. That is, a i (λ i − λ k ) =0for<br />

1 ≤ i ≤ k − 1. However, we have the hypothesis that the eigenvalues are dist<strong>in</strong>ct, so<br />

λ i ≠ λ k for 1 ≤ i ≤ k − 1. Thus a i =0for1≤ i ≤ k − 1.<br />

This reduces the orig<strong>in</strong>al relation of l<strong>in</strong>ear dependence on {x 1 , x 2 , x 3 , ..., x k }<br />

to the simpler equation a k x k = 0. ByTheoremSMEZV we conclude that a k =0or<br />

x k = 0. Eigenvectors are never the zero vector (Def<strong>in</strong>ition EEM), so a k =0.Soall<br />

of the scalars a i ,1≤ i ≤ k are zero, contradict<strong>in</strong>g their <strong>in</strong>troduction as the scalars<br />

creat<strong>in</strong>g a nontrivial relation of l<strong>in</strong>ear dependence on the set {x 1 , x 2 , x 3 , ..., x k }.<br />

With a contradiction <strong>in</strong> hand, we conclude that S must be l<strong>in</strong>early <strong>in</strong>dependent. <br />

There is a simple connection between the eigenvalues of a matrix and whether or<br />

not the matrix is nons<strong>in</strong>gular.<br />

Theorem SMZE S<strong>in</strong>gular Matrices have Zero Eigenvalues<br />

Suppose A is a square matrix. Then A is s<strong>in</strong>gular if and only if λ =0is an eigenvalue<br />

of A.<br />

Proof. We have the follow<strong>in</strong>g equivalences:<br />

A is s<strong>in</strong>gular ⇐⇒ there exists x ≠ 0, Ax = 0<br />

⇐⇒ there exists x ≠ 0, Ax =0x<br />

⇐⇒ λ = 0 is an eigenvalue of A<br />

Def<strong>in</strong>ition NM<br />

Theorem ZSSM<br />

Def<strong>in</strong>ition EEM<br />

<br />

With an equivalence about s<strong>in</strong>gular matrices we can update our list of equivalences<br />

about nons<strong>in</strong>gular matrices.<br />

Theorem NME8 Nons<strong>in</strong>gular Matrix Equivalences, Round 8<br />

Suppose that A is a square matrix of size n. The follow<strong>in</strong>g are equivalent.<br />

1. A is nons<strong>in</strong>gular.<br />

2. A row-reduces to the identity matrix.<br />

3. The null space of A conta<strong>in</strong>s only the zero vector, N (A) ={0}.<br />

4. The l<strong>in</strong>ear system LS(A, b) has a unique solution for every possible choice of<br />

b.


§PEE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 392<br />

5. The columns of A are a l<strong>in</strong>early <strong>in</strong>dependent set.<br />

6. A is <strong>in</strong>vertible.<br />

7. The column space of A is C n , C(A) =C n .<br />

8. The columns of A are a basis for C n .<br />

9. The rank of A is n, r (A) =n.<br />

10. The nullity of A is zero, n (A) =0.<br />

11. The determ<strong>in</strong>ant of A is nonzero, det (A) ≠0.<br />

12. λ =0is not an eigenvalue of A.<br />

Proof. The equivalence of the first and last statements is Theorem SMZE, reformulated<br />

by negat<strong>in</strong>g each statement <strong>in</strong> the equivalence. So we are able to improve on<br />

Theorem NME7 with this addition.<br />

<br />

Certa<strong>in</strong> changes to a matrix change its eigenvalues <strong>in</strong> a predictable way.<br />

Theorem ESMM Eigenvalues of a Scalar Multiple of a Matrix<br />

Suppose A is a square matrix and λ is an eigenvalue of A. Thenαλ is an eigenvalue<br />

of αA.<br />

Proof. Let x ≠ 0 be one eigenvector of A for λ. Then<br />

(αA) x = α (Ax)<br />

Theorem MMSMM<br />

= α (λx) Def<strong>in</strong>ition EEM<br />

=(αλ) x<br />

Property SMAC<br />

So x ≠ 0 is an eigenvector of αA for the eigenvalue αλ.<br />

<br />

Unfortunately, there are not parallel theorems about the sum or product of<br />

arbitrary matrices. But we can prove a similar result for powers of a matrix.<br />

Theorem EOMP Eigenvalues Of Matrix Powers<br />

Suppose A is a square matrix, λ is an eigenvalue of A, ands ≥ 0 is an <strong>in</strong>teger. Then<br />

λ s is an eigenvalue of A s .<br />

Proof. Let x ̸= 0 be one eigenvector of A for λ. Suppose A has size n. Thenwe<br />

proceed by <strong>in</strong>duction on s (Proof Technique I). <strong>First</strong>, for s =0,<br />

A s x = A 0 x<br />

= I n x<br />

= x Theorem MMIM<br />

=1x Property OC


§PEE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 393<br />

= λ 0 x<br />

= λ s x<br />

so λ s is an eigenvalue of A s <strong>in</strong> this special case. If we assume the theorem is true for<br />

s, then we f<strong>in</strong>d<br />

A s+1 x = A s Ax<br />

= A s (λx) Def<strong>in</strong>ition EEM<br />

= λ (A s x) Theorem MMSMM<br />

= λ (λ s x) Induction hypothesis<br />

=(λλ s ) x<br />

Property SMAC<br />

= λ s+1 x<br />

So x ≠ 0 is an eigenvector of A s+1 for λ s+1 , and <strong>in</strong>duction tells us the theorem<br />

is true for all s ≥ 0.<br />

<br />

While we cannot prove that the sum of two arbitrary matrices behaves <strong>in</strong> any<br />

reasonable way with regard to eigenvalues, we can work with the sum of dissimilar<br />

powers of the same matrix. We have already seen two connections between eigenvalues<br />

and polynomials, <strong>in</strong> the proof of Theorem EMHE and the characteristic polynomial<br />

(Def<strong>in</strong>ition CP). Our next theorem strengthens this connection.<br />

Theorem EPM Eigenvalues of the Polynomial of a Matrix<br />

Suppose A is a square matrix and λ is an eigenvalue of A. Letq(x) be a polynomial<br />

<strong>in</strong> the variable x. Thenq(λ) is an eigenvalue of the matrix q(A).<br />

Proof. Let x ≠ 0 be one eigenvector of A for λ, andwriteq(x) =a 0 + a 1 x + a 2 x 2 +<br />

···+ a m x m .Then<br />

q(A)x = ( a 0 A 0 + a 1 A 1 + a 2 A 2 + ···+ a m A m) x<br />

=(a 0 A 0 )x +(a 1 A 1 )x +(a 2 A 2 )x + ···+(a m A m )x Theorem MMDAA<br />

= a 0 (A 0 x)+a 1 (A 1 x)+a 2 (A 2 x)+···+ a m (A m x) Theorem MMSMM<br />

= a 0 (λ 0 x)+a 1 (λ 1 x)+a 2 (λ 2 x)+···+ a m (λ m x) Theorem EOMP<br />

=(a 0 λ 0 )x +(a 1 λ 1 )x +(a 2 λ 2 )x + ···+(a m λ m )x Property SMAC<br />

= ( a 0 λ 0 + a 1 λ 1 + a 2 λ 2 + ···+ a m λ m) x Property DSAC<br />

= q(λ)x<br />

So x ≠ 0 is an eigenvector of q(A) for the eigenvalue q(λ).<br />

<br />

Example BDE Build<strong>in</strong>g desired eigenvalues


§PEE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 394<br />

In Example ESMS4 the 4 × 4 symmetric matrix<br />

⎡<br />

⎤<br />

1 0 1 1<br />

⎢0 1 1 1⎥<br />

C = ⎣<br />

1 1 1 0<br />

⎦<br />

1 1 0 1<br />

is shown to have the three eigenvalues λ =3, 1, −1. Suppose we wanted a 4 × 4<br />

matrix that has the three eigenvalues λ =4, 0, −2. We can employ Theorem EPM<br />

by f<strong>in</strong>d<strong>in</strong>g a polynomial that converts 3 to 4, 1 to 0, and −1 to−2. Such a polynomial<br />

is called an <strong>in</strong>terpolat<strong>in</strong>g polynomial, and <strong>in</strong> this example we can use<br />

r(x) = 1 4 x2 + x − 5 4<br />

We will not discuss how to concoct this polynomial, but a text on numerical<br />

analysis should provide the details. For now, simply verify that r(3) = 4, r(1) = 0<br />

and r(−1) = −2.<br />

Now compute<br />

r(C) = 1 4 C2 + C − 5 4 I 4<br />

⎡<br />

⎤ ⎡<br />

⎤ ⎡<br />

⎤ ⎡<br />

⎤<br />

3 2 2 2 1 0 1 1 1 0 0 0 1 1 3 3<br />

= 1 ⎢2 3 2 2⎥<br />

⎢0 1 1 1⎥<br />

⎣<br />

4 2 2 3 2<br />

⎦ + ⎣<br />

1 1 1 0<br />

⎦ − 5 ⎢0 1 0 0⎥<br />

⎣<br />

4 0 0 1 0<br />

⎦ = 1 ⎢1 1 3 3⎥<br />

⎣<br />

2 3 3 1 1<br />

⎦<br />

2 2 2 3 1 1 0 1 0 0 0 1 3 3 1 1<br />

Theorem EPM tells us that if r(x) transforms the eigenvalues <strong>in</strong> the desired<br />

manner, then r(C) will have the desired eigenvalues. You can check this by comput<strong>in</strong>g<br />

the eigenvalues of r(C) directly. Furthermore, notice that the multiplicities are the<br />

same, and the eigenspaces of C and r(C) are identical.<br />

△<br />

Inverses and transposes also behave predictably with regard to their eigenvalues.<br />

Theorem EIM Eigenvalues of the Inverse of a Matrix<br />

Suppose A is a square nons<strong>in</strong>gular matrix and λ is an eigenvalue of A. Thenλ −1 is<br />

an eigenvalue of the matrix A −1 .<br />

Proof. Notice that s<strong>in</strong>ce A is assumed nons<strong>in</strong>gular, A −1 exists by Theorem NI, but<br />

more importantly, λ −1 =1/λ does not <strong>in</strong>volve division by zero s<strong>in</strong>ce Theorem SMZE<br />

prohibits this possibility.<br />

Let x ≠ 0 be one eigenvector of A for λ. Suppose A has size n. Then<br />

A −1 x = A −1 (1x)<br />

Property OC<br />

= A −1 ( 1 λx)<br />

λ<br />

Property MICN<br />

= 1 λ A−1 (λx) Theorem MMSMM


§PEE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 395<br />

= 1 λ A−1 (Ax) Def<strong>in</strong>ition EEM<br />

= 1 λ (A−1 A)x Theorem MMA<br />

= 1 λ I nx Def<strong>in</strong>ition MI<br />

= 1 x Theorem MMIM<br />

λ<br />

So x ≠ 0 is an eigenvector of A −1 for the eigenvalue 1 λ .<br />

<br />

Theproofsofthetheoremsabovehaveasimilarstyletothem.Theyallbeg<strong>in</strong><br />

by grabb<strong>in</strong>g an eigenvalue-eigenvector pair and adjust<strong>in</strong>g it <strong>in</strong> some way to reach<br />

the desired conclusion. You should add this to your toolkit as a general approach to<br />

prov<strong>in</strong>g theorems about eigenvalues.<br />

So far we have been able to reserve the characteristic polynomial for strictly<br />

computational purposes. However, sometimes a theorem about eigenvalues can be<br />

proved easily by employ<strong>in</strong>g the characteristic polynomial (rather than us<strong>in</strong>g an<br />

eigenvalue-eigenvector pair). The next theorem is an example of this.<br />

Theorem ETM Eigenvalues of the Transpose of a Matrix<br />

Suppose A is a square matrix and λ is an eigenvalue of A. Thenλ is an eigenvalue<br />

of the matrix A t .<br />

Proof. Suppose A has size n. Then<br />

p A (x) =det(A − xI n )<br />

Def<strong>in</strong>ition CP<br />

(<br />

=det (A − xI n ) t) Theorem DT<br />

(<br />

=det A t − (xI n ) t)<br />

Theorem TMA<br />

=det ( A t − xIn<br />

t )<br />

Theorem TMSM<br />

=det ( A t )<br />

− xI n Def<strong>in</strong>ition IM<br />

= p A<br />

t (x) Def<strong>in</strong>ition CP<br />

So A and A t have the same characteristic polynomial, and by Theorem EMRCP,<br />

their eigenvalues are identical and have equal algebraic multiplicities. Notice that<br />

what we have proved here is a bit stronger than the stated conclusion <strong>in</strong> the theorem.<br />

<br />

If a matrix has only real entries, then the computation of the characteristic<br />

polynomial (Def<strong>in</strong>ition CP) will result <strong>in</strong> a polynomial with coefficients that are real<br />

numbers. Complex numbers could result as roots of this polynomial, but they are<br />

roots of quadratic factors with real coefficients, and as such, come <strong>in</strong> conjugate pairs.


§PEE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 396<br />

The next theorem proves this, and a bit more, without mention<strong>in</strong>g the characteristic<br />

polynomial.<br />

Theorem ERMCP Eigenvalues of Real Matrices come <strong>in</strong> Conjugate Pairs<br />

Suppose A is a square matrix with real entries and x is an eigenvector of A for the<br />

eigenvalue λ. Thenx is an eigenvector of A for the eigenvalue λ.<br />

Proof.<br />

Ax = Ax<br />

A has real entries<br />

= Ax Theorem MMCC<br />

= λx Def<strong>in</strong>ition EEM<br />

= λx Theorem CRSM<br />

So x is an eigenvector of A for the eigenvalue λ.<br />

<br />

This phenomenon is amply illustrated <strong>in</strong> Example CEMS6, where the four<br />

complex eigenvalues come <strong>in</strong> two pairs, and the two basis vectors of the eigenspaces<br />

are complex conjugates of each other. Theorem ERMCP can be a time-saver for<br />

comput<strong>in</strong>g eigenvalues and eigenvectors of real matrices with complex eigenvalues,<br />

s<strong>in</strong>ce the conjugate eigenvalue and eigenspace can be <strong>in</strong>ferred from the theorem<br />

rather than computed.<br />

Subsection ME<br />

Multiplicities of Eigenvalues<br />

A polynomial of degree n will have exactly n roots. From this fact about polynomial<br />

equations we can say more about the algebraic multiplicities of eigenvalues.<br />

Theorem DCP Degree of the Characteristic Polynomial<br />

Suppose that A is a square matrix of size n. Then the characteristic polynomial of<br />

A, p A (x), has degree n.<br />

Proof. We will prove a more general result by <strong>in</strong>duction (Proof Technique I). Then<br />

the theorem will be true as a special case. We will carefully state this result as a<br />

proposition <strong>in</strong>dexed by m, m ≥ 1.<br />

P (m): Suppose that A is an m × m matrix whose entries are complex numbers<br />

or l<strong>in</strong>ear polynomials <strong>in</strong> the variable x of the form c − x, wherec is a complex<br />

number. Suppose further that there are exactly k entries that conta<strong>in</strong> x and that no<br />

row or column conta<strong>in</strong>s more than one such entry. Then, when k = m, det (A) is a<br />

polynomial <strong>in</strong> x of degree m, with lead<strong>in</strong>g coefficient ±1, and when k


§PEE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 397<br />

polynomial <strong>in</strong> x of degree m = 1 with lead<strong>in</strong>g coefficient −1. When k


§PEE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 398<br />

Proof. By the def<strong>in</strong>ition of the algebraic multiplicity (Def<strong>in</strong>ition AME), we can factor<br />

the characteristic polynomial as<br />

p A (x) =c(x − λ 1 ) α A(λ 1 ) (x − λ 2 ) α A(λ 2 ) (x − λ 3 ) α A(λ 3) ···(x − λ k ) α A(λ k )<br />

where c is a nonzero constant. (We could prove that c =(−1) n , but we do not<br />

need that specificity right now. See Exercise PEE.T30) The left-hand side is a<br />

polynomial of degree n by Theorem DCP and the right-hand side is a polynomial of<br />

degree ∑ k<br />

i=1 α A (λ i ). So the equality of the polynomials’ degrees gives the equality<br />

∑ k<br />

i=1 α A (λ i )=n.<br />

<br />

Theorem ME Multiplicities of an Eigenvalue<br />

Suppose that A is a square matrix of size n and λ is an eigenvalue. Then<br />

1 ≤ γ A (λ) ≤ α A (λ) ≤ n<br />

Proof. S<strong>in</strong>ce λ is an eigenvalue of A, there is an eigenvector of A for λ, x. Then<br />

x ∈E A (λ), soγ A (λ) ≥ 1,s<strong>in</strong>cewecanextend{x} <strong>in</strong>to a basis of E A (λ) (Theorem<br />

ELIS).<br />

To show γ A (λ) ≤ α A (λ) is the most <strong>in</strong>volved portion of this proof. To this end,<br />

let g = γ A (λ) and let x 1 , x 2 , x 3 , ..., x g be a basis for the eigenspace of λ, E A (λ).<br />

Construct another n − g vectors, y 1 , y 2 , y 3 , ..., y n−g ,sothat<br />

{x 1 , x 2 , x 3 , ..., x g , y 1 , y 2 , y 3 , ..., y n−g }<br />

is a basis of C n . This can be done by repeated applications of Theorem ELIS.<br />

F<strong>in</strong>ally, def<strong>in</strong>e a matrix S by<br />

S =[x 1 |x 2 |x 3 | ...|x g |y 1 |y 2 |y 3 | ...|y n−g ]=[x 1 |x 2 |x 3 | ...|x g |R]<br />

where R is an n × (n − g) matrix whose columns are y 1 , y 2 , y 3 , ..., y n−g .The<br />

columns of S are l<strong>in</strong>early <strong>in</strong>dependent by design, so S is nons<strong>in</strong>gular (Theorem<br />

NMLIC) and therefore <strong>in</strong>vertible (Theorem NI).<br />

Then,<br />

[e 1 |e 2 |e 3 | ...|e n ]=I n<br />

= S −1 S<br />

= S −1 [x 1 |x 2 |x 3 | ...|x g |R]<br />

=[S −1 x 1 |S −1 x 2 |S −1 x 3 | ...|S −1 x g |S −1 R]<br />

So<br />

S −1 x i = e i 1 ≤ i ≤ g (∗)<br />

Preparations <strong>in</strong> place, we compute the characteristic polynomial of A,<br />

p A (x) =det(A − xI n )<br />

Def<strong>in</strong>ition CP<br />

=1det(A − xI n )<br />

Property OCN


§PEE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 399<br />

=det(I n )det(A − xI n )<br />

Def<strong>in</strong>ition DM<br />

=det ( S −1 S ) det (A − xI n )<br />

Def<strong>in</strong>ition MI<br />

=det ( S −1) det (S)det(A − xI n ) Theorem DRMM<br />

=det ( S −1) det (A − xI n )det(S) Property CMCN<br />

=det ( S −1 (A − xI n ) S )<br />

Theorem DRMM<br />

=det ( S −1 AS − S −1 xI n S )<br />

Theorem MMDAA<br />

=det ( S −1 AS − xS −1 I n S )<br />

Theorem MMSMM<br />

=det ( S −1 AS − xS −1 S )<br />

Theorem MMIM<br />

=det ( S −1 )<br />

AS − xI n Def<strong>in</strong>ition MI<br />

= p S −1 AS (x) Def<strong>in</strong>ition CP<br />

What can we learn then about the matrix S −1 AS?<br />

S −1 AS = S −1 A[x 1 |x 2 |x 3 | ...|x g |R]<br />

= S −1 [Ax 1 |Ax 2 |Ax 3 | ...|Ax g |AR] Def<strong>in</strong>ition MM<br />

= S −1 [λx 1 |λx 2 |λx 3 | ...|λx g |AR] Def<strong>in</strong>ition EEM<br />

=[S −1 λx 1 |S −1 λx 2 |S −1 λx 3 | ...|S −1 λx g |S −1 AR] Def<strong>in</strong>ition MM<br />

=[λS −1 x 1 |λS −1 x 2 |λS −1 x 3 | ...|λS −1 x g |S −1 AR] Theorem MMSMM<br />

=[λe 1 |λe 2 |λe 3 | ...|λe g |S −1 AR]<br />

S −1 S = I n ,((∗) above)<br />

Now imag<strong>in</strong>e comput<strong>in</strong>g the characteristic polynomial of A by comput<strong>in</strong>g the<br />

characteristic polynomial of S −1 AS us<strong>in</strong>g the form just obta<strong>in</strong>ed. The first g columns<br />

of S −1 AS are all zero, save for a λ on the diagonal. So if we compute the determ<strong>in</strong>ant<br />

by expand<strong>in</strong>g about the first column, successively, we will get successive factors of<br />

(λ − x). More precisely, let T be the square matrix of size n − g that is formed from<br />

the last n − g rows and last n − g columns of S −1 AR. Then<br />

p A (x) =p S −1 AS (x) =(λ − x) g p T (x) .<br />

This says that (x − λ) is a factor of the characteristic polynomial at least g times,<br />

so the algebraic multiplicity of λ as an eigenvalue of A is greater than or equal to g<br />

(Def<strong>in</strong>ition AME). In other words,<br />

γ A (λ) =g ≤ α A (λ)<br />

as desired.<br />

Theorem NEM says that the sum of the algebraic multiplicities for all the<br />

eigenvalues of A is equal to n. S<strong>in</strong>ce the algebraic multiplicity is a positive quantity,<br />

no s<strong>in</strong>gle algebraic multiplicity can exceed n without the sum of all of the algebraic<br />

multiplicities do<strong>in</strong>g the same.<br />

<br />

Theorem MNEM Maximum Number of Eigenvalues of a Matrix


§PEE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 400<br />

Suppose that A is a square matrix of size n. ThenA cannot have more than n<br />

dist<strong>in</strong>ct eigenvalues.<br />

Proof. Suppose that A has k dist<strong>in</strong>ct eigenvalues, λ 1 ,λ 2 ,λ 3 , ..., λ k .Then<br />

k∑<br />

k = 1<br />

≤<br />

i=1<br />

k∑<br />

α A (λ i )<br />

i=1<br />

Theorem ME<br />

= n Theorem NEM<br />

<br />

Subsection EHM<br />

Eigenvalues of Hermitian Matrices<br />

Recall that a matrix is Hermitian (or self-adjo<strong>in</strong>t) if A = A ∗ (Def<strong>in</strong>ition HM). In<br />

thecasewhereA is a matrix whose entries are all real numbers, be<strong>in</strong>g Hermitian<br />

is identical to be<strong>in</strong>g symmetric (Def<strong>in</strong>ition SYM). Keep this <strong>in</strong> m<strong>in</strong>d as you read<br />

the next two theorems. Their hypotheses could be changed to “suppose A is a real<br />

symmetric matrix.”<br />

Theorem HMRE Hermitian Matrices have Real Eigenvalues<br />

Suppose that A is a Hermitian matrix and λ is an eigenvalue of A. Thenλ ∈ R.<br />

Proof. Let x ̸= 0 be one eigenvector of A for the eigenvalue λ. Then by Theorem<br />

PIP we know 〈x, x〉 ≠0.So<br />

λ = 1 λ 〈x, x〉<br />

〈x, x〉<br />

= 1 〈x, λx〉<br />

〈x, x〉<br />

= 1 〈x, Ax〉<br />

〈x, x〉<br />

= 1 〈Ax, x〉<br />

〈x, x〉<br />

= 1 〈λx, x〉<br />

〈x, x〉<br />

= 1 λ 〈x, x〉<br />

〈x, x〉<br />

Property MICN<br />

Theorem IPSM<br />

Def<strong>in</strong>ition EEM<br />

Theorem HMIP<br />

Def<strong>in</strong>ition EEM<br />

Theorem IPSM<br />

= λ Property MICN


§PEE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 401<br />

If a complex number is equal to its conjugate, then it has a complex part equal<br />

to zero, and therefore is a real number.<br />

<br />

Notice the appeal<strong>in</strong>g symmetry to the justifications given for the steps of this<br />

proof. In the center is the ability to pitch a Hermitian matrix from one side of the<br />

<strong>in</strong>ner product to the other.<br />

Look back and compare Example ESMS4 and Example CEMS6. In Example<br />

CEMS6 the matrix has only real entries, yet the characteristic polynomial has roots<br />

that are complex numbers, and so the matrix has complex eigenvalues. However, <strong>in</strong><br />

Example ESMS4, the matrix has only real entries, but is also symmetric, and hence<br />

Hermitian. So by Theorem HMRE, we were guaranteed eigenvalues that are real<br />

numbers.<br />

In many physical problems, a matrix of <strong>in</strong>terest will be real and symmetric, or<br />

Hermitian. Then if the eigenvalues are to represent physical quantities of <strong>in</strong>terest,<br />

Theorem HMRE guarantees that these values will not be complex numbers.<br />

The eigenvectors of a Hermitian matrix also enjoy a pleas<strong>in</strong>g property that we<br />

will exploit later.<br />

Theorem HMOE Hermitian Matrices have Orthogonal Eigenvectors<br />

Suppose that A is a Hermitian matrix and x and y are two eigenvectors of A for<br />

different eigenvalues. Then x and y are orthogonal vectors.<br />

Proof. Let x be an eigenvector of A for λ and let y be an eigenvector of A for a<br />

different eigenvalue ρ. Sowehaveλ − ρ ≠0.Then<br />

〈x, y〉 = 1 (λ − ρ) 〈x, y〉<br />

λ − ρ<br />

= 1 (λ 〈x, y〉−ρ 〈x, y〉)<br />

λ − ρ<br />

(〈 〉 )<br />

λx, y −〈x, ρy〉<br />

= 1<br />

λ − ρ<br />

= 1 (〈λx, y〉−〈x, ρy〉)<br />

λ − ρ<br />

= 1 (〈Ax, y〉−〈x, Ay〉)<br />

λ − ρ<br />

= 1 (〈Ax, y〉−〈Ax, y〉)<br />

λ − ρ<br />

= 1<br />

λ − ρ (0)<br />

Property MICN<br />

Property DCN<br />

Theorem IPSM<br />

Theorem HMRE<br />

Def<strong>in</strong>ition EEM<br />

Theorem HMIP<br />

Property AICN<br />

=0<br />

This equality says that x and y are orthogonal vectors (Def<strong>in</strong>ition OV).


§PEE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 402<br />

Notice aga<strong>in</strong> how the key step <strong>in</strong> this proof is the fundamental property of<br />

a Hermitian matrix (Theorem HMIP) — the ability to swap A across the two<br />

arguments of the <strong>in</strong>ner product. We will build on these results and cont<strong>in</strong>ue to see<br />

some more <strong>in</strong>terest<strong>in</strong>g properties <strong>in</strong> Section OD.<br />

Read<strong>in</strong>g Questions<br />

1. How can you identify a nons<strong>in</strong>gular matrix just by look<strong>in</strong>g at its eigenvalues?<br />

2. How many different eigenvalues may a square matrix of size n have?<br />

3. What is amaz<strong>in</strong>g about the eigenvalues of a Hermitian matrix and why is it amaz<strong>in</strong>g?<br />

Exercises<br />

T10 † Suppose that A is a square matrix. Prove that the constant term of the characteristic<br />

polynomial of A is equal to the determ<strong>in</strong>ant of A.<br />

T20 † Suppose that A is a square matrix. Prove that a s<strong>in</strong>gle vector may not be an<br />

eigenvector of A for two different eigenvalues.<br />

T22 Suppose that U is a unitary matrix with eigenvalue λ. Provethatλ has modulus 1,<br />

i.e. |λ| = 1. This says that all of the eigenvalues of a unitary matrix lie on the unit circle of<br />

the complex plane.<br />

T30 Theorem DCP tells us that the characteristic polynomial of a square matrix of size n<br />

has degree n. By suitably augment<strong>in</strong>g the proof of Theorem DCP prove that the coefficient<br />

of x n <strong>in</strong> the characteristic polynomial is (−1) n .<br />

T50 † Theorem EIM says that if λ is an eigenvalue of the nons<strong>in</strong>gular matrix A, then 1 λ<br />

is an eigenvalue of A −1 . Write an alternate proof of this theorem us<strong>in</strong>g the characteristic<br />

polynomial and without mak<strong>in</strong>g reference to an eigenvector of A for λ.


Section SD<br />

Similarity and Diagonalization<br />

This section’s topic will perhaps seem out of place at first, but we will make the<br />

connection soon with eigenvalues and eigenvectors. This is also our first look at one<br />

of the central ideas of Chapter R.<br />

Subsection SM<br />

Similar Matrices<br />

The notion of matrices be<strong>in</strong>g “similar” is a lot like say<strong>in</strong>g two matrices are rowequivalent.<br />

Two similar matrices are not equal, but they share many important<br />

properties. This section, and later sections <strong>in</strong> Chapter R willbedevoted<strong>in</strong>partto<br />

discover<strong>in</strong>g just what these common properties are.<br />

<strong>First</strong>, the ma<strong>in</strong> def<strong>in</strong>ition for this section.<br />

Def<strong>in</strong>ition SIM Similar Matrices<br />

Suppose A and B are two square matrices of size n. ThenA and B are similar if<br />

there exists a nons<strong>in</strong>gular matrix of size n, S, such that A = S −1 BS.<br />

□<br />

We will say “A is similar to B via S” whenwewanttoemphasizetheroleofS<br />

<strong>in</strong> the relationship between A and B. Also, it does not matter if we say A is similar<br />

to B, orB is similar to A. If one statement is true then so is the other, as can be<br />

seen by us<strong>in</strong>g S −1 <strong>in</strong> place of S (see Theorem SER for the careful proof). F<strong>in</strong>ally, we<br />

will refer to S −1 BS as a similarity transformation when we want to emphasize<br />

the way S changes B. OK, enough about language, let us build a few examples.<br />

Example SMS5 Similar matrices of size 5<br />

If you wondered if there are examples of similar matrices, then it will not be hard to<br />

conv<strong>in</strong>ce you they exist. Def<strong>in</strong>e<br />

⎡<br />

⎤<br />

⎡<br />

⎤<br />

−4 1 −3 −2 2<br />

1 2 −1 1 1<br />

1 2 −1 3 −2<br />

0 1 −1 −2 −1<br />

⎢<br />

⎥<br />

⎢<br />

⎥<br />

B = ⎢−4 1 3 2 2<br />

⎣−3 4 −2 −1 −3<br />

3 1 −1 1 −4<br />

⎥<br />

⎦<br />

S = ⎢ 1 3 −1 1 1<br />

⎣−2 −3 3 1 −2<br />

1 3 −1 2 1<br />

Check that S is nons<strong>in</strong>gular and then compute<br />

A = S −1 BS<br />

⎡<br />

⎤ ⎡<br />

⎤ ⎡<br />

⎤<br />

10 1 0 2 −5 −4 1 −3 −2 2 1 2 −1 1 1<br />

−1 0 1 0 0<br />

1 2 −1 3 −2<br />

0 1 −1 −2 −1<br />

= ⎢ 3 0 2 1 −3⎥<br />

⎢−4 1 3 2 2 ⎥ ⎢ 1 3 −1 1 1 ⎥<br />

⎣ 0 0 −1 0 1 ⎦ ⎣−3 4 −2 −1 −3⎦<br />

⎣−2 −3 3 1 −2⎦<br />

−4 −1 1 −1 1 3 1 −1 1 −4 1 3 −1 2 1<br />

⎥<br />

⎦<br />

403


§SD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 404<br />

⎡<br />

⎤<br />

−10 −27 −29 −80 −25<br />

−2 6 6 10 −2<br />

= ⎢ −3 11 −9 −14 −9 ⎥<br />

⎣ −1 −13 0 −10 −1 ⎦<br />

11 35 6 49 19<br />

So by this construction, we know that A and B are similar.<br />

Let us do that aga<strong>in</strong>.<br />

Example SMS3 Similar matrices of size 3<br />

Def<strong>in</strong>e<br />

[ ]<br />

−13 −8 −4<br />

B = 12 7 4<br />

24 16 7<br />

S =<br />

[ 1 1<br />

] 2<br />

−2 −1 −3<br />

1 −2 0<br />

Check that S is nons<strong>in</strong>gular and then compute<br />

A = S −1 BS<br />

[ −6 −4<br />

][ −1 −13 −8<br />

][ −4 1 1<br />

] 2<br />

= −3 −2 −1 12 7 4 −2 −1 −3<br />

5 3 1 24 16 7 1 −2 0<br />

[ ]<br />

−1 0 0<br />

= 0 3 0<br />

0 0 −1<br />

So by this construction, we know that A and B are similar. But before we<br />

move on, look at how pleas<strong>in</strong>g the form of A is. Not conv<strong>in</strong>ced? Then consider<br />

that several computations related to A are especially easy. For example, <strong>in</strong> the<br />

spirit of Example DUTM, det (A) =(−1)(3)(−1) = 3. Similarly, the characteristic<br />

polynomial is straightforward to compute by hand, p A (x) =(−1 − x)(3 − x)(−1 −<br />

x) =−(x − 3)(x +1) 2 and s<strong>in</strong>ce the result is already factored, the eigenvalues are<br />

transparently λ =3, −1. F<strong>in</strong>ally, the eigenvectors of A are just the standard unit<br />

vectors (Def<strong>in</strong>ition SUV).<br />

△<br />

Subsection PSM<br />

Properties of Similar Matrices<br />

Similar matrices share many properties and it is these theorems that justify the<br />

choice of the word “similar.” <strong>First</strong> we will show that similarity is an equivalence<br />

relation. Equivalence relations are important <strong>in</strong> the study of various algebras and<br />

can always be regarded as a k<strong>in</strong>d of weak version of equality. Sort of alike, but not<br />

quite equal. The notion of two matrices be<strong>in</strong>g row-equivalent is an example of an<br />

equivalence relation we have been work<strong>in</strong>g with s<strong>in</strong>ce the beg<strong>in</strong>n<strong>in</strong>g of the course (see<br />

Exercise RREF.T11). Row-equivalent matrices are not equal, but they are a lot alike.<br />

For example, row-equivalent matrices have the same rank. Formally, an equivalence<br />


§SD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 405<br />

relation requires three conditions hold: reflexive, symmetric and transitive. We will<br />

illustrate these as we prove that similarity is an equivalence relation.<br />

Theorem SER Similarity is an Equivalence Relation<br />

Suppose A, B and C are square matrices of size n. Then<br />

1. A is similar to A. (Reflexive)<br />

2. If A is similar to B, thenB is similar to A. (Symmetric)<br />

3. If A is similar to B and B is similar to C, thenA is similar to C. (Transitive)<br />

Proof. To see that A is similar to A, we need only demonstrate a nons<strong>in</strong>gular<br />

matrix that effects a similarity transformation of A to A. I n is nons<strong>in</strong>gular (s<strong>in</strong>ce it<br />

row-reduces to the identity matrix, Theorem NMRRI), and<br />

In<br />

−1 AI n = I n AI n = A<br />

If we assume that A is similar to B, then we know there is a nons<strong>in</strong>gular matrix<br />

S so that A = S −1 BS by Def<strong>in</strong>ition SIM. ByTheoremMIMI, S −1 is <strong>in</strong>vertible, and<br />

by Theorem NI is therefore nons<strong>in</strong>gular. So<br />

(S −1 ) −1 A(S −1 )=SAS −1 Theorem MIMI<br />

= SS −1 BSS −1 Def<strong>in</strong>ition SIM<br />

= ( SS −1) B ( SS −1) Theorem MMA<br />

= I n BI n Def<strong>in</strong>ition MI<br />

= B Theorem MMIM<br />

and we see that B is similar to A.<br />

Assume that A is similar to B, andB is similar to C. This gives us the existence<br />

of two nons<strong>in</strong>gular matrices, S and R, such that A = S −1 BS and B = R −1 CR,<br />

by Def<strong>in</strong>ition SIM. (Notice how we have to assume S ̸= R, as will usually be the<br />

case.) S<strong>in</strong>ce S and R are <strong>in</strong>vertible, so too RS is <strong>in</strong>vertible by Theorem SS and then<br />

nons<strong>in</strong>gular by Theorem NI. Now<br />

(RS) −1 C(RS) =S −1 R −1 CRS<br />

Theorem SS<br />

= S −1 ( R −1 CR ) S Theorem MMA<br />

= S −1 BS Def<strong>in</strong>ition SIM<br />

= A<br />

so A is similar to C via the nons<strong>in</strong>gular matrix RS.<br />

<br />

Here is another theorem that tells us exactly what sorts of properties similar<br />

matrices share.


§SD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 406<br />

Theorem SMEE Similar Matrices have Equal Eigenvalues<br />

Suppose A and B are similar matrices. Then the characteristic polynomials of A<br />

and B are equal, that is, p A (x) =p B (x).<br />

Proof. Let n denote the size of A and B. S<strong>in</strong>ceA and B are similar, there exists a<br />

nons<strong>in</strong>gular matrix S, such that A = S −1 BS (Def<strong>in</strong>ition SIM). Then<br />

p A (x) =det(A − xI n )<br />

Def<strong>in</strong>ition CP<br />

=det ( S −1 )<br />

BS − xI n Def<strong>in</strong>ition SIM<br />

=det ( S −1 BS − xS −1 I n S )<br />

Theorem MMIM<br />

=det ( S −1 BS − S −1 xI n S )<br />

Theorem MMSMM<br />

=det ( S −1 (B − xI n ) S )<br />

Theorem MMDAA<br />

=det ( S −1) det (B − xI n )det(S) Theorem DRMM<br />

=det ( S −1) det (S)det(B − xI n ) Property CMCN<br />

=det ( S −1 S ) det (B − xI n )<br />

Theorem DRMM<br />

=det(I n )det(B − xI n )<br />

Def<strong>in</strong>ition MI<br />

=1det(B − xI n )<br />

Def<strong>in</strong>ition DM<br />

= p B (x) Def<strong>in</strong>ition CP<br />

<br />

So similar matrices not only have the same set of eigenvalues, the algebraic<br />

multiplicities of these eigenvalues will also be the same. However, be careful with<br />

this theorem. It is tempt<strong>in</strong>g to th<strong>in</strong>k the converse is true, and argue that if two<br />

matrices have the same eigenvalues, then they are similar. Not so, as the follow<strong>in</strong>g<br />

example illustrates.<br />

Example EENS Equal eigenvalues, not similar<br />

Def<strong>in</strong>e<br />

[ ]<br />

1 1<br />

A =<br />

0 1<br />

and check that<br />

B =<br />

[ ]<br />

1 0<br />

0 1<br />

p A (x) =p B (x) =1− 2x + x 2 =(x − 1) 2<br />

and so A and B have equal characteristic polynomials. If the converse of Theorem<br />

SMEE were true, then A and B would be similar. Suppose this is the case. More<br />

precisely, suppose there is a nons<strong>in</strong>gular matrix S so that A = S −1 BS.<br />

Then<br />

A = S −1 BS = S −1 I 2 S = S −1 S = I 2


§SD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 407<br />

Clearly A ≠ I 2 and this contradiction tells us that the converse of Theorem<br />

SMEE is false.<br />

△<br />

Subsection D<br />

Diagonalization<br />

Good th<strong>in</strong>gs happen when a matrix is similar to a diagonal matrix. For example, the<br />

eigenvalues of the matrix are the entries on the diagonal of the diagonal matrix. And<br />

it can be a much simpler matter to compute high powers of the matrix. Diagonalizable<br />

matrices are also of <strong>in</strong>terest <strong>in</strong> more abstract sett<strong>in</strong>gs. Here are the relevant def<strong>in</strong>itions,<br />

then our ma<strong>in</strong> theorem for this section.<br />

Def<strong>in</strong>ition DIM Diagonal Matrix<br />

Suppose that A is a square matrix. Then A is a diagonal matrix if [A] ij<br />

=0<br />

whenever i ≠ j.<br />

□<br />

Def<strong>in</strong>ition DZM Diagonalizable Matrix<br />

Suppose A is a square matrix. Then A is diagonalizable if A is similar to a diagonal<br />

matrix.<br />

□<br />

Example DAB Diagonalization of Archetype B<br />

Archetype B has a 3 × 3 coefficient matrix<br />

[ −7 −6<br />

] −12<br />

B = 5 5 7<br />

1 0 4<br />

and is similar to a diagonal matrix, as can be seen by the follow<strong>in</strong>g computation<br />

with the nons<strong>in</strong>gular matrix S,<br />

[ −5 −3<br />

]−1 [ −2 −7 −6<br />

][ −12 −5 −3<br />

] −2<br />

S −1 BS = 3 2 1 5 5 7 3 2 1<br />

1 1 1 1 0 4 1 1 1<br />

[ ][ ][ ]<br />

−1 −1 −1 −7 −6 −12 −5 −3 −2<br />

= 2 3 1 5 5 7 3 2 1<br />

−1 −2 1 1 0 4 1 1 1<br />

[ ] −1 0 0<br />

= 0 1 0<br />

0 0 2<br />

Example SMS3 provides yet another example of a matrix that is subjected to<br />

a similarity transformation and the result is a diagonal matrix. Alright, just how<br />

would we f<strong>in</strong>d the magic matrix S that can be used <strong>in</strong> a similarity transformation<br />

to produce a diagonal matrix? Before you read the statement of the next theorem,<br />

you might study the eigenvalues and eigenvectors of Archetype B and compute the<br />


§SD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 408<br />

eigenvalues and eigenvectors of the matrix <strong>in</strong> Example SMS3.<br />

Theorem DC Diagonalization Characterization<br />

Suppose A is a square matrix of size n. ThenA is diagonalizable if and only if there<br />

exists a l<strong>in</strong>early <strong>in</strong>dependent set S that conta<strong>in</strong>s n eigenvectors of A.<br />

Proof. (⇐) LetS = {x 1 , x 2 , x 3 , ..., x n } be a l<strong>in</strong>early <strong>in</strong>dependent set of eigenvectors<br />

of A for the eigenvalues λ 1 ,λ 2 ,λ 3 , ..., λ n . Recall Def<strong>in</strong>ition SUV and def<strong>in</strong>e<br />

R =[x 1 |x 2 |x 3 | ...|x n ]<br />

⎡<br />

⎤<br />

λ 1 0 0 ··· 0<br />

0 λ 2 0 ··· 0<br />

D =<br />

0 0 λ 3 ··· 0<br />

⎢<br />

⎥<br />

⎣<br />

. . . .<br />

⎦ =[λ 1e 1 |λ 2 e 2 |λ 3 e 3 | ...|λ n e n ]<br />

0 0 0 ··· λ n<br />

The columns of R are the vectors of the l<strong>in</strong>early <strong>in</strong>dependent set S and so by<br />

Theorem NMLIC the matrix R is nons<strong>in</strong>gular. By Theorem NI we know R −1 exists.<br />

R −1 AR = R −1 A [x 1 |x 2 |x 3 | ...|x n ]<br />

= R −1 [Ax 1 |Ax 2 |Ax 3 | ...|Ax n ] Def<strong>in</strong>ition MM<br />

= R −1 [λ 1 x 1 |λ 2 x 2 |λ 3 x 3 | ...|λ n x n ] Def<strong>in</strong>ition EEM<br />

= R −1 [λ 1 Re 1 |λ 2 Re 2 |λ 3 Re 3 | ...|λ n Re n ] Def<strong>in</strong>ition MVP<br />

= R −1 [R(λ 1 e 1 )|R(λ 2 e 2 )|R(λ 3 e 3 )| ...|R(λ n e n )] Theorem MMSMM<br />

= R −1 R[λ 1 e 1 |λ 2 e 2 |λ 3 e 3 | ...|λ n e n ] Def<strong>in</strong>ition MM<br />

= I n D Def<strong>in</strong>ition MI<br />

= D Theorem MMIM<br />

This says that A is similar to the diagonal matrix D via the nons<strong>in</strong>gular matrix<br />

R. ThusA is diagonalizable (Def<strong>in</strong>ition DZM).<br />

(⇒) Suppose that A is diagonalizable, so there is a nons<strong>in</strong>gular matrix of size n<br />

T =[y 1 |y 2 |y 3 | ...|y n ]<br />

and a diagonal matrix (recall Def<strong>in</strong>ition SUV)<br />

⎡<br />

⎤<br />

d 1 0 0 ··· 0<br />

0 d 2 0 ··· 0<br />

E =<br />

0 0 d 3 ··· 0<br />

⎢<br />

⎥<br />

⎣ ..<br />

..<br />

..<br />

..<br />

⎦ =[d 1e 1 |d 2 e 2 |d 3 e 3 | ...|d n e n ]<br />

0 0 0 ··· d n<br />

such that T −1 AT = E.


§SD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 409<br />

Then consider,<br />

[Ay 1 |Ay 2 |Ay 3 | ...|Ay n ]<br />

= A [y 1 |y 2 |y 3 | ...|y n ] Def<strong>in</strong>ition MM<br />

= AT<br />

= I n AT Theorem MMIM<br />

= TT −1 AT Def<strong>in</strong>ition MI<br />

= TE<br />

= T [d 1 e 1 |d 2 e 2 |d 3 e 3 | ...|d n e n ]<br />

=[T (d 1 e 1 )|T (d 2 e 2 )|T (d 3 e 3 )| ...|T (d n e n )] Def<strong>in</strong>ition MM<br />

=[d 1 T e 1 |d 2 T e 2 |d 3 T e 3 | ...|d n T e n ] Def<strong>in</strong>ition MM<br />

=[d 1 y 1 |d 2 y 2 |d 3 y 3 | ...|d n y n ]<br />

Def<strong>in</strong>ition MVP<br />

This equality of matrices (Def<strong>in</strong>ition ME) allows us to conclude that the <strong>in</strong>dividual<br />

columns are equal vectors (Def<strong>in</strong>ition CVE). That is, Ay i = d i y i for 1 ≤ i ≤ n.<br />

In other words, y i is an eigenvector of A for the eigenvalue d i ,1≤ i ≤ n. (Why<br />

does y i ≠ 0?). Because T is nons<strong>in</strong>gular, the set conta<strong>in</strong><strong>in</strong>g T ’s columns, S =<br />

{y 1 , y 2 , y 3 , ..., y n }, is a l<strong>in</strong>early <strong>in</strong>dependent set (Theorem NMLIC). So the set S<br />

has all the required properties.<br />

<br />

Notice that the proof of Theorem DC is constructive. To diagonalize a matrix,<br />

we need only locate n l<strong>in</strong>early <strong>in</strong>dependent eigenvectors. Then we can construct<br />

a nons<strong>in</strong>gular matrix us<strong>in</strong>g the eigenvectors as columns (R) sothatR −1 AR is a<br />

diagonal matrix (D). The entries on the diagonal of D will be the eigenvalues of the<br />

eigenvectors used to create R, <strong>in</strong> the same order as the eigenvectors appear <strong>in</strong> R.<br />

We illustrate this by diagonaliz<strong>in</strong>g some matrices.<br />

Example DMS3 Diagonaliz<strong>in</strong>g a matrix of size 3<br />

Consider the matrix<br />

[ ]<br />

−13 −8 −4<br />

F = 12 7 4<br />

24 16 7<br />

of Example CPMS3, Example EMS3 and Example ESMS3. F ’s eigenvalues and<br />

eigenspaces are<br />

〈 ⎧ ⎡<br />

⎨<br />

λ =3 E F (3) = ⎣ − ⎤⎫<br />

1 〉<br />

2<br />

⎬<br />

1 ⎦<br />

⎩ 2 ⎭<br />

1<br />

〈 ⎧ ⎡<br />

⎨<br />

λ = −1 E F (−1) = ⎣ − ⎤ ⎡<br />

2<br />

3<br />

1 ⎦ , ⎣ − ⎤⎫<br />

1 〉<br />

⎬<br />

3<br />

0 ⎦<br />

⎩<br />

0 1<br />


§SD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 410<br />

Def<strong>in</strong>e the matrix S to be the 3 × 3 matrix whose columns are the three basis<br />

vectors <strong>in</strong> the eigenspaces for F ,<br />

⎡<br />

S = ⎣ − ⎤<br />

1<br />

2<br />

− 2 3<br />

− 1 3<br />

1<br />

2<br />

1 0 ⎦<br />

1 0 1<br />

Check that S is nons<strong>in</strong>gular (row-reduces to the identity matrix, Theorem NMRRI<br />

or has a nonzero determ<strong>in</strong>ant, Theorem SMZD). Then the three columns of S are a<br />

l<strong>in</strong>early <strong>in</strong>dependent set (Theorem NMLIC). By Theorem DC we now know that F<br />

is diagonalizable. Furthermore, the construction <strong>in</strong> the proof of Theorem DC tells us<br />

that if we apply the matrix S to F <strong>in</strong> a similarity transformation, the result will be<br />

a diagonal matrix with the eigenvalues of F on the diagonal. The eigenvalues appear<br />

on the diagonal of the matrix <strong>in</strong> the same order as the eigenvectors appear <strong>in</strong> S. So,<br />

⎡<br />

S −1 FS = ⎣ − ⎤<br />

1<br />

2<br />

− 2 3<br />

− 1 −1 [ ] ⎡<br />

3 −13 −8 −4<br />

1<br />

2<br />

1 0 ⎦ 12 7 4 ⎣ − ⎤<br />

1<br />

2<br />

− 2 3<br />

− 1 3<br />

1<br />

2<br />

1 0 ⎦<br />

1 0 1 24 16 7 1 0 1<br />

[ ][ ] ⎡ 6 4 2 −13 −8 −4<br />

= −3 −1 −1 12 7 4 ⎣ − ⎤<br />

1<br />

2<br />

− 2 3<br />

− 1 3<br />

1<br />

2<br />

1 0 ⎦<br />

−6 −4 −1 24 16 7 1 0 1<br />

[ ]<br />

3 0 0<br />

= 0 −1 0<br />

0 0 −1<br />

Note that the above computations can be viewed two ways. The proof of Theorem<br />

DC tells us that the four matrices (F , S, F −1 and the diagonal matrix) will <strong>in</strong>teract<br />

the way we have written the equation. Or as an example, we can actually perform<br />

the computations to verify what the theorem predicts.<br />

△<br />

The dimension of an eigenspace can be no larger than the algebraic multiplicity<br />

of the eigenvalue by Theorem ME. When every eigenvalue’s eigenspace is this large,<br />

then we can diagonalize the matrix, and only then. Three examples we have seen so<br />

far <strong>in</strong> this section, Example SMS5, Example DAB and Example DMS3, illustrate<br />

the diagonalization of a matrix, with vary<strong>in</strong>g degrees of detail about just how the<br />

diagonalization is achieved. However, <strong>in</strong> each case, you can verify that the geometric<br />

and algebraic multiplicities are equal for every eigenvalue. This is the substance of<br />

the next theorem.<br />

Theorem DMFE Diagonalizable Matrices have Full Eigenspaces<br />

Suppose A is a square matrix. Then A is diagonalizable if and only if γ A (λ) = α A (λ)<br />

for every eigenvalue λ of A.<br />

Proof. Suppose A has size n and k dist<strong>in</strong>ct eigenvalues, λ 1 ,λ 2 ,λ 3 , ..., λ k .Let<br />

S i = { x i1 , x i2 , x i3 , ..., x iγA (λ i )}<br />

, denote a basis for the eigenspace of λi , E A (λ i ),


§SD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 411<br />

for 1 ≤ i ≤ k. Then<br />

S = S 1 ∪ S 2 ∪ S 3 ∪···∪S k<br />

is a set of eigenvectors for A. A vector cannot be an eigenvector for two different<br />

eigenvalues (see Exercise EE.T20) soS i ∩ S j = ∅ whenever i ≠ j. In other words, S<br />

is a disjo<strong>in</strong>t union of S i ,1≤ i ≤ k.<br />

(⇐) The size of S is<br />

k∑<br />

|S| = γ A (λ i )<br />

S disjo<strong>in</strong>t union of S i<br />

=<br />

i=1<br />

k∑<br />

α A (λ i )<br />

i=1<br />

Hypothesis<br />

= n Theorem NEM<br />

We next show that S is a l<strong>in</strong>early <strong>in</strong>dependent set. So we will beg<strong>in</strong> with a relation<br />

of l<strong>in</strong>ear dependence on S, us<strong>in</strong>g doubly-subscripted scalars and eigenvectors,<br />

0 = ( a 11 x 11 + a 12 x 12 + ···+ a 1γA (λ 1 )x 1γA (λ 1 ))<br />

+<br />

( )<br />

a21 x 21 + a 22 x 22 + ···+ a 2γA (λ 2 )x 2γA (λ 2 ) +<br />

( )<br />

a31 x 31 + a 32 x 32 + ···+ a 3γA (λ 3 )x 3γA (λ 3 ) +<br />

.<br />

(<br />

ak1 x k1 + a k2 x k2 + ···+ a kγA (λ k )x kγA (λ k ))<br />

Def<strong>in</strong>e the vectors y i ,1≤ i ≤ k by<br />

y 1 = ( )<br />

a 11 x 11 + a 12 x 12 + a 13 x 13 + ···+ a γA (1λ 1 )x 1γA (λ 1 )<br />

y 2 = ( )<br />

a 21 x 21 + a 22 x 22 + a 23 x 23 + ···+ a γA (2λ 2 )x 2γA (λ 2 )<br />

y 3 = ( )<br />

a 31 x 31 + a 32 x 32 + a 33 x 33 + ···+ a γA (3λ 3 )x 3γA (λ 3 )<br />

.<br />

y k = ( )<br />

a k1 x k1 + a k2 x k2 + a k3 x k3 + ···+ a γA (kλ k )x kγA (λ k )<br />

Then the relation of l<strong>in</strong>ear dependence becomes<br />

0 = y 1 + y 2 + y 3 + ···+ y k<br />

S<strong>in</strong>ce the eigenspace E A (λ i ) is closed under vector addition and scalar multiplication,<br />

y i ∈E A (λ i ),1≤ i ≤ k. Thus, for each i, the vector y i is an eigenvector of<br />

A for λ i , or is the zero vector. Recall that sets of eigenvectors whose eigenvalues<br />

are dist<strong>in</strong>ct form a l<strong>in</strong>early <strong>in</strong>dependent set by Theorem EDELI. Should any (or<br />

some) y i be nonzero, the previous equation would provide a nontrivial relation of<br />

l<strong>in</strong>ear dependence on a set of eigenvectors with dist<strong>in</strong>ct eigenvalues, contradict<strong>in</strong>g


§SD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 412<br />

Theorem EDELI. Thusy i = 0, 1≤ i ≤ k.<br />

Each of the k equations, y i = 0, is a relation of l<strong>in</strong>ear dependence on the<br />

correspond<strong>in</strong>g set S i , a set of basis vectors for the eigenspace E A (λ i ),whichis<br />

therefore l<strong>in</strong>early <strong>in</strong>dependent. From these relations of l<strong>in</strong>ear dependence on l<strong>in</strong>early<br />

<strong>in</strong>dependent sets we conclude that the scalars are all zero, more precisely, a ij =0,<br />

1 ≤ j ≤ γ A (λ i ) for 1 ≤ i ≤ k. This establishes that our orig<strong>in</strong>al relation of l<strong>in</strong>ear<br />

dependence on S has only the trivial relation of l<strong>in</strong>ear dependence, and hence S is a<br />

l<strong>in</strong>early <strong>in</strong>dependent set.<br />

We have determ<strong>in</strong>ed that S is a set of n l<strong>in</strong>early <strong>in</strong>dependent eigenvectors for A,<br />

and so by Theorem DC is diagonalizable.<br />

(⇒) Now we assume that A is diagonalizable. Aim<strong>in</strong>g for a contradiction (Proof<br />

Technique CD), suppose that there is at least one eigenvalue, say λ t , such that<br />

γ A (λ t ) ≠ α A (λ t ).ByTheoremME we must have γ A (λ t )


§SD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 413<br />

In Example EMMS4 the matrix<br />

⎡<br />

⎤<br />

−2 1 −2 −4<br />

⎢ 12 1 4 9 ⎥<br />

B = ⎣<br />

6 5 −2 −4<br />

⎦<br />

3 −4 5 10<br />

was determ<strong>in</strong>ed to have characteristic polynomial<br />

p B (x) =(x − 1)(x − 2) 3<br />

and an eigenspace for λ =2of<br />

〈 ⎧ ⎡<br />

⎪ − 1 ⎤⎫<br />

〉<br />

⎨ 2 ⎪⎬<br />

⎢ 1 ⎥<br />

E B (2) = ⎣<br />

⎪ ⎩ − 1 ⎦<br />

2<br />

⎪⎭<br />

1<br />

So the geometric multiplicity of λ =2isγ B (2) = 1, while the algebraic multiplicity<br />

is α B (2) =3.ByTheoremDMFE, thematrixB is not diagonalizable.<br />

△<br />

Archetype A is the lone archetype with a square matrix that is not diagonalizable,<br />

as the algebraic and geometric multiplicities of the eigenvalue λ = 0 differ. Example<br />

HMEM5 is another example of a matrix that cannot be diagonalized due to the<br />

difference between the geometric and algebraic multiplicities of λ =2,asisExample<br />

CEMS6 which has two complex eigenvalues, each with differ<strong>in</strong>g multiplicities.<br />

Likewise, Example EMMS4 has an eigenvalue with different algebraic and geometric<br />

multiplicities and so cannot be diagonalized.<br />

Theorem DED Dist<strong>in</strong>ct Eigenvalues implies Diagonalizable<br />

Suppose A is a square matrix of size n with n dist<strong>in</strong>ct eigenvalues. Then A is<br />

diagonalizable.<br />

Proof. Let λ 1 ,λ 2 ,λ 3 , ..., λ n denote the n dist<strong>in</strong>ct eigenvalues of A. ThenbyTheorem<br />

NEM we have n = ∑ n<br />

i=1 α A (λ i ), which implies that α A (λ i ) =1,1≤ i ≤ n.<br />

From Theorem ME it follows that γ A (λ i ) =1,1≤ i ≤ n. Soγ A (λ i ) = α A (λ i ),<br />

1 ≤ i ≤ n and Theorem DMFE says A is diagonalizable. <br />

Example DEHD Dist<strong>in</strong>ct eigenvalues, hence diagonalizable<br />

In Example DEMS5 the matrix<br />

⎡<br />

⎤<br />

15 18 −8 6 −5<br />

5 3 1 −1 −3<br />

H = ⎢ 0 −4 5 −4 −2 ⎥<br />

⎣−43 −46 17 −14 15 ⎦<br />

26 30 −12 8 −10


§SD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 414<br />

has characteristic polynomial<br />

p H (x) =x(x − 2)(x − 1)(x +1)(x +3)<br />

and so is a 5 × 5 matrix with 5 dist<strong>in</strong>ct eigenvalues.<br />

By Theorem DED we know H must be diagonalizable. But just for practice,<br />

we exhibit a diagonalization. The matrix S conta<strong>in</strong>s eigenvectors of H as columns,<br />

one from each eigenspace, guarantee<strong>in</strong>g l<strong>in</strong>ear <strong>in</strong>dependent columns and thus the<br />

nons<strong>in</strong>gularity of S. Notice that we are us<strong>in</strong>g the versions of the eigenvectors from<br />

Example DEMS5 that have <strong>in</strong>teger entries. The diagonal matrix has the eigenvalues<br />

of H <strong>in</strong> the same order that their respective eigenvectors appear as the columns of<br />

S. With these matrices, verify computationally that S −1 HS = D.<br />

⎡<br />

⎤<br />

⎡<br />

⎤<br />

2 1 −1 1 1<br />

−3 0 0 0 0<br />

−1 0 2 0 −1<br />

0 −1 0 0 0<br />

S = ⎢−2 0 2 −1 −2⎥<br />

D = ⎢ 0 0 0 0 0⎥<br />

⎣−4 −1 0 −2 −1⎦<br />

⎣ 0 0 0 1 0⎦<br />

2 2 1 2 1<br />

0 0 0 0 2<br />

Note that there are many different ways to diagonalize H. We could replace eigenvectors<br />

by nonzero scalar multiples, or we could rearrange the order of the eigenvectors<br />

as the columns of S (which would subsequently reorder the eigenvalues along the<br />

diagonal of D).<br />

△<br />

Archetype B is another example of a matrix that has as many dist<strong>in</strong>ct eigenvalues<br />

as its size, and is hence diagonalizable by Theorem DED.<br />

Powers of a diagonal matrix are easy to compute, and when a matrix is diagonalizable,<br />

it is almost as easy. We could state a theorem here perhaps, but we will<br />

settle <strong>in</strong>stead for an example that makes the po<strong>in</strong>t just as well.<br />

Example HPDM High power of a diagonalizable matrix<br />

Suppose that<br />

⎡<br />

⎤<br />

19 0 6 13<br />

⎢−33 −1 −9 −21⎥<br />

A = ⎣<br />

21 −4 12 21<br />

⎦<br />

−36 2 −14 −28<br />

and we wish to compute A 20 . Normally this would require 19 matrix multiplications,<br />

but s<strong>in</strong>ce A is diagonalizable, we can simplify the computations substantially.<br />

<strong>First</strong>, we diagonalize A. With<br />

⎡<br />

⎤<br />

we f<strong>in</strong>d<br />

D = S −1 AS<br />

1 −1 2 −1<br />

⎢−2 3 −3 3<br />

S = ⎣<br />

1 1 3 3<br />

−2 1 −4 0<br />

⎥<br />


§SD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 415<br />

⎡<br />

⎤ ⎡<br />

⎤ ⎡<br />

⎤<br />

−6 1 −3 −6 19 0 6 13 1 −1 2 −1<br />

⎢ 0 2 −2 −3⎥<br />

⎢−33 −1 −9 −21⎥<br />

⎢−2 3 −3 3 ⎥<br />

= ⎣<br />

3 0 1 2<br />

⎦ ⎣<br />

21 −4 12 21<br />

⎦ ⎣<br />

1 1 3 3<br />

⎦<br />

−1 −1 1 1 −36 2 −14 −28 −2 1 −4 0<br />

⎡<br />

⎤<br />

−1 0 0 0<br />

⎢ 0 0 0 0⎥<br />

= ⎣<br />

0 0 2 0<br />

⎦<br />

0 0 0 1<br />

Now we f<strong>in</strong>d an alternate expression for A 20 ,<br />

A 20 = AAA . . . A<br />

= I n AI n AI n AI n ...I n AI n<br />

= ( SS −1) A ( SS −1) A ( SS −1) A ( SS −1) ... ( SS −1) A ( SS −1)<br />

= S ( S −1 AS )( S −1 AS )( S −1 AS ) ... ( S −1 AS ) S −1<br />

= SDDD...DS −1<br />

= SD 20 S −1<br />

and s<strong>in</strong>ce D is a diagonal matrix, powers are much easier to compute,<br />

⎡<br />

⎤20<br />

−1 0 0 0<br />

⎢ 0 0 0 0⎥<br />

= S ⎣<br />

0 0 2 0<br />

⎦ S −1<br />

0 0 0 1<br />

⎡<br />

(−1) 20 ⎤<br />

0 0 0<br />

⎢ 0 (0)<br />

= S<br />

20 0 0 ⎥<br />

⎣<br />

0 0 (2) 20 0<br />

⎦ S −1<br />

0 0 0 (1) 20<br />

⎡<br />

⎤ ⎡<br />

⎤ ⎡<br />

⎤<br />

1 −1 2 −1 1 0 0 0 −6 1 −3 −6<br />

⎢−2 3 −3 3 ⎥ ⎢0 0 0 0⎥<br />

⎢ 0 2 −2 −3⎥<br />

= ⎣<br />

1 1 3 3<br />

⎦ ⎣<br />

0 0 1048576 0<br />

⎦ ⎣<br />

3 0 1 2<br />

⎦<br />

−2 1 −4 0 0 0 0 1 −1 −1 1 1<br />

⎡<br />

⎤<br />

6291451 2 2097148 4194297<br />

⎢ −9437175 −5 −3145719 −6291441⎥<br />

= ⎣<br />

9437175 −2 3145728 6291453<br />

⎦<br />

−12582900 −2 −4194298 −8388596<br />

Notice how we effectively replaced the twentieth power of A by the twentieth<br />

power of D, and how a high power of a diagonal matrix is just a collection of powers<br />

of scalars on the diagonal. The price we pay for this simplification is the need to<br />

diagonalize the matrix (by comput<strong>in</strong>g eigenvalues and eigenvectors) and f<strong>in</strong>d<strong>in</strong>g the<br />

<strong>in</strong>verse of the matrix of eigenvectors. And we still need to do two matrix products.<br />

But the higher the power, the greater the sav<strong>in</strong>gs.<br />


§SD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 416<br />

Subsection FS<br />

Fibonacci Sequences<br />

Example FSCF Fibonacci sequence, closed form<br />

The Fibonacci sequence is a sequence of <strong>in</strong>tegers def<strong>in</strong>ed recursively by<br />

a 0 =0 a 1 =1 a n+1 = a n + a n−1 , n ≥ 1<br />

So the <strong>in</strong>itial portion of the sequence is 0, 1, 1, 2, 3, 5, 8, 13, 21, ....Inthis<br />

subsection we will illustrate an application of eigenvalues and diagonalization through<br />

the determ<strong>in</strong>ation of a closed-form expression for an arbitrary term of this sequence.<br />

To beg<strong>in</strong>, verify that for any n ≥ 1 the recursive statement above establishes the<br />

truth of the statement<br />

[ ] [ ][ ]<br />

an 0 1 an−1<br />

=<br />

a n+1 1 1 a n<br />

Let A denote this 2 × 2 matrix. Through repeated applications of the statement<br />

above we have<br />

[ ] [ ] [ ] [ ]<br />

[ ]<br />

an an−1<br />

= A = A<br />

a n+1 a 2 an−2<br />

= A<br />

n a 3 an−3<br />

= ···= A<br />

n−1 a n a0<br />

n−2 a 1<br />

In preparation for work<strong>in</strong>g with this high power of A, not unlike <strong>in</strong> Example<br />

HPDM, we will diagonalize A. The characteristic polynomial of A is p A (x) =<br />

x 2 − x − 1, with roots (the eigenvalues of A by Theorem EMRCP)<br />

ρ = 1+√ 5<br />

δ = 1 − √ 5<br />

2<br />

2<br />

With two dist<strong>in</strong>ct eigenvalues, Theorem DED implies that A is diagonalizable.<br />

It will be easier to compute with these eigenvalues once you confirm the follow<strong>in</strong>g<br />

properties (all but the last can be derived from the fact that ρ and δ are roots of<br />

the characteristic polynomial, <strong>in</strong> a factored or unfactored form)<br />

ρ + δ =1 ρδ = −1 1+ρ = ρ 2 1+δ = δ 2 ρ − δ = √ 5<br />

Then eigenvectors of A (for ρ and δ, respectively) are<br />

[ ] [<br />

1 1<br />

ρ<br />

δ]<br />

which can be easily confirmed, as we demonstrate for the eigenvector for ρ,<br />

[ ] [ ] [ ] [<br />

0 1 1 ρ ρ 1<br />

= =<br />

1 1][<br />

ρ 1+ρ ρ 2 = ρ<br />

ρ]<br />

From the proof of Theorem DC we know A can be diagonalized by a matrix<br />

S with these eigenvectors as columns, giv<strong>in</strong>g D = S −1 AS. We list S, S −1 and the


§SD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 417<br />

diagonal matrix D,<br />

[ ]<br />

1 1<br />

S =<br />

ρ δ<br />

S −1 = 1 [ ]<br />

−δ 1<br />

ρ − δ ρ −1<br />

D =<br />

[ ]<br />

ρ 0<br />

0 δ<br />

OK, we have everyth<strong>in</strong>g <strong>in</strong> place now. The ma<strong>in</strong> step <strong>in</strong> the follow<strong>in</strong>g is to replace<br />

A by SDS −1 . Here we go,<br />

[ ] [ ]<br />

an<br />

= A<br />

a n a0<br />

n+1 a 1<br />

= ( SDS −1) [ ]<br />

n a0<br />

a 1<br />

[ ]<br />

= SDS −1 SDS −1 SDS −1 ···SDS −1 a0<br />

a 1<br />

[ ]<br />

= SDDD ···DS −1 a0<br />

a 1<br />

[ ]<br />

= SD n S −1 a0<br />

a 1<br />

[ ][ ] n [ ][ ]<br />

1 1 ρ 0 1 −δ 1 a0<br />

=<br />

ρ δ 0 δ ρ − δ ρ −1 a 1<br />

= 1 [ ][ ][ ]<br />

1 1 ρ<br />

n<br />

0 −δ 1 0<br />

ρ − δ ρ δ 0 δ n ρ −1][<br />

1<br />

= 1 [ ][ ][ ]<br />

1 1 ρ<br />

n<br />

0 1<br />

ρ − δ ρ δ 0 δ n −1<br />

= 1 [ ][ ]<br />

1 1 ρ<br />

n<br />

ρ − δ ρ δ −δ n<br />

= 1 [ ]<br />

ρ n − δ n<br />

ρ − δ ρ n+1 − δ n+1<br />

Perform<strong>in</strong>g the scalar multiplication and equat<strong>in</strong>g the first entries of the two<br />

vectors, we arrive at the closed form expression<br />

a n = 1<br />

ρ − δ (ρn − δ n )<br />

((<br />

= √ 1 1+ √ ) n (<br />

5 1 − √ ) n )<br />

5<br />

− 5 2<br />

2<br />

= 1 ((<br />

2 n√ 1+ √ ) n (<br />

5 − 1 − √ ) n )<br />

5<br />

5<br />

Notice that it does not matter whether we use the equality of the first or second<br />

entries of the vectors, we will arrive at the same formula, once <strong>in</strong> terms of n and<br />

aga<strong>in</strong> <strong>in</strong> terms of n + 1. Also, our def<strong>in</strong>ition clearly describes a sequence that will


§SD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 418<br />

only conta<strong>in</strong> <strong>in</strong>tegers, yet the presence of the irrational number √ 5 might make us<br />

suspicious. But no, our expression for a n will always yield an <strong>in</strong>teger!<br />

The Fibonacci sequence, and generalizations of it, have been extensively studied<br />

(Fibonacci lived <strong>in</strong> the 12th and 13th centuries). There are many ways to derive<br />

the closed-form expression we just found, and our approach may not be the most<br />

efficient route. But it is a nice demonstration of how diagonalization can be used to<br />

solve a problem outside the field of l<strong>in</strong>ear algebra.<br />

△<br />

We close this section with a comment about an important upcom<strong>in</strong>g theorem that<br />

we prove <strong>in</strong> Chapter R. A consequence of Theorem OD is that every Hermitian matrix<br />

(Def<strong>in</strong>ition HM) is diagonalizable (Def<strong>in</strong>ition DZM), and the similarity transformation<br />

that accomplishes the diagonalization uses a unitary matrix (Def<strong>in</strong>ition UM). This<br />

means that for every Hermitian matrix of size n there is a basis of C n that is<br />

composed entirely of eigenvectors for the matrix and also forms an orthonormal set<br />

(Def<strong>in</strong>ition ONS). Notice that for matrices with only real entries, we only need the<br />

hypothesis that the matrix is symmetric (Def<strong>in</strong>ition SYM) to reach this conclusion<br />

(Example ESMS4). Can you imag<strong>in</strong>e a prettier basis for use with a matrix? I cannot.<br />

These results <strong>in</strong> Section OD expla<strong>in</strong> much of our recurr<strong>in</strong>g <strong>in</strong>terest <strong>in</strong> orthogonality,<br />

and make the section a high po<strong>in</strong>t <strong>in</strong> your study of l<strong>in</strong>ear algebra. A precise statement<br />

of this diagonalization result applies to a slightly broader class of matrices, known as<br />

“normal” matrices (Def<strong>in</strong>ition NRML), which are matrices that commute with their<br />

adjo<strong>in</strong>ts. With this expanded category of matrices, the result becomes an equivalence<br />

(Proof Technique E). See Theorem OD and Theorem OBNM <strong>in</strong> Section OD for all<br />

the details.<br />

Read<strong>in</strong>g Questions<br />

1. What is an equivalence relation?<br />

2. State a condition that is equivalent to a matrix be<strong>in</strong>g diagonalizable, but is not the<br />

def<strong>in</strong>ition.<br />

3. F<strong>in</strong>d a diagonal matrix similar to<br />

[ ]<br />

−5 8<br />

A =<br />

−4 7<br />

Exercises<br />

C20 † Consider the matrix A below. <strong>First</strong>, show that A is diagonalizable by comput<strong>in</strong>g<br />

the geometric multiplicities of the eigenvalues and quot<strong>in</strong>g the relevant theorem. Second,<br />

f<strong>in</strong>d a diagonal matrix D and a nons<strong>in</strong>gular matrix S so that S −1 AS = D. (See Exercise


§SD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 419<br />

EE.C20 for some of the necessary computations.)<br />

⎡<br />

⎤<br />

18 −15 33 −15<br />

⎢−4 8 −6 6 ⎥<br />

A = ⎣<br />

−9 9 −16 9<br />

⎦<br />

5 −6 9 −4<br />

C21 † Determ<strong>in</strong>e if the matrix A below is diagonalizable. If the matrix is diagonalizable,<br />

then f<strong>in</strong>d a diagonal matrix D that is similar to A, and provide the <strong>in</strong>vertible matrix S<br />

that performs the similarity transformation. You should use your calculator to f<strong>in</strong>d the<br />

eigenvalues of the matrix, but try only us<strong>in</strong>g the row-reduc<strong>in</strong>g function of your calculator<br />

to assist with f<strong>in</strong>d<strong>in</strong>g eigenvectors.<br />

⎡<br />

⎤<br />

1 9 9 24<br />

⎢−3 −27 −29 −68⎥<br />

A = ⎣<br />

1 11 13 26<br />

⎦<br />

1 7 7 18<br />

C22 † Consider the matrix A below. F<strong>in</strong>d the eigenvalues of A us<strong>in</strong>g a calculator and use<br />

these to construct the characteristic polynomial of A, p A (x). State the algebraic multiplicity<br />

of each eigenvalue. F<strong>in</strong>d all of the eigenspaces for A by comput<strong>in</strong>g expressions for null<br />

spaces, only us<strong>in</strong>g your calculator to row-reduce matrices. State the geometric multiplicity<br />

of each eigenvalue. Is A diagonalizable? If not, expla<strong>in</strong> why. If so, f<strong>in</strong>d a diagonal matrix D<br />

that is similar to A.<br />

⎡<br />

⎤<br />

19 25 30 5<br />

⎢−23 −30 −35 −5⎥<br />

A = ⎣<br />

7 9 10 1<br />

⎦<br />

−3 −4 −5 −1<br />

T15 † Suppose that A and B are similar matrices of size n. ProvethatA 3 and B 3 are<br />

similar matrices. Generalize.<br />

T16 † Suppose that A and B are similar matrices, with A nons<strong>in</strong>gular. Prove that B is<br />

nons<strong>in</strong>gular, and that A −1 is similar to B −1 .<br />

T17 † Suppose that B is a nons<strong>in</strong>gular matrix. Prove that AB is similar to BA.


Chapter LT<br />

L<strong>in</strong>ear Transformations<br />

In the next l<strong>in</strong>ear algebra course you take, the first lecture might be a rem<strong>in</strong>der about<br />

what a vector space is (Def<strong>in</strong>ition VS), their ten properties, basic theorems and then<br />

some examples. The second lecture would likely be all about l<strong>in</strong>ear transformations.<br />

While it may seem we have waited a long time to present what must be a central<br />

topic, <strong>in</strong> truth we have already been work<strong>in</strong>g with l<strong>in</strong>ear transformations for some<br />

time.<br />

Functions are important objects <strong>in</strong> the study of calculus, but have been absent<br />

from this course until now (well, not really, it just seems that way). In your study of<br />

more advanced mathematics it is nearly impossible to escape the use of functions —<br />

they are as fundamental as sets are.<br />

Section LT<br />

L<strong>in</strong>ear Transformations<br />

Early <strong>in</strong> Chapter VS we prefaced the def<strong>in</strong>ition of a vector space with the comment<br />

that it was “one of the two most important def<strong>in</strong>itions <strong>in</strong> the entire course.” Here<br />

comes the other. Any capsule summary of l<strong>in</strong>ear algebra would have to describe the<br />

subject as the <strong>in</strong>terplay of l<strong>in</strong>ear transformations and vector spaces. Here we go.<br />

Subsection LT<br />

L<strong>in</strong>ear Transformations<br />

Def<strong>in</strong>ition LT L<strong>in</strong>ear Transformation<br />

A l<strong>in</strong>ear transformation, T : U → V , is a function that carries elements of the<br />

vector space U (called the doma<strong>in</strong>) to the vector space V (called the codoma<strong>in</strong>),<br />

and which has two additional properties<br />

1. T (u 1 + u 2 )=T (u 1 )+T (u 2 ) for all u 1 , u 2 ∈ U<br />

420


§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 421<br />

2. T (αu) =αT (u) for all u ∈ U and all α ∈ C<br />

□<br />

The two def<strong>in</strong><strong>in</strong>g conditions <strong>in</strong> the def<strong>in</strong>ition of a l<strong>in</strong>ear transformation should<br />

“feel l<strong>in</strong>ear,” whatever that means. Conversely, these two conditions could be taken as<br />

exactly what it means to be l<strong>in</strong>ear. As every vector space property derives from vector<br />

addition and scalar multiplication, so too, every property of a l<strong>in</strong>ear transformation<br />

derives from these two def<strong>in</strong><strong>in</strong>g properties. While these conditions may be rem<strong>in</strong>iscent<br />

of how we test subspaces, they really are quite different, so do not confuse the two.<br />

Here are two diagrams that convey the essence of the two def<strong>in</strong><strong>in</strong>g properties<br />

of a l<strong>in</strong>ear transformation. In each case, beg<strong>in</strong> <strong>in</strong> the upper left-hand corner, and<br />

follow the arrows around the rectangle to the lower-right hand corner, tak<strong>in</strong>g two<br />

different routes and do<strong>in</strong>g the <strong>in</strong>dicated operations labeled on the arrows. There are<br />

two results there. For a l<strong>in</strong>ear transformation these two expressions are always equal.<br />

T<br />

u 1 , u 2 T (u 1 ),T(u 2 )<br />

+<br />

+<br />

T<br />

u 1 + u 2 T (u 1 + u 2 )=T (u 1 )+T (u 2 )<br />

Diagram DLTA: Def<strong>in</strong>ition of L<strong>in</strong>ear Transformation, Additive<br />

u<br />

T<br />

T (u)<br />

αu<br />

α<br />

T<br />

α<br />

T (αu) =αT (u)<br />

Diagram DLTM: Def<strong>in</strong>ition of L<strong>in</strong>ear Transformation, Multiplicative<br />

A couple of words about notation. T is the name of the l<strong>in</strong>ear transformation,<br />

and should be used when we want to discuss the function as a whole. T (u) is how we


§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 422<br />

talk about the output of the function, it is a vector <strong>in</strong> the vector space V .Whenwe<br />

write T (x + y) =T (x)+T (y), the plus sign on the left is the operation of vector<br />

addition <strong>in</strong> the vector space U, s<strong>in</strong>cex and y areelementsofU. The plus sign on<br />

the right is the operation of vector addition <strong>in</strong> the vector space V ,s<strong>in</strong>ceT (x) and<br />

T (y) are elements of the vector space V . These two <strong>in</strong>stances of vector addition<br />

might be wildly different.<br />

Let us exam<strong>in</strong>e several examples and beg<strong>in</strong> to form a catalog of known l<strong>in</strong>ear<br />

transformations to work with.<br />

Example ALT A l<strong>in</strong>ear transformation<br />

Def<strong>in</strong>e T : C 3 → C 2 by describ<strong>in</strong>g the output of the function for a generic <strong>in</strong>put with<br />

the formula<br />

([ ])<br />

x1 [ ]<br />

2x1 + x<br />

T x 2 =<br />

3<br />

−4x<br />

x 2 3<br />

and check the two def<strong>in</strong><strong>in</strong>g properties.<br />

([ ]<br />

x1<br />

T (x + y) =T x 2 +<br />

x 3<br />

[<br />

y1<br />

])<br />

y 2<br />

y 3<br />

([ ])<br />

x1 + y 1<br />

= T x 2 + y 2<br />

x 3 + y 3<br />

[ ]<br />

2(x1 + y<br />

=<br />

1 )+(x 3 + y 3 )<br />

−4(x 2 + y 2 )<br />

[ ]<br />

(2x1 + x<br />

=<br />

3 )+(2y 1 + y 3 )<br />

−4x 2 +(−4)y 2<br />

[ ] [ ]<br />

2x1 + x<br />

=<br />

3 2y1 + y<br />

+<br />

3<br />

−4x 2 −4y 2<br />

([ ]) ([ ])<br />

x1 y1<br />

= T x 2 + T y 2<br />

x 3 y 3<br />

= T (x)+T (y)<br />

and<br />

T (αx) =T<br />

= T<br />

(<br />

α<br />

[<br />

x1<br />

([ αx1<br />

])<br />

x 2<br />

x 3<br />

])<br />

αx 2<br />

αx 3


§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 423<br />

[ ]<br />

2(αx1 )+(αx<br />

=<br />

3 )<br />

−4(αx 2 )<br />

[ ]<br />

α(2x1 + x<br />

=<br />

3 )<br />

α(−4x 2 )<br />

[ ]<br />

2x1 + x<br />

= α<br />

3<br />

−4x 2<br />

([ ])<br />

x1<br />

= αT x 2<br />

x 3<br />

= αT (x)<br />

So by Def<strong>in</strong>ition LT, T is a l<strong>in</strong>ear transformation.<br />

It can be just as <strong>in</strong>structive to look at functions that are not l<strong>in</strong>ear transformations.<br />

S<strong>in</strong>ce the def<strong>in</strong><strong>in</strong>g conditions must be true for all vectors and scalars, it is enough to<br />

f<strong>in</strong>d just one situation where the properties fail.<br />

Example NLT Not a l<strong>in</strong>ear transformation<br />

Def<strong>in</strong>e S : C 3 → C 3 by<br />

([ ]) [ ]<br />

x1 4x1 +2x 2<br />

S x 2 = 0<br />

x 3 x 1 +3x 3 − 2<br />

This function “looks” l<strong>in</strong>ear, but consider<br />

([ ]) [ ] [ ]<br />

1 8 24<br />

3 S 2 =3 0 = 0<br />

3 8 24<br />

while<br />

(<br />

S 3<br />

[ ]) ([ ]) [ ]<br />

1 3 24<br />

2 = S 6 = 0<br />

3 9 28<br />

[ 1<br />

So the second required property fails for the choice of α = 3 and x = 2 and by<br />

3<br />

Def<strong>in</strong>ition LT, S is not a l<strong>in</strong>ear transformation. It is just about as easy to f<strong>in</strong>d an<br />

example where the first def<strong>in</strong><strong>in</strong>g property fails (try it!). Notice that it is the “-2” <strong>in</strong><br />

the third component of the def<strong>in</strong>ition of S that prevents the function from be<strong>in</strong>g a<br />

l<strong>in</strong>ear transformation.<br />

△<br />

Example LTPM L<strong>in</strong>ear transformation, polynomials to matrices<br />

]<br />


§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 424<br />

Def<strong>in</strong>e a l<strong>in</strong>ear transformation T : P 3 → M 22 by<br />

T ( a + bx + cx 2 + dx 3) [ ]<br />

a + b a− 2c<br />

=<br />

d b− d<br />

We verify the two def<strong>in</strong><strong>in</strong>g conditions of a l<strong>in</strong>ear transformation.<br />

T (x + y) =T ( (a 1 + b 1 x + c 1 x 2 + d 1 x 3 )+(a 2 + b 2 x + c 2 x 2 + d 2 x 3 ) )<br />

= T ( (a 1 + a 2 )+(b 1 + b 2 )x +(c 1 + c 2 )x 2 +(d 1 + d 2 )x 3)<br />

[ ]<br />

(a1 + a<br />

=<br />

2 )+(b 1 + b 2 ) (a 1 + a 2 ) − 2(c 1 + c 2 )<br />

d 1 + d 2 (b 1 + b 2 ) − (d 1 + d 2 )<br />

[ ]<br />

(a1 + b<br />

=<br />

1 )+(a 2 + b 2 ) (a 1 − 2c 1 )+(a 2 − 2c 2 )<br />

=<br />

d 1 + d 2 (b 1 − d 1 )+(b 2 − d 2 )<br />

[ ]<br />

a2 + b 2 a 2 − 2c 2<br />

d 2 b 2 − d 2<br />

[ ]<br />

a1 + b 1 a 1 − 2c 1<br />

+<br />

d 1 b 1 − d 1<br />

= T ( a 1 + b 1 x + c 1 x 2 + d 1 x 3) + T ( a 2 + b 2 x + c 2 x 2 + d 2 x 3)<br />

= T (x)+T (y)<br />

and<br />

T (αx) =T ( α(a + bx + cx 2 + dx 3 ) )<br />

= T ( (αa)+(αb)x +(αc)x 2 +(αd)x 3)<br />

[ ]<br />

(αa)+(αb) (αa) − 2(αc)<br />

=<br />

αd (αb) − (αd)<br />

[ ]<br />

α(a + b) α(a − 2c)<br />

=<br />

αd α(b − d)<br />

[ ]<br />

a + b a− 2c<br />

= α<br />

d b− d<br />

= αT ( a + bx + cx 2 + dx 3)<br />

= αT (x)<br />

So by Def<strong>in</strong>ition LT, T is a l<strong>in</strong>ear transformation.<br />

Example LTPP L<strong>in</strong>ear transformation, polynomials to polynomials<br />

Def<strong>in</strong>e a function S : P 4 → P 5 by<br />

S(p(x)) = (x − 2)p(x)<br />

Then<br />

S (p(x)+q(x)) = (x − 2)(p(x)+q(x))<br />

=(x − 2)p(x)+(x − 2)q(x) =S (p(x)) + S (q(x))<br />

S (αp(x)) = (x − 2)(αp(x)) = (x − 2)αp(x) =α(x − 2)p(x) =αS (p(x))<br />


§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 425<br />

So by Def<strong>in</strong>ition LT, S is a l<strong>in</strong>ear transformation.<br />

L<strong>in</strong>ear transformations have many amaz<strong>in</strong>g properties, which we will <strong>in</strong>vestigate<br />

through the next few sections. However, as a taste of th<strong>in</strong>gs to come, here is a<br />

theorem we can prove now and put to use immediately.<br />

Theorem LTTZZ L<strong>in</strong>ear Transformations Take Zero to Zero<br />

Suppose T : U → V is a l<strong>in</strong>ear transformation. Then T (0) =0.<br />

Proof. The two zero vectors <strong>in</strong> the conclusion of the theorem are different. The first<br />

is from U while the second is from V . We will subscript the zero vectors <strong>in</strong> this proof<br />

to highlight the dist<strong>in</strong>ction. Th<strong>in</strong>k about your objects. (This proof is contributed by<br />

Mark Shoemaker).<br />

T (0 U )=T (00 U )<br />

Theorem ZSSM <strong>in</strong> U<br />

=0T (0 U ) Def<strong>in</strong>ition LT<br />

= 0 V Theorem ZSSM <strong>in</strong> V<br />

<br />

([ ]) [ ]<br />

0 0<br />

Return to Example NLT and compute S 0 = 0 to quickly see aga<strong>in</strong><br />

0 −2<br />

that S is not a l<strong>in</strong>ear transformation, while <strong>in</strong> Example LTPM compute<br />

S ( 0+0x +0x 2 +0x 3) [ ]<br />

0 0<br />

=<br />

0 0<br />

as an example of Theorem LTTZZ at work.<br />

Subsection LTC<br />

L<strong>in</strong>ear Transformation Cartoons<br />

Throughout this chapter, and Chapter R, we will <strong>in</strong>clude draw<strong>in</strong>gs of l<strong>in</strong>ear transformations.<br />

We will call them “cartoons,” not because they are humorous, but because<br />

they will only expose a portion of the truth. A Bugs Bunny cartoon might give us<br />

some <strong>in</strong>sights on human nature, but the rules of physics and biology are rout<strong>in</strong>ely<br />

(and grossly) violated. So it will be with our l<strong>in</strong>ear transformation cartoons.<br />

Here is our first, followed by a guide to help you understand how these are meant<br />

to describe fundamental truths about l<strong>in</strong>ear transformations, while simultaneously<br />

violat<strong>in</strong>g other truths.<br />


§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 426<br />

w<br />

u<br />

v<br />

0 U<br />

x<br />

0 V<br />

y<br />

t<br />

U<br />

T<br />

V<br />

Diagram GLT: General L<strong>in</strong>ear Transformation<br />

Here we picture a l<strong>in</strong>ear transformation T : U → V , where this <strong>in</strong>formation will<br />

be consistently displayed along the bottom edge. The ovals are meant to represent<br />

the vector spaces, <strong>in</strong> this case U, the doma<strong>in</strong>, on the left and V , the codoma<strong>in</strong>, on<br />

the right. Of course, vector spaces are typically <strong>in</strong>f<strong>in</strong>ite sets, so you will have to<br />

imag<strong>in</strong>e that characteristic of these sets. A small dot <strong>in</strong>side of an oval will represent<br />

a vector with<strong>in</strong> that vector space, sometimes with a name, sometimes not (<strong>in</strong> this<br />

case every vector has a name). The sizes of the ovals are meant to be proportional<br />

to the dimensions of the vector spaces. However, when we make no assumptions<br />

about the dimensions, we will draw the ovals as the same size, as we have done here<br />

(which is not meant to suggest that the dimensions have to be equal).<br />

To convey that the l<strong>in</strong>ear transformation associates a certa<strong>in</strong> <strong>in</strong>put with a certa<strong>in</strong><br />

output, we will draw an arrow from the <strong>in</strong>put to the output. So, for example,<br />

<strong>in</strong> this cartoon we suggest that T (x) = y. Noth<strong>in</strong>g <strong>in</strong> the def<strong>in</strong>ition of a l<strong>in</strong>ear<br />

transformation prevents two different <strong>in</strong>puts be<strong>in</strong>g sent to the same output and we see<br />

this <strong>in</strong> T (u) = v = T (w). Similarly, an output may not have any <strong>in</strong>put be<strong>in</strong>g sent<br />

its way, as illustrated by no arrow po<strong>in</strong>t<strong>in</strong>g at t. Inthiscartoon,wehavecapturedthe<br />

essence of our one general theorem about l<strong>in</strong>ear transformations, Theorem LTTZZ,<br />

T (0 U ) = 0 V . On occasion we might <strong>in</strong>clude this basic fact when it is relevant, at<br />

other times maybe not. Note that the def<strong>in</strong>ition of a l<strong>in</strong>ear transformation requires<br />

that it be a function, so every element of the doma<strong>in</strong> should be associated with some<br />

element of the codoma<strong>in</strong>. This will be reflected by never hav<strong>in</strong>g an element of the


§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 427<br />

doma<strong>in</strong> without an arrow orig<strong>in</strong>at<strong>in</strong>g there.<br />

These cartoons are of course no substitute for careful def<strong>in</strong>itions and proofs, but<br />

they can be a handy way to th<strong>in</strong>k about the various properties we will be study<strong>in</strong>g.<br />

Subsection MLT<br />

Matrices and L<strong>in</strong>ear Transformations<br />

If you give me a matrix, then I can quickly build you a l<strong>in</strong>ear transformation. Always.<br />

<strong>First</strong> a motivat<strong>in</strong>g example and then the theorem.<br />

Example LTM L<strong>in</strong>ear transformation from a matrix<br />

Let<br />

[ ]<br />

3 −1 8 1<br />

A = 2 0 5 −2<br />

1 1 3 −7<br />

and def<strong>in</strong>e a function P : C 4 → C 3 by<br />

P (x) =Ax<br />

So we are us<strong>in</strong>g an old friend, the matrix-vector product (Def<strong>in</strong>ition MVP) as<br />

a way to convert a vector with 4 components <strong>in</strong>to a vector with 3 components.<br />

Apply<strong>in</strong>g Def<strong>in</strong>ition MVP allows us to write the def<strong>in</strong><strong>in</strong>g formula for P <strong>in</strong> a slightly<br />

different form,<br />

[ ] ⎡ ⎤<br />

x 3 −1 8 1 1 [ ] [ ] [ ] [ ]<br />

3 −1 8 1<br />

⎢x P (x) =Ax = 2 0 5 −2 2 ⎥ ⎣<br />

x<br />

⎦ = x 1 2 + x 2 0 + x 3 5 + x 4 −2<br />

1 1 3 −7 3<br />

1 1 3 −7<br />

x 4<br />

So we recognize the action of the function P as us<strong>in</strong>g the components of the<br />

vector (x 1 ,x 2 ,x 3 ,x 4 ) as scalars to form the output of P as a l<strong>in</strong>ear comb<strong>in</strong>ation<br />

of the four columns of the matrix A, whichareallmembersofC 3 , so the result is<br />

a vector <strong>in</strong> C 3 . We can rearrange this expression further, us<strong>in</strong>g our def<strong>in</strong>itions of<br />

operations <strong>in</strong> C 3 (Section VO).<br />

P (x) =Ax<br />

Def<strong>in</strong>ition of P<br />

[ ] [ ] [ ]<br />

3 −1 8<br />

= x 1 2 + x 2 0 + x 3 5<br />

1 1 3<br />

[ ] [ ] [ ]<br />

3x1 −x2 8x3<br />

= 2x 1 + 0 + 5x 3 +<br />

x 1 x 2 3x 3<br />

[ ]<br />

3x1 − x 2 +8x 3 + x 4<br />

= 2x 1 +5x 3 − 2x 4<br />

x 1 + x 2 +3x 3 − 7x 4<br />

]<br />

[ 1<br />

+ x 4 −2<br />

−7<br />

[<br />

x4<br />

]<br />

−2x 4<br />

−7x 4<br />

Def<strong>in</strong>ition MVP<br />

Def<strong>in</strong>ition CVSM<br />

Def<strong>in</strong>ition CVA


§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 428<br />

You might recognize this f<strong>in</strong>al expression as be<strong>in</strong>g similar <strong>in</strong> style to some previous<br />

examples (Example ALT) and some l<strong>in</strong>ear transformations def<strong>in</strong>ed <strong>in</strong> the archetypes<br />

(Archetype M through Archetype R). But the expression that says the output of<br />

this l<strong>in</strong>ear transformation is a l<strong>in</strong>ear comb<strong>in</strong>ation of the columns of A is probably<br />

the most powerful way of th<strong>in</strong>k<strong>in</strong>g about examples of this type.<br />

Almost forgot — we should verify that P is <strong>in</strong>deed a l<strong>in</strong>ear transformation. This<br />

is easy with two matrix properties from Section MM.<br />

P (x + y) =A (x + y)<br />

Def<strong>in</strong>ition of P<br />

= Ax + Ay Theorem MMDAA<br />

= P (x)+P (y) Def<strong>in</strong>ition of P<br />

and<br />

P (αx) =A (αx)<br />

Def<strong>in</strong>ition of P<br />

= α (Ax) Theorem MMSMM<br />

= αP (x) Def<strong>in</strong>ition of P<br />

So by Def<strong>in</strong>ition LT, P is a l<strong>in</strong>ear transformation.<br />

So the multiplication of a vector by a matrix “transforms” the <strong>in</strong>put vector <strong>in</strong>to<br />

an output vector, possibly of a different size, by perform<strong>in</strong>g a l<strong>in</strong>ear comb<strong>in</strong>ation.<br />

And this transformation happens <strong>in</strong> a “l<strong>in</strong>ear” fashion. This “functional” view of<br />

the matrix-vector product is the most important shift you can make right now <strong>in</strong><br />

how you th<strong>in</strong>k about l<strong>in</strong>ear algebra. Here is the theorem, whose proof is very nearly<br />

an exact copy of the verification <strong>in</strong> the last example.<br />

Theorem MBLT Matrices Build L<strong>in</strong>ear Transformations<br />

Suppose that A is an m × n matrix. Def<strong>in</strong>e a function T : C n → C m by T (x) = Ax.<br />

Then T is a l<strong>in</strong>ear transformation.<br />

Proof.<br />

and<br />

T (x + y) =A (x + y)<br />

Def<strong>in</strong>ition of T<br />

= Ax + Ay Theorem MMDAA<br />

= T (x)+T (y) Def<strong>in</strong>ition of T<br />

△<br />

T (αx) =A (αx)<br />

Def<strong>in</strong>ition of T<br />

= α (Ax) Theorem MMSMM<br />

= αT (x) Def<strong>in</strong>ition of T<br />

So by Def<strong>in</strong>ition LT, T is a l<strong>in</strong>ear transformation.<br />

<br />

So Theorem MBLT gives us a rapid way to construct l<strong>in</strong>ear transformations.


§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 429<br />

Grab an m × n matrix A, def<strong>in</strong>e T (x) = Ax and Theorem MBLT tells us that T is<br />

a l<strong>in</strong>ear transformation from C n to C m , without any further check<strong>in</strong>g.<br />

WecanturnTheoremMBLT around. You give me a l<strong>in</strong>ear transformation and I<br />

will give you a matrix.<br />

Example MFLT Matrix from a l<strong>in</strong>ear transformation<br />

Def<strong>in</strong>e the function R: C 3 → C 4 by<br />

⎡<br />

⎤<br />

([ ]) 2x 1 − 3x 2 +4x 3<br />

x1<br />

⎢ x<br />

R x 2 = 1 + x 2 + x 3 ⎥<br />

⎣<br />

−x<br />

x 1 +5x 2 − 3x<br />

⎦<br />

3<br />

3<br />

x 2 − 4x 3<br />

You could verify that R is a l<strong>in</strong>ear transformation by apply<strong>in</strong>g the def<strong>in</strong>ition, but<br />

we will <strong>in</strong>stead massage the expression def<strong>in</strong><strong>in</strong>g a typical output until we recognize<br />

the form of a known class of l<strong>in</strong>ear transformations.<br />

R<br />

([ ])<br />

x1<br />

x 2 =<br />

x 3<br />

⎡<br />

⎤<br />

2x 1 − 3x 2 +4x 3<br />

⎢ x 1 + x 2 + x 3 ⎥<br />

⎣<br />

−x 1 +5x 2 − 3x<br />

⎦<br />

3<br />

x 2 − 4x 3<br />

⎡ ⎤ ⎡ ⎤<br />

2x 1 −3x 2<br />

⎢ x<br />

= 1 ⎥ ⎢ ⎥<br />

⎣ ⎦ + ⎣ ⎦ +<br />

−x 1<br />

0<br />

⎡<br />

⎢<br />

= x 1 ⎣<br />

⎡<br />

⎢<br />

= ⎣<br />

2<br />

1<br />

−1<br />

0<br />

x 2<br />

5x 2<br />

x 2<br />

⎤ ⎡<br />

⎥ ⎢<br />

⎦ + x 2 ⎣<br />

2 −3 4<br />

1 1 1<br />

−1 5 −3<br />

0 1 −4<br />

So if we def<strong>in</strong>e the matrix<br />

⎡ ⎤<br />

4x 3<br />

⎢ x 3 ⎥<br />

⎣<br />

−3x<br />

⎦<br />

3<br />

−4x 3<br />

⎤ ⎡<br />

−3 4<br />

1 ⎥ ⎢ 1<br />

5<br />

⎦ + x 3 ⎣<br />

−3<br />

1 −4<br />

⎤<br />

⎥<br />

⎦<br />

⎡<br />

⎢<br />

B = ⎣<br />

[<br />

x1<br />

]<br />

x 2<br />

x 3<br />

2 −3 4<br />

1 1 1<br />

−1 5 −3<br />

0 1 −4<br />

⎤<br />

⎥<br />

⎦<br />

⎤<br />

⎥<br />

⎦<br />

Def<strong>in</strong>ition CVA<br />

Def<strong>in</strong>ition CVSM<br />

Def<strong>in</strong>ition MVP<br />

then R (x) = Bx. By Theorem MBLT, we can easily recognize R as a l<strong>in</strong>ear<br />

transformation s<strong>in</strong>ce it has the form described <strong>in</strong> the hypothesis of the theorem. △<br />

Example MFLT was not an accident. Consider any one of the archetypes where<br />

both the doma<strong>in</strong> and codoma<strong>in</strong> are sets of column vectors (Archetype M through<br />

Archetype R) and you should be able to mimic the previous example. Here is the<br />

theorem, which is notable s<strong>in</strong>ce it is our first occasion to use the full power of the


§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 430<br />

def<strong>in</strong><strong>in</strong>g properties of a l<strong>in</strong>ear transformation when our hypothesis <strong>in</strong>cludes a l<strong>in</strong>ear<br />

transformation.<br />

Theorem MLTCV Matrix of a L<strong>in</strong>ear Transformation, Column Vectors<br />

Suppose that T : C n → C m is a l<strong>in</strong>ear transformation. Then there is an m × n matrix<br />

A such that T (x) =Ax.<br />

Proof. The conclusion says a certa<strong>in</strong> matrix exists. What better way to prove<br />

someth<strong>in</strong>g exists than to actually build it? So our proof will be constructive (Proof<br />

Technique C), and the procedure that we will use abstractly <strong>in</strong> the proof can be<br />

used concretely <strong>in</strong> specific examples.<br />

Let e 1 , e 2 , e 3 , ..., e n be the columns of the identity matrix of size n, I n (Def<strong>in</strong>ition<br />

SUV). Evaluate the l<strong>in</strong>ear transformation T with each of these standard unit<br />

vectors as an <strong>in</strong>put, and record the result. In other words, def<strong>in</strong>e n vectors <strong>in</strong> C m ,<br />

A i ,1≤ i ≤ n by<br />

A i = T (e i )<br />

Then package up these vectors as the columns of a matrix<br />

A =[A 1 |A 2 |A 3 | ...|A n ]<br />

Does A have the desired properties? <strong>First</strong>, A is clearly an m × n matrix. Then<br />

T (x) =T (I n x)<br />

Theorem MMIM<br />

= T ([e 1 |e 2 |e 3 | ...|e n ] x) Def<strong>in</strong>ition SUV<br />

= T ([x] 1<br />

e 1 +[x] 2<br />

e 2 +[x] 3<br />

e 3 + ···+[x] n<br />

e n ) Def<strong>in</strong>ition MVP<br />

= T ([x] 1<br />

e 1 )+T ([x] 2<br />

e 2 )+T ([x] 3<br />

e 3 )+···+ T ([x] n<br />

e n ) Def<strong>in</strong>ition LT<br />

=[x] 1<br />

T (e 1 )+[x] 2<br />

T (e 2 )+[x] 3<br />

T (e 3 )+···+[x] n<br />

T (e n ) Def<strong>in</strong>ition LT<br />

=[x] 1<br />

A 1 +[x] 2<br />

A 2 +[x] 3<br />

A 3 + ···+[x] n<br />

A n<br />

Def<strong>in</strong>ition of A i<br />

= Ax Def<strong>in</strong>ition MVP<br />

as desired.<br />

<br />

So if we were to restrict our study of l<strong>in</strong>ear transformations to those where the<br />

doma<strong>in</strong> and codoma<strong>in</strong> are both vector spaces of column vectors (Def<strong>in</strong>ition VSCV),<br />

every matrix leads to a l<strong>in</strong>ear transformation of this type (Theorem MBLT), while<br />

every such l<strong>in</strong>ear transformation leads to a matrix (Theorem MLTCV). So matrices<br />

and l<strong>in</strong>ear transformations are fundamentally the same. We call the matrix A of<br />

Theorem MLTCV the matrix representation of T .<br />

We have def<strong>in</strong>ed l<strong>in</strong>ear transformations for more general vector spaces than just<br />

C m . Can we extend this correspondence between l<strong>in</strong>ear transformations and matrices<br />

to more general l<strong>in</strong>ear transformations (more general doma<strong>in</strong>s and codoma<strong>in</strong>s)? Yes,<br />

and this is the ma<strong>in</strong> theme of Chapter R. Stay tuned. For now, let us illustrate<br />

Theorem MLTCV with an example.


§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 431<br />

Example MOLT Matrix of a l<strong>in</strong>ear transformation<br />

Suppose S : C 3 → C 4 is def<strong>in</strong>ed by<br />

⎡<br />

⎤<br />

([ ]) 3x 1 − 2x 2 +5x 3<br />

x1<br />

⎢ x<br />

S x 2 = 1 + x 2 + x 3 ⎥<br />

⎣<br />

9x<br />

x 1 − 2x 2 +5x<br />

⎦<br />

3<br />

3<br />

4x 2<br />

Then<br />

so def<strong>in</strong>e<br />

⎡ ⎤<br />

([ ]) 3 1<br />

⎢1⎥<br />

C 1 = S (e 1 )=S 0 = ⎣<br />

9<br />

⎦<br />

0<br />

0<br />

⎡ ⎤<br />

([ ]) −2<br />

0<br />

⎢ 1 ⎥<br />

C 2 = S (e 2 )=S 1 = ⎣<br />

−2<br />

⎦<br />

0<br />

4<br />

⎡ ⎤<br />

([ ]) 5 0<br />

⎢1⎥<br />

C 3 = S (e 3 )=S 0 = ⎣<br />

5<br />

⎦<br />

1<br />

0<br />

⎡ ⎤<br />

3 −2 5<br />

⎢1 1 1⎥<br />

C =[C 1 |C 2 |C 3 ]= ⎣<br />

9 −2 5<br />

⎦<br />

0 4 0<br />

and Theorem MLTCV guarantees that S (x) =Cx.<br />

[ ] 2<br />

As an illum<strong>in</strong>at<strong>in</strong>g exercise, let z = −3 and compute S (z) two different ways.<br />

3<br />

<strong>First</strong>, return to the def<strong>in</strong>ition of S and evaluate S (z) directly. Then do the<br />

⎡<br />

matrixvector<br />

product Cz. In both cases you should obta<strong>in</strong> the vector S (z) = ⎣<br />

⎤<br />

27<br />

⎢ 2 ⎥<br />

39<br />

⎦.<br />

−12<br />

△<br />

Subsection LTLC<br />

L<strong>in</strong>ear Transformations and L<strong>in</strong>ear Comb<strong>in</strong>ations<br />

It is the <strong>in</strong>teraction between l<strong>in</strong>ear transformations and l<strong>in</strong>ear comb<strong>in</strong>ations that lies<br />

at the heart of many of the important theorems of l<strong>in</strong>ear algebra. The next theorem<br />

distills the essence of this. The proof is not deep, the result is hardly startl<strong>in</strong>g,<br />

but it will be referenced frequently. We have already passed by one occasion to


§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 432<br />

employ it, <strong>in</strong> the proof of Theorem MLTCV. Paraphras<strong>in</strong>g, this theorem says that<br />

we can “push” l<strong>in</strong>ear transformations “down <strong>in</strong>to” l<strong>in</strong>ear comb<strong>in</strong>ations, or “pull”<br />

l<strong>in</strong>ear transformations “up out” of l<strong>in</strong>ear comb<strong>in</strong>ations. We will have opportunities<br />

to both push and pull.<br />

Theorem LTLC L<strong>in</strong>ear Transformations and L<strong>in</strong>ear Comb<strong>in</strong>ations<br />

Suppose that T : U → V is a l<strong>in</strong>ear transformation, u 1 , u 2 , u 3 , ..., u t are vectors<br />

from U and a 1 ,a 2 ,a 3 , ..., a t are scalars from C. Then<br />

T (a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a t u t )=a 1 T (u 1 )+a 2 T (u 2 )+a 3 T (u 3 )+···+a t T (u t )<br />

Proof.<br />

T (a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a t u t )<br />

= T (a 1 u 1 )+T (a 2 u 2 )+T (a 3 u 3 )+···+ T (a t u t ) Def<strong>in</strong>ition LT<br />

= a 1 T (u 1 )+a 2 T (u 2 )+a 3 T (u 3 )+···+ a t T (u t ) Def<strong>in</strong>ition LT<br />

<br />

Some authors, especially <strong>in</strong> more advanced texts, take the conclusion of Theorem<br />

LTLC as the def<strong>in</strong><strong>in</strong>g condition of a l<strong>in</strong>ear transformation. This has the appeal of<br />

be<strong>in</strong>g a s<strong>in</strong>gle condition, rather than the two-part condition of Def<strong>in</strong>ition LT.(See<br />

Exercise LT.T20).<br />

Our next theorem says, <strong>in</strong>formally, that it is enough to know how a l<strong>in</strong>ear<br />

transformation behaves for <strong>in</strong>puts from any basis of the doma<strong>in</strong>, and all the other<br />

outputs are described by a l<strong>in</strong>ear comb<strong>in</strong>ation of these few values. Aga<strong>in</strong>, the<br />

statement of the theorem, and its proof, are not remarkable, but the <strong>in</strong>sight that<br />

goes along with it is very fundamental.<br />

Theorem LTDB L<strong>in</strong>ear Transformation Def<strong>in</strong>ed on a Basis<br />

Suppose U is a vector space with basis B = {u 1 , u 2 , u 3 , ..., u n } and the vector<br />

space V conta<strong>in</strong>s the vectors v 1 , v 2 , v 3 , ..., v n (which may not be dist<strong>in</strong>ct). Then<br />

there is a unique l<strong>in</strong>ear transformation, T : U → V , such that T (u i ) = v i , 1 ≤ i ≤ n.<br />

Proof. To prove the existence of T , we construct a function and show that it is a<br />

l<strong>in</strong>ear transformation (Proof Technique C). Suppose w ∈ U is an arbitrary element<br />

of the doma<strong>in</strong>. Then by Theorem VRRB there are unique scalars a 1 ,a 2 ,a 3 , ..., a n<br />

such that<br />

w = a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a n u n<br />

Then def<strong>in</strong>e the function T by<br />

T (w) =a 1 v 1 + a 2 v 2 + a 3 v 3 + ···+ a n v n<br />

It should be clear that T behavesasrequiredforn <strong>in</strong>puts from B. S<strong>in</strong>ce the scalars<br />

provided by Theorem VRRB are unique, there is no ambiguity <strong>in</strong> this def<strong>in</strong>ition,


§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 433<br />

and T qualifies as a function with doma<strong>in</strong> U and codoma<strong>in</strong> V (i.e. T is well-def<strong>in</strong>ed).<br />

But is T a l<strong>in</strong>ear transformation as well?<br />

Let x ∈ U be a second element of the doma<strong>in</strong>, and suppose the scalars provided<br />

by Theorem VRRB (relative to B) areb 1 ,b 2 ,b 3 , ..., b n .Then<br />

T (w + x) =T (a 1 u 1 + ···+ a n u n + b 1 u 1 + ···+ b n u n )<br />

= T ((a 1 + b 1 ) u 1 + ···+(a n + b n ) u n ) Def<strong>in</strong>ition VS<br />

=(a 1 + b 1 ) v 1 + ···+(a n + b n ) v n<br />

Def<strong>in</strong>ition of T<br />

= a 1 v 1 + ···+ a n v n + b 1 v 1 + ···+ b n v n Def<strong>in</strong>ition VS<br />

= T (w)+T (x)<br />

Let α ∈ C be any scalar. Then<br />

T (αw) =T (α (a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a n u n ))<br />

= T (αa 1 u 1 + αa 2 u 2 + αa 3 u 3 + ···+ αa n u n ) Def<strong>in</strong>ition VS<br />

= αa 1 v 1 + αa 2 v 2 + αa 3 v 3 + ···+ αa n v n Def<strong>in</strong>ition of T<br />

= α (a 1 v 1 + a 2 v 2 + a 3 v 3 + ···+ a n v n ) Def<strong>in</strong>ition VS<br />

= αT (w)<br />

So by Def<strong>in</strong>ition LT, T is a l<strong>in</strong>ear transformation.<br />

Is T unique (among all l<strong>in</strong>ear transformations that take the u i to the v i )?<br />

Apply<strong>in</strong>g Proof Technique U, we posit the existence of a second l<strong>in</strong>ear transformation,<br />

S : U → V such that S (u i ) = v i ,1≤ i ≤ n. Aga<strong>in</strong>, let w ∈ U represent an arbitrary<br />

element of U and let a 1 ,a 2 ,a 3 , ..., a n be the scalars provided by Theorem VRRB<br />

(relative to B). We have,<br />

T (w) =T (a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a n u n )<br />

Theorem VRRB<br />

= a 1 T (u 1 )+a 2 T (u 2 )+a 3 T (u 3 )+···+ a n T (u n ) Theorem LTLC<br />

= a 1 v 1 + a 2 v 2 + a 3 v 3 + ···+ a n v n Def<strong>in</strong>ition of T<br />

= a 1 S (u 1 )+a 2 S (u 2 )+a 3 S (u 3 )+···+ a n S (u n ) Def<strong>in</strong>ition of S<br />

= S (a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a n u n ) Theorem LTLC<br />

= S (w) Theorem VRRB<br />

So the output of T and S agree on every <strong>in</strong>put, which means they are equal as<br />

functions, T = S. SoT is unique.<br />

<br />

You might recall facts from analytic geometry, such as “any two po<strong>in</strong>ts determ<strong>in</strong>e<br />

a l<strong>in</strong>e” and “any three non-coll<strong>in</strong>ear po<strong>in</strong>ts determ<strong>in</strong>e a parabola.” Theorem LTDB<br />

has much of the same feel. By specify<strong>in</strong>g the n outputs for <strong>in</strong>puts from a basis, an<br />

entire l<strong>in</strong>ear transformation is determ<strong>in</strong>ed. The analogy is not perfect, but the style<br />

of these facts are not very dissimilar from Theorem LTDB.<br />

Notice that the statement of Theorem LTDB asserts the existence of a l<strong>in</strong>ear<br />

transformation with certa<strong>in</strong> properties, while the proof shows us exactly how to def<strong>in</strong>e


§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 434<br />

the desired l<strong>in</strong>ear transformation. The next two examples show how to compute<br />

values of l<strong>in</strong>ear transformations that we create this way.<br />

Example LTDB1 L<strong>in</strong>ear transformation def<strong>in</strong>ed on a basis<br />

Consider the l<strong>in</strong>ear transformation T : C 3 → C 2 that is required to have the follow<strong>in</strong>g<br />

three values,<br />

Because<br />

T<br />

([ 1<br />

0<br />

0<br />

]) [<br />

2<br />

=<br />

1]<br />

T<br />

([ 0<br />

1<br />

0<br />

])<br />

{[ ] 1<br />

B = 0<br />

0<br />

,<br />

=<br />

[ ] 0<br />

1 ,<br />

0<br />

[ ]<br />

−1<br />

4<br />

[ ]} 0<br />

0<br />

1<br />

T<br />

([ 0<br />

0<br />

1<br />

]) [<br />

6<br />

=<br />

0]<br />

is a basis for C 3 (Theorem SUVB), Theorem LTDB says there is a unique l<strong>in</strong>ear<br />

transformation T that behaves this way.<br />

How do we compute other values of T ? Consider the <strong>in</strong>put<br />

so<br />

Then<br />

Do<strong>in</strong>g it aga<strong>in</strong>,<br />

w =<br />

[ 2<br />

−3<br />

1<br />

]<br />

=(2)[ 1<br />

0<br />

0<br />

]<br />

[<br />

2<br />

T (w) =(2) +(−3)<br />

1]<br />

[ ] 5<br />

x = 2<br />

−3<br />

=(5)[ 1<br />

0<br />

0<br />

[<br />

2<br />

T (x) =(5) +(2)<br />

1]<br />

]<br />

+(−3)<br />

[ 0<br />

1<br />

0<br />

]<br />

+(1)[ 0<br />

0<br />

1<br />

]<br />

[ ] [ [ ]<br />

−1 6 13<br />

+(1) =<br />

4 0]<br />

−10<br />

] 0<br />

+(2)[<br />

1<br />

0<br />

+(−3)<br />

[ 0<br />

0<br />

1<br />

[ ] [<br />

−1 6<br />

+(−3) =<br />

4 0]<br />

]<br />

[ ]<br />

−10<br />

13<br />

Any other value of T could be computed <strong>in</strong> a similar manner. So rather than<br />

be<strong>in</strong>g given a formula for the outputs of T ,therequirement that T behave <strong>in</strong> a<br />

certa<strong>in</strong> way for the <strong>in</strong>puts chosen from a basis of the doma<strong>in</strong>, is as sufficient as a<br />

formula for comput<strong>in</strong>g any value of the function. You might notice some parallels<br />

between this example and Example MOLT or Theorem MLTCV.<br />

△<br />

Example LTDB2 L<strong>in</strong>ear transformation def<strong>in</strong>ed on a basis<br />

Consider the l<strong>in</strong>ear transformation R: C 3 → C 2 with the three values,<br />

([ ]) 1 [ ] ([ ]) −1 [ ([ ]) 3 [<br />

5<br />

0 2<br />

R 2 =<br />

R 5 =<br />

R 1 =<br />

−1<br />

4]<br />

3]<br />

1<br />

1<br />

4


§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 435<br />

You can check that<br />

{[ ] 1<br />

D = 2<br />

1<br />

,<br />

[ ] −1<br />

5 ,<br />

1<br />

[ ]} 3<br />

1<br />

4<br />

is a basis for C 3 (make the vectors the columns of a square matrix and check that<br />

the matrix is nons<strong>in</strong>gular, Theorem CNMB). By Theorem LTDB we know there<br />

is a unique l<strong>in</strong>ear transformation R with the three specified outputs. However, we<br />

have to work just a bit harder to take an <strong>in</strong>put vector and express it as a l<strong>in</strong>ear<br />

comb<strong>in</strong>ation of the vectors <strong>in</strong> D.<br />

For example, consider,<br />

[ ] 8<br />

y = −3<br />

5<br />

Then we must first write y as a l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors <strong>in</strong> D and solve<br />

for the unknown scalars, to arrive at<br />

[ ] ] [ ] ]<br />

8 1 −1 3<br />

y = −3 =(3)[<br />

2 +(−2) 5 +(1)[<br />

1<br />

5 1<br />

1 4<br />

Then the proof of Theorem LTDB gives us<br />

[ ] [<br />

5 0<br />

R (y) =(3) +(−2)<br />

−1 4]<br />

[<br />

2<br />

+(1) =<br />

3]<br />

[ ]<br />

17<br />

−8<br />

Any other value of R could be computed <strong>in</strong> a similar manner.<br />

Here is a third example of a l<strong>in</strong>ear transformation def<strong>in</strong>ed by its action on a<br />

basis, only with more abstract vector spaces <strong>in</strong>volved.<br />

Example LTDB3 L<strong>in</strong>ear transformation def<strong>in</strong>ed on a basis<br />

The set W = {p(x) ∈ P 3 | p(1) = 0,p(3) = 0} ⊆P 3 is a subspace of the vector space<br />

of polynomials P 3 . This subspace has C = { 3 − 4x + x 2 , 12 − 13x + x 3} as a basis<br />

(check this!). Suppose we consider the l<strong>in</strong>ear transformation S : P 3 → M 22 with<br />

values<br />

S ( 3 − 4x + x 2) =<br />

[ ]<br />

1 −3<br />

2 0<br />

S ( 12 − 13x + x 3) =<br />

[ ]<br />

0 1<br />

1 0<br />

By Theorem LTDB we know there is a unique l<strong>in</strong>ear transformation with these<br />

two values. To illustrate a sample computation of S, consider q(x) =9−6x−5x 2 +2x 3 .<br />

Verify that q(x) isanelementofW (does it have roots at x = 1 and x =3?),then<br />

f<strong>in</strong>d the scalars needed to write it as a l<strong>in</strong>ear comb<strong>in</strong>ation of the basis vectors <strong>in</strong> C.<br />

Because<br />

q(x) =9− 6x − 5x 2 +2x 3 =(−5)(3 − 4x + x 2 ) + (2)(12 − 13x + x 3 )<br />


§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 436<br />

The proof of Theorem LTDB gives us<br />

[ ]<br />

1 −3<br />

S (q) =(−5) +(2)<br />

2 0<br />

[ ]<br />

0 1<br />

=<br />

1 0<br />

[<br />

−5<br />

]<br />

17<br />

−8 0<br />

And all the other outputs of S could be computed <strong>in</strong> the same manner. Every<br />

output of S will have a zero <strong>in</strong> the second row, second column. Can you see why<br />

this is so?<br />

△<br />

Informally, we can describe Theorem LTDB by say<strong>in</strong>g “it is enough to know<br />

what a l<strong>in</strong>ear transformation does to a basis (of the doma<strong>in</strong>).”<br />

Subsection PI<br />

Pre-Images<br />

The def<strong>in</strong>ition of a function requires that for each <strong>in</strong>put <strong>in</strong> the doma<strong>in</strong> there is<br />

exactly one output <strong>in</strong> the codoma<strong>in</strong>. However, the correspondence does not have<br />

to behave the other way around. An output from the codoma<strong>in</strong> could have many<br />

different <strong>in</strong>puts from the doma<strong>in</strong> which the transformation sends to that output,<br />

or there could be no <strong>in</strong>puts at all which the transformation sends to that output.<br />

To formalize our discussion of this aspect of l<strong>in</strong>ear transformations, we def<strong>in</strong>e the<br />

pre-image.<br />

Def<strong>in</strong>ition PI Pre-Image<br />

Suppose that T : U → V is a l<strong>in</strong>ear transformation. For each v, def<strong>in</strong>e the pre-image<br />

of v to be the subset of U given by<br />

T −1 (v) ={u ∈ U| T (u) =v}<br />

□<br />

In other words, T −1 (v) is the set of all those vectors <strong>in</strong> the doma<strong>in</strong> U that get<br />

“sent” to the vector v.<br />

Example SPIAS Sample pre-images, Archetype S<br />

Archetype S is the l<strong>in</strong>ear transformation def<strong>in</strong>ed by<br />

([ ]) a [<br />

T : C 3 a − b<br />

→ M 22 , T b =<br />

3a + b + c<br />

c<br />

]<br />

2a +2b + c<br />

−2a − 6b − 2c<br />

We could compute a pre-image for every element of the codoma<strong>in</strong> M 22 . However,<br />

even <strong>in</strong> a free textbook, we do not have the room to do that, so we will compute<br />

just two.<br />

Choose<br />

[ ]<br />

2 1<br />

v = ∈ M<br />

3 2 22


§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 437<br />

for no particular reason. What is T −1 (v)? Suppose u =<br />

condition that T (u) =v becomes<br />

[ ]<br />

([ ])<br />

u1<br />

2 1<br />

= v = T (u) =T u<br />

3 2<br />

2<br />

u 3<br />

[<br />

=<br />

[<br />

u1<br />

]<br />

u 2<br />

u 3<br />

∈ T −1 (v). The<br />

]<br />

u 1 − u 2 2u 1 +2u 2 + u 3<br />

3u 1 + u 2 + u 3 −2u 1 − 6u 2 − 2u 3<br />

Us<strong>in</strong>g matrix equality (Def<strong>in</strong>ition ME), we arrive at a system of four equations<br />

<strong>in</strong> the three unknowns u 1 ,u 2 ,u 3 with an augmented matrix that we can row-reduce<br />

<strong>in</strong> the hunt for solutions,<br />

⎡<br />

⎤<br />

1 −1 0 2<br />

⎢ 2 2 1 1⎥<br />

⎣<br />

3 1 1 3<br />

−2 −6 −2 2<br />

−−−−→<br />

⎦ RREF<br />

⎡<br />

⎢<br />

⎣<br />

1<br />

1 0<br />

4<br />

1<br />

0 1<br />

5<br />

4<br />

4<br />

− 3 4<br />

0 0 0 0<br />

0 0 0 0<br />

We recognize this system as hav<strong>in</strong>g <strong>in</strong>f<strong>in</strong>itely many solutions described by the<br />

s<strong>in</strong>gle free variable u 3 . Eventually obta<strong>in</strong><strong>in</strong>g the vector form of the solutions (Theorem<br />

VFSLS), we can describe the preimage precisely as,<br />

T −1 (v) = { u ∈ C 3∣ ∣ T (u) =v }<br />

{[ ]∣ }<br />

u1 ∣∣∣∣<br />

= u 2 u 1 = 5<br />

u 4 − 1 4 u 3,u 2 = − 3 4 − 1 4 u 3<br />

3<br />

⎧⎡<br />

5<br />

⎨<br />

4<br />

= ⎣<br />

− 1 4 u ⎤<br />

⎫<br />

3<br />

− 3<br />

⎩ 4 − 1 4 u ⎬<br />

⎦<br />

3<br />

u 3<br />

∣ u 3 ∈ C<br />

⎭<br />

⎧⎡<br />

⎤ ⎡<br />

5<br />

⎨<br />

4<br />

= ⎣− 3 ⎦ + u<br />

⎩ 4 3<br />

⎣ − ⎤<br />

⎫<br />

1 4<br />

⎬<br />

− 1 ⎦<br />

4<br />

0 1 ∣ u 3 ∈ C<br />

⎭<br />

⎡ ⎤<br />

5 〈 ⎧ ⎡<br />

4<br />

⎨<br />

= ⎣− 3 ⎦ + ⎣ − ⎤⎫<br />

1 〉<br />

4<br />

⎬<br />

4<br />

− 1 ⎦<br />

⎩ 4 ⎭<br />

0<br />

1<br />

This last l<strong>in</strong>e is merely a suggestive way of describ<strong>in</strong>g the set on the previous<br />

l<strong>in</strong>e. You might create three or four vectors <strong>in</strong> the preimage, and evaluate T with<br />

each. Was the result what you expected? For a h<strong>in</strong>t of th<strong>in</strong>gs to come, you might<br />

try evaluat<strong>in</strong>g T with just the lone vector <strong>in</strong> the spann<strong>in</strong>g set above. What was the<br />

result? Now take a look back at Theorem PSPHS. Hmmmm.<br />

OK, let us compute another preimage, but with a different outcome this time.<br />

Choose<br />

[ ]<br />

1 1<br />

v = ∈ M<br />

2 4 22<br />

⎤<br />

⎥<br />


§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 438<br />

What is T −1 (v)? Suppose u =<br />

[<br />

u1<br />

[ ]<br />

([ ])<br />

u1<br />

1 1<br />

= v = T (u) =T u<br />

2 4<br />

2<br />

u 3<br />

]<br />

u 2 ∈ T −1 (v). That T (u) =v becomes<br />

u 3<br />

[<br />

]<br />

u<br />

= 1 − u 2 2u 1 +2u 2 + u 3<br />

3u 1 + u 2 + u 3 −2u 1 − 6u 2 − 2u 3<br />

Us<strong>in</strong>g matrix equality (Def<strong>in</strong>ition ME), we arrive at a system of four equations<br />

<strong>in</strong> the three unknowns u 1 ,u 2 ,u 3 with an augmented matrix that we can row-reduce<br />

<strong>in</strong> the hunt for solutions,<br />

⎡<br />

⎤ ⎡<br />

⎤<br />

1 −1 0 1<br />

⎢ 2 2 1 1⎥<br />

⎣<br />

3 1 1 2<br />

−2 −6 −2 4<br />

−−−−→<br />

⎦ RREF<br />

⎢<br />

⎣<br />

1<br />

1 0<br />

4<br />

0<br />

1<br />

0 1<br />

4<br />

0<br />

0 0 0 1<br />

0 0 0 0<br />

By Theorem RCLS we recognize this system as <strong>in</strong>consistent. So no vector u is a<br />

member of T −1 (v) andso<br />

T −1 (v) =∅<br />

△<br />

The preimage is just a set, it is almost never a subspace of U (you might th<strong>in</strong>k<br />

about just when T −1 (v) is a subspace, see Exercise ILT.T10). We will describe its<br />

properties go<strong>in</strong>g forward, and it will be central to the ma<strong>in</strong> ideas of this chapter.<br />

Subsection NLTFO<br />

New L<strong>in</strong>ear Transformations From Old<br />

We can comb<strong>in</strong>e l<strong>in</strong>ear transformations <strong>in</strong> natural ways to create new l<strong>in</strong>ear transformations.<br />

So we will def<strong>in</strong>e these comb<strong>in</strong>ations and then prove that the results<br />

really are still l<strong>in</strong>ear transformations. <strong>First</strong> the sum of two l<strong>in</strong>ear transformations.<br />

Def<strong>in</strong>ition LTA L<strong>in</strong>ear Transformation Addition<br />

Suppose that T : U → V and S : U → V are two l<strong>in</strong>ear transformations with the<br />

same doma<strong>in</strong> and codoma<strong>in</strong>. Then their sum is the function T + S : U → V whose<br />

outputs are def<strong>in</strong>ed by<br />

(T + S)(u) =T (u)+S (u)<br />

□<br />

Notice that the first plus sign <strong>in</strong> the def<strong>in</strong>ition is the operation be<strong>in</strong>g def<strong>in</strong>ed,<br />

while the second one is the vector addition <strong>in</strong> V . (Vector addition <strong>in</strong> U will appear<br />

just now <strong>in</strong> the proof that T + S is a l<strong>in</strong>ear transformation.) Def<strong>in</strong>ition LTA only<br />

provides a function. It would be nice to know that when the constituents (T , S) are<br />

l<strong>in</strong>ear transformations, then so too is T + S.<br />

⎥<br />


§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 439<br />

Theorem SLTLT Sum of L<strong>in</strong>ear Transformations is a L<strong>in</strong>ear Transformation<br />

Suppose that T : U → V and S : U → V are two l<strong>in</strong>ear transformations with the<br />

same doma<strong>in</strong> and codoma<strong>in</strong>. Then T + S : U → V is a l<strong>in</strong>ear transformation.<br />

Proof. We simply check the def<strong>in</strong><strong>in</strong>g properties of a l<strong>in</strong>ear transformation (Def<strong>in</strong>ition<br />

LT). This is a good place to consistently ask yourself which objects are be<strong>in</strong>g<br />

comb<strong>in</strong>ed with which operations.<br />

(T + S)(x + y) =T (x + y)+S (x + y) Def<strong>in</strong>ition LTA<br />

= T (x)+T (y)+S (x)+S (y) Def<strong>in</strong>ition LT<br />

= T (x)+S (x)+T (y)+S (y) Property C <strong>in</strong> V<br />

=(T + S)(x)+(T + S)(y) Def<strong>in</strong>ition LTA<br />

and<br />

(T + S)(αx) =T (αx)+S (αx) Def<strong>in</strong>ition LTA<br />

= αT (x)+αS (x) Def<strong>in</strong>ition LT<br />

= α (T (x)+S (x)) Property DVA <strong>in</strong> V<br />

= α(T + S)(x) Def<strong>in</strong>ition LTA<br />

Example STLT Sum of two l<strong>in</strong>ear transformations<br />

Suppose that T : C 2 → C 3 and S : C 2 → C 3 are def<strong>in</strong>ed by<br />

([ ]) [ ]<br />

x1 +2x 2<br />

([ ]) [ ] 4x1 − x 2<br />

x1<br />

x1<br />

T = 3x<br />

x 1 − 4x 2 S = x<br />

2 x 1 +3x 2<br />

5x 1 +2x 2 2 −7x 1 +5x 2<br />

Then by Def<strong>in</strong>ition LTA,wehave<br />

([ ]) ([ ]) ([ ])<br />

x1 x1 x1<br />

(T + S) = T + S<br />

x 2 x 2 x 2<br />

[ ] [ ] [ ]<br />

x1 +2x 2 4x1 − x 2 5x1 + x 2<br />

= 3x 1 − 4x 2 + x 1 +3x 2 = 4x 1 − x 2<br />

5x 1 +2x 2 −7x 1 +5x 2 −2x 1 +7x 2<br />

and by Theorem SLTLT we know T + S is also a l<strong>in</strong>ear transformation from C 2 to<br />

C 3 .<br />

△<br />

Def<strong>in</strong>ition LTSM L<strong>in</strong>ear Transformation Scalar Multiplication<br />

Suppose that T : U → V is a l<strong>in</strong>ear transformation and α ∈ C. Then the scalar<br />

multiple is the function αT : U → V whose outputs are def<strong>in</strong>ed by<br />

(αT )(u) =αT (u)


§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 440<br />

□<br />

Given that T is a l<strong>in</strong>ear transformation, it would be nice to know that αT is also<br />

a l<strong>in</strong>ear transformation.<br />

Theorem MLTLT Multiple of a L<strong>in</strong>ear Transformation is a L<strong>in</strong>ear Transformation<br />

Suppose that T : U → V is a l<strong>in</strong>ear transformation and α ∈ C. Then(αT ): U → V<br />

is a l<strong>in</strong>ear transformation.<br />

Proof. We simply check the def<strong>in</strong><strong>in</strong>g properties of a l<strong>in</strong>ear transformation (Def<strong>in</strong>ition<br />

LT). This is another good place to consistently ask yourself which objects are be<strong>in</strong>g<br />

comb<strong>in</strong>ed with which operations.<br />

(αT )(x + y) =α (T (x + y))<br />

Def<strong>in</strong>ition LTSM<br />

= α (T (x)+T (y)) Def<strong>in</strong>ition LT<br />

= αT (x)+αT (y) Property DVA <strong>in</strong> V<br />

=(αT )(x)+(αT )(y) Def<strong>in</strong>ition LTSM<br />

and<br />

(αT )(βx) =αT (βx)<br />

Def<strong>in</strong>ition LTSM<br />

= α (βT (x)) Def<strong>in</strong>ition LT<br />

=(αβ) T (x)<br />

Property SMA <strong>in</strong> V<br />

=(βα) T (x)<br />

Commutativity <strong>in</strong> C<br />

= β (αT (x)) Property SMA <strong>in</strong> V<br />

= β ((αT )(x)) Def<strong>in</strong>ition LTSM<br />

Example SMLT Scalar multiple of a l<strong>in</strong>ear transformation<br />

Suppose that T : C 4 → C 3 is def<strong>in</strong>ed by<br />

⎛⎡<br />

⎤⎞<br />

x 1 [ ]<br />

x1 +2x<br />

⎜⎢x T 2 ⎥⎟<br />

2 − x 3 +2x 4<br />

⎝⎣<br />

x<br />

⎦⎠ = x 1 +5x 2 − 3x 3 + x 4<br />

3<br />

−2x<br />

x 1 +3x 2 − 4x 3 +2x 4 4<br />

For the sake of an example, choose α =2,sobyDef<strong>in</strong>itionLTSM,wehave<br />

⎛⎡<br />

⎤⎞<br />

⎛⎡<br />

⎤⎞<br />

x 1<br />

x 1 [ ]<br />

x1 +2x<br />

⎜⎢x αT 2 ⎥⎟<br />

⎜⎢x ⎝⎣<br />

x<br />

⎦⎠ =2T 2 ⎥⎟<br />

2 − x 3 +2x 4<br />

⎝⎣<br />

3<br />

x<br />

⎦⎠ =2 x 1 +5x 2 − 3x 3 + x 4<br />

3<br />

−2x<br />

x 4 x 1 +3x 2 − 4x 3 +2x 4 4


§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 441<br />

=<br />

[ 2x1 +4x 2 − 2x 3 +4x 4<br />

]<br />

2x 1 +10x 2 − 6x 3 +2x 4<br />

−4x 1 +6x 2 − 8x 3 +4x 4<br />

and by Theorem MLTLT we know 2T is also a l<strong>in</strong>ear transformation from C 4 to C 3 .<br />

△<br />

Now, let us imag<strong>in</strong>e we have two vector spaces, U and V , and we collect every<br />

possible l<strong>in</strong>ear transformation from U to V <strong>in</strong>to one big set, and call it LT (U, V ).<br />

Def<strong>in</strong>ition LTA and Def<strong>in</strong>ition LTSM tell us how we can “add” and “scalar multiply”<br />

two elements of LT (U, V ). TheoremSLTLT and Theorem MLTLT tell us that if<br />

we do these operations, then the result<strong>in</strong>g functions are l<strong>in</strong>ear transformations that<br />

are also <strong>in</strong> LT (U, V ). Hmmmm, sounds like a vector space to me! A set of objects,<br />

an addition and a scalar multiplication. Why not?<br />

Theorem VSLT Vector Space of L<strong>in</strong>ear Transformations<br />

Suppose that U and V are vector spaces. Then the set of all l<strong>in</strong>ear transformations<br />

from U to V , LT (U, V ), is a vector space when the operations are those given <strong>in</strong><br />

Def<strong>in</strong>ition LTA and Def<strong>in</strong>ition LTSM.<br />

Proof. Theorem SLTLT and Theorem MLTLT provide two of the ten properties <strong>in</strong><br />

Def<strong>in</strong>ition VS. However, we still need to verify the rema<strong>in</strong><strong>in</strong>g eight properties. By<br />

and large, the proofs are straightforward and rely on concoct<strong>in</strong>g the obvious object,<br />

or by reduc<strong>in</strong>g the question to the same vector space property <strong>in</strong> the vector space V .<br />

The zero vector is of some <strong>in</strong>terest, though. What l<strong>in</strong>ear transformation would<br />

we add to any other l<strong>in</strong>ear transformation, so as to keep the second one unchanged?<br />

The answer is Z : U → V def<strong>in</strong>ed by Z (u) = 0 V for every u ∈ U. Notice how we do<br />

not need to know any of the specifics about U and V to make this def<strong>in</strong>ition of Z.<br />

Def<strong>in</strong>ition LTC L<strong>in</strong>ear Transformation Composition<br />

Suppose that T : U → V and S : V → W are l<strong>in</strong>ear transformations. Then the<br />

composition of S and T is the function (S ◦ T ): U → W whose outputs are def<strong>in</strong>ed<br />

by<br />

(S ◦ T )(u) =S (T (u))<br />

□<br />

Given that T and S are l<strong>in</strong>ear transformations, it would be nice to know that<br />

S ◦ T is also a l<strong>in</strong>ear transformation.<br />

Theorem CLTLT Composition of L<strong>in</strong>ear Transformations is a L<strong>in</strong>ear Transformation<br />

Suppose that T : U → V and S : V → W are l<strong>in</strong>ear transformations. Then (S ◦<br />

T ): U → W is a l<strong>in</strong>ear transformation.


§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 442<br />

Proof. We simply check the def<strong>in</strong><strong>in</strong>g properties of a l<strong>in</strong>ear transformation (Def<strong>in</strong>ition<br />

LT).<br />

(S ◦ T )(x + y) =S (T (x + y)) Def<strong>in</strong>ition LTC<br />

= S (T (x)+T (y)) Def<strong>in</strong>ition LT for T<br />

= S (T (x)) + S (T (y)) Def<strong>in</strong>ition LT for S<br />

=(S ◦ T )(x)+(S ◦ T )(y) Def<strong>in</strong>ition LTC<br />

and<br />

(S ◦ T )(αx) =S (T (αx)) Def<strong>in</strong>ition LTC<br />

= S (αT (x)) Def<strong>in</strong>ition LT for T<br />

= αS (T (x)) Def<strong>in</strong>ition LT for S<br />

= α(S ◦ T )(x) Def<strong>in</strong>ition LTC<br />

<br />

Example CTLT Composition of two l<strong>in</strong>ear transformations<br />

Suppose that T : C 2 → C 4 and S : C 4 → C 3 are def<strong>in</strong>ed by<br />

⎡ ⎤ ⎛⎡<br />

⎤⎞<br />

([ ]) x 1 +2x 2<br />

x 1 [ ]<br />

2x1 − x x1 ⎢3x T = 1 − 4x 2 ⎥ ⎜⎢x x<br />

⎣<br />

2 5x 1 +2x<br />

⎦ S 2 ⎥⎟<br />

2 + x 3 − x 4<br />

⎝⎣<br />

2<br />

x<br />

⎦⎠ = 5x 1 − 3x 2 +8x 3 − 2x 4<br />

3<br />

−4x<br />

6x 1 − 3x 2 x 1 +3x 2 − 4x 3 +5x 4 4<br />

Then by Def<strong>in</strong>ition LTC<br />

⎛⎡<br />

⎤⎞<br />

([ ]) ( ([ ])) x 1 +2x 2<br />

x1<br />

x1 ⎜⎢3x (S ◦ T ) = S T = S 1 − 4x 2 ⎥⎟<br />

x 2 x<br />

⎝⎣<br />

2 5x 1 +2x<br />

⎦⎠<br />

2<br />

6x 1 − 3x 2<br />

[ ]<br />

2(x1 +2x 2 ) − (3x 1 − 4x 2 )+(5x 1 +2x 2 ) − (6x 1 − 3x 2 )<br />

= 5(x 1 +2x 2 ) − 3(3x 1 − 4x 2 ) + 8(5x 1 +2x 2 ) − 2(6x 1 − 3x 2 )<br />

−4(x 1 +2x 2 ) + 3(3x 1 − 4x 2 ) − 4(5x 1 +2x 2 ) + 5(6x 1 − 3x 2 )<br />

[ ]<br />

−2x1 +13x 2<br />

= 24x 1 +44x 2<br />

15x 1 − 43x 2<br />

and by Theorem CLTLT S ◦ T is a l<strong>in</strong>ear transformation from C 2 to C 3 .<br />

Here is an <strong>in</strong>terest<strong>in</strong>g exercise that will presage an important result later. In<br />

Example STLT compute (via Theorem MLTCV) thematrixofT , S and T + S. Do<br />

you see a relationship between these three matrices?<br />

In Example SMLT compute (via Theorem MLTCV) thematrixofT and 2T .Do<br />

you see a relationship between these two matrices?<br />

Here is the tough one. In Example CTLT compute (via Theorem MLTCV) the<br />


§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 443<br />

matrix of T , S and S ◦ T . Do you see a relationship between these three matrices???<br />

Read<strong>in</strong>g Questions<br />

1. Is the function below a l<strong>in</strong>ear transformation? Why or why not?<br />

⎛⎡<br />

T : C 3 → C 2 , T ⎝⎣ x ⎤⎞<br />

[ ]<br />

1<br />

x 2<br />

⎦⎠ 3x1 − x<br />

=<br />

2 + x 3<br />

8x<br />

x 2 − 6<br />

3<br />

2. Determ<strong>in</strong>e the matrix representation of the l<strong>in</strong>ear transformation S below.<br />

⎡<br />

([ ])<br />

S : C 2 → C 3 x1<br />

, S = ⎣ 3x ⎤<br />

1 +5x 2<br />

8x<br />

x 1 − 3x 2<br />

⎦<br />

2<br />

−4x 1<br />

3. Theorem LTLC has a fairly simple proof. Yet the result itself is very powerful. Comment<br />

onwhywemightsaythis.<br />

Exercises<br />

C15 The archetypes below are all l<strong>in</strong>ear transformations whose doma<strong>in</strong>s and codoma<strong>in</strong>s<br />

are vector spaces of column vectors (Def<strong>in</strong>ition VSCV). For each one, compute the matrix<br />

representation described <strong>in</strong> the proof of Theorem MLTCV.<br />

Archetype M, ArchetypeN, ArchetypeO, ArchetypeP, ArchetypeQ, ArchetypeR<br />

⎛⎡<br />

C16 † F<strong>in</strong>d the matrix representation of T : C 3 → C 4 , T ⎝⎣ x ⎤⎞<br />

⎡<br />

⎤<br />

3x +2y + z<br />

y⎦⎠ ⎢ x + y + z ⎥<br />

= ⎣<br />

x − 3y<br />

⎦.<br />

z<br />

2x +3y + z<br />

⎡<br />

C20 † Let w = ⎣ −3<br />

⎤<br />

1 ⎦. Referr<strong>in</strong>g to Example MOLT, compute S (w) two different ways.<br />

4<br />

<strong>First</strong> use the def<strong>in</strong>ition of S, then compute the matrix-vector product Cw (Def<strong>in</strong>ition<br />

MVP).<br />

C25 † Def<strong>in</strong>e the l<strong>in</strong>ear transformation<br />

⎛⎡<br />

T : C 3 → C 2 , T ⎝⎣ x ⎤⎞<br />

[ ]<br />

1<br />

x 2<br />

⎦⎠ 2x1 − x<br />

=<br />

2 +5x 3<br />

−4x<br />

x 1 +2x 2 − 10x 3<br />

3<br />

Verify that T is a l<strong>in</strong>ear transformation.<br />

C26 † Verify that the function below is a l<strong>in</strong>ear transformation.<br />

T : P 2 → C 2 , T ( a + bx + cx 2) [ ]<br />

2a − b<br />

=<br />

b + c


§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 444<br />

C30 † Def<strong>in</strong>e the l<strong>in</strong>ear transformation<br />

⎛⎡<br />

T : C 3 → C 2 , T ⎝⎣ x ⎤⎞<br />

[ ]<br />

1<br />

x 2<br />

⎦⎠ 2x1 − x<br />

=<br />

2 +5x 3<br />

−4x<br />

x 1 +2x 2 − 10x 3<br />

3<br />

([ ]) ([ ])<br />

Compute the preimages, T −1 2<br />

and T<br />

3<br />

−1 4<br />

.<br />

−8<br />

C31 † For the l<strong>in</strong>ear transformation S compute the pre-images.<br />

⎛⎡<br />

S : C 3 → C 3 , S⎝⎣ a ⎤⎞<br />

⎡<br />

b⎦⎠ = ⎣ a − 2b − c<br />

⎤<br />

3a − b +2c⎦<br />

c a + b +2c<br />

⎛⎡<br />

S −1 ⎝⎣ −2<br />

⎤⎞<br />

5 ⎦⎠<br />

3<br />

⎛⎡<br />

S −1 ⎝⎣ −5<br />

⎤⎞<br />

5 ⎦⎠<br />

7<br />

([ [ ([ [ ] ([<br />

C40 † If T : C 2 → C 2 2 3 1 −1<br />

4<br />

satisfies T = and T = , f<strong>in</strong>d T .<br />

1])<br />

4]<br />

1])<br />

2<br />

3])<br />

⎡<br />

([<br />

C41 † If T : C 2 → C 3 2<br />

satisfies T = ⎣<br />

3]) 2 ⎤<br />

⎡<br />

([<br />

2⎦ 3<br />

and T = ⎣<br />

4]) −1<br />

⎤<br />

0 ⎦, f<strong>in</strong>d the matrix<br />

1<br />

2<br />

representation of T .<br />

([ ])<br />

C42 † Def<strong>in</strong>e T : M 22 → C 1 a b<br />

by T<br />

= a + b + c − d. F<strong>in</strong>d the pre-image T<br />

c d<br />

−1 (3).<br />

C43 † Def<strong>in</strong>e T : P 3 → P 2 by T ( a + bx + cx 2 + dx 3) = b+2cx+3dx 2 . F<strong>in</strong>d the pre-image<br />

of 0. Does this l<strong>in</strong>ear transformation seem familiar?<br />

M10 † Def<strong>in</strong>e two l<strong>in</strong>ear transformations, T : C 4 → C 3 and S : C 3 → C 2 by<br />

⎛⎡<br />

S ⎝⎣ x ⎤⎞<br />

⎛⎡<br />

⎤⎞<br />

[ ]<br />

x 1<br />

⎡<br />

1<br />

x 2<br />

⎦⎠ x1 − 2x<br />

=<br />

2 +3x 3 ⎜⎢x T 2 ⎥⎟<br />

5x<br />

x 1 +4x 2 +2x<br />

⎝⎣<br />

3 x<br />

⎦⎠ = ⎣ −x ⎤<br />

1 +3x 2 + x 3 +9x 4<br />

2x 1 + x 3 +7x 4<br />

⎦<br />

3<br />

3 4x<br />

x 1 +2x 2 + x 3 +2x 4<br />

4<br />

Us<strong>in</strong>g the proof of Theorem MLTCV compute the matrix representations of the three l<strong>in</strong>ear<br />

transformations T , S and S ◦ T . Discover and comment on the relationship between these<br />

three matrices.<br />

M60 Suppose U and V are vector spaces and def<strong>in</strong>e a function Z : U → V by Z (u) = 0 V<br />

for every u ∈ U. ProvethatZ is a (stupid) l<strong>in</strong>ear transformation. (See Exercise ILT.M60,<br />

Exercise SLT.M60, Exercise IVLT.M60.)<br />

T20 Use the conclusion of Theorem LTLC to motivate a new def<strong>in</strong>ition of a l<strong>in</strong>ear<br />

transformation. Then prove that your new def<strong>in</strong>ition is equivalent to Def<strong>in</strong>ition LT.(Proof<br />

Technique D and Proof Technique E might be helpful if you are not sure what you are<br />

be<strong>in</strong>g asked to prove here.)<br />

Theorem SER established three properties of matrix similarity that are collectively known


§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 445<br />

as the def<strong>in</strong><strong>in</strong>g properties of an “equivalence relation”. Exercises T30 and T31 extend this<br />

idea to l<strong>in</strong>ear transformations.<br />

T30 Suppose that T : U → V is a l<strong>in</strong>ear transformation. Say that two vectors from U,<br />

x and y, arerelated exactly when T (x) = T (y) <strong>in</strong> V . Prove the three properties of<br />

an equivalence relation on U: (a)foranyx ∈ U, x is related to x, (b)ifx is related to<br />

y, theny is related to x, and(c)ifx is related to y and y is related to z, thenx is<br />

related to z.<br />

T31 † Equivalence relations always create a partition of the set they are def<strong>in</strong>ed on, via<br />

a construction called equivalence classes. For the relation <strong>in</strong> the previous problem, the<br />

equivalence classes are the pre-images. Prove directly that the collection of pre-images<br />

partition U by show<strong>in</strong>g that (a) every x ∈ U is conta<strong>in</strong>ed <strong>in</strong> some pre-image, and that<br />

(b) any two different pre-images do not have any elements <strong>in</strong> common.


Section ILT<br />

Injective L<strong>in</strong>ear Transformations<br />

Some l<strong>in</strong>ear transformations possess one, or both, of two key properties, which go<br />

by the names <strong>in</strong>jective and surjective. We will see that they are closely related to<br />

ideas like l<strong>in</strong>ear <strong>in</strong>dependence and spann<strong>in</strong>g, and subspaces like the null space and<br />

the column space. In this section we will def<strong>in</strong>e an <strong>in</strong>jective l<strong>in</strong>ear transformation<br />

and analyze the result<strong>in</strong>g consequences. The next section will do the same for the<br />

surjective property. In the f<strong>in</strong>al section of this chapter we will see what happens<br />

when we have the two properties simultaneously.<br />

Subsection ILT<br />

Injective L<strong>in</strong>ear Transformations<br />

As usual, we lead with a def<strong>in</strong>ition.<br />

Def<strong>in</strong>ition ILT Injective L<strong>in</strong>ear Transformation<br />

Suppose T : U → V is a l<strong>in</strong>ear transformation. Then T is <strong>in</strong>jective if whenever<br />

T (x) =T (y), then x = y.<br />

□<br />

Given an arbitrary function, it is possible for two different <strong>in</strong>puts to yield the same<br />

output (th<strong>in</strong>k about the function f(x) =x 2 and the <strong>in</strong>puts x = 3 and x = −3). For an<br />

<strong>in</strong>jective function, this never happens. If we have equal outputs (T (x) = T (y)) then<br />

we must have achieved those equal outputs by employ<strong>in</strong>g equal <strong>in</strong>puts (x = y). Some<br />

authors prefer the term one-to-one where we use <strong>in</strong>jective, and we will sometimes<br />

refer to an <strong>in</strong>jective l<strong>in</strong>ear transformation as an <strong>in</strong>jection.<br />

Subsection EILT<br />

Examples of Injective L<strong>in</strong>ear Transformations<br />

It is perhaps most <strong>in</strong>structive to exam<strong>in</strong>e a l<strong>in</strong>ear transformation that is not <strong>in</strong>jective<br />

first.<br />

Example NIAQ Not <strong>in</strong>jective, Archetype Q<br />

Archetype Q is the l<strong>in</strong>ear transformation<br />

⎛⎡<br />

⎤⎞<br />

⎡<br />

⎤<br />

x 1 −2x 1 +3x 2 +3x 3 − 6x 4 +3x 5<br />

x<br />

T : C 5 → C 5 2<br />

−16x , T ⎜⎢x 3 ⎥⎟<br />

⎝⎣x ⎦⎠ = 1 +9x 2 +12x 3 − 28x 4 +28x 5<br />

⎢−19x 1 +7x 2 +14x 3 − 32x 4 +37x 5 ⎥<br />

⎣<br />

4 −21x 1 +9x 2 +15x 3 − 35x 4 +39x ⎦ 5<br />

x 5 −9x 1 +5x 2 +7x 3 − 16x 4 +16x 5<br />

446


§ILT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 447<br />

Notice that for<br />

⎡ ⎤<br />

1<br />

3<br />

x = ⎢−1⎥<br />

⎣ 2 ⎦<br />

4<br />

⎡ ⎤<br />

4<br />

7<br />

y = ⎢0⎥<br />

⎣5⎦<br />

7<br />

we have<br />

⎛⎡<br />

⎤⎞<br />

⎡ ⎤<br />

1 4<br />

3<br />

55<br />

T ⎜⎢−1⎥⎟<br />

⎝⎣<br />

2 ⎦⎠ = ⎢72⎥<br />

⎣77⎦<br />

4 31<br />

⎛⎡<br />

⎤⎞<br />

⎡ ⎤<br />

4 4<br />

7<br />

55<br />

T ⎜⎢0⎥⎟<br />

⎝⎣5⎦⎠ = ⎢72⎥<br />

⎣77⎦<br />

7 31<br />

So we have two vectors from the doma<strong>in</strong>, x ≠ y, yetT (x) = T (y), <strong>in</strong> violation of<br />

Def<strong>in</strong>ition ILT. This is another example where you should not concern yourself with<br />

how x and y were selected, as this will be expla<strong>in</strong>ed shortly. However, do understand<br />

why these two vectors provide enough evidence to conclude that T is not <strong>in</strong>jective.△<br />

Here is a cartoon of a non-<strong>in</strong>jective l<strong>in</strong>ear transformation. Notice that the central<br />

feature of this cartoon is that T (u) = v = T (w). Even though this happens aga<strong>in</strong><br />

with some unnamed vectors, it only takes one occurrence to destroy the possibility<br />

of <strong>in</strong>jectivity. Note also that the two vectors displayed <strong>in</strong> the bottom of V have no<br />

bear<strong>in</strong>g, either way, on the <strong>in</strong>jectivity of T .<br />

u<br />

w<br />

v<br />

U<br />

T<br />

V


§ILT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 448<br />

Diagram NILT: Non-Injective L<strong>in</strong>ear Transformation<br />

To show that a l<strong>in</strong>ear transformation is not <strong>in</strong>jective, it is enough to f<strong>in</strong>d a s<strong>in</strong>gle<br />

pair of <strong>in</strong>puts that get sent to the identical output, as <strong>in</strong> Example NIAQ. However, to<br />

show that a l<strong>in</strong>ear transformation is <strong>in</strong>jective we must establish that this co<strong>in</strong>cidence<br />

of outputs never occurs. Here is an example that shows how to establish this.<br />

Example IAR Injective, Archetype R<br />

Archetype R is the l<strong>in</strong>ear transformation<br />

⎛⎡<br />

⎤⎞<br />

⎡<br />

⎤<br />

x 1 −65x 1 + 128x 2 +10x 3 − 262x 4 +40x 5<br />

x<br />

T : C 5 → C 5 2<br />

36x , T ⎜⎢x 3 ⎥⎟<br />

⎝⎣x ⎦⎠ = 1 − 73x 2 − x 3 + 151x 4 − 16x 5<br />

⎢ −44x 1 +88x 2 +5x 3 − 180x 4 +24x 5 ⎥<br />

⎣<br />

4 34x 1 − 68x 2 − 3x 3 + 140x 4 − 18x ⎦ 5<br />

x 5 12x 1 − 24x 2 − x 3 +49x 4 − 5x 5<br />

To establish that R is <strong>in</strong>jective we must beg<strong>in</strong> with the assumption that T (x) =<br />

T (y) and somehow arrive at the conclusion that x = y. Here we go,<br />

⎡ ⎤<br />

0<br />

0<br />

⎢0⎥<br />

= T (x) − T (y)<br />

⎣0⎦ 0<br />

⎛⎡<br />

⎤⎞<br />

⎛⎡<br />

⎤⎞<br />

x 1<br />

y 1<br />

x 2<br />

y = T ⎜⎢x 3 ⎥⎟<br />

⎝⎣x ⎦⎠ − T 2<br />

⎜⎢y 3 ⎥⎟<br />

⎝⎣<br />

4<br />

y ⎦⎠<br />

4<br />

x 5 y 5<br />

⎡<br />

⎤<br />

−65x 1 + 128x 2 +10x 3 − 262x 4 +40x 5<br />

36x 1 − 73x 2 − x 3 + 151x 4 − 16x 5<br />

= ⎢ −44x 1 +88x 2 +5x 3 − 180x 4 +24x 5 ⎥<br />

⎣ 34x 1 − 68x 2 − 3x 3 + 140x 4 − 18x ⎦ 5<br />

12x 1 − 24x 2 − x 3 +49x 4 − 5x 5<br />

⎡<br />

⎤<br />

−65y 1 + 128y 2 +10y 3 − 262y 4 +40y 5<br />

36y 1 − 73y 2 − y 3 + 151y 4 − 16y 5<br />

− ⎢ −44y 1 +88y 2 +5y 3 − 180y 4 +24y 5 ⎥<br />

⎣ 34y 1 − 68y 2 − 3y 3 + 140y 4 − 18y ⎦ 5<br />

12y 1 − 24y 2 − y 3 +49y 4 − 5y 5<br />

⎡<br />

⎤<br />

−65(x 1 − y 1 ) + 128(x 2 − y 2 ) + 10(x 3 − y 3 ) − 262(x 4 − y 4 ) + 40(x 5 − y 5 )<br />

36(x 1 − y 1 ) − 73(x 2 − y 2 ) − (x 3 − y 3 ) + 151(x 4 − y 4 ) − 16(x 5 − y 5 )<br />

= ⎢ −44(x 1 − y 1 ) + 88(x 2 − y 2 )+5(x 3 − y 3 ) − 180(x 4 − y 4 ) + 24(x 5 − y 5 ) ⎥<br />

⎣ 34(x 1 − y 1 ) − 68(x 2 − y 2 ) − 3(x 3 − y 3 ) + 140(x 4 − y 4 ) − 18(x 5 − y 5 ) ⎦<br />

12(x 1 − y 1 ) − 24(x 2 − y 2 ) − (x 3 − y 3 ) + 49(x 4 − y 4 ) − 5(x 5 − y 5 )


§ILT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 449<br />

⎡<br />

⎤ ⎡ ⎤<br />

−65 128 10 −262 40 x 1 − y 1<br />

36 −73 −1 151 −16<br />

x 2 − y 2<br />

= ⎢−44 88 5 −180 24 ⎥ ⎢x 3 − y 3 ⎥<br />

⎣ 34 −68 −3 140 −18⎦<br />

⎣x 4 − y ⎦ 4<br />

12 −24 −1 49 −5 x 5 − y 5<br />

Now we recognize that we have a homogeneous system of 5 equations <strong>in</strong> 5 variables<br />

(the terms x i − y i are the variables), so we row-reduce the coefficient matrix to<br />

⎡<br />

⎤<br />

1 0 0 0 0<br />

0 1 0 0 0<br />

0 0 1 0 0<br />

⎢<br />

⎥<br />

⎣ 0 0 0 1 0 ⎦<br />

0 0 0 0 1<br />

So the only solution is the trivial solution<br />

x 1 − y 1 =0 x 2 − y 2 =0 x 3 − y 3 =0 x 4 − y 4 =0 x 5 − y 5 =0<br />

and we conclude that <strong>in</strong>deed x = y. By Def<strong>in</strong>ition ILT, T is <strong>in</strong>jective.<br />

Here is the cartoon for an <strong>in</strong>jective l<strong>in</strong>ear transformation. It is meant to suggest<br />

that we never have two <strong>in</strong>puts associated with a s<strong>in</strong>gle output. Aga<strong>in</strong>, the two lonely<br />

vectors at the bottom of V have no bear<strong>in</strong>g either way on the <strong>in</strong>jectivity of T .<br />

△<br />

U<br />

T<br />

V<br />

Diagram ILT: Injective L<strong>in</strong>ear Transformation


§ILT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 450<br />

Let us now exam<strong>in</strong>e an <strong>in</strong>jective l<strong>in</strong>ear transformation between abstract vector<br />

spaces.<br />

Example IAV Injective, Archetype V<br />

Archetype V is def<strong>in</strong>ed by<br />

T : P 3 → M 22 , T ( a + bx + cx 2 + dx 3) [ ]<br />

a + b a− 2c<br />

=<br />

d b− d<br />

To establish that the l<strong>in</strong>ear transformation is <strong>in</strong>jective, beg<strong>in</strong> by suppos<strong>in</strong>g that<br />

two polynomial <strong>in</strong>puts yield the same output matrix,<br />

T ( a 1 + b 1 x + c 1 x 2 + d 1 x 3) = T ( a 2 + b 2 x + c 2 x 2 + d 2 x 3)<br />

Then<br />

[ ]<br />

0 0<br />

O =<br />

0 0<br />

= T ( a 1 + b 1 x + c 1 x 2 + d 1 x 3) − T ( a 2 + b 2 x + c 2 x 2 + d 2 x 3) Hypothesis<br />

= T ( (a 1 + b 1 x + c 1 x 2 + d 1 x 3 ) − (a 2 + b 2 x + c 2 x 2 + d 2 x 3 ) ) Def<strong>in</strong>ition LT<br />

= T ( (a 1 − a 2 )+(b 1 − b 2 )x +(c 1 − c 2 )x 2 +(d 1 − d 2 )x 3) Operations <strong>in</strong> P 3<br />

[ ]<br />

(a1 − a<br />

=<br />

2 )+(b 1 − b 2 ) (a 1 − a 2 ) − 2(c 1 − c 2 )<br />

Def<strong>in</strong>ition of T<br />

(d 1 − d 2 ) (b 1 − b 2 ) − (d 1 − d 2 )<br />

This s<strong>in</strong>gle matrix equality translates to the homogeneous system of equations<br />

<strong>in</strong> the variables a i − b i ,<br />

(a 1 − a 2 )+(b 1 − b 2 )=0<br />

(a 1 − a 2 ) − 2(c 1 − c 2 )=0<br />

(d 1 − d 2 )=0<br />

(b 1 − b 2 ) − (d 1 − d 2 )=0<br />

This system of equations can be rewritten as the matrix equation<br />

⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤<br />

1 1 0 0 (a 1 − a 2 ) 0<br />

⎢1 0 −2 0<br />

⎣<br />

0 0 0 1<br />

0 1 0 −1<br />

⎥ ⎢(b 1 − b 2 ) ⎥ ⎢0⎥<br />

⎦ ⎣<br />

(c 1 − c 2 )<br />

⎦ = ⎣<br />

0<br />

⎦<br />

(d 1 − d 2 ) 0<br />

S<strong>in</strong>ce the coefficient matrix is nons<strong>in</strong>gular (check this) the only solution is trivial,<br />

i.e.<br />

a 1 − a 2 =0 b 1 − b 2 =0 c 1 − c 2 =0 d 1 − d 2 =0<br />

so that<br />

a 1 = a 2 b 1 = b 2 c 1 = c 2 d 1 = d 2<br />

so the two <strong>in</strong>puts must be equal polynomials. By Def<strong>in</strong>ition ILT, T is <strong>in</strong>jective. △


§ILT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 451<br />

Subsection KLT<br />

Kernel of a L<strong>in</strong>ear Transformation<br />

For a l<strong>in</strong>ear transformation T : U → V , the kernel is a subset of the doma<strong>in</strong> U.<br />

Informally, it is the set of all <strong>in</strong>puts that the transformation sends to the zero vector<br />

of the codoma<strong>in</strong>. It will have some natural connections with the null space of a<br />

matrix, so we will keep the same notation, and if you th<strong>in</strong>k about your objects, then<br />

there should be little confusion. Here is the careful def<strong>in</strong>ition.<br />

Def<strong>in</strong>ition KLT Kernel of a L<strong>in</strong>ear Transformation<br />

Suppose T : U → V is a l<strong>in</strong>ear transformation. Then the kernel of T is the set<br />

K(T )={u ∈ U| T (u) =0}<br />

Notice that the kernel of T is just the preimage of 0, T −1 (0) (Def<strong>in</strong>ition PI).<br />

Here is an example.<br />

Example NKAO Nontrivial kernel, Archetype O<br />

Archetype O is the l<strong>in</strong>ear transformation<br />

⎡<br />

⎤<br />

−x<br />

([ ])<br />

1 + x 2 − 3x 3<br />

x1<br />

−x<br />

T : C 3 → C 5 1 +2x 2 − 4x 3<br />

, T x 2 = ⎢ x 1 + x 2 + x 3 ⎥<br />

x ⎣<br />

3 2x 1 +3x 2 + x ⎦ 3<br />

x 1 +2x 3<br />

To determ<strong>in</strong>e the elements of C 3 <strong>in</strong> K(T ), f<strong>in</strong>d those vectors u such that T (u) = 0,<br />

that is,<br />

T (u) =0<br />

⎡<br />

⎤ ⎡ ⎤<br />

−u 1 + u 2 − 3u 3 0<br />

−u 1 +2u 2 − 4u 3<br />

0<br />

⎢ u 1 + u 2 + u 3 ⎥<br />

⎣ 2u 1 +3u 2 + u ⎦ = ⎢0⎥<br />

⎣<br />

3 0⎦<br />

u 1 +2u 3 0<br />

Vector equality (Def<strong>in</strong>ition CVE) leads us to a homogeneous system of 5 equations<br />

<strong>in</strong> the variables u i ,<br />

−u 1 + u 2 − 3u 3 =0<br />

−u 1 +2u 2 − 4u 3 =0<br />

u 1 + u 2 + u 3 =0<br />

2u 1 +3u 2 + u 3 =0<br />

u 1 +2u 3 =0<br />


§ILT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 452<br />

Row-reduc<strong>in</strong>g the coefficient matrix gives<br />

⎡<br />

⎤<br />

1 0 2<br />

0 1 −1<br />

⎢ 0 0 0<br />

⎥<br />

⎣ 0 0 0 ⎦<br />

0 0 0<br />

The kernel of T is the set of solutions to this homogeneous system of equations,<br />

which by Theorem BNS can be expressed as<br />

〈{[ ]}〉 −2<br />

K(T )= 1<br />

1<br />

We know that the span of a set of vectors is always a subspace (Theorem SSS),<br />

so the kernel computed <strong>in</strong> Example NKAO is also a subspace. This is no accident,<br />

the kernel of a l<strong>in</strong>ear transformation is always a subspace.<br />

Theorem KLTS Kernel of a L<strong>in</strong>ear Transformation is a Subspace<br />

Suppose that T : U → V is a l<strong>in</strong>ear transformation. Then the kernel of T , K(T ), is<br />

asubspaceofU.<br />

Proof. We can apply the three-part test of Theorem TSS. <strong>First</strong> T (0 U ) = 0 V<br />

Theorem LTTZZ, so0 U ∈K(T ) and we know that the kernel is nonempty.<br />

Suppose we assume that x, y ∈K(T ). Is x + y ∈K(T )?<br />

T (x + y) =T (x)+T (y)<br />

Def<strong>in</strong>ition LT<br />

= 0 + 0 x, y ∈K(T )<br />

= 0 Property Z<br />

This qualifies x + y for membership <strong>in</strong> K(T ). So we have additive closure.<br />

Suppose we assume that α ∈ C and x ∈K(T ). Is αx ∈K(T )?<br />

T (αx) =αT (x)<br />

Def<strong>in</strong>ition LT<br />

= α0 x ∈K(T )<br />

= 0 Theorem ZVSM<br />

This qualifies αx for membership <strong>in</strong> K(T ). So we have scalar closure and Theorem<br />

TSS tells us that K(T ) is a subspace of U.<br />

<br />

Let us compute another kernel, now that we know <strong>in</strong> advance that it will be a<br />

subspace.<br />

Example TKAP Trivial kernel, Archetype P<br />

△<br />

by


§ILT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 453<br />

Archetype P is the l<strong>in</strong>ear transformation<br />

T : C 3 → C 5 ,<br />

T<br />

([ ])<br />

x1<br />

x 2 =<br />

x 3<br />

⎡<br />

⎤<br />

−x 1 + x 2 + x 3<br />

−x 1 +2x 2 +2x 3<br />

⎢ x 1 + x 2 +3x 3 ⎥<br />

⎣ 2x 1 +3x 2 + x ⎦ 3<br />

−2x 1 + x 2 +3x 3<br />

To determ<strong>in</strong>e the elements of C 3 <strong>in</strong> K(T ), f<strong>in</strong>d those vectors u such that T (u) = 0,<br />

that is,<br />

T (u) =0<br />

⎡<br />

⎤ ⎡ ⎤<br />

−u 1 + u 2 + u 3 0<br />

−u 1 +2u 2 +2u 3<br />

0<br />

⎢ u 1 + u 2 +3u 3 ⎥<br />

⎣ 2u 1 +3u 2 + u ⎦ = ⎢0⎥<br />

⎣<br />

3 0⎦<br />

−2u 1 + u 2 +3u 3 0<br />

Vector equality (Def<strong>in</strong>ition CVE) leads us to a homogeneous system of 5 equations<br />

<strong>in</strong> the variables u i ,<br />

−u 1 + u 2 + u 3 =0<br />

−u 1 +2u 2 +2u 3 =0<br />

u 1 + u 2 +3u 3 =0<br />

2u 1 +3u 2 + u 3 =0<br />

−2u 1 + u 2 +3u 3 =0<br />

Row-reduc<strong>in</strong>g the coefficient matrix gives<br />

⎡<br />

⎢<br />

⎣<br />

1 0 0<br />

0 1 0<br />

0 0 1<br />

0 0 0<br />

0 0 0<br />

The kernel of T is the set of solutions to this homogeneous system of equations,<br />

which is simply the trivial solution u = 0, so<br />

K(T )={0} = 〈{ }〉<br />

△<br />

Our next theorem says that if a preimage is a nonempty set then we can construct<br />

it by pick<strong>in</strong>g any one element and add<strong>in</strong>g on elements of the kernel.<br />

Theorem KPI Kernel and Pre-Image<br />

Suppose T : U → V is a l<strong>in</strong>ear transformation and v ∈ V . If the preimage T −1 (v)<br />

⎤<br />

⎥<br />


§ILT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 454<br />

is nonempty, and u ∈ T −1 (v) then<br />

T −1 (v) ={u + z| z ∈K(T )} = u + K(T )<br />

Proof. Let M = {u + z| z ∈K(T )}. <strong>First</strong>, we show that M ⊆ T −1 (v). Suppose that<br />

w ∈ M, sow has the form w = u + z, wherez ∈K(T ). Then<br />

T (w) =T (u + z)<br />

= T (u)+T (z) Def<strong>in</strong>ition LT<br />

= v + 0 u ∈ T −1 (v) , z ∈K(T )<br />

= v Property Z<br />

which qualifies w for membership <strong>in</strong> the preimage of v, w ∈ T −1 (v).<br />

For the opposite <strong>in</strong>clusion, suppose x ∈ T −1 (v). Then,<br />

T (x − u) =T (x) − T (u)<br />

Def<strong>in</strong>ition LT<br />

= v − v x, u ∈ T −1 (v)<br />

= 0<br />

This qualifies x − u for membership <strong>in</strong> the kernel of T , K(T ). Sothereisavector<br />

z ∈K(T ) such that x − u = z. Rearrang<strong>in</strong>g this equation gives x = u + z and so<br />

x ∈ M. SoT −1 (v) ⊆ M and we see that M = T −1 (v), as desired.<br />

<br />

This theorem, and its proof, should rem<strong>in</strong>d you very much of Theorem PSPHS.<br />

Additionally, you might go back and review Example SPIAS. Can you tell now which<br />

is the only preimage to be a subspace?<br />

Here is the cartoon which describes the “many-to-one” behavior of a typical<br />

l<strong>in</strong>ear transformation. Presume that T (u i ) = v i ,fori =1, 2, 3, and as guaranteed<br />

by Theorem LTTZZ, T (0 U ) = 0 V . Then four pre-images are depicted, each labeled<br />

slightly different. T −1 (v 2 ) is the most general, employ<strong>in</strong>g Theorem KPI to provide<br />

two equal descriptions of the set. The most unusual is T −1 (0 V ) which is equal to the<br />

kernel, K(T ), and hence is a subspace (by Theorem KLTS). The subdivisions of the<br />

doma<strong>in</strong>, U, are meant to suggest the partion<strong>in</strong>g of the doma<strong>in</strong> by the collection of<br />

pre-images. It also suggests that each pre-image is of similar size or structure, s<strong>in</strong>ce<br />

each is a “shifted” copy of the kernel. Notice that we cannot speak of the dimension<br />

of a pre-image, s<strong>in</strong>ce it is almost never a subspace. Also notice that x, y ∈ V are<br />

elements of the codoma<strong>in</strong> with empty pre-images.


§ILT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 455<br />

u 1<br />

v 1<br />

u 1 + K(T )<br />

u 2<br />

v 2<br />

T −1 (v 2 )=u 2 + K(T )<br />

0 U<br />

0 V<br />

T −1 (0 V )=K(T )<br />

x<br />

u 3<br />

T −1 (v 3 )<br />

U<br />

T<br />

V<br />

v 3<br />

y<br />

Diagram KPI: Kernel and Pre-Image<br />

The next theorem is one we will cite frequently, as it characterizes <strong>in</strong>jections by<br />

the size of the kernel.<br />

Theorem KILT Kernel of an Injective L<strong>in</strong>ear Transformation<br />

Suppose that T : U → V is a l<strong>in</strong>ear transformation. Then T is <strong>in</strong>jective if and only<br />

if the kernel of T is trivial, K(T )={0}.<br />

Proof. (⇒) We assume T is <strong>in</strong>jective and we need to establish that two sets are<br />

equal (Def<strong>in</strong>ition SE). S<strong>in</strong>ce the kernel is a subspace (Theorem KLTS), {0} ⊆K(T ).<br />

To establish the opposite <strong>in</strong>clusion, suppose x ∈K(T ).<br />

T (x) =0<br />

Def<strong>in</strong>ition KLT<br />

= T (0) Theorem LTTZZ<br />

We can apply Def<strong>in</strong>ition ILT to conclude that x = 0. Therefore K(T ) ⊆{0} and<br />

by Def<strong>in</strong>ition SE, K(T )={0}.<br />

(⇐) To establish that T is <strong>in</strong>jective, appeal to Def<strong>in</strong>ition ILT and beg<strong>in</strong> with<br />

the assumption that T (x) =T (y). Then<br />

T (x − y) =T (x) − T (y)<br />

Def<strong>in</strong>ition LT<br />

= 0 Hypothesis<br />

So x − y ∈K(T ) by Def<strong>in</strong>ition KLT and with the hypothesis that the kernel is


§ILT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 456<br />

trivial we conclude that x − y = 0. Then<br />

y = y + 0 = y +(x − y) =x<br />

thus establish<strong>in</strong>g that T is <strong>in</strong>jective by Def<strong>in</strong>ition ILT.<br />

<br />

You might beg<strong>in</strong> to th<strong>in</strong>k about how Diagram KPI would change if the l<strong>in</strong>ear<br />

transformation is <strong>in</strong>jective, which would make the kernel trivial by Theorem KILT.<br />

Example NIAQR Not <strong>in</strong>jective, Archetype Q, revisited<br />

We are now <strong>in</strong> a position to revisit our first example <strong>in</strong> this section, Example NIAQ.<br />

In that example, we showed that Archetype Q is not <strong>in</strong>jective by construct<strong>in</strong>g two<br />

vectors, which when used to evaluate the l<strong>in</strong>ear transformation provided the same<br />

output, thus violat<strong>in</strong>g Def<strong>in</strong>ition ILT. Just where did those two vectors come from?<br />

The key is the vector<br />

⎡ ⎤<br />

3<br />

4<br />

z = ⎢1⎥<br />

⎣3⎦<br />

3<br />

which you can check is an element of K(T ) for Archetype Q. Choose a vector x at<br />

random, and then compute y = x + z (verify this computation back <strong>in</strong> Example<br />

NIAQ). Then<br />

T (y) =T (x + z)<br />

= T (x)+T (z) Def<strong>in</strong>ition LT<br />

= T (x)+0 z ∈K(T )<br />

= T (x) Property Z<br />

Whenever the kernel of a l<strong>in</strong>ear transformation is nontrivial, we can employ this<br />

device and conclude that the l<strong>in</strong>ear transformation is not <strong>in</strong>jective. This is another<br />

way of view<strong>in</strong>g Theorem KILT. For an <strong>in</strong>jective l<strong>in</strong>ear transformation, the kernel is<br />

trivial and our only choice for z isthezerovector,whichwillnothelpuscreatetwo<br />

different <strong>in</strong>puts for T that yield identical outputs. For every one of the archetypes<br />

that is not <strong>in</strong>jective, there is an example presented of exactly this form. △<br />

Example NIAO Not <strong>in</strong>jective, Archetype O<br />

In Example NKAO the kernel of Archetype O was determ<strong>in</strong>ed to be<br />

〈{[ ]}〉 −2<br />

1<br />

1<br />

a subspace of C 3 with dimension 1. S<strong>in</strong>ce the kernel is not trivial, Theorem KILT<br />

tells us that T is not <strong>in</strong>jective.<br />

△<br />

Example IAP Injective, Archetype P


§ILT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 457<br />

In Example TKAP it was shown that the l<strong>in</strong>ear transformation <strong>in</strong> Archetype P has<br />

a trivial kernel. So by Theorem KILT, T is <strong>in</strong>jective.<br />

△<br />

Subsection ILTLI<br />

Injective L<strong>in</strong>ear Transformations and L<strong>in</strong>ear Independence<br />

There is a connection between <strong>in</strong>jective l<strong>in</strong>ear transformations and l<strong>in</strong>early <strong>in</strong>dependent<br />

sets that we will make precise <strong>in</strong> the next two theorems. However, more<br />

<strong>in</strong>formally, we can get a feel for this connection when we th<strong>in</strong>k about how each<br />

property is def<strong>in</strong>ed. A set of vectors is l<strong>in</strong>early <strong>in</strong>dependent if the only relation of<br />

l<strong>in</strong>ear dependence is the trivial one. A l<strong>in</strong>ear transformation is <strong>in</strong>jective if the only<br />

way two <strong>in</strong>put vectors can produce the same output is <strong>in</strong> the trivial way, when both<br />

<strong>in</strong>put vectors are equal.<br />

Theorem ILTLI Injective L<strong>in</strong>ear Transformations and L<strong>in</strong>ear Independence<br />

Suppose that T : U → V is an <strong>in</strong>jective l<strong>in</strong>ear transformation and<br />

S = {u 1 , u 2 , u 3 , ..., u t }<br />

is a l<strong>in</strong>early <strong>in</strong>dependent subset of U. Then<br />

R = {T (u 1 ) ,T(u 2 ) ,T(u 3 ) , ..., T (u t )}<br />

is a l<strong>in</strong>early <strong>in</strong>dependent subset of V .<br />

Proof. Beg<strong>in</strong> with a relation of l<strong>in</strong>ear dependence on R (Def<strong>in</strong>ition RLD, Def<strong>in</strong>ition<br />

LI),<br />

a 1 T (u 1 )+a 2 T (u 2 )+a 3 T (u 3 )+...+ a t T (u t )=0<br />

T (a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a t u t )=0<br />

a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a t u t ∈K(T )<br />

a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a t u t ∈{0}<br />

a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a t u t = 0<br />

Theorem LTLC<br />

Def<strong>in</strong>ition KLT<br />

Theorem KILT<br />

Def<strong>in</strong>ition SET<br />

S<strong>in</strong>ce this is a relation of l<strong>in</strong>ear dependence on the l<strong>in</strong>early <strong>in</strong>dependent set S,<br />

we can conclude that<br />

a 1 =0 a 2 =0 a 3 =0 ... a t =0<br />

and this establishes that R is a l<strong>in</strong>early <strong>in</strong>dependent set.<br />

<br />

Theorem ILTB Injective L<strong>in</strong>ear Transformations and Bases<br />

Suppose that T : U → V is a l<strong>in</strong>ear transformation and<br />

B = {u 1 , u 2 , u 3 , ..., u m }


§ILT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 458<br />

is a basis of U. ThenT is <strong>in</strong>jective if and only if<br />

C = {T (u 1 ) ,T(u 2 ) ,T(u 3 ) , ..., T (u m )}<br />

is a l<strong>in</strong>early <strong>in</strong>dependent subset of V .<br />

Proof. (⇒) AssumeT is <strong>in</strong>jective. S<strong>in</strong>ce B is a basis, we know B is l<strong>in</strong>early <strong>in</strong>dependent<br />

(Def<strong>in</strong>ition B). Then Theorem ILTLI says that C is a l<strong>in</strong>early <strong>in</strong>dependent<br />

subset of V .<br />

(⇐) Assume that C is l<strong>in</strong>early <strong>in</strong>dependent. To establish that T is <strong>in</strong>jective, we<br />

will show that the kernel of T is trivial (Theorem KILT). Suppose that u ∈K(T ).<br />

As an element of U, we can write u as a l<strong>in</strong>ear comb<strong>in</strong>ation of the basis vectors <strong>in</strong><br />

B (uniquely). So there are are scalars, a 1 ,a 2 ,a 3 , ..., a m , such that<br />

u = a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a m u m<br />

Then,<br />

0 = T (u) Def<strong>in</strong>ition KLT<br />

= T (a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a m u m ) Def<strong>in</strong>ition SSVS<br />

= a 1 T (u 1 )+a 2 T (u 2 )+a 3 T (u 3 )+···+ a m T (u m ) Theorem LTLC<br />

This is a relation of l<strong>in</strong>ear dependence (Def<strong>in</strong>ition RLD) on the l<strong>in</strong>early <strong>in</strong>dependent<br />

set C, so the scalars are all zero: a 1 = a 2 = a 3 = ···= a m =0.Then<br />

u = a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a m u m<br />

=0u 1 +0u 2 +0u 3 + ···+0u m Theorem ZSSM<br />

= 0 + 0 + 0 + ···+ 0 Theorem ZSSM<br />

= 0 Property Z<br />

S<strong>in</strong>ce u was chosen as an arbitrary vector from K(T ), wehaveK(T ) = {0} and<br />

Theorem KILT tells us that T is <strong>in</strong>jective.<br />

<br />

Subsection ILTD<br />

Injective L<strong>in</strong>ear Transformations and Dimension<br />

Theorem ILTD Injective L<strong>in</strong>ear Transformations and Dimension<br />

Suppose that T : U → V is an <strong>in</strong>jective l<strong>in</strong>ear transformation. Then dim (U) ≤<br />

dim (V ).<br />

Proof. Suppose to the contrary that m = dim (U) > dim (V ) = t. LetB be a basis<br />

of U, which will then conta<strong>in</strong> m vectors. Apply T to each element of B to form a set<br />

C that is a subset of V .ByTheoremILTB, C is l<strong>in</strong>early <strong>in</strong>dependent and therefore<br />

must conta<strong>in</strong> m dist<strong>in</strong>ct vectors. So we have found a set of m l<strong>in</strong>early <strong>in</strong>dependent<br />

vectors <strong>in</strong> V , a vector space of dimension t, withm>t. However, this contradicts<br />

Theorem G, so our assumption is false and dim (U) ≤ dim (V ).


§ILT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 459<br />

Example NIDAU Not <strong>in</strong>jective by dimension, Archetype U<br />

The l<strong>in</strong>ear transformation <strong>in</strong> Archetype U is<br />

⎡<br />

⎤<br />

([ ]) a +2b +12c − 3d + e +6f<br />

T : M 23 → C 4 a b c ⎢ 2a − b − c + d − 11f ⎥<br />

, T<br />

=<br />

d e f<br />

⎣<br />

a + b +7c +2d + e − 3f<br />

⎦<br />

a +2b +12c +5e − 5f<br />

S<strong>in</strong>ce dim (M 23 ) =6> 4=dim ( C 4) , T cannot be <strong>in</strong>jective for then T would<br />

violate Theorem ILTD.<br />

△<br />

Notice that the previous example made no use of the actual formula def<strong>in</strong><strong>in</strong>g<br />

the function. Merely a comparison of the dimensions of the doma<strong>in</strong> and codoma<strong>in</strong><br />

are enough to conclude that the l<strong>in</strong>ear transformation is not <strong>in</strong>jective. Archetype M<br />

and Archetype N are two more examples of l<strong>in</strong>ear transformations that have “big”<br />

doma<strong>in</strong>s and “small” codoma<strong>in</strong>s, result<strong>in</strong>g <strong>in</strong> “collisions” of outputs and thus are<br />

non-<strong>in</strong>jective l<strong>in</strong>ear transformations.<br />

Subsection CILT<br />

Composition of Injective L<strong>in</strong>ear Transformations<br />

In Subsection LT.NLTFO we saw how to comb<strong>in</strong>e l<strong>in</strong>ear transformations to build<br />

new l<strong>in</strong>ear transformations, specifically, how to build the composition of two l<strong>in</strong>ear<br />

transformations (Def<strong>in</strong>ition LTC). It will be useful later to know that the composition<br />

of <strong>in</strong>jective l<strong>in</strong>ear transformations is aga<strong>in</strong> <strong>in</strong>jective, so we prove that here.<br />

Theorem CILTI Composition of Injective L<strong>in</strong>ear Transformations is Injective<br />

Suppose that T : U → V and S : V → W are <strong>in</strong>jective l<strong>in</strong>ear transformations. Then<br />

(S ◦ T ): U → W is an <strong>in</strong>jective l<strong>in</strong>ear transformation.<br />

Proof. That the composition is a l<strong>in</strong>ear transformation was established <strong>in</strong> Theorem<br />

CLTLT, so we need only establish that the composition is <strong>in</strong>jective. Apply<strong>in</strong>g<br />

Def<strong>in</strong>ition ILT, choose x, y from U. Thenif(S ◦ T )(x) =(S ◦ T )(y),<br />

⇒ S (T (x)) = S (T (y)) Def<strong>in</strong>ition LTC<br />

⇒ T (x) =T (y) Def<strong>in</strong>ition ILT for S<br />

⇒ x = y Def<strong>in</strong>ition ILT for T<br />

<br />

Read<strong>in</strong>g Questions<br />

1. Suppose T : C 8 → C 5 is a l<strong>in</strong>ear transformation. Why is T not <strong>in</strong>jective?<br />

2. Describe the kernel of an <strong>in</strong>jective l<strong>in</strong>ear transformation.<br />

3. Theorem KPI should rem<strong>in</strong>d you of Theorem PSPHS. Why do we say this?


§ILT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 460<br />

Exercises<br />

C10<br />

Each archetype below is a l<strong>in</strong>ear transformation. Compute the kernel for each.<br />

Archetype M, ArchetypeN, ArchetypeO, ArchetypeP, ArchetypeQ, ArchetypeR, Archetype<br />

S, ArchetypeT, ArchetypeU, Archetype V, ArchetypeW, ArchetypeX<br />

C20 † The l<strong>in</strong>ear transformation T : C 4 → C 3 is not <strong>in</strong>jective. F<strong>in</strong>d two <strong>in</strong>puts x, y ∈ C 4<br />

that yield the same output (that is T (x) =T (y)).<br />

⎛⎡<br />

⎤⎞<br />

x 1<br />

⎡<br />

⎤<br />

2x<br />

⎜⎢x T 2 ⎥⎟<br />

1 + x 2 + x 3<br />

⎝⎣<br />

x<br />

⎦⎠ = ⎣−x 1 +3x 2 + x 3 − x 4<br />

⎦<br />

3<br />

3x<br />

x 1 + x 2 +2x 3 − 2x 4<br />

4<br />

C25 †<br />

Def<strong>in</strong>e the l<strong>in</strong>ear transformation<br />

⎛⎡<br />

T : C 3 → C 2 , T ⎝⎣ x ⎤⎞<br />

1<br />

x 2<br />

⎦⎠ =<br />

x 3<br />

[ ]<br />

2x1 − x 2 +5x 3<br />

−4x 1 +2x 2 − 10x 3<br />

F<strong>in</strong>d a basis for the kernel of T , K(T ). Is T <strong>in</strong>jective?<br />

⎡<br />

⎤<br />

1 2 3 1 0<br />

C26 † ⎢2 −1 1 0 1⎥<br />

Let A = ⎣<br />

1 2 −1 −2 1<br />

⎦ and let T : C 5 → C 4 be given by T (x) = Ax. Is<br />

1 3 2 1 2<br />

T <strong>in</strong>jective? (H<strong>in</strong>t: No calculation is required.)<br />

⎛⎡<br />

C27 † Let T : C 3 → C 3 be given by T ⎝⎣ x ⎤⎞<br />

⎡<br />

y⎦⎠ = ⎣ 2x + y + z<br />

⎤<br />

x − y +2z⎦. F<strong>in</strong>dK(T ). IsT <strong>in</strong>jective?<br />

z x +2y − z<br />

⎡<br />

⎤<br />

1 2 3 1<br />

C28 † ⎢2 −1 1 0 ⎥<br />

Let A = ⎣<br />

1 2 −1 −2<br />

⎦ and let T : C 4 → C 4 be given by T (x) = Ax. F<strong>in</strong>d<br />

1 3 2 1<br />

K(T ). Is T <strong>in</strong>jective?<br />

⎡<br />

⎤<br />

1 2 1 1<br />

C29 † ⎢2 1 1 0⎥<br />

Let A = ⎣<br />

1 2 1 2<br />

⎦ and let T : C 4 → C 4 be given by T (x) = Ax. F<strong>in</strong>dK(T ).<br />

1 2 1 1<br />

Is T <strong>in</strong>jective?<br />

([ ])<br />

C30 †<br />

a b<br />

Let T : M 22 → P 2 be given by T<br />

=(a + b)+(a + c)x +(a + d)x<br />

c d<br />

2 .IsT<br />

<strong>in</strong>jective? F<strong>in</strong>d K(T ).<br />

⎛⎡<br />

C31 † Given that the l<strong>in</strong>ear transformation T : C 3 → C 3 , T ⎝⎣ x ⎤⎞<br />

⎡<br />

y⎦⎠ = ⎣ 2x + y<br />

⎤<br />

2y + z⎦ is<br />

z x +2z<br />

<strong>in</strong>jective, show directly that {T (e 1 ) ,T(e 2 ) ,T(e 3 )} is a l<strong>in</strong>early <strong>in</strong>dependent set.


§ILT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 461<br />

⎡<br />

([<br />

C32 † Given that the l<strong>in</strong>ear transformation T : C 2 → C 3 x<br />

, T = ⎣<br />

y]) x + y<br />

⎤<br />

2x + y⎦ is <strong>in</strong>jective,<br />

x +2y<br />

show directly that {T (e 1 ) ,T(e 2 )} is a l<strong>in</strong>early <strong>in</strong>dependent set.<br />

⎡ ⎤<br />

⎛⎡<br />

C33 † Given that the l<strong>in</strong>ear transformation T : C 3 → C 5 , T ⎝⎣ x ⎤⎞<br />

1 3 2 ⎡<br />

0 1 1<br />

y⎦⎠ = ⎢1 2 1⎥<br />

⎣ x ⎤<br />

y⎦<br />

z<br />

⎣<br />

1 0 1<br />

⎦<br />

z<br />

3 1 2<br />

is <strong>in</strong>jective, show directly that {T (e 1 ) ,T(e 2 ) ,T(e 3 )} is a l<strong>in</strong>early <strong>in</strong>dependent set.<br />

C40 † Show that the l<strong>in</strong>ear transformation R is not <strong>in</strong>jective by f<strong>in</strong>d<strong>in</strong>g two different<br />

elements of the doma<strong>in</strong>, x and y, such that R (x) = R (y). (S 22 is the vector space of<br />

symmetric 2 × 2 matrices.)<br />

([ ])<br />

a b<br />

R: S 22 → P 1 R<br />

=(2a − b + c)+(a + b +2c)x<br />

b c<br />

M60 Suppose U and V are vector spaces. Def<strong>in</strong>e the function Z : U → V by Z (u) = 0 V<br />

for every u ∈ U. Then by Exercise LT.M60, Z is a l<strong>in</strong>ear transformation. Formulate a<br />

condition on U that is equivalent to Z be<strong>in</strong>g an <strong>in</strong>jective l<strong>in</strong>ear transformation. In other<br />

words, fill <strong>in</strong> the blank to complete the follow<strong>in</strong>g statement (and then give a proof): Z is<br />

<strong>in</strong>jective if and only if U is . (See Exercise SLT.M60, Exercise IVLT.M60.)<br />

T10 † Suppose T : U → V is a l<strong>in</strong>ear transformation. For which vectors v ∈ V is T −1 (v)<br />

a subspace of U?<br />

T15 † Suppose that that T : U → V and S : V → W are l<strong>in</strong>ear transformations. Prove the<br />

follow<strong>in</strong>g relationship between kernels.<br />

T20 †<br />

K(T ) ⊆K(S ◦ T )<br />

Suppose that A is an m × n matrix. Def<strong>in</strong>e the l<strong>in</strong>ear transformation T by<br />

T : C n → C m ,<br />

T (x) =Ax<br />

Prove that the kernel of T equals the null space of A, K(T )=N (A).


Section SLT<br />

Surjective L<strong>in</strong>ear Transformations<br />

The companion to an <strong>in</strong>jection is a surjection. Surjective l<strong>in</strong>ear transformations are<br />

closely related to spann<strong>in</strong>g sets and ranges. So as you read this section reflect back<br />

on Section ILT and note the parallels and the contrasts. In the next section, Section<br />

IVLT, we will comb<strong>in</strong>e the two properties.<br />

Subsection SLT<br />

Surjective L<strong>in</strong>ear Transformations<br />

As usual, we lead with a def<strong>in</strong>ition.<br />

Def<strong>in</strong>ition SLT Surjective L<strong>in</strong>ear Transformation<br />

Suppose T : U → V is a l<strong>in</strong>ear transformation. Then T is surjective if for every<br />

v ∈ V there exists a u ∈ U so that T (u) =v.<br />

□<br />

Given an arbitrary function, it is possible for there to be an element of the<br />

codoma<strong>in</strong> that is not an output of the function (th<strong>in</strong>k about the function y = f(x) =<br />

x 2 and the codoma<strong>in</strong> element y = −3). For a surjective function, this never happens.<br />

If we choose any element of the codoma<strong>in</strong> (v ∈ V ) then there must be an <strong>in</strong>put<br />

from the doma<strong>in</strong> (u ∈ U) which will create the output when used to evaluate the<br />

l<strong>in</strong>ear transformation (T (u) = v). Some authors prefer the term onto whereweuse<br />

surjective, and we will sometimes refer to a surjective l<strong>in</strong>ear transformation as a<br />

surjection.<br />

Subsection ESLT<br />

Examples of Surjective L<strong>in</strong>ear Transformations<br />

It is perhaps most <strong>in</strong>structive to exam<strong>in</strong>e a l<strong>in</strong>ear transformation that is not surjective<br />

first.<br />

Example NSAQ Not surjective, Archetype Q<br />

Archetype Q is the l<strong>in</strong>ear transformation<br />

⎛⎡<br />

⎤⎞<br />

⎡<br />

⎤<br />

x 1 −2x 1 +3x 2 +3x 3 − 6x 4 +3x 5<br />

x<br />

T : C 5 → C 5 2<br />

−16x , T ⎜⎢x 3 ⎥⎟<br />

⎝⎣x ⎦⎠ = 1 +9x 2 +12x 3 − 28x 4 +28x 5<br />

⎢−19x 1 +7x 2 +14x 3 − 32x 4 +37x 5 ⎥<br />

⎣<br />

4 −21x 1 +9x 2 +15x 3 − 35x 4 +39x ⎦ 5<br />

x 5 −9x 1 +5x 2 +7x 3 − 16x 4 +16x 5<br />

462


§SLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 463<br />

We will demonstrate that<br />

⎡ ⎤<br />

−1<br />

2<br />

v = ⎢ 3 ⎥<br />

⎣−1⎦<br />

4<br />

is an unobta<strong>in</strong>able element of the codoma<strong>in</strong>. Suppose to the contrary that u is an<br />

element of the doma<strong>in</strong> such that T (u) =v.<br />

Then<br />

⎡ ⎤<br />

⎛⎡<br />

⎤⎞<br />

−1<br />

u 1<br />

2<br />

u ⎢ 3 ⎥<br />

⎣−1⎦ = v = T (u) =T 2<br />

⎜⎢u 3 ⎥⎟<br />

⎝⎣u ⎦⎠<br />

4<br />

4<br />

u 5<br />

⎡<br />

⎤<br />

−2u 1 +3u 2 +3u 3 − 6u 4 +3u 5<br />

−16u 1 +9u 2 +12u 3 − 28u 4 +28u 5<br />

= ⎢−19u 1 +7u 2 +14u 3 − 32u 4 +37u 5 ⎥<br />

⎣−21u 1 +9u 2 +15u 3 − 35u 4 +39u ⎦ 5<br />

−9u 1 +5u 2 +7u 3 − 16u 4 +16u 5<br />

⎡<br />

⎤ ⎡ ⎤<br />

−2 3 3 −6 3 u 1<br />

−16 9 12 −28 28<br />

u 2<br />

= ⎢−19 7 14 −32 37⎥<br />

⎢u 3 ⎥<br />

⎣−21 9 15 −35 39⎦<br />

⎣u ⎦ 4<br />

−9 5 7 −16 16 u 5<br />

Now we recognize the appropriate <strong>in</strong>put vector u as a solution to a l<strong>in</strong>ear system<br />

of equations. Form the augmented matrix of the system, and row-reduce to<br />

⎡<br />

⎤<br />

1 0 0 0 −1 0<br />

0 1 0 0 − 4 3<br />

0<br />

0 0 1 0 −<br />

⎢<br />

1 3<br />

0<br />

⎥<br />

⎣ 0 0 0 1 −1 0 ⎦<br />

0 0 0 0 0 1<br />

With a lead<strong>in</strong>g 1 <strong>in</strong> the last column, Theorem RCLS tells us the system is<br />

<strong>in</strong>consistent. From the absence of any solutions we conclude that no such vector u<br />

exists, and by Def<strong>in</strong>ition SLT, T is not surjective.<br />

Aga<strong>in</strong>, do not concern yourself with how v was selected, as this will be expla<strong>in</strong>ed<br />

shortly. However, do understand why this vector provides enough evidence to conclude<br />

that T is not surjective.<br />

△<br />

Here is a cartoon of a non-surjective l<strong>in</strong>ear transformation. Notice that the central<br />

feature of this cartoon is that the vector v ∈ V does not have an arrow po<strong>in</strong>t<strong>in</strong>g<br />

to it, imply<strong>in</strong>g there is no u ∈ U such that T (u) = v. Even though this happens


§SLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 464<br />

aga<strong>in</strong> with a second unnamed vector <strong>in</strong> V , it only takes one occurrence to destroy<br />

the possibility of surjectivity.<br />

v<br />

U<br />

T<br />

V<br />

Diagram NSLT: Non-Surjective L<strong>in</strong>ear Transformation<br />

To show that a l<strong>in</strong>ear transformation is not surjective, it is enough to f<strong>in</strong>d a s<strong>in</strong>gle<br />

element of the codoma<strong>in</strong> that is never created by any <strong>in</strong>put, as <strong>in</strong> Example NSAQ.<br />

However, to show that a l<strong>in</strong>ear transformation is surjective we must establish that<br />

every element of the codoma<strong>in</strong> occurs as an output of the l<strong>in</strong>ear transformation for<br />

some appropriate <strong>in</strong>put.<br />

Example SAR Surjective, Archetype R<br />

Archetype R is the l<strong>in</strong>ear transformation<br />

⎛⎡<br />

⎤⎞<br />

⎡<br />

⎤<br />

x 1 −65x 1 + 128x 2 +10x 3 − 262x 4 +40x 5<br />

x<br />

T : C 5 → C 5 2<br />

36x , T ⎜⎢x 3 ⎥⎟<br />

⎝⎣x ⎦⎠ = 1 − 73x 2 − x 3 + 151x 4 − 16x 5<br />

⎢ −44x 1 +88x 2 +5x 3 − 180x 4 +24x 5 ⎥<br />

⎣<br />

4 34x 1 − 68x 2 − 3x 3 + 140x 4 − 18x ⎦ 5<br />

x 5 12x 1 − 24x 2 − x 3 +49x 4 − 5x 5<br />

To establish that R is surjective we must beg<strong>in</strong> with a totally arbitrary element<br />

of the codoma<strong>in</strong>, v and somehow f<strong>in</strong>d an <strong>in</strong>put vector u such that T (u) = v. We<br />

desire,<br />

T (u) =v


§SLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 465<br />

⎡<br />

⎤ ⎡ ⎤<br />

−65u 1 + 128u 2 +10u 3 − 262u 4 +40u 5 v 1<br />

36u 1 − 73u 2 − u 3 + 151u 4 − 16u 5<br />

v ⎢ −44u 1 +88u 2 +5u 3 − 180u 4 +24u 5 ⎥<br />

⎣ 34u 1 − 68u 2 − 3u 3 + 140u 4 − 18u ⎦ = 2<br />

⎢v 3 ⎥<br />

⎣<br />

5 v ⎦ 4<br />

12u 1 − 24u 2 − u 3 +49u 4 − 5u 5 v 5<br />

⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤<br />

−65 128 10 −262 40 u 1 v 1<br />

36 −73 −1 151 −16<br />

u 2<br />

v ⎢−44 88 5 −180 24 ⎥ ⎢u 3 ⎥<br />

⎣ 34 −68 −3 140 −18⎦<br />

⎣u ⎦ = 2<br />

⎢v 3 ⎥<br />

⎣ 4 v ⎦ 4<br />

12 −24 −1 49 −5 u 5 v 5<br />

We recognize this equation as a system of equations <strong>in</strong> the variables u i , but<br />

our vector of constants conta<strong>in</strong>s symbols. In general, we would have to row-reduce<br />

the augmented matrix by hand, due to the symbolic f<strong>in</strong>al column. However, <strong>in</strong> this<br />

particular example, the 5 × 5 coefficient matrix is nons<strong>in</strong>gular and so has an <strong>in</strong>verse<br />

(Theorem NI, Def<strong>in</strong>ition MI).<br />

⎡<br />

⎤ ⎡<br />

⎤<br />

−65 128 10 −262 40<br />

36 −73 −1 151 −16<br />

⎢−44 88 5 −180 24 ⎥<br />

⎣ 34 −68 −3 140 −18⎦<br />

12 −24 −1 49 −5<br />

−1<br />

−47 92 1 −181 −14<br />

7 221<br />

27 −55 2 2<br />

11<br />

= ⎢−32 64 −1 −126 −12<br />

⎥<br />

⎣<br />

3 199<br />

25 −50<br />

2 2<br />

9<br />

⎦<br />

1 71<br />

9 −18<br />

2 2<br />

4<br />

so we f<strong>in</strong>d that<br />

⎡ ⎤ ⎡<br />

⎤ ⎡ ⎤<br />

u 1 −47 92 1 −181 −14 v 1<br />

7 221<br />

u 2<br />

27 −55 ⎢u 3 ⎥<br />

⎣u ⎦ = 2 2<br />

11<br />

v<br />

⎢−32 64 −1 −126 −12<br />

2<br />

⎥ ⎢v 3 ⎥<br />

⎣<br />

3 199<br />

4 25 −50<br />

2 2<br />

9<br />

⎦ ⎣v ⎦ 4<br />

u 1 71<br />

5 9 −18<br />

2 2<br />

4 v 5<br />

⎡<br />

⎤<br />

−47v 1 +92v 2 + v 3 − 181v 4 − 14v 5<br />

27v 1 − 55v 2 + 7 2 v 3 + 221<br />

2 v 4 +11v 5<br />

= ⎢−32v 1 +64v 2 − v 3 − 126v 4 − 12v 5 ⎥<br />

⎣<br />

25v 1 − 50v 2 + 3 2 v 3 + 199<br />

2 v 4 +9v<br />

⎦<br />

5<br />

9v 1 − 18v 2 + 1 2 v 3 + 71<br />

2 v 4 +4v 5<br />

This establishes that if we are given any output vector v, wecanuseitscomponents<br />

<strong>in</strong> this f<strong>in</strong>al expression to formulate a vector u such that T (u) = v. So<br />

by Def<strong>in</strong>ition SLT we now know that T is surjective. You might try to verify this<br />

condition <strong>in</strong> its full generality (i.e. evaluate T with this f<strong>in</strong>al expression and see if<br />

you get v as the result), or test it more specifically for some numerical vector v (see<br />

Exercise SLT.C20).<br />

△<br />

Here is the cartoon for a surjective l<strong>in</strong>ear transformation. It is meant to suggest<br />

that for every output <strong>in</strong> V there is at least one <strong>in</strong>put <strong>in</strong> U that is sent to the output.<br />

(Even though we have depicted several <strong>in</strong>puts sent to each output.) The key feature<br />

of this cartoon is that there are no vectors <strong>in</strong> V without an arrow po<strong>in</strong>t<strong>in</strong>g to them.


§SLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 466<br />

U<br />

T<br />

V<br />

Diagram SLT: Surjective L<strong>in</strong>ear Transformation<br />

Let us now exam<strong>in</strong>e a surjective l<strong>in</strong>ear transformation between abstract vector<br />

spaces.<br />

Example SAV Surjective, Archetype V<br />

Archetype V is def<strong>in</strong>ed by<br />

T : P 3 → M 22 , T ( a + bx + cx 2 + dx 3) [ ]<br />

a + b a− 2c<br />

=<br />

d b− d<br />

To establish that the l<strong>in</strong>ear transformation is surjective, beg<strong>in</strong> by choos<strong>in</strong>g an<br />

arbitrary output. In this example, we need to choose an arbitrary 2 × 2 matrix, say<br />

[ ]<br />

x y<br />

v =<br />

z w<br />

and we would like to f<strong>in</strong>d an <strong>in</strong>put polynomial<br />

u = a + bx + cx 2 + dx 3<br />

so that T (u) =v. Sowehave,<br />

[ ]<br />

x y<br />

= v<br />

z w<br />

= T (u)<br />

= T ( a + bx + cx 2 + dx 3)


§SLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 467<br />

[ ]<br />

a + b a− 2c<br />

=<br />

d b− d<br />

Matrix equality leads us to the system of four equations <strong>in</strong> the four unknowns,<br />

x, y, z, w,<br />

a + b = x<br />

a − 2c = y<br />

d = z<br />

b − d = w<br />

which can be rewritten as a matrix equation,<br />

⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤<br />

1 1 0 0 a x<br />

⎢1 0 −2 0<br />

⎣<br />

0 0 0 1<br />

0 1 0 −1<br />

⎥ ⎢b⎥<br />

⎢y<br />

⎥<br />

⎦ ⎣<br />

c<br />

⎦ = ⎣<br />

z<br />

⎦<br />

d w<br />

The coefficient matrix is nons<strong>in</strong>gular, hence it has an <strong>in</strong>verse,<br />

⎡<br />

⎤−1<br />

⎡<br />

⎤<br />

1 1 0 0 1 0 −1 −1<br />

⎢1 0 −2 0 ⎥ ⎢0 0 1 1 ⎥<br />

⎣<br />

0 0 0 1<br />

⎦ = ⎣ 1<br />

2<br />

− 1 2<br />

− 1 2<br />

− 1 ⎦<br />

2<br />

0 1 0 −1 0 0 1 0<br />

so we have<br />

⎡ ⎤ ⎡<br />

⎤ ⎡ ⎤<br />

a 1 0 −1 −1 x<br />

⎢b⎥<br />

⎢0 0 1 1 ⎥ ⎢y<br />

⎥<br />

⎣<br />

c<br />

⎦ = ⎣ 1<br />

2<br />

− 1 2<br />

− 1 2<br />

− 1 ⎦ ⎣<br />

z<br />

⎦<br />

2<br />

d 0 0 1 0 w<br />

⎡<br />

⎤<br />

x − z − w<br />

⎢ z + w ⎥<br />

= ⎣ 1<br />

2<br />

(x − y − z − w)<br />

⎦<br />

z<br />

So the <strong>in</strong>put polynomial u =(x−z −w)+(z +w)x+ 1 2 (x−y −z −w)x2 +zx 3 will<br />

yield the output matrix v, no matter what form v takes. This means by Def<strong>in</strong>ition<br />

SLT that T is surjective. All the same, let us do a concrete demonstration and<br />

evaluate T with u,<br />

T (u) =T<br />

((x − z − w)+(z + w)x + 1 )<br />

2 (x − y − z − w)x2 + zx 3<br />

[ ]<br />

(x − z − w)+(z + w) (x − z − w) − 2(<br />

1<br />

=<br />

2<br />

(x − y − z − w))<br />

z<br />

(z + w) − z<br />

[ ]<br />

x y<br />

=<br />

z w


§SLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 468<br />

= v<br />

△<br />

Subsection RLT<br />

Range of a L<strong>in</strong>ear Transformation<br />

For a l<strong>in</strong>ear transformation T : U → V , the range is a subset of the codoma<strong>in</strong> V .<br />

Informally, it is the set of all outputs that the transformation creates when fed every<br />

possible <strong>in</strong>put from the doma<strong>in</strong>. It will have some natural connections with the<br />

column space of a matrix, so we will keep the same notation, and if you th<strong>in</strong>k about<br />

your objects, then there should be little confusion. Here is the careful def<strong>in</strong>ition.<br />

Def<strong>in</strong>ition RLT Range of a L<strong>in</strong>ear Transformation<br />

Suppose T : U → V is a l<strong>in</strong>ear transformation. Then the range of T is the set<br />

R(T )={T (u)| u ∈ U}<br />

Example RAO Range, Archetype O<br />

Archetype O is the l<strong>in</strong>ear transformation<br />

T : C 3 → C 5 ,<br />

T<br />

([ ])<br />

x1<br />

x 2 =<br />

x 3<br />

⎡<br />

⎤<br />

−x 1 + x 2 − 3x 3<br />

−x 1 +2x 2 − 4x 3<br />

⎢ x 1 + x 2 + x 3 ⎥<br />

⎣ 2x 1 +3x 2 + x ⎦ 3<br />

x 1 +2x 3<br />

To determ<strong>in</strong>e the elements of C 5 <strong>in</strong> R(T ), f<strong>in</strong>d those vectors v such that T (u) = v<br />

for some u ∈ C 3 ,<br />

v = T (u)<br />

⎡<br />

⎤<br />

−u 1 + u 2 − 3u 3<br />

−u 1 +2u 2 − 4u 3<br />

= ⎢ u 1 + u 2 + u 3 ⎥<br />

⎣ 2u 1 +3u 2 + u ⎦ 3<br />

u 1 +2u 3<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

−u 1 u 2 −3u 3<br />

−u 1<br />

2u = ⎢ u 1 ⎥<br />

⎣ 2u ⎦ + 2<br />

−4u ⎢ u 2 ⎥<br />

⎣ 1 3u ⎦ + 3<br />

⎢ u 3 ⎥<br />

⎣ 2 u ⎦ 3<br />

u 1 0 2u 3<br />


§SLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 469<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

−1 1 −3<br />

−1<br />

2<br />

−4<br />

= u 1 ⎢ 1 ⎥<br />

⎣ 2 ⎦ + u 2 ⎢1⎥<br />

⎣3⎦ + u 3 ⎢ 1 ⎥<br />

⎣ 1 ⎦<br />

1 0 2<br />

This says that every output of T (<strong>in</strong> other words, the vector v) can be written<br />

as a l<strong>in</strong>ear comb<strong>in</strong>ation of the three vectors<br />

⎡ ⎤<br />

⎡ ⎤<br />

⎡ ⎤<br />

−1<br />

1<br />

−3<br />

−1<br />

2<br />

−4<br />

⎢ 1 ⎥<br />

⎢1⎥<br />

⎢ 1 ⎥<br />

⎣ 2 ⎦<br />

⎣3⎦<br />

⎣ 1 ⎦<br />

1<br />

0<br />

2<br />

us<strong>in</strong>g the scalars u 1 ,u 2 ,u 3 .Furthermore,s<strong>in</strong>ceu can be any element of C 3 ,every<br />

such l<strong>in</strong>ear comb<strong>in</strong>ation is an output. This means that<br />

⎧⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤⎫<br />

−1 1 −3<br />

〈<br />

〉<br />

⎪⎨<br />

−1<br />

2<br />

−4<br />

⎪⎬<br />

R(T )= ⎢ 1 ⎥<br />

⎣<br />

⎪⎩<br />

2 ⎦ , ⎢1⎥<br />

⎣3⎦ , ⎢ 1 ⎥<br />

⎣ 1 ⎦<br />

⎪⎭<br />

1 0 2<br />

The three vectors <strong>in</strong> this spann<strong>in</strong>g set for R(T ) form a l<strong>in</strong>early dependent set<br />

(check this!). So we can f<strong>in</strong>d a more economical presentation by any of the various<br />

methods from Section CRS and Section FS. We will place the vectors <strong>in</strong>to a matrix<br />

as rows, row-reduce, toss out zero rows and appeal to Theorem BRS, sowecan<br />

describe the range of T with a basis,<br />

⎧⎡<br />

⎤ ⎡ ⎤⎫<br />

1 0<br />

〈<br />

〉<br />

⎪⎨<br />

0<br />

1<br />

⎪⎬<br />

R(T )= ⎢−3⎥<br />

⎣<br />

⎪⎩<br />

−7⎦ , ⎢2⎥<br />

⎣5⎦<br />

⎪⎭<br />

−2 1<br />

We know that the span of a set of vectors is always a subspace (Theorem SSS),<br />

so the range computed <strong>in</strong> Example RAO is also a subspace. This is no accident, the<br />

range of a l<strong>in</strong>ear transformation is always a subspace.<br />

Theorem RLTS Range of a L<strong>in</strong>ear Transformation is a Subspace<br />

Suppose that T : U → V is a l<strong>in</strong>ear transformation. Then the range of T , R(T ), isa<br />

subspace of V .<br />

Proof. We can apply the three-part test of Theorem TSS. <strong>First</strong>, 0 U ∈ U and<br />

T (0 U ) = 0 V by Theorem LTTZZ, so0 V ∈R(T ) and we know that the range is<br />

nonempty.<br />


§SLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 470<br />

Suppose we assume that x, y ∈R(T ). Isx + y ∈R(T )? Ifx, y ∈R(T ) then we<br />

know there are vectors w, z ∈ U such that T (w) = x and T (z) = y. Because U is<br />

a vector space, additive closure (Property AC) implies that w + z ∈ U.<br />

Then<br />

T (w + z) =T (w)+T (z)<br />

Def<strong>in</strong>ition LT<br />

= x + y Def<strong>in</strong>ition of w and z<br />

So we have found an <strong>in</strong>put, w + z, which when fed <strong>in</strong>to T creates x + y as an<br />

output. This qualifies x + y for membership <strong>in</strong> R(T ). So we have additive closure.<br />

Suppose we assume that α ∈ C and x ∈R(T ). Isαx ∈R(T )? Ifx ∈R(T ), then<br />

there is a vector w ∈ U such that T (w) = x. Because U is a vector space, scalar<br />

closure implies that αw ∈ U. Then<br />

T (αw) =αT (w)<br />

Def<strong>in</strong>ition LT<br />

= αx Def<strong>in</strong>ition of w<br />

So we have found an <strong>in</strong>put (αw) which when fed <strong>in</strong>to T creates αx as an output.<br />

This qualifies αx for membership <strong>in</strong> R(T ). So we have scalar closure and Theorem<br />

TSS tells us that R(T ) is a subspace of V .<br />

<br />

Let us compute another range, now that we know <strong>in</strong> advance that it will be a<br />

subspace.<br />

Example FRAN Full range, Archetype N<br />

Archetype N is the l<strong>in</strong>ear transformation<br />

⎛⎡<br />

⎤⎞<br />

x 1<br />

[ ]<br />

x<br />

T : C 5 → C 3 2<br />

2x1 + x 2 +3x 3 − 4x 4 +5x 5<br />

, T ⎜⎢x 3 ⎥⎟<br />

⎝⎣x ⎦⎠ = x 1 − 2x 2 +3x 3 − 9x 4 +3x 5<br />

4 3x 1 +4x 3 − 6x 4 +5x 5<br />

x 5<br />

To determ<strong>in</strong>e the elements of C 3 <strong>in</strong> R(T ), f<strong>in</strong>d those vectors v such that T (u) = v<br />

for some u ∈ C 5 ,<br />

v = T (u)<br />

[ 2u1 + u 2 +3u 3 − 4u 4 +5u 5<br />

= u 1 − 2u 2 +3u 3 − 9u 4 +3u 5<br />

=<br />

[ 2u1<br />

3u 1 +4u 3 − 6u 4 +5u 5<br />

]<br />

]<br />

u 1 +<br />

3u 1<br />

]<br />

= u 1<br />

[ 2<br />

1<br />

3<br />

[<br />

u2<br />

−2u 2<br />

0<br />

]<br />

+ u 2<br />

[ 1<br />

−2<br />

0<br />

]<br />

+<br />

[ 3u3<br />

]<br />

3u 3 +<br />

4u 3<br />

+ u 3<br />

[ 3<br />

3<br />

4<br />

]<br />

[ −4u4<br />

]<br />

−9u 4 +<br />

−6u 4<br />

+ u 4<br />

[ −4<br />

−9<br />

−6<br />

]<br />

[ 5u5<br />

]<br />

3u 5<br />

5u 5<br />

+ u 5<br />

[ 5<br />

3<br />

5<br />

]


§SLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 471<br />

This says that every output of T (<strong>in</strong> other words, the vector v) can be written<br />

as a l<strong>in</strong>ear comb<strong>in</strong>ation of the five vectors<br />

[ ] [ ] [ ] [ ] [ ]<br />

2 1<br />

3 −4<br />

5<br />

1<br />

−2<br />

3<br />

−9<br />

3<br />

3<br />

0<br />

4<br />

−6<br />

5<br />

us<strong>in</strong>g the scalars u 1 ,u 2 ,u 3 ,u 4 ,u 5 .Furthermore,s<strong>in</strong>ceu can be any element of C 5 ,<br />

every such l<strong>in</strong>ear comb<strong>in</strong>ation is an output. This means that<br />

] [ ] [ ] [ ] [ ]}〉<br />

2 1 3 −4 5<br />

R(T )=〈{[<br />

1 , −2 , 3 , −9 , 3<br />

3 0 4 −6 5<br />

The five vectors <strong>in</strong> this spann<strong>in</strong>g set for R(T ) form a l<strong>in</strong>early dependent set<br />

(Theorem MVSLD). So we can f<strong>in</strong>d a more economical presentation by any of the<br />

various methods from Section CRS and Section FS. Wewillplacethevectors<strong>in</strong>toa<br />

matrix as rows, row-reduce, toss out zero rows and appeal to Theorem BRS, sowe<br />

can describe the range of T with a (nice) basis,<br />

] [ ] [ ]}〉<br />

1 0 0<br />

R(T )=〈{[<br />

0 , 1 , 0<br />

0 0 1<br />

= C 3 △<br />

In contrast to <strong>in</strong>jective l<strong>in</strong>ear transformations hav<strong>in</strong>g small (trivial) kernels<br />

(Theorem KILT), surjective l<strong>in</strong>ear transformations have large ranges, as <strong>in</strong>dicated <strong>in</strong><br />

the next theorem.<br />

Theorem RSLT Range of a Surjective L<strong>in</strong>ear Transformation<br />

Suppose that T : U → V is a l<strong>in</strong>ear transformation. Then T is surjective if and only<br />

if the range of T equals the codoma<strong>in</strong>, R(T )=V .<br />

Proof. (⇒) By Def<strong>in</strong>ition RLT, we know that R(T ) ⊆ V . To establish the reverse<br />

<strong>in</strong>clusion, assume v ∈ V .Thens<strong>in</strong>ceT is surjective (Def<strong>in</strong>ition SLT), there exists a<br />

vector u ∈ U so that T (u) = v. However, the existence of u ga<strong>in</strong>s v membership <strong>in</strong><br />

R(T ), so V ⊆R(T ). Thus, R(T )=V .<br />

(⇐) To establish that T is surjective, choose v ∈ V . S<strong>in</strong>ce we are assum<strong>in</strong>g that<br />

R(T ) = V , v ∈R(T ). This says there is a vector u ∈ U so that T (u) = v, i.e. T is<br />

surjective.<br />

<br />

Example NSAQR Not surjective, Archetype Q, revisited<br />

We are now <strong>in</strong> a position to revisit our first example <strong>in</strong> this section, Example NSAQ.<br />

In that example, we showed that Archetype Q is not surjective by construct<strong>in</strong>g a<br />

vector <strong>in</strong> the codoma<strong>in</strong> where no element of the doma<strong>in</strong> could be used to evaluate


§SLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 472<br />

the l<strong>in</strong>ear transformation to create the output, thus violat<strong>in</strong>g Def<strong>in</strong>ition SLT. Just<br />

where did this vector come from?<br />

The short answer is that the vector<br />

⎡ ⎤<br />

−1<br />

2<br />

v = ⎢ 3 ⎥<br />

⎣−1⎦<br />

4<br />

was constructed to lie outside of the range of T . How was this accomplished? <strong>First</strong>,<br />

the range of T is given by<br />

⎧⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />

1 0 0 0<br />

〈<br />

〉<br />

⎪⎨<br />

0<br />

1<br />

0<br />

0<br />

⎪⎬<br />

R(T )= ⎢0⎥<br />

⎣<br />

⎪⎩<br />

0⎦ , ⎢ 0 ⎥<br />

⎣ 0 ⎦ , ⎢ 1 ⎥<br />

⎣ 0 ⎦ , ⎢0⎥<br />

⎣1⎦<br />

⎪⎭<br />

1 −1 −1 2<br />

Suppose an element of the range v ∗ has its first 4 components equal to −1, 2, 3,<br />

−1, <strong>in</strong> that order. Then to be an element of R(T ), we would have<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

1 0 0<br />

0 −1<br />

0<br />

1<br />

0<br />

0<br />

2<br />

v ∗ =(−1) ⎢0⎥<br />

⎣0⎦ +(2) ⎢ 0 ⎥<br />

⎣ 0 ⎦ +(3) ⎢ 1 ⎥<br />

⎣ 0 ⎦ +(−1) ⎢0⎥<br />

⎣1⎦ = ⎢ 3 ⎥<br />

⎣−1⎦<br />

1 −1 −1 2 −8<br />

So the only vector <strong>in</strong> the range with these first four components specified, must<br />

have −8 <strong>in</strong> the fifth component. To set the fifth component to any other value (say,<br />

4) will result <strong>in</strong> a vector (v <strong>in</strong> Example NSAQ) outside of the range. Any attempt<br />

to f<strong>in</strong>d an <strong>in</strong>put for T that will produce v as an output will be doomed to failure.<br />

Whenever the range of a l<strong>in</strong>ear transformation is not the whole codoma<strong>in</strong>, we can<br />

employ this device and conclude that the l<strong>in</strong>ear transformation is not surjective. This<br />

is another way of view<strong>in</strong>g Theorem RSLT. For a surjective l<strong>in</strong>ear transformation,<br />

the range is all of the codoma<strong>in</strong> and there is no choice for a vector v that lies <strong>in</strong> V ,<br />

yet not <strong>in</strong> the range. For every one of the archetypes that is not surjective, there is<br />

an example presented of exactly this form.<br />

△<br />

Example NSAO Not surjective, Archetype O<br />

In Example RAO the range of Archetype O was determ<strong>in</strong>ed to be<br />

⎧⎡<br />

⎤ ⎡ ⎤⎫<br />

1 0<br />

〈<br />

〉<br />

⎪⎨<br />

0<br />

1<br />

⎪⎬<br />

R(T )= ⎢−3⎥<br />

⎣<br />

⎪⎩<br />

−7⎦ , ⎢2⎥<br />

⎣5⎦<br />

⎪⎭<br />

−2 1<br />

a subspace of dimension 2 <strong>in</strong> C 5 .S<strong>in</strong>ceR(T ) ̸= C 5 ,TheoremRSLT says T is not


§SLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 473<br />

surjective.<br />

Example SAN Surjective, Archetype N<br />

The range of Archetype N was computed <strong>in</strong> Example FRAN to be<br />

] [ ] [ ]}〉<br />

1 0 0<br />

R(T )=〈{[<br />

0 , 1 , 0<br />

0 0 1<br />

S<strong>in</strong>ce the basis for this subspace is the set of standard unit vectors for C 3<br />

(Theorem SUVB), we have R(T )=C 3 and by Theorem RSLT, T is surjective. △<br />

Subsection SSSLT<br />

Spann<strong>in</strong>g Sets and Surjective L<strong>in</strong>ear Transformations<br />

Just as <strong>in</strong>jective l<strong>in</strong>ear transformations are allied with l<strong>in</strong>ear <strong>in</strong>dependence (Theorem<br />

ILTLI, TheoremILTB), surjective l<strong>in</strong>ear transformations are allied with spann<strong>in</strong>g<br />

sets.<br />

Theorem SSRLT Spann<strong>in</strong>g Set for Range of a L<strong>in</strong>ear Transformation<br />

Suppose that T : U → V is a l<strong>in</strong>ear transformation and<br />

S = {u 1 , u 2 , u 3 , ..., u t }<br />

spans U. Then<br />

R = {T (u 1 ) ,T(u 2 ) ,T(u 3 ) , ..., T (u t )}<br />

spans R(T ).<br />

Proof. We need to establish that R(T ) = 〈R〉, a set equality. <strong>First</strong> we establish<br />

that R(T ) ⊆ 〈R〉. To this end, choose v ∈ R(T ). Then there exists a vector<br />

u ∈ U, suchthatT (u) = v (Def<strong>in</strong>ition RLT). Because S spans U there are scalars,<br />

a 1 ,a 2 ,a 3 , ..., a t , such that<br />

u = a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a t u t<br />

Then<br />

v = T (u)<br />

Def<strong>in</strong>ition RLT<br />

= T (a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a t u t ) Def<strong>in</strong>ition SSVS<br />

= a 1 T (u 1 )+a 2 T (u 2 )+a 3 T (u 3 )+...+ a t T (u t ) Theorem LTLC<br />

△<br />

which establishes that v ∈〈R〉 (Def<strong>in</strong>ition SS). So R(T ) ⊆〈R〉.<br />

To establish the opposite <strong>in</strong>clusion, choose an element of the span of R, say<br />

v ∈〈R〉. Then there are scalars b 1 ,b 2 ,b 3 , ..., b t so that<br />

v = b 1 T (u 1 )+b 2 T (u 2 )+b 3 T (u 3 )+···+ b t T (u t ) Def<strong>in</strong>ition SS


§SLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 474<br />

= T (b 1 u 1 + b 2 u 2 + b 3 u 3 + ···+ b t u t ) Theorem LTLC<br />

This demonstrates that v is an output of the l<strong>in</strong>ear transformation T ,sov ∈R(T ).<br />

Therefore 〈R〉 ⊆R(T ), so we have the set equality R(T ) = 〈R〉 (Def<strong>in</strong>ition SE). In<br />

other words, R spans R(T ) (Def<strong>in</strong>ition SSVS).<br />

<br />

Theorem SSRLT provides an easy way to beg<strong>in</strong> the construction of a basis for<br />

the range of a l<strong>in</strong>ear transformation, s<strong>in</strong>ce the construction of a spann<strong>in</strong>g set requires<br />

simply evaluat<strong>in</strong>g the l<strong>in</strong>ear transformation on a spann<strong>in</strong>g set of the doma<strong>in</strong>. In<br />

practice the best choice for a spann<strong>in</strong>g set of the doma<strong>in</strong> would be as small as<br />

possible, <strong>in</strong> other words, a basis. The result<strong>in</strong>g spann<strong>in</strong>g set for the codoma<strong>in</strong> may<br />

not be l<strong>in</strong>early <strong>in</strong>dependent, so to f<strong>in</strong>d a basis for the range might require toss<strong>in</strong>g<br />

out redundant vectors from the spann<strong>in</strong>g set. Here is an example.<br />

Example BRLT A basis for the range of a l<strong>in</strong>ear transformation<br />

Def<strong>in</strong>e the l<strong>in</strong>ear transformation T : M 22 → P 2 by<br />

([ ])<br />

a b<br />

T<br />

=(a +2b +8c + d)+(−3a +2b +5d) x +(a + b +5c) x<br />

c d<br />

2<br />

A convenient spann<strong>in</strong>g set for M 22 is the basis<br />

{[ ] [ ] [ ] [ ]}<br />

1 0 0 1 0 0 0 0<br />

S = , , ,<br />

0 0 0 0 1 0 0 1<br />

So by Theorem SSRLT, a spann<strong>in</strong>g set for R(T )is<br />

{ ([ ]) ([ ]) ([ ])<br />

1 0 0 1 0 0<br />

R = T<br />

,T<br />

,T<br />

,T<br />

0 0 0 0 1 0<br />

= { 1 − 3x + x 2 , 2+2x + x 2 , 8+5x 2 , 1+5x }<br />

([ ])}<br />

0 0<br />

0 1<br />

The set R is not l<strong>in</strong>early <strong>in</strong>dependent, so if we desire a basis for R(T ), we need<br />

to elim<strong>in</strong>ate some redundant vectors. Two particular relations of l<strong>in</strong>ear dependence<br />

on R are<br />

(−2)(1 − 3x + x 2 )+(−3)(2 + 2x + x 2 )+(8+5x 2 )=0+0x +0x 2 = 0<br />

(1 − 3x + x 2 )+(−1)(2 + 2x + x 2 )+(1+5x) =0+0x +0x 2 = 0<br />

These, <strong>in</strong>dividually, allow us to remove 8 + 5x 2 and 1 + 5x from R without<br />

destroy<strong>in</strong>g the property that R spans R(T ). The two rema<strong>in</strong><strong>in</strong>g vectors are l<strong>in</strong>early<br />

<strong>in</strong>dependent (check this!), so we can write<br />

R(T )= 〈{ 1 − 3x + x 2 , 2+2x + x 2}〉<br />

and see that dim (R(T )) = 2.<br />

△<br />

Elements of the range are precisely those elements of the codoma<strong>in</strong> with nonempty<br />

preimages.<br />

Theorem RPI Range and Pre-Image


§SLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 475<br />

Suppose that T : U → V is a l<strong>in</strong>ear transformation. Then<br />

v ∈R(T ) if and only if T −1 (v) ≠ ∅<br />

Proof. (⇒) Ifv ∈R(T ), then there is a vector u ∈ U such that T (u) = v. This<br />

qualifies u for membership <strong>in</strong> T −1 (v), and thus the preimage of v is not empty.<br />

(⇐) Suppose the preimage of v is not empty, so we can choose a vector u ∈ U<br />

such that T (u) =v. Thenv ∈R(T ).<br />

<br />

Now would be a good time to return to Diagram KPI whichdepictedthepreimages<br />

of a non-surjective l<strong>in</strong>ear transformation. The vectors x, y ∈ V were elements<br />

of the codoma<strong>in</strong> whose pre-images were empty, as we expect for a non-surjective<br />

l<strong>in</strong>ear transformation from the characterization <strong>in</strong> Theorem RPI.<br />

Theorem SLTB Surjective L<strong>in</strong>ear Transformations and Bases<br />

Suppose that T : U → V is a l<strong>in</strong>ear transformation and<br />

B = {u 1 , u 2 , u 3 , ..., u m }<br />

is a basis of U. ThenT is surjective if and only if<br />

C = {T (u 1 ) ,T(u 2 ) ,T(u 3 ) , ..., T (u m )}<br />

is a spann<strong>in</strong>g set for V .<br />

Proof. (⇒) Assume T is surjective. S<strong>in</strong>ce B is a basis, we know B is a spann<strong>in</strong>g<br />

set of U (Def<strong>in</strong>ition B). Then Theorem SSRLT says that C spans R(T ). Butthe<br />

hypothesis that T is surjective means V = R(T )(TheoremRSLT), so C spans V .<br />

(⇐) Assume that C spans V . To establish that T is surjective, we will show that<br />

every element of V is an output of T for some <strong>in</strong>put (Def<strong>in</strong>ition SLT). Suppose that<br />

v ∈ V .AsanelementofV , we can write v as a l<strong>in</strong>ear comb<strong>in</strong>ation of the spann<strong>in</strong>g<br />

set C. So there are scalars, b 1 ,b 2 ,b 3 , ..., b m , such that<br />

v = b 1 T (u 1 )+b 2 T (u 2 )+b 3 T (u 3 )+···+ b m T (u m )<br />

Now def<strong>in</strong>e the vector u ∈ U by<br />

u = b 1 u 1 + b 2 u 2 + b 3 u 3 + ···+ b m u m<br />

Then<br />

T (u) =T (b 1 u 1 + b 2 u 2 + b 3 u 3 + ···+ b m u m )<br />

= b 1 T (u 1 )+b 2 T (u 2 )+b 3 T (u 3 )+···+ b m T (u m ) Theorem LTLC<br />

= v<br />

So, given any choice of a vector v ∈ V , we can design an <strong>in</strong>put u ∈ U to produce<br />

v as an output of T . Thus, by Def<strong>in</strong>ition SLT, T is surjective.


§SLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 476<br />

Subsection SLTD<br />

Surjective L<strong>in</strong>ear Transformations and Dimension<br />

Theorem SLTD Surjective L<strong>in</strong>ear Transformations and Dimension<br />

Suppose that T : U → V is a surjective l<strong>in</strong>ear transformation. Then dim (U) ≥<br />

dim (V ).<br />

Proof. Suppose to the contrary that m = dim (U) < dim (V ) = t. LetB be a basis<br />

of U, which will then conta<strong>in</strong> m vectors. Apply T to each element of B to form a<br />

set C that is a subset of V .ByTheoremSLTB, C is a spann<strong>in</strong>g set of V with m or<br />

fewer vectors. So we have a set of m or fewer vectors that span V , a vector space of<br />

dimension t, withm


§SLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 477<br />

Because S is surjective, there must be a vector v ∈ V , such that S (v) = w. With<br />

the existence of v established, that T is surjective guarantees a vector u ∈ U such<br />

that T (u) =v. Now,<br />

(S ◦ T )(u) =S (T (u)) Def<strong>in</strong>ition LTC<br />

= S (v) Def<strong>in</strong>ition of u<br />

= w Def<strong>in</strong>ition of v<br />

This establishes that any element of the codoma<strong>in</strong> (w) can be created by evaluat<strong>in</strong>g<br />

S ◦ T with the right <strong>in</strong>put (u). Thus, by Def<strong>in</strong>ition SLT, S ◦ T is surjective.<br />

<br />

Read<strong>in</strong>g Questions<br />

1. Suppose T : C 5 → C 8 is a l<strong>in</strong>ear transformation. Why is T not surjective?<br />

2. What is the relationship between a surjective l<strong>in</strong>ear transformation and its range?<br />

3. There are many similarities and differences between <strong>in</strong>jective and surjective l<strong>in</strong>ear<br />

transformations. Compare and contrast these two different types of l<strong>in</strong>ear transformations.<br />

(This means go<strong>in</strong>g well beyond just stat<strong>in</strong>g their def<strong>in</strong>itions.)<br />

Exercises<br />

C10<br />

Each archetype below is a l<strong>in</strong>ear transformation. Compute the range for each.<br />

Archetype M, ArchetypeN, ArchetypeO, ArchetypeP, ArchetypeQ, ArchetypeR, Archetype<br />

S, ArchetypeT, ArchetypeU, Archetype V, ArchetypeW, ArchetypeX<br />

C20 Example SAR concludes with an expression for a vector u ∈ C 5 that we believe<br />

will create the vector v ∈ C 5 when used to evaluate T .Thatis,T (u) = v. Verifythis<br />

assertion by actually evaluat<strong>in</strong>g T with u. If you do not have the patience to push around<br />

all these symbols, try choos<strong>in</strong>g a numerical <strong>in</strong>stance of v, compute u, and then compute<br />

T (u),whichshouldresult<strong>in</strong>v.<br />

C22 † The l<strong>in</strong>ear transformation S : C 4 → C 3 is not surjective. F<strong>in</strong>d an output w ∈ C 3<br />

that has an empty pre-image (that is S −1 (w) =∅.)<br />

⎛⎡<br />

⎤⎞<br />

x 1<br />

⎡<br />

⎜⎢x S 2 ⎥⎟<br />

⎝⎣<br />

x<br />

⎦⎠ = ⎣ 2x ⎤<br />

1 + x 2 +3x 3 − 4x 4<br />

x 1 +3x 2 +4x 3 +3x 4<br />

⎦<br />

3<br />

−x<br />

x 1 +2x 2 + x 3 +7x 4<br />

4<br />

C23 † Determ<strong>in</strong>e whether or not the follow<strong>in</strong>g l<strong>in</strong>ear transformation T : C 5 → P 3 is<br />

surjective:<br />

⎛⎡<br />

⎤⎞<br />

a<br />

b<br />

T ⎜⎢c⎥⎟<br />

⎝⎣<br />

d<br />

⎦⎠ = a +(b + c)x +(c + d)x2 +(d + e)x 3<br />

e


§SLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 478<br />

C24 †<br />

Determ<strong>in</strong>e whether or not the l<strong>in</strong>ear transformation T : P 3 → C 5 below is surjective:<br />

⎡ ⎤<br />

a + b<br />

T ( a + bx + cx 2 + dx 3) b + c<br />

= ⎢c + d⎥<br />

⎣<br />

a + c<br />

⎦ .<br />

b + d<br />

C25 †<br />

Def<strong>in</strong>e the l<strong>in</strong>ear transformation<br />

⎛⎡<br />

T : C 3 → C 2 , T ⎝⎣ x ⎤⎞<br />

1<br />

x 2<br />

⎦⎠ =<br />

x 3<br />

[ ]<br />

2x1 − x 2 +5x 3<br />

−4x 1 +2x 2 − 10x 3<br />

F<strong>in</strong>d a basis for the range of T , R(T ). Is T surjective?<br />

⎛⎡<br />

C26 † Let T : C 3 → C 3 be given by T ⎝⎣ a ⎤⎞<br />

⎡<br />

b⎦⎠ = ⎣ a + b +2c<br />

⎤<br />

2c ⎦. F<strong>in</strong>dabasisofR(T ). Is<br />

c a + b + c<br />

T surjective?<br />

⎛⎡<br />

C27 † Let T : C 3 → C 4 be given by T ⎝⎣ a ⎤⎞<br />

⎡ ⎤<br />

a + b − c<br />

b⎦⎠ ⎢ a − b + c ⎥<br />

= ⎣<br />

−a + b + c<br />

⎦. F<strong>in</strong>dabasisofR(T ). Is<br />

c<br />

a + b + c<br />

T surjective?<br />

⎛⎡<br />

⎤⎞<br />

a [ ]<br />

C28 † Let T : C 4 ⎜⎢b⎥⎟<br />

a + b a+ b + c<br />

→ M 22 be given by T ⎝⎣<br />

c<br />

⎦⎠ =<br />

.F<strong>in</strong>dabasis<br />

a + b + c a+ d<br />

d<br />

of R(T ). Is T surjective?<br />

C29 † Let T : P 2 → P 4 be given by T (p(x)) = x 2 p(x). F<strong>in</strong>d a basis of R(T ). Is T<br />

surjective?<br />

C30 † Let T : P 4 → P 3 be given by T (p(x)) = p ′ (x), where p ′ (x) is the derivative. F<strong>in</strong>d a<br />

basis of R(T ). Is T surjective?<br />

C40 † Show that the l<strong>in</strong>ear transformation T is not surjective by f<strong>in</strong>d<strong>in</strong>g an element of<br />

the codoma<strong>in</strong>, v, such that there is no vector u with T (u) =v.<br />

⎛⎡<br />

T : C 3 → C 3 , T ⎝⎣ a ⎤⎞<br />

⎡ ⎤<br />

2a +3b − c<br />

b⎦⎠ = ⎣ 2b − 2c ⎦<br />

c a − b +2c<br />

M60 Suppose U and V are vector spaces. Def<strong>in</strong>e the function Z : U → V by Z (u) = 0 V<br />

for every u ∈ U. Then by Exercise LT.M60, Z is a l<strong>in</strong>ear transformation. Formulate a<br />

condition on V that is equivalent to Z be<strong>in</strong>g an surjective l<strong>in</strong>ear transformation. In other<br />

words, fill <strong>in</strong> the blank to complete the follow<strong>in</strong>g statement (and then give a proof): Z is<br />

surjective if and only if V is . (See Exercise ILT.M60, Exercise IVLT.M60.)<br />

T15 † Suppose that T : U → V and S : V → W are l<strong>in</strong>ear transformations. Prove the<br />

follow<strong>in</strong>g relationship between ranges.<br />

R(S ◦ T ) ⊆R(S)


§SLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 479<br />

T20 † Suppose that A is an m × n matrix. Def<strong>in</strong>e the l<strong>in</strong>ear transformation T by<br />

T : C n → C m , T (x) =Ax<br />

Prove that the range of T equals the column space of A, R(T )=C(A).


Section IVLT<br />

Invertible L<strong>in</strong>ear Transformations<br />

In this section we will conclude our <strong>in</strong>troduction to l<strong>in</strong>ear transformations by br<strong>in</strong>g<strong>in</strong>g<br />

together the tw<strong>in</strong> properties of <strong>in</strong>jectivity and surjectivity and consider l<strong>in</strong>ear<br />

transformations with both of these properties.<br />

Subsection IVLT<br />

Invertible L<strong>in</strong>ear Transformations<br />

One prelim<strong>in</strong>ary def<strong>in</strong>ition, and then we will have our ma<strong>in</strong> def<strong>in</strong>ition for this section.<br />

Def<strong>in</strong>ition IDLT Identity L<strong>in</strong>ear Transformation<br />

The identity l<strong>in</strong>ear transformation on the vector space W is def<strong>in</strong>ed as<br />

I W : W → W, I W (w) =w<br />

Informally, I W is the “do-noth<strong>in</strong>g” function. You should check that I W is really<br />

a l<strong>in</strong>ear transformation, as claimed, and then compute its kernel and range to see<br />

that it is both <strong>in</strong>jective and surjective. All of these facts should be straightforward<br />

to verify (Exercise IVLT.T05). With this <strong>in</strong> hand we can make our ma<strong>in</strong> def<strong>in</strong>ition.<br />

Def<strong>in</strong>ition IVLT Invertible L<strong>in</strong>ear Transformations<br />

Suppose that T : U → V is a l<strong>in</strong>ear transformation. If there is a function S : V → U<br />

such that<br />

S ◦ T = I U<br />

T ◦ S = I V<br />

then T is <strong>in</strong>vertible. In this case, we call S the <strong>in</strong>verse of T and write S = T −1 . □<br />

Informally, a l<strong>in</strong>ear transformation T is <strong>in</strong>vertible if there is a companion l<strong>in</strong>ear<br />

transformation, S, which “undoes” the action of T . When the two l<strong>in</strong>ear transformations<br />

are applied consecutively (composition), <strong>in</strong> either order, the result is to have<br />

no real effect. It is entirely analogous to squar<strong>in</strong>g a positive number and then tak<strong>in</strong>g<br />

its (positive) square root.<br />

Here is an example of a l<strong>in</strong>ear transformation that is <strong>in</strong>vertible. As usual at the<br />

beg<strong>in</strong>n<strong>in</strong>g of a section, do not be concerned with where S came from, just understand<br />

how it illustrates Def<strong>in</strong>ition IVLT.<br />

Example AIVLT An <strong>in</strong>vertible l<strong>in</strong>ear transformation<br />

Archetype V is the l<strong>in</strong>ear transformation<br />

T : P 3 → M 22 , T ( a + bx + cx 2 + dx 3) [ ]<br />

a + b a− 2c<br />

=<br />

d b− d<br />

□<br />

480


§IVLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 481<br />

Def<strong>in</strong>e the function S : M 22 → P 3 def<strong>in</strong>ed by<br />

([ ])<br />

a b<br />

S<br />

=(a − c − d)+(c + d)x + 1 c d<br />

2 (a − b − c − d)x2 + cx 3<br />

Then<br />

([ ])<br />

a b<br />

(T ◦ S)<br />

c d<br />

( ([ ]))<br />

a b<br />

= T S<br />

c d<br />

= T<br />

((a − c − d)+(c + d)x + 1 )<br />

2 (a − b − c − d)x2 + cx 3<br />

[ ]<br />

(a − c − d)+(c + d) (a − c − d) − 2(<br />

1<br />

=<br />

2<br />

(a − b − c − d))<br />

c<br />

(c + d) − c<br />

[ ]<br />

a b<br />

=<br />

c d<br />

([ ])<br />

a b<br />

= I M22<br />

c d<br />

and<br />

(S ◦ T ) ( a + bx + cx 2 + dx 3)<br />

= S ( T ( a + bx + cx 2 + dx 3))<br />

([ ])<br />

a + b a− 2c<br />

= S<br />

d b− d<br />

=((a + b) − d − (b − d)) + (d +(b − d))x<br />

( )<br />

1<br />

+<br />

2 ((a + b) − (a − 2c) − d − (b − d)) x 2 +(d)x 3<br />

= a + bx + cx 2 + dx 3<br />

= I P3<br />

(<br />

a + bx + cx 2 + dx 3)<br />

For now, understand why these computations show that T is <strong>in</strong>vertible, and that<br />

S = T −1 . Maybe even be amazed by how S works so perfectly <strong>in</strong> concert with T !<br />

We will see later just how to arrive at the correct form of S (when it is possible).△<br />

It can be as <strong>in</strong>structive to study a l<strong>in</strong>ear transformation that is not <strong>in</strong>vertible.<br />

Example ANILT A non-<strong>in</strong>vertible l<strong>in</strong>ear transformation<br />

Consider the l<strong>in</strong>ear transformation T : C 3 → M 22 def<strong>in</strong>ed by<br />

([ ]) a [<br />

]<br />

a − b 2a +2b + c<br />

T b =<br />

3a + b + c −2a − 6b − 2c<br />

c


§IVLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 482<br />

Suppose we were to search for an <strong>in</strong>verse [ function ] S : M 22 → C 3 .<br />

5 3<br />

<strong>First</strong> verify that the 2 × 2 matrix A = is not <strong>in</strong> the range of T . This will<br />

8 2<br />

[ ] a<br />

amount to f<strong>in</strong>d<strong>in</strong>g an <strong>in</strong>put to T , b , such that<br />

c<br />

a − b =5<br />

2a +2b + c =3<br />

3a + b + c =8<br />

−2a − 6b − 2c =2<br />

As this system of equations is <strong>in</strong>consistent, there is no <strong>in</strong>put column vector, and<br />

A ∉R(T ). How should we def<strong>in</strong>e S (A)? Note that<br />

T (S (A)) = (T ◦ S)(A) =I M22 (A) =A<br />

So any def<strong>in</strong>ition we would provide for S (A) must then be a column vector that<br />

T sends to A and we would have A ∈R(T ), contrary to the def<strong>in</strong>ition of T .Thisis<br />

enough to see that there is no function S that will allow us to conclude that T is<br />

<strong>in</strong>vertible, s<strong>in</strong>ce we cannot provide a consistent def<strong>in</strong>ition for S (A) if we assume T<br />

is <strong>in</strong>vertible.<br />

Even though we now know that T is not <strong>in</strong>vertible, let us not leave this example<br />

just yet. Check that<br />

or<br />

T<br />

([ 1<br />

−2<br />

4<br />

])<br />

=<br />

[ ]<br />

([ ]) 0<br />

3 2<br />

= B T −3 =<br />

5 2<br />

8<br />

How would we def<strong>in</strong>e S (B)?<br />

( ([ ])) 1<br />

S (B) =S T −2<br />

4<br />

=(S ◦ T )<br />

(<br />

S (B) =S T<br />

([ ]) 1<br />

−2 = I C 3<br />

4<br />

([ ])) 0<br />

([ ]) 0<br />

−3 =(S ◦ T ) −3 = I C<br />

3<br />

8<br />

8<br />

[ ]<br />

3 2<br />

= B<br />

5 2<br />

([ ]) 1<br />

[ ] 1<br />

−2 = −2<br />

4 4<br />

([ ]) 0<br />

[ ] 0<br />

−3 = −3<br />

8 8<br />

Which def<strong>in</strong>ition should we provide for S (B)? Both are necessary. But then S is<br />

not a function. So we have a second reason to know that there is no function S that<br />

will allow us to conclude that T is <strong>in</strong>vertible. It happens that there are <strong>in</strong>f<strong>in</strong>itely<br />

many column vectors that S wouldhavetotaketoB. Construct the kernel of T ,<br />

〈{[ ]}〉 −1<br />

K(T )= −1<br />

4


§IVLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 483<br />

Now choose either of the two <strong>in</strong>puts used above for T and add to it a scalar<br />

multiple of the basis vector for the kernel of T . For example,<br />

[ ] [ ] [ ]<br />

1<br />

−1 3<br />

x = −2 +(−2) −1 = 0<br />

4<br />

4 −4<br />

then verify that T (x) = B. Practice creat<strong>in</strong>g a few more <strong>in</strong>puts for T that would be<br />

sent to B, and see why it is hopeless to th<strong>in</strong>k that we could ever provide a reasonable<br />

def<strong>in</strong>ition for S (B)! There is a “whole subspace’s worth” of values that S (B) would<br />

have to take on.<br />

△<br />

In Example ANILT you may have noticed that T is not surjective, s<strong>in</strong>ce the<br />

matrix A was not <strong>in</strong> the range of T .AndT is not <strong>in</strong>jective s<strong>in</strong>ce there are two<br />

different <strong>in</strong>put column vectors that T sends to the matrix B. L<strong>in</strong>ear transformations<br />

T that are not surjective lead to putative <strong>in</strong>verse functions S that are undef<strong>in</strong>ed on<br />

<strong>in</strong>puts outside of the range of T . L<strong>in</strong>ear transformations T that are not <strong>in</strong>jective<br />

lead to putative <strong>in</strong>verse functions S that are multiply-def<strong>in</strong>ed on each of their <strong>in</strong>puts.<br />

We will formalize these ideas <strong>in</strong> Theorem ILTIS.<br />

Butfirstnotice<strong>in</strong>Def<strong>in</strong>itionIVLT that we only require the <strong>in</strong>verse (when it<br />

exists) to be a function. When it does exist, it too is a l<strong>in</strong>ear transformation.<br />

Theorem ILTLT Inverse of a L<strong>in</strong>ear Transformation is a L<strong>in</strong>ear Transformation<br />

Suppose that T : U → V is an <strong>in</strong>vertible l<strong>in</strong>ear transformation. Then the function<br />

T −1 : V → U is a l<strong>in</strong>ear transformation.<br />

Proof. We work through verify<strong>in</strong>g Def<strong>in</strong>ition LT for T −1 , us<strong>in</strong>g the fact that T is a<br />

l<strong>in</strong>ear transformation to obta<strong>in</strong> the second equality <strong>in</strong> each half of the proof. To this<br />

end, suppose x, y ∈ V and α ∈ C.<br />

T −1 (x + y) =T −1 ( T ( T −1 (x) ) + T ( T −1 (y) ))<br />

Def<strong>in</strong>ition IVLT<br />

= T −1 ( T ( T −1 (x)+T −1 (y) )) Def<strong>in</strong>ition LT<br />

= T −1 (x)+T −1 (y) Def<strong>in</strong>ition IVLT<br />

Now check the second def<strong>in</strong><strong>in</strong>g property of a l<strong>in</strong>ear transformation for T −1 ,<br />

T −1 (αx) =T −1 ( αT ( T −1 (x) ))<br />

Def<strong>in</strong>ition IVLT<br />

= T −1 ( T ( αT −1 (x) )) Def<strong>in</strong>ition LT<br />

= αT −1 (x) Def<strong>in</strong>ition IVLT<br />

So T −1 fulfills the requirements of Def<strong>in</strong>ition LT and is therefore a l<strong>in</strong>ear transformation.<br />

<br />

So when T has an <strong>in</strong>verse, T −1 is also a l<strong>in</strong>ear transformation. Furthermore, T −1<br />

is an <strong>in</strong>vertible l<strong>in</strong>ear transformation and its <strong>in</strong>verse is what you might expect.


§IVLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 484<br />

Theorem IILT Inverse of an Invertible L<strong>in</strong>ear Transformation<br />

Suppose that T : U → V is an <strong>in</strong>vertible l<strong>in</strong>ear transformation. Then T −1 is an<br />

<strong>in</strong>vertible l<strong>in</strong>ear transformation and ( T −1) −1<br />

= T .<br />

Proof. Because T is <strong>in</strong>vertible, Def<strong>in</strong>ition IVLT tells us there is a function T −1 : V →<br />

U such that<br />

T −1 ◦ T = I U<br />

T ◦ T −1 = I V<br />

Additionally, Theorem ILTLT tells us that T −1 is more than just a function,<br />

it is a l<strong>in</strong>ear transformation. Now view these two statements as properties of the<br />

l<strong>in</strong>ear transformation T −1 . In light of Def<strong>in</strong>ition IVLT, they together say that T −1 is<br />

<strong>in</strong>vertible (let T playtheroleofS <strong>in</strong> the statement of the def<strong>in</strong>ition). Furthermore,<br />

the <strong>in</strong>verse of T −1 is then T , i.e. ( T −1) −1<br />

= T .<br />

<br />

Subsection IV<br />

Invertibility<br />

We now know what an <strong>in</strong>verse l<strong>in</strong>ear transformation is, but just which l<strong>in</strong>ear transformations<br />

have <strong>in</strong>verses? Here is a theorem we have been prepar<strong>in</strong>g for all chapter<br />

long.<br />

Theorem ILTIS Invertible L<strong>in</strong>ear Transformations are Injective and Surjective<br />

Suppose T : U → V is a l<strong>in</strong>ear transformation. Then T is <strong>in</strong>vertible if and only if T<br />

is <strong>in</strong>jective and surjective.<br />

Proof. (⇒) S<strong>in</strong>ceT is presumed <strong>in</strong>vertible, we can employ its <strong>in</strong>verse, T −1 (Def<strong>in</strong>ition<br />

IVLT). To see that T is <strong>in</strong>jective, suppose x, y ∈ U and assume that T (x) = T (y),<br />

x = I U (x)<br />

Def<strong>in</strong>ition IDLT<br />

= ( T −1 ◦ T ) (x) Def<strong>in</strong>ition IVLT<br />

= T −1 (T (x)) Def<strong>in</strong>ition LTC<br />

= T −1 (T (y)) Def<strong>in</strong>ition ILT<br />

= ( T −1 ◦ T ) (y) Def<strong>in</strong>ition LTC<br />

= I U (y) Def<strong>in</strong>ition IVLT<br />

= y Def<strong>in</strong>ition IDLT<br />

So by Def<strong>in</strong>ition ILT T is <strong>in</strong>jective.<br />

To check that T is surjective, suppose v ∈ V .ThenT −1 (v) is a vector <strong>in</strong> U.<br />

Compute<br />

T ( T −1 (v) ) = ( T ◦ T −1) (v)<br />

Def<strong>in</strong>ition LTC<br />

= I V (v) Def<strong>in</strong>ition IVLT


§IVLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 485<br />

= v Def<strong>in</strong>ition IDLT<br />

So there is an element from U, when used as an <strong>in</strong>put to T (namely T −1 (v)) that<br />

produces the desired output, v, and hence T is surjective by Def<strong>in</strong>ition SLT.<br />

(⇐) Now assume that T is both <strong>in</strong>jective and surjective. We will build a function<br />

S : V → U that will establish that T is <strong>in</strong>vertible. To this end, choose any v ∈ V .<br />

S<strong>in</strong>ce T is surjective, Theorem RSLT says R(T ) = V ,sowehavev ∈R(T ). Theorem<br />

RPI says that the pre-image of v, T −1 (v), is nonempty. So we can choose a vector<br />

from the pre-image of v, sayu. In other words, there exists u ∈ T −1 (v).<br />

S<strong>in</strong>ce T −1 (v) is nonempty, Theorem KPI then says that<br />

T −1 (v) ={u + z| z ∈K(T )}<br />

However, because T is <strong>in</strong>jective, by Theorem KILT the kernel is trivial, K(T ) =<br />

{0}. So the pre-image is a set with just one element, T −1 (v) = {u}. Nowwecan<br />

def<strong>in</strong>e S by S (v) = u. This is the key to this half of this proof. Normally the<br />

preimage of a vector from the codoma<strong>in</strong> might be an empty set, or an <strong>in</strong>f<strong>in</strong>ite set.<br />

But surjectivity requires that the preimage not be empty, and then <strong>in</strong>jectivity limits<br />

the preimage to a s<strong>in</strong>gleton. S<strong>in</strong>ce our choice of v was arbitrary, we know that every<br />

pre-image for T is a set with a s<strong>in</strong>gle element. This allows us to construct S as a<br />

function. Now that it is def<strong>in</strong>ed, verify<strong>in</strong>g that it is the <strong>in</strong>verse of T will be easy.<br />

Here we go.<br />

Choose u ∈ U. Def<strong>in</strong>e v = T (u). Then T −1 (v) ={u}, sothatS (v) =u and,<br />

(S ◦ T )(u) =S (T (u)) = S (v) =u = I U (u)<br />

and s<strong>in</strong>ce our choice of u was arbitrary we have function equality, S ◦ T = I U .<br />

Now choose v ∈ V . Def<strong>in</strong>e u to be the s<strong>in</strong>gle vector <strong>in</strong> the set T −1 (v), <strong>in</strong>other<br />

words, u = S (v). Then T (u) =v, so<br />

(T ◦ S)(v) =T (S (v)) = T (u) =v = I V (v)<br />

and s<strong>in</strong>ce our choice of v was arbitrary we have function equality, T ◦ S = I V . <br />

When a l<strong>in</strong>ear transformation is both <strong>in</strong>jective and surjective, the pre-image of<br />

any element of the codoma<strong>in</strong> is a set of size one (a “s<strong>in</strong>gleton”). This fact allowed<br />

us to construct the <strong>in</strong>verse l<strong>in</strong>ear transformation <strong>in</strong> one half of the proof of Theorem<br />

ILTIS (see Proof Technique C) and is illustrated <strong>in</strong> the follow<strong>in</strong>g cartoon. This<br />

should rem<strong>in</strong>d you of the very general Diagram KPI which was used to illustrate<br />

Theorem KPI about pre-images, only now we have an <strong>in</strong>vertible l<strong>in</strong>ear transformation<br />

which is therefore surjective and <strong>in</strong>jective (Theorem ILTIS). As a surjective l<strong>in</strong>ear<br />

transformation, there are no vectors depicted <strong>in</strong> the codoma<strong>in</strong>, V ,thathaveempty<br />

pre-images. More importantly, as an <strong>in</strong>jective l<strong>in</strong>ear transformation, the kernel is<br />

trivial (Theorem KILT), so each pre-image is a s<strong>in</strong>gle vector. This makes it possible<br />

to “turn around” all the arrows to create the <strong>in</strong>verse l<strong>in</strong>ear transformation T −1 .


§IVLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 486<br />

u 1<br />

v 1<br />

u 1 + {0 U } = {u 1 }<br />

u 2<br />

v 2<br />

T −1 (v 2 )=u 2 + {0 U }<br />

0 U<br />

0 V<br />

T −1 (0 V )={0 U }<br />

u 3<br />

T −1 (v 3 )<br />

U<br />

U<br />

T<br />

T −1<br />

Diagram IVLT: Invertible L<strong>in</strong>ear Transformation<br />

Many will call an <strong>in</strong>jective and surjective function a bijective function or just a<br />

bijection. TheoremILTIS tells us that this is just a synonym for the term <strong>in</strong>vertible<br />

(which we will use exclusively).<br />

We can follow the constructive approach of the proof of Theorem ILTIS to<br />

construct the <strong>in</strong>verse of a specific l<strong>in</strong>ear transformation, as the next example shows.<br />

Example CIVLT Comput<strong>in</strong>g the Inverse of a L<strong>in</strong>ear Transformations<br />

Consider the l<strong>in</strong>ear transformation T : S 22 → P 2 def<strong>in</strong>ed by<br />

([ ])<br />

a b<br />

T<br />

=(a + b + c)+(−a +2c) x +(2a +3b +6c) x<br />

b c<br />

2<br />

T is <strong>in</strong>vertible, which you are able to verify, perhaps by determ<strong>in</strong><strong>in</strong>g that the<br />

kernel of T is trivial and the range of T is all of P 2 . This will be easier once we have<br />

Theorem RPNDD, which appears later <strong>in</strong> this section.<br />

By Theorem ILTIS we know T −1 exists, and it will be critical shortly to realize<br />

that T −1 is automatically known to be a l<strong>in</strong>ear transformation as well (Theorem<br />

ILTLT). To determ<strong>in</strong>e the complete behavior of T −1 : P 2 → S 22 we can simply<br />

determ<strong>in</strong>e its action on a basis for the doma<strong>in</strong>, P 2 . This is the substance of Theorem<br />

LTDB, and an excellent example of its application. Choose any basis of P 2 ,the<br />

simpler the better, such as B = { 1,x,x 2} . Values of T −1 for these three basis<br />

V<br />

V<br />

v 3


§IVLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 487<br />

elements will be the s<strong>in</strong>gle elements of their preimages. In turn, we have<br />

T −1 (1) :<br />

([ ])<br />

a b<br />

T<br />

=1+0x +0x<br />

b c<br />

[ 1 1 1<br />

] 1<br />

[ 1 0 0<br />

] −6<br />

−1 0 2 0<br />

RREF<br />

−−−−→ 0 1 0 10<br />

2 3 6 0 0 0 1 −3<br />

{[ ]}<br />

(preimage) T −1 −6 10<br />

(1) =<br />

10 −3<br />

[ ]<br />

(function) T −1 −6 10<br />

(1) =<br />

10 −3<br />

T −1 (x) :<br />

T −1 ( x 2) :<br />

([ ])<br />

a b<br />

T<br />

=0+1x +0x<br />

b c<br />

2<br />

]<br />

[ 1 1 1 0<br />

−1 0 2 1<br />

2 3 6 0<br />

RREF<br />

−−−−→<br />

[ 1 0 0 −3<br />

0 1 0 4<br />

0 0 1 −1<br />

{[ ]}<br />

(preimage) T −1 −3 4<br />

(x) =<br />

4 −1<br />

[ ]<br />

(function) T −1 −3 4<br />

(x) =<br />

4 −1<br />

([ ])<br />

a b<br />

T<br />

=0+0x +1x<br />

b c<br />

2<br />

]<br />

[ 1 1 1 0<br />

−1 0 2 0<br />

2 3 6 1<br />

RREF<br />

−−−−→<br />

[ 1 0 0 2<br />

0 1 0 −3<br />

0 0 1 1<br />

(preimage) T −1 ( x 2) {[ ]}<br />

2 −3<br />

=<br />

−3 1<br />

(function) T −1 ( x 2) [ ]<br />

2 −3<br />

=<br />

−3 1<br />

Theorem LTDB says, <strong>in</strong>formally, “it is enough to know what a l<strong>in</strong>ear transformation<br />

does to a basis.” Formally, we have the outputs of T −1 for a basis, so by<br />

Theorem LTDB there is a unique l<strong>in</strong>ear transformation with these outputs. So we put<br />

this <strong>in</strong>formation to work. The key step here is that we can convert any element of P 2<br />

<strong>in</strong>to a l<strong>in</strong>ear comb<strong>in</strong>ation of the elements of the basis B (Theorem VRRB). We are<br />

]<br />

]


§IVLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 488<br />

after a “formula” for the value of T −1 on a generic element of P 2 ,sayp + qx + rx 2 .<br />

T −1 ( p + qx + rx 2) = T −1 ( p(1) + q(x)+r(x 2 ) )<br />

Theorem VRRB<br />

= pT −1 (1) + qT −1 (x)+rT −1 ( x 2) Theorem LTLC<br />

[ ] [ ] [ ]<br />

−6 10 −3 4 2 −3<br />

= p<br />

+ q<br />

+ r<br />

10 −3 4 −1 −3 1<br />

[ ]<br />

−6p − 3q +2r 10p +4q − 3r<br />

=<br />

10p +4q − 3r −3p − q + r<br />

Notice how a l<strong>in</strong>ear comb<strong>in</strong>ation <strong>in</strong> the doma<strong>in</strong> of T −1 has been translated<br />

<strong>in</strong>to a l<strong>in</strong>ear comb<strong>in</strong>ation <strong>in</strong> the codoma<strong>in</strong> of T −1 s<strong>in</strong>ceweknowT −1 is a l<strong>in</strong>ear<br />

transformation by Theorem ILTLT.<br />

Also, notice how the augmented matrices used to determ<strong>in</strong>e the three pre-images<br />

could be comb<strong>in</strong>ed <strong>in</strong>to one calculation of a matrix <strong>in</strong> extended echelon form,<br />

rem<strong>in</strong>iscent of a procedure we know for comput<strong>in</strong>g the <strong>in</strong>verse of a matrix (see<br />

Example CMI). Hmmmm.<br />

△<br />

We will make frequent use of the characterization of <strong>in</strong>vertible l<strong>in</strong>ear transformations<br />

provided by Theorem ILTIS. The next theorem is a good example of this, and<br />

we will use it often, too.<br />

Theorem CIVLT Composition of Invertible L<strong>in</strong>ear Transformations<br />

Suppose that T : U → V and S : V → W are <strong>in</strong>vertible l<strong>in</strong>ear transformations. Then<br />

the composition, (S ◦ T ):U → W is an <strong>in</strong>vertible l<strong>in</strong>ear transformation.<br />

Proof. S<strong>in</strong>ce S and T are both l<strong>in</strong>ear transformations, S ◦ T is also a l<strong>in</strong>ear transformation<br />

by Theorem CLTLT. S<strong>in</strong>ceS and T are both <strong>in</strong>vertible, Theorem ILTIS says<br />

that S and T are both <strong>in</strong>jective and surjective. Then Theorem CILTI says S ◦ T is<br />

<strong>in</strong>jective, and Theorem CSLTS says S ◦ T is surjective. Now apply the “other half”<br />

of Theorem ILTIS and conclude that S ◦ T is <strong>in</strong>vertible.<br />

<br />

When a composition is <strong>in</strong>vertible, the <strong>in</strong>verse is easy to construct.<br />

Theorem ICLT Inverse of a Composition of L<strong>in</strong>ear Transformations<br />

Suppose that T : U → V and S : V → W are <strong>in</strong>vertible l<strong>in</strong>ear transformations. Then<br />

S ◦ T is <strong>in</strong>vertible and (S ◦ T ) −1 = T −1 ◦ S −1 .<br />

Proof. Compute, for all w ∈ W<br />

(<br />

(S ◦ T ) ◦<br />

(<br />

T −1 ◦ S −1)) (w) =S ( T ( T −1 ( S −1 (w) )))<br />

= S ( (<br />

I V S −1 (w) )) Def<strong>in</strong>ition IVLT<br />

= S ( S −1 (w) ) Def<strong>in</strong>ition IDLT<br />

= w Def<strong>in</strong>ition IVLT<br />

= I W (w) Def<strong>in</strong>ition IDLT


§IVLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 489<br />

So (S ◦ T ) ◦ ( T −1 ◦ S −1) = I W ,andalso<br />

((<br />

T −1 ◦ S −1) ◦ (S ◦ T ) ) (u) =T −1 ( S −1 (S (T (u))) )<br />

= T −1 (I V (T (u))) Def<strong>in</strong>ition IVLT<br />

= T −1 (T (u)) Def<strong>in</strong>ition IDLT<br />

= u Def<strong>in</strong>ition IVLT<br />

= I U (u) Def<strong>in</strong>ition IDLT<br />

so ( T −1 ◦ S −1) ◦ (S ◦ T )=I U .<br />

By Def<strong>in</strong>ition IVLT, S ◦ T is <strong>in</strong>vertible and (S ◦ T ) −1 = T −1 ◦ S −1 .<br />

<br />

Notice that this theorem not only establishes what the <strong>in</strong>verse of S ◦ T is, italso<br />

duplicates the conclusion of Theorem CIVLT and also establishes the <strong>in</strong>vertibility of<br />

S ◦ T . But somehow, the proof of Theorem CIVLT is a nicer way to get this property.<br />

Does Theorem ICLT rem<strong>in</strong>d you of the flavor of any theorem we have seen about<br />

matrices? (H<strong>in</strong>t: Th<strong>in</strong>k about gett<strong>in</strong>g dressed.) Hmmmm.<br />

Subsection SI<br />

Structure and Isomorphism<br />

A vector space is def<strong>in</strong>ed (Def<strong>in</strong>ition VS) as a set of objects (“vectors”) endowed<br />

with a def<strong>in</strong>ition of vector addition (+) and a def<strong>in</strong>ition of scalar multiplication<br />

(written with juxtaposition). Many of our def<strong>in</strong>itions about vector spaces <strong>in</strong>volve<br />

l<strong>in</strong>ear comb<strong>in</strong>ations (Def<strong>in</strong>ition LC), such as the span of a set (Def<strong>in</strong>ition SS) and<br />

l<strong>in</strong>ear <strong>in</strong>dependence (Def<strong>in</strong>ition LI). Other def<strong>in</strong>itions are built up from these ideas,<br />

such as bases (Def<strong>in</strong>ition B) and dimension (Def<strong>in</strong>ition D). The def<strong>in</strong><strong>in</strong>g properties<br />

of a l<strong>in</strong>ear transformation require that a function “respect” the operations of the<br />

two vector spaces that are the doma<strong>in</strong> and the codoma<strong>in</strong> (Def<strong>in</strong>ition LT). F<strong>in</strong>ally, an<br />

<strong>in</strong>vertible l<strong>in</strong>ear transformation is one that can be “undone” — it has a companion<br />

that reverses its effect. In this subsection we are go<strong>in</strong>g to beg<strong>in</strong> to roll all these ideas<br />

<strong>in</strong>to one.<br />

A vector space has “structure” derived from def<strong>in</strong>itions of the two operations<br />

and the requirement that these operations <strong>in</strong>teract <strong>in</strong> ways that satisfy the ten<br />

properties of Def<strong>in</strong>ition VS. When two different vector spaces have an <strong>in</strong>vertible<br />

l<strong>in</strong>ear transformation def<strong>in</strong>ed between them, then we can translate questions about<br />

l<strong>in</strong>ear comb<strong>in</strong>ations (spans, l<strong>in</strong>ear <strong>in</strong>dependence, bases, dimension) from the first<br />

vector space to the second. The answers obta<strong>in</strong>ed <strong>in</strong> the second vector space can<br />

then be translated back, via the <strong>in</strong>verse l<strong>in</strong>ear transformation, and <strong>in</strong>terpreted <strong>in</strong> the<br />

sett<strong>in</strong>g of the first vector space. We say that these <strong>in</strong>vertible l<strong>in</strong>ear transformations<br />

“preserve structure.” And we say that the two vector spaces are “structurally the<br />

same.” The precise term is “isomorphic,” from Greek mean<strong>in</strong>g “of the same form.”<br />

Let us beg<strong>in</strong> to try to understand this important concept.


§IVLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 490<br />

Def<strong>in</strong>ition IVS Isomorphic Vector Spaces<br />

Two vector spaces U and V are isomorphic if there exists an <strong>in</strong>vertible l<strong>in</strong>ear<br />

transformation T with doma<strong>in</strong> U and codoma<strong>in</strong> V , T : U → V . In this case, we write<br />

U ∼ = V , and the l<strong>in</strong>ear transformation T is known as an isomorphism between U<br />

and V .<br />

□<br />

A few comments on this def<strong>in</strong>ition. <strong>First</strong>, be careful with your language (Proof<br />

Technique L). Two vector spaces are isomorphic, or not. It is a yes/no situation and<br />

the term only applies to a pair of vector spaces. Any <strong>in</strong>vertible l<strong>in</strong>ear transformation<br />

can be called an isomorphism, it is a term that applies to functions. Second, given<br />

a pair of vector spaces there might be several different isomorphisms between the<br />

two vector spaces. But it only takes the existence of one to call the pair isomorphic.<br />

Third, U isomorphic to V ,orV isomorphic to U? Itdoesnotmatter,s<strong>in</strong>cethe<br />

<strong>in</strong>verse l<strong>in</strong>ear transformation will provide the needed isomorphism <strong>in</strong> the “opposite”<br />

direction. Be<strong>in</strong>g “isomorphic to” is an equivalence relation on the set of all vector<br />

spaces (see Theorem SER for a rem<strong>in</strong>der about equivalence relations).<br />

Example IVSAV Isomorphic vector spaces, Archetype V<br />

Archetype V is a l<strong>in</strong>ear transformation from P 3 to M 22 ,<br />

T : P 3 → M 22 , T ( a + bx + cx 2 + dx 3) [ ]<br />

a + b a− 2c<br />

=<br />

d b− d<br />

S<strong>in</strong>ce it is <strong>in</strong>jective and surjective, Theorem ILTIS tells us that it is an <strong>in</strong>vertible<br />

l<strong>in</strong>ear transformation. By Def<strong>in</strong>ition IVS we say P 3 and M 22 are isomorphic.<br />

At a basic level, the term “isomorphic” is noth<strong>in</strong>g more than a codeword for the<br />

presence of an <strong>in</strong>vertible l<strong>in</strong>ear transformation. However, it is also a description of<br />

a powerful idea, and this power only becomes apparent <strong>in</strong> the course of study<strong>in</strong>g<br />

examples and related theorems. In this example, we are led to believe that there is<br />

noth<strong>in</strong>g “structurally” different about P 3 and M 22 . In a certa<strong>in</strong> sense they are the<br />

same.Notequal,butthesame.Oneisasgoodastheother.Oneisjustas<strong>in</strong>terest<strong>in</strong>g<br />

as the other.<br />

Here is an extremely basic application of this idea. Suppose we want to compute<br />

the follow<strong>in</strong>g l<strong>in</strong>ear comb<strong>in</strong>ation of polynomials <strong>in</strong> P 3 ,<br />

5(2 + 3x − 4x 2 +5x 3 )+(−3)(3 − 5x +3x 2 + x 3 )<br />

Rather than do<strong>in</strong>g it straight-away (which is very easy), we will apply the<br />

transformation T to convert <strong>in</strong>to a l<strong>in</strong>ear comb<strong>in</strong>ation of matrices, and then compute<br />

<strong>in</strong> M 22 accord<strong>in</strong>g to the def<strong>in</strong>itions of the vector space operations there (Example<br />

VSM),<br />

T ( 5(2 + 3x − 4x 2 +5x 3 )+(−3)(3 − 5x +3x 2 + x 3 ) )<br />

=5T ( 2+3x − 4x 2 +5x 3) +(−3)T ( 3 − 5x +3x 2 + x 3) Theorem LTLC<br />

[ ] [ ]<br />

5 10 −2 −3<br />

=5 +(−3)<br />

Def<strong>in</strong>ition of T<br />

5 −2 1 −6


§IVLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 491<br />

=<br />

[<br />

31<br />

]<br />

59<br />

22 8<br />

Operations <strong>in</strong> M 22<br />

Now we will translate our answer back to P 3 by apply<strong>in</strong>g T −1 , which we demonstrated<br />

<strong>in</strong> Example AIVLT,<br />

([ ])<br />

T −1 : M 22 → P 3 , T −1 a b<br />

=(a − c − d)+(c + d)x + 1 c d<br />

2 (a − b − c − d)x2 + cx 3<br />

We compute,<br />

([ ])<br />

T −1 31 59<br />

=1+30x − 29x<br />

22 8<br />

2 +22x 3<br />

which is, as expected, exactly what we would have computed for the orig<strong>in</strong>al l<strong>in</strong>ear<br />

comb<strong>in</strong>ation had we just used the def<strong>in</strong>itions of the operations <strong>in</strong> P 3 (Example VSP).<br />

Notice this is meant only as an illustration and not a suggested route for do<strong>in</strong>g this<br />

particular computation.<br />

△<br />

In Example IVSAV we avoided a computation <strong>in</strong> P 3 by a conversion of the<br />

computation to a new vector space, M 22 , via an <strong>in</strong>vertible l<strong>in</strong>ear transformation (also<br />

known as an isomorphism). Here is a diagram meant to illustrate the more general<br />

situation of two vector spaces, U and V , and an <strong>in</strong>vertible l<strong>in</strong>ear transformation,<br />

T . The diagram is simply about a sum of two vectors from U, rather than a more<br />

<strong>in</strong>volved l<strong>in</strong>ear comb<strong>in</strong>ation. It should rem<strong>in</strong>d you of Diagram DLTA.<br />

T<br />

u 1 , u 2 T (u 1 ),T(u 2 )<br />

+<br />

+<br />

u 1 + u 2 T (u 1 + u 2 )=T (u 1 )+T (u 2 )<br />

T −1<br />

Diagram AIVS: Addition <strong>in</strong> Isomorphic Vector Spaces<br />

To understand this diagram, beg<strong>in</strong> <strong>in</strong> the upper-left corner, and by go<strong>in</strong>g straight<br />

down we can compute the sum of the two vectors us<strong>in</strong>g the addition for the vector<br />

space U. The more circuitous alternative, <strong>in</strong> the spirit of Example IVSAV, isto<br />

beg<strong>in</strong> <strong>in</strong> the upper-left corner and then proceed clockwise around the other three<br />

sides of the rectangle. Notice that the vector addition is accomplished us<strong>in</strong>g the<br />

addition <strong>in</strong> the vector space V . Then, because T is a l<strong>in</strong>ear transformation, we can<br />

say that the result of T (u 1 ) + T (u 2 ) is equal to T (u 1 + u 2 ). Then the key feature<br />

is to recognize that apply<strong>in</strong>g T −1 obviously converts the second version of this result<br />

<strong>in</strong>to the sum <strong>in</strong> the lower-left corner. So there are two routes to the sum u 1 + u 2 ,


§IVLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 492<br />

each employ<strong>in</strong>g an addition from a different vector space, but one is “direct” and<br />

the other is “roundabout”. You might try design<strong>in</strong>g a similar diagram for the case<br />

of scalar multiplication (see Diagram DLTM) or for a full l<strong>in</strong>ear comb<strong>in</strong>ation.<br />

Check<strong>in</strong>g the dimensions of two vector spaces can be a quick way to establish<br />

that they are not isomorphic. Here is the theorem.<br />

Theorem IVSED Isomorphic Vector Spaces have Equal Dimension<br />

Suppose U and V are isomorphic vector spaces. Then dim (U) =dim(V ).<br />

Proof. If U and V are isomorphic, there is an <strong>in</strong>vertible l<strong>in</strong>ear transformation<br />

T : U → V (Def<strong>in</strong>ition IVS). T is <strong>in</strong>jective by Theorem ILTIS and so by Theorem<br />

ILTD, dim (U) ≤ dim (V ). Similarly, T is surjective by Theorem ILTIS and so by<br />

Theorem SLTD, dim (U) ≥ dim (V ). The net effect of these two <strong>in</strong>equalities is that<br />

dim (U) =dim(V ).<br />

<br />

The contrapositive of Theorem IVSED says that if U and V have different<br />

dimensions, then they are not isomorphic. Dimension is the simplest “structural”<br />

characteristic that will allow you to dist<strong>in</strong>guish non-isomorphic vector spaces. For<br />

example P 6 is not isomorphic to M 34 s<strong>in</strong>ce their dimensions (7 and 12, respectively)<br />

are not equal. With tools developed <strong>in</strong> Section VR we will be able to establish that<br />

theconverseofTheoremIVSED is true. Th<strong>in</strong>k about that one for a moment.<br />

Subsection RNLT<br />

Rank and Nullity of a L<strong>in</strong>ear Transformation<br />

Just as a matrix has a rank and a nullity, so too do l<strong>in</strong>ear transformations. And<br />

just like the rank and nullity of a matrix are related (they sum to the number of<br />

columns, Theorem RPNC) the rank and nullity of a l<strong>in</strong>ear transformation are related.<br />

Here are the def<strong>in</strong>itions and theorems, see the Archetypes (Archetypes) for loads of<br />

examples.<br />

Def<strong>in</strong>ition ROLT Rank Of a L<strong>in</strong>ear Transformation<br />

Suppose that T : U → V is a l<strong>in</strong>ear transformation. Then the rank of T , r (T ), is<br />

the dimension of the range of T ,<br />

r (T )=dim(R(T ))<br />

□<br />

Def<strong>in</strong>ition NOLT Nullity Of a L<strong>in</strong>ear Transformation<br />

Suppose that T : U → V is a l<strong>in</strong>ear transformation. Then the nullity of T , n (T ), is<br />

the dimension of the kernel of T ,<br />

n (T )=dim(K(T ))<br />


§IVLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 493<br />

Here are two quick theorems.<br />

Theorem ROSLT Rank Of a Surjective L<strong>in</strong>ear Transformation<br />

Suppose that T : U → V is a l<strong>in</strong>ear transformation. Then the rank of T is the<br />

dimension of V , r (T )=dim(V ), ifandonlyifT is surjective.<br />

Proof. By Theorem RSLT, T is surjective if and only if R(T ) = V . Apply<strong>in</strong>g<br />

Def<strong>in</strong>ition ROLT, R(T )=V if and only if r (T )=dim(R(T )) = dim (V ). <br />

Theorem NOILT Nullity Of an Injective L<strong>in</strong>ear Transformation<br />

Suppose that T : U → V is a l<strong>in</strong>ear transformation. Then the nullity of T is zero,<br />

n (T )=0,ifandonlyifT is <strong>in</strong>jective.<br />

Proof. By Theorem KILT, T is <strong>in</strong>jective if and only if K(T ) = {0}. Apply<strong>in</strong>g<br />

Def<strong>in</strong>ition NOLT, K(T )={0} if and only if n (T )=0.<br />

<br />

Just as <strong>in</strong>jectivity and surjectivity come together <strong>in</strong> <strong>in</strong>vertible l<strong>in</strong>ear transformations,<br />

there is a clear relationship between rank and nullity of a l<strong>in</strong>ear transformation.<br />

If one is big, the other is small.<br />

Theorem RPNDD Rank Plus Nullity is Doma<strong>in</strong> Dimension<br />

Suppose that T : U → V is a l<strong>in</strong>ear transformation. Then<br />

r (T )+n (T )=dim(U)<br />

Proof. Let r = r (T ) and s = n (T ). Suppose that R = {v 1 , v 2 , v 3 , ..., v r }⊆V is<br />

a basis of the range of T , R(T ), andS = {u 1 , u 2 , u 3 , ..., u s }⊆U is a basis of the<br />

kernel of T , K(T ). Note that R and S are possibly empty, which means that some<br />

of the sums <strong>in</strong> this proof are “empty” and are equal to the zero vector.<br />

Because the elements of R are all <strong>in</strong> the range of T , each must have a nonempty preimage<br />

by Theorem RPI. Choose vectors w i ∈ U, 1≤ i ≤ r such that w i ∈ T −1 (v i ).<br />

So T (w i )=v i ,1≤ i ≤ r. Consider the set<br />

B = {u 1 , u 2 , u 3 , ..., u s , w 1 , w 2 , w 3 , ..., w r }<br />

We claim that B is a basis for U.<br />

To establish l<strong>in</strong>ear <strong>in</strong>dependence for B, beg<strong>in</strong> with a relation of l<strong>in</strong>ear dependence<br />

on B. So suppose there are scalars a 1 ,a 2 ,a 3 , ..., a s and b 1 ,b 2 ,b 3 , ..., b r<br />

0 = a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a s u s + b 1 w 1 + b 2 w 2 + b 3 w 3 + ···+ b r w r<br />

Then<br />

0 = T (0) Theorem LTTZZ<br />

= T (a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a s u s +<br />

b 1 w 1 + b 2 w 2 + b 3 w 3 + ···+ b r w r )<br />

Def<strong>in</strong>ition LI<br />

= a 1 T (u 1 )+a 2 T (u 2 )+a 3 T (u 3 )+···+ a s T (u s )+


§IVLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 494<br />

b 1 T (w 1 )+b 2 T (w 2 )+b 3 T (w 3 )+···+ b r T (w r ) Theorem LTLC<br />

= a 1 0 + a 2 0 + a 3 0 + ···+ a s 0+<br />

b 1 T (w 1 )+b 2 T (w 2 )+b 3 T (w 3 )+···+ b r T (w r ) Def<strong>in</strong>ition KLT<br />

= 0 + 0 + 0 + ···+ 0+<br />

b 1 T (w 1 )+b 2 T (w 2 )+b 3 T (w 3 )+···+ b r T (w r ) Theorem ZVSM<br />

= b 1 T (w 1 )+b 2 T (w 2 )+b 3 T (w 3 )+···+ b r T (w r ) Property Z<br />

= b 1 v 1 + b 2 v 2 + b 3 v 3 + ···+ b r v r Def<strong>in</strong>ition PI<br />

This is a relation of l<strong>in</strong>ear dependence on R (Def<strong>in</strong>ition RLD), and s<strong>in</strong>ce R is a<br />

l<strong>in</strong>early <strong>in</strong>dependent set (Def<strong>in</strong>ition LI), we see that b 1 = b 2 = b 3 = ... = b r =0.<br />

Then the orig<strong>in</strong>al relation of l<strong>in</strong>ear dependence on B becomes<br />

0 = a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a s u s +0w 1 +0w 2 + ...+0w r<br />

= a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a s u s + 0 + 0 + ...+ 0 Theorem ZSSM<br />

= a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a s u s Property Z<br />

But this is aga<strong>in</strong> a relation of l<strong>in</strong>ear <strong>in</strong>dependence (Def<strong>in</strong>ition RLD), now on the<br />

set S. S<strong>in</strong>ceS is l<strong>in</strong>early <strong>in</strong>dependent (Def<strong>in</strong>ition LI), we have a 1 = a 2 = a 3 = ...=<br />

a r = 0. S<strong>in</strong>ce we now know that all the scalars <strong>in</strong> the relation of l<strong>in</strong>ear dependence on<br />

B must be zero, we have established the l<strong>in</strong>ear <strong>in</strong>dependence of S through Def<strong>in</strong>ition<br />

LI.<br />

To now establish that B spans U, choose an arbitrary vector u ∈ U. Then<br />

T (u) ∈ R(T ), so there are scalars c 1 ,c 2 ,c 3 , ..., c r such that<br />

T (u) =c 1 v 1 + c 2 v 2 + c 3 v 3 + ···+ c r v r<br />

Use the scalars c 1 ,c 2 ,c 3 , ..., c r to def<strong>in</strong>e a vector y ∈ U,<br />

y = c 1 w 1 + c 2 w 2 + c 3 w 3 + ···+ c r w r<br />

Then<br />

T (u − y) =T (u) − T (y)<br />

Theorem LTLC<br />

= T (u) − T (c 1 w 1 + c 2 w 2 + c 3 w 3 + ···+ c r w r ) Substitution<br />

= T (u) − (c 1 T (w 1 )+c 2 T (w 2 )+···+ c r T (w r )) Theorem LTLC<br />

= T (u) − (c 1 v 1 + c 2 v 2 + c 3 v 3 + ···+ c r v r ) w i ∈ T −1 (v i )<br />

= T (u) − T (u) Substitution<br />

= 0 Property AI<br />

So the vector u − y is sent to the zero vector by T and hence is an element of the<br />

kernel of T . As such it can be written as a l<strong>in</strong>ear comb<strong>in</strong>ation of the basis vectors<br />

for K(T ), the elements of the set S. So there are scalars d 1 ,d 2 ,d 3 , ..., d s such that<br />

u − y = d 1 u 1 + d 2 u 2 + d 3 u 3 + ···+ d s u s


§IVLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 495<br />

Then<br />

u =(u − y)+y<br />

= d 1 u 1 + d 2 u 2 + d 3 u 3 + ···+ d s u s + c 1 w 1 + c 2 w 2 + c 3 w 3 + ···+ c r w r<br />

This says that for any vector, u, fromU, there exist scalars (d 1 ,d 2 ,d 3 , ..., d s ,<br />

c 1 ,c 2 ,c 3 , ..., c r ) that form u as a l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors <strong>in</strong> the set B.<br />

In other words, B spans U (Def<strong>in</strong>ition SS).<br />

So B is a basis (Def<strong>in</strong>ition B) ofU with s + r vectors, and thus<br />

dim (U) =s + r = n (T )+r (T )<br />

as desired.<br />

<br />

Theorem RPNC said that the rank and nullity of a matrix sum to the number of<br />

columns of the matrix. This result is now an easy consequence of Theorem RPNDD<br />

when we consider the l<strong>in</strong>ear transformation T : C n → C m def<strong>in</strong>ed with the m × n<br />

matrix A by T (x) = Ax. The range and kernel of T are identical to the column<br />

space and null space of the matrix A (Exercise ILT.T20, Exercise SLT.T20), so the<br />

rank and nullity of the matrix A are identical to the rank and nullity of the l<strong>in</strong>ear<br />

transformation T . The dimension of the doma<strong>in</strong> of T is the dimension of C n , exactly<br />

the number of columns for the matrix A.<br />

This theorem can be especially useful <strong>in</strong> determ<strong>in</strong><strong>in</strong>g basic properties of l<strong>in</strong>ear<br />

transformations. For example, suppose that T : C 6 → C 6 is a l<strong>in</strong>ear transformation<br />

and you are able to quickly establish that the kernel is trivial. Then n (T ) = 0. <strong>First</strong><br />

this means that T is <strong>in</strong>jective by Theorem NOILT. Also, Theorem RPNDD becomes<br />

6=dim ( C 6) = r (T )+n (T )=r (T )+0=r (T )<br />

So the rank of T is equal to the dimension of the codoma<strong>in</strong>, and by Theorem ROSLT<br />

we know T is surjective. F<strong>in</strong>ally, we know T is <strong>in</strong>vertible by Theorem ILTIS. Sofrom<br />

the determ<strong>in</strong>ation that the kernel is trivial, and consideration of various dimensions,<br />

the theorems of this section allow us to conclude the existence of an <strong>in</strong>verse l<strong>in</strong>ear<br />

transformation for T . Similarly, Theorem RPNDD can be used to provide alternative<br />

proofs for Theorem ILTD, TheoremSLTD and Theorem IVSED. It would be an<br />

<strong>in</strong>terest<strong>in</strong>g exercise to construct these proofs.<br />

It would be <strong>in</strong>structive to study the archetypes that are l<strong>in</strong>ear transformations<br />

and see how many of their properties can be deduced just from consider<strong>in</strong>g only the<br />

dimensions of the doma<strong>in</strong> and codoma<strong>in</strong>. Then add <strong>in</strong> just knowledge of either the<br />

nullity or rank, and see how much more you can learn about the l<strong>in</strong>ear transformation.<br />

The table preced<strong>in</strong>g all of the archetypes (Archetypes) could be a good place to<br />

start this analysis.


§IVLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 496<br />

Subsection SLELT<br />

Systems of L<strong>in</strong>ear Equations and L<strong>in</strong>ear Transformations<br />

This subsection does not really belong <strong>in</strong> this section, or any other section, for that<br />

matter. It is just the right time to have a discussion about the connections between<br />

the central topic of l<strong>in</strong>ear algebra, l<strong>in</strong>ear transformations, and our motivat<strong>in</strong>g topic<br />

from Chapter SLE, systems of l<strong>in</strong>ear equations. We will discuss several theorems we<br />

have seen already, but we will also make some forward-look<strong>in</strong>g statements that will<br />

be justified <strong>in</strong> Chapter R.<br />

Archetype D and Archetype E are ideal examples to illustrate connections with<br />

l<strong>in</strong>ear transformations. Both have the same coefficient matrix,<br />

[ ]<br />

2 1 7 −7<br />

D = −3 4 −5 −6<br />

1 1 4 −5<br />

To apply the theory of l<strong>in</strong>ear transformations to these two archetypes, employ<br />

the matrix-vector product (Def<strong>in</strong>ition MVP) and def<strong>in</strong>e the l<strong>in</strong>ear transformation,<br />

[ ] [ ] [ ] [ ]<br />

2 1 7 −7<br />

T : C 4 → C 3 , T (x) =Dx = x 1 −3 + x 2 4 + x 3 −5 + x 4 −6<br />

1 1 4 −5<br />

Theorem MBLT tells us that T is <strong>in</strong>deed a<br />

[<br />

l<strong>in</strong>ear<br />

]<br />

transformation. Archetype<br />

8<br />

D asks for solutions to LS(D, b), whereb = −12 . In the language of l<strong>in</strong>ear<br />

−4<br />

transformations this is equivalent to ask<strong>in</strong>g for T −1 (b). In the language of vectors<br />

and matrices it asks for a l<strong>in</strong>ear comb<strong>in</strong>ation ⎡ ⎤ of the four columns of D that will<br />

7<br />

⎢8⎥<br />

equal b. One solution listed is w = ⎣<br />

1<br />

⎦. With a nonempty preimage, Theorem KPI<br />

3<br />

tells us that the complete solution set of the l<strong>in</strong>ear system is the preimage of b,<br />

w + K(T )={w + z| z ∈K(T )}<br />

The kernel of the l<strong>in</strong>ear transformation T is exactly the null space of the matrix<br />

D (see Exercise ILT.T20), so this approach to the solution set should be rem<strong>in</strong>iscent<br />

of Theorem PSPHS. The kernel of the l<strong>in</strong>ear transformation is the preimage of the<br />

zero vector, exactly equal to the solution set of the homogeneous system LS(D, 0).<br />

S<strong>in</strong>ce D has a null space of dimension two, every preimage (and <strong>in</strong> particular the<br />

preimage of b) is as “big” as a subspace of dimension two (but is not a subspace).<br />

Archetype<br />

[ ]<br />

E is identical to Archetype D but with a different vector of constants,<br />

2<br />

d = 3 . We can use the same l<strong>in</strong>ear transformation T to discuss this system<br />

2


§IVLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 497<br />

of equations s<strong>in</strong>ce the coefficient matrix is identical. Now the set of solutions to<br />

LS(D, d) is the pre-image of d, T −1 (d). However, the vector d is not <strong>in</strong> the range<br />

of the l<strong>in</strong>ear transformation (nor is it <strong>in</strong> the column space of the matrix, s<strong>in</strong>ce these<br />

two sets are equal by Exercise SLT.T20). So the empty pre-image is equivalent to<br />

the <strong>in</strong>consistency of the l<strong>in</strong>ear system.<br />

These two archetypes each have three equations <strong>in</strong> four variables, so either the<br />

result<strong>in</strong>g l<strong>in</strong>ear systems are <strong>in</strong>consistent, or they are consistent and application of<br />

Theorem CMVEI tells us that the system has <strong>in</strong>f<strong>in</strong>itely many solutions. Consider<strong>in</strong>g<br />

these same parameters for the l<strong>in</strong>ear transformation, the dimension of the doma<strong>in</strong>,<br />

C 4 , is four, while the codoma<strong>in</strong>, C 3 , has dimension three. Then<br />

n (T )=dim ( C 4) − r (T )<br />

Theorem RPNDD<br />

=4− dim (R(T )) Def<strong>in</strong>ition ROLT<br />

≥ 4 − 3 R(T ) subspace of C 3<br />

=1<br />

So the kernel of T is nontrivial simply by consider<strong>in</strong>g the dimensions of the<br />

doma<strong>in</strong> (number of variables) and the codoma<strong>in</strong> (number of equations). Pre-images<br />

of elements of the codoma<strong>in</strong> that are not <strong>in</strong> the range of T are empty (<strong>in</strong>consistent<br />

systems). For elements of the codoma<strong>in</strong> that are <strong>in</strong> the range of T (consistent<br />

systems), Theorem KPI tells us that the pre-images are built from the kernel, and<br />

with a nontrivial kernel, these pre-images are <strong>in</strong>f<strong>in</strong>ite (<strong>in</strong>f<strong>in</strong>itely many solutions).<br />

When do systems of equations have unique solutions? Consider the system of<br />

l<strong>in</strong>ear equations LS(C, f) and the l<strong>in</strong>ear transformation S (x) = Cx. IfS has a<br />

trivial kernel, then pre-images will either be empty or be f<strong>in</strong>ite sets with s<strong>in</strong>gle<br />

elements. Correspond<strong>in</strong>gly, the coefficient matrix C will have a trivial null space<br />

and solution sets will either be empty (<strong>in</strong>consistent) or conta<strong>in</strong> a s<strong>in</strong>gle solution<br />

(unique solution). Should the matrix be square and have a trivial null space then<br />

we recognize the matrix as be<strong>in</strong>g nons<strong>in</strong>gular. A square matrix means that the<br />

correspond<strong>in</strong>g l<strong>in</strong>ear transformation, T , has equal-sized doma<strong>in</strong> and codoma<strong>in</strong>. With<br />

a nullity of zero, T is <strong>in</strong>jective, and also Theorem RPNDD tells us that rank of T is<br />

equal to the dimension of the doma<strong>in</strong>, which <strong>in</strong> turn is equal to the dimension of<br />

the codoma<strong>in</strong>. In other words, T is surjective. Injective and surjective, and Theorem<br />

ILTIS tells us that T is <strong>in</strong>vertible. Just as we can use the <strong>in</strong>verse of the coefficient<br />

matrix to f<strong>in</strong>d the unique solution of any l<strong>in</strong>ear system with a nons<strong>in</strong>gular coefficient<br />

matrix (Theorem SNCM), we can use the <strong>in</strong>verse of the l<strong>in</strong>ear transformation to<br />

construct the unique element of any pre-image (proof of Theorem ILTIS).<br />

The executive summary of this discussion is that to every coefficient matrix<br />

of a system of l<strong>in</strong>ear equations we can associate a natural l<strong>in</strong>ear transformation.<br />

Solution sets for systems with this coefficient matrix are preimages of elements of the<br />

codoma<strong>in</strong> of the l<strong>in</strong>ear transformation. For every theorem about systems of l<strong>in</strong>ear<br />

equations there is an analogue about l<strong>in</strong>ear transformations. The theory of l<strong>in</strong>ear


§IVLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 498<br />

transformations provides all the tools to recreate the theory of solutions to l<strong>in</strong>ear<br />

systems of equations.<br />

We will cont<strong>in</strong>ue this adventure <strong>in</strong> Chapter R.<br />

Read<strong>in</strong>g Questions<br />

1. What conditions allow us to easily determ<strong>in</strong>e if a l<strong>in</strong>ear transformation is <strong>in</strong>vertible?<br />

2. What does it mean to say two vector spaces are isomorphic? Both technically, and<br />

<strong>in</strong>formally?<br />

3. How do l<strong>in</strong>ear transformations relate to systems of l<strong>in</strong>ear equations?<br />

Exercises<br />

C10 The archetypes below are l<strong>in</strong>ear transformations of the form T : U → V that are<br />

<strong>in</strong>vertible. For each, the <strong>in</strong>verse l<strong>in</strong>ear transformation is given explicitly as part of the<br />

archetype’s description. Verify for each l<strong>in</strong>ear transformation that<br />

T −1 ◦ T = I U<br />

T ◦ T −1 = I V<br />

Archetype R, ArchetypeV, ArchetypeW<br />

C20 † Determ<strong>in</strong>e if the l<strong>in</strong>ear transformation T : P 2 → M 22 is (a) <strong>in</strong>jective, (b) surjective,<br />

(c) <strong>in</strong>vertible.<br />

T ( a + bx + cx 2) [ ]<br />

a +2b − 2c 2a +2b<br />

=<br />

−a + b − 4c 3a +2b +2c<br />

C21 † Determ<strong>in</strong>e if the l<strong>in</strong>ear transformation S : P 3 → M 22 is (a) <strong>in</strong>jective, (b) surjective,<br />

(c) <strong>in</strong>vertible.<br />

S ( a + bx + cx 2 + dx 3) [ ]<br />

−a +4b + c +2d 4a − b +6c − d<br />

=<br />

a +5b − 2c +2d a+2c +5d<br />

C25 For each l<strong>in</strong>ear transformation below: (a) F<strong>in</strong>d the matrix representation of T ,(b)<br />

Calculate n(T ), (c) Calculate r(T ), (d) Graph the image <strong>in</strong> either R 2 or R 3 as appropriate,<br />

(e) How many dimensions are lost?, and (f) How many dimensions are preserved?<br />

⎛⎡<br />

⎤⎞<br />

1. T : C 3 → C 3 given by T ⎝⎣ x y⎦⎠ = ⎣ x x⎦<br />

z x<br />

⎛⎡<br />

⎤⎞<br />

2. T : C 3 → C 3 given by T ⎝⎣ x y⎦⎠ = ⎣ x y⎦<br />

z 0<br />

⎛⎡<br />

3. T : C 3 → C 2 given by T ⎝⎣ x ⎤⎞<br />

[<br />

y⎦⎠ x<br />

=<br />

x]<br />

z<br />

⎡<br />

⎡<br />

⎤<br />


§IVLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 499<br />

⎛⎡<br />

4. T : C 3 → C 2 given by T ⎝⎣ x ⎤⎞<br />

[<br />

y⎦⎠ x<br />

=<br />

y]<br />

z<br />

⎡<br />

([<br />

5. T : C 2 → C 3 x<br />

given by T = ⎣<br />

y]) x ⎤<br />

y⎦<br />

0<br />

⎡<br />

([<br />

6. T : C 2 → C 3 x<br />

given by T = ⎣<br />

y]) x ⎤<br />

y ⎦<br />

x + y<br />

C50 † Consider the l<strong>in</strong>ear transformation S : M 12 → P 1 from the set of 1 × 2 matrices to<br />

the set of polynomials of degree at most 1, def<strong>in</strong>ed by<br />

S ([ a<br />

b ]) =(3a + b)+(5a +2b)x<br />

Prove that S is <strong>in</strong>vertible. Then show that the l<strong>in</strong>ear transformation<br />

R: P 1 → M 12 , R(r + sx) = [ (2r − s) (−5r +3s) ]<br />

is the <strong>in</strong>verse of S, thatisS −1 = R.<br />

M30 † The l<strong>in</strong>ear transformation S below is <strong>in</strong>vertible. F<strong>in</strong>d a formula for the <strong>in</strong>verse<br />

l<strong>in</strong>ear transformation, S −1 .<br />

S : P 1 → M 12 , S(a + bx) = [ 3a + b 2a + b ]<br />

M31 † The l<strong>in</strong>ear transformation R: M 12 → M 21 is <strong>in</strong>vertible. Determ<strong>in</strong>e a formula for<br />

the <strong>in</strong>verse l<strong>in</strong>ear transformation R −1 : M 21 → M 12 .<br />

R ([ a b ]) [ ]<br />

a +3b<br />

=<br />

4a +11b<br />

M50 Rework Example CIVLT, only <strong>in</strong> place of the basis B for P 2 , choose <strong>in</strong>stead to use<br />

the basis C = { 1, 1+x, 1+x + x 2} . This will complicate writ<strong>in</strong>g a generic element of the<br />

doma<strong>in</strong> of T −1 as a l<strong>in</strong>ear comb<strong>in</strong>ation of the basis elements, and the algebra will be a bit<br />

messier, but <strong>in</strong> the end you should obta<strong>in</strong> the same formula for T −1 . The <strong>in</strong>verse l<strong>in</strong>ear<br />

transformation is what it is, and the choice of a particular basis should not <strong>in</strong>fluence the<br />

outcome.<br />

M60 Suppose U and V are vector spaces. Def<strong>in</strong>e the function Z : U → V by Z (u) = 0 V<br />

for every u ∈ U. Then by Exercise LT.M60, Z is a l<strong>in</strong>ear transformation. Formulate a<br />

condition on U and V that is equivalent to Z be<strong>in</strong>g an <strong>in</strong>vertible l<strong>in</strong>ear transformation. In<br />

other words, fill <strong>in</strong> the blank to complete the follow<strong>in</strong>g statement (and then give a proof):<br />

Z is <strong>in</strong>vertible if and only if U and V are . (See Exercise ILT.M60, Exercise SLT.M60,<br />

Exercise MR.M60.)<br />

T05 Prove that the identity l<strong>in</strong>ear transformation (Def<strong>in</strong>ition IDLT) is both <strong>in</strong>jective and<br />

surjective, and hence <strong>in</strong>vertible.<br />

T15 † Suppose that T : U → V is a surjective l<strong>in</strong>ear transformation and dim (U) = dim (V ).<br />

Prove that T is <strong>in</strong>jective.


§IVLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 500<br />

T16 Suppose that T : U → V is an <strong>in</strong>jective l<strong>in</strong>ear transformation and dim (U) = dim (V ).<br />

Prove that T is surjective.<br />

T30 † Suppose that U and V are isomorphic vector spaces, not of dimension zero. Prove<br />

that there are <strong>in</strong>f<strong>in</strong>itely many isomorphisms between U and V .<br />

T40 † Suppose T : U → V and S : V → W are l<strong>in</strong>ear transformations and dim (U) =<br />

dim (V ) = dim (W ). Suppose that S ◦ T is <strong>in</strong>vertible. Prove that S and T are <strong>in</strong>dividually<br />

<strong>in</strong>vertible (this could be construed as a converse of Theorem CIVLT).


Chapter R<br />

Representations<br />

Previous work with l<strong>in</strong>ear transformations may have conv<strong>in</strong>ced you that we can<br />

convert most questions about l<strong>in</strong>ear transformations <strong>in</strong>to questions about systems of<br />

equations or properties of subspaces of C m . In this section we beg<strong>in</strong> to make these<br />

vague notions precise. We have used the word “representation” prior, but it will get<br />

a heavy workout <strong>in</strong> this chapter. In many ways, everyth<strong>in</strong>g we have studied so far<br />

was <strong>in</strong> preparation for this chapter.<br />

Section VR<br />

Vector Representations<br />

You may have noticed that many questions about elements of abstract vector spaces<br />

eventually become questions about column vectors or systems of equations. Example<br />

SM32 would be an example of this. We will make this vague idea more precise <strong>in</strong><br />

this section.<br />

Subsection VR<br />

Vector Representation<br />

We beg<strong>in</strong> by establish<strong>in</strong>g an <strong>in</strong>vertible l<strong>in</strong>ear transformation between any vector<br />

space V of dimension n and C n . This will allow us to “go back and forth” between<br />

the two vector spaces, no matter how abstract the def<strong>in</strong>ition of V might be.<br />

Def<strong>in</strong>ition VR Vector Representation<br />

Suppose that V is a vector space with a basis B = {v 1 , v 2 , v 3 , ..., v n }. Def<strong>in</strong>e a<br />

function ρ B : V → C n as follows. For w ∈ V def<strong>in</strong>e the column vector ρ B (w) ∈ C n<br />

by<br />

w =[ρ B (w)] 1<br />

v 1 +[ρ B (w)] 2<br />

v 2 +[ρ B (w)] 3<br />

v 3 + ···+[ρ B (w)] n<br />

v n<br />

501


§VR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 502<br />

□<br />

This def<strong>in</strong>ition looks more complicated that it really is, though the form above will<br />

be useful <strong>in</strong> proofs. Simply stated, given w ∈ V ,wewritew as a l<strong>in</strong>ear comb<strong>in</strong>ation<br />

of the basis elements of B. ItiskeytorealizethatTheoremVRRB guarantees that<br />

we can do this for every w, and furthermore this expression as a l<strong>in</strong>ear comb<strong>in</strong>ation is<br />

unique. The result<strong>in</strong>g scalars are just the entries of the vector ρ B (w). This discussion<br />

should conv<strong>in</strong>ce you that ρ B is “well-def<strong>in</strong>ed” as a function. We can determ<strong>in</strong>e a<br />

precise output for any <strong>in</strong>put. Now we want to establish that ρ B is a function with<br />

additional properties — it is a l<strong>in</strong>ear transformation.<br />

Theorem VRLT Vector Representation is a L<strong>in</strong>ear Transformation<br />

The function ρ B (Def<strong>in</strong>ition VR) is a l<strong>in</strong>ear transformation.<br />

Proof. We will take a novel approach <strong>in</strong> this proof. We will construct another function,<br />

which we will easily determ<strong>in</strong>e is a l<strong>in</strong>ear transformation, and then show that this<br />

second function is really ρ B <strong>in</strong> disguise. Here we go.<br />

S<strong>in</strong>ce B is a basis, we can def<strong>in</strong>e T : V → C n to be the unique l<strong>in</strong>ear transformation<br />

such that T (v i ) = e i ,1≤ i ≤ n, as guaranteed by Theorem LTDB, and where the<br />

e i are the standard unit vectors (Def<strong>in</strong>ition SUV). Then suppose for an arbitrary<br />

w ∈ V we have,<br />

⎡ ⎛<br />

⎞⎤<br />

n∑<br />

[T (w)] i<br />

= ⎣T ⎝ [ρ B (w)] j<br />

v j<br />

⎠⎦<br />

Def<strong>in</strong>ition VR<br />

⎡<br />

= ⎣<br />

⎡<br />

= ⎣<br />

=<br />

=<br />

j=1<br />

⎤<br />

n∑<br />

[ρ B (w)] j<br />

T (v j ) ⎦<br />

j=1<br />

⎤<br />

n∑<br />

[ρ B (w)] j<br />

e j<br />

⎦<br />

j=1<br />

n∑<br />

]<br />

[[ρ B (w)] j<br />

e j<br />

j=1<br />

i<br />

i<br />

i<br />

i<br />

Theorem LTLC<br />

Def<strong>in</strong>ition CVA<br />

n∑<br />

[ρ B (w)] j<br />

[e j ] i<br />

Def<strong>in</strong>ition CVSM<br />

j=1<br />

=[ρ B (w)] i<br />

[e i ] i<br />

+<br />

=[ρ B (w)] i<br />

(1) +<br />

n∑<br />

[ρ B (w)] j<br />

[e j ] i<br />

Property CC<br />

j=1<br />

j≠i<br />

n∑<br />

[ρ B (w)] j<br />

(0) Def<strong>in</strong>ition SUV<br />

j=1<br />

j≠i


§VR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 503<br />

=[ρ B (w)] i<br />

As column vectors, Def<strong>in</strong>ition CVE implies that T (w) = ρ B (w). S<strong>in</strong>cew was an<br />

arbitrary element of V , as functions T = ρ B .Now,s<strong>in</strong>ceT is known to be a l<strong>in</strong>ear<br />

transformation, it must follow that ρ B is also a l<strong>in</strong>ear transformation. <br />

The proof of Theorem VRLT provides an alternate def<strong>in</strong>ition of vector representation<br />

relative to a basis B that we could state as a corollary (Proof Technique LC):<br />

ρ B is the unique l<strong>in</strong>ear transformation that takes B to the standard unit basis.<br />

Example VRC4 Vector representation <strong>in</strong> C 4<br />

Consider the vector y ∈ C 4 ⎡ ⎤<br />

6<br />

⎢14⎥<br />

y = ⎣<br />

6<br />

⎦<br />

7<br />

We will f<strong>in</strong>d several vector representations of y <strong>in</strong> this example. Notice that y<br />

never changes, but the representations of y do change. One basis for C 4 is<br />

⎧⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />

−2 3 1 4<br />

⎪⎨<br />

⎪⎬<br />

⎢ 1 ⎥ ⎢−6⎥<br />

⎢2⎥<br />

⎢3⎥<br />

B = {u 1 , u 2 , u 3 , u 4 } = ⎣<br />

⎪⎩<br />

2<br />

⎦ , ⎣<br />

2<br />

⎦ , ⎣<br />

0<br />

⎦ , ⎣<br />

1<br />

⎦<br />

⎪⎭<br />

−3 −4 5 6<br />

as can be seen by mak<strong>in</strong>g these vectors the columns of a matrix, check<strong>in</strong>g that the<br />

matrix is nons<strong>in</strong>gular and apply<strong>in</strong>g Theorem CNMB. To f<strong>in</strong>d ρ B (y), we need to<br />

f<strong>in</strong>d scalars, a 1 ,a 2 ,a 3 ,a 4 such that<br />

y = a 1 u 1 + a 2 u 2 + a 3 u 3 + a 4 u 4<br />

By Theorem SLSLC the desired scalars are a solution to the l<strong>in</strong>ear system of<br />

equations with a coefficient matrix whose columns are the vectors <strong>in</strong> B andwitha<br />

vector of constants y. With a nons<strong>in</strong>gular coefficient matrix, the solution is unique,<br />

but this is no surprise as this is the content of Theorem VRRB. This unique solution<br />

is<br />

a 1 =2 a 2 = −1 a 3 = −3 a 4 =4<br />

Then by Def<strong>in</strong>ition VR, wehave<br />

⎡ ⎤<br />

2<br />

⎢−1⎥<br />

ρ B (y) = ⎣<br />

−3<br />

⎦<br />

4<br />

Suppose now that we construct a representation of y relative to another basis of


§VR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 504<br />

C 4 ,<br />

⎧⎡<br />

⎤<br />

−15 ⎪⎨<br />

⎢ 9 ⎥<br />

C = ⎣<br />

⎪⎩<br />

−4<br />

⎦ ,<br />

−2<br />

⎡ ⎤<br />

16<br />

⎢−14⎥<br />

⎣<br />

5<br />

⎦ ,<br />

2<br />

⎡ ⎤<br />

−26<br />

⎢ ⎥<br />

⎣ ⎦ ,<br />

14<br />

−6<br />

−3<br />

⎡ ⎤⎫<br />

14 ⎪⎬<br />

⎢−13⎥<br />

⎣<br />

4<br />

⎦<br />

⎪⎭<br />

6<br />

As with B, it is easy to check that C is a basis. Writ<strong>in</strong>g y as a l<strong>in</strong>ear comb<strong>in</strong>ation<br />

of the vectors <strong>in</strong> C leads to solv<strong>in</strong>g a system of four equations <strong>in</strong> the four unknown<br />

scalars with a nons<strong>in</strong>gular coefficient matrix. The unique solution can be expressed<br />

as<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

6<br />

−15<br />

16 −26 14<br />

⎢14⎥<br />

⎢ 9 ⎥ ⎢−14⎥<br />

⎢ 14 ⎥ ⎢−13⎥<br />

y = ⎣<br />

6<br />

⎦ =(−28) ⎣<br />

−4<br />

⎦ +(−8) ⎣<br />

5<br />

⎦ +11⎣<br />

−6<br />

⎦ +0⎣<br />

4<br />

⎦<br />

7<br />

−2<br />

2 −3 6<br />

so that Def<strong>in</strong>ition VR gives<br />

ρ C (y) =<br />

⎡ ⎤<br />

−28<br />

⎢ −8 ⎥<br />

⎣<br />

11<br />

⎦<br />

0<br />

We often perform representations relative to standard bases, but for vectors <strong>in</strong><br />

C m this is a little silly. Let us f<strong>in</strong>d the vector representation of y relative to the<br />

standard basis (Theorem SUVB),<br />

D = {e 1 , e 2 , e 3 , e 4 }<br />

Then, without any computation, we can check that<br />

⎡ ⎤<br />

6<br />

⎢14⎥<br />

y = ⎣<br />

6<br />

⎦ =6e 1 +14e 2 +6e 3 +7e 4<br />

7<br />

so by Def<strong>in</strong>ition VR,<br />

⎡ ⎤<br />

6<br />

⎢14⎥<br />

ρ D (y) = ⎣<br />

6<br />

⎦<br />

7<br />

which is not very excit<strong>in</strong>g. Notice however that the order <strong>in</strong> which we place the<br />

vectors <strong>in</strong> the basis is critical to the representation. Let us keep the standard unit<br />

vectors as our basis, but rearrange the order we place them <strong>in</strong> the basis. So a fourth<br />

basis is<br />

E = {e 3 , e 4 , e 2 , e 1 }


§VR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 505<br />

Then,<br />

so by Def<strong>in</strong>ition VR,<br />

⎡ ⎤<br />

6<br />

⎢14⎥<br />

y = ⎣<br />

6<br />

⎦ =6e 3 +7e 4 +14e 2 +6e 1<br />

7<br />

⎡<br />

⎢<br />

ρ E (y) = ⎣<br />

So for every possible basis of C 4 we could construct a different representation of<br />

y. △<br />

Vector representations are most <strong>in</strong>terest<strong>in</strong>g for vector spaces that are not C m .<br />

Example VRP2 Vector representations <strong>in</strong> P 2<br />

Consider the vector u =15+10x − 6x 2 ∈ P 2 from the vector space of polynomials<br />

with degree at most 2 (Example VSP). A nice basis for P 2 is<br />

B = { 1,x,x 2}<br />

so that<br />

u =15+10x − 6x 2 = 15(1) + 10(x)+(−6)(x 2 )<br />

so by Def<strong>in</strong>ition VR<br />

ρ B (u) =<br />

6<br />

7<br />

14<br />

6<br />

⎤<br />

⎥<br />

⎦<br />

[ ] 15<br />

10<br />

−6<br />

Another nice basis for P 2 is<br />

C = { 1, 1+x, 1+x + x 2}<br />

so that now it takes a bit of computation to determ<strong>in</strong>e the scalars for the representation.<br />

We want a 1 ,a 2 ,a 3 so that<br />

15 + 10x − 6x 2 = a 1 (1) + a 2 (1 + x)+a 3 (1 + x + x 2 )<br />

Perform<strong>in</strong>g the operations <strong>in</strong> P 2 on the right-hand side, and equat<strong>in</strong>g coefficients,<br />

gives the three equations <strong>in</strong> the three unknown scalars,<br />

15 = a 1 + a 2 + a 3<br />

10 = a 2 + a 3<br />

−6 =a 3<br />

The coefficient matrix of this sytem is nons<strong>in</strong>gular, lead<strong>in</strong>g to a unique solution


§VR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 506<br />

(no surprise there, see Theorem VRRB),<br />

a 1 =5 a 2 =16 a 3 = −6<br />

so by Def<strong>in</strong>ition VR<br />

ρ C (u) =<br />

[ ] 5<br />

16<br />

−6<br />

While we often form vector representations relative to “nice” bases, noth<strong>in</strong>g<br />

prevents us from form<strong>in</strong>g representations relative to “nasty” bases. For example, the<br />

set<br />

D = { −2 − x +3x 2 , 1 − 2x 2 , 5+4x + x 2}<br />

can be verified as a basis of P 2 by check<strong>in</strong>g l<strong>in</strong>ear <strong>in</strong>dependence with Def<strong>in</strong>ition LI<br />

and then argu<strong>in</strong>g that 3 vectors from P 2 , a vector space of dimension 3 (Theorem<br />

DP), must also be a spann<strong>in</strong>g set (Theorem G).<br />

Now we desire scalars a 1 ,a 2 ,a 3 so that<br />

15 + 10x − 6x 2 = a 1 (−2 − x +3x 2 )+a 2 (1 − 2x 2 )+a 3 (5 + 4x + x 2 )<br />

Perform<strong>in</strong>g the operations <strong>in</strong> P 2 on the right-hand side, and equat<strong>in</strong>g coefficients,<br />

gives the three equations <strong>in</strong> the three unknown scalars,<br />

15 = −2a 1 + a 2 +5a 3<br />

10 = −a 1 +4a 3<br />

−6 =3a 1 − 2a 2 + a 3<br />

The coefficient matrix of this sytem is nons<strong>in</strong>gular, lead<strong>in</strong>g to a unique solution<br />

(no surprise there, see Theorem VRRB),<br />

a 1 = −2 a 2 =1 a 3 =2<br />

so by Def<strong>in</strong>ition VR<br />

ρ D (u) =<br />

[ ] −2<br />

1<br />

2<br />

△<br />

Theorem VRI Vector Representation is Injective<br />

The function ρ B (Def<strong>in</strong>ition VR) is an <strong>in</strong>jective l<strong>in</strong>ear transformation.<br />

Proof. We will appeal to Theorem KILT. Suppose U is a vector space of dimension<br />

n, so vector representation is ρ B : U → C n .LetB = {u 1 , u 2 , u 3 , ..., u n } be the<br />

basis of U used <strong>in</strong> the def<strong>in</strong>ition of ρ B . Suppose u ∈K(ρ B ).Wewriteu as a l<strong>in</strong>ear<br />

comb<strong>in</strong>ation of the vectors <strong>in</strong> the basis B where the scalars are the components of


§VR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 507<br />

the vector representation, ρ B (u).<br />

u =[ρ B (u)] 1<br />

u 1 +[ρ B (u)] 2<br />

u 2 + ···+[ρ B (u)] n<br />

u n Def<strong>in</strong>ition VR<br />

=[0] 1<br />

u 1 +[0] 2<br />

u 2 + ···+[0] n<br />

u n Def<strong>in</strong>ition KLT<br />

=0u 1 +0u 2 + ···+0u n Def<strong>in</strong>ition ZCV<br />

= 0 + 0 + ···+ 0 Theorem ZSSM<br />

= 0 Property Z<br />

Thus an arbitrary vector, u, from the kernel ,K(ρ B ), must equal the zero vector<br />

of U. SoK(ρ B )={0} and by Theorem KILT, ρ B is <strong>in</strong>jective.<br />

<br />

Theorem VRS Vector Representation is Surjective<br />

The function ρ B (Def<strong>in</strong>ition VR) is a surjective l<strong>in</strong>ear transformation.<br />

Proof. We will appeal to Theorem RSLT. Suppose U is a vector space of dimension<br />

n, so vector representation is ρ B : U → C n .LetB = {u 1 , u 2 , u 3 , ..., u n } be the<br />

basis of U used <strong>in</strong> the def<strong>in</strong>ition of ρ B . Suppose v ∈ C n . Def<strong>in</strong>e the vector u by<br />

u =[v] 1<br />

u 1 +[v] 2<br />

u 2 +[v] 3<br />

u 3 + ···+[v] n<br />

u n<br />

Then for 1 ≤ i ≤ n, byDef<strong>in</strong>itionVR,<br />

[ρ B (u)] i<br />

=[ρ B ([v] 1<br />

u 1 +[v] 2<br />

u 2 +[v] 3<br />

u 3 + ···+[v] n<br />

u n )] i<br />

=[v] i<br />

so the entries of vectors ρ B (u) and v are equal and Def<strong>in</strong>ition CVE yields the vector<br />

equality ρ B (u) = v. This demonstrates that v ∈R(ρ B ),soC n ⊆R(ρ B ).S<strong>in</strong>ce<br />

R(ρ B ) ⊆ C n by Def<strong>in</strong>ition RLT, wehaveR(ρ B ) = C n and Theorem RSLT says ρ B<br />

is surjective.<br />

<br />

We will have many occasions later to employ the <strong>in</strong>verse of vector representation, so<br />

we will record the fact that vector representation is an <strong>in</strong>vertible l<strong>in</strong>ear transformation.<br />

Theorem VRILT Vector Representation is an Invertible L<strong>in</strong>ear Transformation<br />

The function ρ B (Def<strong>in</strong>ition VR) is an <strong>in</strong>vertible l<strong>in</strong>ear transformation.<br />

Proof. The function ρ B (Def<strong>in</strong>ition VR) is a l<strong>in</strong>ear transformation (Theorem VRLT)<br />

that is <strong>in</strong>jective (Theorem VRI) and surjective (Theorem VRS) with doma<strong>in</strong> V<br />

and codoma<strong>in</strong> C n .ByTheoremILTIS we then know that ρ B is an <strong>in</strong>vertible l<strong>in</strong>ear<br />

transformation.<br />

<br />

Informally, we will refer to the application of ρ B as coord<strong>in</strong>atiz<strong>in</strong>g a vector,<br />

while the application of ρ −1<br />

B<br />

will be referred to as un-coord<strong>in</strong>atiz<strong>in</strong>g a vector.


§VR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 508<br />

Subsection CVS<br />

Characterization of Vector Spaces<br />

Limit<strong>in</strong>g our attention to vector spaces with f<strong>in</strong>ite dimension, we now describe every<br />

possible vector space. All of them. Really.<br />

Theorem CFDVS Characterization of F<strong>in</strong>ite Dimensional Vector Spaces<br />

Suppose that V is a vector space with dimension n. ThenV is isomorphic to C n .<br />

Proof. S<strong>in</strong>ce V has dimension n we can f<strong>in</strong>d a basis of V of size n (Def<strong>in</strong>ition D)which<br />

we will call B. The l<strong>in</strong>ear transformation ρ B is an <strong>in</strong>vertible l<strong>in</strong>ear transformation<br />

from V to C n ,sobyDef<strong>in</strong>itionIVS, wehavethatV and C n are isomorphic. <br />

Theorem CFDVS is the first of several surprises <strong>in</strong> this chapter, though it might<br />

be a bit demoraliz<strong>in</strong>g too. It says that there really are not all that many different<br />

(f<strong>in</strong>ite dimensional) vector spaces, and none are really any more complicated than<br />

C n . Hmmm. The follow<strong>in</strong>g examples should make this po<strong>in</strong>t.<br />

Example TIVS Two isomorphic vector spaces<br />

The vector space of polynomials with degree 8 or less, P 8 , has dimension 9 (Theorem<br />

DP). By Theorem CFDVS, P 8 is isomorphic to C 9 .<br />

△<br />

Example CVSR Crazy vector space revealed<br />

The crazy vector space, C of Example CVS, has dimension 2 by Example DC. By<br />

Theorem CFDVS, C is isomorphic to C 2 . Hmmmm. Not really so crazy after all?△<br />

Example ASC A subspace characterized<br />

In Example DSP4 we determ<strong>in</strong>ed that a certa<strong>in</strong> subspace W of P 4 has dimension 4.<br />

By Theorem CFDVS, W is isomorphic to C 4 .<br />

△<br />

Theorem IFDVS Isomorphism of F<strong>in</strong>ite Dimensional Vector Spaces<br />

Suppose U and V are both f<strong>in</strong>ite-dimensional vector spaces. Then U and V are<br />

isomorphic if and only if dim (U) =dim(V ).<br />

Proof. (⇒) This is just the statement proved <strong>in</strong> Theorem IVSED.<br />

(⇐) This is the advertised converse of Theorem IVSED. We will assume U and<br />

V have equal dimension and discover that they are isomorphic vector spaces. Let<br />

n be the common dimension of U and V . Then by Theorem CFDVS there are<br />

isomorphisms T : U → C n and S : V → C n .<br />

T is therefore an <strong>in</strong>vertible l<strong>in</strong>ear transformation by Def<strong>in</strong>ition IVS. Similarly, S is<br />

an <strong>in</strong>vertible l<strong>in</strong>ear transformation, and so S −1 is an <strong>in</strong>vertible l<strong>in</strong>ear transformation<br />

(Theorem IILT). The composition of <strong>in</strong>vertible l<strong>in</strong>ear transformations is aga<strong>in</strong><br />

<strong>in</strong>vertible (Theorem CIVLT) so the composition of S −1 with T is <strong>in</strong>vertible. Then<br />

(<br />

S −1 ◦ T ) : U → V is an <strong>in</strong>vertible l<strong>in</strong>ear transformation from U to V and Def<strong>in</strong>ition<br />

IVS says U and V are isomorphic.


§VR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 509<br />

Example MIVS Multiple isomorphic vector spaces<br />

C 10 , P 9 , M 25 and M 52 are all vector spaces and each has dimension 10. By Theorem<br />

IFDVS each is isomorphic to any other.<br />

The subspace of M 44 that conta<strong>in</strong>s all the symmetric matrices (Def<strong>in</strong>ition SYM)<br />

has dimension 10, so this subspace is also isomorphic to each of the four vector<br />

spaces above.<br />

△<br />

Subsection CP<br />

Coord<strong>in</strong>atization Pr<strong>in</strong>ciple<br />

With ρ B available as an <strong>in</strong>vertible l<strong>in</strong>ear transformation, we can translate between<br />

vectors <strong>in</strong> a vector space U of dimension m and C m . Furthermore, as a l<strong>in</strong>ear<br />

transformation, ρ B respects the addition and scalar multiplication <strong>in</strong> U, while ρ −1<br />

B<br />

respects the addition and scalar multiplication <strong>in</strong> C m . S<strong>in</strong>ce our def<strong>in</strong>itions of l<strong>in</strong>ear<br />

<strong>in</strong>dependence, spans, bases and dimension are all built up from l<strong>in</strong>ear comb<strong>in</strong>ations,<br />

we will f<strong>in</strong>ally be able to translate fundamental properties between abstract vector<br />

spaces (U) and concrete vector spaces (C m ).<br />

Theorem CLI Coord<strong>in</strong>atization and L<strong>in</strong>ear Independence<br />

Suppose that U is a vector space with a basis B of size n. Then<br />

S = {u 1 , u 2 , u 3 , ..., u k }<br />

is a l<strong>in</strong>early <strong>in</strong>dependent subset of U if and only if<br />

R = {ρ B (u 1 ) ,ρ B (u 2 ) ,ρ B (u 3 ) , ..., ρ B (u k )}<br />

is a l<strong>in</strong>early <strong>in</strong>dependent subset of C n .<br />

Proof. The l<strong>in</strong>ear transformation ρ B is an isomorphism between U and C n (Theorem<br />

VRILT). As an <strong>in</strong>vertible l<strong>in</strong>ear transformation, ρ B is an <strong>in</strong>jective l<strong>in</strong>ear transformation<br />

(Theorem ILTIS), and ρ −1<br />

B<br />

is also an <strong>in</strong>jective l<strong>in</strong>ear transformation (Theorem<br />

IILT, TheoremILTIS).<br />

(⇒) S<strong>in</strong>ceρ B is an <strong>in</strong>jective l<strong>in</strong>ear transformation and S is l<strong>in</strong>early <strong>in</strong>dependent,<br />

Theorem ILTLI says that R is l<strong>in</strong>early <strong>in</strong>dependent.<br />

(⇐) If we apply ρ −1<br />

B<br />

to each element of R, we will create the set S. S<strong>in</strong>ce we are<br />

assum<strong>in</strong>g R is l<strong>in</strong>early <strong>in</strong>dependent and ρ −1<br />

B<br />

is <strong>in</strong>jective, Theorem ILTLI says that S<br />

is l<strong>in</strong>early <strong>in</strong>dependent.<br />

<br />

Theorem CSS Coord<strong>in</strong>atization and Spann<strong>in</strong>g Sets<br />

Suppose that U is a vector space with a basis B of size n. Then<br />

u ∈〈{u 1 , u 2 , u 3 , ..., u k }〉<br />

if and only if<br />

ρ B (u) ∈〈{ρ B (u 1 ) ,ρ B (u 2 ) ,ρ B (u 3 ) , ..., ρ B (u k )}〉


§VR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 510<br />

Proof. (⇒) Suppose u ∈〈{u 1 , u 2 , u 3 , ..., u k }〉. Then we know there are scalars,<br />

a 1 ,a 2 ,a 3 , ..., a k , such that<br />

u = a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a k u k<br />

Then, by Theorem LTLC,<br />

ρ B (u) =ρ B (a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a k u k )<br />

= a 1 ρ B (u 1 )+a 2 ρ B (u 2 )+a 3 ρ B (u 3 )+···+ a k ρ B (u k )<br />

which says that ρ B (u) ∈〈{ρ B (u 1 ) ,ρ B (u 2 ) ,ρ B (u 3 ) , ..., ρ B (u k )}〉.<br />

(⇐) Suppose that ρ B (u) ∈〈{ρ B (u 1 ) ,ρ B (u 2 ) ,ρ B (u 3 ) , ..., ρ B (u k )}〉. Then<br />

there are scalars b 1 ,b 2 ,b 3 , ..., b k such that<br />

ρ B (u) =b 1 ρ B (u 1 )+b 2 ρ B (u 2 )+b 3 ρ B (u 3 )+···+ b k ρ B (u k )<br />

Recall that ρ B is <strong>in</strong>vertible (Theorem VRILT), so<br />

u = I U (u)<br />

Def<strong>in</strong>ition IDLT<br />

= ( ρ −1<br />

B ◦ ρ B)<br />

(u) Def<strong>in</strong>ition IVLT<br />

= ρ −1<br />

B (ρ B (u)) Def<strong>in</strong>ition LTC<br />

= ρ −1<br />

B (b 1ρ B (u 1 )+b 2 ρ B (u 2 )+···+ b k ρ B (u k ))<br />

= b 1 ρ −1<br />

B<br />

(ρ B (u 1 )) + b 2 ρ −1<br />

B<br />

(ρ B (u 2 )) + ···+ b k ρ −1<br />

B (ρ B (u k )) Theorem LTLC<br />

= b 1 I U (u 1 )+b 2 I U (u 2 )+···+ b k I U (u k ) Def<strong>in</strong>ition IVLT<br />

= b 1 u 1 + b 2 u 2 + b 3 u 3 + ···+ b k u k Def<strong>in</strong>ition IDLT<br />

which says that u ∈〈{u 1 , u 2 , u 3 , ..., u k }〉.<br />

<br />

Here is a fairly simple example that illustrates a very, very important idea.<br />

Example CP2 Coord<strong>in</strong>atiz<strong>in</strong>g <strong>in</strong> P 2<br />

In Example VRP2 we needed to know that<br />

D = { −2 − x +3x 2 , 1 − 2x 2 , 5+4x + x 2}<br />

is a basis for P 2 . With Theorem CLI and Theorem CSS this task is much easier.<br />

<strong>First</strong>, choose a known basis for P 2 , a basis that forms vector representations<br />

easily. We will choose<br />

B = { 1,x,x 2}<br />

D,<br />

Now, form the subset of C 3 that is the result of apply<strong>in</strong>g ρ B to each element of<br />

F = { (<br />

ρ B −2 − x +3x<br />

2 ) (<br />

,ρ B 1 − 2x<br />

2 ) (<br />

,ρ B 5+4x + x<br />

2 )}<br />

{[ ] [ ] [ ]}<br />

−2 1 5<br />

= −1 , 0 , 4<br />

3 −2 1


§VR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 511<br />

and ask if F is a l<strong>in</strong>early <strong>in</strong>dependent spann<strong>in</strong>g set for C 3 . This is easily seen to be<br />

the case by form<strong>in</strong>g a matrix A whose columns are the vectors of F , row-reduc<strong>in</strong>g A<br />

to the identity matrix I 3 , and then us<strong>in</strong>g the nons<strong>in</strong>gularity of A to assert that F is<br />

a basis for C 3 (Theorem CNMB). Now, s<strong>in</strong>ce F is a basis for C 3 ,TheoremCLI and<br />

Theorem CSS tell us that D is also a basis for P 2 .<br />

△<br />

Example CP2 illustrates the broad notion that computations <strong>in</strong> abstract vector<br />

spaces can be reduced to computations <strong>in</strong> C m . You may have noticed this phenomenon<br />

as you worked through examples <strong>in</strong> Chapter VS or Chapter LT employ<strong>in</strong>g vector<br />

spaces of matrices or polynomials. These computations seemed to <strong>in</strong>variably result<br />

<strong>in</strong> systems of equations or the like from Chapter SLE, Chapter V and Chapter M.<br />

It is vector representation, ρ B , that allows us to make this connection formal and<br />

precise.<br />

Know<strong>in</strong>g that vector representation allows us to translate questions about l<strong>in</strong>ear<br />

comb<strong>in</strong>ations, l<strong>in</strong>ear <strong>in</strong>dependence and spans from general vector spaces to C m allows<br />

us to prove a great many theorems about how to translate other properties. Rather<br />

than prove these theorems, each of the same style as the other, we will offer some<br />

general guidance about how to best employ Theorem VRLT, TheoremCLI and<br />

Theorem CSS. This comes <strong>in</strong> the form of a “pr<strong>in</strong>ciple”: a basic truth, but most<br />

def<strong>in</strong>itely not a theorem (hence, no proof).<br />

The Coord<strong>in</strong>atization Pr<strong>in</strong>ciple<br />

Suppose that U is a vector space with a basis B of size n. Then any question<br />

about U, or its elements, which ultimately depends on the vector addition or scalar<br />

multiplication <strong>in</strong> U, or depends on l<strong>in</strong>ear <strong>in</strong>dependence or spann<strong>in</strong>g, may be translated<br />

<strong>in</strong>to the same question <strong>in</strong> C n by application of the l<strong>in</strong>ear transformation ρ B to the<br />

relevant vectors. Once the question is answered <strong>in</strong> C n , the answer may be translated<br />

back to U through application of the <strong>in</strong>verse l<strong>in</strong>ear transformation ρ −1<br />

B<br />

(if necessary).<br />

Example CM32 Coord<strong>in</strong>atization <strong>in</strong> M 32<br />

This is a simple example of the Coord<strong>in</strong>atization Pr<strong>in</strong>ciple, depend<strong>in</strong>g only on the fact<br />

that coord<strong>in</strong>atiz<strong>in</strong>g is an <strong>in</strong>vertible l<strong>in</strong>ear transformation (Theorem VRILT). Suppose<br />

we have a l<strong>in</strong>ear comb<strong>in</strong>ation to perform <strong>in</strong> M 32 , the vector space of 3 × 2 matrices,<br />

but we are adverse to do<strong>in</strong>g the operations of M 32 (Def<strong>in</strong>ition MA, Def<strong>in</strong>ition MSM).<br />

More specifically, suppose we are faced with the computation<br />

[ ] [ ]<br />

3 7 −1 3<br />

6 −2 4 +2 4 8<br />

0 −3 −2 5<br />

We choose a nice basis for M 32 (or a nasty basis if we are so <strong>in</strong>cl<strong>in</strong>ed),<br />

{[ 1<br />

] 0<br />

[ 0<br />

] 0<br />

[ 0<br />

] 0<br />

[ 0<br />

] 1<br />

[ 0<br />

] 0<br />

[ 0<br />

]} 0<br />

B = 0 0 , 1 0 , 0 0 , 0 0 , 0 1 , 0 0<br />

0 0 0 0 1 0 0 0 0 0 0 1<br />

and apply ρ B to each vector <strong>in</strong> the l<strong>in</strong>ear comb<strong>in</strong>ation. This gives us a new compu-


§VR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 512<br />

tation, now <strong>in</strong> the vector space C 6 , which we can compute with operations <strong>in</strong> C 6<br />

(Def<strong>in</strong>ition CVA, Def<strong>in</strong>itionCVSM),<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />

3 −1 16<br />

−2<br />

4<br />

−4<br />

0<br />

−2<br />

−4<br />

6<br />

⎢ 7<br />

+2<br />

⎥ ⎢ 3<br />

=<br />

⎥ ⎢ 48<br />

⎥<br />

⎣ 4 ⎦ ⎣ 8 ⎦ ⎣ 40 ⎦<br />

−3 5 −8<br />

We are after the result of a computation <strong>in</strong> M 32 , so we now can apply ρ −1<br />

B<br />

to<br />

obta<strong>in</strong> a 3 × 2 matrix,<br />

[ ] [ ] [ ] [ ] [ ] [ ]<br />

1 0 0 0 0 0 0 1 0 0 0 0<br />

16 0 0 +(−4) 1 0 +(−4) 0 0 +48 0 0 +40 0 1 +(−8) 0 0<br />

0 0 0 0 1 0 0 0 0 0 0 1<br />

[ ] 16 48<br />

= −4 40<br />

−4 −8<br />

which is exactly the matrix we would have computed had we just performed the<br />

matrix operations <strong>in</strong> the first place. So this was not meant to be an easier way to<br />

compute a l<strong>in</strong>ear comb<strong>in</strong>ation of two matrices, just a different way.<br />

△<br />

Read<strong>in</strong>g Questions<br />

1. The vector space of 3 × 5matrices,M 35 is isomorphic to what fundamental vector space?<br />

2. A basis for C 3 is<br />

⎧⎡<br />

⎨<br />

B = ⎣ 1 ⎤ ⎡<br />

2 ⎦ , ⎣ 3<br />

⎤ ⎡<br />

−1⎦ , ⎣ 1 ⎤⎫<br />

⎬<br />

1⎦<br />

⎩<br />

−1 2 1<br />

⎭<br />

⎛⎡<br />

Compute ρ B<br />

⎝⎣ 5 ⎤⎞<br />

8 ⎦⎠.<br />

−1<br />

3. What is the first “surprise,” and why is it surpris<strong>in</strong>g?<br />

Exercises<br />

C10 † In the vector space C 3 , compute the vector representation ρ B (v) for the basis B<br />

and vector v below.<br />

⎧⎡<br />

⎨<br />

B = ⎣ 2<br />

⎤ ⎡<br />

−2⎦ , ⎣ 1 ⎤ ⎡<br />

3⎦ , ⎣ 3 ⎤⎫<br />

⎡<br />

⎬<br />

5⎦<br />

v = ⎣ 11<br />

⎤<br />

5 ⎦<br />

⎩<br />

2 1 2<br />

⎭<br />

8


§VR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 513<br />

C20 † Rework Example CM32 replac<strong>in</strong>g the basis B by the basis<br />

⎧⎡<br />

⎨<br />

C = ⎣ −14 −9<br />

⎤ ⎡<br />

10 10 ⎦ , ⎣ −7 −4<br />

⎤ ⎡<br />

5 5 ⎦ , ⎣ −3 −1<br />

⎤ ⎡<br />

0 −2⎦ , ⎣ −7 −4<br />

⎤ ⎡<br />

3 2 ⎦ , ⎣ 4 2<br />

⎤<br />

−3 −3⎦ ,<br />

⎩<br />

−6 −2 −3 −1 1 1 −1 0 2 1<br />

⎡<br />

⎣ 0 0<br />

⎤⎫<br />

⎬<br />

−1 −2⎦<br />

1 1<br />

⎭<br />

M10 Prove that the set S below is a basis for the vector space of 2 × 2matrices,M 22 .<br />

Do this by choos<strong>in</strong>g a natural basis for M 22 and coord<strong>in</strong>atiz<strong>in</strong>g the elements of S with<br />

respect to this basis. Exam<strong>in</strong>e the result<strong>in</strong>g set of column vectors from C 4 and apply the<br />

Coord<strong>in</strong>atization Pr<strong>in</strong>ciple.<br />

{[ ] [ ] [ ] [ ]}<br />

33 99 −16 −47 10 27 −2 −7<br />

S =<br />

,<br />

, ,<br />

78 −9 −36 2 17 3 −6 4<br />

M20 † The set B = {v 1 , v 2 , v 3 , v 4 } is a basis of the vector space P 3 , polynomials with<br />

degree 3 or less. Therefore ρ B is a l<strong>in</strong>ear transformation, accord<strong>in</strong>g<br />

(<br />

to Theorem VRLT.<br />

F<strong>in</strong>d a “formula” for ρ B . In other words, f<strong>in</strong>d an expression for ρ B a + bx + cx 2 + dx 3) .<br />

v 1 =1− 5x − 22x 2 +3x 3 v 2 = −2+11x +49x 2 − 8x 3<br />

v 3 = −1+7x +33x 2 − 8x 3 v 4 = −1+4x +16x 2 + x 3


Section MR<br />

Matrix Representations<br />

We have seen that l<strong>in</strong>ear transformations whose doma<strong>in</strong> and codoma<strong>in</strong> are vector<br />

spaces of columns vectors have a close relationship with matrices (Theorem MBLT,<br />

Theorem MLTCV). In this section, we will extend the relationship between matrices<br />

and l<strong>in</strong>ear transformations to the sett<strong>in</strong>g of l<strong>in</strong>ear transformations between abstract<br />

vector spaces.<br />

Subsection MR<br />

Matrix Representations<br />

This is a fundamental def<strong>in</strong>ition.<br />

Def<strong>in</strong>ition MR Matrix Representation<br />

Suppose that T : U → V is a l<strong>in</strong>ear transformation, B = {u 1 , u 2 , u 3 , ..., u n }<br />

is a basis for U of size n, andC is a basis for V of size m. Then the matrix<br />

representation of T relative to B and C is the m × n matrix,<br />

MB,C T =[ρ C (T (u 1 ))| ρ C (T (u 2 ))| ρ C (T (u 3 ))| ...|ρ C (T (u n ))]<br />

□<br />

Example OLTTR One l<strong>in</strong>ear transformation, three representations<br />

Consider the l<strong>in</strong>ear transformation, S : P 3 → M 22 ,givenby<br />

S ( a + bx + cx 2 + dx 3) [ ]<br />

3a +7b − 2c − 5d 8a +14b − 2c − 11d<br />

=<br />

−4a − 8b +2c +6d 12a +22b − 4c − 17d<br />

<strong>First</strong>, we build a representation relative to the bases,<br />

B = { 1+2x + x 2 − x 3 , 1+3x + x 2 + x 3 , −1 − 2x +2x 3 , 2+3x +2x 2 − 5x 3}<br />

{[ ] [ ] [ ] [ ]}<br />

1 1 2 3 −1 −1 −1 −4<br />

C = , ,<br />

,<br />

1 2 2 5 0 −2 −2 −4<br />

We evaluate S with each element of the basis for the doma<strong>in</strong>, B, and coord<strong>in</strong>atize<br />

the result relative to the vectors <strong>in</strong> the basis for the codoma<strong>in</strong>, C. Noticeherehow<br />

we take elements of vector spaces and decompose them <strong>in</strong>to l<strong>in</strong>ear comb<strong>in</strong>ations of<br />

basis elements as the key step <strong>in</strong> construct<strong>in</strong>g coord<strong>in</strong>atizations of vectors. There<br />

is a system of equations <strong>in</strong>volved almost every time, but we will omit these details<br />

s<strong>in</strong>ce this should be a rout<strong>in</strong>e exercise at this stage.<br />

( (<br />

ρ C S 1+2x + x 2 − x 3)) ([ ])<br />

20 45<br />

= ρ C<br />

−24 69<br />

514


§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 515<br />

= ρ C<br />

((−90)<br />

[ ]<br />

1 1<br />

+37<br />

1 2<br />

[ ]<br />

2 3<br />

+(−40)<br />

2 5<br />

( (<br />

ρ C S 1+3x + x 2 + x 3)) ([ ])<br />

17 37<br />

= ρ C<br />

−20 57<br />

= ρ C<br />

((−72)<br />

[ ]<br />

1 1<br />

+29<br />

1 2<br />

[ ]<br />

2 3<br />

+(−34)<br />

2 5<br />

( (<br />

ρ C S −1 − 2x +2x<br />

3 )) ([ ])<br />

−27 −58<br />

= ρ C<br />

32 −90<br />

= ρ C<br />

(114<br />

[ ]<br />

1 1<br />

+(−46)<br />

1 2<br />

[ ]<br />

2 3<br />

+54<br />

2 5<br />

( (<br />

ρ C S 2+3x +2x 2 − 5x 3)) ([ ])<br />

48 109<br />

= ρ C<br />

−58 167<br />

= ρ C<br />

(−220<br />

[ ]<br />

1 1<br />

+91<br />

1 2<br />

[ ]<br />

2 3<br />

+(−96)<br />

2 5<br />

[ ]<br />

−1 −1<br />

+4<br />

0 −2<br />

[ ]<br />

−1 −1<br />

+3<br />

0 −2<br />

[ ]<br />

−1 −1<br />

+(−5)<br />

0 −2<br />

[ ]<br />

−1 −1<br />

+10<br />

0 −2<br />

⎡ ⎤<br />

[ ]) −90<br />

−1 −4 ⎢ 37 ⎥<br />

=<br />

−2 −4<br />

⎣<br />

−40<br />

⎦<br />

4<br />

⎡ ⎤<br />

[ ]) −72<br />

−1 −4 ⎢ 29 ⎥<br />

=<br />

−2 −4<br />

⎣<br />

−34<br />

⎦<br />

3<br />

⎡ ⎤<br />

[ ]) 114<br />

−1 −4 ⎢−46⎥<br />

=<br />

−2 −4<br />

⎣<br />

54<br />

⎦<br />

−5<br />

⎡ ⎤<br />

[ ]) −220<br />

−1 −4 ⎢ 91 ⎥<br />

=<br />

−2 −4<br />

⎣<br />

−96<br />

⎦<br />

10<br />

Thus, employ<strong>in</strong>g Def<strong>in</strong>ition MR<br />

⎡<br />

⎤<br />

−90 −72 114 −220<br />

MB,C S ⎢ 37 29 −46 91 ⎥<br />

= ⎣<br />

−40 −34 54 −96<br />

⎦<br />

4 3 −5 10<br />

Often we use “nice” bases to build matrix representations and the work <strong>in</strong>volved<br />

is much easier. Suppose we take bases<br />

D = { 1,x,x 2 ,x 3} {[ ] [ ] [ ] [ ]}<br />

1 0 0 1 0 0 0 0<br />

E = , , ,<br />

0 0 0 0 1 0 0 1<br />

The evaluation of S at the elements of D is easy and coord<strong>in</strong>atization relative to<br />

E canbedoneonsight,<br />

([ ])<br />

3 8<br />

ρ E (S (1)) = ρ E<br />

−4 12


§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 516<br />

= ρ E<br />

(3<br />

[ ]<br />

1 0<br />

+8<br />

0 0<br />

([ ])<br />

7 14<br />

ρ E (S (x)) = ρ E<br />

−8 22<br />

= ρ E<br />

(7<br />

[ ]<br />

1 0<br />

+14<br />

0 0<br />

( (<br />

ρ E S x<br />

2 )) ([ ])<br />

−2 −2<br />

= ρ E<br />

2 −4<br />

= ρ E<br />

((−2)<br />

[ ]<br />

1 0<br />

+(−2)<br />

0 0<br />

[ ]<br />

0 1<br />

+(−4)<br />

0 0<br />

( (<br />

ρ E S x<br />

3 )) ([ ])<br />

−5 −11<br />

= ρ E<br />

6 −17<br />

[ ]<br />

1 0<br />

= ρ E<br />

((−5) +(−11)<br />

0 0<br />

⎡ ⎤<br />

−5<br />

⎢−11⎥<br />

= ⎣<br />

6<br />

⎦<br />

−17<br />

[ ]<br />

0 1<br />

+(−8)<br />

0 0<br />

[ ]<br />

0 1<br />

+2<br />

0 0<br />

[ ]<br />

0 1<br />

+6<br />

0 0<br />

[ ]<br />

0 0<br />

+12<br />

1 0<br />

[ ]<br />

0 0<br />

+22<br />

1 0<br />

[ ])<br />

0 0<br />

=<br />

0 1<br />

[ ]<br />

0 0<br />

+(−4)<br />

1 0<br />

⎡<br />

⎢<br />

⎣<br />

[ ])<br />

0 0<br />

=<br />

0 1<br />

[ ]<br />

0 0<br />

+(−17)<br />

1 0<br />

So the matrix representation of S relative to D and E is<br />

⎡<br />

⎤<br />

3 7 −2 −5<br />

MD,E S ⎢ 8 14 −2 −11⎥<br />

= ⎣<br />

−4 −8 2 6<br />

⎦<br />

12 22 −4 −17<br />

3<br />

8<br />

−4<br />

12<br />

⎤<br />

⎥<br />

⎦<br />

⎡ ⎤<br />

7<br />

⎢ 14 ⎥<br />

⎣<br />

−8<br />

⎦<br />

22<br />

[ ])<br />

0 0<br />

=<br />

0 1<br />

[ ])<br />

0 0<br />

0 1<br />

One more time, but now let us use bases<br />

F = { 1+x − x 2 +2x 3 , −1+2x +2x 3 , 2+x − 2x 2 +3x 3 , 1+x +2x 3}<br />

{[ ] [ ] [ ] [ ]}<br />

1 1 −1 2 2 1 1 1<br />

G =<br />

, , ,<br />

−1 2 0 2 −2 3 0 2<br />

⎡ ⎤<br />

−2<br />

⎢−2⎥<br />

⎣<br />

2<br />

⎦<br />

−4<br />

and evaluate S with the elements of F , then coord<strong>in</strong>atize the results relative to G,<br />

⎡ ⎤<br />

( (<br />

ρ G S 1+x − x 2 +2x 3)) ([ ]) ( [ ]) 2<br />

2 2<br />

1 1 ⎢0⎥<br />

= ρ G = ρ<br />

−2 4 G 2<br />

=<br />

−1 2<br />

⎣<br />

0<br />

⎦<br />

0


§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 517<br />

( (<br />

ρ G S −1+2x +2x<br />

3 )) ([ ]) (<br />

1 −2<br />

= ρ G = ρ<br />

0 −2 G (−1)<br />

[ ])<br />

−1 2<br />

=<br />

0 2<br />

( (<br />

ρ G S 2+x − 2x 2 +3x 3)) ([ ]) ([ ])<br />

2 1<br />

2 1<br />

= ρ G = ρ<br />

−2 3 G =<br />

−2 3<br />

( (<br />

ρ G S 1+x +2x<br />

3 )) ([ ]) (<br />

0 0<br />

= ρ G = ρ<br />

0 0 G 0<br />

[ ])<br />

1 1<br />

=<br />

0 2<br />

⎡ ⎤<br />

0<br />

⎢0⎥<br />

⎣<br />

1<br />

⎦<br />

0<br />

⎡ ⎤<br />

0<br />

⎢0⎥<br />

⎣<br />

0<br />

⎦<br />

0<br />

⎡ ⎤<br />

0<br />

⎢−1⎥<br />

⎣<br />

0<br />

⎦<br />

0<br />

So we arrive at an especially economical matrix representation,<br />

⎡<br />

⎤<br />

2 0 0 0<br />

MF,G S ⎢0 −1 0 0⎥<br />

= ⎣<br />

0 0 1 0<br />

⎦<br />

0 0 0 0<br />

We may choose to use whatever terms we want when we make a def<strong>in</strong>ition. Some<br />

are arbitrary, while others make sense, but only <strong>in</strong> light of subsequent theorems.<br />

Matrix representation is <strong>in</strong> the latter category. We beg<strong>in</strong> with a l<strong>in</strong>ear transformation<br />

and produce a matrix. So what? Here is the theorem that justifies the term “matrix<br />

representation.”<br />

Theorem FTMR Fundamental Theorem of Matrix Representation<br />

Suppose that T : U → V is a l<strong>in</strong>ear transformation, B is a basis for U, C is a basis<br />

for V and MB,C T is the matrix representation of T relative to B and C. Then,for<br />

any u ∈ U,<br />

ρ C (T (u)) = MB,C T (ρ B (u))<br />

or equivalently<br />

T (u) =ρ −1 (<br />

C M<br />

T<br />

B,C (ρ B (u)) )<br />

Proof. Let B = {u 1 , u 2 , u 3 , ..., u n } be the basis of U. S<strong>in</strong>ceu ∈ U, there are<br />

scalars a 1 ,a 2 ,a 3 , ..., a n such that<br />

△<br />

u = a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a n u n<br />

Then,<br />

MB,Cρ T B (u)<br />

=[ρ C (T (u 1 ))| ρ C (T (u 2 ))| ρ C (T (u 3 ))| ...|ρ C (T (u n ))] ρ B (u)<br />

Def<strong>in</strong>ition MR


§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 518<br />

⎡ ⎤<br />

a 1<br />

a 2<br />

=[ρ C (T (u 1 ))| ρ C (T (u 2 ))| ρ C (T (u 3 ))| ...|ρ C (T (u n ))]<br />

a 3<br />

⎢ ⎥<br />

⎣<br />

.<br />

⎦<br />

a n<br />

Def<strong>in</strong>ition VR<br />

= a 1 ρ C (T (u 1 )) + a 2 ρ C (T (u 2 )) + ···+ a n ρ C (T (u n )) Def<strong>in</strong>ition MVP<br />

= ρ C (a 1 T (u 1 )+a 2 T (u 2 )+a 3 T (u 3 )+···+ a n T (u n )) Theorem LTLC<br />

= ρ C (T (a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a n u n )) Theorem LTLC<br />

= ρ C (T (u))<br />

The alternative conclusion is obta<strong>in</strong>ed as<br />

T (u) =I V (T (u))<br />

Def<strong>in</strong>ition IDLT<br />

= ( ρ −1<br />

C ◦ ρ C)<br />

(T (u)) Def<strong>in</strong>ition IVLT<br />

= ρ −1<br />

C (ρ C (T (u))) Def<strong>in</strong>ition LTC<br />

= ρ −1<br />

C<br />

(<br />

M<br />

T<br />

B,C (ρ B (u)) ) <br />

This theorem says that we can apply T to u and coord<strong>in</strong>atize the result relative<br />

to C <strong>in</strong> V , or we can first coord<strong>in</strong>atize u relative to B <strong>in</strong> U, then multiply by the<br />

matrix representation. Either way, the result is the same. So the effect of a l<strong>in</strong>ear<br />

transformation can always be accomplished by a matrix-vector product (Def<strong>in</strong>ition<br />

MVP). That is important enough to say aga<strong>in</strong>. The effect of a l<strong>in</strong>ear transformation<br />

is a matrix-vector product.<br />

u<br />

T<br />

T (u)<br />

ρ B<br />

ρ C<br />

ρ B (u)<br />

M T B,C<br />

M T B,C ρ B (u) =ρ C (T (u))<br />

Diagram FTMR: Fundamental Theorem of Matrix Representations<br />

The alternative conclusion of this result might be even more strik<strong>in</strong>g. It says that<br />

to effect a l<strong>in</strong>ear transformation (T ) of a vector (u), coord<strong>in</strong>atize the <strong>in</strong>put (with ρ B ),<br />

do a matrix-vector product (with MB,C T ), and un-coord<strong>in</strong>atize the result (with ρ−1<br />

C ).


§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 519<br />

So, absent some bookkeep<strong>in</strong>g about vector representations, a l<strong>in</strong>ear transformation<br />

is a matrix. To adjust the diagram, we “reverse” the arrow on the right, which<br />

means <strong>in</strong>vert<strong>in</strong>g the vector representation ρ C on V . Now we can go directly across<br />

the top of the diagram, comput<strong>in</strong>g the l<strong>in</strong>ear transformation between the abstract<br />

vector spaces. Or, we can around the other three sides, us<strong>in</strong>g vector representation,<br />

a matrix-vector product, followed by un-coord<strong>in</strong>atization.<br />

u<br />

T<br />

T (u) =ρ −1<br />

C<br />

(<br />

M<br />

T<br />

B,C ρ B (u) )<br />

ρ B<br />

ρ −1<br />

C<br />

ρ B (u)<br />

M T B,C<br />

M T B,C ρ B (u)<br />

Diagram FTMRA: Fundamental Theorem of Matrix Representations (Alternate)<br />

Here is an example to illustrate how the “action” of a l<strong>in</strong>ear transformation can<br />

be effected by matrix multiplication.<br />

Example ALTMM A l<strong>in</strong>ear transformation as matrix multiplication<br />

In Example OLTTR we found three representations of the l<strong>in</strong>ear transformation S.<br />

In this example, we will compute a s<strong>in</strong>gle output of S <strong>in</strong> four different ways. <strong>First</strong><br />

“normally,” then three times over us<strong>in</strong>g Theorem FTMR.<br />

Choose p(x) =3−x+2x 2 −5x 3 , for no particular reason. Then the straightforward<br />

application of S to p(x) yields<br />

S (p(x)) = S ( 3 − x +2x 2 − 5x 3)<br />

[ ]<br />

3(3) + 7(−1) − 2(2) − 5(−5) 8(3) + 14(−1) − 2(2) − 11(−5)<br />

=<br />

−4(3) − 8(−1) + 2(2) + 6(−5) 12(3) + 22(−1) − 4(2) − 17(−5)<br />

[ ]<br />

23 61<br />

=<br />

−30 91<br />

Now use the representation of S relative to the bases B and C and Theorem<br />

FTMR. Note that we will employ the follow<strong>in</strong>g l<strong>in</strong>ear comb<strong>in</strong>ation <strong>in</strong> mov<strong>in</strong>g from<br />

the second l<strong>in</strong>e to the third,<br />

3 − x +2x 2 − 5x 3 = 48(1 + 2x + x 2 − x 3 )+(−20)(1 + 3x + x 2 + x 3 )+<br />

(−1)(−1 − 2x +2x 3 )+(−13)(2 + 3x +2x 2 − 5x 3 )<br />

S (p(x)) = ρ −1<br />

C<br />

(<br />

M<br />

S<br />

B,C ρ B (p(x)) )


§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 520<br />

= ρ −1<br />

C<br />

= ρ −1<br />

C<br />

( (<br />

M<br />

S<br />

B,C ρ B 3 − x +2x 2 − 5x 3))<br />

⎛ ⎡ ⎤⎞<br />

48<br />

⎜<br />

⎝MB,C<br />

S ⎢−20⎥⎟<br />

⎣<br />

−1<br />

⎦⎠<br />

−13<br />

⎛⎡<br />

⎤ ⎡ ⎤⎞<br />

−90 −72 114 −220 48<br />

= ρ −1 ⎜⎢<br />

37 29 −46 91 ⎥ ⎢−20⎥⎟<br />

C ⎝⎣<br />

−40 −34 54 −96<br />

⎦ ⎣<br />

−1<br />

⎦⎠<br />

4 3 −5 10 −13<br />

⎛⎡<br />

⎤⎞<br />

−134<br />

= ρ −1 ⎜⎢<br />

59 ⎥⎟<br />

C ⎝⎣<br />

−46<br />

⎦⎠<br />

7<br />

[ ] [ ]<br />

1 1 2 3<br />

=(−134) +59 +(−46)<br />

1 2 2 5<br />

[ ]<br />

23 61<br />

=<br />

−30 91<br />

[ ]<br />

−1 −1<br />

+7<br />

0 −2<br />

[<br />

−1<br />

]<br />

−4<br />

−2 −4<br />

Aga<strong>in</strong>, but now with “nice” bases like D and E, and the computations are more<br />

transparent.<br />

S (p(x)) = ρ −1 (<br />

E M<br />

S<br />

D,E ρ D (p(x)) )<br />

= ρ −1 ( (<br />

E M<br />

S<br />

D,E ρ D 3 − x +2x 2 − 5x 3))<br />

= ρ −1 ( (<br />

E M<br />

S<br />

D,E ρ D 3(1) + (−1)(x)+2(x 2 )+(−5)(x 3 ) ))<br />

⎛ ⎡ ⎤⎞<br />

3<br />

= ρ −1 ⎜<br />

E ⎝MD,E<br />

S ⎢−1⎥⎟<br />

⎣<br />

2<br />

⎦⎠<br />

−5<br />

⎛⎡<br />

⎤ ⎡ ⎤⎞<br />

3 7 −2 −5 3<br />

= ρ −1 ⎜⎢<br />

8 14 −2 −11⎥<br />

⎢−1⎥⎟<br />

E ⎝⎣<br />

−4 −8 2 6<br />

⎦ ⎣<br />

2<br />

⎦⎠<br />

12 22 −4 −17 −5<br />

⎛⎡<br />

⎤⎞<br />

23<br />

= ρ −1 ⎜⎢<br />

61 ⎥⎟<br />

E ⎝⎣<br />

−30<br />

⎦⎠<br />

91<br />

[ ] [ ] [ ] [ ]<br />

1 0 0 1<br />

0 0 0 0<br />

=23 +61 +(−30) +91<br />

0 0 0 0<br />

1 0 0 1<br />

[ ]<br />

23 61<br />

=<br />

−30 91


§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 521<br />

OK, last time, now with the bases F and G. The coord<strong>in</strong>atizations will take<br />

some work this time, but the matrix-vector product (Def<strong>in</strong>ition MVP) (whichisthe<br />

actual action of the l<strong>in</strong>ear transformation) will be especially easy, given the diagonal<br />

nature of the matrix representation, MF,G S . Here we go,<br />

S (p(x)) = ρ −1<br />

G<br />

= ρ −1<br />

G<br />

= ρ −1<br />

G<br />

= ρ −1<br />

G<br />

(<br />

M<br />

S<br />

F,G ρ F (p(x)) )<br />

(<br />

M<br />

S<br />

F,G ρ F<br />

(<br />

3 − x +2x 2 − 5x 3))<br />

( (<br />

M<br />

S<br />

F,G ρ F 32(1 + x − x 2 +2x 3 ) − 7(−1+2x +2x 3 )<br />

−17(2 + x − 2x 2 +3x 3 ) − 2(1 + x +2x 3 ) ))<br />

⎛ ⎡ ⎤⎞<br />

32<br />

−7 ⎥⎟<br />

⎜<br />

⎝M S F,G<br />

⎢<br />

⎣<br />

−17<br />

−2<br />

⎦⎠<br />

⎛⎡<br />

⎤ ⎡<br />

2 0 0 0<br />

= ρ −1 ⎜⎢0 −1 0 0⎥<br />

⎢<br />

G ⎝⎣<br />

0 0 1 0<br />

⎦ ⎣<br />

0 0 0 0<br />

⎛⎡<br />

⎤⎞<br />

64<br />

= ρ −1 ⎜⎢<br />

7 ⎥⎟<br />

G ⎝⎣<br />

−17<br />

⎦⎠<br />

0<br />

[ ]<br />

1 1<br />

=64 +7<br />

−1 2<br />

[ ]<br />

23 61<br />

=<br />

−30 91<br />

32<br />

−7<br />

−17<br />

−2<br />

⎤⎞<br />

⎥⎟<br />

⎦⎠<br />

[ ]<br />

−1 2<br />

+(−17)<br />

0 2<br />

[ ]<br />

2 1<br />

+0<br />

−2 3<br />

[ ]<br />

1 1<br />

0 2<br />

This example is not meant to necessarily illustrate that any one of these four<br />

computations is simpler than the others. Instead, it is meant to illustrate the many<br />

different ways we can arrive at the same result, with the last three all employ<strong>in</strong>g a<br />

matrix representation to effect the l<strong>in</strong>ear transformation.<br />

△<br />

We will use Theorem FTMR frequently <strong>in</strong> the next few sections. A typical<br />

application will feel like the l<strong>in</strong>ear transformation T “commutes” with a vector<br />

representation, ρ C , and as it does the transformation morphs <strong>in</strong>to a matrix, M T B,C ,<br />

while the vector representation changes to a new basis, ρ B . Or vice-versa.<br />

Subsection NRFO<br />

New Representations from Old<br />

In Subsection LT.NLTFO we built new l<strong>in</strong>ear transformations from other l<strong>in</strong>ear transformations.<br />

Sums, scalar multiples and compositions. These new l<strong>in</strong>ear transformations<br />

will have matrix representations as well. How do the new matrix representations


§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 522<br />

relate to the old matrix representations? Here are the three theorems.<br />

Theorem MRSLT Matrix Representation of a Sum of L<strong>in</strong>ear Transformations<br />

Suppose that T : U → V and S : U → V are l<strong>in</strong>ear transformations, B is a basis of<br />

U and C is a basis of V .Then<br />

M T +S<br />

B,C<br />

= M T B,C + M S B,C<br />

Proof. Let x be any vector <strong>in</strong> C n . Def<strong>in</strong>e u ∈ U by u = ρ −1<br />

B<br />

(x), sox = ρ B (u).<br />

Then,<br />

M T +S<br />

B,C x = M T +S<br />

B,C ρ B (u)<br />

Substitution<br />

= ρ C ((T + S)(u)) Theorem FTMR<br />

= ρ C (T (u)+S (u)) Def<strong>in</strong>ition LTA<br />

= ρ C (T (u)) + ρ C (S (u)) Def<strong>in</strong>ition LT<br />

= MB,C T (ρ B (u)) + MB,C S (ρ B (u)) Theorem FTMR<br />

= ( MB,C T + MB,C) S ρB (u) Theorem MMDAA<br />

= ( MB,C T + MB,C) S x Substitution<br />

S<strong>in</strong>ce the matrices M T +S<br />

B,C<br />

and M B,C T + M B,C S have equal matrix-vector products<br />

for every vector <strong>in</strong> C n ,byTheoremEMMVP they are equal matrices. (Now would<br />

be a good time to double-back and study the proof of Theorem EMMVP. You did<br />

promise to come back to this theorem sometime, didn’t you?)<br />

<br />

Theorem MRMLT Matrix Representation of a Multiple of a L<strong>in</strong>ear Transformation<br />

Suppose that T : U → V is a l<strong>in</strong>ear transformation, α ∈ C, B is a basis of U and C<br />

is a basis of V .Then<br />

MB,C αT = αMB,C<br />

T<br />

Proof. Let x be any vector <strong>in</strong> C n . Def<strong>in</strong>e u ∈ U by u = ρ −1<br />

B<br />

(x), sox = ρ B (u).<br />

Then,<br />

MB,Cx αT = MB,Cρ αT<br />

B (u)<br />

Substitution<br />

= ρ C ((αT )(u)) Theorem FTMR<br />

= ρ C (αT (u)) Def<strong>in</strong>ition LTSM<br />

= αρ C (T (u)) Def<strong>in</strong>ition LT<br />

= α ( MB,Cρ T B (u) ) Theorem FTMR<br />

= ( αMB,C) T ρB (u) Theorem MMSMM<br />

= ( αMB,C) T x Substitution


§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 523<br />

S<strong>in</strong>ce the matrices MB,C αT and αM B,C T have equal matrix-vector products for every<br />

vector <strong>in</strong> C n ,byTheoremEMMVP they are equal matrices.<br />

<br />

The vector space of all l<strong>in</strong>ear transformations from U to V is now isomorphic to<br />

the vector space of all m × n matrices.<br />

Theorem MRCLT Matrix Representation of a Composition of L<strong>in</strong>ear Transformations<br />

Suppose that T : U → V and S : V → W are l<strong>in</strong>ear transformations, B is a basis of<br />

U, C is a basis of V ,andD is a basis of W .Then<br />

MB,D S◦T = MC,DM S B,C<br />

T<br />

Proof. Let x be any vector <strong>in</strong> C n . Def<strong>in</strong>e u ∈ U by u = ρ −1<br />

B<br />

(x), sox = ρ B (u).<br />

Then,<br />

MB,Dx S◦T = MB,Dρ S◦T<br />

B (u)<br />

Substitution<br />

= ρ D ((S ◦ T )(u)) Theorem FTMR<br />

= ρ D (S (T (u))) Def<strong>in</strong>ition LTC<br />

= MC,Dρ S C (T (u)) Theorem FTMR<br />

= MC,D<br />

S (<br />

M<br />

T<br />

B,C ρ B (u) ) Theorem FTMR<br />

= ( MC,DM S B,C) T ρB (u) Theorem MMA<br />

= ( MC,DM S B,C) T x Substitution<br />

S<strong>in</strong>ce the matrices MB,D S◦T and M C,D S M B,C T have equal matrix-vector products for<br />

every vector <strong>in</strong> C n ,byTheoremEMMVP they are equal matrices.<br />

<br />

This is the second great surprise of <strong>in</strong>troductory l<strong>in</strong>ear algebra. Matrices are<br />

l<strong>in</strong>ear transformations (functions, really), and matrix multiplication is function<br />

composition! We can form the composition of two l<strong>in</strong>ear transformations, then form<br />

the matrix representation of the result. Or we can form the matrix representation of<br />

each l<strong>in</strong>ear transformation separately, then multiply the two representations together<br />

via Def<strong>in</strong>ition MM. In either case, we arrive at the same result.<br />

Example MPMR Matrix product of matrix representations<br />

Consider the two l<strong>in</strong>ear transformations,<br />

([<br />

T : C 2 a<br />

→ P 2 T =(−a +3b)+(2a +4b)x +(a − 2b)x<br />

b])<br />

2<br />

S : P 2 → M 22 S ( a + bx + cx 2) [ ]<br />

2a + b +2c a+4b − c<br />

=<br />

−a +3c 3a + b +2c


§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 524<br />

and bases for C 2 , P 2 and M 22 (respectively),<br />

{[ [<br />

3 2<br />

B = ,<br />

1]<br />

1]}<br />

C = { 1 − 2x + x 2 , −1+3x, 2x +3x 2}<br />

{[ ] [ ] [ ] [ ]}<br />

1 −2 1 −1 −1 2 2 −3<br />

D =<br />

, , ,<br />

1 −1 1 2 0 0 2 2<br />

Beg<strong>in</strong> by comput<strong>in</strong>g the new l<strong>in</strong>ear transformation that is the composition of T<br />

and S (Def<strong>in</strong>ition LTC,TheoremCLTLT), (S ◦ T ):C 2 → M 22 ,<br />

([ ( ([<br />

a a<br />

(S ◦ T ) = S T<br />

b])<br />

b]))<br />

= S ( (−a +3b)+(2a +4b)x +(a − 2b)x 2)<br />

[ ]<br />

2(−a +3b)+(2a +4b)+2(a − 2b) (−a +3b) + 4(2a +4b) − (a − 2b)<br />

=<br />

−(−a +3b)+3(a − 2b) 3(−a +3b)+(2a +4b)+2(a − 2b)<br />

[ ]<br />

2a +6b 6a +21b<br />

=<br />

4a − 9b a+9b<br />

Now compute the matrix representations (Def<strong>in</strong>ition MR) for each of these three<br />

l<strong>in</strong>ear transformations (T , S, S ◦ T ), relative to the appropriate bases. <strong>First</strong> for T ,<br />

([<br />

3 (<br />

ρ C<br />

(T = ρ<br />

1]))<br />

C 10x + x<br />

2 )<br />

(<br />

= ρ C 28(1 − 2x + x 2 ) + 28(−1+3x)+(−9)(2x +3x 2 ) )<br />

[ ] 28<br />

= 28<br />

−9<br />

([<br />

2<br />

ρ C<br />

(T = ρ<br />

1]))<br />

C (1 + 8x)<br />

(<br />

= ρ C 33(1 − 2x + x 2 ) + 32(−1+3x)+(−11)(2x +3x 2 ) )<br />

[ ] 33<br />

= 32<br />

−11<br />

So we have the matrix representation of T ,<br />

[ ] 28 33<br />

MB,C T = 28 32<br />

−9 −11


§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 525<br />

Now, a representation of S,<br />

( (<br />

ρ D S 1 − 2x + x<br />

2 )) ([ ])<br />

2 −8<br />

= ρ D<br />

2 3<br />

[ ]<br />

1 −2<br />

= ρ D<br />

((−11) +(−21)<br />

1 −1<br />

⎡ ⎤<br />

−11<br />

⎢−21⎥<br />

= ⎣<br />

0<br />

⎦<br />

17<br />

([ ])<br />

1 11<br />

ρ D (S (−1+3x)) = ρ D<br />

1 0<br />

[ ]<br />

1 −2<br />

= ρ D<br />

(26 +51<br />

1 −1<br />

=<br />

⎡<br />

⎢<br />

⎣<br />

26<br />

51<br />

0<br />

−38<br />

⎤<br />

⎥<br />

⎦<br />

[<br />

1 −1<br />

1 2<br />

( (<br />

ρ D S 2x +3x<br />

2 )) ([ ])<br />

8 5<br />

= ρ D<br />

9 8<br />

[ ] [<br />

1 −2 1 −1<br />

= ρ D<br />

(34 +67<br />

1 −1 1 2<br />

=<br />

⎡<br />

⎢<br />

⎣<br />

34<br />

67<br />

1<br />

−46<br />

⎤<br />

⎥<br />

⎦<br />

[ ]<br />

1 −1<br />

+0<br />

1 2<br />

]<br />

+0<br />

]<br />

+1<br />

[ ]<br />

−1 2<br />

+ (17)<br />

0 0<br />

[ ]<br />

−1 2<br />

+(−38)<br />

0 0<br />

[ ]<br />

−1 2<br />

+(−46)<br />

0 0<br />

[ ])<br />

2 −3<br />

2 2<br />

[ ])<br />

2 −3<br />

2 2<br />

[ ])<br />

2 −3<br />

2 2<br />

So we have the matrix representation of S,<br />

⎡<br />

⎤<br />

−11 26 34<br />

MC,D S ⎢−21 51 67 ⎥<br />

= ⎣<br />

0 0 1<br />

⎦<br />

17 −38 −46<br />

F<strong>in</strong>ally, a representation of S ◦ T ,<br />

([ ([ ])<br />

3 12 39<br />

ρ D<br />

((S ◦ T ) = ρ<br />

1]))<br />

D<br />

3 12<br />

[ ] [ ]<br />

1 −2 1 −1<br />

= ρ D<br />

(114 + 237 +(−9)<br />

1 −1 1 2<br />

[ ]<br />

−1 2<br />

+(−174)<br />

0 0<br />

[ ])<br />

2 −3<br />

2 2


§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 526<br />

=<br />

⎡<br />

⎢<br />

⎣<br />

114<br />

237<br />

−9<br />

−174<br />

⎤<br />

⎥<br />

⎦<br />

([ ([ ])<br />

2 10 33<br />

ρ D<br />

((S ◦ T ) = ρ<br />

1]))<br />

D<br />

−1 11<br />

[ ] [ ]<br />

1 −2 1 −1<br />

= ρ D<br />

(95 + 202 +(−11)<br />

1 −1 1 2<br />

⎡ ⎤<br />

95<br />

⎢ 202 ⎥<br />

= ⎣<br />

−11<br />

⎦<br />

−149<br />

[ ]<br />

−1 2<br />

+(−149)<br />

0 0<br />

[ ])<br />

2 −3<br />

2 2<br />

So we have the matrix representation of S ◦ T ,<br />

⎡<br />

⎤<br />

114 95<br />

MB,D S◦T ⎢ 237 202 ⎥<br />

= ⎣<br />

−9 −11<br />

⎦<br />

−174 −149<br />

Now, we are all set to verify the conclusion of Theorem MRCLT,<br />

⎡<br />

⎤<br />

−11 26 34 [ ] 28 33<br />

MC,DM S B,C T ⎢−21 51 67 ⎥<br />

= ⎣<br />

0 0 1<br />

⎦ 28 32<br />

−9 −11<br />

17 −38 −46<br />

⎡<br />

⎤<br />

114 95<br />

⎢ 237 202 ⎥<br />

= ⎣<br />

−9 −11<br />

⎦<br />

−174 −149<br />

= M S◦T<br />

B,D<br />

We have <strong>in</strong>tentionally used nonstandard bases. If you were to choose “nice” bases<br />

for the three vector spaces, then the result of the theorem might be rather transparent.<br />

But this would still be a worthwhile exercise — give it a go.<br />

△<br />

A diagram, similar to ones we have seen earlier, might make the importance of<br />

this theorem clearer,


§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 527<br />

Def<strong>in</strong>ition MR<br />

S, T MC,D S ,MT B,C<br />

Def<strong>in</strong>ition LTC<br />

S ◦ T<br />

Def<strong>in</strong>ition MR<br />

Def<strong>in</strong>ition MM<br />

M S◦T<br />

B,D = M S C,D M T B,C<br />

Diagram MRCLT: Matrix Representation and Composition of L<strong>in</strong>ear<br />

Transformations<br />

One of our goals <strong>in</strong> the first part of this book is to make the def<strong>in</strong>ition of<br />

matrix multiplication (Def<strong>in</strong>ition MVP, Def<strong>in</strong>ition MM) seem as natural as possible.<br />

However, many of us are brought up with an entry-by-entry description of matrix<br />

multiplication (Theorem EMP) asthedef<strong>in</strong>ition of matrix multiplication, and<br />

then theorems about columns of matrices and l<strong>in</strong>ear comb<strong>in</strong>ations follow from that<br />

def<strong>in</strong>ition. With this unmotivated def<strong>in</strong>ition, the realization that matrix multiplication<br />

is function composition is quite remarkable. It is an <strong>in</strong>terest<strong>in</strong>g exercise to beg<strong>in</strong> with<br />

the question, “What is the matrix representation of the composition of two l<strong>in</strong>ear<br />

transformations?” and then, without us<strong>in</strong>g any theorems about matrix multiplication,<br />

f<strong>in</strong>ally arrive at the entry-by-entry description of matrix multiplication. Try it yourself<br />

(Exercise MR.T80).<br />

Subsection PMR<br />

Properties of Matrix Representations<br />

It will not be a surprise to discover that the kernel and range of a l<strong>in</strong>ear transformation<br />

are closely related to the null space and column space of the transformation’s matrix<br />

representation. Perhaps this idea has been bounc<strong>in</strong>g around <strong>in</strong> your head already,<br />

even before see<strong>in</strong>g the def<strong>in</strong>ition of a matrix representation. However, with a formal<br />

def<strong>in</strong>ition of a matrix representation (Def<strong>in</strong>ition MR), and a fundamental theorem<br />

to go with it (Theorem FTMR) we can be formal about the relationship, us<strong>in</strong>g the<br />

idea of isomorphic vector spaces (Def<strong>in</strong>ition IVS). Here are the tw<strong>in</strong> theorems.<br />

Theorem KNSI Kernel and Null Space Isomorphism<br />

Suppose that T : U → V is a l<strong>in</strong>ear transformation, B is a basis for U of size n, and<br />

C is a basis for V . Then the kernel of T is isomorphic to the null space of MB,C T ,<br />

K(T ) ∼ = N ( MB,C<br />

T )<br />

Proof. To establish that two vector spaces are isomorphic, we must f<strong>in</strong>d an isomorphism<br />

between them, an <strong>in</strong>vertible l<strong>in</strong>ear transformation (Def<strong>in</strong>ition IVS). The kernel<br />

of the l<strong>in</strong>ear transformation T , K(T ), is a subspace of U, while the null space of the


§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 528<br />

matrix representation, N ( M T B,C)<br />

is a subspace of C n . The function ρ B is def<strong>in</strong>ed as<br />

a function from U to C n , but we can just as well employ the def<strong>in</strong>ition of ρ B as a<br />

function from K(T )toN ( M T B,C)<br />

.<br />

We must first <strong>in</strong>sure that if we choose an <strong>in</strong>put for ρ B from K(T ) that then the<br />

output will be an element of N ( M T B,C)<br />

. So suppose that u ∈K(T ). Then<br />

MB,Cρ T B (u) =ρ C (T (u))<br />

Theorem FTMR<br />

= ρ C (0) Def<strong>in</strong>ition KLT<br />

= 0 Theorem LTTZZ<br />

This says that ρ B (u) ∈N ( MB,C) T , as desired.<br />

The restriction <strong>in</strong> the size of the doma<strong>in</strong> and codoma<strong>in</strong> ρ B will not affect the<br />

fact that ρ B is a l<strong>in</strong>ear transformation (Theorem VRLT), nor will it affect the fact<br />

that ρ B is <strong>in</strong>jective (Theorem VRI). Someth<strong>in</strong>g must be done though to verify that<br />

ρ B is surjective. To this end, appeal to the def<strong>in</strong>ition of surjective (Def<strong>in</strong>ition SLT),<br />

and suppose that we have an element of the codoma<strong>in</strong>, x ∈N ( MB,C) T ⊆ C n and we<br />

wish to f<strong>in</strong>d an element of the doma<strong>in</strong> with x as its image. We now show that the<br />

desired element of the doma<strong>in</strong> is u = ρ −1<br />

B<br />

(x). <strong>First</strong>, verify that u ∈K(T ),<br />

T (u) =T ( ρ −1<br />

B<br />

(x))<br />

= ρ −1 ( ( (<br />

C M<br />

T<br />

B,C ρB ρ<br />

−1<br />

B (x)))) Theorem FTMR<br />

= ρ −1 (<br />

C M<br />

T<br />

B,C (I C n (x)) ) Def<strong>in</strong>ition IVLT<br />

= ρ −1 (<br />

C M<br />

T<br />

B,C x ) Def<strong>in</strong>ition IDLT<br />

= ρ −1<br />

C (0 Cn) Def<strong>in</strong>ition KLT<br />

= 0 V Theorem LTTZZ<br />

Second, verify that the proposed isomorphism, ρ B ,takesu to x,<br />

(<br />

ρ B (u) =ρ B ρ<br />

−1<br />

B (x)) Substitution<br />

= I C n (x) Def<strong>in</strong>ition IVLT<br />

= x Def<strong>in</strong>ition IDLT<br />

With ρ B demonstrated to be an <strong>in</strong>jective and surjective l<strong>in</strong>ear transformation<br />

from K(T ) to N ( M T B,C)<br />

,TheoremILTIS tells us ρB is <strong>in</strong>vertible, and so by Def<strong>in</strong>ition<br />

IVS, wesayK(T )andN ( M T B,C)<br />

are isomorphic. <br />

Example KVMR Kernel via matrix representation<br />

Consider the kernel of the l<strong>in</strong>ear transformation, T : M 22 → P 2 ,givenby<br />

([ ])<br />

a b<br />

T<br />

=(2a − b + c − 5d)+(a +4b +5c +2d)x +(3a − 2b + c − 8d)x<br />

c d<br />

2


§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 529<br />

We will beg<strong>in</strong> with a matrix representation of T relative to the bases for M 22<br />

and P 2 (respectively),<br />

{[ ] [ ] [ ] [ ]}<br />

1 2 1 3 1 2 2 5<br />

B =<br />

,<br />

, ,<br />

−1 −1 −1 −4 0 −2 −2 −4<br />

C = { 1+x + x 2 , 2+3x, −1 − 2x 2}<br />

Then,<br />

([ ]))<br />

1 2<br />

(<br />

ρ C<br />

(T<br />

= ρ<br />

−1 −1<br />

C 4+2x +6x<br />

2 )<br />

(<br />

= ρ C 2(1 + x + x 2 ) + 0(2 + 3x)+(−2)(−1 − 2x 2 ) )<br />

]<br />

[ 2<br />

= 0<br />

−2<br />

([ ]))<br />

1 3<br />

(<br />

ρ C<br />

(T<br />

= ρ<br />

−1 −4<br />

C 18 + 28x<br />

2 )<br />

(<br />

= ρ C (−24)(1 + x + x 2 ) + 8(2 + 3x)+(−26)(−1 − 2x 2 ) )<br />

[ ] −24<br />

= 8<br />

−26<br />

([ ]))<br />

1 2<br />

(<br />

ρ C<br />

(T<br />

= ρ<br />

0 −2<br />

C 10 + 5x +15x<br />

2 )<br />

(<br />

= ρ C 5(1 + x + x 2 ) + 0(2 + 3x)+(−5)(−1 − 2x 2 ) )<br />

]<br />

[ 5<br />

= 0<br />

−5<br />

([ ]))<br />

2 5<br />

(<br />

ρ C<br />

(T<br />

= ρ<br />

−2 −4<br />

C 17 + 4x +26x<br />

2 )<br />

(<br />

= ρ C (−8)(1 + x + x 2 ) + (4)(2 + 3x)+(−17)(−1 − 2x 2 ) )<br />

[ ] −8<br />

= 4<br />

−17<br />

So the matrix representation of T (relative to B and C) is<br />

[ ]<br />

2 −24 5 −8<br />

MB,C T = 0 8 0 4<br />

−2 −26 −5 −17<br />

We know from Theorem KNSI that the kernel of the l<strong>in</strong>ear transformation T is<br />

isomorphic to the null space of the matrix representation MB,C T and by study<strong>in</strong>g<br />

the proof of Theorem KNSI we learn that ρ B is an isomorphism between these


§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 530<br />

null spaces. Rather than try<strong>in</strong>g to compute the kernel of T us<strong>in</strong>g def<strong>in</strong>itions and<br />

techniques from Chapter LT we will <strong>in</strong>stead analyze the null space of MB,C T us<strong>in</strong>g<br />

techniques from way back <strong>in</strong> Chapter V. <strong>First</strong> row-reduce MB,C T ,<br />

[ ] ⎡<br />

2 −24 5 −8<br />

RREF<br />

0 8 0 4 −−−−→ ⎣ 1 0 ⎤<br />

5<br />

2<br />

2<br />

1<br />

0 1 0 ⎦<br />

2<br />

−2 −26 −5 −17 0 0 0 0<br />

ρ −1<br />

B<br />

So, by Theorem BNS, abasisforN ( MB,C) T is<br />

⎧⎡<br />

− ⎪⎨<br />

5 ⎤ ⎡ ⎤⎫<br />

−2<br />

2 ⎪⎬<br />

⎢ 0 ⎥ ⎢− ⎣<br />

⎪⎩ 1<br />

⎦ ,<br />

1 ⎣<br />

2⎥<br />

0<br />

⎦<br />

⎪⎭<br />

0 1<br />

We can now convert this basis of N ( MB,C) T <strong>in</strong>to a basis of K(T ) by apply<strong>in</strong>g<br />

to each element of the basis,<br />

⎛⎡<br />

− 5 ⎤⎞<br />

2<br />

⎜⎢<br />

0 ⎥⎟<br />

⎝⎣<br />

1<br />

⎦⎠ =(− 5 [ ] [ ] [ ] [ ]<br />

1 2 1 3 1 2 2 5<br />

2 ) +0<br />

+1 +0<br />

−1 −1 −1 −4 0 −2 −2 −4<br />

0<br />

ρ −1<br />

B<br />

ρ −1<br />

B<br />

[ ]<br />

−<br />

3<br />

= 2<br />

−3<br />

5 1<br />

2 2<br />

⎛⎡<br />

⎤⎞<br />

−2<br />

[ ]<br />

⎜⎢− 1<br />

⎝⎣<br />

2⎥⎟<br />

1 2<br />

0<br />

⎦⎠ =(−2)<br />

−1 −1<br />

1<br />

[ ]<br />

−<br />

1<br />

= 2<br />

− 1 2<br />

1<br />

2<br />

0<br />

So the set<br />

{[ ]<br />

−<br />

3<br />

2<br />

−3<br />

,<br />

5<br />

2<br />

+(− 1 [ ] [ ] [ ]<br />

1 3 1 2 2 5<br />

2 ) +0 +1<br />

−1 −4 0 −2 −2 −4<br />

1<br />

2<br />

[ ]}<br />

−<br />

1<br />

2<br />

− 1 2<br />

1<br />

2<br />

0<br />

is a basis for K(T ). Just for fun, you might evaluate T with each of these two basis<br />

vectors and verify that the output is the zero polynomial (Exercise MR.C10). △<br />

An entirely similar result applies to the range of a l<strong>in</strong>ear transformation and the<br />

column space of a matrix representation of the l<strong>in</strong>ear transformation.<br />

Theorem RCSI Range and Column Space Isomorphism<br />

Suppose that T : U → V is a l<strong>in</strong>ear transformation, B is a basis for U of size n, and<br />

C is a basis for V of size m. Then the range of T is isomorphic to the column space


§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 531<br />

of M T B,C ,<br />

R(T ) ∼ = C ( MB,C<br />

T )<br />

Proof. To establish that two vector spaces are isomorphic, we must f<strong>in</strong>d an isomorphism<br />

between them, an <strong>in</strong>vertible l<strong>in</strong>ear transformation (Def<strong>in</strong>ition IVS). The range<br />

of the l<strong>in</strong>ear transformation T , R(T ), is a subspace of V , while the column space of<br />

the matrix representation, C ( MB,C) T is a subspace of C m . The function ρ C is def<strong>in</strong>ed<br />

as a function from V to C m , but we can just as well employ the def<strong>in</strong>ition of ρ C as<br />

a function from R(T )toC ( MB,C) T .<br />

We must first <strong>in</strong>sure that if we choose an <strong>in</strong>put for ρ C from R(T ) that then the<br />

output will be an element of C ( MB,C) T .Sosupposethatv ∈R(T ). Then there is a<br />

vector u ∈ U, such that T (u) =v. Consider<br />

MB,Cρ T B (u) =ρ C (T (u))<br />

Theorem FTMR<br />

= ρ C (v) Def<strong>in</strong>ition RLT<br />

This says that ρ C (v) ∈C ( MB,C) T , as desired.<br />

The restriction <strong>in</strong> the size of the doma<strong>in</strong> and codoma<strong>in</strong> will not affect the fact<br />

that ρ C is a l<strong>in</strong>ear transformation (Theorem VRLT), nor will it affect the fact that<br />

ρ C is <strong>in</strong>jective (Theorem VRI). Someth<strong>in</strong>g must be done though to verify that ρ C<br />

is surjective. This all gets a bit confus<strong>in</strong>g, s<strong>in</strong>ce the doma<strong>in</strong> of our isomorphism<br />

is the range of the l<strong>in</strong>ear transformation, so th<strong>in</strong>k about your objects as you go.<br />

To establish that ρ C is surjective, appeal to the def<strong>in</strong>ition of a surjective l<strong>in</strong>ear<br />

transformation (Def<strong>in</strong>ition SLT), and suppose that we have an element of the<br />

codoma<strong>in</strong>, y ∈C ( MB,C) T ⊆ C m and we wish to f<strong>in</strong>d an element of the doma<strong>in</strong> with<br />

y as its image. S<strong>in</strong>ce y ∈C ( MB,C) T , there exists a vector, x ∈ C n with MB,C T x = y.<br />

We now show that the desired element of the doma<strong>in</strong> is v = ρ −1<br />

C<br />

(y). <strong>First</strong>, verify<br />

that v ∈R(T ) by apply<strong>in</strong>g T to u = ρ −1<br />

B<br />

(x),<br />

T (u) =T ( ρ −1<br />

B<br />

(x))<br />

= ρ −1 ( ( (<br />

C M<br />

T<br />

B,C ρB ρ<br />

−1<br />

B (x)))) Theorem FTMR<br />

= ρ −1 (<br />

C M<br />

T<br />

B,C (I C n (x)) ) Def<strong>in</strong>ition IVLT<br />

= ρ −1 (<br />

C M<br />

T<br />

B,C x ) Def<strong>in</strong>ition IDLT<br />

= ρ −1<br />

C<br />

(y) Def<strong>in</strong>ition CSM<br />

= v Substitution<br />

Second, verify that the proposed isomorphism, ρ C ,takesv to y,<br />

(<br />

ρ C (v) =ρ C ρ<br />

−1<br />

C (y)) Substitution<br />

= I C m (y) Def<strong>in</strong>ition IVLT<br />

= y Def<strong>in</strong>ition IDLT


§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 532<br />

With ρ C demonstrated to be an <strong>in</strong>jective and surjective l<strong>in</strong>ear transformation<br />

from R(T ) to C ( M T B,C)<br />

,TheoremILTIS tells us ρC is <strong>in</strong>vertible, and so by Def<strong>in</strong>ition<br />

IVS, wesayR(T )andC ( M T B,C)<br />

are isomorphic. <br />

Example RVMR Range via matrix representation<br />

In this example, we will recycle the l<strong>in</strong>ear transformation T and the bases B and C<br />

of Example KVMR but now we will compute the range of T : M 22 → P 2 ,givenby<br />

([ ])<br />

a b<br />

T<br />

=(2a − b + c − 5d)+(a +4b +5c +2d)x +(3a − 2b + c − 8d)x<br />

c d<br />

2<br />

With bases B and C,<br />

{[ ] [ ] [ ] [ ]}<br />

1 2 1 3 1 2 2 5<br />

B =<br />

,<br />

, ,<br />

−1 −1 −1 −4 0 −2 −2 −4<br />

C = { 1+x + x 2 , 2+3x, −1 − 2x 2}<br />

we obta<strong>in</strong> the matrix representation<br />

[ ]<br />

2 −24 5 −8<br />

MB,C T = 0 8 0 4<br />

−2 −26 −5 −17<br />

We know from Theorem RCSI that the range of the l<strong>in</strong>ear transformation T is<br />

isomorphic to the column space of the matrix representation MB,C T and by study<strong>in</strong>g<br />

the proof of Theorem RCSI we learn that ρ C is an isomorphism between these<br />

subspaces. Notice that s<strong>in</strong>ce the range is a subspace of the codoma<strong>in</strong>, we will<br />

employ ρ C as the isomorphism, rather than ρ B , which was the correct choice for an<br />

isomorphism between the null spaces of Example KVMR.<br />

Rather than try<strong>in</strong>g to compute the range of T us<strong>in</strong>g def<strong>in</strong>itions and techniques<br />

from Chapter LT we will <strong>in</strong>stead analyze the column space of MB,C T us<strong>in</strong>g techniques<br />

from way back <strong>in</strong> Chapter M. <strong>First</strong> row-reduce ( t,<br />

MB,C) T<br />

⎡<br />

⎤ ⎡<br />

⎤<br />

2 0 −2<br />

⎢−24 8 −26⎥<br />

⎣<br />

5 0 −5<br />

⎦ −−−−→<br />

RREF<br />

−8 4 −17<br />

⎢<br />

⎣<br />

1 0 −1<br />

0 1 − 25<br />

4 ⎥<br />

0 0 0 ⎦<br />

0 0 0<br />

Now employ Theorem CSRST and Theorem BRS (there are other methods we<br />

could choose here to compute the column space, such as Theorem BCS) to obta<strong>in</strong><br />

the basis for C ( )<br />

MB,C<br />

T ,<br />

⎧[ ] ⎡<br />

⎨ 1<br />

0 , ⎣ 0 ⎤⎫<br />

⎬<br />

1 ⎦<br />

⎩<br />

−1 − 25 ⎭<br />

4<br />

We can now convert this basis of C ( )<br />

MB,C<br />

T <strong>in</strong>to a basis of R(T ) by apply<strong>in</strong>g ρ<br />

−1<br />

C


§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 533<br />

to each element of the basis,<br />

([ ]) 1<br />

ρ −1<br />

C<br />

0 =(1+x + x 2 ) − (−1 − 2x 2 )=2+x +3x 2<br />

−1<br />

⎛⎡<br />

⎤⎞<br />

ρ −1<br />

C<br />

So the set<br />

⎝⎣ 0 1<br />

− 25<br />

4<br />

⎦⎠ =(2+3x) − 25 4 (−1 − 2x2 )= 33 4<br />

{2+3x +3x 2 , 33 4 +3x + 31 2 x2 }<br />

+3x +<br />

31<br />

2 x2<br />

is a basis for R(T ).<br />

Theorem KNSI and Theorem RCSI can be viewed as further formal evidence for<br />

the Coord<strong>in</strong>atization Pr<strong>in</strong>ciple, though they are not direct consequences.<br />

Diagram KRI is meant to suggest Theorem KNSI and Theorem RCSI, <strong>in</strong> addition<br />

to their proofs (and so carry the same notation as the statements of these two<br />

theorems). The dashed l<strong>in</strong>es <strong>in</strong>dicate a subspace relationship, with the smaller vector<br />

space lower down <strong>in</strong> the diagram. The central square is highly rem<strong>in</strong>iscent of Diagram<br />

FTMR. Each of the four vector representations is an isomorphism, so the <strong>in</strong>verse<br />

l<strong>in</strong>ear transformation could be depicted with an arrow po<strong>in</strong>t<strong>in</strong>g <strong>in</strong> the other direction.<br />

The four vector spaces across the bottom are familiar from the earliest days of the<br />

course, while the four vector spaces across the top are completely abstract. The<br />

vector representations that are restrictions (far left and far right) are the functions<br />

shown to be <strong>in</strong>vertible representations as the key technique <strong>in</strong> the proofs of Theorem<br />

KNSI and Theorem RCSI. So this diagram could be helpful as you study those two<br />

proofs.<br />

△<br />

U<br />

T<br />

V<br />

K(T ) R(T )<br />

ρ B ρ C<br />

M T<br />

C n B,C<br />

ρ C m<br />

B | K(T ) ρ C | R(T )<br />

N ( )<br />

MB,C<br />

T<br />

C ( )<br />

MB,C<br />

T<br />

Diagram KRI: Kernel and Range Isomorphisms


§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 534<br />

Subsection IVLT<br />

Invertible L<strong>in</strong>ear Transformations<br />

We have seen, both <strong>in</strong> theorems and <strong>in</strong> examples, that questions about l<strong>in</strong>ear transformations<br />

are often equivalent to questions about matrices. It is the matrix representation<br />

of a l<strong>in</strong>ear transformation that makes this idea precise. Here is our f<strong>in</strong>al<br />

theorem that solidifies this connection.<br />

Theorem IMR Invertible Matrix Representations<br />

Suppose that T : U → V is a l<strong>in</strong>ear transformation, B is a basis for U and C is a<br />

basis for V .ThenT is an <strong>in</strong>vertible l<strong>in</strong>ear transformation if and only if the matrix<br />

representation of T relative to B and C, MB,C T is an <strong>in</strong>vertible matrix. When T is<br />

<strong>in</strong>vertible,<br />

M T −1<br />

C,B<br />

= ( MB,C<br />

T ) −1<br />

Proof. (⇒) Suppose T is <strong>in</strong>vertible, so the <strong>in</strong>verse l<strong>in</strong>ear transformation T −1 : V → U<br />

exists (Def<strong>in</strong>ition IVLT). Both l<strong>in</strong>ear transformations have matrix representations<br />

relative to the bases of U and V , namely MB,C T −1<br />

T<br />

and MC,B<br />

(Def<strong>in</strong>ition MR).<br />

Then<br />

M T −1<br />

C,B MB,C T = M T −1 ◦T<br />

B,B<br />

Theorem MRCLT<br />

= M I U<br />

B,B<br />

Def<strong>in</strong>ition IVLT<br />

=[ρ B (I U (u 1 ))| ρ B (I U (u 2 ))| ...|ρ B (I U (u n ))] Def<strong>in</strong>ition MR<br />

=[ρ B (u 1 )| ρ B (u 2 )| ρ B (u 3 )| ...|ρ B (u n )] Def<strong>in</strong>ition IDLT<br />

=[e 1 |e 2 |e 3 | ...|e n ]<br />

Def<strong>in</strong>ition VR<br />

= I n Def<strong>in</strong>ition IM<br />

And<br />

M T B,CM T −1<br />

C,B<br />

−1<br />

T ◦T<br />

= MC,C<br />

Theorem MRCLT<br />

= M I V<br />

C,C<br />

Def<strong>in</strong>ition IVLT<br />

=[ρ C (I V (v 1 ))| ρ C (I V (v 2 ))| ...|ρ C (I V (v n ))] Def<strong>in</strong>ition MR<br />

=[ρ C (v 1 )| ρ C (v 2 )| ρ C (v 3 )| ...|ρ C (v n )] Def<strong>in</strong>ition IDLT<br />

=[e 1 |e 2 |e 3 | ...|e n ]<br />

Def<strong>in</strong>ition VR<br />

= I n Def<strong>in</strong>ition IM<br />

These two equations show that MB,C T<br />

MI) and establish that when T is <strong>in</strong>vertible, then M T −1<br />

C,B = ( −1.<br />

MB,C) T<br />

(⇐) Suppose now that MB,C<br />

T<br />

and M<br />

T<br />

−1<br />

C,B<br />

are <strong>in</strong>verse matrices (Def<strong>in</strong>ition<br />

is an <strong>in</strong>vertible matrix and hence nons<strong>in</strong>gular<br />

(Theorem NI). We compute the nullity of T ,<br />

n (T )=dim(K(T ))<br />

Def<strong>in</strong>ition KLT


§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 535<br />

=dim ( N ( MB,C<br />

T ))<br />

Theorem KNSI<br />

= n ( MB,C<br />

T )<br />

Def<strong>in</strong>ition NOM<br />

=0 TheoremRNNM<br />

So the kernel of T is trivial, and by Theorem KILT, T is <strong>in</strong>jective.<br />

We now compute the rank of T ,<br />

r (T )=dim(R(T ))<br />

Def<strong>in</strong>ition RLT<br />

=dim ( C ( MB,C<br />

T ))<br />

Theorem RCSI<br />

= r ( MB,C<br />

T )<br />

Def<strong>in</strong>ition ROM<br />

=dim(V )<br />

Theorem RNNM<br />

S<strong>in</strong>ce the dimension of the range of T equals the dimension of the codoma<strong>in</strong> V ,<br />

by Theorem EDYES, R(T ) = V .WhichsaysthatT is surjective by Theorem RSLT.<br />

Because T is both <strong>in</strong>jective and surjective, by Theorem ILTIS, T is <strong>in</strong>vertible.<br />

By now, the connections between matrices and l<strong>in</strong>ear transformations should<br />

be start<strong>in</strong>g to become more transparent, and you may have already recognized the<br />

<strong>in</strong>vertibility of a matrix as be<strong>in</strong>g tantamount to the <strong>in</strong>vertibility of the associated<br />

matrix representation. The next example shows how to apply this theorem to<br />

the problem of actually build<strong>in</strong>g a formula for the <strong>in</strong>verse of an <strong>in</strong>vertible l<strong>in</strong>ear<br />

transformation.<br />

Example ILTVR Inverse of a l<strong>in</strong>ear transformation via a representation<br />

Consider the l<strong>in</strong>ear transformation<br />

R: P 3 → M 22 , R ( a + bx + cx 2 + x 3) [ ]<br />

a + b − c +2d 2a +3b − 2c +3d<br />

=<br />

a + b +2d −a + b +2c − 5d<br />

If we wish to quickly f<strong>in</strong>d a formula for the <strong>in</strong>verse of R (presum<strong>in</strong>g it exists),<br />

then choos<strong>in</strong>g “nice” bases will work best. So build a matrix representation of R<br />

relative to the bases B and C,<br />

B = { 1,x,x 2 ,x 3}<br />

Then,<br />

C =<br />

{[<br />

1 0<br />

0 0<br />

]<br />

,<br />

[<br />

0 1<br />

0 0<br />

]<br />

,<br />

[ ]<br />

0 0<br />

,<br />

1 0<br />

([ ])<br />

1 2<br />

ρ C (R (1)) = ρ C =<br />

1 −1<br />

[ ]}<br />

0 0<br />

0 1<br />

⎡<br />

⎢<br />

⎣<br />

1<br />

2<br />

1<br />

−1<br />

⎤<br />

⎥<br />


§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 536<br />

So a representation of R is<br />

⎡ ⎤<br />

([ ]) 1<br />

1 3 ⎢3⎥<br />

ρ C (R (x)) = ρ C =<br />

1 1<br />

⎣<br />

1<br />

⎦<br />

1<br />

⎡ ⎤<br />

( (<br />

ρ C R x<br />

2 )) ([ ]) −1<br />

−1 −2 ⎢−2⎥<br />

= ρ C =<br />

0 2<br />

⎣<br />

0<br />

⎦<br />

2<br />

⎡ ⎤<br />

( (<br />

ρ C R x<br />

3 )) ([ ]) 2<br />

2 3 ⎢ 3 ⎥<br />

= ρ C =<br />

2 −5<br />

⎣<br />

2<br />

⎦<br />

−5<br />

M R B,C =<br />

⎡<br />

⎢<br />

⎣<br />

1 1 −1 2<br />

2 3 −2 3<br />

1 1 0 2<br />

−1 1 2 −5<br />

The matrix MB,C R is <strong>in</strong>vertible (as you can check) so we know for sure that R is<br />

<strong>in</strong>vertible by Theorem IMR. Furthermore,<br />

⎡<br />

⎤−1<br />

⎡<br />

⎤<br />

MC,B<br />

R−1<br />

= ( 1 1 −1 2 20 −7 −2 3<br />

MB,C<br />

R ) −1 ⎢ 2 3 −2 3 ⎥ ⎢−8 3 1 −1⎥<br />

= ⎣<br />

1 1 0 2<br />

⎦ = ⎣<br />

−1 0 1 0<br />

⎦<br />

−1 1 2 −5 −6 2 1 −1<br />

We can use this representation of the <strong>in</strong>verse l<strong>in</strong>ear transformation, <strong>in</strong> concert<br />

with Theorem FTMR, to determ<strong>in</strong>e an explicit formula for the <strong>in</strong>verse itself,<br />

([ ]) ( ([ ]))<br />

R −1 a b<br />

= ρ<br />

c d<br />

−1<br />

a b<br />

B<br />

MC,B R−1<br />

ρ C Theorem FTMR<br />

c d<br />

( ([ ]))<br />

(M<br />

= ρ −1 )<br />

R −1 a b<br />

B B,C ρC Theorem IMR<br />

c d<br />

⎛ ⎡ ⎤⎞<br />

= ρ −1 ⎜<br />

B ⎝ ( a<br />

MB,C<br />

R ) −1 ⎢b⎥⎟<br />

⎣<br />

c<br />

⎦⎠<br />

Def<strong>in</strong>ition VR<br />

d<br />

⎛⎡<br />

⎤ ⎡ ⎤⎞<br />

20 −7 −2 3 a<br />

= ρ −1 ⎜⎢−8 3 1 −1⎥<br />

⎢b⎥⎟<br />

B ⎝⎣<br />

−1 0 1 0<br />

⎦ ⎣<br />

c<br />

⎦⎠<br />

Def<strong>in</strong>ition MI<br />

−6 2 1 −1 d<br />

⎤<br />

⎥<br />


§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 537<br />

= ρ −1<br />

B<br />

⎛⎡<br />

⎤⎞<br />

20a − 7b − 2c +3d<br />

⎜⎢<br />

−8a +3b + c − d ⎥⎟<br />

⎝⎣<br />

−a + c<br />

⎦⎠<br />

−6a +2b + c − d<br />

=(20a − 7b − 2c +3d)+(−8a +3b + c − d)x<br />

+(−a + c)x 2 +(−6a +2b + c − d)x 3<br />

Def<strong>in</strong>ition MVP<br />

Def<strong>in</strong>ition VR<br />

You might look back at Example AIVLT, where we first witnessed the <strong>in</strong>verse of<br />

a l<strong>in</strong>ear transformation and recognize that the <strong>in</strong>verse (S) was built from us<strong>in</strong>g the<br />

method of Example ILTVR with a matrix representation of T .<br />

Theorem IMILT Invertible Matrices, Invertible L<strong>in</strong>ear Transformation<br />

Suppose that A is a square matrix of size n and T : C n → C n is the l<strong>in</strong>ear transformation<br />

def<strong>in</strong>ed by T (x) = Ax. ThenA is an <strong>in</strong>vertible matrix if and only if T is an<br />

<strong>in</strong>vertible l<strong>in</strong>ear transformation.<br />

Proof. Choose bases B = C = {e 1 , e 2 , e 3 , ..., e n } consist<strong>in</strong>g of the standard unit<br />

vectors as a basis of C n (Theorem SUVB) and build a matrix representation of T<br />

relative to B and C. Then<br />

ρ C (T (e i )) = ρ C (Ae i )<br />

= ρ C (A i )<br />

= A i<br />

So then the matrix representation of T , relative to B and C, issimplyMB,C T = A.<br />

With this observation, the proof becomes a specialization of Theorem IMR,<br />

T is <strong>in</strong>vertible ⇐⇒ MB,C T is <strong>in</strong>vertible ⇐⇒ A is <strong>in</strong>vertible<br />

<br />

This theorem may seem gratuitous. Why state such a special case of Theorem<br />

IMR? Because it adds another condition to our NMEx series of theorems, and <strong>in</strong><br />

some ways it is the most fundamental expression of what it means for a matrix to<br />

be nons<strong>in</strong>gular — the associated l<strong>in</strong>ear transformation is <strong>in</strong>vertible. This is our f<strong>in</strong>al<br />

update.<br />

Theorem NME9 Nons<strong>in</strong>gular Matrix Equivalences, Round 9<br />

Suppose that A is a square matrix of size n. The follow<strong>in</strong>g are equivalent.<br />

1. A is nons<strong>in</strong>gular.<br />

2. A row-reduces to the identity matrix.<br />

3. The null space of A conta<strong>in</strong>s only the zero vector, N (A) ={0}.<br />


§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 538<br />

4. The l<strong>in</strong>ear system LS(A, b) has a unique solution for every possible choice of<br />

b.<br />

5. The columns of A are a l<strong>in</strong>early <strong>in</strong>dependent set.<br />

6. A is <strong>in</strong>vertible.<br />

7. The column space of A is C n , C(A) =C n .<br />

8. The columns of A are a basis for C n .<br />

9. The rank of A is n, r (A) =n.<br />

10. The nullity of A is zero, n (A) =0.<br />

11. The determ<strong>in</strong>ant of A is nonzero, det (A) ≠0.<br />

12. λ =0is not an eigenvalue of A.<br />

13. The l<strong>in</strong>ear transformation T : C n → C n def<strong>in</strong>ed by T (x) =Ax is <strong>in</strong>vertible.<br />

Proof. By Theorem IMILT, the new addition to this list is equivalent to the statement<br />

that A is <strong>in</strong>vertible, so we can expand Theorem NME8.<br />

<br />

Read<strong>in</strong>g Questions<br />

1. WhydoesTheoremFTMR deserve the moniker “fundamental”?<br />

2. F<strong>in</strong>d the matrix representation, MB,C T of the l<strong>in</strong>ear transformation<br />

([ ]) [ ]<br />

T : C 2 → C 2 x1 2x1 − x<br />

, T =<br />

2<br />

x 2 3x 1 +2x 2<br />

relative to the bases<br />

B =<br />

{[<br />

2<br />

3]<br />

,<br />

[ ]}<br />

−1<br />

2<br />

C =<br />

3. What is the second “surprise,” and why is it surpris<strong>in</strong>g?<br />

Exercises<br />

{[ [<br />

1 1<br />

,<br />

0]<br />

1]}<br />

C10 Example KVMR concludes with a basis for the kernel of the l<strong>in</strong>ear transformation T .<br />

Compute the value of T for each of these two basis vectors. Did you get what you expected?<br />

C20 † Compute the matrix representation of T relative to the bases B and C.<br />

⎡<br />

⎤<br />

T : P 3 → C 3 , T ( a + bx + cx 2 + dx 3) 2a − 3b +4c − 2d<br />

= ⎣ a + b − c + d ⎦<br />

3a +2c − 3d


§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 539<br />

⎧⎡<br />

B = { 1,x,x 2 ,x 3} ⎨<br />

C = ⎣ 1 ⎤<br />

0⎦ ,<br />

⎩<br />

0<br />

⎡<br />

⎤<br />

⎣ 1 1⎦ ,<br />

0<br />

⎡<br />

⎣ 1 ⎤⎫<br />

⎬<br />

1⎦<br />

1<br />

⎭<br />

C21 †<br />

and C.<br />

F<strong>in</strong>d a matrix representation of the l<strong>in</strong>ear transformation T relative to the bases B<br />

[ ]<br />

T : P 2 → C 2 p(1)<br />

, T (p(x)) =<br />

p(3)<br />

B = { 2 − 5x + x 2 , 1+x − x 2 ,x 2}<br />

{[ [<br />

3 2<br />

C = ,<br />

4]<br />

3]}<br />

C22 † Let S 22 be the vector space of 2 × 2 symmetric matrices. Build the matrix representation<br />

of the l<strong>in</strong>ear transformation T : P 2 → S 22 relative to the bases B and C and then<br />

use this matrix representation to compute T ( 3+5x − 2x 2) .<br />

B = { 1, 1+x, 1+x + x 2} {[ ] [ ] [ ]}<br />

1 0 0 1 0 0<br />

C = , ,<br />

0 0 1 0 0 1<br />

T ( a + bx + cx 2) [ ]<br />

2a − b + c a+3b − c<br />

=<br />

a +3b − c a− c<br />

C25 † Use a matrix representation to determ<strong>in</strong>e if the l<strong>in</strong>ear transformation T : P 3 → M 22<br />

is surjective.<br />

T ( a + bx + cx 2 + dx 3) [ ]<br />

−a +4b + c +2d 4a − b +6c − d<br />

=<br />

a +5b − 2c +2d a+2c +5d<br />

C30 † F<strong>in</strong>d bases for the kernel and range of the l<strong>in</strong>ear transformation S below.<br />

([ ])<br />

a b<br />

S : M 22 → P 2 , S<br />

=(a +2b +5c − 4d)+(3a − b +8c +2d)x +(a + b +4c − 2d)x 2<br />

c d<br />

C40 † Let S 22 be the set of 2× 2 symmetric matrices. Verify that the l<strong>in</strong>ear transformation<br />

R is <strong>in</strong>vertible and f<strong>in</strong>d R −1 .<br />

([ ])<br />

a b<br />

R: S 22 → P 2 , R<br />

=(a − b)+(2a − 3b − 2c)x +(a − b + c)x 2<br />

b c<br />

C41 † Prove that the l<strong>in</strong>ear transformation S is <strong>in</strong>vertible. Then f<strong>in</strong>d a formula for the<br />

<strong>in</strong>verse l<strong>in</strong>ear transformation, S −1 , by employ<strong>in</strong>g a matrix <strong>in</strong>verse.<br />

S : P 1 → M 12 , S(a + bx) = [ 3a + b 2a + b ]<br />

C42 † The l<strong>in</strong>ear transformation R: M 12 → M 21 is <strong>in</strong>vertible. Use a matrix representation<br />

to determ<strong>in</strong>e a formula for the <strong>in</strong>verse l<strong>in</strong>ear transformation R −1 : M 21 → M 12 .<br />

R ([ a b ]) [ ]<br />

a +3b<br />

=<br />

4a +11b<br />

C50 †<br />

Use a matrix representation to f<strong>in</strong>d a basis for the range of the l<strong>in</strong>ear transformation


§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 540<br />

L.<br />

([ ])<br />

a b<br />

L: M 22 → P 2 , T<br />

=(a +2b +4c + d)+(3a + c − 2d)x +(−a + b +3c +3d)x 2<br />

c d<br />

C51 Use a matrix representation to f<strong>in</strong>d a basis for the kernel of the l<strong>in</strong>ear transformation<br />

L.<br />

([ ])<br />

a b<br />

L: M 22 → P 2 , T<br />

=(a +2b +4c + d)+(3a + c − 2d)x +(−a + b +3c +3d)x 2<br />

c d<br />

C52 † F<strong>in</strong>d a basis for the kernel of the l<strong>in</strong>ear transformation T : P 2 → M 22 .<br />

T ( a + bx + cx 2) [ ]<br />

a +2b − 2c 2a +2b<br />

=<br />

−a + b − 4c 3a +2b +2c<br />

M20 † The l<strong>in</strong>ear transformation D performs differentiation on polynomials. Use a matrix<br />

representation of D to f<strong>in</strong>d the rank and nullity of D.<br />

D : P n → P n ,<br />

D(p(x)) = p ′ (x)<br />

M60 Suppose U and V are vector spaces and def<strong>in</strong>e a function Z : U → V by T (u) = 0 V<br />

for every u ∈ U. Then Exercise IVLT.M60 asks you to formulate the theorem: Z is <strong>in</strong>vertible<br />

if and only if U = {0 U } and V = {0 V }. What would a matrix representation of Z look like<br />

<strong>in</strong> this case? How does Theorem IMR read <strong>in</strong> this case?<br />

M80 In light of Theorem KNSI and Theorem MRCLT, write a short comparison of<br />

Exercise MM.T40 with Exercise ILT.T15.<br />

M81 In light of Theorem RCSI and Theorem MRCLT, write a short comparison of<br />

Exercise CRS.T40 with Exercise SLT.T15.<br />

M82 In light of Theorem MRCLT and Theorem IMR, write a short comparison of<br />

Theorem SS and Theorem ICLT.<br />

M83 In light of Theorem MRCLT and Theorem IMR, write a short comparison of<br />

Theorem NPNT and Exercise IVLT.T40.<br />

T20 † Construct a new solution to Exercise B.T50 along the follow<strong>in</strong>g outl<strong>in</strong>e. From<br />

the n × n matrix A, construct the l<strong>in</strong>ear transformation T : C n → C n , T (x) = Ax. Use<br />

Theorem NI, TheoremIMILT and Theorem ILTIS to translate between the nons<strong>in</strong>gularity<br />

of A and the surjectivity/<strong>in</strong>jectivity of T . Then apply Theorem ILTB and Theorem SLTB<br />

to connect these properties with bases.<br />

T40 Theorem VSLT def<strong>in</strong>es the vector space LT (U, V ) conta<strong>in</strong><strong>in</strong>g all l<strong>in</strong>ear transformations<br />

with doma<strong>in</strong> U and codoma<strong>in</strong> V . Suppose dim (U) = n and dim (V ) = m. Provethat<br />

LT (U, V ) is isomorphic to M mn , the vector space of all m × n matrices (Example VSM).<br />

(H<strong>in</strong>t: we could have suggested this exercise <strong>in</strong> Chapter LT, but have postponed it to this<br />

section. Why?)<br />

T41 Theorem VSLT def<strong>in</strong>es the vector space LT (U, V ) conta<strong>in</strong><strong>in</strong>g all l<strong>in</strong>ear transformations<br />

with doma<strong>in</strong> U and codoma<strong>in</strong> V . Determ<strong>in</strong>e a basis for LT (U, V ). (H<strong>in</strong>t:study<br />

Exercise MR.T40 first.)


§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 541<br />

T60 Create an entirely different proof of Theorem IMILT that relies on Def<strong>in</strong>ition IVLT to<br />

establish the <strong>in</strong>vertibility of T , and that relies on Def<strong>in</strong>ition MI to establish the <strong>in</strong>vertibility<br />

of A.<br />

T80 † Suppose that T : U → V and S : V → W are l<strong>in</strong>ear transformations, and that B, C<br />

and D are bases for U, V ,andW . Us<strong>in</strong>g only Def<strong>in</strong>ition MR def<strong>in</strong>e matrix representations<br />

for T and S. Us<strong>in</strong>g these two def<strong>in</strong>itions, and Def<strong>in</strong>ition MR, derive a matrix representation<br />

for the composition S ◦ T <strong>in</strong> terms of the entries of the matrices MB,C T and MC,D. S Expla<strong>in</strong><br />

how you would use this result to motivate a def<strong>in</strong>ition for matrix multiplication that is<br />

strik<strong>in</strong>gly similar to Theorem EMP.


Section CB<br />

Change of Basis<br />

We have seen <strong>in</strong> Section MR that a l<strong>in</strong>ear transformation can be represented by<br />

a matrix, once we pick bases for the doma<strong>in</strong> and codoma<strong>in</strong>. How does the matrix<br />

representation change if we choose different bases? Which bases lead to especially<br />

nice representations? From the <strong>in</strong>f<strong>in</strong>ite possibilities, what is the best possible representation?<br />

This section will beg<strong>in</strong> to answer these questions. But first we need to<br />

def<strong>in</strong>e eigenvalues for l<strong>in</strong>ear transformations and the change-of-basis matrix.<br />

Subsection EELT<br />

Eigenvalues and Eigenvectors of L<strong>in</strong>ear Transformations<br />

We now def<strong>in</strong>e the notion of an eigenvalue and eigenvector of a l<strong>in</strong>ear transformation.<br />

It should not be too surpris<strong>in</strong>g, especially if you rem<strong>in</strong>d yourself of the close<br />

relationship between matrices and l<strong>in</strong>ear transformations.<br />

Def<strong>in</strong>ition EELT Eigenvalue and Eigenvector of a L<strong>in</strong>ear Transformation<br />

Suppose that T : V → V is a l<strong>in</strong>ear transformation. Then a nonzero vector v ∈ V is<br />

an eigenvector of T for the eigenvalue λ if T (v) =λv.<br />

□<br />

We will see shortly the best method for comput<strong>in</strong>g the eigenvalues and eigenvectors<br />

of a l<strong>in</strong>ear transformation, but for now, here are some examples to verify that such<br />

th<strong>in</strong>gs really do exist.<br />

Example ELTBM Eigenvectors of l<strong>in</strong>ear transformation between matrices<br />

Consider the l<strong>in</strong>ear transformation T : M 22 → M 22 def<strong>in</strong>ed by<br />

([ ]) [ ]<br />

a b −17a +11b +8c − 11d −57a +35b +24c − 33d<br />

T<br />

=<br />

c d −14a +10b +6c − 10d −41a +25b +16c − 23d<br />

and the vectors<br />

[ ]<br />

[ ]<br />

[ ]<br />

[ ]<br />

0 1<br />

1 1<br />

1 3<br />

2 6<br />

x 1 =<br />

x<br />

0 1<br />

2 =<br />

x<br />

1 0<br />

3 =<br />

x<br />

2 3<br />

4 =<br />

1 4<br />

Then compute<br />

([ ]) [ ]<br />

0 1 0 2<br />

T (x 1 )=T<br />

= =2x<br />

0 1 0 2 1<br />

([ ]) [ ]<br />

1 1 2 2<br />

T (x 2 )=T<br />

= =2x<br />

1 0 2 0 2<br />

([ ]) [ ]<br />

1 3 −1 −3<br />

T (x 3 )=T<br />

=<br />

=(−1)x<br />

2 3 −2 −3<br />

3<br />

([ ]) [ ]<br />

2 6 −4 −12<br />

T (x 4 )=T<br />

=<br />

=(−2)x<br />

1 4 −2 −8<br />

4<br />

542


§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 543<br />

So x 1 , x 2 , x 3 , x 4 are eigenvectors of T with eigenvalues (respectively) λ 1 =2,<br />

λ 2 =2,λ 3 = −1, λ 4 = −2.<br />

△<br />

Here is another.<br />

Example ELTBP Eigenvectors of l<strong>in</strong>ear transformation between polynomials<br />

Consider the l<strong>in</strong>ear transformation R: P 2 → P 2 def<strong>in</strong>ed by<br />

R ( a + bx + cx 2) =(15a +8b − 4c)+(−12a − 6b +3c)x +(24a +14b − 7c)x 2<br />

and the vectors<br />

w 1 =1− x + x 2 w 2 = x +2x 2 w 3 =1+4x 2<br />

Then compute<br />

R (w 1 )=R ( 1 − x + x 2) =3− 3x +3x 2 =3w 1<br />

R (w 2 )=R ( x +2x 2) =0+0x +0x 2 =0w 2<br />

R (w 3 )=R ( 1+4x 2) = −1 − 4x 2 =(−1)w 3<br />

So w 1 , w 2 , w 3 are eigenvectors of R with eigenvalues (respectively) λ 1 =3,<br />

λ 2 =0,λ 3 = −1. Notice how the eigenvalue λ 2 = 0 <strong>in</strong>dicates that the eigenvector w 2<br />

is a nontrivial element of the kernel of R, and therefore R is not <strong>in</strong>jective (Exercise<br />

CB.T15).<br />

△<br />

Of course, these examples are meant only to illustrate the def<strong>in</strong>ition of eigenvectors<br />

and eigenvalues for l<strong>in</strong>ear transformations, and therefore beg the question, “How<br />

would I f<strong>in</strong>d eigenvectors?” We will have an answer before we f<strong>in</strong>ish this section. We<br />

need one more construction first.<br />

Subsection CBM<br />

Change-of-Basis Matrix<br />

Given a vector space, we know we can usually f<strong>in</strong>d many different bases for the<br />

vector space, some nice, some nasty. If we choose a s<strong>in</strong>gle vector from this vector<br />

space, we can build many different representations of the vector by construct<strong>in</strong>g the<br />

representations relative to different bases. How are these different representations<br />

related to each other? A change-of-basis matrix answers this question.<br />

Def<strong>in</strong>ition CBM Change-of-Basis Matrix<br />

Suppose that V is a vector space, and I V : V → V is the identity l<strong>in</strong>ear transformation<br />

on V .LetB = {v 1 , v 2 , v 3 , ..., v n } and C be two bases of V . Then the changeof-basis<br />

matrix from B to C is the matrix representation of I V relative to B and


§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 544<br />

C,<br />

C B,C = M I V<br />

B,C<br />

=[ρ C (I V (v 1 ))| ρ C (I V (v 2 ))| ρ C (I V (v 3 ))| ...|ρ C (I V (v n ))]<br />

=[ρ C (v 1 )| ρ C (v 2 )| ρ C (v 3 )| ...|ρ C (v n )]<br />

Notice that this def<strong>in</strong>ition is primarily about a s<strong>in</strong>gle vector space (V )andtwo<br />

bases of V (B, C). The l<strong>in</strong>ear transformation (I V ) is necessary but not critical. As<br />

you might expect, this matrix has someth<strong>in</strong>g to do with chang<strong>in</strong>g bases. Here is the<br />

theorem that gives the matrix its name (not the other way around).<br />

Theorem CB Change-of-Basis<br />

Suppose that v is a vector <strong>in</strong> the vector space V and B and C are bases of V .Then<br />

ρ C (v) =C B,C ρ B (v)<br />

□<br />

Proof.<br />

ρ C (v) =ρ C (I V (v))<br />

Def<strong>in</strong>ition IDLT<br />

= M I V<br />

B,C ρ B (v)<br />

Theorem FTMR<br />

= C B,C ρ B (v) Def<strong>in</strong>ition CBM<br />

<br />

So the change-of-basis matrix can be used with matrix multiplication to convert a<br />

vector representation of a vector (v) relative to one basis (ρ B (v)) to a representation<br />

of the same vector relative to a second basis (ρ C (v)).<br />

Theorem ICBM Inverse of Change-of-Basis Matrix<br />

Suppose that V is a vector space, and B and C are bases of V . Then the change-ofbasis<br />

matrix C B,C is nons<strong>in</strong>gular and<br />

C −1<br />

B,C = C C,B<br />

Proof. The l<strong>in</strong>ear transformation I V : V → V is <strong>in</strong>vertible, and its <strong>in</strong>verse is itself, I V<br />

(check this!). So by Theorem IMR, thematrixM I V<br />

B,C = C B,C is <strong>in</strong>vertible. Theorem<br />

NI says an <strong>in</strong>vertible matrix is nons<strong>in</strong>gular.<br />

Then<br />

( −1<br />

C −1<br />

B,C = M I V<br />

B,C)<br />

Def<strong>in</strong>ition CBM<br />

= M I−1 V<br />

C,B<br />

Theorem IMR<br />

= M I V<br />

C,B<br />

Def<strong>in</strong>ition IDLT<br />

= C C,B Def<strong>in</strong>ition CBM


§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 545<br />

Example CBP Change of basis with polynomials<br />

The vector space P 4 (Example VSP) has two nice bases (Example BP),<br />

B = { 1,x,x 2 ,x 3 ,x 4}<br />

C = { 1, 1+x, 1+x + x 2 , 1+x + x 2 + x 3 , 1+x + x 2 + x 3 + x 4}<br />

To build the change-of-basis matrix between B and C, we must first build a<br />

vector representation of each vector <strong>in</strong> B relative to C,<br />

⎡ ⎤<br />

1<br />

0<br />

ρ C (1) = ρ C ((1) (1)) = ⎢0⎥<br />

⎣0⎦<br />

0<br />

⎡ ⎤<br />

−1<br />

1<br />

ρ C (x) =ρ C ((−1) (1) + (1) (1 + x)) = ⎢ 0 ⎥<br />

⎣ 0 ⎦<br />

0<br />

⎡ ⎤<br />

0<br />

(<br />

ρ C x<br />

2 ) ( (<br />

= ρ C (−1) (1 + x)+(1) 1+x + x<br />

2 )) −1<br />

= ⎢ 1 ⎥<br />

⎣ 0 ⎦<br />

0<br />

⎡ ⎤<br />

0<br />

(<br />

ρ C x<br />

3 ) ( (<br />

= ρ C (−1) 1+x + x<br />

2 ) +(1) ( 1+x + x 2 + x 3)) 0<br />

= ⎢−1⎥<br />

⎣ 1 ⎦<br />

0<br />

⎡ ⎤<br />

0<br />

(<br />

ρ C x<br />

4 ) ( (<br />

= ρ C (−1) 1+x + x 2 + x 3) +(1) ( 1+x + x 2 + x 3 + x 4)) 0<br />

= ⎢ 0 ⎥<br />

⎣−1⎦<br />

1<br />

Then we package up these vectors as the columns of a matrix,<br />

⎡<br />

⎤<br />

1 −1 0 0 0<br />

0 1 −1 0 0<br />

C B,C = ⎢0 0 1 −1 0 ⎥<br />

⎣0 0 0 1 −1⎦<br />

0 0 0 0 1


§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 546<br />

Now, to illustrate Theorem CB, consider the vector u =5− 3x +2x 2 +8x 3 − 3x 4 .<br />

We can build the representation of u relative to B easily,<br />

⎡ ⎤<br />

5<br />

(<br />

ρ B (u) =ρ B 5 − 3x +2x 2 +8x 3 − 3x 4) −3<br />

= ⎢ 2 ⎥<br />

⎣ 8 ⎦<br />

−3<br />

Apply<strong>in</strong>g Theorem CB, we obta<strong>in</strong> a second representation of u, but now relative<br />

to C,<br />

ρ C (u) =C B,C ρ B (u)<br />

Theorem CB<br />

⎡<br />

⎤ ⎡ ⎤<br />

1 −1 0 0 0 5<br />

0 1 −1 0 0<br />

−3<br />

= ⎢0 0 1 −1 0 ⎥ ⎢ 2 ⎥<br />

⎣0 0 0 1 −1⎦<br />

⎣ 8 ⎦<br />

0 0 0 0 1 −3<br />

⎡ ⎤<br />

8<br />

−5<br />

= ⎢−6⎥<br />

Def<strong>in</strong>ition MVP<br />

⎣ 11 ⎦<br />

−3<br />

We can check our work by unravel<strong>in</strong>g this second representation,<br />

u = ρ −1<br />

C (ρ C (u)) Def<strong>in</strong>ition IVLT<br />

⎛⎡<br />

⎤⎞<br />

8<br />

−5<br />

= ρ −1<br />

C ⎜⎢−6⎥⎟<br />

⎝⎣<br />

11 ⎦⎠<br />

−3<br />

=8(1)+(−5)(1 + x)+(−6)(1 + x + x 2 )<br />

+ (11)(1 + x + x 2 + x 3 )+(−3)(1 + x + x 2 + x 3 + x 4 ) Def<strong>in</strong>ition VR<br />

=5− 3x +2x 2 +8x 3 − 3x 4<br />

The change-of-basis matrix from C to B is actually easier to build. Grab each<br />

vector <strong>in</strong> the basis C and form its representation relative to B<br />

⎡ ⎤<br />

1<br />

0<br />

ρ B (1) = ρ B ((1)1) = ⎢0⎥<br />

⎣0⎦<br />

0


§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 547<br />

⎡ ⎤<br />

1<br />

1<br />

ρ B (1 + x) =ρ B ((1)1 + (1)x) = ⎢0⎥<br />

⎣0⎦<br />

0<br />

⎡ ⎤<br />

1<br />

(<br />

ρ B 1+x + x<br />

2 ) (<br />

= ρ B (1)1 + (1)x +(1)x<br />

2 ) 1<br />

= ⎢1⎥<br />

⎣0⎦<br />

0<br />

⎡ ⎤<br />

1<br />

(<br />

ρ B 1+x + x 2 + x 3) (<br />

= ρ B (1)1 + (1)x +(1)x 2 +(1)x 3) 1<br />

= ⎢1⎥<br />

⎣1⎦<br />

0<br />

⎡ ⎤<br />

1<br />

(<br />

ρ B 1+x + x 2 + x 3 + x 4) (<br />

= ρ B (1)1 + (1)x +(1)x 2 +(1)x 3 +(1)x 4) 1<br />

= ⎢1⎥<br />

⎣1⎦<br />

1<br />

Then we package up these vectors as the columns of a matrix,<br />

⎡<br />

⎤<br />

1 1 1 1 1<br />

0 1 1 1 1<br />

C C,B = ⎢0 0 1 1 1⎥<br />

⎣0 0 0 1 1⎦<br />

0 0 0 0 1<br />

We formed two representations of the vector u above, so we can aga<strong>in</strong> provide a<br />

check on our computations by convert<strong>in</strong>g from the representation of u relative to C<br />

to the representation of u relative to B,<br />

ρ B (u) =C C,B ρ C (u)<br />

Theorem CB<br />

⎡<br />

⎤ ⎡ ⎤<br />

1 1 1 1 1 8<br />

0 1 1 1 1<br />

−5<br />

= ⎢0 0 1 1 1⎥<br />

⎢−6⎥<br />

⎣0 0 0 1 1⎦<br />

⎣ 11 ⎦<br />

0 0 0 0 1 −3<br />

⎡ ⎤<br />

5<br />

−3<br />

= ⎢ 2 ⎥<br />

Def<strong>in</strong>ition MVP<br />

⎣ 8 ⎦<br />

−3


§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 548<br />

One more computation that is either a check on our work, or an illustration of a<br />

theorem. The two change-of-basis matrices, C B,C and C C,B , should be <strong>in</strong>verses of<br />

each other, accord<strong>in</strong>g to Theorem ICBM. Here we go,<br />

⎡<br />

⎤ ⎡<br />

⎤<br />

⎤<br />

1 −1 0 0 0 1 1 1 1 1 1 0 0 0 0<br />

0 1 −1 0 0<br />

0 1 1 1 1<br />

0 1 0 0 0<br />

⎢<br />

⎥ ⎢<br />

⎥ ⎢<br />

⎥<br />

C B,C C C,B = ⎢0 0 1 −1 0<br />

⎣0 0 0 1 −1<br />

0 0 0 0 1<br />

⎡<br />

⎥ ⎢0 0 1 1 1⎥<br />

⎦ ⎣0 0 0 1 1⎦ = ⎢0 0 1 0 0⎥<br />

⎣0 0 0 1 0⎦<br />

0 0 0 0 1 0 0 0 0 1<br />

The computations of the previous example are not meant to present any laborsav<strong>in</strong>g<br />

devices, but <strong>in</strong>stead are meant to illustrate the utility of the change-of-basis<br />

matrix. However, you might have noticed that C C,B was easier to compute than<br />

C B,C . If you needed C B,C , then you could first compute C C,B and then compute its<br />

<strong>in</strong>verse, which by Theorem ICBM, would equal C B,C .<br />

Here is another illustrative example. We have been concentrat<strong>in</strong>g on work<strong>in</strong>g<br />

with abstract vector spaces, but all of our theorems and techniques apply just as well<br />

to C m , the vector space of column vectors. We only need to use more complicated<br />

bases than the standard unit vectors (Theorem SUVB) to make th<strong>in</strong>gs <strong>in</strong>terest<strong>in</strong>g.<br />

Example CBCV Change of basis with column vectors<br />

For the vector space C 4 we have the two bases,<br />

⎧⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />

⎧⎡<br />

⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />

1 −1 2 −1<br />

⎪⎨<br />

⎪⎬ ⎪⎨ 1 −4 −5 3 ⎪⎬<br />

⎢−2⎥<br />

⎢ 3 ⎥ ⎢−3⎥<br />

⎢ 3 ⎥<br />

B = ⎣<br />

⎪⎩<br />

1<br />

⎦ , ⎣<br />

1<br />

⎦ , ⎣<br />

3<br />

⎦ , ⎣<br />

3<br />

⎦<br />

⎪⎭<br />

C = ⎢−6⎥<br />

⎢ 8 ⎥ ⎢ 13 ⎥ ⎢−7⎥<br />

⎣<br />

⎪ ⎩<br />

−4<br />

⎦ , ⎣<br />

−5<br />

⎦ , ⎣<br />

−2<br />

⎦ , ⎣<br />

3<br />

⎦<br />

⎪⎭<br />

−2 1 −4 0<br />

−1 8 9 −6<br />

The change-of-basis matrix from B to C requires writ<strong>in</strong>g each vector of B as a<br />

l<strong>in</strong>ear comb<strong>in</strong>ation the vectors <strong>in</strong> C,<br />

⎛⎡<br />

⎤⎞<br />

⎛ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎞<br />

⎡ ⎤<br />

1<br />

1<br />

−4 −5<br />

3 1<br />

⎜⎢−2⎥⎟<br />

⎜ ⎢−6⎥<br />

⎢ 8 ⎥ ⎢ 13 ⎥ ⎢−7⎥⎟<br />

⎢−2⎥<br />

ρ C ⎝⎣<br />

1<br />

⎦⎠ = ρ C ⎝(1) ⎣<br />

−4<br />

⎦ +(−2) ⎣<br />

−5<br />

⎦ +(1) ⎣<br />

−2<br />

⎦ +(−1) ⎣<br />

3<br />

⎦⎠ = ⎣<br />

1<br />

⎦<br />

−2<br />

−1<br />

8 9<br />

−6 −1<br />

⎛⎡<br />

⎤⎞<br />

⎛ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎞<br />

⎡ ⎤<br />

−1<br />

1<br />

−4 −5 3 2<br />

⎜⎢<br />

3 ⎥⎟<br />

⎜ ⎢−6⎥<br />

⎢ 8 ⎥ ⎢ 13 ⎥ ⎢−7⎥⎟<br />

⎢−3⎥<br />

ρ C ⎝⎣<br />

1<br />

⎦⎠ = ρ C ⎝(2) ⎣<br />

−4<br />

⎦ +(−3) ⎣<br />

−5<br />

⎦ +(3) ⎣<br />

−2<br />

⎦ +(0) ⎣<br />

3<br />

⎦⎠ = ⎣<br />

3<br />

⎦<br />

1<br />

−1<br />

8 9 −6 0<br />

⎛⎡<br />

⎤⎞<br />

⎛ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎞<br />

⎡ ⎤<br />

2<br />

1<br />

−4 −5<br />

3 1<br />

⎜⎢−3⎥⎟<br />

⎜ ⎢−6⎥<br />

⎢ 8 ⎥ ⎢ 13 ⎥ ⎢−7⎥⎟<br />

⎢−3⎥<br />

ρ C ⎝⎣<br />

3<br />

⎦⎠ = ρ C ⎝(1) ⎣<br />

−4<br />

⎦ +(−3) ⎣<br />

−5<br />

⎦ +(1) ⎣<br />

−2<br />

⎦ +(−2) ⎣<br />

3<br />

⎦⎠ = ⎣<br />

1<br />

⎦<br />

−4<br />

−1<br />

8 9<br />

−6 −2<br />


§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 549<br />

⎛⎡<br />

⎤⎞<br />

⎛ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎞<br />

⎡ ⎤<br />

−1<br />

1<br />

−4 −5 3 2<br />

⎜⎢<br />

3 ⎥⎟<br />

⎜ ⎢−6⎥<br />

⎢ 8 ⎥ ⎢ 13 ⎥ ⎢−7⎥⎟<br />

⎢−2⎥<br />

ρ C ⎝⎣<br />

3<br />

⎦⎠ = ρ C ⎝(2) ⎣<br />

−4<br />

⎦ +(−2) ⎣<br />

−5<br />

⎦ +(4) ⎣<br />

−2<br />

⎦ +(3) ⎣<br />

3<br />

⎦⎠ = ⎣<br />

4<br />

⎦<br />

0<br />

−1<br />

8 9 −6 3<br />

Then we package these vectors up as the change-of-basis matrix,<br />

⎡<br />

⎤<br />

1 2 1 2<br />

⎢−2 −3 −3 −2⎥<br />

C B,C = ⎣<br />

1 3 1 4<br />

⎦<br />

−1 0 −2 3<br />

⎡ ⎤<br />

2<br />

⎢ 6 ⎥<br />

Now consider a s<strong>in</strong>gle (arbitrary) vector y = ⎣<br />

−3<br />

⎦. <strong>First</strong>, build the vector<br />

4<br />

representation of y relative to B. This will require writ<strong>in</strong>g y as a l<strong>in</strong>ear comb<strong>in</strong>ation<br />

of the vectors <strong>in</strong> B,<br />

⎛⎡<br />

⎤⎞<br />

2<br />

⎜⎢<br />

6 ⎥⎟<br />

ρ B (y) =ρ B ⎝⎣<br />

−3<br />

⎦⎠<br />

4<br />

= ρ B<br />

⎛<br />

⎜<br />

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎞<br />

⎡ ⎤<br />

1 −1 2<br />

−1 −21<br />

⎢−2⎥<br />

⎢ 3 ⎥ ⎢−3⎥<br />

⎢ 3 ⎥⎟<br />

⎢ 6 ⎥<br />

⎝(−21) ⎣<br />

1<br />

⎦ +(6) ⎣<br />

1<br />

⎦ + (11) ⎣<br />

3<br />

⎦ +(−7) ⎣<br />

3<br />

⎦⎠ = ⎣<br />

11<br />

⎦<br />

−2 1 −4<br />

0 −7<br />

Now, apply<strong>in</strong>g Theorem CB we can convert the representation of y relative to B<br />

<strong>in</strong>to a representation relative to C,<br />

ρ C (y) =C B,C ρ B (y)<br />

Theorem CB<br />

⎡<br />

⎤ ⎡ ⎤<br />

1 2 1 2 −21<br />

⎢−2 −3 −3 −2⎥<br />

⎢ 6<br />

= ⎣<br />

1 3 1 4<br />

⎦ ⎣<br />

11<br />

−1 0 −2 3 −7<br />

⎡ ⎤<br />

−12<br />

⎢ 5 ⎥<br />

= ⎣<br />

−20<br />

⎦<br />

−22<br />

⎥<br />

⎦<br />

Def<strong>in</strong>ition MVP<br />

We could cont<strong>in</strong>ue further with this example, perhaps by comput<strong>in</strong>g the representation<br />

of y relative to the basis C directly as a check on our work (Exercise<br />

CB.C20). Or we could choose another vector to play the role of y and compute two<br />

different representations of this vector relative to the two bases B and C. △


§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 550<br />

Subsection MRS<br />

Matrix Representations and Similarity<br />

Here is the ma<strong>in</strong> theorem of this section. It looks a bit <strong>in</strong>volved at first glance, but<br />

the proof should make you realize it is not all that complicated. In any event, we<br />

are more <strong>in</strong>terested <strong>in</strong> a special case.<br />

Theorem MRCB Matrix Representation and Change of Basis<br />

Suppose that T : U → V is a l<strong>in</strong>ear transformation, B and C are bases for U, and<br />

D and E are bases for V .Then<br />

MB,D T = C E,D MC,EC T B,C<br />

Proof.<br />

C E,D MC,EC T B,C = M I V<br />

E,D M C,EM T I U<br />

B,C<br />

Def<strong>in</strong>ition CBM<br />

= M I V<br />

E,D M T ◦I U<br />

B,E<br />

Theorem MRCLT<br />

= M I V<br />

E,D M B,E<br />

T Def<strong>in</strong>ition IDLT<br />

= M I V ◦T<br />

B,D<br />

Theorem MRCLT<br />

= MB,D T Def<strong>in</strong>ition IDLT<br />

<br />

We will be most <strong>in</strong>terested <strong>in</strong> a special case of this theorem (Theorem SCB), but<br />

here is an example that illustrates the full generality of Theorem MRCB.<br />

Example MRCM Matrix representations and change-of-basis matrices<br />

Beg<strong>in</strong> with two vector spaces, S 2 , the subspace of M 22 conta<strong>in</strong><strong>in</strong>g all 2 × 2 symmetric<br />

matrices, and P 3 (Example VSP), the vector space of all polynomials of degree 3 or<br />

less. Then def<strong>in</strong>e the l<strong>in</strong>ear transformation Q: S 2 → P 3 by<br />

([ ])<br />

a b<br />

Q =(5a − 2b +6c)+(3a − b +2c)x +(a +3b − c)x<br />

b c<br />

2 +(−4a +2b + c)x 3<br />

Here are two bases for each vector space, one nice, one nasty. <strong>First</strong> for S 2 ,<br />

{[ ] [ ] [ ]} {[ ] [ ] [ ]}<br />

5 −3 2 −3 1 2<br />

1 0 0 1 0 0<br />

B =<br />

,<br />

,<br />

C = , ,<br />

−3 −2 −3 0 2 4<br />

0 0 1 0 0 1<br />

and then for P 3 ,<br />

D = { 2+x − 2x 2 +3x 3 , −1 − 2x 2 +3x 3 , −3 − x + x 3 , −x 2 + x 3}<br />

E = { 1,x,x 2 ,x 3}<br />

We will beg<strong>in</strong> with a matrix representation of Q relative to C and E. Wefirst


§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 551<br />

f<strong>in</strong>d vector representations of the elements of C relative to E,<br />

⎡<br />

([ ]))<br />

1 0 (<br />

ρ E<br />

(Q<br />

= ρ<br />

0 0<br />

E 5+3x + x 2 − 4x 3) ⎢<br />

= ⎣<br />

ρ E<br />

(Q<br />

ρ E<br />

(Q<br />

([ ]))<br />

0 1 (<br />

= ρ<br />

1 0<br />

E −2 − x +3x 2 +2x 3) =<br />

([ ]))<br />

0 0 (<br />

= ρ<br />

0 1<br />

E 6+2x − x 2 + x 3) =<br />

⎡<br />

⎢<br />

⎣<br />

5<br />

3<br />

1<br />

−4<br />

⎤<br />

⎥<br />

⎦<br />

⎡ ⎤<br />

−2<br />

⎢−1⎥<br />

⎣<br />

3<br />

⎦<br />

2<br />

⎤<br />

6<br />

2<br />

−1<br />

1<br />

⎥<br />

⎦<br />

So<br />

M Q C,E = ⎡<br />

⎢<br />

⎣<br />

5 −2 6<br />

3 −1 2<br />

1 3 −1<br />

−4 2 1<br />

Now we construct two change-of-basis matrices. <strong>First</strong>, C B,C requires vector<br />

representations of the elements of B, relative to C. S<strong>in</strong>ceC is a nice basis, this is<br />

straightforward,<br />

([ ]) ( [ ] [ ] [ ]) [ ] 5<br />

5 −3<br />

1 0 0 1 0 0<br />

ρ C = ρ<br />

−3 −2 C (5) +(−3) +(−2) = −3<br />

0 0 1 0 0 1<br />

−2<br />

([ ]) ( [ ] [ ] [ ]) [ ] 2<br />

2 −3<br />

1 0 0 1 0 0<br />

ρ C = ρ<br />

−3 0<br />

C (2) +(−3) +(0) = −3<br />

0 0 1 0 0 1<br />

0<br />

([ ]) ( [ ] [ ] [ ]) [ ] 1 1 2<br />

1 0 0 1 0 0<br />

ρ C = ρ<br />

2 4 C (1) +(2) +(4) = 2<br />

0 0 1 0 0 1<br />

4<br />

So<br />

C B,C =<br />

[ 5 2<br />

] 1<br />

−3 −3 2<br />

−2 0 4<br />

The other change-of-basis matrix we will compute is C E,D . However, s<strong>in</strong>ce E is<br />

a nice basis (and D is not) we will turn it around and <strong>in</strong>stead compute C D,E and<br />

⎤<br />

⎥<br />


§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 552<br />

apply Theorem ICBM to use an <strong>in</strong>verse to compute C E,D .<br />

⎡ ⎤<br />

2<br />

(<br />

ρ E 2+x − 2x 2 +3x 3) (<br />

= ρ E (2)1 + (1)x +(−2)x 2 +(3)x 3) ⎢ 1 ⎥<br />

= ⎣<br />

−2<br />

⎦<br />

3<br />

⎡ ⎤<br />

−1<br />

(<br />

ρ E −1 − 2x 2 +3x 3) (<br />

= ρ E (−1)1 + (0)x +(−2)x 2 +(3)x 3) ⎢ 0 ⎥<br />

= ⎣<br />

−2<br />

⎦<br />

3<br />

⎡ ⎤<br />

(<br />

ρ E −3 − x + x<br />

3 ) −3<br />

(<br />

= ρ E (−3)1 + (−1)x +(0)x 2 +(1)x 3) ⎢−1⎥<br />

= ⎣<br />

0<br />

⎦<br />

1<br />

⎡ ⎤<br />

0<br />

0 ⎥<br />

(<br />

ρ E −x 2 + x 3) (<br />

= ρ E (0)1 + (0)x +(−1)x 2 +(1)x 3) ⎢<br />

= ⎣<br />

So, we can package these column vectors up as a matrix to obta<strong>in</strong> C D,E and<br />

then,<br />

C E,D =(C D,E ) −1<br />

Theorem ICBM<br />

⎡<br />

⎤−1<br />

2 −1 −3 0<br />

⎢ 1 0 −1 0 ⎥<br />

= ⎣<br />

−2 −2 0 −1<br />

⎦<br />

3 3 1 1<br />

⎡<br />

⎤<br />

1 −2 1 1<br />

⎢−2 5 −1 −1⎥<br />

= ⎣<br />

1 −3 1 1<br />

⎦<br />

2 −6 −1 0<br />

We are now <strong>in</strong> a position to apply Theorem MRCB. The matrix representation<br />

of Q relative to B and D can be obta<strong>in</strong>ed as follows,<br />

M Q B,D = C E,DM Q C,E C B,C<br />

Theorem MRCB<br />

⎡<br />

⎤ ⎡<br />

⎤<br />

1 −2 1 1 5 −2 6<br />

⎢−2 5 −1 −1⎥<br />

⎢ 3 −1 2 ⎥<br />

= ⎣<br />

1 −3 1 1<br />

⎦ ⎣<br />

1 3 −1<br />

⎦<br />

2 −6 −1 0 −4 2 1<br />

⎡<br />

⎤ ⎡<br />

⎤<br />

1 −2 1 1 19 16 25<br />

⎢−2 5 −1 −1⎥<br />

⎢ 14 9 9 ⎥<br />

= ⎣<br />

1 −3 1 1<br />

⎦ ⎣<br />

−2 −7 3<br />

⎦<br />

2 −6 −1 0 −28 −14 4<br />

[ 5 2<br />

] 1<br />

−3 −3 2<br />

−2 0 4<br />

−1<br />

1<br />


§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 553<br />

⎡<br />

⎤<br />

−39 −23 14<br />

⎢ 62 34 −12⎥<br />

= ⎣<br />

−53 −32 5<br />

⎦<br />

−44 −15 −7<br />

Now check our work by comput<strong>in</strong>g M Q B,D<br />

directly (Exercise CB.C21). △<br />

Here is a special case of the previous theorem, where we choose U and V to<br />

be the same vector space, so the matrix representations and the change-of-basis<br />

matrices are all square of the same size.<br />

Theorem SCB Similarity and Change of Basis<br />

Suppose that T : V → V is a l<strong>in</strong>ear transformation and B and C are bases of V .<br />

Then<br />

MB,B T = C −1<br />

B,C M C,CC T B,C<br />

Proof. In the conclusion of Theorem MRCB, replace D by B, and replace E by C,<br />

MB,B T = C C,B MC,CC T B,C<br />

Theorem MRCB<br />

= C −1<br />

B,C M C,CC T B,C<br />

Theorem ICBM<br />

<br />

This is the third surprise of this chapter. Theorem SCB considers the special case<br />

where a l<strong>in</strong>ear transformation has the same vector space for the doma<strong>in</strong> and codoma<strong>in</strong><br />

(V ). We build a matrix representation of T us<strong>in</strong>g the basis B simultaneously for both<br />

the doma<strong>in</strong> and codoma<strong>in</strong> (MB,B T ), and then we build a second matrix representation<br />

of T , now us<strong>in</strong>g the basis C for both the doma<strong>in</strong> and codoma<strong>in</strong> (MC,C T ). Then these<br />

two representations are related via a similarity transformation (Def<strong>in</strong>ition SIM)<br />

us<strong>in</strong>g a change-of-basis matrix (C B,C )!<br />

Example MRBE Matrix representation with basis of eigenvectors<br />

We return to the l<strong>in</strong>ear transformation T : M 22 → M 22 of Example ELTBM def<strong>in</strong>ed<br />

by<br />

([ ]) [ ]<br />

a b −17a +11b +8c − 11d −57a +35b +24c − 33d<br />

T<br />

=<br />

c d −14a +10b +6c − 10d −41a +25b +16c − 23d<br />

In Example ELTBM we showcased four eigenvectors of T . We will now put these<br />

four vectors <strong>in</strong> a set,<br />

{[ ] [ ] [ ] [ ]}<br />

0 1 1 1 1 3 2 6<br />

B = {x 1 , x 2 , x 3 , x 4 } = , , ,<br />

0 1 1 0 2 3 1 4<br />

Check that B is a basis of M 22 by first establish<strong>in</strong>g the l<strong>in</strong>ear <strong>in</strong>dependence of<br />

B and then employ<strong>in</strong>g Theorem G to get the spann<strong>in</strong>g property easily. Here is a


§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 554<br />

second set of 2 × 2 matrices, which also forms a basis of M 22 (Example BM),<br />

{[ ] [ ] [ ] [ ]}<br />

1 0 0 1 0 0 0 0<br />

C = {y 1 , y 2 , y 3 , y 4 } = , , ,<br />

0 0 0 0 1 0 0 1<br />

We can build two matrix representations of T , one relative to B and one relative<br />

to C. Each is easy, but for wildly different reasons. In our computation of the matrix<br />

representation relative to B we borrow some of our work <strong>in</strong> Example ELTBM. Here<br />

are the representations, then the explanation.<br />

⎡ ⎤<br />

2<br />

⎢0⎥<br />

ρ B (T (x 1 )) = ρ B (2x 1 )=ρ B (2x 1 +0x 2 +0x 3 +0x 4 )= ⎣<br />

0<br />

⎦<br />

0<br />

⎡ ⎤<br />

0<br />

⎢2⎥<br />

ρ B (T (x 2 )) = ρ B (2x 2 )=ρ B (0x 1 +2x 2 +0x 3 +0x 4 )= ⎣<br />

0<br />

⎦<br />

0<br />

⎡ ⎤<br />

0<br />

⎢ 0 ⎥<br />

ρ B (T (x 3 )) = ρ B ((−1)x 3 )=ρ B (0x 1 +0x 2 +(−1)x 3 +0x 4 )= ⎣<br />

−1<br />

⎦<br />

0<br />

⎡ ⎤<br />

0<br />

⎢ 0 ⎥<br />

ρ B (T (x 4 )) = ρ B ((−2)x 4 )=ρ B (0x 1 +0x 2 +0x 3 +(−2)x 4 )= ⎣<br />

0<br />

⎦<br />

−2<br />

So the result<strong>in</strong>g representation is<br />

⎡<br />

⎤<br />

2 0 0 0<br />

MB,B T ⎢0 2 0 0 ⎥<br />

= ⎣<br />

0 0 −1 0<br />

⎦<br />

0 0 0 −2<br />

Very pretty.<br />

Now for the matrix representation relative to C first compute,<br />

([ ])<br />

−17 −57<br />

ρ C (T (y 1 )) = ρ C<br />

−14 −41<br />

= ρ C<br />

((−17)<br />

[ ]<br />

1 0<br />

+(−57)<br />

0 0<br />

([ ])<br />

11 35<br />

ρ C (T (y 2 )) = ρ C<br />

10 25<br />

[ ]<br />

0 1<br />

+(−14)<br />

0 0<br />

[ ]<br />

0 0<br />

+(−41)<br />

1 0<br />

[ ])<br />

0 0<br />

=<br />

0 1<br />

⎡ ⎤<br />

−17<br />

⎢−57⎥<br />

⎣<br />

−14<br />

⎦<br />

−41


§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 555<br />

= ρ C<br />

(11<br />

[ ]<br />

1 0<br />

+35<br />

0 0<br />

([ ])<br />

8 24<br />

ρ C (T (y 3 )) = ρ C<br />

6 16<br />

= ρ C<br />

(8<br />

[ ]<br />

1 0<br />

+24<br />

0 0<br />

([<br />

−11<br />

ρ C (T (y 4 )) = ρ C<br />

−10<br />

= ρ C<br />

((−11)<br />

[ ]<br />

0 1<br />

+10<br />

0 0<br />

[ ]<br />

0 1<br />

+6<br />

0 0<br />

])<br />

−33<br />

−23<br />

[ ]<br />

1 0<br />

+(−33)<br />

0 0<br />

[ ]<br />

0 0<br />

+25<br />

1 0<br />

[ ]<br />

0 0<br />

+16<br />

1 0<br />

[ ]<br />

0 1<br />

+(−10)<br />

0 0<br />

[ ])<br />

0 0<br />

=<br />

0 1<br />

[ ])<br />

0 0<br />

=<br />

0 1<br />

⎡ ⎤<br />

11<br />

⎢35⎥<br />

⎣<br />

10<br />

⎦<br />

25<br />

⎡ ⎤<br />

8<br />

⎢24⎥<br />

⎣<br />

6<br />

⎦<br />

16<br />

[ ]<br />

0 0<br />

+(−23)<br />

1 0<br />

[ ])<br />

0 0<br />

=<br />

0 1<br />

⎡ ⎤<br />

−11<br />

⎢−33⎥<br />

⎣<br />

−10<br />

⎦<br />

−23<br />

So the result<strong>in</strong>g representation is<br />

⎡<br />

⎤<br />

−17 11 8 −11<br />

MC,C T ⎢−57 35 24 −33⎥<br />

= ⎣<br />

−14 10 6 −10<br />

⎦<br />

−41 25 16 −23<br />

Not quite as pretty.<br />

The purpose of this example is to illustrate Theorem SCB. This theorem says that<br />

the two matrix representations, MB,B T and M C,C T , of the one l<strong>in</strong>ear transformation,<br />

T , are related by a similarity transformation us<strong>in</strong>g the change-of-basis matrix C B,C .<br />

Let us compute this change-of-basis matrix. Notice that s<strong>in</strong>ce C is such a nice basis,<br />

this is fairly straightforward,<br />

⎡ ⎤<br />

([ ]) ( [ ] [ ] [ ] [ ]) 0<br />

0 1<br />

1 0 0 1 0 0 0 0 ⎢1⎥<br />

ρ C (x 1 )=ρ C = ρ<br />

0 1 C 0 +1 +0 +1 =<br />

0 0 0 0 1 0 0 1<br />

⎣<br />

0<br />

⎦<br />

1<br />

⎡ ⎤<br />

([ ]) ( [ ] [ ] [ ] [ ]) 1<br />

1 1<br />

1 0 0 1 0 0 0 0 ⎢1⎥<br />

ρ C (x 2 )=ρ C = ρ<br />

1 0 C 1 +1 +1 +0 =<br />

0 0 0 0 1 0 0 1<br />

⎣<br />

1<br />

⎦<br />

0<br />

⎡ ⎤<br />

([ ]) ( [ ] [ ] [ ] [ ]) 1<br />

1 3<br />

1 0 0 1 0 0 0 0 ⎢3⎥<br />

ρ C (x 3 )=ρ C = ρ<br />

2 3 C 1 +3 +2 +3 =<br />

0 0 0 0 1 0 0 1<br />

⎣<br />

2<br />

⎦<br />

3


§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 556<br />

([ ]) (<br />

2 6<br />

ρ C (x 4 )=ρ C = ρ<br />

1 4 C 2<br />

So we have,<br />

[ ]<br />

1 0<br />

+6<br />

0 0<br />

[ ]<br />

0 1<br />

+1<br />

0 0<br />

⎡<br />

⎤<br />

0 1 1 2<br />

⎢1 1 3 6⎥<br />

C B,C = ⎣<br />

0 1 2 1<br />

⎦<br />

1 0 3 4<br />

[ ]<br />

0 0<br />

+4<br />

1 0<br />

[ ])<br />

0 0<br />

=<br />

0 1<br />

Now, accord<strong>in</strong>g to Theorem SCB we can write,<br />

MB,B T = C −1<br />

B,C M C,CC T B,C<br />

⎡<br />

⎤ ⎡<br />

⎤−1 ⎡<br />

⎤ ⎡<br />

⎤<br />

2 0 0 0 0 1 1 2 −17 11 8 −11 0 1 1 2<br />

⎢0 2 0 0 ⎥ ⎢1 1 3 6⎥<br />

⎢−57 35 24 −33⎥<br />

⎢1 1 3 6⎥<br />

⎣<br />

0 0 −1 0<br />

⎦ = ⎣<br />

0 1 2 1<br />

⎦ ⎣<br />

−14 10 6 −10<br />

⎦ ⎣<br />

0 1 2 1<br />

⎦<br />

0 0 0 −2 1 0 3 4 −41 25 16 −23 1 0 3 4<br />

⎡ ⎤<br />

2<br />

⎢6⎥<br />

⎣<br />

1<br />

⎦<br />

4<br />

This should look and feel exactly like the process for diagonaliz<strong>in</strong>g a matrix, as<br />

was described <strong>in</strong> Section SD. And it is.<br />

△<br />

We can now return to the question of comput<strong>in</strong>g an eigenvalue or eigenvector<br />

of a l<strong>in</strong>ear transformation. For a l<strong>in</strong>ear transformation of the form T : V → V ,we<br />

know that representations relative to different bases are similar matrices. We also<br />

know that similar matrices have equal characteristic polynomials by Theorem SMEE.<br />

We will now show that eigenvalues of a l<strong>in</strong>ear transformation T are precisely the<br />

eigenvalues of any matrix representation of T . S<strong>in</strong>ce the choice of a different matrix<br />

representation leads to a similar matrix, there will be no “new” eigenvalues obta<strong>in</strong>ed<br />

from this second representation. Similarly, the change-of-basis matrix can be used<br />

to show that eigenvectors obta<strong>in</strong>ed from one matrix representation will be precisely<br />

those obta<strong>in</strong>ed from any other representation. So we can determ<strong>in</strong>e the eigenvalues<br />

and eigenvectors of a l<strong>in</strong>ear transformation by form<strong>in</strong>g one matrix representation,<br />

us<strong>in</strong>g any basis we please, and analyz<strong>in</strong>g the matrix <strong>in</strong> the manner of Chapter E.<br />

Theorem EER Eigenvalues, Eigenvectors, Representations<br />

Suppose that T : V → V is a l<strong>in</strong>ear transformation and B is a basis of V .Thenv ∈ V<br />

is an eigenvector of T for the eigenvalue λ if and only if ρ B (v) is an eigenvector of<br />

MB,B T for the eigenvalue λ.<br />

Proof. (⇒) Assume that v ∈ V is an eigenvector of T for the eigenvalue λ. Then<br />

MB,Bρ T B (v) =ρ B (T (v))<br />

Theorem FTMR<br />

= ρ B (λv) Def<strong>in</strong>ition EELT<br />

= λρ B (v) Theorem VRLT


§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 557<br />

which by Def<strong>in</strong>ition EEM says that ρ B (v) is an eigenvector of the matrix MB,B T for<br />

the eigenvalue λ.<br />

(⇐) Assume that ρ B (v) is an eigenvector of MB,B T for the eigenvalue λ. Then<br />

T (v) =ρ −1<br />

B (ρ B (T (v))) Def<strong>in</strong>ition IVLT<br />

= ρ −1 (<br />

B M<br />

T<br />

B,B ρ B (v) ) Theorem FTMR<br />

= ρ −1<br />

B (λρ B (v)) Def<strong>in</strong>ition EEM<br />

= λρ −1<br />

B (ρ B (v)) Theorem ILTLT<br />

= λv Def<strong>in</strong>ition IVLT<br />

which by Def<strong>in</strong>ition EELT says v is an eigenvector of T for the eigenvalue λ. <br />

Subsection CELT<br />

Comput<strong>in</strong>g Eigenvectors of L<strong>in</strong>ear Transformations<br />

Theorem EER tells us that the eigenvalues of a l<strong>in</strong>ear transformation are the<br />

eigenvalues of any representation, no matter what the choice of the basis B might be.<br />

So we could now unambiguously def<strong>in</strong>e items such as the characteristic polynomial<br />

of a l<strong>in</strong>ear transformation, which we would def<strong>in</strong>e as the characteristic polynomial<br />

of any matrix representation. We will say that aga<strong>in</strong> — eigenvalues, eigenvectors,<br />

and characteristic polynomials are <strong>in</strong>tr<strong>in</strong>sic properties of a l<strong>in</strong>ear transformation,<br />

<strong>in</strong>dependent of the choice of a basis used to construct a matrix representation.<br />

As a practical matter, how does one compute the eigenvalues and eigenvectors<br />

of a l<strong>in</strong>ear transformation of the form T : V → V ? Choose a nice basis B for V ,<br />

one where the vector representations of the values of the l<strong>in</strong>ear transformations<br />

necessary for the matrix representation are easy to compute. Construct the matrix<br />

representation relative to this basis, and f<strong>in</strong>d the eigenvalues and eigenvectors of<br />

this matrix us<strong>in</strong>g the techniques of Chapter E. The result<strong>in</strong>g eigenvalues of the<br />

matrix are precisely the eigenvalues of the l<strong>in</strong>ear transformation. The eigenvectors<br />

of the matrix are column vectors that need to be converted to vectors <strong>in</strong> V through<br />

application of ρ −1<br />

B<br />

(this is part of the content of Theorem EER).<br />

Now consider the case where the matrix representation of a l<strong>in</strong>ear transformation<br />

is diagonalizable. The n l<strong>in</strong>early <strong>in</strong>dependent eigenvectors that must exist for the<br />

matrix (Theorem DC) can be converted (via ρ −1<br />

B<br />

) <strong>in</strong>to eigenvectors of the l<strong>in</strong>ear<br />

transformation. A matrix representation of the l<strong>in</strong>ear transformation relative to a<br />

basis of eigenvectors will be a diagonal matrix — an especially nice representation!<br />

Though we did not know it at the time, the diagonalizations of Section SD were really<br />

about f<strong>in</strong>d<strong>in</strong>g especially pleas<strong>in</strong>g matrix representations of l<strong>in</strong>ear transformations.<br />

Here are some examples.<br />

Example ELTT Eigenvectors of a l<strong>in</strong>ear transformation, twice


§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 558<br />

Consider the l<strong>in</strong>ear transformation S : M 22 → M 22 def<strong>in</strong>ed by<br />

([ ]) [<br />

]<br />

a b<br />

−b − c − 3d −14a − 15b − 13c + d<br />

S<br />

=<br />

c d 18a +21b +19c +3d −6a − 7b − 7c − 3d<br />

To f<strong>in</strong>d the eigenvalues and eigenvectors of S we will build a matrix representation<br />

and analyze the matrix. S<strong>in</strong>ce Theorem EER places no restriction on the choice of<br />

the basis B, wemayaswelluseabasisthatiseasytoworkwith.Soset<br />

{[ ] [ ] [ ] [ ]}<br />

1 0 0 1 0 0 0 0<br />

B = {x 1 , x 2 , x 3 , x 4 } = , , ,<br />

0 0 0 0 1 0 0 1<br />

Then to build the matrix representation of S relative to B compute,<br />

([ ])<br />

0 −14<br />

ρ B (S (x 1 )) = ρ B<br />

18 −6<br />

⎡ ⎤<br />

0<br />

⎢−14⎥<br />

= ρ B (0x 1 +(−14)x 2 +18x 3 +(−6)x 4 )= ⎣<br />

18<br />

⎦<br />

−6<br />

([ ])<br />

−1 −15<br />

ρ B (S (x 2 )) = ρ B<br />

21 −7<br />

⎡ ⎤<br />

−1<br />

⎢−15⎥<br />

= ρ B ((−1)x 1 +(−15)x 2 +21x 3 +(−7)x 4 )= ⎣<br />

21<br />

⎦<br />

−7<br />

([ ])<br />

−1 −13<br />

ρ B (S (x 3 )) = ρ B<br />

19 −7<br />

⎡ ⎤<br />

−1<br />

⎢−13⎥<br />

= ρ B ((−1)x 1 +(−13)x 2 +19x 3 +(−7)x 4 )= ⎣<br />

19<br />

⎦<br />

−7<br />

([ ])<br />

−3 1<br />

ρ B (S (x 4 )) = ρ B<br />

3 −3<br />

⎡ ⎤<br />

= ρ B ((−3)x 1 +1x 2 +3x 3 +(−3)x 4 )=<br />

So by Def<strong>in</strong>ition MR we have<br />

⎡<br />

⎤<br />

0 −1 −1 −3<br />

M = MB,B S ⎢−14 −15 −13 1 ⎥<br />

= ⎣<br />

18 21 19 3<br />

⎦<br />

−6 −7 −7 −3<br />

⎢<br />

⎣<br />

−3<br />

1<br />

3<br />

−3<br />

⎥<br />


§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 559<br />

Now compute eigenvalues and eigenvectors of the matrix representation of M<br />

with the techniques of Section EE. <strong>First</strong> the characteristic polynomial,<br />

p M (x) =det(M − xI 4 )=x 4 − x 3 − 10x 2 +4x +24=(x − 3)(x − 2)(x +2) 2<br />

We could now make statements about the eigenvalues of M, but <strong>in</strong> light of<br />

Theorem EER we can refer to the eigenvalues of S and mildly abuse (or extend) our<br />

notation for multiplicities to write<br />

α S (3) = 1 α S (2) = 1 α S (−2) = 2<br />

Now compute the eigenvectors of M,<br />

⎡<br />

⎤ ⎡<br />

⎤<br />

−3 −1 −1 −3<br />

⎢−14 −18 −13 1<br />

λ =3 M − 3I 4 = ⎣<br />

18 21 16 3<br />

−6 −7 −7 −6<br />

〈 ⎧ ⎡ ⎤⎫<br />

⎪ −1 〉<br />

⎨ ⎪⎬<br />

⎢ 3 ⎥<br />

E M (3) = N (M − 3I 4 )= ⎣<br />

⎪ ⎩<br />

−3<br />

⎦<br />

⎪⎭<br />

1<br />

⎥<br />

⎦ RREF<br />

−−−−→<br />

⎡<br />

⎤<br />

−2 −1 −1 −3<br />

⎢−14 −17 −13 1<br />

λ =2 M − 2I 4 = ⎣<br />

18 21 17 3<br />

−6 −7 −7 −5<br />

〈 ⎧ ⎡ ⎤⎫<br />

⎪ −2 〉<br />

⎨ ⎪⎬<br />

⎢ 4 ⎥<br />

E M (2) = N (M − 2I 4 )= ⎣<br />

⎪ ⎩<br />

−3<br />

⎦<br />

⎪⎭<br />

1<br />

⎥<br />

⎦ RREF<br />

−−−−→<br />

⎡<br />

2 −1 −1<br />

⎤<br />

−3<br />

λ = −2<br />

⎢−14 −13 −13 1<br />

M − (−2)I 4 = ⎣<br />

18 21 21 3<br />

−6 −7 −7 −1<br />

〈 ⎧ ⎡<br />

⎪ ⎨<br />

⎪ ⎩<br />

E M (−2) = N (M − (−2)I 4 )=<br />

⎢<br />

⎣<br />

⎡<br />

⎢<br />

⎣<br />

⎥<br />

⎦ RREF<br />

−−−−→<br />

⎤<br />

0<br />

⎢−1⎥<br />

⎣<br />

1<br />

⎦ ,<br />

0<br />

1 0 0 1<br />

0 1 0 −3<br />

⎥<br />

0 0 1 3<br />

⎦<br />

0 0 0 0<br />

⎤<br />

1 0 0 2<br />

0 1 0 −4<br />

⎥<br />

0 0 1 3<br />

⎦<br />

0 0 0 0<br />

⎡<br />

⎢<br />

⎣<br />

⎤<br />

1 0 0 −1<br />

0 1 1 1<br />

⎥<br />

0 0 0 0 ⎦<br />

0 0 0 0<br />

⎡ ⎤⎫<br />

1 〉 ⎪⎬<br />

⎢−1⎥<br />

⎣<br />

0<br />

⎦<br />

⎪⎭<br />

1<br />

Accord<strong>in</strong>g to Theorem EER the eigenvectors just listed as basis vectors for the<br />

eigenspaces of M are vector representations (relative to B) of eigenvectors for S. So<br />

the application if the <strong>in</strong>verse function ρ −1<br />

B<br />

will convert these column vectors <strong>in</strong>to<br />

elements of the vector space M 22 (2 × 2 matrices) that are eigenvectors of S. S<strong>in</strong>ce<br />

ρ B is an isomorphism (Theorem VRILT), so is ρ −1<br />

B<br />

. Apply<strong>in</strong>g the <strong>in</strong>verse function


§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 560<br />

will then preserve l<strong>in</strong>ear <strong>in</strong>dependence and spann<strong>in</strong>g properties, so with a sweep<strong>in</strong>g<br />

application of the Coord<strong>in</strong>atization Pr<strong>in</strong>ciple and some extensions of our previous<br />

notation for eigenspaces and geometric multiplicities, we can write,<br />

ρ −1<br />

B<br />

ρ −1<br />

B<br />

ρ −1<br />

B<br />

ρ −1<br />

B<br />

⎛⎡<br />

⎤⎞<br />

−1<br />

⎜⎢<br />

3 ⎥⎟<br />

⎝⎣<br />

−3<br />

⎦⎠ =(−1)x 1 +3x 2 +(−3)x 3 +1x 4 =<br />

1<br />

⎛⎡<br />

⎤⎞<br />

−2<br />

⎜⎢<br />

4 ⎥⎟<br />

⎝⎣<br />

−3<br />

⎦⎠ =(−2)x 1 +4x 2 +(−3)x 3 +1x 4 =<br />

1<br />

⎛⎡<br />

⎤⎞<br />

0<br />

⎜⎢−1⎥⎟<br />

⎝⎣<br />

1<br />

⎦⎠ =0x 1 +(−1)x 2 +1x 3 +0x 4 =<br />

0<br />

⎛⎡<br />

⎤⎞<br />

1<br />

⎜⎢−1⎥⎟<br />

⎝⎣<br />

0<br />

⎦⎠ =1x 1 +(−1)x 2 +0x 3 +1x 4 =<br />

1<br />

[<br />

−1<br />

]<br />

3<br />

−3 1<br />

[<br />

−2<br />

]<br />

4<br />

−3 1<br />

[ ]<br />

0 −1<br />

1 0<br />

[ ]<br />

1 −1<br />

0 1<br />

So<br />

〈{[ ]}〉<br />

−1 3<br />

E S (3) =<br />

−3 1<br />

〈{[ ]}〉<br />

−2 4<br />

E S (2) =<br />

−3 1<br />

〈{[ ] [ ]}〉<br />

0 −1 1 −1<br />

E S (−2) =<br />

,<br />

1 0 0 1<br />

with geometric multiplicities given by<br />

γ S (3) = 1 γ S (2) = 1 γ S (−2) = 2<br />

Suppose we now decided to build another matrix representation of S, onlynow<br />

relative to a l<strong>in</strong>early <strong>in</strong>dependent set of eigenvectors of S, suchas<br />

{[ ] [ ] [ ] [ ]}<br />

−1 3 −2 4 0 −1 1 −1<br />

C =<br />

, , ,<br />

−3 1 −3 1 1 0 0 1<br />

At this po<strong>in</strong>t you should have computed enough matrix representations to predict<br />

that the result of represent<strong>in</strong>g S relative to C will be a diagonal matrix. Comput<strong>in</strong>g<br />

this representation is an example of how Theorem SCB generalizes the


§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 561<br />

diagonalizations from Section SD. For the record, here is the diagonal representation,<br />

⎡<br />

⎤<br />

3 0 0 0<br />

MC,C S ⎢0 2 0 0 ⎥<br />

= ⎣<br />

0 0 −2 0<br />

⎦<br />

0 0 0 −2<br />

Our <strong>in</strong>terest <strong>in</strong> this example is not necessarily build<strong>in</strong>g nice representations, but<br />

<strong>in</strong>stead we want to demonstrate how eigenvalues and eigenvectors are an <strong>in</strong>tr<strong>in</strong>sic<br />

property of a l<strong>in</strong>ear transformation, <strong>in</strong>dependent of any particular representation.<br />

To this end, we will repeat the forego<strong>in</strong>g, but replace B by another basis. We will<br />

make this basis different, but not extremely so,<br />

{[ ] [ ] [ ] [ ]}<br />

1 0 1 1 1 1 1 1<br />

D = {y 1 , y 2 , y 3 , y 4 } = , , ,<br />

0 0 0 0 1 0 1 1<br />

Then to build the matrix representation of S relative to D compute,<br />

([ ])<br />

0 −14<br />

ρ D (S (y 1 )) = ρ D<br />

18 −6<br />

⎡ ⎤<br />

14<br />

⎢−32⎥<br />

= ρ D (14y 1 +(−32)y 2 +24y 3 +(−6)y 4 )= ⎣<br />

24<br />

⎦<br />

−6<br />

([ ])<br />

−1 −29<br />

ρ D (S (y 2 )) = ρ D<br />

39 −13<br />

⎡ ⎤<br />

28<br />

⎢−68⎥<br />

= ρ D (28y 1 +(−68)y 2 +52y 3 +(−13)y 4 )= ⎣<br />

52<br />

⎦<br />

−13<br />

([ ])<br />

−2 −42<br />

ρ D (S (y 3 )) = ρ D<br />

58 −20<br />

⎡ ⎤<br />

40<br />

⎢−100⎥<br />

= ρ D (40y 1 +(−100)y 2 +78y 3 +(−20)y 4 )= ⎣<br />

78<br />

⎦<br />

−20<br />

([ ])<br />

−5 −41<br />

ρ D (S (y 4 )) = ρ D<br />

61 −23<br />

⎡ ⎤<br />

36<br />

⎢−102⎥<br />

= ρ D (36y 1 +(−102)y 2 +84y 3 +(−23)y 4 )= ⎣<br />

84<br />

⎦<br />

−23


§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 562<br />

So by Def<strong>in</strong>ition MR we have<br />

⎡<br />

⎤<br />

14 28 40 36<br />

N = MD,D S ⎢−32 −68 −100 −102⎥<br />

= ⎣<br />

24 52 78 84<br />

⎦<br />

−6 −13 −20 −23<br />

Now compute eigenvalues and eigenvectors of the matrix representation of N<br />

with the techniques of Section EE. <strong>First</strong> the characteristic polynomial,<br />

p N (x) =det(N − xI 4 )=x 4 − x 3 − 10x 2 +4x +24=(x − 3)(x − 2)(x +2) 2<br />

Of course this is not news. We now know that M = MB,B S and N = M D,D S are<br />

similar matrices (Theorem SCB). But Theorem SMEE told us long ago that similar<br />

matrices have identical characteristic polynomials. Now compute eigenvectors for<br />

the matrix representation, which will be different than what we found for M,<br />

⎡<br />

⎤ ⎡<br />

⎤<br />

11 28 40 36<br />

1 0 0 4<br />

⎢−32 −71 −100 −102⎥<br />

λ =3 N − 3I 4 = ⎣<br />

24 52 75 84<br />

⎦ −−−−→<br />

RREF ⎢0 1 0 −6⎥<br />

⎣<br />

0 0 1 4<br />

⎦<br />

−6 −13 −20 −26 0 0 0 0<br />

〈 ⎧ ⎡ ⎤⎫<br />

⎪ −4 〉<br />

⎨ ⎪⎬<br />

⎢ 6 ⎥<br />

E N (3) = N (N − 3I 4 )= ⎣<br />

⎪ ⎩<br />

−4<br />

⎦<br />

⎪⎭<br />

1<br />

⎡<br />

⎤ ⎡<br />

⎤<br />

12 28 40 36<br />

1 0 0 6<br />

⎢−32 −70 −100 −102⎥<br />

λ =2 N − 2I 4 = ⎣<br />

24 52 76 84<br />

⎦ −−−−→<br />

RREF ⎢0 1 0 −7⎥<br />

⎣<br />

0 0 1 4<br />

⎦<br />

−6 −13 −20 −25 0 0 0 0<br />

〈 ⎧ ⎡ ⎤⎫<br />

⎪ −6 〉<br />

⎨ ⎪⎬<br />

⎢ 7 ⎥<br />

E N (2) = N (N − 2I 4 )= ⎣<br />

⎪ ⎩<br />

−4<br />

⎦<br />

⎪⎭<br />

1<br />

⎡<br />

⎤ ⎡<br />

⎤<br />

16 28 40 36<br />

1 0 −1 −3<br />

⎢−32 −66 −100 −102⎥<br />

λ = −2 N − (−2)I 4 = ⎣<br />

24 52 80 84<br />

⎦ −−−−→<br />

RREF ⎢0 1 2 3 ⎥<br />

⎣<br />

0 0 0 0<br />

⎦<br />

−6 −13 −20 −21 0 0 0 0<br />

〈 ⎧ ⎡ ⎤ ⎡ ⎤⎫<br />

⎪ 1 3 〉<br />

⎨ ⎪⎬<br />

⎢−2⎥<br />

⎢−3⎥<br />

E N (−2) = N (N − (−2)I 4 )= ⎣<br />

⎪ ⎩<br />

1<br />

⎦ , ⎣<br />

0<br />

⎦<br />

⎪⎭<br />

0 1<br />

Employ<strong>in</strong>g Theorem EER we can apply ρ −1<br />

D<br />

to each of the basis vectors of the<br />

eigenspaces of N to obta<strong>in</strong> eigenvectors for S that also form bases for eigenspaces of


§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 563<br />

S,<br />

ρ −1<br />

D<br />

ρ −1<br />

D<br />

ρ −1<br />

D<br />

ρ −1<br />

D<br />

⎛⎡<br />

⎤⎞<br />

−4<br />

⎜⎢<br />

6 ⎥⎟<br />

⎝⎣<br />

−4<br />

⎦⎠ =(−4)y 1 +6y 2 +(−4)y 3 +1y 4 =<br />

1<br />

⎛⎡<br />

⎤⎞<br />

−6<br />

⎜⎢<br />

7 ⎥⎟<br />

⎝⎣<br />

−4<br />

⎦⎠ =(−6)y 1 +7y 2 +(−4)y 3 +1y 4 =<br />

1<br />

⎛⎡<br />

⎤⎞<br />

1<br />

⎜⎢−2⎥⎟<br />

⎝⎣<br />

1<br />

⎦⎠ =1y 1 +(−2)y 2 +1y 3 +0y 4 =<br />

0<br />

⎛⎡<br />

⎤⎞<br />

3<br />

⎜⎢−3⎥⎟<br />

⎝⎣<br />

0<br />

⎦⎠ =3y 1 +(−3)y 2 +0y 3 +1y 4 =<br />

1<br />

[<br />

−1<br />

]<br />

3<br />

−3 1<br />

[<br />

−2<br />

]<br />

4<br />

−3 1<br />

[ ]<br />

0 −1<br />

1 0<br />

[ ]<br />

1 −2<br />

1 1<br />

The eigenspaces for the eigenvalues of algebraic multiplicity 1 are exactly as<br />

before,<br />

〈{[ ]}〉<br />

−1 3<br />

E S (3) =<br />

−3 1<br />

〈{[ ]}〉<br />

−2 4<br />

E S (2) =<br />

−3 1<br />

However, the eigenspace for λ = −2 would at first glance appear to be different.<br />

Here are the two eigenspaces for λ = −2, first the eigenspace obta<strong>in</strong>ed from M =<br />

MB,B S , then followed by the eigenspace obta<strong>in</strong>ed from M = M D,D S .<br />

E S (−2) =<br />

〈{[<br />

0 −1<br />

1 0<br />

]<br />

,<br />

[<br />

1 −1<br />

0 1<br />

]}〉<br />

E S (−2) =<br />

〈{[<br />

0 −1<br />

1 0<br />

]<br />

,<br />

[ ]}〉<br />

1 −2<br />

1 1<br />

Subspaces generally have many bases, and that is the situation here. With a<br />

careful proof of set equality, you can show that these two eigenspaces are equal sets.<br />

The key observation to make such a proof go is that<br />

[<br />

1 −2<br />

1 1<br />

]<br />

=<br />

[<br />

0 −1<br />

1 0<br />

]<br />

+<br />

[<br />

1 −1<br />

0 1<br />

which will establish that the second set is a subset of the first. With equal dimensions,<br />

Theorem EDYES will f<strong>in</strong>ish the task.<br />

So the eigenvalues of a l<strong>in</strong>ear transformation are <strong>in</strong>dependent of the matrix<br />

representation employed to compute them!<br />

△<br />

]


§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 564<br />

Another example, this time a bit larger and with complex eigenvalues.<br />

Example CELT Complex eigenvectors of a l<strong>in</strong>ear transformation<br />

Consider the l<strong>in</strong>ear transformation Q: P 4 → P 4 def<strong>in</strong>ed by<br />

Q ( a + bx + cx 2 + dx 3 + ex 4)<br />

=(−46a − 22b +13c +5d + e) + (117a +57b − 32c − 15d − 4e)x+<br />

(−69a − 29b +21c − 7e)x 2 + (159a +73b − 44c − 13d +2e)x 3 +<br />

(−195a − 87b +55c +10d − 13e)x 4<br />

Choose a simple basis to compute with, say<br />

B = { 1,x,x 2 ,x 3 ,x 4}<br />

Then it should be apparent that the matrix representation of Q relative to B is<br />

⎡<br />

⎤<br />

−46 −22 13 5 1<br />

117 57 −32 −15 −4<br />

M = M Q B,B = ⎢ −69 −29 21 0 −7 ⎥<br />

⎣ 159 73 −44 −13 2 ⎦<br />

−195 −87 55 10 −13<br />

Compute the characteristic polynomial, eigenvalues and eigenvectors accord<strong>in</strong>g<br />

to the techniques of Section EE,<br />

p Q (x) =−x 5 +6x 4 − x 3 − 88x 2 + 252x − 208<br />

= −(x − 2) 2 (x +4) ( x 2 − 6x +13 )<br />

= −(x − 2) 2 (x +4)(x − (3 + 2i)) (x − (3 − 2i))<br />

α Q (2) = 2 α Q (−4) = 1 α Q (3 + 2i) =1 α Q (3 − 2i) =1<br />

λ =2<br />

⎡<br />

⎤ ⎡<br />

1<br />

−48 −22 13 5 1<br />

1 0 0<br />

2<br />

− 1 ⎤<br />

2<br />

117 55 −32 −15 −4<br />

0 1 0 − RREF<br />

5 2<br />

− 5 2<br />

M − (2)I 5 = ⎢ −69 −29 19 0 −7 ⎥ −−−−→ ⎢0 0 1 −2 −6⎥<br />

⎣ 159 73 −44 −15 2 ⎦ ⎣<br />

0 0 0 0 0<br />

⎦<br />

−195 −87 55 10 −15 0 0 0 0 0<br />

⎧⎡<br />

−<br />

〈<br />

1 ⎤ ⎡ 1⎤⎫<br />

⎧⎡<br />

⎤ ⎡ ⎤⎫<br />

2 2<br />

−1 1<br />

⎪⎨<br />

5 5 〉 〈<br />

〉<br />

⎪⎬ ⎪⎨<br />

5<br />

5<br />

⎪⎬<br />

E M (2) = N (M − (2)I 5 )= ⎢ 2 ⎥<br />

⎣<br />

⎪⎩ 1<br />

⎦ , 2<br />

⎢6⎥<br />

= ⎢ 4 ⎥<br />

⎣<br />

0<br />

⎦ ⎣<br />

⎪⎭ ⎪⎩<br />

2 ⎦ , ⎢12⎥<br />

⎣ 0 ⎦<br />

⎪⎭<br />

0 1<br />

0 2<br />

λ = −4


§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 565<br />

⎡<br />

⎤ ⎡<br />

⎤<br />

−42 −22 13 5 1<br />

1 0 0 0 1<br />

117 61 −32 −15 −4<br />

0 1 0 0 −3<br />

RREF<br />

M − (−4)I 5 = ⎢ −69 −29 25 0 −7⎥<br />

−−−−→ ⎢0 0 1 0 −1⎥<br />

⎣ 159 73 −44 −9 2 ⎦ ⎣0 0 0 1 −2⎦<br />

−195 −87 55 10 −9 0 0 0 0 0<br />

⎧⎡<br />

⎤⎫<br />

−1<br />

〈 〉<br />

⎪⎨<br />

3<br />

⎪⎬<br />

E M (−4) = N (M − (−4)I 5 )= ⎢ 1 ⎥<br />

⎣<br />

⎪⎩<br />

2 ⎦<br />

⎪⎭<br />

1<br />

λ =3+2i<br />

⎡<br />

⎤<br />

−49 − 2i −22 13 5 1<br />

117 54 − 2i −32 −15 −4<br />

M − (3 + 2i)I 5 = ⎢ −69 −29 18 − 2i 0 −7 ⎥<br />

⎣ 159 73 −44 −16 − 2i 2 ⎦<br />

−195 −87 55 10 −16 − 2i<br />

⎡<br />

1 0 0 0 − 3 4 + ⎤<br />

i<br />

4<br />

7<br />

RREF<br />

0 1 0 0<br />

4<br />

−−−−→<br />

− i 4<br />

⎢0 0 1 0 − 1 2<br />

⎣<br />

+ i 2<br />

7<br />

0 0 0 1<br />

4 − ⎥<br />

i ⎦<br />

4<br />

0 0 0 0 0<br />

⎧⎡<br />

3<br />

〈<br />

4<br />

⎪⎨<br />

− ⎤⎫<br />

i ⎧⎡<br />

⎤⎫<br />

4<br />

3 − i<br />

− 7 4<br />

E M (3 + 2i) =N (M − (3 + 2i)I 5 )=<br />

+ i 〉 〈<br />

〉<br />

4<br />

1<br />

⎢ 2<br />

⎣<br />

− ⎪⎬ ⎪⎨<br />

−7+i<br />

⎪⎬<br />

i 2<br />

− ⎪⎩<br />

7 4 + ⎥ = ⎢ 2 − 2i ⎥<br />

i ⎦ ⎣<br />

4 ⎪⎭<br />

⎪⎩<br />

−7+i⎦<br />

⎪⎭<br />

1<br />

4<br />

λ =3− 2i<br />

⎡<br />

⎤<br />

−49 + 2i −22 13 5 1<br />

117 54 + 2i −32 −15 −4<br />

M − (3 − 2i)I 5 = ⎢ −69 −29 18 + 2i 0 −7 ⎥<br />

⎣ 159 73 −44 −16 + 2i 2 ⎦<br />

−195 −87 55 10 −16 + 2i<br />

⎡<br />

1 0 0 0 − 3 4 − ⎤<br />

i<br />

4<br />

7<br />

RREF<br />

0 1 0 0<br />

4<br />

−−−−→<br />

+ i 4<br />

⎢0 0 1 0 − 1 2<br />

⎣<br />

− i 2<br />

7<br />

0 0 0 1<br />

4 + ⎥<br />

i ⎦<br />

4<br />

0 0 0 0 0


§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 566<br />

⎧⎡<br />

3<br />

〈<br />

4<br />

⎪⎨<br />

+ ⎤⎫<br />

i ⎧⎡<br />

⎤⎫<br />

4<br />

3+i<br />

− 7 4<br />

E M (3 − 2i) =N (M − (3 − 2i)I 5 )=<br />

− i 〉 〈<br />

〉<br />

4<br />

1<br />

⎢ 2<br />

⎣<br />

+ ⎪⎬ ⎪⎨<br />

−7 − i<br />

⎪⎬<br />

i 2<br />

− ⎪⎩<br />

7 4 − ⎥ = ⎢ 2+2i ⎥<br />

i ⎦ ⎣<br />

4 ⎪⎭<br />

⎪⎩<br />

−7 − i⎦<br />

⎪⎭<br />

1<br />

4<br />

It is straightforward to convert each of these basis vectors for eigenspaces of M<br />

back to elements of P 4 by apply<strong>in</strong>g the isomorphism ρ −1<br />

B<br />

⎛⎡<br />

⎤⎞<br />

,<br />

−1<br />

5<br />

ρ −1<br />

B ⎜⎢<br />

4 ⎥⎟<br />

⎝⎣<br />

2 ⎦⎠ = −1+5x +4x2 +2x 3<br />

0<br />

⎛⎡<br />

⎤⎞<br />

1<br />

5<br />

ρ −1<br />

B ⎜⎢12⎥⎟<br />

⎝⎣<br />

0 ⎦⎠ =1+5x +12x2 +2x 4<br />

2<br />

⎛⎡<br />

⎤⎞<br />

ρ −1<br />

B<br />

ρ −1<br />

B<br />

ρ −1<br />

B<br />

−1<br />

3<br />

⎜⎢<br />

1 ⎥⎟<br />

⎝⎣<br />

2 ⎦⎠ = −1+3x + x2 +2x 3 + x 4<br />

1<br />

⎛⎡<br />

⎤⎞<br />

3 − i<br />

−7+i<br />

⎜⎢<br />

2 − 2i ⎥⎟<br />

⎝⎣−7+i⎦⎠ =(3− i)+(−7+i)x +(2− 2i)x2 +(−7+i)x 3 +4x 4<br />

4<br />

⎛⎡<br />

⎤⎞<br />

3+i<br />

−7 − i<br />

⎜⎢<br />

2+2i ⎥⎟<br />

⎝⎣−7 − i⎦⎠ =(3+i)+(−7 − i)x +(2+2i)x2 +(−7 − i)x 3 +4x 4<br />

4<br />

So we apply Theorem EER and the Coord<strong>in</strong>atization Pr<strong>in</strong>ciple to get the<br />

eigenspaces for Q,<br />

E Q (2) = 〈{ −1+5x +4x 2 +2x 3 , 1+5x +12x 2 +2x 4}〉<br />

E Q (−4) = 〈{ −1+3x + x 2 +2x 3 + x 4}〉<br />

E Q (3 + 2i) = 〈{ (3 − i)+(−7+i)x +(2− 2i)x 2 +(−7+i)x 3 +4x 4}〉<br />

E Q (3 − 2i) = 〈{ (3 + i)+(−7 − i)x +(2+2i)x 2 +(−7 − i)x 3 +4x 4}〉


§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 567<br />

with geometric multiplicities<br />

γ Q (2) = 2 γ Q (−4) = 1 γ Q (3 + 2i) =1 γ Q (3 − 2i) =1<br />

△<br />

Read<strong>in</strong>g Questions<br />

1. The change-of-basis matrix is a matrix representation of which l<strong>in</strong>ear transformation?<br />

2. F<strong>in</strong>d the change-of-basis matrix, C B,C ,forthetwobasesofC 2<br />

{[ [ ]}<br />

{[ [<br />

2 −1<br />

1 1<br />

B = ,<br />

C = ,<br />

3]<br />

2<br />

0]<br />

1]}<br />

3. What is the third “surprise,” and why is it surpris<strong>in</strong>g?<br />

Exercises<br />

C20 In Example CBCV we computed the vector representation of y relative to C, ρ C (y),<br />

as an example of Theorem CB. Compute this same representation directly. In other words,<br />

apply Def<strong>in</strong>ition VR rather than Theorem CB.<br />

C21 † Perform a check on Example MRCM by comput<strong>in</strong>g M Q B,D<br />

directly. In other words,<br />

apply Def<strong>in</strong>ition MR rather than Theorem MRCB.<br />

C30 † F<strong>in</strong>d a basis for the vector space P 3 composed of eigenvectors of the l<strong>in</strong>ear transformation<br />

T . Then f<strong>in</strong>d a matrix representation of T relative to this basis.<br />

T : P 3 → P 3 , T ( a + bx + cx 2 + dx 3) =(a+c+d)+(b+c+d)x+(a+b+c)x 2 +(a+b+d)x 3<br />

C40 † Let S 22 be the vector space of 2 × 2 symmetric matrices. F<strong>in</strong>d a basis C for S 22<br />

that yields a diagonal matrix representation of the l<strong>in</strong>ear transformation R.<br />

([ ]) [ ]<br />

a b −5a +2b − 3c −12a +5b − 6c<br />

R: S 22 → S 22 , R<br />

=<br />

b c −12a +5b − 6c 6a − 2b +4c<br />

C41 † Let S 22 be the vector space of 2 × 2 symmetric matrices. F<strong>in</strong>d a basis for S 22<br />

composed of eigenvectors of the l<strong>in</strong>ear transformation Q: S 22 → S 22 .<br />

([ ]) [ ]<br />

a b 25a +18b +30c −16a − 11b − 20c<br />

Q<br />

=<br />

b c −16a − 11b − 20c −11a − 9b − 12c<br />

T10 †<br />

Suppose that T : V → V is an <strong>in</strong>vertible l<strong>in</strong>ear transformation with a nonzero<br />

eigenvalue λ. Provethat 1 λ is an eigenvalue of T −1 .<br />

T15 Suppose that V is a vector space and T : V → V is a l<strong>in</strong>ear transformation. Prove<br />

that T is <strong>in</strong>jective if and only if λ = 0 is not an eigenvalue of T .


Section OD<br />

Orthonormal Diagonalization<br />

We have seen <strong>in</strong> Section SD that under the right conditions a square matrix is<br />

similar to a diagonal matrix. We recognize now, via Theorem SCB, that a similarity<br />

transformation is a change of basis on a matrix representation. So we can now discuss<br />

the choice of a basis used to build a matrix representation, and decide if some bases<br />

are better than others for this purpose. This will be the tone of this section. We will<br />

also see that every matrix has a reasonably useful matrix representation, and we will<br />

discover a new class of diagonalizable l<strong>in</strong>ear transformations. <strong>First</strong> we need some<br />

basic facts about triangular matrices.<br />

Subsection TM<br />

Triangular Matrices<br />

An upper, or lower, triangular matrix is exactly what it sounds like it should be,<br />

but here are the two relevant def<strong>in</strong>itions.<br />

Def<strong>in</strong>ition UTM Upper Triangular Matrix<br />

The n × n square matrix A is upper triangular if [A] ij<br />

= 0 whenever i>j.<br />

Def<strong>in</strong>ition LTM Lower Triangular Matrix<br />

The n × n square matrix A is lower triangular if [A] ij<br />

= 0 whenever i


§OD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 569<br />

∑j−1<br />

= [A] ik<br />

0+<br />

k=1<br />

∑j−1<br />

= [A] ik<br />

0+<br />

k=1<br />

∑j−1<br />

= 0+<br />

k=1<br />

n∑<br />

0<br />

k=j<br />

n∑<br />

[A] ik<br />

[B] kj<br />

k


§OD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 570<br />

What happens <strong>in</strong> columns n +1through2n of M ′ ? These columns began <strong>in</strong> M<br />

as the identity matrix, and <strong>in</strong> M ′ each diagonal entry was scaled to a reciprocal of<br />

the correspond<strong>in</strong>g diagonal entry of A. Notice that trivially, these f<strong>in</strong>al n columns<br />

of M ′ form a lower triangular matrix. Just as we argued for the first n columns,<br />

the row operations that convert M j−1 <strong>in</strong>to M j will preserve the lower triangular<br />

form <strong>in</strong> the f<strong>in</strong>al n columns and preserve the exact values of the diagonal entries. By<br />

Theorem CINM, the f<strong>in</strong>al n columns of M n is the <strong>in</strong>verse of A, and this matrix has<br />

the necessary properties advertised <strong>in</strong> the conclusion of this theorem.<br />

<br />

Subsection UTMR<br />

Upper Triangular Matrix Representation<br />

Not every matrix is diagonalizable, but every l<strong>in</strong>ear transformation has a matrix<br />

representation that is an upper triangular matrix, and the basis that achieves this<br />

representation is especially pleas<strong>in</strong>g. Here is the theorem.<br />

Theorem UTMR Upper Triangular Matrix Representation<br />

Suppose that T : V → V is a l<strong>in</strong>ear transformation. Then there is a basis B for V<br />

such that the matrix representation of T relative to B, MB,B T , is an upper triangular<br />

matrix. Each diagonal entry is an eigenvalue of T ,andifλ is an eigenvalue of T ,<br />

then λ occurs α T (λ) times on the diagonal.<br />

Proof. We beg<strong>in</strong> with a proof by <strong>in</strong>duction (Proof Technique I) of the first statement<br />

<strong>in</strong> the conclusion of the theorem. We use <strong>in</strong>duction on the dimension of V to show<br />

that if T : V → V is a l<strong>in</strong>ear transformation, then there is a basis B for V such that<br />

the matrix representation of T relative to B, MB,B T , is an upper triangular matrix.<br />

To start suppose that dim (V ) = 1. Choose any nonzero vector v ∈ V and realize<br />

that V = 〈{v}〉. ThenT (v) = βv for some β ∈ C, which determ<strong>in</strong>es T uniquely<br />

(Theorem LTDB). This description of T also gives us a matrix representation relative<br />

to the basis B = {v} as the 1 × 1 matrix with lone entry equal to β. And this matrix<br />

representation is upper triangular (Def<strong>in</strong>ition UTM).<br />

For the <strong>in</strong>duction step let dim (V ) = m, and assume the theorem is true for every<br />

l<strong>in</strong>ear transformation def<strong>in</strong>ed on a vector space of dimension less than m. ByTheorem<br />

EMHE (suitably converted to the sett<strong>in</strong>g of a l<strong>in</strong>ear transformation), T has at least<br />

one eigenvalue, and we denote this eigenvalue as λ. (We will remark later about<br />

how critical this step is.) We now consider properties of the l<strong>in</strong>ear transformation<br />

T − λI V : V → V .<br />

Let x be an eigenvector of T for λ. By def<strong>in</strong>ition x ≠ 0. Then<br />

(T − λI V )(x) =T (x) − λI V (x) Theorem VSLT<br />

= T (x) − λx Def<strong>in</strong>ition IDLT<br />

= λx − λx Def<strong>in</strong>ition EELT<br />

= 0 Property AI


§OD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 571<br />

So T − λI V is not <strong>in</strong>jective, as it has a nontrivial kernel (Theorem KILT). With<br />

an application of Theorem RPNDD we bound the rank of T − λI V ,<br />

r (T − λI V )=dim(V ) − n (T − λI V ) ≤ m − 1<br />

Let W bethesubspaceofV that is the range of T − λI V , W = R(T − λI V ),<br />

and def<strong>in</strong>e k =dim(W ) ≤ m − 1. We def<strong>in</strong>e a new l<strong>in</strong>ear transformation S, onW ,<br />

S : W → W, S (w) =T (w)<br />

This does not look we have accomplished much, s<strong>in</strong>ce the action of S is identical<br />

to the action of T . For our purposes this will be a good th<strong>in</strong>g. What is different is<br />

the doma<strong>in</strong> and codoma<strong>in</strong>. S is def<strong>in</strong>ed on W , a vector space with dimension less<br />

than m, and so is susceptible to our <strong>in</strong>duction hypothesis. Verify<strong>in</strong>g that S is really a<br />

l<strong>in</strong>ear transformation is almost entirely rout<strong>in</strong>e, with one exception. Employ<strong>in</strong>g T <strong>in</strong><br />

our def<strong>in</strong>ition of S raises the possibility that the outputs of S will not be conta<strong>in</strong>ed<br />

with<strong>in</strong> W (but <strong>in</strong>stead will lie <strong>in</strong>side V , but outside W ). To exam<strong>in</strong>e this possibility,<br />

suppose that w ∈ W .<br />

S (w) =T (w)<br />

= T (w)+0 Property Z<br />

= T (w)+(λI V (w) − λI V (w)) Property AI<br />

=(T (w) − λI V (w)) + λI V (w) Property AA<br />

=(T (w) − λI V (w)) + λw<br />

Def<strong>in</strong>ition IDLT<br />

=(T − λI V )(w)+λw<br />

Theorem VSLT<br />

S<strong>in</strong>ce W is the range of T − λI V , (T − λI V )(w) ∈ W . And by Property SC,<br />

λw ∈ W . F<strong>in</strong>ally, apply<strong>in</strong>g Property AC we see by closure that the sum is <strong>in</strong> W and<br />

so we conclude that S (w) ∈ W . This argument conv<strong>in</strong>ces us that it is legitimate to<br />

def<strong>in</strong>e S as we did with W as the codoma<strong>in</strong>.<br />

S is a l<strong>in</strong>ear transformation def<strong>in</strong>ed on a vector space with dimension k, less<br />

than m, so we can apply the <strong>in</strong>duction hypothesis and conclude that W has a basis,<br />

C = {w 1 , w 2 , w 3 , ..., w k }, such that the matrix representation of S relative to C<br />

is an upper triangular matrix.<br />

Beg<strong>in</strong>n<strong>in</strong>g with the l<strong>in</strong>early <strong>in</strong>dependent set C, repeatedly apply Theorem ELIS<br />

to add vectors to C, ma<strong>in</strong>ta<strong>in</strong><strong>in</strong>g a l<strong>in</strong>early <strong>in</strong>dependent set and spann<strong>in</strong>g ever larger<br />

subspaces of V . This process will end with the addition of m−k vectors, which together<br />

with C will span all of V . Denote these vectors as D = {u 1 , u 2 , u 3 , ..., u m−k }.<br />

Then B = C ∪ D is a basis for V , and is the basis we desire for the conclusion of<br />

the theorem. So we now consider the matrix representation of T relative to B.<br />

S<strong>in</strong>ce the def<strong>in</strong>ition of T and S agree on W , the first k columns of MB,B T will have<br />

the upper triangular matrix representation of S <strong>in</strong> the first k rows. The rema<strong>in</strong><strong>in</strong>g<br />

m − k rows of these first k columns will be all zeros s<strong>in</strong>ce the outputs of T for basis<br />

vectors from C are all conta<strong>in</strong>ed <strong>in</strong> W and hence are l<strong>in</strong>ear comb<strong>in</strong>ations of the basis


§OD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 572<br />

vectors <strong>in</strong> C. The situation for T on the basis vectors <strong>in</strong> D is not quite as pretty,<br />

but it is close.<br />

For 1 ≤ i ≤ m − k, consider<br />

ρ B (T (u i )) = ρ B (T (u i )+0)<br />

Property Z<br />

= ρ B (T (u i )+(λI V (u i ) − λI V (u i ))) Property AI<br />

= ρ B ((T (u i ) − λI V (u i )) + λI V (u i )) Property AA<br />

= ρ B ((T (u i ) − λI V (u i )) + λu i ) Def<strong>in</strong>ition IDLT<br />

= ρ B ((T − λI V )(u i )+λu i ) Theorem VSLT<br />

= ρ B (a 1 w 1 + a 2 w 2 + a 3 w 3 + ···+ a k w k + λu i )<br />

⎡ ⎤<br />

Def<strong>in</strong>ition RLT<br />

=<br />

⎢<br />

⎣<br />

a 1<br />

a 2<br />

.<br />

a k<br />

0<br />

. ..<br />

0<br />

λ<br />

0<br />

.<br />

0<br />

⎥<br />

⎦<br />

Def<strong>in</strong>ition VR<br />

In the penultimate equality, we have rewritten an element of the range of T −λI V<br />

as a l<strong>in</strong>ear comb<strong>in</strong>ation of the basis vectors, C, for the range of T − λI V , W ,us<strong>in</strong>g<br />

the scalars a 1 ,a 2 ,a 3 , ..., a k . If we <strong>in</strong>corporate these m − k column vectors <strong>in</strong>to<br />

the matrix representation MB,B T we f<strong>in</strong>d m − k occurrences of λ on the diagonal,<br />

and any nonzero entries ly<strong>in</strong>g only <strong>in</strong> the first k rows. Together with the k × k<br />

upper triangular representation <strong>in</strong> the upper left-hand corner, the entire matrix<br />

representation for T is clearly upper triangular. This completes the <strong>in</strong>duction step.<br />

So for any l<strong>in</strong>ear transformation there is a basis that creates an upper triangular<br />

matrix representation.<br />

We have one more statement <strong>in</strong> the conclusion of the theorem to verify. The<br />

eigenvalues of T , and their multiplicities, can be computed with the techniques of<br />

Chapter E relative to any matrix representation (Theorem EER). We take this<br />

approach with our upper triangular matrix representation MB,B T .Letd i be the<br />

diagonal entry of MB,B T <strong>in</strong> row i and column i. Then the characteristic polynomial,<br />

computed as a determ<strong>in</strong>ant (Def<strong>in</strong>ition CP) with repeated expansions about the<br />

first column, is<br />

p M<br />

T<br />

B,B<br />

(x) =(d 1 − x)(d 2 − x)(d 3 − x) ···(d m − x)


§OD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 573<br />

The roots of the polynomial equation p M<br />

T<br />

B,B<br />

(x) = 0 are the eigenvalues of the<br />

l<strong>in</strong>ear transformation (Theorem EMRCP). So each diagonal entry is an eigenvalue,<br />

and is repeated on the diagonal exactly α T (λ) times (Def<strong>in</strong>ition AME). <br />

A key step <strong>in</strong> this proof was the construction of the subspace W with dimension<br />

strictly less than that of V . This required an eigenvalue/eigenvector pair, which was<br />

guaranteed to us by Theorem EMHE. Digg<strong>in</strong>g deeper, the proof of Theorem EMHE<br />

requires that we can factor polynomials completely, <strong>in</strong>to l<strong>in</strong>ear factors. This will not<br />

always happen if our set of scalars is the reals, R. So this is our f<strong>in</strong>al explanation of<br />

our choice of the complex numbers, C, as our set of scalars. In C polynomials factor<br />

completely, so every matrix has at least one eigenvalue, and an <strong>in</strong>ductive argument<br />

will get us to upper triangular matrix representations.<br />

In the case of l<strong>in</strong>ear transformations def<strong>in</strong>ed on C m , we can use the <strong>in</strong>ner product<br />

(Def<strong>in</strong>ition IP) profitably to f<strong>in</strong>e-tune the basis that yields an upper triangular matrix<br />

representation. Recall that the adjo<strong>in</strong>t of matrix A (Def<strong>in</strong>ition A) is written as A ∗ .<br />

Theorem OBUTR Orthonormal Basis for Upper Triangular Representation<br />

Suppose that A is a square matrix. Then there is a unitary matrix U, andanupper<br />

triangular matrix T , such that<br />

U ∗ AU = T<br />

and T has the eigenvalues of A as the entries of the diagonal.<br />

Proof. This theorem is a statement about matrices and similarity. We can convert<br />

it to a statement about l<strong>in</strong>ear transformations, matrix representations and bases<br />

(Theorem SCB). Suppose that A is an n × n matrix, and def<strong>in</strong>e the l<strong>in</strong>ear transformation<br />

S : C n → C n by S (x) = Ax. Then Theorem UTMR gives us a basis<br />

B = {v 1 , v 2 , v 3 , ..., v n } for C n such that a matrix representation of S relative to<br />

B, MB,B S , is upper triangular.<br />

Now convert the basis B <strong>in</strong>to an orthogonal basis, C, by an application of the<br />

Gram-Schmidt procedure (Theorem GSP). This is a messy bus<strong>in</strong>ess computationally,<br />

but here we have an excellent illustration of the power of the Gram-Schmidt procedure.<br />

We need only be sure that B is l<strong>in</strong>early <strong>in</strong>dependent and spans C n ,andthenweknow<br />

that C is l<strong>in</strong>early <strong>in</strong>dependent, spans C n and is also an orthogonal set. We will now<br />

consider the matrix representation of S relative to C (rather than B). Write the new<br />

basis as C = {y 1 , y 2 , y 3 , ..., y n }. The application of the Gram-Schmidt procedure<br />

creates each vector of C, sayy j , as the difference of v j and a l<strong>in</strong>ear comb<strong>in</strong>ation<br />

of y 1 , y 2 , y 3 , ..., y j−1 . We are not concerned here with the actual values of the<br />

scalars <strong>in</strong> this l<strong>in</strong>ear comb<strong>in</strong>ation, so we will write<br />

∑j−1<br />

y j = v j − b jk y k<br />

k=1


§OD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 574<br />

where the b jk are shorthand for the scalars. The equation above is <strong>in</strong> a form useful<br />

for creat<strong>in</strong>g the basis C from B. To better understand the relationship between B<br />

and C convert it to read<br />

∑j−1<br />

v j = y j + b jk y k<br />

In this form, we recognize that the change-of-basis matrix C B,C = M I C n<br />

B,C (Def<strong>in</strong>ition<br />

CBM) is an upper triangular matrix. By Theorem SCB we have<br />

MC,C S = C B,C MB,BC S −1<br />

B,C<br />

The <strong>in</strong>verse of an upper triangular matrix is upper triangular (Theorem ITMT), and<br />

the product of two upper triangular matrices is aga<strong>in</strong> upper triangular (Theorem<br />

PTMT). So MC,C S is an upper triangular matrix.<br />

Now, multiply each vector of C by a nonzero scalar, so that the result has norm<br />

1. In this way we create a new basis D which is an orthonormal set (Def<strong>in</strong>ition ONS).<br />

Note that the change-of-basis matrix C C,D is a diagonal matrix with nonzero entries<br />

equal to the norms of the vectors <strong>in</strong> C.<br />

Now we can convert our results <strong>in</strong>to the language of matrices. Let E be the basis<br />

of C n formed with the standard unit vectors (Def<strong>in</strong>ition SUV). Then the matrix<br />

representation of S relative to E is simply A, A = ME,E S . The change-of-basis matrix<br />

C D,E has columns that are simply the vectors <strong>in</strong> D, the orthonormal basis. As such,<br />

Theorem CUMOS tells us that C D,E is a unitary matrix, and by Def<strong>in</strong>ition UM has<br />

an <strong>in</strong>verse equal to its adjo<strong>in</strong>t. Write U = C D,E .Wehave<br />

U ∗ AU = U −1 AU<br />

Theorem UMI<br />

= C −1<br />

D,E M E,EC S D,E<br />

k=1<br />

= MD,D S Theorem SCB<br />

= C C,D MC,CC S −1<br />

C,D<br />

Theorem SCB<br />

The <strong>in</strong>verse of a diagonal matrix is also a diagonal matrix, and so this f<strong>in</strong>al<br />

expression is the product of three upper triangular matrices, and so is aga<strong>in</strong> upper<br />

triangular (Theorem PTMT). Thus the desired upper triangular matrix, T ,isthe<br />

matrix representation of S relative to the orthonormal basis D, MD,D S . <br />

Subsection NM<br />

Normal Matrices<br />

Normal matrices comprise a broad class of <strong>in</strong>terest<strong>in</strong>g matrices, many of which we<br />

have met already. But they are most <strong>in</strong>terest<strong>in</strong>g s<strong>in</strong>ce they def<strong>in</strong>e exactly which<br />

matrices we can diagonalize via a unitary matrix. This is the upcom<strong>in</strong>g Theorem<br />

OD. Here is the def<strong>in</strong>ition.


§OD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 575<br />

Def<strong>in</strong>ition NRML Normal Matrix<br />

The square matrix A is normal if A ∗ A = AA ∗ .<br />

So a normal matrix commutes with its adjo<strong>in</strong>t. Part of the beauty of this def<strong>in</strong>ition<br />

is that it <strong>in</strong>cludes many other types of matrices. A diagonal matrix will commute<br />

with its adjo<strong>in</strong>t, s<strong>in</strong>ce the adjo<strong>in</strong>t is aga<strong>in</strong> diagonal and the entries are just conjugates<br />

of the entries of the orig<strong>in</strong>al diagonal matrix. A Hermitian (self-adjo<strong>in</strong>t) matrix<br />

(Def<strong>in</strong>ition HM) will trivially commute with its adjo<strong>in</strong>t, s<strong>in</strong>ce the two matrices are<br />

the same. A real, symmetric matrix is Hermitian, so these matrices are also normal.<br />

A unitary matrix (Def<strong>in</strong>ition UM) has its adjo<strong>in</strong>t as its <strong>in</strong>verse, and <strong>in</strong>verses commute<br />

(Theorem OSIS), so unitary matrices are normal. Another class of normal matrices is<br />

the skew-symmetric matrices. However, these broad descriptions still do not capture<br />

all of the normal matrices, as the next example shows.<br />

Example ANM A normal matrix<br />

Let<br />

Then<br />

[ ][ ]<br />

1 −1 1 1<br />

1 1 −1 1<br />

A =<br />

=<br />

[ ]<br />

1 −1<br />

1 1<br />

[ ]<br />

2 0<br />

=<br />

0 2<br />

[<br />

1 1<br />

−1 1<br />

][ ]<br />

1 −1<br />

1 1<br />

so we see by Def<strong>in</strong>ition NRML that A is normal. However, A is not symmetric (hence,<br />

as a real matrix, not Hermitian), not unitary, and not skew-symmetric. △<br />

Subsection OD<br />

Orthonormal Diagonalization<br />

A diagonal matrix is very easy to work with <strong>in</strong> matrix multiplication (Example<br />

HPDM) and an orthonormal basis also has many advantages (Theorem COB). How<br />

about convert<strong>in</strong>g a matrix to a diagonal matrix through a similarity transformation<br />

us<strong>in</strong>g a unitary matrix (i.e. build a diagonal matrix representation with an orthonormal<br />

matrix)? That’d be fantastic! When can we do this? We can always accomplish<br />

this feat when the matrix is normal, and normal matrices are the only ones that<br />

behave this way. Here is the theorem.<br />

Theorem OD Orthonormal Diagonalization<br />

Suppose that A is a square matrix. Then there is a unitary matrix U and a diagonal<br />

matrix D, with diagonal entries equal to the eigenvalues of A, such that U ∗ AU = D<br />

if and only if A is a normal matrix.<br />

Proof. (⇒) Suppose there is a unitary matrix U that diagonalizes A. Wewould<br />

usally write this condition as U ∗ AU = D, but we will f<strong>in</strong>d it convenient <strong>in</strong> this part<br />

of the proof to use our hypothesis <strong>in</strong> the equivalent form, A = UDU ∗ .Recallthata<br />


§OD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 576<br />

diagonal matrix is normal, and notice that this observation is at the center of the<br />

next sequence of equalities. We check the normality of A,<br />

A ∗ A =(UDU ∗ ) ∗ (UDU ∗ )<br />

Hypothesis<br />

=(U ∗ ) ∗ D ∗ U ∗ UDU ∗<br />

Theorem MMAD<br />

= UD ∗ U ∗ UDU ∗ Theorem AA<br />

= UD ∗ I n DU ∗ Def<strong>in</strong>ition UM<br />

= UD ∗ DU ∗ Theorem MMIM<br />

= UDD ∗ U ∗ Def<strong>in</strong>ition NRML<br />

= UDI n D ∗ U ∗ Theorem MMIM<br />

= UDU ∗ UD ∗ U ∗ Def<strong>in</strong>ition UM<br />

= UDU ∗ (U ∗ ) ∗ D ∗ U ∗ Theorem AA<br />

=(UDU ∗ )(UDU ∗ ) ∗<br />

Theorem MMAD<br />

= AA ∗ Hypothesis<br />

So by Def<strong>in</strong>ition NRML, A is a normal matrix.<br />

(⇐) For the converse, suppose that A is a normal matrix. Whether or not A is<br />

normal, Theorem OBUTR provides a unitary matrix U and an upper triangular<br />

matrix T , whose diagonal entries are the eigenvalues of A, andsuchthatU ∗ AU = T .<br />

With the added condition that A is normal, we will determ<strong>in</strong>e that the entries of T<br />

above the diagonal must be all zero. Here we go.<br />

<strong>First</strong> notice that Def<strong>in</strong>ition UM implies that the <strong>in</strong>verse of a unitary matrix U is<br />

the adjo<strong>in</strong>t, U ∗ , so the product of these two matrices, <strong>in</strong> either order, is the identity<br />

matrix (Theorem OSIS). We beg<strong>in</strong> by show<strong>in</strong>g that T is normal.<br />

T ∗ T =(U ∗ AU) ∗ (U ∗ AU)<br />

Theorem OBUTR<br />

= U ∗ A ∗ (U ∗ ) ∗ U ∗ AU Theorem MMAD<br />

= U ∗ A ∗ UU ∗ AU Theorem AA<br />

= U ∗ A ∗ I n AU Def<strong>in</strong>ition UM<br />

= U ∗ A ∗ AU Theorem MMIM<br />

= U ∗ AA ∗ U Def<strong>in</strong>ition NRML<br />

= U ∗ AI n A ∗ U Theorem MMIM<br />

= U ∗ AUU ∗ A ∗ U Def<strong>in</strong>ition UM<br />

= U ∗ AUU ∗ A ∗ (U ∗ ) ∗ Theorem AA<br />

=(U ∗ AU)(U ∗ AU) ∗<br />

Theorem MMAD<br />

= TT ∗ Theorem OBUTR<br />

So by Def<strong>in</strong>ition NRML, T is a normal matrix.


§OD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 577<br />

We can translate the normality of T <strong>in</strong>to the statement TT ∗ − T ∗ T = O. We<br />

now establish an equality we will use repeatedly. For 1 ≤ i ≤ n,<br />

0=[O] ii<br />

Def<strong>in</strong>ition ZM<br />

=[TT ∗ − T ∗ T ] ii<br />

Def<strong>in</strong>ition NRML<br />

=[TT ∗ ] ii<br />

− [T ∗ T ] ii<br />

Def<strong>in</strong>ition MA<br />

n∑<br />

n∑<br />

= [T ] ik<br />

[T ∗ ] ki<br />

− [T ∗ ] ik<br />

[T ] ki<br />

Theorem EMP<br />

=<br />

=<br />

=<br />

k=1<br />

n∑<br />

[T ] ik<br />

[T ] ik<br />

−<br />

k=1<br />

n∑<br />

[T ] ik<br />

[T ] ik<br />

−<br />

k=i<br />

n∑<br />

|[T ] ik<br />

| 2 −<br />

k=i<br />

k=1<br />

n∑<br />

[T ] ki<br />

[T ] ki<br />

Def<strong>in</strong>ition A<br />

k=1<br />

i∑<br />

[T ] ki<br />

[T ] ki<br />

Def<strong>in</strong>ition UTM<br />

k=1<br />

i∑<br />

|[T ] ki<br />

| 2<br />

k=1<br />

Def<strong>in</strong>ition MCN<br />

To conclude, we use the above equality repeatedly, beg<strong>in</strong>n<strong>in</strong>g with i =1,and<br />

discover, row by row, that the entries above the diagonal of T are all zero. The key<br />

observation is that a sum of squares can only equal zero when each term of the sum<br />

is zero. For i =1wehave<br />

n∑<br />

1∑<br />

n∑<br />

0= |[T ] 1k<br />

| 2 − |[T ] k1<br />

| 2 = |[T ] 1k<br />

| 2<br />

k=1<br />

which forces the conclusions<br />

[T ] 12<br />

=0 [T ] 13<br />

=0 [T ] 14<br />

=0 ··· [T ] 1n<br />

=0<br />

k=1<br />

For i = 2 we use the same equality, but also <strong>in</strong>corporate the portion of the above<br />

conclusions that says [T ] 12<br />

=0,<br />

n∑<br />

2∑<br />

n∑<br />

2∑<br />

n∑<br />

0= |[T ] 2k<br />

| 2 − |[T ] k2<br />

| 2 = |[T ] 2k<br />

| 2 − |[T ] k2<br />

| 2 = |[T ] 2k<br />

| 2<br />

k=2<br />

k=1<br />

which forces the conclusions<br />

[T ] 23<br />

=0 [T ] 24<br />

=0 [T ] 25<br />

=0 ··· [T ] 2n<br />

=0<br />

k=2<br />

We can repeat this process for the subsequent values of i =3, 4, 5 ..., n− 1.<br />

Notice that it is critical we do this <strong>in</strong> order, s<strong>in</strong>ce we need to employ portions of each<br />

of the previous conclusions about rows hav<strong>in</strong>g zero entries <strong>in</strong> order to successfully get<br />

the same conclusion for later rows. Eventually, we conclude that all of the nondiagonal<br />

entries of T are zero, so the extra assumption of normality forces T to be diagonal.<br />

k=2<br />

k=2<br />

k=3


§OD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 578<br />

We can rearrange the conclusion of this theorem to read A = UDU ∗ . Recall that<br />

a unitary matrix can be viewed as a geometry-preserv<strong>in</strong>g transformation (isometry),<br />

or more loosely as a rotation of sorts. Then a matrix-vector product, Ax, canbe<br />

viewed <strong>in</strong>stead as a sequence of three transformations. U ∗ is unitary, and so is a<br />

rotation. S<strong>in</strong>ce D is diagonal, it just multiplies each entry of a vector by a scalar.<br />

Diagonal entries that are positive or negative, with absolute values bigger or smaller<br />

than 1 evoke descriptions like reflection, expansion and contraction. Generally we<br />

can say that D “stretches” a vector <strong>in</strong> each component. F<strong>in</strong>al multiplication by U<br />

undoes (<strong>in</strong>verts) the rotation performed by U ∗ . So a normal matrix is a rotationstretch-rotation<br />

transformation.<br />

The orthonormal basis formed from the columns of U can be viewed as a system<br />

of mutually perpendicular axes. The rotation by U ∗ allows the transformation by A<br />

to be replaced by the simple transformation D along these axes, and then D br<strong>in</strong>gs<br />

the result back to the orig<strong>in</strong>al coord<strong>in</strong>ate system. For this reason Theorem OD is<br />

known as the Pr<strong>in</strong>cipal Axis Theorem.<br />

The columns of the unitary matrix <strong>in</strong> Theorem OD create an especially nice<br />

basis for use with the normal matrix. We record this observation as a theorem.<br />

Theorem OBNM Orthonormal Bases and Normal Matrices<br />

Suppose that A is a normal matrix of size n. Then there is an orthonormal basis of<br />

C n composed of eigenvectors of A.<br />

Proof. Let U be the unitary matrix promised by Theorem OD and let D be the<br />

result<strong>in</strong>g diagonal matrix. The desired set of vectors is formed by collect<strong>in</strong>g the<br />

columns of U <strong>in</strong>to a set. Theorem CUMOS says this set of columns is orthonormal.<br />

S<strong>in</strong>ce U is nons<strong>in</strong>gular (Theorem UMI), Theorem CNMB says the set is a basis.<br />

S<strong>in</strong>ce A is diagonalized by U, the diagonal entries of the matrix D are the<br />

eigenvalues of A. An argument exactly like the second half of the proof of Theorem<br />

DC shows that each vector of the basis is an eigenvector of A.<br />

<br />

In a vague way Theorem OBNM is an improvement on Theorem HMOE which<br />

said that eigenvectors of a Hermitian matrix for different eigenvalues are always<br />

orthogonal. Hermitian matrices are normal and we see that we can f<strong>in</strong>d at least<br />

one basis where every pair of eigenvectors is orthogonal. Notice that this is not a<br />

generalization, s<strong>in</strong>ce Theorem HMOE states a weak result which applies to many<br />

(but not all) pairs of eigenvectors, while Theorem OBNM is a seem<strong>in</strong>gly stronger<br />

result, but only asserts that there is one collection of eigenvectors with the stronger<br />

property.<br />

Given an n × n matrix A, an orthonormal basis for C n , comprised of eigenvectors<br />

of A is an extremely useful basis to have at the service of the matrix A. Whydo<br />

we say this? We can consider the vectors of a basis as a preferred set of directions,<br />

known as “axes,” which taken together might also be called a “coord<strong>in</strong>ate system.”<br />

The standard basis of Def<strong>in</strong>ition SUV could be considered the default, or prototype,


§OD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 579<br />

coord<strong>in</strong>ate system. When a basis is orthornormal, we can consider the directions to<br />

be standardized to have unit length, and we can consider the axes as be<strong>in</strong>g mutually<br />

perpendicular. But there is more — let us be a bit more formal.<br />

Suppose U is a matrix whose columns are an orthonormal basis of eigenvectors<br />

of the n × n matrix A. So, <strong>in</strong> particular U is a unitary matrix (Theorem CUMOS).<br />

For a vector x ∈ C n , use the notation ˆx for the vector representation of x relative<br />

to the orthonormal basis. So the entries of ˆx, used <strong>in</strong> a l<strong>in</strong>ear comb<strong>in</strong>ation of the<br />

columns of U will create x. With Def<strong>in</strong>ition MVP, wecanwritethisrelationshipas<br />

U ˆx = x<br />

S<strong>in</strong>ce U ∗ is the <strong>in</strong>verse of U (Def<strong>in</strong>ition UM), we can rearrange this equation as<br />

ˆx = U ∗ x<br />

This says we can easily create the vector representation relative to the orthonormal<br />

basis with a matrix-vector product of the adjo<strong>in</strong>t of U. Note that the adjo<strong>in</strong>t is much<br />

easier to compute than a matrix <strong>in</strong>verse, which would be one general way to obta<strong>in</strong><br />

a vector representation. This is our first observation about coord<strong>in</strong>atization relative<br />

to orthonormal basis. However, we already knew this, as we just have Theorem COB<br />

<strong>in</strong> disguise (see Exercise OD.T20).<br />

We also know that orthonormal bases play nicely with <strong>in</strong>ner products. Theorem<br />

UMPIP says unitary matrices preserve <strong>in</strong>ner products (and hence norms). More<br />

geometrically, lengths and angles are preserved by multiplication by a unitary matrix.<br />

Us<strong>in</strong>g our notation, this becomes<br />

〈x, y〉 = 〈U ˆx, Uŷ〉 = 〈ˆx, ŷ〉<br />

So we can compute <strong>in</strong>ner products with the orig<strong>in</strong>al vectors, or with their representations,<br />

and obta<strong>in</strong> the same result. It follows that norms, lengths, and angles can<br />

all be computed with the orig<strong>in</strong>al vectors or with the representations <strong>in</strong> the new<br />

coord<strong>in</strong>ate system based on an orthonormal basis.<br />

So far we have not really said anyth<strong>in</strong>g new, nor has the matrix A, oritseigenvectors,<br />

come <strong>in</strong>to play. We know that a matrix is really a l<strong>in</strong>ear transformation, so<br />

we express this view of a matrix as a function by writ<strong>in</strong>g generically that Ax = y.<br />

The matrix U will diagonalize A, creat<strong>in</strong>g the diagonal matrix D with diagonal<br />

entries equal to the eigenvalues of A. WecanwritethisasU ∗ AU = D and convert<br />

to U ∗ A = DU ∗ .Thenwehave<br />

ŷ = U ∗ y = U ∗ Ax = DU ∗ x = Dˆx<br />

So with the coord<strong>in</strong>atized vectors, the transformation by the matrix A can be<br />

accomplished with multiplication by a diagonal matrix D. A moment’s thought<br />

should conv<strong>in</strong>ce you that a matrix-vector product with a diagonal matrix is exeed<strong>in</strong>gly<br />

simple computationally. Geometrically, this is simply stretch<strong>in</strong>g, contract<strong>in</strong>g and/or<br />

reflect<strong>in</strong>g <strong>in</strong> the direction of each basis vector (“axis”). And the multiples used for<br />

these changes are the diagonal entries of the diagonal matrix, the eigenvalues of A.


§OD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 580<br />

So the new coord<strong>in</strong>ate system (provided by the orthonormal basis of eigenvectors)<br />

is a collection of mutually perpendicular unit vectors where <strong>in</strong>ner products are<br />

preserved, and the action of the matrix A is described by multiples (eigenvalues) of<br />

the entries of the coord<strong>in</strong>atized versions of the vectors. Nice.<br />

Read<strong>in</strong>g Questions<br />

1. Name three broad classes of normal matrices that we have studied previously. No set<br />

that you give should be a subset of another on your list.<br />

2. Compare and contrast Theorem UTMR with Theorem OD.<br />

3. Given an n × n matrix A, why would you desire an orthonormal basis of C n composed<br />

entirely of eigenvectors of A?<br />

Exercises<br />

T10 Exercise MM.T35 asked you to show that AA ∗ is Hermitian. Prove directly that<br />

AA ∗ is a normal matrix.<br />

T20 In the discussion follow<strong>in</strong>g Theorem OBNM we comment that the equation ˆx = U ∗ x<br />

is just Theorem COB <strong>in</strong> disguise. Formulate this observation more formally and prove the<br />

equivalence.<br />

T30 For the proof of Theorem PTMT we only show that the product of two lower<br />

triangular matrices is aga<strong>in</strong> lower triangular. Provide a proof that the product of two upper<br />

triangular matrices is aga<strong>in</strong> upper triangular. Look to the proof of Theorem PTMT for<br />

guidance if you need a h<strong>in</strong>t.


Chapter P<br />

Prelim<strong>in</strong>aries<br />

“Prelim<strong>in</strong>aries” are basic mathematical concepts you are likely to have seen before.<br />

So we have collected them near the end as reference material (despite the name).<br />

Head back here when you need a refresher, or when a theorem or exercise builds on<br />

some of this basic material.<br />

Section CNO<br />

Complex Number Operations<br />

In this section we review some of the basics of work<strong>in</strong>g with complex numbers.<br />

Subsection CNA<br />

Arithmetic with complex numbers<br />

A complex number is a l<strong>in</strong>ear comb<strong>in</strong>ation of 1 and i = √ −1, typically written <strong>in</strong><br />

the form a + bi. Complex numbers can be added, subtracted, multiplied and divided,<br />

just like we are used to do<strong>in</strong>g with real numbers, <strong>in</strong>clud<strong>in</strong>g the restriction on division<br />

by zero. We will not def<strong>in</strong>e these operations carefully immediately, but <strong>in</strong>stead first<br />

illustrate with examples.<br />

Example ACN Arithmetic of complex numbers<br />

(2 + 5i)+(6− 4i) = (2 + 6) + (5 + (−4))i =8+i<br />

(2 + 5i) − (6 − 4i) =(2− 6) + (5 − (−4))i = −4+9i<br />

(2 + 5i)(6 − 4i) = (2)(6) + (5i)(6) + (2)(−4i)+(5i)(−4i) = 12 + 30i − 8i − 20i 2<br />

=12+22i − 20(−1) = 32 + 22i<br />

581


§CNO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 582<br />

Division takes just a bit more care. We multiply the denom<strong>in</strong>ator by a complex<br />

number chosen to produce a real number and then we can produce a complex number<br />

as a result.<br />

2+5i<br />

6 − 4i = 2+5i 6+4i<br />

6 − 4i 6+4i = −8+38i<br />

52<br />

= − 8<br />

52 + 38<br />

52 i = − 2<br />

13 + 19<br />

26 i △<br />

In this example, we used 6 + 4i to convert the denom<strong>in</strong>ator <strong>in</strong> the fraction to a<br />

real number. This number is known as the conjugate, which we def<strong>in</strong>e <strong>in</strong> the next<br />

section.<br />

We will often exploit the basic properties of complex number addition, subtraction,<br />

multiplication and division, so we will carefully def<strong>in</strong>e the two basic operations,<br />

together with a def<strong>in</strong>ition of equality, and then collect n<strong>in</strong>e basic properties <strong>in</strong> a<br />

theorem.<br />

Def<strong>in</strong>ition CNE Complex Number Equality<br />

The complex numbers α = a + bi and β = c + di are equal, denoted α = β, ifa = c<br />

and b = d.<br />

□<br />

Def<strong>in</strong>ition CNA Complex Number Addition<br />

The sum of the complex numbers α = a + bi and β = c + di , denoted α + β, is<br />

(a + c)+(b + d)i. □<br />

Def<strong>in</strong>ition CNM Complex Number Multiplication<br />

The product of the complex numbers α = a + bi and β = c + di , denoted αβ, is<br />

(ac − bd)+(ad + bc)i.<br />

□<br />

Theorem PCNA Properties of Complex Number Arithmetic<br />

The operations of addition and multiplication of complex numbers have the follow<strong>in</strong>g<br />

properties.<br />

• ACCN Additive Closure, Complex Numbers<br />

If α, β ∈ C, thenα + β ∈ C.<br />

• MCCN Multiplicative Closure, Complex Numbers<br />

If α, β ∈ C, thenαβ ∈ C.<br />

• CACN Commutativity of Addition, Complex Numbers<br />

For any α, β ∈ C, α + β = β + α.<br />

• CMCN Commutativity of Multiplication, Complex Numbers<br />

For any α, β ∈ C, αβ = βα.<br />

• AACN Additive Associativity, Complex Numbers<br />

For any α, β, γ ∈ C, α +(β + γ) =(α + β)+γ.


§CNO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 583<br />

• MACN Multiplicative Associativity, Complex Numbers<br />

For any α, β, γ ∈ C, α (βγ) =(αβ) γ.<br />

• DCN Distributivity, Complex Numbers<br />

For any α, β, γ ∈ C, α(β + γ) =αβ + αγ.<br />

• ZCN Zero, Complex Numbers<br />

There is a complex number 0=0+0i so that for any α ∈ C, 0+α = α.<br />

• OCN One, Complex Numbers<br />

There is a complex number 1=1+0i so that for any α ∈ C, 1α = α.<br />

• AICN Additive Inverse, Complex Numbers<br />

For every α ∈ C there exists −α ∈ C so that α +(−α) =0.<br />

• MICN Multiplicative Inverse, Complex Numbers<br />

For every α ∈ C, α ≠0there exists 1 α ∈ C so that α ( 1<br />

α)<br />

=1.<br />

Proof. We could derive each of these properties of complex numbers with a proof<br />

that builds on the identical properties of the real numbers. The only proof that<br />

might be at all <strong>in</strong>terest<strong>in</strong>g would be to show Property MICN s<strong>in</strong>ce we would need<br />

to trot out a conjugate. For this property, and especially for the others, we might be<br />

tempted to construct proofs of the identical properties for the reals. This would take<br />

us way too far afield, so we will draw a l<strong>in</strong>e <strong>in</strong> the sand right here and just agree<br />

that these n<strong>in</strong>e fundamental behaviors are true. OK?<br />

Mostly we have stated these n<strong>in</strong>e properties carefully so that we can make<br />

reference to them later <strong>in</strong> other proofs. So we will be l<strong>in</strong>k<strong>in</strong>g back here often. <br />

Zero and one play special roles, of course, and especially zero. Our first result is<br />

one we take for granted, but it requires a proof, derived from our n<strong>in</strong>e properties.<br />

You can compare it to its vector space counterparts, Theorem ZSSM and Theorem<br />

ZVSM.<br />

Theorem ZPCN Zero Product, Complex Numbers<br />

Suppose α ∈ C. Then0α =0.<br />

Proof.<br />

0α =0α +0<br />

PropertyZCN<br />

=0α +(0α − (0α)) Property AICN<br />

=(0α +0α) − (0α)<br />

Property AACN<br />

=(0+0)α − (0α)<br />

Property DCN<br />

=0α − (0α) Property ZCN


§CNO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 584<br />

=0 PropertyAICN<br />

Our next theorem could be called “cancellation”, s<strong>in</strong>ce it will make that possible.<br />

Though you will never see us draw<strong>in</strong>g slashes through parts of products. We will<br />

also make very limited use of this result, or its vector space counterpart, Theorem<br />

SMEZV.<br />

Theorem ZPZT Zero Product, Zero Terms<br />

Suppose α, β ∈ C. Thenαβ =0if and only if at least one of α =0or β =0.<br />

Proof. (⇒)We conduct the forward argument <strong>in</strong> two cases. <strong>First</strong> suppose that α =0.<br />

Then we are done. (That was easy.)<br />

For the second case, suppose now that α ≠0.Then<br />

β =1β<br />

Property OCN<br />

( ) 1<br />

=<br />

α α β<br />

Property MICN<br />

= 1 (αβ) Property MACN<br />

α<br />

= 1 α 0 Hypothesis<br />

=0 TheoremZPCN<br />

(⇐)With two applications of Theorem ZPCN it is easy to see that if one of the<br />

scalars is zero, then so is the product.<br />

<br />

As an equivalence (Proof Technique E), we could restate this result as the<br />

contrapositive (Proof Technique CP) by negat<strong>in</strong>g each statement, so it would read<br />

“αβ ̸= 0 if and only if α ≠ 0 and β ̸= 0.” After you have learned more about<br />

nons<strong>in</strong>gular matrices and matrix multiplication, you should compare this result with<br />

Theorem NPNT.<br />

Subsection CCN<br />

Conjugates of Complex Numbers<br />

Def<strong>in</strong>ition CCN Conjugate of a Complex Number<br />

The conjugate of the complex number α = a + bi ∈ C is the complex number<br />

α = a − bi.<br />

□<br />

Example CSCN Conjugate of some complex numbers<br />

2+3i =2− 3i 5 − 4i =5+4i −3+0i = −3+0i 0+0i =0+0i<br />


§CNO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 585<br />

Notice how the conjugate of a real number leaves the number unchanged. The<br />

conjugate enjoys some basic properties that are useful when we work with l<strong>in</strong>ear<br />

expressions <strong>in</strong>volv<strong>in</strong>g addition and multiplication.<br />

Theorem CCRA Complex Conjugation Respects Addition<br />

Suppose that α and β are complex numbers. Then α + β = α + β.<br />

Proof. Let α = a + bi and β = r + si. Then<br />

α + β = (a + r)+(b + s)i =(a + r) − (b + s)i =(a − bi)+(r − si) =α + β<br />

<br />

Theorem CCRM Complex Conjugation Respects Multiplication<br />

Suppose that α and β are complex numbers. Then αβ = αβ.<br />

Proof. Let α = a + bi and β = r + si. Then<br />

αβ = (ar − bs)+(as + br)i =(ar − bs) − (as + br)i<br />

=(ar − (−b)(−s)) + (a(−s)+(−b)r)i =(a − bi)(r − si) =αβ<br />

<br />

Theorem CCT Complex Conjugation Twice<br />

Suppose that α is a complex number. Then α = α.<br />

Proof. Let α = a + bi. Then<br />

α = a − bi = a − (−bi) =a + bi = α<br />

<br />

Subsection MCN<br />

Modulus of a Complex Number<br />

We def<strong>in</strong>e one more operation with complex numbers that may be new to you.<br />

Def<strong>in</strong>ition MCN Modulus of a Complex Number<br />

The modulus of the complex number α = a+bi ∈ C, is the nonnegative real number<br />

|α| = √ αα = √ a 2 + b 2 .<br />

Example MSCN Modulus of some complex numbers<br />

□<br />

|2+3i| = √ 13 |5 − 4i| = √ 41 |−3+0i| =3 |0+0i| =0<br />


§CNO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 586<br />

The modulus can be <strong>in</strong>terpreted as a version of the absolute value for complex<br />

numbers, as is suggested by the notation employed. You can see this <strong>in</strong> how |−3| =<br />

|−3+0i| = 3. Notice too how the modulus of the complex zero, 0 + 0i, has value 0.


Section SET<br />

Sets<br />

We will frequently work carefully with sets, so the material <strong>in</strong> this review section<br />

is very important. If these topics are new to you, study this section carefully and<br />

consider consult<strong>in</strong>g another text for a more comprehensive <strong>in</strong>troduction.<br />

Subsection SET<br />

Sets<br />

Def<strong>in</strong>ition SET Set<br />

A set is an unordered collection of objects. If S is a set and x is an object that is<br />

<strong>in</strong> the set S, wewritex ∈ S. Ifx is not <strong>in</strong> S, thenwewritex ∉ S. We refer to the<br />

objects <strong>in</strong> a set as its elements.<br />

□<br />

Hard to get much more basic than that. Notice that the objects <strong>in</strong> a set can be<br />

anyth<strong>in</strong>g, and there is no notion of order among the elements of the set. A set can be<br />

f<strong>in</strong>ite as well as <strong>in</strong>f<strong>in</strong>ite. A set can conta<strong>in</strong> other sets as its objects. At a primitive<br />

level, a set is just a way to break up some class of objects <strong>in</strong>to two group<strong>in</strong>gs: those<br />

objects <strong>in</strong> the set, and those objects not <strong>in</strong> the set.<br />

Example SETM Set membership<br />

From the set of all possible symbols, construct the follow<strong>in</strong>g set of three symbols,<br />

S = {, , ⋆}<br />

Then the statement ∈ S is true, while the statement ∈ S is false. However, then<br />

the statement ∉ S is true.<br />

△<br />

A portion of a set is known as a subset. Notice how the follow<strong>in</strong>g def<strong>in</strong>ition uses<br />

an implication (ifwhenever...then...). Note too howthe def<strong>in</strong>ition ofa subsetrelies<br />

on the def<strong>in</strong>ition of a set through the idea of set membership.<br />

Def<strong>in</strong>ition SSET Subset<br />

If S and T are two sets, then S is a subset of T , written S ⊆ T if whenever x ∈ S<br />

then x ∈ T .<br />

□<br />

If we want to disallow the possibility that S isthesameasT , we use the notation<br />

S ⊂ T and we say that S is a proper subset of T . We will do an example, but first<br />

we will def<strong>in</strong>e a special set.<br />

Def<strong>in</strong>ition ES Empty Set<br />

The empty set is the set with no elements. It is denoted by ∅.<br />

Example SSET Subset<br />

□<br />

587


§SET Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 588<br />

If S = {, , ⋆}, T = {⋆, }, R = {, ⋆}, then<br />

T ⊆ S R ⊈ T ∅⊆S<br />

T ⊂ S S ⊆ S S ⊄ S<br />

What does it mean for two sets to be equal? They must be the same. Well, that<br />

explanation is not really too helpful, is it? How about: If A ⊆ B and B ⊆ A, then<br />

A equals B. This gives us someth<strong>in</strong>g to work with, if A is a subset of B, andvice<br />

versa, then they must really be the same set. We will now make the symbol “=” do<br />

double-duty and extend its use to statements like A = B, whereA and B are sets.<br />

Here is the def<strong>in</strong>ition, which we will reference often.<br />

Def<strong>in</strong>ition SE Set Equality<br />

Two sets, S and T , are equal, if S ⊆ T and T ⊆ S. In this case, we write S = T . □<br />

Sets are typically written <strong>in</strong>side of braces, as {}, as we have seen above. However,<br />

when sets have more than a few elements, a description will typically have two<br />

components. The first is a description of the general type of objects conta<strong>in</strong>ed <strong>in</strong> a<br />

set, while the second is some sort of restriction on the properties the objects have.<br />

Every object <strong>in</strong> the set must be of the type described <strong>in</strong> the first part and it must<br />

satisfy the restrictions <strong>in</strong> the second part. Conversely, any object of the proper type<br />

for the first part, that also meets the conditions of the second part, will be <strong>in</strong> the<br />

set. These two parts are set off from each other somehow, often with a vertical bar<br />

(|) or a colon (:).<br />

I like to th<strong>in</strong>k of sets as clubs. The first part is some description of the type of<br />

people who might belong to the club, the basic objects. For example, a bicycle club<br />

would describe its members as be<strong>in</strong>g people who like to ride bicycles. The second<br />

part is like a membership committee, it restricts the people who are allowed <strong>in</strong> the<br />

club. Cont<strong>in</strong>u<strong>in</strong>g with our bicycle club analogy, we might decide to limit ourselves<br />

to “serious” riders and only have members who can document hav<strong>in</strong>g ridden 100<br />

kilometers or more <strong>in</strong> a s<strong>in</strong>gle day at least one time.<br />

The restrictions on membership can migrate around some between the first and<br />

second part, and there may be several ways to describe the same set of objects. Here<br />

is a more mathematical example, employ<strong>in</strong>g the set of all <strong>in</strong>tegers, Z, to describe<br />

the set of even <strong>in</strong>tegers.<br />

E = {x ∈ Z| x is an even number}<br />

= {x ∈ Z| 2 divides x evenly}<br />

= {2k| k ∈ Z}<br />

Notice how this set tells us that its objects are <strong>in</strong>teger numbers (not, say, matrices<br />

or functions, for example) and just those that are even. So we can write that 10 ∈ E,<br />


§SET Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 589<br />

while 17 ∉ E once we check the membership criteria. We also recognize the question<br />

[ ]<br />

1 −3 5<br />

∈ E?<br />

2 0 3<br />

as be<strong>in</strong>g simply ridiculous.<br />

Subsection SC<br />

Set Card<strong>in</strong>ality<br />

On occasion, we will be <strong>in</strong>terested <strong>in</strong> the number of elements <strong>in</strong> a f<strong>in</strong>ite set. Here is<br />

the def<strong>in</strong>ition and the associated notation.<br />

Def<strong>in</strong>ition C Card<strong>in</strong>ality<br />

Suppose S is a f<strong>in</strong>ite set. Then the number of elements <strong>in</strong> S is called the card<strong>in</strong>ality<br />

or size of S, and is denoted |S|.<br />

□<br />

Example CS Card<strong>in</strong>ality and Size<br />

If S = {, ⋆, }, then|S| =3.<br />

Subsection SO<br />

Set Operations<br />

In this subsection we def<strong>in</strong>e and illustrate the three most common basic ways to<br />

manipulate sets to create other sets. S<strong>in</strong>ce much of l<strong>in</strong>ear algebra is about sets, we<br />

will use these often.<br />

Def<strong>in</strong>ition SU Set Union<br />

Suppose S and T are sets. Then the union of S and T , denoted S ∪ T , is the set<br />

whose elements are those that are elements of S or of T , or both. More formally,<br />

x ∈ S ∪ T if and only if x ∈ S or x ∈ T<br />

□<br />

Notice that the use of the word “or” <strong>in</strong> this def<strong>in</strong>ition is meant to be nonexclusive.<br />

That is, it allows for x to be an element of both S and T and still qualify<br />

for membership <strong>in</strong> S ∪ T .<br />

Example SU Set union<br />

If S = {, ⋆, } and T = {, ⋆, } then S ∪ T = {, ⋆, , }.<br />

Def<strong>in</strong>ition SI Set Intersection<br />

Suppose S and T are sets. Then the <strong>in</strong>tersection of S and T , denoted S ∩ T ,isthe<br />

set whose elements are only those that are elements of S and of T . More formally,<br />

x ∈ S ∩ T if and only if x ∈ S and x ∈ T<br />

□<br />

△<br />


§SET Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 590<br />

Example SI Set <strong>in</strong>tersection<br />

If S = {, ⋆, } and T = {, ⋆, } then S ∩ T = {, ⋆}.<br />

The union and <strong>in</strong>tersection of sets are operations that beg<strong>in</strong> with two sets and<br />

produce a third, new, set. Our f<strong>in</strong>al operation is the set complement, which we<br />

usually th<strong>in</strong>k of as an operation that takes a s<strong>in</strong>gle set and creates a second, new,<br />

set. However, if you study the def<strong>in</strong>ition carefully, you will see that it needs to be<br />

computed relative to some “universal” set.<br />

Def<strong>in</strong>ition SC Set Complement<br />

Suppose S is a set that is a subset of a universal set U. Then the complement of<br />

S, denoted S, is the set whose elements are those that are elements of U and not<br />

elements of S. More formally,<br />

x ∈ S if and only if x ∈ U and x ∉ S<br />

□<br />

Notice that there is noth<strong>in</strong>g at all special about the universal set. This is simply<br />

a term that suggests that U conta<strong>in</strong>s all of the possible objects we are consider<strong>in</strong>g.<br />

Often this set will be clear from the context, and we will not th<strong>in</strong>k much about it,<br />

nor reference it <strong>in</strong> our notation. In other cases (rarely <strong>in</strong> our work <strong>in</strong> this course)<br />

the exact nature of the universal set must be made explicit, and reference to it will<br />

possibly be carried through <strong>in</strong> our choice of notation.<br />

Example SC Set complement<br />

If U = {, ⋆, , } and S = {, ⋆, } then S = {}.<br />

There are many more natural operations that can be performed on sets, such as<br />

an exclusive-or and the symmetric difference. Many of these can be def<strong>in</strong>ed <strong>in</strong> terms<br />

of the union, <strong>in</strong>tersection and complement. We will not have much need of them<br />

<strong>in</strong> this course, and so we will not give precise descriptions here <strong>in</strong> this prelim<strong>in</strong>ary<br />

section.<br />

There is also an <strong>in</strong>terest<strong>in</strong>g variety of basic results that describe the <strong>in</strong>terplay<br />

of these operations with each other. We mention just two as an example, these are<br />

known as DeMorgan’s Laws.<br />

(S ∪ T )=S ∩ T<br />

(S ∩ T )=S ∪ T<br />

△<br />

△<br />

Besides hav<strong>in</strong>g an appeal<strong>in</strong>g symmetry, we mention these two facts, s<strong>in</strong>ce construct<strong>in</strong>g<br />

the proofs of each is a useful exercise that will require a solid understand<strong>in</strong>g<br />

of all but one of the def<strong>in</strong>itions presented <strong>in</strong> this section. Give it a try.


Reference<br />

Proof Techniques<br />

In this section we collect many short essays designed to help you understand how to<br />

read, understand and construct proofs. Some are very factual, while others consist<br />

of advice. They appear <strong>in</strong> the order that they are first needed (or advisable) <strong>in</strong> the<br />

text, and are meant to be self-conta<strong>in</strong>ed. So you should not th<strong>in</strong>k of read<strong>in</strong>g through<br />

this section <strong>in</strong> one sitt<strong>in</strong>g as you beg<strong>in</strong> this course. But be sure to head back here for<br />

a first read<strong>in</strong>g whenever the text suggests it. Also th<strong>in</strong>k about return<strong>in</strong>g to browse<br />

at various po<strong>in</strong>ts dur<strong>in</strong>g the course, and especially as you struggle with becom<strong>in</strong>g<br />

an accomplished mathematician who is comfortable with the difficult process of<br />

design<strong>in</strong>g new proofs.<br />

Proof Technique D<br />

Def<strong>in</strong>itions<br />

A def<strong>in</strong>ition is a made-up term, used as a k<strong>in</strong>d of shortcut for some typically more<br />

complicated idea. For example, we say a whole number is even as a shortcut for<br />

say<strong>in</strong>g that when we divide the number by two we get a rema<strong>in</strong>der of zero. With a<br />

precise def<strong>in</strong>ition, we can answer certa<strong>in</strong> questions unambiguously. For example, did<br />

you ever wonder if zero was an even number? Now the answer should be clear s<strong>in</strong>ce<br />

we have a precise def<strong>in</strong>ition of what we mean by the term even.<br />

A s<strong>in</strong>gle term might have several possible def<strong>in</strong>itions. For example, we could say<br />

that the whole number n is even if there is some whole number k such that n =2k.<br />

We say this is an equivalent def<strong>in</strong>ition s<strong>in</strong>ce it categorizes even numbers the same<br />

way our first def<strong>in</strong>ition does.<br />

Def<strong>in</strong>itions are like two-way streets — we can use a def<strong>in</strong>ition to replace someth<strong>in</strong>g<br />

rather complicated by its def<strong>in</strong>ition (if it fits) and we can replace a def<strong>in</strong>ition by its<br />

more complicated description. A def<strong>in</strong>ition is usually written as some form of an<br />

implication, such as “If someth<strong>in</strong>g-nice-happens, then blatzo.” However, this also<br />

means that “If blatzo, then someth<strong>in</strong>g-nice-happens,” even though this may not be<br />

591


Techniques Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 592<br />

formally stated. This is what we mean when we say a def<strong>in</strong>ition is a two-way street<br />

— it is really two implications, go<strong>in</strong>g <strong>in</strong> opposite “directions.”<br />

Anybody (<strong>in</strong>clud<strong>in</strong>g you) can make up a def<strong>in</strong>ition, so long as it is unambiguous,<br />

but the real test of a def<strong>in</strong>ition’s utility is whether or not it is useful for concisely<br />

describ<strong>in</strong>g <strong>in</strong>terest<strong>in</strong>g or frequent situations. We will try to restrict our def<strong>in</strong>itions<br />

to parts of speech that are nouns (e.g. “matrix”) or adjectives (e.g. “nons<strong>in</strong>gular”<br />

matrix), and so avoid def<strong>in</strong>itions that are verbs or adverbs. Therefore our def<strong>in</strong>itions<br />

will describe an object (noun) or a property of an object (adjective).<br />

We will talk about theorems later (and especially equivalences). For now, be sure<br />

not to confuse the notion of a def<strong>in</strong>ition with that of a theorem.<br />

In this book, we will display every new def<strong>in</strong>ition carefully set-off from the text,<br />

and the term be<strong>in</strong>g def<strong>in</strong>ed will be written thus: def<strong>in</strong>ition. Additionally, there is a<br />

full list of all the def<strong>in</strong>itions, <strong>in</strong> order of their appearance located <strong>in</strong> the reference<br />

section of the same name (Def<strong>in</strong>itions). Def<strong>in</strong>itions are critical to do<strong>in</strong>g mathematics<br />

and prov<strong>in</strong>g theorems, so we have given you lots of ways to locate a def<strong>in</strong>ition should<br />

you forget its. . . uh, well, . . . def<strong>in</strong>ition.<br />

Can you formulate a precise def<strong>in</strong>ition for what it means for a number to be odd?<br />

(do not just say it is the opposite of even. Act as if you do not have a def<strong>in</strong>ition for<br />

even yet.) Can you formulate your def<strong>in</strong>ition a second, equivalent, way? Can you<br />

employ your def<strong>in</strong>ition to test an odd and an even number for “odd-ness”?<br />

Proof Technique T<br />

Theorems<br />

Higher mathematics is about understand<strong>in</strong>g theorems. Read<strong>in</strong>g them, understand<strong>in</strong>g<br />

them, apply<strong>in</strong>g them, prov<strong>in</strong>g them. Every theorem is a shortcut — we prove<br />

someth<strong>in</strong>g <strong>in</strong> general, and then whenever we f<strong>in</strong>d a specific <strong>in</strong>stance covered by the<br />

theorem we can immediately say that we know someth<strong>in</strong>g else about the situation<br />

by apply<strong>in</strong>g the theorem. In many cases, this new <strong>in</strong>formation can be ga<strong>in</strong>ed with<br />

much less effort than if we did not know the theorem.<br />

The first step <strong>in</strong> understand<strong>in</strong>g a theorem is to realize that the statement of<br />

every theorem can be rewritten us<strong>in</strong>g statements of the form “If someth<strong>in</strong>g-happens,<br />

then someth<strong>in</strong>g-else-happens.” The “someth<strong>in</strong>g-happens” part is the hypothesis<br />

and the “someth<strong>in</strong>g-else-happens” is the conclusion. To understand a theorem,<br />

it helps to rewrite its statement us<strong>in</strong>g this construction. To apply a theorem, we<br />

verify that “someth<strong>in</strong>g-happens” <strong>in</strong> a particular <strong>in</strong>stance and immediately conclude<br />

that “someth<strong>in</strong>g-else-happens.” To prove a theorem, we must argue based on the<br />

assumption that the hypothesis is true, and arrive through the process of logic that<br />

the conclusion must then also be true.


Techniques Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 593<br />

Proof Technique L<br />

Language<br />

Like any science, the language of math must be understood before further<br />

study can cont<strong>in</strong>ue.<br />

Er<strong>in</strong> Wilson, Student<br />

September, 2004<br />

Mathematics is a language. It is a way to express complicated ideas clearly,<br />

precisely, and unambiguously. Because of this, it can be difficult to read. Read slowly,<br />

and have pencil and paper at hand. It will usually be necessary to read someth<strong>in</strong>g<br />

several times. While read<strong>in</strong>g can be difficult, it is even harder to speak mathematics,<br />

and so that is the topic of this technique.<br />

“Natural” language, <strong>in</strong> the present case English, is fraught with ambiguity. Consider<br />

the possible mean<strong>in</strong>gs of the sentence: The fish is ready to eat. One fish, or two<br />

fish? Are the fish hungry, or will the fish be eaten? (See Exercise SSLE.M10, Exercise<br />

SSLE.M11, Exercise SSLE.M12, Exercise SSLE.M13.) In your daily <strong>in</strong>teractions with<br />

others, give some thought to how many mis-understand<strong>in</strong>gs arise from the ambiguity<br />

of pronouns, modifiers and objects.<br />

I am go<strong>in</strong>g to suggest a simple modification to the way you use language that<br />

will make it much, much easier to become proficient at speak<strong>in</strong>g mathematics and<br />

eventually it will become second nature. Th<strong>in</strong>k of it as a tra<strong>in</strong><strong>in</strong>g aid or practice<br />

drill you might use when learn<strong>in</strong>g to become skilled at a sport.<br />

<strong>First</strong>, elim<strong>in</strong>ate pronouns from your vocabulary when discuss<strong>in</strong>g l<strong>in</strong>ear algebra,<br />

<strong>in</strong> class or with your colleagues. Do not use: it, that, those, their or similar sources<br />

of confusion. This is the s<strong>in</strong>gle easiest step you can take to make your oral expression<br />

of mathematics clearer to others, and <strong>in</strong> turn, it will greatly help your own<br />

understand<strong>in</strong>g.<br />

Now rid yourself of the word “th<strong>in</strong>g” (or variants like “someth<strong>in</strong>g”). When<br />

you are tempted to use this word realize that there is some object you want to<br />

discuss, and we likely have a def<strong>in</strong>ition for that object (see the discussion at Proof<br />

Technique D). Always “th<strong>in</strong>k about your objects” and many aspects of the study of<br />

mathematics will get easier. Ask yourself: “Am I work<strong>in</strong>g with a set, a number, a<br />

function, an operation, a differential equation, or what?” Know<strong>in</strong>g what an object<br />

is will allow you to narrow down the procedures you may apply to it. Ifyouhave<br />

studied an object-oriented computer programm<strong>in</strong>g language, then you will already<br />

have experience identify<strong>in</strong>g objects and th<strong>in</strong>k<strong>in</strong>g carefully about what procedures<br />

are allowed to be applied to them.<br />

Third, elim<strong>in</strong>ate the verb “works” (as <strong>in</strong> “the equation works”) from your<br />

vocabulary. This term is used as a substitute when we are not sure just what we<br />

are try<strong>in</strong>g to accomplish. Usually we are try<strong>in</strong>g to say that some object fulfills some


Techniques Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 594<br />

condition. The condition might even have a def<strong>in</strong>ition associated with it, mak<strong>in</strong>g it<br />

even easier to describe.<br />

Last, speak slooooowly and thoughtfully as you try to get by without all these<br />

lazy words. It is hard at first, but you will get better with practice. Especially <strong>in</strong> class,<br />

when the pressure is on and all eyes are on you, do not succumb to the temptation<br />

to use these weak words. Slow down, we’d all rather wait for a slow, well-formed<br />

question or answer than a fast, sloppy, <strong>in</strong>comprehensible one.<br />

You will f<strong>in</strong>d the improvement <strong>in</strong> your ability to speak clearly about complicated<br />

ideas will greatly improve your ability to th<strong>in</strong>k clearly about complicated ideas.<br />

And I believe that you cannot th<strong>in</strong>k clearly about complicated ideas if you cannot<br />

formulate questions or answers clearly <strong>in</strong> the correct language. This is as applicable<br />

to the study of law, economics or philosophy as it is to the study of science or<br />

mathematics.<br />

In this spirit, Dupont Hubert has contributed the follow<strong>in</strong>g quotation, which is<br />

widely used <strong>in</strong> French mathematics courses (and which might be construed as the<br />

contrapositive of Proof Technique CP)<br />

Ce que l’on concoit bien s’enonce clairement,<br />

Et les mots pour le dire arrivent aisement.<br />

which translates as<br />

Whatever is well conceived is clearly said,<br />

And the words to say it flow with ease.<br />

Nicolas Boileau<br />

L’art poetique<br />

Chant I, 1674<br />

So when you come to class, check your pronouns at the door, along with other<br />

weak words. And when study<strong>in</strong>g with friends, you might make a game of catch<strong>in</strong>g<br />

one another us<strong>in</strong>g pronouns, “th<strong>in</strong>g,” or “works.” I know I’ll be call<strong>in</strong>g you on it!<br />

Proof Technique GS<br />

Gett<strong>in</strong>g Started<br />

“I don’t know how to get started!” is often the lament of the novice proof-builder.<br />

Here are a few pieces of advice.<br />

1. As mentioned <strong>in</strong> Proof Technique T, rewrite the statement of the theorem <strong>in</strong><br />

an “if-then” form. This will simplify identify<strong>in</strong>g the hypothesis and conclusion,<br />

which are referenced <strong>in</strong> the next few items.


Techniques Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 595<br />

2. Ask yourself what k<strong>in</strong>d of statement you are try<strong>in</strong>g to prove. This is always<br />

part of your conclusion. Are you be<strong>in</strong>g asked to conclude that two numbers<br />

are equal, that a function is differentiable or a set is a subset of another?<br />

You cannot br<strong>in</strong>g other techniques to bear if you do not know what type of<br />

conclusion you have.<br />

3. Write down reformulations of your hypotheses. Interpret and translate each<br />

def<strong>in</strong>ition properly.<br />

4. Write your hypothesis at the top of a sheet of paper and your conclusion at the<br />

bottom. See if you can formulate a statement that precedes the conclusion and<br />

also implies it. Work down from your hypothesis, and up from your conclusion,<br />

and see if you can meet <strong>in</strong> the middle. When you are f<strong>in</strong>ished, rewrite the<br />

proof nicely, from hypothesis to conclusion, with verifiable implications giv<strong>in</strong>g<br />

each subsequent statement.<br />

5. As you work through your proof, th<strong>in</strong>k about what k<strong>in</strong>ds of objects your<br />

symbols represent. For example, suppose A is a set and f(x) is a real-valued<br />

function. Then the expression A + f might make no sense if we have not<br />

def<strong>in</strong>ed what it means to “add” a set to a function, so we can stop at that<br />

po<strong>in</strong>t and adjust accord<strong>in</strong>gly. On the other hand we might understand 2f to<br />

be the function whose rule is described by (2f)(x) =2f(x). “Th<strong>in</strong>k about<br />

your objects” means to always verify that your objects and operations are<br />

compatible.<br />

Proof Technique C<br />

Constructive Proofs<br />

Conclusions of proofs come <strong>in</strong> a variety of types. Often a theorem will simply assert<br />

that someth<strong>in</strong>g exists. The best way, but not the only way, to show someth<strong>in</strong>g exists<br />

is to actually build it. Such a proof is called constructive. The th<strong>in</strong>g to realize<br />

about constructive proofs is that the proof itself will conta<strong>in</strong> a procedure that might<br />

be used computationally to construct the desired object. If the procedure is not too<br />

cumbersome, then the proof itself is as useful as the statement of the theorem.<br />

Proof Technique E<br />

Equivalences<br />

When a theorem uses the phrase “if and only if” (or the abbreviation “iff”) it is a<br />

shorthand way of say<strong>in</strong>g that two if-then statements are true. So if a theorem says<br />

“P if and only if Q,” then it is true that “if P, then Q” while it is also true that “if<br />

Q, then P.” For example, it may be a theorem that “I wear bright yellow knee-high<br />

plastic boots if and only if it is ra<strong>in</strong><strong>in</strong>g.” This means that I never forget to wear my


Techniques Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 596<br />

super-duper yellow boots when it is ra<strong>in</strong><strong>in</strong>g and I would not be seen <strong>in</strong> such silly<br />

boots unless it was ra<strong>in</strong><strong>in</strong>g. You never have one without the other. I have my boots<br />

on and it is ra<strong>in</strong><strong>in</strong>g or I do not have my boots on and it is dry.<br />

The upshot for prov<strong>in</strong>g such theorems is that it is like a 2-for-1 sale, we get to do<br />

two proofs. Assume P and conclude Q, then start over and assume Q and conclude<br />

P . For this reason, “if and only if” is sometimes abbreviated by ⇐⇒ , while proofs<br />

<strong>in</strong>dicate which of the two implications is be<strong>in</strong>g proved by prefac<strong>in</strong>g each with ⇒ or<br />

⇐. A carefully written proof will rem<strong>in</strong>d the reader which statement is be<strong>in</strong>g used<br />

as the hypothesis, a quicker version will let the reader deduce it from the direction<br />

of the arrow. Tradition dictates we do the “easy” half first, but that is hard for a<br />

student to know until you have f<strong>in</strong>ished do<strong>in</strong>g both halves! Oh well, if you rewrite<br />

your proofs (a good habit), you can then choose to put the easy half first.<br />

Theorems of this type are called “equivalences” or “characterizations,” and they<br />

are some of the most pleas<strong>in</strong>g results <strong>in</strong> mathematics. They say that two objects, or<br />

two situations, are really the same. You do not have one without the other, like ra<strong>in</strong><br />

and my yellow boots. The more different P and Q seem to be, the more pleas<strong>in</strong>g<br />

it is to discover they are really equivalent. And if P describes a very mysterious<br />

solution or <strong>in</strong>volves a tough computation, while Q is transparent or <strong>in</strong>volves easy<br />

computations, then we have found a great shortcut for better understand<strong>in</strong>g or faster<br />

computation. Remember that every theorem really is a shortcut <strong>in</strong> some form. You<br />

will also discover that if prov<strong>in</strong>g P ⇒ Q is very easy, then prov<strong>in</strong>g Q ⇒ P is likely<br />

to be proportionately harder. Sometimes the two halves are about equally hard. And<br />

<strong>in</strong> rare cases, you can str<strong>in</strong>g together a whole sequence of other equivalences to form<br />

the one you are after and you do not even need to do two halves. In this case, the<br />

argument of one half is just the argument of the other half, but <strong>in</strong> reverse.<br />

One last th<strong>in</strong>g about equivalences. If you see a statement of a theorem that says<br />

two th<strong>in</strong>gs are “equivalent,” translate it first <strong>in</strong>to an “if and only if” statement.<br />

Proof Technique N<br />

Negation<br />

When we construct the contrapositive of a theorem (Proof Technique CP), we need<br />

to negate the two statements <strong>in</strong> the implication. And when we construct a proof<br />

by contradiction (Proof Technique CD), we need to negate the conclusion of the<br />

theorem. One way to construct a converse (Proof Technique CV) is to simultaneously<br />

negate the hypothesis and conclusion of an implication (but remember that this<br />

is not guaranteed to be a true statement). So we often have the need to negate<br />

statements, and <strong>in</strong> some situations it can be tricky.<br />

If a statement says that a set is empty, then its negation is the statement that<br />

the set is nonempty. That is straightforward. Suppose a statement says “someth<strong>in</strong>ghappens”<br />

for all i, oreveryi, oranyi. Then the negation is that “someth<strong>in</strong>g-doesnot-happen”<br />

for at least one value of i. If a statement says that there exists at


Techniques Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 597<br />

least one “th<strong>in</strong>g,” then the negation is the statement that there is no “th<strong>in</strong>g.” If a<br />

statement says that a “th<strong>in</strong>g” is unique, then the negation is that there is zero, or<br />

more than one, of the “th<strong>in</strong>g.”<br />

We are not cover<strong>in</strong>g all of the possibilities, but we wish to make the po<strong>in</strong>t that<br />

logical qualifiers like “there exists” or “for every” must be handled with care when<br />

negat<strong>in</strong>g statements. Study<strong>in</strong>g the proofs which employ contradiction (as listed<br />

<strong>in</strong> Proof Technique CD) is a good first step towards understand<strong>in</strong>g the range of<br />

possibilities.<br />

Proof Technique CP<br />

Contrapositives<br />

The contrapositive of an implication P ⇒ Q is the implication not(Q) ⇒ not(P ),<br />

where “not” means the logical negation, or opposite. An implication is true if and<br />

only if its contrapositive is true. In symbols, (P ⇒ Q) ⇐⇒ (not(Q) ⇒ not(P ))<br />

is a theorem. Such statements about logic, that are always true, are known as<br />

tautologies.<br />

For example, it is a theorem that “if a vehicle is a fire truck, then it has big<br />

tires and has a siren.” (Yes, I’m sure you can conjure up a counterexample, but play<br />

along with me anyway.) The contrapositive is “if a vehicle does not have big tires or<br />

does not have a siren, then it is not a fire truck.” Notice how the “and” became an<br />

“or” when we negated the conclusion of the orig<strong>in</strong>al theorem.<br />

It will frequently happen that it is easier to construct a proof of the contrapositive<br />

than of the orig<strong>in</strong>al implication. If you are hav<strong>in</strong>g difficulty formulat<strong>in</strong>g a proof of<br />

some implication, see if the contrapositive is easier for you. The trick is to construct<br />

the negation of complicated statements accurately. More on that later.<br />

Proof Technique CV<br />

Converses<br />

The converse of the implication P ⇒ Q is the implication Q ⇒ P . There is no<br />

guarantee that the truth of these two statements are related. In particular, if an<br />

implication has been proven to be a theorem, then do not try to use its converse too,<br />

as if it were a theorem. Sometimes the converse is true (and we have an equivalence,<br />

see Proof Technique E). But more likely the converse is false, especially if it was not<br />

<strong>in</strong>cluded <strong>in</strong> the statement of the orig<strong>in</strong>al theorem.<br />

Forexample,wehavethetheorem,“ifavehicleisafiretruck,thenitishasbig<br />

tires and has a siren.” The converse is false. The statement that “if a vehicle has big<br />

tires and a siren, then it is a fire truck” is false. A police vehicle for use on a sandy<br />

public beach would have big tires and a siren, yet is not equipped to fight fires.<br />

We br<strong>in</strong>g this up now, because Theorem CSRN has a tempt<strong>in</strong>g converse. Does<br />

this theorem say that if r


Techniques Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 598<br />

Archetype E has r =3< 4=n, yet is <strong>in</strong>consistent. This example is then said<br />

to be a counterexample to the converse. Whenever you th<strong>in</strong>k a theorem that is<br />

an implication might actually be an equivalence, it is good to hunt around for a<br />

counterexample that shows the converse to be false (the archetypes, Archetypes, can<br />

be a good hunt<strong>in</strong>g ground).<br />

Proof Technique CD<br />

Contradiction<br />

Another proof technique is known as “proof by contradiction” and it can be a<br />

powerful (and satisfy<strong>in</strong>g) approach. Simply put, suppose you wish to prove the<br />

implication, “If A, thenB.” As usual, we assume that A is true, but we also make<br />

the additional assumption that B is false. If our orig<strong>in</strong>al implication is true, then<br />

these tw<strong>in</strong> assumptions should lead us to a logical <strong>in</strong>consistency. In practice we<br />

assume the negation of B to be true (see Proof Technique N). So we argue from the<br />

assumptions A and not(B) look<strong>in</strong>g for some obviously false conclusion such as 1 = 6,<br />

or a set is simultaneously empty and nonempty, or a matrix is both nons<strong>in</strong>gular and<br />

s<strong>in</strong>gular.<br />

You should be careful about formulat<strong>in</strong>g proofs that look like proofs by contradiction,<br />

but really are not. This happens when you assume A and not(B) and proceed<br />

to give a “normal” and direct proof that B is true by only us<strong>in</strong>g the assumption<br />

that A is true. Your last step is to then claim that B is true and you then appeal to<br />

the assumption that not(B) is true, thus gett<strong>in</strong>g the desired contradiction. Instead,<br />

you could have avoided the overhead of a proof by contradiction and just run with<br />

the direct proof. This stylistic flaw is known, quite graphically, as “sett<strong>in</strong>g up the<br />

strawman to knock him down.”<br />

Here is a simple example of a proof by contradiction. There are direct proofs that<br />

are just about as easy, but this will demonstrate the po<strong>in</strong>t, while narrowly avoid<strong>in</strong>g<br />

knock<strong>in</strong>g down the straw man.<br />

Theorem: If a and b are odd <strong>in</strong>tegers, then their product, ab, is odd.<br />

Proof: To beg<strong>in</strong> a proof by contradiction, assume the hypothesis, that a and b<br />

are odd. Also assume the negation of the conclusion, <strong>in</strong> this case, that ab is even.<br />

Then there are <strong>in</strong>tegers, j, k, l so that a =2j +1,b =2k +1,ab =2l. Then<br />

0=ab − ab<br />

=(2j + 1)(2k +1)− (2l)<br />

=4jk +2j +2k − 2l +1<br />

=2(2jk + j + k − l)+1<br />

Aga<strong>in</strong>, we do not offer this example as the best proof of this fact about even and<br />

odd numbers, but rather it is a simple illustration of a proof by contradiction. You


Techniques Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 599<br />

can f<strong>in</strong>d examples of proofs by contradiction <strong>in</strong><br />

Theorem RREFU, TheoremNMUS, TheoremNPNT, TheoremTTMI, Theorem<br />

GSP, TheoremELIS, TheoremEDYES, TheoremEMHE, TheoremEDELI, and<br />

Theorem DMFE, <strong>in</strong> addition to several examples and solutions to exercises.<br />

Proof Technique U<br />

Uniqueness<br />

A theorem will sometimes claim that some object, hav<strong>in</strong>g some desirable property,<br />

is unique. In other words, there should be only one such object. To prove this, a<br />

standard technique is to assume there are two such objects and proceed to analyze<br />

the consequences. The end result may be a contradiction (Proof Technique CD), or<br />

the conclusion that the two allegedly different objects really are equal.<br />

Proof Technique ME<br />

Multiple Equivalences<br />

A very specialized form of a theorem beg<strong>in</strong>s with the statement “The follow<strong>in</strong>g are<br />

equivalent...,” which is then followed by a list of statements. Informally, this lead-<strong>in</strong><br />

sometimes gets abbreviated by “TFAE.” This formulation means that any two of the<br />

statements on the list can be connected with an “if and only if” to form a theorem.<br />

So if the list has n statements then, there are n(n−1)<br />

2<br />

possible equivalences that can<br />

be constructed (and are claimed to be true).<br />

Suppose a theorem of this form has statements denoted as A, B, C, ..., Z. To<br />

prove the entire theorem, we can prove A ⇒ B, B ⇒ C, C ⇒ D, ...,Y ⇒ Z and<br />

f<strong>in</strong>ally, Z ⇒ A. This circular cha<strong>in</strong> of n equivalences would allow us, logically, if<br />

not practically, to form any one of the n(n−1)<br />

2<br />

possible equivalences by chas<strong>in</strong>g the<br />

equivalences around the circle as far as required.<br />

Proof Technique PI<br />

Prov<strong>in</strong>g Identities<br />

Many theorems have conclusions that say two objects are equal. Perhaps one object<br />

is hard to compute or understand, while the other is easy to compute or understand.<br />

This would make for a pleas<strong>in</strong>g theorem. Whether the result is pleas<strong>in</strong>g or not,<br />

we take the same approach to formulate a proof. Sometimes we need to employ<br />

specialized notions of equality, such as Def<strong>in</strong>ition SE or Def<strong>in</strong>ition CVE, but <strong>in</strong> other<br />

cases we can str<strong>in</strong>g together a list of equalities.<br />

The wrong way to prove an identity is to beg<strong>in</strong> by writ<strong>in</strong>g it down and then<br />

beat<strong>in</strong>g on it until it reduces to an obvious identity. The first flaw is that you would<br />

be writ<strong>in</strong>g down the statement you wish to prove, as if you already believed it to be


Techniques Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 600<br />

true. But more dangerous is the possibility that some of your maneuvers are not<br />

reversible. Here is an example. Let us prove that 3 = −3.<br />

3=−3 (This is a bad start)<br />

3 2 =(−3) 2 Square both sides<br />

9=9<br />

0 = 0 Subtract 9 from both sides<br />

So because 0 = 0 is a true statement, does it follow that 3 = −3 isatrue<br />

statement? Nope. Of course, we did not really expect a legitimate proof of 3 = −3,<br />

but this attempt should illustrate the dangers of this (<strong>in</strong>correct) approach.<br />

What you have just seen <strong>in</strong> the proof of Theorem VSPCV, and what you will<br />

see consistently throughout this text, is proofs of the follow<strong>in</strong>g form. To prove that<br />

A = D we write<br />

A = B Theorem, Def<strong>in</strong>ition or Hypothesis justify<strong>in</strong>g A = B<br />

= C Theorem, Def<strong>in</strong>ition or Hypothesis justify<strong>in</strong>g B = C<br />

= D Theorem, Def<strong>in</strong>ition or Hypothesis justify<strong>in</strong>g C = D<br />

In your scratch work explor<strong>in</strong>g possible approaches to prov<strong>in</strong>g a theorem you<br />

may massage a variety of expressions, sometimes mak<strong>in</strong>g connections to various bits<br />

and pieces, while some parts get abandoned. Once you see a l<strong>in</strong>e of attack, rewrite<br />

your proof carefully mimick<strong>in</strong>g this style.<br />

Proof Technique DC<br />

Decompositions<br />

Much of your mathematical upbr<strong>in</strong>g<strong>in</strong>g, especially once you began a study of algebra,<br />

revolved around simplify<strong>in</strong>g expressions — comb<strong>in</strong><strong>in</strong>g like terms, obta<strong>in</strong><strong>in</strong>g common<br />

denom<strong>in</strong>ators so as to add fractions, factor<strong>in</strong>g <strong>in</strong> order to solve polynomial equations.<br />

However, as often as not, we will do the opposite. Many theorems and techniques<br />

will revolve around tak<strong>in</strong>g some object and “decompos<strong>in</strong>g” it <strong>in</strong>to some comb<strong>in</strong>ation<br />

of other objects, ostensibly <strong>in</strong> a more complicated fashion. When we say someth<strong>in</strong>g<br />

can “be written as” someth<strong>in</strong>g else, we mean that the one object can be decomposed<br />

<strong>in</strong>to some comb<strong>in</strong>ation of other objects. This may seem unnatural at first, but results<br />

of this type will give us <strong>in</strong>sight <strong>in</strong>to the structure of the orig<strong>in</strong>al object by expos<strong>in</strong>g<br />

its <strong>in</strong>ner work<strong>in</strong>gs. An appropriate analogy might be stripp<strong>in</strong>g the wallboards away<br />

from the <strong>in</strong>terior of a build<strong>in</strong>g to expose the structural members support<strong>in</strong>g the<br />

whole build<strong>in</strong>g.<br />

Perhaps you have studied <strong>in</strong>tegral calculus, or a pre-calculus course, where you<br />

learned about partial fractions. This is a technique where a fraction of two polynomials<br />

is decomposed (written as, expressed as) a sum of simpler fractions. The purpose <strong>in</strong><br />

calculus is to make f<strong>in</strong>d<strong>in</strong>g an antiderivative simpler. For example, you can verify


Techniques Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 601<br />

the truth of the expression<br />

12x 5 +2x 4 − 20x 3 +66x 2 − 294x + 308 5x +2<br />

x 6 + x 5 − 3x 4 +21x 3 − 52x 2 =<br />

+20x − 48 x 2 − x +6 + 3x − 7<br />

x 2 +1 + 3<br />

x +4 + 1<br />

x − 2<br />

In an early course <strong>in</strong> algebra, you might be expected to comb<strong>in</strong>e the four terms<br />

on the right over a common denom<strong>in</strong>ator to create the “simpler” expression on<br />

the left. Go<strong>in</strong>g the other way, the partial fraction technique would allow you to<br />

systematically decompose the fraction of polynomials on the left <strong>in</strong>to the sum of the<br />

four (arguably) simpler fractions of polynomials on the right.<br />

This is a major shift <strong>in</strong> th<strong>in</strong>k<strong>in</strong>g, so come back here often, especially when we<br />

say “can be written as”, or “can be expressed as,” or “can be decomposed as.”<br />

Proof Technique I<br />

Induction<br />

“Induction” or “mathematical <strong>in</strong>duction” is a framework for prov<strong>in</strong>g statements that<br />

are <strong>in</strong>dexed by <strong>in</strong>tegers. In other words, suppose you have a statement to prove that<br />

is really multiple statements, one for n = 1, another for n =2,athirdforn =3,<br />

and so on. If there is enough similarity between the statements, then you can use a<br />

script (the framework) to prove them all at once.<br />

n(n +1)<br />

For example, consider the theorem: 1 + 2 + 3 + ···+ n = for n ≥ 1.<br />

2<br />

This is shorthand for the many statements 1 = 1(1+1)<br />

2<br />

,1+2= 2(2+1)<br />

2<br />

, 1+2+3 =<br />

3(3+1)<br />

2<br />

,1+2+3+4= 4(4+1)<br />

2<br />

, and so on. Forever. You can do the calculations <strong>in</strong><br />

each of these statements and verify that all four are true. We might not be surprised<br />

to learn that the fifth statement is true as well (go ahead and check). However, do<br />

we th<strong>in</strong>k the theorem is true for n = 872? Or n =1, 234, 529?<br />

To see that these questions are not so ridiculous, consider the follow<strong>in</strong>g example<br />

from Rotman’s Journey <strong>in</strong>to Mathematics. The statement “n 2 − n +41isprime”is<br />

true for <strong>in</strong>tegers 1 ≤ n ≤ 40 (check a few). However, when we check n =41wef<strong>in</strong>d<br />

41 2 − 41 + 41 = 41 2 , which is not prime.<br />

So how do we prove <strong>in</strong>f<strong>in</strong>itely many statements all at once? More formally, let us<br />

denote our statements as P (n). Then, if we can prove the two assertions<br />

1. P (1) is true.<br />

2. If P (k) is true, then P (k +1)istrue.<br />

then it follows that P (n) is true for all n ≥ 1. To understand this, I liken the<br />

process to climb<strong>in</strong>g an <strong>in</strong>f<strong>in</strong>itely long ladder with equally spaced rungs. Confronted<br />

with such a ladder, suppose I tell you that you are able to step up onto the first<br />

rung, and if you are on any particular rung, then you are capable of stepp<strong>in</strong>g up to<br />

the next rung. It follows that you can climb the ladder as far up as you wish. The


Techniques Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 602<br />

first formal assertion above is ak<strong>in</strong> to stepp<strong>in</strong>g onto the first rung, and the second<br />

formal assertion is ak<strong>in</strong> to assum<strong>in</strong>g that if you are on any one rung then you can<br />

always reach the next rung.<br />

In practice, establish<strong>in</strong>g that P (1) is true is called the “base case” and <strong>in</strong><br />

most cases is straightforward. Establish<strong>in</strong>g that P (k) ⇒ P (k + 1) is referred to<br />

as the “<strong>in</strong>duction step,” or <strong>in</strong> this book (and elsewhere) we will typically refer to<br />

the assumption of P (k) as the “<strong>in</strong>duction hypothesis.” This is perhaps the most<br />

mysterious part of a proof by <strong>in</strong>duction, s<strong>in</strong>ce we are eventually try<strong>in</strong>g to prove that<br />

P (n) istrueanditappearswedothisbyassum<strong>in</strong>gwhatwearetry<strong>in</strong>gtoprove<br />

(when we assume P (k)). We are try<strong>in</strong>g to prove the truth of P (n) (foralln), but <strong>in</strong><br />

the <strong>in</strong>duction step we establish the truth of an implication, P (k) ⇒ P (k +1),an<br />

“if-then” statement. Sometimes it is even worse, s<strong>in</strong>ce as you get more comfortable<br />

with <strong>in</strong>duction, we often do not bother to use a different letter (k) for the <strong>in</strong>dex (n)<br />

<strong>in</strong> the <strong>in</strong>duction step. Notice that the second formal assertion never says that P (k)<br />

is true, it simply says that if P (k) were true, what might logically follow. We can<br />

establish statements like “If I lived on the moon, then I could pole-vault over a bar<br />

12 meters high.” This may be a true statement, but it does not say we live on the<br />

moon, and <strong>in</strong>deed we may never live there.<br />

Enough generalities. Let us work an example and prove the theorem above about<br />

n(n +1)<br />

sums of <strong>in</strong>tegers. Formally, our statement is P (n): 1+2+3+···+ n = .<br />

2<br />

Proof: Base Case. P (1) is the statement 1 = 1(1+1)<br />

2<br />

, which we see simplifies to<br />

the true statement 1 = 1.<br />

Induction Step: We will assume P (k) is true, and will try to prove P (k +1).<br />

Given what we want to accomplish, it is natural to beg<strong>in</strong> by exam<strong>in</strong><strong>in</strong>g the sum of<br />

the first k + 1 <strong>in</strong>tegers.<br />

1+2+3+···+(k +1)=(1+2+3+···+ k)+(k +1)<br />

k(k +1)<br />

= +(k + 1)<br />

Induction Hypothesis<br />

2<br />

= k2 + k 2k +2<br />

+<br />

2 2<br />

= k2 +3k +2<br />

2<br />

(k +1)(k +2)<br />

=<br />

2<br />

(k + 1)((k +1)+1)<br />

=<br />

2<br />

We then recognize the two ends of this cha<strong>in</strong> of equalities as P (k +1). So,by<br />

mathematical <strong>in</strong>duction, the theorem is true for all n.<br />

How do you recognize when to use <strong>in</strong>duction? The first clue is a statement that<br />

is really many statements, one for each <strong>in</strong>teger. The second clue would be that you


Techniques Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 603<br />

beg<strong>in</strong> a more standard proof and you f<strong>in</strong>d yourself us<strong>in</strong>g words like “and so on” (as<br />

above!) or lots of ellipses (dots) to establish patterns that you are conv<strong>in</strong>ced cont<strong>in</strong>ue<br />

on and on forever. However, there are many m<strong>in</strong>or <strong>in</strong>stances where <strong>in</strong>duction might<br />

be warranted but we do not bother.<br />

Induction is important enough, and used often enough, that it appears <strong>in</strong> various<br />

variations. The base case sometimes beg<strong>in</strong>s with n = 0, or perhaps an <strong>in</strong>teger greater<br />

than n. Some formulate the <strong>in</strong>duction step as P (k − 1) ⇒ P (k). There is also a<br />

“strong form” of <strong>in</strong>duction where we assume all of P (1), P (2), P (3),. . . P (k) asa<br />

hypothesis for show<strong>in</strong>g the conclusion P (k +1).<br />

You can f<strong>in</strong>d examples of <strong>in</strong>duction <strong>in</strong> the proofs of Theorem GSP, Theorem<br />

DER, TheoremDT, TheoremDIM, TheoremEOMP, TheoremDCP, andTheorem<br />

UTMR.<br />

Proof Technique P<br />

Practice<br />

Here is a technique used by many practic<strong>in</strong>g mathematicians when they are teach<strong>in</strong>g<br />

themselves new mathematics. As they read a textbook, monograph or research article,<br />

they attempt to prove each new theorem themselves, before read<strong>in</strong>g the proof. Often<br />

the proofs can be very difficult, so it is wise not to spend too much time on each.<br />

Maybe limit your losses and try each proof for 10 or 15 m<strong>in</strong>utes. Even if the proof<br />

is not found, it is time well-spent. You become more familiar with the def<strong>in</strong>itions<br />

<strong>in</strong>volved, and the hypothesis and conclusion of the theorem. When you do work<br />

through the proof, it might make more sense, and you will ga<strong>in</strong> added <strong>in</strong>sight about<br />

just how to construct a proof.<br />

Proof Technique LC<br />

Lemmas and Corollaries<br />

Theorems often go by different titles. Two of the most popular be<strong>in</strong>g “lemma”<br />

and “corollary.” Before we describe the f<strong>in</strong>e dist<strong>in</strong>ctions, be aware that lemmas,<br />

corollaries, propositions, claims and facts are all just theorems. And every theorem<br />

can be rephrased as an “if-then” statement, or perhaps a pair of “if-then” statements<br />

expressed as an equivalence (Proof Technique E).<br />

A lemma is a theorem that is not too <strong>in</strong>terest<strong>in</strong>g <strong>in</strong> its own right, but is important<br />

for prov<strong>in</strong>g other theorems. It might be a generalization or abstraction of a key<br />

step of several different proofs. For this reason you often hear the phrase “technical<br />

lemma” though some might argue that the adjective “technical” is redundant.<br />

A corollary is a theorem that follows very easily from another theorem. For this<br />

reason, corollaries frequently do not have proofs. You are expected to easily and<br />

quickly see how a previous theorem implies the corollary.


Techniques Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 604<br />

A proposition or fact is really just a codeword for a theorem. A claim might be<br />

similar, but some authors like to use claims with<strong>in</strong> a proof to organize key steps. In<br />

a similar manner, some long proofs are organized as a series of lemmas.<br />

In order to not confuse the novice, we have just called all our theorems theorems.<br />

It is also an organizational convenience. With only theorems and def<strong>in</strong>itions, the<br />

theoretical backbone of the course is laid bare <strong>in</strong> the two lists of Def<strong>in</strong>itions and<br />

Theorems.


Archetypes<br />

This section conta<strong>in</strong>s def<strong>in</strong>itions and capsule summaries for each archetypical example.<br />

Comprehensive and detailed analysis of each can be found <strong>in</strong> the onl<strong>in</strong>e supplement.<br />

Archetype A L<strong>in</strong>ear system of three equations, three unknowns. S<strong>in</strong>gular coefficient<br />

matrix with dimension 1 null space. Integer eigenvalues and a degenerate<br />

eigenspace for coefficient matrix.<br />

x 1 − x 2 +2x 3 =1<br />

2x 1 + x 2 + x 3 =8<br />

x 1 + x 2 =5<br />

Archetype B System with three equations, three unknowns. Nons<strong>in</strong>gular coefficient<br />

matrix. Dist<strong>in</strong>ct <strong>in</strong>teger eigenvalues for coefficient matrix.<br />

−7x 1 − 6x 2 − 12x 3 = −33<br />

5x 1 +5x 2 +7x 3 =24<br />

x 1 +4x 3 =5<br />

Archetype C System with three equations, four variables. Consistent. Null space<br />

of coefficient matrix has dimension 1.<br />

2x 1 − 3x 2 + x 3 − 6x 4 = −7<br />

4x 1 + x 2 +2x 3 +9x 4 = −7<br />

3x 1 + x 2 + x 3 +8x 4 = −8<br />

Archetype D System with three equations, four variables. Consistent. Null space<br />

of coefficient matrix has dimension 2. Coefficient matrix identical to that of Archetype<br />

E, vector of constants is different.<br />

2x 1 + x 2 +7x 3 − 7x 4 =8<br />

−3x 1 +4x 2 − 5x 3 − 6x 4 = −12<br />

x 1 + x 2 +4x 3 − 5x 4 =4<br />

Archetype E System with three equations, four variables. Inconsistent. Null space<br />

of coefficient matrix has dimension 2. Coefficient matrix identical to that of Archetype<br />

D, constant vector is different.<br />

2x 1 + x 2 +7x 3 − 7x 4 =2<br />

−3x 1 +4x 2 − 5x 3 − 6x 4 =3


Archetypes Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 606<br />

x 1 + x 2 +4x 3 − 5x 4 =2<br />

Archetype F System with four equations, four variables. Nons<strong>in</strong>gular coefficient<br />

matrix. Integer eigenvalues, one has “high” multiplicity.<br />

33x 1 − 16x 2 +10x 3 − 2x 4 = −27<br />

99x 1 − 47x 2 +27x 3 − 7x 4 = −77<br />

78x 1 − 36x 2 +17x 3 − 6x 4 = −52<br />

−9x 1 +2x 2 +3x 3 +4x 4 =5<br />

Archetype G System with five equations, two variables. Consistent. Null space of<br />

coefficient matrix has dimension 0. Coefficient matrix identical to that of Archetype<br />

H, constant vector is different.<br />

2x 1 +3x 2 =6<br />

−x 1 +4x 2 = −14<br />

3x 1 +10x 2 = −2<br />

3x 1 − x 2 =20<br />

6x 1 +9x 2 =18<br />

Archetype H System with five equations, two variables. Inconsistent, overdeterm<strong>in</strong>ed.<br />

Null space of coefficient matrix has dimension 0. Coefficient matrix identical<br />

to that of Archetype G, constant vector is different.<br />

2x 1 +3x 2 =5<br />

−x 1 +4x 2 =6<br />

3x 1 +10x 2 =2<br />

3x 1 − x 2 = −1<br />

6x 1 +9x 2 =3<br />

Archetype I System with four equations, seven variables. Consistent. Null space<br />

of coefficient matrix has dimension 4.<br />

x 1 +4x 2 − x 4 +7x 6 − 9x 7 =3<br />

2x 1 +8x 2 − x 3 +3x 4 +9x 5 − 13x 6 +7x 7 =9<br />

2x 3 − 3x 4 − 4x 5 +12x 6 − 8x 7 =1<br />

−x 1 − 4x 2 +2x 3 +4x 4 +8x 5 − 31x 6 +37x 7 =4<br />

Archetype J<br />

System with six equations, n<strong>in</strong>e variables. Consistent. Null space of


Archetypes Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 607<br />

coefficient matrix has dimension 5.<br />

x 1 +2x 2 − 2x 3 +9x 4 +3x 5 − 5x 6 − 2x 7 + x 8 +27x 9 = −5<br />

2x 1 +4x 2 +3x 3 +4x 4 − x 5 +4x 6 +10x 7 +2x 8 − 23x 9 =18<br />

x 1 +2x 2 + x 3 +3x 4 + x 5 + x 6 +5x 7 +2x 8 − 7x 9 =6<br />

2x 1 +4x 2 +3x 3 +4x 4 − 7x 5 +2x 6 +4x 7 − 11x 9 =20<br />

x 1 +2x 2 +5x 4 +2x 5 − 4x 6 +3x 7 +8x 8 +13x 9 = −4<br />

−3x 1 − 6x 2 − x 3 − 13x 4 +2x 5 − 5x 6 − 4x 7 +13x 8 +10x 9 = −29<br />

Archetype K<br />

multiplicity 2.<br />

Archetype L<br />

Square matrix of size 5. Nons<strong>in</strong>gular. 3 dist<strong>in</strong>ct eigenvalues, 2 of<br />

⎡<br />

⎤<br />

10 18 24 24 −12<br />

12 −2 −6 0 −18<br />

⎢−30 −21 −23 −30 39 ⎥<br />

⎣ 27 30 36 37 −30⎦<br />

18 24 30 30 −20<br />

Square matrix of size 5. S<strong>in</strong>gular, nullity 2. 2 dist<strong>in</strong>ct eigenvalues,<br />

each of “high” multiplicity.<br />

⎡ ⎤<br />

−2 −1 −2 −4 4<br />

⎢−6 −5 −4 −4 6<br />

⎢⎢⎣ 10 7 7 10 −13⎥<br />

−7 −5 −6 −9 10 ⎦<br />

−4 −3 −4 −6 6<br />

Archetype M L<strong>in</strong>ear transformation with bigger doma<strong>in</strong> than codoma<strong>in</strong>, so it is<br />

guaranteed to not be <strong>in</strong>jective. Happens to not be surjective.<br />

⎛⎡<br />

⎤⎞<br />

x 1<br />

[ ]<br />

x<br />

T : C 5 → C 3 2<br />

x1 +2x 2 +3x 3 +4x 4 +4x 5<br />

, T ⎜⎢x 3 ⎥⎟<br />

⎝⎣x ⎦⎠ = 3x 1 + x 2 +4x 3 − 3x 4 +7x 5<br />

4<br />

x 1 − x 2 − 5x 4 + x 5<br />

x 5<br />

Archetype N L<strong>in</strong>ear transformation with doma<strong>in</strong> larger than its codoma<strong>in</strong>, so it<br />

is guaranteed to not be <strong>in</strong>jective. Happens to be onto.<br />

⎛⎡<br />

⎤⎞<br />

x 1<br />

[ ]<br />

x<br />

T : C 5 → C 3 2<br />

2x1 + x 2 +3x 3 − 4x 4 +5x 5<br />

, T ⎜⎢x 3 ⎥⎟<br />

⎝⎣x ⎦⎠ = x 1 − 2x 2 +3x 3 − 9x 4 +3x 5<br />

4 3x 1 +4x 3 − 6x 4 +5x 5<br />

x 5


Archetypes Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 608<br />

Archetype O L<strong>in</strong>ear transformation with a doma<strong>in</strong> smaller than the codoma<strong>in</strong>,<br />

so it is guaranteed to not be onto. Happens to not be one-to-one.<br />

⎡<br />

⎤<br />

−x<br />

([ ])<br />

1 + x 2 − 3x 3<br />

x1<br />

−x<br />

T : C 3 → C 5 1 +2x 2 − 4x 3<br />

, T x 2 = ⎢ x 1 + x 2 + x 3 ⎥<br />

x ⎣<br />

3 2x 1 +3x 2 + x ⎦ 3<br />

x 1 +2x 3<br />

Archetype P L<strong>in</strong>ear transformation with a doma<strong>in</strong> smaller that its codoma<strong>in</strong>, so<br />

it is guaranteed to not be surjective. Happens to be <strong>in</strong>jective.<br />

⎡<br />

⎤<br />

−x<br />

([ ])<br />

1 + x 2 + x 3<br />

x1<br />

−x<br />

T : C 3 → C 5 1 +2x 2 +2x 3<br />

, T x 2 = ⎢ x 1 + x 2 +3x 3 ⎥<br />

x ⎣<br />

3 2x 1 +3x 2 + x ⎦ 3<br />

−2x 1 + x 2 +3x 3<br />

Archetype Q L<strong>in</strong>ear transformation with equal-sized doma<strong>in</strong> and codoma<strong>in</strong>, so<br />

it has the potential to be <strong>in</strong>vertible, but <strong>in</strong> this case is not. Neither <strong>in</strong>jective nor<br />

surjective. Diagonalizable, though.<br />

⎛⎡<br />

⎤⎞<br />

⎡<br />

⎤<br />

x 1 −2x 1 +3x 2 +3x 3 − 6x 4 +3x 5<br />

x<br />

T : C 5 → C 5 2<br />

−16x , T ⎜⎢x 3 ⎥⎟<br />

⎝⎣x ⎦⎠ = 1 +9x 2 +12x 3 − 28x 4 +28x 5<br />

⎢−19x 1 +7x 2 +14x 3 − 32x 4 +37x 5 ⎥<br />

⎣<br />

4 −21x 1 +9x 2 +15x 3 − 35x 4 +39x ⎦ 5<br />

x 5 −9x 1 +5x 2 +7x 3 − 16x 4 +16x 5<br />

Archetype R L<strong>in</strong>ear transformation with equal-sized doma<strong>in</strong> and codoma<strong>in</strong>. Injective,<br />

surjective, <strong>in</strong>vertible, diagonalizable, the works.<br />

⎛⎡<br />

⎤⎞<br />

⎡<br />

⎤<br />

x 1 −65x 1 + 128x 2 +10x 3 − 262x 4 +40x 5<br />

x<br />

T : C 5 → C 5 2<br />

36x , T ⎜⎢x 3 ⎥⎟<br />

⎝⎣x ⎦⎠ = 1 − 73x 2 − x 3 + 151x 4 − 16x 5<br />

⎢ −44x 1 +88x 2 +5x 3 − 180x 4 +24x 5 ⎥<br />

⎣<br />

4 34x 1 − 68x 2 − 3x 3 + 140x 4 − 18x ⎦ 5<br />

x 5 12x 1 − 24x 2 − x 3 +49x 4 − 5x 5<br />

Archetype S Doma<strong>in</strong> is column vectors, codoma<strong>in</strong> is matrices. Doma<strong>in</strong> is dimension<br />

3 and codoma<strong>in</strong> is dimension 4. Not <strong>in</strong>jective, not surjective.<br />

([ ]) a [<br />

]<br />

T : C 3 a − b 2a +2b + c<br />

→ M 22 , T b =<br />

3a + b + c −2a − 6b − 2c<br />

c<br />

Archetype T Doma<strong>in</strong> and codoma<strong>in</strong> are polynomials. Doma<strong>in</strong> has dimension 5,


Archetypes Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 609<br />

while codoma<strong>in</strong> has dimension 6. Is <strong>in</strong>jective, can not be surjective.<br />

T : P 4 → P 5 , T (p(x)) = (x − 2)p(x)<br />

Archetype U Doma<strong>in</strong> is matrices, codoma<strong>in</strong> is column vectors. Doma<strong>in</strong> has<br />

dimension 6, while codoma<strong>in</strong> has dimension 4. Cannot be <strong>in</strong>jective, is surjective.<br />

⎡<br />

⎤<br />

([ ]) a +2b +12c − 3d + e +6f<br />

T : M 23 → C 4 a b c ⎢ 2a − b − c + d − 11f ⎥<br />

, T<br />

=<br />

d e f<br />

⎣<br />

a + b +7c +2d + e − 3f<br />

⎦<br />

a +2b +12c +5e − 5f<br />

Archetype V Doma<strong>in</strong> is polynomials, codoma<strong>in</strong> is matrices. Both doma<strong>in</strong> and<br />

codoma<strong>in</strong> have dimension 4. Injective, surjective, <strong>in</strong>vertible. Square matrix representation,<br />

but doma<strong>in</strong> and codoma<strong>in</strong> are unequal, so no eigenvalue <strong>in</strong>formation.<br />

T : P 3 → M 22 , T ( a + bx + cx 2 + dx 3) [ ]<br />

a + b a− 2c<br />

=<br />

d b− d<br />

Archetype W Doma<strong>in</strong> is polynomials, codoma<strong>in</strong> is polynomials. Doma<strong>in</strong> and<br />

codoma<strong>in</strong> both have dimension 3. Injective, surjective, <strong>in</strong>vertible, 3 dist<strong>in</strong>ct eigenvalues,<br />

diagonalizable.<br />

T : P 2 → P 2 ,<br />

T ( a + bx + cx 2) =(19a +6b − 4c)+(−24a − 7b +4c) x +(36a +12b − 9c) x 2<br />

Archetype X Doma<strong>in</strong> and codoma<strong>in</strong> are square matrices. Doma<strong>in</strong> and codoma<strong>in</strong><br />

both have dimension 4. Not <strong>in</strong>jective, not surjective, not <strong>in</strong>vertible, 3 dist<strong>in</strong>ct eigenvalues,<br />

diagonalizable.<br />

([ ]) [ ]<br />

a b −2a +15b +3c +27d 10b +6c +18d<br />

T : M 22 → M 22 , T<br />

=<br />

c d<br />

a − 5b − 9d −a − 4b − 5c − 8d


Def<strong>in</strong>itions<br />

Section WILA What is L<strong>in</strong>ear <strong>Algebra</strong>?<br />

Section SSLE Solv<strong>in</strong>g Systems of L<strong>in</strong>ear Equations<br />

SLE System of L<strong>in</strong>ear Equations . . . . . . . . . . . . . . . . . . . 9<br />

SSLE Solution of a System of L<strong>in</strong>ear Equations . . . . . . . . . . . . 9<br />

SSSLE Solution Set of a System of L<strong>in</strong>ear Equations . . . . . . . . . 9<br />

ESYS Equivalent Systems . . . . . . . . . . . . . . . . . . . . . . . . 11<br />

EO Equation Operations . . . . . . . . . . . . . . . . . . . . . . . 12<br />

Section RREF Reduced Row-Echelon Form<br />

M Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22<br />

CV Column Vector . . . . . . . . . . . . . . . . . . . . . . . . . . 23<br />

ZCV Zero Column Vector . . . . . . . . . . . . . . . . . . . . . . . 23<br />

CM Coefficient Matrix . . . . . . . . . . . . . . . . . . . . . . . . . 23<br />

VOC Vector of Constants . . . . . . . . . . . . . . . . . . . . . . . . 24<br />

SOLV Solution Vector . . . . . . . . . . . . . . . . . . . . . . . . . . 24<br />

MRLS Matrix Representation of a L<strong>in</strong>ear System . . . . . . . . . . . 24<br />

AM Augmented Matrix . . . . . . . . . . . . . . . . . . . . . . . . 25<br />

RO Row Operations . . . . . . . . . . . . . . . . . . . . . . . . . . 26<br />

REM Row-Equivalent Matrices . . . . . . . . . . . . . . . . . . . . . 26<br />

RREF Reduced Row-Echelon Form . . . . . . . . . . . . . . . . . . . 28<br />

SectionTSSTypesofSolutionSets<br />

CS Consistent System . . . . . . . . . . . . . . . . . . . . . . . . 45<br />

IDV Independent and Dependent Variables . . . . . . . . . . . . . 48<br />

Section HSE Homogeneous Systems of Equations<br />

HS Homogeneous System . . . . . . . . . . . . . . . . . . . . . . . 58<br />

TSHSE Trivial Solution to Homogeneous Systems of Equations . . . . 59<br />

NSM Null Space of a Matrix . . . . . . . . . . . . . . . . . . . . . . 61<br />

Section NM Nons<strong>in</strong>gular Matrices<br />

SQM Square Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . 66<br />

NM Nons<strong>in</strong>gular Matrix . . . . . . . . . . . . . . . . . . . . . . . . 66<br />

IM Identity Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . 67<br />

Section VO Vector Operations<br />

VSCV Vector Space of Column Vectors . . . . . . . . . . . . . . . . . 74<br />

CVE Column Vector Equality . . . . . . . . . . . . . . . . . . . . . 75<br />

CVA Column Vector Addition . . . . . . . . . . . . . . . . . . . . . 76<br />

610


Def<strong>in</strong>itions Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 611<br />

CVSM Column Vector Scalar Multiplication . . . . . . . . . . . . . . 77<br />

Section LC L<strong>in</strong>ear Comb<strong>in</strong>ations<br />

LCCV L<strong>in</strong>ear Comb<strong>in</strong>ation of Column Vectors . . . . . . . . . . . . . 83<br />

Section SS Spann<strong>in</strong>g Sets<br />

SSCV Span of a Set of Column Vectors . . . . . . . . . . . . . . . . 107<br />

Section LI L<strong>in</strong>ear Independence<br />

RLDCV Relation of L<strong>in</strong>ear Dependence for Column Vectors . . . . . . 122<br />

LICV L<strong>in</strong>ear Independence of Column Vectors . . . . . . . . . . . . 122<br />

Section LDS L<strong>in</strong>ear Dependence and Spans<br />

Section O Orthogonality<br />

CCCV Complex Conjugate of a Column Vector . . . . . . . . . . . . 148<br />

IP Inner Product . . . . . . . . . . . . . . . . . . . . . . . . . . . 149<br />

NV Norm of a Vector . . . . . . . . . . . . . . . . . . . . . . . . . 152<br />

OV Orthogonal Vectors . . . . . . . . . . . . . . . . . . . . . . . . 154<br />

OSV Orthogonal Set of Vectors . . . . . . . . . . . . . . . . . . . . 155<br />

SUV Standard Unit Vectors . . . . . . . . . . . . . . . . . . . . . . 155<br />

ONS OrthoNormal Set . . . . . . . . . . . . . . . . . . . . . . . . . 159<br />

Section MO Matrix Operations<br />

VSM Vector Space of m × n Matrices . . . . . . . . . . . . . . . . . 162<br />

ME Matrix Equality . . . . . . . . . . . . . . . . . . . . . . . . . . 162<br />

MA Matrix Addition . . . . . . . . . . . . . . . . . . . . . . . . . . 163<br />

MSM Matrix Scalar Multiplication . . . . . . . . . . . . . . . . . . . 163<br />

ZM Zero Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165<br />

TM Transpose of a Matrix . . . . . . . . . . . . . . . . . . . . . . 166<br />

SYM Symmetric Matrix . . . . . . . . . . . . . . . . . . . . . . . . 166<br />

CCM Complex Conjugate of a Matrix . . . . . . . . . . . . . . . . . 168<br />

A Adjo<strong>in</strong>t . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170<br />

Section MM Matrix Multiplication<br />

MVP Matrix-Vector Product . . . . . . . . . . . . . . . . . . . . . . 175<br />

MM Matrix Multiplication . . . . . . . . . . . . . . . . . . . . . . . 179<br />

HM Hermitian Matrix . . . . . . . . . . . . . . . . . . . . . . . . . 188<br />

Section MISLE Matrix Inverses and Systems of L<strong>in</strong>ear Equations<br />

MI Matrix Inverse . . . . . . . . . . . . . . . . . . . . . . . . . . . 194<br />

Section MINM Matrix Inverses and Nons<strong>in</strong>gular Matrices


Def<strong>in</strong>itions Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 612<br />

UM Unitary Matrices . . . . . . . . . . . . . . . . . . . . . . . . . 211<br />

Section CRS Column and Row Spaces<br />

CSM Column Space of a Matrix . . . . . . . . . . . . . . . . . . . . 217<br />

RSM Row Space of a Matrix . . . . . . . . . . . . . . . . . . . . . . 225<br />

Section FS Four Subsets<br />

LNS Left Null Space . . . . . . . . . . . . . . . . . . . . . . . . . . 236<br />

EEF Extended Echelon Form . . . . . . . . . . . . . . . . . . . . . 241<br />

Section VS Vector Spaces<br />

VS Vector Space . . . . . . . . . . . . . . . . . . . . . . . . . . . 257<br />

Section S Subspaces<br />

S Subspace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272<br />

TS Trivial Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . 277<br />

LC L<strong>in</strong>ear Comb<strong>in</strong>ation . . . . . . . . . . . . . . . . . . . . . . . . 279<br />

SS Span of a Set . . . . . . . . . . . . . . . . . . . . . . . . . . . 279<br />

Section LISS L<strong>in</strong>ear Independence and Spann<strong>in</strong>g Sets<br />

RLD Relation of L<strong>in</strong>ear Dependence . . . . . . . . . . . . . . . . . 288<br />

LI L<strong>in</strong>ear Independence . . . . . . . . . . . . . . . . . . . . . . . 288<br />

SSVS Spann<strong>in</strong>g Set of a Vector Space . . . . . . . . . . . . . . . . . 294<br />

Section B Bases<br />

B Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303<br />

Section D Dimension<br />

D Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318<br />

NOM Nullity Of a Matrix . . . . . . . . . . . . . . . . . . . . . . . . 325<br />

ROM Rank Of a Matrix . . . . . . . . . . . . . . . . . . . . . . . . . 325<br />

Section PD Properties of Dimension<br />

Section DM Determ<strong>in</strong>ant of a Matrix<br />

ELEM Elementary Matrices . . . . . . . . . . . . . . . . . . . . . . . 340<br />

SM SubMatrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346<br />

DM Determ<strong>in</strong>ant of a Matrix . . . . . . . . . . . . . . . . . . . . . 346<br />

Section PDM Properties of Determ<strong>in</strong>ants of Matrices<br />

Section EE Eigenvalues and Eigenvectors<br />

EEM Eigenvalues and Eigenvectors of a Matrix . . . . . . . . . . . 367<br />

CP Characteristic Polynomial . . . . . . . . . . . . . . . . . . . . 375


Def<strong>in</strong>itions Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 613<br />

EM Eigenspace of a Matrix . . . . . . . . . . . . . . . . . . . . . . 377<br />

AME <strong>Algebra</strong>ic Multiplicity of an Eigenvalue . . . . . . . . . . . . . 379<br />

GME Geometric Multiplicity of an Eigenvalue . . . . . . . . . . . . 380<br />

Section PEE Properties of Eigenvalues and Eigenvectors<br />

Section SD Similarity and Diagonalization<br />

SIM Similar Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . 403<br />

DIM Diagonal Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . 407<br />

DZM Diagonalizable Matrix . . . . . . . . . . . . . . . . . . . . . . 407<br />

Section LT L<strong>in</strong>ear Transformations<br />

LT L<strong>in</strong>ear Transformation . . . . . . . . . . . . . . . . . . . . . . 420<br />

PI Pre-Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 436<br />

LTA L<strong>in</strong>ear Transformation Addition . . . . . . . . . . . . . . . . . 438<br />

LTSM L<strong>in</strong>ear Transformation Scalar Multiplication . . . . . . . . . . 439<br />

LTC L<strong>in</strong>ear Transformation Composition . . . . . . . . . . . . . . . 441<br />

Section ILT Injective L<strong>in</strong>ear Transformations<br />

ILT Injective L<strong>in</strong>ear Transformation . . . . . . . . . . . . . . . . . 446<br />

KLT Kernel of a L<strong>in</strong>ear Transformation . . . . . . . . . . . . . . . 451<br />

Section SLT Surjective L<strong>in</strong>ear Transformations<br />

SLT Surjective L<strong>in</strong>ear Transformation . . . . . . . . . . . . . . . . 462<br />

RLT Range of a L<strong>in</strong>ear Transformation . . . . . . . . . . . . . . . . 468<br />

Section IVLT Invertible L<strong>in</strong>ear Transformations<br />

IDLT Identity L<strong>in</strong>ear Transformation . . . . . . . . . . . . . . . . . 480<br />

IVLT Invertible L<strong>in</strong>ear Transformations . . . . . . . . . . . . . . . . 480<br />

IVS Isomorphic Vector Spaces . . . . . . . . . . . . . . . . . . . . 490<br />

ROLT Rank Of a L<strong>in</strong>ear Transformation . . . . . . . . . . . . . . . . 492<br />

NOLT Nullity Of a L<strong>in</strong>ear Transformation . . . . . . . . . . . . . . . 492<br />

Section VR Vector Representations<br />

VR Vector Representation . . . . . . . . . . . . . . . . . . . . . . 501<br />

Section MR Matrix Representations<br />

MR Matrix Representation . . . . . . . . . . . . . . . . . . . . . . 514<br />

Section CB Change of Basis<br />

EELT Eigenvalue and Eigenvector of a L<strong>in</strong>ear Transformation . . . . 542<br />

CBM Change-of-Basis Matrix . . . . . . . . . . . . . . . . . . . . . 543


Def<strong>in</strong>itions Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 614<br />

Section OD Orthonormal Diagonalization<br />

UTM Upper Triangular Matrix . . . . . . . . . . . . . . . . . . . . . 568<br />

LTM Lower Triangular Matrix . . . . . . . . . . . . . . . . . . . . . 568<br />

NRML Normal Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . 575<br />

Section CNO Complex Number Operations<br />

CNE Complex Number Equality . . . . . . . . . . . . . . . . . . . . 582<br />

CNA Complex Number Addition . . . . . . . . . . . . . . . . . . . 582<br />

CNM Complex Number Multiplication . . . . . . . . . . . . . . . . 582<br />

CCN Conjugate of a Complex Number . . . . . . . . . . . . . . . . 584<br />

MCN Modulus of a Complex Number . . . . . . . . . . . . . . . . . 585<br />

Section SET Sets<br />

SET Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 587<br />

SSET Subset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 587<br />

ES Empty Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 587<br />

SE Set Equality . . . . . . . . . . . . . . . . . . . . . . . . . . . . 588<br />

C Card<strong>in</strong>ality . . . . . . . . . . . . . . . . . . . . . . . . . . . . 589<br />

SU Set Union . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 589<br />

SI Set Intersection . . . . . . . . . . . . . . . . . . . . . . . . . . 589<br />

SC Set Complement . . . . . . . . . . . . . . . . . . . . . . . . . 590


Theorems<br />

Section WILA What is L<strong>in</strong>ear <strong>Algebra</strong>?<br />

Section SSLE Solv<strong>in</strong>g Systems of L<strong>in</strong>ear Equations<br />

EOPSS Equation Operations Preserve Solution Sets . . . . . . . . . . . 12<br />

Section RREF Reduced Row-Echelon Form<br />

REMES Row-Equivalent Matrices represent Equivalent Systems . . . . . 27<br />

REMEF Row-Equivalent Matrix <strong>in</strong> Echelon Form . . . . . . . . . . . . . 29<br />

RREFU Reduced Row-Echelon Form is Unique . . . . . . . . . . . . . . 32<br />

SectionTSSTypesofSolutionSets<br />

RCLS Recogniz<strong>in</strong>g Consistency of a L<strong>in</strong>ear System . . . . . . . . . . . 49<br />

CSRN Consistent Systems, r and n . . . . . . . . . . . . . . . . . . . . 50<br />

FVCS Free Variables for Consistent Systems . . . . . . . . . . . . . . . 51<br />

PSSLS Possible Solution Sets for L<strong>in</strong>ear Systems . . . . . . . . . . . . . 52<br />

CMVEI Consistent, More Variables than Equations, Inf<strong>in</strong>ite solutions . . 53<br />

Section HSE Homogeneous Systems of Equations<br />

HSC Homogeneous Systems are Consistent . . . . . . . . . . . . . . . 58<br />

HMVEI Homogeneous, More Variables than Equations, Inf<strong>in</strong>ite solutions 60<br />

Section NM Nons<strong>in</strong>gular Matrices<br />

NMRRI Nons<strong>in</strong>gular Matrices Row Reduce to the Identity matrix . . . . 68<br />

NMTNS Nons<strong>in</strong>gular Matrices have Trivial Null Spaces . . . . . . . . . . 69<br />

NMUS Nons<strong>in</strong>gular Matrices and Unique Solutions . . . . . . . . . . . . 69<br />

NME1 Nons<strong>in</strong>gular Matrix Equivalences, Round 1 . . . . . . . . . . . . 70<br />

Section VO Vector Operations<br />

VSPCV Vector Space Properties of Column Vectors . . . . . . . . . . . . 78<br />

Section LC L<strong>in</strong>ear Comb<strong>in</strong>ations<br />

SLSLC Solutions to L<strong>in</strong>ear Systems are L<strong>in</strong>ear Comb<strong>in</strong>ations . . . . . . 87<br />

VFSLS Vector Form of Solutions to L<strong>in</strong>ear Systems . . . . . . . . . . . 94<br />

PSPHS Particular Solution Plus Homogeneous Solutions . . . . . . . . . 101<br />

Section SS Spann<strong>in</strong>g Sets<br />

SSNS Spann<strong>in</strong>g Sets for Null Spaces . . . . . . . . . . . . . . . . . . . 114<br />

Section LI L<strong>in</strong>ear Independence<br />

LIVHS L<strong>in</strong>early Independent Vectors and Homogeneous Systems . . . . 124<br />

615


Theorems Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 616<br />

LIVRN L<strong>in</strong>early Independent Vectors, r and n . . . . . . . . . . . . . . 126<br />

MVSLD More Vectors than Size implies L<strong>in</strong>ear Dependence . . . . . . . 127<br />

NMLIC Nons<strong>in</strong>gular Matrices have L<strong>in</strong>early Independent Columns . . . 128<br />

NME2 Nons<strong>in</strong>gular Matrix Equivalences, Round 2 . . . . . . . . . . . . 129<br />

BNS Basis for Null Spaces . . . . . . . . . . . . . . . . . . . . . . . . 131<br />

Section LDS L<strong>in</strong>ear Dependence and Spans<br />

DLDS Dependency <strong>in</strong> L<strong>in</strong>early Dependent Sets . . . . . . . . . . . . . 136<br />

BS Basis of a Span . . . . . . . . . . . . . . . . . . . . . . . . . . . 142<br />

Section O Orthogonality<br />

CRVA Conjugation Respects Vector Addition . . . . . . . . . . . . . . 148<br />

CRSM Conjugation Respects Vector Scalar Multiplication . . . . . . . 149<br />

IPVA Inner Product and Vector Addition . . . . . . . . . . . . . . . . 150<br />

IPSM Inner Product and Scalar Multiplication . . . . . . . . . . . . . 151<br />

IPAC Inner Product is Anti-Commutative . . . . . . . . . . . . . . . . 151<br />

IPN Inner Products and Norms . . . . . . . . . . . . . . . . . . . . . 153<br />

PIP Positive Inner Products . . . . . . . . . . . . . . . . . . . . . . . 153<br />

OSLI Orthogonal Sets are L<strong>in</strong>early Independent . . . . . . . . . . . . 156<br />

GSP Gram-Schmidt Procedure . . . . . . . . . . . . . . . . . . . . . . 157<br />

Section MO Matrix Operations<br />

VSPM Vector Space Properties of Matrices . . . . . . . . . . . . . . . . 164<br />

SMS Symmetric Matrices are Square . . . . . . . . . . . . . . . . . . 167<br />

TMA Transpose and Matrix Addition . . . . . . . . . . . . . . . . . . 167<br />

TMSM Transpose and Matrix Scalar Multiplication . . . . . . . . . . . 167<br />

TT Transpose of a Transpose . . . . . . . . . . . . . . . . . . . . . . 168<br />

CRMA Conjugation Respects Matrix Addition . . . . . . . . . . . . . . 169<br />

CRMSM Conjugation Respects Matrix Scalar Multiplication . . . . . . . 169<br />

CCM Conjugate of the Conjugate of a Matrix . . . . . . . . . . . . . . 169<br />

MCT Matrix Conjugation and Transposes . . . . . . . . . . . . . . . . 170<br />

AMA Adjo<strong>in</strong>t and Matrix Addition . . . . . . . . . . . . . . . . . . . . 170<br />

AMSM Adjo<strong>in</strong>t and Matrix Scalar Multiplication . . . . . . . . . . . . . 171<br />

AA Adjo<strong>in</strong>t of an Adjo<strong>in</strong>t . . . . . . . . . . . . . . . . . . . . . . . . 171<br />

Section MM Matrix Multiplication<br />

SLEMM Systems of L<strong>in</strong>ear Equations as Matrix Multiplication . . . . . . 176<br />

EMMVP Equal Matrices and Matrix-Vector Products . . . . . . . . . . . 178<br />

EMP Entries of Matrix Products . . . . . . . . . . . . . . . . . . . . . 180<br />

MMZM Matrix Multiplication and the Zero Matrix . . . . . . . . . . . . 182<br />

MMIM Matrix Multiplication and Identity Matrix . . . . . . . . . . . . 182<br />

MMDAA Matrix Multiplication Distributes Across Addition . . . . . . . . 183


Theorems Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 617<br />

MMSMM Matrix Multiplication and Scalar Matrix Multiplication . . . . . 184<br />

MMA Matrix Multiplication is Associative . . . . . . . . . . . . . . . . 184<br />

MMIP Matrix Multiplication and Inner Products . . . . . . . . . . . . 185<br />

MMCC Matrix Multiplication and Complex Conjugation . . . . . . . . . 186<br />

MMT Matrix Multiplication and Transposes . . . . . . . . . . . . . . . 186<br />

MMAD Matrix Multiplication and Adjo<strong>in</strong>ts . . . . . . . . . . . . . . . . 187<br />

AIP Adjo<strong>in</strong>t and Inner Product . . . . . . . . . . . . . . . . . . . . . 188<br />

HMIP Hermitian Matrices and Inner Products . . . . . . . . . . . . . . 189<br />

Section MISLE Matrix Inverses and Systems of L<strong>in</strong>ear Equations<br />

TTMI Two-by-Two Matrix Inverse . . . . . . . . . . . . . . . . . . . . 196<br />

CINM Comput<strong>in</strong>g the Inverse of a Nons<strong>in</strong>gular Matrix . . . . . . . . . 200<br />

MIU Matrix Inverse is Unique . . . . . . . . . . . . . . . . . . . . . . 202<br />

SS Socks and Shoes . . . . . . . . . . . . . . . . . . . . . . . . . . . 202<br />

MIMI Matrix Inverse of a Matrix Inverse . . . . . . . . . . . . . . . . . 203<br />

MIT Matrix Inverse of a Transpose . . . . . . . . . . . . . . . . . . . 203<br />

MISM Matrix Inverse of a Scalar Multiple . . . . . . . . . . . . . . . . 204<br />

Section MINM Matrix Inverses and Nons<strong>in</strong>gular Matrices<br />

NPNT Nons<strong>in</strong>gular Product has Nons<strong>in</strong>gular Terms . . . . . . . . . . . 207<br />

OSIS One-Sided Inverse is Sufficient . . . . . . . . . . . . . . . . . . . 209<br />

NI Nons<strong>in</strong>gularity is Invertibility . . . . . . . . . . . . . . . . . . . 209<br />

NME3 Nons<strong>in</strong>gular Matrix Equivalences, Round 3 . . . . . . . . . . . . 210<br />

SNCM Solution with Nons<strong>in</strong>gular Coefficient Matrix . . . . . . . . . . . 210<br />

UMI Unitary Matrices are Invertible . . . . . . . . . . . . . . . . . . . 212<br />

CUMOS Columns of Unitary Matrices are Orthonormal Sets . . . . . . . 212<br />

UMPIP Unitary Matrices Preserve Inner Products . . . . . . . . . . . . 213<br />

Section CRS Column and Row Spaces<br />

CSCS Column Spaces and Consistent Systems . . . . . . . . . . . . . . 218<br />

BCS Basis of the Column Space . . . . . . . . . . . . . . . . . . . . . 221<br />

CSNM Column Space of a Nons<strong>in</strong>gular Matrix . . . . . . . . . . . . . . 224<br />

NME4 Nons<strong>in</strong>gular Matrix Equivalences, Round 4 . . . . . . . . . . . . 224<br />

REMRS Row-Equivalent Matrices have equal Row Spaces . . . . . . . . 227<br />

BRS Basis for the Row Space . . . . . . . . . . . . . . . . . . . . . . 228<br />

CSRST Column Space, Row Space, Transpose . . . . . . . . . . . . . . . 230<br />

Section FS Four Subsets<br />

PEEF Properties of Extended Echelon Form . . . . . . . . . . . . . . . 242<br />

FS Four Subsets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244<br />

Section VS Vector Spaces


Theorems Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 618<br />

ZVU Zero Vector is Unique . . . . . . . . . . . . . . . . . . . . . . . . 266<br />

AIU Additive Inverses are Unique . . . . . . . . . . . . . . . . . . . . 266<br />

ZSSM Zero Scalar <strong>in</strong> Scalar Multiplication . . . . . . . . . . . . . . . . 266<br />

ZVSM Zero Vector <strong>in</strong> Scalar Multiplication . . . . . . . . . . . . . . . . 267<br />

AISM Additive Inverses from Scalar Multiplication . . . . . . . . . . . 267<br />

SMEZV Scalar Multiplication Equals the Zero Vector . . . . . . . . . . . 268<br />

Section S Subspaces<br />

TSS Test<strong>in</strong>g Subsets for Subspaces . . . . . . . . . . . . . . . . . . . 274<br />

NSMS Null Space of a Matrix is a Subspace . . . . . . . . . . . . . . . 277<br />

SSS Span of a Set is a Subspace . . . . . . . . . . . . . . . . . . . . . 280<br />

CSMS Column Space of a Matrix is a Subspace . . . . . . . . . . . . . 285<br />

RSMS Row Space of a Matrix is a Subspace . . . . . . . . . . . . . . . 285<br />

LNSMS Left Null Space of a Matrix is a Subspace . . . . . . . . . . . . . 285<br />

Section LISS L<strong>in</strong>ear Independence and Spann<strong>in</strong>g Sets<br />

VRRB Vector Representation Relative to a Basis . . . . . . . . . . . . . 299<br />

Section B Bases<br />

SUVB Standard Unit Vectors are a Basis . . . . . . . . . . . . . . . . . 304<br />

CNMB Columns of Nons<strong>in</strong>gular Matrix are a Basis . . . . . . . . . . . . 309<br />

NME5 Nons<strong>in</strong>gular Matrix Equivalences, Round 5 . . . . . . . . . . . . 310<br />

COB Coord<strong>in</strong>ates and Orthonormal Bases . . . . . . . . . . . . . . . . 311<br />

UMCOB Unitary Matrices Convert Orthonormal Bases . . . . . . . . . . 314<br />

Section D Dimension<br />

SSLD Spann<strong>in</strong>g Sets and L<strong>in</strong>ear Dependence . . . . . . . . . . . . . . 318<br />

BIS Bases have Identical Sizes . . . . . . . . . . . . . . . . . . . . . . 322<br />

DCM Dimension of C m . . . . . . . . . . . . . . . . . . . . . . . . . . 323<br />

DP Dimension of P n . . . . . . . . . . . . . . . . . . . . . . . . . . . 323<br />

DM Dimension of M mn . . . . . . . . . . . . . . . . . . . . . . . . . 323<br />

CRN Comput<strong>in</strong>g Rank and Nullity . . . . . . . . . . . . . . . . . . . . 326<br />

RPNC Rank Plus Nullity is Columns . . . . . . . . . . . . . . . . . . . 326<br />

RNNM Rank and Nullity of a Nons<strong>in</strong>gular Matrix . . . . . . . . . . . . 327<br />

NME6 Nons<strong>in</strong>gular Matrix Equivalences, Round 6 . . . . . . . . . . . . 328<br />

Section PD Properties of Dimension<br />

ELIS Extend<strong>in</strong>g L<strong>in</strong>early Independent Sets . . . . . . . . . . . . . . . 331<br />

G Goldilocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332<br />

PSSD Proper Subspaces have Smaller Dimension . . . . . . . . . . . . 335<br />

EDYES Equal Dimensions Yields Equal Subspaces . . . . . . . . . . . . 335<br />

RMRT Rank of a Matrix is the Rank of the Transpose . . . . . . . . . . 335


Theorems Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 619<br />

DFS Dimensions of Four Subspaces . . . . . . . . . . . . . . . . . . . 337<br />

Section DM Determ<strong>in</strong>ant of a Matrix<br />

EMDRO Elementary Matrices Do Row Operations . . . . . . . . . . . . . 342<br />

EMN Elementary Matrices are Nons<strong>in</strong>gular . . . . . . . . . . . . . . . 345<br />

NMPEM Nons<strong>in</strong>gular Matrices are Products of Elementary Matrices . . . 345<br />

DMST Determ<strong>in</strong>ant of Matrices of Size Two . . . . . . . . . . . . . . . 347<br />

DER Determ<strong>in</strong>ant Expansion about Rows . . . . . . . . . . . . . . . . 348<br />

DT Determ<strong>in</strong>ant of the Transpose . . . . . . . . . . . . . . . . . . . 349<br />

DEC Determ<strong>in</strong>ant Expansion about Columns . . . . . . . . . . . . . . 350<br />

Section PDM Properties of Determ<strong>in</strong>ants of Matrices<br />

DZRC Determ<strong>in</strong>ant with Zero Row or Column . . . . . . . . . . . . . . 355<br />

DRCS Determ<strong>in</strong>ant for Row or Column Swap . . . . . . . . . . . . . . 355<br />

DRCM Determ<strong>in</strong>ant for Row or Column Multiples . . . . . . . . . . . . 356<br />

DERC Determ<strong>in</strong>ant with Equal Rows or Columns . . . . . . . . . . . . 357<br />

DRCMA Determ<strong>in</strong>ant for Row or Column Multiples and Addition . . . . 358<br />

DIM Determ<strong>in</strong>ant of the Identity Matrix . . . . . . . . . . . . . . . . 361<br />

DEM Determ<strong>in</strong>ants of Elementary Matrices . . . . . . . . . . . . . . . 361<br />

DEMMM Determ<strong>in</strong>ants, Elementary Matrices, Matrix Multiplication . . . 362<br />

SMZD S<strong>in</strong>gular Matrices have Zero Determ<strong>in</strong>ants . . . . . . . . . . . . 363<br />

NME7 Nons<strong>in</strong>gular Matrix Equivalences, Round 7 . . . . . . . . . . . . 364<br />

DRMM Determ<strong>in</strong>ant Respects Matrix Multiplication . . . . . . . . . . . 365<br />

Section EE Eigenvalues and Eigenvectors<br />

EMHE Every Matrix Has an Eigenvalue . . . . . . . . . . . . . . . . . . 372<br />

EMRCP Eigenvalues of a Matrix are Roots of Characteristic Polynomials 376<br />

EMS Eigenspace for a Matrix is a Subspace . . . . . . . . . . . . . . . 377<br />

EMNS Eigenspace of a Matrix is a Null Space . . . . . . . . . . . . . . 378<br />

Section PEE Properties of Eigenvalues and Eigenvectors<br />

EDELI Eigenvectors with Dist<strong>in</strong>ct Eigenvalues are L<strong>in</strong>early Independent 390<br />

SMZE S<strong>in</strong>gular Matrices have Zero Eigenvalues . . . . . . . . . . . . . 391<br />

NME8 Nons<strong>in</strong>gular Matrix Equivalences, Round 8 . . . . . . . . . . . . 391<br />

ESMM Eigenvalues of a Scalar Multiple of a Matrix . . . . . . . . . . . 392<br />

EOMP Eigenvalues Of Matrix Powers . . . . . . . . . . . . . . . . . . . 392<br />

EPM Eigenvalues of the Polynomial of a Matrix . . . . . . . . . . . . 393<br />

EIM Eigenvalues of the Inverse of a Matrix . . . . . . . . . . . . . . . 394<br />

ETM Eigenvalues of the Transpose of a Matrix . . . . . . . . . . . . . 395<br />

ERMCP Eigenvalues of Real Matrices come <strong>in</strong> Conjugate Pairs . . . . . . 396<br />

DCP Degree of the Characteristic Polynomial . . . . . . . . . . . . . . 396<br />

NEM Number of Eigenvalues of a Matrix . . . . . . . . . . . . . . . . 397


Theorems Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 620<br />

ME Multiplicities of an Eigenvalue . . . . . . . . . . . . . . . . . . . 398<br />

MNEM Maximum Number of Eigenvalues of a Matrix . . . . . . . . . . 400<br />

HMRE Hermitian Matrices have Real Eigenvalues . . . . . . . . . . . . 400<br />

HMOE Hermitian Matrices have Orthogonal Eigenvectors . . . . . . . . 401<br />

Section SD Similarity and Diagonalization<br />

SER Similarity is an Equivalence Relation . . . . . . . . . . . . . . . 405<br />

SMEE Similar Matrices have Equal Eigenvalues . . . . . . . . . . . . . 406<br />

DC Diagonalization Characterization . . . . . . . . . . . . . . . . . . 408<br />

DMFE Diagonalizable Matrices have Full Eigenspaces . . . . . . . . . . 410<br />

DED Dist<strong>in</strong>ct Eigenvalues implies Diagonalizable . . . . . . . . . . . . 413<br />

Section LT L<strong>in</strong>ear Transformations<br />

LTTZZ L<strong>in</strong>ear Transformations Take Zero to Zero . . . . . . . . . . . . 425<br />

MBLT Matrices Build L<strong>in</strong>ear Transformations . . . . . . . . . . . . . . 428<br />

MLTCV Matrix of a L<strong>in</strong>ear Transformation, Column Vectors . . . . . . . 430<br />

LTLC L<strong>in</strong>ear Transformations and L<strong>in</strong>ear Comb<strong>in</strong>ations . . . . . . . . 432<br />

LTDB L<strong>in</strong>ear Transformation Def<strong>in</strong>ed on a Basis . . . . . . . . . . . . 432<br />

SLTLT Sum of L<strong>in</strong>ear Transformations is a L<strong>in</strong>ear Transformation . . . 439<br />

MLTLT Multiple of a L<strong>in</strong>ear Transformation is a L<strong>in</strong>ear Transformation 440<br />

VSLT Vector Space of L<strong>in</strong>ear Transformations . . . . . . . . . . . . . . 441<br />

CLTLT Composition of L<strong>in</strong>ear Transformations is a L<strong>in</strong>ear Transformation 441<br />

Section ILT Injective L<strong>in</strong>ear Transformations<br />

KLTS Kernel of a L<strong>in</strong>ear Transformation is a Subspace . . . . . . . . . 452<br />

KPI Kernel and Pre-Image . . . . . . . . . . . . . . . . . . . . . . . . 453<br />

KILT Kernel of an Injective L<strong>in</strong>ear Transformation . . . . . . . . . . . 455<br />

ILTLI Injective L<strong>in</strong>ear Transformations and L<strong>in</strong>ear Independence . . . 457<br />

ILTB Injective L<strong>in</strong>ear Transformations and Bases . . . . . . . . . . . . 457<br />

ILTD Injective L<strong>in</strong>ear Transformations and Dimension . . . . . . . . . 458<br />

CILTI Composition of Injective L<strong>in</strong>ear Transformations is Injective . . 459<br />

Section SLT Surjective L<strong>in</strong>ear Transformations<br />

RLTS Range of a L<strong>in</strong>ear Transformation is a Subspace . . . . . . . . . 469<br />

RSLT Range of a Surjective L<strong>in</strong>ear Transformation . . . . . . . . . . . 471<br />

SSRLT Spann<strong>in</strong>g Set for Range of a L<strong>in</strong>ear Transformation . . . . . . . 473<br />

RPI Range and Pre-Image . . . . . . . . . . . . . . . . . . . . . . . . 475<br />

SLTB Surjective L<strong>in</strong>ear Transformations and Bases . . . . . . . . . . . 475<br />

SLTD Surjective L<strong>in</strong>ear Transformations and Dimension . . . . . . . . 476<br />

CSLTS Composition of Surjective L<strong>in</strong>ear Transformations is Surjective . 476<br />

Section IVLT Invertible L<strong>in</strong>ear Transformations


Theorems Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 621<br />

ILTLT Inverse of a L<strong>in</strong>ear Transformation is a L<strong>in</strong>ear Transformation . 483<br />

IILT Inverse of an Invertible L<strong>in</strong>ear Transformation . . . . . . . . . . 484<br />

ILTIS Invertible L<strong>in</strong>ear Transformations are Injective and Surjective . 484<br />

CIVLT Composition of Invertible L<strong>in</strong>ear Transformations . . . . . . . . 488<br />

ICLT Inverse of a Composition of L<strong>in</strong>ear Transformations . . . . . . . 488<br />

IVSED Isomorphic Vector Spaces have Equal Dimension . . . . . . . . . 492<br />

ROSLT Rank Of a Surjective L<strong>in</strong>ear Transformation . . . . . . . . . . . 493<br />

NOILT Nullity Of an Injective L<strong>in</strong>ear Transformation . . . . . . . . . . 493<br />

RPNDD Rank Plus Nullity is Doma<strong>in</strong> Dimension . . . . . . . . . . . . . 493<br />

Section VR Vector Representations<br />

VRLT Vector Representation is a L<strong>in</strong>ear Transformation . . . . . . . . 502<br />

VRI Vector Representation is Injective . . . . . . . . . . . . . . . . . 506<br />

VRS Vector Representation is Surjective . . . . . . . . . . . . . . . . 507<br />

VRILT Vector Representation is an Invertible L<strong>in</strong>ear Transformation . . 507<br />

CFDVS Characterization of F<strong>in</strong>ite Dimensional Vector Spaces . . . . . . 508<br />

IFDVS Isomorphism of F<strong>in</strong>ite Dimensional Vector Spaces . . . . . . . . 508<br />

CLI Coord<strong>in</strong>atization and L<strong>in</strong>ear Independence . . . . . . . . . . . . 509<br />

CSS Coord<strong>in</strong>atization and Spann<strong>in</strong>g Sets . . . . . . . . . . . . . . . . 509<br />

Section MR Matrix Representations<br />

FTMR Fundamental Theorem of Matrix Representation . . . . . . . . . 517<br />

MRSLT Matrix Representation of a Sum of L<strong>in</strong>ear Transformations . . . 522<br />

MRMLT Matrix Representation of a Multiple of a L<strong>in</strong>ear Transformation 522<br />

MRCLT Matrix Representation of a Composition of L<strong>in</strong>ear Transformations 523<br />

KNSI Kernel and Null Space Isomorphism . . . . . . . . . . . . . . . . 527<br />

RCSI Range and Column Space Isomorphism . . . . . . . . . . . . . . 530<br />

IMR Invertible Matrix Representations . . . . . . . . . . . . . . . . . 534<br />

IMILT Invertible Matrices, Invertible L<strong>in</strong>ear Transformation . . . . . . 537<br />

NME9 Nons<strong>in</strong>gular Matrix Equivalences, Round 9 . . . . . . . . . . . . 537<br />

Section CB Change of Basis<br />

CB Change-of-Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . 544<br />

ICBM Inverse of Change-of-Basis Matrix . . . . . . . . . . . . . . . . . 544<br />

MRCB Matrix Representation and Change of Basis . . . . . . . . . . . 550<br />

SCB Similarity and Change of Basis . . . . . . . . . . . . . . . . . . . 553<br />

EER Eigenvalues, Eigenvectors, Representations . . . . . . . . . . . . 556<br />

Section OD Orthonormal Diagonalization<br />

PTMT Product of Triangular Matrices is Triangular . . . . . . . . . . . 568<br />

ITMT Inverse of a Triangular Matrix is Triangular . . . . . . . . . . . 569<br />

UTMR Upper Triangular Matrix Representation . . . . . . . . . . . . . 570


Theorems Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 622<br />

OBUTR Orthonormal Basis for Upper Triangular Representation . . . . 573<br />

OD Orthonormal Diagonalization . . . . . . . . . . . . . . . . . . . . 575<br />

OBNM Orthonormal Bases and Normal Matrices . . . . . . . . . . . . . 578<br />

Section CNO Complex Number Operations<br />

PCNA Properties of Complex Number Arithmetic . . . . . . . . . . . . 582<br />

ZPCN Zero Product, Complex Numbers . . . . . . . . . . . . . . . . . 583<br />

ZPZT Zero Product, Zero Terms . . . . . . . . . . . . . . . . . . . . . 584<br />

CCRA Complex Conjugation Respects Addition . . . . . . . . . . . . . 585<br />

CCRM Complex Conjugation Respects Multiplication . . . . . . . . . . 585<br />

CCT Complex Conjugation Twice . . . . . . . . . . . . . . . . . . . . 585<br />

Section SET Sets


Notation<br />

A Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22<br />

[A] ij<br />

Matrix Entries . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22<br />

v Column Vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23<br />

[v] i<br />

Column Vector Entries . . . . . . . . . . . . . . . . . . . . . . . 23<br />

0 Zero Column Vector . . . . . . . . . . . . . . . . . . . . . . . . . 23<br />

LS(A, b) Matrix Representation of a L<strong>in</strong>ear System . . . . . . . . . . . . 24<br />

[A | b] Augmented Matrix . . . . . . . . . . . . . . . . . . . . . . . . . 25<br />

R i ↔ R j Row Operation, Swap . . . . . . . . . . . . . . . . . . . . . . . . 26<br />

αR i Row Operation, Multiply . . . . . . . . . . . . . . . . . . . . . . 26<br />

αR i + R j Row Operation, Add . . . . . . . . . . . . . . . . . . . . . . . . 26<br />

r, D, F Reduced Row-Echelon Form Analysis . . . . . . . . . . . . . . . 28<br />

N (A) Null Space of a Matrix . . . . . . . . . . . . . . . . . . . . . . . 61<br />

I m Identity Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . 67<br />

C m Vector Space of Column Vectors . . . . . . . . . . . . . . . . . . 74<br />

u = v Column Vector Equality . . . . . . . . . . . . . . . . . . . . . . 75<br />

u + v Column Vector Addition . . . . . . . . . . . . . . . . . . . . . . 76<br />

αu Column Vector Scalar Multiplication . . . . . . . . . . . . . . . 77<br />

〈S〉 Span of a Set of Vectors . . . . . . . . . . . . . . . . . . . . . . 107<br />

u Complex Conjugate of a Column Vector . . . . . . . . . . . . . 148<br />

〈u, v〉 Inner Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149<br />

‖v‖ Norm of a Vector . . . . . . . . . . . . . . . . . . . . . . . . . . 152<br />

e i Standard Unit Vectors . . . . . . . . . . . . . . . . . . . . . . . 155<br />

M mn Vector Space of Matrices . . . . . . . . . . . . . . . . . . . . . . 162<br />

A = B Matrix Equality . . . . . . . . . . . . . . . . . . . . . . . . . . . 162<br />

A + B Matrix Addition . . . . . . . . . . . . . . . . . . . . . . . . . . . 163<br />

αA Matrix Scalar Multiplication . . . . . . . . . . . . . . . . . . . . 163<br />

O Zero Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165<br />

A t Transpose of a Matrix . . . . . . . . . . . . . . . . . . . . . . . . 166<br />

A Complex Conjugate of a Matrix . . . . . . . . . . . . . . . . . . 168<br />

A ∗ Adjo<strong>in</strong>t . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170<br />

Au Matrix-Vector Product . . . . . . . . . . . . . . . . . . . . . . . 175<br />

AB Matrix Multiplication . . . . . . . . . . . . . . . . . . . . . . . . 179<br />

A −1 Matrix Inverse . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194<br />

C(A) Column Space of a Matrix . . . . . . . . . . . . . . . . . . . . . 217<br />

R(A) Row Space of a Matrix . . . . . . . . . . . . . . . . . . . . . . . 225<br />

L(A) Left Null Space . . . . . . . . . . . . . . . . . . . . . . . . . . . 236<br />

dim (V ) Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318<br />

n (A) Nullity of a Matrix . . . . . . . . . . . . . . . . . . . . . . . . . 325<br />

r (A) Rank of a Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . 325<br />

623


Notation Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 624<br />

E i,j Elementary Matrix, Swap . . . . . . . . . . . . . . . . . . . . . . 340<br />

E i (α) Elementary Matrix, Multiply . . . . . . . . . . . . . . . . . . . . 340<br />

E i,j (α) Elementary Matrix, Add . . . . . . . . . . . . . . . . . . . . . . 340<br />

A (i|j) SubMatrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346<br />

|A| Determ<strong>in</strong>ant of a Matrix, Bars . . . . . . . . . . . . . . . . . . . 346<br />

det (A) Determ<strong>in</strong>ant of a Matrix, Functional . . . . . . . . . . . . . . . 346<br />

α A (λ) <strong>Algebra</strong>ic Multiplicity of an Eigenvalue . . . . . . . . . . . . . . 379<br />

γ A (λ) Geometric Multiplicity of an Eigenvalue . . . . . . . . . . . . . . 380<br />

T : U → V L<strong>in</strong>ear Transformation . . . . . . . . . . . . . . . . . . . . . . . 420<br />

K(T ) Kernel of a L<strong>in</strong>ear Transformation . . . . . . . . . . . . . . . . . 451<br />

R(T ) Range of a L<strong>in</strong>ear Transformation . . . . . . . . . . . . . . . . . 468<br />

r (T ) Rank of a L<strong>in</strong>ear Transformation . . . . . . . . . . . . . . . . . 492<br />

n (T ) Nullity of a L<strong>in</strong>ear Transformation . . . . . . . . . . . . . . . . . 492<br />

ρ B (w) Vector Representation . . . . . . . . . . . . . . . . . . . . . . . . 501<br />

MB,C T Matrix Representation . . . . . . . . . . . . . . . . . . . . . . . 514<br />

α = β Complex Number Equality . . . . . . . . . . . . . . . . . . . . . 582<br />

α + β Complex Number Addition . . . . . . . . . . . . . . . . . . . . . 582<br />

αβ Complex Number Multiplication . . . . . . . . . . . . . . . . . . 582<br />

α Conjugate of a Complex Number . . . . . . . . . . . . . . . . . 584<br />

x ∈ S Set Membership . . . . . . . . . . . . . . . . . . . . . . . . . . . 587<br />

S ⊆ T Subset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 587<br />

∅ Empty Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 587<br />

S = T Set Equality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 588<br />

|S| Card<strong>in</strong>ality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 589<br />

S ∪ T Set Union . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 589<br />

S ∩ T Set Intersection . . . . . . . . . . . . . . . . . . . . . . . . . . . 589<br />

S Set Complement . . . . . . . . . . . . . . . . . . . . . . . . . . . 590


GNU Free Documentation License<br />

Version 1.3, 3 November 2008<br />

Copyright c○ 2000, 2001, 2002, 2007, 2008 Free Software Foundation, Inc.<br />

<br />

Everyone is permitted to copy and distribute verbatim copies of this license document, but<br />

chang<strong>in</strong>g it is not allowed.<br />

Preamble<br />

The purpose of this License is to make a manual, textbook, or other functional and useful<br />

document “free” <strong>in</strong> the sense of freedom: to assure everyone the effective freedom to copy<br />

and redistribute it, with or without modify<strong>in</strong>g it, either commercially or noncommercially.<br />

Secondarily, this License preserves for the author and publisher a way to get credit for their<br />

work, while not be<strong>in</strong>g considered responsible for modifications made by others.<br />

This License is a k<strong>in</strong>d of “copyleft”, which means that derivative works of the document<br />

must themselves be free <strong>in</strong> the same sense. It complements the GNU General Public License,<br />

which is a copyleft license designed for free software.<br />

We have designed this License <strong>in</strong> order to use it for manuals for free software, because<br />

free software needs free documentation: a free program should come with manuals provid<strong>in</strong>g<br />

the same freedoms that the software does. But this License is not limited to software<br />

manuals; it can be used for any textual work, regardless of subject matter or whether it<br />

is published as a pr<strong>in</strong>ted book. We recommend this License pr<strong>in</strong>cipally for works whose<br />

purpose is <strong>in</strong>struction or reference.<br />

1. APPLICABILITY AND DEFINITIONS<br />

This License applies to any manual or other work, <strong>in</strong> any medium, that conta<strong>in</strong>s a<br />

notice placed by the copyright holder say<strong>in</strong>g it can be distributed under the terms of this<br />

License. Such a notice grants a world-wide, royalty-free license, unlimited <strong>in</strong> duration, to use<br />

that work under the conditions stated here<strong>in</strong>. The “Document”, below, refers to any such<br />

manual or work. Any member of the public is a licensee, and is addressed as “you”. You<br />

accept the license if you copy, modify or distribute the work <strong>in</strong> a way requir<strong>in</strong>g permission<br />

under copyright law.<br />

A“Modified Version” of the Document means any work conta<strong>in</strong><strong>in</strong>g the Document<br />

or a portion of it, either copied verbatim, or with modifications and/or translated <strong>in</strong>to<br />

another language.<br />

A“Secondary Section” is a named appendix or a front-matter section of the Document<br />

that deals exclusively with the relationship of the publishers or authors of the Document<br />

to the Document’s overall subject (or to related matters) and conta<strong>in</strong>s noth<strong>in</strong>g that could<br />

fall directly with<strong>in</strong> that overall subject. (Thus, if the Document is <strong>in</strong> part a textbook of<br />

mathematics, a Secondary Section may not expla<strong>in</strong> any mathematics.) The relationship<br />

could be a matter of historical connection with the subject or with related matters, or of<br />

legal, commercial, philosophical, ethical or political position regard<strong>in</strong>g them.


GFDL Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 626<br />

The “Invariant Sections” are certa<strong>in</strong> Secondary Sections whose titles are designated,<br />

as be<strong>in</strong>g those of Invariant Sections, <strong>in</strong> the notice that says that the Document is released<br />

under this License. If a section does not fit the above def<strong>in</strong>ition of Secondary then it is not<br />

allowed to be designated as Invariant. The Document may conta<strong>in</strong> zero Invariant Sections.<br />

If the Document does not identify any Invariant Sections then there are none.<br />

The “Cover Texts” are certa<strong>in</strong> short passages of text that are listed, as Front-Cover<br />

Texts or Back-Cover Texts, <strong>in</strong> the notice that says that the Document is released under<br />

this License. A Front-Cover Text may be at most 5 words, and a Back-Cover Text may be<br />

at most 25 words.<br />

A“Transparent” copy of the Document means a mach<strong>in</strong>e-readable copy, represented<br />

<strong>in</strong> a format whose specification is available to the general public, that is suitable for revis<strong>in</strong>g<br />

the document straightforwardly with generic text editors or (for images composed of pixels)<br />

generic pa<strong>in</strong>t programs or (for draw<strong>in</strong>gs) some widely available draw<strong>in</strong>g editor, and that is<br />

suitable for <strong>in</strong>put to text formatters or for automatic translation to a variety of formats<br />

suitable for <strong>in</strong>put to text formatters. A copy made <strong>in</strong> an otherwise Transparent file format<br />

whose markup, or absence of markup, has been arranged to thwart or discourage subsequent<br />

modification by readers is not Transparent. An image format is not Transparent if used for<br />

any substantial amount of text. A copy that is not “Transparent” is called “Opaque”.<br />

Examples of suitable formats for Transparent copies <strong>in</strong>clude pla<strong>in</strong> ASCII without<br />

markup, Tex<strong>in</strong>fo <strong>in</strong>put format, LaTeX <strong>in</strong>put format, SGML or XML us<strong>in</strong>g a publicly<br />

available DTD, and standard-conform<strong>in</strong>g simple HTML, PostScript or PDF designed for<br />

human modification. Examples of transparent image formats <strong>in</strong>clude PNG, XCF and<br />

JPG. Opaque formats <strong>in</strong>clude proprietary formats that can be read and edited only by<br />

proprietary word processors, SGML or XML for which the DTD and/or process<strong>in</strong>g tools are<br />

not generally available, and the mach<strong>in</strong>e-generated HTML, PostScript or PDF produced<br />

by some word processors for output purposes only.<br />

The “Title Page” means, for a pr<strong>in</strong>ted book, the title page itself, plus such follow<strong>in</strong>g<br />

pages as are needed to hold, legibly, the material this License requires to appear <strong>in</strong> the title<br />

page. For works <strong>in</strong> formats which do not have any title page as such, “Title Page” means<br />

the text near the most prom<strong>in</strong>ent appearance of the work’s title, preced<strong>in</strong>g the beg<strong>in</strong>n<strong>in</strong>g<br />

of the body of the text.<br />

The “publisher” means any person or entity that distributes copies of the Document<br />

to the public.<br />

Asection“Entitled XYZ” means a named subunit of the Document whose title<br />

either is precisely XYZ or conta<strong>in</strong>s XYZ <strong>in</strong> parentheses follow<strong>in</strong>g text that translates<br />

XYZ <strong>in</strong> another language. (Here XYZ stands for a specific section name mentioned below,<br />

such as “Acknowledgements”, “Dedications”, “Endorsements”, or “History”.) To<br />

“Preserve the Title” of such a section when you modify the Document means that it<br />

rema<strong>in</strong>s a section “Entitled XYZ” accord<strong>in</strong>g to this def<strong>in</strong>ition.<br />

The Document may <strong>in</strong>clude Warranty Disclaimers next to the notice which states that<br />

this License applies to the Document. These Warranty Disclaimers are considered to be<br />

<strong>in</strong>cluded by reference <strong>in</strong> this License, but only as regards disclaim<strong>in</strong>g warranties: any other<br />

implication that these Warranty Disclaimers may have is void and has no effect on the<br />

mean<strong>in</strong>g of this License.


GFDL Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 627<br />

2. VERBATIM COPYING<br />

You may copy and distribute the Document <strong>in</strong> any medium, either commercially or<br />

noncommercially, provided that this License, the copyright notices, and the license notice<br />

say<strong>in</strong>g this License applies to the Document are reproduced <strong>in</strong> all copies, and that you<br />

add no other conditions whatsoever to those of this License. You may not use technical<br />

measures to obstruct or control the read<strong>in</strong>g or further copy<strong>in</strong>g of the copies you make or<br />

distribute. However, you may accept compensation <strong>in</strong> exchange for copies. If you distribute<br />

a large enough number of copies you must also follow the conditions <strong>in</strong> section 3.<br />

You may also lend copies, under the same conditions stated above, and you may publicly<br />

display copies.<br />

3. COPYING IN QUANTITY<br />

If you publish pr<strong>in</strong>ted copies (or copies <strong>in</strong> media that commonly have pr<strong>in</strong>ted covers) of<br />

the Document, number<strong>in</strong>g more than 100, and the Document’s license notice requires Cover<br />

Texts, you must enclose the copies <strong>in</strong> covers that carry, clearly and legibly, all these Cover<br />

Texts: Front-Cover Texts on the front cover, and Back-Cover Texts on the back cover. Both<br />

covers must also clearly and legibly identify you as the publisher of these copies. The front<br />

cover must present the full title with all words of the title equally prom<strong>in</strong>ent and visible.<br />

You may add other material on the covers <strong>in</strong> addition. Copy<strong>in</strong>g with changes limited to<br />

the covers, as long as they preserve the title of the Document and satisfy these conditions,<br />

can be treated as verbatim copy<strong>in</strong>g <strong>in</strong> other respects.<br />

If the required texts for either cover are too volum<strong>in</strong>ous to fit legibly, you should put<br />

the first ones listed (as many as fit reasonably) on the actual cover, and cont<strong>in</strong>ue the rest<br />

onto adjacent pages.<br />

If you publish or distribute Opaque copies of the Document number<strong>in</strong>g more than 100,<br />

you must either <strong>in</strong>clude a mach<strong>in</strong>e-readable Transparent copy along with each Opaque copy,<br />

or state <strong>in</strong> or with each Opaque copy a computer-network location from which the general<br />

network-us<strong>in</strong>g public has access to download us<strong>in</strong>g public-standard network protocols a<br />

complete Transparent copy of the Document, free of added material. If you use the latter<br />

option, you must take reasonably prudent steps, when you beg<strong>in</strong> distribution of Opaque<br />

copies <strong>in</strong> quantity, to ensure that this Transparent copy will rema<strong>in</strong> thus accessible at the<br />

stated location until at least one year after the last time you distribute an Opaque copy<br />

(directly or through your agents or retailers) of that edition to the public.<br />

It is requested, but not required, that you contact the authors of the Document well<br />

before redistribut<strong>in</strong>g any large number of copies, to give them a chance to provide you with<br />

an updated version of the Document.<br />

4. MODIFICATIONS<br />

You may copy and distribute a Modified Version of the Document under the conditions<br />

of sections 2 and 3 above, provided that you release the Modified Version under precisely<br />

this License, with the Modified Version fill<strong>in</strong>g the role of the Document, thus licens<strong>in</strong>g<br />

distribution and modification of the Modified Version to whoever possesses a copy of it. In<br />

addition, you must do these th<strong>in</strong>gs <strong>in</strong> the Modified Version:


GFDL Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 628<br />

A. Use <strong>in</strong> the Title Page (and on the covers, if any) a title dist<strong>in</strong>ct from that of the<br />

Document, and from those of previous versions (which should, if there were any, be<br />

listed <strong>in</strong> the History section of the Document). You may use the same title as a<br />

previous version if the orig<strong>in</strong>al publisher of that version gives permission.<br />

B. List on the Title Page, as authors, one or more persons or entities responsible for<br />

authorship of the modifications <strong>in</strong> the Modified Version, together with at least five<br />

of the pr<strong>in</strong>cipal authors of the Document (all of its pr<strong>in</strong>cipal authors, if it has fewer<br />

than five), unless they release you from this requirement.<br />

C. State on the Title page the name of the publisher of the Modified Version, as the<br />

publisher.<br />

D. Preserve all the copyright notices of the Document.<br />

E. Add an appropriate copyright notice for your modifications adjacent to the other<br />

copyright notices.<br />

F. Include, immediately after the copyright notices, a license notice giv<strong>in</strong>g the public<br />

permission to use the Modified Version under the terms of this License, <strong>in</strong> the form<br />

shown <strong>in</strong> the Addendum below.<br />

G. Preserve <strong>in</strong> that license notice the full lists of Invariant Sections and required Cover<br />

Texts given <strong>in</strong> the Document’s license notice.<br />

H. Include an unaltered copy of this License.<br />

I. Preserve the section Entitled “History”, Preserve its Title, and add to it an item<br />

stat<strong>in</strong>g at least the title, year, new authors, and publisher of the Modified Version as<br />

given on the Title Page. If there is no section Entitled “History” <strong>in</strong> the Document,<br />

create one stat<strong>in</strong>g the title, year, authors, and publisher of the Document as given<br />

on its Title Page, then add an item describ<strong>in</strong>g the Modified Version as stated <strong>in</strong> the<br />

previous sentence.<br />

J. Preserve the network location, if any, given <strong>in</strong> the Document for public access to<br />

a Transparent copy of the Document, and likewise the network locations given <strong>in</strong><br />

the Document for previous versions it was based on. These may be placed <strong>in</strong> the<br />

“History” section. You may omit a network location for a work that was published at<br />

least four years before the Document itself, or if the orig<strong>in</strong>al publisher of the version<br />

it refers to gives permission.<br />

K. For any section Entitled “Acknowledgements” or “Dedications”, Preserve the Title<br />

of the section, and preserve <strong>in</strong> the section all the substance and tone of each of the<br />

contributor acknowledgements and/or dedications given there<strong>in</strong>.<br />

L. Preserve all the Invariant Sections of the Document, unaltered <strong>in</strong> their text and <strong>in</strong><br />

their titles. Section numbers or the equivalent are not considered part of the section<br />

titles.


GFDL Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 629<br />

M. Delete any section Entitled “Endorsements”. Such a section may not be <strong>in</strong>cluded <strong>in</strong><br />

the Modified Version.<br />

N. Do not retitle any exist<strong>in</strong>g section to be Entitled “Endorsements” or to conflict <strong>in</strong><br />

title with any Invariant Section.<br />

O. Preserve any Warranty Disclaimers.<br />

If the Modified Version <strong>in</strong>cludes new front-matter sections or appendices that qualify as<br />

Secondary Sections and conta<strong>in</strong> no material copied from the Document, you may at your<br />

option designate some or all of these sections as <strong>in</strong>variant. To do this, add their titles to<br />

the list of Invariant Sections <strong>in</strong> the Modified Version’s license notice. These titles must be<br />

dist<strong>in</strong>ct from any other section titles.<br />

You may add a section Entitled “Endorsements”, provided it conta<strong>in</strong>s noth<strong>in</strong>g but<br />

endorsements of your Modified Version by various parties—for example, statements of peer<br />

review or that the text has been approved by an organization as the authoritative def<strong>in</strong>ition<br />

of a standard.<br />

You may add a passage of up to five words as a Front-Cover Text, and a passage of up<br />

to 25 words as a Back-Cover Text, to the end of the list of Cover Texts <strong>in</strong> the Modified<br />

Version. Only one passage of Front-Cover Text and one of Back-Cover Text may be added<br />

by (or through arrangements made by) any one entity. If the Document already <strong>in</strong>cludes a<br />

cover text for the same cover, previously added by you or by arrangement made by the<br />

same entity you are act<strong>in</strong>g on behalf of, you may not add another; but you may replace<br />

the old one, on explicit permission from the previous publisher that added the old one.<br />

The author(s) and publisher(s) of the Document do not by this License give permission<br />

to use their names for publicity for or to assert or imply endorsement of any Modified<br />

Version.<br />

5. COMBINING DOCUMENTS<br />

You may comb<strong>in</strong>e the Document with other documents released under this License,<br />

under the terms def<strong>in</strong>ed <strong>in</strong> section 4 above for modified versions, provided that you <strong>in</strong>clude<br />

<strong>in</strong> the comb<strong>in</strong>ation all of the Invariant Sections of all of the orig<strong>in</strong>al documents, unmodified,<br />

and list them all as Invariant Sections of your comb<strong>in</strong>ed work <strong>in</strong> its license notice, and that<br />

you preserve all their Warranty Disclaimers.<br />

The comb<strong>in</strong>ed work need only conta<strong>in</strong> one copy of this License, and multiple identical<br />

Invariant Sections may be replaced with a s<strong>in</strong>gle copy. If there are multiple Invariant<br />

Sections with the same name but different contents, make the title of each such section<br />

unique by add<strong>in</strong>g at the end of it, <strong>in</strong> parentheses, the name of the orig<strong>in</strong>al author or<br />

publisher of that section if known, or else a unique number. Make the same adjustment to<br />

the section titles <strong>in</strong> the list of Invariant Sections <strong>in</strong> the license notice of the comb<strong>in</strong>ed work.<br />

In the comb<strong>in</strong>ation, you must comb<strong>in</strong>e any sections Entitled “History” <strong>in</strong> the various<br />

orig<strong>in</strong>al documents, form<strong>in</strong>g one section Entitled “History”; likewise comb<strong>in</strong>e any sections<br />

Entitled “Acknowledgements”, and any sections Entitled “Dedications”. You must delete<br />

all sections Entitled “Endorsements”.


GFDL Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 630<br />

6. COLLECTIONS OF DOCUMENTS<br />

You may make a collection consist<strong>in</strong>g of the Document and other documents released<br />

under this License, and replace the <strong>in</strong>dividual copies of this License <strong>in</strong> the various documents<br />

with a s<strong>in</strong>gle copy that is <strong>in</strong>cluded <strong>in</strong> the collection, provided that you follow the rules of<br />

this License for verbatim copy<strong>in</strong>g of each of the documents <strong>in</strong> all other respects.<br />

You may extract a s<strong>in</strong>gle document from such a collection, and distribute it <strong>in</strong>dividually<br />

under this License, provided you <strong>in</strong>sert a copy of this License <strong>in</strong>to the extracted document,<br />

and follow this License <strong>in</strong> all other respects regard<strong>in</strong>g verbatim copy<strong>in</strong>g of that document.<br />

7. AGGREGATION WITH INDEPENDENT<br />

WORKS<br />

A compilation of the Document or its derivatives with other separate and <strong>in</strong>dependent<br />

documents or works, <strong>in</strong> or on a volume of a storage or distribution medium, is called an<br />

“aggregate” if the copyright result<strong>in</strong>g from the compilation is not used to limit the legal<br />

rights of the compilation’s users beyond what the <strong>in</strong>dividual works permit. When the<br />

Document is <strong>in</strong>cluded <strong>in</strong> an aggregate, this License does not apply to the other works <strong>in</strong><br />

the aggregate which are not themselves derivative works of the Document.<br />

If the Cover Text requirement of section 3 is applicable to these copies of the Document,<br />

then if the Document is less than one half of the entire aggregate, the Document’s Cover<br />

Texts may be placed on covers that bracket the Document with<strong>in</strong> the aggregate, or the<br />

electronic equivalent of covers if the Document is <strong>in</strong> electronic form. Otherwise they must<br />

appear on pr<strong>in</strong>ted covers that bracket the whole aggregate.<br />

8. TRANSLATION<br />

Translation is considered a k<strong>in</strong>d of modification, so you may distribute translations of<br />

the Document under the terms of section 4. Replac<strong>in</strong>g Invariant Sections with translations<br />

requires special permission from their copyright holders, but you may <strong>in</strong>clude translations<br />

of some or all Invariant Sections <strong>in</strong> addition to the orig<strong>in</strong>al versions of these Invariant<br />

Sections. You may <strong>in</strong>clude a translation of this License, and all the license notices <strong>in</strong> the<br />

Document, and any Warranty Disclaimers, provided that you also <strong>in</strong>clude the orig<strong>in</strong>al<br />

English version of this License and the orig<strong>in</strong>al versions of those notices and disclaimers. In<br />

case of a disagreement between the translation and the orig<strong>in</strong>al version of this License or a<br />

notice or disclaimer, the orig<strong>in</strong>al version will prevail.<br />

If a section <strong>in</strong> the Document is Entitled “Acknowledgements”, “Dedications”, or “History”,<br />

the requirement (section 4) to Preserve its Title (section 1) will typically require<br />

chang<strong>in</strong>g the actual title.<br />

9. TERMINATION<br />

You may not copy, modify, sublicense, or distribute the Document except as expressly<br />

provided under this License. Any attempt otherwise to copy, modify, sublicense, or distribute<br />

it is void, and will automatically term<strong>in</strong>ate your rights under this License.


GFDL Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 631<br />

However, if you cease all violation of this License, then your license from a particular<br />

copyright holder is re<strong>in</strong>stated (a) provisionally, unless and until the copyright holder explicitly<br />

and f<strong>in</strong>ally term<strong>in</strong>ates your license, and (b) permanently, if the copyright holder fails to<br />

notify you of the violation by some reasonable means prior to 60 days after the cessation.<br />

Moreover, your license from a particular copyright holder is re<strong>in</strong>stated permanently if<br />

the copyright holder notifies you of the violation by some reasonable means, this is the first<br />

time you have received notice of violation of this License (for any work) from that copyright<br />

holder, and you cure the violation prior to 30 days after your receipt of the notice.<br />

Term<strong>in</strong>ation of your rights under this section does not term<strong>in</strong>ate the licenses of parties<br />

who have received copies or rights from you under this License. If your rights have been<br />

term<strong>in</strong>ated and not permanently re<strong>in</strong>stated, receipt of a copy of some or all of the same<br />

material does not give you any rights to use it.<br />

10. FUTURE REVISIONS OF THIS LICENSE<br />

The Free Software Foundation may publish new, revised versions of the GNU Free<br />

Documentation License from time to time. Such new versions will be similar <strong>in</strong> spirit to<br />

the present version, but may differ <strong>in</strong> detail to address new problems or concerns. See<br />

http://www.gnu.org/copyleft/.<br />

Each version of the License is given a dist<strong>in</strong>guish<strong>in</strong>g version number. If the Document<br />

specifies that a particular numbered version of this License “or any later version” applies<br />

to it, you have the option of follow<strong>in</strong>g the terms and conditions either of that specified<br />

version or of any later version that has been published (not as a draft) by the Free Software<br />

Foundation. If the Document does not specify a version number of this License, you may<br />

choose any version ever published (not as a draft) by the Free Software Foundation. If the<br />

Document specifies that a proxy can decide which future versions of this License can be<br />

used, that proxy’s public statement of acceptance of a version permanently authorizes you<br />

to choose that version for the Document.<br />

11. RELICENSING<br />

“Massive Multiauthor Collaboration Site” (or “MMC Site”) means any World Wide<br />

Web server that publishes copyrightable works and also provides prom<strong>in</strong>ent facilities for<br />

anybody to edit those works. A public wiki that anybody can edit is an example of such a<br />

server. A “Massive Multiauthor Collaboration” (or “MMC”) conta<strong>in</strong>ed <strong>in</strong> the site means<br />

any set of copyrightable works thus published on the MMC site.<br />

“CC-BY-SA” means the Creative Commons Attribution-Share Alike 3.0 license published<br />

by Creative Commons Corporation, a not-for-profit corporation with a pr<strong>in</strong>cipal<br />

place of bus<strong>in</strong>ess <strong>in</strong> San Francisco, California, as well as future copyleft versions of that<br />

license published by that same organization.<br />

“Incorporate” means to publish or republish a Document, <strong>in</strong> whole or <strong>in</strong> part, as part<br />

of another Document.<br />

An MMC is “eligible for relicens<strong>in</strong>g” if it is licensed under this License, and if all<br />

works that were first published under this License somewhere other than this MMC, and<br />

subsequently <strong>in</strong>corporated <strong>in</strong> whole or <strong>in</strong> part <strong>in</strong>to the MMC, (1) had no cover texts or<br />

<strong>in</strong>variant sections, and (2) were thus <strong>in</strong>corporated prior to November 1, 2008.


GFDL Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 632<br />

The operator of an MMC Site may republish an MMC conta<strong>in</strong>ed <strong>in</strong> the site under<br />

CC-BY-SA on the same site at any time before August 1, 2009, provided the MMC is<br />

eligible for relicens<strong>in</strong>g.<br />

ADDENDUM: How to use this License for your<br />

documents<br />

To use this License <strong>in</strong> a document you have written, <strong>in</strong>clude a copy of the License <strong>in</strong><br />

the document and put the follow<strong>in</strong>g copyright and license notices just after the title page:<br />

Copyright c○ YEAR YOUR NAME. Permission is granted to copy, distribute<br />

and/or modify this document under the terms of the GNU Free Documentation<br />

License, Version 1.3 or any later version published by the Free Software Foundation;<br />

with no Invariant Sections, no Front-Cover Texts, and no Back-Cover<br />

Texts. A copy of the license is <strong>in</strong>cluded <strong>in</strong> the section entitled “GNU Free<br />

Documentation License”.<br />

If you have Invariant Sections, Front-Cover Texts and Back-Cover Texts, replace the<br />

“with . . . Texts.” l<strong>in</strong>e with this:<br />

with the Invariant Sections be<strong>in</strong>g LIST THEIR TITLES, with the Front-Cover<br />

Texts be<strong>in</strong>g LIST, and with the Back-Cover Texts be<strong>in</strong>g LIST.<br />

If you have Invariant Sections without Cover Texts, or some other comb<strong>in</strong>ation of the<br />

three, merge those two alternatives to suit the situation.<br />

If your document conta<strong>in</strong>s nontrivial examples of program code, we recommend releas<strong>in</strong>g<br />

these examples <strong>in</strong> parallel under your choice of free software license, such as the GNU<br />

General Public License, to permit their use <strong>in</strong> free software.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!