A First Course in Linear Algebra, 2015a
A First Course in Linear Algebra, 2015a
A First Course in Linear Algebra, 2015a
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong><br />
Robert A. Beezer<br />
University of Puget Sound<br />
Version 3.50<br />
Congruent Press
Robert A. Beezer is a Professor of Mathematics at the University of Puget Sound,<br />
where he has been on the faculty s<strong>in</strong>ce 1984. He received a B.S. <strong>in</strong> Mathematics<br />
(with an Emphasis <strong>in</strong> Computer Science) from the University of Santa Clara <strong>in</strong> 1978,<br />
a M.S. <strong>in</strong> Statistics from the University of Ill<strong>in</strong>ois at Urbana-Champaign <strong>in</strong> 1982<br />
and a Ph.D. <strong>in</strong> Mathematics from the University of Ill<strong>in</strong>ois at Urbana-Champaign<br />
<strong>in</strong> 1984.<br />
In addition to his teach<strong>in</strong>g at the University of Puget Sound, he has made sabbatical<br />
visits to the University of the West Indies (Tr<strong>in</strong>idad campus) and the University<br />
of Western Australia. He has also given several courses <strong>in</strong> the Master’s program at<br />
the African Institute for Mathematical Sciences, South Africa. He has been a Sage<br />
developer s<strong>in</strong>ce 2008.<br />
He teaches calculus, l<strong>in</strong>ear algebra and abstract algebra regularly, while his research<br />
<strong>in</strong>terests <strong>in</strong>clude the applications of l<strong>in</strong>ear algebra to graph theory. His professional<br />
website is at http://buzzard.ups.edu.<br />
Edition<br />
Version 3.50<br />
ISBN: 978-0-9844175-5-1<br />
Cover Design<br />
Aidan Meacham<br />
Publisher<br />
Robert A. Beezer<br />
Congruent Press<br />
Gig Harbor, Wash<strong>in</strong>gton, USA<br />
c○ 2004—2015<br />
Robert A. Beezer<br />
Permission is granted to copy, distribute and/or modify this document under the<br />
terms of the GNU Free Documentation License, Version 1.2 or any later version<br />
published by the Free Software Foundation; with no Invariant Sections, no Front-<br />
Cover Texts, and no Back-Cover Texts. A copy of the license is <strong>in</strong>cluded <strong>in</strong> the<br />
appendix entitled “GNU Free Documentation License”.<br />
The most recent version can always be found at http://l<strong>in</strong>ear.pugetsound.edu.
To my wife, Pat.
Contents<br />
Preface<br />
Acknowledgements<br />
v<br />
xi<br />
Systems of L<strong>in</strong>ear Equations 1<br />
What is L<strong>in</strong>ear <strong>Algebra</strong>? . . . . . . . . . . . . . . . . . . . . . . . . . . . 1<br />
Solv<strong>in</strong>g Systems of L<strong>in</strong>ear Equations . . . . . . . . . . . . . . . . . . . . 8<br />
Reduced Row-Echelon Form . . . . . . . . . . . . . . . . . . . . . . . . . 22<br />
Types of Solution Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45<br />
Homogeneous Systems of Equations . . . . . . . . . . . . . . . . . . . . 58<br />
Nons<strong>in</strong>gular Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66<br />
Vectors 74<br />
Vector Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74<br />
L<strong>in</strong>ear Comb<strong>in</strong>ations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83<br />
Spann<strong>in</strong>g Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107<br />
L<strong>in</strong>ear Independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122<br />
L<strong>in</strong>ear Dependence and Spans . . . . . . . . . . . . . . . . . . . . . . . . 136<br />
Orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148<br />
Matrices 162<br />
Matrix Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162<br />
Matrix Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175<br />
Matrix Inverses and Systems of L<strong>in</strong>ear Equations . . . . . . . . . . . . . 193<br />
Matrix Inverses and Nons<strong>in</strong>gular Matrices . . . . . . . . . . . . . . . . . 207<br />
Column and Row Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 217<br />
Four Subsets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236<br />
Vector Spaces 257<br />
Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257<br />
Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272<br />
L<strong>in</strong>ear Independence and Spann<strong>in</strong>g Sets . . . . . . . . . . . . . . . . . . 288<br />
iv
Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303<br />
Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318<br />
Properties of Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . 331<br />
Determ<strong>in</strong>ants 340<br />
Determ<strong>in</strong>ant of a Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . 340<br />
Properties of Determ<strong>in</strong>ants of Matrices . . . . . . . . . . . . . . . . . . . 355<br />
Eigenvalues 367<br />
Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . 367<br />
Properties of Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . 390<br />
Similarity and Diagonalization . . . . . . . . . . . . . . . . . . . . . . . 403<br />
L<strong>in</strong>ear Transformations 420<br />
L<strong>in</strong>ear Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . 420<br />
Injective L<strong>in</strong>ear Transformations . . . . . . . . . . . . . . . . . . . . . . 446<br />
Surjective L<strong>in</strong>ear Transformations . . . . . . . . . . . . . . . . . . . . . 462<br />
Invertible L<strong>in</strong>ear Transformations . . . . . . . . . . . . . . . . . . . . . . 480<br />
Representations 501<br />
Vector Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 501<br />
Matrix Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . 514<br />
Change of Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 542<br />
Orthonormal Diagonalization . . . . . . . . . . . . . . . . . . . . . . . . 568<br />
Prelim<strong>in</strong>aries 581<br />
Complex Number Operations . . . . . . . . . . . . . . . . . . . . . . . . 581<br />
Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 587<br />
Reference 591<br />
Proof Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 591<br />
Archetypes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 605<br />
Def<strong>in</strong>itions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 610<br />
Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 611<br />
Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 612<br />
GNU Free Documentation License . . . . . . . . . . . . . . . . . . . . . 613<br />
v
Preface<br />
This text is designed to teach the concepts and techniques of basic l<strong>in</strong>ear algebra<br />
as a rigorous mathematical subject. Besides computational proficiency, there is an<br />
emphasis on understand<strong>in</strong>g def<strong>in</strong>itions and theorems, as well as read<strong>in</strong>g, understand<strong>in</strong>g<br />
and creat<strong>in</strong>g proofs. A strictly logical organization, complete and exceed<strong>in</strong>gly<br />
detailed proofs of every theorem, advice on techniques for read<strong>in</strong>g and writ<strong>in</strong>g proofs,<br />
and a selection of challeng<strong>in</strong>g theoretical exercises will slowly provide the novice<br />
with the tools and confidence to be able to study other mathematical topics <strong>in</strong> a<br />
rigorous fashion.<br />
Most students tak<strong>in</strong>g a course <strong>in</strong> l<strong>in</strong>ear algebra will have completed courses <strong>in</strong><br />
differential and <strong>in</strong>tegral calculus, and maybe also multivariate calculus, and will<br />
typically be second-year students <strong>in</strong> university. This level of mathematical maturity is<br />
expected, however there is little or no requirement to know calculus itself to use this<br />
book successfully. With complete details for every proof, for nearly every example,<br />
and for solutions to a majority of the exercises, the book is ideal for self-study, for<br />
those of any age.<br />
While there is an abundance of guidance <strong>in</strong> the use of the software system, Sage,<br />
there is no attempt to address the problems of numerical l<strong>in</strong>ear algebra, which are<br />
arguably cont<strong>in</strong>uous <strong>in</strong> nature. Similarly, there is little emphasis on a geometric<br />
approach to problems of l<strong>in</strong>ear algebra. While this may contradict the experience of<br />
many experienced mathematicians, the approach here is consciously algebraic. As a<br />
result, the student should be well-prepared to encounter groups, r<strong>in</strong>gs and fields <strong>in</strong><br />
future courses <strong>in</strong> algebra, or other areas of discrete mathematics.<br />
HowtoUseThisBook<br />
While the book is divided <strong>in</strong>to chapters, the ma<strong>in</strong> organizational unit is the thirtyseven<br />
sections. Each conta<strong>in</strong>s a selection of def<strong>in</strong>itions, theorems, and examples<br />
<strong>in</strong>terspersed with commentary. If you are enrolled <strong>in</strong> a course, read the section before<br />
class and then answer the section’s read<strong>in</strong>g questions as preparation for class.<br />
The version available for view<strong>in</strong>g <strong>in</strong> a web browser is the most complete, <strong>in</strong>tegrat<strong>in</strong>g<br />
all of the components of the book. Consider acqua<strong>in</strong>t<strong>in</strong>g yourself with this version.<br />
vi
Knowls are <strong>in</strong>dicated by a dashed underl<strong>in</strong>es and will allow you to seamlessly rem<strong>in</strong>d<br />
yourself of the content of def<strong>in</strong>itions, theorems, examples, exercises, subsections and<br />
more. Use them liberally.<br />
Historically, mathematics texts have numbered def<strong>in</strong>itions and theorems. We<br />
have <strong>in</strong>stead adopted a strategy more appropriate to the heavy cross-referenc<strong>in</strong>g,<br />
l<strong>in</strong>k<strong>in</strong>g and knowl<strong>in</strong>g afforded by modern media. Mimick<strong>in</strong>g an approach taken by<br />
Donald Knuth, we have given items short titles and associated acronyms. You will<br />
become comfortable with this scheme after a short time, and might even come to<br />
appreciate its <strong>in</strong>herent advantages. In the web version, each chapter has a list of ten<br />
or so important items from that chapter, and you will f<strong>in</strong>d yourself recogniz<strong>in</strong>g some<br />
of these acronyms with no extra effort beyond the normal amount of study. Bruno<br />
Mello suggests that some say an acronym should be pronounceable as a word (such<br />
as “radar”), and otherwise is an abbreviation. We will not be so strict <strong>in</strong> our use of<br />
the term.<br />
Exercises come <strong>in</strong> three flavors, <strong>in</strong>dicated by the first letter of their label. “C”<br />
<strong>in</strong>dicates a problem that is essentially computational. “T” represents a problem<br />
that is more theoretical, usually requir<strong>in</strong>g a solution that is as rigorous as a proof.<br />
“M” stands for problems that are “medium”, “moderate”, “midway”, “mediate” or<br />
“median”, but never “mediocre.” Their statements could feel computational, but their<br />
solutions require a more thorough understand<strong>in</strong>g of the concepts or theory, while<br />
perhaps not be<strong>in</strong>g as rigorous as a proof. Of course, such a tripartite division will<br />
be subject to <strong>in</strong>terpretation. Otherwise, larger numerical values <strong>in</strong>dicate greater<br />
perceived difficulty, with gaps allow<strong>in</strong>g for the contribution of new problems from<br />
readers. Many, but not all, exercises have complete solutions. These are <strong>in</strong>dicated<br />
by daggers <strong>in</strong> the PDF and pr<strong>in</strong>t versions, with solutions available <strong>in</strong> an onl<strong>in</strong>e<br />
supplement, while <strong>in</strong> the web version a solution is <strong>in</strong>dicated by a knowl right after the<br />
problem statement. Resist the urge to peek early. Work<strong>in</strong>g the exercises diligently is<br />
the best way to master the material.<br />
The Archetypes are a collection of twenty-four archetypical examples. The open<br />
source lexical database, WordNet, def<strong>in</strong>es an archetype as “someth<strong>in</strong>g that serves as<br />
a model or a basis for mak<strong>in</strong>g copies.” We employ the word <strong>in</strong> the first sense here.<br />
By carefully choos<strong>in</strong>g the examples we hope to provide at least one example that<br />
is <strong>in</strong>terest<strong>in</strong>g and appropriate for many of the theorems and def<strong>in</strong>itions, and also<br />
provide counterexamples to conjectures (and especially counterexamples to converses<br />
of theorems). Each archetype has numerous computational results which you could<br />
strive to duplicate as you encounter new def<strong>in</strong>itions and theorems. There are some<br />
exercises which will help guide you <strong>in</strong> this quest.<br />
Supplements<br />
Pr<strong>in</strong>t versions of the book (either a physical copy or a PDF version) have significant<br />
material available as supplements. Solutions are conta<strong>in</strong>ed <strong>in</strong> the Exercise Manual.<br />
vii
Advice on the use of the open source mathematical software system, Sage,isconta<strong>in</strong>ed<br />
<strong>in</strong> another supplement. (Look for a l<strong>in</strong>ear algebra “Quick Reference” sheet at the<br />
Sage website.) The Archetypes are available <strong>in</strong> a PDF form which could be used<br />
as a workbook. Flashcards, with the statement of every def<strong>in</strong>ition and theorem, <strong>in</strong><br />
order of appearance, are also available.<br />
Freedom<br />
This book is copyrighted by its author. Some would say it is his “<strong>in</strong>tellectual property,”<br />
a distasteful phrase if there ever was one. Rather than exercise all the restrictions<br />
provided by the government-granted monopoly that is copyright, the author has<br />
granted you a license, the GNU Free Documentation License (GFDL). In summary<br />
it says you may receive an electronic copy at no cost via electronic networks and you<br />
may make copies forever. So your copy of the book never has to go “out-of-pr<strong>in</strong>t.”<br />
You may redistribute copies and you may make changes to your copy for your own<br />
use. However, you have one major responsibility <strong>in</strong> accept<strong>in</strong>g this license. If you<br />
make changes and distribute the changed version, then you must offer the same<br />
license for the new version, you must acknowledge the orig<strong>in</strong>al author’s work, and<br />
you must <strong>in</strong>dicate where you have made changes.<br />
In practice, if you see a change that needs to be made (like correct<strong>in</strong>g an error,<br />
or add<strong>in</strong>g a particularly nice theoretical exercise), you may just wish to donate the<br />
change to the author rather than create and ma<strong>in</strong>ta<strong>in</strong> a new version. Such donations<br />
are highly encouraged and gratefully accepted. You may notice the large number of<br />
small mistakes that have been corrected by readers that have come before you. Pay<br />
it forward.<br />
So, <strong>in</strong> one word, the book really is “free” (as <strong>in</strong> “no cost”). But the open license<br />
employed is vastly different than “free to download, all rights reserved.” Most<br />
importantly, you know that this book, and its ideas, are not the property of anyone.<br />
Or they are the property of everyone. Either way, this book has its own <strong>in</strong>herent<br />
“freedom,” separate from those who contribute to it. Much of this philosophy is<br />
embodied <strong>in</strong> the follow<strong>in</strong>g quote:<br />
If nature has made any one th<strong>in</strong>g less susceptible than all others of<br />
exclusive property, it is the action of the th<strong>in</strong>k<strong>in</strong>g power called an idea,<br />
which an <strong>in</strong>dividual may exclusively possess as long as he keeps it to<br />
himself; but the moment it is divulged, it forces itself <strong>in</strong>to the possession<br />
of every one, and the receiver cannot dispossess himself of it. Its peculiar<br />
character, too, is that no one possesses the less, because every other<br />
possesses the whole of it. He who receives an idea from me, receives<br />
<strong>in</strong>struction himself without lessen<strong>in</strong>g m<strong>in</strong>e; as he who lights his taper<br />
at m<strong>in</strong>e, receives light without darken<strong>in</strong>g me. That ideas should freely<br />
spread from one to another over the globe, for the moral and mutual<br />
<strong>in</strong>struction of man, and improvement of his condition, seems to have<br />
viii
een peculiarly and benevolently designed by nature, when she made<br />
them, like fire, expansible over all space, without lessen<strong>in</strong>g their density<br />
<strong>in</strong> any po<strong>in</strong>t, and like the air <strong>in</strong> which we breathe, move, and have our<br />
physical be<strong>in</strong>g, <strong>in</strong>capable of conf<strong>in</strong>ement or exclusive appropriation.<br />
To the Instructor<br />
Thomas Jefferson<br />
Letter to Isaac McPherson<br />
August 13, 1813<br />
The first half of this text (through Chapter M) is a course <strong>in</strong> matrix algebra, though<br />
the foundation of some more advanced ideas is also be<strong>in</strong>g formed <strong>in</strong> these early<br />
sections (such as Theorem NMUS, which presages <strong>in</strong>vertible l<strong>in</strong>ear transformations).<br />
Vectors are presented exclusively as column vectors (not transposes of row vectors),<br />
and l<strong>in</strong>ear comb<strong>in</strong>ations are presented very early. Spans, null spaces, column spaces<br />
and row spaces are also presented early, simply as sets, sav<strong>in</strong>g most of their vector<br />
space properties for later, so they are familiar objects before be<strong>in</strong>g scrut<strong>in</strong>ized<br />
carefully.<br />
You cannot do everyth<strong>in</strong>g early, so <strong>in</strong> particular matrix multiplication comes later<br />
than usual. However, with a def<strong>in</strong>ition built on l<strong>in</strong>ear comb<strong>in</strong>ations of column vectors,<br />
it should seem more natural than the more frequent def<strong>in</strong>ition us<strong>in</strong>g dot products<br />
of rows with columns. And this delay emphasizes that l<strong>in</strong>ear algebra is built upon<br />
vector addition and scalar multiplication. Of course, matrix <strong>in</strong>verses must wait for<br />
matrix multiplication, but this does not prevent nons<strong>in</strong>gular matrices from occurr<strong>in</strong>g<br />
sooner. Vector space properties are h<strong>in</strong>ted at when vector and matrix operations<br />
are first def<strong>in</strong>ed, but the notion of a vector space is saved for a more axiomatic<br />
treatment later (Chapter VS). Once bases and dimension have been explored <strong>in</strong><br />
the context of vector spaces, l<strong>in</strong>ear transformations and their matrix representation<br />
follow. The predom<strong>in</strong>ant purpose of the book is the four sections of Chapter R, which<br />
<strong>in</strong>troduces the student to representations of vectors and matrices, change-of-basis,<br />
and orthonormal diagonalization (the spectral theorem). This f<strong>in</strong>al chapter pulls<br />
together all the important ideas of the previous chapters.<br />
Our vector spaces use the complex numbers as the field of scalars. This avoids<br />
the fiction of complex eigenvalues be<strong>in</strong>g used to form scalar multiples of eigenvectors.<br />
The presence of the complex numbers <strong>in</strong> the earliest sections should not frighten<br />
students who need a review, s<strong>in</strong>ce they will not be used heavily until much later,<br />
and Section CNO provides a quick review.<br />
L<strong>in</strong>ear algebra is an ideal subject for the novice mathematics student to learn how<br />
to develop a subject precisely, with all the rigor mathematics requires. Unfortunately,<br />
much of this rigor seems to have escaped the standard calculus curriculum, so<br />
for many university students this is their first exposure to careful def<strong>in</strong>itions and<br />
ix
theorems, and the expectation that they fully understand them, to say noth<strong>in</strong>g of the<br />
expectation that they become proficient <strong>in</strong> formulat<strong>in</strong>g their own proofs. We have<br />
tried to make this text as helpful as possible with this transition. Every def<strong>in</strong>ition<br />
is stated carefully, set apart from the text. Likewise, every theorem is carefully<br />
stated, and almost every one has a complete proof. Theorems usually have just one<br />
conclusion, so they can be referenced precisely later. Def<strong>in</strong>itions and theorems are<br />
cataloged <strong>in</strong> order of their appearance (Def<strong>in</strong>itions and Theorems <strong>in</strong> the Reference<br />
chapter at the end of the book). Along the way, there are discussions of some more<br />
important ideas relat<strong>in</strong>g to formulat<strong>in</strong>g proofs (Proof Techniques), which is partly<br />
advice and partly a primer on logic.<br />
Collect<strong>in</strong>g responses to the Read<strong>in</strong>g Questions prior to cover<strong>in</strong>g material <strong>in</strong> class<br />
will require students to learn how to read the material. Sections are designed to be<br />
covered <strong>in</strong> a fifty-m<strong>in</strong>ute lecture. Later sections are longer, but as students become<br />
more proficient at read<strong>in</strong>g the text, it is possible to survey these longer sections at<br />
the same pace. With solutions to many of the exercises, students may be given the<br />
freedom to work homework at their own pace and style (<strong>in</strong>dividually, <strong>in</strong> groups, with<br />
an <strong>in</strong>structor’s help, etc.). To compensate and keep students from fall<strong>in</strong>g beh<strong>in</strong>d, I<br />
give an exam<strong>in</strong>ation on each chapter.<br />
Sage is a powerful open source program for advanced mathematics. It is especially<br />
robust for l<strong>in</strong>ear algebra. We have <strong>in</strong>cluded an abundance of material which will help<br />
the student (and <strong>in</strong>structor) learn how to use Sage for the study of l<strong>in</strong>ear algebra<br />
and how to understand l<strong>in</strong>ear algebra better with Sage. This material is tightly<br />
<strong>in</strong>tegrated with the web version of the book and will become even easier to use<br />
s<strong>in</strong>ce the technology for <strong>in</strong>terfaces to Sage cont<strong>in</strong>ues to rapidly evolve. Sage is highly<br />
capable for mathematical research as well, and so should be a tool that students can<br />
use <strong>in</strong> subsequent courses and careers.<br />
Conclusion<br />
L<strong>in</strong>ear algebra is a beautiful subject. I have enjoyed prepar<strong>in</strong>g this exposition and<br />
mak<strong>in</strong>g it widely available. Much of my motivation for writ<strong>in</strong>g this book is captured<br />
by the sentiments expressed by H.M. Cundy and A.P. Rollet <strong>in</strong> their Preface to the<br />
<strong>First</strong> Edition of Mathematical Models (1952), especially the f<strong>in</strong>al sentence,<br />
This book was born <strong>in</strong> the classroom, and arose from the spontaneous<br />
<strong>in</strong>terest of a Mathematical Sixth <strong>in</strong> the construction of simple models. A<br />
desire to show that even <strong>in</strong> mathematics one could have fun led to an<br />
exhibition of the results and attracted considerable attention throughout<br />
the school. S<strong>in</strong>ce then the Sherborne collection has grown, ideas have come<br />
from many sources, and widespread <strong>in</strong>terest has been shown. It seems<br />
therefore desirable to give permanent form to the lessons of experience so<br />
that others can benefit by them and be encouraged to undertake similar<br />
work.<br />
x
Foremost, I hope that students f<strong>in</strong>d their time spent with this book profitable. I<br />
hope that <strong>in</strong>structors f<strong>in</strong>d it flexible enough to fit the needs of their course. You can<br />
always f<strong>in</strong>d the latest version, and keep current with any changes, at the book’s website<br />
(http://l<strong>in</strong>ear.pugetsound.edu). I appreciate receiv<strong>in</strong>g suggestions, corrections,<br />
and other comments, so please do contact me.<br />
Robert A. Beezer<br />
Tacoma, Wash<strong>in</strong>gton<br />
December 2012<br />
xi
Acknowledgements<br />
Many people have helped to make this book, and its freedoms, possible.<br />
<strong>First</strong>, the time to create, edit and distribute the book has been provided implicitly<br />
and explicitly by the University of Puget Sound. A sabbatical leave Spr<strong>in</strong>g 2004, a<br />
course release <strong>in</strong> Spr<strong>in</strong>g 2007 and a Lantz Senior Fellowship for the 2010-11 academic<br />
year are three obvious examples of explicit support. The course release was provided<br />
by support from the L<strong>in</strong>d-VanEnkevort Fund. The university has also provided<br />
clerical support, computer hardware, network servers and bandwidth. Thanks to<br />
Dean Kris Bartanen and the chairs of the Mathematics and Computer Science<br />
Department, Professors Mart<strong>in</strong> Jackson, Sigrun Bod<strong>in</strong>e and Bryan Smith, for their<br />
support, encouragement and flexibility.<br />
My colleagues <strong>in</strong> the Mathematics and Computer Science Department have<br />
graciously taught our <strong>in</strong>troductory l<strong>in</strong>ear algebra course us<strong>in</strong>g earlier versions and<br />
have provided valuable suggestions that have improved the book immeasurably.<br />
Thanks to Professor Mart<strong>in</strong> Jackson (v0.30), Professor David Scott (v0.70), Professor<br />
Bryan Smith (v0.70, 0.80, v1.00, v2.00, v2.20), Professor Manley Perkel (v2.10), and<br />
Professor Cynthia Gibson (v2.20).<br />
University of Puget Sound librarians Lori Ricigliano, Elizabeth Knight and Jeanne<br />
Kimura provided valuable advice on production, and <strong>in</strong>terest<strong>in</strong>g conversations about<br />
copyrights.<br />
Many aspects of the book have been <strong>in</strong>fluenced by <strong>in</strong>sightful questions and creative<br />
suggestions from the students who have labored through the book <strong>in</strong> our courses. For<br />
example, the flashcards with theorems and def<strong>in</strong>itions are a direct result of a student<br />
suggestion. I will s<strong>in</strong>gle out a handful of students at the University of Puget Sound<br />
who have been especially adept at f<strong>in</strong>d<strong>in</strong>g and report<strong>in</strong>g mathematically significant<br />
typographical errors: Jake L<strong>in</strong>enthal, Christie Su, Kim Le, Sarah McQuate, Andy<br />
Zimmer, Travis Osborne, Andrew Tapay, Mark Shoemaker, Tasha Underhill, Tim<br />
Zitzer, Elizabeth Million, Steve Canfield, J<strong>in</strong>shil Yi, Cliff Berger, Preston Van Buren,<br />
Duncan Bennett, Dan Messenger, Caden Rob<strong>in</strong>son, Glenna Toomey, Tyler Ueltschi,<br />
Kyle Whitcomb, Anna Dovzhik, Chris Spald<strong>in</strong>g and Jenna Fonta<strong>in</strong>e. All the students<br />
of the Fall 2012 Math 290 sections were very helpful and patient through the major<br />
changes required <strong>in</strong> mak<strong>in</strong>g Version 3.00.<br />
xii
I have tried to be as orig<strong>in</strong>al as possible <strong>in</strong> the organization and presentation of<br />
this beautiful subject. However, I have been <strong>in</strong>fluenced by many years of teach<strong>in</strong>g<br />
from another excellent textbook, Introduction to L<strong>in</strong>ear <strong>Algebra</strong> by L.W. Johnson,<br />
R.D. Reiss and J.T. Arnold. When I have needed <strong>in</strong>spiration for the correct approach<br />
to particularly important proofs, I have learned to eventually consult two other<br />
textbooks. Sheldon Axler’s L<strong>in</strong>ear <strong>Algebra</strong> Done Right is a highly orig<strong>in</strong>al exposition,<br />
while Ben Noble’s Applied L<strong>in</strong>ear <strong>Algebra</strong> frequently strikes just the right note<br />
between rigor and <strong>in</strong>tuition. Noble’s excellent book is highly recommended, even<br />
though its publication dates to 1969.<br />
Conversion to various electronic formats have greatly depended on assistance from:<br />
Eitan Gurari, author of the powerful LaTeX translator, tex4ht; Davide Cervone,<br />
author of jsMath and MathJax; and Carl Witty, who advised and tested the Sony<br />
Reader format. Thanks to these <strong>in</strong>dividuals for their critical assistance.<br />
Incorporation of Sage code is made possible by the entire community of Sage<br />
developers and users, who create and ref<strong>in</strong>e the mathematical rout<strong>in</strong>es, the user<br />
<strong>in</strong>terfaces and applications <strong>in</strong> educational sett<strong>in</strong>gs. Technical and logistical aspects of<br />
<strong>in</strong>corporat<strong>in</strong>g Sage code <strong>in</strong> open textbooks was supported by a grant from the United<br />
States National Science Foundation (DUE-1022574), which has been adm<strong>in</strong>istered by<br />
the American Institute of Mathematics, and <strong>in</strong> particular, David Farmer. The support<br />
and assistance of my fellow Pr<strong>in</strong>cipal Investigators, Jason Grout, Tom Judson, Kiran<br />
Kedlaya, Sandra Laursen, Susan Lynds, and William Ste<strong>in</strong> is especially appreciated.<br />
David Farmer and Sally Koutsoliotas are responsible for the vision and <strong>in</strong>itial<br />
experiments which lead to the knowl-enabled web version, as part of the Version 3<br />
project.<br />
General support and encouragement of free and affordable textbooks, <strong>in</strong> addition<br />
to specific promotion of this text, was provided by Nicole Allen, Textbook Advocate<br />
at Student Public Interest Research Groups. Nicole was an early consumer of this<br />
material, back when it looked more like lecture notes than a textbook.<br />
F<strong>in</strong>ally, <strong>in</strong> every respect, the production and distribution of this book has been<br />
accomplished with open source software. The range of <strong>in</strong>dividuals and projects is far<br />
too great to pretend to list them all. This project is an attempt to pay it forward.<br />
xiii
Contributors<br />
Name Location Contact<br />
Beezer, David Santa Clara U.<br />
Beezer, Robert U. of Puget Sound buzzard.ups.edu/<br />
Black, Chris<br />
Braithwaite, David Chicago, Ill<strong>in</strong>ois<br />
Bucht, Sara U. of Puget Sound<br />
Canfield, Steve U. of Puget Sound<br />
Hubert, Dupont Creteil, France<br />
Fellez, Sarah U. of Puget Sound<br />
Fickenscher, Eric U. of Puget Sound<br />
Jackson, Mart<strong>in</strong> U. of Puget Sound<br />
Johnson, Chili U. of Puget Sound<br />
www.math.ups.edu/~mart<strong>in</strong>j<br />
Kessler, Ivan U. of Puget Sound<br />
Kreher, Don Michigan Tech. U.<br />
Hamrick, Mark St. Louis U.<br />
www.math.mtu.edu/~kreher<br />
L<strong>in</strong>enthal, Jacob U. of Puget Sound<br />
Million, Elizabeth U. of Puget Sound<br />
Osborne, Travis U. of Puget Sound<br />
Riegsecker, Joe Middlebury, Indiana joepye(at)pobox(dot)com<br />
Perkel, Manley U. of Puget Sound<br />
Phelps, Douglas U. of Puget Sound<br />
Shoemaker, Mark U. of Puget Sound<br />
Toth, Zoltan<br />
zoli.web.elte.hu<br />
Ueltschi, Tyler U. of Puget Sound<br />
Zimmer, Andy U. of Puget Sound<br />
xiv
Chapter SLE<br />
Systems of L<strong>in</strong>ear Equations<br />
We will motivate our study of l<strong>in</strong>ear algebra by study<strong>in</strong>g solutions to systems of<br />
l<strong>in</strong>ear equations. While the focus of this chapter is on the practical matter of how<br />
to f<strong>in</strong>d, and describe, these solutions, we will also be sett<strong>in</strong>g ourselves up for more<br />
theoretical ideas that will appear later.<br />
Section WILA<br />
What is L<strong>in</strong>ear <strong>Algebra</strong>?<br />
We beg<strong>in</strong> our study of l<strong>in</strong>ear algebra with an <strong>in</strong>troduction and a motivational example.<br />
Subsection LA<br />
L<strong>in</strong>ear + <strong>Algebra</strong><br />
The subject of l<strong>in</strong>ear algebra can be partially expla<strong>in</strong>ed by the mean<strong>in</strong>g of the<br />
two terms compris<strong>in</strong>g the title. “L<strong>in</strong>ear” is a term you will appreciate better at<br />
the end of this course, and <strong>in</strong>deed, atta<strong>in</strong><strong>in</strong>g this appreciation could be taken as<br />
one of the primary goals of this course. However for now, you can understand it<br />
to mean anyth<strong>in</strong>g that is “straight” or “flat.” For example <strong>in</strong> the xy-plane you<br />
might be accustomed to describ<strong>in</strong>g straight l<strong>in</strong>es (is there any other k<strong>in</strong>d?) as<br />
the set of solutions to an equation of the form y = mx + b, wheretheslopem<br />
and the y-<strong>in</strong>tercept b are constants that together describe the l<strong>in</strong>e. If you have<br />
studied multivariate calculus, then you will have encountered planes. Liv<strong>in</strong>g <strong>in</strong> three<br />
dimensions, with coord<strong>in</strong>ates described by triples (x, y, z), they can be described<br />
as the set of solutions to equations of the form ax + by + cz = d, wherea, b, c, d<br />
are constants that together determ<strong>in</strong>e the plane. While we might describe planes as<br />
“flat,” l<strong>in</strong>es <strong>in</strong> three dimensions might be described as “straight.” From a multivariate<br />
calculus course you will recall that l<strong>in</strong>es are sets of po<strong>in</strong>ts described by equations<br />
1
§WILA Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 2<br />
such as x =3t − 4, y = −7t +2,z =9t, wheret is a parameter that can take on any<br />
value.<br />
Another view of this notion of “flatness” is to recognize that the sets of po<strong>in</strong>ts<br />
just described are solutions to equations of a relatively simple form. These equations<br />
<strong>in</strong>volve addition and multiplication only. We will have a need for subtraction, and<br />
occasionally we will divide, but mostly you can describe “l<strong>in</strong>ear” equations as<br />
<strong>in</strong>volv<strong>in</strong>g only addition and multiplication. Here are some examples of typical<br />
equations we will see <strong>in</strong> the next few sections:<br />
2x +3y − 4z =13 4x 1 +5x 2 − x 3 + x 4 + x 5 =0 9a − 2b +7c +2d = −7<br />
What we will not see are equations like:<br />
xy +5yz =13 x 1 + x 3 2/x 4 − x 3 x 4 x 2 5 =0 tan(ab)+log(c − d) =−7<br />
The exception will be that we will on occasion need to take a square root.<br />
You have probably heard the word “algebra” frequently <strong>in</strong> your mathematical<br />
preparation for this course. Most likely, you have spent a good ten to fifteen years<br />
learn<strong>in</strong>g the algebra of the real numbers, along with some <strong>in</strong>troduction to the very<br />
similar algebra of complex numbers (see Section CNO). However, there are many<br />
new algebras to learn and use, and likely l<strong>in</strong>ear algebra will be your second algebra.<br />
Like learn<strong>in</strong>g a second language, the necessary adjustments can be challeng<strong>in</strong>g at<br />
times, but the rewards are many. And it will make learn<strong>in</strong>g your third and fourth<br />
algebras even easier. Perhaps you have heard of “groups” and “r<strong>in</strong>gs” (or maybe<br />
you have studied them already), which are excellent examples of other algebras with<br />
very <strong>in</strong>terest<strong>in</strong>g properties and applications. In any event, prepare yourself to learn<br />
a new algebra and realize that some of the old rules you used for the real numbers<br />
may no longer apply to this new algebra you will be learn<strong>in</strong>g!<br />
The brief discussion above about l<strong>in</strong>es and planes suggests that l<strong>in</strong>ear algebra<br />
has an <strong>in</strong>herently geometric nature, and this is true. Examples <strong>in</strong> two and three<br />
dimensions can be used to provide valuable <strong>in</strong>sight <strong>in</strong>to important concepts of this<br />
course. However, much of the power of l<strong>in</strong>ear algebra will be the ability to work with<br />
“flat” or “straight” objects <strong>in</strong> higher dimensions, without concern<strong>in</strong>g ourselves with<br />
visualiz<strong>in</strong>g the situation. While much of our <strong>in</strong>tuition will come from examples <strong>in</strong><br />
two and three dimensions, we will ma<strong>in</strong>ta<strong>in</strong> an algebraic approach to the subject,<br />
with the geometry be<strong>in</strong>g secondary. Others may wish to switch this emphasis around,<br />
and that can lead to a very fruitful and beneficial course, but here and now we are<br />
lay<strong>in</strong>g our bias bare.<br />
Subsection AA<br />
An Application<br />
We conclude this section with a rather <strong>in</strong>volved example that will highlight some of<br />
the power and techniques of l<strong>in</strong>ear algebra. Work through all of the details with pencil
§WILA Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 3<br />
and paper, until you believe all the assertions made. However, <strong>in</strong> this <strong>in</strong>troductory<br />
example, do not concern yourself with how some of the results are obta<strong>in</strong>ed or how<br />
you might be expected to solve a similar problem. We will come back to this example<br />
later and expose some of the techniques used and properties exploited. For now,<br />
use your background <strong>in</strong> mathematics to conv<strong>in</strong>ce yourself that everyth<strong>in</strong>g said here<br />
really is correct.<br />
Example TMP Trail Mix Packag<strong>in</strong>g<br />
Suppose you are the production manager at a food-packag<strong>in</strong>g plant and one of your<br />
product l<strong>in</strong>es is trail mix, a healthy snack popular with hikers and backpackers,<br />
conta<strong>in</strong><strong>in</strong>g rais<strong>in</strong>s, peanuts and hard-shelled chocolate pieces. By adjust<strong>in</strong>g the mix<br />
of these three <strong>in</strong>gredients, you are able to sell three varieties of this item. The fancy<br />
version is sold <strong>in</strong> half-kilogram packages at outdoor supply stores and has more<br />
chocolate and fewer rais<strong>in</strong>s, thus command<strong>in</strong>g a higher price. The standard version<br />
is sold <strong>in</strong> one kilogram packages <strong>in</strong> grocery stores and gas station m<strong>in</strong>i-markets.<br />
S<strong>in</strong>ce the standard version has roughly equal amounts of each <strong>in</strong>gredient, it is not as<br />
expensive as the fancy version. F<strong>in</strong>ally, a bulk version is sold <strong>in</strong> b<strong>in</strong>s at grocery stores<br />
for consumers to load <strong>in</strong>to plastic bags <strong>in</strong> amounts of their choos<strong>in</strong>g. To appeal to<br />
the shoppers that like bulk items for their economy and healthfulness, this mix has<br />
many more rais<strong>in</strong>s (at the expense of chocolate) and therefore sells for less.<br />
Your production facilities have limited storage space and early each morn<strong>in</strong>g<br />
you are able to receive and store 380 kilograms of rais<strong>in</strong>s, 500 kilograms of peanuts<br />
and 620 kilograms of chocolate pieces. As production manager, one of your most<br />
important duties is to decide how much of each version of trail mix to make every<br />
day. Clearly, you can have up to 1500 kilograms of raw <strong>in</strong>gredients available each<br />
day, so to be the most productive you will likely produce 1500 kilograms of trail<br />
mix each day. Also, you would prefer not to have any <strong>in</strong>gredients leftover each day,<br />
so that your f<strong>in</strong>al product is as fresh as possible and so that you can receive the<br />
maximum delivery the next morn<strong>in</strong>g. But how should these <strong>in</strong>gredients be allocated<br />
to the mix<strong>in</strong>g of the bulk, standard and fancy versions? <strong>First</strong>, we need a little more<br />
<strong>in</strong>formation about the mixes. Workers mix the <strong>in</strong>gredients <strong>in</strong> 15 kilogram batches,<br />
and each row of the table below gives a recipe for a 15 kilogram batch. There is some<br />
additional <strong>in</strong>formation on the costs of the <strong>in</strong>gredients and the price the manufacturer<br />
can charge for the different versions of the trail mix.<br />
Rais<strong>in</strong>s Peanuts Chocolate Cost Sale Price<br />
(kg/batch) (kg/batch) (kg/batch) ($/kg) ($/kg)<br />
Bulk 7 6 2 3.69 4.99<br />
Standard 6 4 5 3.86 5.50<br />
Fancy 2 5 8 4.45 6.50<br />
Storage (kg) 380 500 620<br />
Cost ($/kg) 2.55 4.65 4.80<br />
As production manager, it is important to realize that you only have three
§WILA Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 4<br />
decisions to make — the amount of bulk mix to make, the amount of standard mix to<br />
make and the amount of fancy mix to make. Everyth<strong>in</strong>g else is beyond your control<br />
or is handled by another department with<strong>in</strong> the company. Pr<strong>in</strong>cipally, you are also<br />
limited by the amount of raw <strong>in</strong>gredients you can store each day. Let us denote the<br />
amount of each mix to produce each day, measured <strong>in</strong> kilograms, by the variable<br />
quantities b, s and f. Your production schedule can be described as values of b, s<br />
and f that do several th<strong>in</strong>gs. <strong>First</strong>, we cannot make negative quantities of each mix,<br />
so<br />
b ≥ 0 s ≥ 0 f ≥ 0<br />
Second, if we want to consume all of our <strong>in</strong>gredients each day, the storage capacities<br />
lead to three (l<strong>in</strong>ear) equations, one for each <strong>in</strong>gredient,<br />
7<br />
15 b + 6<br />
15 s + 2 f = 380<br />
15<br />
(rais<strong>in</strong>s)<br />
6<br />
15 b + 4<br />
15 s + 5 f = 500<br />
15<br />
(peanuts)<br />
2<br />
15 b + 5<br />
15 s + 8 f = 620<br />
15<br />
(chocolate)<br />
It happens that this system of three equations has just one solution. In other<br />
words, as production manager, your job is easy, s<strong>in</strong>ce there is but one way to use up<br />
all of your raw <strong>in</strong>gredients mak<strong>in</strong>g trail mix. This s<strong>in</strong>gle solution is<br />
b = 300 kg s = 300 kg f = 900 kg.<br />
We do not yet have the tools to expla<strong>in</strong> why this solution is the only one, but it<br />
should be simple for you to verify that this is <strong>in</strong>deed a solution. (Go ahead, we will<br />
wait.) Determ<strong>in</strong><strong>in</strong>g solutions such as this, and establish<strong>in</strong>g that they are unique, will<br />
be the ma<strong>in</strong> motivation for our <strong>in</strong>itial study of l<strong>in</strong>ear algebra.<br />
So we have solved the problem of mak<strong>in</strong>g sure that we make the best use of<br />
our limited storage space, and each day use up all of the raw <strong>in</strong>gredients that are<br />
shipped to us. Additionally, as production manager, you must report weekly to the<br />
CEO of the company, and you know he will be more <strong>in</strong>terested <strong>in</strong> the profit derived<br />
from your decisions than <strong>in</strong> the actual production levels. So you compute,<br />
300(4.99 − 3.69) + 300(5.50 − 3.86) + 900(6.50 − 4.45) = 2727.00<br />
for a daily profit of $2,727 from this production schedule. The computation of the<br />
daily profit is also beyond our control, though it is def<strong>in</strong>itely of <strong>in</strong>terest, and it too<br />
looks like a “l<strong>in</strong>ear” computation.<br />
As often happens, th<strong>in</strong>gs do not stay the same for long, and now the market<strong>in</strong>g<br />
department has suggested that your company’s trail mix products standardize on<br />
every mix be<strong>in</strong>g one-third peanuts. Adjust<strong>in</strong>g the peanut portion of each recipe by<br />
also adjust<strong>in</strong>g the chocolate portion leads to revised recipes, and slightly different<br />
costs for the bulk and standard mixes, as given <strong>in</strong> the follow<strong>in</strong>g table.
§WILA Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 5<br />
Rais<strong>in</strong>s Peanuts Chocolate Cost Sale Price<br />
(kg/batch) (kg/batch) (kg/batch) ($/kg) ($/kg)<br />
Bulk 7 5 3 3.70 4.99<br />
Standard 6 5 4 3.85 5.50<br />
Fancy 2 5 8 4.45 6.50<br />
Storage (kg) 380 500 620<br />
Cost ($/kg) 2.55 4.65 4.80<br />
In a similar fashion as before, we desire values of b, s and f so that<br />
b ≥ 0 s ≥ 0 f ≥ 0<br />
and<br />
7<br />
15 b + 6<br />
15 s + 2 f = 380 (rais<strong>in</strong>s)<br />
15<br />
5<br />
15 b + 5<br />
15 s + 5 f = 500 (peanuts)<br />
15<br />
3<br />
15 b + 4<br />
15 s + 8 f = 620 (chocolate)<br />
15<br />
It now happens that this system of equations has <strong>in</strong>f<strong>in</strong>itely many solutions,<br />
as we will now demonstrate. Let f rema<strong>in</strong> a variable quantity. Then suppose we<br />
make f kilograms of the fancy mix, b =4f − 3300 kilograms of the bulk mix, and<br />
s = −5f + 4800 kilograms of the standard mix. We now show that these choices,<br />
for any value of f, will yield a production schedule that exhausts all of the day’s<br />
supply of raw <strong>in</strong>gredients. (We will very soon learn how to solve systems of equations<br />
with <strong>in</strong>f<strong>in</strong>itely many solutions and then determ<strong>in</strong>e expressions like these for b and<br />
s). Grab your pencil and paper and play along by substitut<strong>in</strong>g these choices for the<br />
production schedule <strong>in</strong>to the storage limits for each raw <strong>in</strong>gredient and simpliy<strong>in</strong>g<br />
the algebra.<br />
7<br />
6<br />
2 5700<br />
(4f − 3300) + (−5f + 4800) + f =0f +<br />
15 15 15 15 = 380<br />
5<br />
5<br />
5 7500<br />
(4f − 3300) + (−5f + 4800) + f =0f +<br />
15 15 15 15 = 500<br />
3<br />
4<br />
8 9300<br />
(4f − 3300) + (−5f + 4800) + f =0f +<br />
15 15 15 15 = 620<br />
Conv<strong>in</strong>ce yourself that these expressions for b and s allow us to vary f and obta<strong>in</strong><br />
an <strong>in</strong>f<strong>in</strong>ite number of possibilities for solutions to the three equations that describe<br />
our storage capacities. As a practical matter, there really are not an <strong>in</strong>f<strong>in</strong>ite number<br />
of solutions, s<strong>in</strong>ce we are unlikely to want to end the day with a fractional number<br />
of bags of fancy mix, so our allowable values of f should probably be <strong>in</strong>tegers. More<br />
importantly, we need to remember that we cannot make negative amounts of each<br />
mix! Where does this lead us? Positive quantities of the bulk mix requires that<br />
b ≥ 0 ⇒ 4f − 3300 ≥ 0 ⇒ f ≥ 825
§WILA Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 6<br />
Similarly for the standard mix,<br />
s ≥ 0 ⇒ −5f + 4800 ≥ 0 ⇒ f ≤ 960<br />
So, as production manager, you really have to choose a value of f from the f<strong>in</strong>ite<br />
set<br />
{825, 826, ..., 960}<br />
leav<strong>in</strong>g you with 136 choices, each of which will exhaust the day’s supply of raw<br />
<strong>in</strong>gredients. Pause now and th<strong>in</strong>k about which you would choose.<br />
Recall<strong>in</strong>g your weekly meet<strong>in</strong>g with the CEO suggests that you might want to<br />
choose a production schedule that yields the biggest possible profit for the company.<br />
So you compute an expression for the profit based on your as yet undeterm<strong>in</strong>ed<br />
decision for the value of f,<br />
(4f − 3300)(4.99 − 3.70) + (−5f + 4800)(5.50 − 3.85) + (f)(6.50 − 4.45)<br />
= −1.04f + 3663<br />
S<strong>in</strong>ce f has a negative coefficient it would appear that mix<strong>in</strong>g fancy mix is<br />
detrimental to your profit and should be avoided. So you will make the decision<br />
to set daily fancy mix production at f = 825. This has the effect of sett<strong>in</strong>g b =<br />
4(825) − 3300 = 0 and we stop produc<strong>in</strong>g bulk mix entirely. So the rema<strong>in</strong>der of your<br />
daily production is standard mix at the level of s = −5(825) + 4800 = 675 kilograms<br />
and the result<strong>in</strong>g daily profit is (−1.04)(825) + 3663 = 2805. It is a pleasant surprise<br />
that daily profit has risen to $2,805, but this is not the most important part of the<br />
story. What is important here is that there are a large number of ways to produce<br />
trail mix that use all of the day’s worth of raw <strong>in</strong>gredients and you were able to<br />
easily choose the one that netted the largest profit. Notice too how all of the above<br />
computations look “l<strong>in</strong>ear.”<br />
In the food <strong>in</strong>dustry, th<strong>in</strong>gs do not stay the same for long, and now the sales<br />
department says that <strong>in</strong>creased competition has led to the decision to stay competitive<br />
and charge just $5.25 for a kilogram of the standard mix, rather than the previous<br />
$5.50 per kilogram. This decision has no effect on the possibilities for the production<br />
schedule, but will affect the decision based on profit considerations. So you revisit<br />
just the profit computation, suitably adjusted for the new sell<strong>in</strong>g price of standard<br />
mix,<br />
(4f − 3300)(4.99 − 3.70) + (−5f + 4800)(5.25 − 3.85) + (f)(6.50 − 4.45)<br />
=0.21f + 2463<br />
Now it would appear that fancy mix is beneficial to the company’s profit s<strong>in</strong>ce<br />
the value of f has a positive coefficient. So you take the decision to make as much<br />
fancy mix as possible, sett<strong>in</strong>g f = 960. This leads to s = −5(960) + 4800 = 0 and the<br />
<strong>in</strong>creased competition has driven you out of the standard mix market all together. The<br />
rema<strong>in</strong>der of production is therefore bulk mix at a daily level of b = 4(960) − 3300 =
§WILA Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 7<br />
540 kilograms and the result<strong>in</strong>g daily profit is 0.21(960) + 2463 = 2664.60. A daily<br />
profit of $2,664.60 is less than it used to be, but as production manager, you have<br />
made the best of a difficult situation and shown the sales department that the best<br />
course is to pull out of the highly competitive standard mix market completely. △<br />
This example is taken from a field of mathematics variously known by names such<br />
as operations research, systems science, or management science. More specifically,<br />
this is a prototypical example of problems that are solved by the techniques of “l<strong>in</strong>ear<br />
programm<strong>in</strong>g.”<br />
There is a lot go<strong>in</strong>g on under the hood <strong>in</strong> this example. The heart of the matter is<br />
the solution to systems of l<strong>in</strong>ear equations, which is the topic of the next few sections,<br />
and a recurrent theme throughout this course. We will return to this example on<br />
several occasions to reveal some of the reasons for its behavior.<br />
Read<strong>in</strong>g Questions<br />
1. Is the equation x 2 + xy +tan(y 3 ) = 0 l<strong>in</strong>ear or not? Why or why not?<br />
2. F<strong>in</strong>d all solutions to the system of two l<strong>in</strong>ear equations 2x +3y = −8, x − y =6.<br />
3. Describe how the production manager might expla<strong>in</strong> the importance of the procedures<br />
described <strong>in</strong> the trail mix application (Subsection WILA.AA).<br />
Exercises<br />
C10 In Example TMP the first table lists the cost (per kilogram) to manufacture each of<br />
the three varieties of trail mix (bulk, standard, fancy). For example, it costs $3.69 to make<br />
one kilogram of the bulk variety. Re-compute each of these three costs and notice that the<br />
computations are l<strong>in</strong>ear <strong>in</strong> character.<br />
M70 † In Example TMP two different prices were considered for market<strong>in</strong>g standard mix<br />
with the revised recipes (one-third peanuts <strong>in</strong> each recipe). Sell<strong>in</strong>g standard mix at $5.50<br />
resulted <strong>in</strong> sell<strong>in</strong>g the m<strong>in</strong>imum amount of the fancy mix and no bulk mix. At $5.25 it<br />
was best for profits to sell the maximum amount of fancy mix and then sell no standard<br />
mix. Determ<strong>in</strong>e a sell<strong>in</strong>g price for standard mix that allows for maximum profits while still<br />
sell<strong>in</strong>g some of each type of mix.
Section SSLE<br />
Solv<strong>in</strong>g Systems of L<strong>in</strong>ear Equations<br />
We will motivate our study of l<strong>in</strong>ear algebra by consider<strong>in</strong>g the problem of solv<strong>in</strong>g<br />
several l<strong>in</strong>ear equations simultaneously. The word “solve” tends to get abused<br />
somewhat, as <strong>in</strong> “solve this problem.” When talk<strong>in</strong>g about equations we understand<br />
a more precise mean<strong>in</strong>g: f<strong>in</strong>d all of the values of some variable quantities that make<br />
an equation, or several equations, simultaneously true.<br />
Subsection SLE<br />
Systems of L<strong>in</strong>ear Equations<br />
Our first example is of a type we will not pursue further. While it has two equations,<br />
the first is not l<strong>in</strong>ear. So this is a good example to come back to later, especially<br />
after you have seen Theorem PSSLS.<br />
Example STNE Solv<strong>in</strong>g two (nonl<strong>in</strong>ear) equations<br />
Suppose we desire the simultaneous solutions of the two equations,<br />
x 2 + y 2 =1<br />
−x + √ 3y =0<br />
You can easily check by substitution that x = √ 3<br />
2 ,y= 1 2 and x = − √ 3<br />
2 ,y= − 1 2<br />
are both solutions. We need to also conv<strong>in</strong>ce ourselves that these are the only<br />
solutions. To see this, plot each equation on the xy-plane, which means to plot (x, y)<br />
pairs that make an <strong>in</strong>dividual equation true. In this case we get a circle centered at<br />
the orig<strong>in</strong> with radius 1 and a straight l<strong>in</strong>e through the orig<strong>in</strong> with slope √ 1<br />
3<br />
.The<br />
<strong>in</strong>tersections of these two curves are our desired simultaneous solutions, and so we<br />
believe from our plot that the two solutions we know already are <strong>in</strong>deed the only<br />
ones. We like to write solutions as sets, so <strong>in</strong> this case we write the set of solutions as<br />
{( √3 ) (<br />
S =<br />
2 , 1 2<br />
, − √ )}<br />
3<br />
2 , − 1 2<br />
In order to discuss systems of l<strong>in</strong>ear equations carefully, we need a precise<br />
def<strong>in</strong>ition. And before we do that, we will <strong>in</strong>troduce our periodic discussions about<br />
“Proof Techniques.” L<strong>in</strong>ear algebra is an excellent sett<strong>in</strong>g for learn<strong>in</strong>g how to read,<br />
understand and formulate proofs. But this is a difficult step <strong>in</strong> your development as<br />
a mathematician, so we have <strong>in</strong>cluded a series of short essays conta<strong>in</strong><strong>in</strong>g advice and<br />
explanations to help you along. These will be referenced <strong>in</strong> the text as needed, and<br />
are also collected as a list you can consult when you want to return to re-read them.<br />
(Which is strongly encouraged!)<br />
8<br />
△
§SSLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 9<br />
With a def<strong>in</strong>ition next, now is the time for the first of our proof techniques. So<br />
study Proof Technique D. We’ll be right here when you get back. See you <strong>in</strong> a bit.<br />
Def<strong>in</strong>ition SLE System of L<strong>in</strong>ear Equations<br />
A system of l<strong>in</strong>ear equations is a collection of m equations <strong>in</strong> the variable<br />
quantities x 1 ,x 2 ,x 3 ,...,x n of the form,<br />
a 11 x 1 + a 12 x 2 + a 13 x 3 + ···+ a 1n x n = b 1<br />
a 21 x 1 + a 22 x 2 + a 23 x 3 + ···+ a 2n x n = b 2<br />
a 31 x 1 + a 32 x 2 + a 33 x 3 + ···+ a 3n x n = b 3<br />
a m1 x 1 + a m2 x 2 + a m3 x 3 + ···+ a mn x n = b m<br />
where the values of a ij , b i and x j ,1≤ i ≤ m, 1≤ j ≤ n, are from the set of complex<br />
numbers, C.<br />
□<br />
Do not let the mention of the complex numbers, C, rattle you. We will stick with<br />
real numbers exclusively for many more sections, and it will sometimes seem like<br />
we only work with <strong>in</strong>tegers! However, we want to leave the possibility of complex<br />
numbers open, and there will be occasions <strong>in</strong> subsequent sections where they are<br />
necessary. You can review the basic properties of complex numbers <strong>in</strong> Section CNO,<br />
but these facts will not be critical until we reach Section O.<br />
Now we make the notion of a solution to a l<strong>in</strong>ear system precise.<br />
Def<strong>in</strong>ition SSLE Solution of a System of L<strong>in</strong>ear Equations<br />
A solution of a system of l<strong>in</strong>ear equations <strong>in</strong> n variables, x 1 ,x 2 ,x 3 , ..., x n (such<br />
as the system given <strong>in</strong> Def<strong>in</strong>ition SLE), is an ordered list of n complex numbers,<br />
s 1 ,s 2 ,s 3 , ..., s n such that if we substitute s 1 for x 1 , s 2 for x 2 , s 3 for x 3 , ..., s n<br />
for x n , then for every equation of the system the left side will equal the right side,<br />
i.e. each equation is true simultaneously.<br />
□<br />
More typically, we will write a solution <strong>in</strong> a form like x 1 = 12, x 2 = −7, x 3 =2to<br />
mean that s 1 = 12, s 2 = −7, s 3 = 2 <strong>in</strong> the notation of Def<strong>in</strong>ition SSLE. To discuss<br />
all of the possible solutions to a system of l<strong>in</strong>ear equations, we now def<strong>in</strong>e the set<br />
of all solutions. (So Section SET is now applicable, and you may want to go and<br />
familiarize yourself with what is there.)<br />
Def<strong>in</strong>ition SSSLE Solution Set of a System of L<strong>in</strong>ear Equations<br />
The solution set of a l<strong>in</strong>ear system of equations is the set which conta<strong>in</strong>s every<br />
solution to the system, and noth<strong>in</strong>g more.<br />
□<br />
Be aware that a solution set can be <strong>in</strong>f<strong>in</strong>ite, or there can be no solutions, <strong>in</strong><br />
whichcasewewritethesolutionsetastheemptyset,∅ = {} (Def<strong>in</strong>ition ES). Here<br />
is an example to illustrate us<strong>in</strong>g the notation <strong>in</strong>troduced <strong>in</strong> Def<strong>in</strong>ition SLE and the<br />
notion of a solution (Def<strong>in</strong>ition SSLE).<br />
.
§SSLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 10<br />
Example NSE Notation for a system of equations<br />
Given the system of l<strong>in</strong>ear equations,<br />
x 1 +2x 2 + x 4 =7<br />
x 1 + x 2 + x 3 − x 4 =3<br />
3x 1 + x 2 +5x 3 − 7x 4 =1<br />
we have n =4variablesandm = 3 equations. Also,<br />
a 11 =1 a 12 =2 a 13 =0 a 14 =1 b 1 =7<br />
a 21 =1 a 22 =1 a 23 =1 a 24 = −1 b 2 =3<br />
a 31 =3 a 32 =1 a 33 =5 a 34 = −7 b 3 =1<br />
Additionally, conv<strong>in</strong>ce yourself that x 1 = −2, x 2 =4,x 3 =2,x 4 = 1 is one<br />
solution (Def<strong>in</strong>ition SSLE), but it is not the only one! For example, another solution<br />
is x 1 = −12, x 2 = 11, x 3 =1,x 4 = −3, and there are more to be found. So the<br />
solution set conta<strong>in</strong>s at least two elements.<br />
△<br />
We will often shorten the term “system of l<strong>in</strong>ear equations” to “system of<br />
equations” leav<strong>in</strong>g the l<strong>in</strong>ear aspect implied. After all, this is a book about l<strong>in</strong>ear<br />
algebra.<br />
Subsection PSS<br />
Possibilities for Solution Sets<br />
The next example illustrates the possibilities for the solution set of a system of l<strong>in</strong>ear<br />
equations. We will not be too formal here, and the necessary theorems to back up<br />
our claims will come <strong>in</strong> subsequent sections. So read for feel<strong>in</strong>g and come back later<br />
to revisit this example.<br />
Example TTS Three typical systems<br />
Consider the system of two equations with two variables,<br />
2x 1 +3x 2 =3<br />
x 1 − x 2 =4<br />
If we plot the solutions to each of these equations separately on the x 1 x 2 -plane,<br />
we get two l<strong>in</strong>es, one with negative slope, the other with positive slope. They have<br />
exactly one po<strong>in</strong>t <strong>in</strong> common, (x 1 ,x 2 )=(3, −1), which is the solution x 1 =3,<br />
x 2 = −1. From the geometry, we believe that this is the only solution to the system<br />
of equations, and so we say it is unique.<br />
Now adjust the system with a different second equation,<br />
2x 1 +3x 2 =3<br />
4x 1 +6x 2 =6
§SSLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 11<br />
A plot of the solutions to these equations <strong>in</strong>dividually results <strong>in</strong> two l<strong>in</strong>es, one on<br />
top of the other! There are <strong>in</strong>f<strong>in</strong>itely many pairs of po<strong>in</strong>ts that make both equations<br />
true. We will learn shortly how to describe this <strong>in</strong>f<strong>in</strong>ite solution set precisely (see<br />
Example SAA, TheoremVFSLS). Notice now how the second equation is just a<br />
multiple of the first.<br />
One more m<strong>in</strong>or adjustment provides a third system of l<strong>in</strong>ear equations,<br />
2x 1 +3x 2 =3<br />
4x 1 +6x 2 =10<br />
A plot now reveals two l<strong>in</strong>es with identical slopes, i.e. parallel l<strong>in</strong>es. They have<br />
no po<strong>in</strong>ts <strong>in</strong> common, and so the system has a solution set that is empty, S = ∅.△<br />
This example exhibits all of the typical behaviors of a system of equations. A<br />
subsequent theorem will tell us that every system of l<strong>in</strong>ear equations has a solution<br />
set that is empty, conta<strong>in</strong>s a s<strong>in</strong>gle solution or conta<strong>in</strong>s <strong>in</strong>f<strong>in</strong>itely many solutions<br />
(Theorem PSSLS). Example STNE yielded exactly two solutions, but this does not<br />
contradict the forthcom<strong>in</strong>g theorem. The equations <strong>in</strong> Example STNE are not l<strong>in</strong>ear<br />
because they do not match the form of Def<strong>in</strong>ition SLE, and so we cannot apply<br />
Theorem PSSLS <strong>in</strong> this case.<br />
Subsection ESEO<br />
Equivalent Systems and Equation Operations<br />
With all this talk about f<strong>in</strong>d<strong>in</strong>g solution sets for systems of l<strong>in</strong>ear equations, you<br />
might be ready to beg<strong>in</strong> learn<strong>in</strong>g how to f<strong>in</strong>d these solution sets yourself. We beg<strong>in</strong><br />
with our first def<strong>in</strong>ition that takes a common word and gives it a very precise mean<strong>in</strong>g<br />
<strong>in</strong> the context of systems of l<strong>in</strong>ear equations.<br />
Def<strong>in</strong>ition ESYS Equivalent Systems<br />
Two systems of l<strong>in</strong>ear equations are equivalent if their solution sets are equal. □<br />
Notice here that the two systems of equations could look very different (i.e. not<br />
be equal), but still have equal solution sets, and we would then call the systems<br />
equivalent. Two l<strong>in</strong>ear equations <strong>in</strong> two variables might be plotted as two l<strong>in</strong>es<br />
that <strong>in</strong>tersect <strong>in</strong> a s<strong>in</strong>gle po<strong>in</strong>t. A different system, with three equations <strong>in</strong> two<br />
variables might have a plot that is three l<strong>in</strong>es, all <strong>in</strong>tersect<strong>in</strong>g at a common po<strong>in</strong>t,<br />
with this common po<strong>in</strong>t identical to the <strong>in</strong>tersection po<strong>in</strong>t for the first system. By our<br />
def<strong>in</strong>ition, we could then say these two very different look<strong>in</strong>g systems of equations<br />
are equivalent, s<strong>in</strong>ce they have identical solution sets. It is really like a weaker form<br />
of equality, where we allow the systems to be different <strong>in</strong> some respects, but we use<br />
the term equivalent to highlight the situation when their solution sets are equal.<br />
With this def<strong>in</strong>ition, we can beg<strong>in</strong> to describe our strategy for solv<strong>in</strong>g l<strong>in</strong>ear<br />
systems. Given a system of l<strong>in</strong>ear equations that looks difficult to solve, we would<br />
liketohaveanequivalent system that is easy to solve. S<strong>in</strong>ce the systems will have
§SSLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 12<br />
equal solution sets, we can solve the “easy” system and get the solution set to the<br />
“difficult” system. Here come the tools for mak<strong>in</strong>g this strategy viable.<br />
Def<strong>in</strong>ition EO Equation Operations<br />
Given a system of l<strong>in</strong>ear equations, the follow<strong>in</strong>g three operations will transform the<br />
system <strong>in</strong>to a different one, and each operation is known as an equation operation.<br />
1. Swap the locations of two equations <strong>in</strong> the list of equations.<br />
2. Multiply each term of an equation by a nonzero quantity.<br />
3. Multiply each term of one equation by some quantity, and add these terms to<br />
a second equation, on both sides of the equality. Leave the first equation the<br />
same after this operation, but replace the second equation by the new one.<br />
□<br />
These descriptions might seem a bit vague, but the proof or the examples that<br />
follow should make it clear what is meant by each. We will shortly prove a key<br />
theorem about equation operations and solutions to l<strong>in</strong>ear systems of equations.<br />
We are about to give a rather <strong>in</strong>volved proof, so a discussion about just what a<br />
theorem really is would be timely. Stop and read Proof Technique T first.<br />
In the theorem we are about to prove, the conclusion is that two systems are<br />
equivalent. By Def<strong>in</strong>ition ESYS this translates to requir<strong>in</strong>g that solution sets be<br />
equal for the two systems. So we are be<strong>in</strong>g asked to show that two sets are equal.<br />
How do we do this? Well, there is a very standard technique, and we will use it<br />
repeatedly through the course. If you have not done so already, head to Section SET<br />
and familiarize yourself with sets, their operations, and especially the notion of set<br />
equality, Def<strong>in</strong>ition SE, and the nearby discussion about its use.<br />
The follow<strong>in</strong>g theorem has a rather long proof. This chapter conta<strong>in</strong>s a few very<br />
necessary theorems like this, with proofs that you can safely skip on a first read<strong>in</strong>g.<br />
You might come back to them later, when you are more comfortable with read<strong>in</strong>g<br />
and study<strong>in</strong>g proofs.<br />
Theorem EOPSS Equation Operations Preserve Solution Sets<br />
If we apply one of the three equation operations of Def<strong>in</strong>ition EO toasystemof<br />
l<strong>in</strong>ear equations (Def<strong>in</strong>ition SLE), then the orig<strong>in</strong>al system and the transformed<br />
system are equivalent.<br />
Proof. We take each equation operation <strong>in</strong> turn and show that the solution sets of<br />
the two systems are equal, us<strong>in</strong>g the def<strong>in</strong>ition of set equality (Def<strong>in</strong>ition SE).<br />
1. It will not be our habit <strong>in</strong> proofs to resort to say<strong>in</strong>g statements are “obvious,”<br />
but <strong>in</strong> this case, it should be. There is noth<strong>in</strong>g about the order <strong>in</strong> which we<br />
write l<strong>in</strong>ear equations that affects their solutions, so the solution set will be<br />
equal if the systems only differ by a rearrangement of the order of the equations.
§SSLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 13<br />
2. Suppose α ̸= 0 is a number. Let us choose to multiply the terms of equation i<br />
by α to build the new system of equations,<br />
a 11 x 1 + a 12 x 2 + a 13 x 3 + ···+ a 1n x n = b 1<br />
a 21 x 1 + a 22 x 2 + a 23 x 3 + ···+ a 2n x n = b 2<br />
a 31 x 1 + a 32 x 2 + a 33 x 3 + ···+ a 3n x n = b 3<br />
αa i1 x 1 + αa i2 x 2 + αa i3 x 3 + ···+ αa <strong>in</strong> x n = αb i<br />
a m1 x 1 + a m2 x 2 + a m3 x 3 + ···+ a mn x n = b m<br />
Let S denote the solutions to the system <strong>in</strong> the statement of the theorem, and<br />
let T denote the solutions to the transformed system.<br />
(a) Show S ⊆ T . Suppose (x 1 ,x 2 , x 3 , ...,x n )=(β 1 ,β 2 , β 3 , ...,β n ) ∈ S is<br />
a solution to the orig<strong>in</strong>al system. Ignor<strong>in</strong>g the i-th equation for a moment,<br />
we know it makes all the other equations of the transformed system true.<br />
We also know that<br />
a i1 β 1 + a i2 β 2 + a i3 β 3 + ···+ a <strong>in</strong> β n = b i<br />
which we can multiply by α to get<br />
.<br />
.<br />
αa i1 β 1 + αa i2 β 2 + αa i3 β 3 + ···+ αa <strong>in</strong> β n = αb i<br />
This says that the i-th equation of the transformed system is also true, so<br />
we have established that (β 1 ,β 2 , β 3 , ...,β n ) ∈ T , and therefore S ⊆ T .<br />
(b) Now show T ⊆ S. Suppose (x 1 ,x 2 , x 3 , ...,x n )=(β 1 ,β 2 , β 3 , ...,β n ) ∈<br />
T is a solution to the transformed system. Ignor<strong>in</strong>g the i-th equation<br />
for a moment, we know it makes all the other equations of the orig<strong>in</strong>al<br />
system true. We also know that<br />
αa i1 β 1 + αa i2 β 2 + αa i3 β 3 + ···+ αa <strong>in</strong> β n = αb i<br />
which we can multiply by 1 α<br />
,s<strong>in</strong>ceα ≠0,toget<br />
a i1 β 1 + a i2 β 2 + a i3 β 3 + ···+ a <strong>in</strong> β n = b i<br />
This says that the i-th equation of the orig<strong>in</strong>al system is also true, so<br />
we have established that (β 1 ,β 2 , β 3 , ...,β n ) ∈ S, and therefore T ⊆ S.<br />
Locate the key po<strong>in</strong>t where we required that α ̸= 0, and consider what<br />
wouldhappenifα =0.
§SSLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 14<br />
3. Suppose α is a number. Let us choose to multiply the terms of equation i by<br />
α and add them to equation j <strong>in</strong> order to build the new system of equations,<br />
a 11 x 1 + a 12 x 2 + ···+ a 1n x n = b 1<br />
a 21 x 1 + a 22 x 2 + ···+ a 2n x n = b 2<br />
a 31 x 1 + a 32 x 2 + ···+ a 3n x n = b 3<br />
(αa i1 + a j1 )x 1 +(αa i2 + a j2 )x 2 + ···+(αa <strong>in</strong> + a jn )x n = αb i + b j<br />
a m1 x 1 + a m2 x 2 + ···+ a mn x n = b m<br />
Let S denote the solutions to the system <strong>in</strong> the statement of the theorem, and<br />
let T denote the solutions to the transformed system.<br />
(a) Show S ⊆ T . Suppose (x 1 ,x 2 , x 3 , ...,x n )=(β 1 ,β 2 , β 3 , ...,β n ) ∈ S is<br />
a solution to the orig<strong>in</strong>al system. Ignor<strong>in</strong>g the j-th equation for a moment,<br />
we know this solution makes all the other equations of the transformed<br />
system true. Us<strong>in</strong>g the fact that the solution makes the i-th and j-th<br />
equations of the orig<strong>in</strong>al system true, we f<strong>in</strong>d<br />
(αa i1 + a j1 )β 1 +(αa i2 + a j2 )β 2 + ···+(αa <strong>in</strong> + a jn )β n<br />
=(αa i1 β 1 + αa i2 β 2 + ···+ αa <strong>in</strong> β n )+(a j1 β 1 + a j2 β 2 + ···+ a jn β n )<br />
= α(a i1 β 1 + a i2 β 2 + ···+ a <strong>in</strong> β n )+(a j1 β 1 + a j2 β 2 + ···+ a jn β n )<br />
= αb i + b j .<br />
This says that the j-th equation of the transformed system is also true, so<br />
we have established that (β 1 ,β 2 , β 3 , ...,β n ) ∈ T , and therefore S ⊆ T .<br />
(b) Now show T ⊆ S. Suppose (x 1 ,x 2 , x 3 , ...,x n )=(β 1 ,β 2 , β 3 , ...,β n ) ∈<br />
T is a solution to the transformed system. Ignor<strong>in</strong>g the j-th equation<br />
for a moment, we know it makes all the other equations of the orig<strong>in</strong>al<br />
system true. We then f<strong>in</strong>d<br />
a j1 β 1 + ···+ a jn β n<br />
= a j1 β 1 + ···+ a jn β n + αb i − αb i<br />
= a j1 β 1 + ···+ a jn β n +(αa i1 β 1 + ···+ αa <strong>in</strong> β n ) − αb i<br />
= a j1 β 1 + αa i1 β 1 + ···+ a jn β n + αa <strong>in</strong> β n − αb i<br />
=(αa i1 + a j1 )β 1 + ···+(αa <strong>in</strong> + a jn )β n − αb i<br />
= αb i + b j − αb i<br />
= b j<br />
.<br />
.
§SSLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 15<br />
This says that the j-th equation of the orig<strong>in</strong>al system is also true, so we<br />
have established that (β 1 ,β 2 , β 3 , ...,β n ) ∈ S, and therefore T ⊆ S.<br />
Why did we not need to require that α ̸= 0 for this row operation? In other<br />
words, how does the third statement of the theorem read when α =0?Does<br />
our proof require some extra care when α = 0? Compare your answers with<br />
the similar situation for the second row operation. (See Exercise SSLE.T20.)<br />
Theorem EOPSS is the necessary tool to complete our strategy for solv<strong>in</strong>g systems<br />
of equations. We will use equation operations to move from one system to another,<br />
all the while keep<strong>in</strong>g the solution set the same. With the right sequence of operations,<br />
we will arrive at a simpler equation to solve. The next two examples illustrate this<br />
idea, while sav<strong>in</strong>g some of the details for later.<br />
Example US Three equations, one solution<br />
We solve the follow<strong>in</strong>g system by a sequence of equation operations.<br />
x 1 +2x 2 +2x 3 =4<br />
x 1 +3x 2 +3x 3 =5<br />
2x 1 +6x 2 +5x 3 =6<br />
α = −1 times equation 1, add to equation 2:<br />
x 1 +2x 2 +2x 3 =4<br />
0x 1 +1x 2 +1x 3 =1<br />
2x 1 +6x 2 +5x 3 =6<br />
α = −2 times equation 1, add to equation 3:<br />
x 1 +2x 2 +2x 3 =4<br />
0x 1 +1x 2 +1x 3 =1<br />
0x 1 +2x 2 +1x 3 = −2<br />
α = −2 times equation 2, add to equation 3:<br />
α = −1 times equation 3:<br />
x 1 +2x 2 +2x 3 =4<br />
0x 1 +1x 2 +1x 3 =1<br />
0x 1 +0x 2 − 1x 3 = −4<br />
x 1 +2x 2 +2x 3 =4<br />
0x 1 +1x 2 +1x 3 =1
§SSLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 16<br />
which can be written more clearly as<br />
0x 1 +0x 2 +1x 3 =4<br />
x 1 +2x 2 +2x 3 =4<br />
x 2 + x 3 =1<br />
x 3 =4<br />
This is now a very easy system of equations to solve. The third equation requires<br />
that x 3 = 4 to be true. Mak<strong>in</strong>g this substitution <strong>in</strong>to equation 2 we arrive at x 2 = −3,<br />
and f<strong>in</strong>ally, substitut<strong>in</strong>g these values of x 2 and x 3 <strong>in</strong>to the first equation, we f<strong>in</strong>d<br />
that x 1 = 2. Note too that this is the only solution to this f<strong>in</strong>al system of equations,<br />
s<strong>in</strong>cewewereforcedtochoosethesevaluestomaketheequationstrue.S<strong>in</strong>cewe<br />
performed equation operations on each system to obta<strong>in</strong> the next one <strong>in</strong> the list, all<br />
of the systems listed here are all equivalent to each other by Theorem EOPSS. Thus<br />
(x 1 ,x 2 ,x 3 )=(2, −3, 4) is the unique solution to the orig<strong>in</strong>al system of equations<br />
(and all of the other <strong>in</strong>termediate systems of equations listed as we transformed one<br />
<strong>in</strong>to another).<br />
△<br />
Example IS Three equations, <strong>in</strong>f<strong>in</strong>itely many solutions<br />
The follow<strong>in</strong>g system of equations made an appearance earlier <strong>in</strong> this section (Example<br />
NSE),wherewelistedone of its solutions. Now, we will try to f<strong>in</strong>d all of the<br />
solutions to this system. Do not concern yourself too much about why we choose<br />
this particular sequence of equation operations, just believe that the work we do is<br />
all correct.<br />
x 1 +2x 2 +0x 3 + x 4 =7<br />
x 1 + x 2 + x 3 − x 4 =3<br />
3x 1 + x 2 +5x 3 − 7x 4 =1<br />
α = −1 times equation 1, add to equation 2:<br />
x 1 +2x 2 +0x 3 + x 4 =7<br />
0x 1 − x 2 + x 3 − 2x 4 = −4<br />
3x 1 + x 2 +5x 3 − 7x 4 =1<br />
α = −3 times equation 1, add to equation 3:<br />
x 1 +2x 2 +0x 3 + x 4 =7<br />
0x 1 − x 2 + x 3 − 2x 4 = −4<br />
0x 1 − 5x 2 +5x 3 − 10x 4 = −20
§SSLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 17<br />
α = −5 times equation 2, add to equation 3:<br />
α = −1 times equation 2:<br />
x 1 +2x 2 +0x 3 + x 4 =7<br />
0x 1 − x 2 + x 3 − 2x 4 = −4<br />
0x 1 +0x 2 +0x 3 +0x 4 =0<br />
x 1 +2x 2 +0x 3 + x 4 =7<br />
0x 1 + x 2 − x 3 +2x 4 =4<br />
0x 1 +0x 2 +0x 3 +0x 4 =0<br />
α = −2 times equation 2, add to equation 1:<br />
which can be written more clearly as<br />
x 1 +0x 2 +2x 3 − 3x 4 = −1<br />
0x 1 + x 2 − x 3 +2x 4 =4<br />
0x 1 +0x 2 +0x 3 +0x 4 =0<br />
x 1 +2x 3 − 3x 4 = −1<br />
x 2 − x 3 +2x 4 =4<br />
0=0<br />
What does the equation 0 = 0 mean? We can choose any values for x 1 , x 2 ,<br />
x 3 , x 4 and this equation will be true, so we only need to consider further the first<br />
two equations, s<strong>in</strong>ce the third is true no matter what. We can analyze the second<br />
equation without consideration of the variable x 1 . It would appear that there is<br />
considerable latitude <strong>in</strong> how we can choose x 2 , x 3 , x 4 and make this equation true.<br />
Let us choose x 3 and x 4 to be anyth<strong>in</strong>g we please, say x 3 = a and x 4 = b.<br />
Now we can take these arbitrary values for x 3 and x 4 , substitute them <strong>in</strong> equation<br />
1, to obta<strong>in</strong><br />
x 1 +2a − 3b = −1<br />
x 1 = −1 − 2a +3b<br />
Similarly, equation 2 becomes<br />
x 2 − a +2b =4<br />
x 2 =4+a − 2b<br />
So our arbitrary choices of values for x 3 and x 4 (a and b) translate <strong>in</strong>to specific<br />
values of x 1 and x 2 . The lone solution given <strong>in</strong> Example NSE was obta<strong>in</strong>ed by<br />
choos<strong>in</strong>g a = 2 and b = 1. Now we can easily and quickly f<strong>in</strong>d many more (<strong>in</strong>f<strong>in</strong>itely
§SSLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 18<br />
more). Suppose we choose a = 5 and b = −2, then we compute<br />
x 1 = −1 − 2(5) + 3(−2) = −17<br />
x 2 =4+5− 2(−2) = 13<br />
and you can verify that (x 1 ,x 2 ,x 3 ,x 4 )=(−17, 13, 5, −2) makes all three equations<br />
true. The entire solution set is written as<br />
S = {(−1 − 2a +3b, 4+a − 2b, a, b)| a ∈ C, b∈ C}<br />
It would be <strong>in</strong>structive to f<strong>in</strong>ish off your study of this example by tak<strong>in</strong>g the<br />
general form of the solutions given <strong>in</strong> this set and substitut<strong>in</strong>g them <strong>in</strong>to each of the<br />
three equations and verify that they are true <strong>in</strong> each case (Exercise SSLE.M40). △<br />
In the next section we will describe how to use equation operations to systematically<br />
solve any system of l<strong>in</strong>ear equations. But first, read one of our more important<br />
pieces of advice about speak<strong>in</strong>g and writ<strong>in</strong>g mathematics. See Proof Technique L.<br />
Before attack<strong>in</strong>g the exercises <strong>in</strong> this section, it will be helpful to read some<br />
advice on gett<strong>in</strong>g started on the construction of a proof. See Proof Technique GS.<br />
Read<strong>in</strong>g Questions<br />
1. How many solutions does the system of equations 3x+2y =4,6x+4y = 8 have? Expla<strong>in</strong><br />
your answer.<br />
2. How many solutions does the system of equations 3x +2y =4,6x +4y = −2 have?<br />
Expla<strong>in</strong> your answer.<br />
3. What do we mean when we say mathematics is a language?<br />
Exercises<br />
C10 F<strong>in</strong>d a solution to the system <strong>in</strong> Example IS where x 3 =6andx 4 =2.F<strong>in</strong>dtwo<br />
other solutions to the system. F<strong>in</strong>d a solution where x 1 = −17 and x 2 = 14. How many<br />
possible answers are there to each of these questions?<br />
C20 Each archetype (Archetypes) that is a system of equations beg<strong>in</strong>s by list<strong>in</strong>g some<br />
specific solutions. Verify the specific solutions listed <strong>in</strong> the follow<strong>in</strong>g archetypes by evaluat<strong>in</strong>g<br />
the system of equations with the solutions listed.<br />
Archetype A,ArchetypeB,ArchetypeC,ArchetypeD,ArchetypeE,ArchetypeF,Archetype<br />
G, ArchetypeH, ArchetypeI, ArchetypeJ<br />
C30 † F<strong>in</strong>d all solutions to the l<strong>in</strong>ear system:<br />
x + y =5<br />
2x − y =3<br />
C31<br />
F<strong>in</strong>d all solutions to the l<strong>in</strong>ear system:<br />
3x +2y =1
§SSLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 19<br />
x − y =2<br />
4x +2y =2<br />
C32<br />
C33<br />
C34<br />
F<strong>in</strong>d all solutions to the l<strong>in</strong>ear system:<br />
x +2y =8<br />
x − y =2<br />
x + y =4<br />
F<strong>in</strong>d all solutions to the l<strong>in</strong>ear system:<br />
x + y − z = −1<br />
x − y − z = −1<br />
z =2<br />
F<strong>in</strong>d all solutions to the l<strong>in</strong>ear system:<br />
x + y − z = −5<br />
x − y − z = −3<br />
x + y − z =0<br />
C50 † A three-digit number has two properties. The tens-digit and the ones-digit add up<br />
to 5. If the number is written with the digits <strong>in</strong> the reverse order, and then subtracted<br />
from the orig<strong>in</strong>al number, the result is 792. Use a system of equations to f<strong>in</strong>d all of the<br />
three-digit numbers with these properties.<br />
C51 † F<strong>in</strong>d all of the six-digit numbers <strong>in</strong> which the first digit is one less than the second,<br />
the third digit is half the second, the fourth digit is three times the third and the last two<br />
digits form a number that equals the sum of the fourth and fifth. The sum of all the digits<br />
is 24. (From The MENSA Puzzle Calendar for January 9, 2006.)<br />
C52 † Driv<strong>in</strong>g along, Terry notices that the last four digits on his car’s odometer are<br />
pal<strong>in</strong>dromic. A mile later, the last five digits are pal<strong>in</strong>dromic. After driv<strong>in</strong>g another mile, the<br />
middle four digits are pal<strong>in</strong>dromic. One more mile, and all six are pal<strong>in</strong>dromic. What was<br />
the odometer read<strong>in</strong>g when Terry first looked at it? Form a l<strong>in</strong>ear system of equations that<br />
expresses the requirements of this puzzle. (Car Talk Puzzler, National Public Radio, Week<br />
of January 21, 2008) (A car odometer displays six digits and a sequence is a pal<strong>in</strong>drome<br />
if it reads the same left-to-right as right-to-left.)<br />
C53 † An article <strong>in</strong> The Economist (“Free Exchange”, December 6, 2014) quotes the<br />
follow<strong>in</strong>g problem as an illustration that some of the “underly<strong>in</strong>g assumptions of classical<br />
economics” about people’s behavior are <strong>in</strong>correct and “the m<strong>in</strong>d plays tricks.” A bat and<br />
ball cost $1.10 between them. How much does each cost? Answer this quickly with no<br />
writ<strong>in</strong>g, then construct system of l<strong>in</strong>ear equations and solve the problem carefully.<br />
M10 † Each sentence below has at least two mean<strong>in</strong>gs. Identify the source of the double<br />
mean<strong>in</strong>g, and rewrite the sentence (at least twice) to clearly convey each mean<strong>in</strong>g.<br />
1. They are bak<strong>in</strong>g potatoes.
§SSLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 20<br />
2. He bought many ripe pears and apricots.<br />
3. She likes his sculpture.<br />
4. I decided on the bus.<br />
M11 † Discuss the difference <strong>in</strong> mean<strong>in</strong>g of each of the follow<strong>in</strong>g three almost identical<br />
sentences, which all have the same grammatical structure. (These are due to Keith Devl<strong>in</strong>.)<br />
1. She saw him <strong>in</strong> the park with a dog.<br />
2. She saw him <strong>in</strong> the park with a founta<strong>in</strong>.<br />
3. She saw him <strong>in</strong> the park with a telescope.<br />
M12 † The follow<strong>in</strong>g sentence, due to Noam Chomsky, has a correct grammatical structure,<br />
but is mean<strong>in</strong>gless. Critique its faults. “Colorless green ideas sleep furiously.” (Chomsky,<br />
Noam. Syntactic Structures, The Hague/Paris: Mouton, 1957. p. 15.)<br />
M13 † Read the follow<strong>in</strong>g sentence and form a mental picture of the situation.<br />
The baby cried and the mother picked it up.<br />
What assumptions did you make about the situation?<br />
M14 Discuss the difference <strong>in</strong> mean<strong>in</strong>g of the follow<strong>in</strong>g two almost identical sentences,<br />
which have nearly identical grammatical structure. (This antanaclasis is often attributed to<br />
the comedian Groucho Marx, but has earlier roots.)<br />
1. Time flies like an arrow.<br />
2. Fruit flies like a banana.<br />
M30 † This problem appears <strong>in</strong> a middle-school mathematics textbook: Together Dan<br />
and Diane have $20. Together Diane and Donna have $15. How much do the three of them<br />
have <strong>in</strong> total? (Transition Mathematics, Second Edition, Scott Foresman Addison Wesley,<br />
1998. Problem 5–1.19.)<br />
M40 Solutions to the system <strong>in</strong> Example IS are given as<br />
(x 1 ,x 2 ,x 3 ,x 4 )=(−1 − 2a +3b, 4+a − 2b, a, b)<br />
Evaluate the three equations of the orig<strong>in</strong>al system with these expressions <strong>in</strong> a and b and<br />
verify that each equation is true, no matter what values are chosen for a and b.<br />
M70 † We have seen <strong>in</strong> this section that systems of l<strong>in</strong>ear equations have limited possibilities<br />
for solution sets, and we will shortly prove Theorem PSSLS that describes these<br />
possibilities exactly. This exercise will show that if we relax the requirement that our equations<br />
be l<strong>in</strong>ear, then the possibilities expand greatly. Consider a system of two equations <strong>in</strong><br />
the two variables x and y, where the departure from l<strong>in</strong>earity <strong>in</strong>volves simply squar<strong>in</strong>g the<br />
variables.<br />
x 2 − y 2 =1
§SSLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 21<br />
x 2 + y 2 =4<br />
After solv<strong>in</strong>g this system of nonl<strong>in</strong>ear equations, replace the second equation <strong>in</strong> turn by<br />
x 2 +2x + y 2 =3,x 2 + y 2 =1,x 2 − 4x + y 2 = −3, −x 2 + y 2 = 1 and solve each result<strong>in</strong>g<br />
system of two equations <strong>in</strong> two variables. (This exercise <strong>in</strong>cludes suggestions from Don<br />
Kreher.)<br />
T10 † Proof Technique D asks you to formulate a def<strong>in</strong>ition of what it means for a whole<br />
number to be odd. What is your def<strong>in</strong>ition? (Do not say “the opposite of even.”) Is 6 odd?<br />
Is 11 odd? Justify your answers by us<strong>in</strong>g your def<strong>in</strong>ition.<br />
T20 † Expla<strong>in</strong> why the second equation operation <strong>in</strong> Def<strong>in</strong>ition EO requires that the<br />
scalar be nonzero, while <strong>in</strong> the third equation operation this restriction on the scalar is not<br />
present.
Section RREF<br />
Reduced Row-Echelon Form<br />
After solv<strong>in</strong>g a few systems of equations, you will recognize that it does not matter so<br />
much what we call our variables, as opposed to what numbers act as their coefficients.<br />
A system <strong>in</strong> the variables x 1 ,x 2 ,x 3 would behave the same if we changed the names<br />
of the variables to a, b, c and kept all the constants the same and <strong>in</strong> the same places.<br />
In this section, we will isolate the key bits of <strong>in</strong>formation about a system of equations<br />
<strong>in</strong>to someth<strong>in</strong>g called a matrix, and then use this matrix to systematically solve<br />
the equations. Along the way we will obta<strong>in</strong> one of our most important and useful<br />
computational tools.<br />
Subsection MVNSE<br />
Matrix and Vector Notation for Systems of Equations<br />
Def<strong>in</strong>ition M Matrix<br />
An m × n matrix is a rectangular layout of numbers from C hav<strong>in</strong>g m rows and<br />
n columns. We will use upper-case Lat<strong>in</strong> letters from the start of the alphabet<br />
(A, B, C,...) to denote matrices and squared-off brackets to delimit the layout.<br />
Many use large parentheses <strong>in</strong>stead of brackets — the dist<strong>in</strong>ction is not important.<br />
Rows of a matrix will be referenced start<strong>in</strong>g at the top and work<strong>in</strong>g down (i.e. row 1<br />
is at the top) and columns will be referenced start<strong>in</strong>g from the left (i.e. column 1 is<br />
at the left). For a matrix A, the notation [A] ij<br />
will refer to the complex number <strong>in</strong><br />
row i and column j of A.<br />
□<br />
Be careful with this notation for <strong>in</strong>dividual entries, s<strong>in</strong>ce it is easy to th<strong>in</strong>k that<br />
[A] ij<br />
refers to the whole matrix. It does not. It is just a number, but is a convenient<br />
way to talk about the <strong>in</strong>dividual entries simultaneously. This notation will get a<br />
heavy workout once we get to Chapter M.<br />
Example AM A matrix<br />
B =<br />
[ −1 2 5<br />
] 3<br />
1 0 −6 1<br />
−4 2 2 −2<br />
is a matrix with m = 3 rows and n = 4 columns. We can say that [B] 2,3<br />
= −6 while<br />
[B] 3,4<br />
= −2.<br />
△<br />
When we do equation operations on system of equations, the names of the<br />
variables really are not very important. Use x 1 , x 2 , x 3 ,ora, b, c, orx, y, z, it really<br />
does not matter. In this subsection we will describe some notation that will make it<br />
easier to describe l<strong>in</strong>ear systems, solve the systems and describe the solution sets.<br />
Here is a list of def<strong>in</strong>itions, laden with notation.<br />
22
§RREF Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 23<br />
Def<strong>in</strong>ition CV Column Vector<br />
A column vector of size m is an ordered list of m numbers, which is written <strong>in</strong><br />
order vertically, start<strong>in</strong>g at the top and proceed<strong>in</strong>g to the bottom. At times, we will<br />
refer to a column vector as simply a vector. Column vectors will be written <strong>in</strong> bold,<br />
usually with lower case Lat<strong>in</strong> letter from the end of the alphabet such as u, v, w,<br />
x, y, z. Some books like to write vectors with arrows, such as ⃗u. Writ<strong>in</strong>g by hand,<br />
some like to put arrows on top of the symbol, or a tilde underneath the symbol, as<br />
<strong>in</strong> u<br />
∼<br />
. To refer to the entry or component of vector v <strong>in</strong> location i of the list, we<br />
write [v] i<br />
.<br />
Be careful with this notation. While the symbols [v] i<br />
might look somewhat<br />
substantial, as an object this represents just one entry of a vector, which is just a<br />
s<strong>in</strong>gle complex number.<br />
Def<strong>in</strong>ition ZCV Zero Column Vector<br />
The zero vector of size m is the column vector of size m where each entry is the<br />
number zero,<br />
⎡ ⎤<br />
0<br />
0<br />
0 =<br />
0<br />
⎢<br />
⎣<br />
⎥<br />
.<br />
⎦<br />
0<br />
or def<strong>in</strong>ed much more compactly, [0] i<br />
=0for1≤ i ≤ m.<br />
Def<strong>in</strong>ition CM Coefficient Matrix<br />
For a system of l<strong>in</strong>ear equations,<br />
a 11 x 1 + a 12 x 2 + a 13 x 3 + ···+ a 1n x n = b 1<br />
a 21 x 1 + a 22 x 2 + a 23 x 3 + ···+ a 2n x n = b 2<br />
a 31 x 1 + a 32 x 2 + a 33 x 3 + ···+ a 3n x n = b 3<br />
.<br />
□<br />
□<br />
a m1 x 1 + a m2 x 2 + a m3 x 3 + ···+ a mn x n = b m<br />
the coefficient matrix is the m × n matrix<br />
⎡<br />
⎤<br />
a 11 a 12 a 13 ... a 1n<br />
a 21 a 22 a 23 ... a 2n<br />
A =<br />
a 31 a 32 a 33 ... a 3n<br />
⎢<br />
⎥<br />
⎣<br />
.<br />
⎦<br />
a m1 a m2 a m3 ... a mn<br />
Def<strong>in</strong>ition VOC Vector of Constants<br />
□
§RREF Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 24<br />
For a system of l<strong>in</strong>ear equations,<br />
a 11 x 1 + a 12 x 2 + a 13 x 3 + ···+ a 1n x n = b 1<br />
a 21 x 1 + a 22 x 2 + a 23 x 3 + ···+ a 2n x n = b 2<br />
a 31 x 1 + a 32 x 2 + a 33 x 3 + ···+ a 3n x n = b 3<br />
.<br />
a m1 x 1 + a m2 x 2 + a m3 x 3 + ···+ a mn x n = b m<br />
the vector of constants is the column vector of size m<br />
⎡ ⎤<br />
b 1<br />
b 2<br />
b =<br />
b 3..<br />
⎢ ⎥<br />
⎣ ⎦<br />
b m<br />
Def<strong>in</strong>ition SOLV Solution Vector<br />
For a system of l<strong>in</strong>ear equations,<br />
□<br />
a 11 x 1 + a 12 x 2 + a 13 x 3 + ···+ a 1n x n = b 1<br />
a 21 x 1 + a 22 x 2 + a 23 x 3 + ···+ a 2n x n = b 2<br />
a 31 x 1 + a 32 x 2 + a 33 x 3 + ···+ a 3n x n = b 3<br />
.<br />
a m1 x 1 + a m2 x 2 + a m3 x 3 + ···+ a mn x n = b m<br />
the solution vector is the column vector of size n<br />
⎡ ⎤<br />
x 1<br />
x 2<br />
x =<br />
x 3<br />
⎢<br />
⎣ . ⎥<br />
.<br />
⎦<br />
x n<br />
The solution vector may do double-duty on occasion. It might refer to a list of<br />
variable quantities at one po<strong>in</strong>t, and subsequently refer to values of those variables<br />
that actually form a particular solution to that system.<br />
Def<strong>in</strong>ition MRLS Matrix Representation of a L<strong>in</strong>ear System<br />
If A is the coefficient matrix of a system of l<strong>in</strong>ear equations and b is the vector of<br />
constants, then we will write LS(A, b) as a shorthand expression for the system of<br />
l<strong>in</strong>ear equations, which we will refer to as the matrix representation of the l<strong>in</strong>ear<br />
□
§RREF Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 25<br />
system.<br />
Example NSLE Notation for systems of l<strong>in</strong>ear equations<br />
The system of l<strong>in</strong>ear equations<br />
2x 1 +4x 2 − 3x 3 +5x 4 + x 5 =9<br />
3x 1 + x 2 + x 4 − 3x 5 =0<br />
−2x 1 +7x 2 − 5x 3 +2x 4 +2x 5 = −3<br />
has coefficient matrix<br />
and vector of constants<br />
A =<br />
and so will be referenced as LS(A, b).<br />
[ 2 4 −3 5<br />
] 1<br />
3 1 0 1 −3<br />
−2 7 −5 2 2<br />
[ ] 9<br />
b = 0<br />
−3<br />
Def<strong>in</strong>ition AM Augmented Matrix<br />
Suppose we have a system of m equations <strong>in</strong> n variables, with coefficient matrix A<br />
and vector of constants b. Then the augmented matrix of the system of equations<br />
is the m × (n + 1) matrix whose first n columns are the columns of A and whose last<br />
column (n + 1) is the column vector b. Thismatrixwillbewrittenas[A | b]. □<br />
The augmented matrix represents all the important <strong>in</strong>formation <strong>in</strong> the system of<br />
equations, s<strong>in</strong>ce the names of the variables have been ignored, and the only connection<br />
with the variables is the location of their coefficients <strong>in</strong> the matrix. It is important<br />
to realize that the augmented matrix is just that, a matrix, and not asystemof<br />
equations. In particular, the augmented matrix does not have any “solutions,” though<br />
it will be useful for f<strong>in</strong>d<strong>in</strong>g solutions to the system of equations that it is associated<br />
with. (Th<strong>in</strong>k about your objects, and review Proof Technique L.) However, notice<br />
that an augmented matrix always belongs to some system of equations, and vice<br />
versa, so it is tempt<strong>in</strong>g to try and blur the dist<strong>in</strong>ction between the two. Here is a<br />
quick example.<br />
Example AMAA Augmented matrix for Archetype A<br />
Archetype A is the follow<strong>in</strong>g system of 3 equations <strong>in</strong> 3 variables.<br />
x 1 − x 2 +2x 3 =1<br />
2x 1 + x 2 + x 3 =8<br />
x 1 + x 2 =5<br />
□<br />
△
§RREF Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 26<br />
Here is its augmented matrix.<br />
[1 ] −1 2 1<br />
2 1 1 8<br />
1 1 0 5<br />
△<br />
Subsection RO<br />
Row Operations<br />
An augmented matrix for a system of equations will save us the tedium of cont<strong>in</strong>ually<br />
writ<strong>in</strong>g down the names of the variables as we solve the system. It will also release<br />
us from any dependence on the actual names of the variables. We have seen how<br />
certa<strong>in</strong> operations we can perform on equations (Def<strong>in</strong>ition EO) will preserve their<br />
solutions (Theorem EOPSS). The next two def<strong>in</strong>itions and the follow<strong>in</strong>g theorem<br />
carry over these ideas to augmented matrices.<br />
Def<strong>in</strong>ition RO Row Operations<br />
The follow<strong>in</strong>g three operations will transform an m × n matrix <strong>in</strong>to a different matrix<br />
of the same size, and each is known as a row operation.<br />
1. Swap the locations of two rows.<br />
2. Multiply each entry of a s<strong>in</strong>gle row by a nonzero quantity.<br />
3. Multiply each entry of one row by some quantity, and add these values to the<br />
entries <strong>in</strong> the same columns of a second row. Leave the first row the same after<br />
this operation, but replace the second row by the new values.<br />
We will use a symbolic shorthand to describe these row operations:<br />
1. R i ↔ R j : Swap the location of rows i and j.<br />
2. αR i : Multiply row i by the nonzero scalar α.<br />
3. αR i + R j : Multiply row i by the scalar α and add to row j.<br />
Def<strong>in</strong>ition REM Row-Equivalent Matrices<br />
Two matrices, A and B, arerow-equivalent if one can be obta<strong>in</strong>ed from the other<br />
by a sequence of row operations.<br />
□<br />
Example TREM Two row-equivalent matrices<br />
The matrices<br />
[ ]<br />
2 −1 3 4<br />
A = 5 2 −2 3<br />
B =<br />
1 1 0 6<br />
[ 1 1 0<br />
] 6<br />
3 0 −2 −9<br />
2 −1 3 4<br />
□
§RREF Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 27<br />
are row-equivalent as can be seen from<br />
[ ] [ ] [ ]<br />
2 −1 3 4<br />
1 1 0 6<br />
1 1 0 6<br />
R<br />
5 2 −2 3<br />
1 ↔R<br />
−−−−−→<br />
3<br />
−2R<br />
5 2 −2 3<br />
1 +R<br />
−−−−−−→<br />
2<br />
3 0 −2 −9<br />
1 1 0 6<br />
2 −1 3 4<br />
2 −1 3 4<br />
We can also say that any pair of these three matrices are row-equivalent.<br />
Notice that each of the three row operations is reversible (Exercise RREF.T10),<br />
so we do not have to be careful about the dist<strong>in</strong>ction between “A is row-equivalent<br />
to B” and“B is row-equivalent to A.” (Exercise RREF.T11)<br />
The preced<strong>in</strong>g def<strong>in</strong>itions are designed to make the follow<strong>in</strong>g theorem possible.<br />
It says that row-equivalent matrices represent systems of l<strong>in</strong>ear equations that have<br />
identical solution sets.<br />
Theorem REMES Row-Equivalent Matrices represent Equivalent Systems<br />
Suppose that A and B are row-equivalent augmented matrices. Then the systems of<br />
l<strong>in</strong>ear equations that they represent are equivalent systems.<br />
Proof. If we perform a s<strong>in</strong>gle row operation on an augmented matrix, it will have the<br />
same effect as if we did the analogous equation operation on the system of equations<br />
the matrix represents. By exactly the same methods as we used <strong>in</strong> the proof of<br />
Theorem EOPSS we can see that each of these row operations will preserve the set<br />
of solutions for the system of equations the matrix represents.<br />
<br />
So at this po<strong>in</strong>t, our strategy is to beg<strong>in</strong> with a system of equations, represent<br />
the system by an augmented matrix, perform row operations (which will preserve<br />
solutions for the system) to get a “simpler” augmented matrix, convert back to a<br />
“simpler” system of equations and then solve that system, know<strong>in</strong>g that its solutions<br />
are those of the orig<strong>in</strong>al system. Here is a rehash of Example US as an exercise <strong>in</strong><br />
us<strong>in</strong>g our new tools.<br />
Example USR Three equations, one solution, reprised<br />
We solve the follow<strong>in</strong>g system us<strong>in</strong>g augmented matrices and row operations. This<br />
is the same system of equations solved <strong>in</strong> Example US us<strong>in</strong>g equation operations.<br />
x 1 +2x 2 +2x 3 =4<br />
x 1 +3x 2 +3x 3 =5<br />
2x 1 +6x 2 +5x 3 =6<br />
Form the augmented matrix,<br />
A =<br />
[ 1 2 2<br />
] 4<br />
1 3 3 5<br />
2 6 5 6<br />
△
§RREF Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 28<br />
and apply row operations,<br />
So the matrix<br />
[ 1 2 2 4<br />
−1R 1 +R<br />
−−−−−−→<br />
2<br />
0 1 1 1<br />
]<br />
[ 1 2 2 4<br />
−2R 1 +R<br />
−−−−−−→<br />
3<br />
0 1 1 1<br />
2 6 5 6<br />
0 2 1 −2<br />
[ ] [ ]<br />
1 2 2 4 1 2 2 4<br />
−2R 2 +R<br />
−−−−−−→<br />
3<br />
−1R<br />
0 1 1 1 −−−→<br />
3<br />
0 1 1 1<br />
0 0 −1 −4 0 0 1 4<br />
B =<br />
[ 1 2 2<br />
] 4<br />
0 1 1 1<br />
0 0 1 4<br />
is row equivalent to A and by Theorem REMES the system of equations below has<br />
the same solution set as the orig<strong>in</strong>al system of equations.<br />
x 1 +2x 2 +2x 3 =4<br />
x 2 + x 3 =1<br />
x 3 =4<br />
Solv<strong>in</strong>g this “simpler” system is straightforward and is identical to the process<br />
<strong>in</strong> Example US.<br />
△<br />
Subsection RREF<br />
Reduced Row-Echelon Form<br />
The preced<strong>in</strong>g example amply illustrates the def<strong>in</strong>itions and theorems we have seen<br />
so far. But it still leaves two questions unanswered. Exactly what is this “simpler”<br />
form for a matrix, and just how do we get it? Here is the answer to the first question,<br />
a def<strong>in</strong>ition of reduced row-echelon form.<br />
Def<strong>in</strong>ition RREF Reduced Row-Echelon Form<br />
A matrix is <strong>in</strong> reduced row-echelon form if it meets all of the follow<strong>in</strong>g conditions:<br />
1. If there is a row where every entry is zero, then this row lies below any other<br />
row that conta<strong>in</strong>s a nonzero entry.<br />
2. The leftmost nonzero entry of a row is equal to 1.<br />
3. The leftmost nonzero entry of a row is the only nonzero entry <strong>in</strong> its column.<br />
4. Consider any two different leftmost nonzero entries, one located <strong>in</strong> row i,<br />
column j and the other located <strong>in</strong> row s, column t. Ifs>i,thent>j.<br />
A row of only zero entries is called a zero row and the leftmost nonzero entry<br />
of a nonzero row is a lead<strong>in</strong>g 1. A column conta<strong>in</strong><strong>in</strong>g a lead<strong>in</strong>g 1 will be called a<br />
]
§RREF Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 29<br />
pivot column. The number of nonzero rows will be denoted by r, whichisalso<br />
equal to the number of lead<strong>in</strong>g 1’s and the number of pivot columns.<br />
The set of column <strong>in</strong>dices for the pivot columns will be denoted by D =<br />
{d 1 ,d 2 ,d 3 , ..., d r } where d 1 < d 2 < d 3 < ··· < d r , while the columns that<br />
are not pivot columns will be denoted as F = {f 1 ,f 2 ,f 3 , ..., f n−r } where f 1 <<br />
f 2
§RREF Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 30<br />
1. A and B are row-equivalent.<br />
2. B is <strong>in</strong> reduced row-echelon form.<br />
Proof. Suppose that A has m rows and n columns. We will describe a process for<br />
convert<strong>in</strong>g A <strong>in</strong>to B via row operations. This procedure is known as Gauss-Jordan<br />
elim<strong>in</strong>ation. Trac<strong>in</strong>g through this procedure will be easier if you recognize that i<br />
refers to a row that is be<strong>in</strong>g converted, j refers to a column that is be<strong>in</strong>g converted,<br />
and r keeps track of the number of nonzero rows. Here we go.<br />
1. Set j = 0 and r =0.<br />
2. Increase j by 1. If j now equals n +1,thenstop.<br />
3. Exam<strong>in</strong>e the entries of A <strong>in</strong> column j located <strong>in</strong> rows r +1throughm. Ifall<br />
of these entries are zero, then go to Step 2.<br />
4. Choose a row from rows r +1throughm with a nonzero entry <strong>in</strong> column j.<br />
Let i denote the <strong>in</strong>dex for this row.<br />
5. Increase r by 1.<br />
6. Use the first row operation to swap rows i and r.<br />
7. Use the second row operation to convert the entry <strong>in</strong> row r and column j to a<br />
1.<br />
8. Use the third row operation with row r to convert every other entry of column<br />
j to zero.<br />
9. Go to Step 2.<br />
The result of this procedure is that the matrix A is converted to a matrix <strong>in</strong><br />
reduced row-echelon form, which we will refer to as B. We need to now prove this<br />
claim by show<strong>in</strong>g that the converted matrix has the requisite properties of Def<strong>in</strong>ition<br />
RREF. <strong>First</strong>, the matrix is only converted through row operations (Steps 6, 7, 8), so<br />
A and B are row-equivalent (Def<strong>in</strong>ition REM).<br />
It is a bit more work to be certa<strong>in</strong> that B is <strong>in</strong> reduced row-echelon form. We<br />
claim that as we beg<strong>in</strong> Step 2, the first j columns of the matrix are <strong>in</strong> reduced<br />
row-echelon form with r nonzero rows. Certa<strong>in</strong>ly this is true at the start when j =0,<br />
s<strong>in</strong>ce the matrix has no columns and so vacuously meets the conditions of Def<strong>in</strong>ition<br />
RREF with r = 0 nonzero rows.<br />
In Step 2 we <strong>in</strong>crease j by 1 and beg<strong>in</strong> to work with the next column. There<br />
are two possible outcomes for Step 3. Suppose that every entry of column j <strong>in</strong> rows<br />
r +1 throughm is zero. Then with no changes we recognize that the first j columns
§RREF Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 31<br />
of the matrix has its first r rows still <strong>in</strong> reduced-row echelon form, with the f<strong>in</strong>al<br />
m − r rows still all zero.<br />
Suppose <strong>in</strong>stead that the entry <strong>in</strong> row i of column j is nonzero. Notice that s<strong>in</strong>ce<br />
r +1≤ i ≤ m, we know the first j − 1 entries of this row are all zero. Now, <strong>in</strong> Step<br />
5 we <strong>in</strong>crease r by 1, and then embark on build<strong>in</strong>g a new nonzero row. In Step 6 we<br />
swap row r and row i. In the first j columns, the first r − 1 rows rema<strong>in</strong> <strong>in</strong> reduced<br />
row-echelon form after the swap. In Step 7 we multiply row r by a nonzero scalar,<br />
creat<strong>in</strong>ga1<strong>in</strong>theentry<strong>in</strong>columnj of row i, and not chang<strong>in</strong>g any other rows.<br />
This new lead<strong>in</strong>g 1 is the first nonzero entry <strong>in</strong> its row, and is located to the right of<br />
all the lead<strong>in</strong>g 1’s <strong>in</strong> the preced<strong>in</strong>g r − 1 rows. With Step 8 we <strong>in</strong>sure that every<br />
entry <strong>in</strong> the column with this new lead<strong>in</strong>g 1 is now zero, as required for reduced<br />
row-echelon form. Also, rows r +1 through m are now all zeros <strong>in</strong> the first j columns,<br />
so we now only have one new nonzero row, consistent with our <strong>in</strong>crease of r by one.<br />
Furthermore, s<strong>in</strong>ce the first j − 1 entries of row r are zero, the employment of the<br />
third row operation does not destroy any of the necessary features of rows 1 through<br />
r − 1androwsr +1throughm, <strong>in</strong> columns 1 through j − 1.<br />
So at this stage, the first j columns of the matrix are <strong>in</strong> reduced row-echelon<br />
form. When Step 2 f<strong>in</strong>ally <strong>in</strong>creases j to n + 1, then the procedure is completed and<br />
the full n columns of the matrix are <strong>in</strong> reduced row-echelon form, with the value of<br />
r correctly record<strong>in</strong>g the number of nonzero rows.<br />
<br />
Theproceduregiven<strong>in</strong>theproofofTheoremREMEF can be more precisely<br />
described us<strong>in</strong>g a pseudo-code version of a computer program. S<strong>in</strong>gle-letter variables,<br />
like m, n, i, j, r have the same mean<strong>in</strong>gs as above. := is assignment of the value<br />
on the right to the variable on the left, A[i,j] is the equivalent of the matrix entry<br />
[A] ij<br />
, while == is an equality test and is a “not equals” test.<br />
<strong>in</strong>put m, n and A<br />
r := 0<br />
for j := 1 to n<br />
i := r+1<br />
while i
§RREF Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 32<br />
that follows is not out-of-bounds.<br />
So now we can put it all together. Beg<strong>in</strong> with a system of l<strong>in</strong>ear equations<br />
(Def<strong>in</strong>ition SLE), and represent the system by its augmented matrix (Def<strong>in</strong>ition AM).<br />
Use row operations (Def<strong>in</strong>ition RO) to convert this matrix <strong>in</strong>to reduced row-echelon<br />
form (Def<strong>in</strong>ition RREF), us<strong>in</strong>g the procedure outl<strong>in</strong>ed <strong>in</strong> the proof of Theorem<br />
REMEF. TheoremREMEF also tells us we can always accomplish this, and that the<br />
result is row-equivalent (Def<strong>in</strong>ition REM) to the orig<strong>in</strong>al augmented matrix. S<strong>in</strong>ce<br />
the matrix <strong>in</strong> reduced-row echelon form has the same solution set, we can analyze<br />
the row-reduced version <strong>in</strong>stead of the orig<strong>in</strong>al matrix, view<strong>in</strong>g it as the augmented<br />
matrix of a different system of equations. The beauty of augmented matrices <strong>in</strong><br />
reduced row-echelon form is that the solution sets to the systems they represent can<br />
be easily determ<strong>in</strong>ed, as we will see <strong>in</strong> the next few examples and <strong>in</strong> the next section.<br />
We will see through the course that almost every <strong>in</strong>terest<strong>in</strong>g property of a matrix<br />
can be discerned by look<strong>in</strong>g at a row-equivalent matrix <strong>in</strong> reduced row-echelon form.<br />
For this reason it is important to know that the matrix B is guaranteed to exist by<br />
Theorem REMEF is also unique.<br />
Two proof techniques are applicable to the proof. <strong>First</strong>, head out and read two<br />
proof techniques: Proof Technique CD and Proof Technique U.<br />
Theorem RREFU Reduced Row-Echelon Form is Unique<br />
Suppose that A is an m × n matrix and that B and C are m × n matrices that are<br />
row-equivalent to A and <strong>in</strong> reduced row-echelon form. Then B = C.<br />
Proof. We need to beg<strong>in</strong> with no assumptions about any relationships between B<br />
and C, other than they are both <strong>in</strong> reduced row-echelon form, and they are both<br />
row-equivalent to A.<br />
If B and C are both row-equivalent to A, then they are row-equivalent to each<br />
other. Repeated row operations on a matrix comb<strong>in</strong>e the rows with each other us<strong>in</strong>g<br />
operations that are l<strong>in</strong>ear, and are identical <strong>in</strong> each column. A key observation for<br />
this proof is that each <strong>in</strong>dividual row of B is l<strong>in</strong>early related to the rows of C. This<br />
relationship is different for each row of B, but once we fix a row, the relationship is<br />
the same across columns. More precisely, there are scalars δ ik ,1≤ i, k ≤ m such<br />
that for any 1 ≤ i ≤ m, 1≤ j ≤ n,<br />
m∑<br />
[B] ij<br />
= δ ik [C] kj<br />
k=1<br />
You should read this as say<strong>in</strong>g that an entry of row i of B (<strong>in</strong> column j) isa<br />
l<strong>in</strong>ear function of the entries of all the rows of C that are also <strong>in</strong> column j, andthe<br />
scalars (δ ik ) depend on which row of B we are consider<strong>in</strong>g (the i subscript on δ ik ),<br />
but are the same for every column (no dependence on j <strong>in</strong> δ ik ). This idea may be<br />
complicated now, but will feel more familiar once we discuss “l<strong>in</strong>ear comb<strong>in</strong>ations”<br />
(Def<strong>in</strong>ition LCCV) and moreso when we discuss “row spaces” (Def<strong>in</strong>ition RSM).<br />
For now, spend some time carefully work<strong>in</strong>g Exercise RREF.M40, which is designed
§RREF Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 33<br />
to illustrate the orig<strong>in</strong>s of this expression. This completes our exploitation of the<br />
row-equivalence of B and C.<br />
We now repeatedly exploit the fact that B and C are <strong>in</strong> reduced row-echelon<br />
form. Recall that a pivot column is all zeros, except a s<strong>in</strong>gle one. More carefully, if<br />
R is a matrix <strong>in</strong> reduced row-echelon form, and d l is the <strong>in</strong>dex of a pivot column,<br />
then [R] kdl<br />
= 1 precisely when k = l and is otherwise zero. Notice also that any<br />
entry of R that is both below the entry <strong>in</strong> row l and to the left of column d l is also<br />
zero (with below and left understood to <strong>in</strong>clude equality). In other words, look at<br />
examples of matrices <strong>in</strong> reduced row-echelon form and choose a lead<strong>in</strong>g 1 (with a<br />
box around it). The rest of the column is also zeros, and the lower left “quadrant”<br />
of the matrix that beg<strong>in</strong>s here is totally zeros.<br />
Assum<strong>in</strong>g no relationship about the form of B and C, letB have r nonzero rows<br />
and denote the pivot columns as D = {d 1 ,d 2 ,d 3 , ..., d r }.ForC let r ′ denote the<br />
number of nonzero rows and denote the pivot columns as<br />
D ′ = { d ′ 1,d ′ 2,d ′ 3, ..., d ′ r ′} (Def<strong>in</strong>ition RREF). There are four steps <strong>in</strong> the<br />
proof, and the first three are about show<strong>in</strong>g that B and C have the same number<br />
of pivot columns, <strong>in</strong> the same places. In other words, the “primed” symbols are a<br />
necessary fiction.<br />
<strong>First</strong> Step. Suppose that d 1
§RREF Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 34<br />
= δ p+1,l [C] ld<br />
′<br />
l<br />
+<br />
= δ p+1,l (1) +<br />
m∑<br />
δ p+1,k [C] kd<br />
′<br />
l<br />
k=1<br />
k≠l<br />
m∑<br />
δ p+1,k (0)<br />
k=1<br />
k≠l<br />
Property CACN<br />
Def<strong>in</strong>ition RREF<br />
Now,<br />
= δ p+1,l<br />
1=[B] p+1,dp+1<br />
=<br />
=<br />
=<br />
=<br />
=<br />
=0<br />
m∑<br />
δ p+1,k [C] kdp+1<br />
k=1<br />
p∑<br />
δ p+1,k [C] kdp+1<br />
+<br />
k=1<br />
p∑<br />
(0) [C] kdp+1<br />
+<br />
k=1<br />
m∑<br />
k=p+1<br />
m∑<br />
k=p+1<br />
δ p+1,k [C] kdp+1<br />
m∑<br />
k=p+1<br />
m∑<br />
k=p+1<br />
δ p+1,k [C] kdp+1<br />
δ p+1,k [C] kdp+1<br />
Def<strong>in</strong>ition RREF<br />
Property AACN<br />
δ p+1,k (0) d p+1
§RREF Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 35<br />
∑<br />
= δ rk [C] kdl<br />
r ′<br />
k=1<br />
∑<br />
= δ rk [C] kd<br />
′<br />
l<br />
r ′<br />
k=1<br />
∑<br />
= δ rl [C] ld<br />
′ + δ rk [C]<br />
l<br />
kd<br />
′<br />
l<br />
r ′<br />
k=1<br />
k≠l<br />
∑<br />
= δ rl (1) + δ rk (0)<br />
r ′<br />
k=1<br />
k≠l<br />
Property CACN<br />
Def<strong>in</strong>ition RREF<br />
= δ rl<br />
Now exam<strong>in</strong>e the entries of row r of B,<br />
m∑<br />
[B] rj<br />
= δ rk [C] kj<br />
k=1<br />
∑<br />
= δ rk [C] kj<br />
+<br />
r ′<br />
k=1<br />
∑<br />
= δ rk [C] kj<br />
+<br />
r ′<br />
k=1<br />
∑<br />
= δ rk [C] kj<br />
r ′<br />
k=1<br />
∑<br />
= (0) [C] kj<br />
=0<br />
r ′<br />
k=1<br />
m∑<br />
k=r ′ +1<br />
m∑<br />
k=r ′ +1<br />
δ rk [C] kj<br />
δ rk (0)<br />
Property CACN<br />
Def<strong>in</strong>ition RREF<br />
So row r is a totally zero row, contradict<strong>in</strong>g that this should be the bottommost<br />
nonzero row of B. Sor ′ ≥ r. By an entirely similar argument, revers<strong>in</strong>g the roles of<br />
B and C, we would conclude that r ′ ≤ r and therefore r = r ′ . Thus, comb<strong>in</strong><strong>in</strong>g the<br />
first three steps we can say that D = D ′ . In other words, B and C have the same<br />
pivot columns, <strong>in</strong> the same locations.<br />
Fourth Step. In this f<strong>in</strong>al step, we will not argue by contradiction. Our <strong>in</strong>tent<br />
is to determ<strong>in</strong>e the values of the δ ij . Notice that we can use the values of the d i<br />
<strong>in</strong>terchangeably for B and C. Here we go,<br />
1=[B] idi<br />
Def<strong>in</strong>ition RREF
§RREF Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 36<br />
=<br />
m∑<br />
δ ik [C] kdi<br />
k=1<br />
= δ ii [C] idi<br />
+<br />
= δ ii (1) +<br />
= δ ii<br />
m∑<br />
δ ik [C] kdi<br />
k=1<br />
k≠i<br />
m∑<br />
δ ik (0)<br />
k=1<br />
k≠i<br />
and for l ≠ i<br />
0=[B] idl<br />
m∑<br />
= δ ik [C] kdl<br />
k=1<br />
= δ il [C] ldl<br />
+<br />
= δ il (1) +<br />
m∑<br />
δ ik [C] kdl<br />
k=1<br />
k≠l<br />
m∑<br />
δ ik (0)<br />
k=1<br />
k≠l<br />
Property CACN<br />
Def<strong>in</strong>ition RREF<br />
Def<strong>in</strong>ition RREF<br />
Property CACN<br />
Def<strong>in</strong>ition RREF<br />
= δ il<br />
F<strong>in</strong>ally, hav<strong>in</strong>g determ<strong>in</strong>ed the values of the δ ij , we can show that B = C. For<br />
1 ≤ i ≤ m, 1≤ j ≤ n,<br />
m∑<br />
[B] ij<br />
= δ ik [C] kj<br />
k=1<br />
= δ ii [C] ij<br />
+<br />
= (1) [C] ij<br />
+<br />
m∑<br />
δ ik [C] kj<br />
k=1<br />
k≠i<br />
m∑<br />
(0) [C] kj<br />
k=1<br />
k≠i<br />
Property CACN<br />
=[C] ij<br />
So B and C have equal values <strong>in</strong> every entry, and so are the same matrix.<br />
<br />
We will now run through some examples of us<strong>in</strong>g these def<strong>in</strong>itions and theorems<br />
to solve some systems of equations. From now on, when we have a matrix <strong>in</strong> reduced
§RREF Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 37<br />
row-echelon form, we will mark the lead<strong>in</strong>g 1’s with a small box. This will help you<br />
count, and identify, the pivot columns. In your work, you can box ’em, circle ’em or<br />
write ’em <strong>in</strong> a different color — just identify ’em somehow. This device will prove<br />
very useful later and is a very good habit to start develop<strong>in</strong>g right now.<br />
Example SAB Solutions for Archetype B<br />
Let us f<strong>in</strong>d the solutions to the follow<strong>in</strong>g system of equations,<br />
−7x 1 − 6x 2 − 12x 3 = −33<br />
5x 1 +5x 2 +7x 3 =24<br />
x 1 +4x 3 =5<br />
<strong>First</strong>, form the augmented matrix,<br />
[ −7 −6 −12<br />
] −33<br />
5 5 7 24<br />
1 0 4 5<br />
and work to reduced row-echelon form, first with j =1,<br />
[ ] [ ]<br />
1 0 4 5<br />
1 0 4 5<br />
R 1 ↔R<br />
−−−−−→<br />
3<br />
−5R<br />
5 5 7 24<br />
1 +R<br />
−−−−−−→<br />
2<br />
0 5 −13 −1<br />
−7 −6 −12 −33<br />
−7 −6 −12 −33<br />
⎡<br />
7R 1 +R<br />
−−−−−→<br />
3<br />
⎣ 1 0 4 5<br />
⎤<br />
0 5 −13 −1⎦<br />
0 −6 16 2<br />
Now, with j =2,<br />
⎡<br />
1<br />
5 R 2<br />
−−−→ ⎣ 1 0 4 5<br />
⎤ ⎡<br />
−13 −1<br />
0 1 ⎦<br />
5 5 −−−−−→ 6R 2+R 3<br />
⎣ 1 0 4 5<br />
⎤<br />
−13 −1<br />
0 1 ⎦<br />
5 5<br />
0 −6 16 2<br />
2 4<br />
0 0<br />
5 5<br />
And f<strong>in</strong>ally, with j =3,<br />
⎡<br />
5<br />
2 R 3<br />
−−−→ ⎣ 1 0 4 5<br />
⎤<br />
−13 −1<br />
0 1<br />
5 5<br />
0 0 1 2<br />
⎡<br />
⎤<br />
1 0 0 −3<br />
−4R 3 +R<br />
−−−−−−→<br />
1<br />
⎣ 0 1 0 5 ⎦<br />
0 0 1 2<br />
⎦ 13<br />
5 R 3+R 2<br />
−−−−−−→<br />
⎡<br />
⎣ 1 0 4 5<br />
⎤<br />
0 1 0 5⎦<br />
0 0 1 2<br />
This is now the augmented matrix of a very simple system of equations, namely<br />
x 1 = −3, x 2 =5,x 3 = 2, which has an obvious solution. Furthermore, we can<br />
see that this is the only solution to this system, so we have determ<strong>in</strong>ed the entire
§RREF Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 38<br />
solution set,<br />
S =<br />
{[ ]} −3<br />
5<br />
2<br />
You might compare this example with the procedure we used <strong>in</strong> Example US.△<br />
Archetypes A and B are meant to contrast each other <strong>in</strong> many respects. So let<br />
us solve Archetype A now.<br />
Example SAA Solutions for Archetype A<br />
Let us f<strong>in</strong>d the solutions to the follow<strong>in</strong>g system of equations,<br />
x 1 − x 2 +2x 3 =1<br />
2x 1 + x 2 + x 3 =8<br />
x 1 + x 2 =5<br />
<strong>First</strong>, form the augmented matrix,<br />
[ 1 −1 2<br />
] 1<br />
2 1 1 8<br />
1 1 0 5<br />
and work to reduced row-echelon form, first with j =1,<br />
[ ] ⎡<br />
1 −1 2 1<br />
−2R 1 +R<br />
−−−−−−→<br />
2<br />
−1R<br />
0 3 −3 6<br />
1 +R<br />
−−−−−−→<br />
3<br />
⎣ 1 −1 2 1<br />
⎤<br />
0 3 −3 6⎦<br />
1 1 0 5<br />
0 2 −2 4<br />
Now, with j =2,<br />
⎡<br />
1<br />
3 R 2<br />
−−−→ ⎣ 1 −1 2 1<br />
⎤ ⎡<br />
0 1 −1 2⎦ −−−−−→ 1R 2+R 1<br />
⎣ 1 0 1 3<br />
⎤<br />
0 1 −1 2⎦<br />
0 2 −2 4<br />
0 2 −2 4<br />
⎡<br />
−2R 2 +R<br />
−−−−−−→<br />
3<br />
⎣ 1 0 1 3<br />
⎤<br />
0 1 −1 2⎦<br />
0 0 0 0<br />
The system of equations represented by this augmented matrix needs to be<br />
considered a bit differently than that for Archetype B. <strong>First</strong>, the last row of the<br />
matrix is the equation 0 = 0, which is always true, so it imposes no restrictions on<br />
our possible solutions and therefore we can safely ignore it as we analyze the other<br />
two equations. These equations are,<br />
x 1 + x 3 =3<br />
x 2 − x 3 =2.<br />
While this system is fairly easy to solve, it also appears to have a multitude of
§RREF Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 39<br />
solutions. For example, choose x 3 = 1 and see that then x 1 = 2 and x 2 = 3 will<br />
together form a solution. Or choose x 3 = 0, and then discover that x 1 = 3 and<br />
x 2 = 2 lead to a solution. Try it yourself: pick any value of x 3 you please, and figure<br />
out what x 1 and x 2 should be to make the first and second equations (respectively)<br />
true. We’ll wait while you do that. Because of this behavior, we say that x 3 is a “free”<br />
or “<strong>in</strong>dependent” variable. But why do we vary x 3 and not some other variable? For<br />
now, notice that the third column of the augmented matrix is not a pivot column.<br />
With this idea, we can rearrange the two equations, solv<strong>in</strong>g each for the variable<br />
whose <strong>in</strong>dex is the same as the column <strong>in</strong>dex of a pivot column.<br />
x 1 =3− x 3<br />
x 2 =2+x 3<br />
To write the set of solution vectors <strong>in</strong> set notation, we have<br />
{[ ]∣ }<br />
3 − x3 ∣∣∣∣<br />
S = 2+x 3 x 3 ∈ C<br />
x 3<br />
We will learn more <strong>in</strong> the next section about systems with <strong>in</strong>f<strong>in</strong>itely many<br />
solutions and how to express their solution sets. Right now, you might look back at<br />
Example IS.<br />
△<br />
Example SAE Solutions for Archetype E<br />
Let us f<strong>in</strong>d the solutions to the follow<strong>in</strong>g system of equations,<br />
2x 1 + x 2 +7x 3 − 7x 4 =2<br />
−3x 1 +4x 2 − 5x 3 − 6x 4 =3<br />
x 1 + x 2 +4x 3 − 5x 4 =2<br />
<strong>First</strong>, form the augmented matrix,<br />
[ 2 1 7 −7<br />
] 2<br />
−3 4 −5 −6 3<br />
1 1 4 −5 2<br />
and work to reduced row-echelon form, first with j =1,<br />
[ ] [ ]<br />
1 1 4 −5 2<br />
1 1 4 −5 2<br />
R 1 ↔R<br />
−−−−−→<br />
3<br />
3R<br />
−3 4 −5 −6 3<br />
1 +R<br />
−−−−−→<br />
2<br />
0 7 7 −21 9<br />
2 1 7 −7 2<br />
2 1 7 −7 2<br />
⎡<br />
−2R 1 +R<br />
−−−−−−→<br />
3<br />
⎣ 1 1 4 −5 2<br />
⎤<br />
0 7 7 −21 9 ⎦<br />
0 −1 −1 3 −2
§RREF Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 40<br />
Now, with j =2,<br />
⎡<br />
R 2 ↔R<br />
−−−−−→<br />
3<br />
⎣ 1 1 4 −5 2<br />
⎤ ⎡<br />
0 −1 −1 3 −2⎦ −1R 2<br />
−−−→ ⎣ 1 1 4 −5 2<br />
⎤<br />
0 1 1 −3 2⎦<br />
0 7 7 −21 9<br />
0 7 7 −21 9<br />
⎡<br />
−1R 2 +R<br />
−−−−−−→<br />
1<br />
⎣ 1 0 3 −2 0<br />
⎤ ⎡<br />
0 1 1 −3 2⎦ −−−−−−→ −7R 2+R 3<br />
⎣ 1 0 3 −2 0<br />
⎤<br />
0 1 1 −3 2 ⎦<br />
0 7 7 −21 9<br />
0 0 0 0 −5<br />
And f<strong>in</strong>ally, with j =4,<br />
⎡<br />
− 1 5 R 3<br />
−−−−→ ⎣ 1 0 3 −2 0<br />
⎤ ⎡<br />
0 1 1 −3 2⎦ −−−−−−→ −2R 3+R 2<br />
⎣<br />
0 0 0 0 1<br />
1 0 3 −2 0<br />
0 1 1 −3 0<br />
0 0 0 0 1<br />
Let us analyze the equations <strong>in</strong> the system represented by this augmented matrix.<br />
The third equation will read 0 = 1. This is patently false, all the time. No choice<br />
of values for our variables will ever make it true. We are done. S<strong>in</strong>ce we cannot<br />
even make the last equation true, we have no hope of mak<strong>in</strong>g all of the equations<br />
simultaneously true. So this system has no solutions, and its solution set is the empty<br />
set, ∅ = {}(Def<strong>in</strong>ition ES).<br />
Notice that we could have reached this conclusion sooner. After perform<strong>in</strong>g the<br />
row operation −7R 2 + R 3 , we can see that the third equation reads 0 = −5, a false<br />
statement. S<strong>in</strong>ce the system represented by this matrix has no solutions, none of the<br />
systems represented has any solutions. However, for this example, we have chosen to<br />
br<strong>in</strong>g the matrix all the way to reduced row-echelon form as practice. △<br />
These three examples (Example SAB, Example SAA, ExampleSAE) illustrate<br />
the full range of possibilities for a system of l<strong>in</strong>ear equations — no solutions, one<br />
solution, or <strong>in</strong>f<strong>in</strong>itely many solutions. In the next section we will exam<strong>in</strong>e these three<br />
scenarios more closely.<br />
We (and everybody else) will often speak of “row-reduc<strong>in</strong>g” a matrix. This is an<br />
<strong>in</strong>formal way of say<strong>in</strong>g we beg<strong>in</strong> with a matrix A and then analyze the matrix B that<br />
is row-equivalent to A and <strong>in</strong> reduced row-echelon form. So the term row-reduce is<br />
used as a verb, but describes someth<strong>in</strong>g a bit more complicated, s<strong>in</strong>ce we do not really<br />
change A. TheoremREMEF tells us that this process will always be successful and<br />
Theorem RREFU tells us that B will be unambiguous. Typically, an <strong>in</strong>vestigation<br />
of A will proceed by analyz<strong>in</strong>g B and apply<strong>in</strong>g theorems whose hypotheses <strong>in</strong>clude<br />
the row-equivalence of A and B, and usually the hypothesis that B is <strong>in</strong> reduced<br />
row-echelon form.<br />
Read<strong>in</strong>g Questions<br />
⎤<br />
⎦
C10 † 2x 1 − 3x 2 + x 3 +7x 4 =14<br />
C11 † 3x 1 +4x 2 − x 3 +2x 4 =6<br />
§RREF Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 41<br />
1. Is the matrix below <strong>in</strong> reduced row-echelon form? Why or why not?<br />
⎡<br />
⎣ 1 5 0 6 8<br />
⎤<br />
0 0 1 2 0⎦<br />
0 0 0 0 1<br />
2. Use row operations to convert the matrix below to reduced row-echelon form and report<br />
the f<strong>in</strong>al matrix.<br />
⎡<br />
⎣ 2 1 8<br />
⎤<br />
−1 1 −1⎦<br />
−2 5 4<br />
3. F<strong>in</strong>d all the solutions to the system below by us<strong>in</strong>g an augmented matrix and row<br />
operations. Report your f<strong>in</strong>al matrix <strong>in</strong> reduced row-echelon form and the set of solutions.<br />
Exercises<br />
2x 1 +3x 2 − x 3 =0<br />
x 1 +2x 2 + x 3 =3<br />
x 1 +3x 2 +3x 3 =7<br />
C05 Each archetype below is a system of equations. Form the augmented matrix of the<br />
system of equations, convert the matrix to reduced row-echelon form by us<strong>in</strong>g equation<br />
operations and then describe the solution set of the orig<strong>in</strong>al system of equations.<br />
Archetype A,ArchetypeB,ArchetypeC,ArchetypeD,ArchetypeE,ArchetypeF,Archetype<br />
G, ArchetypeH, ArchetypeI, ArchetypeJ<br />
For problems C10–C19, f<strong>in</strong>d all solutions to the system of l<strong>in</strong>ear equations. Use your favorite<br />
comput<strong>in</strong>g device to row-reduce the augmented matrices for the systems, and write the<br />
solutions as a set, us<strong>in</strong>g correct set notation.<br />
2x 1 +8x 2 − 4x 3 +5x 4 = −1<br />
x 1 +3x 2 − 3x 3 =4<br />
−5x 1 +2x 2 +3x 3 +4x 4 = −19<br />
x 1 − 2x 2 +3x 3 + x 4 =2<br />
10x 2 − 10x 3 − x 4 =1<br />
C12 †<br />
2x 1 +4x 2 +5x 3 +7x 4 = −26
C14 † 2x 1 + x 2 +7x 3 − 2x 4 =4<br />
C16 † 2x 1 +3x 2 +19x 3 − 4x 4 =2<br />
C18 † x 1 +2x 2 − 4x 3 − x 4 =32<br />
C19 † 2x 1 + x 2 =6<br />
§RREF Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 42<br />
x 1 +2x 2 + x 3 − x 4 = −4<br />
−2x 1 − 4x 2 + x 3 +11x 4 = −10<br />
C13 †<br />
x 1 +2x 2 +8x 3 − 7x 4 = −2<br />
3x 1 +2x 2 +12x 3 − 5x 4 =6<br />
−x 1 + x 2 + x 3 − 5x 4 = −10<br />
3x 1 − 2x 2 +11x 4 =13<br />
x 1 + x 2 +5x 3 − 3x 4 =1<br />
C15 †<br />
2x 1 +3x 2 − x 3 − 9x 4 = −16<br />
x 1 +2x 2 + x 3 =0<br />
−x 1 +2x 2 +3x 3 +4x 4 =8<br />
x 1 +2x 2 +12x 3 − 3x 4 =1<br />
−x 1 +2x 2 +8x 3 − 5x 4 =1<br />
C17 †<br />
−x 1 +5x 2 = −8<br />
−2x 1 +5x 2 +5x 3 +2x 4 =9<br />
−3x 1 − x 2 +3x 3 + x 4 =3<br />
7x 1 +6x 2 +5x 3 + x 4 =30<br />
x 1 +3x 2 − 7x 3 − x 5 =33<br />
x 1 +2x 3 − 2x 4 +3x 5 =22<br />
−x 1 − x 2 = −2<br />
3x 1 +4x 2 =4
§RREF Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 43<br />
3x 1 +5x 2 =2<br />
For problems C30–C33, row-reduce the matrix without the aid of a calculator, <strong>in</strong>dicat<strong>in</strong>g<br />
the row operations you are us<strong>in</strong>g at each step us<strong>in</strong>g the notation of Def<strong>in</strong>ition RO.<br />
C30 †<br />
⎡<br />
⎣ 2 1 5 10<br />
⎤<br />
1 −3 −1 −2⎦<br />
4 −2 6 12<br />
C31 †<br />
C32 †<br />
C33 †<br />
⎡<br />
⎣ 1 2 −4<br />
⎤<br />
−3 −1 −3⎦<br />
−2 1 −7<br />
⎡<br />
⎣ 1 1 1<br />
⎤<br />
−4 −3 −2⎦<br />
3 2 1<br />
⎡<br />
⎤<br />
1 2 −1 −1<br />
⎣ 2 4 −1 4 ⎦<br />
−1 −2 3 5<br />
M40 †<br />
Consider the two 3 × 4matricesbelow<br />
⎡<br />
B = ⎣ 1 3 −2 2<br />
⎤<br />
⎡<br />
−1 −2 −1 −1⎦ C = ⎣ 1 2 1 2<br />
⎤<br />
1 1 4 0⎦<br />
−1 −5 8 −3<br />
−1 −1 −4 1<br />
1. Row-reduce each matrix and determ<strong>in</strong>e that the reduced row-echelon forms of B and<br />
C are identical. From this argue that B and C are row-equivalent.<br />
2. In the proof of Theorem RREFU, we beg<strong>in</strong> by argu<strong>in</strong>g that entries of row-equivalent<br />
matrices are related by way of certa<strong>in</strong> scalars and sums. In this example, we would<br />
write that entries of B from row i that are <strong>in</strong> column j are l<strong>in</strong>early related to the<br />
entries of C <strong>in</strong> column j from all three rows<br />
[B] ij<br />
= δ i1 [C] 1j<br />
+ δ i2 [C] 2j<br />
+ δ i3 [C] 3j<br />
1 ≤ j ≤ 4<br />
For each 1 ≤ i ≤ 3 f<strong>in</strong>d the correspond<strong>in</strong>g three scalars <strong>in</strong> this relationship. So your<br />
answer will be n<strong>in</strong>e scalars, determ<strong>in</strong>ed three at a time.
§RREF Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 44<br />
M45 † You keep a number of lizards, mice and peacocks as pets. There are a total of 108<br />
legs and 30 tails <strong>in</strong> your menagerie. You have twice as many mice as lizards. How many of<br />
each creature do you have?<br />
M50 † A park<strong>in</strong>g lot has 66 vehicles (cars, trucks, motorcycles and bicycles) <strong>in</strong> it. There<br />
are four times as many cars as trucks. The total number of tires (4 per car or truck, 2 per<br />
motorcycle or bicycle) is 252. How many cars are there? How many bicycles?<br />
T10 † Prove that each of the three row operations (Def<strong>in</strong>ition RO) is reversible. More<br />
precisely, if the matrix B is obta<strong>in</strong>ed from A by application of a s<strong>in</strong>gle row operation, show<br />
that there is a s<strong>in</strong>gle row operation that will transform B back <strong>in</strong>to A.<br />
T11 Suppose that A, B and C are m × n matrices. Use the def<strong>in</strong>ition of row-equivalence<br />
(Def<strong>in</strong>ition REM) to prove the follow<strong>in</strong>g three facts.<br />
1. A is row-equivalent to A.<br />
2. If A is row-equivalent to B, thenB is row-equivalent to A.<br />
3. If A is row-equivalent to B, andB is row-equivalent to C, then A is row-equivalent<br />
to C.<br />
A relationship that satisfies these three properties is known as an equivalence relation,<br />
an important idea <strong>in</strong> the study of various algebras. This is a formal way of say<strong>in</strong>g that<br />
a relationship behaves like equality, without requir<strong>in</strong>g the relationship to be as strict as<br />
equality itself. We will see it aga<strong>in</strong> <strong>in</strong> Theorem SER.<br />
T12 Suppose that B is an m × n matrix <strong>in</strong> reduced row-echelon form. Build a new, likely<br />
smaller, k × l matrix C as follows. Keep any collection of k adjacent rows, k ≤ m. From<br />
these rows, keep columns 1 through l, l ≤ n. ProvethatC is <strong>in</strong> reduced row-echelon form.<br />
T13 Generalize Exercise RREF.T12 by just keep<strong>in</strong>g any k rows, and not requir<strong>in</strong>g the<br />
rows to be adjacent. Prove that any such matrix C is <strong>in</strong> reduced row-echelon form.
Section TSS<br />
Types of Solution Sets<br />
We will now be more careful about analyz<strong>in</strong>g the reduced row-echelon form derived<br />
from the augmented matrix of a system of l<strong>in</strong>ear equations. In particular, we will see<br />
how to systematically handle the situation when we have <strong>in</strong>f<strong>in</strong>itely many solutions<br />
to a system, and we will prove that every system of l<strong>in</strong>ear equations has either zero,<br />
one or <strong>in</strong>f<strong>in</strong>itely many solutions. With these tools, we will be able to rout<strong>in</strong>ely solve<br />
any l<strong>in</strong>ear system.<br />
Subsection CS<br />
Consistent Systems<br />
The computer scientist Donald Knuth said, “Science is what we understand well<br />
enough to expla<strong>in</strong> to a computer. Art is everyth<strong>in</strong>g else.” In this section we will<br />
remove solv<strong>in</strong>g systems of equations from the realm of art, and <strong>in</strong>to the realm of<br />
science. We beg<strong>in</strong> with a def<strong>in</strong>ition.<br />
Def<strong>in</strong>ition CS Consistent System<br />
A system of l<strong>in</strong>ear equations is consistent if it has at least one solution. Otherwise,<br />
the system is called <strong>in</strong>consistent.<br />
□<br />
We will want to first recognize when a system is <strong>in</strong>consistent or consistent, and <strong>in</strong><br />
the case of consistent systems we will be able to further ref<strong>in</strong>e the types of solutions<br />
possible. We will do this by analyz<strong>in</strong>g the reduced row-echelon form of a matrix,<br />
us<strong>in</strong>g the value of r, and the sets of column <strong>in</strong>dices, D and F ,firstdef<strong>in</strong>edback<strong>in</strong><br />
Def<strong>in</strong>ition RREF.<br />
Use of the notation for the elements of D and F can be a bit confus<strong>in</strong>g, s<strong>in</strong>ce<br />
we have subscripted variables that are <strong>in</strong> turn equal to <strong>in</strong>tegers used to <strong>in</strong>dex the<br />
matrix. However, many questions about matrices and systems of equations can be<br />
answered once we know r, D and F . The choice of the letters D and F refer to our<br />
upcom<strong>in</strong>g def<strong>in</strong>ition of dependent and free variables (Def<strong>in</strong>ition IDV). An example<br />
will help us beg<strong>in</strong> to get comfortable with this aspect of reduced row-echelon form.<br />
Example RREFN Reduced row-echelon form notation<br />
For the 5 × 9 matrix<br />
⎡<br />
⎤<br />
1 5 0 0 2 8 0 5 −1<br />
0 0 1 0 4 7 0 2 0<br />
B =<br />
⎢ 0 0 0 1 3 9 0 3 −6<br />
⎥<br />
⎣<br />
⎦<br />
0 0 0 0 0 0 1 4 2<br />
0 0 0 0 0 0 0 0 0<br />
45
§TSS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 46<br />
<strong>in</strong> reduced row-echelon form we have<br />
r =4<br />
d 1 =1 d 2 =3 d 3 =4 d 4 =7<br />
f 1 =2 f 2 =5 f 3 =6 f 4 =8 f 5 =9<br />
Notice that the sets<br />
D = {d 1 ,d 2 ,d 3 ,d 4 } = {1, 3, 4, 7} F = {f 1 ,f 2 ,f 3 ,f 4 ,f 5 } = {2, 5, 6, 8, 9}<br />
have noth<strong>in</strong>g <strong>in</strong> common and together account for all of the columns of B (we say it<br />
is a partition of the set of column <strong>in</strong>dices).<br />
△<br />
The number r is the s<strong>in</strong>gle most important piece of <strong>in</strong>formation we can get from<br />
the reduced row-echelon form of a matrix. It is def<strong>in</strong>ed as the number of nonzero<br />
rows, but s<strong>in</strong>ce each nonzero row has a lead<strong>in</strong>g 1, it is also the number of lead<strong>in</strong>g<br />
1’s present. For each lead<strong>in</strong>g 1, we have a pivot column, so r is also the number of<br />
pivot columns. Repeat<strong>in</strong>g ourselves, r is the number of nonzero rows, the number<br />
of lead<strong>in</strong>g 1’s and the number of pivot columns. Across different situations, each<br />
of these <strong>in</strong>terpretations of the mean<strong>in</strong>g of r will be useful, though it may be most<br />
helpful to th<strong>in</strong>k <strong>in</strong> terms of pivot columns.<br />
Before prov<strong>in</strong>g some theorems about the possibilities for solution sets to systems<br />
of equations, let us analyze one particular system with an <strong>in</strong>f<strong>in</strong>ite solution set very<br />
carefully as an example. We will use this technique frequently, and shortly we will<br />
ref<strong>in</strong>e it slightly.<br />
Archetypes I and J are both fairly large for do<strong>in</strong>g computations by hand (though<br />
not impossibly large). Their properties are very similar, so we will frequently analyze<br />
the situation <strong>in</strong> Archetype I, and leave you the joy of analyz<strong>in</strong>g Archetype J yourself.<br />
So work through Archetype I with the text, by hand and/or with a computer, and<br />
then tackle Archetype J yourself (and check your results with those listed). Notice<br />
too that the archetypes describ<strong>in</strong>g systems of equations each lists the values of r, D<br />
and F . Here we go...<br />
Example ISSI Describ<strong>in</strong>g <strong>in</strong>f<strong>in</strong>ite solution sets, Archetype I<br />
Archetype I is the system of m = 4 equations <strong>in</strong> n = 7 variables.<br />
x 1 +4x 2 − x 4 +7x 6 − 9x 7 =3<br />
2x 1 +8x 2 − x 3 +3x 4 +9x 5 − 13x 6 +7x 7 =9<br />
2x 3 − 3x 4 − 4x 5 +12x 6 − 8x 7 =1<br />
−x 1 − 4x 2 +2x 3 +4x 4 +8x 5 − 31x 6 +37x 7 =4<br />
This system has a 4 × 8 augmented matrix that is row-equivalent to the follow<strong>in</strong>g<br />
matrix (check this!), and which is <strong>in</strong> reduced row-echelon form (the existence of<br />
this matrix is guaranteed by Theorem REMEF and its uniqueness is guaranteed by
§TSS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 47<br />
Theorem RREFU),<br />
⎡<br />
⎢<br />
⎣<br />
⎤<br />
1 4 0 0 2 1 −3 4<br />
0 0 1 0 1 −3 5 2<br />
⎥<br />
0 0 0 1 2 −6 6 1<br />
⎦<br />
0 0 0 0 0 0 0 0<br />
So we f<strong>in</strong>d that r = 3 and<br />
D = {d 1 ,d 2 ,d 3 } = {1, 3, 4} F = {f 1 ,f 2 ,f 3 ,f 4 ,f 5 } = {2, 5, 6, 7, 8}<br />
Let i denote any one of the r = 3 nonzero rows. Then the <strong>in</strong>dex d i is a pivot<br />
column. It will be easy <strong>in</strong> this case to use the equation represented by row i to<br />
write an expression for the variable x di . It will be a l<strong>in</strong>ear function of the variables<br />
x f1 ,x f2 ,x f3 ,x f4 (notice that f 5 = 8 does not reference a variable, but does tell us<br />
the f<strong>in</strong>al column is not a pivot column). We will now construct these three expressions.<br />
Notice that us<strong>in</strong>g subscripts upon subscripts takes some gett<strong>in</strong>g used to.<br />
(i =1) x d1 = x 1 =4− 4x 2 − 2x 5 − x 6 +3x 7<br />
(i =2) x d2 = x 3 =2− x 5 +3x 6 − 5x 7<br />
(i =3) x d3 = x 4 =1− 2x 5 +6x 6 − 6x 7<br />
Each element of the set F = {f 1 ,f 2 ,f 3 ,f 4 ,f 5 } = {2, 5, 6, 7, 8} is the <strong>in</strong>dex<br />
of a variable, except for f 5 = 8. We refer to x f1 = x 2 , x f2 = x 5 , x f3 = x 6 and<br />
x f4 = x 7 as “free” (or “<strong>in</strong>dependent”) variables s<strong>in</strong>ce they are allowed to assume<br />
any possible comb<strong>in</strong>ation of values that we can imag<strong>in</strong>e and we can cont<strong>in</strong>ue on to<br />
build a solution to the system by solv<strong>in</strong>g <strong>in</strong>dividual equations for the values of the<br />
other (“dependent”) variables.<br />
Each element of the set D = {d 1 ,d 2 ,d 3 } = {1, 3, 4} is the <strong>in</strong>dex of a variable.<br />
We refer to the variables x d1 = x 1 , x d2 = x 3 and x d3 = x 4 as “dependent” variables<br />
s<strong>in</strong>ce they depend on the <strong>in</strong>dependent variables. More precisely, for each possible<br />
choice of values for the <strong>in</strong>dependent variables we get exactly one set of values for<br />
the dependent variables that comb<strong>in</strong>e to form a solution of the system.<br />
To express the solutions as a set, we write<br />
⎧⎡<br />
⎤<br />
⎫<br />
4 − 4x 2 − 2x 5 − x 6 +3x 7<br />
x 2<br />
⎪⎨<br />
2 − x 5 +3x 6 − 5x 7<br />
⎪⎬<br />
1 − 2x 5 +6x 6 − 6x 7<br />
x 2 ,x 5 ,x 6 ,x 7 ∈ C<br />
⎢ x 5<br />
⎥<br />
⎣ x ⎪⎩<br />
6<br />
⎦<br />
x 7<br />
∣<br />
⎪⎭<br />
The condition that x 2 ,x 5 ,x 6 ,x 7 ∈ C is how we specify that the variables<br />
x 2 ,x 5 ,x 6 ,x 7 are “free” to assume any possible values.<br />
This systematic approach to solv<strong>in</strong>g a system of equations will allow us to create<br />
a precise description of the solution set for any consistent system once we have found
§TSS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 48<br />
the reduced row-echelon form of the augmented matrix. It will work just as well<br />
when the set of free variables is empty and we get just a s<strong>in</strong>gle solution. And we<br />
could program a computer to do it! Now have a whack at Archetype J (Exercise<br />
TSS.C10), mimick<strong>in</strong>g the discussion <strong>in</strong> this example. We’ll still be here when you<br />
get back.<br />
△<br />
Us<strong>in</strong>g the reduced row-echelon form of the augmented matrix of a system of<br />
equations to determ<strong>in</strong>e the nature of the solution set of the system is a very key<br />
idea. So let us look at one more example like the last one. But first a def<strong>in</strong>ition, and<br />
then the example. We mix our metaphors a bit when we call variables free versus<br />
dependent. Maybe we should call dependent variables “enslaved”?<br />
Def<strong>in</strong>ition IDV Independent and Dependent Variables<br />
Suppose A is the augmented matrix of a consistent system of l<strong>in</strong>ear equations and<br />
B is a row-equivalent matrix <strong>in</strong> reduced row-echelon form. Suppose j is the <strong>in</strong>dex<br />
of a pivot column of B. Then the variable x j is dependent. A variable that is not<br />
dependent is called <strong>in</strong>dependent or free.<br />
□<br />
If you studied this def<strong>in</strong>ition carefully, you might wonder what to do if the system<br />
has n variables and column n + 1 is a pivot column? We will see shortly, by Theorem<br />
RCLS, that this never happens for a consistent system.<br />
Example FDV Free and dependent variables<br />
Consider the system of five equations <strong>in</strong> five variables,<br />
x 1 − x 2 − 2x 3 + x 4 +11x 5 =13<br />
x 1 − x 2 + x 3 + x 4 +5x 5 =16<br />
2x 1 − 2x 2 + x 4 +10x 5 =21<br />
2x 1 − 2x 2 − x 3 +3x 4 +20x 5 =38<br />
2x 1 − 2x 2 + x 3 + x 4 +8x 5 =22<br />
whose augmented matrix row-reduces to<br />
⎡<br />
⎤<br />
1 −1 0 0 3 6<br />
0 0 1 0 −2 1<br />
⎢ 0 0 0 1 4 9⎥<br />
⎣<br />
0 0 0 0 0 0<br />
⎦<br />
0 0 0 0 0 0<br />
Columns 1, 3 and 4 are pivot columns, so D = {1, 3, 4}. Fromthisweknow<br />
that the variables x 1 , x 3 and x 4 will be dependent variables, and each of the r =3<br />
nonzero rows of the row-reduced matrix will yield an expression for one of these<br />
three variables. The set F is all the rema<strong>in</strong><strong>in</strong>g column <strong>in</strong>dices, F = {2, 5, 6}. The<br />
column <strong>in</strong>dex 6 <strong>in</strong> F means that the f<strong>in</strong>al column is not a pivot column, and thus<br />
the system is consistent (Theorem RCLS). The rema<strong>in</strong><strong>in</strong>g <strong>in</strong>dices <strong>in</strong> F <strong>in</strong>dicate free
§TSS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 49<br />
variables, so x 2 and x 5 (the rema<strong>in</strong><strong>in</strong>g variables) are our free variables. The result<strong>in</strong>g<br />
three equations that describe our solution set are then,<br />
(x d1 = x 1 ) x 1 =6+x 2 − 3x 5<br />
(x d2 = x 3 ) x 3 =1+2x 5<br />
(x d3 = x 4 ) x 4 =9− 4x 5<br />
Make sure you understand where these three equations came from, and notice<br />
how the location of the pivot columns determ<strong>in</strong>ed the variables on the left-hand side<br />
of each equation. We can compactly describe the solution set as,<br />
⎧⎡<br />
⎤<br />
⎫<br />
6+x 2 − 3x 5<br />
⎪⎨<br />
x 2<br />
⎪⎬<br />
S = ⎢ 1+2x 5 ⎥<br />
x<br />
⎣<br />
⎪⎩<br />
9 − 4x ⎦<br />
2 ,x 5 ∈ C<br />
5<br />
x 5<br />
∣<br />
⎪⎭<br />
Notice how we express the freedom for x 2 and x 5 : x 2 ,x 5 ∈ C.<br />
△<br />
Sets are an important part of algebra, and we have seen a few already. Be<strong>in</strong>g<br />
comfortable with sets is important for understand<strong>in</strong>g and writ<strong>in</strong>g proofs. If you have<br />
not already, pay a visit now to Section SET.<br />
We can now use the values of m, n, r, and the <strong>in</strong>dependent and dependent<br />
variables to categorize the solution sets for l<strong>in</strong>ear systems through a sequence of<br />
theorems.<br />
Through the follow<strong>in</strong>g sequence of proofs, you will want to consult three proof<br />
techniques. See Proof Technique E, Proof Technique N, Proof Technique CP.<br />
<strong>First</strong> we have an important theorem that explores the dist<strong>in</strong>ction between<br />
consistent and <strong>in</strong>consistent l<strong>in</strong>ear systems.<br />
Theorem RCLS Recogniz<strong>in</strong>g Consistency of a L<strong>in</strong>ear System<br />
Suppose A is the augmented matrix of a system of l<strong>in</strong>ear equations with n variables.<br />
Suppose also that B is a row-equivalent matrix <strong>in</strong> reduced row-echelon form with r<br />
nonzero rows. Then the system of equations is <strong>in</strong>consistent if and only if column<br />
n +1 of B is a pivot column.<br />
Proof. (⇐) The first half of the proof beg<strong>in</strong>s with the assumption that column n +1<br />
of B is a pivot column. Then the lead<strong>in</strong>g 1 of row r is located <strong>in</strong> column n +1of<br />
B and so row r of B beg<strong>in</strong>s with n consecutive zeros, f<strong>in</strong>ish<strong>in</strong>g with the lead<strong>in</strong>g 1.<br />
This is a representation of the equation 0 = 1, which is false. S<strong>in</strong>ce this equation<br />
is false for any collection of values we might choose for the variables, there are no<br />
solutions for the system of equations, and the system is <strong>in</strong>consistent.<br />
(⇒) For the second half of the proof, we wish to show that if we assume the system<br />
is <strong>in</strong>consistent, then column n +1 of B is a pivot column. But <strong>in</strong>stead of prov<strong>in</strong>g this<br />
directly, we will form the logically equivalent statement that is the contrapositive,<br />
and prove that <strong>in</strong>stead (see Proof Technique CP). Turn<strong>in</strong>g the implication around,
§TSS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 50<br />
and negat<strong>in</strong>g each portion, we arrive at the logically equivalent statement: if column<br />
n +1ofB is not a pivot column, then the system of equations is consistent.<br />
If column n +1ofB is not a pivot column, the lead<strong>in</strong>g 1 for row r is located<br />
somewhere <strong>in</strong> columns 1 through n. Thenevery preced<strong>in</strong>g row’s lead<strong>in</strong>g 1 is also<br />
located <strong>in</strong> columns 1 through n. In other words, s<strong>in</strong>ce the last lead<strong>in</strong>g 1 is not <strong>in</strong><br />
the last column, no lead<strong>in</strong>g 1 for any row is <strong>in</strong> the last column, due to the echelon<br />
layout of the lead<strong>in</strong>g 1’s (Def<strong>in</strong>ition RREF). We will now construct a solution to the<br />
system by sett<strong>in</strong>g each dependent variable to the entry of the f<strong>in</strong>al column <strong>in</strong> the<br />
row with the correspond<strong>in</strong>g lead<strong>in</strong>g 1, and sett<strong>in</strong>g each free variable to zero. That<br />
sentence is pretty vague, so let us be more precise. Us<strong>in</strong>g our notation for the sets<br />
D and F from the reduced row-echelon form (Def<strong>in</strong>ition RREF):<br />
x di =[B] i,n+1<br />
, 1 ≤ i ≤ r x fi =0, 1 ≤ i ≤ n − r<br />
These values for the variables make the equations represented by the first r rows<br />
of B all true (conv<strong>in</strong>ce yourself of this). Rows numbered greater than r (if any) are<br />
all zero rows, hence represent the equation 0 = 0 and are also all true. We have now<br />
identified one solution to the system represented by B, and hence a solution to the<br />
system represented by A (Theorem REMES). So we can say the system is consistent<br />
(Def<strong>in</strong>ition CS).<br />
<br />
The beauty of this theorem be<strong>in</strong>g an equivalence is that we can unequivocally<br />
test to see if a system is consistent or <strong>in</strong>consistent by look<strong>in</strong>g at just a s<strong>in</strong>gle entry<br />
of the reduced row-echelon form matrix. We could program a computer to do it!<br />
Notice that for a consistent system the row-reduced augmented matrix has<br />
n +1∈ F , so the largest element of F does not refer to a variable. Also, for an<br />
<strong>in</strong>consistent system, n +1∈ D, and it then does not make much sense to discuss<br />
whether or not variables are free or dependent s<strong>in</strong>ce there is no solution. Take a look<br />
back at Def<strong>in</strong>ition IDV and see why we did not need to consider the possibility of<br />
referenc<strong>in</strong>g x n+1 as a dependent variable.<br />
With the characterization of Theorem RCLS, we can explore the relationships<br />
between r and n for a consistent system. We can dist<strong>in</strong>guish between the case of a<br />
unique solution and <strong>in</strong>f<strong>in</strong>itely many solutions, and furthermore, we recognize that<br />
these are the only two possibilities.<br />
Theorem CSRN Consistent Systems, r and n<br />
Suppose A is the augmented matrix of a consistent system of l<strong>in</strong>ear equations with<br />
n variables. Suppose also that B is a row-equivalent matrix <strong>in</strong> reduced row-echelon<br />
form with r pivot columns. Then r ≤ n. Ifr = n, then the system has a unique<br />
solution, and if r
§TSS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 51<br />
column is a pivot column. So Theorem RCLS tells us that the system is <strong>in</strong>consistent,<br />
contrary to our hypothesis. We are left with r ≤ n.<br />
When r = n, we f<strong>in</strong>d n − r = 0 free variables (i.e. F = {n +1}) andtheonly<br />
solution is given by sett<strong>in</strong>g the n variables to the the first n entries of column n +1<br />
of B.<br />
When r0 free variables. Choose one free variable and set<br />
all the other free variables to zero. Now, set the chosen free variable to any fixed<br />
value. It is possible to then determ<strong>in</strong>e the values of the dependent variables to create<br />
a solution to the system. By sett<strong>in</strong>g the chosen free variable to different values, <strong>in</strong><br />
this manner we can create <strong>in</strong>f<strong>in</strong>itely many solutions.<br />
<br />
Subsection FV<br />
Free Variables<br />
The next theorem simply states a conclusion from the f<strong>in</strong>al paragraph of the previous<br />
proof, allow<strong>in</strong>g us to state explicitly the number of free variables for a consistent<br />
system.<br />
Theorem FVCS Free Variables for Consistent Systems<br />
Suppose A is the augmented matrix of a consistent system of l<strong>in</strong>ear equations with<br />
n variables. Suppose also that B is a row-equivalent matrix <strong>in</strong> reduced row-echelon<br />
form with r rows that are not completely zeros. Then the solution set can be described<br />
with n − r free variables.<br />
Proof. See the proof of Theorem CSRN.<br />
<br />
Example CFV Count<strong>in</strong>g free variables<br />
For each archetype that is a system of equations, the values of n and r are listed.<br />
Many also conta<strong>in</strong> a few sample solutions. We can use this <strong>in</strong>formation profitably,<br />
as illustrated by four examples.<br />
1. Archetype A has n = 3 and r = 2. It can be seen to be consistent by the<br />
sample solutions given. Its solution set then has n − r = 1 free variables, and<br />
therefore will be <strong>in</strong>f<strong>in</strong>ite.<br />
2. Archetype B has n = 3 and r = 3. It can be seen to be consistent by the s<strong>in</strong>gle<br />
sample solution given. Its solution set can then be described with n − r =0<br />
free variables, and therefore will have just the s<strong>in</strong>gle solution.<br />
3. Archetype H has n = 2 and r = 3. In this case, column 3 must be a pivot<br />
column,sobyTheoremRCLS, the system is <strong>in</strong>consistent. We should not try<br />
to apply Theorem FVCS to count free variables, s<strong>in</strong>ce the theorem only applies<br />
to consistent systems. (What would happen if you did try to <strong>in</strong>correctly apply<br />
Theorem FVCS?)
§TSS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 52<br />
4. Archetype E has n = 4 and r = 3. However, by look<strong>in</strong>g at the reduced rowechelon<br />
form of the augmented matrix, we f<strong>in</strong>d that column 5 is a pivot column.<br />
By Theorem RCLS we recognize the system as <strong>in</strong>consistent.<br />
△<br />
We have accomplished a lot so far, but our ma<strong>in</strong> goal has been the follow<strong>in</strong>g<br />
theorem, which is now very simple to prove. The proof is so simple that we ought to<br />
call it a corollary, but the result is important enough that it deserves to be called a<br />
theorem. (See Proof Technique LC.) Notice that this theorem was presaged first by<br />
Example TTS and further foreshadowed by other examples.<br />
Theorem PSSLS Possible Solution Sets for L<strong>in</strong>ear Systems<br />
A system of l<strong>in</strong>ear equations has no solutions, a unique solution or <strong>in</strong>f<strong>in</strong>itely many<br />
solutions.<br />
Proof. By its def<strong>in</strong>ition, a system is either <strong>in</strong>consistent or consistent (Def<strong>in</strong>ition CS).<br />
The first case describes systems with no solutions. For consistent systems, we have<br />
the rema<strong>in</strong><strong>in</strong>g two possibilities as guaranteed by, and described <strong>in</strong>, Theorem CSRN.<br />
<br />
Here is a diagram that consolidates several of our theorems from this section, and<br />
which is of practical use when you analyze systems of equations. Note this presumes<br />
we have the reduced row-echelon form of the augmented matrix of the system to<br />
analyze.<br />
no pivot column <strong>in</strong><br />
column n +1<br />
Consistent<br />
Theorem RCLS<br />
pivot column <strong>in</strong><br />
column n +1<br />
Inconsistent<br />
Theorem CSRN<br />
r
§TSS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 53<br />
We have one more theorem to round out our set of tools for determ<strong>in</strong><strong>in</strong>g solution<br />
sets to systems of l<strong>in</strong>ear equations.<br />
Theorem CMVEI Consistent, More Variables than Equations, Inf<strong>in</strong>ite solutions<br />
Suppose a consistent system of l<strong>in</strong>ear equations has m equations <strong>in</strong> n variables. If<br />
n>m, then the system has <strong>in</strong>f<strong>in</strong>itely many solutions.<br />
Proof. Suppose that the augmented matrix of the system of equations is rowequivalent<br />
to B, a matrix <strong>in</strong> reduced row-echelon form with r nonzero rows. Because<br />
B has m rows <strong>in</strong> total, the number of nonzero rows is less than or equal to m. In<br />
other words, r ≤ m. Follow this with the hypothesis that n>mand we f<strong>in</strong>d that<br />
the system has a solution set described by at least one free variable because<br />
n − r ≥ n − m>0.<br />
A consistent system with free variables will have an <strong>in</strong>f<strong>in</strong>ite number of solutions,<br />
as given by Theorem CSRN.<br />
<br />
Notice that to use this theorem we need only know that the system is consistent,<br />
together with the values of m and n. We do not necessarily have to compute a<br />
row-equivalent reduced row-echelon form matrix, even though we discussed such a<br />
matrix <strong>in</strong> the proof. This is the substance of the follow<strong>in</strong>g example.<br />
Example OSGMD One solution gives many, Archetype D<br />
Archetype D is the system of m = 3 equations <strong>in</strong> n =4variables,<br />
2x 1 + x 2 +7x 3 − 7x 4 =8<br />
−3x 1 +4x 2 − 5x 3 − 6x 4 = −12<br />
x 1 + x 2 +4x 3 − 5x 4 =4<br />
and the solution x 1 =0,x 2 =1,x 3 =2,x 4 = 1 can be checked easily by<br />
substitution. Hav<strong>in</strong>g been handed this solution, we know the system is consistent.<br />
This, together with n>m, allows us to apply Theorem CMVEI and conclude that<br />
the system has <strong>in</strong>f<strong>in</strong>itely many solutions.<br />
△<br />
These theorems give us the procedures and implications that allow us to completely<br />
solve any system of l<strong>in</strong>ear equations. The ma<strong>in</strong> computational tool is us<strong>in</strong>g<br />
row operations to convert an augmented matrix <strong>in</strong>to reduced row-echelon form. Here<br />
is a broad outl<strong>in</strong>e of how we would <strong>in</strong>struct a computer to solve a system of l<strong>in</strong>ear<br />
equations.<br />
1. Represent a system of l<strong>in</strong>ear equations <strong>in</strong> n variables by an augmented matrix<br />
(an array is the appropriate data structure <strong>in</strong> most computer languages).<br />
2. Convert the matrix to a row-equivalent matrix <strong>in</strong> reduced row-echelon form<br />
us<strong>in</strong>g the procedure from the proof of Theorem REMEF. Identifythelocation<br />
of the pivot columns, and their number r.
§TSS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 54<br />
3. If column n + 1 is a pivot column, output the statement that the system is<br />
<strong>in</strong>consistent and halt.<br />
4. If column n + 1 is not a pivot column, there are two possibilities:<br />
(a) r = n and the solution is unique. It can be read off directly from the<br />
entries<strong>in</strong>rows1throughn of column n +1.<br />
(b) r
C21 † x 1 +4x 2 +3x 3 − x 4 =5<br />
C22 † x 1 − 2x 2 + x 3 − x 4 =3<br />
C23 † x 1 − 2x 2 + x 3 − x 4 =3<br />
C24 † x 1 − 2x 2 + x 3 − x 4 =2<br />
C25 † x 1 +2x 2 +3x 3 =1<br />
C26 † x 1 +2x 2 +3x 3 =1<br />
C27 † x 1 +2x 2 +3x 3 =0<br />
§TSS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 55<br />
x 1 − x 2 + x 3 +2x 4 =6<br />
4x 1 + x 2 +6x 3 +5x 4 =9<br />
2x 1 − 4x 2 + x 3 + x 4 =2<br />
x 1 − 2x 2 − 2x 3 +3x 4 =1<br />
x 1 + x 2 + x 3 − x 4 =1<br />
x 1 + x 3 − x 4 =2<br />
x 1 + x 2 + x 3 − x 4 =2<br />
x 1 + x 3 − x 4 =2<br />
2x 1 − x 2 + x 3 =2<br />
3x 1 + x 2 + x 3 =4<br />
x 2 +2x 3 =6<br />
2x 1 − x 2 + x 3 =2<br />
3x 1 + x 2 + x 3 =4<br />
5x 2 +2x 3 =1<br />
2x 1 − x 2 + x 3 =2<br />
x 1 − 8x 2 − 7x 3 =1<br />
x 2 + x 3 =0
C28 † x 1 +2x 2 +3x 3 =1<br />
§TSS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 56<br />
2x 1 − x 2 + x 3 =2<br />
x 1 − 8x 2 − 7x 3 =1<br />
x 2 + x 3 =0<br />
M45 † The details for Archetype J <strong>in</strong>clude several sample solutions. Verify that one of<br />
these solutions is correct (any one, but just one). Based only on this evidence, and especially<br />
without do<strong>in</strong>g any row operations, expla<strong>in</strong> how you know this system of l<strong>in</strong>ear equations<br />
has <strong>in</strong>f<strong>in</strong>itely many solutions.<br />
M46 Consider Archetype J, and specifically the row-reduced version of the augmented<br />
matrix of the system of equations, denoted as B here, and the values of r, D and F<br />
immediately follow<strong>in</strong>g. Determ<strong>in</strong>e the values of the entries<br />
[B] 1,d1<br />
[B] 3,d3<br />
[B] 1,d3<br />
[B] 3,d1<br />
[B] d1 ,1<br />
[B] d3 ,3<br />
[B] d1 ,3<br />
[B] d3 ,1<br />
[B] 1,f1<br />
[B] 3,f1<br />
(See Exercise TSS.M70 for a generalization.)<br />
For Exercises M51–M57 say as much as possible about each system’s solution set. Be<br />
sure to make it clear which theorems you are us<strong>in</strong>g to reach your conclusions.<br />
M51 † A consistent system of 8 equations <strong>in</strong> 6 variables.<br />
M52 † A consistent system of 6 equations <strong>in</strong> 8 variables.<br />
M53 † A system of 5 equations <strong>in</strong> 9 variables.<br />
M54 † A system with 12 equations <strong>in</strong> 35 variables.<br />
M56 † A system with 6 equations <strong>in</strong> 12 variables.<br />
M57 † A system with 8 equations and 6 variables. The reduced row-echelon form of<br />
the augmented matrix of the system has 7 pivot columns.<br />
M60 Without do<strong>in</strong>g any computations, and without exam<strong>in</strong><strong>in</strong>g any solutions, say as much<br />
as possible about the form of the solution set for each archetype that is a system of equations.<br />
Archetype A,ArchetypeB,ArchetypeC,ArchetypeD,ArchetypeE,ArchetypeF,Archetype<br />
G, ArchetypeH, ArchetypeI, ArchetypeJ<br />
M70 Suppose that B is a matrix <strong>in</strong> reduced row-echelon form that is equivalent to the<br />
augmented matrix of a system of equations with m equations <strong>in</strong> n variables. Let r, D and F<br />
be as def<strong>in</strong>ed <strong>in</strong> Def<strong>in</strong>ition RREF. What can you conclude, <strong>in</strong> general, about the follow<strong>in</strong>g<br />
entries?<br />
[B] 1,d1<br />
[B] 3,d3<br />
[B] 1,d3<br />
[B] 3,d1<br />
[B] d1 ,1<br />
[B] d3 ,3<br />
[B] d1 ,3<br />
[B] d3 ,1<br />
[B] 1,f1<br />
[B] 3,f1<br />
If you cannot conclude anyth<strong>in</strong>g about an entry, then say so. (See Exercise TSS.M46.)<br />
T10 † An <strong>in</strong>consistent system may have r>n. If we try (<strong>in</strong>correctly!) to apply Theorem<br />
FVCS to such a system, how many free variables would we discover?
§TSS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 57<br />
T11 † Suppose A is the augmented matrix of a system of l<strong>in</strong>ear equations <strong>in</strong> n variables.<br />
and that B is a row-equivalent matrix <strong>in</strong> reduced row-echelon form with r pivot columns.<br />
If r = n + 1, prove that the system of equations is <strong>in</strong>consistent.<br />
T20 Suppose that B is a matrix <strong>in</strong> reduced row-echelon form that is equivalent to the<br />
augmented matrix of a system of equations with m equations <strong>in</strong> n variables. Let r, D and<br />
F be as def<strong>in</strong>ed <strong>in</strong> Def<strong>in</strong>ition RREF. Provethatd k ≥ k for all 1 ≤ k ≤ r. Then suppose<br />
that r ≥ 2and1≤ k
Section HSE<br />
Homogeneous Systems of Equations<br />
In this section we specialize to systems of l<strong>in</strong>ear equations where every equation<br />
has a zero as its constant term. Along the way, we will beg<strong>in</strong> to express more and<br />
more ideas <strong>in</strong> the language of matrices and beg<strong>in</strong> a move away from writ<strong>in</strong>g out<br />
whole systems of equations. The ideas <strong>in</strong>itiated <strong>in</strong> this section will carry through<br />
the rema<strong>in</strong>der of the course.<br />
Subsection SHS<br />
Solutions of Homogeneous Systems<br />
As usual, we beg<strong>in</strong> with a def<strong>in</strong>ition.<br />
Def<strong>in</strong>ition HS Homogeneous System<br />
A system of l<strong>in</strong>ear equations, LS(A, b) is homogeneous if the vector of constants<br />
is the zero vector, <strong>in</strong> other words, if b = 0.<br />
□<br />
Example AHSAC Archetype C as a homogeneous system<br />
For each archetype that is a system of equations, we have formulated a similar, yet<br />
different, homogeneous system of equations by replac<strong>in</strong>g each equation’s constant<br />
term with a zero. To wit, for Archetype C, we can convert the orig<strong>in</strong>al system of<br />
equations <strong>in</strong>to the homogeneous system,<br />
2x 1 − 3x 2 + x 3 − 6x 4 =0<br />
4x 1 + x 2 +2x 3 +9x 4 =0<br />
3x 1 + x 2 + x 3 +8x 4 =0<br />
Can you quickly f<strong>in</strong>d a solution to this system without row-reduc<strong>in</strong>g the augmented<br />
matrix?<br />
△<br />
As you might have discovered by study<strong>in</strong>g Example AHSAC, sett<strong>in</strong>g each variable<br />
to zero will always be a solution of a homogeneous system. This is the substance of<br />
the follow<strong>in</strong>g theorem.<br />
Theorem HSC Homogeneous Systems are Consistent<br />
Suppose that a system of l<strong>in</strong>ear equations is homogeneous. Then the system is<br />
consistent and one solution is found by sett<strong>in</strong>g each variable to zero.<br />
Proof. Set each variable of the system to zero. When substitut<strong>in</strong>g these values <strong>in</strong>to<br />
each equation, the left-hand side evaluates to zero, no matter what the coefficients<br />
are. S<strong>in</strong>ce a homogeneous system has zero on the right-hand side of each equation<br />
as the constant term, each equation is true. With one demonstrated solution, we<br />
can call the system consistent.<br />
<br />
58
§HSE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 59<br />
S<strong>in</strong>ce this solution is so obvious, we now def<strong>in</strong>e it as the trivial solution.<br />
Def<strong>in</strong>ition TSHSE Trivial Solution to Homogeneous Systems of Equations<br />
Suppose a homogeneous system of l<strong>in</strong>ear equations has n variables. The solution<br />
x 1 =0,x 2 =0,...,x n = 0 (i.e. x = 0) is called the trivial solution. □<br />
Here are three typical examples, which we will reference throughout this section.<br />
Work through the row operations as we br<strong>in</strong>g each to reduced row-echelon form.<br />
Also notice what is similar <strong>in</strong> each example, and what differs.<br />
Example HUSAB Homogeneous, unique solution, Archetype B<br />
Archetype B can be converted to the homogeneous system,<br />
−7x 1 − 6x 2 − 12x 3 =0<br />
5x 1 +5x 2 +7x 3 =0<br />
x 1 +4x 3 =0<br />
whose augmented matrix row-reduces to<br />
⎡<br />
1 0 0<br />
⎤<br />
0<br />
⎣ 0 1 0 0⎦<br />
0 0 1 0<br />
By Theorem HSC, the system is consistent, and so the computation n − r =<br />
3 − 3 = 0 means the solution set conta<strong>in</strong>s just a s<strong>in</strong>gle solution. Then, this lone<br />
solution must be the trivial solution.<br />
△<br />
Example HISAA Homogeneous, <strong>in</strong>f<strong>in</strong>ite solutions, Archetype A<br />
Archetype A can be converted to the homogeneous system,<br />
x 1 − x 2 +2x 3 =0<br />
2x 1 + x 2 + x 3 =0<br />
x 1 + x 2 =0<br />
whose augmented matrix row-reduces to<br />
⎡<br />
⎣ 1 0 1 0<br />
⎤<br />
0 1 −1 0⎦<br />
0 0 0 0<br />
By Theorem HSC, the system is consistent, and so the computation n − r =<br />
3 − 2 = 1 means the solution set conta<strong>in</strong>s one free variable by Theorem FVCS, and<br />
hence has <strong>in</strong>f<strong>in</strong>itely many solutions. We can describe this solution set us<strong>in</strong>g the free<br />
variable x 3 ,<br />
{[ ]∣ } {[ ]∣ }<br />
x1 ∣∣∣∣ −x3 ∣∣∣∣<br />
S = x 2 x 1 = −x 3 ,x 2 = x 3 = x 3 x 3 ∈ C<br />
x 3 x 3
§HSE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 60<br />
Geometrically, these are po<strong>in</strong>ts <strong>in</strong> three dimensions that lie on a l<strong>in</strong>e through the<br />
orig<strong>in</strong>.<br />
△<br />
Example HISAD Homogeneous, <strong>in</strong>f<strong>in</strong>ite solutions, Archetype D<br />
Archetype D (and identically, Archetype E) can be converted to the homogeneous<br />
system,<br />
2x 1 + x 2 +7x 3 − 7x 4 =0<br />
−3x 1 +4x 2 − 5x 3 − 6x 4 =0<br />
x 1 + x 2 +4x 3 − 5x 4 =0<br />
whose augmented matrix row-reduces to<br />
⎡<br />
⎣ 1 0 3 −2 0<br />
⎤<br />
0 1 1 −3 0⎦<br />
0 0 0 0 0<br />
By Theorem HSC, the system is consistent, and so the computation n − r =<br />
4 − 2 = 2 means the solution set conta<strong>in</strong>s two free variables by Theorem FVCS, and<br />
hence has <strong>in</strong>f<strong>in</strong>itely many solutions. We can describe this solution set us<strong>in</strong>g the free<br />
variables x 3 and x 4 ,<br />
⎧⎡<br />
⎤<br />
⎫<br />
x ⎪⎨ 1<br />
⎪⎬<br />
⎢x S = 2 ⎥<br />
⎣<br />
⎪⎩<br />
x<br />
⎦<br />
x 1 = −3x 3 +2x 4 ,x 2 = −x 3 +3x 4 3<br />
⎪<br />
x ∣ ⎭<br />
4<br />
⎧⎡<br />
⎤<br />
⎫<br />
−3x ⎪⎨ 3 +2x 4<br />
⎪⎬<br />
⎢ −x<br />
=<br />
3 +3x 4 ⎥<br />
⎣<br />
⎪⎩<br />
x<br />
⎦<br />
x<br />
3<br />
3 ,x 4 ∈ C<br />
x ∣<br />
⎪⎭<br />
4<br />
After work<strong>in</strong>g through these examples, you might perform the same computations<br />
for the slightly larger example, Archetype J.<br />
Notice that when we do row operations on the augmented matrix of a homogeneous<br />
system of l<strong>in</strong>ear equations the last column of the matrix is all zeros. Any one of<br />
the three allowable row operations will convert zeros to zeros and thus, the f<strong>in</strong>al<br />
column of the matrix <strong>in</strong> reduced row-echelon form will also be all zeros. So <strong>in</strong> this<br />
case, we may be as likely to reference only the coefficient matrix and presume that<br />
we remember that the f<strong>in</strong>al column beg<strong>in</strong>s with zeros, and after any number of row<br />
operations is still zero.<br />
Example HISAD suggests the follow<strong>in</strong>g theorem.<br />
Theorem HMVEI Homogeneous, More Variables than Equations, Inf<strong>in</strong>ite solutions<br />
Suppose that a homogeneous system of l<strong>in</strong>ear equations has m equations and n<br />
△
§HSE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 61<br />
variables with n>m. Then the system has <strong>in</strong>f<strong>in</strong>itely many solutions.<br />
Proof. We are assum<strong>in</strong>g the system is homogeneous, so Theorem HSC says it is<br />
consistent. Then the hypothesis that n>m, together with Theorem CMVEI, gives<br />
<strong>in</strong>f<strong>in</strong>itely many solutions.<br />
<br />
Example HUSAB and Example HISAA are concerned with homogeneous systems<br />
where n = m and expose a fundamental dist<strong>in</strong>ction between the two examples. One<br />
has a unique solution, while the other has <strong>in</strong>f<strong>in</strong>itely many. These are exactly the<br />
only two possibilities for a homogeneous system and illustrate that each is possible<br />
(unlike the case when n>mwhere Theorem HMVEI tells us that there is only one<br />
possibility for a homogeneous system).<br />
Subsection NSM<br />
Null Space of a Matrix<br />
The set of solutions to a homogeneous system (which by Theorem HSC is never<br />
empty) is of enough <strong>in</strong>terest to warrant its own name. However, we def<strong>in</strong>e it as a<br />
property of the coefficient matrix, not as a property of some system of equations.<br />
Def<strong>in</strong>ition NSM Null Space of a Matrix<br />
The null space of a matrix A, denoted N (A), is the set of all the vectors that are<br />
solutions to the homogeneous system LS(A, 0).<br />
□<br />
In the Archetypes (Archetypes) each example that is a system of equations also<br />
has a correspond<strong>in</strong>g homogeneous system of equations listed, and several sample<br />
solutions are given. These solutions will be elements of the null space of the coefficient<br />
matrix. We will look at one example.<br />
Example NSEAI Null space elements of Archetype I<br />
The write-up for Archetype I lists several solutions of the correspond<strong>in</strong>g homogeneous<br />
system. Here are two, written as solution vectors. We can say that they are <strong>in</strong> the<br />
null space of the coefficient matrix for the system of equations <strong>in</strong> Archetype I.<br />
⎡<br />
⎤<br />
3<br />
0<br />
−5<br />
x =<br />
−6<br />
⎢ 0<br />
⎥<br />
⎣ 0 ⎦<br />
1<br />
⎡ ⎤<br />
−4<br />
1<br />
−3<br />
y =<br />
−2<br />
⎢ 1<br />
⎥<br />
⎣ 1 ⎦<br />
1
§HSE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 62<br />
However, the vector<br />
⎡ ⎤<br />
1<br />
0<br />
0<br />
z =<br />
0<br />
⎢0<br />
⎥<br />
⎣0⎦<br />
2<br />
is not <strong>in</strong> the null space, s<strong>in</strong>ce it is not a solution to the homogeneous system. For<br />
example, it fails to even make the first equation true.<br />
△<br />
Here are two (prototypical) examples of the computation of the null space of a<br />
matrix.<br />
Example CNS1 Comput<strong>in</strong>g a null space, no. 1<br />
Let us compute the null space of<br />
[ 2 −1 7 −3<br />
] −8<br />
A = 1 0 2 4 9<br />
2 2 −2 −1 8<br />
which we write as N (A). Translat<strong>in</strong>g Def<strong>in</strong>ition NSM, we simply desire to solve the<br />
homogeneous system LS(A, 0). So we row-reduce the augmented matrix to obta<strong>in</strong><br />
⎡<br />
1 0 2 0 1<br />
⎤<br />
0<br />
⎣ 0 1 −3 0 4 0⎦<br />
0 0 0 1 2 0<br />
The variables (of the homogeneous system) x 3 and x 5 arefree(s<strong>in</strong>cecolumns1,<br />
2 and 4 are pivot columns), so we arrange the equations represented by the matrix<br />
<strong>in</strong> reduced row-echelon form to<br />
x 1 = −2x 3 − x 5<br />
x 2 =3x 3 − 4x 5<br />
x 4 = −2x 5<br />
So we can write the <strong>in</strong>f<strong>in</strong>ite solution set as sets us<strong>in</strong>g column vectors,<br />
⎧⎡<br />
⎤<br />
⎫<br />
−2x 3 − x 5<br />
⎪⎨<br />
3x 3 − 4x 5<br />
⎪⎬<br />
N (A) = ⎢ x 3 ⎥<br />
x 3 ,x 5 ∈ C<br />
⎣<br />
⎪⎩<br />
−2x ⎦<br />
5<br />
x 5<br />
∣<br />
⎪⎭<br />
Example CNS2 Comput<strong>in</strong>g a null space, no. 2<br />
△
§HSE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 63<br />
Let us compute the null space of<br />
⎡ ⎤<br />
−4 6 1<br />
⎢−1 4 1⎥<br />
C = ⎣<br />
5 6 7<br />
⎦<br />
4 7 1<br />
which we write as N (C). Translat<strong>in</strong>g Def<strong>in</strong>ition NSM, we simply desire to solve the<br />
homogeneous system LS(C, 0). So we row-reduce the augmented matrix to obta<strong>in</strong><br />
⎡<br />
⎤<br />
1 0 0 0<br />
⎢ 0 1 0 0<br />
⎥<br />
⎣<br />
0 0 1 0<br />
⎦<br />
0 0 0 0<br />
There are no free variables <strong>in</strong> the homogeneous system represented by the rowreduced<br />
matrix, so there is only the trivial solution, the zero vector, 0. Sowecan<br />
write the (trivial) solution set as<br />
{[ ]} 0<br />
N (C) ={0} = 0<br />
0<br />
Read<strong>in</strong>g Questions<br />
1. What is always true of the solution set for a homogeneous system of equations?<br />
2. Suppose a homogeneous system of equations has 13 variables and 8 equations. How<br />
many solutions will it have? Why?<br />
3. Describe, us<strong>in</strong>g only words, the null space of a matrix. (So <strong>in</strong> particular, do not use any<br />
symbols.)<br />
Exercises<br />
C10 Each Archetype (Archetypes) that is a system of equations has a correspond<strong>in</strong>g<br />
homogeneous system with the same coefficient matrix. Compute the set of solutions for<br />
each. Notice that these solution sets are the null spaces of the coefficient matrices.<br />
Archetype A,ArchetypeB,ArchetypeC,ArchetypeD/Archetype E,ArchetypeF,Archetype<br />
G/Archetype H, ArchetypeI, ArchetypeJ<br />
C20 Archetype K and Archetype L are simply 5 × 5 matrices (i.e. they are not systems<br />
of equations). Compute the null space of each matrix.<br />
For Exercises C21-C23, solve the given homogeneous l<strong>in</strong>ear system. Compare your results<br />
to the results of the correspond<strong>in</strong>g exercise <strong>in</strong> Section TSS.
§HSE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 64<br />
C21 † x 1 +4x 2 +3x 3 − x 4 =0<br />
x 1 − x 2 + x 3 +2x 4 =0<br />
4x 1 + x 2 +6x 3 +5x 4 =0<br />
C22 † x 1 − 2x 2 + x 3 − x 4 =0<br />
2x 1 − 4x 2 + x 3 + x 4 =0<br />
x 1 − 2x 2 − 2x 3 +3x 4 =0<br />
C23 † x 1 − 2x 2 + x 3 − x 4 =0<br />
x 1 + x 2 + x 3 − x 4 =0<br />
x 1 + x 3 − x 4 =0<br />
For Exercises C25-C27, solve the given homogeneous l<strong>in</strong>ear system. Compare your results<br />
to the results of the correspond<strong>in</strong>g exercise <strong>in</strong> Section TSS.<br />
C25 † x 1 +2x 2 +3x 3 =0<br />
2x 1 − x 2 + x 3 =0<br />
3x 1 + x 2 + x 3 =0<br />
x 2 +2x 3 =0<br />
C26 † x 1 +2x 2 +3x 3 =0<br />
2x 1 − x 2 + x 3 =0<br />
3x 1 + x 2 + x 3 =0<br />
5x 2 +2x 3 =0<br />
C27 † x 1 +2x 2 +3x 3 =0<br />
2x 1 − x 2 + x 3 =0<br />
x 1 − 8x 2 − 7x 3 =0<br />
x 2 + x 3 =0
§HSE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 65<br />
C30 †<br />
C31 †<br />
Compute the null space of the matrix A, N (A).<br />
⎡<br />
⎤<br />
2 4 1 3 8<br />
⎢−1 −2 −1 −1 1⎥<br />
A = ⎣<br />
2 4 0 −3 4<br />
⎦<br />
2 4 −1 −7 4<br />
F<strong>in</strong>d the null space of the matrix B, N (B).<br />
⎡<br />
⎤<br />
−6 4 −36 6<br />
B = ⎣ 2 −1 10 −1⎦<br />
−3 2 −18 3<br />
M45 Without do<strong>in</strong>g any computations, and without exam<strong>in</strong><strong>in</strong>g any solutions, say as<br />
much as possible about the form of the solution set for correspond<strong>in</strong>g homogeneous system<br />
of equations of each archetype that is a system of equations.<br />
Archetype A,ArchetypeB,ArchetypeC,ArchetypeD/Archetype E,ArchetypeF,Archetype<br />
G/Archetype H, ArchetypeI, ArchetypeJ<br />
For Exercises M50–M52 say as much as possible about each system’s solution set. Be<br />
sure to make it clear which theorems you are us<strong>in</strong>g to reach your conclusions.<br />
M50 † A homogeneous system of 8 equations <strong>in</strong> 8 variables.<br />
M51 † A homogeneous system of 8 equations <strong>in</strong> 9 variables.<br />
M52 † A homogeneous system of 8 equations <strong>in</strong> 7 variables.<br />
T10 † Prove or disprove: A system of l<strong>in</strong>ear equations is homogeneous if and only if the<br />
system has the zero vector as a solution.<br />
T11 † Suppose that two systems of l<strong>in</strong>ear equations are equivalent. Prove that if the first<br />
system is homogeneous, then the second system is homogeneous. Notice that this will<br />
allow us to conclude that two equivalent systems are either both homogeneous or both not<br />
homogeneous.<br />
T12 Give an alternate proof of Theorem HSC that uses Theorem RCLS.<br />
T20 † Consider the homogeneous system of l<strong>in</strong>ear equations LS(A, 0), and suppose that<br />
⎡<br />
u<br />
⎤<br />
⎡<br />
1<br />
4u<br />
⎤<br />
1<br />
u 2<br />
4u 2<br />
u =<br />
u 3<br />
⎢<br />
⎣<br />
⎥<br />
⎦ is one solution to the system of equations. Prove that v = 4u 3<br />
⎢<br />
⎣<br />
⎥ is also a<br />
.<br />
⎦<br />
.<br />
u n 4u n<br />
solution to LS(A, 0).
Section NM<br />
Nons<strong>in</strong>gular Matrices<br />
In this section we specialize further and consider matrices with equal numbers of<br />
rows and columns, which when considered as coefficient matrices lead to systems<br />
with equal numbers of equations and variables. We will see <strong>in</strong> the second half of<br />
the course (Chapter D, Chapter E, Chapter LT, Chapter R) that these matrices are<br />
especially important.<br />
Subsection NM<br />
Nons<strong>in</strong>gular Matrices<br />
Our theorems will now establish connections between systems of equations (homogeneous<br />
or otherwise), augmented matrices represent<strong>in</strong>g those systems, coefficient<br />
matrices, constant vectors, the reduced row-echelon form of matrices (augmented and<br />
coefficient) and solution sets. Be very careful <strong>in</strong> your read<strong>in</strong>g, writ<strong>in</strong>g and speak<strong>in</strong>g<br />
about systems of equations, matrices and sets of vectors. A system of equations is<br />
not a matrix, a matrix is not a solution set, and a solution set is not a system of<br />
equations. Now would be a great time to review the discussion about speak<strong>in</strong>g and<br />
writ<strong>in</strong>g mathematics <strong>in</strong> Proof Technique L.<br />
Def<strong>in</strong>ition SQM Square Matrix<br />
A matrix with m rows and n columns is square if m = n. In this case, we say the<br />
matrix has size n. To emphasize the situation when a matrix is not square, we will<br />
call it rectangular.<br />
□<br />
We can now present one of the central def<strong>in</strong>itions of l<strong>in</strong>ear algebra.<br />
Def<strong>in</strong>ition NM Nons<strong>in</strong>gular Matrix<br />
Suppose A is a square matrix. Suppose further that the solution set to the homogeneous<br />
l<strong>in</strong>ear system of equations LS(A, 0) is {0}, <strong>in</strong> other words, the system has<br />
only the trivial solution. Then we say that A is a nons<strong>in</strong>gular matrix. Otherwise<br />
we say A is a s<strong>in</strong>gular matrix.<br />
□<br />
We can <strong>in</strong>vestigate whether any square matrix is nons<strong>in</strong>gular or not, no matter if<br />
the matrix is derived somehow from a system of equations or if it is simply a matrix.<br />
The def<strong>in</strong>ition says that to perform this <strong>in</strong>vestigation we must construct a very<br />
specific system of equations (homogeneous, with the matrix as the coefficient matrix)<br />
and look at its solution set. We will have theorems <strong>in</strong> this section that connect<br />
nons<strong>in</strong>gular matrices with systems of equations, creat<strong>in</strong>g more opportunities for<br />
confusion. Conv<strong>in</strong>ce yourself now of two observations, (1) we can decide nons<strong>in</strong>gularity<br />
for any square matrix, and (2) the determ<strong>in</strong>ation of nons<strong>in</strong>gularity <strong>in</strong>volves the<br />
solution set for a certa<strong>in</strong> homogeneous system of equations.<br />
Notice that it makes no sense to call a system of equations nons<strong>in</strong>gular (the term<br />
66
§NM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 67<br />
does not apply to a system of equations), nor does it make any sense to call a 5 × 7<br />
matrix s<strong>in</strong>gular (the matrix is not square).<br />
Example S A s<strong>in</strong>gular matrix, Archetype A<br />
Example HISAA shows that the coefficient matrix derived from Archetype A, specifically<br />
the 3 × 3 matrix,<br />
[ 1 −1<br />
] 2<br />
A = 2 1 1<br />
1 1 0<br />
is a s<strong>in</strong>gular matrix s<strong>in</strong>ce there are nontrivial solutions to the homogeneous system<br />
LS(A, 0).<br />
△<br />
Example NM A nons<strong>in</strong>gular matrix, Archetype B<br />
Example HUSAB shows that the coefficient matrix derived from Archetype B,<br />
specifically the 3 × 3 matrix,<br />
[ −7 −6<br />
] −12<br />
B = 5 5 7<br />
1 0 4<br />
is a nons<strong>in</strong>gular matrix s<strong>in</strong>ce the homogeneous system, LS(B, 0), has only the trivial<br />
solution.<br />
△<br />
Notice that we will not discuss Example HISAD as be<strong>in</strong>g a s<strong>in</strong>gular or nons<strong>in</strong>gular<br />
coefficient matrix s<strong>in</strong>ce the matrix is not square.<br />
The next theorem comb<strong>in</strong>es with our ma<strong>in</strong> computational technique (row reduc<strong>in</strong>g<br />
a matrix) to make it easy to recognize a nons<strong>in</strong>gular matrix. But first a def<strong>in</strong>ition.<br />
Def<strong>in</strong>ition IM Identity Matrix<br />
The m × m identity matrix, I m , is def<strong>in</strong>ed by<br />
{<br />
1 i = j<br />
[I m ] ij<br />
=<br />
1 ≤ i, j ≤ m<br />
0 i ≠ j<br />
Example IM An identity matrix<br />
The 4 × 4identitymatrixis<br />
⎡<br />
⎤<br />
1 0 0 0<br />
⎢0 1 0 0⎥<br />
I 4 = ⎣<br />
0 0 1 0<br />
⎦ .<br />
0 0 0 1<br />
Notice that an identity matrix is square, and <strong>in</strong> reduced row-echelon form. Also,<br />
every column is a pivot column, and every possible pivot column appears once.<br />
□<br />
△
§NM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 68<br />
Theorem NMRRI Nons<strong>in</strong>gular Matrices Row Reduce to the Identity matrix<br />
Suppose that A is a square matrix and B is a row-equivalent matrix <strong>in</strong> reduced<br />
row-echelon form. Then A is nons<strong>in</strong>gular if and only if B is the identity matrix.<br />
Proof. (⇐) Suppose B is the identity matrix. When the augmented matrix [A | 0]<br />
is row-reduced, the result is [B | 0] = [I n | 0]. The number of nonzero rows is equal<br />
to the number of variables <strong>in</strong> the l<strong>in</strong>ear system of equations LS(A, 0), son = r<br />
and Theorem FVCS gives n − r = 0 free variables. Thus, the homogeneous system<br />
LS(A, 0) has just one solution, which must be the trivial solution. This is exactly<br />
the def<strong>in</strong>ition of a nons<strong>in</strong>gular matrix (Def<strong>in</strong>ition NM).<br />
(⇒) IfA is nons<strong>in</strong>gular, then the homogeneous system LS(A, 0) has a unique<br />
solution, and has no free variables <strong>in</strong> the description of the solution set. The homogeneous<br />
system is consistent (Theorem HSC) soTheoremFVCS applies and tells<br />
us there are n − r free variables. Thus, n − r =0,andson = r. SoB has n pivot<br />
columns among its total of n columns. This is enough to force B to be the n × n<br />
identity matrix I n (see Exercise NM.T12).<br />
<br />
Notice that s<strong>in</strong>ce this theorem is an equivalence it will always allow us to determ<strong>in</strong>e<br />
if a matrix is either nons<strong>in</strong>gular or s<strong>in</strong>gular. Here are two examples of this, cont<strong>in</strong>u<strong>in</strong>g<br />
our study of Archetype A and Archetype B.<br />
Example SRR S<strong>in</strong>gular matrix, row-reduced<br />
We have the coefficient matrix for Archetype A and a row-equivalent matrix B <strong>in</strong><br />
reduced row-echelon form,<br />
A =<br />
[ 1 −1 2<br />
2 1 1<br />
1 1 0<br />
]<br />
RREF<br />
−−−−→<br />
⎡<br />
⎣ 1 0 1<br />
0 1 −1<br />
0 0 0<br />
⎤<br />
⎦ = B<br />
S<strong>in</strong>ce B is not the 3 × 3 identity matrix, Theorem NMRRI tells us that A is a<br />
s<strong>in</strong>gular matrix.<br />
△<br />
Example NSR Nons<strong>in</strong>gular matrix, row-reduced<br />
We have the coefficient matrix for Archetype B and a row-equivalent matrix B <strong>in</strong><br />
reduced row-echelon form,<br />
A =<br />
[ −7 −6 −12<br />
5 5 7<br />
1 0 4<br />
]<br />
RREF<br />
−−−−→<br />
⎡<br />
⎣<br />
1 0 0<br />
0 1 0<br />
0 0 1<br />
⎤<br />
⎦ = B<br />
S<strong>in</strong>ce B is the 3 × 3 identity matrix, Theorem NMRRI tells us that A is a<br />
nons<strong>in</strong>gular matrix.<br />
△
§NM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 69<br />
Subsection NSNM<br />
Null Space of a Nons<strong>in</strong>gular Matrix<br />
Nons<strong>in</strong>gular matrices and their null spaces are <strong>in</strong>timately related, as the next two<br />
examples illustrate.<br />
Example NSS Null space of a s<strong>in</strong>gular matrix<br />
Given the s<strong>in</strong>gular coefficient matrix from Archetype A, thenullspaceisthesetof<br />
solutions to the homogeneous system of equations LS(A, 0), whichhasasolution<br />
set and null space constructed <strong>in</strong> Example HISAA as an <strong>in</strong>f<strong>in</strong>ite set of vectors.<br />
A =<br />
[ 1 −1<br />
] 2<br />
2 1 1<br />
1 1 0<br />
N (A) =<br />
{[ ]∣ }<br />
−x3 ∣∣∣∣<br />
x 3 x 3 ∈ C<br />
x 3<br />
Example NSNM Null space of a nons<strong>in</strong>gular matrix<br />
Given the nons<strong>in</strong>gular coefficient matrix from Archetype B, the solution set to the<br />
homogeneous system LS(A, 0) is constructed <strong>in</strong> Example HUSAB and conta<strong>in</strong>s only<br />
the trivial solution, so the null space of A has only a s<strong>in</strong>gle element,<br />
A =<br />
[ −7 −6<br />
] −12<br />
5 5 7<br />
1 0 4<br />
]} 0<br />
N (A) ={[<br />
0<br />
0<br />
These two examples illustrate the next theorem, which is another equivalence.<br />
Theorem NMTNS Nons<strong>in</strong>gular Matrices have Trivial Null Spaces<br />
Suppose that A is a square matrix. Then A is nons<strong>in</strong>gular if and only if the null<br />
space of A is the set conta<strong>in</strong><strong>in</strong>g only the zero vector, i.e. N (A) ={0}.<br />
Proof. The null space of a square matrix, A, is equal to the set of solutions to the<br />
homogeneous system, LS(A, 0). Amatrix is nons<strong>in</strong>gular if and only if the set of<br />
solutions to the homogeneous system, LS(A, 0), has only a trivial solution. These<br />
two observations may be cha<strong>in</strong>ed together to construct the two proofs necessary for<br />
each half of this theorem.<br />
<br />
The next theorem pulls a lot of big ideas together. Theorem NMUS tells us that<br />
we can learn much about solutions to a system of l<strong>in</strong>ear equations with a square<br />
coefficient matrix by just exam<strong>in</strong><strong>in</strong>g a similar homogeneous system.<br />
Theorem NMUS Nons<strong>in</strong>gular Matrices and Unique Solutions<br />
Suppose that A is a square matrix. A is a nons<strong>in</strong>gular matrix if and only if the<br />
system LS(A, b) has a unique solution for every choice of the constant vector b.<br />
△<br />
△
§NM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 70<br />
Proof. (⇐) The hypothesis for this half of the proof is that the system LS(A, b)<br />
has a unique solution for every choice of the constant vector b. We will make a very<br />
specific choice for b: b = 0. Then we know that the system LS(A, 0) has a unique<br />
solution. But this is precisely the def<strong>in</strong>ition of what it means for A to be nons<strong>in</strong>gular<br />
(Def<strong>in</strong>ition NM). That almost seems too easy! Notice that we have not used the full<br />
power of our hypothesis, but there is noth<strong>in</strong>g that says we must use a hypothesis to<br />
its fullest.<br />
(⇒) We assume that A is nons<strong>in</strong>gular of size n × n, so we know there is a<br />
sequence of row operations that will convert A <strong>in</strong>to the identity matrix I n (Theorem<br />
NMRRI). Form the augmented matrix A ′ = [A | b] and apply this same sequence<br />
of row operations to A ′ . The result will be the matrix B ′ = [I n | c], which is <strong>in</strong><br />
reduced row-echelon form with r = n. Then the augmented matrix B ′ represents the<br />
(extremely simple) system of equations x i = [c] i<br />
,1≤ i ≤ n. The vector c is clearly a<br />
solution, so the system is consistent (Def<strong>in</strong>ition CS). With a consistent system, we<br />
use Theorem FVCS to count free variables. We f<strong>in</strong>d that there are n − r = n − n =0<br />
free variables, and so we therefore know that the solution is unique. (This half of<br />
the proof was suggested by Asa Scherer.)<br />
<br />
This theorem helps to expla<strong>in</strong> part of our <strong>in</strong>terest <strong>in</strong> nons<strong>in</strong>gular matrices. If a<br />
matrix is nons<strong>in</strong>gular, then no matter what vector of constants we pair it with, us<strong>in</strong>g<br />
the matrix as the coefficient matrix will always yield a l<strong>in</strong>ear system of equations<br />
with a solution, and the solution is unique. To determ<strong>in</strong>e if a matrix has this property<br />
(nons<strong>in</strong>gularity) it is enough to just solve one l<strong>in</strong>ear system, the homogeneous system<br />
with the matrix as coefficient matrix and the zero vector as the vector of constants<br />
(or any other vector of constants, see Exercise MM.T10).<br />
Formulat<strong>in</strong>g the negation of the second part of this theorem is a good exercise.<br />
A s<strong>in</strong>gular matrix has the property that for some value of the vector b, the system<br />
LS(A, b) does not have a unique solution (which means that it has no solution or<br />
<strong>in</strong>f<strong>in</strong>itely many solutions). We will be able to say more about this case later (see the<br />
discussion follow<strong>in</strong>g Theorem PSPHS).<br />
Square matrices that are nons<strong>in</strong>gular have a long list of <strong>in</strong>terest<strong>in</strong>g properties,<br />
which we will start to catalog <strong>in</strong> the follow<strong>in</strong>g, recurr<strong>in</strong>g, theorem. Of course, s<strong>in</strong>gular<br />
matrices will then have all of the opposite properties. The follow<strong>in</strong>g theorem is a list<br />
of equivalences.<br />
We want to understand just what is <strong>in</strong>volved with understand<strong>in</strong>g and prov<strong>in</strong>g<br />
a theorem that says several conditions are equivalent. So have a look at Proof<br />
Technique ME before study<strong>in</strong>g the first <strong>in</strong> this series of theorems.<br />
Theorem NME1 Nons<strong>in</strong>gular Matrix Equivalences, Round 1<br />
Suppose that A is a square matrix. The follow<strong>in</strong>g are equivalent.<br />
1. A is nons<strong>in</strong>gular.<br />
2. A row-reduces to the identity matrix.
§NM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 71<br />
3. The null space of A conta<strong>in</strong>s only the zero vector, N (A) ={0}.<br />
4. The l<strong>in</strong>ear system LS(A, b) has a unique solution for every possible choice of<br />
b.<br />
Proof. The statement that A is nons<strong>in</strong>gular is equivalent to each of the subsequent<br />
statements by, <strong>in</strong> turn, Theorem NMRRI, TheoremNMTNS and Theorem NMUS.<br />
So the statement of this theorem is just a convenient way to organize all these results.<br />
<br />
F<strong>in</strong>ally, you may have wondered why we refer to a matrix as nons<strong>in</strong>gular when<br />
it creates systems of equations with s<strong>in</strong>gle solutions (Theorem NMUS)! I have<br />
wondered the same th<strong>in</strong>g. We will have an opportunity to address this when we get<br />
to Theorem SMZD. Can you wait that long?<br />
Read<strong>in</strong>g Questions<br />
1. In your own words state the def<strong>in</strong>ition of a nons<strong>in</strong>gular matrix.<br />
2. What is the easiest way to recognize if a square matrix is nons<strong>in</strong>gular or not?<br />
3. Suppose we have a system of equations and its coefficient matrix is nons<strong>in</strong>gular. What<br />
can you say about the solution set for this system?<br />
Exercises<br />
In Exercises C30–C33 determ<strong>in</strong>e if the matrix is nons<strong>in</strong>gular or s<strong>in</strong>gular. Give reasons for<br />
your answer.<br />
C30 †<br />
⎡<br />
⎤<br />
⎢<br />
⎣<br />
−3 1 2 8<br />
2 0 3 4<br />
1 2 7 −4<br />
5 −1 2 0<br />
⎥<br />
⎦<br />
C31 †<br />
C32 †<br />
⎡<br />
⎤<br />
2 3 1 4<br />
⎢ 1 1 1 0⎥<br />
⎣<br />
−1 2 3 5<br />
⎦<br />
1 2 1 3<br />
⎡<br />
⎣ 9 3 2 4<br />
⎤<br />
5 −6 1 3 ⎦<br />
4 1 3 −5
§NM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 72<br />
C33 †<br />
⎡<br />
⎤<br />
−1 2 0 3<br />
⎢ 1 −3 −2 4⎥<br />
⎣<br />
−2 0 4 3<br />
⎦<br />
−3 1 −2 3<br />
C40 Each of the archetypes below is a system of equations with a square coefficient<br />
matrix, or is itself a square matrix. Determ<strong>in</strong>e if these matrices are nons<strong>in</strong>gular, or s<strong>in</strong>gular.<br />
Comment on the null space of each matrix.<br />
Archetype A, ArchetypeB, ArchetypeF, ArchetypeK, Archetype L<br />
C50 † F<strong>in</strong>d the null space of the matrix E below.<br />
⎡<br />
⎤<br />
2 1 −1 −9<br />
⎢ 2 2 −6 −6⎥<br />
E = ⎣<br />
1 2 −8 0<br />
⎦<br />
−1 2 −12 12<br />
M30 † Let A be the coefficient matrix of the system of equations below. Is A nons<strong>in</strong>gular<br />
or s<strong>in</strong>gular? Expla<strong>in</strong> what you could <strong>in</strong>fer about the solution set for the system based only<br />
on what you have learned about A be<strong>in</strong>g s<strong>in</strong>gular or nons<strong>in</strong>gular.<br />
−x 1 +5x 2 = −8<br />
−2x 1 +5x 2 +5x 3 +2x 4 =9<br />
−3x 1 − x 2 +3x 3 + x 4 =3<br />
7x 1 +6x 2 +5x 3 + x 4 =30<br />
For Exercises M51–M52 say as much as possible about each system’s solution set. Be<br />
sure to make it clear which theorems you are us<strong>in</strong>g to reach your conclusions.<br />
M51 † 6 equations <strong>in</strong> 6 variables, s<strong>in</strong>gular coefficient matrix.<br />
M52 † A system with a nons<strong>in</strong>gular coefficient matrix, not homogeneous.<br />
T10 † Suppose that A is a square matrix, and B is a matrix <strong>in</strong> reduced row-echelon form<br />
that is row-equivalent to A. ProvethatifA is s<strong>in</strong>gular, then the last row of B is a zero row.<br />
T12 Us<strong>in</strong>g (Def<strong>in</strong>ition RREF) and (Def<strong>in</strong>ition IM) carefully, give a proof of the follow<strong>in</strong>g<br />
equivalence: A is a square matrix <strong>in</strong> reduced row-echelon form where every column is a<br />
pivot column if and only if A is the identity matrix.<br />
T30 † Suppose that A is a nons<strong>in</strong>gular matrix and A is row-equivalent to the matrix B.<br />
Prove that B is nons<strong>in</strong>gular.<br />
T31 † Suppose that A is a square matrix of size n × n and that we know there is a s<strong>in</strong>gle<br />
vector b ∈ C n such that the system LS(A, b) has a unique solution. Prove that A is a<br />
nons<strong>in</strong>gular matrix. (Notice that this is very similar to Theorem NMUS, but is not exactly<br />
the same.)
§NM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 73<br />
T90 † Provide an alternative for the second half of the proof of Theorem NMUS, without<br />
appeal<strong>in</strong>g to properties of the reduced row-echelon form of the coefficient matrix. In other<br />
words, prove that if A is nons<strong>in</strong>gular, then LS(A, b) has a unique solution for every choice<br />
of the constant vector b. Construct this proof without us<strong>in</strong>g Theorem REMEF or Theorem<br />
RREFU.
Chapter V<br />
Vectors<br />
We have worked extensively <strong>in</strong> the last chapter with matrices, and some with vectors.<br />
In this chapter we will develop the properties of vectors, while prepar<strong>in</strong>g to study<br />
vector spaces (Chapter VS). Initially we will depart from our study of systems<br />
of l<strong>in</strong>ear equations, but <strong>in</strong> Section LC we will forge a connection between l<strong>in</strong>ear<br />
comb<strong>in</strong>ations and systems of l<strong>in</strong>ear equations <strong>in</strong> Theorem SLSLC. This connection<br />
will allow us to understand systems of l<strong>in</strong>ear equations at a higher level, while<br />
consequently discuss<strong>in</strong>g them less frequently.<br />
Section VO<br />
Vector Operations<br />
In this section we def<strong>in</strong>e some new operations <strong>in</strong>volv<strong>in</strong>g vectors, and collect some<br />
basic properties of these operations. Beg<strong>in</strong> by recall<strong>in</strong>g our def<strong>in</strong>ition of a column<br />
vector as an ordered list of complex numbers, written vertically (Def<strong>in</strong>ition CV).<br />
The collection of all possible vectors of a fixed size is a commonly used set, so we<br />
start with its def<strong>in</strong>ition.<br />
Subsection CV<br />
Column Vectors<br />
Def<strong>in</strong>ition VSCV Vector Space of Column Vectors<br />
The vector space C m is the set of all column vectors (Def<strong>in</strong>ition CV) ofsizem with<br />
entries from the set of complex numbers, C.<br />
□<br />
When a set similar to this is def<strong>in</strong>ed us<strong>in</strong>g only column vectors where all the<br />
entries are from the real numbers, it is written as R m and is known as Euclidean<br />
m-space.<br />
74
§VO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 75<br />
The term vector is used <strong>in</strong> a variety of different ways. We have def<strong>in</strong>ed it as<br />
an ordered list written vertically. It could simply be an ordered list of numbers,<br />
and perhaps written as 〈2, 3, −1, 6〉. Or it could be <strong>in</strong>terpreted as a po<strong>in</strong>t <strong>in</strong> m<br />
dimensions, such as (3, 4, −2) represent<strong>in</strong>g a po<strong>in</strong>t <strong>in</strong> three dimensions relative to x,<br />
y and z axes. With an <strong>in</strong>terpretation as a po<strong>in</strong>t, we can construct an arrow from the<br />
orig<strong>in</strong> to the po<strong>in</strong>t which is consistent with the notion that a vector has direction<br />
and magnitude.<br />
All of these ideas can be shown to be related and equivalent, so keep that <strong>in</strong> m<strong>in</strong>d<br />
as you connect the ideas of this course with ideas from other discipl<strong>in</strong>es. For now,<br />
we will stick with the idea that a vector is just a list of numbers, <strong>in</strong> some particular<br />
order.<br />
Subsection VEASM<br />
Vector Equality, Addition, Scalar Multiplication<br />
We start our study of this set by first def<strong>in</strong><strong>in</strong>g what it means for two vectors to be<br />
the same.<br />
Def<strong>in</strong>ition CVE Column Vector Equality<br />
Suppose that u, v ∈ C m .Thenu and v are equal, written u = v if<br />
[u] i<br />
=[v] i<br />
1 ≤ i ≤ m<br />
Now this may seem like a silly (or even stupid) th<strong>in</strong>g to say so carefully. Of<br />
course two vectors are equal if they are equal for each correspond<strong>in</strong>g entry! Well,<br />
this is not as silly as it appears. We will see a few occasions later where the obvious<br />
def<strong>in</strong>ition is not the right one. And besides, <strong>in</strong> do<strong>in</strong>g mathematics we need to be very<br />
careful about mak<strong>in</strong>g all the necessary def<strong>in</strong>itions and mak<strong>in</strong>g them unambiguous.<br />
And we have done that here.<br />
Notice now that the symbol “=” is now do<strong>in</strong>g triple-duty. We know from our<br />
earlier education what it means for two numbers (real or complex) to be equal, and<br />
we take this for granted. In Def<strong>in</strong>ition SE we def<strong>in</strong>ed what it meant for two sets<br />
to be equal. Now we have def<strong>in</strong>ed what it means for two vectors to be equal, and<br />
that def<strong>in</strong>ition builds on our def<strong>in</strong>ition for when two numbers are equal when we<br />
use the condition u i = v i for all 1 ≤ i ≤ m. So th<strong>in</strong>k carefully about your objects<br />
when you see an equal sign and th<strong>in</strong>k about just which notion of equality you have<br />
encountered. This will be especially important when you are asked to construct<br />
proofs whose conclusion states that two objects are equal. If you have an electronic<br />
copy of the book, such as the PDF version, search<strong>in</strong>g on “Def<strong>in</strong>ition CVE” can be<br />
an <strong>in</strong>structive exercise. See how often, and where, the def<strong>in</strong>ition is employed.<br />
OK, let us do an example of vector equality that beg<strong>in</strong>s to h<strong>in</strong>t at the utility of<br />
this def<strong>in</strong>ition.<br />
□
§VO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 76<br />
Example VESE Vector equality for a system of equations<br />
Consider the system of l<strong>in</strong>ear equations <strong>in</strong> Archetype B,<br />
−7x 1 − 6x 2 − 12x 3 = −33<br />
5x 1 +5x 2 +7x 3 =24<br />
x 1 +4x 3 =5<br />
Note the use of three equals signs — each <strong>in</strong>dicates an equality of numbers (the<br />
l<strong>in</strong>ear expressions are numbers when we evaluate them with fixed values of the<br />
variable quantities). Now write the vector equality,<br />
[ −7x1 − 6x 2 − 12x 3<br />
]<br />
5x 1 +5x 2 +7x 3 =<br />
x 1 +4x 3<br />
[ −33<br />
24<br />
5<br />
By Def<strong>in</strong>ition CVE, thiss<strong>in</strong>gle equality (of two column vectors) translates <strong>in</strong>to three<br />
simultaneous equalities of numbers that form the system of equations. So with this<br />
new notion of vector equality we can become less reliant on referr<strong>in</strong>g to systems of<br />
simultaneous equations. There is more to vector equality than just this, but this is a<br />
good example for starters and we will develop it further.<br />
△<br />
We will now def<strong>in</strong>e two operations on the set C m . By this we mean well-def<strong>in</strong>ed<br />
procedures that somehow convert vectors <strong>in</strong>to other vectors. Here are two of the<br />
most basic def<strong>in</strong>itions of the entire course.<br />
Def<strong>in</strong>ition CVA Column Vector Addition<br />
Suppose that u, v ∈ C m .Thesum of u and v is the vector u + v def<strong>in</strong>ed by<br />
[u + v] i<br />
=[u] i<br />
+[v] i<br />
1 ≤ i ≤ m<br />
So vector addition takes two vectors of the same size and comb<strong>in</strong>es them (<strong>in</strong> a<br />
natural way!) to create a new vector of the same size. Notice that this def<strong>in</strong>ition<br />
is required, even if we agree that this is the obvious, right, natural or correct way<br />
to do it. Notice too that the symbol ‘+’ is be<strong>in</strong>g recycled. We all know how to add<br />
numbers, but now we have the same symbol extended to double-duty and we use<br />
it to <strong>in</strong>dicate how to add two new objects, vectors. And this def<strong>in</strong>ition of our new<br />
mean<strong>in</strong>g is built on our previous mean<strong>in</strong>g of addition via the expressions u i + v i .<br />
Th<strong>in</strong>k about your objects, especially when do<strong>in</strong>g proofs. Vector addition is easy, here<br />
is an example from C 4 .<br />
Example VA Addition of two vectors <strong>in</strong> C 4<br />
If<br />
⎡ ⎤<br />
⎡ ⎤<br />
2<br />
−1<br />
⎢−3⎥<br />
⎢ 5 ⎥<br />
u = ⎣<br />
4<br />
⎦ v = ⎣<br />
2<br />
⎦<br />
2<br />
−7<br />
]<br />
.<br />
□
§VO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 77<br />
then<br />
⎡ ⎤ ⎡ ⎤<br />
2 −1<br />
⎢−3⎥<br />
⎢ 5<br />
u + v = ⎣<br />
4<br />
⎦ + ⎣<br />
2<br />
2 −7<br />
⎥<br />
⎦ =<br />
⎡ ⎤<br />
2+(−1)<br />
⎢<br />
⎣<br />
−3+5<br />
4+2<br />
2+(−7)<br />
⎥<br />
⎦ =<br />
Our second operation takes two objects of different types, specifically a number<br />
and a vector, and comb<strong>in</strong>es them to create another vector. In this context we call a<br />
number a scalar <strong>in</strong> order to emphasize that it is not a vector.<br />
Def<strong>in</strong>ition CVSM Column Vector Scalar Multiplication<br />
Suppose u ∈ C m and α ∈ C, then the scalar multiple of u by α is the vector αu<br />
def<strong>in</strong>ed by<br />
[αu] i<br />
= α [u] i<br />
1 ≤ i ≤ m<br />
□<br />
Notice that we are do<strong>in</strong>g a k<strong>in</strong>d of multiplication here, but we are def<strong>in</strong><strong>in</strong>g anew<br />
type, perhaps <strong>in</strong> what appears to be a natural way. We use juxtaposition (smash<strong>in</strong>g<br />
two symbols together side-by-side) to denote this operation rather than us<strong>in</strong>g a<br />
symbol like we did with vector addition. So this can be another source of confusion.<br />
When two symbols are next to each other, are we do<strong>in</strong>g regular old multiplication,<br />
the k<strong>in</strong>d we have done for years, or are we do<strong>in</strong>g scalar vector multiplication, the<br />
operation we just def<strong>in</strong>ed? Th<strong>in</strong>k about your objects — if the first object is a scalar,<br />
and the second is a vector, then it must be that we are do<strong>in</strong>g our new operation,<br />
and the result of this operation will be another vector.<br />
Notice how consistency <strong>in</strong> notation can be an aid here. If we write scalars as<br />
lower case Greek letters from the start of the alphabet (such as α, β, ...)andwrite<br />
vectors <strong>in</strong> bold Lat<strong>in</strong> letters from the end of the alphabet (u, v, . . . ), then we have<br />
some h<strong>in</strong>ts about what type of objects we are work<strong>in</strong>g with. This can be a bless<strong>in</strong>g<br />
and a curse, s<strong>in</strong>ce when we go read another book about l<strong>in</strong>ear algebra, or read an<br />
application <strong>in</strong> another discipl<strong>in</strong>e (physics, economics, . . . ) the types of notation<br />
employed may be very different and hence unfamiliar.<br />
Aga<strong>in</strong>, computationally, vector scalar multiplication is very easy.<br />
Example CVSM Scalar multiplication <strong>in</strong> C 5<br />
If<br />
⎡ ⎤<br />
3<br />
1<br />
u = ⎢−2⎥<br />
⎣ 4 ⎦<br />
−1<br />
⎡<br />
⎢<br />
⎣<br />
1<br />
2<br />
6<br />
−5<br />
⎤<br />
⎥<br />
⎦<br />
△
§VO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 78<br />
and α =6,then<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
3 6(3) 18<br />
1<br />
6(1)<br />
6<br />
αu =6⎢−2⎥<br />
⎣ 4 ⎦ = ⎢6(−2)<br />
⎥<br />
⎣ 6(4) ⎦ = ⎢−12⎥<br />
⎣ 24 ⎦ .<br />
−1 6(−1) −6<br />
△<br />
Subsection VSP<br />
Vector Space Properties<br />
With def<strong>in</strong>itions of vector addition and scalar multiplication we can state, and prove,<br />
several properties of each operation, and some properties that <strong>in</strong>volve their <strong>in</strong>terplay.<br />
We now collect ten of them here for later reference.<br />
Theorem VSPCV Vector Space Properties of Column Vectors<br />
Suppose that C m is the set of column vectors of size m (Def<strong>in</strong>ition VSCV) with<br />
addition and scalar multiplication as def<strong>in</strong>ed <strong>in</strong> Def<strong>in</strong>ition CVA and Def<strong>in</strong>ition CVSM.<br />
Then<br />
• ACC Additive Closure, Column Vectors<br />
If u, v ∈ C m ,thenu + v ∈ C m .<br />
• SCC Scalar Closure, Column Vectors<br />
If α ∈ C and u ∈ C m ,thenαu ∈ C m .<br />
• CC Commutativity, Column Vectors<br />
If u, v ∈ C m ,thenu + v = v + u.<br />
• AAC Additive Associativity, Column Vectors<br />
If u, v, w ∈ C m ,thenu +(v + w) =(u + v)+w.<br />
• ZC Zero Vector, Column Vectors<br />
There is a vector, 0, called the zero vector, such that u + 0 = u for all u ∈ C m .<br />
• AIC Additive Inverses, Column Vectors<br />
If u ∈ C m , then there exists a vector −u ∈ C m so that u +(−u) =0.<br />
• SMAC Scalar Multiplication Associativity, Column Vectors<br />
If α, β ∈ C and u ∈ C m ,thenα(βu) =(αβ)u.<br />
• DVAC Distributivity across Vector Addition, Column Vectors<br />
If α ∈ C and u, v ∈ C m ,thenα(u + v) =αu + αv.
§VO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 79<br />
• DSAC Distributivity across Scalar Addition, Column Vectors<br />
If α, β ∈ C and u ∈ C m ,then(α + β)u = αu + βu.<br />
• OC One, Column Vectors<br />
If u ∈ C m ,then1u = u.<br />
Proof. While some of these properties seem very obvious, they all require proof.<br />
However, the proofs are not very <strong>in</strong>terest<strong>in</strong>g, and border on tedious. We will prove<br />
one version of distributivity very carefully, and you can test your proof-build<strong>in</strong>g<br />
skills on some of the others. We need to establish an equality, so we will do so by<br />
beg<strong>in</strong>n<strong>in</strong>g with one side of the equality, apply various def<strong>in</strong>itions and theorems (listed<br />
to the right of each step) to massage the expression from the left <strong>in</strong>to the expression<br />
ontheright.HerewegowithaproofofPropertyDSAC.<br />
For 1 ≤ i ≤ m,<br />
[(α + β)u] i<br />
=(α + β)[u] i<br />
Def<strong>in</strong>ition CVSM<br />
= α [u] i<br />
+ β [u] i<br />
Property DCN<br />
=[αu] i<br />
+[βu] i<br />
Def<strong>in</strong>ition CVSM<br />
=[αu + βu] i<br />
Def<strong>in</strong>ition CVA<br />
S<strong>in</strong>ce the <strong>in</strong>dividual components of the vectors (α + β)u and αu + βu are equal<br />
for all i, 1≤ i ≤ m, Def<strong>in</strong>ition CVE tells us the vectors are equal.<br />
<br />
Many of the conclusions of our theorems can be characterized as “identities,”<br />
especially when we are establish<strong>in</strong>g basic properties of operations such as those <strong>in</strong><br />
this section. Most of the properties listed <strong>in</strong> Theorem VSPCV are examples. So<br />
some advice about the style we use for prov<strong>in</strong>g identities is appropriate right now.<br />
Have a look at Proof Technique PI.<br />
Be careful with the notion of the vector −u. This is a vector that we add to u<br />
so that the result is the particular vector 0. This is basically a property of vector<br />
addition. It happens that we can compute −u us<strong>in</strong>g the other operation, scalar<br />
multiplication. We can prove this directly by writ<strong>in</strong>g that<br />
[−u] i<br />
= − [u] i<br />
=(−1) [u] i<br />
=[(−1)u] i<br />
We will see later how to derive this property as a consequence of several of the ten<br />
properties listed <strong>in</strong> Theorem VSPCV.<br />
Similarly, we will often write someth<strong>in</strong>g you would immediately recognize as<br />
“vector subtraction.” This could be placed on a firm theoretical foundation — as you<br />
candoyourselfwithExerciseVO.T30.<br />
A f<strong>in</strong>al note. Property AAC implies that we do not have to be careful about how<br />
we “parenthesize” the addition of vectors. In other words, there is noth<strong>in</strong>g to be<br />
ga<strong>in</strong>ed by writ<strong>in</strong>g (u + v) + (w +(x + y)) rather than u + v + w + x + y, s<strong>in</strong>cewe
§VO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 80<br />
get the same result no matter which order we choose to perform the four additions.<br />
So we will not be careful about us<strong>in</strong>g parentheses this way.<br />
Read<strong>in</strong>g Questions<br />
1. Where have you seen vectors used before <strong>in</strong> other courses? How were they different?<br />
2. In words only, when are two vectors equal?<br />
3. Perform the follow<strong>in</strong>g computation with vector operations<br />
⎡<br />
2 ⎣ 1 ⎤ ⎡<br />
5⎦ +(−3) ⎣ 7 ⎤<br />
6⎦<br />
0 5<br />
Exercises<br />
C10 †<br />
Compute<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
2<br />
1 −1<br />
−3<br />
2<br />
3<br />
4 ⎢ 4 ⎥<br />
⎣<br />
1<br />
⎦ +(−2) ⎢−5⎥<br />
⎣<br />
2<br />
⎦ + ⎢ 0 ⎥<br />
⎣<br />
1<br />
⎦<br />
0<br />
4 2<br />
C11 †<br />
C12 †<br />
C13 †<br />
C14 †<br />
C15 †<br />
Solve the given vector equation for x, or expla<strong>in</strong> why no solution exists:<br />
⎡<br />
3 ⎣ 1 ⎤ ⎡<br />
2 ⎦ +4⎣ 2 ⎤ ⎡<br />
0⎦ = ⎣ 11<br />
⎤<br />
6 ⎦<br />
−1 x 17<br />
Solve the given vector equation for α, or expla<strong>in</strong> why no solution exists:<br />
⎡<br />
α ⎣ 1 ⎤ ⎡<br />
2 ⎦ +4⎣ 3 ⎤ ⎡<br />
4⎦ = ⎣ −1<br />
⎤<br />
0 ⎦<br />
−1 2 4<br />
Solve the given vector equation for α, or expla<strong>in</strong> why no solution exists:<br />
⎡<br />
α ⎣ 3 ⎤ ⎡<br />
2 ⎦ + ⎣ 6 ⎤ ⎡<br />
1⎦ = ⎣ 0<br />
⎤<br />
−3⎦<br />
−2 2 6<br />
F<strong>in</strong>d α and β that solve the vector equation.<br />
[ [ [<br />
1 0 3<br />
α + β =<br />
0]<br />
1]<br />
2]<br />
F<strong>in</strong>d α and β that solve the vector equation.<br />
[ [ [<br />
2 1 5<br />
α + β =<br />
1]<br />
3]<br />
0]
§VO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 81<br />
T05 † Provide reasons (mostly vector space properties) as justification for each of the<br />
seven steps of the follow<strong>in</strong>g proof.<br />
Theorem For any vectors u, v, w ∈ C m ,ifu + v = u + w, thenv = w.<br />
Proof: Letu, v, w ∈ C m , and suppose u + v = u + w.<br />
v = 0 + v<br />
=(−u + u)+v<br />
= −u +(u + v)<br />
= −u +(u + w)<br />
=(−u + u)+w<br />
= 0 + w<br />
= w<br />
T06 † Provide reasons (mostly vector space properties) as justification for each of the six<br />
steps of the follow<strong>in</strong>g proof.<br />
Theorem For any vector u ∈ C m ,0u = 0.<br />
Proof: Letu ∈ C m .<br />
0 =0u +(−0u)<br />
=(0+0)u +(−0u)<br />
=(0u +0u)+(−0u)<br />
=0u +(0u +(−0u))<br />
=0u + 0<br />
=0u<br />
T07 † Provide reasons (mostly vector space properties) as justification for each of the six<br />
steps of the follow<strong>in</strong>g proof.<br />
Theorem For any scalar c, c 0 = 0.<br />
Proof: Letc be an arbitrary scalar.<br />
0 = c0 +(−c0)<br />
= c(0 + 0)+(−c0)<br />
=(c0 + c0)+(−c0)<br />
= c0 +(c0 +(−c0))
§VO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 82<br />
= c0 + 0<br />
= c0<br />
T13 † Prove Property CC of Theorem VSPCV. Write your proof <strong>in</strong> the style of the proof<br />
of Property DSAC given <strong>in</strong> this section.<br />
T17 Prove Property SMAC of Theorem VSPCV. Write your proof <strong>in</strong> the style of the<br />
proof of Property DSAC given <strong>in</strong> this section.<br />
T18 Prove Property DVAC of Theorem VSPCV. Write your proof <strong>in</strong> the style of the<br />
proof of Property DSAC given <strong>in</strong> this section.<br />
Exercises T30, T31 and T32 are about mak<strong>in</strong>g a careful def<strong>in</strong>ition of “vector subtraction”.<br />
T30 Suppose u and v are two vectors <strong>in</strong> C m . Def<strong>in</strong>e a new operation, called “subtraction,”<br />
as the new vector denoted u − v and def<strong>in</strong>ed by<br />
[u − v] i<br />
=[u] i<br />
− [v] i<br />
1 ≤ i ≤ m<br />
Prove that we can express the subtraction of two vectors <strong>in</strong> terms of our two basic<br />
operations. More precisely, prove that u − v = u +(−1)v. So <strong>in</strong> a sense, subtraction is<br />
not someth<strong>in</strong>g new and different, but is just a convenience. Mimic the style of similar<br />
proofs <strong>in</strong> this section.<br />
T31 Prove, by giv<strong>in</strong>g counterexamples, that vector subtraction is not commutative<br />
and not associative.<br />
T32 Prove that vector subtraction obeys a distributive property. Specifically, prove<br />
that α (u − v) =αu − αv.<br />
Can you give two different proofs? Dist<strong>in</strong>guish your two proofs by us<strong>in</strong>g the alternate<br />
descriptions of vector subtraction provided by Exercise VO.T30.
Section LC<br />
L<strong>in</strong>ear Comb<strong>in</strong>ations<br />
In Section VO we def<strong>in</strong>ed vector addition and scalar multiplication. These two<br />
operations comb<strong>in</strong>e nicely to give us a construction known as a l<strong>in</strong>ear comb<strong>in</strong>ation,<br />
a construct that we will work with throughout this course.<br />
Subsection LC<br />
L<strong>in</strong>ear Comb<strong>in</strong>ations<br />
Def<strong>in</strong>ition LCCV L<strong>in</strong>ear Comb<strong>in</strong>ation of Column Vectors<br />
Given n vectors u 1 , u 2 , u 3 , ..., u n from C m and n scalars α 1 ,α 2 ,α 3 , ..., α n ,their<br />
l<strong>in</strong>ear comb<strong>in</strong>ation is the vector<br />
α 1 u 1 + α 2 u 2 + α 3 u 3 + ···+ α n u n<br />
So this def<strong>in</strong>ition takes an equal number of scalars and vectors, comb<strong>in</strong>es them<br />
us<strong>in</strong>g our two new operations (scalar multiplication and vector addition) and creates<br />
a s<strong>in</strong>gle brand-new vector, of the same size as the orig<strong>in</strong>al vectors. When a def<strong>in</strong>ition<br />
or theorem employs a l<strong>in</strong>ear comb<strong>in</strong>ation, th<strong>in</strong>k about the nature of the objects that<br />
go <strong>in</strong>to its creation (lists of scalars and vectors), and the type of object that results<br />
(a s<strong>in</strong>gle vector). Computationally, a l<strong>in</strong>ear comb<strong>in</strong>ation is pretty easy.<br />
Example TLC Two l<strong>in</strong>ear comb<strong>in</strong>ations <strong>in</strong> C 6<br />
Suppose that<br />
α 1 =1 α 2 = −4 α 3 =2 α 4 = −1<br />
and<br />
⎡<br />
⎤<br />
2<br />
4<br />
−3<br />
u 1 =<br />
⎢ 1<br />
⎥<br />
⎣ 2 ⎦<br />
9<br />
⎡<br />
⎤<br />
6<br />
3<br />
0<br />
u 2 =<br />
⎢−2<br />
⎥<br />
⎣ 1 ⎦<br />
4<br />
⎡ ⎤<br />
−5<br />
2<br />
1<br />
u 3 =<br />
⎢ 1<br />
⎥<br />
⎣−3⎦<br />
0<br />
⎡<br />
⎤<br />
3<br />
2<br />
−5<br />
u 4 =<br />
⎢ 7<br />
⎥<br />
⎣ 1 ⎦<br />
3<br />
then their l<strong>in</strong>ear comb<strong>in</strong>ation is<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
2<br />
6 −5<br />
3<br />
4<br />
3<br />
2<br />
2<br />
−3<br />
0<br />
1<br />
−5<br />
α 1 u 1 + α 2 u 2 + α 3 u 3 + α 4 u 4 =(1)<br />
⎢ 1<br />
+(−4)<br />
⎥ ⎢−2<br />
+(2)<br />
⎥ ⎢ 1<br />
+(−1)<br />
⎥ ⎢ 7<br />
⎥<br />
⎣ 2 ⎦ ⎣ 1 ⎦ ⎣−3⎦<br />
⎣ 1 ⎦<br />
9<br />
4 0<br />
3<br />
□<br />
83
§LC Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 84<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
2 −24 −10 −3 −35<br />
4<br />
−12<br />
4<br />
−2<br />
−6<br />
−3<br />
0<br />
2<br />
5<br />
4<br />
=<br />
⎢ 1<br />
+<br />
⎥ ⎢ 8<br />
+<br />
⎥ ⎢ 2<br />
+<br />
⎥ ⎢−7<br />
=<br />
⎥ ⎢ 4<br />
⎥<br />
⎣ 2 ⎦ ⎣ −4 ⎦ ⎣ −6 ⎦ ⎣−1⎦<br />
⎣ −9 ⎦<br />
9 −16 0 −3 −10<br />
A different l<strong>in</strong>ear comb<strong>in</strong>ation, of the same set of vectors, can be formed with<br />
different scalars. Take<br />
β 1 =3 β 2 =0 β 3 =5 β 4 = −1<br />
and form the l<strong>in</strong>ear comb<strong>in</strong>ation<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
2 6 −5<br />
3<br />
4<br />
3<br />
2<br />
2<br />
−3<br />
0<br />
1<br />
−5<br />
β 1 u 1 + β 2 u 2 + β 3 u 3 + β 4 u 4 =(3)<br />
⎢ 1<br />
+(0)<br />
⎥ ⎢−2<br />
+(5)<br />
⎥ ⎢ 1<br />
+(−1)<br />
⎥ ⎢ 7<br />
⎥<br />
⎣ 2 ⎦ ⎣ 1 ⎦ ⎣−3⎦<br />
⎣ 1 ⎦<br />
9 4 0<br />
3<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
6 0 −25 −3 −22<br />
12<br />
0<br />
10<br />
−2<br />
20<br />
−9<br />
0<br />
5<br />
5<br />
1<br />
=<br />
⎢ 3<br />
+<br />
⎥ ⎢0<br />
+<br />
⎥ ⎢ 5<br />
+<br />
⎥ ⎢−7<br />
=<br />
⎥ ⎢ 1<br />
⎥<br />
⎣ 6 ⎦ ⎣0⎦<br />
⎣−15⎦<br />
⎣−1⎦<br />
⎣−10⎦<br />
27 0 0 −3 24<br />
Notice how we could keep our set of vectors fixed, and use different sets of scalars<br />
to construct different vectors. You might build a few new l<strong>in</strong>ear comb<strong>in</strong>ations of<br />
u 1 , u 2 , u 3 , u 4 right now. We will be right here when you get back. What vectors<br />
were you able to create? Do you th<strong>in</strong>k you could create the vector w with a “suitable”<br />
choice of four scalars?<br />
⎡ ⎤<br />
13<br />
15<br />
5<br />
w =<br />
⎢−17<br />
⎥<br />
⎣ 2 ⎦<br />
25<br />
Do you th<strong>in</strong>k you could create any possible vector from C 6 by choos<strong>in</strong>g the proper<br />
scalars? These last two questions are very fundamental, and time spent consider<strong>in</strong>g<br />
them now will prove beneficial later.<br />
△<br />
Our next two examples are key ones, and a discussion about decompositions is<br />
timely. Have a look at Proof Technique DC before study<strong>in</strong>g the next two examples.<br />
Example ABLC Archetype B as a l<strong>in</strong>ear comb<strong>in</strong>ation
§LC Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 85<br />
In this example we will rewrite Archetype B <strong>in</strong> the language of vectors, vector<br />
equality and l<strong>in</strong>ear comb<strong>in</strong>ations. In Example VESE we wrote the system of m =3<br />
equations as the vector equality<br />
[ ]<br />
−7x1 − 6x 2 − 12x 3<br />
5x 1 +5x 2 +7x 3 =<br />
x 1 +4x 3<br />
[ ] −33<br />
24<br />
5<br />
Now we will bust up the l<strong>in</strong>ear expressions on the left, first us<strong>in</strong>g vector addition,<br />
[ ] [ ] [ ] [ ]<br />
−7x1 −6x2 −12x3 −33<br />
5x 1 + 5x 2 + 7x 3 = 24<br />
x 1 0x 2 4x 3 5<br />
Now we can rewrite each of these vectors as a scalar multiple of a fixed vector,<br />
where the scalar is one of the unknown variables, convert<strong>in</strong>g the left-hand side <strong>in</strong>to<br />
a l<strong>in</strong>ear comb<strong>in</strong>ation<br />
[ −7<br />
x 1 5<br />
1<br />
]<br />
[ ] −6<br />
+ x 2 5<br />
0<br />
[ ] −12<br />
+ x 3 7 =<br />
4<br />
[ ] −33<br />
24<br />
5<br />
We can now <strong>in</strong>terpret the problem of solv<strong>in</strong>g the system of equations as determ<strong>in</strong><strong>in</strong>g<br />
values for the scalar multiples that make the vector equation true. In the analysis<br />
of Archetype B, we were able to determ<strong>in</strong>e that it had only one solution. A quick<br />
way to see this is to row-reduce the coefficient matrix to the 3 × 3 identity matrix<br />
and apply Theorem NMRRI to determ<strong>in</strong>e that the coefficient matrix is nons<strong>in</strong>gular.<br />
Then Theorem NMUS tells us that the system of equations has a unique solution.<br />
This solution is<br />
x 1 = −3 x 2 =5 x 3 =2<br />
So, <strong>in</strong> the context of this example, we can express the fact that these values of<br />
the variables are a solution by writ<strong>in</strong>g the l<strong>in</strong>ear comb<strong>in</strong>ation,<br />
(−3)<br />
[ ] −7<br />
5<br />
1<br />
+(5)<br />
[ ] −6<br />
5<br />
0<br />
+(2)<br />
[ ] −12<br />
7<br />
4<br />
=<br />
[ ] −33<br />
24<br />
5<br />
Furthermore, these are the only three scalars that will accomplish this equality,<br />
s<strong>in</strong>ce they come from a unique solution.<br />
Notice how the three vectors <strong>in</strong> this example are the columns of the coefficient<br />
matrix of the system of equations. This is our first h<strong>in</strong>t of the important <strong>in</strong>terplay<br />
between the vectors that form the columns of a matrix, and the matrix itself. △<br />
With any discussion of Archetype A or Archetype B we should be sure to contrast<br />
with the other.<br />
Example AALC Archetype A as a l<strong>in</strong>ear comb<strong>in</strong>ation
§LC Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 86<br />
As a vector equality, Archetype A can be written as<br />
[ ] [ ]<br />
x1 − x 2 +2x 3 1<br />
2x 1 + x 2 + x 3 = 8<br />
x 1 + x 2 5<br />
Now bust up the l<strong>in</strong>ear expressions on the left, first us<strong>in</strong>g vector addition,<br />
[ ] [ ] [ ] [ ]<br />
x1 −x2 2x3 1<br />
2x 1 + x 2 + x 3 = 8<br />
x 1 x 2 0x 3 5<br />
Rewrite each of these vectors as a scalar multiple of a fixed vector, where the<br />
scalar is one of the unknown variables, convert<strong>in</strong>g the left-hand side <strong>in</strong>to a l<strong>in</strong>ear<br />
comb<strong>in</strong>ation<br />
x 1<br />
[ 1<br />
2<br />
1<br />
]<br />
[ ] [ ] [ ]<br />
−1 2 1<br />
+ x 2 1 + x 3 1 = 8<br />
1 0 5<br />
Row-reduc<strong>in</strong>g the augmented matrix for Archetype A leads to the conclusion<br />
that the system is consistent and has free variables, hence <strong>in</strong>f<strong>in</strong>itely many solutions.<br />
So for example, the two solutions<br />
x 1 =2 x 2 =3 x 3 =1<br />
x 1 =3 x 2 =2 x 3 =0<br />
can be used together to say that,<br />
[ ] [ ] ]<br />
1 −1 2<br />
(2) 2 +(3) 1 +(1)[<br />
1<br />
1 1 0<br />
=<br />
[ 1<br />
8<br />
5<br />
]<br />
] 1<br />
=(3)[<br />
2<br />
1<br />
+(2)<br />
[ ] ]<br />
−1 2<br />
1 +(0)[<br />
1<br />
1 0<br />
Ignore the middle of this equation, and move all the terms to the left-hand side,<br />
[ ] [ ] ] [ ] [ ] [ ] [ ]<br />
1 −1 2 1 −1 2 0<br />
(2) 2 +(3) 1 +(1)[<br />
1 +(−3) 2 +(−2) 1 +(−0) 1 = 0<br />
1 1 0 1<br />
1<br />
0 0<br />
Regroup<strong>in</strong>g gives<br />
[ ] 1<br />
(−1) 2<br />
1<br />
+(1)<br />
[ ] ] [ ]<br />
−1 2 0<br />
1 +(1)[<br />
1 = 0<br />
1 0 0<br />
Notice that these three vectors are the columns of the coefficient matrix for the<br />
system of equations <strong>in</strong> Archetype A. This equality says there is a l<strong>in</strong>ear comb<strong>in</strong>ation<br />
of those columns that equals the vector of all zeros. Give it some thought, but this<br />
says that<br />
x 1 = −1 x 2 =1 x 3 =1<br />
is a nontrivial solution to the homogeneous system of equations with the coefficient
§LC Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 87<br />
matrix for the orig<strong>in</strong>al system <strong>in</strong> Archetype A. In particular, this demonstrates that<br />
this coefficient matrix is s<strong>in</strong>gular.<br />
△<br />
There is a lot go<strong>in</strong>g on <strong>in</strong> the last two examples. Come back to them <strong>in</strong> a while and<br />
make some connections with the <strong>in</strong>terven<strong>in</strong>g material. For now, we will summarize<br />
and expla<strong>in</strong> some of this behavior with a theorem.<br />
Theorem SLSLC Solutions to L<strong>in</strong>ear Systems are L<strong>in</strong>ear Comb<strong>in</strong>ations<br />
Denote the columns of the m × n matrix A as the vectors A 1 , A 2 , A 3 , ..., A n .<br />
Then x ∈ C n is a solution to the l<strong>in</strong>ear system of equations LS(A, b) if and only if<br />
b equals the l<strong>in</strong>ear comb<strong>in</strong>ation of the columns of A formed with the entries of x,<br />
[x] 1<br />
A 1 +[x] 2<br />
A 2 +[x] 3<br />
A 3 + ···+[x] n<br />
A n = b<br />
Proof. The proof of this theorem is as much about a change <strong>in</strong> notation as it is<br />
about mak<strong>in</strong>g logical deductions. Write the system of equations LS(A, b) as<br />
a 11 x 1 + a 12 x 2 + a 13 x 3 + ···+ a 1n x n = b 1<br />
a 21 x 1 + a 22 x 2 + a 23 x 3 + ···+ a 2n x n = b 2<br />
a 31 x 1 + a 32 x 2 + a 33 x 3 + ···+ a 3n x n = b 3<br />
.<br />
a m1 x 1 + a m2 x 2 + a m3 x 3 + ···+ a mn x n = b m<br />
Notice then that the entry of the coefficient matrix A <strong>in</strong> row i and column j has<br />
two names: a ij as the coefficient of x j <strong>in</strong> equation i of the system and [A j ] i<br />
as the<br />
i-th entry of the column vector <strong>in</strong> column j of the coefficient matrix A. Likewise,<br />
entry i of b has two names: b i from the l<strong>in</strong>ear system and [b] i<br />
as an entry of a<br />
vector. Our theorem is an equivalence (Proof Technique E) so we need to prove both<br />
“directions.”<br />
(⇐) Suppose we have the vector equality between b and the l<strong>in</strong>ear comb<strong>in</strong>ation<br />
of the columns of A. Then for 1 ≤ i ≤ m,<br />
b i =[b] i<br />
Def<strong>in</strong>ition CV<br />
=[[x] 1<br />
A 1 +[x] 2<br />
A 2 +[x] 3<br />
A 3 + ···+[x] n<br />
A n ] i<br />
Hypothesis<br />
=[[x] 1<br />
A 1 ] i<br />
+[[x] 2<br />
A 2 ] i<br />
+[[x] 3<br />
A 3 ] i<br />
+ ···+[[x] n<br />
A n ] i<br />
Def<strong>in</strong>ition CVA<br />
=[x] 1<br />
[A 1 ] i<br />
+[x] 2<br />
[A 2 ] i<br />
+[x] 3<br />
[A 3 ] i<br />
+ ···+[x] n<br />
[A n ] i<br />
Def<strong>in</strong>ition CVSM<br />
=[x] 1<br />
a i1 +[x] 2<br />
a i2 +[x] 3<br />
a i3 + ···+[x] n<br />
a <strong>in</strong><br />
Def<strong>in</strong>ition CV<br />
= a i1 [x] 1<br />
+ a i2 [x] 2<br />
+ a i3 [x] 3<br />
+ ···+ a <strong>in</strong> [x] n<br />
Property CMCN<br />
This says that the entries of x form a solution to equation i of LS(A, b) for all<br />
1 ≤ i ≤ m, <strong>in</strong> other words, x is a solution to LS(A, b).<br />
(⇒) Suppose now that x is a solution to the l<strong>in</strong>ear system LS(A, b). Thenfor
§LC Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 88<br />
all 1 ≤ i ≤ m,<br />
[b] i<br />
= b i<br />
Def<strong>in</strong>ition CV<br />
= a i1 [x] 1<br />
+ a i2 [x] 2<br />
+ a i3 [x] 3<br />
+ ···+ a <strong>in</strong> [x] n<br />
Hypothesis<br />
=[x] 1<br />
a i1 +[x] 2<br />
a i2 +[x] 3<br />
a i3 + ···+[x] n<br />
a <strong>in</strong><br />
Property CMCN<br />
=[x] 1<br />
[A 1 ] i<br />
+[x] 2<br />
[A 2 ] i<br />
+[x] 3<br />
[A 3 ] i<br />
+ ···+[x] n<br />
[A n ] i<br />
Def<strong>in</strong>ition CV<br />
=[[x] 1<br />
A 1 ] i<br />
+[[x] 2<br />
A 2 ] i<br />
+[[x] 3<br />
A 3 ] i<br />
+ ···+[[x] n<br />
A n ] i<br />
Def<strong>in</strong>ition CVSM<br />
=[[x] 1<br />
A 1 +[x] 2<br />
A 2 +[x] 3<br />
A 3 + ···+[x] n<br />
A n ] i<br />
Def<strong>in</strong>ition CVA<br />
So the entries of the vector b, and the entries of the vector that is the l<strong>in</strong>ear<br />
comb<strong>in</strong>ation of the columns of A, agree for all 1 ≤ i ≤ m. By Def<strong>in</strong>ition CVE we see<br />
that the two vectors are equal, as desired.<br />
<br />
In other words, this theorem tells us that solutions to systems of equations are<br />
l<strong>in</strong>ear comb<strong>in</strong>ations of the n column vectors of the coefficient matrix (A j )which<br />
yield the constant vector b. Or said another way, a solution to a system of equations<br />
LS(A, b) is an answer to the question “How can I form the vector b as a l<strong>in</strong>ear<br />
comb<strong>in</strong>ation of the columns of A?” Look through the archetypes that are systems<br />
of equations and exam<strong>in</strong>e a few of the advertised solutions. In each case use the<br />
solution to form a l<strong>in</strong>ear comb<strong>in</strong>ation of the columns of the coefficient matrix and<br />
verify that the result equals the constant vector (see Exercise LC.C21).<br />
Subsection VFSS<br />
Vector Form of Solution Sets<br />
We have written solutions to systems of equations as column vectors. For example<br />
Archetype B has the solution x 1 = −3, x 2 =5,x 3 =2whichwewriteas<br />
[ ] [ ]<br />
x1 −3<br />
x = x 2 = 5<br />
x 3 2<br />
Now, we will use column vectors and l<strong>in</strong>ear comb<strong>in</strong>ations to express all of the<br />
solutions to a l<strong>in</strong>ear system of equations <strong>in</strong> a compact and understandable way.<br />
<strong>First</strong>, here are two examples that will motivate our next theorem. This is a valuable<br />
technique, almost the equal of row-reduc<strong>in</strong>g a matrix, so be sure you get comfortable<br />
with it over the course of this section.<br />
Example VFSAD Vector form of solutions for Archetype D<br />
Archetype D is a l<strong>in</strong>ear system of 3 equations <strong>in</strong> 4 variables. Row-reduc<strong>in</strong>g the<br />
augmented matrix yields<br />
⎡<br />
⎣ 1 0 3 −2 4<br />
⎤<br />
0 1 1 −3 0⎦<br />
0 0 0 0 0
§LC Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 89<br />
and we see r = 2 pivot columns. Also, D = {1, 2} so the dependent variables are<br />
then x 1 and x 2 . F = {3, 4, 5} so the two free variables are x 3 and x 4 . We will<br />
express a generic solution for the system by two slightly different methods, though<br />
both arrive at the same conclusion.<br />
<strong>First</strong>, we will decompose (Proof Technique DC) a solution vector. Rearrang<strong>in</strong>g<br />
each equation represented <strong>in</strong> the row-reduced form of the augmented matrix by<br />
solv<strong>in</strong>g for the dependent variable <strong>in</strong> each row yields the vector equality,<br />
⎡ ⎡<br />
⎤<br />
⎤<br />
x 1<br />
⎢x 2 ⎥<br />
⎣<br />
x<br />
⎦ =<br />
3<br />
x 4<br />
⎢<br />
⎣<br />
4 − 3x 3 +2x 4<br />
−x 3 +3x 4<br />
x 3<br />
x 4<br />
⎥<br />
⎦<br />
Now we will use the def<strong>in</strong>itions of column vector addition and scalar multiplication<br />
to express this vector as a l<strong>in</strong>ear comb<strong>in</strong>ation,<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
4 −3x 3 2x 4<br />
⎢0⎥<br />
⎢ −x<br />
= ⎣<br />
0<br />
⎦ + 3 ⎥ ⎢3x ⎣<br />
x<br />
⎦ + 4 ⎥<br />
⎣<br />
3 0<br />
⎦ Def<strong>in</strong>ition CVA<br />
0 0 x 4<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
4 −3 2<br />
⎢0⎥<br />
⎢−1⎥<br />
⎢3⎥<br />
= ⎣<br />
0<br />
⎦ + x 3 ⎣<br />
1<br />
⎦ + x 4 ⎣<br />
0<br />
⎦ Def<strong>in</strong>ition CVSM<br />
0 0 1<br />
We will develop the same l<strong>in</strong>ear comb<strong>in</strong>ation a bit quicker, us<strong>in</strong>g three steps.<br />
While the method above is <strong>in</strong>structive, the method below will be our preferred<br />
approach.<br />
Step 1. Write the vector of variables as a fixed vector, plus a l<strong>in</strong>ear comb<strong>in</strong>ation<br />
of n − r vectors, us<strong>in</strong>g the free variables as the scalars.<br />
⎡ ⎡ ⎤ ⎤ ⎤<br />
⎤<br />
x 1<br />
⎢x x = 2 ⎥<br />
⎣<br />
x<br />
⎦ =<br />
3<br />
x 4<br />
⎢<br />
⎣<br />
⎡ ⎡<br />
⎥ ⎢ ⎥ ⎢<br />
⎦ + x 3 ⎣ ⎦ + x 4 ⎣<br />
Step 2. Use 0’s and 1’s to ensure equality for the entries of the vectors with<br />
<strong>in</strong>dices <strong>in</strong> F (correspond<strong>in</strong>g to the free variables).<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
x 1<br />
⎢x x = 2 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥<br />
⎣<br />
x<br />
⎦ = ⎣<br />
3 0<br />
⎦ + x 3 ⎣ 1<br />
⎦ + x 4 ⎣ 0<br />
⎦<br />
x 4 0 0 1<br />
Step 3. For each dependent variable, use the augmented matrix to formulate an<br />
equation express<strong>in</strong>g the dependent variable as a constant plus multiples of the free<br />
⎥<br />
⎦
§LC Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 90<br />
variables. Convert this equation <strong>in</strong>to entries of the vectors that ensure equality for<br />
each dependent variable, one at a time.<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
x 1 4 −3 2<br />
⎢x x 1 =4− 3x 3 +2x 4 ⇒ x = 2 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥<br />
⎣<br />
x<br />
⎦ = ⎣<br />
3 0<br />
⎦ + x 3 ⎣<br />
1<br />
⎦ + x 4 ⎣<br />
0<br />
⎦<br />
x 4 0 0 1<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
x 1 4 −3 2<br />
⎢x x 2 =0− 1x 3 +3x 4 ⇒ x = 2 ⎥ ⎢0⎥<br />
⎢−1⎥<br />
⎢3⎥<br />
⎣<br />
x<br />
⎦ = ⎣<br />
3 0<br />
⎦ + x 3 ⎣<br />
1<br />
⎦ + x 4 ⎣<br />
0<br />
⎦<br />
x 4 0 0 1<br />
This f<strong>in</strong>al form of a typical solution is especially pleas<strong>in</strong>g and useful. For example,<br />
we can build solutions quickly by choos<strong>in</strong>g values for our free variables, and then<br />
compute a l<strong>in</strong>ear comb<strong>in</strong>ation. Such as<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
x 1 4 −3 2 −12<br />
⎢x x 3 =2,x 4 = −5 ⇒ x = 2 ⎥ ⎢0⎥<br />
⎢−1⎥<br />
⎢3⎥<br />
⎢−17⎥<br />
⎣<br />
x<br />
⎦ = ⎣<br />
3 0<br />
⎦ +(2) ⎣<br />
1<br />
⎦ +(−5) ⎣<br />
0<br />
⎦ = ⎣<br />
2<br />
⎦<br />
x 4 0 0<br />
1 −5<br />
or,<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
x 1 4 −3 2 7<br />
⎢x x 3 =1,x 4 =3 ⇒ x = 2 ⎥ ⎢0⎥<br />
⎢−1⎥<br />
⎢3⎥<br />
⎢8⎥<br />
⎣<br />
x<br />
⎦ = ⎣<br />
3 0<br />
⎦ +(1) ⎣<br />
1<br />
⎦ +(3) ⎣<br />
0<br />
⎦ = ⎣<br />
1<br />
⎦<br />
x 4 0 0 1 3<br />
You will f<strong>in</strong>d the second solution listed <strong>in</strong> the write-up for Archetype D, andyou<br />
might check the first solution by substitut<strong>in</strong>g it back <strong>in</strong>to the orig<strong>in</strong>al equations.<br />
While this form is useful for quickly creat<strong>in</strong>g solutions, it is even better because<br />
it tells us exactly what every solution looks like. We know the solution set is <strong>in</strong>f<strong>in</strong>ite,<br />
⎡ ⎤<br />
−3<br />
⎢−1⎥<br />
which is pretty big, but now we can say that a solution is some multiple of ⎣<br />
1<br />
⎦<br />
0<br />
⎡ ⎤<br />
⎡ ⎤<br />
2<br />
4<br />
⎢3⎥<br />
⎢0⎥<br />
plus a multiple of ⎣<br />
0<br />
⎦ plus the fixed vector ⎣<br />
0<br />
⎦. Period. So it only takes us three<br />
1<br />
0<br />
vectors to describe the entire <strong>in</strong>f<strong>in</strong>ite solution set, provided we also agree on how to<br />
comb<strong>in</strong>e the three vectors <strong>in</strong>to a l<strong>in</strong>ear comb<strong>in</strong>ation.<br />
△<br />
This is such an important and fundamental technique, we will do another example.<br />
Example VFS Vector form of solutions<br />
Consider a l<strong>in</strong>ear system of m = 5 equations <strong>in</strong> n = 7 variables, hav<strong>in</strong>g the augmented
§LC Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 91<br />
matrix A.<br />
⎡<br />
A = ⎢<br />
⎣<br />
2 1 −1 −2 2 1 5 21<br />
1 1 −3 1 1 1 2 −5<br />
1 2 −8 5 1 1 −6 −15<br />
3 3 −9 3 6 5 2 −24<br />
−2 −1 1 2 1 1 −9 −30<br />
Row-reduc<strong>in</strong>g we obta<strong>in</strong> the matrix<br />
⎡<br />
⎤<br />
1 0 2 −3 0 0 9 15<br />
0 1 −5 4 0 0 −8 −10<br />
B =<br />
⎢ 0 0 0 0 1 0 −6 11<br />
⎥<br />
⎣<br />
⎦<br />
0 0 0 0 0 1 7 −21<br />
0 0 0 0 0 0 0 0<br />
and we see r = 4 pivot columns. Also, D = {1, 2, 5, 6} so the dependent variables<br />
are then x 1 ,x 2 ,x 5 , and x 6 . F = {3, 4, 7, 8} so the n − r = 3 free variables are<br />
x 3 ,x 4 and x 7 . We will express a generic solution for the system by two different<br />
methods: both a decomposition and a construction.<br />
<strong>First</strong>, we will decompose (Proof Technique DC) a solution vector. Rearrang<strong>in</strong>g<br />
each equation represented <strong>in</strong> the row-reduced form of the augmented matrix by<br />
solv<strong>in</strong>g for the dependent variable <strong>in</strong> each row yields the vector equality,<br />
⎡ ⎤ ⎡<br />
⎤<br />
x 1 15 − 2x 3 +3x 4 − 9x 7<br />
x 2<br />
−10 + 5x 3 − 4x 4 +8x 7<br />
x 3<br />
x 3<br />
x 4<br />
=<br />
x 4<br />
⎢x 5 ⎥ ⎢ 11 + 6x 7 ⎥<br />
⎣x 6<br />
⎦ ⎣ −21 − 7x 7<br />
⎦<br />
x 7 x 7<br />
Now we will use the def<strong>in</strong>itions of column vector addition and scalar multiplication<br />
to decompose this generic solution vector as a l<strong>in</strong>ear comb<strong>in</strong>ation,<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
15 −2x 3 3x 4 −9x 7<br />
−10<br />
5x 3<br />
−4x 4<br />
8x 7<br />
0<br />
x 3<br />
0<br />
0<br />
=<br />
0<br />
+<br />
0 +<br />
x 4<br />
+<br />
0<br />
Def<strong>in</strong>ition CVA<br />
⎢ 11<br />
⎥ ⎢ 0<br />
⎥ ⎢ 0 ⎥ ⎢ 6x 7 ⎥<br />
⎣−21⎦<br />
⎣ 0 ⎦ ⎣ 0 ⎦ ⎣−7x 7<br />
⎦<br />
0 0 0<br />
x 7<br />
⎤<br />
⎥<br />
⎦
§LC Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 92<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
15 −2 3 −9<br />
−10<br />
5<br />
−4<br />
8<br />
0<br />
1<br />
0<br />
0<br />
=<br />
0<br />
+ x ⎢ 11<br />
3 0<br />
+ x 4 1<br />
+ x<br />
⎥ ⎢ 0<br />
⎥ ⎢ 0<br />
7 0<br />
⎥ ⎢ 6<br />
⎥<br />
⎣−21⎦<br />
⎣ 0 ⎦ ⎣ 0 ⎦ ⎣−7⎦<br />
0 0 0 1<br />
Def<strong>in</strong>ition CVSM<br />
We will now develop the same l<strong>in</strong>ear comb<strong>in</strong>ation a bit quicker, us<strong>in</strong>g three steps.<br />
While the method above is <strong>in</strong>structive, the method below will be our preferred<br />
approach.<br />
Step 1. Write the vector of variables as a fixed vector, plus a l<strong>in</strong>ear comb<strong>in</strong>ation<br />
of n − r vectors, us<strong>in</strong>g the free variables as the scalars.<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
x 1<br />
x 2<br />
x 3<br />
x =<br />
x 4<br />
=<br />
+ x 3 + x 4 + x ⎢x 7 5 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥<br />
⎣x 6<br />
⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦<br />
x 7<br />
Step 2. Use 0’s and 1’s to ensure equality for the entries of the vectors with<br />
<strong>in</strong>dices <strong>in</strong> F (correspond<strong>in</strong>g to the free variables).<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
x 1<br />
x 2<br />
x 3<br />
0<br />
1<br />
0<br />
0<br />
x =<br />
x 4<br />
=<br />
0<br />
+ x 3 0<br />
+ x 4 1<br />
+ x 7 0<br />
⎢x 5 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥<br />
⎣x 6<br />
⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦<br />
x 7 0 0 0 1<br />
Step 3. For each dependent variable, use the augmented matrix to formulate an<br />
equation express<strong>in</strong>g the dependent variable as a constant plus multiples of the free<br />
variables. Convert this equation <strong>in</strong>to entries of the vectors that ensure equality for<br />
each dependent variable, one at a time.<br />
x 1 =15− 2x 3 +3x 4 − 9x 7 ⇒<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
x 1 15 −2 3 −9<br />
x 2<br />
x 3<br />
0<br />
1<br />
0<br />
0<br />
x =<br />
x 4<br />
=<br />
0<br />
+ x ⎢x 3 0<br />
+ x 4 1<br />
+ x 7 0<br />
5 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥<br />
⎣x 6<br />
⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦<br />
x 7 0 0 0 1<br />
x 2 = −10 + 5x 3 − 4x 4 +8x 7 ⇒
§LC Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 93<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
x 1 15 −2 3 −9<br />
x 2<br />
−10<br />
5<br />
−4<br />
8<br />
x 3<br />
0<br />
1<br />
0<br />
0<br />
x =<br />
x 4<br />
=<br />
0<br />
+ x ⎢x 3 0<br />
+ x 4 1<br />
+ x 7 0<br />
5 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥<br />
⎣x 6<br />
⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦<br />
x 7 0 0 0 1<br />
x 5 =11+6x 7 ⇒<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
x 1 15 −2 3 −9<br />
x 2<br />
−10<br />
5<br />
−4<br />
8<br />
x 3<br />
0<br />
1<br />
0<br />
0<br />
x =<br />
x 4<br />
=<br />
0<br />
+ x 3 0<br />
+ x 4 1<br />
+ x 7 0<br />
⎢x 5 ⎥ ⎢ 11<br />
⎥ ⎢ 0<br />
⎥ ⎢ 0<br />
⎥ ⎢ 6<br />
⎥<br />
⎣x 6<br />
⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦<br />
x 7 0 0 0 1<br />
x 6 = −21 − 7x 7 ⇒<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
x 1 15 −2 3 −9<br />
x 2<br />
−10<br />
5<br />
−4<br />
8<br />
x 3<br />
0<br />
1<br />
0<br />
0<br />
x =<br />
x 4<br />
=<br />
0<br />
+ x 3 0<br />
+ x ⎢x 5 ⎥ ⎢ 11<br />
4 1<br />
+ x<br />
⎥ ⎢ 0<br />
7 0<br />
⎥ ⎢ 0<br />
⎥ ⎢ 6<br />
⎥<br />
⎣x 6<br />
⎦ ⎣−21⎦<br />
⎣ 0 ⎦ ⎣ 0 ⎦ ⎣−7⎦<br />
x 7 0 0 0 1<br />
This f<strong>in</strong>al form of a typical solution is especially pleas<strong>in</strong>g and useful. For example,<br />
we can build solutions quickly by choos<strong>in</strong>g values for our free variables, and then<br />
compute a l<strong>in</strong>ear comb<strong>in</strong>ation. For example<br />
x 3 =2,x 4 = −4, x 7 =3 ⇒<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
x 1 15 −2<br />
3 −9 −28<br />
x 2<br />
−10<br />
5<br />
−4<br />
8<br />
40<br />
x 3<br />
0<br />
1<br />
0<br />
0<br />
2<br />
x =<br />
x 4<br />
=<br />
0<br />
+(2)<br />
0<br />
+(−4)<br />
1<br />
+(3)<br />
0<br />
=<br />
−4<br />
⎢x 5 ⎥ ⎢ 11<br />
⎥ ⎢ 0<br />
⎥ ⎢ 0<br />
⎥ ⎢ 6<br />
⎥ ⎢ 29<br />
⎥<br />
⎣x 6<br />
⎦ ⎣−21⎦<br />
⎣ 0 ⎦ ⎣ 0 ⎦ ⎣−7⎦<br />
⎣−42⎦<br />
x 7 0<br />
0<br />
0 1 3<br />
or perhaps,<br />
x 3 =5,x 4 =2,x 7 =1<br />
⇒
§LC Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 94<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
x 1 15 −2 3 −9 2<br />
x 2<br />
−10<br />
5<br />
−4<br />
8<br />
15<br />
x 3<br />
0<br />
1<br />
0<br />
0<br />
5<br />
x =<br />
x 4<br />
=<br />
0<br />
+(5)<br />
0<br />
+(2)<br />
1<br />
+(1)<br />
0 ⎥ =<br />
2<br />
⎢x 5 ⎥ ⎢ 11<br />
⎥ ⎢ 0<br />
⎥ ⎢ 0<br />
⎥ ⎢ 6 ⎥⎥⎦ ⎢ 17<br />
⎥<br />
⎣x 6<br />
⎦ ⎣−21⎦<br />
⎣ 0 ⎦ ⎣ 0 ⎦ ⎣−7<br />
⎣−28⎦<br />
x 7 0<br />
0 0 1 1<br />
or even,<br />
x 3 =0,x 4 =0,x 7 =0 ⇒<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
x 1 15 −2 3 −9 15<br />
x 2<br />
−10<br />
5<br />
−4<br />
8<br />
−10<br />
x 3<br />
0<br />
1<br />
0<br />
0<br />
0<br />
x =<br />
x 4<br />
=<br />
0<br />
+(0)<br />
0<br />
+(0)<br />
1<br />
+(0)<br />
0<br />
=<br />
0<br />
⎢x 5 ⎥ ⎢ 11<br />
⎥ ⎢ 0<br />
⎥ ⎢ 0<br />
⎥ ⎢ 6<br />
⎥ ⎢ 11<br />
⎥<br />
⎣x 6<br />
⎦ ⎣−21⎦<br />
⎣ 0 ⎦ ⎣ 0 ⎦ ⎣−7⎦<br />
⎣−21⎦<br />
x 7 0<br />
0 0 1 0<br />
So we can compactly express all of the solutions to this l<strong>in</strong>ear system with just 4<br />
fixed vectors, provided we agree how to comb<strong>in</strong>e them <strong>in</strong> a l<strong>in</strong>ear comb<strong>in</strong>ations to<br />
create solution vectors.<br />
Suppose you were told that the vector w below was a solution to this system of<br />
equations. Could you turn the problem around and write w as a l<strong>in</strong>ear comb<strong>in</strong>ation<br />
of the four vectors c, u 1 , u 2 , u 3 ? (See Exercise LC.M11.)<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
100<br />
−75<br />
7<br />
w =<br />
9<br />
⎢−37<br />
⎥<br />
⎣ 35 ⎦<br />
−8<br />
15<br />
−10<br />
0<br />
c =<br />
0<br />
⎢ 11<br />
⎥<br />
⎣−21⎦<br />
0<br />
−2<br />
5<br />
1<br />
u 1 =<br />
0<br />
⎢ 0<br />
⎥<br />
⎣ 0 ⎦<br />
0<br />
3<br />
−4<br />
0<br />
u 2 =<br />
1<br />
⎢ 0<br />
⎥<br />
⎣ 0 ⎦<br />
0<br />
−9<br />
8<br />
0<br />
u 3 =<br />
0<br />
⎢ 6<br />
⎥<br />
⎣−7⎦<br />
1<br />
Did you th<strong>in</strong>k a few weeks ago that you could so quickly and easily list all the<br />
solutions to a l<strong>in</strong>ear system of 5 equations <strong>in</strong> 7 variables?<br />
We will now formalize the last two (important) examples as a theorem. The<br />
statement of this theorem is a bit scary, and the proof is scarier. For now, be sure to<br />
convice yourself, by work<strong>in</strong>g through the examples and exercises, that the statement<br />
just describes the procedure of the two immediately previous examples.<br />
Theorem VFSLS Vector Form of Solutions to L<strong>in</strong>ear Systems<br />
Suppose that [A | b] is the augmented matrix for a consistent l<strong>in</strong>ear system LS(A, b)<br />
of m equations <strong>in</strong> n variables. Let B be a row-equivalent m × (n +1) matrix<br />
<strong>in</strong> reduced row-echelon form. Suppose that B has r pivot columns, with <strong>in</strong>dices<br />
D = {d 1 ,d 2 ,d 3 , ..., d r },whilethen − r non-pivot columns have <strong>in</strong>dices <strong>in</strong> F =
§LC Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 95<br />
{f 1 ,f 2 ,f 3 , ..., f n−r ,n+1}. Def<strong>in</strong>e vectors c, u j , 1 ≤ j ≤ n − r of size n by<br />
{<br />
0 if i ∈ F<br />
[c] i<br />
=<br />
[B] k,n+1<br />
if i ∈ D, i = d k<br />
⎧<br />
⎪⎨ 1 if i ∈ F , i = f j<br />
[u j ] i<br />
= 0 if i ∈ F , i ≠ f j .<br />
⎪⎩<br />
− [B] k,fj<br />
if i ∈ D, i = d k<br />
Then the set of solutions to the system of equations LS(A, b) is<br />
S = {c + α 1 u 1 + α 2 u 2 + α 3 u 3 + ···+ α n−r u n−r | α 1 ,α 2 ,α 3 , ..., α n−r ∈ C}<br />
Proof. <strong>First</strong>, LS(A, b) is equivalent to the l<strong>in</strong>ear system of equations that has the<br />
matrix B as its augmented matrix (Theorem REMES), so we need only show that S<br />
is the solution set for the system with B as its augmented matrix. The conclusion of<br />
this theorem is that the solution set is equal to the set S, so we will apply Def<strong>in</strong>ition<br />
SE.<br />
We beg<strong>in</strong> by show<strong>in</strong>g that every element of S is <strong>in</strong>deed a solution to the system.<br />
Let α 1 ,α 2 ,α 3 , ..., α n−r be one choice of the scalars used to describe elements of<br />
S. So an arbitrary element of S, which we will consider as a proposed solution is<br />
x = c + α 1 u 1 + α 2 u 2 + α 3 u 3 + ···+ α n−r u n−r<br />
When r +1 ≤ l ≤ m, rowl of the matrix B is a zero row, so the equation<br />
represented by that row is always true, no matter which solution vector we propose.<br />
So concentrate on rows represent<strong>in</strong>g equations 1 ≤ l ≤ r. We evaluate equation l of<br />
the system represented by B with the proposed solution vector x and refer to the<br />
value of the left-hand side of the equation as β l ,<br />
to<br />
β l =[B] l1<br />
[x] 1<br />
+[B] l2<br />
[x] 2<br />
+[B] l3<br />
[x] 3<br />
+ ···+[B] ln<br />
[x] n<br />
S<strong>in</strong>ce [B] ldi<br />
=0forall1≤ i ≤ r, except that [B] ldl<br />
= 1, we see that β l simplifies<br />
β l =[x] dl<br />
+[B] lf1<br />
[x] f1<br />
+[B] lf2<br />
[x] f2<br />
+[B] lf3<br />
[x] f3<br />
+ ···+[B] lfn−r<br />
[x] fn−r<br />
Notice that for 1 ≤ i ≤ n − r<br />
[x] fi<br />
=[c] fi<br />
+ α 1 [u 1 ] fi<br />
+ α 2 [u 2 ] fi<br />
+ ···+ α i [u i ] fi<br />
+ ···+ α n−r [u n−r ] fi<br />
=0+α 1 (0) + α 2 (0) + ···+ α i (1) + ···+ α n−r (0)<br />
= α i<br />
So β l simplifies further, and we expand the first term<br />
β l =[x] dl<br />
+[B] lf1<br />
α 1 +[B] lf2<br />
α 2 +[B] lf3<br />
α 3 + ···+[B] lfn−r<br />
α n−r<br />
=[c + α 1 u 1 + α 2 u 2 + α 3 u 3 + ···+ α n−r u n−r ] dl<br />
+
§LC Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 96<br />
[B] lf1<br />
α 1 +[B] lf2<br />
α 2 +[B] lf3<br />
α 3 + ···+[B] lfn−r<br />
α n−r<br />
=[c] dl<br />
+ α 1 [u 1 ] dl<br />
+ α 2 [u 2 ] dl<br />
+ α 3 [u 3 ] dl<br />
+ ···+ α n−r [u n−r ] dl<br />
+<br />
[B] lf1<br />
α 1 +[B] lf2<br />
α 2 +[B] lf3<br />
α 3 + ···+[B] lfn−r<br />
α n−r<br />
=[B] l,n+1<br />
+<br />
α 1 (− [B] lf1<br />
)+α 2 (− [B] lf2<br />
)+α 3 (− [B] lf3<br />
)+···+ α n−r (− [B] lfn−r<br />
)+<br />
[B] lf1<br />
α 1 +[B] lf2<br />
α 2 +[B] lf3<br />
α 3 + ···+[B] lfn−r<br />
α n−r<br />
=[B] l,n+1<br />
So β l began as the left-hand side of equation l of the system represented by B<br />
and we now know it equals [B] l,n+1<br />
, the constant term for equation l of this system.<br />
So the arbitrarily chosen vector from S makes every equation of the system true,<br />
and therefore is a solution to the system. So all the elements of S are solutions to<br />
the system.<br />
For the second half of the proof, assume that x is a solution vector for the system<br />
hav<strong>in</strong>g B as its augmented matrix. For convenience and clarity, denote the entries of<br />
x by x i , <strong>in</strong> other words, x i = [x] i<br />
. We desire to show that this solution vector is also<br />
an element of the set S. Beg<strong>in</strong> with the observation that a solution vector’s entries<br />
makes equation l of the system true for all 1 ≤ l ≤ m,<br />
[B] l,1<br />
x 1 +[B] l,2<br />
x 2 +[B] l,3<br />
x 3 + ···+[B] l,n<br />
x n =[B] l,n+1<br />
When l ≤ r, the pivot columns of B have zero entries <strong>in</strong> row l with the exception<br />
of column d l , which will conta<strong>in</strong> a 1. So for 1 ≤ l ≤ r, equation l simplifies to<br />
1x dl +[B] l,f1<br />
x f1 +[B] l,f2<br />
x f2 +[B] l,f3<br />
x f3 + ···+[B] l,fn−r<br />
x fn−r<br />
=[B] l,n+1<br />
This allows us to write,<br />
[x] dl<br />
= x dl<br />
=[B] l,n+1<br />
− [B] l,f1<br />
x f1 − [B] l,f2<br />
x f2 − [B] l,f3<br />
x f3 −···−[B] l,fn−r<br />
x fn−r<br />
=[c] dl<br />
+ x f1 [u 1 ] dl<br />
+ x f2 [u 2 ] dl<br />
+ x f3 [u 3 ] dl<br />
+ ···+ x fn−r [u n−r ] dl<br />
= [ ]<br />
c + x f1 u 1 + x f2 u 2 + x f3 u 3 + ···+ x fn−r u n−r d l<br />
This tells us that the entries of the solution vector x correspond<strong>in</strong>g to dependent<br />
variables (<strong>in</strong>dices <strong>in</strong> D), are equal to those of a vector <strong>in</strong> the set S. We still need to<br />
check the other entries of the solution vector x correspond<strong>in</strong>g to the free variables<br />
(<strong>in</strong>dices <strong>in</strong> F ) to see if they are equal to the entries of the same vector <strong>in</strong> the set S.<br />
To this end, suppose i ∈ F and i = f j .Then<br />
[x] i<br />
= x i = x fj<br />
=0+0x f1 +0x f2 +0x f3 + ···+0x fj−1 +1x fj +0x fj+1 + ···+0x fn−r<br />
=[c] i<br />
+ x f1 [u 1 ] i<br />
+ x f2 [u 2 ] i<br />
+ x f3 [u 3 ] i<br />
+ ···+ x fj [u j ] i<br />
+ ···+ x fn−r [u n−r ] i
§LC Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 97<br />
= [ c + x f1 u 1 + x f2 u 2 + ···+ x fn−r u n−r<br />
]i<br />
So entries of x and c + x f1 u 1 + x f2 u 2 + ···+ x fn−r u n−r are equal and therefore<br />
by Def<strong>in</strong>ition CVE they are equal vectors. S<strong>in</strong>ce x f1 ,x f2 ,x f3 , ..., x fn−r are scalars,<br />
thisshowsusthatx qualifies for membership <strong>in</strong> S. So the set S conta<strong>in</strong>s all of the<br />
solutions to the system.<br />
<br />
Note that both halves of the proof of Theorem VFSLS <strong>in</strong>dicate that α i = [x] fi<br />
.<br />
In other words, the arbitrary scalars, α i , <strong>in</strong> the description of the set S actually have<br />
more mean<strong>in</strong>g — they are the values of the free variables [x] fi<br />
,1≤ i ≤ n − r. Sowe<br />
will often exploit this observation <strong>in</strong> our descriptions of solution sets.<br />
Theorem VFSLS formalizes what happened <strong>in</strong> the three steps of Example VFSAD.<br />
The theorem will be useful <strong>in</strong> prov<strong>in</strong>g other theorems, and it it is useful s<strong>in</strong>ce it tells<br />
us an exact procedure for simply describ<strong>in</strong>g an <strong>in</strong>f<strong>in</strong>ite solution set. We could program<br />
a computer to implement it, once we have the augmented matrix row-reduced and<br />
have checked that the system is consistent. By Knuth’s def<strong>in</strong>ition, this completes<br />
our conversion of l<strong>in</strong>ear equation solv<strong>in</strong>g from art <strong>in</strong>to science. Notice that it even<br />
applies (but is overkill) <strong>in</strong> the case of a unique solution. However, as a practical<br />
matter, I prefer the three-step process of Example VFSAD when I need to describe<br />
an <strong>in</strong>f<strong>in</strong>ite solution set. So let us practice some more, but with a bigger example.<br />
Example VFSAI Vector form of solutions for Archetype I<br />
Archetype I is a l<strong>in</strong>ear system of m = 4 equations <strong>in</strong> n = 7 variables. Row-reduc<strong>in</strong>g<br />
the augmented matrix yields<br />
⎡<br />
⎤<br />
⎢<br />
⎣<br />
1 4 0 0 2 1 −3 4<br />
0 0 1 0 1 −3 5 2<br />
⎥<br />
0 0 0 1 2 −6 6 1<br />
⎦<br />
0 0 0 0 0 0 0 0<br />
and we see r = 3 pivot columns, with <strong>in</strong>dices D = {1, 3, 4}. Sother =3dependent<br />
variables are x 1 ,x 3 ,x 4 . The non-pivot columns have <strong>in</strong>dices <strong>in</strong> F = {2, 5, 6, 7, 8},<br />
so the n − r = 4 free variables are x 2 ,x 5 ,x 6 ,x 7 .<br />
Step 1. Write the vector of variables (x) as a fixed vector (c), plus a l<strong>in</strong>ear<br />
comb<strong>in</strong>ation of n − r = 4 vectors (u 1 , u 2 , u 3 , u 4 ), us<strong>in</strong>g the free variables as the<br />
scalars.<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
x 1<br />
x 2<br />
x 3<br />
x =<br />
x 4<br />
=<br />
+ x ⎢x 2 + x 5 + x 6 + x 7 5 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥<br />
⎣x 6<br />
⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦<br />
x 7<br />
Step 2. For each free variable, use 0’s and 1’s to ensure equality for the corre-
§LC Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 98<br />
spond<strong>in</strong>g entry of the vectors. Take note of the pattern of 0’s and 1’s at this stage,<br />
because this is the best look you will have at it. We will state an important theorem<br />
<strong>in</strong> the next section and the proof will essentially rely on this observation.<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
x 1<br />
x 2<br />
0<br />
1<br />
0<br />
0<br />
0<br />
x 3<br />
x =<br />
x 4<br />
=<br />
+ x 2 + x 5 + x 6 + x ⎢x 5 ⎥ ⎢0<br />
⎥ ⎢0<br />
⎥ ⎢1<br />
⎥ ⎢0<br />
7 ⎥ ⎢0<br />
⎥<br />
⎣x 6<br />
⎦ ⎣0⎦<br />
⎣0⎦<br />
⎣0⎦<br />
⎣1⎦<br />
⎣0⎦<br />
x 7 0 0 0 0 1<br />
Step 3. For each dependent variable, use the augmented matrix to formulate an<br />
equation express<strong>in</strong>g the dependent variable as a constant plus multiples of the free<br />
variables. Convert this equation <strong>in</strong>to entries of the vectors that ensure equality for<br />
each dependent variable, one at a time.<br />
x 1 =4− 4x 2 − 2x 5 − 1x 6 +3x 7 ⇒<br />
⎡ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
x 1 4 −4 −2 −1 3<br />
x 2<br />
0<br />
1<br />
0<br />
0<br />
0<br />
x 3<br />
x = ⎢ ⎥ = ⎢ ⎥ + x 2 ⎢ ⎥ + x 5 ⎢ ⎥ + x 6 ⎢ ⎥ + x 7 ⎢ ⎥<br />
⎤<br />
x 4<br />
⎢x 5 ⎥<br />
⎣x 6<br />
⎦<br />
x 7<br />
⎢0<br />
⎥<br />
⎣0⎦<br />
0<br />
⎢<br />
⎣<br />
0<br />
0<br />
0<br />
⎥<br />
⎦<br />
⎢<br />
⎣<br />
1<br />
0<br />
0<br />
⎥<br />
⎦<br />
⎢<br />
⎣<br />
0<br />
1<br />
0<br />
⎥<br />
⎦<br />
⎢0<br />
⎥<br />
⎣0⎦<br />
1<br />
x 3 =2+0x 2 − x 5 +3x 6 − 5x 7 ⇒<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
x 1 4 −4 −2 −1 3<br />
x 2<br />
0<br />
1<br />
0<br />
0<br />
0<br />
x 3<br />
2<br />
0<br />
−1<br />
3<br />
−5<br />
x =<br />
x 4<br />
=<br />
+ x 2 + x 5 + x 6 + x 7 ⎢x 5 ⎥ ⎢0<br />
⎥ ⎢ 0<br />
⎥ ⎢ 1<br />
⎥ ⎢ 0<br />
⎥ ⎢ 0<br />
⎥<br />
⎣x 6<br />
⎦ ⎣0⎦<br />
⎣ 0 ⎦ ⎣ 0 ⎦ ⎣ 1 ⎦ ⎣ 0 ⎦<br />
x 7 0 0 0 0 1<br />
x 4 =1+0x 2 − 2x 5 +6x 6 − 6x 7 ⇒<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
x 1 4 −4 −2 −1 3<br />
x 2<br />
0<br />
1<br />
0<br />
0<br />
0<br />
x 3<br />
2<br />
0<br />
−1<br />
3<br />
−5<br />
x =<br />
x 4<br />
=<br />
1<br />
+ x 2 0<br />
+ x ⎢x 5 ⎥ ⎢0<br />
5 −2<br />
+ x<br />
⎥ ⎢ 0<br />
6 6<br />
+ x 7 −6<br />
⎥ ⎢ 1<br />
⎥ ⎢ 0<br />
⎥ ⎢ 0<br />
⎥<br />
⎣x 6<br />
⎦ ⎣0⎦<br />
⎣ 0 ⎦ ⎣ 0 ⎦ ⎣ 1 ⎦ ⎣ 0 ⎦<br />
x 7 0 0 0 0 1
§LC Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 99<br />
We can now use this f<strong>in</strong>al expression to quickly build solutions to the system.<br />
You might try to recreate each of the solutions listed <strong>in</strong> the write-up for Archetype<br />
I. (H<strong>in</strong>t: look at the values of the free variables <strong>in</strong> each solution, and notice that the<br />
vector c has 0’s <strong>in</strong> these locations.)<br />
Even better, we have a description of the <strong>in</strong>f<strong>in</strong>ite solution set, based on just 5<br />
vectors, which we comb<strong>in</strong>e <strong>in</strong> l<strong>in</strong>ear comb<strong>in</strong>ations to produce solutions.<br />
Whenever we discuss Archetype I you know that is your cue to go work through<br />
Archetype J by yourself. Remember to take note of the 0/1 pattern at the conclusion<br />
of Step 2. Have fun — we won’t go anywhere while you’re away.<br />
△<br />
This technique is so important, that we will do one more example. However, an<br />
important dist<strong>in</strong>ction will be that this system is homogeneous.<br />
Example VFSAL Vector form of solutions for Archetype L<br />
Archetype L is presented simply as the 5 × 5 matrix<br />
⎡<br />
⎤<br />
−2 −1 −2 −4 4<br />
−6 −5 −4 −4 6<br />
L = ⎢ 10 7 7 10 −13⎥<br />
⎣−7 −5 −6 −9 10 ⎦<br />
−4 −3 −4 −6 6<br />
We will employ this matrix here as the coefficient matrix of a homogeneous<br />
system and reference this matrix as L. So we are solv<strong>in</strong>g the homogeneous system<br />
LS(L, 0) hav<strong>in</strong>g m = 5 equations <strong>in</strong> n = 5 variables. If we built the augmented<br />
matrix, we would add a sixth column to L conta<strong>in</strong><strong>in</strong>g all zeros. As we did row<br />
operations, this sixth column would rema<strong>in</strong> all zeros. So <strong>in</strong>stead we will row-reduce<br />
the coefficient matrix, and mentally remember the miss<strong>in</strong>g sixth column of zeros.<br />
This row-reduced matrix is ⎡<br />
⎢<br />
1 0 0 1 −2<br />
⎤<br />
⎢⎢⎢⎣ 0 1 0 −2 2<br />
0 0 1 2 −1⎥<br />
0 0 0 0 0<br />
⎦<br />
0 0 0 0 0<br />
and we see r = 3 pivot columns, with <strong>in</strong>dices D = {1, 2, 3}. Sother =3dependent<br />
variables are x 1 ,x 2 ,x 3 . The non-pivot columns have <strong>in</strong>dices F = {4, 5}, sothe<br />
n − r = 2 free variables are x 4 ,x 5 . Notice that if we had <strong>in</strong>cluded the all-zero vector<br />
of constants to form the augmented matrix for the system, then the <strong>in</strong>dex 6 would<br />
have appeared <strong>in</strong> the set F , and subsequently would have been ignored when list<strong>in</strong>g<br />
the free variables. So noth<strong>in</strong>g is lost by not creat<strong>in</strong>g an augmented matrix (<strong>in</strong> the<br />
case of a homogenous system). And maybe it is an improvement, s<strong>in</strong>ce now every<br />
<strong>in</strong>dex <strong>in</strong> F can be used to reference a variable of the l<strong>in</strong>ear system.<br />
Step 1. Write the vector of variables (x) as a fixed vector (c), plus a l<strong>in</strong>ear
§LC Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 100<br />
comb<strong>in</strong>ation of n − r = 2 vectors (u 1 , u 2 ), us<strong>in</strong>g the free variables as the scalars.<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
x 1<br />
x 2<br />
x = ⎢x 3 ⎥<br />
⎣x ⎦ = ⎢ ⎥<br />
⎣ ⎦ + x 4 ⎢ ⎥<br />
⎣ ⎦ + x 5 ⎢ ⎥<br />
⎣ ⎦<br />
4<br />
x 5<br />
Step 2. For each free variable, use 0’s and 1’s to ensure equality for the correspond<strong>in</strong>g<br />
entry of the vectors. Take note of the pattern of 0’s and 1’s at this stage,<br />
even if it is not as illum<strong>in</strong>at<strong>in</strong>g as <strong>in</strong> other examples.<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
x 1<br />
x 2<br />
x = ⎢x 3 ⎥<br />
⎣x ⎦ = ⎢ ⎥<br />
⎣ 4 0 ⎦ + x 4 ⎢ ⎥<br />
⎣ 1 ⎦ + x 5 ⎢ ⎥<br />
⎣ 0 ⎦<br />
x 5 0 0 1<br />
Step 3. For each dependent variable, use the augmented matrix to formulate an<br />
equation express<strong>in</strong>g the dependent variable as a constant plus multiples of the free<br />
variables. Do not forget about the “miss<strong>in</strong>g” sixth column be<strong>in</strong>g full of zeros. Convert<br />
this equation <strong>in</strong>to entries of the vectors that ensure equality for each dependent<br />
variable, one at a time.<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
x 1 0 −1 2<br />
x 2<br />
x 1 =0− 1x 4 +2x 5 ⇒ x = ⎢x 3 ⎥<br />
⎣x ⎦ = ⎢ ⎥<br />
⎣ 4 0⎦ + x 4 ⎢ ⎥<br />
⎣ 1 ⎦ + x 5 ⎢ ⎥<br />
⎣0⎦<br />
x 5 0 0 1<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
x 1 0 −1 2<br />
x 2<br />
0<br />
2<br />
−2<br />
x 2 =0+2x 4 − 2x 5 ⇒ x = ⎢x 3 ⎥<br />
⎣x ⎦ = ⎢ ⎥<br />
⎣ 4 0⎦ + x 4 ⎢ ⎥<br />
⎣ 1 ⎦ + x 5 ⎢ ⎥<br />
⎣ 0 ⎦<br />
x 5 0 0 1<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
x 1 0 −1 2<br />
x 2<br />
0<br />
2<br />
−2<br />
x 3 =0− 2x 4 +1x 5 ⇒ x = ⎢x 3 ⎥<br />
⎣x ⎦ = ⎢0⎥<br />
⎣ 4 0⎦ + x 4 ⎢−2⎥<br />
⎣ 1 ⎦ + x 5 ⎢ 1 ⎥<br />
⎣ 0 ⎦<br />
x 5 0 0 1<br />
The vector c will always have 0’s <strong>in</strong> the entries correspond<strong>in</strong>g to free variables.<br />
However, s<strong>in</strong>ce we are solv<strong>in</strong>g a homogeneous system, the row-reduced augmented<br />
matrix has zeros <strong>in</strong> column n + 1 = 6, and hence all the entries of c are zero. So we
§LC Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 101<br />
can write<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
x 1<br />
−1 2 −1 2<br />
x 2<br />
2<br />
−2<br />
2<br />
−2<br />
x = ⎢x 3 ⎥<br />
⎣x ⎦ = 0 + x 4 ⎢−2⎥<br />
⎣<br />
4<br />
1 ⎦ + x 5 ⎢ 1 ⎥<br />
⎣ 0 ⎦ = x 4 ⎢−2⎥<br />
⎣ 1 ⎦ + x 5 ⎢ 1 ⎥<br />
⎣ 0 ⎦<br />
x 5 0 1 0 1<br />
It will always happen that the solutions to a homogeneous system has c = 0<br />
(even <strong>in</strong> the case of a unique solution?). So our expression for the solutions is a<br />
bit more pleas<strong>in</strong>g. In this example it says ⎡ that ⎤ the solutions ⎡ ⎤are all possible l<strong>in</strong>ear<br />
−1<br />
2<br />
2<br />
−2<br />
comb<strong>in</strong>ations of the two vectors u 1 = ⎢−2⎥<br />
⎣ 1 ⎦ and u 2 = ⎢ 1 ⎥, with no mention of<br />
⎣ 0 ⎦<br />
0<br />
1<br />
any fixed vector enter<strong>in</strong>g <strong>in</strong>to the l<strong>in</strong>ear comb<strong>in</strong>ation.<br />
This observation will motivate our next section and the ma<strong>in</strong> def<strong>in</strong>ition of that<br />
section, and after that we will conclude the section by formaliz<strong>in</strong>g this situation.△<br />
Subsection PSHS<br />
Particular Solutions, Homogeneous Solutions<br />
The next theorem tells us that <strong>in</strong> order to f<strong>in</strong>d all of the solutions to a l<strong>in</strong>ear system<br />
of equations, it is sufficient to f<strong>in</strong>d just one solution, and then f<strong>in</strong>d all of the solutions<br />
to the correspond<strong>in</strong>g homogeneous system. This expla<strong>in</strong>s part of our <strong>in</strong>terest <strong>in</strong> the<br />
null space, the set of all solutions to a homogeneous system.<br />
Theorem PSPHS Particular Solution Plus Homogeneous Solutions<br />
Suppose that w is one solution to the l<strong>in</strong>ear system of equations LS(A, b). Theny<br />
is a solution to LS(A, b) if and only if y = w + z for some vector z ∈N(A).<br />
Proof. Let A 1 , A 2 , A 3 , ..., A n be the columns of the coefficient matrix A.<br />
(⇐) Suppose y = w + z and z ∈N(A). Then<br />
b =[w] 1<br />
A 1 +[w] 2<br />
A 2 +[w] 3<br />
A 3 + ···+[w] n<br />
A n Theorem SLSLC<br />
=[w] 1<br />
A 1 +[w] 2<br />
A 2 +[w] 3<br />
A 3 + ···+[w] n<br />
A n + 0 Property ZC<br />
=[w] 1<br />
A 1 +[w] 2<br />
A 2 +[w] 3<br />
A 3 + ···+[w] n<br />
A n Theorem SLSLC<br />
+[z] 1<br />
A 1 +[z] 2<br />
A 2 +[z] 3<br />
A 3 + ···+[z] n<br />
A n<br />
=([w] 1<br />
+[z] 1<br />
) A 1 +([w] 2<br />
+[z] 2<br />
) A 2<br />
Theorem VSPCV<br />
([w] 3<br />
+[z] 3<br />
) A 3 + ···+([w] n<br />
+[z] n<br />
) A n<br />
=+[w + z] 1<br />
A 1 +[w + z] 2<br />
A 2 + ···+[w + z] n<br />
A n Def<strong>in</strong>ition CVA<br />
=[y] 1<br />
A 1 +[y] 2<br />
A 2 +[y] 3<br />
A 3 + ···+[y] n<br />
A n Def<strong>in</strong>ition of y<br />
Apply<strong>in</strong>g Theorem SLSLC we see that the vector y is a solution to LS(A, b).
§LC Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 102<br />
(⇒) Suppose y is a solution to LS(A, b). Then<br />
0 = b − b<br />
=[y] 1<br />
A 1 +[y] 2<br />
A 2 +[y] 3<br />
A 3 + ···+[y] n<br />
A n<br />
− ([w] 1<br />
A 1 +[w] 2<br />
A 2 +[w] 3<br />
A 3 + ···+[w] n<br />
A n )<br />
=([y] 1<br />
− [w] 1<br />
) A 1 +([y] 2<br />
− [w] 2<br />
) A 2<br />
+([y] 3<br />
− [w] 3<br />
) A 3 + ···+([y] n<br />
− [w] n<br />
) A n<br />
=[y − w] 1<br />
A 1 +[y − w] 2<br />
A 2<br />
Theorem SLSLC<br />
Theorem VSPCV<br />
Def<strong>in</strong>ition CVA<br />
+[y − w] 3<br />
A 3 + ···+[y − w] n<br />
A n<br />
By Theorem SLSLC we see that the vector y − w is a solution to the homogeneous<br />
system LS(A, 0) and by Def<strong>in</strong>ition NSM, y − w ∈N(A). In other words, y − w = z<br />
for some vector z ∈N(A). Rewritten, this is y = w + z, as desired.<br />
<br />
After prov<strong>in</strong>g Theorem NMUS we commented (<strong>in</strong>sufficiently) on the negation of<br />
one half of the theorem. Nons<strong>in</strong>gular coefficient matrices lead to unique solutions for<br />
every choice of the vector of constants. What does this say about s<strong>in</strong>gular matrices?<br />
A s<strong>in</strong>gular matrix A has a nontrivial null space (Theorem NMTNS). For a given<br />
vector of constants, b, the system LS(A, b) could be <strong>in</strong>consistent, mean<strong>in</strong>g there<br />
arenosolutions.Butifthereisatleastonesolution(w), then Theorem PSPHS tells<br />
us there will be <strong>in</strong>f<strong>in</strong>itely many solutions because of the role of the <strong>in</strong>f<strong>in</strong>ite null space<br />
for a s<strong>in</strong>gular matrix. So a system of equations with a s<strong>in</strong>gular coefficient matrix<br />
never has a unique solution. Notice that this is the contrapositive of the statement<br />
<strong>in</strong> Exercise NM.T31. With a s<strong>in</strong>gular coefficient matrix, either there are no solutions,<br />
or <strong>in</strong>f<strong>in</strong>itely many solutions, depend<strong>in</strong>g on the choice of the vector of constants (b).<br />
Example PSHS Particular solutions, homogeneous solutions, Archetype D<br />
Archetype D is a consistent system of equations with a nontrivial null space. Let<br />
A denote the coefficient matrix of this system. The write-up for this system beg<strong>in</strong>s<br />
with three solutions,<br />
⎡ ⎤<br />
⎡ ⎤<br />
⎡ ⎤<br />
0<br />
4<br />
7<br />
⎢1⎥<br />
⎢0⎥<br />
⎢8⎥<br />
y 1 = ⎣<br />
2<br />
⎦ y 2 = ⎣<br />
0<br />
⎦ y 3 = ⎣<br />
1<br />
⎦<br />
1<br />
0<br />
3<br />
We will choose to have y 1 playtheroleofw <strong>in</strong> the statement of Theorem PSPHS,<br />
any one of the three vectors listed here (or others) could have been chosen. To<br />
illustrate the theorem, we should be able to write each of these three solutions as<br />
the vector w plus a solution to the correspond<strong>in</strong>g homogeneous system of equations.<br />
S<strong>in</strong>ce 0 is always a solution to a homogeneous system we can easily write<br />
y 1 = w = w + 0.<br />
The vectors y 2 and y 3 will require a bit more effort. Solutions to the homogeneous
§LC Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 103<br />
system LS(A, 0) are exactly the elements of the null space of the coefficient matrix,<br />
which by an application of Theorem VFSLS is<br />
⎧ ⎡ ⎤ ⎡ ⎤<br />
⎫<br />
−3 2<br />
⎪⎨<br />
⎪⎬<br />
N (A) =<br />
⎪⎩ x ⎢−1⎥<br />
⎢3⎥<br />
3 ⎣<br />
1<br />
⎦ + x 4 ⎣<br />
0<br />
⎦<br />
x 3 ,x 4 ∈ C<br />
0 1 ∣<br />
⎪⎭<br />
Then<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎛ ⎡ ⎤ ⎡ ⎤⎞<br />
4 0 4 0<br />
−3 2<br />
⎢0⎥<br />
⎢1⎥<br />
⎢−1⎥<br />
⎢1⎥<br />
⎜ ⎢−1⎥<br />
⎢3⎥⎟<br />
y 2 = ⎣<br />
0<br />
⎦ = ⎣<br />
2<br />
⎦ + ⎣<br />
−2<br />
⎦ = ⎣<br />
2<br />
⎦ + ⎝(−2) ⎣<br />
1<br />
⎦ +(−1) ⎣<br />
0<br />
⎦⎠ = w + z 2<br />
0 1 −1 1<br />
0<br />
1<br />
where<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
4<br />
−3 2<br />
⎢−1⎥<br />
⎢−1⎥<br />
⎢3⎥<br />
z 2 = ⎣<br />
−2<br />
⎦ =(−2) ⎣<br />
1<br />
⎦ +(−1) ⎣<br />
0<br />
⎦<br />
−1<br />
0<br />
1<br />
is obviously a solution of the homogeneous system s<strong>in</strong>ce it is written as a l<strong>in</strong>ear<br />
comb<strong>in</strong>ation of the vectors describ<strong>in</strong>g the null space of the coefficient matrix (or as<br />
a check, you could just evaluate the equations <strong>in</strong> the homogeneous system with z 2 ).<br />
Aga<strong>in</strong><br />
where<br />
⎡ ⎤ ⎡ ⎤ ⎡<br />
7 0<br />
⎢8⎥<br />
⎢1⎥<br />
⎢<br />
y 3 = ⎣<br />
1<br />
⎦ = ⎣<br />
2<br />
⎦ + ⎣<br />
3 1<br />
7<br />
7<br />
−1<br />
2<br />
⎢<br />
z 3 = ⎣<br />
⎤ ⎡ ⎤ ⎛ ⎡ ⎤ ⎡ ⎤⎞<br />
0<br />
−3 2<br />
⎥ ⎢1⎥<br />
⎜ ⎢−1⎥<br />
⎢3⎥⎟<br />
⎦ = ⎣<br />
2<br />
⎦ + ⎝(−1) ⎣<br />
1<br />
⎦ +2⎣<br />
0<br />
⎦⎠ = w + z 3<br />
1<br />
0 1<br />
⎡<br />
7<br />
7<br />
−1<br />
2<br />
⎤ ⎡ ⎤ ⎡ ⎤<br />
−3 2<br />
⎥ ⎢−1⎥<br />
⎢3⎥<br />
⎦ =(−1) ⎣<br />
1<br />
⎦ +2⎣<br />
0<br />
⎦<br />
0 1<br />
is obviously a solution of the homogeneous system s<strong>in</strong>ce it is written as a l<strong>in</strong>ear<br />
comb<strong>in</strong>ation of the vectors describ<strong>in</strong>g the null space of the coefficient matrix (or as<br />
a check, you could just evaluate the equations <strong>in</strong> the homogeneous system with z 2 ).<br />
Here is another view of this theorem, <strong>in</strong> the context of this example. Grab two<br />
new solutions of the orig<strong>in</strong>al system of equations, say<br />
⎡<br />
⎢<br />
y 4 = ⎣<br />
11<br />
0<br />
−3<br />
−1<br />
⎤<br />
⎥<br />
⎦ y 5 =<br />
⎡ ⎤<br />
−4<br />
⎢ ⎥<br />
⎣ ⎦<br />
2<br />
4<br />
2
§LC Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 104<br />
and form their difference,<br />
⎡<br />
⎢<br />
u = ⎣<br />
11<br />
0<br />
−3<br />
−1<br />
⎤<br />
⎥<br />
⎦ −<br />
⎡ ⎤<br />
−4<br />
⎢<br />
⎣<br />
2<br />
4<br />
2<br />
⎥<br />
⎦ =<br />
⎡ ⎤<br />
15<br />
⎢−2⎥<br />
⎣<br />
−7<br />
⎦ .<br />
−3<br />
It is no accident that u is a solution to the homogeneous system (check this!). In<br />
other words, the difference between any two solutions to a l<strong>in</strong>ear system of equations<br />
is an element of the null space of the coefficient matrix. This is an equivalent way to<br />
state Theorem PSPHS. (See Exercise MM.T50).<br />
△<br />
The ideas of this subsection will appear aga<strong>in</strong> <strong>in</strong> Chapter LT when we discuss<br />
pre-images of l<strong>in</strong>ear transformations (Def<strong>in</strong>ition PI).<br />
Read<strong>in</strong>g Questions<br />
1. Earlier, a read<strong>in</strong>g question asked you to solve the system of equations<br />
2x 1 +3x 2 − x 3 =0<br />
x 1 +2x 2 + x 3 =3<br />
x 1 +3x 2 +3x 3 =7<br />
Use a l<strong>in</strong>ear comb<strong>in</strong>ation to rewrite this system of equations as a vector equality.<br />
2. F<strong>in</strong>d a l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors<br />
⎧⎡<br />
⎨<br />
S = ⎣ 1 ⎤ ⎡<br />
3 ⎦ , ⎣ 2 ⎤ ⎡<br />
0⎦ , ⎣ −1<br />
⎤⎫<br />
⎬<br />
3 ⎦<br />
⎩<br />
−1 4 −5<br />
⎭<br />
⎡<br />
that equals the vector ⎣ 1<br />
⎤<br />
−9⎦.<br />
11<br />
3. The matrix below is the augmented matrix of a system of equations, row-reduced to<br />
reduced row-echelon form. Write the vector form of the solutions to the system.<br />
⎡<br />
⎤<br />
1 3 0 6 0 9<br />
⎢<br />
⎥<br />
⎣ 0 0 1 −2 0 −8⎦<br />
0 0 0 0 1 3<br />
Exercises<br />
C21 † Consider each archetype that is a system of equations. For <strong>in</strong>dividual solutions<br />
listed (both for the orig<strong>in</strong>al system and the correspond<strong>in</strong>g homogeneous system) express<br />
the vector of constants as a l<strong>in</strong>ear comb<strong>in</strong>ation of the columns of the coefficient matrix, as<br />
guaranteed by Theorem SLSLC. Verify this equality by comput<strong>in</strong>g the l<strong>in</strong>ear comb<strong>in</strong>ation.<br />
For systems with no solutions, recognize that it is then impossible to write the vector of<br />
constants as a l<strong>in</strong>ear comb<strong>in</strong>ation of the columns of the coefficient matrix. Note too, for
§LC Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 105<br />
homogeneous systems, that the solutions give rise to l<strong>in</strong>ear comb<strong>in</strong>ations that equal the<br />
zero vector.<br />
Archetype A,ArchetypeB,ArchetypeC,ArchetypeD,ArchetypeE,ArchetypeF,Archetype<br />
G, ArchetypeH, ArchetypeI, ArchetypeJ<br />
C22 † Consider each archetype that is a system of equations. Write elements of the solution<br />
set <strong>in</strong> vector form, as guaranteed by Theorem VFSLS.<br />
Archetype A,ArchetypeB,ArchetypeC,ArchetypeD,ArchetypeE,ArchetypeF,Archetype<br />
G, ArchetypeH, ArchetypeI, ArchetypeJ<br />
C40 † F<strong>in</strong>d the vector form of the solutions to the system of equations below.<br />
2x 1 − 4x 2 +3x 3 + x 5 =6<br />
x 1 − 2x 2 − 2x 3 +14x 4 − 4x 5 =15<br />
x 1 − 2x 2 + x 3 +2x 4 + x 5 = −1<br />
−2x 1 +4x 2 − 12x 4 + x 5 = −7<br />
C41 †<br />
F<strong>in</strong>d the vector form of the solutions to the system of equations below.<br />
−2x 1 − 1x 2 − 8x 3 +8x 4 +4x 5 − 9x 6 − 1x 7 − 1x 8 − 18x 9 =3<br />
3x 1 − 2x 2 +5x 3 +2x 4 − 2x 5 − 5x 6 +1x 7 +2x 8 +15x 9 =10<br />
4x 1 − 2x 2 +8x 3 +2x 5 − 14x 6 − 2x 8 +2x 9 =36<br />
−1x 1 +2x 2 +1x 3 − 6x 4 +7x 6 − 1x 7 − 3x 9 = −8<br />
3x 1 +2x 2 +13x 3 − 14x 4 − 1x 5 +5x 6 − 1x 8 +12x 9 =15<br />
−2x 1 +2x 2 − 2x 3 − 4x 4 +1x 5 +6x 6 − 2x 7 − 2x 8 − 15x 9 = −7<br />
M10 †<br />
Example TLC asks if the vector<br />
⎡ ⎤<br />
13<br />
15<br />
5<br />
w =<br />
⎢−17<br />
⎥<br />
⎣ 2 ⎦<br />
25<br />
can be written as a l<strong>in</strong>ear comb<strong>in</strong>ation of the four vectors<br />
⎡ ⎤<br />
⎡ ⎤<br />
⎡ ⎤<br />
2<br />
6<br />
−5<br />
4<br />
3<br />
2<br />
−3<br />
0<br />
1<br />
u 1 =<br />
⎢ 1<br />
u 2 =<br />
⎥<br />
⎢−2<br />
u 3 =<br />
⎥<br />
⎢ 1<br />
⎥<br />
⎣ 2 ⎦<br />
⎣ 1 ⎦<br />
⎣−3⎦<br />
9<br />
4<br />
0<br />
⎡ ⎤<br />
3<br />
2<br />
−5<br />
u 4 =<br />
⎢ 7<br />
⎥<br />
⎣ 1 ⎦<br />
3<br />
Can it? Can any vector <strong>in</strong> C 6 be written as a l<strong>in</strong>ear comb<strong>in</strong>ation of the four vectors<br />
u 1 , u 2 , u 3 , u 4 ?<br />
M11 † At the end of Example VFS, thevectorw is claimed to be a solution to the l<strong>in</strong>ear<br />
system under discussion. Verify that w really is a solution. Then determ<strong>in</strong>e the four scalars
§LC Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 106<br />
that express w as a l<strong>in</strong>ear comb<strong>in</strong>ation of c, u 1 , u 2 , u 3 .
Section SS<br />
Spann<strong>in</strong>g Sets<br />
In this section we will provide an extremely compact way to describe an <strong>in</strong>f<strong>in</strong>ite set<br />
of vectors, mak<strong>in</strong>g use of l<strong>in</strong>ear comb<strong>in</strong>ations. This will give us a convenient way to<br />
describe the solution set of a l<strong>in</strong>ear system, the null space of a matrix, and many<br />
other sets of vectors.<br />
Subsection SSV<br />
Span of a Set of Vectors<br />
In Example VFSAL we saw the solution set of a homogeneous system described<br />
as all possible l<strong>in</strong>ear comb<strong>in</strong>ations of two particular vectors. This is a useful way<br />
to construct or describe <strong>in</strong>f<strong>in</strong>ite sets of vectors, so we encapsulate the idea <strong>in</strong> a<br />
def<strong>in</strong>ition.<br />
Def<strong>in</strong>ition SSCV Span of a Set of Column Vectors<br />
Given a set of vectors S = {u 1 , u 2 , u 3 , ..., u p },theirspan, 〈S〉, is the set of all<br />
possible l<strong>in</strong>ear comb<strong>in</strong>ations of u 1 , u 2 , u 3 , ..., u p . Symbolically,<br />
〈S〉 = {α 1 u 1 + α 2 u 2 + α 3 u 3 + ···+ α p u p | α i ∈ C, 1 ≤ i ≤ p}<br />
{ p∑<br />
∣ }<br />
∣∣∣∣<br />
= α i u i α i ∈ C, 1 ≤ i ≤ p<br />
i=1<br />
□<br />
The span is just a set of vectors, though <strong>in</strong> all but one situation it is an <strong>in</strong>f<strong>in</strong>ite<br />
set. (Just when is it not <strong>in</strong>f<strong>in</strong>ite?) So we start with a f<strong>in</strong>ite collection of vectors S (p<br />
of them to be precise), and use this f<strong>in</strong>ite set to describe an <strong>in</strong>f<strong>in</strong>ite set of vectors,<br />
〈S〉. Confus<strong>in</strong>g the f<strong>in</strong>ite set S with the <strong>in</strong>f<strong>in</strong>ite set 〈S〉 is one of the most persistent<br />
problems <strong>in</strong> understand<strong>in</strong>g <strong>in</strong>troductory l<strong>in</strong>ear algebra. We will see this construction<br />
repeatedly, so let us work through some examples to get comfortable with it. The<br />
most obvious question about a set is if a particular item of the correct type is <strong>in</strong> the<br />
set, or not <strong>in</strong> the set.<br />
Example ABS A basic span<br />
Consider the set of 5 vectors, S, fromC 4<br />
⎧⎡<br />
⎤ ⎡ ⎤ ⎡<br />
1 2<br />
⎪⎨<br />
⎢1⎥<br />
⎢ 1 ⎥ ⎢<br />
S = ⎣<br />
⎪⎩<br />
3<br />
⎦ , ⎣<br />
2<br />
⎦ , ⎣<br />
1 −1<br />
7<br />
3<br />
5<br />
−5<br />
⎤<br />
⎥<br />
⎦ ,<br />
⎡<br />
⎢<br />
⎣<br />
1<br />
1<br />
−1<br />
2<br />
⎤<br />
⎥<br />
⎦ ,<br />
⎡ ⎤⎫<br />
−1 ⎪⎬<br />
⎢ 0 ⎥<br />
⎣<br />
9<br />
⎦<br />
⎪⎭<br />
0<br />
and consider the <strong>in</strong>f<strong>in</strong>ite set of vectors 〈S〉 formed from all possible l<strong>in</strong>ear comb<strong>in</strong>ations<br />
of the elements of S. Herearefourvectorswedef<strong>in</strong>itelyknowareelementsof〈S〉,<br />
107
§SS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 108<br />
s<strong>in</strong>ce we will construct them <strong>in</strong> accordance with Def<strong>in</strong>ition SSCV,<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
1 2<br />
7 1 −1 −4<br />
⎢1⎥<br />
⎢ 1 ⎥ ⎢ 3 ⎥ ⎢ 1 ⎥ ⎢ 0 ⎥ ⎢ 2 ⎥<br />
w =(2) ⎣<br />
3<br />
⎦ +(1) ⎣<br />
2<br />
⎦ +(−1) ⎣<br />
5<br />
⎦ +(2) ⎣<br />
−1<br />
⎦ +(3) ⎣<br />
9<br />
⎦ = ⎣<br />
28<br />
⎦<br />
1 −1 −5 2 0 10<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡<br />
1<br />
2<br />
7<br />
⎢1⎥<br />
⎢ 1 ⎥ ⎢ 3 ⎥ ⎢<br />
x =(5) ⎣<br />
3<br />
⎦ +(−6) ⎣<br />
2<br />
⎦ +(−3) ⎣<br />
5<br />
⎦ +(4) ⎣<br />
1 −1 −5<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡<br />
1 2 7<br />
⎢1⎥<br />
⎢ 1 ⎥ ⎢ 3 ⎥ ⎢<br />
y =(1) ⎣<br />
3<br />
⎦ +(0) ⎣<br />
2<br />
⎦ +(1) ⎣<br />
5<br />
⎦ +(0) ⎣<br />
1 −1 −5<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡<br />
1 2 7<br />
⎢1⎥<br />
⎢ 1 ⎥ ⎢ 3 ⎥ ⎢<br />
z =(0) ⎣<br />
3<br />
⎦ +(0) ⎣<br />
2<br />
⎦ +(0) ⎣<br />
5<br />
⎦ +(0) ⎣<br />
1 −1 −5<br />
1<br />
1<br />
−1<br />
2<br />
1<br />
1<br />
−1<br />
2<br />
1<br />
1<br />
−1<br />
2<br />
⎤ ⎡ ⎤<br />
−1<br />
⎥ ⎢ 0<br />
⎦ +(2) ⎣<br />
9<br />
0<br />
⎤ ⎡ ⎤<br />
−1<br />
⎥ ⎢ 0<br />
⎦ +(1) ⎣<br />
9<br />
0<br />
⎥<br />
⎦ =<br />
⎥<br />
⎦ =<br />
⎤ ⎡ ⎤<br />
−1<br />
⎥ ⎢ 0<br />
⎦ +(0) ⎣<br />
9<br />
0<br />
⎥<br />
⎦ =<br />
⎡ ⎤<br />
−26<br />
⎢ ⎥<br />
⎣ ⎦<br />
⎡<br />
⎢<br />
⎣<br />
7<br />
4<br />
−6<br />
2<br />
34<br />
17<br />
−4<br />
⎤<br />
⎥<br />
⎦<br />
⎡ ⎤<br />
0<br />
⎢0⎥<br />
⎣<br />
0<br />
⎦<br />
0<br />
The purpose of a set is to collect objects with some common property, and to<br />
exclude objects without that property. So the most fundamental question about a<br />
set is if a given object is an element of the set or not. Let us learn more about 〈S〉<br />
by <strong>in</strong>vestigat<strong>in</strong>g which<br />
⎡<br />
vectors<br />
⎤<br />
are elements of the set, and which are not.<br />
<strong>First</strong>, is u =<br />
⎢<br />
⎣<br />
−15<br />
−6<br />
19<br />
5<br />
α 1 ,α 2 ,α 3 ,α 4 ,α 5 such that<br />
⎡ ⎤ ⎡ ⎤ ⎡<br />
1 2<br />
⎢1⎥<br />
⎢ 1 ⎥ ⎢<br />
α 1 ⎣<br />
3<br />
⎦ + α 2 ⎣<br />
2<br />
⎦ + α 3 ⎣<br />
1 −1<br />
⎥<br />
⎦ an element of 〈S〉? We are ask<strong>in</strong>g if there are scalars<br />
⎤<br />
7<br />
3 ⎥<br />
5<br />
−5<br />
⎡<br />
⎢<br />
⎦ + α 4 ⎣<br />
1<br />
1<br />
−1<br />
2<br />
⎤ ⎡ ⎤<br />
−1<br />
⎥ ⎢ 0<br />
⎦ + α 5 ⎣<br />
9<br />
0<br />
⎥<br />
⎦ = u =<br />
⎡ ⎤<br />
−15<br />
⎢ ⎥<br />
⎣ ⎦<br />
Apply<strong>in</strong>g Theorem SLSLC we recognize the search for these scalars as a solution<br />
to a l<strong>in</strong>ear system of equations with augmented matrix<br />
⎡<br />
⎤<br />
1 2 7 1 −1 −15<br />
⎢1 1 3 1 0 −6 ⎥<br />
⎣<br />
3 2 5 −1 9 19<br />
⎦<br />
1 −1 −5 2 0 5<br />
−6<br />
19<br />
5
§SS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 109<br />
which row-reduces to<br />
⎡<br />
⎢<br />
⎣<br />
⎤<br />
1 0 −1 0 3 10<br />
0 1 4 0 −1 −9<br />
⎥<br />
0 0 0 1 −2 −7<br />
⎦<br />
0 0 0 0 0 0<br />
At this po<strong>in</strong>t, we see that the system is consistent (Theorem RCLS), so we know<br />
there is a solution for the five scalars α 1 ,α 2 ,α 3 ,α 4 ,α 5 . This is enough evidence for<br />
us to say that u ∈〈S〉. If we wished further evidence, we could compute an actual<br />
solution, say<br />
α 1 =2 α 2 =1 α 3 = −2 α 4 = −3 α 5 =2<br />
This particular solution allows us to write<br />
⎡ ⎤ ⎡<br />
1<br />
⎢1⎥<br />
⎢<br />
(2) ⎣<br />
3<br />
⎦ +(1) ⎣<br />
1<br />
2<br />
1<br />
2<br />
−1<br />
⎤<br />
⎡<br />
⎥ ⎢<br />
⎦ +(−2) ⎣<br />
7<br />
3<br />
5<br />
−5<br />
⎤<br />
⎡<br />
⎥ ⎢<br />
⎦ +(−3) ⎣<br />
1<br />
1<br />
−1<br />
2<br />
⎤ ⎡ ⎤<br />
−1<br />
⎥ ⎢ 0<br />
⎦ +(2) ⎣<br />
9<br />
0<br />
⎥<br />
⎦ = u =<br />
⎡ ⎤<br />
−15<br />
⎢ ⎥<br />
⎣ ⎦<br />
mak<strong>in</strong>g it even more obvious that<br />
⎡ ⎤<br />
u ∈〈S〉.<br />
3<br />
⎢ 1 ⎥<br />
Let us do it aga<strong>in</strong>. Is v = ⎣<br />
2<br />
⎦ an element of 〈S〉? We are ask<strong>in</strong>g if there are<br />
−1<br />
scalars α 1 ,α 2 ,α 3 ,α 4 ,α 5 such that<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
1<br />
1 −1<br />
⎢1⎥<br />
⎢ ⎥ ⎢ ⎥ ⎢ 1 ⎥ ⎢ ⎥ ⎢ ⎥<br />
α 1 ⎣ ⎦ + α 2 ⎣ ⎦ + α 3 ⎣ ⎦ + α 4 ⎣ ⎦ + α 5 ⎣ ⎦ = v = ⎣ ⎦<br />
3<br />
1<br />
2<br />
1<br />
2<br />
−1<br />
7<br />
3<br />
5<br />
−5<br />
Apply<strong>in</strong>g Theorem SLSLC we recognize the search for these scalars as a solution<br />
to a l<strong>in</strong>ear system of equations with augmented matrix<br />
⎡<br />
⎤<br />
1 2 7 1 −1 3<br />
⎢1 1 3 1 0 1 ⎥<br />
⎣<br />
3 2 5 −1 9 2<br />
⎦<br />
1 −1 −5 2 0 −1<br />
which row-reduces to<br />
⎡<br />
⎢<br />
⎣<br />
−1<br />
2<br />
1 0 −1 0 3 0<br />
0 1 4 0 −1 0<br />
0 0 0 1 −2 0<br />
0 0 0 0 0 1<br />
At this po<strong>in</strong>t, we see that the system is <strong>in</strong>consistent by Theorem RCLS, sowe<br />
know there is not a solution for the five scalars α 1 ,α 2 ,α 3 ,α 4 ,α 5 . This is enough<br />
⎤<br />
⎥<br />
⎦<br />
0<br />
9<br />
0<br />
3<br />
1<br />
2<br />
−1<br />
−6<br />
19<br />
5
§SS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 110<br />
evidence for us to say that v ∉〈S〉. End of story.<br />
Example SCAA Span of the columns of Archetype A<br />
Beg<strong>in</strong> with the f<strong>in</strong>ite set of three vectors of size 3<br />
{[ ] [ ] [ ]}<br />
1 −1 2<br />
S = {u 1 , u 2 , u 3 } = 2 , 1 , 1<br />
1 1 0<br />
and consider the <strong>in</strong>f<strong>in</strong>ite set 〈S〉. The vectors of S could have been chosen to be<br />
anyth<strong>in</strong>g, but for reasons that will become clear later, we have chosen the three<br />
columns of the coefficient matrix <strong>in</strong> Archetype A.<br />
<strong>First</strong>, as an example, note that<br />
v =(5)[ 1<br />
2<br />
1<br />
]<br />
+(−3)<br />
[ ] ]<br />
−1 2<br />
1 +(7)[<br />
1 =<br />
1 0<br />
[ ] 22<br />
14<br />
2<br />
is <strong>in</strong> 〈S〉, s<strong>in</strong>ce it is a l<strong>in</strong>ear comb<strong>in</strong>ation of u 1 , u 2 , u 3 . We write this succ<strong>in</strong>ctly as<br />
v ∈〈S〉. There is noth<strong>in</strong>g magical about the scalars α 1 =5,α 2 = −3, α 3 =7,they<br />
could have been chosen to be anyth<strong>in</strong>g. So repeat this part of the example yourself,<br />
us<strong>in</strong>g different values of α 1 ,α 2 ,α 3 . What happens if you choose all three scalars to<br />
be zero?<br />
So we know how to quickly construct sample elements of the set 〈S〉. A slightly<br />
different question arises when you are handed a<br />
]<br />
vector of the correct size and asked<br />
[ 1<br />
if it is an element of 〈S〉. For example, is w = 8 <strong>in</strong> 〈S〉? More succ<strong>in</strong>ctly, w ∈〈S〉?<br />
5<br />
To answer this question, we will look for scalars α 1 ,α 2 ,α 3 so that<br />
α 1 u 1 + α 2 u 2 + α 3 u 3 = w<br />
By Theorem SLSLC solutions to this vector equation are solutions to the system of<br />
equations<br />
α 1 − α 2 +2α 3 =1<br />
2α 1 + α 2 + α 3 =8<br />
α 1 + α 2 =5<br />
Build<strong>in</strong>g the augmented matrix for this l<strong>in</strong>ear system, and row-reduc<strong>in</strong>g, gives<br />
⎡<br />
⎣ 1 0 1 3<br />
⎤<br />
0 1 −1 2⎦<br />
0 0 0 0<br />
This system has <strong>in</strong>f<strong>in</strong>itely many solutions (there is a free variable <strong>in</strong> x 3 ), but all<br />
△
§SS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 111<br />
we need is one solution vector. The solution,<br />
α 1 =2 α 2 =3 α 3 =1<br />
tells us that<br />
(2)u 1 +(3)u 2 +(1)u 3 = w<br />
so we are conv<strong>in</strong>ced that w really is <strong>in</strong> 〈S〉. Notice that there are an <strong>in</strong>f<strong>in</strong>ite number<br />
of ways to answer this question affirmatively. We could choose a different solution,<br />
this time choos<strong>in</strong>g the free variable to be zero,<br />
α 1 =3 α 2 =2 α 3 =0<br />
shows us that<br />
(3)u 1 +(2)u 2 +(0)u 3 = w<br />
Verify<strong>in</strong>g the arithmetic <strong>in</strong> this second solution will make it obvious that w is <strong>in</strong><br />
this span. And of course, we now realize that there are an <strong>in</strong>f<strong>in</strong>ite number of ways<br />
to realize w as element of 〈S〉.<br />
]<br />
[ 2<br />
Let us ask the same type of question aga<strong>in</strong>, but this time with y = 4<br />
3<br />
y ∈〈S〉?<br />
So we will look for scalars α 1 ,α 2 ,α 3 so that<br />
α 1 u 1 + α 2 u 2 + α 3 u 3 = y<br />
, i.e. is<br />
By Theorem SLSLC solutions to this vector equation are the solutions to the system<br />
of equations<br />
α 1 − α 2 +2α 3 =2<br />
2α 1 + α 2 + α 3 =4<br />
α 1 + α 2 =3<br />
Build<strong>in</strong>g the augmented matrix for this l<strong>in</strong>ear system, and row-reduc<strong>in</strong>g, gives<br />
⎡<br />
1 0 1<br />
⎤<br />
0<br />
⎣ 0 1 −1 0 ⎦<br />
0 0 0 1<br />
This system is <strong>in</strong>consistent (there is a pivot column <strong>in</strong> the last column, Theorem<br />
RCLS), so there are no scalars α 1 ,α 2 ,α 3 that will create a l<strong>in</strong>ear comb<strong>in</strong>ation of<br />
u 1 , u 2 , u 3 that equals y. More precisely, y ∉〈S〉.<br />
There are three th<strong>in</strong>gs to observe <strong>in</strong> this example. (1) It is easy to construct<br />
vectors <strong>in</strong> 〈S〉. (2) It is possible that some vectors are <strong>in</strong> 〈S〉 (e.g. w), while others<br />
are not (e.g. y). (3) Decid<strong>in</strong>g if a given vector is <strong>in</strong> 〈S〉 leads to solv<strong>in</strong>g a l<strong>in</strong>ear<br />
system of equations and ask<strong>in</strong>g if the system is consistent.
§SS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 112<br />
With a computer program <strong>in</strong> hand to solve systems of l<strong>in</strong>ear equations, could<br />
you create a program to decide if a vector was, or was not, <strong>in</strong> the span of a given set<br />
of vectors? Is this art or science?<br />
This example was built on vectors from the columns of the coefficient matrix of<br />
Archetype A. Study the determ<strong>in</strong>ation that v ∈〈S〉 and see if you can connect it<br />
with some of the other properties of Archetype A.<br />
△<br />
Hav<strong>in</strong>g analyzed Archetype A <strong>in</strong> Example SCAA, we will of course subject<br />
Archetype B to a similar <strong>in</strong>vestigation.<br />
Example SCAB Span of the columns of Archetype B<br />
Beg<strong>in</strong> with the f<strong>in</strong>ite set of three vectors of size 3 that are the columns of the<br />
coefficient matrix <strong>in</strong> Archetype B,<br />
R = {v 1 , v 2 , v 3 } =<br />
and consider the <strong>in</strong>f<strong>in</strong>ite set 〈R〉.<br />
<strong>First</strong>, as an example, note that<br />
[ ] −7<br />
x =(2) 5<br />
1<br />
+(4)<br />
{[ ] −7<br />
5 ,<br />
1<br />
[ ] −6<br />
5 +(−3)<br />
0<br />
[ ] −6<br />
5 ,<br />
0<br />
[ ]} −12<br />
7<br />
4<br />
[ ] −12<br />
[ ] −2<br />
7 = 9<br />
4 −10<br />
is <strong>in</strong> 〈R〉, s<strong>in</strong>ce it is a l<strong>in</strong>ear comb<strong>in</strong>ation of v 1 , v 2 , v 3 . In other words, x ∈〈R〉. Try<br />
some different values of α 1 ,α 2 ,α 3 yourself, and see what vectors you can create as<br />
elements of 〈R〉.<br />
Now ask if a given vector is an element of 〈R〉. For example, is z =<br />
〈R〉? Isz ∈〈R〉?<br />
To answer this question, we will look for scalars α 1 ,α 2 ,α 3 so that<br />
α 1 v 1 + α 2 v 2 + α 3 v 3 = z<br />
[ ] −33<br />
24<br />
5<br />
By Theorem SLSLC solutions to this vector equation are the solutions to the system<br />
of equations<br />
−7α 1 − 6α 2 − 12α 3 = −33<br />
5α 1 +5α 2 +7α 3 =24<br />
α 1 +4α 3 =5<br />
Build<strong>in</strong>g the augmented matrix for this l<strong>in</strong>ear system, and row-reduc<strong>in</strong>g, gives<br />
⎡<br />
1 0 0<br />
⎤<br />
−3<br />
⎣ 0 1 0 5 ⎦<br />
0 0 1 2<br />
<strong>in</strong>
§SS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 113<br />
This system has a unique solution,<br />
α 1 = −3 α 2 =5 α 3 =2<br />
tell<strong>in</strong>g us that<br />
(−3)v 1 +(5)v 2 +(2)v 3 = z<br />
so we are conv<strong>in</strong>ced that z really is <strong>in</strong> 〈R〉. Noticethat<strong>in</strong>thiscasewehaveonly<br />
one way to answer the question affirmatively s<strong>in</strong>ce<br />
[<br />
the<br />
]<br />
solution is unique.<br />
−7<br />
Let us ask about another vector, say is x = 8 <strong>in</strong> 〈R〉? Isx ∈〈R〉?<br />
−3<br />
We desire scalars α 1 ,α 2 ,α 3 so that<br />
α 1 v 1 + α 2 v 2 + α 3 v 3 = x<br />
By Theorem SLSLC solutions to this vector equation are the solutions to the system<br />
of equations<br />
−7α 1 − 6α 2 − 12α 3 = −7<br />
5α 1 +5α 2 +7α 3 =8<br />
α 1 +4α 3 = −3<br />
Build<strong>in</strong>g the augmented matrix for this l<strong>in</strong>ear system, and row-reduc<strong>in</strong>g, gives<br />
⎡<br />
1 0 0<br />
⎤<br />
1<br />
⎣ 0 1 0 2 ⎦<br />
0 0 1 −1<br />
This system has a unique solution,<br />
α 1 =1 α 2 =2 α 3 = −1<br />
tell<strong>in</strong>g us that<br />
(1)v 1 +(2)v 2 +(−1)v 3 = x<br />
so we are conv<strong>in</strong>ced that x really is <strong>in</strong> 〈R〉. Notice that <strong>in</strong> this case we aga<strong>in</strong> have<br />
only one way to answer the question affirmatively s<strong>in</strong>ce the solution is aga<strong>in</strong> unique.<br />
We could cont<strong>in</strong>ue to test other vectors for membership <strong>in</strong> 〈R〉, but there is no<br />
po<strong>in</strong>t. A question about membership <strong>in</strong> 〈R〉 <strong>in</strong>evitably leads to a system of three<br />
equations <strong>in</strong> the three variables α 1 ,α 2 ,α 3 with a coefficient matrix whose columns<br />
are the vectors v 1 , v 2 , v 3 . This particular coefficient matrix is nons<strong>in</strong>gular, so by<br />
Theorem NMUS, the system is guaranteed to have a solution. (This solution is<br />
unique, but that is not critical here.) So no matter which vector we might have<br />
chosen for z, we would have been certa<strong>in</strong> to discover that it was an element of 〈R〉.<br />
Stated differently, every vector of size 3 is <strong>in</strong> 〈R〉, or〈R〉 = C 3 .<br />
Compare this example with Example SCAA, and see if you can connect z with
§SS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 114<br />
some aspects of the write-up for Archetype B.<br />
△<br />
Subsection SSNS<br />
Spann<strong>in</strong>g Sets of Null Spaces<br />
We saw <strong>in</strong> Example VFSAL that when a system of equations is homogeneous the<br />
solution set can be expressed <strong>in</strong> the form described by Theorem VFSLS where the<br />
vector c is the zero vector. We can essentially ignore this vector, so that the rema<strong>in</strong>der<br />
of the typical expression for a solution looks like an arbitrary l<strong>in</strong>ear comb<strong>in</strong>ation,<br />
where the scalars are the free variables and the vectors are u 1 , u 2 , u 3 , ..., u n−r .<br />
Which sounds a lot like a span. This is the substance of the next theorem.<br />
Theorem SSNS Spann<strong>in</strong>g Sets for Null Spaces<br />
Suppose that A is an m × n matrix, and B is a row-equivalent matrix <strong>in</strong> reduced<br />
row-echelon form. Suppose that B has r pivot columns, with <strong>in</strong>dices given<br />
by D = {d 1 ,d 2 ,d 3 , ..., d r },whilethen − r non-pivot columns have <strong>in</strong>dices F =<br />
{f 1 ,f 2 ,f 3 , ..., f n−r ,n+1}. Construct the n − r vectors z j , 1 ≤ j ≤ n − r of size<br />
n,<br />
⎧<br />
⎪⎨ 1 if i ∈ F , i = f j<br />
[z j ] i<br />
= 0 if i ∈ F , i ≠ f j<br />
⎪⎩<br />
− [B] k,fj<br />
if i ∈ D, i = d k<br />
Then the null space of A is given by<br />
N (A) =〈{z 1 , z 2 , z 3 , ..., z n−r }〉<br />
Proof. Consider the homogeneous system with A as a coefficient matrix, LS(A, 0).<br />
Its set of solutions, S, is by Def<strong>in</strong>ition NSM, the null space of A, N (A). LetB ′<br />
denote the result of row-reduc<strong>in</strong>g the augmented matrix of this homogeneous system.<br />
S<strong>in</strong>ce the system is homogeneous, the f<strong>in</strong>al column of the augmented matrix will be<br />
all zeros, and after any number of row operations (Def<strong>in</strong>ition RO), the column will<br />
still be all zeros. So B ′ has a f<strong>in</strong>al column that is totally zeros.<br />
Now apply Theorem VFSLS to B ′ , after not<strong>in</strong>g that our homogeneous system<br />
must be consistent (Theorem HSC). The vector c has zeros for each entry that has<br />
an <strong>in</strong>dex <strong>in</strong> F . For entries with their <strong>in</strong>dex <strong>in</strong> D, thevalueis− [B ′ ] k,n+1<br />
, but for<br />
B ′ any entry <strong>in</strong> the f<strong>in</strong>al column (<strong>in</strong>dex n +1)iszero. Soc = 0. The vectors z j ,<br />
1 ≤ j ≤ n − r are identical to the vectors u j ,1≤ j ≤ n − r described <strong>in</strong> Theorem<br />
VFSLS. Putt<strong>in</strong>g it all together and apply<strong>in</strong>g Def<strong>in</strong>ition SSCV <strong>in</strong> the f<strong>in</strong>al step,<br />
N (A) =S<br />
= {c + α 1 u 1 + α 2 u 2 + α 3 u 3 + ···+ α n−r u n−r | α 1 ,α 2 ,α 3 , ..., α n−r ∈ C}<br />
= {α 1 u 1 + α 2 u 2 + α 3 u 3 + ···+ α n−r u n−r | α 1 ,α 2 ,α 3 , ..., α n−r ∈ C}<br />
= 〈{z 1 , z 2 , z 3 , ..., z n−r }〉
§SS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 115<br />
Notice that the hypotheses of Theorem VFSLS and Theorem SSNS are slightly<br />
different. In the former, B is the row-reduced version of an augmented matrix of<br />
a l<strong>in</strong>ear system, while <strong>in</strong> the latter, B is the row-reduced version of an arbitrary<br />
matrix. Understand<strong>in</strong>g this subtlety now will avoid confusion later.<br />
Example SSNS Spann<strong>in</strong>g set of a null space<br />
F<strong>in</strong>d a set of vectors, S, so that the null space of the matrix A below is the span of<br />
S, thatis,〈S〉 = N (A).<br />
⎡<br />
⎤<br />
A =<br />
⎢<br />
⎣<br />
1 3 3 −1 −5<br />
2 5 7 1 1<br />
1 1 5 1 5<br />
−1 −4 −2 0 4<br />
The null space of A is the set of all solutions to the homogeneous system LS(A, 0).<br />
If we f<strong>in</strong>d the vector form of the solutions to this homogeneous system (Theorem<br />
VFSLS) then the vectors u j ,1≤ j ≤ n − r <strong>in</strong> the l<strong>in</strong>ear comb<strong>in</strong>ation are exactly the<br />
vectors z j ,1≤ j ≤ n − r described <strong>in</strong> Theorem SSNS. So we can mimic Example<br />
VFSAL to arrive at these vectors (rather than be<strong>in</strong>g a slave to the formulas <strong>in</strong> the<br />
statement of the theorem).<br />
Beg<strong>in</strong> by row-reduc<strong>in</strong>g A. The result is<br />
⎡<br />
⎤<br />
⎢<br />
⎣<br />
1 0 6 0 4<br />
0 1 −1 0 −2<br />
⎥<br />
0 0 0 1 3<br />
⎦<br />
0 0 0 0 0<br />
With D = {1, 2, 4} and F = {3, 5} we recognize that x 3 and x 5 are free variables<br />
and we can <strong>in</strong>terpret each nonzero row as an expression for the dependent variables<br />
x 1 , x 2 , x 4 (respectively) <strong>in</strong> the free variables x 3 and x 5 .Withthiswecanwritethe<br />
vector form of a solution vector as<br />
⎡ ⎤ ⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤<br />
x 1 −6x 3 − 4x 5 −6 −4<br />
x 2<br />
x ⎢x 3 ⎥<br />
⎣x ⎦ = 3 +2x 5<br />
1<br />
2<br />
⎢ x 3 ⎥<br />
⎣ 4 −3x ⎦ = x 3 ⎢ 1 ⎥<br />
⎣<br />
5<br />
0 ⎦ + x 5 ⎢ 0 ⎥<br />
⎣−3⎦<br />
x 5 x 5<br />
0 1<br />
Then <strong>in</strong> the notation of Theorem SSNS,<br />
⎡ ⎤<br />
−6<br />
1<br />
z 1 = ⎢ 1 ⎥<br />
⎣ 0 ⎦<br />
0<br />
⎥<br />
⎦<br />
⎡ ⎤<br />
−4<br />
2<br />
z 2 = ⎢ 0 ⎥<br />
⎣−3⎦<br />
1
§SS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 116<br />
and<br />
⎧⎡<br />
⎤ ⎡ ⎤⎫<br />
−6 −4<br />
〈<br />
〉<br />
⎪⎨<br />
1<br />
2<br />
⎪⎬<br />
N (A) =〈{z 1 , z 2 }〉 = ⎢ 1 ⎥<br />
⎣<br />
⎪⎩<br />
0 ⎦ , ⎢ 0 ⎥<br />
⎣−3⎦<br />
⎪⎭<br />
0 1<br />
Example NSDS Null space directly as a span<br />
Let us express the null space of A as the span of a set of vectors, apply<strong>in</strong>g Theorem<br />
SSNS as economically as possible, without reference to the underly<strong>in</strong>g homogeneous<br />
system of equations (<strong>in</strong> contrast to Example SSNS).<br />
⎡<br />
⎤<br />
2 1 5 1 5 1<br />
1 1 3 1 6 −1<br />
A = ⎢−1 1 −1 0 4 −3⎥<br />
⎣−3 2 −4 −4 −7 0 ⎦<br />
3 −1 5 2 2 3<br />
Theorem SSNS creates vectors for the span by first row-reduc<strong>in</strong>g the matrix <strong>in</strong><br />
question. The row-reduced version of A is<br />
⎡<br />
⎤<br />
1 0 2 0 −1 2<br />
0 1 1 0 3 −1<br />
B =<br />
⎢ 0 0 0 1 4 −2⎥<br />
⎣<br />
0 0 0 0 0 0<br />
⎦<br />
0 0 0 0 0 0<br />
We will mechanically follow the prescription of Theorem SSNS. Here we go, <strong>in</strong><br />
two big steps.<br />
<strong>First</strong>, the non-pivot columns have <strong>in</strong>dices F = {3, 5, 6}, so we will construct the<br />
n − r =6− 3 = 3 vectors with a pattern of zeros and ones dictated by the <strong>in</strong>dices<br />
<strong>in</strong> F . This is the realization of the first two l<strong>in</strong>es of the three-case def<strong>in</strong>ition of the<br />
vectors z j ,1≤ j ≤ n − r.<br />
⎡ ⎤<br />
⎡ ⎤<br />
⎡ ⎤<br />
1<br />
z 1 =<br />
⎢ ⎥<br />
⎣0⎦<br />
0<br />
0<br />
z 2 =<br />
⎢ ⎥<br />
⎣1⎦<br />
0<br />
0<br />
z 3 =<br />
⎢ ⎥<br />
⎣0⎦<br />
1<br />
Each of these vectors arises due to the presence of a column that is not a pivot<br />
column. The rema<strong>in</strong><strong>in</strong>g entries of each vector are the entries of the non-pivot column,<br />
negated, and distributed <strong>in</strong>to the empty slots <strong>in</strong> order (these slots have <strong>in</strong>dices <strong>in</strong><br />
the set D, so also refer to pivot columns). This is the realization of the third l<strong>in</strong>e of<br />
△
§SS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 117<br />
the three-case def<strong>in</strong>ition of the vectors z j ,1≤ j ≤ n − r.<br />
⎡ ⎤<br />
⎡ ⎤<br />
−2<br />
1<br />
−1<br />
−3<br />
1<br />
0<br />
z 1 =<br />
⎢ 0<br />
z 2 =<br />
⎥<br />
⎢−4<br />
⎥<br />
⎣ 0 ⎦<br />
⎣ 1 ⎦<br />
0<br />
0<br />
⎡ ⎤<br />
−2<br />
1<br />
0<br />
z 3 =<br />
⎢ 2<br />
⎥<br />
⎣ 0 ⎦<br />
1<br />
So, by Theorem SSNS, wehave<br />
⎧⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤⎫<br />
−2 1 −2<br />
〈<br />
−1<br />
−3<br />
1<br />
〉<br />
⎪⎨<br />
⎪⎬<br />
1<br />
0<br />
0<br />
N (A) =〈{z 1 , z 2 , z 3 }〉 =<br />
⎢ 0<br />
,<br />
⎥ ⎢−4<br />
,<br />
⎥ ⎢ 2<br />
⎥<br />
⎣ 0 ⎦ ⎣ 1 ⎦ ⎣ 0 ⎦<br />
⎪⎩<br />
⎪⎭<br />
0 0 1<br />
We know that the null space of A is the solution set of the homogeneous system<br />
LS(A, 0), but nowhere <strong>in</strong> this application of Theorem SSNS have we found occasion<br />
to reference the variables or equations of this system. These details are all buried <strong>in</strong><br />
the proof of Theorem SSNS.<br />
△<br />
Here is an example that will simultaneously exercise the span construction and<br />
Theorem SSNS, while also po<strong>in</strong>t<strong>in</strong>g the way to the next section.<br />
Example SCAD Span of the columns of Archetype D<br />
Beg<strong>in</strong> with the set of four vectors of size 3<br />
{[ ] [ ] [ ]<br />
2 1 7<br />
T = {w 1 , w 2 , w 3 , w 4 } = −3 , 4 , −5 ,<br />
1 1 4<br />
[ ]} −7<br />
−6<br />
−5<br />
and consider the <strong>in</strong>f<strong>in</strong>ite set W = 〈T 〉. The vectors of T have been chosen as the<br />
four columns of the coefficient matrix <strong>in</strong> Archetype D. Check that the vector<br />
⎡ ⎤<br />
2<br />
⎢3⎥<br />
z 2 = ⎣<br />
0<br />
⎦<br />
1<br />
is a solution to the homogeneous system LS(D, 0) (it is the vector z 2 provided by<br />
the description of the null space of the coefficient matrix D from Theorem SSNS).<br />
Apply<strong>in</strong>g Theorem SLSLC, we can write the l<strong>in</strong>ear comb<strong>in</strong>ation,<br />
2w 1 +3w 2 +0w 3 +1w 4 = 0<br />
which we can solve for w 4 ,<br />
w 4 =(−2)w 1 +(−3)w 2 .<br />
This equation says that whenever we encounter the vector w 4 , we can replace it
§SS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 118<br />
with a specific l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors w 1 and w 2 .Sous<strong>in</strong>gw 4 <strong>in</strong> the set<br />
T ,alongwithw 1 and w 2 , is excessive. An example of what we mean here can be<br />
illustrated by the computation,<br />
5w 1 +(−4)w 2 +6w 3 +(−3)w 4<br />
=5w 1 +(−4)w 2 +6w 3 +(−3) ((−2)w 1 +(−3)w 2 )<br />
=5w 1 +(−4)w 2 +6w 3 +(6w 1 +9w 2 )<br />
=11w 1 +5w 2 +6w 3<br />
So what began as a l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors w 1 , w 2 , w 3 , w 4 has been<br />
reduced to a l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors w 1 , w 2 , w 3 . A careful proof us<strong>in</strong>g<br />
our def<strong>in</strong>ition of set equality (Def<strong>in</strong>ition SE) would now allow us to conclude that<br />
this reduction is possible for any vector <strong>in</strong> W ,so<br />
W = 〈{w 1 , w 2 , w 3 }〉<br />
So the span of our set of vectors, W , has not changed, but we have described it by<br />
the span of a set of three vectors, rather than four. Furthermore, we can achieve yet<br />
another, similar, reduction.<br />
Check that the vector<br />
⎡ ⎤<br />
−3<br />
⎢−1⎥<br />
z 1 = ⎣<br />
1<br />
⎦<br />
0<br />
is a solution to the homogeneous system LS(D, 0) (it is the vector z 1 provided by<br />
the description of the null space of the coefficient matrix D from Theorem SSNS).<br />
Apply<strong>in</strong>g Theorem SLSLC, we can write the l<strong>in</strong>ear comb<strong>in</strong>ation,<br />
(−3)w 1 +(−1)w 2 +1w 3 = 0<br />
which we can solve for w 3 ,<br />
w 3 =3w 1 +1w 2<br />
This equation says that whenever we encounter the vector w 3 , we can replace it<br />
with a specific l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors w 1 and w 2 . So, as before, the vector<br />
w 3 is not needed <strong>in</strong> the description of W , provided we have w 1 and w 2 available. In<br />
particular, a careful proof (such as is done <strong>in</strong> Example RSC5) would show that<br />
W = 〈{w 1 , w 2 }〉<br />
So W began life as the span of a set of four vectors, and we have now shown<br />
(utiliz<strong>in</strong>g solutions to a homogeneous system) that W can also be described as the<br />
span of a set of just two vectors. Conv<strong>in</strong>ce yourself that we cannot go any further.<br />
In other words, it is not possible to dismiss either w 1 or w 2 <strong>in</strong> a similar fashion and<br />
w<strong>in</strong>now the set down to just one vector.<br />
What was it about the orig<strong>in</strong>al set of four vectors that allowed us to declare
§SS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 119<br />
certa<strong>in</strong> vectors as surplus? And just which vectors were we able to dismiss? And<br />
why did we have to stop once we had two vectors rema<strong>in</strong><strong>in</strong>g? The answers to these<br />
questions motivate “l<strong>in</strong>ear <strong>in</strong>dependence,” our next section and next def<strong>in</strong>ition, and<br />
so are worth consider<strong>in</strong>g carefully now.<br />
△<br />
Read<strong>in</strong>g Questions<br />
1. Let S be the set of three vectors below.<br />
⎧⎡<br />
⎨<br />
S = ⎣ 1 ⎤ ⎡<br />
2 ⎦ , ⎣ 3<br />
⎤ ⎡<br />
−4⎦ , ⎣ 4<br />
⎤⎫<br />
⎬<br />
−2⎦<br />
⎩<br />
−1 2 1<br />
⎭<br />
⎡<br />
Let W = 〈S〉 be the span of S. Is the vector ⎣ −1<br />
⎤<br />
8 ⎦ <strong>in</strong> W ? Give an explanation of the<br />
−4<br />
reason for your answer.<br />
⎡<br />
2. Use S and W from the previous question. Is the vector ⎣ 6 ⎤<br />
5 ⎦ <strong>in</strong> W ? Give an explanation<br />
−1<br />
of the reason for your answer.<br />
3. For the matrix A below, f<strong>in</strong>d a set S so that 〈S〉 = N (A), whereN (A) is the null space<br />
of A. (See Theorem SSNS.)<br />
⎡<br />
A = ⎣ 1 3 1 9<br />
⎤<br />
2 1 −3 8⎦<br />
1 1 −1 5<br />
Exercises<br />
C22 † For each archetype that is a system of equations, consider the correspond<strong>in</strong>g homogeneous<br />
system of equations. Write elements of the solution set to these homogeneous systems<br />
<strong>in</strong> vector form, as guaranteed by Theorem VFSLS. Then write the null space of the coefficient<br />
matrix of each system as the span of a set of vectors, as described <strong>in</strong> Theorem SSNS.<br />
Archetype A,ArchetypeB,ArchetypeC,ArchetypeD/Archetype E,ArchetypeF,Archetype<br />
G/Archetype H, ArchetypeI, ArchetypeJ<br />
C23 † Archetype K and Archetype L are def<strong>in</strong>ed as matrices. Use Theorem SSNS directly<br />
to f<strong>in</strong>d a set S so that 〈S〉 is the null space of the matrix. Do not make any reference to<br />
the associated homogeneous system of equations <strong>in</strong> your solution.<br />
⎧⎡<br />
⎤ ⎡ ⎤⎫<br />
⎡ ⎤<br />
2 3<br />
5<br />
⎪⎨<br />
⎪⎬<br />
C40 † ⎢−1⎥<br />
⎢ 2 ⎥<br />
Suppose that S = ⎣<br />
⎪⎩<br />
3<br />
⎦ , ⎣<br />
−2<br />
⎦<br />
⎪⎭ .LetW = 〈S〉 and let x = ⎢ 8 ⎥<br />
⎣<br />
−12<br />
⎦. Isx ∈ W ?<br />
4 1<br />
−5<br />
If so, provide an explicit l<strong>in</strong>ear comb<strong>in</strong>ation that demonstrates this.
§SS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 120<br />
⎧⎡<br />
⎤ ⎡ ⎤⎫<br />
⎡ ⎤<br />
2 3<br />
5<br />
⎪⎨<br />
⎪⎬<br />
C41 † ⎢−1⎥<br />
⎢ 2 ⎥<br />
Suppose that S = ⎣<br />
⎪⎩<br />
3<br />
⎦ , ⎣<br />
−2<br />
⎦<br />
⎪⎭ .LetW = 〈S〉 and let y = ⎢1⎥<br />
⎣<br />
3<br />
⎦. Isy ∈ W ?If<br />
4 1<br />
5<br />
so, provide an explicit l<strong>in</strong>ear comb<strong>in</strong>ation that demonstrates this.<br />
⎧⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤⎫<br />
⎡ ⎤<br />
2 1 3<br />
1<br />
⎪⎨<br />
−1<br />
1<br />
−1<br />
⎪⎬<br />
−1<br />
C42 † Suppose R = ⎢ 3 ⎥<br />
⎣<br />
⎪⎩<br />
4<br />
⎦ , ⎢ 2 ⎥<br />
⎣<br />
2<br />
⎦ , ⎢ 0 ⎥ .Isy = ⎢−8⎥<br />
<strong>in</strong> 〈R〉?<br />
⎣<br />
3<br />
⎦ ⎣<br />
⎪⎭<br />
−4<br />
⎦<br />
0 −1 −2<br />
−3<br />
⎧⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤⎫<br />
⎡ ⎤<br />
2 1 3<br />
1<br />
⎪⎨<br />
−1<br />
1<br />
−1<br />
⎪⎬<br />
1<br />
C43 † Suppose R = ⎢ 3 ⎥<br />
⎣<br />
⎪⎩<br />
4<br />
⎦ , ⎢ 2 ⎥<br />
⎣<br />
2<br />
⎦ , ⎢ 0 ⎥ .Isz = ⎢5⎥<br />
<strong>in</strong> 〈R〉?<br />
⎣<br />
3<br />
⎦ ⎣<br />
⎪⎭<br />
3<br />
⎦<br />
0 −1 −2<br />
1<br />
⎧⎡<br />
⎨<br />
C44 † Suppose that S = ⎣ −1<br />
⎤ ⎡<br />
2 ⎦ , ⎣ 3 ⎤ ⎡<br />
1⎦ , ⎣ 1 ⎤ ⎡<br />
5⎦ , ⎣ −6<br />
⎤⎫<br />
⎡<br />
⎬<br />
5 ⎦<br />
⎩<br />
1 2 4 1<br />
⎭ .LetW = 〈S〉 and let y = ⎣ −5<br />
⎤<br />
3 ⎦.<br />
0<br />
Is y ∈ W ? If so, provide an explicit l<strong>in</strong>ear comb<strong>in</strong>ation that demonstrates this.<br />
⎧⎡<br />
⎨<br />
C45 † Suppose that S = ⎣ −1<br />
⎤ ⎡<br />
2 ⎦ , ⎣ 3 ⎤ ⎡<br />
1⎦ , ⎣ 1 ⎤ ⎡<br />
5⎦ , ⎣ −6<br />
⎤⎫<br />
⎡<br />
⎬<br />
5 ⎦<br />
⎩<br />
1 2 4 1<br />
⎭ .LetW = 〈S〉 and let w = ⎣ 2 ⎤<br />
1⎦.<br />
3<br />
Is w ∈ W ? If so, provide an explicit l<strong>in</strong>ear comb<strong>in</strong>ation that demonstrates this.<br />
C50 † Let A be the matrix below.<br />
1. F<strong>in</strong>d a set S so that N (A) =〈S〉.<br />
⎡ ⎤<br />
3<br />
⎢−5⎥<br />
2. If z = ⎣<br />
1<br />
⎦, then show directly that z ∈N(A).<br />
2<br />
3. Write z as a l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors <strong>in</strong> S.<br />
⎡<br />
A = ⎣ 2 3 1 4<br />
⎤<br />
1 2 1 3⎦<br />
−1 0 1 1<br />
C60 † For the matrix A below, f<strong>in</strong>d a set of vectors S so that the span of S equals the<br />
null space of A, 〈S〉 = N (A).<br />
⎡<br />
A = ⎣ 1 1 6 −8<br />
⎤<br />
1 −2 0 1 ⎦<br />
−2 1 −6 7<br />
M10 † Consider the set of all size 2 vectors <strong>in</strong> the Cartesian plane R 2 .<br />
1. Give a geometric description of the span of a s<strong>in</strong>gle vector.
§SS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 121<br />
2. How can you tell if two vectors span the entire plane, without do<strong>in</strong>g any row reduction<br />
or calculation?<br />
M11 † Consider the set of all size 3 vectors <strong>in</strong> Cartesian 3-space R 3 .<br />
1. Give a geometric description of the span of a s<strong>in</strong>gle vector.<br />
2. Describe the possibilities for the span of two vectors.<br />
3. Describe the possibilities for the span of three vectors.<br />
⎡<br />
M12 † Let u = ⎣ 1 ⎤ ⎡<br />
3 ⎦ and v = ⎣ 2<br />
⎤<br />
−2⎦.<br />
−2<br />
1<br />
1. F<strong>in</strong>d a vector w 1 , different from u and v, sothat〈{u, v, w 1 }〉 = 〈{u, v}〉.<br />
2. F<strong>in</strong>d a vector w 2 so that 〈{u, v, w 2 }〉 ≠ 〈{u, v}〉.<br />
M20 In Example SCAD we began with the four columns of the coefficient matrix of<br />
Archetype D, and used these columns <strong>in</strong> a span construction. Then we methodically argued<br />
that we could remove the last column, then the third column, and create the same set by<br />
just do<strong>in</strong>g a span construction with the first two columns. We claimed we could not go any<br />
further, and had removed as many vectors as possible. Provide a conv<strong>in</strong>c<strong>in</strong>g argument for<br />
why a third vector cannot be removed.<br />
M21 † In the spirit of Example SCAD, beg<strong>in</strong> with the four columns of the coefficient<br />
matrix of Archetype C, and use these columns <strong>in</strong> a span construction to build the set S.<br />
Argue that S can be expressed as the span of just three of the columns of the coefficient<br />
matrix (say<strong>in</strong>g exactly which three) and <strong>in</strong> the spirit of Exercise SS.M20 argue that no one<br />
of these three vectors can be removed and still have a span construction create S.<br />
T10 † Suppose that v 1 , v 2 ∈ C m .Provethat<br />
〈{v 1 , v 2 }〉 = 〈{v 1 , v 2 , 5v 1 +3v 2 }〉<br />
T20 † Suppose that S is a set of vectors from C m . Prove that the zero vector, 0, isan<br />
element of 〈S〉.<br />
T21 Suppose that S is a set of vectors from C m and x, y ∈〈S〉. Provethatx + y ∈〈S〉.<br />
T22<br />
Suppose that S is a set of vectors from C m , α ∈ C, andx ∈〈S〉. Provethatαx ∈〈S〉.
Section LI<br />
L<strong>in</strong>ear Independence<br />
“L<strong>in</strong>ear <strong>in</strong>dependence” is one of the most fundamental conceptual ideas <strong>in</strong> l<strong>in</strong>ear<br />
algebra, along with the notion of a span. So this section, and the subsequent Section<br />
LDS, will explore this new idea.<br />
Subsection LISV<br />
L<strong>in</strong>early Independent Sets of Vectors<br />
Theorem SLSLC tells us that a solution to a homogeneous system of equations is<br />
a l<strong>in</strong>ear comb<strong>in</strong>ation of the columns of the coefficient matrix that equals the zero<br />
vector. We used just this situation to our advantage (twice!) <strong>in</strong> Example SCAD<br />
where we reduced the set of vectors used <strong>in</strong> a span construction from four down to<br />
two, by declar<strong>in</strong>g certa<strong>in</strong> vectors as surplus. The next two def<strong>in</strong>itions will allow us<br />
to formalize this situation.<br />
Def<strong>in</strong>ition RLDCV Relation of L<strong>in</strong>ear Dependence for Column Vectors<br />
Given a set of vectors S = {u 1 , u 2 , u 3 , ..., u n }, a true statement of the form<br />
α 1 u 1 + α 2 u 2 + α 3 u 3 + ···+ α n u n = 0<br />
is a relation of l<strong>in</strong>ear dependence on S. If this statement is formed <strong>in</strong> a trivial<br />
fashion, i.e. α i =0,1≤ i ≤ n, then we say it is the trivial relation of l<strong>in</strong>ear<br />
dependence on S.<br />
□<br />
Def<strong>in</strong>ition LICV L<strong>in</strong>ear Independence of Column Vectors<br />
The set of vectors S = {u 1 , u 2 , u 3 , ..., u n } is l<strong>in</strong>early dependent if there is<br />
a relation of l<strong>in</strong>ear dependence on S that is not trivial. In the case where the<br />
only relation of l<strong>in</strong>ear dependence on S is the trivial one, then S is a l<strong>in</strong>early<br />
<strong>in</strong>dependent set of vectors.<br />
□<br />
Notice that a relation of l<strong>in</strong>ear dependence is an equation. Though most of it is a<br />
l<strong>in</strong>ear comb<strong>in</strong>ation, it is not a l<strong>in</strong>ear comb<strong>in</strong>ation (that would be a vector). L<strong>in</strong>ear<br />
<strong>in</strong>dependence is a property of a set of vectors. It is easy to take a set of vectors,<br />
and an equal number of scalars, all zero, and form a l<strong>in</strong>ear comb<strong>in</strong>ation that equals<br />
the zero vector. When the easy way is the only way, then we say the set is l<strong>in</strong>early<br />
<strong>in</strong>dependent. Here are a couple of examples.<br />
Example LDS L<strong>in</strong>early dependent set <strong>in</strong> C 5<br />
122
§LI Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 123<br />
Consider the set of n = 4 vectors from C 5 ,<br />
⎧⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />
2 1 2 −6<br />
⎪⎨<br />
−1<br />
2<br />
1<br />
7<br />
⎪⎬<br />
S = ⎢ 3 ⎥<br />
⎣<br />
⎪⎩<br />
1 ⎦ , ⎢−1⎥<br />
⎣ 5 ⎦ , ⎢−3⎥<br />
⎣ 6 ⎦ , ⎢−1⎥<br />
⎣ 0 ⎦<br />
⎪⎭<br />
2 2 1 1<br />
To determ<strong>in</strong>e l<strong>in</strong>ear <strong>in</strong>dependence we first form a relation of l<strong>in</strong>ear dependence,<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
2 1 2 −6<br />
−1<br />
2<br />
1<br />
7<br />
α 1 ⎢ 3 ⎥<br />
⎣ 1 ⎦ + α 2 ⎢−1⎥<br />
⎣ 5 ⎦ + α 3 ⎢−3⎥<br />
⎣ 6 ⎦ + α 4 ⎢−1⎥<br />
⎣ 0 ⎦ = 0<br />
2 2 1 1<br />
We know that α 1 = α 2 = α 3 = α 4 = 0 is a solution to this equation, but<br />
that is of no <strong>in</strong>terest whatsoever. That is always the case, no matter what four<br />
vectorswemighthavechosen.Wearecurioustoknowifthereareother,nontrivial,<br />
solutions. Theorem SLSLC tells us that we can f<strong>in</strong>d such solutions as solutions to the<br />
homogeneous system LS(A, 0) where the coefficient matrix has these four vectors<br />
as columns, which we then row-reduce<br />
⎡<br />
⎤ ⎡<br />
⎤<br />
2 1 2 −6<br />
1 0 0 −2<br />
−1 2 1 7<br />
A = ⎢ 3 −1 −3 −1⎥<br />
⎣ 1 5 6 0 ⎦<br />
2 2 1 1<br />
RREF<br />
−−−−→<br />
0 1 0 4<br />
⎢ 0 0 1 −3⎥<br />
⎣<br />
0 0 0 0<br />
⎦<br />
0 0 0 0<br />
We could solve this homogeneous system completely, but for this example all we<br />
need is one nontrivial solution. Sett<strong>in</strong>g the lone free variable to any nonzero value,<br />
such as x 4 = 1, yields the nontrivial solution<br />
⎡ ⎤<br />
2<br />
⎢−4⎥<br />
x = ⎣<br />
3<br />
⎦<br />
1<br />
complet<strong>in</strong>g our application of Theorem SLSLC, wehave<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
2<br />
1 2 −6<br />
−1<br />
2<br />
1<br />
7<br />
2 ⎢ 3 ⎥<br />
⎣ 1 ⎦ +(−4) ⎢−1⎥<br />
⎣ 5 ⎦ +3 ⎢−3⎥<br />
⎣ 6 ⎦ +1 ⎢−1⎥<br />
⎣ 0 ⎦ = 0<br />
2<br />
2 1 1<br />
This is a relation of l<strong>in</strong>ear dependence on S that is not trivial, so we conclude<br />
that S is l<strong>in</strong>early dependent.<br />
△<br />
Example LIS L<strong>in</strong>early <strong>in</strong>dependent set <strong>in</strong> C 5
§LI Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 124<br />
Consider the set of n = 4 vectors from C 5 ,<br />
⎧⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />
2 1 2 −6<br />
⎪⎨<br />
−1<br />
2<br />
1<br />
7<br />
⎪⎬<br />
T = ⎢ 3 ⎥<br />
⎣<br />
⎪⎩<br />
1 ⎦ , ⎢−1⎥<br />
⎣ 5 ⎦ , ⎢−3⎥<br />
⎣ 6 ⎦ , ⎢−1⎥<br />
⎣ 1 ⎦<br />
⎪⎭<br />
2 2 1 1<br />
To determ<strong>in</strong>e l<strong>in</strong>ear <strong>in</strong>dependence we first form a relation of l<strong>in</strong>ear dependence,<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
2 1 2 −6<br />
−1<br />
2<br />
1<br />
7<br />
α 1 ⎢ 3 ⎥<br />
⎣ 1 ⎦ + α 2 ⎢−1⎥<br />
⎣ 5 ⎦ + α 3 ⎢−3⎥<br />
⎣ 6 ⎦ + α 4 ⎢−1⎥<br />
⎣ 1 ⎦ = 0<br />
2 2 1 1<br />
We know that α 1 = α 2 = α 3 = α 4 = 0 is a solution to this equation, but<br />
that is of no <strong>in</strong>terest whatsoever. That is always the case, no matter what four<br />
vectorswemighthavechosen.Wearecurioustoknowifthereareother,nontrivial,<br />
solutions. Theorem SLSLC tells us that we can f<strong>in</strong>d such solutions as solution to the<br />
homogeneous system LS(B, 0) where the coefficient matrix has these four vectors<br />
as columns. Row-reduc<strong>in</strong>g this coefficient matrix yields,<br />
⎡<br />
⎤ ⎡<br />
2 1 2 −6<br />
−1 2 1 7<br />
RREF<br />
B = ⎢ 3 −1 −3 −1⎥<br />
−−−−→<br />
⎣ 1 5 6 1 ⎦ ⎢<br />
⎣<br />
2 2 1 1<br />
1 0 0 0<br />
0 1 0 0<br />
0 0 1 0<br />
0 0 0 1<br />
0 0 0 0<br />
From the form of this matrix, we see that there are no free variables, so the<br />
solution is unique, and because the system is homogeneous, this unique solution is<br />
the trivial solution. So we now know that there is but one way to comb<strong>in</strong>e the four<br />
vectors of T <strong>in</strong>to a relation of l<strong>in</strong>ear dependence, and that one way is the easy and<br />
obvious way. In this situation we say that the set, T , is l<strong>in</strong>early <strong>in</strong>dependent. △<br />
Example LDS and Example LIS relied on solv<strong>in</strong>g a homogeneous system of<br />
equations to determ<strong>in</strong>e l<strong>in</strong>ear <strong>in</strong>dependence. We can codify this process <strong>in</strong> a timesav<strong>in</strong>g<br />
theorem.<br />
Theorem LIVHS L<strong>in</strong>early Independent Vectors and Homogeneous Systems<br />
Suppose that S = {v 1 , v 2 , v 3 , ..., v n }⊆C m is a set of vectors and A is the m × n<br />
matrix whose columns are the vectors <strong>in</strong> S. ThenS is a l<strong>in</strong>early <strong>in</strong>dependent set if<br />
and only if the homogeneous system LS(A, 0) has a unique solution.<br />
Proof. (⇐) Suppose that LS(A, 0) has a unique solution. S<strong>in</strong>ce it is a homogeneous<br />
system, this solution must be the trivial solution x = 0. ByTheoremSLSLC, this<br />
means that the only relation of l<strong>in</strong>ear dependence on S is the trivial one. So S is<br />
l<strong>in</strong>early <strong>in</strong>dependent.<br />
⎤<br />
⎥<br />
⎦
§LI Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 125<br />
(⇒) We will prove the contrapositive. Suppose that LS(A, 0) does not have a<br />
unique solution. S<strong>in</strong>ce it is a homogeneous system, it is consistent (Theorem HSC),<br />
and so must have <strong>in</strong>f<strong>in</strong>itely many solutions (Theorem PSSLS). One of these <strong>in</strong>f<strong>in</strong>itely<br />
many solutions must be nontrivial (<strong>in</strong> fact, almost all of them are), so choose one.<br />
By Theorem SLSLC this nontrivial solution will give a nontrivial relation of l<strong>in</strong>ear<br />
dependence on S, so we can conclude that S is a l<strong>in</strong>early dependent set. <br />
S<strong>in</strong>ce Theorem LIVHS is an equivalence, we can use it to determ<strong>in</strong>e the l<strong>in</strong>ear<br />
<strong>in</strong>dependence or dependence of any set of column vectors, just by creat<strong>in</strong>g a matrix<br />
and analyz<strong>in</strong>g the row-reduced form. Let us illustrate this with two more examples.<br />
Example LIHS L<strong>in</strong>early <strong>in</strong>dependent, homogeneous system<br />
Is the set of vectors<br />
⎧⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤⎫<br />
2 6 4<br />
⎪⎨<br />
−1<br />
2<br />
3<br />
⎪⎬<br />
S = ⎢ 3 ⎥<br />
⎣<br />
⎪⎩<br />
4 ⎦ , ⎢−1⎥<br />
⎣ 3 ⎦ , ⎢−4⎥<br />
⎣ 5 ⎦<br />
⎪⎭<br />
2 4 1<br />
l<strong>in</strong>early <strong>in</strong>dependent or l<strong>in</strong>early dependent?<br />
Theorem LIVHS suggests we study the matrix, A, whose columns are the vectors<br />
<strong>in</strong> S. Specifically, we are <strong>in</strong>terested <strong>in</strong> the size of the solution set for the homogeneous<br />
system LS(A, 0), so we row-reduce A.<br />
⎡<br />
⎤<br />
2 6 4<br />
−1 2 3<br />
A = ⎢ 3 −1 −4⎥<br />
⎣ 4 3 5 ⎦<br />
2 4 1<br />
RREF<br />
−−−−→<br />
⎡<br />
⎢<br />
⎣<br />
1 0 0<br />
0 1 0<br />
0 0 1<br />
0 0 0<br />
0 0 0<br />
Now, r = 3, so there are n−r =3−3 = 0 free variables and we see that LS(A, 0)<br />
has a unique solution (Theorem HSC, TheoremFVCS). By Theorem LIVHS, the<br />
set S is l<strong>in</strong>early <strong>in</strong>dependent.<br />
△<br />
Example LDHS L<strong>in</strong>early dependent, homogeneous system<br />
Is the set of vectors<br />
⎧⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤⎫<br />
2 6 4<br />
⎪⎨<br />
−1<br />
2<br />
3<br />
⎪⎬<br />
S = ⎢ 3 ⎥<br />
⎣<br />
⎪⎩<br />
4 ⎦ , ⎢−1⎥<br />
⎣ 3 ⎦ , ⎢−4⎥<br />
⎣−1⎦<br />
⎪⎭<br />
2 4 2<br />
l<strong>in</strong>early <strong>in</strong>dependent or l<strong>in</strong>early dependent?<br />
Theorem LIVHS suggests we study the matrix, A, whose columns are the vectors<br />
<strong>in</strong> S. Specifically, we are <strong>in</strong>terested <strong>in</strong> the size of the solution set for the homogeneous<br />
system LS(A, 0), so we row-reduce A.<br />
⎤<br />
⎥<br />
⎦
§LI Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 126<br />
⎡<br />
⎤ ⎡<br />
⎤<br />
2 6 4<br />
1 0 −1<br />
−1 2 3<br />
RREF<br />
0 1 1<br />
A = ⎢ 3 −1 −4⎥<br />
−−−−→<br />
⎣ 4 3 −1⎦<br />
⎢ 0 0 0<br />
⎥<br />
⎣ 0 0 0 ⎦<br />
2 4 2<br />
0 0 0<br />
Now, r = 2, so there are n−r =3−2 = 1 free variables and we see that LS(A, 0)<br />
has <strong>in</strong>f<strong>in</strong>itely many solutions (Theorem HSC, TheoremFVCS). By Theorem LIVHS,<br />
the set S is l<strong>in</strong>early dependent.<br />
△<br />
As an equivalence, Theorem LIVHS gives us a straightforward way to determ<strong>in</strong>e<br />
if a set of vectors is l<strong>in</strong>early <strong>in</strong>dependent or dependent.<br />
Review Example LIHS and Example LDHS. They are very similar, differ<strong>in</strong>g only<br />
<strong>in</strong> the last two slots of the third vector. This resulted <strong>in</strong> slightly different matrices<br />
when row-reduced, and slightly different values of r, the number of nonzero rows.<br />
Notice, too, that we are less <strong>in</strong>terested <strong>in</strong> the actual solution set, and more <strong>in</strong>terested<br />
<strong>in</strong> its form or size. These observations allow us to make a slight improvement <strong>in</strong><br />
Theorem LIVHS.<br />
Theorem LIVRN L<strong>in</strong>early Independent Vectors, r and n<br />
Suppose that S = {v 1 , v 2 , v 3 , ..., v n }⊆C m is a set of vectors and A is the m × n<br />
matrix whose columns are the vectors <strong>in</strong> S. LetB be a matrix <strong>in</strong> reduced row-echelon<br />
form that is row-equivalent to A and let r denote the number of pivot columns <strong>in</strong> B.<br />
Then S is l<strong>in</strong>early <strong>in</strong>dependent if and only if n = r.<br />
Proof. Theorem LIVHS says the l<strong>in</strong>ear <strong>in</strong>dependence of S is equivalent to the<br />
homogeneous l<strong>in</strong>ear system LS(A, 0) hav<strong>in</strong>g a unique solution. S<strong>in</strong>ce LS(A, 0) is<br />
consistent (Theorem HSC) we can apply Theorem CSRN to see that the solution is<br />
unique exactly when n = r.<br />
<br />
So now here is an example of the most straightforward way to determ<strong>in</strong>e if a set<br />
of column vectors is l<strong>in</strong>early <strong>in</strong>dependent or l<strong>in</strong>early dependent. While this method<br />
can be quick and easy, do not forget the logical progression from the def<strong>in</strong>ition<br />
of l<strong>in</strong>ear <strong>in</strong>dependence through homogeneous system of equations which makes it<br />
possible.<br />
Example LDRN L<strong>in</strong>early dependent, r and n<br />
Is the set of vectors<br />
⎧⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />
2 9 1 −3 6<br />
−1<br />
−6<br />
1<br />
1<br />
−2<br />
⎪⎨<br />
⎪⎬<br />
3<br />
−2<br />
1<br />
4<br />
1<br />
S =<br />
⎢ 1<br />
,<br />
⎥ ⎢ 3<br />
,<br />
⎥ ⎢0<br />
,<br />
⎥ ⎢ 2<br />
,<br />
⎥ ⎢ 4<br />
⎥<br />
⎣ 0 ⎦ ⎣ 2 ⎦ ⎣0⎦<br />
⎣ 1 ⎦ ⎣ 3 ⎦<br />
⎪⎩<br />
⎪⎭<br />
3 1 1 2 2
§LI Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 127<br />
l<strong>in</strong>early <strong>in</strong>dependent or l<strong>in</strong>early dependent?<br />
Theorem LIVRN suggests we place these vectors <strong>in</strong>to a matrix as columns and<br />
analyze the row-reduced version of the matrix,<br />
⎡<br />
⎤ ⎡<br />
⎤<br />
2 9 1 −3 6<br />
1 0 0 0 −1<br />
−1 −6 1 1 −2<br />
3 −2 1 4 1<br />
⎢ 1 3 0 2 4<br />
⎥<br />
⎣ 0 2 0 1 3 ⎦<br />
3 1 1 2 2<br />
RREF<br />
−−−−→<br />
⎢<br />
⎣<br />
0 1 0 0 1<br />
0 0 1 0 2<br />
0 0 0 1 1<br />
0 0 0 0 0<br />
0 0 0 0 0<br />
Now we need only compute that r =4< 5=n to recognize, via Theorem LIVRN<br />
that S is a l<strong>in</strong>early dependent set. Boom!<br />
△<br />
Example LLDS Large l<strong>in</strong>early dependent set <strong>in</strong> C 4<br />
Consider the set of n = 9 vectors from C 4 ,<br />
⎧⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
−1 7 1 0 5 2<br />
⎪⎨<br />
⎢ 3 ⎥ ⎢ 1 ⎥ ⎢ 2 ⎥ ⎢4⎥<br />
⎢−2⎥<br />
⎢ 1 ⎥<br />
R = ⎣<br />
⎪⎩<br />
1<br />
⎦ , ⎣<br />
−3<br />
⎦ , ⎣<br />
−1<br />
⎦ , ⎣<br />
2<br />
⎦ , ⎣<br />
4<br />
⎦ , ⎣<br />
−6<br />
⎦ ,<br />
2 6 −2 9 3 4<br />
⎡<br />
⎢<br />
⎣<br />
3<br />
0<br />
−3<br />
1<br />
⎤<br />
⎥<br />
⎦ ,<br />
⎥<br />
⎦<br />
⎡ ⎤ ⎡ ⎤⎫<br />
1 −6 ⎪⎬<br />
⎢1⎥<br />
⎢−1⎥<br />
⎣<br />
5<br />
⎦ , ⎣<br />
1<br />
⎦<br />
⎪⎭ .<br />
3 1<br />
To employ Theorem LIVHS, we form a 4 × 9 matrix, C, whose columns are the<br />
vectors <strong>in</strong> R<br />
⎡<br />
⎤<br />
−1 7 1 0 5 2 3 1 −6<br />
⎢ 3 1 2 4 −2 1 0 1 −1⎥<br />
C = ⎣<br />
1 −3 −1 2 4 −6 −3 5 1<br />
⎦ .<br />
2 6 −2 9 3 4 1 3 1<br />
To determ<strong>in</strong>e if the homogeneous system LS(C, 0) has a unique solution or not,<br />
we would normally row-reduce this matrix. But <strong>in</strong> this particular example, we can<br />
do better. Theorem HMVEI tells us that s<strong>in</strong>ce the system is homogeneous with<br />
n = 9 variables <strong>in</strong> m = 4 equations, and n>m, there must be <strong>in</strong>f<strong>in</strong>itely many<br />
solutions. S<strong>in</strong>ce there is not a unique solution, Theorem LIVHS says the set is l<strong>in</strong>early<br />
dependent.<br />
△<br />
The situation <strong>in</strong> Example LLDS is slick enough to warrant formulat<strong>in</strong>g as a<br />
theorem.<br />
Theorem MVSLD More Vectors than Size implies L<strong>in</strong>ear Dependence<br />
Suppose that S = {u 1 , u 2 , u 3 , ..., u n }⊆C m and n>m.ThenS is a l<strong>in</strong>early<br />
dependent set.<br />
Proof. Form the m × n matrix A whose columns are u i ,1≤ i ≤ n. Consider the<br />
homogeneous system LS(A, 0). ByTheoremHMVEI this system has <strong>in</strong>f<strong>in</strong>itely many<br />
solutions. S<strong>in</strong>ce the system does not have a unique solution, Theorem LIVHS says<br />
the columns of A form a l<strong>in</strong>early dependent set, as desired.
§LI Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 128<br />
Subsection LINM<br />
L<strong>in</strong>ear Independence and Nons<strong>in</strong>gular Matrices<br />
Wewillnowspecializetosetsofn vectors from C n . This will put Theorem MVSLD<br />
off-limits, while Theorem LIVHS will <strong>in</strong>volve square matrices. Let us beg<strong>in</strong> by<br />
contrast<strong>in</strong>g Archetype A and Archetype B.<br />
Example LDCAA L<strong>in</strong>early dependent columns <strong>in</strong> Archetype A<br />
Archetype A is a system of l<strong>in</strong>ear equations with coefficient matrix,<br />
[ 1 −1<br />
] 2<br />
A = 2 1 1<br />
1 1 0<br />
Do the columns of this matrix form a l<strong>in</strong>early <strong>in</strong>dependent or dependent set? By<br />
Example S we know that A is s<strong>in</strong>gular. Accord<strong>in</strong>g to the def<strong>in</strong>ition of nons<strong>in</strong>gular<br />
matrices, Def<strong>in</strong>ition NM, the homogeneous system LS(A, 0) has <strong>in</strong>f<strong>in</strong>itely many<br />
solutions. So by Theorem LIVHS, thecolumnsofA form a l<strong>in</strong>early dependent set. △<br />
Example LICAB L<strong>in</strong>early <strong>in</strong>dependent columns <strong>in</strong> Archetype B<br />
Archetype B is a system of l<strong>in</strong>ear equations with coefficient matrix,<br />
[ −7 −6<br />
] −12<br />
B = 5 5 7<br />
1 0 4<br />
Do the columns of this matrix form a l<strong>in</strong>early <strong>in</strong>dependent or dependent set?<br />
By Example NM we know that B is nons<strong>in</strong>gular. Accord<strong>in</strong>g to the def<strong>in</strong>ition of<br />
nons<strong>in</strong>gular matrices, Def<strong>in</strong>ition NM, the homogeneous system LS(A, 0) has a unique<br />
solution.SobyTheoremLIVHS, thecolumnsofB form a l<strong>in</strong>early <strong>in</strong>dependent set.<br />
△<br />
That Archetype A and Archetype B have opposite properties for the columns<br />
of their coefficient matrices is no accident. Here is the theorem, and then we will<br />
update our equivalences for nons<strong>in</strong>gular matrices, Theorem NME1.<br />
Theorem NMLIC Nons<strong>in</strong>gular Matrices have L<strong>in</strong>early Independent Columns<br />
Suppose that A is a square matrix. Then A is nons<strong>in</strong>gular if and only if the columns<br />
of A form a l<strong>in</strong>early <strong>in</strong>dependent set.<br />
Proof. This is a proof where we can cha<strong>in</strong> together equivalences, rather than prov<strong>in</strong>g<br />
the two halves separately.<br />
A nons<strong>in</strong>gular ⇐⇒ LS (A, 0) has a unique solution<br />
⇐⇒ columns of A are l<strong>in</strong>early <strong>in</strong>dependent<br />
Def<strong>in</strong>ition NM<br />
Theorem LIVHS
§LI Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 129<br />
Here is the update to Theorem NME1.<br />
Theorem NME2 Nons<strong>in</strong>gular Matrix Equivalences, Round 2<br />
Suppose that A is a square matrix. The follow<strong>in</strong>g are equivalent.<br />
1. A is nons<strong>in</strong>gular.<br />
2. A row-reduces to the identity matrix.<br />
3. The null space of A conta<strong>in</strong>s only the zero vector, N (A) ={0}.<br />
4. The l<strong>in</strong>ear system LS(A, b) has a unique solution for every possible choice of<br />
b.<br />
5. The columns of A form a l<strong>in</strong>early <strong>in</strong>dependent set.<br />
Proof. Theorem NMLIC is yet another equivalence for a nons<strong>in</strong>gular matrix, so we<br />
can add it to the list <strong>in</strong> Theorem NME1.<br />
<br />
Subsection NSSLI<br />
Null Spaces, Spans, L<strong>in</strong>ear Independence<br />
In Subsection SS.SSNS we proved Theorem SSNS which provided n − r vectors<br />
that could be used with the span construction to build the entire null space of a<br />
matrix. As we have h<strong>in</strong>ted <strong>in</strong> Example SCAD, and as we will see aga<strong>in</strong> go<strong>in</strong>g forward,<br />
l<strong>in</strong>early dependent sets carry redundant vectors with them when used <strong>in</strong> build<strong>in</strong>g<br />
a set as a span. Our aim now is to show that the vectors provided by Theorem<br />
SSNS form a l<strong>in</strong>early <strong>in</strong>dependent set, so <strong>in</strong> one sense they are as efficient as possible<br />
a way to describe the null space. Notice that the vectors z j ,1≤ j ≤ n − r first<br />
appear <strong>in</strong> the vector form of solutions to arbitrary l<strong>in</strong>ear systems (Theorem VFSLS).<br />
The exact same vectors appear aga<strong>in</strong> <strong>in</strong> the span construction <strong>in</strong> the conclusion of<br />
Theorem SSNS. S<strong>in</strong>ce this second theorem specializes to homogeneous systems the<br />
only real difference is that the vector c <strong>in</strong> Theorem VFSLS is the zero vector for a<br />
homogeneous system. F<strong>in</strong>ally, Theorem BNS will now show that these same vectors<br />
are a l<strong>in</strong>early <strong>in</strong>dependent set. We will set the stage for the proof of this theorem<br />
with a moderately large example. Study the example carefully, as it will make it<br />
easier to understand the proof.<br />
Example LINSB L<strong>in</strong>ear <strong>in</strong>dependence of null space basis<br />
Suppose that we are <strong>in</strong>terested <strong>in</strong> the null space of a 3×7 matrix, A, which row-reduces<br />
to<br />
⎡<br />
⎤<br />
1 0 −2 4 0 3 9<br />
B = ⎣ 0 1 5 6 0 7 1 ⎦<br />
0 0 0 0 1 8 −5
§LI Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 130<br />
The set F = {3, 4, 6, 7} is the set of <strong>in</strong>dices for our four free variables that would<br />
be used <strong>in</strong> a description of the solution set for the homogeneous system LS(A, 0).<br />
Apply<strong>in</strong>g Theorem SSNS we can beg<strong>in</strong> to construct a set of four vectors whose span<br />
is the null space of A, a set of vectors we will reference as T .<br />
⎧⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />
〈<br />
〉<br />
⎪⎨<br />
1<br />
0<br />
0<br />
0<br />
⎪⎬<br />
N (A) =〈T 〉 = 〈{z 1 , z 2 , z 3 , z 4 }〉 =<br />
0<br />
,<br />
1<br />
,<br />
0<br />
,<br />
0<br />
⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥<br />
⎣0⎦<br />
⎣0⎦<br />
⎣1⎦<br />
⎣0⎦<br />
⎪⎩<br />
⎪⎭<br />
0 0 0 1<br />
So far, we have constructed as much of these <strong>in</strong>dividual vectors as we can, based<br />
just on the knowledge of the contents of the set F . This has allowed us to determ<strong>in</strong>e<br />
the entries <strong>in</strong> slots 3, 4, 6 and 7, while we have left slots 1, 2 and 5 blank. Without<br />
do<strong>in</strong>g any more, let us ask if T is l<strong>in</strong>early <strong>in</strong>dependent? Beg<strong>in</strong> with a relation of<br />
l<strong>in</strong>ear dependence on T , and see what we can learn about the scalars,<br />
0 = α 1 z 1 + α 2 z 2 + α 3 z 3 + α 4 z 4<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
0<br />
0<br />
0<br />
1<br />
0<br />
0<br />
0<br />
0<br />
= α 1 0<br />
+ α 2 1<br />
+ α 3 0<br />
+ α 4 0<br />
⎢0<br />
⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥<br />
⎣0⎦<br />
⎣0⎦<br />
⎣0⎦<br />
⎣1⎦<br />
⎣0⎦<br />
0 0 0 0 1<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎡ ⎡<br />
=<br />
⎢<br />
⎣<br />
α 1<br />
0<br />
0<br />
0<br />
⎤ ⎤ ⎤<br />
0<br />
0<br />
0<br />
α 1<br />
+<br />
α 2<br />
+<br />
0<br />
+<br />
0<br />
=<br />
α 2<br />
⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥<br />
⎦ ⎣ 0 ⎦ ⎣α 3<br />
⎦ ⎣ 0 ⎦ ⎣α 3<br />
⎦<br />
0 0 α 4 α 4<br />
Apply<strong>in</strong>g Def<strong>in</strong>ition CVE to the two ends of this cha<strong>in</strong> of equalities, we see that<br />
α 1 = α 2 = α 3 = α 4 = 0. So the only relation of l<strong>in</strong>ear dependence on the set T is a<br />
trivial one. By Def<strong>in</strong>ition LICV the set T is l<strong>in</strong>early <strong>in</strong>dependent. The important<br />
feature of this example is how the “pattern of zeros and ones” <strong>in</strong> the four vectors<br />
led to the conclusion of l<strong>in</strong>ear <strong>in</strong>dependence.<br />
△<br />
The proof of Theorem BNS is really quite straightforward, and relies on the<br />
“pattern of zeros and ones” that arise <strong>in</strong> the vectors z i ,1≤ i ≤ n − r <strong>in</strong> the entries<br />
that arise with the locations of the non-pivot columns. Play along with Example<br />
LINSB as you study the proof. Also, take a look at Example VFSAD, Example<br />
VFSAI and Example VFSAL, especially at the conclusion of Step 2 (temporarily
§LI Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 131<br />
ignore the construction of the constant vector, c). This proof is also a good first<br />
example of how to prove a conclusion that states a set is l<strong>in</strong>early <strong>in</strong>dependent.<br />
Theorem BNS Basis for Null Spaces<br />
Suppose that A is an m × n matrix, and B is a row-equivalent matrix <strong>in</strong> reduced<br />
row-echelon form with r pivot columns. Let D = {d 1 ,d 2 ,d 3 , ..., d r } and F =<br />
{f 1 ,f 2 ,f 3 , ..., f n−r } be the sets of column <strong>in</strong>dices where B does and does not<br />
(respectively) have pivot columns. Construct the n − r vectors z j , 1 ≤ j ≤ n − r of<br />
size n as<br />
⎧<br />
⎪⎨ 1 if i ∈ F , i = f j<br />
[z j ] i<br />
= 0 if i ∈ F , i ≠ f j<br />
⎪⎩<br />
− [B] k,fj<br />
if i ∈ D, i = d k<br />
Def<strong>in</strong>e the set S = {z 1 , z 2 , z 3 , ..., z n−r }.Then<br />
1. N (A) =〈S〉.<br />
2. S is a l<strong>in</strong>early <strong>in</strong>dependent set.<br />
Proof. Notice first that the vectors z j ,1≤ j ≤ n − r are exactly the same as the<br />
n − r vectors def<strong>in</strong>ed <strong>in</strong> Theorem SSNS. Also, the hypotheses of Theorem SSNS are<br />
the same as the hypotheses of the theorem we are currently prov<strong>in</strong>g. So it is then<br />
simply the conclusion of Theorem SSNS that tells us that N (A) = 〈S〉. Thatwas<br />
the easy half, but the second part is not much harder. What is new here is the claim<br />
that S is a l<strong>in</strong>early <strong>in</strong>dependent set.<br />
To prove the l<strong>in</strong>ear <strong>in</strong>dependence of a set, we need to start with a relation of<br />
l<strong>in</strong>ear dependence and somehow conclude that the scalars <strong>in</strong>volved must all be zero,<br />
i.e. that the relation of l<strong>in</strong>ear dependence only happens <strong>in</strong> the trivial fashion. So to<br />
establish the l<strong>in</strong>ear <strong>in</strong>dependence of S, we start with<br />
α 1 z 1 + α 2 z 2 + α 3 z 3 + ···+ α n−r z n−r = 0.<br />
For each j, 1≤ j ≤ n − r, consider the equality of the <strong>in</strong>dividual entries of the<br />
vectors on both sides of this equality <strong>in</strong> position f j ,<br />
0=[0] fj<br />
=[α 1 z 1 + α 2 z 2 + α 3 z 3 + ···+ α n−r z n−r ] fj<br />
=[α 1 z 1 ] fj<br />
+[α 2 z 2 ] fj<br />
+[α 3 z 3 ] fj<br />
+ ···+[α n−r z n−r ] fj<br />
= α 1 [z 1 ] fj<br />
+ α 2 [z 2 ] fj<br />
+ α 3 [z 3 ] fj<br />
+ ···+<br />
α j−1 [z j−1 ] fj<br />
+ α j [z j ] fj<br />
+ α j+1 [z j+1 ] fj<br />
+ ···+<br />
α n−r [z n−r ] fj<br />
= α 1 (0) + α 2 (0) + α 3 (0) + ···+<br />
Def<strong>in</strong>ition CVE<br />
Def<strong>in</strong>ition CVA<br />
Def<strong>in</strong>ition CVSM
§LI Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 132<br />
α j−1 (0) + α j (1) + α j+1 (0) + ···+ α n−r (0)<br />
Def<strong>in</strong>ition of z j<br />
= α j<br />
So for all j, 1≤ j ≤ n − r, wehaveα j = 0, which is the conclusion that tells<br />
us that the only relation of l<strong>in</strong>ear dependence on S = {z 1 , z 2 , z 3 , ..., z n−r } is the<br />
trivial one. Hence, by Def<strong>in</strong>ition LICV the set is l<strong>in</strong>early <strong>in</strong>dependent, as desired.<br />
Example NSLIL Null space spanned by l<strong>in</strong>early <strong>in</strong>dependent set, Archetype L<br />
In Example VFSAL we previewed Theorem SSNS by f<strong>in</strong>d<strong>in</strong>g a set of two vectors<br />
such that their span was the null space for the matrix <strong>in</strong> Archetype L. Writ<strong>in</strong>g the<br />
matrix as L, wehave<br />
⎧⎡<br />
⎤ ⎡ ⎤⎫<br />
−1 2<br />
〈<br />
〉<br />
⎪⎨<br />
2<br />
−2<br />
⎪⎬<br />
N (L) = ⎢−2⎥<br />
⎣<br />
⎪⎩<br />
1 ⎦ , ⎢ 1 ⎥<br />
⎣ 0 ⎦<br />
⎪⎭<br />
0 1<br />
Solv<strong>in</strong>g the homogeneous system LS(L, 0) resulted <strong>in</strong> recogniz<strong>in</strong>g x 4 and x 5 as the<br />
free variables. So look <strong>in</strong> entries 4 and 5 of the two vectors above and notice the<br />
pattern of zeros and ones that provides the l<strong>in</strong>ear <strong>in</strong>dependence of the set. △<br />
Read<strong>in</strong>g Questions<br />
1. Let S be the set of three vectors below.<br />
⎧⎡<br />
⎨<br />
S = ⎣ 1 ⎤ ⎡<br />
2 ⎦ , ⎣ 3<br />
⎤ ⎡<br />
−4⎦ , ⎣ 4<br />
⎤⎫<br />
⎬<br />
−2⎦<br />
⎩<br />
−1 2 1<br />
⎭<br />
Is S l<strong>in</strong>early <strong>in</strong>dependent or l<strong>in</strong>early dependent? Expla<strong>in</strong> why.<br />
2. Let S be the set of three vectors below.<br />
⎧⎡<br />
⎨<br />
S = ⎣ 1<br />
⎤ ⎡<br />
−1⎦ , ⎣ 3 ⎤ ⎡<br />
2⎦ , ⎣ 4 ⎤⎫<br />
⎬<br />
3 ⎦<br />
⎩<br />
0 2 −4<br />
⎭<br />
Is S l<strong>in</strong>early <strong>in</strong>dependent or l<strong>in</strong>early dependent? Expla<strong>in</strong> why.<br />
3. Is the matrix below s<strong>in</strong>gular or nons<strong>in</strong>gular? Expla<strong>in</strong> your answer us<strong>in</strong>g only the f<strong>in</strong>al<br />
conclusion you reached <strong>in</strong> the previous question, along with one new theorem.<br />
⎡<br />
⎣ 1 3 4<br />
⎤<br />
−1 2 3 ⎦<br />
0 2 −4<br />
Exercises<br />
Determ<strong>in</strong>e if the sets of vectors <strong>in</strong> Exercises C20–C25 are l<strong>in</strong>early <strong>in</strong>dependent or l<strong>in</strong>early<br />
dependent. When the set is l<strong>in</strong>early dependent, exhibit a nontrivial relation of l<strong>in</strong>ear
§LI Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 133<br />
dependence.<br />
⎧⎡<br />
⎨<br />
C20 † ⎣ 1<br />
⎤ ⎡<br />
−2⎦ , ⎣ 2<br />
⎤ ⎡<br />
−1⎦ , ⎣ 1 ⎤⎫<br />
⎬<br />
5⎦<br />
⎩<br />
1 3 0<br />
⎭<br />
⎧⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤⎫<br />
⎪⎨ −1 3 7 ⎪⎬<br />
C21 † ⎢ 2 ⎥ ⎢ 3 ⎥ ⎢ 3 ⎥<br />
⎣<br />
⎪ ⎩<br />
4<br />
⎦ , ⎣<br />
−1<br />
⎦ , ⎣<br />
−6<br />
⎦<br />
⎪⎭<br />
2 3 4<br />
⎧⎡<br />
⎨<br />
C22 † ⎣ −2<br />
⎤ ⎡<br />
−1⎦ , ⎣ 1 ⎤ ⎡<br />
0 ⎦ , ⎣ 3 ⎤ ⎡<br />
3⎦ , ⎣ −5<br />
⎤ ⎡<br />
−4⎦ , ⎣ 4 ⎤⎫<br />
⎬<br />
4⎦<br />
⎩<br />
−1 −1 6 −6 7<br />
⎭<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />
1 3 2 1<br />
⎧⎪ ⎨ −2<br />
3<br />
1<br />
0<br />
⎪⎬<br />
C23 †<br />
⎢ 2 ⎥<br />
⎣<br />
⎪ ⎩<br />
5<br />
⎦ , ⎢ 1 ⎥<br />
⎣<br />
2<br />
⎦ , ⎢ 2 ⎥<br />
⎣<br />
−1<br />
⎦ , ⎢1⎥<br />
⎣<br />
2<br />
⎦<br />
⎪⎭<br />
3 −4 1 2<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />
1 3 4 −1<br />
⎧⎪ ⎨ 2<br />
2<br />
4<br />
2<br />
⎪⎬<br />
C24 †<br />
⎢−1⎥<br />
⎣<br />
⎪ ⎩<br />
0<br />
⎦ , ⎢−1⎥<br />
⎣<br />
2<br />
⎦ , ⎢−2⎥<br />
⎣<br />
2<br />
⎦ , ⎢−1⎥<br />
⎣<br />
−2<br />
⎦<br />
⎪⎭<br />
1 2 3 0<br />
C25 †<br />
⎧⎪ ⎨<br />
⎪ ⎩<br />
⎡<br />
⎢<br />
⎣<br />
2<br />
1<br />
3<br />
−1<br />
2<br />
⎤ ⎡ ⎤ ⎡ ⎤⎫<br />
4 10<br />
−2<br />
−7<br />
⎪⎬<br />
⎥<br />
⎦ , ⎢ 1 ⎥<br />
⎣<br />
3<br />
⎦ , ⎢ 0 ⎥<br />
⎣<br />
10<br />
⎦<br />
⎪⎭<br />
2 4<br />
C30 † For the matrix B below, f<strong>in</strong>d a set S that is l<strong>in</strong>early <strong>in</strong>dependent and spans the<br />
null space of B, thatis,N (B) =〈S〉.<br />
⎡<br />
⎤<br />
−3 1 −2 7<br />
B = ⎣−1 2 1 4 ⎦<br />
1 1 2 −1<br />
C31 † For the matrix A below, f<strong>in</strong>d a l<strong>in</strong>early <strong>in</strong>dependent set S so that the null space of<br />
A is spanned by S, thatis,N (A) =〈S〉.<br />
⎡<br />
−1 −2 2 1<br />
⎤<br />
5<br />
⎢ 1 2 1 1 5⎥<br />
A = ⎣<br />
3 6 1 2 7<br />
⎦<br />
2 4 0 1 2<br />
C32 † F<strong>in</strong>d a set of column vectors, T , such that (1) the span of T is the null space of B,<br />
〈T 〉 = N (B) and(2)T is a l<strong>in</strong>early <strong>in</strong>dependent set.<br />
⎡<br />
B = ⎣ 2 1 1 1<br />
⎤<br />
−4 −3 1 −7⎦<br />
1 1 −1 3
§LI Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 134<br />
C33 † F<strong>in</strong>d a set S so that S is l<strong>in</strong>early <strong>in</strong>dependent and N (A) = 〈S〉, whereN (A) is the<br />
null space of the matrix A below.<br />
⎡<br />
A = ⎣ 2 3 3 1 4<br />
⎤<br />
1 1 −1 −1 −3⎦<br />
3 2 −8 −1 1<br />
C50 Consider each archetype that is a system of equations and consider the solutions<br />
listed for the homogeneous version of the archetype. (If only the trivial solution is listed,<br />
then assume this is the only solution to the system.) From the solution set, determ<strong>in</strong>e if<br />
the columns of the coefficient matrix form a l<strong>in</strong>early <strong>in</strong>dependent or l<strong>in</strong>early dependent<br />
set. In the case of a l<strong>in</strong>early dependent set, use one of the sample solutions to provide<br />
a nontrivial relation of l<strong>in</strong>ear dependence on the set of columns of the coefficient matrix<br />
(Def<strong>in</strong>ition RLD). Indicate when Theorem MVSLD applies and connect this with the<br />
number of variables and equations <strong>in</strong> the system of equations.<br />
Archetype A,ArchetypeB,ArchetypeC,ArchetypeD/Archetype E,ArchetypeF,Archetype<br />
G/Archetype H, ArchetypeI, ArchetypeJ<br />
C51 For each archetype that is a system of equations consider the homogeneous version.<br />
Write elements of the solution set <strong>in</strong> vector form (Theorem VFSLS) and from this extract<br />
the vectors z j described<strong>in</strong>TheoremBNS. These vectors are used <strong>in</strong> a span construction<br />
to describe the null space of the coefficient matrix for each archetype. What does it mean<br />
when we write a null space as 〈{ }〉?<br />
Archetype A,ArchetypeB,ArchetypeC,ArchetypeD/Archetype E,ArchetypeF,Archetype<br />
G/Archetype H, ArchetypeI, ArchetypeJ<br />
C52 For each archetype that is a system of equations consider the homogeneous version.<br />
Sample solutions are given and a l<strong>in</strong>early <strong>in</strong>dependent spann<strong>in</strong>g set is given for the null<br />
space of the coefficient matrix. Write each of the sample solutions <strong>in</strong>dividually as a l<strong>in</strong>ear<br />
comb<strong>in</strong>ation of the vectors <strong>in</strong> the spann<strong>in</strong>g set for the null space of the coefficient matrix.<br />
Archetype A,ArchetypeB,ArchetypeC,ArchetypeD/Archetype E,ArchetypeF,Archetype<br />
G/Archetype H, ArchetypeI, ArchetypeJ<br />
C60 † For the matrix A below, f<strong>in</strong>d a set of vectors S so that (1) S is l<strong>in</strong>early <strong>in</strong>dependent,<br />
and (2) the span of S equals the null space of A, 〈S〉 = N (A). (See Exercise SS.C60.)<br />
⎡<br />
A = ⎣ 1 1 6 −8<br />
⎤<br />
1 −2 0 1 ⎦<br />
−2 1 −6 7<br />
M20 †<br />
set<br />
Suppose that S = {v 1 , v 2 , v 3 } is a set of three vectors from C 873 . Prove that the<br />
is l<strong>in</strong>early dependent.<br />
T = {2v 1 +3v 2 + v 3 , v 1 − v 2 − 2v 3 , 2v 1 + v 2 − v 3 }
§LI Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 135<br />
M21 † Suppose that S = {v 1 , v 2 , v 3 } is a l<strong>in</strong>early <strong>in</strong>dependent set of three vectors from<br />
C 873 . Prove that the set<br />
T = {2v 1 +3v 2 + v 3 , v 1 − v 2 +2v 3 , 2v 1 + v 2 − v 3 }<br />
is l<strong>in</strong>early <strong>in</strong>dependent.<br />
M50 † Consider the set of vectors from C 3 , W , given below. F<strong>in</strong>d a set T that conta<strong>in</strong>s<br />
three vectors from W andsuchthatW = 〈T 〉.<br />
〈 ⎧ ⎡<br />
⎨<br />
W = 〈{v 1 , v 2 , v 3 , v 4 , v 5 }〉 = ⎣ 2 ⎤ ⎡<br />
1⎦ , ⎣ ⎤ ⎡<br />
−1⎦ , ⎣ 1 ⎤ ⎡<br />
2⎦ , ⎣ 3 ⎤ ⎡<br />
1⎦ , ⎣ 0 ⎤⎫〉<br />
⎬<br />
1 ⎦<br />
⎩<br />
1 1 3 3 −3<br />
⎭<br />
M51 † Consider the subspace W = 〈{v 1 , v 2 , v 3 , v 4 }〉. F<strong>in</strong>dasetS so that (1) S is a<br />
subset of W ,(2)S is l<strong>in</strong>early <strong>in</strong>dependent, and (3) W = 〈S〉. Write each vector not <strong>in</strong>cluded<br />
<strong>in</strong> S as a l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors that are <strong>in</strong> S.<br />
⎡<br />
v 1 = ⎣ 1<br />
⎤<br />
⎡<br />
−1⎦ v 2 = ⎣ 4<br />
⎤<br />
⎡<br />
−4⎦ v 3 = ⎣ −3<br />
⎤<br />
⎡<br />
2 ⎦ v 4 = ⎣ 2 ⎤<br />
1⎦<br />
2<br />
8<br />
−7<br />
7<br />
T10 Prove that if a set of vectors conta<strong>in</strong>s the zero vector, then the set is l<strong>in</strong>early<br />
dependent. (Ed. “The zero vector is death to l<strong>in</strong>early <strong>in</strong>dependent sets.”)<br />
T12 Suppose that S is a l<strong>in</strong>early <strong>in</strong>dependent set of vectors, and T is a subset of S, T ⊆ S<br />
(Def<strong>in</strong>ition SSET). Prove that T is l<strong>in</strong>early <strong>in</strong>dependent.<br />
T13 Suppose that T is a l<strong>in</strong>early dependent set of vectors, and T is a subset of S, T ⊆ S<br />
(Def<strong>in</strong>ition SSET). Prove that S is l<strong>in</strong>early dependent.<br />
T15 † Suppose that {v 1 , v 2 , v 3 , ..., v n } is a set of vectors. Prove that<br />
{v 1 − v 2 , v 2 − v 3 , v 3 − v 4 , ..., v n − v 1 }<br />
is a l<strong>in</strong>early dependent set.<br />
T20 † Suppose that {v 1 , v 2 , v 3 , v 4 } is a l<strong>in</strong>early <strong>in</strong>dependent set <strong>in</strong> C 35 .Provethat<br />
{v 1 , v 1 + v 2 , v 1 + v 2 + v 3 , v 1 + v 2 + v 3 + v 4 }<br />
is a l<strong>in</strong>early <strong>in</strong>dependent set.<br />
T50 † Suppose that A is an m × n matrix with l<strong>in</strong>early <strong>in</strong>dependent columns and the<br />
l<strong>in</strong>ear system LS(A, b) is consistent. Show that this system has a unique solution. (Notice<br />
that we are not requir<strong>in</strong>g A to be square.)
Section LDS<br />
L<strong>in</strong>ear Dependence and Spans<br />
In any l<strong>in</strong>early dependent set there is always one vector that can be written as a<br />
l<strong>in</strong>ear comb<strong>in</strong>ation of the others. This is the substance of the upcom<strong>in</strong>g Theorem<br />
DLDS. Perhaps this will expla<strong>in</strong> the use of the word “dependent.” In a l<strong>in</strong>early<br />
dependent set, at least one vector “depends” on the others (via a l<strong>in</strong>ear comb<strong>in</strong>ation).<br />
Indeed, because Theorem DLDS is an equivalence (Proof Technique E) some<br />
authors use this condition as a def<strong>in</strong>ition (Proof Technique D) of l<strong>in</strong>ear dependence.<br />
Then l<strong>in</strong>ear <strong>in</strong>dependence is def<strong>in</strong>ed as the logical opposite of l<strong>in</strong>ear dependence. Of<br />
course, we have chosen to take Def<strong>in</strong>ition LICV as our def<strong>in</strong>ition, and then follow<br />
with Theorem DLDS as a theorem.<br />
Subsection LDSS<br />
L<strong>in</strong>early Dependent Sets and Spans<br />
If we use a l<strong>in</strong>early dependent set to construct a span, then we can always create<br />
the same <strong>in</strong>f<strong>in</strong>ite set with a start<strong>in</strong>g set that is one vector smaller <strong>in</strong> size. We will<br />
illustrate this behavior <strong>in</strong> Example RSC5. However, this will not be possible if we<br />
build a span from a l<strong>in</strong>early <strong>in</strong>dependent set. So <strong>in</strong> a certa<strong>in</strong> sense, us<strong>in</strong>g a l<strong>in</strong>early<br />
<strong>in</strong>dependent set to formulate a span is the best possible way — there are not any<br />
extra vectors be<strong>in</strong>g used to build up all the necessary l<strong>in</strong>ear comb<strong>in</strong>ations. OK, here<br />
is the theorem, and then the example.<br />
Theorem DLDS Dependency <strong>in</strong> L<strong>in</strong>early Dependent Sets<br />
Suppose that S = {u 1 , u 2 , u 3 , ..., u n } is a set of vectors. Then S is a l<strong>in</strong>early<br />
dependent set if and only if there is an <strong>in</strong>dex t, 1 ≤ t ≤ n such that u t is a l<strong>in</strong>ear<br />
comb<strong>in</strong>ation of the vectors u 1 , u 2 , u 3 , ..., u t−1 , u t+1 , ..., u n .<br />
Proof. (⇒) Suppose that S is l<strong>in</strong>early dependent, so there exists a nontrivial relation<br />
of l<strong>in</strong>ear dependence by Def<strong>in</strong>ition LICV. That is, there are scalars, α i ,1≤ i ≤ n,<br />
which are not all zero, such that<br />
α 1 u 1 + α 2 u 2 + α 3 u 3 + ···+ α n u n = 0.<br />
S<strong>in</strong>ce the α i cannot all be zero, choose one, say α t , that is nonzero. Then,<br />
u t = −1 (−α t u t )<br />
Property MICN<br />
α t<br />
= −1 (α 1 u 1 + ···+ α t−1 u t−1 + α t+1 u t+1 + ···+ α n u n ) Theorem VSPCV<br />
α t<br />
= −α 1<br />
u 1 + ···+ −α t−1<br />
α t<br />
α t<br />
u t−1 + −α t+1<br />
α t<br />
u t+1 + ···+ −α n<br />
α t<br />
u n Theorem VSPCV<br />
136
§LDS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 137<br />
S<strong>in</strong>ce the values of α i<br />
α t<br />
are aga<strong>in</strong> scalars, we have expressed u t as a l<strong>in</strong>ear comb<strong>in</strong>ation<br />
of the other elements of S.<br />
(⇐) Assume that the vector u t is a l<strong>in</strong>ear comb<strong>in</strong>ation of the other vectors <strong>in</strong><br />
S. Write this l<strong>in</strong>ear comb<strong>in</strong>ation, denot<strong>in</strong>g the relevant scalars as β 1 , β 2 ,...,β t−1 ,<br />
β t+1 , ..., β n ,as<br />
u t = β 1 u 1 + β 2 u 2 + ···+ β t−1 u t−1 + β t+1 u t+1 + ···+ β n u n<br />
Then we have<br />
β 1 u 1 + ···+ β t−1 u t−1 +(−1)u t + β t+1 u t+1 + ···+ β n u n<br />
= u t +(−1)u t Theorem VSPCV<br />
=(1+(−1)) u t<br />
Property DSAC<br />
=0u t Property AICN<br />
= 0 Def<strong>in</strong>ition CVSM<br />
So the scalars β 1 ,β 2 ,β 3 , ..., β t−1 ,β t = −1,β t+1 , ..., β n provide a nontrivial<br />
l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors <strong>in</strong> S, thus establish<strong>in</strong>g that S is a l<strong>in</strong>early dependent<br />
set (Def<strong>in</strong>ition LICV).<br />
<br />
This theorem can be used, sometimes repeatedly, to whittle down the size of a<br />
set of vectors used <strong>in</strong> a span construction. We have seen some of this already <strong>in</strong><br />
Example SCAD, but <strong>in</strong> the next example we will detail some of the subtleties.<br />
Example RSC5 Reduc<strong>in</strong>g a span <strong>in</strong> C 5<br />
Consider the set of n = 4 vectors from C 5 ,<br />
⎧⎡<br />
⎤ ⎡ ⎤ ⎡<br />
1 2<br />
⎪⎨<br />
2<br />
1<br />
R = {v 1 , v 2 , v 3 , v 4 } = ⎢−1⎥<br />
⎣<br />
⎪⎩<br />
3 ⎦ , ⎢3⎥<br />
⎣1⎦ , ⎢<br />
⎣<br />
2 2<br />
0<br />
−7<br />
6<br />
−11<br />
−2<br />
⎤ ⎡ ⎤⎫<br />
4<br />
1<br />
⎪⎬<br />
⎥<br />
⎦ , ⎢2⎥<br />
⎣1⎦<br />
⎪⎭<br />
6<br />
and def<strong>in</strong>e V = 〈R〉.<br />
To employ Theorem LIVHS, we form a 5 × 4 matrix, D, and row-reduce to<br />
understand solutions to the homogeneous system LS(D, 0),<br />
⎡<br />
⎤ ⎡<br />
⎤<br />
1 2 0 4<br />
1 0 0 4<br />
2 1 −7 1<br />
D = ⎢−1 3 6 2⎥<br />
⎣ 3 1 −11 1⎦<br />
2 2 −2 6<br />
RREF<br />
−−−−→<br />
0 1 0 0<br />
⎢ 0 0 1 1⎥<br />
⎣<br />
0 0 0 0<br />
⎦<br />
0 0 0 0<br />
We can f<strong>in</strong>d <strong>in</strong>f<strong>in</strong>itely many solutions to this system, most of them nontrivial,<br />
and we choose any one we like to build a relation of l<strong>in</strong>ear dependence on R. Letus
§LDS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 138<br />
beg<strong>in</strong> with x 4 = 1, to f<strong>in</strong>d the solution<br />
⎡ ⎤<br />
−4<br />
⎢ 0 ⎥ ⎣<br />
−1<br />
⎦<br />
1<br />
So we can write the relation of l<strong>in</strong>ear dependence,<br />
(−4)v 1 +0v 2 +(−1)v 3 +1v 4 = 0<br />
Theorem DLDS guarantees that we can solve this relation of l<strong>in</strong>ear dependence<br />
for some vector <strong>in</strong> R, butthechoiceofwhichoneisuptous.Noticehoweverthat<br />
v 2 has a zero coefficient. In this case, we cannot choose to solve for v 2 .Maybesome<br />
other relation of l<strong>in</strong>ear dependence would produce a nonzero coefficient for v 2 if we<br />
just had to solve for this vector. Unfortunately, this example has been eng<strong>in</strong>eered to<br />
always produce a zero coefficient here, as you can see from solv<strong>in</strong>g the homogeneous<br />
system. Every solution has x 2 =0!<br />
OK, if we are conv<strong>in</strong>ced that we cannot solve for v 2 , let us <strong>in</strong>stead solve for v 3 ,<br />
v 3 =(−4)v 1 +0v 2 +1v 4 =(−4)v 1 +1v 4<br />
We now claim that this particular equation will allow us to write<br />
V = 〈R〉 = 〈{v 1 , v 2 , v 3 , v 4 }〉 = 〈{v 1 , v 2 , v 4 }〉<br />
<strong>in</strong> essence declar<strong>in</strong>g v 3 as surplus for the task of build<strong>in</strong>g V as a span. This claim<br />
is an equality of two sets, so we will use Def<strong>in</strong>ition SE to establish it carefully. Let<br />
R ′ = {v 1 , v 2 , v 4 } and V ′ = 〈R ′ 〉. We want to show that V = V ′ .<br />
<strong>First</strong> show that V ′ ⊆ V . S<strong>in</strong>ce every vector of R ′ is <strong>in</strong> R, any vector we can<br />
construct <strong>in</strong> V ′ as a l<strong>in</strong>ear comb<strong>in</strong>ation of vectors from R ′ can also be constructed<br />
as a vector <strong>in</strong> V by the same l<strong>in</strong>ear comb<strong>in</strong>ation of the same vectors <strong>in</strong> R. Thatwas<br />
easy, now turn it around.<br />
Next show that V ⊆ V ′ .Chooseanyv from V . So there are scalars α 1 ,α 2 ,α 3 ,α 4<br />
such that<br />
v = α 1 v 1 + α 2 v 2 + α 3 v 3 + α 4 v 4<br />
= α 1 v 1 + α 2 v 2 + α 3 ((−4)v 1 +1v 4 )+α 4 v 4<br />
= α 1 v 1 + α 2 v 2 +((−4α 3 )v 1 + α 3 v 4 )+α 4 v 4<br />
=(α 1 − 4α 3 ) v 1 + α 2 v 2 +(α 3 + α 4 ) v 4 .<br />
This equation says that v can then be written as a l<strong>in</strong>ear comb<strong>in</strong>ation of the<br />
vectors <strong>in</strong> R ′ and hence qualifies for membership <strong>in</strong> V ′ .SoV ⊆ V ′ and we have<br />
established that V = V ′ .<br />
If R ′ was also l<strong>in</strong>early dependent (it is not), we could reduce the set even further.<br />
Notice that we could have chosen to elim<strong>in</strong>ate any one of v 1 , v 3 or v 4 , but somehow v 2<br />
is essential to the creation of V s<strong>in</strong>ce it cannot be replaced by any l<strong>in</strong>ear comb<strong>in</strong>ation
§LDS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 139<br />
of v 1 , v 3 or v 4 .<br />
△<br />
Subsection COV<br />
Cast<strong>in</strong>g Out Vectors<br />
In Example RSC5 we used four vectors to create a span. With a relation of l<strong>in</strong>ear<br />
dependence <strong>in</strong> hand, we were able to “toss out” one of these four vectors and create<br />
the same span from a subset of just three vectors from the orig<strong>in</strong>al set of four. We did<br />
have to take some care as to just which vector we tossed out. In the next example,<br />
we will be more methodical about just how we choose to elim<strong>in</strong>ate vectors from a<br />
l<strong>in</strong>early dependent set while preserv<strong>in</strong>g a span.<br />
Example COV Cast<strong>in</strong>g out vectors<br />
Webeg<strong>in</strong>withasetS conta<strong>in</strong><strong>in</strong>g seven vectors from C 4 ,<br />
⎧⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />
1 4 0 −1 0 7 −9<br />
⎪⎨<br />
⎪⎬<br />
⎢ 2 ⎥ ⎢ 8 ⎥ ⎢−1⎥<br />
⎢ 3 ⎥ ⎢ 9 ⎥ ⎢−13⎥<br />
⎢ 7 ⎥<br />
S = ⎣<br />
⎪⎩<br />
0<br />
⎦ , ⎣<br />
0<br />
⎦ , ⎣<br />
2<br />
⎦ , ⎣<br />
−3<br />
⎦ , ⎣<br />
−4<br />
⎦ , ⎣<br />
12<br />
⎦ , ⎣<br />
−8<br />
⎦<br />
⎪⎭<br />
−1 −4 2 4 8 −31 37<br />
and def<strong>in</strong>e W = 〈S〉.<br />
The set S is obviously l<strong>in</strong>early dependent by Theorem MVSLD, s<strong>in</strong>cewehave<br />
n = 7 vectors from C 4 . So we can slim down S some, and still create W as the span<br />
of a smaller set of vectors.<br />
As a device for identify<strong>in</strong>g relations of l<strong>in</strong>ear dependence among the vectors of S,<br />
we place the seven column vectors of S <strong>in</strong>to a matrix as columns,<br />
⎡<br />
⎤<br />
1 4 0 −1 0 7 −9<br />
⎢ 2 8 −1 3 9 −13 7 ⎥<br />
A =[A 1 |A 2 |A 3 | ...|A 7 ]= ⎣<br />
0 0 2 −3 −4 12 −8<br />
⎦<br />
−1 −4 2 4 8 −31 37<br />
By Theorem SLSLC a nontrivial solution to LS(A, 0) will give us a nontrivial<br />
relation of l<strong>in</strong>ear dependence (Def<strong>in</strong>ition RLDCV) onthecolumnsofA (which are<br />
theelementsofthesetS). The row-reduced form for A is the matrix<br />
⎡<br />
⎤<br />
B =<br />
⎢<br />
⎣<br />
1 4 0 0 2 1 −3<br />
0 0 1 0 1 −3 5<br />
0 0 0 1 2 −6 6<br />
0 0 0 0 0 0 0<br />
so we can easily create solutions to the homogeneous system LS(A, 0) us<strong>in</strong>g the<br />
free variables x 2 ,x 5 ,x 6 ,x 7 . Any such solution will provide a relation of l<strong>in</strong>ear<br />
dependence on the columns of B. These solutions will allow us to solve for one<br />
column vector as a l<strong>in</strong>ear comb<strong>in</strong>ation of some others, <strong>in</strong> the spirit of Theorem<br />
DLDS, and remove that vector from the set. We will set about form<strong>in</strong>g these l<strong>in</strong>ear<br />
comb<strong>in</strong>ations methodically.<br />
⎥<br />
⎦
§LDS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 140<br />
Set the free variable x 2 = 1, and set the other free variables to zero. Then a<br />
solution to LS(A, 0) is<br />
⎡ ⎤<br />
−4<br />
1<br />
0<br />
x =<br />
0<br />
⎢ 0<br />
⎥<br />
⎣ 0 ⎦<br />
0<br />
which can be used to create the l<strong>in</strong>ear comb<strong>in</strong>ation<br />
(−4)A 1 +1A 2 +0A 3 +0A 4 +0A 5 +0A 6 +0A 7 = 0<br />
This can then be arranged and solved for A 2 , result<strong>in</strong>g <strong>in</strong> A 2 expressed as a<br />
l<strong>in</strong>ear comb<strong>in</strong>ation of {A 1 , A 3 , A 4 },<br />
A 2 =4A 1 +0A 3 +0A 4<br />
This means that A 2 is surplus, and we can create W just as well with a smaller<br />
set with this vector removed,<br />
W = 〈{A 1 , A 3 , A 4 , A 5 , A 6 , A 7 }〉<br />
Technically, this set equality for W requires a proof, <strong>in</strong> the spirit of Example<br />
RSC5, but we will bypass this requirement here, and <strong>in</strong> the next few paragraphs.<br />
Now, set the free variable x 5 = 1, and set the other free variables to zero. Then<br />
a solution to LS(B, 0) is<br />
⎡ ⎤<br />
−2<br />
0<br />
−1<br />
x =<br />
−2<br />
⎢ 1<br />
⎥<br />
⎣ 0 ⎦<br />
0<br />
which can be used to create the l<strong>in</strong>ear comb<strong>in</strong>ation<br />
(−2)A 1 +0A 2 +(−1)A 3 +(−2)A 4 +1A 5 +0A 6 +0A 7 = 0<br />
This can then be arranged and solved for A 5 , result<strong>in</strong>g <strong>in</strong> A 5 expressed as a<br />
l<strong>in</strong>ear comb<strong>in</strong>ation of {A 1 , A 3 , A 4 },<br />
A 5 =2A 1 +1A 3 +2A 4<br />
This means that A 5 is surplus, and we can create W just as well with a smaller<br />
set with this vector removed,<br />
W = 〈{A 1 , A 3 , A 4 , A 6 , A 7 }〉<br />
Do it aga<strong>in</strong>, set the free variable x 6 = 1, and set the other free variables to zero.
§LDS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 141<br />
Then a solution to LS(B, 0) is<br />
⎡ ⎤<br />
−1<br />
0<br />
3<br />
x =<br />
6<br />
⎢ 0<br />
⎥<br />
⎣ 1 ⎦<br />
0<br />
which can be used to create the l<strong>in</strong>ear comb<strong>in</strong>ation<br />
(−1)A 1 +0A 2 +3A 3 +6A 4 +0A 5 +1A 6 +0A 7 = 0<br />
This can then be arranged and solved for A 6 , result<strong>in</strong>g <strong>in</strong> A 6 expressed as a<br />
l<strong>in</strong>ear comb<strong>in</strong>ation of {A 1 , A 3 , A 4 },<br />
A 6 =1A 1 +(−3)A 3 +(−6)A 4<br />
This means that A 6 is surplus, and we can create W just as well with a smaller set<br />
with this vector removed,<br />
W = 〈{A 1 , A 3 , A 4 , A 7 }〉<br />
Set the free variable x 7 = 1, and set the other free variables to zero. Then a<br />
solution to LS(B, 0) is<br />
⎡ ⎤<br />
3<br />
0<br />
−5<br />
x =<br />
−6<br />
⎢ 0<br />
⎥<br />
⎣ 0 ⎦<br />
1<br />
which can be used to create the l<strong>in</strong>ear comb<strong>in</strong>ation<br />
3A 1 +0A 2 +(−5)A 3 +(−6)A 4 +0A 5 +0A 6 +1A 7 = 0<br />
This can then be arranged and solved for A 7 , result<strong>in</strong>g <strong>in</strong> A 7 expressed as a<br />
l<strong>in</strong>ear comb<strong>in</strong>ation of {A 1 , A 3 , A 4 },<br />
A 7 =(−3)A 1 +5A 3 +6A 4<br />
This means that A 7 is surplus, and we can create W just as well with a smaller<br />
set with this vector removed,<br />
W = 〈{A 1 , A 3 , A 4 }〉<br />
You might th<strong>in</strong>k we could keep this up, but we have run out of free variables.<br />
And not co<strong>in</strong>cidentally, the set {A 1 , A 3 , A 4 } is l<strong>in</strong>early <strong>in</strong>dependent (check this!).<br />
It should be clear how each free variable was used to elim<strong>in</strong>ate the a column from<br />
the set used to span the column space, as this will be the essence of the proof of the
§LDS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 142<br />
next theorem. The column vectors <strong>in</strong> S were not chosen entirely at random, they are<br />
the columns of Archetype I. See if you can mimic this example us<strong>in</strong>g the columns of<br />
Archetype J. Go ahead, we’ll go grab a cup of coffee and be back before you f<strong>in</strong>ish<br />
up.<br />
For extra credit, notice that the vector<br />
⎡ ⎤<br />
3<br />
⎢9⎥<br />
b = ⎣<br />
1<br />
⎦<br />
4<br />
is the vector of constants <strong>in</strong> the def<strong>in</strong>ition of Archetype I. S<strong>in</strong>ce the system LS(A, b)<br />
is consistent, we know by Theorem SLSLC that b is a l<strong>in</strong>ear comb<strong>in</strong>ation of the<br />
columns of A, or stated equivalently, b ∈ W . This means that b must also be a<br />
l<strong>in</strong>ear comb<strong>in</strong>ation of just the three columns A 1 , A 3 , A 4 . Can you f<strong>in</strong>d such a l<strong>in</strong>ear<br />
comb<strong>in</strong>ation? Did you notice that there is just a s<strong>in</strong>gle (unique) answer? Hmmmm.<br />
△<br />
Example COV deserves your careful attention, s<strong>in</strong>ce this important example<br />
motivates the follow<strong>in</strong>g very fundamental theorem.<br />
Theorem BS Basis of a Span<br />
Suppose that S = {v 1 , v 2 , v 3 , ..., v n } is a set of column vectors. Def<strong>in</strong>e W = 〈S〉<br />
and let A be the matrix whose columns are the vectors from S. LetB be the reduced<br />
row-echelon form of A, withD = {d 1 ,d 2 ,d 3 , ..., d r } the set of <strong>in</strong>dices for the pivot<br />
columns of B. Then<br />
1. T = {v d1 , v d2 , v d3 , ... v dr } is a l<strong>in</strong>early <strong>in</strong>dependent set.<br />
2. W = 〈T 〉.<br />
Proof. To prove that T is l<strong>in</strong>early <strong>in</strong>dependent, beg<strong>in</strong> with a relation of l<strong>in</strong>ear<br />
dependence on T ,<br />
0 = α 1 v d1 + α 2 v d2 + α 3 v d3 + ...+ α r v dr<br />
and we will try to conclude that the only possibility for the scalars α i is that they are<br />
all zero. Denote the non-pivot columns of B by F = {f 1 ,f 2 ,f 3 , ..., f n−r }.Then<br />
we can preserve the equality by add<strong>in</strong>g a big fat zero to the l<strong>in</strong>ear comb<strong>in</strong>ation,<br />
0 = α 1 v d1 + α 2 v d2 + α 3 v d3 + ...+ α r v dr +0v f1 +0v f2 +0v f3 + ...+0v fn−r<br />
By Theorem SLSLC, the scalars <strong>in</strong> this l<strong>in</strong>ear comb<strong>in</strong>ation (suitably reordered)<br />
are a solution to the homogeneous system LS(A, 0). But notice that this is the<br />
solution obta<strong>in</strong>ed by sett<strong>in</strong>g each free variable to zero. If we consider the description of<br />
a solution vector <strong>in</strong> the conclusion of Theorem VFSLS, <strong>in</strong> the case of a homogeneous<br />
system, then we see that if all the free variables are set to zero the result<strong>in</strong>g solution
§LDS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 143<br />
vector is trivial (all zeros). So it must be that α i =0,1≤ i ≤ r. Thisimpliesby<br />
Def<strong>in</strong>ition LICV that T is a l<strong>in</strong>early <strong>in</strong>dependent set.<br />
The second conclusion of this theorem is an equality of sets (Def<strong>in</strong>ition SE).<br />
S<strong>in</strong>ce T is a subset of S, any l<strong>in</strong>ear comb<strong>in</strong>ation of elements of the set T can also<br />
be viewed as a l<strong>in</strong>ear comb<strong>in</strong>ation of elements of the set S. So〈T 〉⊆〈S〉 = W .It<br />
rema<strong>in</strong>s to prove that W = 〈S〉 ⊆〈T 〉.<br />
For each k, 1≤ k ≤ n − r, form a solution x to LS(A, 0) by sett<strong>in</strong>g the free<br />
variables as follows:<br />
x f1 =0 x f2 =0 x f3 =0 ... x fk =1 ... x fn−r =0<br />
By Theorem VFSLS, the rema<strong>in</strong>der of this solution vector is given by,<br />
x d1 = − [B] 1,fk<br />
x d2 = − [B] 2,fk<br />
x d3 = − [B] 3,fk<br />
... x dr = − [B] r,fk<br />
From this solution, we obta<strong>in</strong> a relation of l<strong>in</strong>ear dependence on the columns of<br />
A,<br />
− [B] 1,fk<br />
v d1 − [B] 2,fk<br />
v d2 − [B] 3,fk<br />
v d3 − ...− [B] r,fk<br />
v dr +1v fk = 0<br />
which can be arranged as the equality<br />
v fk<br />
=[B] 1,fk<br />
v d1 +[B] 2,fk<br />
v d2 +[B] 3,fk<br />
v d3 + ...+[B] r,fk<br />
v dr<br />
Now, suppose we take an arbitrary element, w, ofW = 〈S〉 and write it as a<br />
l<strong>in</strong>ear comb<strong>in</strong>ation of the elements of S, but with the terms organized accord<strong>in</strong>g to<br />
the <strong>in</strong>dices <strong>in</strong> D and F ,<br />
w = α 1 v d1 + α 2 v d2 + ...+ α r v dr + β 1 v f1 + β 2 v f2 + ...+ β n−r v fn−r<br />
From the above, we can replace each v fj by a l<strong>in</strong>ear comb<strong>in</strong>ation of the v di ,<br />
w = α 1 v d1 + α 2 v d2 + ...+ α r v dr +<br />
(<br />
)<br />
β 1 [B] 1,f1<br />
v d1 +[B] 2,f1<br />
v d2 +[B] 3,f1<br />
v d3 + ...+[B] r,f1<br />
v dr +<br />
(<br />
)<br />
β 2 [B] 1,f2<br />
v d1 +[B] 2,f2<br />
v d2 +[B] 3,f2<br />
v d3 + ...+[B] r,f2<br />
v dr +<br />
.<br />
β n−r<br />
(<br />
[B] 1,fn−r<br />
v d1 +[B] 2,fn−r<br />
v d2 +[B] 3,fn−r<br />
v d3 + ...+[B] r,fn−r<br />
v dr<br />
)<br />
With repeated applications of several of the properties of Theorem VSPCV we can<br />
rearrange this expression as,<br />
)<br />
=<br />
(α 1 + β 1 [B] 1,f1<br />
+ β 2 [B] 1,f2<br />
+ β 3 [B] 1,f3<br />
+ ...+ β n−r [B] 1,fn−r<br />
v d1 +<br />
)<br />
(α 2 + β 1 [B] 2,f1<br />
+ β 2 [B] 2,f2<br />
+ β 3 [B] 2,f3<br />
+ ...+ β n−r [B] 2,fn−r<br />
v d2 +<br />
.
§LDS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 144<br />
(α r + β 1 [B] r,f1<br />
+ β 2 [B] r,f2<br />
+ β 3 [B] r,f3<br />
+ ...+ β n−r [B] r,fn−r<br />
)<br />
v dr<br />
This mess expresses the vector w as a l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors <strong>in</strong><br />
T = {v d1 , v d2 , v d3 , ... v dr }<br />
thus say<strong>in</strong>g that w ∈〈T 〉. Therefore, W = 〈S〉 ⊆〈T 〉.<br />
<br />
In Example COV, we tossed-out vectors one at a time. But <strong>in</strong> each <strong>in</strong>stance,<br />
we rewrote the offend<strong>in</strong>g vector as a l<strong>in</strong>ear comb<strong>in</strong>ation of those vectors with the<br />
column <strong>in</strong>dices of the pivot columns of the reduced row-echelon form of the matrix<br />
of columns. In the proof of Theorem BS, we accomplish this reduction <strong>in</strong> one big<br />
step. In Example COV we arrived at a l<strong>in</strong>early <strong>in</strong>dependent set at exactly the same<br />
moment that we ran out of free variables to exploit. This was not a co<strong>in</strong>cidence, it<br />
is the substance of our conclusion of l<strong>in</strong>ear <strong>in</strong>dependence <strong>in</strong> Theorem BS.<br />
Here is a straightforward application of Theorem BS.<br />
Example RSC4 Reduc<strong>in</strong>g a span <strong>in</strong> C 4<br />
Beg<strong>in</strong> with a set of five vectors from C 4 ,<br />
⎧⎡<br />
⎤ ⎡ ⎤ ⎡<br />
1 2 ⎪⎨<br />
⎢1⎥<br />
⎢2⎥<br />
⎢<br />
S = ⎣<br />
⎪⎩<br />
2<br />
⎦ , ⎣<br />
4<br />
⎦ , ⎣<br />
1 2<br />
2<br />
0<br />
−1<br />
1<br />
⎤<br />
⎡<br />
⎥ ⎢<br />
⎦ , ⎣<br />
7<br />
1<br />
−1<br />
4<br />
⎤<br />
⎥<br />
⎦ ,<br />
⎡ ⎤⎫<br />
0 ⎪⎬<br />
⎢2⎥<br />
⎣<br />
5<br />
⎦<br />
⎪⎭<br />
1<br />
and let W = 〈S〉. To arrive at a (smaller) l<strong>in</strong>early <strong>in</strong>dependent set, follow the<br />
procedure described <strong>in</strong> Theorem BS. Place the vectors from S <strong>in</strong>toamatrixas<br />
columns, and row-reduce,<br />
⎡<br />
⎤ ⎡<br />
⎤<br />
1 2 2 7 0<br />
⎢1 2 0 1 2⎥<br />
⎣<br />
2 4 −1 −1 5<br />
⎦ −−−−→<br />
RREF<br />
1 2 1 4 1<br />
⎢<br />
⎣<br />
1 2 0 1 2<br />
0 0 1 3 −1<br />
⎥<br />
0 0 0 0 0 ⎦<br />
0 0 0 0 0<br />
Columns 1 and 3 are the pivot columns (D = {1, 3}) so the set<br />
⎧⎡<br />
⎤ ⎡ ⎤⎫<br />
1 2<br />
⎪⎨<br />
⎪⎬<br />
⎢1⎥<br />
⎢ 0 ⎥<br />
T = ⎣<br />
⎪⎩<br />
2<br />
⎦ , ⎣<br />
−1<br />
⎦<br />
⎪⎭<br />
1 1<br />
is l<strong>in</strong>early <strong>in</strong>dependent and 〈T 〉 = 〈S〉 = W .Boom!<br />
S<strong>in</strong>ce the reduced row-echelon form of a matrix is unique (Theorem RREFU),<br />
the procedure of Theorem BS leads us to a unique set T . However, there is a wide<br />
variety of possibilities for sets T that are l<strong>in</strong>early <strong>in</strong>dependent and which can be
§LDS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 145<br />
employed <strong>in</strong> a span to create W . Without proof, we list two other possibilities:<br />
⎧⎡<br />
⎤ ⎡ ⎤⎫<br />
2 2<br />
⎪⎨<br />
⎪⎬<br />
T ′ ⎢2⎥<br />
⎢ 0 ⎥<br />
= ⎣<br />
⎪⎩<br />
4<br />
⎦ , ⎣<br />
−1<br />
⎦<br />
⎪⎭<br />
2 1<br />
⎧⎡<br />
⎤ ⎡ ⎤⎫<br />
3 −1<br />
⎪⎨<br />
⎪⎬<br />
T ∗ ⎢1⎥<br />
⎢ 1 ⎥<br />
= ⎣<br />
⎪⎩<br />
1<br />
⎦ , ⎣<br />
3<br />
⎦<br />
⎪⎭<br />
2 0<br />
Can you prove that T ′ and T ∗ are l<strong>in</strong>early <strong>in</strong>dependent sets and W = 〈S〉 =<br />
〈T ′ 〉 = 〈T ∗ 〉? △<br />
Example RES Rework<strong>in</strong>g elements of a span<br />
Beg<strong>in</strong> with a set of five vectors from C 4 ,<br />
⎧⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤ ⎡<br />
2 −1 −8<br />
⎪⎨<br />
⎢1⎥<br />
⎢ 1 ⎥ ⎢−1⎥<br />
⎢<br />
R = ⎣<br />
⎪⎩<br />
3<br />
⎦ , ⎣<br />
0<br />
⎦ , ⎣<br />
−9<br />
⎦ , ⎣<br />
2 1 −4<br />
3<br />
1<br />
−1<br />
−2<br />
⎤<br />
⎥<br />
⎦ ,<br />
⎡ ⎤⎫<br />
−10 ⎪⎬<br />
⎢ −1 ⎥<br />
⎣<br />
−1<br />
⎦<br />
⎪⎭<br />
4<br />
It is easy to create elements of X = 〈R〉 — we will create one at random,<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
2 −1 −8 3 −10 9<br />
⎢1⎥<br />
⎢ 1 ⎥ ⎢−1⎥<br />
⎢ 1 ⎥ ⎢ −1 ⎥ ⎢ 2 ⎥<br />
y =6⎣<br />
3<br />
⎦ +(−7) ⎣<br />
0<br />
⎦ +1⎣<br />
−9<br />
⎦ +6⎣<br />
−1<br />
⎦ +2⎣<br />
−1<br />
⎦ = ⎣<br />
1<br />
⎦<br />
2<br />
1 −4 −2 4 −3<br />
We know we can replace R by a smaller set (s<strong>in</strong>ce it is obviously l<strong>in</strong>early dependent<br />
by Theorem MVSLD) that will create the same span. Here goes,<br />
⎡<br />
⎤ ⎡<br />
⎤<br />
2 −1 −8 3 −10<br />
1 0 −3 0 −1<br />
⎢1 1 −1 1 −1 ⎥<br />
⎣<br />
3 0 −9 −1 −1<br />
⎦ −−−−→<br />
RREF<br />
⎢ 0 1 2 0 2<br />
⎥<br />
⎣<br />
0 0 0 1 −2<br />
⎦<br />
2 1 −4 −2 4<br />
0 0 0 0 0<br />
So, if we collect the first, second and fourth vectors from R,<br />
⎧⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤⎫<br />
2 −1 3<br />
⎪⎨<br />
⎪⎬<br />
⎢1⎥<br />
⎢ 1 ⎥ ⎢ 1 ⎥<br />
P = ⎣<br />
⎪⎩<br />
3<br />
⎦ , ⎣<br />
0<br />
⎦ , ⎣<br />
−1<br />
⎦<br />
⎪⎭<br />
2 1 −2<br />
then P is l<strong>in</strong>early <strong>in</strong>dependent and 〈P 〉 = 〈R〉 = X by Theorem BS. S<strong>in</strong>ce we built<br />
y as an element of 〈R〉 it must also be an element of 〈P 〉. Can we write y as a l<strong>in</strong>ear<br />
comb<strong>in</strong>ation of just the three vectors <strong>in</strong> P ? The answer is, of course, yes. But let us<br />
compute an explicit l<strong>in</strong>ear comb<strong>in</strong>ation just for fun. By Theorem SLSLC we can get<br />
such a l<strong>in</strong>ear comb<strong>in</strong>ation by solv<strong>in</strong>g a system of equations with the column vectors<br />
of R as the columns of a coefficient matrix, and y as the vector of constants.
§LDS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 146<br />
Employ<strong>in</strong>g an augmented matrix to solve this system,<br />
⎡<br />
⎤ ⎡<br />
⎤<br />
2 −1 3 9<br />
1 0 0 1<br />
⎢1 1 1 2 ⎥<br />
⎣<br />
3 0 −1 1<br />
⎦ −−−−→<br />
RREF<br />
⎢ 0 1 0 −1<br />
⎥<br />
⎣<br />
0 0 1 2<br />
⎦<br />
2 1 −2 −3<br />
0 0 0 0<br />
So we see, as expected, that<br />
⎡ ⎤ ⎡ ⎤ ⎡<br />
2 −1<br />
⎢1⎥<br />
⎢ 1 ⎥ ⎢<br />
1 ⎣<br />
3<br />
⎦ +(−1) ⎣<br />
0<br />
⎦ +2⎣<br />
2<br />
1<br />
3<br />
1<br />
−1<br />
−2<br />
⎤<br />
⎥<br />
⎦ =<br />
⎡<br />
⎢<br />
⎣<br />
9<br />
2<br />
1<br />
−3<br />
⎤<br />
⎥<br />
⎦ = y<br />
A key feature of this example is that the l<strong>in</strong>ear comb<strong>in</strong>ation that expresses y as a<br />
l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors <strong>in</strong> P is unique. This is a consequence of the l<strong>in</strong>ear<br />
<strong>in</strong>dependence of P . The l<strong>in</strong>early <strong>in</strong>dependent set P is smaller than R, but still just<br />
(barely) big enough to create elements of the set X = 〈R〉. There are many, many<br />
ways to write y as a l<strong>in</strong>ear comb<strong>in</strong>ation of the five vectors <strong>in</strong> R (the appropriate<br />
system of equations to verify this claim yields two free variables <strong>in</strong> the description<br />
of the solution set), yet there is precisely one way to write y as a l<strong>in</strong>ear comb<strong>in</strong>ation<br />
of the three vectors <strong>in</strong> P .<br />
△<br />
Read<strong>in</strong>g Questions<br />
1. Let S be the l<strong>in</strong>early dependent set of three vectors below.<br />
⎧⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤⎫<br />
1 1 5<br />
⎪⎨<br />
⎪⎬<br />
⎢ 10 ⎥ ⎢1⎥<br />
⎢ 23 ⎥<br />
S = ⎣<br />
⎪⎩<br />
100<br />
⎦ , ⎣<br />
1<br />
⎦ , ⎣<br />
203<br />
⎦<br />
⎪⎭<br />
1000 1 2003<br />
Write one vector from S as a l<strong>in</strong>ear comb<strong>in</strong>ation of the other two and <strong>in</strong>clude this<br />
vector equality <strong>in</strong> your response. (You should be able to do this on sight, rather than<br />
do<strong>in</strong>g some computations.) Convert this expression <strong>in</strong>to a nontrivial relation of l<strong>in</strong>ear<br />
dependence on S.<br />
2. Expla<strong>in</strong> why the word “dependent” is used <strong>in</strong> the def<strong>in</strong>ition of l<strong>in</strong>ear dependence.<br />
3. Suppose that Y = 〈P 〉 = 〈Q〉, whereP is a l<strong>in</strong>early dependent set and Q is l<strong>in</strong>early<br />
<strong>in</strong>dependent. Would you rather use P or Q to describe Y ?Why?<br />
Exercises<br />
C20 † Let T be the set of columns of the matrix B below. Def<strong>in</strong>e W = 〈T 〉. F<strong>in</strong>dasetR<br />
so that (1) R has 3 vectors, (2) R is a subset of T ,and(3)W = 〈R〉.<br />
⎡<br />
⎤<br />
−3 1 −2 7<br />
B = ⎣−1 2 1 4 ⎦<br />
1 1 2 −1
§LDS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 147<br />
C40 Verify that the set R ′ = {v 1 , v 2 , v 4 } at the end of Example RSC5 is l<strong>in</strong>early<br />
<strong>in</strong>dependent.<br />
C50 † Consider the set of vectors from C 3 , W , given below. F<strong>in</strong>d a l<strong>in</strong>early <strong>in</strong>dependent<br />
set T that conta<strong>in</strong>s three vectors from W andsuchthat〈W 〉 = 〈T 〉.<br />
⎧⎡<br />
⎨<br />
W = {v 1 , v 2 , v 3 , v 4 , v 5 } = ⎣ 2 ⎤ ⎡<br />
1⎦ , ⎣ ⎤ ⎡<br />
−1⎦ , ⎣ 1 ⎤ ⎡<br />
2⎦ , ⎣ 3 ⎤ ⎡<br />
1⎦ , ⎣ 0 ⎤⎫<br />
⎬<br />
1 ⎦<br />
⎩<br />
1 1 3 3 −3<br />
⎭<br />
C51 †<br />
Given the set S below, f<strong>in</strong>d a l<strong>in</strong>early <strong>in</strong>dependent set T so that 〈T 〉 = 〈S〉.<br />
⎧⎡<br />
⎨<br />
S = ⎣ 2<br />
⎤ ⎡<br />
−1⎦ , ⎣ 3 ⎤ ⎡<br />
0⎦ , ⎣ 1 ⎤ ⎡<br />
1 ⎦ , ⎣ 5<br />
⎤⎫<br />
⎬<br />
−1⎦<br />
⎩<br />
2 1 −1 3<br />
⎭<br />
C52 † Let W be the span of the set of vectors S below, W = 〈S〉. F<strong>in</strong>dasetT so that 1)<br />
the span of T is W , 〈T 〉 = W ,(2)T is a l<strong>in</strong>early <strong>in</strong>dependent set, and (3) T is a subset of<br />
S.<br />
⎧⎡<br />
⎨<br />
S = ⎣ 1 ⎤ ⎡<br />
2 ⎦ , ⎣ 2<br />
⎤ ⎡<br />
−3⎦ , ⎣ 4 ⎤ ⎡<br />
1 ⎦ , ⎣ 3 ⎤ ⎡<br />
1⎦ , ⎣ 3<br />
⎤⎫<br />
⎬<br />
−1⎦<br />
⎩<br />
−1 1 −1 1 0<br />
⎭<br />
⎧⎡<br />
⎨<br />
C55 † Let T be the set of vectors T = ⎣ 1<br />
⎤ ⎡<br />
−1⎦ , ⎣ 3 ⎤ ⎡<br />
0⎦ , ⎣ 4 ⎤ ⎡<br />
2⎦ , ⎣ 3 ⎤⎫<br />
⎬<br />
0⎦<br />
. F<strong>in</strong>d two different<br />
⎩<br />
2 1 3 6<br />
⎭<br />
subsets of T ,namedR and S, sothatR and S each conta<strong>in</strong> three vectors, and so that<br />
〈R〉 = 〈T 〉 and 〈S〉 = 〈T 〉. Prove that both R and S are l<strong>in</strong>early <strong>in</strong>dependent.<br />
C70 Reprise Example RES by creat<strong>in</strong>g a new version of the vector y. Inotherwords,<br />
form a new, different l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors <strong>in</strong> R to create a new vector y (but<br />
do not simplify the problem too much by choos<strong>in</strong>g any of the five new scalars to be zero).<br />
Then express this new y as a comb<strong>in</strong>ation of the vectors <strong>in</strong> P .<br />
M10 At the conclusion of Example RSC4 two alternative solutions, sets T ′ and T ∗ ,are<br />
proposed. Verify these claims by prov<strong>in</strong>g that 〈T 〉 = 〈T ′ 〉 and 〈T 〉 = 〈T ∗ 〉.<br />
T40 † Suppose that v 1 and v 2 are any two vectors from C m . Prove the follow<strong>in</strong>g set<br />
equality.<br />
〈{v 1 , v 2 }〉 = 〈{v 1 + v 2 , v 1 − v 2 }〉
Section O<br />
Orthogonality<br />
In this section we def<strong>in</strong>e a couple more operations with vectors, and prove a few<br />
theorems. At first blush these def<strong>in</strong>itions and results will not appear central to what<br />
follows, but we will make use of them at key po<strong>in</strong>ts <strong>in</strong> the rema<strong>in</strong>der of the course<br />
(such as Section MINM, Section OD). Because we have chosen to use C as our set of<br />
scalars, this subsection is a bit more, uh, . . . complex than it would be for the real<br />
numbers. We will expla<strong>in</strong> as we go along how th<strong>in</strong>gs get easier for the real numbers<br />
R. If you have not already, now would be a good time to review some of the basic<br />
properties of arithmetic with complex numbers described <strong>in</strong> Section CNO. With<br />
that done, we can extend the basics of complex number arithmetic to our study of<br />
vectors <strong>in</strong> C m .<br />
Subsection CAV<br />
Complex Arithmetic and Vectors<br />
We know how the addition and multiplication of complex numbers is employed <strong>in</strong><br />
def<strong>in</strong><strong>in</strong>g the operations for vectors <strong>in</strong> C m (Def<strong>in</strong>ition CVA and Def<strong>in</strong>ition CVSM).<br />
We can also extend the idea of the conjugate to vectors.<br />
Def<strong>in</strong>ition CCCV Complex Conjugate of a Column Vector<br />
Suppose that u is a vector from C m . Then the conjugate of the vector, u, is def<strong>in</strong>ed<br />
by<br />
[u] i<br />
= [u] i<br />
1 ≤ i ≤ m<br />
□<br />
With this def<strong>in</strong>ition we can show that the conjugate of a column vector behaves<br />
as we would expect with regard to vector addition and scalar multiplication.<br />
Theorem CRVA Conjugation Respects Vector Addition<br />
Suppose x and y are two vectors from C m .Then<br />
x + y = x + y<br />
Proof. For each 1 ≤ i ≤ m,<br />
[x + y] i<br />
= [x + y] i<br />
Def<strong>in</strong>ition CCCV<br />
= [x] i<br />
+[y] i<br />
Def<strong>in</strong>ition CVA<br />
= [x] i<br />
+ [y] i<br />
Theorem CCRA<br />
=[x] i<br />
+[y] i<br />
Def<strong>in</strong>ition CCCV<br />
=[x + y] i<br />
Def<strong>in</strong>ition CVA<br />
148
§O Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 149<br />
Then by Def<strong>in</strong>ition CVE we have x + y = x + y.<br />
<br />
Theorem CRSM Conjugation Respects Vector Scalar Multiplication<br />
Suppose x is a vector from C m ,andα ∈ C is a scalar. Then<br />
αx = α x<br />
Proof. For 1 ≤ i ≤ m,<br />
[αx] i<br />
= [αx] i<br />
Def<strong>in</strong>ition CCCV<br />
= α [x] i<br />
Def<strong>in</strong>ition CVSM<br />
= α [x] i<br />
Theorem CCRM<br />
= α [x] i<br />
Def<strong>in</strong>ition CCCV<br />
=[α x] i<br />
Def<strong>in</strong>ition CVSM<br />
Then by Def<strong>in</strong>ition CVE we have αx = α x.<br />
<br />
These two theorems together tell us how we can “push” complex conjugation<br />
through l<strong>in</strong>ear comb<strong>in</strong>ations.<br />
Subsection IP<br />
Inner products<br />
Def<strong>in</strong>ition IP Inner Product<br />
Given the vectors u, v ∈ C m the <strong>in</strong>ner product of u and v is the scalar quantity<br />
<strong>in</strong> C,<br />
m∑<br />
〈u, v〉 = [u] 1<br />
[v] 1<br />
+ [u] 2<br />
[v] 2<br />
+ [u] 3<br />
[v] 3<br />
+ ···+ [u] m<br />
[v] m<br />
= [u] i<br />
[v] i<br />
This operation is a bit different <strong>in</strong> that we beg<strong>in</strong> with two vectors but produce a<br />
scalar. Comput<strong>in</strong>g one is straightforward.<br />
Example CSIP Comput<strong>in</strong>g some <strong>in</strong>ner products<br />
The <strong>in</strong>ner product of<br />
[ ] 2+3i<br />
u = 5+2i<br />
−3+i<br />
and v =<br />
is<br />
i=1<br />
[ ] 1+2i<br />
−4+5i<br />
0+5i<br />
〈u, v〉 =(2+3i)(1 + 2i)+(5+2i)(−4+5i)+(−3+i)(0 + 5i)<br />
=(2− 3i)(1 + 2i)+(5− 2i)(−4+5i)+(−3 − i)(0 + 5i)<br />
□
§O Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 150<br />
is<br />
=(8+i)+(−10 + 33i)+(5− 15i)<br />
=3+19i<br />
The <strong>in</strong>ner product of<br />
⎡ ⎤<br />
⎡<br />
2<br />
4<br />
w = ⎢−3⎥<br />
and x = ⎢<br />
⎣ 2 ⎦<br />
⎣<br />
8<br />
3<br />
1<br />
0<br />
−1<br />
−2<br />
〈w, x〉 =(2)3 + (4)1 + (−3)0 + (2)(−1) + (8)(−2)<br />
= 2(3) + 4(1) + (−3)0 + 2(−1) + 8(−2) = −8.<br />
In the case where the entries of our vectors are all real numbers (as <strong>in</strong> the second<br />
part of Example CSIP), the computation of the <strong>in</strong>ner product may look familiar and<br />
be known to you as a dot product or scalar product. So you can view the <strong>in</strong>ner<br />
product as a generalization of the scalar product to vectors from C m (rather than<br />
R m ).<br />
Note that we have chosen to conjugate the entries of the first vector listed <strong>in</strong> the<br />
<strong>in</strong>ner product, while it is almost equally feasible to conjugate entries from the second<br />
vector <strong>in</strong>stead. In particular, prior to Version 2.90, we did use the latter def<strong>in</strong>ition,<br />
and this has now changed to the former, with result<strong>in</strong>g adjustments propogated<br />
up through Section CB (only). However, conjugat<strong>in</strong>g the first vector leads to much<br />
nicer formulas for certa<strong>in</strong> matrix decompositions and also shortens some proofs.<br />
There are several quick theorems we can now prove, and they will each be useful<br />
later.<br />
Theorem IPVA Inner Product and Vector Addition<br />
Suppose u, v, w ∈ C m .Then<br />
1. 〈u + v, w〉 = 〈u, w〉 + 〈v, w〉<br />
2. 〈u, v + w〉 = 〈u, v〉 + 〈u, w〉<br />
Proof. The proofs of the two parts are very similar, with the second one requir<strong>in</strong>g<br />
just a bit more effort due to the conjugation that occurs. We will prove part 1 and<br />
you can prove part 2 (Exercise O.T10).<br />
m∑<br />
〈u + v, w〉 = [u + v] i<br />
[w] i<br />
Def<strong>in</strong>ition IP<br />
=<br />
i=1<br />
m∑<br />
i=1<br />
([u] i<br />
+[v] i<br />
)<br />
[w] i<br />
⎤<br />
⎥<br />
⎦<br />
Def<strong>in</strong>ition CVA<br />
△
§O Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 151<br />
=<br />
=<br />
=<br />
m∑<br />
i=1<br />
([u] i<br />
+ [v] i<br />
)<br />
[w] i<br />
m∑<br />
[u] i<br />
[w] i<br />
+ [v] i<br />
[w] i<br />
i=1<br />
m∑<br />
[u] i<br />
[w] i<br />
+<br />
i=1<br />
m∑<br />
[v] i<br />
[w] i<br />
i=1<br />
Theorem CCRA<br />
Property DCN<br />
Property CACN<br />
= 〈u, w〉 + 〈v, w〉 Def<strong>in</strong>ition IP<br />
<br />
Theorem IPSM Inner Product and Scalar Multiplication<br />
Suppose u, v ∈ C m and α ∈ C. Then<br />
1. 〈αu, v〉 = α 〈u, v〉<br />
2. 〈u, αv〉 = α 〈u, v〉<br />
Proof. The proofs of the two parts are very similar, with the second one requir<strong>in</strong>g<br />
just a bit more effort due to the conjugation that occurs. We will prove part 1 and<br />
you can prove part 2 (Exercise O.T11).<br />
m∑<br />
〈αu, v〉 = [αu] i<br />
[v] i<br />
Def<strong>in</strong>ition IP<br />
=<br />
=<br />
= α<br />
i=1<br />
m∑<br />
α [u] i<br />
[v] i<br />
i=1<br />
m∑<br />
α [u] i<br />
[v] i<br />
i=1<br />
m∑<br />
[u] i<br />
[v] i<br />
i=1<br />
Def<strong>in</strong>ition CVSM<br />
Theorem CCRM<br />
Property DCN<br />
= α 〈u, v〉 Def<strong>in</strong>ition IP<br />
Theorem IPAC Inner Product is Anti-Commutative<br />
Suppose that u and v are vectors <strong>in</strong> C m .Then〈u, v〉 = 〈v, u〉.<br />
<br />
Proof.<br />
〈u, v〉 =<br />
m∑<br />
[u] i<br />
[v] i<br />
i=1<br />
Def<strong>in</strong>ition IP
§O Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 152<br />
m∑<br />
= [u] i<br />
[v] i<br />
Theorem CCT<br />
i=1<br />
m∑<br />
= [u] i<br />
[v] i<br />
Theorem CCRM<br />
i=1<br />
( m<br />
)<br />
∑<br />
= [u] i<br />
[v] i<br />
Theorem CCRA<br />
i=1<br />
( m<br />
)<br />
∑<br />
= [v] i<br />
[u] i<br />
i=1<br />
Property CMCN<br />
= 〈v, u〉 Def<strong>in</strong>ition IP<br />
Subsection N<br />
Norm<br />
If treat<strong>in</strong>g l<strong>in</strong>ear algebra <strong>in</strong> a more geometric fashion, the length of a vector occurs<br />
naturally, and is what you would expect from its name. With complex numbers, we<br />
will def<strong>in</strong>e a similar function. Recall that if c is a complex number, then |c| denotes<br />
its modulus (Def<strong>in</strong>ition MCN).<br />
Def<strong>in</strong>ition NV Norm of a Vector<br />
The norm of the vector u is the scalar quantity <strong>in</strong> C<br />
‖u‖ =<br />
√<br />
|[u] 1<br />
| 2 + |[u] 2<br />
| 2 + |[u] 3<br />
| 2 + ···+ |[u] m<br />
| 2 =<br />
Comput<strong>in</strong>g a norm is also easy to do.<br />
Example CNSV Comput<strong>in</strong>g the norm of some vectors<br />
The norm of<br />
⎡ ⎤<br />
3+2i<br />
⎢1 − 6i⎥<br />
u = ⎣<br />
2+4i<br />
⎦<br />
2+i<br />
is<br />
‖u‖ =<br />
∑<br />
√ m |[u] i<br />
| 2<br />
i=1<br />
√<br />
|3+2i| 2 + |1 − 6i| 2 + |2+4i| 2 + |2+i| 2<br />
<br />
□
§O Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 153<br />
is<br />
The norm of<br />
‖v‖ =<br />
= √ 13 + 37 + 20 + 5 = √ 75 = 5 √ 3<br />
⎡ ⎤<br />
3<br />
−1<br />
v = ⎢ 2 ⎥<br />
⎣ 4 ⎦<br />
−3<br />
√<br />
|3| 2 + |−1| 2 + |2| 2 + |4| 2 + |−3| 2 = √ 3 2 +1 2 +2 2 +4 2 +3 2 = √ 39.<br />
Notice how the norm of a vector with real number entries is just the length of<br />
the vector. Inner products and norms are related by the follow<strong>in</strong>g theorem.<br />
Theorem IPN Inner Products and Norms<br />
Suppose that u is a vector <strong>in</strong> C m .Then‖u‖ 2 = 〈u, u〉.<br />
△<br />
Proof.<br />
⎛<br />
‖u‖ 2 ∑<br />
= ⎝√ m<br />
=<br />
=<br />
i=1<br />
m∑<br />
|[u] i<br />
| 2<br />
i=1<br />
m∑<br />
[u] i<br />
[u] i<br />
i=1<br />
⎞2<br />
|[u] i<br />
| 2 ⎠<br />
Def<strong>in</strong>ition NV<br />
Inverse functions<br />
Def<strong>in</strong>ition MCN<br />
= 〈u, u〉 Def<strong>in</strong>ition IP<br />
<br />
When our vectors have entries only from the real numbers Theorem IPN says<br />
that the dot product of a vector with itself is equal to the length of the vector<br />
squared.<br />
Theorem PIP Positive Inner Products<br />
Suppose that u is a vector <strong>in</strong> C m .Then〈u, u〉 ≥0 with equality if and only if u = 0.<br />
Proof. From the proof of Theorem IPN we see that<br />
〈u, u〉 = |[u] 1<br />
| 2 + |[u] 2<br />
| 2 + |[u] 3<br />
| 2 + ···+ |[u] m<br />
| 2<br />
S<strong>in</strong>ce each modulus is squared, every term is positive, and the sum must also be<br />
positive. (Notice that <strong>in</strong> general the <strong>in</strong>ner product is a complex number and cannot<br />
be compared with zero, but <strong>in</strong> the special case of 〈u, u〉 the result is a real number.)
§O Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 154<br />
The phrase, “with equality if and only if” means that we want to show that<br />
the statement 〈u, u〉 = 0 (i.e. with equality) is equivalent (“if and only if”) to the<br />
statement u = 0.<br />
If u = 0, then it is a straightforward computation to see that 〈u, u〉 =0.Inthe<br />
other direction, assume that 〈u, u〉 = 0. As before, 〈u, u〉 is a sum of moduli. So we<br />
have<br />
0=〈u, u〉 = |[u] 1<br />
| 2 + |[u] 2<br />
| 2 + |[u] 3<br />
| 2 + ···+ |[u] m<br />
| 2<br />
Now we have a sum of squares equal<strong>in</strong>g zero, so each term must be zero. Then<br />
by similar logic, |[u] i<br />
| = 0 will imply that [u] i<br />
=0,s<strong>in</strong>ce0+0i is the only complex<br />
number with zero modulus. Thus every entry of u is zero and so u = 0, as desired.<br />
Notice that Theorem PIP conta<strong>in</strong>s three implications:<br />
u ∈ C m ⇒〈u, u〉 ≥0<br />
u = 0 ⇒〈u, u〉 =0<br />
〈u, u〉 =0⇒ u = 0<br />
The results conta<strong>in</strong>ed <strong>in</strong> Theorem PIP are summarized by say<strong>in</strong>g “the <strong>in</strong>ner<br />
product is positive def<strong>in</strong>ite.”<br />
Subsection OV<br />
Orthogonal Vectors<br />
“Orthogonal” is a generalization of “perpendicular.” You may have used mutually<br />
perpendicular vectors <strong>in</strong> a physics class, or you may recall from a calculus class that<br />
perpendicular vectors have a zero dot product. We will now extend these ideas <strong>in</strong>to<br />
the realm of higher dimensions and complex scalars.<br />
Def<strong>in</strong>ition OV Orthogonal Vectors<br />
A pair of vectors, u and v, fromC m are orthogonal if their <strong>in</strong>ner product is zero,<br />
that is, 〈u, v〉 =0.<br />
□<br />
Example TOV Two orthogonal vectors<br />
The vectors<br />
⎡ ⎤<br />
⎡ ⎤<br />
2+3i<br />
1 − i<br />
⎢4 − 2i⎥<br />
⎢2+3i⎥<br />
u = ⎣<br />
1+i<br />
⎦ v = ⎣<br />
4 − 6i<br />
⎦<br />
1+i<br />
1<br />
are orthogonal s<strong>in</strong>ce<br />
〈u, v〉 =(2− 3i)(1 − i)+(4+2i)(2 + 3i)+(1− i)(4 − 6i)+(1− i)(1)<br />
=(−1 − 5i)+(2+16i)+(−2 − 10i)+(1− i)<br />
=0+0i.
§O Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 155<br />
We extend this def<strong>in</strong>ition to whole sets by requir<strong>in</strong>g vectors to be pairwise<br />
orthogonal. Despite us<strong>in</strong>g the same word, careful thought about what objects you<br />
are us<strong>in</strong>g will elim<strong>in</strong>ate any source of confusion.<br />
Def<strong>in</strong>ition OSV Orthogonal Set of Vectors<br />
Suppose that S = {u 1 , u 2 , u 3 , ..., u n } is a set of vectors from C m .ThenS is<br />
an orthogonal set if every pair of different vectors from S is orthogonal, that is<br />
〈u i , u j 〉 = 0 whenever i ≠ j. □<br />
We now def<strong>in</strong>e the prototypical orthogonal set, which we will reference repeatedly.<br />
Def<strong>in</strong>ition SUV Standard Unit Vectors<br />
Let e j ∈ C m ,1≤ j ≤ m denote the column vectors def<strong>in</strong>ed by<br />
{<br />
0 if i ≠ j<br />
[e j ] i<br />
=<br />
1 if i = j<br />
Then the set<br />
{e 1 , e 2 , e 3 , ..., e m } = {e j | 1 ≤ j ≤ m}<br />
is the set of standard unit vectors <strong>in</strong> C m .<br />
Notice that e j is identical to column j of the m×m identity matrix I m (Def<strong>in</strong>ition<br />
IM) and is a pivot column for I m , s<strong>in</strong>ce the identity matrix is <strong>in</strong> reduced row-echelon<br />
form. These observations will often be useful. We will reserve the notation e i for these<br />
vectors. It is not hard to see that the set of standard unit vectors is an orthogonal<br />
set.<br />
Example SUVOS Standard Unit Vectors are an Orthogonal Set<br />
Compute the <strong>in</strong>ner product of two dist<strong>in</strong>ct vectors from the set of standard unit<br />
vectors (Def<strong>in</strong>ition SUV), say e i , e j ,wherei ≠ j,<br />
〈e i , e j 〉 = 00 + 00 + ···+ 10 + ···+ 00 + ···+ 01 + ···+ 00 + 00<br />
= 0(0) + 0(0) + ···+1(0)+···+0(1)+···+ 0(0) + 0(0)<br />
=0<br />
So the set {e 1 , e 2 , e 3 , ..., e m } is an orthogonal set.<br />
△<br />
Example AOS An orthogonal set<br />
The set<br />
⎧⎡<br />
⎤ ⎡ ⎤ ⎡<br />
1+i 1+5i<br />
⎪⎨<br />
⎢ 1 ⎥ ⎢ 6+5i ⎥ ⎢<br />
{x 1 , x 2 , x 3 , x 4 } = ⎣<br />
⎪⎩<br />
1 − i<br />
⎦ , ⎣<br />
−7 − i<br />
⎦ , ⎣<br />
i 1 − 6i<br />
−7+34i<br />
−8 − 23i<br />
−10 + 22i<br />
30 + 13i<br />
⎤<br />
⎥<br />
⎦ ,<br />
⎡ ⎤⎫<br />
−2 − 4i ⎪⎬<br />
⎢ 6+i ⎥<br />
⎣<br />
4+3i<br />
⎦<br />
⎪⎭<br />
6 − i<br />
is an orthogonal set.<br />
S<strong>in</strong>ce the <strong>in</strong>ner product is anti-commutative (Theorem IPAC) we can test pairs<br />
of different vectors <strong>in</strong> any order. If the result is zero, then it will also be zero if the<br />
□
§O Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 156<br />
<strong>in</strong>ner product is computed <strong>in</strong> the opposite order. This means there are six different<br />
pairs of vectors to use <strong>in</strong> an <strong>in</strong>ner product computation. We will do two and you<br />
can practice your <strong>in</strong>ner products on the other four.<br />
〈x 1 , x 3 〉 =(1− i)(−7+34i) + (1)(−8 − 23i)+(1+i)(−10 + 22i)+(−i)(30 + 13i)<br />
= (27 + 41i)+(−8 − 23i)+(−32 + 12i)+(13− 30i)<br />
=0+0i<br />
and<br />
〈x 2 , x 4 〉 =(1− 5i)(−2 − 4i)+(6− 5i)(6 + i)+(−7+i)(4 + 3i)+(1+6i)(6 − i)<br />
=(−22 + 6i)+(41− 24i)+(−31 − 17i)+(12+35i)<br />
=0+0i<br />
So far, this section has seen lots of def<strong>in</strong>itions, and lots of theorems establish<strong>in</strong>g<br />
un-surpris<strong>in</strong>g consequences of those def<strong>in</strong>itions. But here is our first theorem that<br />
suggests that <strong>in</strong>ner products and orthogonal vectors have some utility. It is also one<br />
of our first illustrations of how to arrive at l<strong>in</strong>ear <strong>in</strong>dependence as the conclusion of<br />
a theorem.<br />
Theorem OSLI Orthogonal Sets are L<strong>in</strong>early Independent<br />
Suppose that S is an orthogonal set of nonzero vectors. Then S is l<strong>in</strong>early <strong>in</strong>dependent.<br />
Proof. Let S = {u 1 , u 2 , u 3 , ..., u n } be an orthogonal set of nonzero vectors. To<br />
prove the l<strong>in</strong>ear <strong>in</strong>dependence of S, we can appeal to the def<strong>in</strong>ition (Def<strong>in</strong>ition LICV)<br />
and beg<strong>in</strong> with an arbitrary relation of l<strong>in</strong>ear dependence (Def<strong>in</strong>ition RLDCV),<br />
α 1 u 1 + α 2 u 2 + α 3 u 3 + ···+ α n u n = 0.<br />
Then, for every 1 ≤ i ≤ n, wehave<br />
α i 〈u i , u i 〉<br />
= α 1 (0) + α 2 (0) + ···+ α i 〈u i , u i 〉 + ···+ α n (0) Property ZCN<br />
= α 1 〈u i , u 1 〉 + ···+ α i 〈u i , u i 〉 + ···+ α n 〈u i , u n 〉 Def<strong>in</strong>ition OSV<br />
= 〈u i ,α 1 u 1 〉 + 〈u i ,α 2 u 2 〉 + ···+ 〈u i ,α n u n 〉 Theorem IPSM<br />
= 〈u i ,α 1 u 1 + α 2 u 2 + α 3 u 3 + ···+ α n u n 〉 Theorem IPVA<br />
= 〈u i , 0〉 Def<strong>in</strong>ition RLDCV<br />
= 0 Def<strong>in</strong>ition IP<br />
Because u i was assumed to be nonzero, Theorem PIP says 〈u i , u i 〉 is nonzero<br />
and thus α i must be zero. So we conclude that α i =0forall1≤ i ≤ n <strong>in</strong> any<br />
relation of l<strong>in</strong>ear dependence on S. But this says that S is a l<strong>in</strong>early <strong>in</strong>dependent<br />
set s<strong>in</strong>ce the only way to form a relation of l<strong>in</strong>ear dependence is the trivial way<br />
(Def<strong>in</strong>ition LICV). Boom!
§O Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 157<br />
Subsection GSP<br />
Gram-Schmidt Procedure<br />
The Gram-Schmidt Procedure is really a theorem. It says that if we beg<strong>in</strong> with a<br />
l<strong>in</strong>early <strong>in</strong>dependent set of p vectors, S, then we can do a number of calculations<br />
with these vectors and produce an orthogonal set of p vectors, T ,sothat〈S〉 = 〈T 〉.<br />
Given the large number of computations <strong>in</strong>volved, it is <strong>in</strong>deed a procedure to do all<br />
the necessary computations, and it is best employed on a computer. However, it also<br />
has value <strong>in</strong> proofs where we may on occasion wish to replace a l<strong>in</strong>early <strong>in</strong>dependent<br />
set by an orthogonal set.<br />
This is our first occasion to use the technique of “mathematical <strong>in</strong>duction” for a<br />
proof, a technique we will see aga<strong>in</strong> several times, especially <strong>in</strong> Chapter D. Sostudy<br />
the simple example described <strong>in</strong> Proof Technique I first.<br />
Theorem GSP Gram-Schmidt Procedure<br />
Suppose that S = {v 1 , v 2 , v 3 , ..., v p } is a l<strong>in</strong>early <strong>in</strong>dependent set of vectors <strong>in</strong><br />
C m . Def<strong>in</strong>e the vectors u i , 1 ≤ i ≤ p by<br />
u i = v i − 〈u 1, v i 〉<br />
〈u 1 , u 1 〉 u 1 − 〈u 2, v i 〉<br />
〈u 2 , u 2 〉 u 2 − 〈u 3, v i 〉<br />
〈u 3 , u 3 〉 u 3 −···− 〈u i−1, v i 〉<br />
〈u i−1 , u i−1 〉 u i−1<br />
Let T = {u 1 , u 2 , u 3 , ..., u p }.ThenT is an orthogonal set of nonzero vectors,<br />
and 〈T 〉 = 〈S〉.<br />
Proof. We will prove the result by us<strong>in</strong>g <strong>in</strong>duction on p (Proof Technique I). To<br />
beg<strong>in</strong>, we prove that T has the desired properties when p = 1. In this case u 1 = v 1<br />
and T = {u 1 } = {v 1 } = S. Because S and T are equal, 〈S〉 = 〈T 〉. Equally trivial,<br />
T is an orthogonal set. If u 1 = 0, thenS would be a l<strong>in</strong>early dependent set, a<br />
contradiction.<br />
Suppose that the theorem is true for any set of p − 1 l<strong>in</strong>early <strong>in</strong>dependent<br />
vectors. Let S = {v 1 , v 2 , v 3 , ..., v p } be a l<strong>in</strong>early <strong>in</strong>dependent set of p vectors.<br />
Then S ′ = {v 1 , v 2 , v 3 , ..., v p−1 } is also l<strong>in</strong>early <strong>in</strong>dependent. So we can apply<br />
the theorem to S ′ and construct the vectors T ′ = {u 1 , u 2 , u 3 , ..., u p−1 }. T ′ is<br />
therefore an orthogonal set of nonzero vectors and 〈S ′ 〉 = 〈T ′ 〉. Def<strong>in</strong>e<br />
u p = v p − 〈u 1, v p 〉<br />
〈u 1 , u 1 〉 u 1 − 〈u 2, v p 〉<br />
〈u 2 , u 2 〉 u 2 − 〈u 3, v p 〉<br />
〈u 3 , u 3 〉 u 3 −···− 〈u p−1, v p 〉<br />
〈u p−1 , u p−1 〉 u p−1<br />
and let T = T ′ ∪{u p }. We need to now show that T has several properties by<br />
build<strong>in</strong>g on what we know about T ′ . But first notice that the above equation has<br />
no problems with the denom<strong>in</strong>ators (〈u i , u i 〉) be<strong>in</strong>g zero, s<strong>in</strong>ce the u i are from T ′ ,<br />
which is composed of nonzero vectors.<br />
We show that 〈T 〉 = 〈S〉, by first establish<strong>in</strong>g that 〈T 〉⊆〈S〉. Supposex ∈〈T 〉,<br />
so<br />
x = a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a p u p
§O Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 158<br />
The term a p u p is a l<strong>in</strong>ear comb<strong>in</strong>ation of vectors from T ′ and the vector v p , while the<br />
rema<strong>in</strong><strong>in</strong>g terms are a l<strong>in</strong>ear comb<strong>in</strong>ation of vectors from T ′ .S<strong>in</strong>ce〈T ′ 〉 = 〈S ′ 〉,any<br />
term that is a multiple of a vector from T ′ can be rewritten as a l<strong>in</strong>ear comb<strong>in</strong>ation<br />
of vectors from S ′ . The rema<strong>in</strong><strong>in</strong>g term a p v p is a multiple of a vector <strong>in</strong> S. Sowe<br />
see that x can be rewritten as a l<strong>in</strong>ear comb<strong>in</strong>ation of vectors from S, i.e. x ∈〈S〉.<br />
To show that 〈S〉 ⊆〈T 〉, beg<strong>in</strong>withy ∈〈S〉, so<br />
y = a 1 v 1 + a 2 v 2 + a 3 v 3 + ···+ a p v p<br />
Rearrange our def<strong>in</strong><strong>in</strong>g equation for u p by solv<strong>in</strong>g for v p . Then the term a p v p<br />
is a multiple of a l<strong>in</strong>ear comb<strong>in</strong>ation of elements of T . The rema<strong>in</strong><strong>in</strong>g terms are a<br />
l<strong>in</strong>ear comb<strong>in</strong>ation of v 1 , v 2 , v 3 , ..., v p−1 , hence an element of 〈S ′ 〉 = 〈T ′ 〉.Thus<br />
these rema<strong>in</strong><strong>in</strong>g terms can be written as a l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors <strong>in</strong> T ′ .<br />
So y is a l<strong>in</strong>ear comb<strong>in</strong>ation of vectors from T , i.e. y ∈〈T 〉.<br />
The elements of T ′ are nonzero, but what about u p ? Suppose to the contrary<br />
that u p = 0,<br />
0 = u p = v p − 〈u 1, v p 〉<br />
〈u 1 , u 1 〉 u 1 − 〈u 2, v p 〉<br />
〈u 2 , u 2 〉 u 2 − 〈u 3, v p 〉<br />
〈u 3 , u 3 〉 u 3 −···− 〈u p−1, v p 〉<br />
〈u p−1 , u p−1 〉 u p−1<br />
v p = 〈u 1, v p 〉<br />
〈u 1 , u 1 〉 u 1 + 〈u 2, v p 〉<br />
〈u 2 , u 2 〉 u 2 + 〈u 3, v p 〉<br />
〈u 3 , u 3 〉 u 3 + ···+ 〈u p−1, v p 〉<br />
〈u p−1 , u p−1 〉 u p−1<br />
S<strong>in</strong>ce 〈S ′ 〉 = 〈T ′ 〉 we can write the vectors u 1 , u 2 , u 3 , ..., u p−1 on the right side<br />
of this equation <strong>in</strong> terms of the vectors v 1 , v 2 , v 3 , ..., v p−1 andwethenhavethe<br />
vector v p expressed as a l<strong>in</strong>ear comb<strong>in</strong>ation of the other p − 1 vectors <strong>in</strong> S, imply<strong>in</strong>g<br />
that S is a l<strong>in</strong>early dependent set (Theorem DLDS), contrary to our lone hypothesis<br />
about S.<br />
F<strong>in</strong>ally, it is a simple matter to establish that T is an orthogonal set, though it<br />
will not appear so simple look<strong>in</strong>g. Th<strong>in</strong>k about your objects as you work through<br />
the follow<strong>in</strong>g — what is a vector and what is a scalar. S<strong>in</strong>ce T ′ is an orthogonal set<br />
by <strong>in</strong>duction, most pairs of elements <strong>in</strong> T are already known to be orthogonal. We<br />
just need to test “new” <strong>in</strong>ner products, between u p and u i , for 1 ≤ i ≤ p − 1. Here<br />
we go, us<strong>in</strong>g summation notation,<br />
〈<br />
〈u i , u p 〉 = u i , v p −<br />
= 〈u i , v p 〉−<br />
∑p−1<br />
k=1<br />
〈<br />
u i ,<br />
∑p−1<br />
= 〈u i , v p 〉−<br />
k=1<br />
〈u k , v p 〉<br />
〈u k , u k 〉 u k<br />
∑p−1<br />
k=1<br />
〉<br />
〈u k , v p 〉<br />
〈u k , u k 〉 u k<br />
〈<br />
u i , 〈u k, v p 〉<br />
〈u k , u k 〉 u k<br />
〉<br />
〉<br />
Theorem IPVA<br />
Theorem IPVA
§O Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 159<br />
= 〈u i , v p 〉−<br />
∑p−1<br />
k=1<br />
〈u k , v p 〉<br />
〈u k , u k 〉 〈u i, u k 〉 Theorem IPSM<br />
= 〈u i , v p 〉− 〈u i, v p 〉<br />
〈u i , u i 〉 〈u i, u i 〉− ∑ k≠i<br />
= 〈u i , v p 〉−〈u i , v p 〉− ∑ k≠i<br />
0<br />
〈u k , v p 〉<br />
(0) Induction Hypothesis<br />
〈u k , u k 〉<br />
=0<br />
<br />
Example GSTV Gram-Schmidt of three vectors<br />
We will illustrate the Gram-Schmidt process with three vectors. Beg<strong>in</strong> with the<br />
l<strong>in</strong>early <strong>in</strong>dependent (check this!) set<br />
and<br />
Then<br />
S = {v 1 , v 2 , v 3 } =<br />
u 1 = v 1 =<br />
[ 1<br />
]<br />
1+i<br />
1<br />
u 2 = v 2 − 〈u 1, v 2 〉<br />
〈u 1 , u 1 〉 u 1 = 1 4<br />
{[ ] 1<br />
1+i ,<br />
1<br />
[ ] −2 − 3i<br />
1 − i<br />
2+5i<br />
[ ] −i<br />
1 ,<br />
1+i<br />
u 3 = v 3 − 〈u 1, v 3 〉<br />
〈u 1 , u 1 〉 u 1 − 〈u 2, v 3 〉<br />
〈u 2 , u 2 〉 u 2 = 1 11<br />
T = {u 1 , u 2 , u 3 } =<br />
{[ ] 1<br />
1+i , 1<br />
1 4<br />
[ ] −2 − 3i<br />
1 − i ,<br />
2+5i<br />
[ ]} 0<br />
i<br />
i<br />
[ ] −3 − i<br />
1+3i<br />
−1 − i<br />
1<br />
11<br />
[ ]} −3 − i<br />
1+3i<br />
−1 − i<br />
is an orthogonal set (which you can check) of nonzero vectors and 〈T 〉 = 〈S〉 (all<br />
by Theorem GSP). Of course, as a by-product of orthogonality, the set T is also<br />
l<strong>in</strong>early <strong>in</strong>dependent (Theorem OSLI).<br />
△<br />
One f<strong>in</strong>al def<strong>in</strong>ition related to orthogonal vectors.<br />
Def<strong>in</strong>ition ONS OrthoNormal Set<br />
Suppose S = {u 1 , u 2 , u 3 , ..., u n } is an orthogonal set of vectors such that ‖u i ‖ =1<br />
for all 1 ≤ i ≤ n. ThenS is an orthonormal set of vectors.<br />
□<br />
Once you have an orthogonal set, it is easy to convert it to an orthonormal set
§O Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 160<br />
— multiply each vector by the reciprocal of its norm, and the result<strong>in</strong>g vector will<br />
have norm 1. This scal<strong>in</strong>g of each vector will not affect the orthogonality properties<br />
(apply Theorem IPSM).<br />
Example ONTV Orthonormal set, three vectors<br />
The set<br />
{[ ] 1<br />
T = {u 1 , u 2 , u 3 } = 1+i , 1<br />
1 4<br />
from Example GSTV is an orthogonal set.<br />
We compute the norm of each vector,<br />
[ ] −2 − 3i<br />
1 − i ,<br />
2+5i<br />
1<br />
11<br />
‖u 1 ‖ =2 ‖u 2 ‖ = 1 2√<br />
11 ‖u3 ‖ =<br />
[ ]} −3 − i<br />
1+3i<br />
−1 − i<br />
√<br />
2<br />
√<br />
11<br />
Convert<strong>in</strong>g each vector to a norm of 1, yields an orthonormal set,<br />
[ ]<br />
w 1 = 1 1<br />
1+i<br />
2 1<br />
[ ] [ ]<br />
w 2 = √ 1 1 −2 − 3i<br />
1 − i = 1 −2 − 3i<br />
1<br />
2 11 4 2+5i 2 √ 1 − i<br />
11 2+5i<br />
[ ] [ ]<br />
w 3 = 1 1 −3 − i<br />
√ 1+3i = 1 −3 − i<br />
√ 1+3i<br />
√ 2 11<br />
11<br />
−1 − i 22 −1 − i<br />
Example ONFV Orthonormal set, four vectors<br />
As an exercise convert the l<strong>in</strong>early <strong>in</strong>dependent set<br />
⎧⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />
1+i i i −1 − i<br />
⎪⎨<br />
⎪⎬<br />
⎢ 1 ⎥ ⎢1+i⎥<br />
⎢ −i ⎥ ⎢ i ⎥<br />
S = ⎣<br />
⎪⎩<br />
1 − i<br />
⎦ , ⎣<br />
−1<br />
⎦ , ⎣<br />
−1+i<br />
⎦ , ⎣<br />
1<br />
⎦<br />
⎪⎭<br />
i −i 1 −1<br />
to an orthogonal set via the Gram-Schmidt Process (Theorem GSP) and then scale<br />
the vectors to norm 1 to create an orthonormal set. You should get the same set you<br />
would if you scaled the orthogonal set of Example AOS to become an orthonormal<br />
set.<br />
△<br />
We will see orthonormal sets aga<strong>in</strong> <strong>in</strong> Subsection MINM.UM. They are <strong>in</strong>timately<br />
related to unitary matrices (Def<strong>in</strong>ition UM) through Theorem CUMOS. Some of<br />
the utility of orthonormal sets is captured by Theorem COB <strong>in</strong> Subsection B.OBC.<br />
Orthonormal sets appear once aga<strong>in</strong> <strong>in</strong> Section OD where they are key <strong>in</strong> orthonormal<br />
diagonalization.<br />
△
§O Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 161<br />
Read<strong>in</strong>g Questions<br />
1. Is the set<br />
⎧⎡<br />
⎨<br />
⎣ 1<br />
⎤ ⎡<br />
−1⎦ , ⎣ 5 ⎤ ⎡<br />
3 ⎦ , ⎣ 8 ⎤⎫<br />
⎬<br />
4 ⎦<br />
⎩<br />
2 −1 −2<br />
⎭<br />
an orthogonal set? Why?<br />
2. What is the dist<strong>in</strong>ction between an orthogonal set and an orthonormal set?<br />
3. What is nice about the output of the Gram-Schmidt process?<br />
Exercises<br />
C20 Complete Example AOS by verify<strong>in</strong>g that the four rema<strong>in</strong><strong>in</strong>g <strong>in</strong>ner products are<br />
zero.<br />
C21 Verify that the set T created<strong>in</strong>ExampleGSTV by the Gram-Schmidt Procedure is<br />
an orthogonal set.<br />
M60 Suppose that {u, v, w} ⊆C n is an orthonormal set. Prove that u + v is not<br />
orthogonal to v + w.<br />
T10 Prove part 2 of the conclusion of Theorem IPVA.<br />
T11 Prove part 2 of the conclusion of Theorem IPSM.<br />
T20 † Suppose that u, v, w ∈ C n , α, β ∈ C and u is orthogonal to both v and w. Prove<br />
that u is orthogonal to αv + βw.<br />
T30 Suppose that the set S <strong>in</strong> the hypothesis of Theorem GSP is not just l<strong>in</strong>early<br />
<strong>in</strong>dependent, but is also orthogonal. Prove that the set T created by the Gram-Schmidt<br />
procedure is equal to S. (Note that we are gett<strong>in</strong>g a stronger conclusion than 〈T 〉 = 〈S〉 —<br />
the conclusion is that T = S.) In other words, it is po<strong>in</strong>tless to apply the Gram-Schmidt<br />
procedure to a set that is already orthogonal.<br />
T31 Suppose that the set S is l<strong>in</strong>early <strong>in</strong>dependent. Apply the Gram-Schmidt procedure<br />
(Theorem GSP) twice, creat<strong>in</strong>g first the l<strong>in</strong>early <strong>in</strong>dependent set T 1 from S, andthen<br />
creat<strong>in</strong>g T 2 from T 1 . As a consequence of Exercise O.T30, provethatT 1 = T 2 .Inother<br />
words, it is po<strong>in</strong>tless to apply the Gram-Schmidt procedure twice.
Chapter M<br />
Matrices<br />
We have made frequent use of matrices for solv<strong>in</strong>g systems of equations, and we<br />
have begun to <strong>in</strong>vestigate a few of their properties, such as the null space and<br />
nons<strong>in</strong>gularity. In this chapter, we will take a more systematic approach to the study<br />
of matrices.<br />
Section MO<br />
Matrix Operations<br />
In this section we will back up and start simple. We beg<strong>in</strong> with a def<strong>in</strong>ition of a<br />
totally general set of matrices, and see where that takes us.<br />
Subsection MEASM<br />
Matrix Equality, Addition, Scalar Multiplication<br />
Def<strong>in</strong>ition VSM Vector Space of m × n Matrices<br />
The vector space M mn is the set of all m × n matrices with entries from the set of<br />
complex numbers.<br />
□<br />
Just as we made, and used, a careful def<strong>in</strong>ition of equality for column vectors, so<br />
too, we have precise def<strong>in</strong>itions for matrices.<br />
Def<strong>in</strong>ition ME Matrix Equality<br />
The m × n matrices A and B are equal, written A = B provided [A] ij<br />
= [B] ij<br />
for<br />
all 1 ≤ i ≤ m, 1≤ j ≤ n.<br />
□<br />
So equality of matrices translates to the equality of complex numbers, on an<br />
entry-by-entry basis. Notice that we now have yet another def<strong>in</strong>ition that uses the<br />
symbol “=” for shorthand. Whenever a theorem has a conclusion say<strong>in</strong>g two matrices<br />
162
§MO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 163<br />
are equal (th<strong>in</strong>k about your objects), we will consider appeal<strong>in</strong>g to this def<strong>in</strong>ition as<br />
a way of formulat<strong>in</strong>g the top-level structure of the proof.<br />
We will now def<strong>in</strong>e two operations on the set M mn . Aga<strong>in</strong>, we will overload a<br />
symbol (‘+’) and a convention (juxtaposition for scalar multiplication).<br />
Def<strong>in</strong>ition MA Matrix Addition<br />
Given the m × n matrices A and B, def<strong>in</strong>e the sum of A and B as an m × n matrix,<br />
written A + B, accord<strong>in</strong>g to<br />
[A + B] ij<br />
=[A] ij<br />
+[B] ij<br />
1 ≤ i ≤ m, 1 ≤ j ≤ n<br />
□<br />
So matrix addition takes two matrices of the same size and comb<strong>in</strong>es them (<strong>in</strong> a<br />
natural way!) to create a new matrix of the same size. Perhaps this is the “obvious”<br />
th<strong>in</strong>g to do, but it does not relieve us from the obligation to state it carefully.<br />
Example MA Addition of two matrices <strong>in</strong> M 23<br />
If<br />
[ ]<br />
2 −3 4<br />
A =<br />
B =<br />
1 0 −7<br />
then<br />
[ ] [ ]<br />
2 −3 4 6 2 −4<br />
A + B =<br />
+<br />
1 0 −7 3 5 2<br />
[ ]<br />
2+6 −3+2 4+(−4)<br />
=<br />
=<br />
1+3 0+5 −7+2<br />
[<br />
6 2<br />
]<br />
−4<br />
3 5 2<br />
[<br />
8 −1<br />
]<br />
0<br />
4 5 −5<br />
Our second operation takes two objects of different types, specifically a number<br />
and a matrix, and comb<strong>in</strong>es them to create another matrix. As with vectors, <strong>in</strong> this<br />
context we call a number a scalar <strong>in</strong> order to emphasize that it is not a matrix.<br />
Def<strong>in</strong>ition MSM Matrix Scalar Multiplication<br />
Given the m × n matrix A and the scalar α ∈ C, thescalar multiple of A is an<br />
m × n matrix, written αA and def<strong>in</strong>ed accord<strong>in</strong>g to<br />
[αA] ij<br />
= α [A] ij<br />
1 ≤ i ≤ m, 1 ≤ j ≤ n<br />
□<br />
Notice aga<strong>in</strong> that we have yet another k<strong>in</strong>d of multiplication, and it is aga<strong>in</strong> written<br />
putt<strong>in</strong>g two symbols side-by-side. Computationally, scalar matrix multiplication is<br />
very easy.<br />
Example MSM Scalar multiplication <strong>in</strong> M 32<br />
△
§MO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 164<br />
If<br />
and α =7,then<br />
αA =7<br />
[ 2<br />
] 8<br />
−3 5 =<br />
0 1<br />
A =<br />
[ 2<br />
] 8<br />
−3 5<br />
0 1<br />
[ 7(2)<br />
] 7(8)<br />
7(−3) 7(5) =<br />
7(0) 7(1)<br />
[ 14<br />
] 56<br />
−21 35<br />
0 7<br />
△<br />
Subsection VSP<br />
Vector Space Properties<br />
With def<strong>in</strong>itions of matrix addition and scalar multiplication we can now state, and<br />
prove, several properties of each operation, and some properties that <strong>in</strong>volve their<br />
<strong>in</strong>terplay. We now collect ten of them here for later reference.<br />
Theorem VSPM Vector Space Properties of Matrices<br />
Suppose that M mn is the set of all m × n matrices (Def<strong>in</strong>ition VSM) with addition<br />
and scalar multiplication as def<strong>in</strong>ed <strong>in</strong> Def<strong>in</strong>ition MA and Def<strong>in</strong>ition MSM. Then<br />
• ACM Additive Closure, Matrices<br />
If A, B ∈ M mn ,thenA + B ∈ M mn .<br />
• SCM Scalar Closure, Matrices<br />
If α ∈ C and A ∈ M mn ,thenαA ∈ M mn .<br />
• CM Commutativity, Matrices<br />
If A, B ∈ M mn ,thenA + B = B + A.<br />
• AAM Additive Associativity, Matrices<br />
If A, B, C ∈ M mn ,thenA +(B + C) =(A + B)+C.<br />
• ZM Zero Matrix, Matrices<br />
There is a matrix, O, called the zero matrix, such that A + O = A for all<br />
A ∈ M mn .<br />
• AIM Additive Inverses, Matrices<br />
If A ∈ M mn , then there exists a matrix −A ∈ M mn so that A +(−A) =O.<br />
• SMAM Scalar Multiplication Associativity, Matrices<br />
If α, β ∈ C and A ∈ M mn ,thenα(βA) =(αβ)A.
§MO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 165<br />
• DMAM Distributivity across Matrix Addition, Matrices<br />
If α ∈ C and A, B ∈ M mn ,thenα(A + B) =αA + αB.<br />
• DSAM Distributivity across Scalar Addition, Matrices<br />
If α, β ∈ C and A ∈ M mn ,then(α + β)A = αA + βA.<br />
• OM One, Matrices<br />
If A ∈ M mn ,then1A = A.<br />
Proof. While some of these properties seem very obvious, they all require proof.<br />
However, the proofs are not very <strong>in</strong>terest<strong>in</strong>g, and border on tedious. We will prove<br />
one version of distributivity very carefully, and you can test your proof-build<strong>in</strong>g skills<br />
on some of the others. We will give our new notation for matrix entries a workout<br />
here. Compare the style of the proofs here with those given for vectors <strong>in</strong> Theorem<br />
VSPCV — while the objects here are more complicated, our notation makes the<br />
proofs cleaner.<br />
To prove Property DSAM, (α+β)A = αA+βA, we need to establish the equality<br />
of two matrices (see Proof Technique GS). Def<strong>in</strong>ition ME says we need to establish<br />
the equality of their entries, one-by-one. How do we do this, when we do not even<br />
know how many entries the two matrices might have? This is where the notation for<br />
matrix entries, given <strong>in</strong> Def<strong>in</strong>ition M, comes <strong>in</strong>to play. Ready? Here we go.<br />
For any i and j, 1≤ i ≤ m, 1≤ j ≤ n,<br />
[(α + β)A] ij<br />
=(α + β)[A] ij<br />
Def<strong>in</strong>ition MSM<br />
= α [A] ij<br />
+ β [A] ij<br />
Distributivity <strong>in</strong> C<br />
=[αA] ij<br />
+[βA] ij<br />
Def<strong>in</strong>ition MSM<br />
=[αA + βA] ij<br />
Def<strong>in</strong>ition MA<br />
There are several th<strong>in</strong>gs to notice here. (1) Each equals sign is an equality of<br />
scalars (numbers). (2) The two ends of the equation, be<strong>in</strong>g true for any i and j,<br />
allow us to conclude the equality of the matrices by Def<strong>in</strong>ition ME. (3)Thereare<br />
several plus signs, and several <strong>in</strong>stances of juxtaposition. Identify each one, and state<br />
exactly what operation is be<strong>in</strong>g represented by each.<br />
<br />
For now, note the similarities between Theorem VSPM about matrices and<br />
Theorem VSPCV about vectors.<br />
The zero matrix described <strong>in</strong> this theorem, O, is what you would expect — a<br />
matrix full of zeros.<br />
Def<strong>in</strong>ition ZM Zero Matrix<br />
The m × n zero matrix is written as O = O m×n anddef<strong>in</strong>edby[O] ij<br />
= 0, for all<br />
1 ≤ i ≤ m, 1≤ j ≤ n. □
§MO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 166<br />
Subsection TSM<br />
Transposes and Symmetric Matrices<br />
We describe one more common operation we can perform on matrices. Informally,<br />
to transpose a matrix is to build a new matrix by swapp<strong>in</strong>g its rows and columns.<br />
Def<strong>in</strong>ition TM Transpose of a Matrix<br />
Given an m × n matrix A, itstranspose is the n × m matrix A t given by<br />
[<br />
A<br />
t ] ij =[A] ji , 1 ≤ i ≤ n, 1 ≤ j ≤ m. □<br />
Example TM Transpose of a 3 × 4 matrix<br />
Suppose<br />
[ 3 7 2<br />
] −3<br />
D = −1 4 2 8 .<br />
0 3 −2 5<br />
We could formulate the transpose, entry-by-entry, us<strong>in</strong>g the def<strong>in</strong>ition. But it is<br />
easier to just systematically rewrite rows as columns (or vice-versa). The form of<br />
the def<strong>in</strong>ition given will be more useful <strong>in</strong> proofs. So we have<br />
⎡<br />
⎤<br />
3 −1 0<br />
7 4 3 ⎥<br />
D t =<br />
⎢<br />
⎣<br />
2 2 −2<br />
−3 8 5<br />
It will sometimes happen that a matrix is equal to its transpose. In this case, we<br />
will call a matrix symmetric. These matrices occur naturally <strong>in</strong> certa<strong>in</strong> situations,<br />
and also have some nice properties, so it is worth stat<strong>in</strong>g the def<strong>in</strong>ition carefully.<br />
Informally a matrix is symmetric if we can “flip” it about the ma<strong>in</strong> diagonal (upperleft<br />
corner, runn<strong>in</strong>g down to the lower-right corner) and have it look unchanged.<br />
Def<strong>in</strong>ition SYM Symmetric Matrix<br />
The matrix A is symmetric if A = A t .<br />
Example SYM A symmetric 5 × 5 matrix<br />
The matrix<br />
⎡<br />
⎤<br />
2 3 −9 5 7<br />
3 1 6 −2 −3<br />
E = ⎢−9 6 0 −1 9 ⎥<br />
⎣ 5 −2 −1 4 −8⎦<br />
7 −3 9 −8 −3<br />
is symmetric.<br />
You might have noticed that Def<strong>in</strong>ition SYM did not specify the size of the matrix<br />
⎦<br />
△<br />
□<br />
△
§MO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 167<br />
A, as has been our custom. That is because it was not necessary. An alternative<br />
would have been to state the def<strong>in</strong>ition just for square matrices, but this is the<br />
substance of the next proof.<br />
Before read<strong>in</strong>g the next proof, we want to offer you some advice about how to<br />
become more proficient at construct<strong>in</strong>g proofs. Perhaps you can apply this advice to<br />
the next theorem. Have a peek at Proof Technique P now.<br />
Theorem SMS Symmetric Matrices are Square<br />
Suppose that A is a symmetric matrix. Then A is square.<br />
Proof. We start by specify<strong>in</strong>g A’s size, without assum<strong>in</strong>g it is square, s<strong>in</strong>ce we are<br />
try<strong>in</strong>g to prove that, so we cannot also assume it. Suppose A is an m × n matrix.<br />
Because A is symmetric, we know by Def<strong>in</strong>ition SYM that A = A t . So, <strong>in</strong> particular,<br />
Def<strong>in</strong>ition ME requires that A and A t musthavethesamesize.ThesizeofA t is<br />
n × m. Because A has m rows and A t has n rows, we conclude that m = n, and<br />
hence A must be square by Def<strong>in</strong>ition SQM.<br />
<br />
We f<strong>in</strong>ish this section with three easy theorems, but they illustrate the <strong>in</strong>terplay<br />
of our three new operations, our new notation, and the techniques used to prove<br />
matrix equalities.<br />
Theorem TMA Transpose and Matrix Addition<br />
Suppose that A and B are m × n matrices. Then (A + B) t = A t + B t .<br />
Proof. The statement to be proved is an equality of matrices, so we work entry-byentry<br />
and use Def<strong>in</strong>ition ME. Th<strong>in</strong>k carefully about the objects <strong>in</strong>volved here, and<br />
the many uses of the plus sign. Realize too that while A and B are m × n matrices,<br />
the conclusion is a statement about the equality of two n × m matrices. So we beg<strong>in</strong><br />
with: for 1 ≤ i ≤ n, 1≤ j ≤ m,<br />
[<br />
(A + B)<br />
t ] ij =[A + B] ji<br />
Def<strong>in</strong>ition TM<br />
=[A] ji<br />
+[B] ji<br />
Def<strong>in</strong>ition MA<br />
= [ A t] + [ B t] Def<strong>in</strong>ition TM<br />
ij ij<br />
= [ A t + B t] Def<strong>in</strong>ition MA<br />
ij<br />
S<strong>in</strong>ce the matrices (A + B) t and A t + B t agree at each entry, Def<strong>in</strong>ition ME tells<br />
us the two matrices are equal.<br />
<br />
Theorem TMSM Transpose and Matrix Scalar Multiplication<br />
Suppose that α ∈ C and A is an m × n matrix. Then (αA) t = αA t .<br />
Proof. The statement to be proved is an equality of matrices, so we work entry-byentry<br />
and use Def<strong>in</strong>ition ME. Notice that the desired equality is of n×m matrices, and
§MO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 168<br />
th<strong>in</strong>k carefully about the objects <strong>in</strong>volved here, plus the many uses of juxtaposition.<br />
For 1 ≤ i ≤ m, 1≤ j ≤ n,<br />
[<br />
(αA)<br />
t ] ji =[αA] ij<br />
Def<strong>in</strong>ition TM<br />
= α [A] ij<br />
Def<strong>in</strong>ition MSM<br />
= α [ A t] ji<br />
Def<strong>in</strong>ition TM<br />
= [ αA t] ji<br />
Def<strong>in</strong>ition MSM<br />
S<strong>in</strong>ce the matrices (αA) t and αA t agree at each entry, Def<strong>in</strong>ition ME tells us the<br />
two matrices are equal.<br />
<br />
Theorem TT Transpose of a Transpose<br />
Suppose that A is an m × n matrix. Then (A t ) t = A.<br />
Proof. We aga<strong>in</strong> want to prove an equality of matrices, so we work entry-by-entry<br />
and use Def<strong>in</strong>ition ME. For 1 ≤ i ≤ m, 1≤ j ≤ n,<br />
[ (A<br />
t ) t ] ij = [ A t] ji<br />
Def<strong>in</strong>ition TM<br />
=[A] ij<br />
Def<strong>in</strong>ition TM<br />
S<strong>in</strong>ce the matrices (A t ) t and A agree at each entry, Def<strong>in</strong>ition ME tells us the<br />
two matrices are equal.<br />
<br />
Subsection MCC<br />
Matrices and Complex Conjugation<br />
As we did with vectors (Def<strong>in</strong>ition CCCV), we can def<strong>in</strong>e what it means to take the<br />
conjugate of a matrix.<br />
Def<strong>in</strong>ition CCM Complex Conjugate of a Matrix<br />
Suppose A is an m × n matrix. Then the conjugate of A, written A is an m × n<br />
matrix def<strong>in</strong>ed by<br />
[<br />
A<br />
]<br />
ij = [A] ij<br />
Example CCM Complex conjugate of a matrix<br />
If<br />
[ ]<br />
2 − i 3 5+4i<br />
A =<br />
−3+6i 2 − 3i 0<br />
then<br />
A =<br />
[<br />
2+i 3<br />
]<br />
5− 4i<br />
−3 − 6i 2+3i 0<br />
□
§MO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 169<br />
The <strong>in</strong>terplay between the conjugate of a matrix and the two operations on<br />
matrices is what you might expect.<br />
Theorem CRMA Conjugation Respects Matrix Addition<br />
Suppose that A and B are m × n matrices. Then A + B = A + B.<br />
Proof. For 1 ≤ i ≤ m, 1≤ j ≤ n,<br />
[ ]<br />
A + B<br />
ij = [A + B] ij<br />
Def<strong>in</strong>ition CCM<br />
= [A] ij<br />
+[B] ij<br />
Def<strong>in</strong>ition MA<br />
= [A] ij<br />
+ [B] ij<br />
Theorem CCRA<br />
= [ A ] ij + [ B ] ij<br />
= [ A + B ] ij<br />
Def<strong>in</strong>ition CCM<br />
Def<strong>in</strong>ition MA<br />
S<strong>in</strong>ce the matrices A + B and A + B are equal <strong>in</strong> each entry, Def<strong>in</strong>ition ME says<br />
that A + B = A + B.<br />
<br />
Theorem CRMSM Conjugation Respects Matrix Scalar Multiplication<br />
Suppose that α ∈ C and A is an m × n matrix. Then αA = αA.<br />
Proof. For 1 ≤ i ≤ m, 1≤ j ≤ n,<br />
[ ]<br />
αA<br />
ij = [αA] ij<br />
Def<strong>in</strong>ition CCM<br />
= α [A] ij<br />
Def<strong>in</strong>ition MSM<br />
= α[A] ij<br />
Theorem CCRM<br />
= α [ A ] ij<br />
= [ αA ] ij<br />
Def<strong>in</strong>ition CCM<br />
Def<strong>in</strong>ition MSM<br />
S<strong>in</strong>ce the matrices αA and αA are equal <strong>in</strong> each entry, Def<strong>in</strong>ition ME says that<br />
αA = αA.<br />
<br />
Theorem CCM Conjugate of the Conjugate of a Matrix<br />
Suppose that A is an m × n matrix. Then ( A ) = A.<br />
Proof. For 1 ≤ i ≤ m, 1≤ j ≤ n,<br />
[ (A ) ] ij<br />
= [ A ] ij<br />
Def<strong>in</strong>ition CCM<br />
= [A] ij<br />
Def<strong>in</strong>ition CCM<br />
=[A] ij<br />
Theorem CCT
§MO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 170<br />
S<strong>in</strong>ce the matrices ( A ) and A are equal <strong>in</strong> each entry, Def<strong>in</strong>ition ME says that<br />
(<br />
A<br />
)<br />
= A. <br />
F<strong>in</strong>ally, we will need the follow<strong>in</strong>g result about matrix conjugation and transposes<br />
later.<br />
Theorem MCT Matrix Conjugation and Transposes<br />
Suppose that A is an m × n matrix. Then (A t )= ( A ) t<br />
.<br />
Proof. For 1 ≤ i ≤ m, 1≤ j ≤ n,<br />
[ ]<br />
(A t ) = ji [At ] ji<br />
Def<strong>in</strong>ition CCM<br />
= [A] ij<br />
Def<strong>in</strong>ition TM<br />
= [ A ] Def<strong>in</strong>ition CCM<br />
ij<br />
[ (A ) t<br />
]<br />
=<br />
Def<strong>in</strong>ition TM<br />
ji<br />
S<strong>in</strong>ce the matrices (A t ) and ( A ) t<br />
are equal <strong>in</strong> each entry, Def<strong>in</strong>ition ME says<br />
that (A t )= ( A ) t<br />
.<br />
Subsection AM<br />
Adjo<strong>in</strong>t of a Matrix<br />
The comb<strong>in</strong>ation of transpos<strong>in</strong>g and conjugat<strong>in</strong>g a matrix will be important <strong>in</strong><br />
subsequent sections, such as Subsection MINM.UM and Section OD. Wemakea<br />
key def<strong>in</strong>ition here and prove some basic results <strong>in</strong> the same spirit as those above.<br />
Def<strong>in</strong>ition A Adjo<strong>in</strong>t<br />
If A is a matrix, then its adjo<strong>in</strong>t is A ∗ = ( A ) t<br />
.<br />
You will see the adjo<strong>in</strong>t written elsewhere variously as A H , A ∗ or A † . Notice that<br />
Theorem MCT says it does not really matter if we conjugate and then transpose, or<br />
transpose and then conjugate.<br />
Theorem AMA Adjo<strong>in</strong>t and Matrix Addition<br />
Suppose A and B are matrices of the same size. Then (A + B) ∗ = A ∗ + B ∗ .<br />
<br />
□<br />
Proof.<br />
(A + B) ∗ = ( A + B ) t<br />
Def<strong>in</strong>ition A<br />
= ( A + B ) t<br />
Theorem CRMA<br />
= ( A ) t ( ) t<br />
+ B Theorem TMA
§MO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 171<br />
= A ∗ + B ∗ Def<strong>in</strong>ition A<br />
<br />
Theorem AMSM Adjo<strong>in</strong>t and Matrix Scalar Multiplication<br />
Suppose α ∈ C is a scalar and A is a matrix. Then (αA) ∗ = αA ∗ .<br />
Proof.<br />
(αA) ∗ = ( αA ) t<br />
Def<strong>in</strong>ition A<br />
= ( αA ) t<br />
Theorem CRMSM<br />
= α ( A ) t<br />
Theorem TMSM<br />
= αA ∗ Def<strong>in</strong>ition A<br />
<br />
Theorem AA Adjo<strong>in</strong>t of an Adjo<strong>in</strong>t<br />
Suppose that A is a matrix. Then (A ∗ ) ∗ = A.<br />
Proof.<br />
( t<br />
(A ∗ ) ∗ = (A )) ∗ Def<strong>in</strong>ition A<br />
(<br />
= (A ∗ ) t) Theorem MCT<br />
( ((A ) t<br />
) ) t<br />
=<br />
Def<strong>in</strong>ition A<br />
= ( A ) Theorem TT<br />
= A Theorem CCM<br />
<br />
Take note of how the theorems <strong>in</strong> this section, while simple, build on earlier<br />
theorems and def<strong>in</strong>itions and never descend to the level of entry-by-entry proofs<br />
based on Def<strong>in</strong>ition ME. In other words, the equal signs that appear <strong>in</strong> the previous<br />
proofs are equalities of matrices, not scalars (which is the opposite of a proof like<br />
that of Theorem TMA).<br />
Read<strong>in</strong>g Questions<br />
1. Perform the follow<strong>in</strong>g matrix computation.<br />
⎡<br />
(6) ⎣ 2 −2 8 1<br />
⎤ ⎡<br />
4 5 −1 3⎦ +(−2) ⎣ 2 7 1 2<br />
⎤<br />
3 −1 0 5⎦<br />
7 −3 0 2 1 7 3 3
§MO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 172<br />
2. Theorem VSPM rem<strong>in</strong>ds you of what previous theorem? How strong is the similarity?<br />
3. Compute the transpose of the matrix below.<br />
⎡<br />
⎣ 6 8 4<br />
⎤<br />
−2 1 0⎦<br />
9 −5 6<br />
Exercises<br />
⎡<br />
[ ] [ ]<br />
C10 † 1 4 −3 3 2 1<br />
Let A =<br />
, B =<br />
and C = ⎣ 2 4<br />
⎤<br />
4 0⎦. Letα =4and<br />
6 3 0 −2 −6 5<br />
−2 2<br />
β =1/2. Perform the follow<strong>in</strong>g calculations: (1) A + B, (2)A + C, (3)B t + C, (4)A + B t ,<br />
(5) βC, (6)4A − 3B, (7)A t + αC, (8)A + B − C t ,(9)4A +2B − 5C t .<br />
C11 † Solve the given vector equation for x, or expla<strong>in</strong> why no solution exists:<br />
[ ] [ ] [ ]<br />
1 2 3 1 1 2 −1 1 0<br />
2<br />
− 3<br />
=<br />
0 4 2 0 1 x 0 5 −2<br />
C12 †<br />
C13 †<br />
C14 †<br />
Solve the given matrix equation for α, or expla<strong>in</strong> why no solution exists:<br />
[ ] [ ] [ ]<br />
1 3 4 4 3 −6 7 12 6<br />
α<br />
+<br />
=<br />
2 1 −1 0 1 1 6 4 −2<br />
Solve the given matrix equation for α, or expla<strong>in</strong> why no solution exists:<br />
⎡<br />
α ⎣ 3 1<br />
⎤ ⎡<br />
2 0⎦ − ⎣ 4 1<br />
⎤ ⎡<br />
3 2⎦ = ⎣ 2 1<br />
⎤<br />
1 −2⎦<br />
1 4 0 1 2 6<br />
F<strong>in</strong>d α and β that solve the follow<strong>in</strong>g equation:<br />
[ ] [ ] [ ]<br />
1 2 2 1 −1 4<br />
α + β =<br />
4 1 3 1 6 1<br />
In Chapter V we def<strong>in</strong>ed the operations of vector addition and vector scalar multiplication<br />
<strong>in</strong> Def<strong>in</strong>ition CVA and Def<strong>in</strong>ition CVSM. These two operations formed the underp<strong>in</strong>n<strong>in</strong>gs<br />
of the rema<strong>in</strong>der of the chapter. We have now def<strong>in</strong>ed similar operations for matrices <strong>in</strong><br />
Def<strong>in</strong>ition MA and Def<strong>in</strong>ition MSM. You will have noticed the result<strong>in</strong>g similarities between<br />
Theorem VSPCV and Theorem VSPM.<br />
In Exercises M20–M25, you will be asked to extend these similarities to other fundamental<br />
def<strong>in</strong>itions and concepts we first saw <strong>in</strong> Chapter V. This sequence of problems was suggested<br />
by Mart<strong>in</strong> Jackson.<br />
M20 Suppose S = {B 1 ,B 2 ,B 3 , ..., B p } is a set of matrices from M mn .Formulate<br />
appropriate def<strong>in</strong>itions for the follow<strong>in</strong>g terms and give an example of the use of each.<br />
1. A l<strong>in</strong>ear comb<strong>in</strong>ation of elements of S.<br />
2. A relation of l<strong>in</strong>ear dependence on S, both trivial and nontrivial.
§MO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 173<br />
3. S is a l<strong>in</strong>early <strong>in</strong>dependent set.<br />
4. 〈S〉.<br />
M21 † Show that the set S is l<strong>in</strong>early <strong>in</strong>dependent <strong>in</strong> M 22 .<br />
{[ ] [ ] [ ] [ ]}<br />
1 0 0 1 0 0 0 0<br />
S = , , ,<br />
0 0 0 0 1 0 0 1<br />
M22 † Determ<strong>in</strong>e if the set S below is l<strong>in</strong>early <strong>in</strong>dependent <strong>in</strong> M 23 .<br />
{[ ] [ ] [ ] [ ] [ ]}<br />
−2 3 4 4 −2 2 −1 −2 −2 −1 1 0 −1 2 −2<br />
,<br />
,<br />
,<br />
,<br />
−1 3 −2 0 −1 1 2 2 2 −1 0 −2 0 −1 −2<br />
M23 † Determ<strong>in</strong>e if the matrix A is <strong>in</strong> the span of S. Inotherwords,isA ∈〈S〉? Ifso<br />
write A as a l<strong>in</strong>ear comb<strong>in</strong>ation of the elements of S.<br />
[ ]<br />
−13 24 2<br />
A =<br />
−8 −2 −20<br />
{[ ] [ ] [ ] [ ] [ ]}<br />
−2 3 4 4 −2 2 −1 −2 −2 −1 1 0 −1 2 −2<br />
S =<br />
,<br />
,<br />
,<br />
,<br />
−1 3 −2 0 −1 1 2 2 2 −1 0 −2 0 −1 −2<br />
M24 † Suppose Y is the set of all 3 × 3 symmetric matrices (Def<strong>in</strong>ition SYM). F<strong>in</strong>d a<br />
set T so that T is l<strong>in</strong>early <strong>in</strong>dependent and 〈T 〉 = Y .<br />
M25 Def<strong>in</strong>e a subset of M 33 by<br />
{<br />
}<br />
U 33 = A ∈ M 33 | [A] ij<br />
= 0 whenever i>j<br />
F<strong>in</strong>d a set R so that R is l<strong>in</strong>early <strong>in</strong>dependent and 〈R〉 = U 33 .<br />
T13 † Prove Property CM of Theorem VSPM. Write your proof <strong>in</strong> the style of the proof<br />
of Property DSAM given <strong>in</strong> this section.<br />
T14 Prove Property AAM of Theorem VSPM. Write your proof <strong>in</strong> the style of the proof<br />
of Property DSAM given <strong>in</strong> this section.<br />
T17 Prove Property SMAM of Theorem VSPM. Write your proof <strong>in</strong> the style of the<br />
proof of Property DSAM given <strong>in</strong> this section.<br />
T18 Prove Property DMAM of Theorem VSPM. Write your proof <strong>in</strong> the style of the<br />
proof of Property DSAM given <strong>in</strong> this section.<br />
A matrix A is skew-symmetric if A t = −A Exercises T30–T37 employ this def<strong>in</strong>ition.<br />
T30 Prove that a skew-symmetric matrix is square. (H<strong>in</strong>t: study the proof of Theorem<br />
SMS.)<br />
T31 Prove that a skew-symmetric matrix must have zeros for its diagonal elements.<br />
In other words, if A is skew-symmetric of size n, then[A] ii<br />
=0for1≤ i ≤ n. (H<strong>in</strong>t:<br />
carefully construct an example of a 3 × 3 skew-symmetric matrix before attempt<strong>in</strong>g a<br />
proof.)<br />
T32 Prove that a matrix A is both skew-symmetric and symmetric if and only if A is<br />
the zero matrix. (H<strong>in</strong>t: one half of this proof is very easy, the other half takes a little<br />
more work.)
§MO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 174<br />
T33 Suppose A and B are both skew-symmetric matrices of the same size and α, β ∈ C.<br />
Prove that αA + βB is a skew-symmetric matrix.<br />
T34 Suppose A is a square matrix. Prove that A + A t is a symmetric matrix.<br />
T35 Suppose A is a square matrix. Prove that A − A t is a skew-symmetric matrix.<br />
T36 Suppose A is a square matrix. Prove that there is a symmetric matrix B and a<br />
skew-symmetric matrix C such that A = B + C. In other words, any square matrix can<br />
be decomposed <strong>in</strong>to a symmetric matrix and a skew-symmetric matrix (Proof Technique<br />
DC). (H<strong>in</strong>t: consider build<strong>in</strong>g a proof on Exercise MO.T34 and Exercise MO.T35.)<br />
T37 Prove that the decomposition <strong>in</strong> Exercise MO.T36 is unique (see Proof Technique<br />
U). (H<strong>in</strong>t: a proof can turn on Exercise MO.T31.)
Section MM<br />
Matrix Multiplication<br />
We know how to add vectors and how to multiply them by scalars. Together, these<br />
operations give us the possibility of mak<strong>in</strong>g l<strong>in</strong>ear comb<strong>in</strong>ations. Similarly, we know<br />
how to add matrices and how to multiply matrices by scalars. In this section we mix<br />
all these ideas together and produce an operation known as “matrix multiplication.”<br />
This will lead to some results that are both surpris<strong>in</strong>g and central. We beg<strong>in</strong> with a<br />
def<strong>in</strong>ition of how to multiply a vector by a matrix.<br />
Subsection MVP<br />
Matrix-Vector Product<br />
We have repeatedly seen the importance of form<strong>in</strong>g l<strong>in</strong>ear comb<strong>in</strong>ations of the<br />
columns of a matrix. As one example of this, the oft-used Theorem SLSLC, said<br />
that every solution to a system of l<strong>in</strong>ear equations gives rise to a l<strong>in</strong>ear comb<strong>in</strong>ation<br />
of the column vectors of the coefficient matrix that equals the vector of constants.<br />
This theorem, and others, motivate the follow<strong>in</strong>g central def<strong>in</strong>ition.<br />
Def<strong>in</strong>ition MVP Matrix-Vector Product<br />
Suppose A is an m × n matrix with columns A 1 , A 2 , A 3 , ..., A n and u is a vector<br />
of size n. Then the matrix-vector product of A with u is the l<strong>in</strong>ear comb<strong>in</strong>ation<br />
Au =[u] 1<br />
A 1 +[u] 2<br />
A 2 +[u] 3<br />
A 3 + ···+[u] n<br />
A n<br />
So, the matrix-vector product is yet another version of “multiplication,” at least<br />
<strong>in</strong> the sense that we have yet aga<strong>in</strong> overloaded juxtaposition of two symbols as our<br />
notation. Remember your objects, an m × n matrix times a vector of size n will<br />
create a vector of size m. SoifA is rectangular, then the size of the vector changes.<br />
With all the l<strong>in</strong>ear comb<strong>in</strong>ations we have performed so far, this computation should<br />
now seem second nature.<br />
Example MTV A matrix times a vector<br />
Consider<br />
Then<br />
Au =2<br />
A =<br />
[ 1 4 2 3<br />
] 4<br />
−3 2 0 1 −2<br />
1 6 −3 −1 5<br />
[ ] ] [ ] ]<br />
1 4 2 3<br />
−3 +1[<br />
2 +(−2) 0 +3[<br />
1 +(−1)<br />
1 6 −3 −1<br />
175<br />
⎡ ⎤<br />
2<br />
1<br />
u = ⎢−2⎥<br />
⎣ 3 ⎦<br />
−1<br />
[ ] [ ] 4 7<br />
−2 = 1 .<br />
5 6<br />
□
§MM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 176<br />
△<br />
We can now represent systems of l<strong>in</strong>ear equations compactly with a matrix-vector<br />
product (Def<strong>in</strong>ition MVP) and column vector equality (Def<strong>in</strong>ition CVE). This f<strong>in</strong>ally<br />
yields a very popular alternative to our unconventional LS(A, b) notation.<br />
Theorem SLEMM Systems of L<strong>in</strong>ear Equations as Matrix Multiplication<br />
The set of solutions to the l<strong>in</strong>ear system LS(A, b) equals the set of solutions for x<br />
<strong>in</strong> the vector equation Ax = b.<br />
Proof. This theorem says that two sets (of solutions) are equal. So we need to show<br />
that one set of solutions is a subset of the other, and vice versa (Def<strong>in</strong>ition SE). Let<br />
A 1 , A 2 , A 3 , ..., A n be the columns of A. Both of these set <strong>in</strong>clusions then follow<br />
from the follow<strong>in</strong>g cha<strong>in</strong> of equivalences (Proof Technique E),<br />
x is a solution to LS(A, b)<br />
⇐⇒ [x] 1<br />
A 1 +[x] 2<br />
A 2 +[x] 3<br />
A 3 + ···+[x] n<br />
A n = b Theorem SLSLC<br />
⇐⇒ x is a solution to Ax = b<br />
Def<strong>in</strong>ition MVP<br />
Example MNSLE Matrix notation for systems of l<strong>in</strong>ear equations<br />
Consider the system of l<strong>in</strong>ear equations from Example NSLE.<br />
2x 1 +4x 2 − 3x 3 +5x 4 + x 5 =9<br />
3x 1 + x 2 + x 4 − 3x 5 =0<br />
−2x 1 +7x 2 − 5x 3 +2x 4 +2x 5 = −3<br />
has coefficient matrix and vector of constants<br />
[ ]<br />
[ ]<br />
2 4 −3 5 1<br />
9<br />
A = 3 1 0 1 −3<br />
b = 0<br />
−2 7 −5 2 2<br />
−3<br />
and so will be described compactly by the vector equation Ax = b.<br />
The matrix-vector product is a very natural computation. We have motivated it<br />
by its connections with systems of equations, but here is another example.<br />
Example MBC Money’s best cities<br />
Every year Money magaz<strong>in</strong>e selects several cities <strong>in</strong> the United States as the “best”<br />
cities to live <strong>in</strong>, based on a wide array of statistics about each city. This is an example<br />
of how the editors of Money might arrive at a s<strong>in</strong>gle number that consolidates the<br />
statistics about a city. We will analyze Los Angeles, Chicago and New York City,<br />
based on four criteria: average high temperature <strong>in</strong> July (Farenheit), number of<br />
colleges and universities <strong>in</strong> a 30-mile radius, number of toxic waste sites <strong>in</strong> the<br />
Superfund environmental clean-up program and a personal crime <strong>in</strong>dex based on FBI<br />
△
§MM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 177<br />
statistics (average = 100, smaller is safer). It should be apparent how to generalize<br />
the example to a greater number of cities and a greater number of statistics.<br />
We beg<strong>in</strong> by build<strong>in</strong>g a table of statistics. The rows will be labeled with the<br />
cities, and the columns with statistical categories. These values are from Money’s<br />
website <strong>in</strong> early 2005.<br />
City Temp Colleges Superfund Crime<br />
Los Angeles 77 28 93 254<br />
Chicago 84 38 85 363<br />
New York 84 99 1 193<br />
Conceivably these data might reside <strong>in</strong> a spreadsheet. Now we must comb<strong>in</strong>e<br />
the statistics for each city. We could accomplish this by weight<strong>in</strong>g each category,<br />
scal<strong>in</strong>g the values and summ<strong>in</strong>g them. The sizes of the weights would depend upon<br />
the numerical size of each statistic generally, but more importantly, they would<br />
reflect the editors op<strong>in</strong>ions or beliefs about which statistics were most important to<br />
their readers. Is the crime <strong>in</strong>dex more important than the number of colleges and<br />
universities? Of course, there is no right answer to this question.<br />
Suppose the editors f<strong>in</strong>ally decide on the follow<strong>in</strong>g weights to employ: temperature,<br />
0.23; colleges, 0.46; Superfund, −0.05; crime, −0.20. Notice how negative weights<br />
are used for undesirable statistics. Then, for example, the editors would compute for<br />
Los Angeles,<br />
(0.23)(77) + (0.46)(28) + (−0.05)(93) + (−0.20)(254) = −24.86<br />
This computation might rem<strong>in</strong>d you of an <strong>in</strong>ner product, but we will produce<br />
the computations for all of the cities as a matrix-vector product. Write the table of<br />
raw statistics as a matrix<br />
[ ]<br />
77 28 93 254<br />
T = 84 38 85 363<br />
84 99 1 193<br />
and the weights as a vector<br />
⎡<br />
⎢<br />
w = ⎣<br />
0.23<br />
0.46<br />
−0.05<br />
−0.20<br />
then the matrix-vector product (Def<strong>in</strong>ition MVP) yields<br />
T w =(0.23)<br />
[ ] 77<br />
84<br />
84<br />
+(0.46)<br />
[ ] 28<br />
38<br />
99<br />
+(−0.05)<br />
[ ] 93<br />
85<br />
1<br />
+(−0.20)<br />
⎤<br />
⎥<br />
⎦<br />
[ ] 254<br />
363 =<br />
193<br />
[ ] −24.86<br />
−40.05<br />
26.21<br />
This vector conta<strong>in</strong>s a s<strong>in</strong>gle number for each of the cities be<strong>in</strong>g studied, so the<br />
editors would rank New York best (26.21), Los Angeles next (−24.86), and Chicago
§MM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 178<br />
third (−40.05). Of course, the mayor’s offices <strong>in</strong> Chicago and Los Angeles are free to<br />
counter with a different set of weights that cause their city to be ranked best. These<br />
alternative weights would be chosen to play to each cities’ strengths, and m<strong>in</strong>imize<br />
their problem areas.<br />
If a speadsheet were used to make these computations, a row of weights would<br />
be entered somewhere near the table of data and the formulas <strong>in</strong> the spreadsheet<br />
would effect a matrix-vector product. This example is meant to illustrate how “l<strong>in</strong>ear”<br />
computations (addition, multiplication) can be organized as a matrix-vector product.<br />
Another example would be the matrix of numerical scores on exam<strong>in</strong>ations and<br />
exercises for students <strong>in</strong> a class. The rows would be <strong>in</strong>dexed by students and the<br />
columns would be <strong>in</strong>dexed by exams and assignments. The <strong>in</strong>structor could then<br />
assign weights to the different exams and assignments, and via a matrix-vector<br />
product, compute a s<strong>in</strong>gle score for each student.<br />
△<br />
Later (much later) we will need the follow<strong>in</strong>g theorem, which is really a technical<br />
lemma (see Proof Technique LC). S<strong>in</strong>ce we are <strong>in</strong> a position to prove it now, we<br />
will. But you can safely skip it for the moment, if you promise to come back later to<br />
study the proof when the theorem is employed. At that po<strong>in</strong>t you will also be able<br />
to understand the comments <strong>in</strong> the paragraph follow<strong>in</strong>g the proof.<br />
Theorem EMMVP Equal Matrices and Matrix-Vector Products<br />
Suppose that A and B are m × n matrices such that Ax = Bx for every x ∈ C n .<br />
Then A = B.<br />
Proof. We are assum<strong>in</strong>g Ax = Bx for all x ∈ C n , so we can employ this equality<br />
for any choice of the vector x. However, we will limit our use of this equality to the<br />
standard unit vectors, e j ,1≤ j ≤ n (Def<strong>in</strong>ition SUV). For all 1 ≤ j ≤ n, 1≤ i ≤ m,<br />
[A] ij<br />
=0[A] i1<br />
+ ···+0[A] i,j−1<br />
+1[A] ij<br />
+0[A] i,j+1<br />
+ ···+0[A] <strong>in</strong><br />
Theorem PCNA<br />
=[e j ] 1<br />
[A] i1<br />
+[e j ] 2<br />
[A] i2<br />
+[e j ] 3<br />
[A] i3<br />
+ ···+[e j ] n<br />
[A] <strong>in</strong><br />
Property CMCN<br />
=[A] i1<br />
[e j ] 1<br />
+[A] i2<br />
[e j ] 2<br />
+[A] i3<br />
[e j ] 3<br />
+ ···+[A] <strong>in</strong><br />
[e j ] n<br />
Def<strong>in</strong>ition SUV<br />
=[Ae j ] i<br />
Def<strong>in</strong>ition MVP<br />
=[Be j ] i<br />
Hypothesis<br />
=[B] i1<br />
[e j ] 1<br />
+[B] i2<br />
[e j ] 2<br />
+[B] i3<br />
[e j ] 3<br />
+ ···+[B] <strong>in</strong><br />
[e j ] n<br />
Def<strong>in</strong>ition MVP<br />
=[e j ] 1<br />
[B] i1<br />
+[e j ] 2<br />
[B] i2<br />
+[e j ] 3<br />
[B] i3<br />
+ ···+[e j ] n<br />
[B] <strong>in</strong><br />
Property CMCN<br />
=0[B] i1<br />
+ ···+0[B] i,j−1<br />
+1[B] ij<br />
+0[B] i,j+1<br />
+ ···+0[B] <strong>in</strong><br />
Def<strong>in</strong>ition SUV<br />
=[B] ij<br />
Theorem PCNA<br />
So by Def<strong>in</strong>ition ME the matrices A and B are equal, as desired.<br />
<br />
You might notice from study<strong>in</strong>g the proof that the hypotheses of this theorem
§MM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 179<br />
could be “weakened” (i.e. made less restrictive). We need only suppose the equality<br />
of the matrix-vector products for just the standard unit vectors (Def<strong>in</strong>ition SUV) or<br />
any other spann<strong>in</strong>g set (Def<strong>in</strong>ition SSVS) ofC n (Exercise LISS.T40). However, <strong>in</strong><br />
practice, when we apply this theorem the stronger hypothesis will be <strong>in</strong> effect so this<br />
version of the theorem will suffice for our purposes. (If we changed the statement of<br />
the theorem to have the less restrictive hypothesis, then we would call the theorem<br />
“stronger.”)<br />
Subsection MM<br />
Matrix Multiplication<br />
We now def<strong>in</strong>e how to multiply two matrices together. Stop for a m<strong>in</strong>ute and th<strong>in</strong>k<br />
about how you might def<strong>in</strong>e this new operation.<br />
Many books would present this def<strong>in</strong>ition much earlier <strong>in</strong> the course. However,<br />
we have taken great care to delay it as long as possible and to present as many<br />
ideas as practical based mostly on the notion of l<strong>in</strong>ear comb<strong>in</strong>ations. Towards the<br />
conclusion of the course, or when you perhaps take a second course <strong>in</strong> l<strong>in</strong>ear algebra,<br />
you may be <strong>in</strong> a position to appreciate the reasons for this. For now, understand<br />
that matrix multiplication is a central def<strong>in</strong>ition and perhaps you will appreciate its<br />
importance more by hav<strong>in</strong>g saved it for later.<br />
Def<strong>in</strong>ition MM Matrix Multiplication<br />
Suppose A is an m × n matrix and B 1 , B 2 , B 3 , ..., B p are the columns of an n × p<br />
matrix B. Then the matrix product of A with B is the m×p matrix where column<br />
i is the matrix-vector product AB i . Symbolically,<br />
AB = A [B 1 |B 2 |B 3 | ...|B p ]=[AB 1 |AB 2 |AB 3 | ...|AB p ] .<br />
□<br />
Example PTM Product of two matrices<br />
Set<br />
A =<br />
[ 1 2 −1 4<br />
] 6<br />
0 −4 1 2 3<br />
−5 1 2 −3 4<br />
Then<br />
⎡ ⎡ ⎤<br />
⎡ ⎤<br />
⎡<br />
1<br />
6<br />
−1<br />
4<br />
AB = ⎢<br />
⎣ A ⎢ 1 ⎥<br />
A ⎢ 1 ⎥<br />
A ⎢<br />
⎣ 6 ⎦<br />
⎣<br />
4 ⎦<br />
⎣<br />
1 ∣ −2 ∣<br />
2<br />
3<br />
2<br />
−1<br />
3<br />
⎡<br />
⎤<br />
1 6 2 1<br />
−1 4 3 2<br />
B = ⎢ 1 1 2 3⎥<br />
⎣ 6 4 −1 2⎦<br />
1 −2 3 0<br />
⎤<br />
⎡ ⎤⎤<br />
1<br />
[ ]<br />
2<br />
28 17 20 10<br />
⎥<br />
A ⎢3⎥⎥<br />
⎦<br />
⎣<br />
2⎦⎦ = 20 −13 −3 −1 .<br />
−18 −44 12 −3<br />
∣ 0<br />
△
§MM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 180<br />
Is this the def<strong>in</strong>ition of matrix multiplication you expected? Perhaps our previous<br />
operations for matrices caused you to th<strong>in</strong>k that we might multiply two matrices of<br />
the same size, entry-by-entry? Notice that our current def<strong>in</strong>ition uses matrices of<br />
different sizes (though the number of columns <strong>in</strong> the first must equal the number<br />
of rows <strong>in</strong> the second), and the result is of a third size. Notice too <strong>in</strong> the previous<br />
example that we cannot even consider the product BA, s<strong>in</strong>ce the sizes of the two<br />
matrices <strong>in</strong> this order are not right.<br />
But it gets weirder than that. Many of your old ideas about “multiplication” will<br />
not apply to matrix multiplication, but some still will. So make no assumptions, and<br />
do not do anyth<strong>in</strong>g until you have a theorem that says you can. Even if the sizes are<br />
right, matrix multiplication is not commutative — order matters.<br />
Example MMNC Matrix multiplication is not commutative<br />
Set<br />
[ ]<br />
[ ]<br />
1 3<br />
4 0<br />
A =<br />
B = .<br />
−1 2<br />
5 1<br />
Then we have two square, 2 × 2 matrices, so Def<strong>in</strong>ition MM allows us to multiply<br />
them <strong>in</strong> either order. We f<strong>in</strong>d<br />
[ ]<br />
[ ]<br />
19 3<br />
4 12<br />
AB =<br />
BA =<br />
6 2<br />
4 17<br />
and AB ̸= BA. Not even close. It should not be hard for you to construct other<br />
pairs of matrices that do not commute (try a couple of 3 × 3’s). Can you f<strong>in</strong>d a pair<br />
of non-identical matrices that do commute?<br />
△<br />
Subsection MMEE<br />
Matrix Multiplication, Entry-by-Entry<br />
While certa<strong>in</strong> “natural” properties of multiplication do not hold, many more do. In<br />
the next subsection, we will state and prove the relevant theorems. But first, we<br />
need a theorem that provides an alternate means of multiply<strong>in</strong>g two matrices. In<br />
many texts, this would be given as the def<strong>in</strong>ition of matrix multiplication. We prefer<br />
to turn it around and have the follow<strong>in</strong>g formula as a consequence of our def<strong>in</strong>ition.<br />
It will prove useful for proofs of matrix equality, where we need to exam<strong>in</strong>e products<br />
of matrices, entry-by-entry.<br />
Theorem EMP Entries of Matrix Products<br />
Suppose A is an m × n matrix and B is an n × p matrix. Then for 1 ≤ i ≤ m,<br />
1 ≤ j ≤ p, the <strong>in</strong>dividual entries of AB are given by<br />
[AB] ij<br />
=[A] i1<br />
[B] 1j<br />
+[A] i2<br />
[B] 2j<br />
+[A] i3<br />
[B] 3j<br />
+ ···+[A] <strong>in</strong><br />
[B] nj<br />
n∑<br />
= [A] ik<br />
[B] kj<br />
k=1
§MM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 181<br />
Proof. Let the vectors A 1 , A 2 , A 3 , ..., A n denote the columns of A and let the<br />
vectors B 1 , B 2 , B 3 , ..., B p denote the columns of B. Then for 1 ≤ i ≤ m,1≤ j ≤ p,<br />
[AB] ij<br />
=[AB j ] i<br />
Def<strong>in</strong>ition MM<br />
= [ [B j ] 1<br />
A 1 +[B j ] 2<br />
A 2 + ···+[B j ] n<br />
A n Def<strong>in</strong>ition MVP<br />
]i<br />
= [ ]<br />
[B j ] 1<br />
A 1 + [ ]<br />
[B<br />
i j ] 2<br />
A 2 + ···+ [ [B<br />
i j ] n<br />
A n Def<strong>in</strong>ition CVA<br />
]i<br />
=[B j ] 1<br />
[A 1 ] i<br />
+[B j ] 2<br />
[A 2 ] i<br />
+ ···+[B j ] n<br />
[A n ] i<br />
Def<strong>in</strong>ition CVSM<br />
=[B] 1j<br />
[A] i1<br />
+[B] 2j<br />
[A] i2<br />
+ ···+[B] nj<br />
[A] <strong>in</strong><br />
Def<strong>in</strong>ition M<br />
=[A] i1<br />
[B] 1j<br />
+[A] i2<br />
[B] 2j<br />
+ ···+[A] <strong>in</strong><br />
[B] nj<br />
Property CMCN<br />
n∑<br />
= [A] ik<br />
[B] kj<br />
k=1<br />
Example PTMEE Product of two matrices, entry-by-entry<br />
Consider aga<strong>in</strong> the two matrices from Example PTM<br />
⎡<br />
⎤<br />
1 6 2 1<br />
[ ]<br />
1 2 −1 4 6<br />
−1 4 3 2<br />
A = 0 −4 1 2 3 B = ⎢ 1 1 2 3⎥<br />
−5 1 2 −3 4<br />
⎣ 6 4 −1 2⎦<br />
1 −2 3 0<br />
Then suppose we just wanted the entry of AB <strong>in</strong> the second row, third column:<br />
[AB] 23<br />
=[A] 21<br />
[B] 13<br />
+[A] 22<br />
[B] 23<br />
+[A] 23<br />
[B] 33<br />
+[A] 24<br />
[B] 43<br />
+[A] 25<br />
[B] 53<br />
=(0)(2) + (−4)(3) + (1)(2) + (2)(−1) + (3)(3) = −3<br />
Notice how there are 5 terms <strong>in</strong> the sum, s<strong>in</strong>ce 5 is the common dimension of the<br />
two matrices (column count for A, rowcountforB). In the conclusion of Theorem<br />
EMP, it would be the <strong>in</strong>dex k that would run from 1 to 5 <strong>in</strong> this computation. Here<br />
is a bit more practice.<br />
The entry of third row, first column:<br />
[AB] 31<br />
=[A] 31<br />
[B] 11<br />
+[A] 32<br />
[B] 21<br />
+[A] 33<br />
[B] 31<br />
+[A] 34<br />
[B] 41<br />
+[A] 35<br />
[B] 51<br />
=(−5)(1) + (1)(−1) + (2)(1) + (−3)(6) + (4)(1) = −18<br />
To get some more practice on your own, complete the computation of the other<br />
10 entries of this product. Construct some other pairs of matrices (of compatible<br />
sizes) and compute their product two ways. <strong>First</strong> use Def<strong>in</strong>ition MM. S<strong>in</strong>ce l<strong>in</strong>ear<br />
comb<strong>in</strong>ations are straightforward for you now, this should be easy to do and to do<br />
correctly. Then do it aga<strong>in</strong>, us<strong>in</strong>g Theorem EMP. S<strong>in</strong>ce this process may take some<br />
practice, use your first computation to check your work.<br />
△<br />
Theorem EMP is the way many people compute matrix products by hand. It
§MM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 182<br />
will also be very useful for the theorems we are go<strong>in</strong>g to prove shortly. However, the<br />
def<strong>in</strong>ition (Def<strong>in</strong>ition MM) is frequently the most useful for its connections with<br />
deeper ideas like the null space and the upcom<strong>in</strong>g column space.<br />
Subsection PMM<br />
Properties of Matrix Multiplication<br />
In this subsection, we collect properties of matrix multiplication and its <strong>in</strong>teraction<br />
with the zero matrix (Def<strong>in</strong>ition ZM), the identity matrix (Def<strong>in</strong>ition IM), matrix<br />
addition (Def<strong>in</strong>ition MA), scalar matrix multiplication (Def<strong>in</strong>ition MSM), the <strong>in</strong>ner<br />
product (Def<strong>in</strong>ition IP), conjugation (Theorem MMCC), and the transpose (Def<strong>in</strong>ition<br />
TM). Whew! Here we go. These are great proofs to practice with, so try to<br />
concoct the proofs before read<strong>in</strong>g them, they will get progressively more complicated<br />
as we go.<br />
Theorem MMZM Matrix Multiplication and the Zero Matrix<br />
Suppose A is an m × n matrix. Then<br />
1. AO n×p = O m×p<br />
2. O p×m A = O p×n<br />
Proof. We will prove (1) and leave (2) to you. Entry-by-entry, for 1 ≤ i ≤ m,<br />
1 ≤ j ≤ p,<br />
n∑<br />
[AO n×p ] ij<br />
= [A] ik<br />
[O n×p ] kj<br />
Theorem EMP<br />
=<br />
=<br />
k=1<br />
n∑<br />
[A] ik<br />
0<br />
k=1<br />
n∑<br />
0<br />
k=1<br />
Def<strong>in</strong>ition ZM<br />
=0 PropertyZCN<br />
=[O m×p ] ij<br />
Def<strong>in</strong>ition ZM<br />
So by the def<strong>in</strong>ition of matrix equality (Def<strong>in</strong>ition ME), the matrices AO n×p and<br />
O m×p are equal.<br />
<br />
Theorem MMIM Matrix Multiplication and Identity Matrix<br />
Suppose A is an m × n matrix. Then<br />
1. AI n = A<br />
2. I m A = A
§MM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 183<br />
Proof. Aga<strong>in</strong>, we will prove (1) and leave (2) to you. Entry-by-entry, For 1 ≤ i ≤ m,<br />
1 ≤ j ≤ n,<br />
n∑<br />
[AI n ] ij<br />
= [A] ik<br />
[I n ] kj<br />
Theorem EMP<br />
k=1<br />
=[A] ij<br />
[I n ] jj<br />
+<br />
=[A] ij<br />
(1) +<br />
=[A] ij<br />
+<br />
=[A] ij<br />
n∑<br />
[A] ik<br />
[I n ] kj<br />
k=1<br />
k≠j<br />
n∑<br />
k=1,k≠j<br />
n∑<br />
k=1,k≠j<br />
0<br />
[A] ik<br />
(0)<br />
Property CACN<br />
Def<strong>in</strong>ition IM<br />
So the matrices A and AI n are equal, entry-by-entry, and by the def<strong>in</strong>ition of<br />
matrix equality (Def<strong>in</strong>ition ME) we can say they are equal matrices.<br />
<br />
It is this theorem that gives the identity matrix its name. It is a matrix that<br />
behaves with matrix multiplication like the scalar 1 does with scalar multiplication.<br />
To multiply by the identity matrix is to have no effect on the other matrix.<br />
Theorem MMDAA Matrix Multiplication Distributes Across Addition<br />
Suppose A is an m × n matrix and B and C are n × p matrices and D is a p × s<br />
matrix. Then<br />
1. A(B + C) =AB + AC<br />
2. (B + C)D = BD + CD<br />
Proof. We will do (1), you do (2). Entry-by-entry, for 1 ≤ i ≤ m, 1≤ j ≤ p,<br />
n∑<br />
[A(B + C)] ij<br />
= [A] ik<br />
[B + C] kj<br />
Theorem EMP<br />
=<br />
=<br />
=<br />
k=1<br />
n∑<br />
[A] ik<br />
([B] kj<br />
+[C] kj<br />
)<br />
k=1<br />
n∑<br />
[A] ik<br />
[B] kj<br />
+[A] ik<br />
[C] kj<br />
k=1<br />
n∑<br />
[A] ik<br />
[B] kj<br />
+<br />
k=1<br />
=[AB] ij<br />
+[AC] ij<br />
n∑<br />
[A] ik<br />
[C] kj<br />
k=1<br />
Def<strong>in</strong>ition MA<br />
Property DCN<br />
Property CACN<br />
Theorem EMP
§MM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 184<br />
=[AB + AC] ij<br />
Def<strong>in</strong>ition MA<br />
So the matrices A(B + C) andAB + AC are equal, entry-by-entry, and by the<br />
def<strong>in</strong>ition of matrix equality (Def<strong>in</strong>ition ME) we can say they are equal matrices.<br />
Theorem MMSMM Matrix Multiplication and Scalar Matrix Multiplication<br />
Suppose A is an m × n matrix and B is an n × p matrix. Let α be a scalar. Then<br />
α(AB) =(αA)B = A(αB).<br />
Proof. These are equalities of matrices. We will do the first one, the second is similar<br />
and will be good practice for you. For 1 ≤ i ≤ m, 1≤ j ≤ p,<br />
[α(AB)] ij<br />
= α [AB] ij<br />
Def<strong>in</strong>ition MSM<br />
n∑<br />
= α [A] ik<br />
[B] kj<br />
Theorem EMP<br />
=<br />
=<br />
k=1<br />
n∑<br />
α [A] ik<br />
[B] kj<br />
k=1<br />
n∑<br />
[αA] ik<br />
[B] kj<br />
k=1<br />
=[(αA)B] ij<br />
Property DCN<br />
Def<strong>in</strong>ition MSM<br />
Theorem EMP<br />
So the matrices α(AB) and(αA)B are equal, entry-by-entry, and by the def<strong>in</strong>ition<br />
of matrix equality (Def<strong>in</strong>ition ME) we can say they are equal matrices. <br />
Theorem MMA Matrix Multiplication is Associative<br />
Suppose A is an m × n matrix, B is an n × p matrix and D is a p × s matrix. Then<br />
A(BD) =(AB)D.<br />
Proof. A matrix equality, so we will go entry-by-entry, no surprise there. For 1 ≤<br />
i ≤ m, 1≤ j ≤ s,<br />
n∑<br />
[A(BD)] ij<br />
= [A] ik<br />
[BD] kj<br />
Theorem EMP<br />
=<br />
=<br />
k=1<br />
(<br />
n∑<br />
p∑<br />
)<br />
[A] ik<br />
[B] kl<br />
[D] lj<br />
l=1<br />
k=1<br />
n∑<br />
k=1 l=1<br />
p∑<br />
[A] ik<br />
[B] kl<br />
[D] lj<br />
Theorem EMP<br />
Property DCN
§MM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 185<br />
We can switch the order of the summation s<strong>in</strong>ce these are f<strong>in</strong>ite sums,<br />
p∑ n∑<br />
= [A] ik<br />
[B] kl<br />
[D] lj<br />
Property CACN<br />
l=1 k=1<br />
As [D] lj<br />
does not depend on the <strong>in</strong>dex k, we can use distributivity to move it outside<br />
of the <strong>in</strong>ner sum,<br />
(<br />
p∑<br />
n<br />
)<br />
∑<br />
= [D] lj<br />
[A] ik<br />
[B] kl<br />
Property DCN<br />
=<br />
=<br />
l=1<br />
k=1<br />
p∑<br />
[D] lj<br />
[AB] il<br />
l=1<br />
p∑<br />
[AB] il<br />
[D] lj<br />
l=1<br />
=[(AB)D] ij<br />
Theorem EMP<br />
Property CMCN<br />
Theorem EMP<br />
So the matrices (AB)D and A(BD) are equal, entry-by-entry, and by the def<strong>in</strong>ition<br />
of matrix equality (Def<strong>in</strong>ition ME) we can say they are equal matrices. <br />
S<strong>in</strong>ce Theorem MMA says matrix multipication is associative, it means we do<br />
not have to be careful about the order <strong>in</strong> which we perform matrix multiplication,<br />
nor how we parenthesize an expression with just several matrices multiplied togther.<br />
So this is where we draw the l<strong>in</strong>e on expla<strong>in</strong><strong>in</strong>g every last detail <strong>in</strong> a proof. We will<br />
frequently add, remove, or rearrange parentheses with no comment. Indeed, I only<br />
see about a dozen places where Theorem MMA is cited <strong>in</strong> a proof. You could try to<br />
count how many times we avoid mak<strong>in</strong>g a reference to this theorem.<br />
The statement of our next theorem is technically <strong>in</strong>accurate. If we upgrade the<br />
vectors u, v to matrices with a s<strong>in</strong>gle column, then the expression u t v is a 1 × 1<br />
matrix, though we will treat this small matrix as if it was simply the scalar quantity<br />
<strong>in</strong> its lone entry. When we apply Theorem MMIP there should not be any confusion.<br />
Notice that if we treat a column vector as a matrix with a s<strong>in</strong>gle column, then we<br />
can also construct the adjo<strong>in</strong>t of a vector, though we will not make this a common<br />
practice.<br />
Theorem MMIP Matrix Multiplication and Inner Products<br />
If we consider the vectors u, v ∈ C m as m × 1 matrices then<br />
〈u, v〉 = u t v = u ∗ v<br />
Proof.<br />
m∑<br />
〈u, v〉 = [u] k<br />
[v] k<br />
k=1<br />
Def<strong>in</strong>ition IP
§MM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 186<br />
=<br />
=<br />
=<br />
m∑<br />
[u] k1<br />
[v] k1<br />
Column vectors as matrices<br />
k=1<br />
m∑<br />
[u] k1<br />
[v] k1<br />
k=1<br />
Def<strong>in</strong>ition CCM<br />
m∑ [<br />
u<br />
t ] 1k [v] k1<br />
Def<strong>in</strong>ition TM<br />
k=1<br />
= [ u t v ] Theorem EMP<br />
11<br />
To f<strong>in</strong>ish we just blur the dist<strong>in</strong>ction between a 1 × 1matrix(u t v) and its lone<br />
entry.<br />
<br />
Theorem MMCC Matrix Multiplication and Complex Conjugation<br />
Suppose A is an m × n matrix and B is an n × p matrix. Then AB = A B.<br />
Proof. To obta<strong>in</strong> this matrix equality, we will work entry-by-entry. For 1 ≤ i ≤ m,<br />
1 ≤ j ≤ p,<br />
[ ]<br />
AB<br />
ij = [AB] ij<br />
Def<strong>in</strong>ition CCM<br />
n∑<br />
= [A] ik<br />
[B] kj<br />
Theorem EMP<br />
=<br />
=<br />
=<br />
k=1<br />
n∑<br />
[A] ik<br />
[B] kj<br />
k=1<br />
n∑<br />
[A] ik<br />
[B] kj<br />
k=1<br />
n∑ [ ]<br />
A<br />
k=1<br />
ik<br />
[<br />
B<br />
]<br />
kj<br />
Theorem CCRA<br />
Theorem CCRM<br />
Def<strong>in</strong>ition CCM<br />
= [ A B ] ij<br />
Theorem EMP<br />
So the matrices AB and A B are equal, entry-by-entry, and by the def<strong>in</strong>ition of<br />
matrix equality (Def<strong>in</strong>ition ME) we can say they are equal matrices.<br />
<br />
Another theorem <strong>in</strong> this style, and it is a good one. If you have been practic<strong>in</strong>g<br />
with the previous proofs you should be able to do this one yourself.<br />
Theorem MMT Matrix Multiplication and Transposes<br />
Suppose A is an m × n matrix and B is an n × p matrix. Then (AB) t = B t A t .
§MM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 187<br />
Proof. This theorem may be surpris<strong>in</strong>g but if we check the sizes of the matrices<br />
<strong>in</strong>volved, then maybe it will not seem so far-fetched. <strong>First</strong>, AB has size m × p, so<br />
its transpose has size p × m. The product of B t with A t is a p × n matrix times an<br />
n × m matrix, also result<strong>in</strong>g <strong>in</strong> a p × m matrix. So at least our objects are compatible<br />
for equality (and would not be, <strong>in</strong> general, if we did not reverse the order of the<br />
matrix multiplication).<br />
Here we go aga<strong>in</strong>, entry-by-entry. For 1 ≤ i ≤ m, 1≤ j ≤ p,<br />
[<br />
(AB)<br />
t ] ji =[AB] ij<br />
Def<strong>in</strong>ition TM<br />
=<br />
=<br />
=<br />
n∑<br />
[A] ik<br />
[B] kj<br />
k=1<br />
n∑<br />
[B] kj<br />
[A] ik<br />
k=1<br />
Theorem EMP<br />
Property CMCN<br />
n∑ [<br />
B<br />
t ] [<br />
jk A<br />
t ] Def<strong>in</strong>ition TM<br />
ki<br />
k=1<br />
= [ B t A t] ji<br />
Theorem EMP<br />
So the matrices (AB) t and B t A t are equal, entry-by-entry, and by the def<strong>in</strong>ition<br />
of matrix equality (Def<strong>in</strong>ition ME) we can say they are equal matrices. <br />
This theorem seems odd at first glance, s<strong>in</strong>ce we have to switch the order of<br />
A and B. But if we simply consider the sizes of the matrices <strong>in</strong>volved, we can see<br />
that the switch is necessary for this reason alone. That the <strong>in</strong>dividual entries of the<br />
products then come along to be equal is a bonus.<br />
As the adjo<strong>in</strong>t of a matrix is a composition of a conjugate and a transpose, its<br />
<strong>in</strong>teraction with matrix multiplication is similar to that of a transpose. Here is the<br />
last of our long list of basic properties of matrix multiplication.<br />
Theorem MMAD Matrix Multiplication and Adjo<strong>in</strong>ts<br />
Suppose A is an m × n matrix and B is an n × p matrix. Then (AB) ∗ = B ∗ A ∗ .<br />
Proof.<br />
(AB) ∗ = ( AB ) t<br />
Def<strong>in</strong>ition A<br />
= ( A B ) t<br />
Theorem MMCC<br />
= ( B ) t ( ) t<br />
A Theorem MMT<br />
= B ∗ A ∗ Def<strong>in</strong>ition A<br />
<br />
Notice how none of these proofs above relied on writ<strong>in</strong>g out huge general matrices<br />
with lots of ellipses (“. . . ”) and try<strong>in</strong>g to formulate the equalities a whole matrix at
§MM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 188<br />
a time. This messy bus<strong>in</strong>ess is a “proof technique” to be avoided at all costs. Notice<br />
too how the proof of Theorem MMAD does not use an entry-by-entry approach,<br />
but simply builds on previous results about matrix multiplication’s <strong>in</strong>teraction with<br />
conjugation and transposes.<br />
These theorems, along with Theorem VSPM and the other results <strong>in</strong> Section<br />
MO, give you the “rules” for how matrices <strong>in</strong>teract with the various operations<br />
we have def<strong>in</strong>ed on matrices (addition, scalar multiplication, matrix multiplication,<br />
conjugation, transposes and adjo<strong>in</strong>ts). Use them and use them often. But do not try<br />
to do anyth<strong>in</strong>g with a matrix that you do not have a rule for. Together, we would<br />
<strong>in</strong>formally call all these operations, and the attendant theorems, “the algebra of<br />
matrices.” Notice, too, that every column vector is just a n × 1 matrix, so these<br />
theorems apply to column vectors also. F<strong>in</strong>ally, these results, taken as a whole, may<br />
make us feel that the def<strong>in</strong>ition of matrix multiplication is not so unnatural.<br />
Subsection HM<br />
Hermitian Matrices<br />
The adjo<strong>in</strong>t of a matrix has a basic property when employed <strong>in</strong> a matrix-vector<br />
product as part of an <strong>in</strong>ner product. At this po<strong>in</strong>t, you could even use the follow<strong>in</strong>g<br />
result as a motivation for the def<strong>in</strong>ition of an adjo<strong>in</strong>t.<br />
Theorem AIP Adjo<strong>in</strong>t and Inner Product<br />
Suppose that A is an m × n matrix and x ∈ C n , y ∈ C m .Then〈Ax, y〉 = 〈x, A ∗ y〉.<br />
Proof.<br />
〈Ax, y〉 = ( Ax ) t<br />
y<br />
Theorem MMIP<br />
= ( A x ) t<br />
y Theorem MMCC<br />
= x t A t y Theorem MMT<br />
= x t (A ∗ y) Def<strong>in</strong>ition A<br />
= 〈x, A ∗ y〉 Theorem MMIP<br />
<br />
Sometimes a matrix is equal to its adjo<strong>in</strong>t (Def<strong>in</strong>ition A), and these matrices<br />
have <strong>in</strong>terest<strong>in</strong>g properties. One of the most common situations where this occurs<br />
is when a matrix has only real number entries. Then we are simply talk<strong>in</strong>g about<br />
symmetric matrices (Def<strong>in</strong>ition SYM), so you can view this as a generalization of a<br />
symmetric matrix.<br />
Def<strong>in</strong>ition HM Hermitian Matrix<br />
The square matrix A is Hermitian (or self-adjo<strong>in</strong>t) ifA = A ∗ .<br />
□
§MM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 189<br />
Aga<strong>in</strong>, the set of real matrices that are Hermitian is exactly the set of symmetric<br />
matrices. In Section PEE we will uncover some amaz<strong>in</strong>g properties of Hermitian<br />
matrices, so when you get there, run back here to rem<strong>in</strong>d yourself of this def<strong>in</strong>ition.<br />
Further properties will also appear <strong>in</strong> Section OD. Right now we prove a fundamental<br />
result about Hermitian matrices, matrix vector products and <strong>in</strong>ner products. As a<br />
characterization, this could be employed as a def<strong>in</strong>ition of a Hermitian matrix and<br />
some authors take this approach.<br />
Theorem HMIP Hermitian Matrices and Inner Products<br />
Suppose that A is a square matrix of size n. ThenA is Hermitian if and only if<br />
〈Ax, y〉 = 〈x, Ay〉 for all x, y ∈ C n .<br />
Proof. (⇒) This is the “easy half” of the proof, and makes the rationale for a<br />
def<strong>in</strong>ition of Hermitian matrices most obvious. Assume A is Hermitian,<br />
〈Ax, y〉 = 〈x, A ∗ y〉<br />
Theorem AIP<br />
= 〈x, Ay〉 Def<strong>in</strong>ition HM<br />
(⇐) This “half” will take a bit more work. Assume that 〈Ax, y〉 = 〈x, Ay〉 for<br />
all x, y ∈ C n . We show that A = A ∗ by establish<strong>in</strong>g that Ax = A ∗ x for all x, sowe<br />
can then apply Theorem EMMVP. With only this much motivation, consider the<br />
<strong>in</strong>ner product for any x ∈ C n .<br />
〈Ax − A ∗ x,Ax − A ∗ x〉 = 〈Ax − A ∗ x,Ax〉−〈Ax − A ∗ x,A ∗ x〉 Theorem IPVA<br />
= 〈Ax − A ∗ x,Ax〉−〈A (Ax − A ∗ x) , x〉 Theorem AIP<br />
= 〈Ax − A ∗ x,Ax〉−〈Ax − A ∗ x,Ax〉 Hypothesis<br />
=0 PropertyAICN<br />
Because this first <strong>in</strong>ner product equals zero, and has the same vector <strong>in</strong> each<br />
argument (Ax − A ∗ x), Theorem PIP gives the conclusion that Ax − A ∗ x = 0. With<br />
Ax = A ∗ x for all x ∈ C n ,TheoremEMMVP says A = A ∗ , which is the def<strong>in</strong><strong>in</strong>g<br />
property of a Hermitian matrix (Def<strong>in</strong>ition HM).<br />
<br />
So, <strong>in</strong>formally, Hermitian matrices are those that can be tossed around from one<br />
side of an <strong>in</strong>ner product to the other with reckless abandon. We will see later what<br />
this buys us.<br />
Read<strong>in</strong>g Questions<br />
1. Form the matrix vector product of<br />
⎡<br />
⎣ 2 3 −1 0<br />
⎤<br />
1 −2 7 3⎦<br />
1 5 3 2<br />
with<br />
⎡ ⎤<br />
2<br />
⎢−3⎥<br />
⎣<br />
0<br />
⎦<br />
5
§MM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 190<br />
2. Multiply together the two matrices below (<strong>in</strong> the order given).<br />
⎡<br />
⎣ 2 3 −1 0<br />
⎤<br />
⎡ ⎤<br />
2 6<br />
1 −2 7 3⎦<br />
⎢−3<br />
−4⎥<br />
⎣<br />
0 2<br />
⎦<br />
1 5 3 2<br />
3 −1<br />
3. Rewrite the system of l<strong>in</strong>ear equations below as a vector equality and us<strong>in</strong>g a matrixvector<br />
product. (This question does not ask for a solution to the system. But it does<br />
ask you to express the system of equations <strong>in</strong> a new form us<strong>in</strong>g tools from this section.)<br />
Exercises<br />
2x 1 +3x 2 − x 3 =0<br />
x 1 +2x 2 + x 3 =3<br />
x 1 +3x 2 +3x 3 =7<br />
C20 † Compute the product of the two matrices below, AB. Do this us<strong>in</strong>g the def<strong>in</strong>itions<br />
of the matrix-vector product (Def<strong>in</strong>ition MVP) and the def<strong>in</strong>ition of matrix multiplication<br />
(Def<strong>in</strong>ition MM).<br />
⎡<br />
A = ⎣ 2 5<br />
⎤<br />
[ ]<br />
−1 3 ⎦ 1 5 −3 4<br />
B =<br />
2 0 2 −3<br />
2 −2<br />
C21 † Compute the product AB of the two matrices below us<strong>in</strong>g both the def<strong>in</strong>ition of<br />
the matrix-vector product (Def<strong>in</strong>ition MVP) and the def<strong>in</strong>ition of matrix multiplication<br />
(Def<strong>in</strong>ition MM).<br />
⎡<br />
A = ⎣ 1 3 2<br />
⎤<br />
⎡<br />
−1 2 1⎦ B = ⎣ 4 1 2<br />
⎤<br />
1 0 1⎦<br />
0 1 0<br />
3 1 5<br />
C22 † Compute the product AB of the two matrices below us<strong>in</strong>g both the def<strong>in</strong>ition of<br />
the matrix-vector product (Def<strong>in</strong>ition MVP) and the def<strong>in</strong>ition of matrix multiplication<br />
(Def<strong>in</strong>ition MM).<br />
[ ]<br />
[ ]<br />
1 0<br />
2 3<br />
A =<br />
B =<br />
−2 1<br />
4 6<br />
C23 † Compute the product AB of the two matrices below us<strong>in</strong>g both the def<strong>in</strong>ition of<br />
the matrix-vector product (Def<strong>in</strong>ition MVP) and the def<strong>in</strong>ition of matrix multiplication<br />
(Def<strong>in</strong>ition MM).<br />
⎡ ⎤<br />
3 1<br />
[ ]<br />
⎢2 4⎥<br />
−3 1<br />
A = ⎣<br />
6 5<br />
⎦ B =<br />
4 2<br />
1 2
§MM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 191<br />
C24 †<br />
C25 †<br />
Compute the product AB of the two matrices below.<br />
⎡<br />
A = ⎣ 1 2 3 −2<br />
⎤<br />
⎡ ⎤<br />
3<br />
0 1 −2 −1⎦ ⎢4⎥<br />
B = ⎣<br />
0<br />
⎦<br />
1 1 3 1<br />
2<br />
Compute the product AB of the two matrices below.<br />
⎡<br />
A = ⎣ 1 2 3 −2<br />
⎤<br />
⎡ ⎤<br />
−7<br />
0 1 −2 −1⎦ ⎢ 3 ⎥<br />
B = ⎣<br />
1<br />
⎦<br />
1 1 3 1<br />
1<br />
C26 † Compute the product AB of the two matrices below us<strong>in</strong>g both the def<strong>in</strong>ition of<br />
the matrix-vector product (Def<strong>in</strong>ition MVP) and the def<strong>in</strong>ition of matrix multiplication<br />
(Def<strong>in</strong>ition MM).<br />
⎡ ⎤<br />
⎡<br />
⎤<br />
A = ⎣ 1 3 1<br />
2 −5 −1<br />
0 1 0⎦ B = ⎣ 0 1 0 ⎦<br />
1 1 2<br />
−1 2 1<br />
C30 † [ ]<br />
1 2<br />
For the matrix A = , f<strong>in</strong>d A<br />
0 1<br />
, A 3 , A 4 . F<strong>in</strong>d a general formula for A n for any<br />
positive <strong>in</strong>teger n.<br />
[ ]<br />
C31 † 1 −1<br />
For the matrix A = ,f<strong>in</strong>dA<br />
0 1<br />
, A 3 , A 4 . F<strong>in</strong>d a general formula for A n for<br />
any positive <strong>in</strong>teger n.<br />
⎡ ⎤<br />
C32 † For the matrix A = ⎣ 1 0 0<br />
0 2 0⎦, f<strong>in</strong>d A 2 , A 3 , A 4 . F<strong>in</strong>d a general formula for A n<br />
0 0 3<br />
for any positive <strong>in</strong>teger n.<br />
⎡ ⎤<br />
C33 † For the matrix A = ⎣ 0 1 2<br />
0 0 1⎦, f<strong>in</strong>d A 2 , A 3 , A 4 . F<strong>in</strong>d a general formula for A n<br />
0 0 0<br />
for any positive <strong>in</strong>teger n.<br />
T10 † Suppose that A is a square matrix and there is a vector, b, such that LS(A, b) has<br />
a unique solution. Prove that A is nons<strong>in</strong>gular. Give a direct proof (perhaps appeal<strong>in</strong>g to<br />
Theorem PSPHS) rather than just negat<strong>in</strong>g a sentence from the text discuss<strong>in</strong>g a similar<br />
situation.<br />
T12 The conclusion of Theorem HMIP is 〈Ax, y〉 = 〈x, A ∗ y〉. Use the same hypotheses,<br />
and prove the similar conclusion: 〈x, Ay〉 = 〈A ∗ x, y〉. Two different approaches can be<br />
based on an application of Theorem HMIP. The first uses Theorem AA, while a second<br />
uses Theorem IPAC. Can you provide two proofs?<br />
T20 Prove the second part of Theorem MMZM.<br />
T21 Prove the second part of Theorem MMIM.<br />
T22 Prove the second part of Theorem MMDAA.
§MM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 192<br />
T23 †<br />
T31<br />
T32<br />
T35<br />
Prove the second part of Theorem MMSMM.<br />
Suppose that A is an m × n matrix and x, y ∈N(A). Prove that x + y ∈N(A).<br />
Suppose that A is an m × n matrix, α ∈ C, andx ∈N(A). Provethatαx ∈N(A).<br />
Suppose that A is an n × n matrix. Prove that A ∗ A and AA ∗ are Hermitian matrices.<br />
T40 † Suppose that A is an m × n matrix and B is an n × p matrix. Prove that the null<br />
space of B is a subset of the null space of AB, thatisN (B) ⊆N(AB). Provide an example<br />
where the opposite is false, <strong>in</strong> other words give an example where N (AB) ⊈N(B).<br />
T41 † Suppose that A is an n × n nons<strong>in</strong>gular matrix and B is an n × p matrix. Prove that<br />
the null space of B is equal to the null space of AB, thatisN (B) = N (AB). (Compare<br />
with Exercise MM.T40.)<br />
T50 Suppose u and v are any two solutions of the l<strong>in</strong>ear system LS(A, b). Provethat<br />
u − v is an element of the null space of A, thatis,u − v ∈N(A).<br />
T51 † Give a new proof of Theorem PSPHS replac<strong>in</strong>g applications of Theorem SLSLC<br />
with matrix-vector products (Theorem SLEMM).<br />
T52 † Suppose that x, y ∈ C n , b ∈ C m and A is an m × n matrix. If x, y and x + y are<br />
each a solution to the l<strong>in</strong>ear system LS(A, b), what can you say that is <strong>in</strong>terest<strong>in</strong>g about<br />
b? Form an implication with the existence of the three solutions as the hypothesis and an<br />
<strong>in</strong>terest<strong>in</strong>g statement about LS(A, b) as the conclusion, and then give a proof.
Section MISLE<br />
Matrix Inverses and Systems of L<strong>in</strong>ear Equations<br />
The <strong>in</strong>verse of a square matrix, and solutions to l<strong>in</strong>ear systems with square coefficient<br />
matrices, are <strong>in</strong>timately connected.<br />
Subsection SI<br />
Solutions and Inverses<br />
We beg<strong>in</strong> with a familiar example, performed <strong>in</strong> a novel way.<br />
Example SABMI Solutions to Archetype B with a matrix <strong>in</strong>verse<br />
Archetype B is the system of m = 3 l<strong>in</strong>ear equations <strong>in</strong> n = 3 variables,<br />
−7x 1 − 6x 2 − 12x 3 = −33<br />
5x 1 +5x 2 +7x 3 =24<br />
x 1 +4x 3 =5<br />
By Theorem SLEMM we can represent this system of equations as<br />
Ax = b<br />
where<br />
A =<br />
[ −7 −6<br />
] −12<br />
5 5 7<br />
1 0 4<br />
x =<br />
[<br />
x1<br />
]<br />
x 2<br />
x 3<br />
Now, entirely unmotivated, we def<strong>in</strong>e the 3 × 3 matrix B,<br />
⎡<br />
⎤<br />
−10 −12 −9<br />
B = ⎣ 13<br />
⎦<br />
and note the remarkable fact that<br />
⎡<br />
⎤<br />
−10 −12 −9<br />
BA = ⎣ 13<br />
⎦<br />
11<br />
2<br />
8<br />
2<br />
5<br />
5<br />
2<br />
3<br />
2<br />
11<br />
2<br />
8<br />
2<br />
5<br />
5<br />
2<br />
3<br />
2<br />
[ −7 −6<br />
] −12<br />
5 5 7 =<br />
1 0 4<br />
b =<br />
[ ] −33<br />
24<br />
5<br />
[ 1 0<br />
] 0<br />
0 1 0<br />
0 0 1<br />
Now apply this computation to the problem of solv<strong>in</strong>g the system of equations,<br />
x = I 3 x<br />
Theorem MMIM<br />
=(BA)x<br />
Substitution<br />
= B(Ax) Theorem MMA<br />
= Bb Substitution<br />
193
§MISLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 194<br />
So we have<br />
⎡<br />
−10 −12<br />
⎤<br />
−9<br />
x = Bb = ⎣ 13<br />
⎦<br />
11<br />
2<br />
8<br />
2<br />
5<br />
5<br />
2<br />
3<br />
2<br />
[ ] −33<br />
24 =<br />
5<br />
[ ] −3<br />
5<br />
2<br />
So with the help and assistance of B we have been able to determ<strong>in</strong>e a solution<br />
to the system represented by Ax = b through judicious use of matrix multiplication.<br />
We know by Theorem NMUS that s<strong>in</strong>ce the coefficient matrix <strong>in</strong> this example is<br />
nons<strong>in</strong>gular, there would be a unique solution, no matter what the choice of b. The<br />
derivation above amplifies this result, s<strong>in</strong>ce we were forced to conclude that x = Bb<br />
and the solution could not be anyth<strong>in</strong>g else. You should notice that this argument<br />
would hold for any particular choice of b.<br />
△<br />
The matrix B of the previous example is called the <strong>in</strong>verse of A. WhenA and B<br />
are comb<strong>in</strong>ed via matrix multiplication, the result is the identity matrix, which can<br />
be <strong>in</strong>serted “<strong>in</strong> front” of x as the first step <strong>in</strong> f<strong>in</strong>d<strong>in</strong>g the solution. This is entirely<br />
analogous to how we might solve a s<strong>in</strong>gle l<strong>in</strong>ear equation like 3x = 12.<br />
( ) 1<br />
x =1x =<br />
3 (3) x = 1 3 (3x) =1 3 (12) = 4<br />
Here we have obta<strong>in</strong>ed a solution by employ<strong>in</strong>g the “multiplicative <strong>in</strong>verse” of<br />
3, 3 −1 = 1 3<br />
. This works f<strong>in</strong>e for any scalar multiple of x, except for zero, s<strong>in</strong>ce zero<br />
does not have a multiplicative <strong>in</strong>verse. Consider separately the two l<strong>in</strong>ear equations,<br />
0x =12 0x =0<br />
The first has no solutions, while the second has <strong>in</strong>f<strong>in</strong>itely many solutions. For<br />
matrices, it is all just a little more complicated. Some matrices have <strong>in</strong>verses, some<br />
do not. And when a matrix does have an <strong>in</strong>verse, just how would we compute it?<br />
In other words, just where did that matrix B <strong>in</strong> the last example come from? Are<br />
there other matrices that might have worked just as well?<br />
Subsection IM<br />
Inverse of a Matrix<br />
Def<strong>in</strong>ition MI Matrix Inverse<br />
Suppose A and B are square matrices of size n such that AB = I n and BA = I n .<br />
Then A is <strong>in</strong>vertible and B is the <strong>in</strong>verse of A. In this situation, we write B = A −1 .<br />
□<br />
Notice that if B is the <strong>in</strong>verse of A, then we can just as easily say A is the <strong>in</strong>verse<br />
of B, orA and B are <strong>in</strong>verses of each other.<br />
Not every square matrix has an <strong>in</strong>verse. In Example SABMI the matrix B is<br />
the <strong>in</strong>verse of the coefficient matrix of Archetype B. To see this it only rema<strong>in</strong>s to<br />
check that AB = I 3 . What about Archetype A? It is an example of a square matrix
§MISLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 195<br />
without an <strong>in</strong>verse.<br />
Example MWIAA A matrix without an <strong>in</strong>verse, Archetype A<br />
Consider the coefficient matrix from Archetype A,<br />
[ 1 −1<br />
] 2<br />
A = 2 1 1<br />
1 1 0<br />
Suppose that A is <strong>in</strong>vertible and does have an <strong>in</strong>verse, say B. Choose the vector<br />
of constants<br />
[ ] 1<br />
b = 3<br />
2<br />
and consider the system of equations LS(A, b). Just as <strong>in</strong> Example SABMI, this<br />
vector equation would have the unique solution x = Bb.<br />
However, the system LS(A, b) is <strong>in</strong>consistent. Form the augmented matrix<br />
[A | b] and row-reduce to<br />
⎡<br />
⎣<br />
1 0 1 0<br />
0 1 −1 0<br />
0 0 0 1<br />
which allows us to recognize the <strong>in</strong>consistency by Theorem RCLS.<br />
So the assumption of A’s <strong>in</strong>verse leads to a logical <strong>in</strong>consistency (the system<br />
cannot be both consistent and <strong>in</strong>consistent), so our assumption is false. A is not<br />
<strong>in</strong>vertible.<br />
It is possible this example is less than satisfy<strong>in</strong>g. Just where did that particular<br />
choice of the vector b come from anyway? Stay tuned for an application of the future<br />
Theorem CSCS <strong>in</strong> Example CSAA.<br />
△<br />
Letuslookatonemorematrix<strong>in</strong>versebeforeweembarkonamoresystematic<br />
study.<br />
Example MI Matrix <strong>in</strong>verse<br />
Consider the matrices,<br />
⎡<br />
⎤<br />
1 2 1 2 1<br />
−2 −3 0 −5 −1<br />
A = ⎢ 1 1 0 2 1 ⎥<br />
⎣−2 −3 −1 −3 −2⎦<br />
−1 −3 −1 −3 1<br />
⎤<br />
⎦<br />
⎡<br />
⎤<br />
−3 3 6 −1 −2<br />
0 −2 −5 −1 1<br />
B = ⎢ 1 2 4 1 −1⎥<br />
⎣ 1 0 1 1 0 ⎦<br />
1 −1 −2 0 1
§MISLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 196<br />
Then<br />
⎡<br />
⎤ ⎡<br />
⎤ ⎡<br />
⎤<br />
1 2 1 2 1 −3 3 6 −1 −2 1 0 0 0 0<br />
−2 −3 0 −5 −1<br />
0 −2 −5 −1 1<br />
0 1 0 0 0<br />
AB = ⎢ 1 1 0 2 1 ⎥ ⎢ 1 2 4 1 −1⎥<br />
⎣−2 −3 −1 −3 −2⎦<br />
⎣ 1 0 1 1 0 ⎦ = ⎢0 0 1 0 0⎥<br />
⎣0 0 0 1 0⎦<br />
−1 −3 −1 −3 1 1 −1 −2 0 1 0 0 0 0 1<br />
and<br />
⎡<br />
⎤ ⎡<br />
⎤ ⎡<br />
⎤<br />
−3 3 6 −1 −2 1 2 1 2 1 1 0 0 0 0<br />
0 −2 −5 −1 1<br />
−2 −3 0 −5 −1<br />
0 1 0 0 0<br />
BA = ⎢ 1 2 4 1 −1⎥<br />
⎢ 1 1 0 2 1 ⎥<br />
⎣ 1 0 1 1 0 ⎦ ⎣−2 −3 −1 −3 −2⎦ = ⎢0 0 1 0 0⎥<br />
⎣0 0 0 1 0⎦<br />
1 −1 −2 0 1 −1 −3 −1 −3 1 0 0 0 0 1<br />
so by Def<strong>in</strong>ition MI, wecansaythatA is <strong>in</strong>vertible and write B = A −1 .<br />
We will now concern ourselves less with whether or not an <strong>in</strong>verse of a matrix<br />
exists, but <strong>in</strong>stead with how you can f<strong>in</strong>d one when it does exist. In Section MINM<br />
we will have some theorems that allow us to more quickly and easily determ<strong>in</strong>e just<br />
when a matrix is <strong>in</strong>vertible.<br />
Subsection CIM<br />
Comput<strong>in</strong>g the Inverse of a Matrix<br />
We have seen that the matrices from Archetype B and Archetype K both have<br />
<strong>in</strong>verses, but these <strong>in</strong>verse matrices have just dropped from the sky. How would<br />
we compute an <strong>in</strong>verse? And just when is a matrix <strong>in</strong>vertible, and when is it not?<br />
Writ<strong>in</strong>g a putative <strong>in</strong>verse with n 2 unknowns and solv<strong>in</strong>g the resultant n 2 equations<br />
is one approach. Apply<strong>in</strong>g this approach to 2 × 2 matrices can get us somewhere, so<br />
just for fun, let us do it.<br />
Theorem TTMI Two-by-Two Matrix Inverse<br />
Suppose<br />
[ ]<br />
a b<br />
A =<br />
c d<br />
Then A is <strong>in</strong>vertible if and only if ad − bc ≠0.WhenA is <strong>in</strong>vertible, then<br />
[ ]<br />
A −1 1 d −b<br />
=<br />
ad − bc −c a<br />
Proof. (⇐) Assume that ad − bc ̸= 0. We will use the def<strong>in</strong>ition of the <strong>in</strong>verse of a<br />
matrix to establish that A has an <strong>in</strong>verse (Def<strong>in</strong>ition MI). Note that if ad − bc ̸= 0<br />
then the displayed formula for A −1 is legitimate s<strong>in</strong>ce we are not divid<strong>in</strong>g by zero).<br />
△
§MISLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 197<br />
Us<strong>in</strong>g this proposed formula for the <strong>in</strong>verse of A, we compute<br />
[ ]( [ ]) [ ]<br />
AA −1 a b 1 d −b 1 ad − bc 0<br />
=<br />
=<br />
=<br />
c d ad − bc −c a ad − bc 0 ad − bc<br />
[ ]<br />
1 0<br />
0 1<br />
and<br />
[ ][ ] [ ] [ ]<br />
A −1 1 d −b a b 1 ad − bc 0 1 0<br />
A =<br />
=<br />
=<br />
ad − bc −c a c d ad − bc 0 ad − bc 0 1<br />
By Def<strong>in</strong>ition MI this is sufficient to establish that A is <strong>in</strong>vertible, and that the<br />
expression for A −1 is correct.<br />
(⇒) Assume that A is <strong>in</strong>vertible, and proceed with a proof by contradiction<br />
(Proof Technique CD), by assum<strong>in</strong>g also that ad − bc = 0. This translates to ad = bc.<br />
Let<br />
[ ]<br />
e f<br />
B =<br />
g h<br />
be a putative <strong>in</strong>verse of A.<br />
This means that<br />
[ ][ ] [ ]<br />
a b e f ae + bg af + bh<br />
I 2 = AB =<br />
=<br />
c d g h ce + dg cf + dh<br />
Work<strong>in</strong>g on the matrices on two ends of this equation, we will multiply the top<br />
row by c and the bottom row by a.<br />
[ ] [ ]<br />
c 0 ace + bcg acf + bch<br />
=<br />
0 a ace + adg acf + adh<br />
We are assum<strong>in</strong>g that ad = bc, so we can replace two occurrences of ad by bc <strong>in</strong><br />
the bottom row of the right matrix.<br />
[ ] [ ]<br />
c 0 ace + bcg acf + bch<br />
=<br />
0 a ace + bcg acf + bch<br />
The matrix on the right now has two rows that are identical, and therefore the<br />
same must be true of the matrix on the left. Identical rows for the matrix on the<br />
left implies that a = 0 and c =0.<br />
With this <strong>in</strong>formation, the product AB becomes<br />
[ ]<br />
[ ] [ ]<br />
1 0<br />
ae + bg af + bh bg bh<br />
= I<br />
0 1 2 = AB =<br />
=<br />
ce + dg cf + dh dg dh<br />
So bg = dh = 1 and thus b, g, d, h are all nonzero. But then bh and dg (the “other<br />
corners”) must also be nonzero, so this is (f<strong>in</strong>ally) a contradiction. So our assumption<br />
was false and we see that ad − bc ≠ 0 whenever A has an <strong>in</strong>verse.<br />
<br />
There are several ways one could try to prove this theorem, but there is a cont<strong>in</strong>ual<br />
temptation to divide by one of the eight entries <strong>in</strong>volved (a through f), but we can
§MISLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 198<br />
never be sure if these numbers are zero or not. This could lead to an analysis by<br />
cases, which is messy, messy, messy. Note how the above proof never divides, but<br />
always multiplies, and how zero/nonzero considerations are handled. Pay attention<br />
to the expression ad − bc, as we will see it aga<strong>in</strong> <strong>in</strong> a while (Chapter D).<br />
This theorem is cute, and it is nice to have a formula for the <strong>in</strong>verse, and a condition<br />
that tells us when we can use it. However, this approach becomes impractical for<br />
larger matrices, even though it is possible to demonstrate that, <strong>in</strong> theory, there is<br />
a general formula. (Th<strong>in</strong>k for a m<strong>in</strong>ute about extend<strong>in</strong>g this result to just 3 × 3<br />
matrices. For starters, we need 18 letters!) Instead, we will work column-by-column.<br />
Let us first work an example that will motivate the ma<strong>in</strong> theorem and remove some<br />
of the previous mystery.<br />
Example CMI Comput<strong>in</strong>g a matrix <strong>in</strong>verse<br />
Consider the matrix def<strong>in</strong>ed <strong>in</strong> Example MI as,<br />
⎡<br />
⎤<br />
1 2 1 2 1<br />
−2 −3 0 −5 −1<br />
A = ⎢ 1 1 0 2 1 ⎥<br />
⎣−2 −3 −1 −3 −2⎦<br />
−1 −3 −1 −3 1<br />
For its <strong>in</strong>verse, we desire a matrix B so that AB = I 5 . Emphasiz<strong>in</strong>g the structure<br />
of the columns and employ<strong>in</strong>g the def<strong>in</strong>ition of matrix multiplication Def<strong>in</strong>ition MM,<br />
AB = I 5<br />
A[B 1 |B 2 |B 3 |B 4 |B 5 ]=[e 1 |e 2 |e 3 |e 4 |e 5 ]<br />
[AB 1 |AB 2 |AB 3 |AB 4 |AB 5 ]=[e 1 |e 2 |e 3 |e 4 |e 5 ]<br />
Equat<strong>in</strong>g the matrices column-by-column we have<br />
AB 1 = e 1 AB 2 = e 2 AB 3 = e 3 AB 4 = e 4 AB 5 = e 5 .<br />
S<strong>in</strong>ce the matrix B is what we are try<strong>in</strong>g to compute, we can view each column,<br />
B i , as a column vector of unknowns. Then we have five systems of equations to<br />
solve, each with 5 equations <strong>in</strong> 5 variables. Notice that all 5 of these systems have<br />
the same coefficient matrix. We will now solve each system <strong>in</strong> turn,
§MISLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 199<br />
Row-reduce the augmented matrix of the l<strong>in</strong>ear system LS(A, e 1 ),<br />
⎡<br />
⎤<br />
⎡<br />
⎤<br />
1 2 1 2 1 1<br />
1 0 0 0 0 −3 ⎡ ⎤<br />
−2 −3 0 −5 −1 0<br />
0 1 0 0 0 0<br />
−3<br />
0<br />
RREF<br />
⎢ 1 1 0 2 1 0⎥<br />
−−−−→<br />
0 0 1 0 0 1<br />
; B 1 = ⎢ 1 ⎥<br />
⎣−2 −3 −1 −3 −2 0⎦<br />
⎢<br />
⎥ ⎣<br />
⎣ 0 0 0 1 0 1 ⎦ 1 ⎦<br />
−1 −3 −1 −3 1 0<br />
1<br />
0 0 0 0 1 1<br />
Row-reduce the augmented matrix of the l<strong>in</strong>ear system LS(A, e 2 ),<br />
⎡<br />
⎤<br />
⎡<br />
⎤<br />
1 2 1 2 1 0<br />
1 0 0 0 0 3 ⎡ ⎤<br />
−2 −3 0 −5 −1 1<br />
0 1 0 0 0 −2<br />
3<br />
−2<br />
RREF<br />
⎢ 1 1 0 2 1 0⎥<br />
−−−−→<br />
0 0 1 0 0 2<br />
; B 2 = ⎢ 2 ⎥<br />
⎣−2 −3 −1 −3 −2 0⎦<br />
⎢<br />
⎥ ⎣<br />
⎣ 0 0 0 1 0 0 ⎦ 0 ⎦<br />
−1 −3 −1 −3 1 0<br />
−1<br />
0 0 0 0 1 −1<br />
Row-reduce the augmented matrix of the l<strong>in</strong>ear system LS(A, e 3 ),<br />
⎡<br />
⎤<br />
⎡<br />
⎤<br />
1 2 1 2 1 0<br />
1 0 0 0 0 6 ⎡ ⎤<br />
−2 −3 0 −5 −1 0<br />
0 1 0 0 0 −5<br />
6<br />
−5<br />
RREF<br />
⎢ 1 1 0 2 1 1⎥<br />
−−−−→<br />
0 0 1 0 0 4<br />
; B<br />
⎣−2 −3 −1 −3 −2 0⎦<br />
⎢<br />
3 = ⎢ 4 ⎥<br />
⎥ ⎣<br />
⎣ 0 0 0 1 0 1 ⎦ 1 ⎦<br />
−1 −3 −1 −3 1 0<br />
−2<br />
0 0 0 0 1 −2<br />
Row-reduce the augmented matrix of the l<strong>in</strong>ear system LS(A, e 4 ),<br />
⎡<br />
⎤<br />
⎡<br />
⎤<br />
1 2 1 2 1 0<br />
1 0 0 0 0 −1 ⎡ ⎤<br />
−2 −3 0 −5 −1 0<br />
0 1 0 0 0 −1<br />
−1<br />
−1<br />
RREF<br />
⎢ 1 1 0 2 1 0⎥<br />
−−−−→<br />
0 0 1 0 0 1<br />
; B 4 = ⎢ 1 ⎥<br />
⎣−2 −3 −1 −3 −2 1⎦<br />
⎢<br />
⎥ ⎣<br />
⎣ 0 0 0 1 0 1 ⎦ 1 ⎦<br />
−1 −3 −1 −3 1 0<br />
0<br />
0 0 0 0 1 0<br />
Row-reduce the augmented matrix of the l<strong>in</strong>ear system LS(A, e 5 ),<br />
⎡<br />
⎤<br />
⎡<br />
⎤<br />
1 2 1 2 1 0<br />
1 0 0 0 0 −2 ⎡ ⎤<br />
−2 −3 0 −5 −1 0<br />
0 1 0 0 0 1<br />
−2<br />
1<br />
RREF<br />
⎢ 1 1 0 2 1 0⎥<br />
−−−−→<br />
0 0 1 0 0 −1<br />
; B<br />
⎣−2 −3 −1 −3 −2 0⎦<br />
⎢<br />
⎥ 5 = ⎢−1⎥<br />
⎣<br />
⎣ 0 0 0 1 0 0 ⎦ 0 ⎦<br />
−1 −3 −1 −3 1 1<br />
1<br />
0 0 0 0 1 1
§MISLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 200<br />
We can now collect our 5 solution vectors <strong>in</strong>to the matrix B,<br />
B =[B 1 |B 2 |B 3 |B 4 |B 5 ]<br />
⎡⎡<br />
⎤<br />
⎡ ⎤<br />
⎡ ⎤<br />
⎡ ⎤<br />
⎡ ⎤⎤<br />
−3<br />
3<br />
6<br />
−1<br />
−2<br />
0<br />
−2<br />
−5<br />
−1<br />
1<br />
= ⎢⎢<br />
1 ⎥<br />
⎢ 2 ⎥<br />
⎢ 4 ⎥<br />
⎢ 1 ⎥<br />
⎢−1⎥⎥<br />
⎣⎣<br />
1 ⎦<br />
⎣<br />
0 ⎦<br />
⎣<br />
1 ⎦<br />
⎣<br />
1 ⎦<br />
⎣<br />
0 ⎦⎦<br />
1 ∣ −1 ∣ −2 ∣ 0 ∣ 1<br />
⎡<br />
⎤<br />
−3 3 6 −1 −2<br />
⎢ 0 −2 −5 −1 1<br />
⎢⎢⎣ = 1 2 4 1 −1⎥<br />
1 0 1 1 0 ⎦<br />
1 −1 −2 0 1<br />
By this method, we know that AB = I 5 .CheckthatBA = I 5 , and then we will<br />
know that we have the <strong>in</strong>verse of A.<br />
△<br />
Notice how the five systems of equations <strong>in</strong> the preced<strong>in</strong>g example were all solved<br />
by exactly the same sequence of row operations. Would it not be nice to avoid this<br />
obvious duplication of effort? Our ma<strong>in</strong> theorem for this section follows, and it<br />
mimics this previous example, while also avoid<strong>in</strong>g all the overhead.<br />
Theorem CINM Comput<strong>in</strong>g the Inverse of a Nons<strong>in</strong>gular Matrix<br />
Suppose A is a nons<strong>in</strong>gular square matrix of size n. Create the n × 2n matrix M by<br />
plac<strong>in</strong>g the n × n identity matrix I n to the right of the matrix A. LetN be a matrix<br />
that is row-equivalent to M and <strong>in</strong> reduced row-echelon form. F<strong>in</strong>ally, let J be the<br />
matrix formed from the f<strong>in</strong>al n columns of N. ThenAJ = I n .<br />
Proof. A is nons<strong>in</strong>gular, so by Theorem NMRRI there is a sequence of row operations<br />
that will convert A <strong>in</strong>to I n . It is this same sequence of row operations that will<br />
convert M <strong>in</strong>to N, s<strong>in</strong>ce hav<strong>in</strong>g the identity matrix <strong>in</strong> the first n columns of N is<br />
sufficient to guarantee that N is <strong>in</strong> reduced row-echelon form.<br />
If we consider the systems of l<strong>in</strong>ear equations, LS(A, e i ),1≤ i ≤ n, we see that<br />
the aforementioned sequence of row operations will also br<strong>in</strong>g the augmented matrix<br />
of each of these systems <strong>in</strong>to reduced row-echelon form. Furthermore, the unique<br />
solution to LS(A, e i ) appears <strong>in</strong> column n + 1 of the row-reduced augmented matrix<br />
of the system and is identical to column n+i of N. LetN 1 , N 2 , N 3 , ..., N 2n denote<br />
the columns of N. So we f<strong>in</strong>d,<br />
AJ =A[N n+1 |N n+2 |N n+3 | ...|N n+n ]<br />
=[AN n+1 |AN n+2 |AN n+3 | ...|AN n+n ] Def<strong>in</strong>ition MM<br />
=[e 1 |e 2 |e 3 | ...|e n ]<br />
=I n Def<strong>in</strong>ition IM<br />
as desired.
§MISLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 201<br />
We have to be just a bit careful here about both what this theorem says and<br />
what it does not say. If A is a nons<strong>in</strong>gular matrix, then we are guaranteed a matrix<br />
B such that AB = I n , and the proof gives us a process for construct<strong>in</strong>g B. However,<br />
the def<strong>in</strong>ition of the <strong>in</strong>verse of a matrix (Def<strong>in</strong>ition MI) requires that BA = I n also.<br />
So at this juncture we must compute the matrix product <strong>in</strong> the “opposite” order<br />
before we claim B as the <strong>in</strong>verse of A. However, we will soon see that this is always<br />
the case, <strong>in</strong> Theorem OSIS, so the title of this theorem is not <strong>in</strong>accurate.<br />
What if A is s<strong>in</strong>gular? At this po<strong>in</strong>t we only know that Theorem CINM cannot<br />
be applied. The question of A’s <strong>in</strong>verse is still open. (But see Theorem NI <strong>in</strong> the<br />
next section.)<br />
We will f<strong>in</strong>ish by comput<strong>in</strong>g the <strong>in</strong>verse for the coefficient matrix of Archetype<br />
B, the one we just pulled from a hat <strong>in</strong> Example SABMI. There are more examples<br />
<strong>in</strong> the Archetypes (Archetypes) to practice with, though notice that it is silly to ask<br />
for the <strong>in</strong>verse of a rectangular matrix (the sizes are not right) and not every square<br />
matrix has an <strong>in</strong>verse (remember Example MWIAA?).<br />
Example CMIAB Comput<strong>in</strong>g a matrix <strong>in</strong>verse, Archetype B<br />
Archetype B has a coefficient matrix given as<br />
[ −7 −6<br />
] −12<br />
B = 5 5 7<br />
1 0 4<br />
Exercis<strong>in</strong>g Theorem CINM we set<br />
[ −7 −6 −12 1 0<br />
] 0<br />
M = 5 5 7 0 1 0 .<br />
1 0 4 0 0 1<br />
which row reduces to<br />
So<br />
⎡<br />
⎤<br />
1 0 0 −10 −12 −9<br />
N = ⎣<br />
13<br />
11<br />
0 1 0<br />
2<br />
8 ⎦<br />
2<br />
.<br />
5<br />
5<br />
0 0 1<br />
2<br />
3<br />
2<br />
⎡<br />
−10 −12<br />
⎤<br />
−9<br />
B −1 = ⎣ 13<br />
⎦<br />
11<br />
2<br />
8<br />
2<br />
5<br />
5<br />
2<br />
3<br />
2<br />
once we check that B −1 B = I 3 (the product <strong>in</strong> the opposite order is a consequence<br />
of the theorem).<br />
△
§MISLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 202<br />
Subsection PMI<br />
Properties of Matrix Inverses<br />
The <strong>in</strong>verse of a matrix enjoys some nice properties. We collect a few here. <strong>First</strong>, a<br />
matrix can have but one <strong>in</strong>verse.<br />
Theorem MIU Matrix Inverse is Unique<br />
Suppose the square matrix A has an <strong>in</strong>verse. Then A −1 is unique.<br />
Proof. As described <strong>in</strong> Proof Technique U, we will assume that A has two <strong>in</strong>verses.<br />
The hypothesis tells there is at least one. Suppose then that B and C are both<br />
<strong>in</strong>verses for A, so we know by Def<strong>in</strong>ition MI that AB = BA = I n and AC = CA = I n .<br />
Then we have,<br />
B = BI n<br />
Theorem MMIM<br />
= B(AC) Def<strong>in</strong>ition MI<br />
=(BA)C<br />
Theorem MMA<br />
= I n C Def<strong>in</strong>ition MI<br />
= C Theorem MMIM<br />
So we conclude that B and C are the same, and cannot be different. So any<br />
matrix that acts like an <strong>in</strong>verse, must be the <strong>in</strong>verse.<br />
<br />
When most of us dress <strong>in</strong> the morn<strong>in</strong>g, we put on our socks first, followed by our<br />
shoes. In the even<strong>in</strong>g we must then first remove our shoes, followed by our socks.<br />
Try to connect the conclusion of the follow<strong>in</strong>g theorem with this everyday example.<br />
Theorem SS Socks and Shoes<br />
Suppose A and B are <strong>in</strong>vertible matrices of size n. ThenAB is an <strong>in</strong>vertible matrix<br />
and (AB) −1 = B −1 A −1 .<br />
Proof. At the risk of carry<strong>in</strong>g our everyday analogies too far, the proof of this<br />
theorem is quite easy when we compare it to the work<strong>in</strong>gs of a dat<strong>in</strong>g service. We<br />
have a statement about the <strong>in</strong>verse of the matrix AB, which for all we know right<br />
now might not even exist. Suppose AB was to sign up for a dat<strong>in</strong>g service with<br />
two requirements for a compatible date. Upon multiplication on the left, and on<br />
the right, the result should be the identity matrix. In other words, AB’s ideal date<br />
would be its <strong>in</strong>verse.<br />
Now along comes the matrix B −1 A −1 (which we know exists because our hypothesissaysbothA<br />
and B are <strong>in</strong>vertible and we can form the product of these<br />
two matrices), also look<strong>in</strong>g for a date. Let us see if B −1 A −1 is a good match for<br />
AB. <strong>First</strong> they meet at a noncommittal neutral location, say a coffee shop, for quiet<br />
conversation:<br />
(B −1 A −1 )(AB) =B −1 (A −1 A)B Theorem MMA
§MISLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 203<br />
= B −1 I n B Def<strong>in</strong>ition MI<br />
= B −1 B Theorem MMIM<br />
= I n Def<strong>in</strong>ition MI<br />
The first date hav<strong>in</strong>g gone smoothly, a second, more serious, date is arranged, say<br />
d<strong>in</strong>ner and a show:<br />
(AB)(B −1 A −1 )=A(BB −1 )A −1 Theorem MMA<br />
= AI n A −1 Def<strong>in</strong>ition MI<br />
= AA −1 Theorem MMIM<br />
= I n Def<strong>in</strong>ition MI<br />
So the matrix B −1 A −1 has met all of the requirements to be AB’s <strong>in</strong>verse (date)<br />
and with the ensu<strong>in</strong>g marriage proposal we can announce that (AB) −1 = B −1 A −1 .<br />
<br />
Theorem MIMI Matrix Inverse of a Matrix Inverse<br />
Suppose A is an <strong>in</strong>vertible matrix. Then A −1 is <strong>in</strong>vertible and (A −1 ) −1 = A.<br />
Proof. As with the proof of Theorem SS, we exam<strong>in</strong>e if A is a suitable <strong>in</strong>verse for<br />
A −1 (by def<strong>in</strong>ition, the opposite is true).<br />
AA −1 = I n<br />
Def<strong>in</strong>ition MI<br />
and<br />
A −1 A = I n<br />
Def<strong>in</strong>ition MI<br />
The matrix A has met all the requirements to be the <strong>in</strong>verse of A −1 ,andsois<br />
<strong>in</strong>vertible and we can write A =(A −1 ) −1 .<br />
<br />
Theorem MIT Matrix Inverse of a Transpose<br />
Suppose A is an <strong>in</strong>vertible matrix. Then A t is <strong>in</strong>vertible and (A t ) −1 =(A −1 ) t .<br />
Proof. As with the proof of Theorem SS, we see if (A −1 ) t is a suitable <strong>in</strong>verse for<br />
A t . Apply Theorem MMT to see that<br />
(A −1 ) t A t =(AA −1 ) t Theorem MMT<br />
= In t Def<strong>in</strong>ition MI<br />
= I n Def<strong>in</strong>ition SYM<br />
and<br />
A t (A −1 ) t =(A −1 A) t<br />
Theorem MMT<br />
= In t Def<strong>in</strong>ition MI
§MISLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 204<br />
= I n Def<strong>in</strong>ition SYM<br />
The matrix (A −1 ) t has met all the requirements to be the <strong>in</strong>verse of A t ,andso<br />
is <strong>in</strong>vertible and we can write (A t ) −1 =(A −1 ) t .<br />
<br />
Theorem MISM Matrix Inverse of a Scalar Multiple<br />
Suppose A is an <strong>in</strong>vertible matrix and α is a nonzero scalar. Then (αA) −1 = 1 α A−1<br />
and αA is <strong>in</strong>vertible.<br />
Proof. As with the proof of Theorem SS, we see if 1 α A−1 is a suitable <strong>in</strong>verse for αA.<br />
( 1<br />
α A−1 )<br />
(αA) =( 1<br />
α α ) (A −1 A ) Theorem MMSMM<br />
=1I n Scalar multiplicative <strong>in</strong>verses<br />
= I n Property OM<br />
and<br />
( ) (<br />
1<br />
(αA)<br />
α A−1 = α<br />
α) 1 (AA<br />
−1 ) Theorem MMSMM<br />
=1I n Scalar multiplicative <strong>in</strong>verses<br />
= I n Property OM<br />
The matrix 1 α A−1 has met all the requirements to be the <strong>in</strong>verse of αA, sowe<br />
can write (αA) −1 = 1 α A−1 .<br />
<br />
Notice that there are some likely theorems that are miss<strong>in</strong>g here. For example, it<br />
would be tempt<strong>in</strong>g to th<strong>in</strong>k that (A + B) −1 = A −1 + B −1 , but this is false. Can you<br />
f<strong>in</strong>d a counterexample? (See Exercise MISLE.T10.)<br />
Read<strong>in</strong>g Questions<br />
1. Compute the <strong>in</strong>verse of the matrix below.<br />
[ ]<br />
−2 3<br />
−3 4<br />
2. Compute the <strong>in</strong>verse of the matrix below.<br />
⎡<br />
⎣ 2 3 1<br />
⎤<br />
1 −2 −3⎦<br />
−2 4 6<br />
3. Expla<strong>in</strong> why Theorem SS has the title it does. (Do not just state the theorem, expla<strong>in</strong><br />
the choice of the title mak<strong>in</strong>g reference to the theorem itself.)
§MISLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 205<br />
Exercises<br />
⎡<br />
C16 † If it exists, f<strong>in</strong>d the <strong>in</strong>verse of A = ⎣ 1 0 1<br />
⎤<br />
1 1 1⎦, and check your answer.<br />
2 −1 1<br />
⎡<br />
C17 † If it exists, f<strong>in</strong>d the <strong>in</strong>verse of A = ⎣ 2 −1 1<br />
⎤<br />
1 2 1⎦, and check your answer.<br />
3 1 2<br />
⎡<br />
C18 † If it exists, f<strong>in</strong>d the <strong>in</strong>verse of A = ⎣ 1 3 1<br />
⎤<br />
1 2 1⎦, and check your answer.<br />
2 2 1<br />
⎡<br />
C19 † If it exists, f<strong>in</strong>d the <strong>in</strong>verse of A = ⎣ 1 3 1<br />
⎤<br />
0 2 1⎦, and check your answer.<br />
2 2 1<br />
C21 † Verify that B is the <strong>in</strong>verse of A.<br />
⎡<br />
⎤<br />
⎡<br />
⎤<br />
1 1 −1 2<br />
4 2 0 −1<br />
⎢−2 −1 2 −3⎥<br />
⎢ 8 4 −1 −1⎥<br />
A = ⎣<br />
1 1 0 2<br />
⎦ B = ⎣<br />
−1 0 1 0<br />
⎦<br />
−1 2 0 2<br />
−6 −3 1 1<br />
C22 †<br />
Recycle the matrices A and B from Exercise MISLE.C21 and set<br />
⎡ ⎤<br />
⎡ ⎤<br />
2<br />
1<br />
⎢ 1 ⎥<br />
⎢1⎥<br />
c = ⎣<br />
−3<br />
⎦ d = ⎣<br />
1<br />
⎦<br />
2<br />
1<br />
Employ the matrix B to solve the two l<strong>in</strong>ear systems LS(A, c) andLS(A, d).<br />
C23 If it exists, f<strong>in</strong>d the <strong>in</strong>verse of the 2 × 2 matrix<br />
[ ]<br />
7 3<br />
A =<br />
5 2<br />
and check your answer. (See Theorem TTMI.)<br />
C24 If it exists, f<strong>in</strong>d the <strong>in</strong>verse of the 2 × 2 matrix<br />
[ ]<br />
6 3<br />
A =<br />
4 2<br />
and check your answer. (See Theorem TTMI.)<br />
C25 At the conclusion of Example CMI, verifythatBA = I 5 by comput<strong>in</strong>g the matrix<br />
product.<br />
C26 † Let<br />
⎡<br />
⎤<br />
1 −1 3 −2 1<br />
−2 3 −5 3 0<br />
D = ⎢ 1 −1 4 −2 2⎥<br />
⎣<br />
−1 4 −1 0 4<br />
⎦<br />
1 0 5 −2 5
§MISLE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 206<br />
Compute the <strong>in</strong>verse of D, D −1 , by form<strong>in</strong>g the 5 × 10 matrix [ D | I 5 ] and row-reduc<strong>in</strong>g<br />
(Theorem CINM). Then use a calculator to compute D −1 directly.<br />
C27 † Let<br />
⎡<br />
⎤<br />
1 −1 3 −2 1<br />
−2 3 −5 3 −1<br />
E = ⎢ 1 −1 4 −2 2 ⎥<br />
⎣<br />
−1 4 −1 0 2<br />
⎦<br />
1 0 5 −2 4<br />
Compute the <strong>in</strong>verse of E, E −1 , by form<strong>in</strong>g the 5 × 10 matrix [ E | I 5 ] and row-reduc<strong>in</strong>g<br />
(Theorem CINM). Then use a calculator to compute E −1 directly.<br />
C28 † Let<br />
⎡<br />
⎤<br />
1 1 3 1<br />
⎢−2 −1 −4 −1⎥<br />
C = ⎣<br />
1 4 10 2<br />
⎦<br />
−2 0 −4 5<br />
Compute the <strong>in</strong>verse of C, C −1 , by form<strong>in</strong>g the 4 × 8 matrix [ C | I 4 ] and row-reduc<strong>in</strong>g<br />
(Theorem CINM). Then use a calculator to compute C −1 directly.<br />
C40 † F<strong>in</strong>d all solutions to the system of equations below, mak<strong>in</strong>g use of the matrix <strong>in</strong>verse<br />
found <strong>in</strong> Exercise MISLE.C28.<br />
x 1 + x 2 +3x 3 + x 4 = −4<br />
−2x 1 − x 2 − 4x 3 − x 4 =4<br />
x 1 +4x 2 +10x 3 +2x 4 = −20<br />
−2x 1 − 4x 3 +5x 4 =9<br />
C41 † Use the <strong>in</strong>verse of a matrix to f<strong>in</strong>d all the solutions to the follow<strong>in</strong>g system of<br />
equations.<br />
x 1 +2x 2 − x 3 = −3<br />
2x 1 +5x 2 − x 3 = −4<br />
−x 1 − 4x 2 =2<br />
C42 †<br />
Use a matrix <strong>in</strong>verse to solve the l<strong>in</strong>ear system of equations.<br />
x 1 − x 2 +2x 3 =5<br />
x 1 − 2x 3 = −8<br />
2x 1 − x 2 − x 3 = −6<br />
T10 † Construct an example to demonstrate that (A + B) −1 = A −1 + B −1 is not true for<br />
all square matrices A and B of the same size.
Section MINM<br />
Matrix Inverses and Nons<strong>in</strong>gular Matrices<br />
We saw <strong>in</strong> Theorem CINM that if a square matrix A is nons<strong>in</strong>gular, then there<br />
is a matrix B so that AB = I n . In other words, B is halfway to be<strong>in</strong>g an <strong>in</strong>verse<br />
of A. We will see <strong>in</strong> this section that B automatically fulfills the second condition<br />
(BA = I n ). Example MWIAA showed us that the coefficient matrix from Archetype<br />
A had no <strong>in</strong>verse. Not co<strong>in</strong>cidentally, this coefficient matrix is s<strong>in</strong>gular. We will make<br />
all these connections precise now. Not many examples or def<strong>in</strong>itions <strong>in</strong> this section,<br />
just theorems.<br />
Subsection NMI<br />
Nons<strong>in</strong>gular Matrices are Invertible<br />
We need a couple of technical results for starters. Some books would call these m<strong>in</strong>or,<br />
but essential, results “lemmas.” We’ll just call ’em theorems. See Proof Technique<br />
LC for more on the dist<strong>in</strong>ction.<br />
The first of these technical results is <strong>in</strong>terest<strong>in</strong>g <strong>in</strong> that the hypothesis says<br />
someth<strong>in</strong>g about a product of two square matrices and the conclusion then says the<br />
same th<strong>in</strong>g about each <strong>in</strong>dividual matrix <strong>in</strong> the product. This result has an analogy<br />
<strong>in</strong> the algebra of complex numbers: suppose α, β ∈ C, thenαβ ≠ 0 if and only if<br />
α ̸= 0 and β ̸= 0. We can view this result as suggest<strong>in</strong>g that the term “nons<strong>in</strong>gular”<br />
for matrices is like the term “nonzero” for scalars. Consider too that we know<br />
s<strong>in</strong>gular matrices, as coefficient matrices for systems of equations, will sometimes<br />
lead to systems with no solutions, or systems with <strong>in</strong>f<strong>in</strong>itely many solutions (Theorem<br />
NMUS). What do l<strong>in</strong>ear equations with zero look like? Consider 0x = 5, which has<br />
no solution, and 0x = 0, which has <strong>in</strong>f<strong>in</strong>itely many solutions. In the algebra of scalars,<br />
zero is exceptional (mean<strong>in</strong>g different, not better), and <strong>in</strong> the algebra of matrices,<br />
s<strong>in</strong>gular matrices are also the exception. While there is only one zero scalar, and<br />
there are <strong>in</strong>f<strong>in</strong>itely many s<strong>in</strong>gular matrices, we will see that s<strong>in</strong>gular matrices are a<br />
dist<strong>in</strong>ct m<strong>in</strong>ority.<br />
Theorem NPNT Nons<strong>in</strong>gular Product has Nons<strong>in</strong>gular Terms<br />
Suppose that A and B are square matrices of size n. The product AB is nons<strong>in</strong>gular<br />
if and only if A and B are both nons<strong>in</strong>gular.<br />
Proof. (⇒) For this portion of the proof we will form the logically-equivalent contrapositive<br />
and prove that statement us<strong>in</strong>g two cases. “AB is nons<strong>in</strong>gular implies A<br />
and B are both nons<strong>in</strong>gular” becomes “A or B is s<strong>in</strong>gular implies AB is s<strong>in</strong>gular.”<br />
(Be sure to undertstand why the “and” became an “or”, see Proof Technique CP.)<br />
Case 1. Suppose B is s<strong>in</strong>gular. Then there is a nonzero vector z that is a solution<br />
207
§MINM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 208<br />
to LS(B, 0). So<br />
(AB)z = A(Bz)<br />
Theorem MMA<br />
= A0 Theorem SLEMM<br />
= 0 Theorem MMZM<br />
With Theorem SLEMM we can translate this vector equality to the statement<br />
that z is a nonzero solution to LS(AB, 0). ThusAB is s<strong>in</strong>gular (Def<strong>in</strong>ition NM), as<br />
desired.<br />
Case 2. Suppose A is s<strong>in</strong>gular, and B is not s<strong>in</strong>gular. In other words, with Case<br />
1 complete, we can be more precise about this rema<strong>in</strong><strong>in</strong>g case and assume that B is<br />
nons<strong>in</strong>gular. Because A is s<strong>in</strong>gular, there is a nonzero vector y that is a solution<br />
to LS(A, 0). Now consider the l<strong>in</strong>ear system LS(B, y). S<strong>in</strong>ceB is nons<strong>in</strong>gular, the<br />
system has a unique solution (Theorem NMUS), which we will denote as w. Wefirst<br />
claim w is not the zero vector either. Assum<strong>in</strong>g the opposite, suppose that w = 0<br />
(Proof Technique CD). Then<br />
y = Bw<br />
Theorem SLEMM<br />
= B0 Hypothesis<br />
= 0 Theorem MMZM<br />
contrary to y be<strong>in</strong>g nonzero. So w ≠ 0. The pieces are <strong>in</strong> place, so here we go,<br />
(AB)w = A(Bw)<br />
Theorem MMA<br />
= Ay Theorem SLEMM<br />
= 0 Theorem SLEMM<br />
With Theorem SLEMM we can translate this vector equality to the statement<br />
that w is a nonzero solution to LS(AB, 0). ThusAB is s<strong>in</strong>gular (Def<strong>in</strong>ition NM),<br />
as desired. And this conclusion holds for both cases.<br />
(⇐) Now assume that both A and B are nons<strong>in</strong>gular. Suppose that x ∈ C n is a<br />
solution to LS(AB, 0). Then<br />
0 =(AB) x Theorem SLEMM<br />
= A (Bx) Theorem MMA<br />
By Theorem SLEMM, Bx is a solution to LS(A, 0), and by the def<strong>in</strong>ition of a<br />
nons<strong>in</strong>gular matrix (Def<strong>in</strong>ition NM), we conclude that Bx = 0. Now,byanentirely<br />
similar argument, the nons<strong>in</strong>gularity of B forces us to conclude that x = 0. So<br />
the only solution to LS(AB, 0) is the zero vector and we conclude that AB is<br />
nons<strong>in</strong>gular by Def<strong>in</strong>ition NM.
§MINM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 209<br />
This is a powerful result <strong>in</strong> the “forward” direction, because it allows us to beg<strong>in</strong><br />
with a hypothesis that someth<strong>in</strong>g complicated (the matrix product AB) has the<br />
property of be<strong>in</strong>g nons<strong>in</strong>gular, and we can then conclude that the simpler constituents<br />
(A and B <strong>in</strong>dividually) then also have the property of be<strong>in</strong>g nons<strong>in</strong>gular. If we had<br />
thought that the matrix product was an artificial construction, results like this would<br />
make us beg<strong>in</strong> to th<strong>in</strong>k twice.<br />
The contrapositive of this entire result is equally <strong>in</strong>terest<strong>in</strong>g. It says that A or B<br />
(or both) is a s<strong>in</strong>gular matrix if and only if the product AB is s<strong>in</strong>gular. (See Proof<br />
Technique CP.)<br />
Theorem OSIS One-Sided Inverse is Sufficient<br />
Suppose A and B are square matrices of size n such that AB = I n .ThenBA = I n .<br />
Proof. The matrix I n is nons<strong>in</strong>gular (s<strong>in</strong>ce it row-reduces easily to I n ,Theorem<br />
NMRRI). So A and B are nons<strong>in</strong>gular by Theorem NPNT, so <strong>in</strong> particular B is<br />
nons<strong>in</strong>gular. We can therefore apply Theorem CINM to assert the existence of a<br />
matrix C so that BC = I n . This application of Theorem CINM could be a bit<br />
confus<strong>in</strong>g, mostly because of the names of the matrices <strong>in</strong>volved. B is nons<strong>in</strong>gular,<br />
so there must be a “right-<strong>in</strong>verse” for B, and we are call<strong>in</strong>g it C.<br />
Now<br />
BA =(BA)I n<br />
Theorem MMIM<br />
=(BA)(BC)<br />
Theorem CINM<br />
= B(AB)C Theorem MMA<br />
= BI n C Hypothesis<br />
= BC Theorem MMIM<br />
= I n Theorem CINM<br />
which is the desired conclusion.<br />
<br />
So Theorem OSIS tells us that if A is nons<strong>in</strong>gular, then the matrix B guaranteed<br />
by Theorem CINM will be both a “right-<strong>in</strong>verse” and a “left-<strong>in</strong>verse” for A, soA is<br />
<strong>in</strong>vertible and A −1 = B.<br />
So if you have a nons<strong>in</strong>gular matrix, A, you can use the procedure described<br />
<strong>in</strong> Theorem CINM to f<strong>in</strong>d an <strong>in</strong>verse for A. IfA is s<strong>in</strong>gular, then the procedure<br />
<strong>in</strong> Theorem CINM will fail as the first n columns of M will not row-reduce to<br />
the identity matrix. However, we can say a bit more. When A is s<strong>in</strong>gular, then A<br />
does not have an <strong>in</strong>verse (which is very different from say<strong>in</strong>g that the procedure <strong>in</strong><br />
Theorem CINM fails to f<strong>in</strong>d an <strong>in</strong>verse). This may feel like we are splitt<strong>in</strong>g hairs,<br />
but it is important that we do not make unfounded assumptions. These observations<br />
motivate the next theorem.<br />
Theorem NI Nons<strong>in</strong>gularity is Invertibility<br />
Suppose that A is a square matrix. Then A is nons<strong>in</strong>gular if and only if A is <strong>in</strong>vertible.
§MINM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 210<br />
Proof. (⇐) S<strong>in</strong>ceA is <strong>in</strong>vertible, we can write I n = AA −1 (Def<strong>in</strong>ition MI). Notice<br />
that I n is nons<strong>in</strong>gular (Theorem NMRRI) soTheoremNPNT implies that A (and<br />
A −1 ) is nons<strong>in</strong>gular.<br />
(⇒) Suppose now that A is nons<strong>in</strong>gular. By Theorem CINM we f<strong>in</strong>d B so that<br />
AB = I n . Then Theorem OSIS tells us that BA = I n .SoB is A’s <strong>in</strong>verse, and by<br />
construction, A is <strong>in</strong>vertible.<br />
<br />
So for a square matrix, the properties of hav<strong>in</strong>g an <strong>in</strong>verse and of hav<strong>in</strong>g a trivial<br />
null space are one and the same. Cannot have one without the other.<br />
Theorem NME3 Nons<strong>in</strong>gular Matrix Equivalences, Round 3<br />
Suppose that A is a square matrix of size n. The follow<strong>in</strong>g are equivalent.<br />
1. A is nons<strong>in</strong>gular.<br />
2. A row-reduces to the identity matrix.<br />
3. The null space of A conta<strong>in</strong>s only the zero vector, N (A) ={0}.<br />
4. The l<strong>in</strong>ear system LS(A, b) has a unique solution for every possible choice of<br />
b.<br />
5. The columns of A are a l<strong>in</strong>early <strong>in</strong>dependent set.<br />
6. A is <strong>in</strong>vertible.<br />
Proof. We can update our list of equivalences for nons<strong>in</strong>gular matrices (Theorem<br />
NME2) with the equivalent condition from Theorem NI.<br />
<br />
InthecasethatA is a nons<strong>in</strong>gular coefficient matrix of a system of equations,<br />
the <strong>in</strong>verse allows us to very quickly compute the unique solution, for any vector of<br />
constants.<br />
Theorem SNCM Solution with Nons<strong>in</strong>gular Coefficient Matrix<br />
Suppose that A is nons<strong>in</strong>gular. Then the unique solution to LS(A, b) is A −1 b.<br />
Proof. By Theorem NMUS we know already that LS(A, b) has a unique solution<br />
for every choice of b. We need to show that the expression stated is <strong>in</strong>deed a solution<br />
(the solution). That is easy, just “plug it <strong>in</strong>” to the vector equation representation<br />
of the system (Theorem SLEMM),<br />
A ( A −1 b ) = ( AA −1) b<br />
Theorem MMA<br />
= I n b Def<strong>in</strong>ition MI<br />
= b Theorem MMIM<br />
S<strong>in</strong>ce Ax = b is true when we substitute A −1 b for x, A −1 b is a (the!) solution<br />
to LS(A, b).
§MINM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 211<br />
Subsection UM<br />
Unitary Matrices<br />
Recall that the adjo<strong>in</strong>t of a matrix is A ∗ = ( A ) t<br />
(Def<strong>in</strong>ition A).<br />
Def<strong>in</strong>ition UM Unitary Matrices<br />
Suppose that U is a square matrix of size n such that U ∗ U = I n .ThenwesayU is<br />
unitary.<br />
□<br />
This condition may seem rather far-fetched at first glance. Would there be any<br />
matrix that behaved this way? Well, yes, here is one.<br />
Example UM3 Unitary matrix of size 3<br />
⎡<br />
⎢<br />
U = ⎣<br />
1+i √<br />
5<br />
1−i √<br />
5<br />
i √<br />
5<br />
3+2 i √<br />
55<br />
2+2i √<br />
22<br />
2+2 i √<br />
55<br />
⎤<br />
−3+i √<br />
⎥<br />
22 ⎦<br />
√<br />
22<br />
3−5 i √<br />
55<br />
− 2<br />
The computations get a bit tiresome, but if you work your way through the computation<br />
of U ∗ U,youwill arrive at the 3 × 3 identity matrix I 3 .<br />
△<br />
Unitary matrices do not have to look quite so gruesome. Here is a larger one<br />
that is a bit more pleas<strong>in</strong>g.<br />
Example UPM Unitary permutation matrix<br />
The matrix<br />
⎡<br />
⎤<br />
0 1 0 0 0<br />
0 0 0 1 0<br />
P = ⎢1 0 0 0 0⎥<br />
⎣0 0 0 0 1⎦<br />
0 0 1 0 0<br />
is unitary as can be easily checked. Notice that it is just a rearrangement of the<br />
columns of the 5 × 5 identity matrix, I 5 (Def<strong>in</strong>ition IM).<br />
An <strong>in</strong>terest<strong>in</strong>g exercise is to build another 5 × 5 unitary matrix, R, us<strong>in</strong>ga<br />
different rearrangement of the columns of I 5 . Then form the product PR.This<br />
will be another unitary matrix (Exercise MINM.T10). If you were to build all<br />
5! = 5 × 4 × 3 × 2 × 1 = 120 matrices of this type you would have a set that<br />
rema<strong>in</strong>s closed under matrix multiplication. It is an example of another algebraic<br />
structure known as a group s<strong>in</strong>ce together the set and the one operation (matrix<br />
multiplication here) is closed, associative, has an identity (I 5 ), and <strong>in</strong>verses (Theorem<br />
UMI). Notice though that the operation <strong>in</strong> this group is not commutative! △<br />
If a matrix A has only real number entries (we say it is a real matrix) then<br />
the def<strong>in</strong><strong>in</strong>g property of be<strong>in</strong>g unitary simplifies to A t A = I n . In this case we, and<br />
everybody else, call the matrix orthogonal, so you may often encounter this term
§MINM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 212<br />
<strong>in</strong> your other read<strong>in</strong>g when the complex numbers are not under consideration.<br />
Unitary matrices have easily computed <strong>in</strong>verses. They also have columns that<br />
form orthonormal sets. Here are the theorems that show us that unitary matrices<br />
are not as strange as they might <strong>in</strong>itially appear.<br />
Theorem UMI Unitary Matrices are Invertible<br />
Suppose that U is a unitary matrix of size n. ThenU is nons<strong>in</strong>gular, and U −1 = U ∗ .<br />
Proof. By Def<strong>in</strong>ition UM, we know that U ∗ U = I n .ThematrixI n is nons<strong>in</strong>gular<br />
(s<strong>in</strong>ce it row-reduces easily to I n ,TheoremNMRRI). So by Theorem NPNT, U and<br />
U ∗ are both nons<strong>in</strong>gular matrices.<br />
The equation U ∗ U = I n gets us halfway to an <strong>in</strong>verse of U, andTheoremOSIS<br />
tells us that then UU ∗ = I n also. So U and U ∗ are <strong>in</strong>verses of each other (Def<strong>in</strong>ition<br />
MI).<br />
<br />
Theorem CUMOS Columns of Unitary Matrices are Orthonormal Sets<br />
Suppose that S = {A 1 , A 2 , A 3 , ..., A n } is the set of columns of a square matrix A<br />
of size n. ThenA is a unitary matrix if and only if S is an orthonormal set.<br />
Proof. The proof revolves around recogniz<strong>in</strong>g that a typical entry of the product<br />
A ∗ A is an <strong>in</strong>ner product of columns of A. Here are the details to support this claim.<br />
n∑<br />
[A ∗ A] ij<br />
= [A ∗ ] ik<br />
[A] kj<br />
Theorem EMP<br />
=<br />
=<br />
=<br />
=<br />
k=1<br />
n∑<br />
k=1<br />
[<br />
A t] [A] kj<br />
ik<br />
n∑ [ ]<br />
A<br />
k=1<br />
ki [A] kj<br />
n∑<br />
[A] ki<br />
[A] kj<br />
k=1<br />
n∑<br />
[A i ] k<br />
[A j ] k<br />
k=1<br />
Theorem EMP<br />
Def<strong>in</strong>ition TM<br />
Def<strong>in</strong>ition CCM<br />
= 〈A i , A j 〉 Def<strong>in</strong>ition IP<br />
We now employ this equality <strong>in</strong> a cha<strong>in</strong> of equivalences,<br />
S = {A 1 , A 2 , A 3 , ..., A n } is an orthonormal set<br />
{<br />
0 if i ≠ j<br />
⇐⇒ 〈A i , A j 〉 =<br />
Def<strong>in</strong>ition ONS<br />
1 if i = j
§MINM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 213<br />
⇐⇒ [A ∗ A] ij<br />
=<br />
{<br />
0 if i ≠ j<br />
1 if i = j<br />
⇐⇒ [A ∗ A] ij<br />
=[I n ] ij<br />
, 1 ≤ i ≤ n, 1 ≤ j ≤ n<br />
⇐⇒ A ∗ A = I n<br />
⇐⇒ A is a unitary matrix<br />
Example OSMC Orthonormal set from matrix columns<br />
The matrix<br />
⎡<br />
1+i √ 3+2 √ i 2+2i √<br />
⎢ 5 55 22<br />
⎥<br />
U = ⎣<br />
⎦<br />
1−i √<br />
5<br />
i √<br />
5<br />
2+2 i √<br />
55<br />
−3+i √<br />
22<br />
3−5 i √<br />
55<br />
− 2<br />
√<br />
22<br />
⎤<br />
Def<strong>in</strong>ition IM<br />
Def<strong>in</strong>ition ME<br />
Def<strong>in</strong>ition UM<br />
<br />
from Example UM3 is a unitary matrix. By Theorem CUMOS, itscolumns<br />
⎧⎡<br />
⎡ ⎡ ⎫<br />
1+i √ 3+2 √ i 2+2i √ ⎪⎨<br />
⎢ 5<br />
1−i<br />
⎣<br />
√<br />
⎥ ⎢ 55<br />
2+2 i<br />
5 ⎦ , ⎣<br />
√<br />
⎥ ⎢ 22<br />
⎪⎬<br />
−3+i<br />
55 ⎦ , ⎣<br />
√<br />
⎥<br />
22<br />
⎦<br />
⎪⎩ i 3−5 i<br />
⎪⎭<br />
√<br />
5<br />
⎤<br />
√<br />
55<br />
⎤<br />
⎤<br />
− √ 2<br />
22<br />
form an orthonormal set. You might f<strong>in</strong>d check<strong>in</strong>g the six <strong>in</strong>ner products of pairs<br />
of these vectors easier than do<strong>in</strong>g the matrix product U ∗ U. Or, because the <strong>in</strong>ner<br />
product is anti-commutative (Theorem IPAC) you only need check three <strong>in</strong>ner<br />
products (see Exercise MINM.T12).<br />
△<br />
When us<strong>in</strong>g vectors and matrices that only have real number entries, orthogonal<br />
matrices are those matrices with <strong>in</strong>verses that equal their transpose. Similarly, the<br />
<strong>in</strong>ner product is the familiar dot product. Keep this special case <strong>in</strong> m<strong>in</strong>d as you read<br />
the next theorem.<br />
Theorem UMPIP Unitary Matrices Preserve Inner Products<br />
Suppose that U is a unitary matrix of size n and u and v are two vectors from C n .<br />
Then<br />
〈Uu, Uv〉 = 〈u, v〉 and ‖Uv‖ = ‖v‖<br />
Proof.<br />
〈Uu, Uv〉 = ( Uu ) t<br />
Uv<br />
Theorem MMIP<br />
= ( Uu ) t<br />
Uv Theorem MMCC<br />
= u t U t Uv Theorem MMT<br />
= u t U ∗ Uv Def<strong>in</strong>ition A<br />
= u t I n v Def<strong>in</strong>ition UM
§MINM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 214<br />
= u t v Theorem MMIM<br />
= 〈u, v〉 Theorem MMIP<br />
The second conclusion is just a specialization of the first conclusion.<br />
√<br />
‖Uv‖ = ‖Uv‖ 2<br />
= √ 〈Uv, Uv〉 Theorem IPN<br />
= √ 〈v, v〉<br />
√<br />
= ‖v‖ 2<br />
= ‖v‖<br />
Theorem IPN<br />
<br />
Aside from the <strong>in</strong>herent <strong>in</strong>terest <strong>in</strong> this theorem, it makes a bigger statement<br />
about unitary matrices. When we view vectors geometrically as directions or forces,<br />
then the norm equates to a notion of length. If we transform a vector by multiplication<br />
with a unitary matrix, then the length (norm) of that vector stays the same. If we<br />
consider column vectors with two or three slots conta<strong>in</strong><strong>in</strong>g only real numbers, then<br />
the <strong>in</strong>ner product of two such vectors is just the dot product, and this quantity can<br />
be used to compute the angle between two vectors. When two vectors are multiplied<br />
(transformed) by the same unitary matrix, their dot product is unchanged and their<br />
<strong>in</strong>dividual lengths are unchanged. This results <strong>in</strong> the angle between the two vectors<br />
rema<strong>in</strong><strong>in</strong>g unchanged.<br />
A “unitary transformation” (matrix-vector products with unitary matrices) thus<br />
preserve geometrical relationships among vectors represent<strong>in</strong>g directions, forces, or<br />
other physical quantities. In the case of a two-slot vector with real entries, this is<br />
simply a rotation. These sorts of computations are exceed<strong>in</strong>gly important <strong>in</strong> computer<br />
graphics such as games and real-time simulations, especially when <strong>in</strong>creased realism<br />
is achieved by perform<strong>in</strong>g many such computations quickly. We will see unitary<br />
matrices aga<strong>in</strong> <strong>in</strong> subsequent sections (especially Theorem OD) and <strong>in</strong> each <strong>in</strong>stance,<br />
consider the <strong>in</strong>terpretation of the unitary matrix as a sort of geometry-preserv<strong>in</strong>g<br />
transformation. Some authors use the term isometry to highlight this behavior. We<br />
will speak loosely of a unitary matrix as be<strong>in</strong>g a sort of generalized rotation.<br />
A f<strong>in</strong>al rem<strong>in</strong>der: the terms “dot product,” “symmetric matrix” and “orthogonal<br />
matrix” used <strong>in</strong> reference to vectors or matrices with real number entries are special<br />
cases of the terms “<strong>in</strong>ner product,” “Hermitian matrix” and “unitary matrix” that<br />
we use for vectors or matrices with complex number entries, so keep that <strong>in</strong> m<strong>in</strong>d as<br />
you read elsewhere.<br />
Read<strong>in</strong>g Questions
§MINM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 215<br />
1. Compute the <strong>in</strong>verse of the coefficient matrix of the system of equations below and use<br />
the <strong>in</strong>verse to solve the system.<br />
4x 1 +10x 2 =12<br />
2x 1 +6x 2 =4<br />
2. In the read<strong>in</strong>g questions for Section MISLE you were asked to f<strong>in</strong>d the <strong>in</strong>verse of the<br />
3 × 3 matrix below.<br />
⎡<br />
⎣ 2 3 1<br />
⎤<br />
1 −2 −3⎦<br />
−2 4 6<br />
Because the matrix was not nons<strong>in</strong>gular, you had no theorems at that po<strong>in</strong>t that would<br />
allow you to compute the <strong>in</strong>verse. Expla<strong>in</strong> why you now know that the <strong>in</strong>verse does<br />
not exist (which is different than not be<strong>in</strong>g able to compute it) by quot<strong>in</strong>g the relevant<br />
theorem’s acronym.<br />
3. Is the matrix A unitary? Why?<br />
[ ]<br />
√ 1<br />
1<br />
A = 22<br />
(4 + 2i) √<br />
374<br />
(5 + 3i)<br />
√<br />
1<br />
1<br />
22<br />
(−1 − i) √<br />
374<br />
(12 + 14i)<br />
Exercises<br />
⎡<br />
C20 Let A = ⎣ 1 2 1<br />
⎤ ⎡<br />
0 1 1⎦ and B = ⎣ −1 1 0<br />
⎤<br />
1 2 1⎦. VerifythatAB is nons<strong>in</strong>gular.<br />
1 0 2<br />
0 1 1<br />
C40 † Solve the system of equations below us<strong>in</strong>g the <strong>in</strong>verse of a matrix.<br />
x 1 + x 2 +3x 3 + x 4 =5<br />
−2x 1 − x 2 − 4x 3 − x 4 = −7<br />
x 1 +4x 2 +10x 3 +2x 4 =9<br />
−2x 1 − 4x 3 +5x 4 =9<br />
⎡ ⎤<br />
M10 † F<strong>in</strong>d values of x, y, z so that matrix A = ⎣ 1 2 x<br />
3 0 y⎦ is <strong>in</strong>vertible.<br />
1<br />
⎡<br />
1 z<br />
⎤<br />
M11 † F<strong>in</strong>d values of x, yzso that matrix A = ⎣ 1 x 1<br />
1 y 4⎦ is s<strong>in</strong>gular.<br />
0 z 5<br />
M15 † If A and B are n × n matrices, A is nons<strong>in</strong>gular, and B is s<strong>in</strong>gular, show directly<br />
that AB is s<strong>in</strong>gular, without us<strong>in</strong>g Theorem NPNT.<br />
M20 † Construct an example of a 4 × 4 unitary matrix.<br />
M80 † Matrix multiplication <strong>in</strong>teracts nicely with many operations. But not always with<br />
transform<strong>in</strong>g a matrix to reduced row-echelon form. Suppose that A is an m × n matrix<br />
and B is an n × p matrix. Let P beamatrixthatisrow-equivalenttoA and<strong>in</strong>reduced
§MINM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 216<br />
row-echelon form, Q be a matrix that is row-equivalent to B and <strong>in</strong> reduced row-echelon<br />
form, and let R be a matrix that is row-equivalent to AB and <strong>in</strong> reduced row-echelon form.<br />
Is PQ = R? (In other words, with nonstandard notation, is rref(A)rref(B) = rref(AB)?)<br />
Construct a counterexample to show that, <strong>in</strong> general, this statement is false. Then f<strong>in</strong>d a<br />
large class of matrices where if A and B are <strong>in</strong> the class, then the statement is true.<br />
T10 Suppose that Q and P are unitary matrices of size n. ProvethatQP is a unitary<br />
matrix.<br />
T11 Prove that Hermitian matrices (Def<strong>in</strong>ition HM) have real entries on the diagonal.<br />
More precisely, suppose that A is a Hermitian matrix of size n. Then[A] ii<br />
∈ R, 1≤ i ≤ n.<br />
T12 Suppose that we are check<strong>in</strong>g if a square matrix of size n is unitary. Show that<br />
a straightforward application of Theorem CUMOS requires the computation of n 2 <strong>in</strong>ner<br />
products when the matrix is unitary, and fewer when the matrix is not orthogonal. Then<br />
show that this maximum number of <strong>in</strong>ner products can be reduced to 1 n(n + 1) <strong>in</strong> light of<br />
2<br />
Theorem IPAC.<br />
T25 The notation A k means a repeated matrix product between k copies of the square<br />
matrix A.<br />
1. Assume A is an n × n matrix where A 2 = O (which does not imply that A = O.)<br />
Prove that I n − A is <strong>in</strong>vertible by show<strong>in</strong>g that I n + A is an <strong>in</strong>verse of I n − A.<br />
2. Assume that A is an n × n matrix where A 3 = O. ProvethatI n − A is <strong>in</strong>vertible.<br />
3. Form a general theorem based on your observations from parts (1) and (2) and<br />
provide a proof.
Section CRS<br />
Column and Row Spaces<br />
A matrix-vector product (Def<strong>in</strong>ition MVP) is a l<strong>in</strong>ear comb<strong>in</strong>ation of the columns<br />
of the matrix and this allows us to connect matrix multiplication with systems of<br />
equations via Theorem SLSLC. Row operations are l<strong>in</strong>ear comb<strong>in</strong>ations of the rows<br />
of a matrix, and of course, reduced row-echelon form (Def<strong>in</strong>ition RREF) isalso<br />
<strong>in</strong>timately related to solv<strong>in</strong>g systems of equations. In this section we will formalize<br />
these ideas with two key def<strong>in</strong>itions of sets of vectors derived from a matrix.<br />
Subsection CSSE<br />
Column Spaces and Systems of Equations<br />
Theorem SLSLC showed us that there is a natural correspondence between solutions<br />
to l<strong>in</strong>ear systems and l<strong>in</strong>ear comb<strong>in</strong>ations of the columns of the coefficient matrix.<br />
This idea motivates the follow<strong>in</strong>g important def<strong>in</strong>ition.<br />
Def<strong>in</strong>ition CSM Column Space of a Matrix<br />
Suppose that A is an m × n matrix with columns A 1 , A 2 , A 3 , ..., A n . Then<br />
the column space of A, written C(A), is the subset of C m conta<strong>in</strong><strong>in</strong>g all l<strong>in</strong>ear<br />
comb<strong>in</strong>ations of the columns of A,<br />
C(A) =〈{A 1 , A 2 , A 3 , ..., A n }〉<br />
□<br />
Some authors refer to the column space of a matrix as the range, but we will<br />
reserve this term for use with l<strong>in</strong>ear transformations (Def<strong>in</strong>ition RLT).<br />
Upon encounter<strong>in</strong>g any new set, the first question we ask is what objects are <strong>in</strong><br />
the set, and which objects are not? Here is an example of one way to answer this<br />
question, and it will motivate a theorem that will then answer the question precisely.<br />
Example CSMCS Column space of a matrix and consistent systems<br />
Archetype D and Archetype E are l<strong>in</strong>ear systems of equations, with an identical<br />
3 × 4 coefficient matrix, which we call A here. However, Archetype D is consistent,<br />
while Archetype E is not. We can expla<strong>in</strong> this difference by employ<strong>in</strong>g the column<br />
space of the matrix A.<br />
The column vector of constants, b, <strong>in</strong>ArchetypeD is given below, and one<br />
solution listed for LS(A, b) isx,<br />
b =<br />
[ ] 8<br />
−12<br />
4<br />
⎡ ⎤<br />
7<br />
⎢8⎥<br />
x = ⎣<br />
1<br />
⎦<br />
3<br />
217
§CRS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 218<br />
By Theorem SLSLC, we can summarize this solution as a l<strong>in</strong>ear comb<strong>in</strong>ation of<br />
the columns of A that equals b,<br />
[ ] ] [ ] [ ] [ ]<br />
2 1 7 −7 8<br />
7 −3 +8[<br />
4 +1 −5 +3 −6 = −12 = b.<br />
1 1 4 −5 4<br />
This equation says that b is a l<strong>in</strong>ear comb<strong>in</strong>ation of the columns of A, andthen<br />
by Def<strong>in</strong>ition CSM, we can say that b ∈C(A).<br />
On the other hand, Archetype E is the l<strong>in</strong>ear system LS(A, c), where the vector<br />
of constants is<br />
[ ] 2<br />
c = 3<br />
2<br />
and this system of equations is <strong>in</strong>consistent. This means c ∉C(A), for if it were,<br />
then it would equal a l<strong>in</strong>ear comb<strong>in</strong>ation of the columns of A and Theorem SLSLC<br />
would lead us to a solution of the system LS(A, c).<br />
△<br />
So if we fix the coefficient matrix, and vary the vector of constants, we can<br />
sometimes f<strong>in</strong>d consistent systems, and sometimes <strong>in</strong>consistent systems. The vectors<br />
of constants that lead to consistent systems are exactly the elements of the column<br />
space. This is the content of the next theorem, and s<strong>in</strong>ce it is an equivalence, it<br />
provides an alternate view of the column space.<br />
Theorem CSCS Column Spaces and Consistent Systems<br />
Suppose A is an m × n matrix and b isavectorofsizem. Thenb ∈C(A) if and<br />
only if LS(A, b) is consistent.<br />
Proof. (⇒) Suppose b ∈C(A). Then we can write b as some l<strong>in</strong>ear comb<strong>in</strong>ation<br />
of the columns of A. ByTheoremSLSLC we can use the scalars from this l<strong>in</strong>ear<br />
comb<strong>in</strong>ation to form a solution to LS(A, b), so this system is consistent.<br />
(⇐) IfLS(A, b) is consistent, there is a solution that may be used with Theorem<br />
SLSLC to write b as a l<strong>in</strong>ear comb<strong>in</strong>ation of the columns of A. This qualifies b for<br />
membership <strong>in</strong> C(A).<br />
<br />
This theorem tells us that ask<strong>in</strong>g if the system LS(A, b) is consistent is exactly<br />
the same question as ask<strong>in</strong>g if b is <strong>in</strong> the column space of A. Or equivalently, it tells<br />
us that the column space of the matrix A is precisely those vectors of constants, b,<br />
that can be paired with A to create a system of l<strong>in</strong>ear equations LS(A, b) that is<br />
consistent.<br />
Employ<strong>in</strong>g Theorem SLEMM we can form the cha<strong>in</strong> of equivalences<br />
b ∈C(A) ⇐⇒ LS (A, b) is consistent ⇐⇒ Ax = b for some x<br />
Thus, an alternative (and popular) def<strong>in</strong>ition of the column space of an m × n
§CRS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 219<br />
matrix A is<br />
C(A) ={y ∈ C m | y = Ax for some x ∈ C n } = {Ax| x ∈ C n }⊆C m<br />
Werecognizethisassay<strong>in</strong>gcreateall the matrix vector products possible with<br />
the matrix A by lett<strong>in</strong>g x range over all of the possibilities. By Def<strong>in</strong>ition MVP<br />
we see that this means take all possible l<strong>in</strong>ear comb<strong>in</strong>ations of the columns of A —<br />
precisely the def<strong>in</strong>ition of the column space (Def<strong>in</strong>ition CSM) wehavechosen.<br />
Notice how this formulation of the column space looks very much like the def<strong>in</strong>ition<br />
of the null space of a matrix (Def<strong>in</strong>ition NSM), but for a rectangular matrix the<br />
column vectors of C(A) and N (A) have different sizes, so the sets are very different.<br />
Given a vector b and a matrix A it is now very mechanical to test if b ∈C(A).<br />
Form the l<strong>in</strong>ear system LS(A, b), row-reduce the augmented matrix, [A | b], and<br />
test for consistency with Theorem RCLS. Here is an example of this procedure.<br />
Example MCSM Membership <strong>in</strong> the column space of a matrix<br />
Consider the column space of the 3 × 4 matrix A,<br />
[ ]<br />
3 2 1 −4<br />
A = −1 1 −2 3<br />
2 −4 6 −8<br />
[ ] 18<br />
We first show that v = −6 is <strong>in</strong> the column space of A, v ∈C(A). Theorem<br />
12<br />
CSCS says we need only check the consistency of LS(A, v). Form the augmented<br />
matrix and row-reduce,<br />
[ ] ⎡<br />
3 2 1 −4 18<br />
RREF<br />
−1 1 −2 3 −6 −−−−→ ⎣ 1 0 1 −2 6<br />
⎤<br />
0 1 −1 1 0⎦<br />
2 −4 6 −8 12<br />
0 0 0 0 0<br />
S<strong>in</strong>ce the f<strong>in</strong>al column is not a pivot column, Theorem RCLS tells us the system<br />
is consistent and therefore by Theorem CSCS, v ∈C(A).<br />
If we wished to demonstrate explicitly that v is a l<strong>in</strong>ear comb<strong>in</strong>ation of the<br />
columns of A, we can f<strong>in</strong>d a solution (any solution) of LS(A, v) and use Theorem<br />
SLSLC to construct the desired l<strong>in</strong>ear comb<strong>in</strong>ation. For example, set the free variables<br />
to x 3 = 2 and x 4 = 1. Then a solution has x 2 = 1 and x 1 = 6. Then by Theorem<br />
SLSLC,<br />
[ ] [ ] ] [ ] [ ]<br />
18 3 2 1 −4<br />
v = −6 =6 −1 +1[<br />
1 +2 −2 +1 3<br />
12 2 −4 6 −8<br />
[ ] 2<br />
Nowweshowthatw = 1 is not <strong>in</strong> the column space of A, w ∉C(A).<br />
−3<br />
Theorem CSCS says we need only check the consistency of LS(A, w). Formthe
§CRS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 220<br />
augmented matrix and row-reduce,<br />
[ 3 2 1 −4<br />
] 2<br />
⎡<br />
−1 1 −2 3 1<br />
RREF<br />
−−−−→ ⎣<br />
2 −4 6 −8 −3<br />
1 0 1 −2 0<br />
0 1 −1 1 0<br />
0 0 0 0 1<br />
s<strong>in</strong>ce the f<strong>in</strong>al column is a pivot column, Theorem RCLS tells us the system is<br />
<strong>in</strong>consistent and therefore by Theorem CSCS, w ∉C(A).<br />
△<br />
Theorem CSCS completes a collection of three theorems, and one def<strong>in</strong>ition,<br />
that deserve comment. Many questions about spans, l<strong>in</strong>ear <strong>in</strong>dependence, null space,<br />
column spaces and similar objects can be converted to questions about systems<br />
of equations (homogeneous or not), which we understand well from our previous<br />
results, especially those <strong>in</strong> Chapter SLE. These previous results <strong>in</strong>clude theorems<br />
like Theorem RCLS which allows us to quickly decide consistency of a system, and<br />
Theorem BNS which allows us to describe solution sets for homogeneous systems<br />
compactly as the span of a l<strong>in</strong>early <strong>in</strong>dependent set of column vectors.<br />
The table below lists these four def<strong>in</strong>itions and theorems along with a brief<br />
rem<strong>in</strong>der of the statement and an example of how the statement is used.<br />
⎤<br />
⎦<br />
Def<strong>in</strong>ition NSM<br />
Synopsis<br />
Example<br />
Theorem SLSLC<br />
Synopsis<br />
Example<br />
Theorem SLEMM<br />
Synopsis<br />
Example<br />
Theorem CSCS<br />
Synopsis<br />
Example<br />
Null space is solution set of homogeneous system<br />
General solution sets described by Theorem PSPHS<br />
Solutions for l<strong>in</strong>ear comb<strong>in</strong>ations with unknown scalars<br />
Decid<strong>in</strong>g membership <strong>in</strong> spans<br />
System of equations represented by matrix-vector product<br />
Solution to LS(A, b) isA −1 b when A is nons<strong>in</strong>gular<br />
Column space vectors create consistent systems<br />
Decid<strong>in</strong>g membership <strong>in</strong> column spaces<br />
Subsection CSSOC<br />
Column Space Spanned by Orig<strong>in</strong>al Columns<br />
So we have a foolproof, automated procedure for determ<strong>in</strong><strong>in</strong>g membership <strong>in</strong> C(A).<br />
While this works just f<strong>in</strong>e a vector at a time, we would like to have a more useful<br />
description of the set C(A) as a whole. The next example will preview the first of<br />
two fundamental results about the column space of a matrix.<br />
Example CSTW Column space, two ways
§CRS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 221<br />
Consider the 5 × 7 matrix A,<br />
⎡<br />
⎤<br />
2 4 1 −1 1 4 4<br />
1 2 1 0 2 4 7<br />
⎢ 0 0 1 4 1 8 7 ⎥<br />
⎣ 1 2 −1 2 1 9 6 ⎦<br />
−2 −4 1 3 −1 −2 −2<br />
Accord<strong>in</strong>g to the def<strong>in</strong>ition (Def<strong>in</strong>ition CSM), the column space of A is<br />
⎧⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />
2 4 1 −1 1 4 4<br />
〈<br />
〉<br />
⎪⎨<br />
1<br />
2<br />
1<br />
0 ⎥ 2<br />
4<br />
7<br />
⎪⎬<br />
C(A) = ⎢ 0 ⎥<br />
⎣<br />
⎪⎩<br />
1 ⎦ , ⎢ 0 ⎥<br />
⎣ 2 ⎦ , ⎢ 1 ⎥<br />
⎣−1⎦ , ⎥⎥⎦ ⎢ 4 , ⎢ 1 ⎥<br />
⎣ 2 ⎣ 1 ⎦ , ⎢ 8 ⎥<br />
⎣ 9 ⎦ , ⎢ 7 ⎥<br />
⎣ 6 ⎦<br />
⎪⎭<br />
−2 −4 1 3 −1 −2 −2<br />
While this is a concise description of an <strong>in</strong>f<strong>in</strong>ite set, we might be able to describe<br />
the span with fewer than seven vectors. This is the substance of Theorem BS. Sowe<br />
take these seven vectors and make them the columns of a matrix, which is simply<br />
the orig<strong>in</strong>al matrix A aga<strong>in</strong>. Now we row-reduce,<br />
⎡<br />
⎤ ⎡<br />
⎤<br />
2 4 1 −1 1 4 4<br />
1 2 0 0 0 3 1<br />
1 2 1 0 2 4 7<br />
RREF<br />
0 0 1 0 0 −1 0<br />
⎢ 0 0 1 4 1 8 7 ⎥ −−−−→<br />
⎣ 1 2 −1 2 1 9 6 ⎦ ⎢ 0 0 0 1 0 2 1<br />
⎥<br />
⎣<br />
⎦<br />
0 0 0 0 1 1 3<br />
−2 −4 1 3 −1 −2 −2<br />
0 0 0 0 0 0 0<br />
The pivot columns are D = {1, 3, 4, 5}, so we can create the set<br />
⎧⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />
2 1 −1 1<br />
⎪⎨<br />
1<br />
1<br />
0<br />
2<br />
⎪⎬<br />
T = ⎢ 0 ⎥<br />
⎣<br />
⎪⎩<br />
1 ⎦ , ⎢ 1 ⎥<br />
⎣−1⎦ , ⎢ 4 ⎥<br />
⎣ 2 ⎦ , ⎢ 1 ⎥<br />
⎣ 1 ⎦<br />
⎪⎭<br />
−2 1 3 −1<br />
and know that C(A) = 〈T 〉 and T is a l<strong>in</strong>early <strong>in</strong>dependent set of columns from the<br />
set of columns of A.<br />
△<br />
We will now formalize the previous example, which will make it trivial to determ<strong>in</strong>e<br />
a l<strong>in</strong>early <strong>in</strong>dependent set of vectors that will span the column space of a matrix,<br />
and is constituted of just columns of A.<br />
Theorem BCS Basis of the Column Space<br />
Suppose that A is an m × n matrix with columns A 1 , A 2 , A 3 , ..., A n ,andB is<br />
a row-equivalent matrix <strong>in</strong> reduced row-echelon form with r pivot columns. Let<br />
D = {d 1 ,d 2 ,d 3 , ..., d r } be the set of <strong>in</strong>dices for the pivot columns of B Let T =<br />
{A d1 , A d2 , A d3 , ..., A dr }.Then<br />
1. T is a l<strong>in</strong>early <strong>in</strong>dependent set.
§CRS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 222<br />
2. C(A) =〈T 〉.<br />
Proof. Def<strong>in</strong>ition CSM describes the column space as the span of the set of columns<br />
of A. TheoremBS tells us that we can reduce the set of vectors used <strong>in</strong> a span. If<br />
we apply Theorem BS to C(A), we would collect the columns of A <strong>in</strong>toamatrix<br />
(which would just be A aga<strong>in</strong>) and br<strong>in</strong>g the matrix to reduced row-echelon form,<br />
which is the matrix B <strong>in</strong> the statement of the theorem. In this case, the conclusions<br />
of Theorem BS applied to A, B and C(A) are exactly the conclusions we desire. <br />
This is a nice result s<strong>in</strong>ce it gives us a handful of vectors that describe the entire<br />
column space (through the span), and we believe this set is as small as possible<br />
because we cannot create any more relations of l<strong>in</strong>ear dependence to trim it down<br />
further. Furthermore, we def<strong>in</strong>ed the column space (Def<strong>in</strong>ition CSM) as all l<strong>in</strong>ear<br />
comb<strong>in</strong>ations of the columns of the matrix, and the elements of the set T are still<br />
columns of the matrix (we will not be so lucky <strong>in</strong> the next two constructions of the<br />
column space).<br />
Procedurally this theorem is extremely easy to apply. Row-reduce the orig<strong>in</strong>al<br />
matrix, identify r pivot columns the reduced matrix, and grab the columns of the<br />
orig<strong>in</strong>al matrix with the same <strong>in</strong>dices as the pivot columns. But it is still important<br />
to study the proof of Theorem BS and its motivation <strong>in</strong> Example COV which lie at<br />
the root of this theorem. We will trot through an example all the same.<br />
Example CSOCD Column space, orig<strong>in</strong>al columns, Archetype D<br />
Let us determ<strong>in</strong>e a compact expression for the entire column space of the coefficient<br />
matrix of the system of equations that is Archetype D. Notice that <strong>in</strong> Example<br />
CSMCS we were only determ<strong>in</strong><strong>in</strong>g if <strong>in</strong>dividual vectors were <strong>in</strong> the column space or<br />
not, now we are describ<strong>in</strong>g the entire column space.<br />
To start with the application of Theorem BCS, call the coefficient matrix A and<br />
row-reduce it to reduced row-echelon form B,<br />
A =<br />
[ 2 1 7 −7<br />
−3 4 −5 −6<br />
1 1 4 −5<br />
]<br />
⎡<br />
B = ⎣ 1 0 3 −2<br />
⎤<br />
0 1 1 −3⎦<br />
0 0 0 0<br />
S<strong>in</strong>ce columns 1 and 2 are pivot columns, D = {1, 2}. To construct a set that<br />
spans C(A), just grab the columns of A with <strong>in</strong>dices <strong>in</strong> D, so<br />
〈{[ ] [ ]}〉<br />
2 1<br />
C(A) = −3 , 4 .<br />
1 1<br />
That’s it.<br />
In Example CSMCS we determ<strong>in</strong>ed that the vector<br />
[ ] 2<br />
c = 3<br />
2
§CRS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 223<br />
was not <strong>in</strong> the column space of A. Trytowritec as a l<strong>in</strong>ear comb<strong>in</strong>ation of the first<br />
two columns of A. What happens?<br />
Also <strong>in</strong> Example CSMCS we determ<strong>in</strong>ed that the vector<br />
[ ] 8<br />
b = −12<br />
4<br />
was <strong>in</strong> the column space of A. Trytowriteb as a l<strong>in</strong>ear comb<strong>in</strong>ation of the first<br />
two columns of A. What happens? Did you f<strong>in</strong>d a unique solution to this question?<br />
Hmmmm.<br />
△<br />
Subsection CSNM<br />
Column Space of a Nons<strong>in</strong>gular Matrix<br />
Let us specialize to square matrices and contrast the column spaces of the coefficient<br />
matrices <strong>in</strong> Archetype A and Archetype B.<br />
Example CSAA Column space of Archetype A<br />
The coefficient matrix <strong>in</strong> Archetype A is A, which row-reduces to B,<br />
A =<br />
[ 1 −1<br />
] 2<br />
2 1 1<br />
1 1 0<br />
⎡<br />
B = ⎣ 1 0 1<br />
⎤<br />
0 1 −1⎦<br />
0 0 0<br />
Columns 1 and 2 are pivot columns, so by Theorem BCS we can write<br />
〈{[ ] [ ]}〉<br />
1 −1<br />
C(A) =〈{A 1 , A 2 }〉 = 2 , 1 .<br />
1 1<br />
We want to show <strong>in</strong> this example that C(A) ≠ C 3 . So take, for example, the<br />
]<br />
[ 1<br />
vector b = 3 . Then there is no solution to the system LS(A, b), or equivalently,<br />
2<br />
it is not possible to write b as a l<strong>in</strong>ear comb<strong>in</strong>ation of A 1 and A 2 . Try one of these<br />
two computations yourself. (Or try both!). S<strong>in</strong>ce b ∉C(A), the column space of A<br />
cannot be all of C 3 . So by vary<strong>in</strong>g the vector of constants, it is possible to create<br />
<strong>in</strong>consistent systems of equations with this coefficient matrix (the vector b be<strong>in</strong>g<br />
one such example).<br />
In Example MWIAA we wished to show that the coefficient matrix from Archetype<br />
A was not <strong>in</strong>vertible as a first example of a matrix without an <strong>in</strong>verse. Our<br />
device there was to f<strong>in</strong>d an <strong>in</strong>consistent l<strong>in</strong>ear system with A as the coefficient<br />
matrix. The vector of constants <strong>in</strong> that example was b, deliberately chosen outside<br />
the column space of A.<br />
△
§CRS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 224<br />
Example CSAB Column space of Archetype B<br />
The coefficient matrix <strong>in</strong> Archetype B, call it B here, is known to be nons<strong>in</strong>gular<br />
(see Example NM). By Theorem NMUS, the l<strong>in</strong>ear system LS(B, b) has a (unique)<br />
solution for every choice of b. TheoremCSCS then says that b ∈C(B) for all b ∈ C 3 .<br />
Stated differently, there is no way to build an <strong>in</strong>consistent system with the coefficient<br />
matrix B, but then we knew that already from Theorem NMUS.<br />
△<br />
Example CSAA and Example CSAB together motivate the follow<strong>in</strong>g equivalence,<br />
which says that nons<strong>in</strong>gular matrices have column spaces that are as big as possible.<br />
Theorem CSNM Column Space of a Nons<strong>in</strong>gular Matrix<br />
Suppose A is a square matrix of size n. Then A is nons<strong>in</strong>gular if and only if<br />
C(A) =C n .<br />
Proof. (⇒) Suppose A is nons<strong>in</strong>gular. We wish to establish the set equality C(A) =<br />
C n . By Def<strong>in</strong>ition CSM, C(A) ⊆ C n . To show that C n ⊆C(A) choose b ∈ C n .By<br />
Theorem NMUS, we know the l<strong>in</strong>ear system LS(A, b) has a (unique) solution and<br />
therefore is consistent. Theorem CSCS then says that b ∈C(A). So by Def<strong>in</strong>ition<br />
SE, C(A) =C n .<br />
(⇐) Ife i is column i of the n × n identity matrix (Def<strong>in</strong>ition SUV) andby<br />
hypothesis C(A) = C n ,thene i ∈C(A) for 1 ≤ i ≤ n. ByTheoremCSCS, the system<br />
LS(A, e i ) is consistent for 1 ≤ i ≤ n. Letb i denote any one particular solution to<br />
LS(A, e i ), 1 ≤ i ≤ n.<br />
Def<strong>in</strong>e the n × n matrix B =[b 1 |b 2 |b 3 | ...|b n ]. Then<br />
AB = A [b 1 |b 2 |b 3 | ...|b n ]<br />
=[Ab 1 |Ab 2 |Ab 3 | ...|Ab n ]<br />
Def<strong>in</strong>ition MM<br />
=[e 1 |e 2 |e 3 | ...|e n ]<br />
= I n Def<strong>in</strong>ition SUV<br />
So the matrix B is a “right-<strong>in</strong>verse” for A. ByTheoremNMRRI, I n is a nons<strong>in</strong>gular<br />
matrix, so by Theorem NPNT both A and B are nons<strong>in</strong>gular. Thus, <strong>in</strong><br />
particular, A is nons<strong>in</strong>gular. (Travis Osborne contributed to this proof.) <br />
With this equivalence for nons<strong>in</strong>gular matrices we can update our list, Theorem<br />
NME3.<br />
Theorem NME4 Nons<strong>in</strong>gular Matrix Equivalences, Round 4<br />
Suppose that A is a square matrix of size n. The follow<strong>in</strong>g are equivalent.<br />
1. A is nons<strong>in</strong>gular.<br />
2. A row-reduces to the identity matrix.<br />
3. The null space of A conta<strong>in</strong>s only the zero vector, N (A) ={0}.
§CRS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 225<br />
4. The l<strong>in</strong>ear system LS(A, b) has a unique solution for every possible choice of<br />
b.<br />
5. The columns of A are a l<strong>in</strong>early <strong>in</strong>dependent set.<br />
6. A is <strong>in</strong>vertible.<br />
7. The column space of A is C n , C(A) =C n .<br />
Proof. S<strong>in</strong>ce Theorem CSNM is an equivalence, we can add it to the list <strong>in</strong> Theorem<br />
NME3.<br />
<br />
Subsection RSM<br />
Row Space of a Matrix<br />
The rows of a matrix can be viewed as vectors, s<strong>in</strong>ce they are just lists of numbers,<br />
arranged horizontally. So we will transpose a matrix, turn<strong>in</strong>g rows <strong>in</strong>to columns, so<br />
we can then manipulate rows as column vectors. As a result we will be able to make<br />
some new connections between row operations and solutions to systems of equations.<br />
OK, here is the second primary def<strong>in</strong>ition of this section.<br />
Def<strong>in</strong>ition RSM Row Space of a Matrix<br />
Suppose A is an m × n matrix. Then the row space of A, R(A), isthecolumn<br />
space of A t , i.e. R(A) =C(A t ).<br />
□<br />
Informally, the row space is the set of all l<strong>in</strong>ear comb<strong>in</strong>ations of the rows of<br />
A. However, we write the rows as column vectors, thus the necessity of us<strong>in</strong>g the<br />
transpose to make the rows <strong>in</strong>to columns. Additionally, with the row space def<strong>in</strong>ed<br />
<strong>in</strong> terms of the column space, all of the previous results of this section can be applied<br />
to row spaces.<br />
Notice that if A is a rectangular m×n matrix, then C(A) ⊆ C m , while R(A) ⊆ C n<br />
and the two sets are not comparable s<strong>in</strong>ce they do not even hold objects of the same<br />
type. However, when A is square of size n, both C(A) and R(A) are subsets of C n ,<br />
though usually the sets will not be equal (but see Exercise CRS.M20).<br />
Example RSAI Row space of Archetype I<br />
The coefficient matrix <strong>in</strong> Archetype I is<br />
⎡<br />
⎤<br />
1 4 0 −1 0 7 −9<br />
⎢ 2 8 −1 3 9 −13 7 ⎥<br />
I = ⎣<br />
0 0 2 −3 −4 12 −8<br />
⎦<br />
−1 −4 2 4 8 −31 37
§CRS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 226<br />
To build the row space, we transpose the matrix,<br />
⎡<br />
⎤<br />
1 2 0 −1<br />
4 8 0 −4<br />
0 −1 2 2<br />
I t =<br />
−1 3 −3 4<br />
⎢ 0 9 −4 8<br />
⎥<br />
⎣ 7 −13 12 −31⎦<br />
−9 7 −8 37<br />
Then the columns of this matrix are used <strong>in</strong> a span to build the row space,<br />
⎧⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />
1 2 0 −1<br />
R(I) =C ( 〈<br />
4<br />
8<br />
0<br />
−4<br />
〉<br />
⎪⎨<br />
I t) 0<br />
−1<br />
2<br />
2<br />
⎪⎬<br />
=<br />
−1<br />
,<br />
3<br />
,<br />
−3<br />
,<br />
4<br />
.<br />
⎢ 0<br />
⎥ ⎢ 9<br />
⎥ ⎢−4<br />
⎥ ⎢ 8<br />
⎥<br />
⎣ 7 ⎦ ⎣−13⎦<br />
⎣ 12 ⎦ ⎣−31⎦<br />
⎪⎩<br />
⎪⎭<br />
−9 7 −8 37<br />
However, we can use Theorem BCS to get a slightly better description. <strong>First</strong>,<br />
row-reduce I t ,<br />
⎡<br />
⎤<br />
1 0 0 − 31<br />
7<br />
⎢<br />
⎣<br />
0 1 0<br />
12<br />
7<br />
13<br />
0 0 1<br />
7<br />
0 0 0 0<br />
0 0 0 0<br />
0 0 0 0<br />
0 0 0 0<br />
S<strong>in</strong>ce the pivot columns have <strong>in</strong>dices D = {1, 2, 3}, the column space of I t can<br />
be spanned by just the first three columns of I t ,<br />
⎧⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤⎫<br />
1 2 0<br />
R(I) =C ( 〈<br />
4<br />
8<br />
0<br />
〉<br />
⎪⎨<br />
I t) 0<br />
−1<br />
2<br />
⎪⎬<br />
=<br />
−1<br />
,<br />
3<br />
,<br />
−3<br />
.<br />
⎢ 0<br />
⎥ ⎢ 9<br />
⎥ ⎢−4<br />
⎥<br />
⎣ 7 ⎦ ⎣−13⎦<br />
⎣ 12 ⎦<br />
⎪⎩<br />
⎪⎭<br />
−9 7 −8<br />
The row space would not be too <strong>in</strong>terest<strong>in</strong>g if it was simply the column space of<br />
the transpose. However, when we do row operations on a matrix we have no effect<br />
on the many l<strong>in</strong>ear comb<strong>in</strong>ations that can be formed with the rows of the matrix.<br />
This is stated more carefully <strong>in</strong> the follow<strong>in</strong>g theorem.<br />
Theorem REMRS Row-Equivalent Matrices have equal Row Spaces<br />
.<br />
⎥<br />
⎦<br />
△
§CRS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 227<br />
Suppose A and B are row-equivalent matrices. Then R(A) =R(B).<br />
Proof. Two matrices are row-equivalent (Def<strong>in</strong>ition REM) if one can be obta<strong>in</strong>ed<br />
from another by a sequence of (possibly many) row operations. We will prove the<br />
theorem for two matrices that differ by a s<strong>in</strong>gle row operation, and then this result<br />
can be applied repeatedly to get the full statement of the theorem. The row spaces<br />
of A and B are spans of the columns of their transposes. For each row operation<br />
we perform on a matrix, we can def<strong>in</strong>e an analogous operation on the columns.<br />
Perhaps we should call these column operations. Instead, we will still call them<br />
row operations, but we will apply them to the columns of the transposes.<br />
Refer to the columns of A t and B t as A i and B i ,1≤ i ≤ m. Therowoperation<br />
that switches rows will just switch columns of the transposed matrices. This will<br />
have no effect on the possible l<strong>in</strong>ear comb<strong>in</strong>ations formed by the columns.<br />
Suppose that B t is formed from A t by multiply<strong>in</strong>g column A t by α ̸= 0.Inother<br />
words, B t = αA t ,andB i = A i for all i ̸= t. We need to establish that two sets<br />
are equal, C(A t ) = C(B t ). We will take a generic element of one and show that it is<br />
conta<strong>in</strong>ed <strong>in</strong> the other.<br />
β 1 B 1 +β 2 B 2 + β 3 B 3 + ···+ β t B t + ···+ β m B m<br />
= β 1 A 1 + β 2 A 2 + β 3 A 3 + ···+ β t (αA t )+···+ β m A m<br />
= β 1 A 1 + β 2 A 2 + β 3 A 3 + ···+(αβ t ) A t + ···+ β m A m<br />
says that C(B t ) ⊆C(A t ). Similarly,<br />
γ 1 A 1 +γ 2 A 2 + γ 3 A 3 + ···+ γ t A t + ···+ γ m A m<br />
( γt<br />
)<br />
= γ 1 A 1 + γ 2 A 2 + γ 3 A 3 + ···+<br />
α α A t + ···+ γ m A m<br />
= γ 1 A 1 + γ 2 A 2 + γ 3 A 3 + ···+ γ t<br />
α (αA t)+···+ γ m A m<br />
= γ 1 B 1 + γ 2 B 2 + γ 3 B 3 + ···+ γ t<br />
α B t + ···+ γ m B m<br />
says that C(A t ) ⊆C(B t ).SoR(A) = C(A t ) = C(B t ) = R(B) when a s<strong>in</strong>gle row<br />
operation of the second type is performed.<br />
Suppose now that B t is formed from A t by replac<strong>in</strong>g A t with αA s + A t for some<br />
α ∈ C and s ≠ t. In other words, B t = αA s + A t ,andB i = A i for i ≠ t.<br />
β 1 B 1 +β 2 B 2 + ···+ β s B s + ···+ β t B t + ···+ β m B m<br />
= β 1 A 1 + β 2 A 2 + ···+ β s A s + ···+ β t (αA s + A t )+···+ β m A m<br />
= β 1 A 1 + β 2 A 2 + ···+ β s A s + ···+(β t α) A s + β t A t + ···+ β m A m<br />
= β 1 A 1 + β 2 A 2 + ···+ β s A s +(β t α) A s + ···+ β t A t + ···+ β m A m<br />
= β 1 A 1 + β 2 A 2 + ···+(β s + β t α) A s + ···+ β t A t + ···+ β m A m<br />
says that C(B t ) ⊆C(A t ). Similarly,
§CRS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 228<br />
γ 1 A 1 + γ 2 A 2 + ···+ γ s A s + ···+ γ t A t + ···+ γ m A m<br />
= γ 1 A 1 + γ 2 A 2 + ···+ γ s A s + ···+(−αγ t A s + αγ t A s )+γ t A t + ···+ γ m A m<br />
= γ 1 A 1 + γ 2 A 2 + ···+(−αγ t + γ s ) A s + ···+ γ t (αA s + A t )+···+ γ m A m<br />
= γ 1 B 1 + γ 2 B 2 + ···+(−αγ t + γ s ) B s + ···+ γ t B t + ···+ γ m B m<br />
says that C(A t ) ⊆C(B t ).SoR(A) = C(A t ) = C(B t ) = R(B) when a s<strong>in</strong>gle row<br />
operationofthethirdtypeisperformed.<br />
So the row space of a matrix is preserved by each row operation, and hence row<br />
spaces of row-equivalent matrices are equal sets.<br />
<br />
Example RSREM Row spaces of two row-equivalent matrices<br />
In Example TREM we saw that the matrices<br />
[ 2 −1 3<br />
] 4<br />
[ 1 1 0<br />
] 6<br />
A = 5 2 −2 3<br />
B = 3 0 −2 −9<br />
1 1 0 6<br />
2 −1 3 4<br />
are row-equivalent by demonstrat<strong>in</strong>g a sequence of two row operations that converted<br />
A <strong>in</strong>to B. Apply<strong>in</strong>g Theorem REMRS we can say<br />
〈 ⎧ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />
⎪ 2 5 1 〉 〈 ⎨ ⎪⎬<br />
⎧ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />
⎪ 1 3 2 〉<br />
⎨ ⎪⎬<br />
⎢−1⎥<br />
⎢ 2 ⎥ ⎢1⎥<br />
⎢1⎥<br />
⎢ 0 ⎥ ⎢−1⎥<br />
R(A) = ⎣<br />
⎪ ⎩<br />
3<br />
⎦ , ⎣<br />
−2<br />
⎦ , ⎣<br />
0<br />
⎦ = ⎣<br />
⎪⎭ ⎪ ⎩<br />
0<br />
⎦ , ⎣<br />
−2<br />
⎦ , ⎣<br />
3<br />
⎦ = R(B)<br />
⎪⎭<br />
4 3 6<br />
6 −9 4<br />
Theorem REMRS is at its best when one of the row-equivalent matrices is <strong>in</strong><br />
reduced row-echelon form. The vectors that are zero rows can be ignored. (Who<br />
needs the zero vector when build<strong>in</strong>g a span? See Exercise LI.T10.) The echelon<br />
pattern <strong>in</strong>sures that the nonzero rows yield vectors that are l<strong>in</strong>early <strong>in</strong>dependent.<br />
Here is the theorem.<br />
Theorem BRS BasisfortheRowSpace<br />
Suppose that A is a matrix and B is a row-equivalent matrix <strong>in</strong> reduced row-echelon<br />
form. Let S be the set of nonzero columns of B t .Then<br />
1. R(A) =〈S〉.<br />
2. S is a l<strong>in</strong>early <strong>in</strong>dependent set.<br />
Proof. From Theorem REMRS we know that R(A) = R(B). IfB has any zero rows,<br />
these are columns of B t that are the zero vector. We can safely toss out the zero<br />
vector <strong>in</strong> the span construction, s<strong>in</strong>ce it can be recreated from the nonzero vectors<br />
by a l<strong>in</strong>ear comb<strong>in</strong>ation where all the scalars are zero. So R(A) =〈S〉.<br />
Suppose B has r nonzero rows and let D = {d 1 ,d 2 ,d 3 , ..., d r } denote the<br />
<strong>in</strong>dices of the pivot columns of B. Denotether column vectors of B t , the vectors
§CRS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 229<br />
<strong>in</strong> S, asB 1 , B 2 , B 3 , ..., B r . To show that S is l<strong>in</strong>early <strong>in</strong>dependent, start with a<br />
relation of l<strong>in</strong>ear dependence<br />
α 1 B 1 + α 2 B 2 + α 3 B 3 + ···+ α r B r = 0<br />
Now consider this vector equality <strong>in</strong> location d i .S<strong>in</strong>ceB is <strong>in</strong> reduced row-echelon<br />
form, the entries of column d i of B are all zero, except for a lead<strong>in</strong>g 1 <strong>in</strong> row i. Thus,<br />
<strong>in</strong> B t ,rowd i is all zeros, except<strong>in</strong>g a 1 <strong>in</strong> column i. So,for1≤ i ≤ r,<br />
0=[0] di<br />
Def<strong>in</strong>ition ZCV<br />
=[α 1 B 1 + α 2 B 2 + α 3 B 3 + ···+ α r B r ] di<br />
Def<strong>in</strong>ition RLDCV<br />
=[α 1 B 1 ] di<br />
+[α 2 B 2 ] di<br />
+[α 3 B 3 ] di<br />
+ ···+[α r B r ] di<br />
Def<strong>in</strong>ition MA<br />
= α 1 [B 1 ] di<br />
+ α 2 [B 2 ] di<br />
+ α 3 [B 3 ] di<br />
+ ···+ α r [B r ] di<br />
Def<strong>in</strong>ition MSM<br />
= α 1 (0) + α 2 (0) + α 3 (0) + ···+ α i (1) + ···+ α r (0) Def<strong>in</strong>ition RREF<br />
= α i<br />
So we conclude that α i =0forall1≤ i ≤ r, establish<strong>in</strong>g the l<strong>in</strong>ear <strong>in</strong>dependence<br />
of S (Def<strong>in</strong>ition LICV).<br />
<br />
Example IAS Improv<strong>in</strong>g a span<br />
Suppose <strong>in</strong> the course of analyz<strong>in</strong>g a matrix (its column space, its null space, its<br />
. . . ) we encounter the follow<strong>in</strong>g set of vectors, described by a span<br />
⎧⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤ ⎡<br />
1 3 1<br />
〈 ⎪⎨<br />
2<br />
−1<br />
−1<br />
X = ⎢1⎥<br />
⎣<br />
⎪⎩<br />
6⎦ , ⎢ 2 ⎥<br />
⎣−1⎦ , ⎢ 0 ⎥<br />
⎣−1⎦ , ⎢<br />
⎣<br />
6 6 −2<br />
−3<br />
2<br />
−3<br />
6<br />
−10<br />
⎤⎫<br />
〉<br />
⎪⎬<br />
⎥<br />
⎦<br />
⎪⎭<br />
Let A be the matrix whose rows are the vectors <strong>in</strong> X, sobydesignX = R(A),<br />
⎡<br />
⎤<br />
1 2 1 6 6<br />
⎢ 3 −1 2 −1 6 ⎥<br />
A = ⎣<br />
1 −1 0 −1 −2<br />
⎦<br />
−3 2 −3 6 −10<br />
Row-reduce A to form a row-equivalent matrix <strong>in</strong> reduced row-echelon form,<br />
⎡<br />
⎤<br />
1 0 0 2 −1<br />
B = ⎢ 0 1 0 3 1<br />
⎥<br />
⎣<br />
0 0 1 −2 5<br />
⎦<br />
0 0 0 0 0
§CRS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 230<br />
Then Theorem BRS says we can grab the nonzero columns of B t and write<br />
⎧⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤⎫<br />
1 0 0<br />
〈<br />
〉<br />
⎪⎨<br />
0<br />
1<br />
0<br />
⎪⎬<br />
X = R(A) =R(B) = ⎢ 0 ⎥<br />
⎣<br />
⎪⎩<br />
2 ⎦ , ⎢0⎥<br />
⎣3⎦ , ⎢ 1 ⎥<br />
⎣−2⎦<br />
⎪⎭<br />
−1 1 5<br />
These three vectors provide a much-improved description of X. There are fewer<br />
vectors, and the pattern of zeros and ones <strong>in</strong> the first three entries makes it easier to<br />
determ<strong>in</strong>e membership <strong>in</strong> X.<br />
△<br />
Notice that <strong>in</strong> Example IAS all we had to do was row-reduce the right matrix and<br />
toss out a zero row. Next to row operations themselves, Theorem BRS is probably<br />
the most powerful computational technique at your disposal as it quickly provides a<br />
much improved description of a span, any span (row space, column space, . . . ).<br />
Theorem BRS and the techniques of Example IAS will provide yet another<br />
description of the column space of a matrix. <strong>First</strong> we state a triviality as a theorem,<br />
so we can reference it later.<br />
Theorem CSRST Column Space, Row Space, Transpose<br />
Suppose A is a matrix. Then C(A) =R(A t ).<br />
Proof.<br />
C(A) =C( (A<br />
t ) t ) Theorem TT<br />
= R ( A t) Def<strong>in</strong>ition RSM<br />
<br />
So to f<strong>in</strong>d another expression for the column space of a matrix, build its transpose,<br />
row-reduce it, toss out the zero rows, and convert the nonzero rows to column vectors<br />
to yield an improved set for the span construction. We will do Archetype I, then<br />
you do Archetype J.<br />
Example CSROI Column space from row operations, Archetype I<br />
To f<strong>in</strong>d the column space of the coefficient matrix of Archetype I, we proceed as<br />
follows. The matrix is<br />
⎡<br />
⎤<br />
I =<br />
1 4 0 −1 0 7 −9<br />
⎢ 2 8 −1 3 9 −13 7 ⎥<br />
⎣<br />
0 0 2 −3 −4 12 −8<br />
⎦<br />
−1 −4 2 4 8 −31 37
§CRS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 231<br />
The transpose is<br />
⎡<br />
⎤<br />
1 2 0 −1<br />
4 8 0 −4<br />
0 −1 2 2<br />
−1 3 −3 4<br />
.<br />
⎢ 0 9 −4 8<br />
⎥<br />
⎣ 7 −13 12 −31⎦<br />
−9 7 −8 37<br />
Row-reduced this becomes,<br />
⎡ ⎤<br />
1 0 0 − 31<br />
7<br />
⎢ ⎢⎢⎢⎢⎢⎢⎢⎣<br />
12<br />
0 1 0<br />
7<br />
13<br />
0 0 1 7<br />
0 0 0 0<br />
.<br />
0 0 0 0<br />
⎥<br />
0 0 0 0 ⎦<br />
0 0 0 0<br />
Now, us<strong>in</strong>g Theorem CSRST and Theorem BRS<br />
C(I) =R ( 〈 ⎧ ⎡ ⎤ ⎡ ⎤<br />
⎪ ⎨<br />
I t) ⎢ ⎥ ⎢ ⎥<br />
= ⎣ ⎦ , ⎣ ⎦ ,<br />
⎪ ⎩<br />
1<br />
0<br />
0<br />
− 31<br />
7<br />
0<br />
1<br />
0<br />
12<br />
7<br />
⎡<br />
⎢<br />
⎣<br />
0<br />
0<br />
1<br />
13<br />
7<br />
⎤⎫<br />
〉 ⎪⎬<br />
⎥<br />
⎦ .<br />
⎪⎭<br />
This is a very nice description of the column space. Fewer vectors than the 7<br />
<strong>in</strong>volved <strong>in</strong> the def<strong>in</strong>ition, and the pattern of the zeros and ones <strong>in</strong> the first 3 slots<br />
can be used to advantage. For example, Archetype I is presented as a consistent<br />
system of equations with a vector of constants<br />
⎡ ⎤<br />
3<br />
⎢9⎥<br />
b = ⎣<br />
1<br />
⎦ .<br />
4<br />
S<strong>in</strong>ce LS(I, b) is consistent, Theorem CSCS tells us that b ∈C(I). But we could<br />
see this quickly with the follow<strong>in</strong>g computation, which really only <strong>in</strong>volves any work<br />
<strong>in</strong> the 4th entry of the vectors as the scalars <strong>in</strong> the l<strong>in</strong>ear comb<strong>in</strong>ation are dictated<br />
by the first three entries of b.<br />
⎡ ⎤ ⎡<br />
3<br />
⎢9⎥<br />
⎢<br />
b = ⎣<br />
1<br />
⎦ =3⎣<br />
4<br />
1<br />
0<br />
0<br />
− 31<br />
7<br />
⎤<br />
⎡<br />
⎥ ⎢<br />
⎦ +9⎣<br />
0<br />
1<br />
0<br />
12<br />
7<br />
⎤<br />
⎡<br />
⎥ ⎢<br />
⎦ +1⎣<br />
Can you now rapidly construct several vectors, b, sothatLS(I, b) is consistent,<br />
and several more so that the system is <strong>in</strong>consistent?<br />
△<br />
0<br />
0<br />
1<br />
13<br />
7<br />
⎤<br />
⎥<br />
⎦
§CRS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 232<br />
Read<strong>in</strong>g Questions<br />
1. Write the column space of the matrix below as the span of a set of three vectors and<br />
expla<strong>in</strong> your choice of method.<br />
⎡<br />
⎣ 1 3 1 3<br />
⎤<br />
2 0 1 1⎦<br />
−1 2 1 0<br />
2. Suppose that A is an n × n nons<strong>in</strong>gular matrix. What can you say about its column<br />
space?<br />
⎡ ⎤<br />
0<br />
⎢5⎥<br />
3. Is the vector ⎣<br />
2<br />
⎦ <strong>in</strong> the row space of the follow<strong>in</strong>g matrix? Why or why not?<br />
3<br />
⎡<br />
⎣ 1 3 1 3<br />
⎤<br />
2 0 1 1⎦<br />
−1 2 1 0<br />
Exercises<br />
C20 For each matrix below, f<strong>in</strong>d a set of l<strong>in</strong>early <strong>in</strong>dependent vectors X so that 〈X〉<br />
equals the column space of the matrix, and a set of l<strong>in</strong>early <strong>in</strong>dependent vectors Y so that<br />
〈Y 〉 equals the row space of the matrix.<br />
⎡ ⎤<br />
⎡<br />
⎤<br />
1 2 3 1<br />
⎡<br />
⎢0 1 1 2 ⎥<br />
A = ⎣<br />
1 −1 2 3<br />
⎦ B = ⎣ 1 2 1 1 1<br />
⎤<br />
2 1 0<br />
3 0 3<br />
3 2 −1 4 5⎦ C = ⎢1 2 −3⎥<br />
0 1 1 1 2<br />
⎣<br />
1 1 −1<br />
⎦<br />
1 1 2 −1<br />
1 1 −1<br />
From your results for these three matrices, can you formulate a conjecture about the sets<br />
X and Y ?<br />
C30 † Example CSOCD expresses the column space of the coefficient matrix from Archetype<br />
D (call the matrix A here) as the span of the first two columns of A. In Example<br />
CSMCS we determ<strong>in</strong>ed that the vector<br />
⎡<br />
c = ⎣ 2 ⎤<br />
3⎦<br />
2<br />
was not <strong>in</strong> the column space of A and that the vector<br />
⎡<br />
b = ⎣ 8<br />
⎤<br />
−12⎦<br />
4<br />
was <strong>in</strong> the column space of A. Attempt to write c and b as l<strong>in</strong>ear comb<strong>in</strong>ations of the two<br />
vectors <strong>in</strong> the span construction for the column space <strong>in</strong> Example CSOCD and record your<br />
observations.
§CRS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 233<br />
C31 † For the matrix A below f<strong>in</strong>d a set of vectors T meet<strong>in</strong>g the follow<strong>in</strong>g requirements:<br />
(1) the span of T is the column space of A, thatis,〈T 〉 = C(A), (2)T is l<strong>in</strong>early <strong>in</strong>dependent,<br />
and (3) the elements of T are columns of A.<br />
⎡<br />
⎤<br />
2 1 4 −1 2<br />
⎢ 1 −1 5 1 1⎥<br />
A = ⎣<br />
−1 2 −7 0 1<br />
⎦<br />
2 −1 8 −1 2<br />
C32 In Example CSAA, verify that the vector b is not <strong>in</strong> the column space of the<br />
coefficient matrix.<br />
C33 † F<strong>in</strong>d a l<strong>in</strong>early <strong>in</strong>dependent set S so that the span of S, 〈S〉, isrowspaceofthe<br />
matrix B, andS is l<strong>in</strong>early <strong>in</strong>dependent.<br />
⎡<br />
B = ⎣ 2 3 1 1<br />
⎤<br />
1 1 0 1 ⎦<br />
−1 2 3 −4<br />
C34 † For the 3 × 4 matrix A and the column vector y ∈ C 4 given below, determ<strong>in</strong>e if y<br />
is <strong>in</strong> the row space of A. In other words, answer the question: y ∈R(A)?<br />
⎡<br />
⎤<br />
⎡ ⎤<br />
2<br />
−2 6 7 −1<br />
A = ⎣ 7 −3 0 −3⎦ ⎢ 1 ⎥<br />
y = ⎣<br />
3<br />
⎦<br />
8 0 7 6<br />
−2<br />
C35 † For the matrix A below, f<strong>in</strong>d two different l<strong>in</strong>early <strong>in</strong>dependent sets whose spans<br />
equal the column space of A, C(A), such that<br />
1. the elements are each columns of A.<br />
2. the set is obta<strong>in</strong>ed by a procedure that is substantially different from the procedure<br />
you use <strong>in</strong> part (1).<br />
⎡<br />
A = ⎣ 3 5 1 −2<br />
⎤<br />
1 2 3 3 ⎦<br />
−3 −4 7 13<br />
C40 The follow<strong>in</strong>g archetypes are systems of equations. For each system, write the vector<br />
of constants as a l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors <strong>in</strong> the span construction for the column<br />
space provided by Theorem BCS (these vectors are listed for each of these archetypes).<br />
Archetype A,ArchetypeB,ArchetypeC,ArchetypeD,ArchetypeE,ArchetypeF,Archetype<br />
G, ArchetypeH, ArchetypeI, ArchetypeJ<br />
C42 The follow<strong>in</strong>g archetypes are either matrices or systems of equations with coefficient<br />
matrices. For each matrix, compute a set of column vectors such that (1) the vectors are<br />
columns of the matrix, (2) the set is l<strong>in</strong>early <strong>in</strong>dependent, and (3) the span of the set is<br />
the column space of the matrix. See Theorem BCS.
§CRS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 234<br />
Archetype A,ArchetypeB,ArchetypeC,ArchetypeD/Archetype E,ArchetypeF,Archetype<br />
G/Archetype H, ArchetypeI, ArchetypeJ, ArchetypeK, ArchetypeL<br />
C50 The follow<strong>in</strong>g archetypes are either matrices or systems of equations with coefficient<br />
matrices. For each matrix, compute a set of column vectors such that (1) the set is l<strong>in</strong>early<br />
<strong>in</strong>dependent, and (2) the span of the set is the row space of the matrix. See Theorem BRS.<br />
Archetype A,ArchetypeB,ArchetypeC,ArchetypeD/Archetype E,ArchetypeF,Archetype<br />
G/Archetype H, ArchetypeI, ArchetypeJ, ArchetypeK, ArchetypeL<br />
C51 The follow<strong>in</strong>g archetypes are either matrices or systems of equations with coefficient<br />
matrices. For each matrix, compute the column space as the span of a l<strong>in</strong>early <strong>in</strong>dependent<br />
set as follows: transpose the matrix, row-reduce, toss out zero rows, convert rows <strong>in</strong>to<br />
column vectors. See Example CSROI.<br />
Archetype A,ArchetypeB,ArchetypeC,ArchetypeD/Archetype E,ArchetypeF,Archetype<br />
G/Archetype H, ArchetypeI, ArchetypeJ, ArchetypeK, ArchetypeL<br />
C52 The follow<strong>in</strong>g archetypes are systems of equations. For each different coefficient<br />
matrix build two new vectors of constants. The first should lead to a consistent system<br />
and the second should lead to an <strong>in</strong>consistent system. Descriptions of the column space as<br />
spans of l<strong>in</strong>early <strong>in</strong>dependent sets of vectors with “nice patterns” of zeros and ones might<br />
be most useful and <strong>in</strong>structive <strong>in</strong> connection with this exercise. (See the end of Example<br />
CSROI.)<br />
Archetype A,ArchetypeB,ArchetypeC,ArchetypeD/Archetype E,ArchetypeF,Archetype<br />
G/Archetype H, ArchetypeI, ArchetypeJ<br />
M10 † For the matrix E below, f<strong>in</strong>d vectors b and c so that the system LS(E, b) is<br />
consistent and LS(E, c) is <strong>in</strong>consistent.<br />
⎡<br />
E = ⎣ −2 1 1 0<br />
⎤<br />
3 −1 0 2⎦<br />
4 1 1 6<br />
M20 † Usually the column space and null space of a matrix conta<strong>in</strong> vectors of different<br />
sizes. For a square matrix, though, the vectors <strong>in</strong> these two sets are the same size. Usually<br />
the two sets will be different. Construct an example of a square matrix where the column<br />
space and null space are equal.<br />
M21 † We have a variety of theorems about how to create column spaces and row spaces<br />
and they frequently <strong>in</strong>volve row-reduc<strong>in</strong>g a matrix. Here is a procedure that some try to<br />
use to get a column space. Beg<strong>in</strong> with an m × n matrix A and row-reduce to a matrix B<br />
with columns B 1 , B 2 , B 3 , ..., B n .ThenformthecolumnspaceofA as<br />
C(A) =〈{B 1 , B 2 , B 3 , ..., B n }〉 = C(B)<br />
This is not not a legitimate procedure, and therefore is not a theorem. Construct an example<br />
to show that the procedure will not <strong>in</strong> general create the column space of A.<br />
T40 † Suppose that A is an m × n matrix and B is an n × p matrix. Prove that the<br />
column space of AB is a subset of the column space of A, thatisC(AB) ⊆C(A). Providean
§CRS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 235<br />
example where the opposite is false, <strong>in</strong> other words give an example where C(A) ⊈C(AB).<br />
(Compare with Exercise MM.T40.)<br />
T41 † Suppose that A is an m × n matrix and B is an n × n nons<strong>in</strong>gular matrix. Prove<br />
that the column space of A is equal to the column space of AB, thatisC(A) = C(AB).<br />
(Compare with Exercise MM.T41 and Exercise CRS.T40.)<br />
T45 † Suppose that A is an m × n matrix and B is an n × m matrix where AB is a<br />
nons<strong>in</strong>gular matrix. Prove that<br />
1. N (B) ={0}<br />
2. C(B) ∩N(A) ={0}<br />
Discuss the case when m = n <strong>in</strong> connection with Theorem NPNT.
Section FS<br />
Four Subsets<br />
There are four natural subsets associated with a matrix. We have met three already:<br />
the null space, the column space and the row space. In this section we will <strong>in</strong>troduce<br />
a fourth, the left null space. The objective of this section is to describe one procedure<br />
that will allow us to f<strong>in</strong>d l<strong>in</strong>early <strong>in</strong>dependent sets that span each of these four sets<br />
of column vectors. Along the way, we will make a connection with the <strong>in</strong>verse of<br />
a matrix, so Theorem FS will tie together most all of this chapter (and the entire<br />
course so far).<br />
Subsection LNS<br />
Left Null Space<br />
Def<strong>in</strong>ition LNS Left Null Space<br />
Suppose A is an m × n matrix. Then the left null space is def<strong>in</strong>ed as L(A) =<br />
N (A t ) ⊆ C m .<br />
□<br />
The left null space will not feature prom<strong>in</strong>ently <strong>in</strong> the sequel, but we can expla<strong>in</strong><br />
its name and connect it to row operations. Suppose y ∈L(A). Then by Def<strong>in</strong>ition<br />
LNS, A t y = 0. We can then write<br />
0 t = ( A t y ) t<br />
Def<strong>in</strong>ition LNS<br />
= y t ( A t) t<br />
Theorem MMT<br />
= y t A Theorem TT<br />
The product y t A can be viewed as the components of y act<strong>in</strong>g as the scalars <strong>in</strong><br />
a l<strong>in</strong>ear comb<strong>in</strong>ation of the rows of A. And the result is a “row vector”, 0 t that is<br />
totally zeros. When we apply a sequence of row operations to a matrix, each row of<br />
the result<strong>in</strong>g matrix is some l<strong>in</strong>ear comb<strong>in</strong>ation of the rows. These observations tell<br />
us that the vectors <strong>in</strong> the left null space are scalars that record a sequence of row<br />
operations that result <strong>in</strong> a row of zeros <strong>in</strong> the row-reduced version of the matrix.<br />
We will see this idea more explicitly <strong>in</strong> the course of prov<strong>in</strong>g Theorem FS.<br />
Example LNS Left null space<br />
We will f<strong>in</strong>d the left null space of<br />
⎡<br />
⎤<br />
1 −3 1<br />
⎢−2 1 1⎥<br />
A = ⎣<br />
1 5 1<br />
⎦<br />
9 −4 0<br />
236
§FS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 237<br />
We transpose A and row-reduce,<br />
[ ] ⎡<br />
⎤<br />
1 −2 1 9<br />
1 0 0 2<br />
A t RREF<br />
= −3 1 5 −4 −−−−→ ⎣ 0 1 0 −3⎦<br />
1 1 1 0<br />
0 0 1 1<br />
Apply<strong>in</strong>g Def<strong>in</strong>ition LNS and Theorem BNS we have<br />
L(A) =N ( 〈 ⎧ ⎡ ⎤⎫<br />
⎪ −2 〉<br />
⎨ ⎪⎬<br />
A t) ⎢ 3 ⎥<br />
= ⎣<br />
⎪ ⎩<br />
−1<br />
⎦<br />
⎪⎭<br />
1<br />
If you row-reduce A you will discover one zero row <strong>in</strong> the reduced row-echelon<br />
form. This zero row is created by a sequence of row operations, which <strong>in</strong> total amounts<br />
to a l<strong>in</strong>ear comb<strong>in</strong>ation, with scalars a 1 = −2, a 2 =3,a 3 = −1 anda 4 =1,onthe<br />
rows of A and which results <strong>in</strong> the zero vector (check this!). So the components of<br />
the vector describ<strong>in</strong>g the left null space of A provide a relation of l<strong>in</strong>ear dependence<br />
on the rows of A.<br />
△<br />
Subsection CCS<br />
Comput<strong>in</strong>g Column Spaces<br />
We have three ways to build the column space of a matrix. <strong>First</strong>, we can use just the<br />
def<strong>in</strong>ition, Def<strong>in</strong>ition CSM, and express the column space as a span of the columns of<br />
the matrix. A second approach gives us the column space as the span of some of the<br />
columns of the matrix, and additionally, this set is l<strong>in</strong>early <strong>in</strong>dependent (Theorem<br />
BCS). F<strong>in</strong>ally, we can transpose the matrix, row-reduce the transpose, kick out<br />
zero rows, and write the rema<strong>in</strong><strong>in</strong>g rows as column vectors. Theorem CSRST and<br />
Theorem BRS tell us that the result<strong>in</strong>g vectors are l<strong>in</strong>early <strong>in</strong>dependent and their<br />
span is the column space of the orig<strong>in</strong>al matrix.<br />
We will now demonstrate a fourth method by way of a rather complicated example.<br />
Study this example carefully, but realize that its ma<strong>in</strong> purpose is to motivate a<br />
theorem that simplifies much of the apparent complexity. So other than an <strong>in</strong>structive<br />
exercise or two, the procedure we are about to describe will not be a usual approach<br />
to comput<strong>in</strong>g a column space.<br />
Example CSANS Column space as null space<br />
Let us f<strong>in</strong>d the column space of the matrix A below with a new approach.<br />
⎡<br />
⎤<br />
10 0 3 8 7<br />
−16 −1 −4 −10 −13<br />
−6 1 −3 −6 −6<br />
A =<br />
⎢ 0 2 −2 −3 −2<br />
⎥<br />
⎣ 3 0 1 2 3 ⎦<br />
−1 −1 1 1 0
§FS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 238<br />
By Theorem CSCS we know that the column vector b is <strong>in</strong> the column space<br />
of A if and only if the l<strong>in</strong>ear system LS(A, b) is consistent. So let us try to solve<br />
this system <strong>in</strong> full generality, us<strong>in</strong>g a vector of variables for the vector of constants.<br />
In other words, which vectors b lead to consistent systems? Beg<strong>in</strong> by form<strong>in</strong>g the<br />
augmented matrix [A | b] with a general version of b,<br />
⎡<br />
⎤<br />
10 0 3 8 7 b 1<br />
−16 −1 −4 −10 −13 b 2<br />
−6 1 −3 −6 −6 b<br />
[A | b] =<br />
3<br />
⎢ 0 2 −2 −3 −2 b 4 ⎥<br />
⎣ 3 0 1 2 3 b 5<br />
⎦<br />
−1 −1 1 1 0 b 6<br />
To identify solutions we will br<strong>in</strong>g this matrix to reduced row-echelon form.<br />
Despite the presence of variables <strong>in</strong> the last column, there is noth<strong>in</strong>g to stop us<br />
from do<strong>in</strong>g this, except numerical computational rout<strong>in</strong>es cannot be used, and even<br />
some of the symbolic algebra rout<strong>in</strong>es do some unexpected maneuvers with this<br />
computation. So do it by hand. Yes, it is a bit of work. But worth it. We’ll still be<br />
here when you get back. Notice along the way that the row operations are exactly<br />
the same ones you would do if you were just row-reduc<strong>in</strong>g the coefficient matrix<br />
alone, say <strong>in</strong> connection with a homogeneous system of equations. The column with<br />
the b i acts as a sort of bookkeep<strong>in</strong>g device. There are many different possibilities for<br />
the result, depend<strong>in</strong>g on what order you choose to perform the row operations, but<br />
shortly we will all be on the same page. If you want to match our work right now,<br />
use row 5 to remove any occurrence of b 1 from the other entries of the last column,<br />
and use row 6 to remove any occurrence of b 2 from the last columns. We have:<br />
⎡<br />
⎤<br />
1 0 0 0 2 b 3 − b 4 +2b 5 − b 6<br />
0 1 0 0 −3 −2b 3 +3b 4 − 3b 5 +3b 6<br />
0 0 1 0 1 b 3 + b 4 +3b 5 +3b 6<br />
⎢ 0 0 0 1 −2 −2b<br />
⎣<br />
3 + b 4 − 4b 5 ⎥<br />
0 0 0 0 0 b 1 +3b 3 − b 4 +3b 5 + b<br />
⎦<br />
6<br />
0 0 0 0 0 b 2 − 2b 3 + b 4 + b 5 − b 6<br />
Our goal is to identify those vectors b which make LS(A, b) consistent. By<br />
Theorem RCLS we know that the consistent systems are precisely those without a<br />
pivot column <strong>in</strong> the last column. Are the expressions <strong>in</strong> the last column of rows 5<br />
and 6 equal to zero, or are they lead<strong>in</strong>g 1’s? The answer is: maybe. It depends on b.<br />
With a nonzero value for either of these expressions, we would scale the row and<br />
produce a lead<strong>in</strong>g 1. So we get a consistent system, and b is <strong>in</strong> the column space,<br />
if and only if these two expressions are both simultaneously zero. In other words,<br />
members of the column space of A are exactly those vectors b that satisfy<br />
b 1 +3b 3 − b 4 +3b 5 + b 6 =0<br />
b 2 − 2b 3 + b 4 + b 5 − b 6 =0
§FS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 239<br />
Hmmm. Looks suspiciously like a homogeneous system of two equations with<br />
six variables. If you have been play<strong>in</strong>g along (and we hope you have) then you may<br />
have a slightly different system, but you should have just two equations. Form the<br />
coefficient matrix and row-reduce (notice that the system above has a coefficient<br />
matrix that is already <strong>in</strong> reduced row-echelon form). We should all be together now<br />
with the same matrix,<br />
[ ]<br />
1 0 3 −1 3 1<br />
L =<br />
0 1 −2 1 1 −1<br />
So, C(A) = N (L) and we can apply Theorem BNS to obta<strong>in</strong> a l<strong>in</strong>early <strong>in</strong>dependent<br />
set to use <strong>in</strong> a span construction,<br />
⎧⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />
−3 1 −3 −1<br />
〈<br />
2<br />
−1<br />
−1<br />
1<br />
〉<br />
⎪⎨<br />
⎪⎬<br />
1<br />
0<br />
0<br />
0<br />
C(A) =N (L) =<br />
⎢ 0<br />
,<br />
⎥ ⎢ 1<br />
,<br />
⎥ ⎢ 0<br />
,<br />
⎥ ⎢ 0<br />
⎥<br />
⎣ 0 ⎦ ⎣ 0 ⎦ ⎣ 1 ⎦ ⎣ 0 ⎦<br />
⎪⎩<br />
⎪⎭<br />
0 0 0 1<br />
Whew! As a postscript to this central example, you may wish to conv<strong>in</strong>ce yourself<br />
that the four vectors above really are elements of the column space. Do they create<br />
consistent systems with A as coefficient matrix? Can you recognize the constant<br />
vector <strong>in</strong> your description of these solution sets?<br />
OK, that was so much fun, let us do it aga<strong>in</strong>. But simpler this time. And we will<br />
all get the same results all the way through. Do<strong>in</strong>g row operations by hand with<br />
variables can be a bit error prone, so let us see if we can improve the process some.<br />
Rather than row-reduce a column vector b full of variables, let us write b = I 6 b<br />
and we will row-reduce the matrix I 6 and when we f<strong>in</strong>ish row-reduc<strong>in</strong>g, then we will<br />
compute the matrix-vector product. You should first conv<strong>in</strong>ce yourself that we can<br />
operate like this (this is the subject of a future homework exercise).<br />
Rather than augment<strong>in</strong>g A with b, we will <strong>in</strong>stead augment it with I 6 (does this<br />
feel familiar?),<br />
⎡<br />
⎤<br />
10 0 3 8 7 1 0 0 0 0 0<br />
−16 −1 −4 −10 −13 0 1 0 0 0 0<br />
−6 1 −3 −6 −6 0 0 1 0 0 0<br />
M =<br />
⎢ 0 2 −2 −3 −2 0 0 0 1 0 0<br />
⎥<br />
⎣ 3 0 1 2 3 0 0 0 0 1 0⎦<br />
−1 −1 1 1 0 0 0 0 0 0 1<br />
We want to row-reduce the left-hand side of this matrix, but we will apply the<br />
same row operations to the right-hand side as well. And once we get the left-hand<br />
side <strong>in</strong> reduced row-echelon form, we will cont<strong>in</strong>ue on to put lead<strong>in</strong>g 1’s <strong>in</strong> the f<strong>in</strong>al<br />
two rows, as well as mak<strong>in</strong>g pivot columns that conta<strong>in</strong> these two additional lead<strong>in</strong>g<br />
1’s. It is these additional row operations that will ensure that we all get to the same
§FS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 240<br />
place, s<strong>in</strong>ce the reduced row-echelon form is unique (Theorem RREFU),<br />
⎡<br />
⎤<br />
1 0 0 0 2 0 0 1 −1 2 −1<br />
0 1 0 0 −3 0 0 −2 3 −3 3<br />
0 0 1 0 1 0 0 1 1 3 3<br />
N =<br />
⎢0 0 0 1 −2 0 0 −2 1 −4 0<br />
⎥<br />
⎣0 0 0 0 0 1 0 3 −1 3 1 ⎦<br />
0 0 0 0 0 0 1 −2 1 1 −1<br />
so<br />
We are after the f<strong>in</strong>al six columns of this matrix, which we will multiply by b<br />
⎡<br />
⎤<br />
0 0 1 −1 2 −1<br />
0 0 −2 3 −3 3<br />
0 0 1 1 3 3<br />
J =<br />
⎢0 0 −2 1 −4 0<br />
⎥<br />
⎣1 0 3 −1 3 1 ⎦<br />
0 1 −2 1 1 −1<br />
⎡<br />
⎤ ⎡ ⎤ ⎡<br />
⎤<br />
0 0 1 −1 2 −1 b 1 b 3 − b 4 +2b 5 − b 6<br />
0 0 −2 3 −3 3<br />
b 2<br />
−2b 3 +3b 4 − 3b 5 +3b 6<br />
0 0 1 1 3 3<br />
b<br />
Jb =<br />
⎢0 0 −2 1 −4 0<br />
3<br />
b<br />
⎥ ⎢b =<br />
3 + b 4 +3b 5 +3b 6<br />
4 ⎥ ⎢ −2b 3 + b 4 − 4b 5 ⎥<br />
⎣1 0 3 −1 3 1 ⎦ ⎣b 5<br />
⎦ ⎣b 1 +3b 3 − b 4 +3b 5 + b 6<br />
⎦<br />
0 1 −2 1 1 −1 b 6 b 2 − 2b 3 + b 4 + b 5 − b 6<br />
So by apply<strong>in</strong>g the same row operations that row-reduce A to the identity matrix<br />
(which we could do computationally once I 6 is placed alongside of A), we can then<br />
arrive at the result of row-reduc<strong>in</strong>g a column of symbols where the vector of constants<br />
usually resides. S<strong>in</strong>ce the row-reduced version of A has two zero rows, for a consistent<br />
system we require that<br />
b 1 +3b 3 − b 4 +3b 5 + b 6 =0<br />
b 2 − 2b 3 + b 4 + b 5 − b 6 =0<br />
Now we are exactly back where we were on the first go-round. Notice that we<br />
obta<strong>in</strong> the matrix L as simply the last two rows and last six columns of N. △<br />
This example motivates the rema<strong>in</strong>der of this section, so it is worth careful study.<br />
You might attempt to mimic the second approach with the coefficient matrices of<br />
Archetype I and Archetype J. We will see shortly that the matrix L conta<strong>in</strong>s more<br />
<strong>in</strong>formation about A than just the column space.<br />
Subsection EEF<br />
Extended Echelon Form<br />
The f<strong>in</strong>al matrix that we row-reduced <strong>in</strong> Example CSANS should look familiar <strong>in</strong><br />
most respects to the procedure we used to compute the <strong>in</strong>verse of a nons<strong>in</strong>gular
§FS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 241<br />
matrix, Theorem CINM. We will now generalize that procedure to matrices that are<br />
not necessarily nons<strong>in</strong>gular, or even square. <strong>First</strong> a def<strong>in</strong>ition.<br />
Def<strong>in</strong>ition EEF Extended Echelon Form<br />
Suppose A is an m × n matrix. Extend A on its right side with the addition of an<br />
m×m identity matrix to form an m×(n+m) matrix M. Use row operations to br<strong>in</strong>g<br />
M to reduced row-echelon form and call the result N. N is the extended reduced<br />
row-echelon form of A, and we will standardize on names for five submatrices (B,<br />
C, J, K, L) ofN.<br />
Let B denote the m × n matrix formed from the first n columns of N and let J<br />
denote the m × m matrix formed from the last m columns of N. Suppose that B has<br />
r nonzero rows. Further partition N by lett<strong>in</strong>g C denote the r × n matrix formed<br />
from all of the nonzero rows of B. LetK be the r × m matrix formed from the first<br />
r rows of J, while L will be the (m − r) × m matrix formed from the bottom m − r<br />
rows of J. Pictorially,<br />
M =[A|I m ] RREF<br />
−−−−→ N =[B|J] =<br />
[<br />
C K<br />
0 L<br />
Example SEEF Submatrices of extended echelon form<br />
We illustrate Def<strong>in</strong>ition EEF with the matrix A,<br />
⎡<br />
⎤<br />
1 −1 −2 7 1 6<br />
⎢−6 2 −4 −18 −3 −26⎥<br />
A = ⎣<br />
4 −1 4 10 2 17<br />
⎦<br />
3 −1 2 9 1 12<br />
Augment<strong>in</strong>g with the 4 × 4 identity matrix,<br />
⎡<br />
⎤<br />
1 −1 −2 7 1 6 1 0 0 0<br />
⎢−6 2 −4 −18 −3 −26 0 1 0 0⎥<br />
M = ⎣<br />
4 −1 4 10 2 17 0 0 1 0<br />
⎦<br />
3 −1 2 9 1 12 0 0 0 1<br />
and row-reduc<strong>in</strong>g, we obta<strong>in</strong><br />
⎡<br />
N =<br />
⎢<br />
⎣<br />
So we then obta<strong>in</strong><br />
1 0 2 1 0 3 0 1 1 1<br />
0 1 4 −6 0 −1 0 2 3 0<br />
0 0 0 0 1 2 0 −1 0 −2<br />
0 0 0 0 0 0 1 2 2 1<br />
B =<br />
⎡<br />
⎢<br />
⎣<br />
⎤<br />
1 0 2 1 0 3<br />
0 1 4 −6 0 −1<br />
⎥<br />
0 0 0 0 1 2<br />
⎦<br />
0 0 0 0 0 0<br />
]<br />
⎤<br />
⎥<br />
⎦<br />
□
§FS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 242<br />
⎡<br />
1 0 2 1 0<br />
⎤<br />
3<br />
C = ⎣ 0 1 4 −6 0 −1⎦<br />
0 0 0 0 1 2<br />
⎡<br />
⎤<br />
0 1 1 1<br />
⎢ 0 2 3 0 ⎥<br />
J = ⎣ 0 −1 0 −2⎦<br />
1 2 2 1<br />
[ ]<br />
0 1 1 1<br />
K = 0 2 3 0<br />
0 −1 0 −2<br />
L = [ 1 2 2 1 ]<br />
You can observe (or verify) the properties of the follow<strong>in</strong>g theorem with this<br />
example.<br />
△<br />
Theorem PEEF Properties of Extended Echelon Form<br />
Suppose that A is an m × n matrix and that N is its extended echelon form. Then<br />
1. J is nons<strong>in</strong>gular.<br />
2. B = JA.<br />
3. If x ∈ C n and y ∈ C m ,thenAx = y if and only if Bx = Jy.<br />
4. C is <strong>in</strong> reduced row-echelon form, has no zero rows and has r pivot columns.<br />
5. L is <strong>in</strong> reduced row-echelon form, has no zero rows and has m−r pivot columns.<br />
Proof. J is the result of apply<strong>in</strong>g a sequence of row operations to I m , and therefore<br />
J and I m are row-equivalent. LS(I m , 0) has only the zero solution, s<strong>in</strong>ce I m is<br />
nons<strong>in</strong>gular (Theorem NMRRI). Thus, LS(J, 0) also has only the zero solution<br />
(Theorem REMES, Def<strong>in</strong>ition ESYS) andJ is therefore nons<strong>in</strong>gular (Def<strong>in</strong>ition<br />
NSM).<br />
To prove the second part of this conclusion, first conv<strong>in</strong>ce yourself that row<br />
operations and the matrix-vector product are associative operations. By this we<br />
mean the follow<strong>in</strong>g. Suppose that F is an m × n matrix that is row-equivalent to<br />
the matrix G. Apply to the column vector F w the same sequence of row operations<br />
that converts F to G. Then the result is Gw. So we can do row operations on the<br />
matrix, then do a matrix-vector product, or do a matrix-vector product and then do<br />
row operations on a column vector, and the result will be the same either way. S<strong>in</strong>ce<br />
matrix multiplication is def<strong>in</strong>ed by a collection of matrix-vector products (Def<strong>in</strong>ition<br />
MM), the matrix product FH will become GH if we apply the same sequence of<br />
row operations to FH that convert F to G. (This argument can be made more<br />
rigorous us<strong>in</strong>g elementary matrices from the upcom<strong>in</strong>g Subsection DM.EM and the
§FS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 243<br />
associative property of matrix multiplication established <strong>in</strong> Theorem MMA.) Now<br />
apply these observations to A.<br />
Write AI n = I m A and apply the row operations that convert M to N. A is<br />
converted to B, while I m is converted to J, sowehaveBI n = JA. Simplify<strong>in</strong>g the<br />
left side gives the desired conclusion.<br />
For the third conclusion, we now establish the two equivalences<br />
Ax = y ⇐⇒ JAx = Jy ⇐⇒ Bx = Jy<br />
The forward direction of the first equivalence is accomplished by multiply<strong>in</strong>g<br />
both sides of the matrix equality by J, while the backward direction is accomplished<br />
by multiply<strong>in</strong>g by the <strong>in</strong>verse of J (which we know exists by Theorem NI s<strong>in</strong>ce J is<br />
nons<strong>in</strong>gular). The second equivalence is obta<strong>in</strong>ed simply by the substitutions given<br />
by JA = B.<br />
The first r rows of N are <strong>in</strong> reduced row-echelon form, s<strong>in</strong>ce any contiguous<br />
collection of rows taken from a matrix <strong>in</strong> reduced row-echelon form will form a<br />
matrix that is aga<strong>in</strong> <strong>in</strong> reduced row-echelon form (Exercise RREF.T12). S<strong>in</strong>ce the<br />
matrix C is formed by remov<strong>in</strong>g the last n entries of each these rows, the rema<strong>in</strong>der<br />
is still <strong>in</strong> reduced row-echelon form. By its construction, C has no zero rows. C has<br />
r rows and each conta<strong>in</strong>s a lead<strong>in</strong>g 1, so there are r pivot columns <strong>in</strong> C.<br />
The f<strong>in</strong>al m − r rows of N are <strong>in</strong> reduced row-echelon form, s<strong>in</strong>ce any contiguous<br />
collection of rows taken from a matrix <strong>in</strong> reduced row-echelon form will form a<br />
matrix that is aga<strong>in</strong> <strong>in</strong> reduced row-echelon form. S<strong>in</strong>ce the matrix L is formed by<br />
remov<strong>in</strong>g the first n entries of each these rows, and these entries are all zero (they<br />
form the zero rows of B), the rema<strong>in</strong>der is still <strong>in</strong> reduced row-echelon form. L is the<br />
f<strong>in</strong>al m − r rows of the nons<strong>in</strong>gular matrix J, so none of these rows can be totally<br />
zero, or J would not row-reduce to the identity matrix. L has m − r rows and each<br />
conta<strong>in</strong>s a lead<strong>in</strong>g 1, so there are m − r pivot columns <strong>in</strong> L.<br />
<br />
Notice that <strong>in</strong> the case where A is a nons<strong>in</strong>gular matrix we know that the reduced<br />
row-echelon form of A is the identity matrix (Theorem NMRRI), so B = I n .Then<br />
the second conclusion above says JA = B = I n ,soJ is the <strong>in</strong>verse of A. Thusthis<br />
theorem generalizes Theorem CINM, though the result is a “left-<strong>in</strong>verse” of A rather<br />
than a “right-<strong>in</strong>verse.”<br />
The third conclusion of Theorem PEEF is the most tell<strong>in</strong>g. It says that x is a<br />
solution to the l<strong>in</strong>ear system LS(A, y) if and only if x is a solution to the l<strong>in</strong>ear<br />
system LS(B, Jy). Or said differently, if we row-reduce the augmented matrix [A | y]<br />
we will get the augmented matrix [B | Jy]. ThematrixJ tracks the cumulative effect<br />
of the row operations that converts A to reduced row-echelon form, here effectively<br />
apply<strong>in</strong>g them to the vector of constants <strong>in</strong> a system of equations hav<strong>in</strong>g A as a<br />
coefficient matrix. When A row-reduces to a matrix with zero rows, then Jy should<br />
also have zero entries <strong>in</strong> the same rows if the system is to be consistent.
§FS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 244<br />
Subsection FS<br />
Four Subsets<br />
With all the prelim<strong>in</strong>aries <strong>in</strong> place we can state our ma<strong>in</strong> result for this section. In<br />
essence this result will allow us to say that we can f<strong>in</strong>d l<strong>in</strong>early <strong>in</strong>dependent sets to<br />
use <strong>in</strong> span constructions for all four subsets (null space, column space, row space,<br />
left null space) by analyz<strong>in</strong>g only the extended echelon form of the matrix, and<br />
specifically, just the two submatrices C and L, which will be ripe for analysis s<strong>in</strong>ce<br />
they are already <strong>in</strong> reduced row-echelon form (Theorem PEEF).<br />
Theorem FS Four Subsets<br />
Suppose A is an m × n matrix with extended echelon form N. Suppose the reduced<br />
row-echelon form of A has r nonzero rows. Then C is the submatrix of N formed<br />
from the first r rows and the first n columns and L is the submatrix of N formed<br />
from the last m columns and the last m − r rows. Then<br />
1. The null space of A is the null space of C, N (A) =N (C).<br />
2. The row space of A is the row space of C, R(A) =R(C).<br />
3. The column space of A is the null space of L, C(A) =N (L).<br />
4. The left null space of A is the row space of L, L(A) =R(L).<br />
Proof. <strong>First</strong>, N (A) = N (B) s<strong>in</strong>ce B is row-equivalent to A (Theorem REMES). The<br />
zero rows of B represent equations that are always true <strong>in</strong> the homogeneous system<br />
LS(B, 0), so the removal of these equations will not change the solution set. Thus,<br />
<strong>in</strong> turn, N (B) =N (C).<br />
Second, R(A) = R(B) s<strong>in</strong>ce B is row-equivalent to A (Theorem REMRS). The<br />
zero rows of B contribute noth<strong>in</strong>g to the span that is the row space of B, sothe<br />
removal of these rows will not change the row space. Thus, <strong>in</strong> turn, R(B) =R(C).<br />
Third, we prove the set equality C(A) = N (L) with Def<strong>in</strong>ition SE. Beg<strong>in</strong>by<br />
show<strong>in</strong>g that C(A) ⊆N(L). Choose y ∈C(A) ⊆ C m . Then there exists a vector<br />
x ∈ C n such that Ax = y (Theorem CSCS). Then for 1 ≤ k ≤ m − r,<br />
[Ly] k<br />
=[Jy] r+k<br />
L a submatrix of J<br />
=[Bx] r+k<br />
Theorem PEEF<br />
=[Ox] k<br />
Zero matrix a submatrix of B<br />
=[0] k<br />
Theorem MMZM<br />
So, for all 1 ≤ k ≤ m − r, [Ly] k<br />
= [0] k<br />
.SobyDef<strong>in</strong>itionCVE we have Ly = 0<br />
and thus y ∈N(L).<br />
Now, show that N (L) ⊆ C(A). Choose y ∈ N(L) ⊆ C m . Form the vector<br />
Ky ∈ C r . The l<strong>in</strong>ear system LS(C, Ky) is consistent s<strong>in</strong>ce C is <strong>in</strong> reduced rowechelon<br />
form and has no zero rows (Theorem PEEF). Let x ∈ C n denote a solution<br />
to LS(C, Ky).
§FS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 245<br />
Then for 1 ≤ j ≤ r,<br />
[Bx] j<br />
=[Cx] j<br />
=[Ky] j<br />
=[Jy] j<br />
C a submatrix of B<br />
x a solution to LS(C, Ky)<br />
K a submatrix of J<br />
And for r +1≤ k ≤ m,<br />
[Bx] k<br />
=[Ox] k−r<br />
Zero matrix a submatrix of B<br />
=[0] k−r<br />
Theorem MMZM<br />
=[Ly] k−r<br />
y <strong>in</strong> N (L)<br />
=[Jy] k<br />
L a submatrix of J<br />
So for all 1 ≤ i ≤ m, [Bx] i<br />
= [Jy] i<br />
and by Def<strong>in</strong>ition CVE we have Bx = Jy.<br />
From Theorem PEEF we know then that Ax = y, and therefore y ∈C(A) (Theorem<br />
CSCS). By Def<strong>in</strong>ition SE we now have C(A) =N (L).<br />
Fourth, we prove the set equality L(A) = R(L) with Def<strong>in</strong>ition SE. Beg<strong>in</strong>by<br />
show<strong>in</strong>g that R(L) ⊆L(A). Choose y ∈R(L) ⊆ C m . Then there exists a vector<br />
w ∈ C m−r such that y = L t w (Def<strong>in</strong>ition RSM,TheoremCSCS). Then for 1 ≤ i ≤ n,<br />
[<br />
A t y ] = ∑<br />
m [<br />
i A<br />
t ] [y] ik k<br />
Theorem EMP<br />
=<br />
=<br />
=<br />
k=1<br />
m∑ [<br />
A<br />
t ] [<br />
ik L t w ] k<br />
k=1<br />
m∑ [<br />
A<br />
t ] ik<br />
k=1<br />
m∑<br />
k=1<br />
m−r<br />
∑<br />
=<br />
l=1<br />
m−r<br />
∑<br />
l=1<br />
m−r<br />
∑<br />
l=1<br />
[<br />
A<br />
t ] ik<br />
Def<strong>in</strong>ition of w<br />
[<br />
L<br />
t ] kl [w] l<br />
Theorem EMP<br />
[<br />
L<br />
t ] kl [w] l<br />
Property DCN<br />
m∑ [<br />
A<br />
t ] [<br />
ik L<br />
t ] kl [w] l<br />
Property CACN<br />
k=1<br />
(<br />
m−r<br />
∑ ∑ m<br />
[<br />
=<br />
A<br />
t ] [<br />
ik L<br />
t ] )<br />
[w]<br />
kl l<br />
l=1 k=1<br />
(<br />
m−r<br />
∑ ∑ m<br />
[<br />
=<br />
A<br />
t ] [<br />
ik J<br />
t ] )<br />
[w]<br />
k,r+l l<br />
l=1<br />
k=1<br />
Property DCN<br />
L a submatrix of J
§FS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 246<br />
=<br />
m−r<br />
∑<br />
l=1<br />
m−r<br />
∑<br />
=<br />
l=1<br />
[<br />
A t J t] i,r+l [w] l<br />
Theorem EMP<br />
[<br />
(JA) t] [w] l<br />
i,r+l<br />
Theorem MMT<br />
m−r<br />
∑ [<br />
= B<br />
t ] i,r+l [w] l<br />
Theorem PEEF<br />
l=1<br />
m−r<br />
∑<br />
= 0[w] l<br />
l=1<br />
Zero rows <strong>in</strong> B<br />
=0 PropertyZCN<br />
=[0] i<br />
Def<strong>in</strong>ition ZCV<br />
S<strong>in</strong>ce [A t y] i<br />
= [0] i<br />
for 1 ≤ i ≤ n, Def<strong>in</strong>ition CVE implies that A t y = 0. This<br />
means that y ∈N(A t ).<br />
Now, show that L(A) ⊆ R(L). Choose y ∈ L(A) ⊆ C m . The matrix J is<br />
nons<strong>in</strong>gular (Theorem PEEF), so J t is also nons<strong>in</strong>gular (Theorem MIT) and therefore<br />
the l<strong>in</strong>ear system LS(J t , y) has a unique solution. Denote this solution as x ∈ C m .<br />
We will need to work with two “halves” of x, which we will denote as z and w with<br />
formal def<strong>in</strong>itions given by<br />
[z] j<br />
=[x] i<br />
1 ≤ j ≤ r, [w] k<br />
=[x] r+k<br />
1 ≤ k ≤ m − r<br />
Now, for 1 ≤ j ≤ r,<br />
[<br />
C t z ] r∑<br />
j = [<br />
C<br />
t ] [z] jk k<br />
Theorem EMP<br />
=<br />
=<br />
=<br />
=<br />
=<br />
k=1<br />
r∑<br />
k=1<br />
[<br />
C<br />
t ] m−r<br />
[z] jk k + ∑<br />
l=1<br />
[O] jl<br />
[w] l<br />
Def<strong>in</strong>ition ZM<br />
r∑ [<br />
B<br />
t ] m−r<br />
[z] jk k + ∑ [<br />
B<br />
t ] [w] j,r+l l<br />
C, O submatrices of B<br />
k=1<br />
l=1<br />
r∑ [<br />
B<br />
t ] m−r<br />
jk [x] k + ∑ [<br />
B<br />
t ] j,r+l [x] r+l<br />
Def<strong>in</strong>itions of z and w<br />
k=1<br />
l=1<br />
r∑ [<br />
B<br />
t ] m jk [x] k + ∑<br />
k=1<br />
k=r+1<br />
[<br />
B<br />
t ] jk [x] k<br />
Re-<strong>in</strong>dex second sum<br />
m∑ [<br />
B<br />
t ] [x] jk k<br />
Comb<strong>in</strong>e sums<br />
k=1
§FS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 247<br />
=<br />
=<br />
=<br />
=<br />
=<br />
=<br />
=<br />
m∑<br />
k=1<br />
[<br />
(JA) t] [x] k<br />
jk<br />
Theorem PEEF<br />
m∑ [<br />
A t J t] jk [x] k<br />
Theorem MMT<br />
k=1<br />
m∑<br />
k=1 l=1<br />
m∑<br />
l=1 k=1<br />
m∑ [<br />
A<br />
t ] [<br />
jl J<br />
t ] lk [x] k<br />
Theorem EMP<br />
m∑ [<br />
A<br />
t ] [<br />
jl J<br />
t ] [x] lk k<br />
Property CACN<br />
m∑ [<br />
A<br />
t ] (<br />
∑ m<br />
[<br />
jl J<br />
t ] )<br />
lk [x] k<br />
l=1<br />
k=1<br />
Property DCN<br />
m∑ [<br />
A<br />
t ] [<br />
jl J t x ] Theorem EMP<br />
l<br />
l=1<br />
m∑ [<br />
A<br />
t ] [y] jl l<br />
Def<strong>in</strong>ition of x<br />
l=1<br />
= [ A t y ] j<br />
Theorem EMP<br />
=[0] j<br />
y ∈L(A)<br />
So, by Def<strong>in</strong>ition CVE, C t z = 0 and the vector z gives us a l<strong>in</strong>ear comb<strong>in</strong>ation<br />
of the columns of C t that equals the zero vector. In other words, z gives a relation<br />
of l<strong>in</strong>ear dependence on the the rows of C. However, the rows of C are a l<strong>in</strong>early<br />
<strong>in</strong>dependent set by Theorem BRS. Accord<strong>in</strong>g to Def<strong>in</strong>ition LICV we must conclude<br />
that the entries of z are all zero, i.e. z = 0.<br />
Now, for 1 ≤ i ≤ m, wehave<br />
[y] i<br />
= [ J t x ] Def<strong>in</strong>ition of x<br />
i<br />
m∑ [<br />
= J<br />
t ] ik [x] k<br />
Theorem EMP<br />
=<br />
=<br />
=<br />
k=1<br />
r∑ [<br />
J<br />
t ] m ik [x] k + ∑<br />
k=1<br />
k=r+1<br />
r∑ [<br />
J<br />
t ] m<br />
ik [z] k + ∑<br />
k=1<br />
r∑<br />
k=1<br />
[<br />
J<br />
t ] m−r<br />
ik 0+ ∑<br />
l=1<br />
k=r+1<br />
[<br />
J<br />
t ] ik [x] k<br />
Break apart sum<br />
[<br />
J<br />
t ] ik [w] k−r<br />
Def<strong>in</strong>ition of z and w<br />
[<br />
J<br />
t ] i,r+l [w] l<br />
z = 0, re-<strong>in</strong>dex
§FS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 248<br />
=0+<br />
m−r<br />
∑<br />
l=1<br />
[<br />
L<br />
t ] i,l [w] l<br />
L a submatrix of J<br />
= [ L t w ] i<br />
Theorem EMP<br />
So by Def<strong>in</strong>ition CVE, y = L t w. The existence of w implies that y ∈R(L), and<br />
therefore L(A) ⊆R(L). So by Def<strong>in</strong>ition SE we have L(A) =R(L).<br />
<br />
The first two conclusions of this theorem are nearly trivial. But they set up a<br />
pattern of results for C that is reflected <strong>in</strong> the latter two conclusions about L. In<br />
total, they tell us that we can compute all four subsets just by f<strong>in</strong>d<strong>in</strong>g null spaces and<br />
row spaces. This theorem does not tell us exactly how to compute these subsets, but<br />
<strong>in</strong>stead simply expresses them as null spaces and row spaces of matrices <strong>in</strong> reduced<br />
row-echelon form without any zero rows (C and L). A l<strong>in</strong>early <strong>in</strong>dependent set that<br />
spans the null space of a matrix <strong>in</strong> reduced row-echelon form can be found easily<br />
with Theorem BNS. It is an even easier matter to f<strong>in</strong>d a l<strong>in</strong>early <strong>in</strong>dependent set<br />
that spans the row space of a matrix <strong>in</strong> reduced row-echelon form with Theorem<br />
BRS, especially when there are no zero rows present. So an application of Theorem<br />
FS is typically followed by two applications each of Theorem BNS and Theorem<br />
BRS.<br />
The situation when r = m deserves comment, s<strong>in</strong>ce now the matrix L has no<br />
rows. What is C(A) when we try to apply Theorem FS and encounter N (L)? One<br />
<strong>in</strong>terpretation of this situation is that L is the coefficient matrix of a homogeneous<br />
system that has no equations. How hard is it to f<strong>in</strong>d a solution vector to this system?<br />
Some thought will conv<strong>in</strong>ce you that any proposed vector will qualify as a solution,<br />
s<strong>in</strong>ce it makes all of the equations true. So every possible vector is <strong>in</strong> the null space<br />
of L and therefore C(A) = N (L) = C m . OK, perhaps this sounds like some twisted<br />
argument from Alice <strong>in</strong> Wonderland. Let us try another argument that might solidly<br />
conv<strong>in</strong>ce you of this logic.<br />
If r = m, when we row-reduce the augmented matrix of LS(A, b) the result<br />
will have no zero rows, and the first n columns will all be pivot columns, leav<strong>in</strong>g<br />
none for the f<strong>in</strong>al column, so by Theorem RCLS the system will be consistent. By<br />
Theorem CSCS, b ∈C(A). S<strong>in</strong>ceb was arbitrary, every possible vector is <strong>in</strong> the<br />
column space of A, so we aga<strong>in</strong> have C(A) = C m . The situation when a matrix has<br />
r = m is known by the term full rank, and <strong>in</strong> the case of a square matrix co<strong>in</strong>cides<br />
with nons<strong>in</strong>gularity (see Exercise FS.M50).<br />
The properties of the matrix L described by this theorem can be expla<strong>in</strong>ed<br />
<strong>in</strong>formally as follows. A column vector y ∈ C m is <strong>in</strong> the column space of A if the<br />
l<strong>in</strong>ear system LS(A, y) is consistent (Theorem CSCS). By Theorem RCLS, the<br />
reduced row-echelon form of the augmented matrix [A | y] of a consistent system<br />
will have zeros <strong>in</strong> the bottom m − r locations of the last column. By Theorem PEEF<br />
this f<strong>in</strong>al column is the vector Jy and so should then have zeros <strong>in</strong> the f<strong>in</strong>al m − r
§FS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 249<br />
locations. But s<strong>in</strong>ce L comprises the f<strong>in</strong>al m − r rows of J, this condition is expressed<br />
by say<strong>in</strong>g y ∈N(L).<br />
Additionally, the rows of J are the scalars <strong>in</strong> l<strong>in</strong>ear comb<strong>in</strong>ations of the rows<br />
of A that create the rows of B. Thatis,therowsofJ record the net effect of the<br />
sequence of row operations that takes A to its reduced row-echelon form, B. This<br />
canbeseen<strong>in</strong>theequationJA = B (Theorem PEEF). As such, the rows of L are<br />
scalars for l<strong>in</strong>ear comb<strong>in</strong>ations of the rows of A that yield zero rows. But such l<strong>in</strong>ear<br />
comb<strong>in</strong>ations are precisely the elements of the left null space. So any element of the<br />
row space of L is also an element of the left null space of A.<br />
We will now illustrate Theorem FS with a few examples.<br />
Example FS1 Four subsets, no. 1<br />
In Example SEEF we found the five relevant submatrices of the extended echelon<br />
form for the matrix<br />
⎡<br />
⎤<br />
1 −1 −2 7 1 6<br />
⎢−6 2 −4 −18 −3 −26⎥<br />
A = ⎣<br />
4 −1 4 10 2 17<br />
⎦<br />
3 −1 2 9 1 12<br />
To apply Theorem FS we only need C and L,<br />
⎡<br />
⎤<br />
1 0 2 1 0 3<br />
C = ⎣ 0 1 4 −6 0 −1⎦ L = [ 1 2 2 1 ]<br />
0 0 0 0 1 2<br />
Then we use Theorem FS to obta<strong>in</strong><br />
⎧⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤⎫<br />
−2 −1 −3<br />
〈<br />
−4<br />
6<br />
1<br />
〉<br />
⎪⎨<br />
⎪⎬<br />
1<br />
0<br />
0<br />
N (A) =N (C) =<br />
⎢ 0<br />
,<br />
⎥ ⎢ 1<br />
,<br />
⎥ ⎢ 0<br />
⎥<br />
⎣ 0 ⎦ ⎣ 0 ⎦ ⎣−2⎦<br />
⎪⎩<br />
⎪⎭<br />
0 0 1<br />
⎧⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤⎫<br />
1 0 0<br />
〈<br />
0<br />
1<br />
0<br />
〉<br />
⎪⎨<br />
⎪⎬<br />
2<br />
4<br />
0<br />
R(A) =R(C) =<br />
⎢1<br />
,<br />
⎥ ⎢−6<br />
,<br />
⎥ ⎢0<br />
⎥<br />
⎣0⎦<br />
⎣ 0 ⎦ ⎣1⎦<br />
⎪⎩<br />
⎪⎭<br />
3 −1 2<br />
〈 ⎧ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />
⎪ −2 −2 −1 〉<br />
⎨ ⎪⎬<br />
⎢ 1 ⎥ ⎢ 0 ⎥ ⎢ 0 ⎥<br />
C(A) =N (L) = ⎣<br />
⎪ ⎩<br />
0<br />
⎦ , ⎣<br />
1<br />
⎦ , ⎣<br />
0<br />
⎦<br />
⎪⎭<br />
0 0 1<br />
Theorem BNS<br />
Theorem BRS<br />
Theorem BNS
§FS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 250<br />
Boom!<br />
〈 ⎧ ⎡ ⎤⎫<br />
⎪ 1 〉 ⎨ ⎪⎬<br />
⎢2⎥<br />
L(A) =R(L) = ⎣<br />
⎪ ⎩<br />
2<br />
⎦<br />
⎪⎭<br />
1<br />
Theorem BRS<br />
Example FS2 Four subsets, no. 2<br />
Now let us return to the matrix A that we used to motivate this section <strong>in</strong> Example<br />
CSANS,<br />
⎡<br />
⎤<br />
10 0 3 8 7<br />
−16 −1 −4 −10 −13<br />
−6 1 −3 −6 −6<br />
A =<br />
⎢ 0 2 −2 −3 −2<br />
⎥<br />
⎣ 3 0 1 2 3 ⎦<br />
−1 −1 1 1 0<br />
We form the matrix M by adjo<strong>in</strong><strong>in</strong>g the 6 × 6 identity matrix I 6 ,<br />
⎡<br />
⎤<br />
10 0 3 8 7 1 0 0 0 0 0<br />
−16 −1 −4 −10 −13 0 1 0 0 0 0<br />
−6 1 −3 −6 −6 0 0 1 0 0 0<br />
M =<br />
⎢ 0 2 −2 −3 −2 0 0 0 1 0 0<br />
⎥<br />
⎣ 3 0 1 2 3 0 0 0 0 1 0⎦<br />
−1 −1 1 1 0 0 0 0 0 0 1<br />
and row-reduce to obta<strong>in</strong> N<br />
⎡<br />
⎤<br />
1 0 0 0 2 0 0 1 −1 2 −1<br />
0 1 0 0 −3 0 0 −2 3 −3 3<br />
0 0 1 0 1 0 0 1 1 3 3<br />
N =<br />
0 0 0 1 −2 0 0 −2 1 −4 0<br />
⎢<br />
⎥<br />
⎣ 0 0 0 0 0 1 0 3 −1 3 1 ⎦<br />
0 0 0 0 0 0 1 −2 1 1 −1<br />
To f<strong>in</strong>d the four subsets for A, we only need identify the 4 × 5 matrix C and the<br />
2 × 6 matrix L,<br />
⎡<br />
⎤<br />
1 0 0 0 2<br />
[ ]<br />
C = ⎢ 0 1 0 0 −3<br />
⎥<br />
1 0 3 −1 3 1<br />
⎣ 0 0 1 0 1 ⎦ L =<br />
0 1 −2 1 1 −1<br />
0 0 0 1 −2<br />
△
§FS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 251<br />
Then we apply Theorem FS,<br />
⎧⎡<br />
⎤⎫<br />
−2<br />
〈 〉<br />
⎪⎨<br />
3<br />
⎪⎬<br />
N (A) =N (C) = ⎢−1⎥<br />
⎣<br />
⎪⎩<br />
2 ⎦<br />
⎪⎭<br />
1<br />
⎧⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />
1 0 0 0<br />
〈<br />
〉<br />
⎪⎨<br />
0<br />
1<br />
0<br />
0<br />
⎪⎬<br />
R(A) =R(C) = ⎢0⎥<br />
⎣<br />
⎪⎩<br />
0⎦ , ⎢ 0 ⎥<br />
⎣ 0 ⎦ , ⎢1⎥<br />
⎣0⎦ , ⎢ 0 ⎥<br />
⎣ 1 ⎦<br />
⎪⎭<br />
2 −3 1 −2<br />
⎧⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />
−3 1 −3 −1<br />
〈<br />
2<br />
−1<br />
−1<br />
1<br />
〉<br />
⎪⎨<br />
⎪⎬<br />
1<br />
0<br />
0<br />
0<br />
C(A) =N (L) =<br />
⎢ 0<br />
,<br />
⎥ ⎢ 1<br />
,<br />
⎥ ⎢ 0<br />
,<br />
⎥ ⎢ 0<br />
⎥<br />
⎣ 0 ⎦ ⎣ 0 ⎦ ⎣ 1 ⎦ ⎣ 0 ⎦<br />
⎪⎩<br />
⎪⎭<br />
0 0 0 1<br />
⎧⎡<br />
⎤ ⎡ ⎤⎫<br />
1 0<br />
〈<br />
0<br />
1<br />
〉<br />
⎪⎨<br />
⎪⎬<br />
3<br />
−2<br />
L(A) =R(L) =<br />
⎢−1<br />
,<br />
⎥ ⎢ 1<br />
⎥<br />
⎣ 3 ⎦ ⎣ 1 ⎦<br />
⎪⎩<br />
⎪⎭<br />
1 −1<br />
Theorem BNS<br />
Theorem BRS<br />
Theorem BNS<br />
Theorem BRS<br />
The next example is just a bit different s<strong>in</strong>ce the matrix has more rows than<br />
columns, and a trivial null space.<br />
Example FSAG Four subsets, Archetype G<br />
Archetype G and Archetype H are both systems of m = 5 equations <strong>in</strong> n =2<br />
variables. They have identical coefficient matrices, which we will denote here as the<br />
matrix G,<br />
⎡ ⎤<br />
2 3<br />
−1 4<br />
G = ⎢ 3 10⎥<br />
⎣ 3 −1⎦<br />
6 9<br />
Adjo<strong>in</strong> the 5 × 5 identity matrix, I 5 , to form<br />
⎡<br />
⎤<br />
2 3 1 0 0 0 0<br />
−1 4 0 1 0 0 0<br />
M = ⎢ 3 10 0 0 1 0 0⎥<br />
⎣ 3 −1 0 0 0 1 0⎦<br />
6 9 0 0 0 0 1<br />
△
§FS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 252<br />
This row-reduces to<br />
⎡<br />
3<br />
1 0 0 0 0<br />
11<br />
0 1 0 0 0 − 2<br />
11<br />
1<br />
33<br />
1<br />
11<br />
N =<br />
0 0 1 0 0 0 −<br />
⎢<br />
1 3<br />
⎣ 0 0 0 1 0 1 − 1 ⎥<br />
⎦<br />
3<br />
0 0 0 0 1 1 −1<br />
The first n = 2 columns conta<strong>in</strong> r = 2 lead<strong>in</strong>g 1’s, so we obta<strong>in</strong> C as the 2 × 2<br />
identity matrix and extract L from the f<strong>in</strong>al m − r = 3 rows <strong>in</strong> the f<strong>in</strong>al m =5<br />
columns.<br />
C =<br />
[<br />
1 0<br />
0 1<br />
]<br />
⎡<br />
1 0 0 0 − 1 ⎤<br />
3<br />
L = ⎣ 0 1 0 1 − 1 ⎦<br />
3<br />
0 0 1 1 −1<br />
Then we apply Theorem FS,<br />
N (G) =N (C) =〈∅〉 = {0}<br />
Theorem BNS<br />
〈{[ [<br />
1 0<br />
R(G) =R(C) = , = C<br />
0]<br />
1]}〉<br />
2 Theorem BRS<br />
⎧⎡<br />
⎤ ⎡ 1⎤⎫<br />
0<br />
〈<br />
3<br />
⎪⎨<br />
−1<br />
1 〉<br />
⎪⎬<br />
C(G) =N (L) = ⎢−1⎥<br />
⎣<br />
⎪⎩<br />
1 ⎦ , 3<br />
⎢1⎥<br />
Theorem BNS<br />
⎣<br />
0<br />
⎦<br />
⎪⎭<br />
0 1<br />
⎧⎡<br />
⎤ ⎡ ⎤⎫<br />
0 1<br />
〈<br />
〉<br />
⎪⎨<br />
−1<br />
1<br />
⎪⎬<br />
= ⎢−1⎥<br />
⎣<br />
⎪⎩<br />
1 ⎦ , ⎢3⎥<br />
⎣0⎦<br />
⎪⎭<br />
0 3<br />
⎧⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤⎫<br />
1 0 0<br />
〈<br />
〉<br />
⎪⎨<br />
0<br />
1<br />
0<br />
⎪⎬<br />
L(G) =R(L) = ⎢ 0<br />
⎥<br />
⎣ 0 ⎦ , ⎢ 0<br />
⎥<br />
⎣ 1 ⎦ , ⎢ 1 ⎥ Theorem BRS<br />
⎣<br />
⎪⎩<br />
1 ⎦<br />
⎪⎭<br />
− 1 3<br />
− 1 −1<br />
3<br />
⎧⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤⎫<br />
3 0 0<br />
〈<br />
〉<br />
⎪⎨<br />
0<br />
3<br />
0<br />
⎪⎬<br />
= ⎢ 0 ⎥<br />
⎣<br />
⎪⎩<br />
0 ⎦ , ⎢ 0 ⎥<br />
⎣ 3 ⎦ , ⎢ 1 ⎥<br />
⎣ 1 ⎦<br />
⎪⎭<br />
−1 −1 −1<br />
As mentioned earlier, Archetype G is consistent, while Archetype H is <strong>in</strong>consistent.<br />
See if you can write the two different vectors of constants from these two archetypes<br />
as l<strong>in</strong>ear comb<strong>in</strong>ations of the two vectors that form the spann<strong>in</strong>g set for C(G). How<br />
⎤
§FS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 253<br />
about the two columns of G? Can you write each <strong>in</strong>dividually as a l<strong>in</strong>ear comb<strong>in</strong>ation<br />
of the two vectors that form the spann<strong>in</strong>g set for C(G)? They must be <strong>in</strong> the column<br />
space of G also. Are your answers unique? Do you notice anyth<strong>in</strong>g about the scalars<br />
that appear <strong>in</strong> the l<strong>in</strong>ear comb<strong>in</strong>ations you are form<strong>in</strong>g?<br />
△<br />
Example COV and Example CSROI each describes the column space of the<br />
coefficient matrix from Archetype I as the span of a set of r = 3 l<strong>in</strong>early <strong>in</strong>dependent<br />
vectors. It is no accident that these two different sets both have the same size. If we<br />
(you?) were to calculate the column space of this matrix us<strong>in</strong>g the null space of the<br />
matrix L from Theorem FS then we would aga<strong>in</strong> f<strong>in</strong>d a set of 3 l<strong>in</strong>early <strong>in</strong>dependent<br />
vectors that span the range. More on this later.<br />
So we have three different methods to obta<strong>in</strong> a description of the column space of<br />
a matrix as the span of a l<strong>in</strong>early <strong>in</strong>dependent set. Theorem BCS is sometimes useful<br />
s<strong>in</strong>ce the vectors it specifies are equal to actual columns of the matrix. Theorem BRS<br />
and Theorem CSRST comb<strong>in</strong>e to create vectors with lots of zeros, and strategically<br />
placed 1’s near the top of the vector. Theorem FS and the matrix L from the<br />
extended echelon form gives us a third method, which tends to create vectors with<br />
lots of zeros, and strategically placed 1’s near the bottom of the vector. If we do not<br />
care about l<strong>in</strong>ear <strong>in</strong>dependence we can also appeal to Def<strong>in</strong>ition CSM and simply<br />
express the column space as the span of all the columns of the matrix, giv<strong>in</strong>g us a<br />
fourth description.<br />
With Theorem CSRST and Def<strong>in</strong>ition RSM, we can compute column spaces<br />
with theorems about row spaces, and we can compute row spaces with theorems<br />
about column spaces, but <strong>in</strong> each case we must transpose the matrix first. At this<br />
po<strong>in</strong>t you may be overwhelmed by all the possibilities for comput<strong>in</strong>g column and<br />
row spaces. Diagram CSRST is meant to help. For both the column space and row<br />
space, it suggests four techniques. One is to appeal to the def<strong>in</strong>ition, another yields a<br />
span of a l<strong>in</strong>early <strong>in</strong>dependent set, and a third uses Theorem FS. A fourth suggests<br />
transpos<strong>in</strong>g the matrix and the dashed l<strong>in</strong>e implies that then the companion set of<br />
techniques can be applied. This can lead to a bit of sill<strong>in</strong>ess, s<strong>in</strong>ce if you were to<br />
follow the dashed l<strong>in</strong>es twice you would transpose the matrix twice, and by Theorem<br />
TT would accomplish noth<strong>in</strong>g productive.
§FS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 254<br />
Def<strong>in</strong>ition CSM<br />
C(A)<br />
Theorem BCS<br />
Theorem FS, N (L)<br />
Theorem CSRST, R(A t )<br />
Def<strong>in</strong>ition RSM, C(A t )<br />
R(A)<br />
Theorem FS, R(C)<br />
Theorem BRS<br />
Def<strong>in</strong>ition RSM<br />
Diagram CSRST: Column Space and Row Space Techniques<br />
Although we have many ways to describe a column space, notice that one tempt<strong>in</strong>g<br />
strategy will usually fail. It is not possible to simply row-reduce a matrix directly<br />
and then use the columns of the row-reduced matrix as a set whose span equals<br />
the column space. In other words, row operations do not preserve column spaces<br />
(however row operations do preserve row spaces, Theorem REMRS). See Exercise<br />
CRS.M21.<br />
Read<strong>in</strong>g Questions<br />
1. F<strong>in</strong>d a nontrivial element of the left null space of A.<br />
⎡<br />
A = ⎣ 2 1 −3 4<br />
⎤<br />
−1 −1 2 −1⎦<br />
0 −1 1 2<br />
2. F<strong>in</strong>d the matrices C and L <strong>in</strong> the extended echelon form of A.<br />
⎡<br />
⎤<br />
−9 5 −3<br />
A = ⎣ 2 −1 1 ⎦<br />
−5 3 −1<br />
3. Why is Theorem FS a great conclusion to Chapter M?<br />
Exercises<br />
C20 Example FSAG concludes with several questions. Perform the analysis suggested by<br />
these questions.
§FS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 255<br />
C25 † Given the matrix A below, use the extended echelon form of A to answer each part<br />
of this problem. In each part, f<strong>in</strong>d a l<strong>in</strong>early <strong>in</strong>dependent set of vectors, S, so that the span<br />
of S, 〈S〉, equals the specified set of vectors.<br />
⎡<br />
⎤<br />
−5 3 −1<br />
⎢−1 1 1 ⎥<br />
A = ⎣<br />
−8 5 −1<br />
⎦<br />
3 −2 0<br />
1. The row space of A, R(A).<br />
2. The column space of A, C(A).<br />
3. The null space of A, N (A).<br />
4. The left null space of A, L(A).<br />
C26 †<br />
For the matrix D below use the extended echelon form to f<strong>in</strong>d:<br />
1. A l<strong>in</strong>early <strong>in</strong>dependent set whose span is the column space of D.<br />
2. A l<strong>in</strong>early <strong>in</strong>dependent set whose span is the left null space of D.<br />
D =<br />
⎡<br />
⎤<br />
−7 −11 −19 −15<br />
⎢ 6 10 18 14 ⎥<br />
⎣<br />
3 5 9 7<br />
⎦<br />
−1 −2 −4 −3<br />
C41 The follow<strong>in</strong>g archetypes are systems of equations. For each system, write the vector<br />
of constants as a l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors <strong>in</strong> the span construction for the column<br />
space provided by Theorem FS and Theorem BNS (these vectors are listed for each of these<br />
archetypes).<br />
Archetype A,ArchetypeB,ArchetypeC,ArchetypeD,ArchetypeE,ArchetypeF,Archetype<br />
G, ArchetypeH, ArchetypeI, ArchetypeJ<br />
C43 The follow<strong>in</strong>g archetypes are either matrices or systems of equations with coefficient<br />
matrices. For each matrix, compute the extended echelon form N and identify the matrices<br />
C and L. Us<strong>in</strong>g Theorem FS, TheoremBNS and Theorem BRS express the null space, the<br />
row space, the column space and left null space of each coefficient matrix as a span of a<br />
l<strong>in</strong>early <strong>in</strong>dependent set.<br />
Archetype A,ArchetypeB,ArchetypeC,ArchetypeD/Archetype E,ArchetypeF,Archetype<br />
G/Archetype H, ArchetypeI, ArchetypeJ, ArchetypeK, ArchetypeL<br />
C60 † For the matrix B below, f<strong>in</strong>d sets of vectors whose span equals the column space of<br />
B (C(B)) and which <strong>in</strong>dividually meet the follow<strong>in</strong>g extra requirements.<br />
1. The set illustrates the def<strong>in</strong>ition of the column space.
§FS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 256<br />
2. The set is l<strong>in</strong>early <strong>in</strong>dependent and the members of the set are columns of B.<br />
3. The set is l<strong>in</strong>early <strong>in</strong>dependent with a “nice pattern of zeros and ones” at the top of<br />
each vector.<br />
4. The set is l<strong>in</strong>early <strong>in</strong>dependent with a “nice pattern of zeros and ones” at the bottom<br />
of each vector.<br />
⎡<br />
B = ⎣ 2 3 1 1<br />
⎤<br />
1 1 0 1 ⎦<br />
−1 2 3 −4<br />
C61 †<br />
Let A be the matrix below, and f<strong>in</strong>d the <strong>in</strong>dicated sets with the requested properties.<br />
⎡<br />
⎤<br />
2 −1 5 −3<br />
A = ⎣−5 3 −12 7 ⎦<br />
1 1 4 −3<br />
1. A l<strong>in</strong>early <strong>in</strong>dependent set S so that C(A) = 〈S〉 and S is composed of columns of A.<br />
2. A l<strong>in</strong>early <strong>in</strong>dependent set S so that C(A) = 〈S〉 and the vectors <strong>in</strong> S have a nice<br />
pattern of zeros and ones at the top of the vectors.<br />
3. A l<strong>in</strong>early <strong>in</strong>dependent set S so that C(A) = 〈S〉 and the vectors <strong>in</strong> S have a nice<br />
pattern of zeros and ones at the bottom of the vectors.<br />
4. A l<strong>in</strong>early <strong>in</strong>dependent set S so that R(A) =〈S〉.<br />
M50 Suppose that A is a nons<strong>in</strong>gular matrix. Extend the four conclusions of Theorem<br />
FS <strong>in</strong> this special case and discuss connections with previous results (such as Theorem<br />
NME4).<br />
M51 Suppose that A is a s<strong>in</strong>gular matrix. Extend the four conclusions of Theorem FS <strong>in</strong><br />
this special case and discuss connections with previous results (such as Theorem NME4).
Chapter VS<br />
Vector Spaces<br />
We now have a computational toolkit <strong>in</strong> place and so we can beg<strong>in</strong> our study of<br />
l<strong>in</strong>ear algebra at a more theoretical level.<br />
L<strong>in</strong>ear algebra is the study of two fundamental objects, vector spaces and l<strong>in</strong>ear<br />
transformations (see Chapter LT). This chapter will focus on the former. The power<br />
of mathematics is often derived from generaliz<strong>in</strong>g many different situations <strong>in</strong>to one<br />
abstract formulation, and that is exactly what we will be do<strong>in</strong>g throughout this<br />
chapter.<br />
Section VS<br />
Vector Spaces<br />
In this section we present a formal def<strong>in</strong>ition of a vector space, which will lead to an<br />
extra <strong>in</strong>crement of abstraction. Once def<strong>in</strong>ed, we study its most basic properties.<br />
Subsection VS<br />
Vector Spaces<br />
Here is one of the two most important def<strong>in</strong>itions <strong>in</strong> the entire course.<br />
Def<strong>in</strong>ition VS Vector Space<br />
Suppose that V is a set upon which we have def<strong>in</strong>ed two operations: (1) vector<br />
addition, which comb<strong>in</strong>es two elements of V and is denoted by “+”, and (2) scalar<br />
multiplication, which comb<strong>in</strong>es a complex number with an element of V and is<br />
denoted by juxtaposition. Then V , along with the two operations, is a vector space<br />
over C if the follow<strong>in</strong>g ten properties hold.<br />
• AC Additive Closure<br />
If u, v ∈ V ,thenu + v ∈ V .<br />
257
§VS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 258<br />
• SC Scalar Closure<br />
If α ∈ C and u ∈ V ,thenαu ∈ V .<br />
• C Commutativity<br />
If u, v ∈ V ,thenu + v = v + u.<br />
• AA Additive Associativity<br />
If u, v, w ∈ V ,thenu +(v + w) =(u + v)+w.<br />
• Z Zero Vector<br />
There is a vector, 0, called the zero vector, such that u + 0 = u for all u ∈ V .<br />
• AI Additive Inverses<br />
If u ∈ V , then there exists a vector −u ∈ V so that u +(−u) =0.<br />
• SMA Scalar Multiplication Associativity<br />
If α, β ∈ C and u ∈ V ,thenα(βu) =(αβ)u.<br />
• DVA Distributivity across Vector Addition<br />
If α ∈ C and u, v ∈ V ,thenα(u + v) =αu + αv.<br />
• DSA Distributivity across Scalar Addition<br />
If α, β ∈ C and u ∈ V ,then(α + β)u = αu + βu.<br />
• OOne<br />
If u ∈ V ,then1u = u.<br />
Theobjects<strong>in</strong>V are called vectors, no matter what else they might really be,<br />
simply by virtue of be<strong>in</strong>g elements of a vector space.<br />
□<br />
Now, there are several important observations to make. Many of these will be<br />
easier to understand on a second or third read<strong>in</strong>g, and especially after carefully<br />
study<strong>in</strong>g the examples <strong>in</strong> Subsection VS.EVS.<br />
An axiom is often a “self-evident” truth. Someth<strong>in</strong>g so fundamental that we<br />
all agree it is true and accept it without proof. Typically, it would be the logical<br />
underp<strong>in</strong>n<strong>in</strong>g that we would beg<strong>in</strong> to build theorems upon. Some might refer to<br />
the ten properties of Def<strong>in</strong>ition VS as axioms, imply<strong>in</strong>g that a vector space is a<br />
very natural object and the ten properties are the essence of a vector space. We<br />
will <strong>in</strong>stead emphasize that we will beg<strong>in</strong> with a def<strong>in</strong>ition of a vector space. After<br />
study<strong>in</strong>g the rema<strong>in</strong>der of this chapter, you might return here and rem<strong>in</strong>d yourself<br />
how all our forthcom<strong>in</strong>g theorems and def<strong>in</strong>itions rest on this foundation.<br />
As we will see shortly, the objects <strong>in</strong> V can be anyth<strong>in</strong>g, even though we will<br />
call them vectors. We have been work<strong>in</strong>g with vectors frequently, but we should
§VS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 259<br />
stress here that these have so far just been column vectors — scalars arranged <strong>in</strong> a<br />
columnar list of fixed length. In a similar ve<strong>in</strong>, you have used the symbol “+” for<br />
many years to represent the addition of numbers (scalars). We have extended its<br />
use to the addition of column vectors and to the addition of matrices, and now we<br />
are go<strong>in</strong>g to recycle it even further and let it denote vector addition <strong>in</strong> any possible<br />
vector space. So when describ<strong>in</strong>g a new vector space, we will have to def<strong>in</strong>e exactly<br />
what “+” is. Similar comments apply to scalar multiplication. Conversely, we can<br />
def<strong>in</strong>e our operations any way we like, so long as the ten properties are fulfilled (see<br />
Example CVS).<br />
In Def<strong>in</strong>ition VS, the scalars do not have to be complex numbers. They can come<br />
from what are called <strong>in</strong> more advanced mathematics, “fields”. Examples of fields are<br />
the set of complex numbers, the set of real numbers, the set of rational numbers,<br />
and even the f<strong>in</strong>ite set of “b<strong>in</strong>ary numbers”, {0, 1}. There are many, many others.<br />
In this case we would call V a vector space over (the field) F .<br />
A vector space is composed of three objects, a set and two operations. Some<br />
would explicitly state <strong>in</strong> the def<strong>in</strong>ition that V must be a nonempty set, but we can<br />
<strong>in</strong>fer this from Property Z, s<strong>in</strong>ce the set cannot be empty and conta<strong>in</strong> a vector that<br />
behaves as the zero vector. Also, we usually use the same symbol for both the set<br />
and the vector space itself. Do not let this convenience fool you <strong>in</strong>to th<strong>in</strong>k<strong>in</strong>g the<br />
operations are secondary!<br />
This discussion has either conv<strong>in</strong>ced you that we are really embark<strong>in</strong>g on a new<br />
level of abstraction, or it has seemed cryptic, mysterious or nonsensical. You might<br />
want to return to this section <strong>in</strong> a few days and give it another read then. In any<br />
case, let us look at some concrete examples now.<br />
Subsection EVS<br />
Examples of Vector Spaces<br />
Our aim <strong>in</strong> this subsection is to give you a storehouse of examples to work with, to<br />
become comfortable with the ten vector space properties and to conv<strong>in</strong>ce you that<br />
the multitude of examples justifies (at least <strong>in</strong>itially) mak<strong>in</strong>g such a broad def<strong>in</strong>ition<br />
as Def<strong>in</strong>ition VS. Some of our claims will be justified by reference to previous<br />
theorems, we will prove some facts from scratch, and we will do one nontrivial<br />
example completely. In other places, our usual thoroughness will be neglected, so<br />
grab paper and pencil and play along.<br />
Example VSCV The vector space C m<br />
Set: C m , all column vectors of size m, Def<strong>in</strong>ition VSCV.<br />
Equality: Entry-wise, Def<strong>in</strong>ition CVE.<br />
Vector Addition: The “usual” addition, given <strong>in</strong> Def<strong>in</strong>ition CVA.<br />
Scalar Multiplication: The “usual” scalar multiplication, given <strong>in</strong> Def<strong>in</strong>ition<br />
CVSM.
§VS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 260<br />
Does this set with these operations fulfill the ten properties? Yes. And by design<br />
all we need to do is quote Theorem VSPCV. That was easy.<br />
△<br />
Example VSM The vector space of matrices, M mn<br />
Set: M mn , the set of all matrices of size m × n and entries from C, Def<strong>in</strong>ition VSM.<br />
Equality: Entry-wise, Def<strong>in</strong>ition ME.<br />
Vector Addition: The “usual” addition, given <strong>in</strong> Def<strong>in</strong>ition MA.<br />
Scalar Multiplication: The “usual” scalar multiplication, given <strong>in</strong> Def<strong>in</strong>ition MSM.<br />
Does this set with these operations fulfill the ten properties? Yes. And all we<br />
need to do is quote Theorem VSPM. Another easy one (by design).<br />
△<br />
So, the set of all matrices of a fixed size forms a vector space. That entitles us to<br />
call a matrix a vector, s<strong>in</strong>ce a matrix is an element of a vector space. For example, if<br />
A, B ∈ M 34 then we call A and B “vectors,” and we even use our previous notation<br />
for column vectors to refer to A and B. So we could legitimately write expressions<br />
like<br />
u + v = A + B = B + A = v + u<br />
This could lead to some confusion, but it is not too great a danger. But it is worth<br />
comment.<br />
The previous two examples may be less than satisfy<strong>in</strong>g. We made all the relevant<br />
def<strong>in</strong>itions long ago. And the required verifications were all handled by quot<strong>in</strong>g old<br />
theorems. However, it is important to consider these two examples first. We have<br />
been study<strong>in</strong>g vectors and matrices carefully (Chapter V, Chapter M), and both<br />
objects, along with their operations, have certa<strong>in</strong> properties <strong>in</strong> common, as you<br />
may have noticed <strong>in</strong> compar<strong>in</strong>g Theorem VSPCV with Theorem VSPM. Indeed,<br />
it is these two theorems that motivate us to formulate the abstract def<strong>in</strong>ition of a<br />
vector space, Def<strong>in</strong>ition VS. Now, if we prove some general theorems about vector<br />
spaces (as we will shortly <strong>in</strong> Subsection VS.VSP), we can then <strong>in</strong>stantly apply the<br />
conclusions to both C m and M mn . Notice too, how we have taken six def<strong>in</strong>itions and<br />
two theorems and reduced them down to two examples. With greater generalization<br />
and abstraction our old ideas get downgraded <strong>in</strong> stature.<br />
Let us look at some more examples, now consider<strong>in</strong>g some new vector spaces.<br />
Example VSP The vector space of polynomials, P n<br />
Set: P n , the set of all polynomials of degree n or less <strong>in</strong> the variable x with coefficients<br />
from C.<br />
Equality:<br />
a 0 + a 1 x + a 2 x 2 + ···+ a n x n = b 0 + b 1 x + b 2 x 2 + ···+ b n x n<br />
if and only if a i = b i for 0 ≤ i ≤ n<br />
Vector Addition:<br />
(a 0 + a 1 x + a 2 x 2 + ···+ a n x n )+(b 0 + b 1 x + b 2 x 2 + ···+ b n x n )=
§VS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 261<br />
(a 0 + b 0 )+(a 1 + b 1 )x +(a 2 + b 2 )x 2 + ···+(a n + b n )x n<br />
Scalar Multiplication:<br />
α(a 0 + a 1 x + a 2 x 2 + ···+ a n x n )=(αa 0 )+(αa 1 )x +(αa 2 )x 2 + ···+(αa n )x n<br />
This set, with these operations, will fulfill the ten properties, though we will not<br />
work all the details here. However, we will make a few comments and prove one of<br />
the properties. <strong>First</strong>, the zero vector (Property Z) is what you might expect, and<br />
you can check that it has the required property.<br />
0 =0+0x +0x 2 + ···+0x n<br />
The additive <strong>in</strong>verse (Property AI) is also no surprise, though consider how we<br />
have chosen to write it.<br />
− ( a 0 + a 1 x + a 2 x 2 + ···+ a n x n) =(−a 0 )+(−a 1 )x +(−a 2 )x 2 + ···+(−a n )x n<br />
Now let us prove the associativity of vector addition (Property AA). This is a<br />
bit tedious, though necessary. Throughout, the plus sign (“+”) does triple-duty. You<br />
might ask yourself what each plus sign represents as you work through this proof.<br />
u+(v + w)<br />
=(a 0 + a 1 x + ···+ a n x n )+((b 0 + b 1 x + ···+ b n x n )+(c 0 + c 1 x + ···+ c n x n ))<br />
=(a 0 + a 1 x + ···+ a n x n )+((b 0 + c 0 )+(b 1 + c 1 )x + ···+(b n + c n )x n )<br />
=(a 0 +(b 0 + c 0 )) + (a 1 +(b 1 + c 1 ))x + ···+(a n +(b n + c n ))x n<br />
=((a 0 + b 0 )+c 0 )+((a 1 + b 1 )+c 1 )x + ···+((a n + b n )+c n )x n<br />
=((a 0 + b 0 )+(a 1 + b 1 )x + ···+(a n + b n )x n )+(c 0 + c 1 x + ···+ c n x n )<br />
=((a 0 + a 1 x + ···+ a n x n )+(b 0 + b 1 x + ···+ b n x n )) + (c 0 + c 1 x + ···+ c n x n )<br />
=(u + v)+w<br />
Notice how it is the application of the associativity of the (old) addition of<br />
complex numbers <strong>in</strong> the middle of this cha<strong>in</strong> of equalities that makes the whole proof<br />
happen. The rema<strong>in</strong>der is successive applications of our (new) def<strong>in</strong>ition of vector<br />
(polynomial) addition. Prov<strong>in</strong>g the rema<strong>in</strong>der of the ten properties is similar <strong>in</strong> style<br />
and tedium. You might try prov<strong>in</strong>g the commutativity of vector addition (Property<br />
C), or one of the distributivity properties (Property DVA, Property DSA). △<br />
Example VSIS The vector space of <strong>in</strong>f<strong>in</strong>ite sequences<br />
Set: C ∞ = {(c 0 ,c 1 ,c 2 ,c 3 , ...)| c i ∈ C, i ∈ N}.<br />
Equality:<br />
(c 0 ,c 1 ,c 2 , ...)=(d 0 ,d 1 ,d 2 , ...) if and only if c i = d i for all i ≥ 0<br />
Vector Addition:<br />
(c 0 ,c 1 ,c 2 , ...)+(d 0 ,d 1 ,d 2 , ...)=(c 0 + d 0 ,c 1 + d 1 ,c 2 + d 2 , ...)
§VS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 262<br />
Scalar Multiplication:<br />
α(c 0 ,c 1 ,c 2 ,c 3 , ...)=(αc 0 ,αc 1 ,αc 2 ,αc 3 , ...)<br />
This should rem<strong>in</strong>d you of the vector space C m , though now our lists of scalars are<br />
written horizontally with commas as delimiters and they are allowed to be <strong>in</strong>f<strong>in</strong>ite <strong>in</strong><br />
length. What does the zero vector look like (Property Z)? Additive <strong>in</strong>verses (Property<br />
AI)? Can you prove the associativity of vector addition (Property AA)? △<br />
Example VSF The vector space of functions<br />
Let X be any set.<br />
Set: F = {f| f : X → C}.<br />
Equality: f = g if and only if f(x) =g(x) for all x ∈ X.<br />
Vector Addition: f + g is the function with outputs def<strong>in</strong>ed by (f + g)(x) =<br />
f(x)+g(x).<br />
Scalar Multiplication: αf is the function with outputs def<strong>in</strong>ed by (αf)(x) =<br />
αf(x).<br />
So this is the set of all functions of one variable that take elements of the set X<br />
to a complex number. You might have studied functions of one variable that take a<br />
real number to a real number, and that might be a more natural set to use as X.<br />
But s<strong>in</strong>ce we are allow<strong>in</strong>g our scalars to be complex numbers, we need to specify<br />
that the range of our functions is the complex numbers. Study carefully how the<br />
def<strong>in</strong>itions of the operation are made, and th<strong>in</strong>k about the different uses of “+” and<br />
juxtaposition. As an example of what is required when verify<strong>in</strong>g that this is a vector<br />
space, consider that the zero vector (Property Z) is the function z whose def<strong>in</strong>ition<br />
is z(x) = 0 for every <strong>in</strong>put x ∈ X.<br />
Vector spaces of functions are very important <strong>in</strong> mathematics and physics, where<br />
the field of scalars may be the real numbers, so the ranges of the functions can <strong>in</strong><br />
turn also be the set of real numbers.<br />
△<br />
Here is a unique example.<br />
Example VSS The s<strong>in</strong>gleton vector space<br />
Set: Z = {z}.<br />
Equality: Huh?<br />
Vector Addition: z + z = z.<br />
Scalar Multiplication: αz = z.<br />
This should look pretty wild. <strong>First</strong>, just what is z? Column vector, matrix,<br />
polynomial, sequence, function? M<strong>in</strong>eral, plant, or animal? We aren’t say<strong>in</strong>g! z just<br />
is. And we have def<strong>in</strong>itions of vector addition and scalar multiplication that are<br />
sufficient for an occurrence of either that may come along.<br />
Our only concern is if this set, along with the def<strong>in</strong>itions of two operations, fulfills<br />
the ten properties of Def<strong>in</strong>ition VS. Let us check associativity of vector addition
§VS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 263<br />
(Property AA). For all u, v, w ∈ Z,<br />
u +(v + w) =z +(z + z)<br />
= z + z<br />
=(z + z)+z<br />
=(u + v)+w<br />
What is the zero vector <strong>in</strong> this vector space (Property Z)? With only one element<br />
<strong>in</strong> the set, we do not have much choice. Is z = 0? Itappearsthatz behaves like the<br />
zero vector should, so it gets the title. Maybe now the def<strong>in</strong>ition of this vector space<br />
does not seem so bizarre. It is a set whose only element is the element that behaves<br />
like the zero vector, so that lone element is the zero vector.<br />
△<br />
Perhaps some of the above def<strong>in</strong>itions and verifications seem obvious or like<br />
splitt<strong>in</strong>g hairs, but the next example should conv<strong>in</strong>ce you that they are necessary.<br />
We will study this one carefully. Ready? Check your preconceptions at the door.<br />
Example CVS The crazy vector space<br />
Set: C = {(x 1 ,x 2 )| x 1 ,x 2 ∈ C}.<br />
Vector Addition: (x 1 ,x 2 )+(y 1 ,y 2 )=(x 1 + y 1 +1,x 2 + y 2 +1).<br />
Scalar Multiplication: α(x 1 ,x 2 )=(αx 1 + α − 1, αx 2 + α − 1).<br />
Now, the first th<strong>in</strong>g I hear you say is “You can’t do that!” And my response is,<br />
“Oh yes, I can!” I am free to def<strong>in</strong>e my set and my operations any way I please. They<br />
may not look natural, or even useful, but we will now verify that they provide us<br />
with another example of a vector space. And that is enough. If you are adventurous,<br />
you might try first check<strong>in</strong>g some of the properties yourself. What is the zero vector?<br />
Additive <strong>in</strong>verses? Can you prove associativity? Ready, here we go.<br />
Property AC, Property SC: The result of each operation is a pair of complex<br />
numbers, so these two closure properties are fulfilled.<br />
Property C:<br />
u + v =(x 1 ,x 2 )+(y 1 ,y 2 )=(x 1 + y 1 +1,x 2 + y 2 +1)<br />
=(y 1 + x 1 +1,y 2 + x 2 +1)=(y 1 ,y 2 )+(x 1 ,x 2 )<br />
= v + u<br />
Property AA:<br />
u +(v + w) =(x 1 ,x 2 )+((y 1 ,y 2 )+(z 1 ,z 2 ))<br />
=(x 1 ,x 2 )+(y 1 + z 1 +1,y 2 + z 2 +1)<br />
=(x 1 +(y 1 + z 1 +1)+1,x 2 +(y 2 + z 2 +1)+1)<br />
=(x 1 + y 1 + z 1 +2,x 2 + y 2 + z 2 +2)<br />
=((x 1 + y 1 +1)+z 1 +1, (x 2 + y 2 +1)+z 2 +1)<br />
=(x 1 + y 1 +1,x 2 + y 2 +1)+(z 1 ,z 2 )
§VS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 264<br />
=((x 1 ,x 2 )+(y 1 ,y 2 )) + (z 1 ,z 2 )<br />
=(u + v)+w<br />
Property Z: The zero vector is . . . 0 =(−1, −1). Now I hear you say, “No, no,<br />
that can’t be, it must be (0, 0)!”. Indulge me for a moment and let us check my<br />
proposal.<br />
u + 0 =(x 1 ,x 2 )+(−1, −1) = (x 1 +(−1) + 1, x 2 +(−1) + 1) = (x 1 ,x 2 )=u<br />
Feel<strong>in</strong>g better? Or worse?<br />
Property AI: For each vector, u, we must locate an additive <strong>in</strong>verse, −u. Hereit<br />
is, −(x 1 ,x 2 )=(−x 1 −2, −x 2 −2). As odd as it may look, I hope you are withhold<strong>in</strong>g<br />
judgment. Check:<br />
u +(−u) =(x 1 ,x 2 )+(−x 1 − 2, −x 2 − 2)<br />
=(x 1 +(−x 1 − 2) + 1, −x 2 +(x 2 − 2) + 1) = (−1, −1) = 0<br />
Property SMA:<br />
α(βu) =α(β(x 1 ,x 2 ))<br />
= α(βx 1 + β − 1, βx 2 + β − 1)<br />
=(α(βx 1 + β − 1) + α − 1, α(βx 2 + β − 1) + α − 1)<br />
=((αβx 1 + αβ − α)+α − 1, (αβx 2 + αβ − α)+α − 1)<br />
=(αβx 1 + αβ − 1, αβx 2 + αβ − 1)<br />
=(αβ)(x 1 ,x 2 )<br />
=(αβ)u<br />
Property DVA: If you have hung on so far, here is where it gets even wilder. In<br />
the next two properties we mix and mash the two operations.<br />
α(u + v)<br />
= α ((x 1 ,x 2 )+(y 1 ,y 2 ))<br />
= α(x 1 + y 1 +1,x 2 + y 2 +1)<br />
=(α(x 1 + y 1 +1)+α − 1, α(x 2 + y 2 +1)+α − 1)<br />
=(αx 1 + αy 1 + α + α − 1, αx 2 + αy 2 + α + α − 1)<br />
=(αx 1 + α − 1+αy 1 + α − 1+1,αx 2 + α − 1+αy 2 + α − 1+1)<br />
=((αx 1 + α − 1) + (αy 1 + α − 1) + 1, (αx 2 + α − 1) + (αy 2 + α − 1) + 1)<br />
=(αx 1 + α − 1, αx 2 + α − 1) + (αy 1 + α − 1, αy 2 + α − 1)<br />
= α(x 1 ,x 2 )+α(y 1 ,y 2 )<br />
= αu + αv<br />
Property DSA:<br />
(α + β)u
§VS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 265<br />
=(α + β)(x 1 ,x 2 )<br />
=((α + β)x 1 +(α + β) − 1, (α + β)x 2 +(α + β) − 1)<br />
=(αx 1 + βx 1 + α + β − 1, αx 2 + βx 2 + α + β − 1)<br />
=(αx 1 + α − 1+βx 1 + β − 1+1,αx 2 + α − 1+βx 2 + β − 1+1)<br />
=((αx 1 + α − 1) + (βx 1 + β − 1) + 1, (αx 2 + α − 1) + (βx 2 + β − 1) + 1)<br />
=(αx 1 + α − 1, αx 2 + α − 1) + (βx 1 + β − 1, βx 2 + β − 1)<br />
= α(x 1 ,x 2 )+β(x 1 ,x 2 )<br />
= αu + βu<br />
Property O: After all that, this one is easy, but no less pleas<strong>in</strong>g.<br />
1u =1(x 1 ,x 2 )=(x 1 +1− 1, x 2 +1− 1) = (x 1 ,x 2 )=u<br />
That is it, C is a vector space, as crazy as that may seem.<br />
Notice that <strong>in</strong> the case of the zero vector and additive <strong>in</strong>verses, we only had to<br />
propose possibilities and then verify that they were the correct choices. You might<br />
try to discover how you would arrive at these choices, though you should understand<br />
why the process of discover<strong>in</strong>g them is not a necessary component of the proof itself.<br />
△<br />
Subsection VSP<br />
Vector Space Properties<br />
Subsection VS.EVS has provided us with an abundance of examples of vector spaces,<br />
most of them conta<strong>in</strong><strong>in</strong>g useful and <strong>in</strong>terest<strong>in</strong>g mathematical objects along with<br />
natural operations. In this subsection we will prove some general properties of<br />
vector spaces. Some of these results will aga<strong>in</strong> seem obvious, but it is important<br />
to understand why it is necessary to state and prove them. A typical hypothesis<br />
will be “Let V be a vector space.” From this we may assume the ten properties of<br />
Def<strong>in</strong>ition VS, and noth<strong>in</strong>g more. It is like start<strong>in</strong>g over, as we learn about what can<br />
happen <strong>in</strong> this new algebra we are learn<strong>in</strong>g. But the power of this careful approach is<br />
that we can apply these theorems to any vector space we encounter — those <strong>in</strong> the<br />
previous examples, or new ones we have not yet contemplated. Or perhaps new ones<br />
that nobody has ever contemplated. We will illustrate some of these results with<br />
examples from the crazy vector space (Example CVS), but mostly we are stat<strong>in</strong>g<br />
theorems and do<strong>in</strong>g proofs. These proofs do not get too <strong>in</strong>volved, but are not trivial<br />
either, so these are good theorems to try prov<strong>in</strong>g yourself before you study the proof<br />
given here. (See Proof Technique P.)<br />
<strong>First</strong> we show that there is just one zero vector. Notice that the properties only<br />
require there to be at least one, and say noth<strong>in</strong>g about there possibly be<strong>in</strong>g more.<br />
That is because we can use the ten properties of a vector space (Def<strong>in</strong>ition VS) to<br />
learn that there can never be more than one. To require that this extra condition
§VS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 266<br />
be stated as an eleventh property would make the def<strong>in</strong>ition of a vector space more<br />
complicated than it needs to be.<br />
Theorem ZVU Zero Vector is Unique<br />
Suppose that V is a vector space. The zero vector, 0, is unique.<br />
Proof. To prove uniqueness, a standard technique is to suppose the existence of two<br />
objects (Proof Technique U). So let 0 1 and 0 2 be two zero vectors <strong>in</strong> V .Then<br />
0 1 = 0 1 + 0 2 Property Z for 0 2<br />
= 0 2 + 0 1 Property C<br />
= 0 2 Property Z for 0 1<br />
This proves the uniqueness s<strong>in</strong>ce the two zero vectors are really the same.<br />
<br />
Theorem AIU Additive Inverses are Unique<br />
Suppose that V is a vector space. For each u ∈ V , the additive <strong>in</strong>verse, −u, is unique.<br />
Proof. To prove uniqueness, a standard technique is to suppose the existence of two<br />
objects (Proof Technique U). So let −u 1 and −u 2 be two additive <strong>in</strong>verses for u.<br />
Then<br />
−u 1 = −u 1 + 0<br />
Property Z<br />
= −u 1 +(u + −u 2 ) Property AI<br />
=(−u 1 + u)+−u 2<br />
Property AA<br />
= 0 + −u 2 Property AI<br />
= −u 2 Property Z<br />
So the two additive <strong>in</strong>verses are really the same.<br />
<br />
Asobviousasthenextthreetheoremsappear,nowherehaveweguaranteedthat<br />
the zero scalar, scalar multiplication and the zero vector all <strong>in</strong>teract this way. Until<br />
we have proved it, anyway.<br />
Theorem ZSSM Zero Scalar <strong>in</strong> Scalar Multiplication<br />
Suppose that V is a vector space and u ∈ V .Then0u = 0.<br />
Proof. Notice that 0 is a scalar, u is a vector, so Property SC says 0u is aga<strong>in</strong> a<br />
vector. As such, 0u has an additive <strong>in</strong>verse, −(0u) by Property AI.<br />
0u = 0 +0u<br />
Property Z<br />
=(−(0u)+0u)+0u<br />
Property AI<br />
= −(0u)+(0u +0u) Property AA<br />
= −(0u)+(0+0)u Property DSA<br />
= −(0u)+0u Property ZCN
§VS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 267<br />
= 0 Property AI<br />
Here is another theorem that looks like it should be obvious, but is still <strong>in</strong> need<br />
of a proof.<br />
Theorem ZVSM Zero Vector <strong>in</strong> Scalar Multiplication<br />
Suppose that V is a vector space and α ∈ C. Thenα0 = 0.<br />
Proof. Notice that α is a scalar, 0 isavector,soPropertySC means α0 is aga<strong>in</strong> a<br />
vector. As such, α0 has an additive <strong>in</strong>verse, −(α0) byPropertyAI.<br />
α0 = 0 + α0<br />
Property Z<br />
=(−(α0)+α0)+α0<br />
Property AI<br />
= −(α0)+(α0 + α0) Property AA<br />
= −(α0)+α (0 + 0) Property DVA<br />
= −(α0)+α0 Property Z<br />
= 0 Property AI<br />
Here is another one that sure looks obvious. But understand that we have chosen<br />
to use certa<strong>in</strong> notation because it makes the theorem’s conclusion look so nice. The<br />
theorem is not true because the notation looks so good; it still needs a proof. If we<br />
had really wanted to make this po<strong>in</strong>t, we might have used notation like u ♯ for the<br />
additive <strong>in</strong>verse of u. Then we would have written the def<strong>in</strong><strong>in</strong>g property, Property<br />
AI, asu + u ♯ = 0. This theorem would become u ♯ =(−1)u. Not really quite as<br />
pretty, is it?<br />
Theorem AISM Additive Inverses from Scalar Multiplication<br />
Suppose that V is a vector space and u ∈ V .Then−u =(−1)u.<br />
Proof.<br />
<br />
−u = −u + 0<br />
Property Z<br />
= −u +0u Theorem ZSSM<br />
= −u +(1+(−1)) u Property AICN<br />
= −u +(1u +(−1)u) Property DSA<br />
= −u +(u +(−1)u) Property O<br />
=(−u + u)+(−1)u<br />
Property AA<br />
= 0 +(−1)u Property AI<br />
=(−1)u<br />
Property Z
§VS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 268<br />
Because of this theorem, we can now write l<strong>in</strong>ear comb<strong>in</strong>ations like 6u 1 +(−4)u 2<br />
as 6u 1 − 4u 2 , even though we have not formally def<strong>in</strong>ed an operation called vector<br />
subtraction.<br />
Our next theorem is a bit different from several of the others <strong>in</strong> the list. Rather<br />
than mak<strong>in</strong>g a declaration (“the zero vector is unique”) it is an implication (“if. . . ,<br />
then...”) and so can be used <strong>in</strong> proofs to convert a vector equality <strong>in</strong>to two possibilities,<br />
one a scalar equality and the other a vector equality. It should rem<strong>in</strong>d you of<br />
the situation for complex numbers. If α, β ∈ C and αβ =0,thenα =0orβ =0.<br />
This critical property is the driv<strong>in</strong>g force beh<strong>in</strong>d us<strong>in</strong>g a factorization to solve a<br />
polynomial equation.<br />
Theorem SMEZV Scalar Multiplication Equals the Zero Vector<br />
Suppose that V is a vector space and α ∈ C. Ifαu = 0, then either α =0or u = 0.<br />
Proof. We prove this theorem by break<strong>in</strong>g up the analysis <strong>in</strong>to two cases. The first<br />
seems too trivial, and it is, but the logic of the argument is still legitimate.<br />
Case 1. Suppose α = 0. In this case our conclusion is true (the first part of the<br />
either/or is true) and we are done. That was easy.<br />
Case 2. Suppose α ≠0.<br />
u =1u Property O<br />
( ) 1<br />
=<br />
α α u α ≠0, < acroreftype =”property”acro =”MICN”/ ><br />
= 1 (αu)<br />
α<br />
Property SMA<br />
= 1 (0)<br />
α<br />
Hypothesis<br />
= 0 Theorem ZVSM<br />
So <strong>in</strong> this case, the conclusion is true (the second part of the either/or is true)<br />
and we are done s<strong>in</strong>ce the conclusion was true <strong>in</strong> each of the two cases. <br />
Example PCVS Properties for the Crazy Vector Space<br />
Several of the above theorems have <strong>in</strong>terest<strong>in</strong>g demonstrations when applied to the<br />
crazy vector space, C (Example CVS). We are not prov<strong>in</strong>g anyth<strong>in</strong>g new here, or<br />
learn<strong>in</strong>g anyth<strong>in</strong>g we did not know already about C. It is just pla<strong>in</strong> fun to see how<br />
these general theorems apply <strong>in</strong> a specific <strong>in</strong>stance. For most of our examples, the<br />
applications are obvious or trivial, but not with C.<br />
Suppose u ∈ C. Then, as given by Theorem ZSSM,<br />
0u =0(x 1 ,x 2 )=(0x 1 +0− 1, 0x 2 +0− 1) = (−1, −1) = 0<br />
And as given by Theorem ZVSM,<br />
α0 = α(−1, −1) = (α(−1) + α − 1, α(−1) + α − 1)
§VS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 269<br />
=(−α + α − 1, −α + α − 1) = (−1, −1) = 0<br />
F<strong>in</strong>ally, as given by Theorem AISM,<br />
(−1)u =(−1)(x 1 ,x 2 )=((−1)x 1 +(−1) − 1, (−1)x 2 +(−1) − 1)<br />
=(−x 1 − 2, −x 2 − 2) = −u<br />
△<br />
Subsection RD<br />
Recycl<strong>in</strong>g Def<strong>in</strong>itions<br />
When we say that V is a vector space, we then know we have a set of objects (the<br />
“vectors”), but we also know we have been provided with two operations (“vector<br />
addition” and “scalar multiplication”) and these operations behave with these objects<br />
accord<strong>in</strong>g to the ten properties of Def<strong>in</strong>ition VS. One comb<strong>in</strong>es two vectors and<br />
produces a vector, the other takes a scalar and a vector, produc<strong>in</strong>g a vector as the<br />
result. So if u 1 , u 2 , u 3 ∈ V then an expression like<br />
5u 1 +7u 2 − 13u 3<br />
would be unambiguous <strong>in</strong> any of the vector spaces we have discussed <strong>in</strong> this section.<br />
And the result<strong>in</strong>g object would be another vector <strong>in</strong> the vector space. If you were<br />
tempted to call the above expression a l<strong>in</strong>ear comb<strong>in</strong>ation, you would be right. Four<br />
of the def<strong>in</strong>itions that were central to our discussions <strong>in</strong> Chapter V were stated <strong>in</strong><br />
the context of vectors be<strong>in</strong>g column vectors, but were purposely kept broad enough<br />
that they could be applied <strong>in</strong> the context of any vector space. They only rely on the<br />
presence of scalars, vectors, vector addition and scalar multiplication to make sense.<br />
We will restate them shortly, unchanged, except that their titles and acronyms no<br />
longer refer to column vectors, and the hypothesis of be<strong>in</strong>g <strong>in</strong> a vector space has<br />
been added. Take the time now to look forward and review each one, and beg<strong>in</strong><br />
to form some connections to what we have done earlier and what we will be do<strong>in</strong>g<br />
<strong>in</strong> subsequent sections and chapters. Specifically, compare the follow<strong>in</strong>g pairs of<br />
def<strong>in</strong>itions:<br />
• Def<strong>in</strong>ition LCCV and Def<strong>in</strong>ition LC<br />
• Def<strong>in</strong>ition SSCV and Def<strong>in</strong>ition SS<br />
• Def<strong>in</strong>ition RLDCV and Def<strong>in</strong>ition RLD<br />
• Def<strong>in</strong>ition LICV and Def<strong>in</strong>ition LI<br />
Read<strong>in</strong>g Questions
§VS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 270<br />
1. Comment on how the vector space C m went from a theorem (Theorem VSPCV) toan<br />
example (Example VSCV).<br />
2. In the crazy vector space, C, (Example CVS) compute the l<strong>in</strong>ear comb<strong>in</strong>ation<br />
2(3, 4) + (−6)(1, 2).<br />
3. Suppose that α is a scalar and 0 is the zero vector. Why should we prove anyth<strong>in</strong>g as<br />
obvious as α0 = 0 such as we did <strong>in</strong> Theorem ZVSM?<br />
Exercises<br />
M10 Def<strong>in</strong>e a possibly new vector space by beg<strong>in</strong>n<strong>in</strong>g with the set and vector addition<br />
from C 2 (Example VSCV) but change the def<strong>in</strong>ition of scalar multiplication to<br />
[<br />
0<br />
αx = 0 =<br />
α ∈ C, x ∈ C<br />
0]<br />
2<br />
Prove that the first n<strong>in</strong>e properties required for a vector space hold, but Property O does<br />
not hold.<br />
This example shows us that we cannot expect to be able to derive Property O as a<br />
consequence of assum<strong>in</strong>g the first n<strong>in</strong>e properties. In other words, we cannot slim down our<br />
list of properties by jettison<strong>in</strong>g the last one, and still have the same collection of objects<br />
qualify as vector spaces.<br />
M11 † Let V be the set C 2 with the usual vector addition, but with scalar multiplication<br />
def<strong>in</strong>ed by<br />
[<br />
x<br />
α =<br />
y]<br />
[ ]<br />
αy<br />
αx<br />
Determ<strong>in</strong>e whether or not V is a vector space with these operations.<br />
M12 † Let V be the set C 2 with the usual scalar multiplication, but with vector addition<br />
def<strong>in</strong>ed by<br />
[ [ [ ]<br />
x z y + w<br />
+ =<br />
y]<br />
w]<br />
x + z<br />
Determ<strong>in</strong>e whether or not V is a vector space with these operations.<br />
M13 † Let V be the set M 22 with the usual scalar multiplication, but with addition def<strong>in</strong>ed<br />
by A + B = O 2,2 for all 2 × 2 matrices A and B. Determ<strong>in</strong>e whether or not V is a vector<br />
space with these operations.<br />
M14 † Let V be the set M 22 with the usual addition, but with scalar multiplication def<strong>in</strong>ed<br />
by αA = O 2,2 for all 2 × 2 matrices A and scalars α. Determ<strong>in</strong>e whether or not V is a<br />
vector space with these operations.<br />
M15 † Consider the follow<strong>in</strong>g sets of 3 × 3 matrices, where the symbol ∗ <strong>in</strong>dicates the<br />
position of an arbitrary complex number. Determ<strong>in</strong>e whether or not these sets form vector<br />
spaces with the usual operations of addition and scalar multiplication for matrices.
§VS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 271<br />
1. All matrices of the form<br />
2. All matrices of the form<br />
3. All matrices of the form<br />
4. All matrices of the form<br />
⎡<br />
⎣ ∗ ∗ 1<br />
⎤<br />
∗ 1 ∗⎦<br />
1 ∗ ∗<br />
⎡<br />
⎣ ∗ 0 ∗<br />
⎤<br />
0 ∗ 0⎦<br />
∗ 0 ∗<br />
⎡<br />
⎣ ∗ 0 0<br />
⎤<br />
0 ∗ 0⎦ (These are the diagonal matrices.)<br />
0 0 ∗<br />
⎡<br />
⎣ ∗ ∗ ∗<br />
⎤<br />
0 ∗ ∗⎦ (These are the upper triangular matrices.)<br />
0 0 ∗<br />
M20 † Expla<strong>in</strong> why we need to def<strong>in</strong>e the vector space P n as the set of all polynomials<br />
with degree up to and <strong>in</strong>clud<strong>in</strong>g n <strong>in</strong>stead of the more obvious set of all polynomials of<br />
degree exactly n.<br />
{[ ]∣<br />
M21 † The set of <strong>in</strong>tegers is denoted Z. DoesthesetZ 2 m ∣∣∣<br />
= m, n ∈ Z}<br />
with the<br />
n<br />
operations of standard addition and scalar multiplication of vectors form a vector space?<br />
T10 ProveeachofthetenpropertiesofDef<strong>in</strong>itionVS for each of the follow<strong>in</strong>g examples<br />
of a vector space: Example VSP, Example VSIS, Example VSF, Example VSS.<br />
The next three problems suggest that under the right situations we can “cancel.” In practice,<br />
these techniques should be avoided <strong>in</strong> other proofs. Prove each of the follow<strong>in</strong>g statements.<br />
T21 † Suppose that V is a vector space, and u, v, w ∈ V .Ifw + u = w + v, then<br />
u = v.<br />
T22 † Suppose V is a vector space, u, v ∈ V and α is a nonzero scalar from C. If<br />
αu = αv, thenu = v.<br />
T23 † Suppose V is a vector space, u ̸= 0 is a vector <strong>in</strong> V and α, β ∈ C. Ifαu = βu,<br />
then α = β.<br />
T30 † Suppose that V is a vector space and α ∈ C is a scalar such that αx = x<br />
for every x ∈ V .Provethatα = 1. In other words, Property O is not duplicated for<br />
any other scalar but the “special” scalar, 1. (This question was suggested by James<br />
Gallagher.)
Section S<br />
Subspaces<br />
A subspace is a vector space that is conta<strong>in</strong>ed with<strong>in</strong> another vector space. So every<br />
subspace is a vector space <strong>in</strong> its own right, but it is also def<strong>in</strong>ed relative to some<br />
other (larger) vector space. We will discover shortly that we are already familiar<br />
with a wide variety of subspaces from previous sections.<br />
Subsection S<br />
Subspaces<br />
Here is the pr<strong>in</strong>cipal def<strong>in</strong>ition for this section.<br />
Def<strong>in</strong>ition S Subspace<br />
Suppose that V and W are two vector spaces that have identical def<strong>in</strong>itions of vector<br />
addition and scalar multiplication, and that W is a subset of V , W ⊆ V .ThenW is<br />
a subspace of V .<br />
□<br />
Let us look at an example of a vector space <strong>in</strong>side another vector space.<br />
Example SC3 A subspace of C 3<br />
We know that C 3 is a vector space (Example VSCV). Consider the subset,<br />
{[ ]∣ }<br />
x1 ∣∣∣∣<br />
W = x 2 2x 1 − 5x 2 +7x 3 =0<br />
x 3<br />
It is clear that W ⊆ C 3 , s<strong>in</strong>ce the objects <strong>in</strong> W are column vectors of size 3. But<br />
is W a vector space? Does it satisfy the ten properties of Def<strong>in</strong>ition VS when we use<br />
the same operations?<br />
[ ]<br />
That is the<br />
[<br />
ma<strong>in</strong><br />
]<br />
question.<br />
x1<br />
y1<br />
Suppose x = x 2 and y = y 2 are vectors from W . Then we know that these<br />
x 3 y 3<br />
vectors cannot be totally arbitrary, they must have ga<strong>in</strong>ed membership <strong>in</strong> W by<br />
virtue of meet<strong>in</strong>g the membership test. For example, we know that x must satisfy<br />
2x 1 − 5x 2 +7x 3 =0whiley must satisfy 2y 1 − 5y 2 +7y 3 = 0. Our first property<br />
(Property AC) asks the question, is x + y ∈ W ? When our set of vectors was C 3 ,<br />
this was an easy question to answer. Now it is not so obvious. Notice first that<br />
[ ] [ ] [ ]<br />
x1 y1 x1 + y 1<br />
x + y = x 2 + y 2 = x 2 + y 2<br />
x 3 y 3 x 3 + y 3<br />
and we can test this vector for membership <strong>in</strong> W as follows. Because x ∈ W we know<br />
2x 1 − 5x 2 +7x 3 = 0 and because y ∈ W we know 2y 1 − 5y 2 +7y 3 = 0. Therefore,<br />
2(x 1 + y 1 ) − 5(x 2 + y 2 )+7(x 3 + y 3 )=2x 1 +2y 1 − 5x 2 − 5y 2 +7x 3 +7y 3<br />
272
§S Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 273<br />
=(2x 1 − 5x 2 +7x 3 )+(2y 1 − 5y 2 +7y 3 )<br />
=0+0<br />
=0<br />
and by this computation we see that x + y ∈ W . One property down, n<strong>in</strong>e to go.<br />
If α is a scalar and x ∈ W ,isitalwaystruethatαx ∈ W ?Thisiswhatweneed<br />
to establish Property SC. Aga<strong>in</strong>, the answer is not as obvious as it was when our<br />
set of vectors was all of C 3 . Let us see. <strong>First</strong>, note that because x ∈ W we know<br />
2x 1 − 5x 2 +7x 3 = 0. Therefore,<br />
[ ] [ ]<br />
x1 αx1<br />
αx = α x 2 = αx 2<br />
x 3 αx 3<br />
and we can test this vector for membership <strong>in</strong> W . <strong>First</strong>, note that because x ∈ W<br />
we know 2x 1 − 5x 2 +7x 3 = 0. Therefore,<br />
2(αx 1 ) − 5(αx 2 )+7(αx 3 )=α(2x 1 − 5x 2 +7x 3 )<br />
= α0<br />
=0<br />
and we see that <strong>in</strong>deed αx ∈ W .Always.<br />
If W has a zero vector, it will be unique (Theorem ZVU). The zero vector for C 3<br />
should also perform the required duties when added to elements of W . So the likely<br />
candidate for a zero vector ] <strong>in</strong> W is the same zero vector that we know C 3 has. You<br />
[ 0<br />
can check that 0 = 0 is a zero vector <strong>in</strong> W too (Property Z).<br />
0<br />
With a zero vector, we can now ask about additive <strong>in</strong>verses (Property AI). As<br />
you might suspect, the natural candidate for an additive <strong>in</strong>verse <strong>in</strong> W isthesameas<br />
the additive <strong>in</strong>verse from C 3 . However, we must <strong>in</strong>sure that these additive <strong>in</strong>verses<br />
actually are elements of W .Givenx ∈ W ,is−x ∈ W ?<br />
[ ] −x1<br />
−x = −x 2<br />
−x 3<br />
and we can test this vector for membership <strong>in</strong> W . As before, because x ∈ W we<br />
know 2x 1 − 5x 2 +7x 3 =0.<br />
2(−x 1 ) − 5(−x 2 )+7(−x 3 )=−(2x 1 − 5x 2 +7x 3 )<br />
= −0<br />
=0<br />
and we now believe that −x ∈ W .<br />
Is the vector addition <strong>in</strong> W commutative (Property C)? Is x + y = y + x? Of<br />
course! Noth<strong>in</strong>g about restrict<strong>in</strong>g the scope of our set of vectors will prevent the
§S Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 274<br />
operation from still be<strong>in</strong>g commutative. Indeed, the rema<strong>in</strong><strong>in</strong>g five properties are<br />
unaffected by the transition to a smaller set of vectors, and so rema<strong>in</strong> true. That<br />
was convenient.<br />
So W satisfies all ten properties, is therefore a vector space, and thus earns the<br />
title of be<strong>in</strong>g a subspace of C 3 .<br />
△<br />
Subsection TS<br />
Test<strong>in</strong>g Subspaces<br />
In Example SC3 we proceeded through all ten of the vector space properties before<br />
believ<strong>in</strong>g that a subset was a subspace. But six of the properties were easy to prove,<br />
and we can lean on some of the properties of the vector space (the superset) to make<br />
the other four easier. Here is a theorem that will make it easier to test if a subset is<br />
a vector space. A shortcut if there ever was one.<br />
Theorem TSS Test<strong>in</strong>g Subsets for Subspaces<br />
Suppose that V is a vector space and W is a subset of V , W ⊆ V . Endow W with<br />
the same operations as V .ThenW is a subspace if and only if three conditions are<br />
met<br />
1. W is nonempty, W ≠ ∅.<br />
2. If x ∈ W and y ∈ W ,thenx + y ∈ W .<br />
3. If α ∈ C and x ∈ W ,thenαx ∈ W .<br />
Proof. (⇒) WehavethehypothesisthatW is a subspace, so by Property Z we know<br />
that W conta<strong>in</strong>s a zero vector. This is enough to show that W ≠ ∅. Also, s<strong>in</strong>ce W is<br />
a vector space it satisfies the additive and scalar multiplication closure properties<br />
(Property AC, Property SC), and so exactly meets the second and third conditions.<br />
If that was easy, the other direction might require a bit more work.<br />
(⇐) We have three properties for our hypothesis, and from this we should<br />
conclude that W has the ten def<strong>in</strong><strong>in</strong>g properties of a vector space. The second and<br />
third conditions of our hypothesis are exactly Property AC and Property SC. Our<br />
hypothesis that V is a vector space implies that Property C, Property AA, Property<br />
SMA, Property DVA, Property DSA and Property O all hold. They cont<strong>in</strong>ue to be<br />
true for vectors from W s<strong>in</strong>ce pass<strong>in</strong>g to a subset, and keep<strong>in</strong>g the operation the<br />
same, leaves their statements unchanged. Eight down, two to go.<br />
Suppose x ∈ W . Then by the third part of our hypothesis (scalar closure), we<br />
know that (−1)x ∈ W .ByTheoremAISM (−1)x = −x, so together these statements<br />
show us that −x ∈ W . −x is the additive <strong>in</strong>verse of x <strong>in</strong> V , but will cont<strong>in</strong>ue <strong>in</strong><br />
this role when viewed as element of the subset W . So every element of W has an<br />
additive <strong>in</strong>verse that is an element of W and Property AI is completed. Just one<br />
property left.
§S Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 275<br />
While we have implicitly discussed the zero vector <strong>in</strong> the previous paragraph, we<br />
need to be certa<strong>in</strong> that the zero vector (of V ) really lives <strong>in</strong> W .S<strong>in</strong>ceW is nonempty,<br />
we can choose some vector z ∈ W . Then by the argument <strong>in</strong> the previous paragraph,<br />
we know −z ∈ W .NowbyPropertyAI for V and then by the second part of our<br />
hypothesis (additive closure) we see that<br />
0 = z +(−z) ∈ W<br />
So W conta<strong>in</strong>s the zero vector from V . S<strong>in</strong>ce this vector performs the required<br />
duties of a zero vector <strong>in</strong> V , it will cont<strong>in</strong>ue <strong>in</strong> that role as an element of W .This<br />
gives us, Property Z, the f<strong>in</strong>al property of the ten required. (Sarah Fellez contributed<br />
to this proof.)<br />
<br />
So just three conditions, plus be<strong>in</strong>g a subset of a known vector space, gets us all<br />
ten properties. Fabulous! This theorem can be paraphrased by say<strong>in</strong>g that a subspace<br />
is “a nonempty subset (of a vector space) that is closed under vector addition and<br />
scalar multiplication.”<br />
You might want to go back and rework Example SC3 <strong>in</strong> light of this result,<br />
perhaps see<strong>in</strong>g where we can now economize or where the work done <strong>in</strong> the example<br />
mirrored the proof and where it did not. We will press on and apply this theorem <strong>in</strong><br />
a slightly more abstract sett<strong>in</strong>g.<br />
Example SP4 A subspace of P 4<br />
P 4 is the vector space of polynomials with degree at most 4 (Example VSP). Def<strong>in</strong>e<br />
a subset W as<br />
W = {p(x)| p ∈ P 4 , p(2) = 0}<br />
so W is the collection of those polynomials (with degree 4 or less) whose graphs<br />
cross the x-axis at x = 2. Whenever we encounter a new set it is a good idea to ga<strong>in</strong><br />
a better understand<strong>in</strong>g of the set by f<strong>in</strong>d<strong>in</strong>g a few elements <strong>in</strong> the set, and a few<br />
outside it. For example x 2 − x − 2 ∈ W , while x 4 + x 3 − 7 ∉ W .<br />
Is W nonempty? Yes, x − 2 ∈ W .<br />
Additive closure? Suppose p ∈ W and q ∈ W .Isp + q ∈ W ? p and q are not<br />
totally arbitrary, we know that p(2) = 0 and q(2) = 0. Then we can check p + q for<br />
membership <strong>in</strong> W ,<br />
(p + q)(2) = p(2) + q(2) Addition <strong>in</strong> P 4<br />
=0+0 p ∈ W, q ∈ W<br />
=0<br />
so we see that p + q qualifies for membership <strong>in</strong> W .<br />
Scalar multiplication closure? Suppose that α ∈ C and p ∈ W . Then we know<br />
that p(2) = 0. Test<strong>in</strong>g αp for membership,<br />
(αp)(2) = αp(2) Scalar multiplication <strong>in</strong> P 4
§S Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 276<br />
= α0 p ∈ W<br />
=0<br />
so αp ∈ W .<br />
We have shown that W meets the three conditions of Theorem TSS and so<br />
qualifies as a subspace of P 4 . Notice that by Def<strong>in</strong>ition S we now know that W is<br />
also a vector space. So all the properties of a vector space (Def<strong>in</strong>ition VS) andthe<br />
theorems of Section VS apply <strong>in</strong> full.<br />
△<br />
Much of the power of Theorem TSS is that we can easily establish new vector<br />
spaces if we can locate them as subsets of other vector spaces, such as the vector<br />
spaces presented <strong>in</strong> Subsection VS.EVS.<br />
It can be as <strong>in</strong>structive to consider some subsets that are not subspaces. S<strong>in</strong>ce<br />
Theorem TSS is an equivalence (see Proof Technique E) wecanbeassuredthata<br />
subset is not a subspace if it violates one of the three conditions, and <strong>in</strong> any example<br />
of <strong>in</strong>terest this will not be the “nonempty” condition. However, s<strong>in</strong>ce a subspace has<br />
to be a vector space <strong>in</strong> its own right, we can also search for a violation of any one<br />
of the ten def<strong>in</strong><strong>in</strong>g properties <strong>in</strong> Def<strong>in</strong>ition VS or any <strong>in</strong>herent property of a vector<br />
space, such as those given by the basic theorems of Subsection VS.VSP. Notice also<br />
that a violation need only be for a specific vector or pair of vectors.<br />
Example NSC2Z A non-subspace <strong>in</strong> C 2 , zero vector<br />
Consider the subset W below as a candidate for be<strong>in</strong>g a subspace of C 2<br />
{[ ]∣ }<br />
x1 ∣∣∣<br />
W = 3x<br />
x 1 − 5x 2 =12<br />
2<br />
[<br />
0<br />
The zero vector of C 2 , 0 = will need to be the zero vector <strong>in</strong> W also. However,<br />
0]<br />
0 ∉ W s<strong>in</strong>ce 3(0) − 5(0) = 0 ≠ 12. So W has no zero vector and fails Property Z<br />
of Def<strong>in</strong>ition VS. This subspace also fails to be closed under addition and scalar<br />
multiplication. Can you f<strong>in</strong>d examples of this?<br />
△<br />
Example NSC2A A non-subspace <strong>in</strong> C 2 , additive closure<br />
Consider the subset X below as a candidate for be<strong>in</strong>g a subspace of C 2<br />
{[ ]∣ }<br />
x1 ∣∣∣<br />
X = x<br />
x 1 x 2 =0<br />
2<br />
You can check that 0 ∈ X, so the[ approach of the last [ example will not get us<br />
1 0<br />
anywhere. However, notice that x = ∈ X and y = ∈ X. Yet<br />
0]<br />
1]<br />
[ [ [<br />
1 0 1<br />
x + y = + = ∉ X<br />
0]<br />
1]<br />
1]<br />
So X fails the additive closure requirement of either Property AC or Theorem<br />
TSS, and is therefore not a subspace.<br />
△
§S Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 277<br />
Example NSC2S A non-subspace <strong>in</strong> C 2 , scalar multiplication closure<br />
Consider the subset Y below as a candidate for be<strong>in</strong>g a subspace of C 2<br />
{[ ]∣ }<br />
x1 ∣∣∣<br />
Y = x<br />
x 1 ∈ Z, x 2 ∈ Z<br />
2<br />
Z is the set of <strong>in</strong>tegers, so we are only allow<strong>in</strong>g “whole numbers” as the constituents<br />
of our vectors. Now, 0 ∈ Y , and additive closure also holds (can you prove these<br />
claims?). So we will have to try someth<strong>in</strong>g different. Note that α = 1 2 ∈ C and<br />
[<br />
2<br />
3]<br />
∈ Y , but<br />
αx = 1 [<br />
2<br />
=<br />
2 3]<br />
[ 1<br />
3<br />
2<br />
]<br />
∉ Y<br />
So Y fails the scalar multiplication closure requirement of either Property SC or<br />
Theorem TSS, and is therefore not a subspace.<br />
△<br />
There are two examples of subspaces that are trivial. Suppose that V is any<br />
vector space. Then V is a subset of itself and is a vector space. By Def<strong>in</strong>ition S, V<br />
qualifies as a subspace of itself. The set conta<strong>in</strong><strong>in</strong>g just the zero vector Z = {0} is<br />
also a subspace as can be seen by apply<strong>in</strong>g Theorem TSS or by simple modifications<br />
of the techniques h<strong>in</strong>ted at <strong>in</strong> Example VSS. S<strong>in</strong>ce these subspaces are so obvious<br />
(and therefore not too <strong>in</strong>terest<strong>in</strong>g) we will refer to them as be<strong>in</strong>g trivial.<br />
Def<strong>in</strong>ition TS Trivial Subspaces<br />
Given the vector space V , the subspaces V and {0} are each called a trivial<br />
subspace.<br />
□<br />
We can also use Theorem TSS to prove more general statements about subspaces,<br />
as illustrated <strong>in</strong> the next theorem.<br />
Theorem NSMS Null Space of a Matrix is a Subspace<br />
Suppose that A is an m × n matrix. Then the null space of A, N (A), isasubspace<br />
of C n .<br />
Proof. Wewillexam<strong>in</strong>ethethreerequirementsofTheoremTSS. Recall that Def<strong>in</strong>ition<br />
NSM canbeformulatedasN (A) ={x ∈ C n | Ax = 0}.<br />
<strong>First</strong>, 0 ∈N(A), which can be <strong>in</strong>ferred as a consequence of Theorem HSC. So<br />
N (A) ≠ ∅.<br />
Second, check additive closure by suppos<strong>in</strong>g that x ∈N(A) and y ∈N(A). So<br />
we know a little someth<strong>in</strong>g about x and y: Ax = 0 and Ay = 0, and that is all we<br />
know. Question: Is x + y ∈N(A)? Let us check.<br />
A(x + y) =Ax + Ay<br />
Theorem MMDAA<br />
= 0 + 0 x ∈N(A) , y ∈N(A)<br />
= 0 Theorem VSPCV
§S Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 278<br />
So, yes, x + y qualifies for membership <strong>in</strong> N (A).<br />
Third, check scalar multiplication closure by suppos<strong>in</strong>g that α ∈ C and x ∈N(A).<br />
So we know a little someth<strong>in</strong>g about x: Ax = 0, and that is all we know. Question:<br />
Is αx ∈N(A)? Let us check.<br />
A(αx) =α(Ax)<br />
Theorem MMSMM<br />
= α0 x ∈N(A)<br />
= 0 Theorem ZVSM<br />
So, yes, αx qualifies for membership <strong>in</strong> N (A).<br />
Hav<strong>in</strong>g met the three conditions <strong>in</strong> Theorem TSS we can now say that the null<br />
space of a matrix is a subspace (and hence a vector space <strong>in</strong> its own right!). <br />
Here is an example where we can exercise Theorem NSMS.<br />
Example RSNS Recast<strong>in</strong>g a subspace as a null space<br />
Consider the subset of C 5 def<strong>in</strong>ed as<br />
⎧⎡<br />
⎤<br />
⎫<br />
x 1<br />
⎪⎨<br />
x 2<br />
3x 1 + x 2 − 5x 3 +7x 4 + x 5 =0, ⎪⎬<br />
W = ⎢x 3 ⎥<br />
4x 1 +6x 2 +3x 3 − 6x 4 − 5x 5 =0,<br />
⎣<br />
⎪⎩<br />
x ⎦<br />
4<br />
−2x 1 +4x 2 +7x 4 + x 5 =0<br />
x 5<br />
∣<br />
⎪⎭<br />
It is possible to show that W is a subspace of C 5 by check<strong>in</strong>g the three conditions<br />
of Theorem TSS directly, but it will get tedious rather quickly. Instead, give W<br />
a fresh look and notice that it is a set of solutions to a homogeneous system of<br />
equations. Def<strong>in</strong>e the matrix<br />
[ 3 1 −5 7<br />
] 1<br />
A = 4 6 3 −6 −5<br />
−2 4 0 7 1<br />
and then recognize that W = N (A). ByTheoremNSMS we can immediately see<br />
that W is a subspace. Boom!<br />
△<br />
Subsection TSS<br />
The Span of a Set<br />
The span of a set of column vectors got a heavy workout <strong>in</strong> Chapter V and Chapter<br />
M. The def<strong>in</strong>ition of the span depended only on be<strong>in</strong>g able to formulate l<strong>in</strong>ear<br />
comb<strong>in</strong>ations. In any of our more general vector spaces we always have a def<strong>in</strong>ition<br />
of vector addition and of scalar multiplication. So we can build l<strong>in</strong>ear comb<strong>in</strong>ations<br />
and manufacture spans. This subsection conta<strong>in</strong>s two def<strong>in</strong>itions that are just mild<br />
variants of def<strong>in</strong>itions we have seen earlier for column vectors. If you have not already,<br />
compare them with Def<strong>in</strong>ition LCCV and Def<strong>in</strong>ition SSCV.
§S Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 279<br />
Def<strong>in</strong>ition LC L<strong>in</strong>ear Comb<strong>in</strong>ation<br />
Suppose that V is a vector space. Given n vectors u 1 , u 2 , u 3 , ..., u n and n scalars<br />
α 1 ,α 2 ,α 3 , ..., α n ,theirl<strong>in</strong>ear comb<strong>in</strong>ation is the vector<br />
α 1 u 1 + α 2 u 2 + α 3 u 3 + ···+ α n u n .<br />
□<br />
Example LCM A l<strong>in</strong>ear comb<strong>in</strong>ation of matrices<br />
In the vector space M 23 of 2 × 3 matrices, we have the vectors<br />
[ ]<br />
[ ]<br />
[ ]<br />
1 3 −2<br />
3 −1 2<br />
4 2 −4<br />
x =<br />
y =<br />
z =<br />
2 0 7<br />
5 5 1<br />
1 1 1<br />
and we can form l<strong>in</strong>ear comb<strong>in</strong>ations such as<br />
[ ] [ ] [ ]<br />
1 3 −2 3 −1 2 4 2 −4<br />
2x +4y +(−1)z =2<br />
+4<br />
+(−1)<br />
2 0 7 5 5 1 1 1 1<br />
[ ] [ ] [ ]<br />
2 6 −4 12 −4 8 −4 −2 4<br />
=<br />
+<br />
+<br />
4 0 14 20 20 4 −1 −1 −1<br />
[ ]<br />
10 0 8<br />
=<br />
23 19 17<br />
or,<br />
[ ] [ ] [ ]<br />
1 3 −2 3 −1 2 4 2 −4<br />
4x − 2y +3z =4<br />
− 2<br />
+3<br />
2 0 7 5 5 1 1 1 1<br />
[ ] [ ] [<br />
4 12 −8 −6 2 −4 12 6 −12<br />
=<br />
+<br />
+<br />
8 0 28 −10 −10 −2 3 3 3<br />
[ ]<br />
10 20 −24<br />
=<br />
1 −7 29<br />
When we realize that we can form l<strong>in</strong>ear comb<strong>in</strong>ations <strong>in</strong> any vector space, then<br />
it is natural to revisit our def<strong>in</strong>ition of the span of a set, s<strong>in</strong>ce it is the set of all<br />
possible l<strong>in</strong>ear comb<strong>in</strong>ations of a set of vectors.<br />
Def<strong>in</strong>ition SS Span of a Set<br />
Suppose that V is a vector space. Given a set of vectors S = {u 1 , u 2 , u 3 , ..., u t },<br />
their span, 〈S〉, is the set of all possible l<strong>in</strong>ear comb<strong>in</strong>ations of u 1 , u 2 , u 3 , ..., u t .<br />
Symbolically,<br />
〈S〉 = {α 1 u 1 + α 2 u 2 + α 3 u 3 + ···+ α t u t | α i ∈ C, 1 ≤ i ≤ t}<br />
{ t∑<br />
∣ }<br />
∣∣∣∣<br />
= α i u i α i ∈ C, 1 ≤ i ≤ t<br />
i=1<br />
□<br />
]<br />
△
§S Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 280<br />
Theorem SSS Span of a Set is a Subspace<br />
Suppose V is a vector space. Given a set of vectors S = {u 1 , u 2 , u 3 , ..., u t }⊆V ,<br />
their span, 〈S〉, is a subspace.<br />
Proof. By Def<strong>in</strong>ition SS, the span conta<strong>in</strong>s l<strong>in</strong>ear comb<strong>in</strong>ations of vectors from the<br />
vector space V , so by repeated use of the closure properties, Property AC and<br />
Property SC, 〈S〉 can be seen to be a subset of V .<br />
We will then verify the three conditions of Theorem TSS. <strong>First</strong>,<br />
0 = 0 + 0 + 0 + ...+ 0 Property Z<br />
=0u 1 +0u 2 +0u 3 + ···+0u t Theorem ZSSM<br />
So we have written 0 as a l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors <strong>in</strong> S and by Def<strong>in</strong>ition<br />
SS, 0 ∈〈S〉 and therefore 〈S〉 ≠ ∅.<br />
Second, suppose x ∈〈S〉 and y ∈〈S〉. Can we conclude that x + y ∈〈S〉? What<br />
do we know about x and y by virtue of their membership <strong>in</strong> 〈S〉? Theremustbe<br />
scalars from C, α 1 ,α 2 ,α 3 , ..., α t and β 1 ,β 2 ,β 3 , ..., β t so that<br />
x = α 1 u 1 + α 2 u 2 + α 3 u 3 + ···+ α t u t<br />
y = β 1 u 1 + β 2 u 2 + β 3 u 3 + ···+ β t u t<br />
Then<br />
x + y = α 1 u 1 + α 2 u 2 + α 3 u 3 + ···+ α t u t<br />
+ β 1 u 1 + β 2 u 2 + β 3 u 3 + ···+ β t u t<br />
= α 1 u 1 + β 1 u 1 + α 2 u 2 + β 2 u 2<br />
+ α 3 u 3 + β 3 u 3 + ···+ α t u t + β t u t Property AA, Property C<br />
=(α 1 + β 1 )u 1 +(α 2 + β 2 )u 2<br />
+(α 3 + β 3 )u 3 + ···+(α t + β t )u t<br />
Property DSA<br />
S<strong>in</strong>ce each α i + β i is aga<strong>in</strong> a scalar from C we have expressed the vector sum x + y<br />
as a l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors from S, and therefore by Def<strong>in</strong>ition SS we can<br />
say that x + y ∈〈S〉.<br />
Third, suppose α ∈ C and x ∈〈S〉. Can we conclude that αx ∈〈S〉? Whatdo<br />
we know about x by virtue of its membership <strong>in</strong> 〈S〉? There must be scalars from C,<br />
α 1 ,α 2 ,α 3 , ..., α t so that<br />
x = α 1 u 1 + α 2 u 2 + α 3 u 3 + ···+ α t u t<br />
Then<br />
αx = α (α 1 u 1 + α 2 u 2 + α 3 u 3 + ···+ α t u t )<br />
= α(α 1 u 1 )+α(α 2 u 2 )+α(α 3 u 3 )+···+ α(α t u t ) Property DVA<br />
=(αα 1 )u 1 +(αα 2 )u 2 +(αα 3 )u 3 + ···+(αα t )u t<br />
Property SMA
§S Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 281<br />
S<strong>in</strong>ce each αα i is aga<strong>in</strong> a scalar from C we have expressed the scalar multiple αx as<br />
a l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors from S, and therefore by Def<strong>in</strong>ition SS we can<br />
say that αx ∈〈S〉.<br />
With the three conditions of Theorem TSS met, we can say that 〈S〉 is a subspace<br />
(and so is also a vector space, Def<strong>in</strong>ition VS). (See Exercise SS.T20, Exercise SS.T21,<br />
Exercise SS.T22.)<br />
<br />
Example SSP Span of a set of polynomials<br />
In Example SP4 we proved that<br />
W = {p(x)| p ∈ P 4 , p(2) = 0}<br />
is a subspace of P 4 , the vector space of polynomials of degree at most 4. S<strong>in</strong>ce W is<br />
a vector space itself, let us construct a span with<strong>in</strong> W . <strong>First</strong> let<br />
S = { x 4 − 4x 3 +5x 2 − x − 2, 2x 4 − 3x 3 − 6x 2 +6x +4 }<br />
and verify that S is a subset of W by check<strong>in</strong>g that each of these two polynomials<br />
has x = 2 as a root. Now, if we def<strong>in</strong>e U = 〈S〉, then Theorem SSS tells us that U is<br />
a subspace of W . So quite quickly we have built a cha<strong>in</strong> of subspaces, U <strong>in</strong>side W ,<br />
and W <strong>in</strong>side P 4 .<br />
Rather than dwell on how quickly we can build subspaces, let us try to ga<strong>in</strong> a<br />
better understand<strong>in</strong>g of just how the span construction creates subspaces, <strong>in</strong> the<br />
context of this example. We can quickly build representative elements of U,<br />
3(x 4 −4x 3 +5x 2 −x−2)+5(2x 4 −3x 3 −6x 2 +6x+4) = 13x 4 −27x 3 −15x 2 +27x+14<br />
and<br />
(−2)(x 4 −4x 3 +5x 2 −x−2)+8(2x 4 −3x 3 −6x 2 +6x+4) = 14x 4 −16x 3 −58x 2 +50x+36<br />
and each of these polynomials must be <strong>in</strong> W s<strong>in</strong>ce it is closed under addition and<br />
scalar multiplication. But you might check for yourself that both of these polynomials<br />
have x =2asaroot.<br />
I can tell you that y =3x 4 − 7x 3 − x 2 +7x − 2isnot<strong>in</strong>U, but would you believe<br />
me? A first check shows that y does have x = 2 as a root, but that only shows that<br />
y ∈ W .Whatdoesy have to do to ga<strong>in</strong> membership <strong>in</strong> U = 〈S〉? It must be a l<strong>in</strong>ear<br />
comb<strong>in</strong>ation of the vectors <strong>in</strong> S, x 4 − 4x 3 +5x 2 − x − 2and2x 4 − 3x 3 − 6x 2 +6x +4.<br />
So let us suppose that y is such a l<strong>in</strong>ear comb<strong>in</strong>ation,<br />
y =3x 4 − 7x 3 − x 2 +7x − 2<br />
= α 1 (x 4 − 4x 3 +5x 2 − x − 2) + α 2 (2x 4 − 3x 3 − 6x 2 +6x +4)<br />
=(α 1 +2α 2 )x 4 +(−4α 1 − 3α 2 )x 3 +(5α 1 − 6α 2 )x 2<br />
+(−α 1 +6α 2 )x +(−2α 1 +4α 2 )<br />
Notice that operations above are done <strong>in</strong> accordance with the def<strong>in</strong>ition of the
§S Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 282<br />
vector space of polynomials (Example VSP). Now, if we equate coefficients, which is<br />
the def<strong>in</strong>ition of equality for polynomials, then we obta<strong>in</strong> the system of five l<strong>in</strong>ear<br />
equations <strong>in</strong> two variables<br />
α 1 +2α 2 =3<br />
−4α 1 − 3α 2 = −7<br />
5α 1 − 6α 2 = −1<br />
−α 1 +6α 2 =7<br />
−2α 1 +4α 2 = −2<br />
Build an augmented matrix from the system and row-reduce,<br />
⎡<br />
⎤<br />
1 2 3<br />
−4 −3 −7<br />
⎢ 5 −6 −1⎥<br />
⎣−1 6 7 ⎦<br />
−2 4 −2<br />
RREF<br />
−−−−→<br />
⎡<br />
⎢<br />
⎣<br />
1 0 0<br />
0 1 0<br />
0 0 1<br />
0 0 0<br />
0 0 0<br />
S<strong>in</strong>ce the f<strong>in</strong>al column of the row-reduced augmented matrix is a pivot column,<br />
Theorem RCLS tells us the system of equations is <strong>in</strong>consistent. Therefore, there are<br />
no scalars, α 1 and α 2 , to establish y as a l<strong>in</strong>ear comb<strong>in</strong>ation of the elements <strong>in</strong> U.<br />
So y ∉ U.<br />
△<br />
Let us aga<strong>in</strong> exam<strong>in</strong>e membership <strong>in</strong> a span.<br />
Example SM32 A subspace of M 32<br />
The set of all 3 × 2 matrices forms a vector space when we use the operations of<br />
matrix addition (Def<strong>in</strong>ition MA) and scalar matrix multiplication (Def<strong>in</strong>ition MSM),<br />
as was shown <strong>in</strong> Example VSM. Consider the subset<br />
S =<br />
{[ 3 1<br />
4 2<br />
5 −5<br />
]<br />
,<br />
[ 1 1<br />
2 −1<br />
14 −1<br />
]<br />
,<br />
[ 3<br />
] −1<br />
−1 2 ,<br />
−19 −11<br />
⎤<br />
⎥<br />
⎦<br />
[ 4 2<br />
1 −2<br />
14 −2<br />
]<br />
,<br />
[ 3<br />
]} 1<br />
−4 0<br />
−17 7<br />
and def<strong>in</strong>e a new subset of vectors W <strong>in</strong> M 32 us<strong>in</strong>g the span (Def<strong>in</strong>ition SS), W = 〈S〉.<br />
So by Theorem SSS we know that W is a subspace of M 32 . While W is an <strong>in</strong>f<strong>in</strong>ite set,<br />
and this is a precise description, it would still be worthwhile to <strong>in</strong>vestigate whether<br />
or not W conta<strong>in</strong>s certa<strong>in</strong> elements.<br />
<strong>First</strong>, is<br />
[ 9 3<br />
]<br />
y = 7 3<br />
10 −11<br />
<strong>in</strong> W ? To answer this, we want to determ<strong>in</strong>e if y can be written as a l<strong>in</strong>ear comb<strong>in</strong>ation
§S Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 283<br />
of the five matrices <strong>in</strong> S. Can we f<strong>in</strong>d scalars, α 1 ,α 2 ,α 3 ,α 4 ,α 5 so that<br />
[ ] 9 3<br />
7 3<br />
10 −11<br />
[ 3 1<br />
= α 1 4 2<br />
5 −5<br />
=<br />
] [ ] [ ] [ ]<br />
1 1 3 −1 4 2<br />
+ α 2 2 −1 + α 3 −1 2 + α 4 1 −2<br />
14 −1 −19 −11 14 −2<br />
]<br />
3α 1 + α 2 +3α 3 +4α 4 +3α 5 α 1 + α 2 − α 3 +2α 4 + α 5<br />
4α 1 +2α 2 − α 3 + α 4 − 4α 5 2α 1 − α 2 +2α 3 − 2α 4<br />
5α 1 +14α 2 − 19α 3 +14α 4 − 17α 5 −5α 1 − α 2 − 11α 3 − 2α 4 +7α 5<br />
[<br />
[ ] 3 1<br />
+ α 5 −4 0<br />
−17 7<br />
Us<strong>in</strong>g our def<strong>in</strong>ition of matrix equality (Def<strong>in</strong>ition ME) we can translate this<br />
statement <strong>in</strong>to six equations <strong>in</strong> the five unknowns,<br />
3α 1 + α 2 +3α 3 +4α 4 +3α 5 =9<br />
α 1 + α 2 − α 3 +2α 4 + α 5 =3<br />
4α 1 +2α 2 − α 3 + α 4 − 4α 5 =7<br />
2α 1 − α 2 +2α 3 − 2α 4 =3<br />
5α 1 +14α 2 − 19α 3 +14α 4 − 17α 5 =10<br />
−5α 1 − α 2 − 11α 3 − 2α 4 +7α 5 = −11<br />
This is a l<strong>in</strong>ear system of equations, which we can represent with an augmented<br />
matrix and row-reduce <strong>in</strong> search of solutions. The matrix that is row-equivalent to<br />
the augmented matrix is<br />
⎡ ⎤<br />
5<br />
1 0 0 0<br />
8<br />
2<br />
⎢ −19<br />
⎢⎢⎢⎢⎢⎣ 0 1 0 0<br />
4<br />
−1<br />
−7<br />
0 0 1 0<br />
8<br />
0<br />
17<br />
0 0 0 1<br />
8<br />
1 ⎥<br />
0 0 0 0 0 0<br />
⎦<br />
0 0 0 0 0 0<br />
So we recognize that the system is consistent s<strong>in</strong>ce the f<strong>in</strong>al column is not a pivot<br />
column (Theorem RCLS), and compute n − r =5− 4 = 1 free variables (Theorem<br />
FVCS). While there are <strong>in</strong>f<strong>in</strong>itely many solutions, we are only <strong>in</strong> pursuit of a s<strong>in</strong>gle<br />
solution, so let us choose the free variable α 5 = 0 for simplicity’s sake. Then we<br />
easily see that α 1 =2,α 2 = −1, α 3 =0,α 4 = 1. So the scalars α 1 =2,α 2 = −1,<br />
α 3 =0,α 4 =1,α 5 = 0 will provide a l<strong>in</strong>ear comb<strong>in</strong>ation of the elements of S that<br />
equals y, as we can verify by check<strong>in</strong>g,<br />
[ 9 3<br />
7 3<br />
10 −11<br />
]<br />
=2<br />
[ 3 1<br />
4 2<br />
5 −5<br />
]<br />
+(−1)<br />
[ 1<br />
] 1<br />
[ 4<br />
] 2<br />
2 −1 +(1) 1 −2<br />
14 −1 14 −2<br />
So with one particular l<strong>in</strong>ear comb<strong>in</strong>ation <strong>in</strong> hand, we are conv<strong>in</strong>ced that y deserves
§S Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 284<br />
to be a member of W = 〈S〉.<br />
Second, is<br />
x =<br />
[ ] 2 1<br />
3 1<br />
4 −2<br />
<strong>in</strong> W ? To answer this, we want to determ<strong>in</strong>e if x canbewrittenasal<strong>in</strong>earcomb<strong>in</strong>ation<br />
of the five matrices <strong>in</strong> S. Can we f<strong>in</strong>d scalars, α 1 ,α 2 ,α 3 ,α 4 ,α 5 so that<br />
[ ] 2 1<br />
3 1<br />
4 −2<br />
[ ] [ ] [ ] [ ] [ ]<br />
3 1 1 1 3 −1 4 2 3 1<br />
= α 1 4 2 + α 2 2 −1 + α 3 −1 2 + α 4 1 −2 + α 5 −4 0<br />
5 −5 14 −1 −19 −11 14 −2 −17 7<br />
[<br />
]<br />
3α 1 + α 2 +3α 3 +4α 4 +3α 5 α 1 + α 2 − α 3 +2α 4 + α 5<br />
= 4α 1 +2α 2 − α 3 + α 4 − 4α 5 2α 1 − α 2 +2α 3 − 2α 4<br />
5α 1 +14α 2 − 19α 3 +14α 4 − 17α 5 −5α 1 − α 2 − 11α 3 − 2α 4 +7α 5<br />
Us<strong>in</strong>g our def<strong>in</strong>ition of matrix equality (Def<strong>in</strong>ition ME) we can translate this statement<br />
<strong>in</strong>to six equations <strong>in</strong> the five unknowns,<br />
3α 1 + α 2 +3α 3 +4α 4 +3α 5 =2<br />
α 1 + α 2 − α 3 +2α 4 + α 5 =1<br />
4α 1 +2α 2 − α 3 + α 4 − 4α 5 =3<br />
2α 1 − α 2 +2α 3 − 2α 4 =1<br />
5α 1 +14α 2 − 19α 3 +14α 4 − 17α 5 =4<br />
−5α 1 − α 2 − 11α 3 − 2α 4 +7α 5 = −2<br />
This is a l<strong>in</strong>ear system of equations, which we can represent with an augmented<br />
matrix and row-reduce <strong>in</strong> search of solutions. The matrix that is row-equivalent to<br />
the augmented matrix is<br />
⎡ ⎤<br />
5<br />
1 0 0 0<br />
8<br />
0<br />
⎢ ⎢⎢⎢⎢⎢⎢⎣ 0 1 0 0 − 38<br />
8<br />
0<br />
0 0 1 0 − 7 8<br />
0<br />
0 0 0 1 − 17<br />
8<br />
0<br />
⎥<br />
0 0 0 0 0 1 ⎦<br />
0 0 0 0 0 0<br />
S<strong>in</strong>ce the last column is a pivot column, Theorem RCLS tells us that the system is<br />
<strong>in</strong>consistent. Therefore, there are no values for the scalars that will place x <strong>in</strong> W ,<br />
and so we conclude that x ∉ W .<br />
△<br />
Notice how Example SSP and Example SM32 conta<strong>in</strong>ed questions about mem-
§S Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 285<br />
bership <strong>in</strong> a span, but these questions quickly became questions about solutions to<br />
a system of l<strong>in</strong>ear equations. This will be a common theme go<strong>in</strong>g forward.<br />
Subsection SC<br />
Subspace Constructions<br />
Several of the subsets of vectors spaces that we worked with <strong>in</strong> Chapter M are also<br />
subspaces — they are closed under vector addition and scalar multiplication <strong>in</strong> C m .<br />
Theorem CSMS Column Space of a Matrix is a Subspace<br />
Suppose that A is an m × n matrix. Then C(A) is a subspace of C m .<br />
Proof. Def<strong>in</strong>ition CSM shows us that C(A) is a subset of C m , and that it is def<strong>in</strong>ed<br />
as the span of a set of vectors from C m (the columns of the matrix). S<strong>in</strong>ce C(A) is a<br />
span, Theorem SSS says it is a subspace.<br />
<br />
That was easy! Notice that we could have used this same approach to prove that<br />
the null space is a subspace, s<strong>in</strong>ce Theorem SSNS provided a description of the null<br />
space of a matrix as the span of a set of vectors. However, I much prefer the current<br />
proofofTheoremNSMS. Speak<strong>in</strong>g of easy, here is a very easy theorem that exposes<br />
another of our constructions as creat<strong>in</strong>g subspaces.<br />
Theorem RSMS Row Space of a Matrix is a Subspace<br />
Suppose that A is an m × n matrix. Then R(A) is a subspace of C n .<br />
Proof. Def<strong>in</strong>ition RSM says R(A) = C(A t ), so the row space of a matrix is a column<br />
space, and every column space is a subspace by Theorem CSMS. That’s enough.<br />
One more.<br />
Theorem LNSMS Left Null Space of a Matrix is a Subspace<br />
Suppose that A is an m × n matrix. Then L(A) is a subspace of C m .<br />
Proof. Def<strong>in</strong>ition LNS says L(A) = N (A t ), so the left null space is a null space, and<br />
every null space is a subspace by Theorem NSMS. Done.<br />
<br />
So the span of a set of vectors, and the null space, column space, row space and<br />
left null space of a matrix are all subspaces, and hence are all vector spaces, mean<strong>in</strong>g<br />
they have all the properties detailed <strong>in</strong> Def<strong>in</strong>ition VS and <strong>in</strong> the basic theorems<br />
presented <strong>in</strong> Section VS. We have worked with these objects as just sets <strong>in</strong> Chapter<br />
V and Chapter M, but now we understand that they have much more structure. In<br />
particular, be<strong>in</strong>g closed under vector addition and scalar multiplication means a<br />
subspace is also closed under l<strong>in</strong>ear comb<strong>in</strong>ations.
§S Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 286<br />
Read<strong>in</strong>g Questions<br />
1. Summarize the three conditions that allow us to quickly test if a set is a subspace.<br />
2. Consider the set of vectors<br />
⎧ ⎡<br />
⎨<br />
W = ⎣ a ⎤<br />
⎫ ⎬<br />
b⎦<br />
⎩<br />
c ∣ 3a − 2b + c =5 ⎭<br />
Is the set W a subspace of C 3 ? Expla<strong>in</strong> your answer.<br />
3. Name five general constructions of sets of column vectors (subsets of C m )thatwenow<br />
know as subspaces.<br />
Exercises<br />
⎡<br />
⎤<br />
C15 † Work<strong>in</strong>g with<strong>in</strong> the vector space C 3 , determ<strong>in</strong>e if b = ⎣ 4 3⎦ is <strong>in</strong> the subspace W ,<br />
1<br />
〈 ⎧ ⎡<br />
⎨<br />
W = ⎣ 3 ⎤ ⎡<br />
2⎦ , ⎣ 1 ⎤ ⎡<br />
0⎦ , ⎣ 1 ⎤ ⎡<br />
1⎦ , ⎣ 2 ⎤⎫〉<br />
⎬<br />
1⎦<br />
⎩<br />
3 3 0 3<br />
⎭<br />
⎡ ⎤<br />
1<br />
C16 † Work<strong>in</strong>g with<strong>in</strong> the vector space C 4 ⎢1⎥<br />
, determ<strong>in</strong>e if b = ⎣<br />
0<br />
⎦ is <strong>in</strong> the subspace W ,<br />
1<br />
〈 ⎧ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />
⎪ 1 1 2 〉<br />
⎨ ⎪⎬<br />
⎢ 2 ⎥ ⎢0⎥<br />
⎢1⎥<br />
W = ⎣<br />
⎪ ⎩<br />
−1<br />
⎦ , ⎣<br />
3<br />
⎦ , ⎣<br />
1<br />
⎦<br />
⎪⎭<br />
1 1 2<br />
⎡ ⎤<br />
2<br />
C17 † Work<strong>in</strong>g with<strong>in</strong> the vector space C 4 ⎢1⎥<br />
, determ<strong>in</strong>e if b = ⎣<br />
2<br />
⎦ is <strong>in</strong> the subspace W ,<br />
1<br />
〈 ⎧ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />
⎪ 1 1 0 1 〉<br />
⎨ ⎪⎬<br />
⎢2⎥<br />
⎢0⎥<br />
⎢1⎥<br />
⎢1⎥<br />
W = ⎣<br />
⎪ ⎩<br />
0<br />
⎦ , ⎣<br />
3<br />
⎦ , ⎣<br />
0<br />
⎦ , ⎣<br />
2<br />
⎦<br />
⎪⎭<br />
2 1 2 0<br />
C20 † Work<strong>in</strong>g with<strong>in</strong> the vector space P 3 of polynomials of degree 3 or less, determ<strong>in</strong>e if<br />
p(x) =x 3 +6x + 4 is <strong>in</strong> the subspace W below.<br />
W = 〈{ x 3 + x 2 + x, x 3 +2x − 6, x 2 − 5 }〉<br />
C21 †<br />
Consider the subspace<br />
〈{[ ]<br />
2 1<br />
W =<br />
,<br />
3 −1<br />
[ ]<br />
4 0<br />
,<br />
2 3<br />
[ ]}〉<br />
−3 1<br />
2 1
§S Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 287<br />
[ ]<br />
−3 3<br />
of the vector space of 2 × 2matrices,M 22 .IsC =<br />
an element of W ?<br />
6 −4<br />
{[ ]∣ }<br />
x1 ∣∣∣<br />
C25 Show that the set W = 3x<br />
x 1 − 5x 2 =12 from Example NSC2Z fails Property<br />
2<br />
AC and Property SC.<br />
{[ ]∣ }<br />
x1 ∣∣∣<br />
C26 Show that the set Y = x<br />
x 1 ∈ Z, x 2 ∈ Z from Example NSC2S has Property<br />
2<br />
AC.<br />
M20 † In C 3 , the vector space of column vectors of size 3, prove that the set Z is a<br />
subspace.<br />
⎧ ⎡<br />
⎨<br />
Z = ⎣ x ⎤<br />
⎫<br />
1<br />
⎬<br />
x 2<br />
⎦<br />
⎩<br />
x 3 ∣ 4x 1 − x 2 +5x 3 =0<br />
⎭<br />
T20 † A square matrix A of size n is upper triangular if [A] ij<br />
= 0 whenever i>j.Let<br />
UT n be the set of all upper triangular matrices of size n. ProvethatUT n is a subspace of<br />
the vector space of all square matrices of size n, M nn .<br />
T30 † Let P be the set of all polynomials, of any degree. The set P is a vector space. Let<br />
E be the subset of P consist<strong>in</strong>g of all polynomials with only terms of even degree. Prove or<br />
disprove: the set E is a subspace of P .<br />
T31 † Let P be the set of all polynomials, of any degree. The set P is a vector space. Let<br />
F be the subset of P consist<strong>in</strong>g of all polynomials with only terms of odd degree. Prove or<br />
disprove: the set F is a subspace of P .
Section LISS<br />
L<strong>in</strong>ear Independence and Spann<strong>in</strong>g Sets<br />
A vector space is def<strong>in</strong>ed as a set with two operations, meet<strong>in</strong>g ten properties<br />
(Def<strong>in</strong>ition VS). Just as the def<strong>in</strong>ition of span of a set of vectors only required<br />
know<strong>in</strong>g how to add vectors and how to multiply vectors by scalars, so it is with<br />
l<strong>in</strong>ear <strong>in</strong>dependence. A def<strong>in</strong>ition of a l<strong>in</strong>early <strong>in</strong>dependent set of vectors <strong>in</strong> an<br />
arbitrary vector space only requires know<strong>in</strong>g how to form l<strong>in</strong>ear comb<strong>in</strong>ations and<br />
equat<strong>in</strong>g these with the zero vector. S<strong>in</strong>ce every vector space must have a zero vector<br />
(Property Z), we always have a zero vector at our disposal.<br />
In this section we will also put a twist on the notion of the span of a set of<br />
vectors. Rather than beg<strong>in</strong>n<strong>in</strong>g with a set of vectors and creat<strong>in</strong>g a subspace that is<br />
the span, we will <strong>in</strong>stead beg<strong>in</strong> with a subspace and look for a set of vectors whose<br />
span equals the subspace.<br />
The comb<strong>in</strong>ation of l<strong>in</strong>ear <strong>in</strong>dependence and spann<strong>in</strong>g will be very important<br />
go<strong>in</strong>g forward.<br />
Subsection LI<br />
L<strong>in</strong>ear Independence<br />
Our previous def<strong>in</strong>ition of l<strong>in</strong>ear <strong>in</strong>dependence (Def<strong>in</strong>ition LICV) employed a relation<br />
of l<strong>in</strong>ear dependence that was a l<strong>in</strong>ear comb<strong>in</strong>ation on one side of an equality and a<br />
zero vector on the other side. As a l<strong>in</strong>ear comb<strong>in</strong>ation <strong>in</strong> a vector space (Def<strong>in</strong>ition<br />
LC) depends only on vector addition and scalar multiplication, and every vector<br />
space must have a zero vector (Property Z), we can extend our def<strong>in</strong>ition of l<strong>in</strong>ear<br />
<strong>in</strong>dependence from the sett<strong>in</strong>g of C m to the sett<strong>in</strong>g of a general vector space V with<br />
almost no changes. Compare these next two def<strong>in</strong>itions with Def<strong>in</strong>ition RLDCV and<br />
Def<strong>in</strong>ition LICV.<br />
Def<strong>in</strong>ition RLD Relation of L<strong>in</strong>ear Dependence<br />
Suppose that V is a vector space. Given a set of vectors S = {u 1 , u 2 , u 3 , ..., u n },<br />
an equation of the form<br />
α 1 u 1 + α 2 u 2 + α 3 u 3 + ···+ α n u n = 0<br />
is a relation of l<strong>in</strong>ear dependence on S. If this equation is formed <strong>in</strong> a trivial<br />
fashion, i.e. α i =0,1≤ i ≤ n, then we say it is a trivial relation of l<strong>in</strong>ear<br />
dependence on S.<br />
□<br />
Def<strong>in</strong>ition LI L<strong>in</strong>ear Independence<br />
Suppose that V is a vector space. The set of vectors S = {u 1 , u 2 , u 3 , ..., u n } from<br />
V is l<strong>in</strong>early dependent if there is a relation of l<strong>in</strong>ear dependence on S that is not<br />
288
§LISS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 289<br />
trivial. In the case where the only relation of l<strong>in</strong>ear dependence on S is the trivial<br />
one, then S is a l<strong>in</strong>early <strong>in</strong>dependent set of vectors.<br />
□<br />
Notice the emphasis on the word “only.” This might rem<strong>in</strong>d you of the def<strong>in</strong>ition<br />
of a nons<strong>in</strong>gular matrix, where if the matrix is employed as the coefficient matrix of<br />
a homogeneous system then the only solution is the trivial one.<br />
Example LIP4 L<strong>in</strong>ear <strong>in</strong>dependence <strong>in</strong> P 4<br />
In the vector space of polynomials with degree 4 or less, P 4 (Example VSP) consider<br />
the set S below<br />
{<br />
2x 4 +3x 3 +2x 2 − x +10, −x 4 − 2x 3 + x 2 +5x − 8, 2x 4 + x 3 +10x 2 +17x − 2 }<br />
Is this set of vectors l<strong>in</strong>early <strong>in</strong>dependent or dependent? Consider that<br />
3 ( 2x 4 +3x 3 +2x 2 − x +10 ) +4 ( −x 4 − 2x 3 + x 2 +5x − 8 )<br />
+(−1) ( 2x 4 + x 3 +10x 2 +17x − 2 ) =0x 4 +0x 3 +0x 2 +0x +0=0<br />
This is a nontrivial relation of l<strong>in</strong>ear dependence (Def<strong>in</strong>ition RLD) onthesetS and<br />
so conv<strong>in</strong>ces us that S is l<strong>in</strong>early dependent (Def<strong>in</strong>ition LI).<br />
Now, I hear you say, “Where did those scalars come from?” Do not worry about<br />
that right now, just be sure you understand why the above explanation is sufficient to<br />
prove that S is l<strong>in</strong>early dependent. The rema<strong>in</strong>der of the example will demonstrate<br />
how we might f<strong>in</strong>d these scalars if they had not been provided so readily.<br />
Let us look at another set of vectors (polynomials) from P 4 .Let<br />
T = { 3x 4 − 2x 3 +4x 2 +6x − 1, −3x 4 +1x 3 +0x 2 +4x +2,<br />
4x 4 +5x 3 − 2x 2 +3x +1, 2x 4 − 7x 3 +4x 2 +2x +1 }<br />
Suppose we have a relation of l<strong>in</strong>ear dependence on this set,<br />
0 =0x 4 +0x 3 +0x 2 +0x +0<br />
(<br />
= α 1 3x 4 − 2x 3 +4x 2 +6x − 1 ) (<br />
+ α 2 −3x 4 +1x 3 +0x 2 +4x +2 )<br />
(<br />
+ α 3 4x 4 +5x 3 − 2x 2 +3x +1 ) (<br />
+ α 4 2x 4 − 7x 3 +4x 2 +2x +1 )<br />
Us<strong>in</strong>g our def<strong>in</strong>itions of vector addition and scalar multiplication <strong>in</strong> P 4 (Example<br />
VSP), we arrive at,<br />
0x 4 +0x 3 +0x 2 +0x +0=<br />
(3α 1 − 3α 2 +4α 3 +2α 4 ) x 4 +(−2α 1 + α 2 +5α 3 − 7α 4 ) x 3 +<br />
(4α 1 − 2α 3 +4α 4 ) x 2 +(6α 1 +4α 2 +3α 3 +2α 4 ) x +(−α 1 +2α 2 + α 3 + α 4 )<br />
Equat<strong>in</strong>g coefficients, we arrive at the homogeneous system of equations,<br />
3α 1 − 3α 2 +4α 3 +2α 4 =0<br />
−2α 1 + α 2 +5α 3 − 7α 4 =0<br />
4α 1 − 2α 3 +4α 4 =0
§LISS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 290<br />
6α 1 +4α 2 +3α 3 +2α 4 =0<br />
−α 1 +2α 2 + α 3 + α 4 =0<br />
We form the coefficient matrix of this homogeneous system of equations and<br />
row-reduce to f<strong>in</strong>d<br />
⎡<br />
⎤<br />
1 0 0 0<br />
0 1 0 0<br />
⎢ 0 0 1 0<br />
⎥<br />
⎣<br />
⎦<br />
0 0 0 1<br />
0 0 0 0<br />
We expected the system to be consistent (Theorem HSC) and so can compute<br />
n − r =4− 4=0andTheoremCSRN tells us that the solution is unique. S<strong>in</strong>ce<br />
this is a homogeneous system, this unique solution is the trivial solution (Def<strong>in</strong>ition<br />
TSHSE), α 1 =0,α 2 =0,α 3 =0,α 4 = 0. So by Def<strong>in</strong>ition LI the set T is l<strong>in</strong>early<br />
<strong>in</strong>dependent.<br />
A few observations. If we had discovered <strong>in</strong>f<strong>in</strong>itely many solutions, then we could<br />
have used one of the nontrivial solutions to provide a l<strong>in</strong>ear comb<strong>in</strong>ation <strong>in</strong> the<br />
manner we used to show that S was l<strong>in</strong>early dependent. It is important to realize<br />
that it is not <strong>in</strong>terest<strong>in</strong>g that we can create a relation of l<strong>in</strong>ear dependence with zero<br />
scalars — we can always do that — but for T , this is the only way to create a relation<br />
of l<strong>in</strong>ear dependence. It was no accident that we arrived at a homogeneous system<br />
of equations <strong>in</strong> this example, it is related to our use of the zero vector <strong>in</strong> def<strong>in</strong><strong>in</strong>g a<br />
relation of l<strong>in</strong>ear dependence. It is easy to present a conv<strong>in</strong>c<strong>in</strong>g statement that a set<br />
is l<strong>in</strong>early dependent (just exhibit a nontrivial relation of l<strong>in</strong>ear dependence) but a<br />
conv<strong>in</strong>c<strong>in</strong>g statement of l<strong>in</strong>ear <strong>in</strong>dependence requires demonstrat<strong>in</strong>g that there is<br />
no relation of l<strong>in</strong>ear dependence other than the trivial one. Notice how we relied<br />
on theorems from Chapter SLE toprovidethisdemonstration.Whew!Thereisa<br />
lot go<strong>in</strong>g on <strong>in</strong> this example. Spend some time with it, we will be wait<strong>in</strong>g patiently<br />
right here when you get back.<br />
△<br />
Example LIM32 L<strong>in</strong>ear <strong>in</strong>dependence <strong>in</strong> M 32<br />
Consider the two sets of vectors R and S from the vector space of all 3 × 2 matrices,<br />
M 32 (Example VSM)<br />
R =<br />
S =<br />
{[ 3 −1<br />
1 4<br />
6 −6<br />
{[ 2 0<br />
1 −1<br />
1 3<br />
] [ −2<br />
] 3<br />
, 1 −3 ,<br />
]<br />
−2 −6<br />
[ ] −4 0<br />
, −2 2 ,<br />
−2 −6<br />
[ 6 −6<br />
−1 0<br />
7 −9<br />
[ 1 1<br />
−2 1<br />
2 4<br />
]<br />
]<br />
,<br />
[ 7<br />
]} 9<br />
, −4 −5<br />
2 5<br />
[ ]} −5 3<br />
−10 7<br />
2 0<br />
One set is l<strong>in</strong>early <strong>in</strong>dependent, the other is not. Which is which? Let us exam<strong>in</strong>e
§LISS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 291<br />
R first. Build a generic relation of l<strong>in</strong>ear dependence (Def<strong>in</strong>ition RLD),<br />
[ ] [ ] [ ] [ ]<br />
3 −1 −2 3 6 −6 7 9<br />
α 1 1 4 + α 2 1 −3 + α 3 −1 0 + α 4 −4 −5 = 0<br />
6 −6 −2 −6 7 −9 2 5<br />
Massag<strong>in</strong>g the left-hand side with our def<strong>in</strong>itions of vector addition and scalar<br />
multiplication <strong>in</strong> M 32 (Example VSM) we obta<strong>in</strong>,<br />
[ ] [ ]<br />
3α1 − 2α 2 +6α 3 +7α 4 −α 1 +3α 2 − 6α 3 +9α 4 0 0<br />
α 1 + α 2 − α 3 − 4α 4 4α 1 − 3α 2 + −5α 4 = 0 0<br />
6α 1 − 2α 2 +7α 3 +2α 4 −6α 1 − 6α 2 − 9α 3 +5α 4 0 0<br />
Us<strong>in</strong>g our def<strong>in</strong>ition of matrix equality (Def<strong>in</strong>ition ME) to equate entries we get<br />
the homogeneous system of six equations <strong>in</strong> four variables,<br />
3α 1 − 2α 2 +6α 3 +7α 4 =0<br />
−α 1 +3α 2 − 6α 3 +9α 4 =0<br />
α 1 + α 2 − α 3 − 4α 4 =0<br />
4α 1 − 3α 2 + −5α 4 =0<br />
6α 1 − 2α 2 +7α 3 +2α 4 =0<br />
−6α 1 − 6α 2 − 9α 3 +5α 4 =0<br />
Form the coefficient matrix of this homogeneous system and row-reduce to obta<strong>in</strong><br />
⎡<br />
⎤<br />
1 0 0 0<br />
0 1 0 0<br />
0 0 1 0<br />
⎢ 0 0 0 1 ⎥<br />
⎣<br />
0 0 0 0<br />
⎦<br />
0 0 0 0<br />
Analyz<strong>in</strong>g this matrix we are led to conclude that α 1 =0,α 2 =0,α 3 =0,α 4 =0.<br />
This means there is only a trivial relation of l<strong>in</strong>ear dependence on the vectors of R<br />
and so we call R a l<strong>in</strong>early <strong>in</strong>dependent set (Def<strong>in</strong>ition LI).<br />
So it must be that S is l<strong>in</strong>early dependent. Let us see if we can f<strong>in</strong>d a nontrivial<br />
relation of l<strong>in</strong>ear dependence on S. We will beg<strong>in</strong> as with R, by construct<strong>in</strong>g a<br />
relation of l<strong>in</strong>ear dependence (Def<strong>in</strong>ition RLD) with unknown scalars,<br />
α 1<br />
[ 2 0<br />
1 −1<br />
1 3<br />
] [ ] −4 0<br />
+ α 2 −2 2<br />
−2 −6<br />
+ α 3<br />
[ 1 1<br />
−2 1<br />
2 4<br />
]<br />
+ α 4<br />
[ −5 3<br />
−10 7<br />
2 0<br />
Massag<strong>in</strong>g the left-hand side with our def<strong>in</strong>itions of vector addition and scalar<br />
]<br />
= 0
§LISS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 292<br />
multiplication <strong>in</strong> M 32 (Example VSM) we obta<strong>in</strong>,<br />
[ 2α1 − 4α 2 + α 3 − 5α 4 α 3 +3α 4<br />
]<br />
α 1 − 2α 2 − 2α 3 − 10α 4 −α 1 +2α 2 + α 3 +7α 4 =<br />
α 1 − 2α 2 +2α 3 +2α 4 3α 1 − 6α 2 +4α 3<br />
[ ] 0 0<br />
0 0<br />
0 0<br />
Us<strong>in</strong>g our def<strong>in</strong>ition of matrix equality (Def<strong>in</strong>ition ME) to equate entries we get<br />
the homogeneous system of six equations <strong>in</strong> four variables,<br />
2α 1 − 4α 2 + α 3 − 5α 4 =0<br />
α 3 +3α 4 =0<br />
α 1 − 2α 2 − 2α 3 − 10α 4 =0<br />
−α 1 +2α 2 + α 3 +7α 4 =0<br />
α 1 − 2α 2 +2α 3 +2α 4 =0<br />
3α 1 − 6α 2 +4α 3 =0<br />
Form the coefficient matrix of this homogeneous system and row-reduce to obta<strong>in</strong><br />
⎡<br />
⎤<br />
1 −2 0 −4<br />
0 0 1 3<br />
0 0 0 0<br />
⎢ 0 0 0 0 ⎥<br />
⎣<br />
⎦<br />
0 0 0 0<br />
0 0 0 0<br />
Analyz<strong>in</strong>g this we see that the system is consistent (we expected this s<strong>in</strong>ce the<br />
system is homogeneous, Theorem HSC) and has n − r =4− 2 = 2 free variables,<br />
namely α 2 and α 4 . This means there are <strong>in</strong>f<strong>in</strong>itely many solutions, and <strong>in</strong> particular,<br />
we can f<strong>in</strong>d a nontrivial solution, so long as we do not pick all of our free variables<br />
to be zero. The mere presence of a nontrivial solution for these scalars is enough to<br />
conclude that S is a l<strong>in</strong>early dependent set (Def<strong>in</strong>ition LI). But let us go ahead and<br />
explicitly construct a nontrivial relation of l<strong>in</strong>ear dependence.<br />
Choose α 2 = 1 and α 4 = −1. There is noth<strong>in</strong>g special about this choice, there<br />
are <strong>in</strong>f<strong>in</strong>itely many possibilities, some “easier” than this one, just avoid pick<strong>in</strong>g both<br />
variables to be zero. (Why not?) Then we f<strong>in</strong>d the dependent variables to have values<br />
α 1 = −2 andα 3 = 3. So the relation of l<strong>in</strong>ear dependence,<br />
[ ] [ ] [ ] [ ] [ ]<br />
2 0 −4 0 1 1 −5 3 0 0<br />
(−2) 1 −1 +(1) −2 2 +(3) −2 1 +(−1) −10 7 = 0 0<br />
1 3 −2 −6 2 4<br />
2 0 0 0<br />
is an iron-clad demonstration that S is l<strong>in</strong>early dependent. Can you construct another<br />
such demonstration?<br />
△<br />
Example LIC L<strong>in</strong>early <strong>in</strong>dependent set <strong>in</strong> the crazy vector space<br />
Is the set R = {(1, 0), (6, 3)} l<strong>in</strong>early <strong>in</strong>dependent <strong>in</strong> the crazy vector space C<br />
(Example CVS)?
§LISS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 293<br />
We beg<strong>in</strong> with an arbitrary relation of l<strong>in</strong>ear dependence on R<br />
0 = a 1 (1, 0) + a 2 (6, 3) Def<strong>in</strong>ition RLD<br />
and then massage it to a po<strong>in</strong>t where we can apply the def<strong>in</strong>ition of equality <strong>in</strong> C.<br />
Recall the def<strong>in</strong>itions of vector addition and scalar multiplication <strong>in</strong> C are not what<br />
you would expect.<br />
( − 1, −1)<br />
= 0 Example CVS<br />
= a 1 (1, 0) + a 2 (6, 3) Def<strong>in</strong>ition RLD<br />
=(1a 1 + a 1 − 1, 0a 1 + a 1 − 1) + (6a 2 + a 2 − 1, 3a 2 + a 2 − 1) Example CVS<br />
=(2a 1 − 1, a 1 − 1) + (7a 2 − 1, 4a 2 − 1)<br />
=(2a 1 − 1+7a 2 − 1+1,a 1 − 1+4a 2 − 1 + 1)<br />
Example CVS<br />
=(2a 1 +7a 2 − 1, a 1 +4a 2 − 1)<br />
Equality <strong>in</strong> C (Example CVS) then yields the two equations,<br />
2a 1 +7a 2 − 1=−1<br />
a 1 +4a 2 − 1=−1<br />
which becomes the homogeneous system<br />
2a 1 +7a 2 =0<br />
a 1 +4a 2 =0<br />
S<strong>in</strong>ce the coefficient matrix of this system is nons<strong>in</strong>gular (check this!) the system<br />
has only the trivial solution a 1 = a 2 = 0. By Def<strong>in</strong>ition LI the set R is l<strong>in</strong>early<br />
<strong>in</strong>dependent. Notice that even though the zero vector of C is not what we might<br />
have first suspected, a question about l<strong>in</strong>ear <strong>in</strong>dependence still concludes with a<br />
question about a homogeneous system of equations. Hmmm.<br />
△<br />
Subsection SS<br />
Spann<strong>in</strong>g Sets<br />
In a vector space V , suppose we are given a set of vectors S ⊆ V .Thenwecan<br />
immediately construct a subspace, 〈S〉, us<strong>in</strong>g Def<strong>in</strong>ition SS and then be assured<br />
by Theorem SSS that the construction does provide a subspace. We now turn the<br />
situation upside-down. Suppose we are first given a subspace W ⊆ V . Can we f<strong>in</strong>d a<br />
set S so that 〈S〉 = W ? Typically W is <strong>in</strong>f<strong>in</strong>ite and we are search<strong>in</strong>g for a f<strong>in</strong>ite set<br />
of vectors S that we can comb<strong>in</strong>e <strong>in</strong> l<strong>in</strong>ear comb<strong>in</strong>ations and “build” all of W .<br />
I like to th<strong>in</strong>k of S as the raw materials that are sufficient for the construction of<br />
W . If you have nails, lumber, wire, copper pipe, drywall, plywood, carpet, sh<strong>in</strong>gles,
§LISS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 294<br />
pa<strong>in</strong>t (and a few other th<strong>in</strong>gs), then you can comb<strong>in</strong>e them <strong>in</strong> many different ways<br />
to create a house (or <strong>in</strong>f<strong>in</strong>itely many different houses for that matter). A fast-food<br />
restaurant may have beef, chicken, beans, cheese, tortillas, taco shells and hot sauce<br />
and from this small list of <strong>in</strong>gredients build a wide variety of items for sale. Or<br />
maybe a better analogy comes from Ben Cordes — the additive primary colors (red,<br />
green and blue) can be comb<strong>in</strong>ed to create many different colors by vary<strong>in</strong>g the<br />
<strong>in</strong>tensity of each. The <strong>in</strong>tensity is like a scalar multiple, and the comb<strong>in</strong>ation of the<br />
three <strong>in</strong>tensities is like vector addition. The three <strong>in</strong>dividual colors, red, green and<br />
blue, are the elements of the spann<strong>in</strong>g set.<br />
Because we will use terms like “spanned by” and “spann<strong>in</strong>g set,” there is the<br />
potential for confusion with “the span.” Come back and reread the first paragraph of<br />
this subsection whenever you are uncerta<strong>in</strong> about the difference. Here is the work<strong>in</strong>g<br />
def<strong>in</strong>ition.<br />
Def<strong>in</strong>ition SSVS Spann<strong>in</strong>g Set of a Vector Space<br />
Suppose V is a vector space. A subset S of V is a spann<strong>in</strong>g set of V if 〈S〉 = V .<br />
In this case, we also frequently say S spans V .<br />
□<br />
The def<strong>in</strong>ition of a spann<strong>in</strong>g set requires that two sets (subspaces actually) be<br />
equal. If S is a subset of V ,then〈S〉 ⊆V , always. Thus it is usually only necessary<br />
to prove that V ⊆〈S〉. Now would be a good time to review Def<strong>in</strong>ition SE.<br />
Example SSP4 Spann<strong>in</strong>g set <strong>in</strong> P 4<br />
In Example SP4 we showed that<br />
W = {p(x)| p ∈ P 4 , p(2) = 0}<br />
is a subspace of P 4 , the vector space of polynomials with degree at most 4 (Example<br />
VSP). In this example, we will show that the set<br />
S = { x − 2, x 2 − 4x +4,x 3 − 6x 2 +12x − 8, x 4 − 8x 3 +24x 2 − 32x +16 }<br />
is a spann<strong>in</strong>g set for W . To do this, we require that W = 〈S〉. This is an equality of<br />
sets. We can check that every polynomial <strong>in</strong> S has x = 2 as a root and therefore<br />
S ⊆ W .S<strong>in</strong>ceW is closed under addition and scalar multiplication, 〈S〉 ⊆W also.<br />
So it rema<strong>in</strong>s to show that W ⊆〈S〉 (Def<strong>in</strong>ition SE). To do this, beg<strong>in</strong> by<br />
choos<strong>in</strong>g an arbitrary polynomial <strong>in</strong> W ,sayr(x) =ax 4 + bx 3 + cx 2 + dx + e ∈ W .<br />
This polynomial is not as arbitrary as it would appear, s<strong>in</strong>ce we also know it must<br />
have x = 2 as a root. This translates to<br />
0=a(2) 4 + b(2) 3 + c(2) 2 + d(2) + e =16a +8b +4c +2d + e<br />
as a condition on r.<br />
We wish to show that r is a polynomial <strong>in</strong> 〈S〉, that is, we want to show that r<br />
can be written as a l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors (polynomials) <strong>in</strong> S. Soletus<br />
try.<br />
r(x) =ax 4 + bx 3 + cx 2 + dx + e
§LISS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 295<br />
(<br />
= α 1 (x − 2) + α 2 x 2 − 4x +4 ) (<br />
+ α 3 x 3 − 6x 2 +12x − 8 )<br />
(<br />
+ α 4 x 4 − 8x 3 +24x 2 − 32x +16 )<br />
= α 4 x 4 +(α 3 − 8α 4 ) x 3 +(α 2 − 6α 3 +24α 4 ) x 2<br />
+(α 1 − 4α 2 +12α 3 − 32α 4 ) x +(−2α 1 +4α 2 − 8α 3 +16α 4 )<br />
Equat<strong>in</strong>g coefficients (vector equality <strong>in</strong> P 4 ) gives the system of five equations <strong>in</strong><br />
four variables,<br />
α 4 = a<br />
α 3 − 8α 4 = b<br />
α 2 − 6α 3 +24α 4 = c<br />
α 1 − 4α 2 +12α 3 − 32α 4 = d<br />
−2α 1 +4α 2 − 8α 3 +16α 4 = e<br />
Any solution to this system of equations will provide the l<strong>in</strong>ear comb<strong>in</strong>ation we<br />
need to determ<strong>in</strong>e if r ∈〈S〉, but we need to be conv<strong>in</strong>ced there is a solution for any<br />
values of a, b, c, d, e that qualify r to be a member of W . So the question is: is this<br />
system of equations consistent? We will form the augmented matrix, and row-reduce.<br />
(We probably need to do this by hand, s<strong>in</strong>ce the matrix is symbolic — revers<strong>in</strong>g the<br />
order of the first four rows is the best way to start). We obta<strong>in</strong> a matrix <strong>in</strong> reduced<br />
row-echelon form<br />
⎡ ⎤<br />
1 0 0 0 32a +12b +4c + d<br />
⎢ ⎢⎢⎢⎣ 0 1 0 0 24a +6b + c<br />
0 0 1 0 8a + b<br />
⎥<br />
⎦<br />
0 0 0 1 a<br />
0 0 0 0 16a +8b +4c +2d + e<br />
⎡<br />
⎤<br />
1 0 0 0 32a +12b +4c + d<br />
0 1 0 0 24a +6b + c<br />
=<br />
⎢ 0 0 1 0 8a + b<br />
⎥<br />
⎣<br />
⎦<br />
0 0 0 1 a<br />
0 0 0 0 0<br />
For your results to match our first matrix, you may f<strong>in</strong>d it necessary to multiply<br />
the f<strong>in</strong>al row of your row-reduced matrix by the appropriate scalar, and/or add<br />
multiples of this row to some of the other rows. To obta<strong>in</strong> the second version of the<br />
matrix, the last entry of the last column has been simplified to zero accord<strong>in</strong>g to the<br />
one condition we were able to impose on an arbitrary polynomial from W .S<strong>in</strong>cethe<br />
last column is not a pivot column, Theorem RCLS tells us this system is consistent.<br />
Therefore, any polynomial from W can be written as a l<strong>in</strong>ear comb<strong>in</strong>ation of the<br />
polynomials <strong>in</strong> S, soW ⊆〈S〉. Therefore, W = 〈S〉 and S is a spann<strong>in</strong>g set for W
§LISS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 296<br />
by Def<strong>in</strong>ition SSVS.<br />
Notice that an alternative to row-reduc<strong>in</strong>g the augmented matrix by hand would<br />
be to appeal to Theorem FS by express<strong>in</strong>g the column space of the coefficient matrix<br />
as a null space, and then verify<strong>in</strong>g that the condition on r guarantees that r is <strong>in</strong><br />
the column space, thus imply<strong>in</strong>g that the system is always consistent. Give it a try,<br />
we will wait. This has been a complicated example, but worth study<strong>in</strong>g carefully.△<br />
Given a subspace and a set of vectors, as <strong>in</strong> Example SSP4 it can take some<br />
work to determ<strong>in</strong>e that the set actually is a spann<strong>in</strong>g set. An even harder problem is<br />
to be confronted with a subspace and required to construct a spann<strong>in</strong>g set with no<br />
guidance. We will now work an example of this flavor, but some of the steps will be<br />
unmotivated. Fortunately, we will have some better tools for this type of problem<br />
later on.<br />
Example SSM22 Spann<strong>in</strong>g set <strong>in</strong> M 22<br />
In the space of all 2 × 2 matrices, M 22 consider the subspace<br />
{[ ]∣<br />
a b ∣∣∣<br />
Z = a +3b − c − 5d =0, −2a − 6b +3c +14d =0}<br />
c d<br />
and f<strong>in</strong>d a spann<strong>in</strong>g set for Z.<br />
We need to construct a limited number of matrices <strong>in</strong> Z so that every matrix<br />
<strong>in</strong> Z can be expressed [ as]<br />
a l<strong>in</strong>ear comb<strong>in</strong>ation of this limited number of matrices.<br />
a b<br />
Suppose that B = is a matrix <strong>in</strong> Z. Then we can form a column vector with<br />
c d<br />
the entries of B and write ⎡ ⎤<br />
a<br />
⎢ ⎣<br />
b⎥<br />
c<br />
⎦ ∈N<br />
d<br />
([<br />
1 3 −1<br />
])<br />
−5<br />
−2 −6 3 14<br />
Row-reduc<strong>in</strong>g this matrix and apply<strong>in</strong>g Theorem REMES we obta<strong>in</strong> the equivalent<br />
statement,<br />
⎡ ⎤<br />
a ([ ])<br />
⎢b⎥<br />
1 3 0 −1<br />
⎣<br />
c<br />
⎦ ∈N<br />
0 0 1 4<br />
d<br />
We can then express the subspace Z <strong>in</strong> the follow<strong>in</strong>g equal forms,<br />
{[ ]∣<br />
a b ∣∣∣<br />
Z = a +3b − c − 5d =0, −2a − 6b +3c +14d =0}<br />
c d<br />
{[ ]∣<br />
a b ∣∣∣<br />
= a +3b − d =0, c+4d =0}<br />
c d<br />
{[ ]∣<br />
a b ∣∣∣<br />
= a = −3b + d, c = −4d}<br />
c d
§LISS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 297<br />
So the set<br />
{[ ]∣<br />
−3b + d b ∣∣∣<br />
=<br />
b, d ∈ C}<br />
−4d d<br />
{[ ] [ ]∣<br />
−3b b d 0 ∣∣∣<br />
=<br />
+<br />
b, d ∈ C}<br />
0 0 −4d d<br />
{ [ ] [ ]∣<br />
−3 1 1 0 ∣∣∣<br />
= b + d b, d ∈ C}<br />
0 0 −4 1<br />
〈{[ ] [ ]}〉<br />
−3 1 1 0<br />
=<br />
,<br />
0 0 −4 1<br />
spans Z by Def<strong>in</strong>ition SSVS.<br />
Q =<br />
{[ ]<br />
−3 1<br />
,<br />
0 0<br />
[<br />
1<br />
]}<br />
0<br />
−4 1<br />
Example SSC Spann<strong>in</strong>g set <strong>in</strong> the crazy vector space<br />
In Example LIC we determ<strong>in</strong>ed that the set R = {(1, 0), (6, 3)} is l<strong>in</strong>early <strong>in</strong>dependent<br />
<strong>in</strong> the crazy vector space C (Example CVS). We now show that R is a spann<strong>in</strong>g<br />
set for C.<br />
Given an arbitrary vector (x, y) ∈ C we desire to show that it can be written as<br />
a l<strong>in</strong>ear comb<strong>in</strong>ation of the elements of R. In other words, are there scalars a 1 and<br />
a 2 so that<br />
(x, y) =a 1 (1, 0) + a 2 (6, 3)<br />
We will act as if this equation is true and try to determ<strong>in</strong>e just what a 1 and<br />
a 2 would be (as functions of x and y). Recall that our vector space operations are<br />
unconventional and are def<strong>in</strong>ed <strong>in</strong> Example CVS.<br />
(x, y) =a 1 (1, 0) + a 2 (6, 3)<br />
=(1a 1 + a 1 − 1, 0a 1 + a 1 − 1) + (6a 2 + a 2 − 1, 3a 2 + a 2 − 1)<br />
=(2a 1 − 1, a 1 − 1) + (7a 2 − 1, 4a 2 − 1)<br />
=(2a 1 − 1+7a 2 − 1+1,a 1 − 1+4a 2 − 1+1)<br />
=(2a 1 +7a 2 − 1, a 1 +4a 2 − 1)<br />
Equality <strong>in</strong> C then yields the two equations,<br />
2a 1 +7a 2 − 1=x<br />
a 1 +4a 2 − 1=y<br />
which becomes the l<strong>in</strong>ear system with a matrix representation<br />
[ ][ ] [ ]<br />
2 7 a1 x +1<br />
=<br />
1 4 a 2 y +1<br />
The coefficient matrix of this system is nons<strong>in</strong>gular, hence <strong>in</strong>vertible (Theorem<br />
△
§LISS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 298<br />
NI), and we can employ its <strong>in</strong>verse to f<strong>in</strong>d a solution (Theorem TTMI, Theorem<br />
SNCM),<br />
[ ] [ ] −1 [ ] [ ][ ] [ ]<br />
a1 2 7 x +1 4 −7 x +1 4x − 7y − 3<br />
=<br />
=<br />
=<br />
a 2 1 4 y +1 −1 2 y +1 −x +2y +1<br />
We could chase through the above implications backwards and take the existence<br />
of these solutions as sufficient evidence for R be<strong>in</strong>g a spann<strong>in</strong>g set for C. Instead,let<br />
us view the above as simply scratchwork and now get serious with a simple direct<br />
proof that R is a spann<strong>in</strong>g set. Ready? Suppose (x, y) is any vector from C, then<br />
compute the follow<strong>in</strong>g l<strong>in</strong>ear comb<strong>in</strong>ation us<strong>in</strong>g the def<strong>in</strong>itions of the operations <strong>in</strong><br />
C,<br />
(4x − 7y − 3)(1, 0) + (−x +2y + 1)(6, 3)<br />
= (1(4x − 7y − 3) + (4x − 7y − 3) − 1, 0(4x − 7y − 3) + (4x − 7y − 3) − 1) +<br />
(6(−x +2y +1)+(−x +2y +1)− 1, 3(−x +2y +1)+(−x +2y +1)− 1)<br />
=(8x − 14y − 7, 4x − 7y − 4) + (−7x +14y +6, −4x +8y +3)<br />
= ((8x − 14y − 7) + (−7x +14y +6)+1, (4x − 7y − 4) + (−4x +8y +3)+1)<br />
=(x, y)<br />
This f<strong>in</strong>al sequence of computations <strong>in</strong> C is sufficient to demonstrate that any<br />
element of C can be written (or expressed) as a l<strong>in</strong>ear comb<strong>in</strong>ation of the two vectors<br />
<strong>in</strong> R, soC ⊆〈R〉. S<strong>in</strong>ce the reverse <strong>in</strong>clusion 〈R〉 ⊆C is trivially true, C = 〈R〉 and<br />
we say R spans C (Def<strong>in</strong>ition SSVS). Notice that this demonstration is no more or<br />
less valid if we hide from the reader our scratchwork that suggested a 1 =4x − 7y − 3<br />
and a 2 = −x +2y +1.<br />
△<br />
Subsection VR<br />
Vector Representation<br />
In Chapter R we will take up the matter of representations fully, where Theorem<br />
VRRB will be critical for Def<strong>in</strong>ition VR. We will now motivate and prove a critical<br />
theorem that tells us how to “represent” a vector. This theorem could wait, but<br />
work<strong>in</strong>g with it now will provide some extra <strong>in</strong>sight <strong>in</strong>to the nature of l<strong>in</strong>early<br />
<strong>in</strong>dependent spann<strong>in</strong>g sets. <strong>First</strong> an example, then the theorem.<br />
Example AVR A vector representation<br />
Consider the set<br />
{[ ] −7<br />
[ ] −6<br />
S = 5<br />
1<br />
, 5<br />
0<br />
,<br />
[ ]} −12<br />
7<br />
4<br />
from the vector space C 3 .LetA be the matrix whose columns are the set S, and<br />
verify that A is nons<strong>in</strong>gular. By Theorem NMLIC theelementsofS form a l<strong>in</strong>early
§LISS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 299<br />
<strong>in</strong>dependent set. Suppose that b ∈ C 3 .ThenLS(A, b) has a (unique) solution<br />
(Theorem NMUS) and hence is consistent. By Theorem SLSLC, b ∈〈S〉. S<strong>in</strong>ceb<br />
is arbitrary, this is enough to show that 〈S〉 = C 3 , and therefore S is a spann<strong>in</strong>g<br />
set for C 3 (Def<strong>in</strong>ition SSVS). (This set comes from the columns of the coefficient<br />
matrix of Archetype B.)<br />
Now exam<strong>in</strong>e the situation for a particular choice of b, sayb =<br />
[ ] −33<br />
24 . Because<br />
5<br />
S is a spann<strong>in</strong>g set for C 3 ,weknowwecanwriteb as a l<strong>in</strong>ear comb<strong>in</strong>ation of the<br />
vectors <strong>in</strong> S,<br />
[ ] −33<br />
24 =(−3)<br />
5<br />
[ ] −7<br />
5 +(5)<br />
1<br />
[ ] −6<br />
5 +(2)<br />
0<br />
[ ] −12<br />
7 .<br />
4<br />
The nons<strong>in</strong>gularity of the matrix A tells that the scalars <strong>in</strong> this l<strong>in</strong>ear comb<strong>in</strong>ation<br />
are unique. More precisely, it is the l<strong>in</strong>ear <strong>in</strong>dependence of S that provides the<br />
uniqueness. We will refer to the scalars a 1 = −3, a 2 =5,a 3 = 2 as a “representation<br />
of b relative to S.” In other words, once we settle on S as a l<strong>in</strong>early <strong>in</strong>dependent<br />
set that spans C 3 , the vector b is recoverable just by know<strong>in</strong>g the scalars a 1 = −3,<br />
a 2 =5,a 3 = 2 (use these scalars <strong>in</strong> a l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors <strong>in</strong> S). This is<br />
all an illustration of the follow<strong>in</strong>g important theorem, which we prove <strong>in</strong> the sett<strong>in</strong>g<br />
of a general vector space.<br />
△<br />
Theorem VRRB Vector Representation Relative to a Basis<br />
Suppose that V is a vector space and B = {v 1 , v 2 , v 3 , ..., v m } is a l<strong>in</strong>early <strong>in</strong>dependent<br />
set that spans V .Letw be any vector <strong>in</strong> V . Then there exist unique scalars<br />
a 1 ,a 2 ,a 3 , ..., a m such that<br />
w = a 1 v 1 + a 2 v 2 + a 3 v 3 + ···+ a m v m .<br />
Proof. That w can be written as a l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors <strong>in</strong> B follows<br />
from the spann<strong>in</strong>g property of the set (Def<strong>in</strong>ition SSVS). This is good, but not the<br />
meat of this theorem. We now know that for any choice of the vector w there exist<br />
some scalars that will create w as a l<strong>in</strong>ear comb<strong>in</strong>ation of the basis vectors. The<br />
real question is: Is there more than one way to write w as a l<strong>in</strong>ear comb<strong>in</strong>ation of<br />
{v 1 , v 2 , v 3 , ..., v m }? Are the scalars a 1 ,a 2 ,a 3 , ..., a m unique? (Proof Technique<br />
U)<br />
Assume there are two different l<strong>in</strong>ear comb<strong>in</strong>ations of {v 1 , v 2 , v 3 , ..., v m }<br />
that equal the vector w. In other words there exist scalars a 1 ,a 2 ,a 3 , ..., a m and<br />
b 1 ,b 2 ,b 3 , ..., b m so that<br />
w = a 1 v 1 + a 2 v 2 + a 3 v 3 + ···+ a m v m<br />
w = b 1 v 1 + b 2 v 2 + b 3 v 3 + ···+ b m v m .
§LISS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 300<br />
Then notice that<br />
0 = w +(−w) Property AI<br />
= w +(−1)w Theorem AISM<br />
=(a 1 v 1 + a 2 v 2 + a 3 v 3 + ···+ a m v m )+<br />
(−1)(b 1 v 1 + b 2 v 2 + b 3 v 3 + ···+ b m v m )<br />
=(a 1 v 1 + a 2 v 2 + a 3 v 3 + ···+ a m v m )+<br />
(−b 1 v 1 − b 2 v 2 − b 3 v 3 − ...− b m v m ) Property DVA<br />
=(a 1 − b 1 )v 1 +(a 2 − b 2 )v 2 +(a 3 − b 3 )v 3 +<br />
···+(a m − b m )v m<br />
Property C, Property DSA<br />
But this is a relation of l<strong>in</strong>ear dependence on a l<strong>in</strong>early <strong>in</strong>dependent set of<br />
vectors (Def<strong>in</strong>ition RLD)! Now we are us<strong>in</strong>g the other assumption about B, that<br />
{v 1 , v 2 , v 3 , ..., v m } is a l<strong>in</strong>early <strong>in</strong>dependent set. So by Def<strong>in</strong>ition LI it must<br />
happen that the scalars are all zero. That is,<br />
(a 1 − b 1 )=0 (a 2 − b 2 )=0 (a 3 − b 3 )=0 ... (a m − b m )=0<br />
a 1 = b 1 a 2 = b 2 a 3 = b 3 ... a m = b m .<br />
And so we f<strong>in</strong>d that the scalars are unique.<br />
<br />
TheconverseofTheoremVRRB is true as well, but is not important enough to<br />
rise beyond an exercise (see Exercise LISS.T51).<br />
This is a very typical use of the hypothesis that a set is l<strong>in</strong>early <strong>in</strong>dependent —<br />
obta<strong>in</strong> a relation of l<strong>in</strong>ear dependence and then conclude that the scalars must all<br />
be zero. The result of this theorem tells us that we can write any vector <strong>in</strong> a vector<br />
space as a l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors <strong>in</strong> a l<strong>in</strong>early <strong>in</strong>dependent spann<strong>in</strong>g set,<br />
but only just. There is only enough raw material <strong>in</strong> the spann<strong>in</strong>g set to write each<br />
vector one way as a l<strong>in</strong>ear comb<strong>in</strong>ation. So <strong>in</strong> this sense, we could call a l<strong>in</strong>early<br />
<strong>in</strong>dependent spann<strong>in</strong>g set a “m<strong>in</strong>imal spann<strong>in</strong>g set.” These sets are so important<br />
that we will give them a simpler name (“basis”) and explore their properties further<br />
<strong>in</strong> the next section.<br />
Read<strong>in</strong>g Questions<br />
1. Is the set of matrices below l<strong>in</strong>early <strong>in</strong>dependent or l<strong>in</strong>early dependent <strong>in</strong> the vector<br />
space M 22 ? Why or why not?<br />
{[ ] [ ] [ ]}<br />
1 3 −2 3 0 9<br />
,<br />
,<br />
−2 4 3 −5 −1 3<br />
2. Expla<strong>in</strong> the difference between the follow<strong>in</strong>g two uses of the term “span”:<br />
1. S is a subset of the vector space V and the span of S is a subspace of V .
§LISS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 301<br />
2. W is a subspace of the vector space Y and T spans W .<br />
3. The set<br />
⎧⎡<br />
⎨<br />
S = ⎣ 6 ⎤ ⎡<br />
2⎦ , ⎣ 4<br />
⎤ ⎡<br />
−3⎦ , ⎣ 5 ⎤⎫<br />
⎬<br />
8⎦<br />
⎩<br />
1 1 2<br />
⎭<br />
⎡<br />
is l<strong>in</strong>early <strong>in</strong>dependent and spans C 3 . Write the vector x = ⎣ −6<br />
⎤<br />
2 ⎦ as a l<strong>in</strong>ear comb<strong>in</strong>ation<br />
2<br />
of the elements of S. How many ways are there to answer this question, and which<br />
theorem allows you to say so?<br />
Exercises<br />
C20 † In the vector space of 2 × 2matrices,M 22 , determ<strong>in</strong>e if the set S below is l<strong>in</strong>early<br />
<strong>in</strong>dependent.<br />
{[ ] [ ] [ ]}<br />
2 −1 0 4 4 2<br />
S =<br />
, ,<br />
1 3 −1 2 1 3<br />
C21 † In the crazy vector space C (Example CVS), is the set S = {(0, 2), (2, 8)} l<strong>in</strong>early<br />
<strong>in</strong>dependent?<br />
C22 † In the vector space of polynomials P 3 , determ<strong>in</strong>e if the set S is l<strong>in</strong>early <strong>in</strong>dependent<br />
or l<strong>in</strong>early dependent.<br />
S = { 2+x − 3x 2 − 8x 3 , 1+x + x 2 +5x 3 , 3 − 4x 2 − 7x 3}<br />
C23 † Determ<strong>in</strong>e if the set S = {(3, 1), (7, 3)} is l<strong>in</strong>early <strong>in</strong>dependent <strong>in</strong> the crazy vector<br />
space C (Example CVS).<br />
C24 † In the vector space of real-valued functions F = { f| f : R → R}, determ<strong>in</strong>e if the<br />
follow<strong>in</strong>g set S is l<strong>in</strong>early <strong>in</strong>dependent.<br />
S = { s<strong>in</strong> 2 x, cos 2 x, 2 }<br />
C25 †<br />
Let<br />
S =<br />
{[ ]<br />
1 2<br />
,<br />
2 1<br />
[ ]<br />
2 1<br />
,<br />
−1 2<br />
[ ]}<br />
0 1<br />
1 2<br />
1. Determ<strong>in</strong>e if S spans M 22 .<br />
2. Determ<strong>in</strong>e if S is l<strong>in</strong>early <strong>in</strong>dependent.<br />
C26 †<br />
Let<br />
S =<br />
{[ ]<br />
1 2<br />
,<br />
2 1<br />
[ ]<br />
2 1<br />
,<br />
−1 2<br />
[ ]<br />
0 1<br />
,<br />
1 2<br />
[ ]<br />
1 0<br />
,<br />
1 1<br />
[ ]}<br />
1 4<br />
0 3<br />
1. Determ<strong>in</strong>e if S spans M 22 .
§LISS Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 302<br />
2. Determ<strong>in</strong>e if S is l<strong>in</strong>early <strong>in</strong>dependent.<br />
C30 In Example LIM32, f<strong>in</strong>d another nontrivial relation of l<strong>in</strong>ear dependence on the<br />
l<strong>in</strong>early dependent set of 3 × 2matrices,S.<br />
C40 † Determ<strong>in</strong>e if the set T = { x 2 − x +5, 4x 3 − x 2 +5x, 3x +2 } spans the vector<br />
space of polynomials with degree 4 or less, P 4 .<br />
C41 † The set W is a subspace of M 22 , the vector space of all 2 × 2 matrices. Prove that<br />
S is a spann<strong>in</strong>g set for W .<br />
{[ ]∣ {[ ] [ ] [ ]}<br />
a b ∣∣∣ 1 0 0 1 0 0<br />
W = 2a − 3b +4c − d =0}<br />
S = , ,<br />
c d<br />
0 2 0 −3 1 4<br />
C42 †<br />
CVS).<br />
M10 †<br />
Determ<strong>in</strong>e if the set S = {(3, 1), (7, 3)} spans the crazy vector space C (Example<br />
Halfway through Example SSP4, we need to show that the system of equations<br />
⎛⎡<br />
⎤ ⎡ ⎤⎞<br />
0 0 0 1 a<br />
0 0 1 −8<br />
b<br />
LS⎜⎢<br />
0 1 −6 24 ⎥<br />
⎝⎣<br />
1 −4 12 −32<br />
⎦ , ⎢c⎥⎟<br />
⎣<br />
d<br />
⎦⎠<br />
−2 4 −8 16 e<br />
is consistent for every choice of the vector of constants satisfy<strong>in</strong>g 16a +8b +4c +2d + e =0.<br />
Express the column space of the coefficient matrix of this system as a null space, us<strong>in</strong>g<br />
Theorem FS. From this use Theorem CSCS to establish that the system is always<br />
consistent. Notice that this approach removes from Example SSP4 the need to row-reduce<br />
a symbolic matrix.<br />
T20 † Suppose that S is a f<strong>in</strong>ite l<strong>in</strong>early <strong>in</strong>dependent set of vectors from the vector space<br />
V .LetT be any subset of S. ProvethatT is l<strong>in</strong>early <strong>in</strong>dependent.<br />
T40 Prove the follow<strong>in</strong>g variant of Theorem EMMVP that has a weaker hypothesis: Suppose<br />
that C = {u 1 , u 2 , u 3 , ..., u p } is a l<strong>in</strong>early <strong>in</strong>dependent spann<strong>in</strong>g set for C n .Suppose<br />
also that A and B are m×n matrices such that Au i = Bu i for every 1 ≤ i ≤ n. Then A = B.<br />
Can you weaken the hypothesis even further while still preserv<strong>in</strong>g the conclusion?<br />
T50 † Suppose that V is a vector space and u, v ∈ V are two vectors <strong>in</strong> V .Usethe<br />
def<strong>in</strong>ition of l<strong>in</strong>ear <strong>in</strong>dependence to prove that S = {u, v} is a l<strong>in</strong>early dependent set if<br />
and only if one of the two vectors is a scalar multiple of the other. Prove this directly <strong>in</strong><br />
the context of an abstract vector space (V ), without simply giv<strong>in</strong>g an upgraded version of<br />
Theorem DLDS for the special case of just two vectors.<br />
T51 † Carefully formulate the converse of Theorem VRRB and provide a proof.
Section B<br />
Bases<br />
A basis of a vector space is one of the most useful concepts <strong>in</strong> l<strong>in</strong>ear algebra. It often<br />
provides a concise, f<strong>in</strong>ite description of an <strong>in</strong>f<strong>in</strong>ite vector space.<br />
Subsection B<br />
Bases<br />
We now have all the tools <strong>in</strong> place to def<strong>in</strong>e a basis of a vector space.<br />
Def<strong>in</strong>ition B Basis<br />
Suppose V is a vector space. Then a subset S ⊆ V is a basis of V if it is l<strong>in</strong>early<br />
<strong>in</strong>dependent and spans V .<br />
□<br />
So, a basis is a l<strong>in</strong>early <strong>in</strong>dependent spann<strong>in</strong>g set for a vector space. The requirement<br />
that the set spans V <strong>in</strong>sures that S has enough raw material to build V ,<br />
while the l<strong>in</strong>ear <strong>in</strong>dependence requirement <strong>in</strong>sures that we do not have any more<br />
raw material than we need. As we shall see soon <strong>in</strong> Section D, a basis is a m<strong>in</strong>imal<br />
spann<strong>in</strong>g set.<br />
You may have noticed that we used the term basis for some of the titles of previous<br />
theorems (e.g. Theorem BNS, TheoremBCS, TheoremBRS) and if you review each<br />
of these theorems you will see that their conclusions provide l<strong>in</strong>early <strong>in</strong>dependent<br />
spann<strong>in</strong>g sets for sets that we now recognize as subspaces of C m . Examples associated<br />
with these theorems <strong>in</strong>clude Example NSLIL, Example CSOCD and Example IAS.<br />
As we will see, these three theorems will cont<strong>in</strong>ue to be powerful tools, even <strong>in</strong> the<br />
sett<strong>in</strong>g of more general vector spaces.<br />
Furthermore, the archetypes conta<strong>in</strong> an abundance of bases. For each coefficient<br />
matrix of a system of equations, and for each archetype def<strong>in</strong>ed simply as a matrix,<br />
thereisabasisforthenullspace,three bases for the column space, and a basis for<br />
the row space. For this reason, our subsequent examples will concentrate on bases<br />
for vector spaces other than C m .<br />
Notice that Def<strong>in</strong>ition B does not preclude a vector space from hav<strong>in</strong>g many<br />
bases, and this is the case, as h<strong>in</strong>ted above by the statement that the archetypes<br />
conta<strong>in</strong> three bases for the column space of a matrix. More generally, we can grab<br />
any basis for a vector space, multiply any one basis vector by a nonzero scalar and<br />
create a slightly different set that is still a basis. For “important” vector spaces, it<br />
will be convenient to have a collection of “nice” bases. When a vector space has<br />
a s<strong>in</strong>gle particularly nice basis, it is sometimes called the standard basis though<br />
there is noth<strong>in</strong>g precise enough about this term to allow us to def<strong>in</strong>e it formally —<br />
it is a question of style. Here are some nice bases for important vector spaces.<br />
Theorem SUVB Standard Unit Vectors are a Basis<br />
303
§B Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 304<br />
The set of standard unit vectors for C m (Def<strong>in</strong>ition SUV), B = {e i | 1 ≤ i ≤ m} is a<br />
basis for the vector space C m .<br />
Proof. We must show that the set B is both l<strong>in</strong>early <strong>in</strong>dependent and a spann<strong>in</strong>g<br />
set for C m . <strong>First</strong>, the vectors <strong>in</strong> B are, by Def<strong>in</strong>ition SUV, the columns of the<br />
identity matrix, which we know is nons<strong>in</strong>gular (s<strong>in</strong>ce it row-reduces to the identity<br />
matrix, Theorem NMRRI). And the columns of a nons<strong>in</strong>gular matrix are l<strong>in</strong>early<br />
<strong>in</strong>dependent by Theorem NMLIC.<br />
Suppose we grab an arbitrary vector from C m ,say<br />
⎡ ⎤<br />
v 1<br />
v 2<br />
v =<br />
v 3<br />
⎢ ⎥<br />
⎣<br />
.<br />
⎦ .<br />
v m<br />
Can we write v as a l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors <strong>in</strong> B? Yes, and quite<br />
simply.<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
⎡ ⎤<br />
v 1 1 0 0<br />
0<br />
v 2<br />
0<br />
1<br />
0<br />
0<br />
v 3<br />
⎢<br />
⎣ . ⎥<br />
.<br />
⎦ = v 0<br />
1 ⎢<br />
⎣<br />
⎥<br />
.<br />
⎦ + v 0<br />
2 ⎢<br />
⎣<br />
⎥<br />
.<br />
⎦ + v 1<br />
3 ⎢<br />
⎣<br />
⎥<br />
.<br />
⎦ + ···+ v 0<br />
m ⎢<br />
⎣<br />
⎥<br />
.<br />
⎦<br />
v m 0 0 0<br />
1<br />
v = v 1 e 1 + v 2 e 2 + v 3 e 3 + ···+ v m e m<br />
This shows that C m ⊆〈B〉, which is sufficient to show that B is a spann<strong>in</strong>g set<br />
for C m .<br />
<br />
Example BP Bases for P n<br />
The vector space of polynomials with degree at most n, P n , has the basis<br />
B = { 1,x,x 2 ,x 3 , ..., x n} .<br />
Another nice basis for P n is<br />
C = { 1, 1+x, 1+x + x 2 , 1+x + x 2 + x 3 , ..., 1+x + x 2 + x 3 + ···+ x n} .<br />
Check<strong>in</strong>g that each of B and C is a l<strong>in</strong>early <strong>in</strong>dependent spann<strong>in</strong>g set are good<br />
exercises.<br />
△<br />
Example BM A basis for the vector space of matrices<br />
In the vector space M mn of matrices (Example VSM) def<strong>in</strong>e the matrices B kl ,<br />
1 ≤ k ≤ m, 1≤ l ≤ n by<br />
{<br />
1 if k = i, l = j<br />
[B kl ] ij<br />
=<br />
0 otherwise
§B Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 305<br />
So these matrices have entries that are all zeros, with the exception of a lone<br />
entry that is one. The set of all mn of them,<br />
B = {B kl | 1 ≤ k ≤ m, 1 ≤ l ≤ n}<br />
forms a basis for M mn . See Exercise B.M20.<br />
△<br />
The bases described above will often be convenient ones to work with. However<br />
a basis does not have to obviously look like a basis.<br />
Example BSP4 A basis for a subspace of P 4<br />
In Example SSP4 we showed that<br />
S = { x − 2, x 2 − 4x +4,x 3 − 6x 2 +12x − 8, x 4 − 8x 3 +24x 2 − 32x +16 }<br />
is a spann<strong>in</strong>g set for W = {p(x)| p ∈ P 4 , p(2) = 0}. WewillnowshowthatS is also<br />
l<strong>in</strong>early <strong>in</strong>dependent <strong>in</strong> W . Beg<strong>in</strong> with a relation of l<strong>in</strong>ear dependence,<br />
0+0x +0x 2 +0x 3 +0x 4<br />
(<br />
= α 1 (x − 2) + α 2 x 2 − 4x +4 ) (<br />
+ α 3 x 3 − 6x 2 +12x − 8 )<br />
(<br />
+ α 4 x 4 − 8x 3 +24x 2 − 32x +16 )<br />
= α 4 x 4 +(α 3 − 8α 4 ) x 3 +(α 2 − 6α 3 +24α 4 ) x 2<br />
+(α 1 − 4α 2 +12α 3 − 32α 4 ) x +(−2α 1 +4α 2 − 8α 3 +16α 4 )<br />
Equat<strong>in</strong>g coefficients (vector equality <strong>in</strong> P 4 ) gives the homogeneous system of<br />
five equations <strong>in</strong> four variables,<br />
α 4 =0<br />
α 3 − 8α 4 =0<br />
α 2 − 6α 3 +24α 4 =0<br />
α 1 − 4α 2 +12α 3 − 32α 4 =0<br />
−2α 1 +4α 2 − 8α 3 +16α 4 =0<br />
We form the coefficient matrix, and row-reduce to obta<strong>in</strong> a matrix <strong>in</strong> reduced<br />
row-echelon form<br />
⎡<br />
⎤<br />
1 0 0 0<br />
0 1 0 0<br />
⎢ 0 0 1 0<br />
⎥<br />
⎣<br />
⎦<br />
0 0 0 1<br />
0 0 0 0<br />
With only the trivial solution to this homogeneous system, we conclude that<br />
only scalars that will form a relation of l<strong>in</strong>ear dependence are the trivial ones, and<br />
therefore the set S is l<strong>in</strong>early <strong>in</strong>dependent (Def<strong>in</strong>ition LI). F<strong>in</strong>ally, S has earned the
§B Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 306<br />
right to be called a basis for W (Def<strong>in</strong>ition B).<br />
△<br />
Example BSM22 A basis for a subspace of M 22<br />
In Example SSM22 we discovered that<br />
{[ ] [ ]}<br />
−3 1 1 0<br />
Q =<br />
,<br />
0 0 −4 1<br />
is a spann<strong>in</strong>g set for the subspace<br />
{[ ]∣<br />
a b ∣∣∣<br />
Z = a +3b − c − 5d =0, −2a − 6b +3c +14d =0}<br />
c d<br />
of the vector space of all 2 × 2 matrices, M 22 . If we can also determ<strong>in</strong>e that Q is<br />
l<strong>in</strong>early <strong>in</strong>dependent <strong>in</strong> Z (or <strong>in</strong> M 22 ), then it will qualify as a basis for Z.<br />
Let us beg<strong>in</strong> with a relation of l<strong>in</strong>ear dependence.<br />
[<br />
0 0<br />
0 0<br />
] [ ]<br />
−3 1<br />
= α 1<br />
0 0<br />
=<br />
[ ]<br />
−3α1 + α 2 α 1<br />
−4α 2 α 2<br />
+ α 2<br />
[<br />
1 0<br />
−4 1<br />
Us<strong>in</strong>g our def<strong>in</strong>ition of matrix equality (Def<strong>in</strong>ition ME) we equate entries and<br />
get a homogeneous system of four equations <strong>in</strong> two variables,<br />
−3α 1 + α 2 =0<br />
α 1 =0<br />
−4α 2 =0<br />
α 2 =0<br />
We could row-reduce the coefficient matrix of this homogeneous system, but it is<br />
not necessary. The second and fourth equations tell us that α 1 =0,α 2 =0isthe<br />
only solution to this homogeneous system. This qualifies the set Q as be<strong>in</strong>g l<strong>in</strong>early<br />
<strong>in</strong>dependent, s<strong>in</strong>ce the only relation of l<strong>in</strong>ear dependence is trivial (Def<strong>in</strong>ition LI).<br />
Therefore Q is a basis for Z (Def<strong>in</strong>ition B).<br />
△<br />
Example BC Basis for the crazy vector space<br />
In Example LIC and Example SSC we determ<strong>in</strong>ed that the set R = {(1, 0), (6, 3)}<br />
from the crazy vector space, C (Example CVS), is l<strong>in</strong>early <strong>in</strong>dependent and is a<br />
spann<strong>in</strong>g set for C. By Def<strong>in</strong>ition B we see that R is a basis for C.<br />
△<br />
We have seen that several of the sets associated with a matrix are subspaces<br />
of vector spaces of column vectors. Specifically these are the null space (Theorem<br />
NSMS), column space (Theorem CSMS), row space (Theorem RSMS) andleftnull<br />
space (Theorem LNSMS). As subspaces they are vector spaces (Def<strong>in</strong>ition S) andit<br />
is natural to ask about bases for these vector spaces. Theorem BNS, TheoremBCS,<br />
Theorem BRS each have conclusions that provide l<strong>in</strong>early <strong>in</strong>dependent spann<strong>in</strong>g sets<br />
for (respectively) the null space, column space, and row space. Notice that each of<br />
]
§B Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 307<br />
these theorems conta<strong>in</strong>s the word “basis” <strong>in</strong> its title, even though we did not know<br />
the precise mean<strong>in</strong>g of the word at the time. To f<strong>in</strong>d a basis for a left null space we<br />
can use the def<strong>in</strong>ition of this subspace as a null space (Def<strong>in</strong>ition LNS) and apply<br />
Theorem BNS. OrTheoremFS tells us that the left null space can be expressed as<br />
a row space and we can then use Theorem BRS.<br />
Theorem BS is another early result that provides a l<strong>in</strong>early <strong>in</strong>dependent spann<strong>in</strong>g<br />
set (i.e. a basis) as its conclusion. If a vector space of column vectors can be<br />
expressed as a span of a set of column vectors, then Theorem BS can be employed<br />
<strong>in</strong> a straightforward manner to quickly yield a basis.<br />
Subsection BSCV<br />
Bases for Spans of Column Vectors<br />
We have seen several examples of bases <strong>in</strong> different vector spaces. In this subsection,<br />
and the next (Subsection B.BNM), we will consider build<strong>in</strong>g bases for C m and its<br />
subspaces.<br />
Suppose we have a subspace of C m that is expressed as the span of a set of vectors,<br />
S, andS is not necessarily l<strong>in</strong>early <strong>in</strong>dependent, or perhaps not very attractive.<br />
Theorem REMRS says that row-equivalent matrices have identical row spaces, while<br />
Theorem BRS says the nonzero rows of a matrix <strong>in</strong> reduced row-echelon form are a<br />
basis for the row space. These theorems together give us a great computational tool<br />
for quickly f<strong>in</strong>d<strong>in</strong>g a basis for a subspace that is expressed orig<strong>in</strong>ally as a span.<br />
Example RSB Row space basis<br />
When we first def<strong>in</strong>ed the span of a set of column vectors, <strong>in</strong> Example SCAD we<br />
looked at the set<br />
W =<br />
〈{[ ] 2<br />
−3 ,<br />
1<br />
[ ] 1<br />
4 ,<br />
1<br />
[ ] 7<br />
−5 ,<br />
4<br />
[ ]}〉 −7<br />
−6<br />
−5<br />
with an eye towards realiz<strong>in</strong>g W as the span of a smaller set. By build<strong>in</strong>g relations<br />
of l<strong>in</strong>ear dependence (though we did not know them by that name then) we were<br />
able to remove two vectors and write W as the span of the other two vectors. These<br />
two rema<strong>in</strong><strong>in</strong>g vectors formed a l<strong>in</strong>early <strong>in</strong>dependent set, even though we did not<br />
know that at the time.<br />
NowweknowthatW is a subspace and must have a basis. Consider the matrix,<br />
C, whose rows are the vectors <strong>in</strong> the spann<strong>in</strong>g set for W ,<br />
C =<br />
⎡<br />
⎢<br />
⎣<br />
2 −3 1<br />
1 4 1<br />
7 −5 4<br />
−7 −6 −5<br />
Then,byDef<strong>in</strong>itionRSM, the row space of C will be W , R(C) = W .Theorem<br />
BRS tells us that if we row-reduce C, the nonzero rows of the row-equivalent matrix<br />
⎤<br />
⎥<br />
⎦
§B Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 308<br />
<strong>in</strong> reduced row-echelon form will be a basis for R(C), and hence a basis for W .Let<br />
us do it — C row-reduces to<br />
⎡<br />
⎤<br />
7<br />
1 0<br />
11<br />
1<br />
⎢ 0 1 11⎥<br />
⎣ 0 0 0 ⎦<br />
0 0 0<br />
If we convert the two nonzero rows to column vectors then we have a basis,<br />
⎧⎡<br />
⎨<br />
B = ⎣ 1 ⎤ ⎡<br />
0 ⎦ , ⎣ 0 ⎤⎫<br />
⎬<br />
1 ⎦<br />
⎩ 7 1 ⎭<br />
11 11<br />
and<br />
⎡ ⎤ ⎡ ⎤⎫〉<br />
〈 ⎧ ⎨<br />
W = ⎣ 1 0 ⎦ ,<br />
⎩<br />
7<br />
11<br />
⎣ 0 1 ⎦<br />
1<br />
11<br />
For aesthetic reasons, we might wish to multiply each vector <strong>in</strong> B by 11, which<br />
will not change the spann<strong>in</strong>g or l<strong>in</strong>ear <strong>in</strong>dependence properties of B as a basis. Then<br />
we can also write<br />
W =<br />
〈{[ 11<br />
0<br />
7<br />
]<br />
,<br />
⎬<br />
⎭<br />
[ ]}〉 0<br />
11<br />
1<br />
Example IAS provides another example of this flavor, though now we can notice<br />
that X is a subspace, and that the result<strong>in</strong>g set of three vectors is a basis. This is<br />
such a powerful technique that we should do one more example.<br />
Example RS Reduc<strong>in</strong>g a span<br />
In Example RSC5 we began with a set of n = 4 vectors from C 5 ,<br />
⎧⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />
1 2 0 4<br />
⎪⎨<br />
2<br />
1<br />
−7<br />
1<br />
⎪⎬<br />
R = {v 1 , v 2 , v 3 , v 4 } = ⎢−1⎥<br />
⎣<br />
⎪⎩<br />
3 ⎦ , ⎢3⎥<br />
⎣1⎦ , ⎢ 6 ⎥<br />
⎣−11⎦ , ⎢2⎥<br />
⎣1⎦<br />
⎪⎭<br />
2 2 −2 6<br />
and def<strong>in</strong>ed V = 〈R〉. Our goal <strong>in</strong> that problem was to f<strong>in</strong>d a relation of l<strong>in</strong>ear<br />
dependence on the vectors <strong>in</strong> R, solve the result<strong>in</strong>g equation for one of the vectors,<br />
and re-express V as the span of a set of three vectors.<br />
Here is another way to accomplish someth<strong>in</strong>g similar. The row space of the matrix<br />
⎡<br />
⎤<br />
1 2 −1 3 2<br />
⎢2 1 3 1 2 ⎥<br />
A = ⎣<br />
0 −7 6 −11 −2<br />
⎦<br />
4 1 2 1 6<br />
△
§B Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 309<br />
is equal to 〈R〉. ByTheoremBRS we can row-reduce this matrix, ignore any zero<br />
rows, and use the nonzero rows as column vectors that are a basis for the row space<br />
of A. Row-reduc<strong>in</strong>g A creates the matrix<br />
So<br />
⎧⎡<br />
⎪⎨<br />
⎢<br />
⎣<br />
⎪⎩<br />
⎡<br />
1 0 0 − 1<br />
17<br />
25<br />
⎢0 1 0<br />
⎣0 0 1 − 2<br />
0 0 0 0 0<br />
1<br />
0<br />
0<br />
− 1<br />
17<br />
30<br />
17<br />
⎤ ⎡<br />
⎥<br />
⎦ , ⎢<br />
⎣<br />
0<br />
1<br />
0<br />
25<br />
17<br />
− 2<br />
17<br />
30<br />
17<br />
17<br />
− 2<br />
17<br />
17<br />
− 8<br />
17<br />
⎤ ⎡<br />
⎥<br />
⎦ , ⎢<br />
⎣<br />
0<br />
0<br />
1<br />
⎤<br />
⎥<br />
⎦<br />
− 2<br />
17<br />
− 8<br />
17<br />
⎤⎫<br />
⎪⎬<br />
⎥<br />
⎦<br />
⎪⎭<br />
is a basis for V . Our theorem tells us this is a basis, there is no need to verify that<br />
the subspace spanned by three vectors (rather than four) is the identical subspace,<br />
and there is no need to verify that we have reached the limit <strong>in</strong> reduc<strong>in</strong>g the set,<br />
s<strong>in</strong>ce the set of three vectors is guaranteed to be l<strong>in</strong>early <strong>in</strong>dependent. △<br />
Subsection BNM<br />
Bases and Nons<strong>in</strong>gular Matrices<br />
A quick source of diverse bases for C m is the set of columns of a nons<strong>in</strong>gular matrix.<br />
Theorem CNMB Columns of Nons<strong>in</strong>gular Matrix are a Basis<br />
Suppose that A is a square matrix of size m. Then the columns of A are a basis of<br />
C m if and only if A is nons<strong>in</strong>gular.<br />
Proof. (⇒) Suppose that the columns of A areabasisforC m . Then Def<strong>in</strong>ition B<br />
says the set of columns is l<strong>in</strong>early <strong>in</strong>dependent. Theorem NMLIC then says that A<br />
is nons<strong>in</strong>gular.<br />
(⇐) Suppose that A is nons<strong>in</strong>gular. Then by Theorem NMLIC this set of<br />
columns is l<strong>in</strong>early <strong>in</strong>dependent. Theorem CSNM says that for a nons<strong>in</strong>gular matrix,<br />
C(A) = C m . This is equivalent to say<strong>in</strong>g that the columns of A are a spann<strong>in</strong>g set<br />
for the vector space C m . As a l<strong>in</strong>early <strong>in</strong>dependent spann<strong>in</strong>g set, the columns of A<br />
qualify as a basis for C m (Def<strong>in</strong>ition B).<br />
<br />
Example CABAK Columns as Basis, Archetype K
§B Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 310<br />
Archetype K is the 5 × 5 matrix<br />
⎡<br />
⎤<br />
10 18 24 24 −12<br />
12 −2 −6 0 −18<br />
K = ⎢−30 −21 −23 −30 39 ⎥<br />
⎣ 27 30 36 37 −30⎦<br />
18 24 30 30 −20<br />
which is row-equivalent to the 5 × 5 identity matrix I 5 .SobyTheoremNMRRI, K<br />
is nons<strong>in</strong>gular. Then Theorem CNMB says the set<br />
⎧⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />
10 18 24 24 −12<br />
⎪⎨<br />
12<br />
−2<br />
−6<br />
0<br />
−18<br />
⎪⎬<br />
⎢−30⎥<br />
⎣<br />
⎪⎩<br />
27 ⎦ , ⎢−21⎥<br />
⎣ 30 ⎦ , ⎢−23⎥<br />
⎣ 36 ⎦ , ⎢−30⎥<br />
⎣ 37 ⎦ , ⎢ 39 ⎥<br />
⎣−30⎦<br />
⎪⎭<br />
18 24 30 30 −20<br />
is a (novel) basis of C 5 .<br />
△<br />
Perhaps we should view the fact that the standard unit vectors are a basis<br />
(Theorem SUVB) as just a simple corollary of Theorem CNMB? (See Proof Technique<br />
LC.)<br />
With a new equivalence for a nons<strong>in</strong>gular matrix, we can update our list of<br />
equivalences.<br />
Theorem NME5 Nons<strong>in</strong>gular Matrix Equivalences, Round 5<br />
Suppose that A is a square matrix of size n. The follow<strong>in</strong>g are equivalent.<br />
1. A is nons<strong>in</strong>gular.<br />
2. A row-reduces to the identity matrix.<br />
3. The null space of A conta<strong>in</strong>s only the zero vector, N (A) ={0}.<br />
4. The l<strong>in</strong>ear system LS(A, b) has a unique solution for every possible choice of<br />
b.<br />
5. The columns of A are a l<strong>in</strong>early <strong>in</strong>dependent set.<br />
6. A is <strong>in</strong>vertible.<br />
7. The column space of A is C n , C(A) =C n .<br />
8. The columns of A are a basis for C n .<br />
Proof. With a new equivalence for a nons<strong>in</strong>gular matrix <strong>in</strong> Theorem CNMB we can<br />
expand Theorem NME4.
§B Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 311<br />
Subsection OBC<br />
Orthonormal Bases and Coord<strong>in</strong>ates<br />
We learned about orthogonal sets of vectors <strong>in</strong> C m back <strong>in</strong> Section O, andwealso<br />
learned that orthogonal sets are automatically l<strong>in</strong>early <strong>in</strong>dependent (Theorem OSLI).<br />
When an orthogonal set also spans a subspace of C m , then the set is a basis. And<br />
when the set is orthonormal, then the set is an <strong>in</strong>credibly nice basis. We will back up<br />
this claim with a theorem, but first consider how you might manufacture such a set.<br />
Suppose that W is a subspace of C m with basis B. ThenB spans W and is<br />
a l<strong>in</strong>early <strong>in</strong>dependent set of nonzero vectors. We can apply the Gram-Schmidt<br />
Procedure (Theorem GSP) and obta<strong>in</strong> a l<strong>in</strong>early <strong>in</strong>dependent set T such that<br />
〈T 〉 = 〈B〉 = W and T is orthogonal. In other words, T is a basis for W ,andisan<br />
orthogonal set. By scal<strong>in</strong>g each vector of T to norm 1, we can convert T <strong>in</strong>to an<br />
orthonormal set, without destroy<strong>in</strong>g the properties that make it a basis of W .In<br />
short, we can convert any basis <strong>in</strong>to an orthonormal basis. Example GSTV, followed<br />
by Example ONTV, illustrates this process.<br />
Unitary matrices (Def<strong>in</strong>ition UM) are another good source of orthonormal bases<br />
(and vice versa). Suppose that Q is a unitary matrix of size n. Then the n columns of<br />
Q form an orthonormal set (Theorem CUMOS) that is therefore l<strong>in</strong>early <strong>in</strong>dependent<br />
(Theorem OSLI). S<strong>in</strong>ce Q is <strong>in</strong>vertible (Theorem UMI), we know Q is nons<strong>in</strong>gular<br />
(Theorem NI), and then the columns of Q span C n (Theorem CSNM). So the columns<br />
of a unitary matrix of size n are an orthonormal basis for C n .<br />
Why all the fuss about orthonormal bases? Theorem VRRB told us that any<br />
vector <strong>in</strong> a vector space could be written, uniquely, as a l<strong>in</strong>ear comb<strong>in</strong>ation of basis<br />
vectors. For an orthonormal basis, f<strong>in</strong>d<strong>in</strong>g the scalars for this l<strong>in</strong>ear comb<strong>in</strong>ation<br />
is extremely easy, and this is the content of the next theorem. Furthermore, with<br />
vectors written this way (as l<strong>in</strong>ear comb<strong>in</strong>ations of the elements of an orthonormal<br />
set) certa<strong>in</strong> computations and analysis become much easier. Here is the promised<br />
theorem.<br />
Theorem COB Coord<strong>in</strong>ates and Orthonormal Bases<br />
Suppose that B = {v 1 , v 2 , v 3 , ..., v p } is an orthonormal basis of the subspace W<br />
of C m . For any w ∈ W ,<br />
w = 〈v 1 , w〉 v 1 + 〈v 2 , w〉 v 2 + 〈v 3 , w〉 v 3 + ···+ 〈v p , w〉 v p<br />
Proof. Because B is a basis of W ,TheoremVRRB tells us that we can write w<br />
uniquely as a l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors <strong>in</strong> B. So it is not this aspect of<br />
the conclusion that makes this theorem <strong>in</strong>terest<strong>in</strong>g. What is <strong>in</strong>terest<strong>in</strong>g is that the<br />
particular scalars are so easy to compute. No need to solve big systems of equations<br />
— just do an <strong>in</strong>ner product of w with v i to arrive at the coefficient of v i <strong>in</strong> the l<strong>in</strong>ear<br />
comb<strong>in</strong>ation.<br />
So beg<strong>in</strong> the proof by writ<strong>in</strong>g w as a l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors <strong>in</strong> B,
§B Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 312<br />
us<strong>in</strong>g unknown scalars,<br />
w = a 1 v 1 + a 2 v 2 + a 3 v 3 + ···+ a p v p<br />
and compute,<br />
〈<br />
〉<br />
p∑<br />
〈v i , w〉 = v i , a k v k Theorem VRRB<br />
=<br />
=<br />
k=1<br />
p∑<br />
〈v i ,a k v k 〉 Theorem IPVA<br />
k=1<br />
p∑<br />
a k 〈v i , v k 〉<br />
k=1<br />
= a i 〈v i , v i 〉 +<br />
= a i (1) +<br />
= a i<br />
p∑<br />
a k 〈v i , v k 〉<br />
k=1<br />
k≠i<br />
p∑<br />
a k (0)<br />
k=1<br />
k≠i<br />
Theorem IPSM<br />
Property C<br />
Def<strong>in</strong>ition ONS<br />
So the (unique) scalars for the l<strong>in</strong>ear comb<strong>in</strong>ation are <strong>in</strong>deed the <strong>in</strong>ner products<br />
advertised <strong>in</strong> the conclusion of the theorem’s statement.<br />
<br />
Example CROB4 Coord<strong>in</strong>atization relative to an orthonormal basis, C 4<br />
The set<br />
⎧⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />
1+i 1+5i −7+34i −2 − 4i<br />
⎪⎨<br />
⎪⎬<br />
⎢ 1 ⎥ ⎢ 6+5i ⎥ ⎢ −8 − 23i ⎥ ⎢ 6+i ⎥<br />
{x 1 , x 2 , x 3 , x 4 } = ⎣<br />
⎪⎩<br />
1 − i<br />
⎦ , ⎣<br />
−7 − i<br />
⎦ , ⎣<br />
−10 + 22i<br />
⎦ , ⎣<br />
4+3i<br />
⎦<br />
⎪⎭<br />
i 1 − 6i 30 + 13i 6 − i<br />
was proposed, and partially verified, as an orthogonal set <strong>in</strong> Example AOS. Let<br />
us scale each vector to norm 1, so as to form an orthonormal set <strong>in</strong> C 4 .Thenby<br />
Theorem OSLI the set will be l<strong>in</strong>early <strong>in</strong>dependent, and by Theorem NME5 the<br />
set will be a basis for C 4 . So, once scaled to norm 1, the adjusted set will be an<br />
orthonormal basis of C 4 . The norms are,<br />
‖x 1 ‖ = √ 6 ‖x 2 ‖ = √ 174 ‖x 3 ‖ = √ 3451 ‖x 4 ‖ = √ 119<br />
So an orthonormal basis is<br />
B = {v 1 , v 2 , v 3 , v 4 }
§B Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 313<br />
⎧ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />
1+i 1+5i<br />
−7+34i<br />
−2 − 4i<br />
⎪⎨<br />
1 ⎢ 1 ⎥ 1 ⎢ 6+5i ⎥ 1 ⎢ −8 − 23i ⎥ 1<br />
⎪⎬<br />
⎢ 6+i ⎥<br />
= √ ⎣<br />
⎪⎩ 6 1 − i<br />
⎦ , √ ⎣ 174 −7 − i<br />
⎦ , √ ⎣ 3451 −10 + 22i<br />
⎦ , √ ⎣ 119 4+3i<br />
⎦<br />
⎪⎭<br />
i<br />
1 − 6i<br />
30 + 13i<br />
6 − i<br />
⎡ ⎤<br />
2<br />
Now, to illustrate Theorem COB, choose any vector from C 4 ⎢−3⎥<br />
,sayw = ⎣<br />
1<br />
⎦,<br />
4<br />
and compute<br />
〈v 1 , w〉 = −5i √<br />
6<br />
〈v 2 , w〉 =<br />
−19 + 30i<br />
√<br />
174<br />
〈v 3 , w〉 =<br />
120 − 211i<br />
√<br />
3451<br />
〈v 4 , w〉 = 6+12i √<br />
119<br />
Then Theorem COB guarantees that<br />
⎡ ⎤ ⎛ ⎡ ⎤⎞<br />
⎛ ⎡ ⎤⎞<br />
2<br />
1+i<br />
1+5i<br />
⎢−3⎥<br />
⎣<br />
1<br />
⎦ = −5i ⎜<br />
√ ⎝√ 1 ⎢ 1 ⎥⎟<br />
−19 + 30i ⎜<br />
⎣ 6 6 1 − i<br />
⎦⎠ + √ ⎝√ 1 ⎢ 6+5i ⎥⎟<br />
⎣<br />
174 174 −7 − i<br />
⎦⎠<br />
4<br />
i<br />
1 − 6i<br />
⎛ ⎡ ⎤⎞<br />
⎛ ⎡ ⎤⎞<br />
−7+34i<br />
−2 − 4i<br />
120 − 211i ⎜<br />
+ √ ⎝√ 1 ⎢ −8 − 23i ⎥⎟<br />
⎣ ⎦⎠ + 6+12i ⎜<br />
√ ⎝√ 1 ⎢ ⎥⎟<br />
⎣ ⎦⎠<br />
3451 3451 119 119<br />
−10 + 22i<br />
30 + 13i<br />
as you might want to check (if you have unlimited patience).<br />
6+i<br />
4+3i<br />
6 − i<br />
A slightly less <strong>in</strong>timidat<strong>in</strong>g example follows, <strong>in</strong> three dimensions and with just<br />
real numbers.<br />
Example CROB3 Coord<strong>in</strong>atization relative to an orthonormal basis, C 3<br />
The set<br />
{[ ] [ ] [ ]}<br />
1 −1 2<br />
{x 1 , x 2 , x 3 } = 2 , 0 , 1<br />
1 1 1<br />
is a l<strong>in</strong>early <strong>in</strong>dependent set, which the Gram-Schmidt Process (Theorem GSP)<br />
converts to an orthogonal set, and which can then be converted to the orthonormal<br />
set,<br />
{<br />
B = {v 1 , v 2 , v 3 } =<br />
1<br />
√<br />
6<br />
[ 1<br />
2<br />
1<br />
]<br />
,<br />
[ ]<br />
1 −1<br />
√ 0 , 2 1<br />
[ ]}<br />
1 1<br />
√ −1<br />
3 1<br />
which is therefore an orthonormal basis of C 3 . With three vectors <strong>in</strong> C 3 ,allwith<br />
real number entries, the <strong>in</strong>ner product (Def<strong>in</strong>ition IP) reduces to the usual “dot<br />
product” (or scalar product) and the orthogonal pairs of vectors can be <strong>in</strong>terpreted<br />
△
§B Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 314<br />
as perpendicular pairs of directions. So the vectors <strong>in</strong> B serve as replacements for our<br />
usual 3-D axes, or the usual 3-D unit vectors ⃗i,⃗j and ⃗ k. We would like to decompose<br />
arbitrary vectors <strong>in</strong>to “components” <strong>in</strong> the directions of each of these basis vectors.<br />
It is Theorem COB that tells us [ how ] to do this.<br />
2<br />
Suppose that we choose w = −1 . Compute<br />
5<br />
〈v 1 , w〉 = √ 5 〈v 2 , w〉 = √ 3 〈v 3 , w〉 = √ 8<br />
6 2 3<br />
then Theorem COB guarantees that<br />
[ ] ( [ ]) ( [ ]) ( [ ])<br />
2<br />
−1 = √ 5 1 1<br />
√ 2 + √ 3 1 −1<br />
√ 0 + 8 1 1<br />
√ √ −1<br />
5 6 6 1 2 2 1 3 3 1<br />
which you should be able to check easily, even if you do not have much patience.△<br />
Not only do the columns of a unitary matrix form an orthonormal basis, but there<br />
is a deeper connection between orthonormal bases and unitary matrices. Informally,<br />
the next theorem says that if we transform each vector of an orthonormal basis by<br />
multiply<strong>in</strong>g it by a unitary matrix, then the result<strong>in</strong>g set will be another orthonormal<br />
basis. And more remarkably, any matrix with this property must be unitary! As an<br />
equivalence (Proof Technique E) we could take this as our def<strong>in</strong><strong>in</strong>g property of a<br />
unitary matrix, though it might not have the same utility as Def<strong>in</strong>ition UM.<br />
Theorem UMCOB Unitary Matrices Convert Orthonormal Bases<br />
Let A be an n × n matrix and B = {x 1 , x 2 , x 3 , ..., x n } be an orthonormal basis of<br />
C n . Def<strong>in</strong>e<br />
C = {Ax 1 ,Ax 2 ,Ax 3 , ..., Ax n }<br />
Then A is a unitary matrix if and only if C is an orthonormal basis of C n .<br />
Proof. (⇒) Assume A is a unitary matrix and establish several facts about C. <strong>First</strong><br />
we check that C is an orthonormal set (Def<strong>in</strong>ition ONS). By Theorem UMPIP, for<br />
i ≠ j,<br />
〈Ax i ,Ax j 〉 = 〈x i , x j 〉 =0<br />
Similarly, Theorem UMPIP also gives, for 1 ≤ i ≤ n,<br />
‖Ax i ‖ = ‖x i ‖ =1<br />
As C is an orthogonal set (Def<strong>in</strong>ition OSV), Theorem OSLI yields the l<strong>in</strong>ear<br />
<strong>in</strong>dependence of C. Hav<strong>in</strong>g established that the column vectors on C form a l<strong>in</strong>early<br />
<strong>in</strong>dependent set, a matrix whose columns are the vectors of C is nons<strong>in</strong>gular (Theorem<br />
NMLIC), and hence these vectors form a basis of C n by Theorem CNMB.<br />
(⇐) Now assume that C is an orthonormal set. Let y be an arbitrary vector
§B Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 315<br />
from C n .S<strong>in</strong>ceB spans C n , there are scalars, a 1 ,a 2 ,a 3 , ..., a n , such that<br />
y = a 1 x 1 + a 2 x 2 + a 3 x 3 + ···+ a n x n<br />
Now<br />
A ∗ Ay =<br />
=<br />
=<br />
=<br />
=<br />
=<br />
=<br />
=<br />
=<br />
=<br />
=<br />
n∑<br />
〈x i ,A ∗ Ay〉 x i Theorem COB<br />
i=1<br />
〈<br />
n∑<br />
x i ,A ∗ A<br />
i=1<br />
〈<br />
n∑<br />
x i ,<br />
i=1<br />
〈<br />
n∑<br />
x i ,<br />
i=1<br />
n∑<br />
i=1 j=1<br />
n∑<br />
i=1 j=1<br />
n∑<br />
i=1 j=1<br />
n∑<br />
i=1<br />
n∑<br />
i=1<br />
n∑<br />
i=1<br />
〉<br />
n∑<br />
a j x j x i Def<strong>in</strong>ition SSVS<br />
j=1<br />
〉<br />
n∑<br />
A ∗ Aa j x j x i Theorem MMDAA<br />
j=1<br />
〉<br />
n∑<br />
a j A ∗ Ax j x i Theorem MMSMM<br />
j=1<br />
n∑<br />
〈x i ,a j A ∗ Ax j 〉 x i Theorem IPVA<br />
n∑<br />
a j 〈x i ,A ∗ Ax j 〉 x i<br />
n∑<br />
a j 〈Ax i ,Ax j 〉 x i<br />
n∑<br />
a j 〈Ax i ,Ax j 〉 x i +<br />
j=1<br />
j≠i<br />
n∑<br />
a j (0)x i +<br />
j=1<br />
j≠i<br />
n∑<br />
0 +<br />
j=1<br />
j≠i<br />
n∑<br />
a l x l<br />
l=1<br />
n∑<br />
a l 〈Ax l ,Ax l 〉 x l<br />
l=1<br />
n∑<br />
a l (1)x l<br />
l=1<br />
n∑<br />
a l x l<br />
l=1<br />
Theorem IPSM<br />
Theorem AIP<br />
Property C<br />
Def<strong>in</strong>ition ONS<br />
Theorem ZSSM<br />
Property Z<br />
= y<br />
= I n y Theorem MMIM<br />
S<strong>in</strong>ce the choice of y was arbitrary, Theorem EMMVP tells us that A ∗ A = I n ,<br />
so A is unitary (Def<strong>in</strong>ition UM).
§B Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 316<br />
Read<strong>in</strong>g Questions<br />
1. The matrix below is nons<strong>in</strong>gular. What can you now say about its columns?<br />
⎡<br />
A = ⎣ −3 0 1<br />
⎤<br />
1 2 1⎦<br />
5 1 6<br />
⎡<br />
⎤<br />
2. Write the vector w = ⎣ 6 6 ⎦ as a l<strong>in</strong>ear comb<strong>in</strong>ation of the columns of the matrix A<br />
15<br />
above. How many ways are there to answer this question?<br />
3. Why is an orthonormal basis desirable?<br />
Exercises<br />
C10 †<br />
F<strong>in</strong>d a basis for 〈S〉, where<br />
⎧⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />
1 1 1 1 3<br />
⎪⎨<br />
⎪⎬<br />
⎢3⎥<br />
⎢2⎥<br />
⎢1⎥<br />
⎢2⎥<br />
⎢4⎥<br />
S = ⎣<br />
⎪⎩<br />
2<br />
⎦ , ⎣<br />
1<br />
⎦ , ⎣<br />
0<br />
⎦ , ⎣<br />
2<br />
⎦ , ⎣<br />
1<br />
⎦<br />
⎪⎭ .<br />
1 1 1 1 3<br />
C11 † F<strong>in</strong>d a basis for the subspace W of C 4 ,<br />
⎧ ⎡<br />
⎤<br />
⎫<br />
a + b − 2c<br />
⎪⎨<br />
⎪⎬<br />
⎢ a + b − 2c + d ⎥<br />
W = ⎣<br />
⎪⎩<br />
−2a +2b +4c − d<br />
⎦<br />
a, b, c, d ∈ C<br />
b + d<br />
∣<br />
⎪⎭<br />
C12 † F<strong>in</strong>d a basis ⎡for the vector ⎤ space T of lower triangular 3 × 3 matrices; that is,<br />
matrices of the form ⎣ ∗ 0 0<br />
∗ ∗ 0⎦ where an asterisk represents any complex number.<br />
∗ ∗ ∗<br />
C13 † F<strong>in</strong>d a basis for the subspace Q of P 2 , Q = { p(x) =a + bx + cx 2∣ }<br />
∣ p(0) = 0 .<br />
C14 † F<strong>in</strong>d a basis for the subspace R of P 2 , R = { p(x) =a + bx + cx 2∣ ∣ p ′ (0) = 0 } ,where<br />
p ′ denotes the derivative.<br />
C40 † From Example RSB, form an arbitrary (and nontrivial) l<strong>in</strong>ear comb<strong>in</strong>ation of the<br />
four vectors <strong>in</strong> the orig<strong>in</strong>al spann<strong>in</strong>g set for W . So the result of this computation is of<br />
course an element of W . As such, this vector should be a l<strong>in</strong>ear comb<strong>in</strong>ation of the basis<br />
vectors <strong>in</strong> B. F<strong>in</strong>d the (unique) scalars that provide this l<strong>in</strong>ear comb<strong>in</strong>ation. Repeat with<br />
another l<strong>in</strong>ear comb<strong>in</strong>ation of the orig<strong>in</strong>al four vectors.<br />
C80 Prove that {(1, 2), (2, 3)} is a basis for the crazy vector space C (Example CVS).<br />
M20 † In Example BM provide the verifications (l<strong>in</strong>ear <strong>in</strong>dependence and spann<strong>in</strong>g) to<br />
show that B is a basis of M mn .<br />
T50 † Theorem UMCOB says that unitary matrices are characterized as those matrices<br />
that “carry” orthonormal bases to orthonormal bases. This problem asks you to prove a
§B Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 317<br />
similar result: nons<strong>in</strong>gular matrices are characterized as those matrices that “carry” bases<br />
to bases.<br />
More precisely, suppose that A is a square matrix of size n and B = {x 1 , x 2 , x 3 , ..., x n }<br />
is a basis of C n .ProvethatA is nons<strong>in</strong>gular if and only if C = {Ax 1 ,Ax 2 ,Ax 3 , ..., Ax n }<br />
is a basis of C n . (See also Exercise PD.T33, Exercise MR.T20.)<br />
T51 † Use the result of Exercise B.T50 to build a very concise proof of Theorem CNMB.<br />
(H<strong>in</strong>t: make a judicious choice for the basis B.)
Section D<br />
Dimension<br />
Almost every vector space we have encountered has been <strong>in</strong>f<strong>in</strong>ite <strong>in</strong> size (an exception<br />
is Example VSS). But some are bigger and richer than others. Dimension, once<br />
suitably def<strong>in</strong>ed, will be a measure of the size of a vector space, and a useful tool<br />
for study<strong>in</strong>g its properties. You probably already have a rough notion of what a<br />
mathematical def<strong>in</strong>ition of dimension might be — try to forget these imprecise ideas<br />
and go with the new ones given here.<br />
Subsection D<br />
Dimension<br />
Def<strong>in</strong>ition D Dimension<br />
Suppose that V is a vector space and {v 1 , v 2 , v 3 , ..., v t } is a basis of V . Then the<br />
dimension of V is def<strong>in</strong>ed by dim (V ) = t. IfV has no f<strong>in</strong>ite bases, we say V has<br />
<strong>in</strong>f<strong>in</strong>ite dimension.<br />
□<br />
This is a very simple def<strong>in</strong>ition, which belies its power. Grab a basis, any basis,<br />
and count up the number of vectors it conta<strong>in</strong>s. That is the dimension. However, this<br />
simplicity causes a problem. Given a vector space, you and I could each construct<br />
different bases — remember that a vector space might have many bases. And what if<br />
your basis and my basis had different sizes? Apply<strong>in</strong>g Def<strong>in</strong>ition D we would arrive<br />
at different numbers! With our current knowledge about vector spaces, we would<br />
have to say that dimension is not “well-def<strong>in</strong>ed.” Fortunately, there is a theorem<br />
that will correct this problem.<br />
In a strictly logical progression, the next two theorems would precede the def<strong>in</strong>ition<br />
of dimension. Many subsequent theorems will trace their l<strong>in</strong>eage back to the follow<strong>in</strong>g<br />
fundamental result.<br />
Theorem SSLD Spann<strong>in</strong>g Sets and L<strong>in</strong>ear Dependence<br />
Suppose that S = {v 1 , v 2 , v 3 , ..., v t } is a f<strong>in</strong>ite set of vectors which spans the<br />
vector space V .Thenanysetoft +1 or more vectors from V is l<strong>in</strong>early dependent.<br />
Proof. We want to prove that any set of t + 1 or more vectors from V is l<strong>in</strong>early<br />
dependent. So we will beg<strong>in</strong> with a totally arbitrary set of vectors from V , R =<br />
{u 1 , u 2 , u 3 , ..., u m },wherem>t. We will now construct a nontrivial relation of<br />
l<strong>in</strong>ear dependence on R.<br />
Each vector u 1 , u 2 , u 3 , ..., u m can be written as a l<strong>in</strong>ear comb<strong>in</strong>ation of the<br />
vectors v 1 , v 2 , v 3 , ..., v t s<strong>in</strong>ce S is a spann<strong>in</strong>g set of V . This means there exist<br />
scalars a ij ,1≤ i ≤ t, 1≤ j ≤ m, sothat<br />
u 1 = a 11 v 1 + a 21 v 2 + a 31 v 3 + ···+ a t1 v t<br />
318
§D Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 319<br />
u 2 = a 12 v 1 + a 22 v 2 + a 32 v 3 + ···+ a t2 v t<br />
u 3 = a 13 v 1 + a 23 v 2 + a 33 v 3 + ···+ a t3 v t<br />
.<br />
u m = a 1m v 1 + a 2m v 2 + a 3m v 3 + ···+ a tm v t<br />
Now we form, unmotivated, the homogeneous system of t equations <strong>in</strong> the m<br />
variables, x 1 ,x 2 ,x 3 , ..., x m , where the coefficients are the just-discovered scalars<br />
a ij ,<br />
a 11 x 1 + a 12 x 2 + a 13 x 3 + ···+ a 1m x m =0<br />
a 21 x 1 + a 22 x 2 + a 23 x 3 + ···+ a 2m x m =0<br />
a 31 x 1 + a 32 x 2 + a 33 x 3 + ···+ a 3m x m =0<br />
.<br />
a t1 x 1 + a t2 x 2 + a t3 x 3 + ···+ a tm x m =0<br />
This is a homogeneous system with more variables than equations (our hypothesis<br />
is expressed as m>t), so by Theorem HMVEI there are <strong>in</strong>f<strong>in</strong>itely many solutions.<br />
Choose a nontrivial solution and denote it by x 1 = c 1 ,x 2 = c 2 ,x 3 = c 3 , ..., x m =<br />
c m . As a solution to the homogeneous system, we then have<br />
a 11 c 1 + a 12 c 2 + a 13 c 3 + ···+ a 1m c m =0<br />
a 21 c 1 + a 22 c 2 + a 23 c 3 + ···+ a 2m c m =0<br />
a 31 c 1 + a 32 c 2 + a 33 c 3 + ···+ a 3m c m =0<br />
.<br />
a t1 c 1 + a t2 c 2 + a t3 c 3 + ···+ a tm c m =0<br />
As a collection of nontrivial scalars, c 1 ,c 2 ,c 3 , ..., c m will provide the nontrivial<br />
relation of l<strong>in</strong>ear dependence we desire,<br />
c 1 u 1 + c 2 u 2 + c 3 u 3 + ···+ c m u m<br />
= c 1 (a 11 v 1 + a 21 v 2 + a 31 v 3 + ···+ a t1 v t ) Def<strong>in</strong>ition SSVS<br />
+ c 2 (a 12 v 1 + a 22 v 2 + a 32 v 3 + ···+ a t2 v t )<br />
+ c 3 (a 13 v 1 + a 23 v 2 + a 33 v 3 + ···+ a t3 v t )<br />
.<br />
+ c m (a 1m v 1 + a 2m v 2 + a 3m v 3 + ···+ a tm v t )<br />
= c 1 a 11 v 1 + c 1 a 21 v 2 + c 1 a 31 v 3 + ···+ c 1 a t1 v t Property DVA
§D Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 320<br />
+ c 2 a 12 v 1 + c 2 a 22 v 2 + c 2 a 32 v 3 + ···+ c 2 a t2 v t<br />
+ c 3 a 13 v 1 + c 3 a 23 v 2 + c 3 a 33 v 3 + ···+ c 3 a t3 v t<br />
.<br />
+ c m a 1m v 1 + c m a 2m v 2 + c m a 3m v 3 + ···+ c m a tm v t<br />
=(c 1 a 11 + c 2 a 12 + c 3 a 13 + ···+ c m a 1m ) v 1<br />
Property DSA<br />
+(c 1 a 21 + c 2 a 22 + c 3 a 23 + ···+ c m a 2m ) v 2<br />
+(c 1 a 31 + c 2 a 32 + c 3 a 33 + ···+ c m a 3m ) v 3<br />
.<br />
+(c 1 a t1 + c 2 a t2 + c 3 a t3 + ···+ c m a tm ) v t<br />
=(a 11 c 1 + a 12 c 2 + a 13 c 3 + ···+ a 1m c m ) v 1<br />
Property CMCN<br />
+(a 21 c 1 + a 22 c 2 + a 23 c 3 + ···+ a 2m c m ) v 2<br />
+(a 31 c 1 + a 32 c 2 + a 33 c 3 + ···+ a 3m c m ) v 3<br />
.<br />
+(a t1 c 1 + a t2 c 2 + a t3 c 3 + ···+ a tm c m ) v t<br />
=0v 1 +0v 2 +0v 3 + ···+0v t c j as solution<br />
= 0 + 0 + 0 + ···+ 0 Theorem ZSSM<br />
= 0 Property Z<br />
That does it. R has been undeniably shown to be a l<strong>in</strong>early dependent set.<br />
The proof just given has some monstrous expressions <strong>in</strong> it, mostly ow<strong>in</strong>g to<br />
the double subscripts present. Now is a great opportunity to show the value of a<br />
more compact notation. We will rewrite the key steps of the previous proof us<strong>in</strong>g<br />
summation notation, result<strong>in</strong>g <strong>in</strong> a more economical presentation, and even greater<br />
<strong>in</strong>sight <strong>in</strong>to the key aspects of the proof. So here is an alternate proof — study it<br />
carefully.<br />
Alternate Proof: We want to prove that any set of t + 1 or more vectors from<br />
V is l<strong>in</strong>early dependent. So we will beg<strong>in</strong> with a totally arbitrary set of vectors from<br />
V , R = {u j | 1 ≤ j ≤ m}, wherem>t. We will now construct a nontrivial relation<br />
of l<strong>in</strong>ear dependence on R.<br />
Each vector u j ,1≤ j ≤ m can be written as a l<strong>in</strong>ear comb<strong>in</strong>ation of v i ,1≤ i ≤ t<br />
s<strong>in</strong>ce S is a spann<strong>in</strong>g set of V . This means there are scalars a ij ,1≤ i ≤ t, 1≤ j ≤ m,<br />
so that<br />
t∑<br />
u j = a ij v i<br />
1 ≤ j ≤ m<br />
i=1<br />
Now we form, unmotivated, the homogeneous system of t equations <strong>in</strong> the m
§D Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 321<br />
variables, x j ,1≤ j ≤ m, where the coefficients are the just-discovered scalars a ij ,<br />
m∑<br />
a ij x j =0<br />
1≤ i ≤ t<br />
j=1<br />
This is a homogeneous system with more variables than equations (our hypothesis<br />
is expressed as m>t), so by Theorem HMVEI there are <strong>in</strong>f<strong>in</strong>itely many solutions.<br />
Choose one of these solutions that is not trivial and denote it by x j = c j ,1≤ j ≤ m.<br />
As a solution to the homogeneous system, we then have ∑ m<br />
j=1 a ijc j =0for1≤ i ≤ t.<br />
As a collection of nontrivial scalars, c j ,1≤ j ≤ m, will provide the nontrivial relation<br />
of l<strong>in</strong>ear dependence we desire,<br />
)<br />
m∑<br />
m∑<br />
c j u j =<br />
a ij v i Def<strong>in</strong>ition SSVS<br />
j=1<br />
( t∑<br />
c j<br />
j=1 i=1<br />
m∑ t∑<br />
= c j a ij v i<br />
=<br />
=<br />
=<br />
=<br />
=<br />
j=1 i=1<br />
t∑<br />
i=1 j=1<br />
t∑<br />
i=1 j=1<br />
i=1<br />
m∑<br />
c j a ij v i<br />
m∑<br />
a ij c j v i<br />
⎛<br />
t∑ m∑<br />
⎝<br />
t∑<br />
0v i<br />
i=1<br />
j=1<br />
a ij c j<br />
⎞<br />
⎠ v i<br />
Property DVA<br />
Property C<br />
Property CMCN<br />
Property DSA<br />
c j as solution<br />
t∑<br />
0 Theorem ZSSM<br />
i=1<br />
= 0 Property Z<br />
That does it. R has been undeniably shown to be a l<strong>in</strong>early dependent set.<br />
Noticehowtheswapofthetwosummationsissomucheasier<strong>in</strong>thethirdstep<br />
above, as opposed to all the rearrang<strong>in</strong>g and regroup<strong>in</strong>g that takes place <strong>in</strong> the<br />
previous proof. And us<strong>in</strong>g only about half the space. And there are no ellipses (...).<br />
Theorem SSLD can be viewed as a generalization of Theorem MVSLD. Weknow<br />
that C m has a basis with m vectors <strong>in</strong> it (Theorem SUVB), so it is a set of m vectors<br />
that spans C m .ByTheoremSSLD, any set of more than m vectors from C m will be<br />
l<strong>in</strong>early dependent. But this is exactly the conclusion we have <strong>in</strong> Theorem MVSLD.
§D Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 322<br />
Maybe this is not a total shock, as the proofs of both theorems rely heavily on<br />
Theorem HMVEI. The beauty of Theorem SSLD is that it applies <strong>in</strong> any vector<br />
space. We illustrate the generality of this theorem, and h<strong>in</strong>t at its power, <strong>in</strong> the next<br />
example.<br />
Example LDP4 L<strong>in</strong>early dependent set <strong>in</strong> P 4<br />
In Example SSP4 we showed that<br />
S = { x − 2, x 2 − 4x +4,x 3 − 6x 2 +12x − 8, x 4 − 8x 3 +24x 2 − 32x +16 }<br />
is a spann<strong>in</strong>g set for W = {p(x)| p ∈ P 4 , p(2) = 0}. So we can apply Theorem SSLD<br />
to W with t = 4. Here is a set of five vectors from W , as you may check by verify<strong>in</strong>g<br />
that each is a polynomial of degree 4 or less and has x =2asaroot,<br />
T = {p 1 ,p 2 ,p 3 ,p 4 ,p 5 }⊆W<br />
p 1 = x 4 − 2x 3 +2x 2 − 8x +8<br />
p 2 = −x 3 +6x 2 − 5x − 6<br />
p 3 =2x 4 − 5x 3 +5x 2 − 7x +2<br />
p 4 = −x 4 +4x 3 − 7x 2 +6x<br />
p 5 =4x 3 − 9x 2 +5x − 6<br />
By Theorem SSLD we conclude that T is l<strong>in</strong>early dependent, with no further<br />
computations.<br />
△<br />
Theorem SSLD is <strong>in</strong>deed powerful, but our ma<strong>in</strong> purpose <strong>in</strong> prov<strong>in</strong>g it right now<br />
was to make sure that our def<strong>in</strong>ition of dimension (Def<strong>in</strong>ition D) is well-def<strong>in</strong>ed.<br />
Here is the theorem.<br />
Theorem BIS Bases have Identical Sizes<br />
Suppose that V is a vector space with a f<strong>in</strong>ite basis B and a second basis C. ThenB<br />
and C have the same size.<br />
Proof. Suppose that C has more vectors than B. (Allow<strong>in</strong>g for the possibility that<br />
C is <strong>in</strong>f<strong>in</strong>ite, we can replace C by a subset that has more vectors than B.) As a<br />
basis, B is a spann<strong>in</strong>g set for V (Def<strong>in</strong>ition B), so Theorem SSLD says that C is<br />
l<strong>in</strong>early dependent. However, this contradicts the fact that as a basis C is l<strong>in</strong>early<br />
<strong>in</strong>dependent (Def<strong>in</strong>ition B). So C must also be a f<strong>in</strong>ite set, with size less than, or<br />
equal to, that of B.<br />
Suppose that B has more vectors than C. As a basis, C is a spann<strong>in</strong>g set for V<br />
(Def<strong>in</strong>ition B), so Theorem SSLD says that B is l<strong>in</strong>early dependent. However, this<br />
contradicts the fact that as a basis B is l<strong>in</strong>early <strong>in</strong>dependent (Def<strong>in</strong>ition B). So C<br />
cannot be strictly smaller than B.<br />
The only possibility left for the sizes of B and C is for them to be equal.
§D Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 323<br />
Theorem BIS tells us that if we f<strong>in</strong>d one f<strong>in</strong>ite basis <strong>in</strong> a vector space, then they<br />
all have the same size. This (f<strong>in</strong>ally) makes Def<strong>in</strong>ition D unambiguous.<br />
Subsection DVS<br />
Dimension of Vector Spaces<br />
We can now collect the dimension of some common, and not so common, vector<br />
spaces.<br />
Theorem DCM Dimension of C m<br />
The dimension of C m (Example VSCV) ism.<br />
Proof. Theorem SUVB provides a basis with m vectors.<br />
<br />
Theorem DP Dimension of P n<br />
The dimension of P n (Example VSP) isn +1.<br />
Proof. Example BP provides two bases with n + 1 vectors. Take your pick.<br />
<br />
Theorem DM Dimension of M mn<br />
The dimension of M mn (Example VSM) ismn.<br />
Proof. Example BM provides a basis with mn vectors.<br />
<br />
Example DSM22 Dimension of a subspace of M 22<br />
It should now be plausible that<br />
{[ ]∣<br />
a b ∣∣∣<br />
Z = 2a + b +3c +4d =0, −a +3b − 5c − 2d =0}<br />
c d<br />
is a subspace of the vector space M 22 (Example VSM). (It is.) To f<strong>in</strong>d the dimension<br />
of Z we must first f<strong>in</strong>d a basis, though any old basis will do.<br />
<strong>First</strong> concentrate on the conditions relat<strong>in</strong>g a, b, c and d. They form a homogeneous<br />
system of two equations <strong>in</strong> four variables with coefficient matrix<br />
[ ]<br />
2 1 3 4<br />
−1 3 −5 −2<br />
We can row-reduce this matrix to obta<strong>in</strong><br />
[ ]<br />
1 0 2 2<br />
0 1 −1 0<br />
Rewrite the two equations represented by each row of this matrix, express<strong>in</strong>g the<br />
dependent variables (a and b) <strong>in</strong> terms of the free variables (c and d), and we obta<strong>in</strong>,<br />
a = −2c − 2d<br />
b = c
§D Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 324<br />
We can now write a typical entry of Z strictly <strong>in</strong> terms of c and d, andwecan<br />
decompose the result,<br />
[ ] [ ] [ ] [ ] [ ] [ ]<br />
a b −2c − 2d c −2c c −2d 0 −2 1 −2 0<br />
=<br />
= + = c + d<br />
c d c d c 0 0 d 1 0 0 1<br />
This equation says that an arbitrary matrix <strong>in</strong> Z canbewrittenasal<strong>in</strong>ear<br />
comb<strong>in</strong>ation of the two vectors <strong>in</strong><br />
{[ ] [ ]}<br />
−2 1 −2 0<br />
S =<br />
,<br />
1 0 0 1<br />
so we know that<br />
Z = 〈S〉 =<br />
〈{[ ]<br />
−2 1<br />
,<br />
1 0<br />
[ ]}〉<br />
−2 0<br />
0 1<br />
Are these two matrices (vectors) also l<strong>in</strong>early <strong>in</strong>dependent? Beg<strong>in</strong> with a relation<br />
of l<strong>in</strong>ear dependence on S,<br />
[ ] [ ]<br />
−2 1 −2 0<br />
a 1 + a<br />
1 0 2 = O<br />
0 1<br />
[ ] [ ]<br />
−2a1 − 2a 2 a 1 0 0<br />
=<br />
a 1 a 2 0 0<br />
From the equality of the two entries <strong>in</strong> the last row, we conclude that a 1 =0,<br />
a 2 = 0. Thus the only possible relation of l<strong>in</strong>ear dependence is the trivial one, and<br />
therefore S is l<strong>in</strong>early <strong>in</strong>dependent (Def<strong>in</strong>ition LI). So S is a basis for Z (Def<strong>in</strong>ition B).<br />
F<strong>in</strong>ally, we can conclude that dim (Z) = 2 (Def<strong>in</strong>ition D) s<strong>in</strong>ceS has two elements.<br />
△<br />
Example DSP4 Dimension of a subspace of P 4<br />
In Example BSP4 we showed that<br />
S = { x − 2, x 2 − 4x +4,x 3 − 6x 2 +12x − 8, x 4 − 8x 3 +24x 2 − 32x +16 }<br />
is a basis for W = {p(x)| p ∈ P 4 , p(2) = 0}. Thus, the dimension of W is four,<br />
dim (W )=4.<br />
Note that dim (P 4 ) =5byTheoremDP, soW is a subspace of dimension 4<br />
with<strong>in</strong> the vector space P 4 of dimension 5, illustrat<strong>in</strong>g the upcom<strong>in</strong>g Theorem PSSD.<br />
△<br />
Example DC Dimension of the crazy vector space<br />
In Example BC we determ<strong>in</strong>ed that the set R = {(1, 0), (6, 3)} from the crazy<br />
vector space, C (Example CVS), is a basis for C. By Def<strong>in</strong>ition D we see that C has<br />
dimension 2, dim (C) =2.<br />
△<br />
It is possible for a vector space to have no f<strong>in</strong>ite bases, <strong>in</strong> which case we say<br />
it has <strong>in</strong>f<strong>in</strong>ite dimension. Many of the best examples of this are vector spaces of<br />
functions, which lead to constructions like Hilbert spaces. We will focus exclusively
§D Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 325<br />
on f<strong>in</strong>ite-dimensional vector spaces. OK, one <strong>in</strong>f<strong>in</strong>ite-dimensional example, and then<br />
we will focus exclusively on f<strong>in</strong>ite-dimensional vector spaces.<br />
Example VSPUD Vector space of polynomials with unbounded degree<br />
Def<strong>in</strong>e the set P by<br />
P = {p| p(x) is a polynomial <strong>in</strong> x}<br />
Our operations will be the same as those def<strong>in</strong>ed for P n (Example VSP).<br />
With no restrictions on the possible degrees of our polynomials, any f<strong>in</strong>ite set<br />
that is a candidate for spann<strong>in</strong>g P will come up short. We will give a proof by<br />
contradiction (Proof Technique CD). To this end, suppose that the dimension of P<br />
is f<strong>in</strong>ite, say dim (P )=n.<br />
The set T = { 1,x,x 2 , ..., x n} is a l<strong>in</strong>early <strong>in</strong>dependent set (check this!) conta<strong>in</strong><strong>in</strong>g<br />
n + 1 polynomials from P .However,abasisofP will be a spann<strong>in</strong>g set of<br />
P conta<strong>in</strong><strong>in</strong>g n vectors. This situation is a contradiction of Theorem SSLD, soour<br />
assumption that P has f<strong>in</strong>ite dimension is false. Thus, we say dim (P )=∞. △<br />
Subsection RNM<br />
Rank and Nullity of a Matrix<br />
For any matrix, we have seen that we can associate several subspaces — the null<br />
space (Theorem NSMS), the column space (Theorem CSMS), row space (Theorem<br />
RSMS) and the left null space (Theorem LNSMS). As vector spaces, each of these<br />
has a dimension, and for the null space and column space, they are important enough<br />
to warrant names.<br />
Def<strong>in</strong>ition NOM Nullity Of a Matrix<br />
Suppose that A is an m × n matrix. Then the nullity of A is the dimension of the<br />
null space of A, n (A) =dim(N (A)).<br />
□<br />
Def<strong>in</strong>ition ROM Rank Of a Matrix<br />
Suppose that A is an m × n matrix. Then the rank of A is the dimension of the<br />
column space of A, r (A) =dim(C(A)).<br />
□<br />
Example RNM Rank and nullity of a matrix<br />
Let us compute the rank and nullity of<br />
⎡<br />
⎤<br />
2 −4 −1 3 2 1 −4<br />
1 −2 0 0 4 0 1<br />
−2 4 1 0 −5 −4 −8<br />
A =<br />
⎢ 1 −2 1 1 6 1 −3<br />
⎥<br />
⎣ 2 −4 −1 1 4 −2 −1⎦<br />
−1 2 3 −1 6 3 −1<br />
To do this, we will first row-reduce the matrix s<strong>in</strong>ce that will help us determ<strong>in</strong>e
§D Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 326<br />
bases for the null space and column space.<br />
⎡<br />
⎤<br />
1 −2 0 0 4 0 1<br />
0 0 1 0 3 0 −2<br />
0 0 0 1 −1 0 −3<br />
⎢ 0 0 0 0 0 1 1 ⎥<br />
⎣<br />
0 0 0 0 0 0 0<br />
⎦<br />
0 0 0 0 0 0 0<br />
From this row-equivalent matrix <strong>in</strong> reduced row-echelon form we record D =<br />
{1, 3, 4, 6} and F = {2, 5, 7}.<br />
For each <strong>in</strong>dex <strong>in</strong> D, TheoremBCS creates a s<strong>in</strong>gle basis vector. In total the<br />
basis will have 4 vectors, so the column space of A will have dimension 4 and we<br />
write r (A) =4.<br />
For each <strong>in</strong>dex <strong>in</strong> F ,TheoremBNS creates a s<strong>in</strong>gle basis vector. In total the<br />
basis will have 3 vectors, so the null space of A will have dimension 3 and we write<br />
n (A) =3.<br />
△<br />
There were no accidents or co<strong>in</strong>cidences <strong>in</strong> the previous example — with the<br />
row-reduced version of a matrix <strong>in</strong> hand, the rank and nullity are easy to compute.<br />
Theorem CRN Comput<strong>in</strong>g Rank and Nullity<br />
Suppose that A is an m × n matrix and B is a row-equivalent matrix <strong>in</strong> reduced<br />
row-echelon form. Let r denote the number of pivot columns (or the number of<br />
nonzero rows). Then r (A) =r and n (A) =n − r.<br />
Proof. Theorem BCS provides a basis for the column space by choos<strong>in</strong>g columns<br />
of A that that have the same <strong>in</strong>dices as the pivot columns of B. In the analysis of<br />
B, each lead<strong>in</strong>g 1 provides one nonzero row and one pivot column. So there are r<br />
column vectors <strong>in</strong> a basis for C(A).<br />
Theorem BNS provides a basis for the null space by creat<strong>in</strong>g basis vectors of the<br />
null space of A from entries of B, one basis vector for each column that is not a<br />
pivot column. So there are n − r column vectors <strong>in</strong> a basis for n (A).<br />
<br />
Every archetype (Archetypes) that <strong>in</strong>volves a matrix lists its rank and nullity.<br />
You may have noticed as you studied the archetypes that the larger the column space<br />
is the smaller the null space is. A simple corollary states this trade-off succ<strong>in</strong>ctly.<br />
(See Proof Technique LC.)<br />
Theorem RPNC Rank Plus Nullity is Columns<br />
Suppose that A is an m × n matrix. Then r (A)+n (A) =n.<br />
Proof. Let r be the number of nonzero rows <strong>in</strong> a row-equivalent matrix <strong>in</strong> reduced<br />
row-echelon form. By Theorem CRN,<br />
r (A)+n (A) =r +(n − r) =n
§D Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 327<br />
When we first <strong>in</strong>troduced r as our standard notation for the number of nonzero<br />
rows <strong>in</strong> a matrix <strong>in</strong> reduced row-echelon form you might have thought r stood for<br />
“rows.” Not really — it stands for “rank”!<br />
Subsection RNNM<br />
Rank and Nullity of a Nons<strong>in</strong>gular Matrix<br />
Let us take a look at the rank and nullity of a square matrix.<br />
Example RNSM Rank and nullity of a square matrix<br />
The matrix<br />
⎡<br />
⎤<br />
0 4 −1 2 2 3 1<br />
2 −2 1 −1 0 −4 −3<br />
−2 −3 9 −3 9 −1 9<br />
E =<br />
−3 −4 9 4 −1 6 −2<br />
⎢−3 −4 6 −2 5 9 −4<br />
⎥<br />
⎣ 9 −3 8 −2 −4 2 4 ⎦<br />
8 2 2 9 3 0 9<br />
is row-equivalent to the matrix <strong>in</strong> reduced row-echelon form,<br />
⎡<br />
⎤<br />
1 0 0 0 0 0 0<br />
0 1 0 0 0 0 0<br />
0 0 1 0 0 0 0<br />
0 0 0 1 0 0 0<br />
⎢<br />
0 0 0 0 1 0 0<br />
⎥<br />
⎣ 0 0 0 0 0 1 0 ⎦<br />
0 0 0 0 0 0 1<br />
With n =7columnsandr = 7 nonzero rows Theorem CRN tells us the rank is<br />
r (E) = 7 and the nullity is n (E) =7− 7=0.<br />
△<br />
The value of either the nullity or the rank are enough to characterize a nons<strong>in</strong>gular<br />
matrix.<br />
Theorem RNNM Rank and Nullity of a Nons<strong>in</strong>gular Matrix<br />
Suppose that A is a square matrix of size n. The follow<strong>in</strong>g are equivalent.<br />
1. A is nons<strong>in</strong>gular.<br />
2. The rank of A is n, r (A) =n.<br />
3. The nullity of A is zero, n (A) =0.<br />
Proof. (1 ⇒ 2) Theorem CSNM says that if A is nons<strong>in</strong>gular then C(A) = C n .If<br />
C(A) = C n , then the column space has dimension n by Theorem DCM, so the rank<br />
of A is n.
§D Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 328<br />
(2 ⇒ 3) Suppose r (A) =n. Then Theorem RPNC gives<br />
n (A) =n − r (A)<br />
Theorem RPNC<br />
= n − n Hypothesis<br />
=0<br />
(3 ⇒ 1) Suppose n (A) = 0, so a basis for the null space of A is the empty set.<br />
This implies that N (A) ={0} and Theorem NMTNS says A is nons<strong>in</strong>gular. <br />
With a new equivalence for a nons<strong>in</strong>gular matrix, we can update our list of<br />
equivalences (Theorem NME5) which now becomes a list requir<strong>in</strong>g double digits to<br />
number.<br />
Theorem NME6 Nons<strong>in</strong>gular Matrix Equivalences, Round 6<br />
Suppose that A is a square matrix of size n. The follow<strong>in</strong>g are equivalent.<br />
1. A is nons<strong>in</strong>gular.<br />
2. A row-reduces to the identity matrix.<br />
3. The null space of A conta<strong>in</strong>s only the zero vector, N (A) ={0}.<br />
4. The l<strong>in</strong>ear system LS(A, b) has a unique solution for every possible choice of<br />
b.<br />
5. The columns of A are a l<strong>in</strong>early <strong>in</strong>dependent set.<br />
6. A is <strong>in</strong>vertible.<br />
7. The column space of A is C n , C(A) =C n .<br />
8. The columns of A are a basis for C n .<br />
9. The rank of A is n, r (A) =n.<br />
10. The nullity of A is zero, n (A) =0.<br />
Proof. Build<strong>in</strong>g on Theorem NME5 we can add two of the statements from Theorem<br />
RNNM.<br />
<br />
Read<strong>in</strong>g Questions<br />
1. What is the dimension of the vector space P 6 , the set of all polynomials of degree 6 or<br />
less?<br />
2. How are the rank and nullity of a matrix related?<br />
3. Expla<strong>in</strong> why we might say that a nons<strong>in</strong>gular matrix has “full rank.”
§D Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 329<br />
Exercises<br />
C20 The archetypes listed below are matrices, or systems of equations with coefficient<br />
matrices. For each, compute the nullity and rank of the matrix. This <strong>in</strong>formation is listed<br />
for each archetype (along with the number of columns <strong>in</strong> the matrix, so as to illustrate<br />
Theorem RPNC), and notice how it could have been computed immediately after the<br />
determ<strong>in</strong>ation of the sets D and F associated with the reduced row-echelon form of the<br />
matrix.<br />
Archetype A,ArchetypeB,ArchetypeC,ArchetypeD/Archetype E,ArchetypeF,Archetype<br />
G/Archetype H, ArchetypeI, ArchetypeJ, ArchetypeK, ArchetypeL<br />
⎧ ⎡ ⎤<br />
⎫<br />
a + b<br />
⎪⎨<br />
⎪⎬<br />
C21 † ⎢a + c⎥<br />
F<strong>in</strong>d the dimension of the subspace W = ⎣<br />
⎪⎩<br />
a + d<br />
⎦<br />
a, b, c, d ∈ C<br />
d<br />
∣<br />
⎪⎭ of C4 .<br />
C22 † F<strong>in</strong>d the dimension of the subspace W = { a + bx + cx 2 + dx 3∣ ∣ a + b + c + d =0 }<br />
of P 3 .<br />
{[ ]∣<br />
C23 † a b ∣∣∣<br />
F<strong>in</strong>d the dimension of the subspace W = a + b = c, b + c = d, c + d = a}<br />
c d<br />
of M 22 .<br />
C30 † For the matrix A below, compute the dimension of the null space of A, dim (N (A)).<br />
⎡<br />
⎤<br />
2 −1 −3 11 9<br />
⎢1 2 1 −7 −3⎥<br />
A = ⎣<br />
3 1 −3 6 8<br />
⎦<br />
2 1 2 −5 −3<br />
C31 † The set W below is a subspace of C 4 . F<strong>in</strong>d the dimension of W .<br />
〈 ⎧ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />
⎪ 2 3 −4 〉<br />
⎨ ⎪⎬<br />
⎢−3⎥<br />
⎢ 0 ⎥ ⎢−3⎥<br />
W = ⎣<br />
⎪ ⎩<br />
4<br />
⎦ , ⎣<br />
1<br />
⎦ , ⎣<br />
2<br />
⎦<br />
⎪⎭<br />
1 −2 5<br />
⎡ ⎤<br />
1 0 1<br />
1 2 2<br />
C35 † F<strong>in</strong>d the rank and nullity of the matrix A = ⎢ 2 1 1⎥<br />
⎣<br />
−1 0 1<br />
⎦ .<br />
1 1 2<br />
⎡<br />
C36 † F<strong>in</strong>d the rank and nullity of the matrix A = ⎣ 1 2 1 1 1<br />
⎤<br />
1 3 2 0 4⎦.<br />
1 2 1 1 1<br />
⎡<br />
⎤<br />
3 2 1 1 1<br />
2 3 0 1 1<br />
C37 † F<strong>in</strong>d the rank and nullity of the matrix A = ⎢−1 1 2 1 0 ⎥<br />
⎣<br />
1 1 0 1 1<br />
⎦ .<br />
0 1 1 2 −1<br />
C40 In Example LDP4 we determ<strong>in</strong>ed that the set of five polynomials, T , is l<strong>in</strong>early<br />
dependent by a simple <strong>in</strong>vocation of Theorem SSLD. ProvethatT is l<strong>in</strong>early dependent
§D Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 330<br />
from scratch, beg<strong>in</strong>n<strong>in</strong>g with Def<strong>in</strong>ition LI.<br />
M20 † M 22 is the vector space of 2 × 2 matrices. Let S 22 denote the set of all 2 × 2<br />
symmetric matrices. That is<br />
S 22 = { A ∈ M 22 | A t = A }<br />
1. Show that S 22 is a subspace of M 22 .<br />
2. Exhibit a basis for S 22 and prove that it has the required properties.<br />
3. What is the dimension of S 22 ?<br />
M21 † A2× 2 matrix B is upper triangular if [B] 21<br />
=0.LetUT 2 be the set of all 2 × 2<br />
upper triangular matrices. Then UT 2 is a subspace of the vector space of all 2 × 2matrices,<br />
M 22 (you may assume this). Determ<strong>in</strong>e the dimension of UT 2 provid<strong>in</strong>g all of the necessary<br />
justifications for your answer.
Section PD<br />
Properties of Dimension<br />
Once the dimension of a vector space is known, then the determ<strong>in</strong>ation of whether<br />
or not a set of vectors is l<strong>in</strong>early <strong>in</strong>dependent, or if it spans the vector space, can<br />
often be much easier. In this section we will state a workhorse theorem and then<br />
apply it to the column space and row space of a matrix. It will also help us describe<br />
a super-basis for C m .<br />
Subsection GT<br />
Goldilocks’ Theorem<br />
We beg<strong>in</strong> with a useful theorem that we will need later, and <strong>in</strong> the proof of the<br />
ma<strong>in</strong> theorem <strong>in</strong> this subsection. This theorem says that we can extend l<strong>in</strong>early<br />
<strong>in</strong>dependent sets, one vector at a time, by add<strong>in</strong>g vectors from outside the span of<br />
the l<strong>in</strong>early <strong>in</strong>dependent set, all the while preserv<strong>in</strong>g the l<strong>in</strong>ear <strong>in</strong>dependence of the<br />
set.<br />
Theorem ELIS Extend<strong>in</strong>g L<strong>in</strong>early Independent Sets<br />
Suppose V is a vector space and S is a l<strong>in</strong>early <strong>in</strong>dependent set of vectors from V .<br />
Suppose w is a vector such that w ∉〈S〉. Then the set S ′ = S ∪{w} is l<strong>in</strong>early<br />
<strong>in</strong>dependent.<br />
Proof. Suppose S = {v 1 , v 2 , v 3 , ..., v m } and beg<strong>in</strong> with a relation of l<strong>in</strong>ear dependence<br />
on S ′ ,<br />
a 1 v 1 + a 2 v 2 + a 3 v 3 + ···+ a m v m + a m+1 w = 0.<br />
There are two cases to consider. <strong>First</strong> suppose that a m+1 = 0. Then the relation<br />
of l<strong>in</strong>ear dependence on S ′ becomes<br />
a 1 v 1 + a 2 v 2 + a 3 v 3 + ···+ a m v m = 0.<br />
and by the l<strong>in</strong>ear <strong>in</strong>dependence of the set S, we conclude that a 1 = a 2 = a 3 = ···=<br />
a m = 0. So all of the scalars <strong>in</strong> the relation of l<strong>in</strong>ear dependence on S ′ are zero.<br />
In the second case, suppose that a m+1 ≠ 0. Then the relation of l<strong>in</strong>ear dependence<br />
on S ′ becomes<br />
a m+1 w = −a 1 v 1 − a 2 v 2 − a 3 v 3 −···−a m v m<br />
w = − a 1<br />
a m+1<br />
v 1 − a 2<br />
a m+1<br />
v 2 − a 3<br />
a m+1<br />
v 3 −···−<br />
a m<br />
a m+1<br />
v m<br />
This equation expresses w as a l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors <strong>in</strong> S, contrary<br />
to the assumption that w ∉〈S〉, so this case leads to a contradiction.<br />
The first case yielded only a trivial relation of l<strong>in</strong>ear dependence on S ′ and the<br />
second case led to a contradiction. So S ′ is a l<strong>in</strong>early <strong>in</strong>dependent set s<strong>in</strong>ce any<br />
331
§PD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 332<br />
relation of l<strong>in</strong>ear dependence is trivial.<br />
<br />
In the story Goldilocks and the Three Bears, the young girl Goldilocks visits the<br />
empty house of the three bears while out walk<strong>in</strong>g <strong>in</strong> the woods. One bowl of porridge<br />
is too hot, the other too cold, the third is just right. One chair is too hard, one too<br />
soft, the third is just right. So it is with sets of vectors — some are too big (l<strong>in</strong>early<br />
dependent), some are too small (they do not span), and some are just right (bases).<br />
Here is Goldilocks’ Theorem.<br />
Theorem G Goldilocks<br />
Suppose that V is a vector space of dimension t. LetS = {v 1 , v 2 , v 3 , ..., v m } be a<br />
set of vectors from V .Then<br />
1. If m>t,thenS is l<strong>in</strong>early dependent.<br />
2. If m
§PD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 333<br />
There is a tension <strong>in</strong> the construction of a basis. Make a set too big and you will<br />
end up with relations of l<strong>in</strong>ear dependence among the vectors. Make a set too small<br />
and you will not have enough raw material to span the entire vector space. Make a<br />
set just the right size (the dimension) and you only need to have l<strong>in</strong>ear <strong>in</strong>dependence<br />
or spann<strong>in</strong>g, and you get the other property for free. These roughly-stated ideas are<br />
made precise by Theorem G.<br />
The structure and proof of this theorem also deserve comment. The hypotheses<br />
seem <strong>in</strong>nocuous. We presume we know the dimension of the vector space <strong>in</strong> hand,<br />
then we mostly just look at the size of the set S. From this we get big conclusions<br />
about spann<strong>in</strong>g and l<strong>in</strong>ear <strong>in</strong>dependence. Each of the four proofs relies on ultimately<br />
contradict<strong>in</strong>g Theorem SSLD, so <strong>in</strong> a way we could th<strong>in</strong>k of this entire theorem as a<br />
corollary of Theorem SSLD. (See Proof Technique LC.)Theproofsofthethirdand<br />
fourth parts parallel each other <strong>in</strong> style: <strong>in</strong>troduce w us<strong>in</strong>g Theorem ELIS or toss<br />
v k us<strong>in</strong>g Theorem DLDS. Then obta<strong>in</strong> a contradiction to Theorem SSLD.<br />
Theorem G is useful <strong>in</strong> both concrete examples and as a tool <strong>in</strong> other proofs. We<br />
will use it often to bypass verify<strong>in</strong>g l<strong>in</strong>ear <strong>in</strong>dependence or spann<strong>in</strong>g.<br />
Example BPR Bases for P n , reprised<br />
In Example BP we claimed that<br />
B = { 1,x,x 2 ,x 3 , ..., x n}<br />
C = { 1, 1+x, 1+x + x 2 , 1+x + x 2 + x 3 , ..., 1+x + x 2 + x 3 + ···+ x n} .<br />
were both bases for P n (Example VSP). Suppose we had first verified that B was<br />
a basis, so we would then know that dim (P n ) = n +1.ThesizeofC is n +1,the<br />
right size to be a basis. We could then verify that C is l<strong>in</strong>early <strong>in</strong>dependent. We<br />
would not have to make any special efforts to prove that C spans P n , s<strong>in</strong>ce Theorem<br />
G would allow us to conclude this property of C directly. Then we would be able to<br />
say that C is a basis of P n also.<br />
△<br />
Example BDM22 Basis by dimension <strong>in</strong> M 22<br />
In Example DSM22 we showed that<br />
{[ ] [ ]}<br />
−2 1 −2 0<br />
B =<br />
,<br />
1 0 0 1<br />
is a basis for the subspace Z of M 22 (Example VSM) givenby<br />
{[ ]∣<br />
a b ∣∣∣<br />
Z = 2a + b +3c +4d =0, −a +3b − 5c − d =0}<br />
c d<br />
This tells us that dim (Z) = 2. In this example we will f<strong>in</strong>d another basis. We<br />
can construct two new matrices <strong>in</strong> Z by form<strong>in</strong>g l<strong>in</strong>ear comb<strong>in</strong>ations of the matrices<br />
<strong>in</strong> B.<br />
2<br />
[ ]<br />
−2 1<br />
+(−3)<br />
1 0<br />
[ ] [ ]<br />
−2 0 2 2<br />
=<br />
0 1 2 −3
§PD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 334<br />
Then the set<br />
3<br />
[ ]<br />
−2 1<br />
+1<br />
1 0<br />
C =<br />
{[ ]<br />
2 2<br />
,<br />
2 −3<br />
[ ]<br />
−2 0<br />
=<br />
0 1<br />
[ ]}<br />
−8 3<br />
3 1<br />
[ ]<br />
−8 3<br />
3 1<br />
hastherightsizetobeabasisofZ. Let us see if it is a l<strong>in</strong>early <strong>in</strong>dependent set.<br />
The relation of l<strong>in</strong>ear dependence<br />
[ ] [ ]<br />
2 2 −8 3<br />
a 1 + a<br />
2 −3 2 = O<br />
3 1<br />
[ ] [ ]<br />
2a1 − 8a 2 2a 1 +3a 2 0 0<br />
=<br />
2a 1 +3a 2 −3a 1 + a 2 0 0<br />
leads to the homogeneous system of equations whose coefficient matrix<br />
⎡ ⎤<br />
2 −8<br />
⎢ 2 3 ⎥<br />
⎣<br />
2 3<br />
⎦<br />
−3 1<br />
row-reduces to<br />
⎡<br />
⎢<br />
⎣<br />
1 0<br />
0 1<br />
0 0<br />
0 0<br />
So with a 1 = a 2 = 0 as the only solution, the set is l<strong>in</strong>early <strong>in</strong>dependent. Now<br />
we can apply Theorem G to see that C also spans Z and therefore is a second basis<br />
for Z.<br />
△<br />
Example SVP4 Sets of vectors <strong>in</strong> P 4<br />
In Example BSP4 we showed that<br />
B = { x − 2, x 2 − 4x +4,x 3 − 6x 2 +12x − 8, x 4 − 8x 3 +24x 2 − 32x +16 }<br />
is a basis for W = {p(x)| p ∈ P 4 , p(2) = 0}. Sodim(W )=4.<br />
The set<br />
{<br />
3x 2 − 5x − 2, 2x 2 − 7x +6,x 3 − 2x 2 + x − 2 }<br />
is a subset of W (check this) and it happens to be l<strong>in</strong>early <strong>in</strong>dependent (check this,<br />
too). However, by Theorem G it cannot span W .<br />
The set<br />
{<br />
3x 2 − 5x − 2, 2x 2 − 7x +6,x 3 − 2x 2 + x − 2, −x 4 +2x 3 +5x 2 − 10x, x 4 − 16 }<br />
is another subset of W (check this) and Theorem G tells us that it must be l<strong>in</strong>early<br />
dependent.<br />
⎤<br />
⎥<br />
⎦
§PD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 335<br />
The set<br />
{<br />
x − 2, x 2 − 2x, x 3 − 2x 2 ,x 4 − 2x 3}<br />
is a third subset of W (check this) and is l<strong>in</strong>early <strong>in</strong>dependent (check this). S<strong>in</strong>ce it<br />
has the right size to be a basis, and is l<strong>in</strong>early <strong>in</strong>dependent, Theorem G tells us that<br />
it also spans W , and therefore is a basis of W .<br />
△<br />
A simple consequence of Theorem G is the observation that a proper subspace<br />
has strictly smaller dimension that its parent vector space. Hopefully this may seem<br />
<strong>in</strong>tuitively obvious, but it still requires proof, and we will cite this result later.<br />
Theorem PSSD Proper Subspaces have Smaller Dimension<br />
Suppose that U and V are subspaces of the vector space W , such that U V .Then<br />
dim (U) < dim (V ).<br />
Proof. Suppose that dim (U) = m and dim (V ) = t. ThenU has a basis B of size<br />
m. Ifm>t, then by Theorem G, B is l<strong>in</strong>early dependent, which is a contradiction.<br />
If m = t, thenbyTheoremG, B spans V .ThenU = 〈B〉 = V , also a contradiction.<br />
All that rema<strong>in</strong>s is that m
§PD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 336<br />
Proof. Suppose we row-reduce A to the matrix B <strong>in</strong> reduced row-echelon form, and<br />
B has r nonzero rows. The quantity r tells us three th<strong>in</strong>gs about B: thenumberof<br />
lead<strong>in</strong>g 1’s, the number of nonzero rows and the number of pivot columns. For this<br />
proof we will be <strong>in</strong>terested <strong>in</strong> the latter two.<br />
Theorem BRS and Theorem BCS each has a conclusion that provides a basis, for<br />
the row space and the column space, respectively. In each case, these bases conta<strong>in</strong><br />
r vectors. This observation makes the follow<strong>in</strong>g go.<br />
r (A) =dim(C(A))<br />
Def<strong>in</strong>ition ROM<br />
= r Theorem BCS<br />
=dim(R(A))<br />
Theorem BRS<br />
=dim ( C ( A t))<br />
Theorem CSRST<br />
= r ( A t) Def<strong>in</strong>ition ROM<br />
Jacob L<strong>in</strong>enthal helped with this proof.<br />
<br />
This says that the row space and the column space of a matrix have the same<br />
dimension, which should be very surpris<strong>in</strong>g. It does not say that column space<br />
and the row space are identical. Indeed, if the matrix is not square, then the sizes<br />
(number of slots) of the vectors <strong>in</strong> each space are different, so the sets are not even<br />
comparable.<br />
It is not hard to construct by yourself examples of matrices that illustrate Theorem<br />
RMRT, s<strong>in</strong>ce it applies equally well to any matrix. Grab a matrix, row-reduce it,<br />
count the nonzero rows or the number of pivot columns. That is the rank. Transpose<br />
the matrix, row-reduce that, count the nonzero rows or the pivot columns. That is<br />
the rank of the transpose. The theorem says the two will be equal. Every time. Here<br />
is an example anyway.<br />
Example RRTI Rank,rankoftranspose,ArchetypeI<br />
Archetype I has a 4 × 7 coefficient matrix which row-reduces to<br />
⎡<br />
⎤<br />
1 4 0 0 2 1 −3<br />
⎢ 0 0 1 0 1 −3 5<br />
⎥<br />
⎣<br />
0 0 0 1 2 −6 6<br />
⎦<br />
0 0 0 0 0 0 0
§PD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 337<br />
so the rank is 3. Row-reduc<strong>in</strong>g the transpose yields<br />
⎡<br />
⎤<br />
1 0 0 − 31<br />
7<br />
⎢<br />
⎣<br />
0 1 0<br />
12<br />
7<br />
13<br />
0 0 1<br />
7<br />
0 0 0 0<br />
0 0 0 0<br />
0 0 0 0<br />
0 0 0 0<br />
.<br />
⎥<br />
⎦<br />
demonstrat<strong>in</strong>g that the rank of the transpose is also 3.<br />
△<br />
Subsection DFS<br />
Dimension of Four Subspaces<br />
That the rank of a matrix equals the rank of its transpose is a fundamental and<br />
surpris<strong>in</strong>g result. However, apply<strong>in</strong>g Theorem FS we can easily determ<strong>in</strong>e the<br />
dimension of all four fundamental subspaces associated with a matrix.<br />
Theorem DFS Dimensions of Four Subspaces<br />
Suppose that A is an m × n matrix, and B is a row-equivalent matrix <strong>in</strong> reduced<br />
row-echelon form with r nonzero rows. Then<br />
1. dim (N (A)) = n − r<br />
2. dim (C(A)) = r<br />
3. dim (R(A)) = r<br />
4. dim (L(A)) = m − r<br />
Proof. If A row-reduces to a matrix <strong>in</strong> reduced row-echelon form with r nonzero<br />
rows, then the matrix C of extended echelon form (Def<strong>in</strong>ition EEF) willbeanr × n<br />
matrix <strong>in</strong> reduced row-echelon form with no zero rows and r pivot columns (Theorem<br />
PEEF). Similarly, the matrix L of extended echelon form (Def<strong>in</strong>ition EEF) will be<br />
an m − r × m matrix <strong>in</strong> reduced row-echelon form with no zero rows and m − r<br />
pivot columns (Theorem PEEF).<br />
dim (N (A)) = dim (N (C))<br />
Theorem FS<br />
= n − r Theorem BNS<br />
dim (C(A)) = dim (N (L))<br />
Theorem FS<br />
= m − (m − r) Theorem BNS<br />
= r
§PD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 338<br />
dim (R(A)) = dim (R(C))<br />
Theorem FS<br />
= r Theorem BRS<br />
dim (L(A)) = dim (R(L))<br />
Theorem FS<br />
= m − r Theorem BRS<br />
There are many different ways to state and prove this result, and <strong>in</strong>deed, the<br />
equality of the dimensions of the column space and row space is just a slight expansion<br />
of Theorem RMRT. However, we have restricted our techniques to apply<strong>in</strong>g Theorem<br />
FS and then determ<strong>in</strong><strong>in</strong>g dimensions with bases provided by Theorem BNS and<br />
Theorem BRS. This provides an appeal<strong>in</strong>g symmetry to the results and the proof.<br />
Read<strong>in</strong>g Questions<br />
1. WhydoesTheoremG have the title it does?<br />
2. Why is Theorem RMRT so surpris<strong>in</strong>g ?<br />
3. Row-reduce the matrix A to reduced row-echelon form. Without any further computations,<br />
compute the dimensions of the four subspaces, (a) N (A), (b)C(A), (c)R(A) and<br />
(d) L(A).<br />
⎡<br />
⎤<br />
1 −1 2 8 5<br />
⎢1 1 1 4 −1⎥<br />
A = ⎣<br />
0 2 −3 −8 −6<br />
⎦<br />
2 0 1 8 4<br />
Exercises<br />
C10<br />
Example SVP4 leaves several details for the reader to check. Verify these five claims.<br />
C40 † Determ<strong>in</strong>e if the set T = { x 2 − x +5, 4x 3 − x 2 +5x, 3x +2 } spans the vector<br />
space of polynomials with degree 4 or less, P 4 . (Compare the solution to this exercise with<br />
Solution LISS.C40.)<br />
T05 Trivially, if U and V are two subspaces of W with U = V ,thendim (U) = dim (V ).<br />
Comb<strong>in</strong>e this fact, Theorem PSSD, and Theorem EDYES all <strong>in</strong>to one grand comb<strong>in</strong>ed<br />
theorem. You might look to Theorem PIP for stylistic <strong>in</strong>spiration. (Notice this problem<br />
does not ask you to prove anyth<strong>in</strong>g. It just asks you to roll up three theorems <strong>in</strong>to one<br />
compact, logically equivalent statement.)<br />
T10 Prove the follow<strong>in</strong>g theorem, which could be viewed as a reformulation of parts (3) and<br />
(4) of Theorem G, or more appropriately as a corollary of Theorem G (Proof Technique LC).
§PD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 339<br />
Suppose V is a vector space and S is a subset of V such that the number of vectors<br />
<strong>in</strong> S equals the dimension of V .ThenS is l<strong>in</strong>early <strong>in</strong>dependent if and only if S spans V .<br />
T15 Suppose that A is an m × n matrix and let m<strong>in</strong>(m, n) denote the m<strong>in</strong>imum of m<br />
and n. Provethatr (A) ≤ m<strong>in</strong>(m, n). (If m and n are two numbers, then m<strong>in</strong>(m, n) stands<br />
for the number that is the smaller of the two. For example m<strong>in</strong>(4, 6) = 4.)<br />
T20 † Suppose that A is an m × n matrix and b ∈ C m . Prove that the l<strong>in</strong>ear system<br />
LS(A, b) is consistent if and only if r (A) =r ([ A | b]).<br />
T25 Suppose that V is a vector space with f<strong>in</strong>ite dimension. Let W be any subspace of<br />
V .ProvethatW has f<strong>in</strong>ite dimension.<br />
T33 † Part of Exercise B.T50 is the half of the proof where we assume the matrix A is<br />
nons<strong>in</strong>gular and prove that a set is a basis. In Solution B.T50 we proved directly that the<br />
set was both l<strong>in</strong>early <strong>in</strong>dependent and a spann<strong>in</strong>g set. Shorten this part of the proof by<br />
apply<strong>in</strong>g Theorem G. Be careful, there is one subtlety.<br />
T60 † Suppose that W is a vector space with dimension 5, and U and V are subspaces of<br />
W , each of dimension 3. Prove that U ∩ V conta<strong>in</strong>s a nonzero vector. State a more general<br />
result.
Chapter D<br />
Determ<strong>in</strong>ants<br />
The determ<strong>in</strong>ant is a function that takes a square matrix as an <strong>in</strong>put and produces<br />
a scalar as an output. So unlike a vector space, it is not an algebraic structure.<br />
However, it has many beneficial properties for study<strong>in</strong>g vector spaces, matrices and<br />
systems of equations, so it is hard to ignore (though some have tried). While the<br />
properties of a determ<strong>in</strong>ant can be very useful, they are also complicated to prove.<br />
Section DM<br />
Determ<strong>in</strong>ant of a Matrix<br />
Before we def<strong>in</strong>e the determ<strong>in</strong>ant of a matrix, we take a slight detour to <strong>in</strong>troduce<br />
elementary matrices. These will br<strong>in</strong>g us back to the beg<strong>in</strong>n<strong>in</strong>g of the course and<br />
our old friend, row operations.<br />
Subsection EM<br />
Elementary Matrices<br />
Elementary matrices are very simple, as you might have suspected from their name.<br />
Their purpose is to effect row operations (Def<strong>in</strong>ition RO) on a matrix through matrix<br />
multiplication (Def<strong>in</strong>ition MM). Their def<strong>in</strong>itions look much more complicated than<br />
they really are, so be sure to skip over them on your first read<strong>in</strong>g and head right for<br />
the explanation that follows and the first example.<br />
Def<strong>in</strong>ition ELEM Elementary Matrices<br />
340
§DM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 341<br />
1. For i ≠ j, E i,j is the square matrix of size n with<br />
⎧<br />
0 k ≠ i, k ≠ j, l ≠ k<br />
1 k ≠ i, k ≠ j, l = k<br />
⎪⎨<br />
0 k = i, l ≠ j<br />
[E i,j ] kl<br />
=<br />
1 k = i, l = j<br />
0 k = j, l ≠ i<br />
⎪⎩<br />
1 k = j, l = i<br />
2. For α ≠0,E i (α) is the square matrix of size n with<br />
⎧<br />
⎪⎨ 0 l ≠ k<br />
[E i (α)] kl<br />
= 1 k ≠ i, l = k<br />
⎪⎩<br />
α k = i, l = i<br />
3. For i ≠ j, E i,j (α) is the square matrix of size n with<br />
⎧<br />
0 k ≠ j, l ≠ k<br />
⎪⎨ 1 k ≠ j, l = k<br />
[E i,j (α)] kl<br />
= 0 k = j, l ≠ i, l ≠ j<br />
1 k = j, l = j<br />
⎪⎩<br />
α k = j, l = i<br />
□<br />
Aga<strong>in</strong>, these matrices are not as complicated as their def<strong>in</strong>itions suggest, s<strong>in</strong>ce<br />
they are just small perturbations of the n × n identity matrix (Def<strong>in</strong>ition IM). E i,j<br />
is the identity matrix with rows (or columns) i and j trad<strong>in</strong>g places, E i (α) is the<br />
identity matrix where the diagonal entry <strong>in</strong> row i and column i has been replaced<br />
by α, andE i,j (α) is the identity matrix where the entry <strong>in</strong> row j and column i<br />
has been replaced by α. (Yes, those subscripts look backwards <strong>in</strong> the description of<br />
E i,j (α)). Notice that our notation makes no reference to the size of the elementary<br />
matrix, s<strong>in</strong>ce this will always be apparent from the context, or unimportant.<br />
The raison d’etre for elementary matrices is to “do” row operations on matrices<br />
with matrix multiplication. So here is an example where we will both see some<br />
elementary matrices and see how they accomplish row operations when used with<br />
matrix multiplication.<br />
Example EMRO Elementary matrices and row operations<br />
We will perform a sequence of row operations (Def<strong>in</strong>ition RO) onthe3× 4 matrix<br />
A, while also multiply<strong>in</strong>g the matrix on the left by the appropriate 3 × 3 elementary
§DM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 342<br />
matrix.<br />
R 1 ↔ R 3 :<br />
2R 2 :<br />
2R 3 + R 1 :<br />
[ 5 0 3<br />
] 1<br />
1 3 2 4<br />
2 1 3 1<br />
[ ]<br />
5 0 3 1<br />
2 6 4 8<br />
2 1 3 1<br />
[ ]<br />
9 2 9 3<br />
2 6 4 8<br />
2 1 3 1<br />
A =<br />
E 1,3 :<br />
E 2 (2) :<br />
E 3,1 (2) :<br />
[ 2 1 3<br />
] 1<br />
1 3 2 4<br />
5 0 3 1<br />
[ ][ ]<br />
0 0 1 2 1 3 1<br />
0 1 0 1 3 2 4<br />
1 0 0 5 0 3 1<br />
[ ][ ]<br />
1 0 0 5 0 3 1<br />
0 2 0 1 3 2 4<br />
=<br />
=<br />
0 0 1 2 1 3 1<br />
[ ][ ]<br />
1 0 2 5 0 3 1<br />
0 1 0 2 6 4 8 =<br />
0 0 1 2 1 3 1<br />
[ 5 0 3<br />
] 1<br />
1 3 2 4<br />
2 1 3 1<br />
[ ]<br />
5 0 3 1<br />
2 6 4 8<br />
2 1 3 1<br />
[ ]<br />
9 2 9 3<br />
2 6 4 8<br />
2 1 3 1<br />
The next three theorems establish that each elementary matrix effects a row<br />
operation via matrix multiplication.<br />
Theorem EMDRO Elementary Matrices Do Row Operations<br />
Suppose that A is an m×n matrix, and B is a matrix of the same size that is obta<strong>in</strong>ed<br />
from A by a s<strong>in</strong>gle row operation (Def<strong>in</strong>ition RO). Then there is an elementary<br />
matrix of size m that will convert A to B via matrix multiplication on the left. More<br />
precisely,<br />
1. If the row operation swaps rows i and j, thenB = E i,j A.<br />
2. If the row operation multiplies row i by α, thenB = E i (α) A.<br />
3. If the row operation multiplies row i by α and adds the result to row j, then<br />
B = E i,j (α) A.<br />
Proof. In each of the three conclusions, perform<strong>in</strong>g the row operation on A will<br />
create the matrix B where only one or two rows will have changed. So we will<br />
establish the equality of the matrix entries row by row, first for the unchanged rows,<br />
then for the changed rows, show<strong>in</strong>g <strong>in</strong> each case that the result of the matrix product<br />
is the same as the result of the row operation. Here we go.<br />
Row k of the product E i,j A,wherek ≠ i, k ≠ j, is unchanged from A,<br />
n∑<br />
[E i,j A] kl<br />
= [E i,j ] kp<br />
[A] pl<br />
Theorem EMP<br />
p=1<br />
=[E i,j ] kk<br />
[A] kl<br />
+<br />
n∑<br />
[E i,j ] kp<br />
[A] pl<br />
Property CACN<br />
p=1<br />
p≠k<br />
△
§DM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 343<br />
=1[A] kl<br />
+<br />
n∑<br />
0[A] pl<br />
p=1<br />
p≠k<br />
Def<strong>in</strong>ition ELEM<br />
=[A] kl<br />
Row i of the product E i,j A is row j of A,<br />
n∑<br />
[E i,j A] il<br />
= [E i,j ] ip<br />
[A] pl<br />
Theorem EMP<br />
p=1<br />
=[E i,j ] ij<br />
[A] jl<br />
+<br />
=1[A] jl<br />
+<br />
=[A] jl<br />
n∑<br />
[E i,j ] ip<br />
[A] pl<br />
Property CACN<br />
p=1<br />
p≠j<br />
n∑<br />
0[A] pl<br />
p=1<br />
p≠j<br />
Def<strong>in</strong>ition ELEM<br />
Row j of the product E i,j A is row i of A,<br />
n∑<br />
[E i,j A] jl<br />
= [E i,j ] jp<br />
[A] pl<br />
Theorem EMP<br />
p=1<br />
=[E i,j ] ji<br />
[A] il<br />
+<br />
=1[A] il<br />
+<br />
=[A] il<br />
n∑<br />
[E i,j ] jp<br />
[A] pl<br />
Property CACN<br />
p=1<br />
p≠i<br />
n∑<br />
0[A] pl<br />
p=1<br />
p≠i<br />
Def<strong>in</strong>ition ELEM<br />
So the matrix product E i,j A is the same as the row operation that swaps rows i<br />
and j.<br />
Row k of the product E i (α) A, wherek ≠ i, is unchanged from A,<br />
n∑<br />
[E i (α) A] kl<br />
= [E i (α)] kp<br />
[A] pl<br />
Theorem EMP<br />
p=1<br />
=[E i (α)] kk<br />
[A] kl<br />
+<br />
n∑<br />
[E i (α)] kp<br />
[A] pl<br />
Property CACN<br />
p=1<br />
p≠k
§DM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 344<br />
=1[A] kl<br />
+<br />
n∑<br />
0[A] pl<br />
p=1<br />
p≠k<br />
Def<strong>in</strong>ition ELEM<br />
=[A] kl<br />
Row i of the product E i (α) A is α times row i of A,<br />
n∑<br />
[E i (α) A] il<br />
= [E i (α)] ip<br />
[A] pl<br />
Theorem EMP<br />
p=1<br />
=[E i (α)] ii<br />
[A] il<br />
+<br />
= α [A] il<br />
+<br />
= α [A] il<br />
n∑<br />
[E i (α)] ip<br />
[A] pl<br />
Property CACN<br />
p=1<br />
p≠i<br />
n∑<br />
0[A] pl<br />
p=1<br />
p≠i<br />
Def<strong>in</strong>ition ELEM<br />
So the matrix product E i (α) A is the same as the row operation that swaps<br />
multiplies row i by α.<br />
Row k of the product E i,j (α) A, wherek ≠ j, is unchanged from A,<br />
n∑<br />
[E i,j (α) A] kl<br />
= [E i,j (α)] kp<br />
[A] pl<br />
Theorem EMP<br />
p=1<br />
=[E i,j (α)] kk<br />
[A] kl<br />
+<br />
=1[A] kl<br />
+<br />
=[A] kl<br />
n∑<br />
0[A] pl<br />
p=1<br />
p≠k<br />
n∑<br />
[E i,j (α)] kp<br />
[A] pl<br />
Property CACN<br />
p=1<br />
p≠k<br />
Def<strong>in</strong>ition ELEM<br />
Row j of the product E i,j (α) A, isα times row i of A and then added to row j<br />
of A,<br />
n∑<br />
[E i,j (α) A] jl<br />
= [E i,j (α)] jp<br />
[A] pl<br />
Theorem EMP<br />
p=1<br />
=[E i,j (α)] jj<br />
[A] jl<br />
+<br />
[E i,j (α)] ji<br />
[A] il<br />
+<br />
n∑<br />
[E i,j (α)] jp<br />
[A] pl<br />
Property CACN<br />
p=1<br />
p≠j,i
§DM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 345<br />
=1[A] jl<br />
+ α [A] il<br />
+<br />
n∑<br />
p=1<br />
p≠j,i<br />
0[A] pl<br />
Def<strong>in</strong>ition ELEM<br />
=[A] jl<br />
+ α [A] il<br />
So the matrix product E i,j (α) A is the same as the row operation that multiplies<br />
row i by α and adds the result to row j.<br />
<br />
Later <strong>in</strong> this section we will need two facts about elementary matrices.<br />
Theorem EMN Elementary Matrices are Nons<strong>in</strong>gular<br />
If E is an elementary matrix, then E is nons<strong>in</strong>gular.<br />
Proof. We show that we can row-reduce each elementary matrix to the identity<br />
matrix. Given an elementary matrix of the form E i,j , perform the row operation<br />
that swaps row j with row i. Given an elementary matrix of the form E i (α), with<br />
α ̸= 0, perform the row operation that multiplies row i by 1/α. Given an elementary<br />
matrix of the form E i,j (α), withα ≠ 0, perform the row operation that multiplies<br />
row i by −α and adds it to row j. In each case, the result of the s<strong>in</strong>gle row operation<br />
is the identity matrix. So each elementary matrix is row-equivalent to the identity<br />
matrix, and by Theorem NMRRI is nons<strong>in</strong>gular.<br />
<br />
Notice that we have now made use of the nonzero restriction on α <strong>in</strong> the def<strong>in</strong>ition<br />
of E i (α). One more key property of elementary matrices.<br />
Theorem NMPEM Nons<strong>in</strong>gular Matrices are Products of Elementary Matrices<br />
Suppose that A is a nons<strong>in</strong>gular matrix. Then there exists elementary matrices<br />
E 1 ,E 2 ,E 3 , ..., E t so that A = E 1 E 2 E 3 ...E t .<br />
Proof. S<strong>in</strong>ce A is nons<strong>in</strong>gular, it is row-equivalent to the identity matrix by Theorem<br />
NMRRI, so there is a sequence of t row operations that converts I to A. For each of<br />
these row operations, form the associated elementary matrix from Theorem EMDRO<br />
and denote these matrices by E 1 ,E 2 ,E 3 , ..., E t . Apply<strong>in</strong>g the first row operation<br />
to I yields the matrix E 1 I. The second row operation yields E 2 (E 1 I), and the third<br />
row operation creates E 3 E 2 E 1 I. The result of the full sequence of t row operations<br />
will yield A, so<br />
A = E t ...E 3 E 2 E 1 I = E t ...E 3 E 2 E 1<br />
Other than the cosmetic matter of re-<strong>in</strong>dex<strong>in</strong>g these elementary matrices <strong>in</strong> the<br />
opposite order, this is the desired result.
§DM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 346<br />
Subsection DD<br />
Def<strong>in</strong>ition of the Determ<strong>in</strong>ant<br />
We will now turn to the def<strong>in</strong>ition of a determ<strong>in</strong>ant and do some sample computations.<br />
The def<strong>in</strong>ition of the determ<strong>in</strong>ant function is recursive, that is, the determ<strong>in</strong>ant of<br />
a large matrix is def<strong>in</strong>ed <strong>in</strong> terms of the determ<strong>in</strong>ant of smaller matrices. To this<br />
end, we will make a few def<strong>in</strong>itions.<br />
Def<strong>in</strong>ition SM SubMatrix<br />
Suppose that A is an m×n matrix. Then the submatrix A (i|j) is the (m−1)×(n−1)<br />
matrix obta<strong>in</strong>ed from A by remov<strong>in</strong>g row i and column j.<br />
□<br />
Example SS Some submatrices<br />
For the matrix<br />
A =<br />
we have the submatrices<br />
[ ]<br />
1 −2 9<br />
A (2|3) =<br />
3 5 1<br />
[ 1 −2 3<br />
] 9<br />
4 −2 0 1<br />
3 5 2 1<br />
A (3|1) =<br />
[<br />
−2 3<br />
]<br />
9<br />
−2 0 1<br />
Def<strong>in</strong>ition DM Determ<strong>in</strong>ant of a Matrix<br />
Suppose A is a square matrix. Then its determ<strong>in</strong>ant, det (A) = |A|, isanelement<br />
of C def<strong>in</strong>ed recursively by:<br />
1. If A is a 1 × 1 matrix, then det (A) =[A] 11<br />
.<br />
2. If A is a matrix of size n with n ≥ 2, then<br />
det (A) =[A] 11<br />
det (A (1|1)) − [A] 12<br />
det (A (1|2)) + [A] 13<br />
det (A (1|3)) −<br />
[A] 14<br />
det (A (1|4)) + ···+(−1) n+1 [A] 1n<br />
det (A (1|n))<br />
So to compute the determ<strong>in</strong>ant of a 5 × 5 matrix we must build 5 submatrices,<br />
each of size 4. To compute the determ<strong>in</strong>ants of each the 4 × 4 matrices we need<br />
to create 4 submatrices each, these now of size 3 and so on. To compute the<br />
determ<strong>in</strong>ant of a 10 × 10 matrix would require comput<strong>in</strong>g the determ<strong>in</strong>ant of<br />
10! = 10 × 9 × 8 × 7 × 6 × 5 × 4 × 3 × 2=3, 628, 800 1 × 1 matrices. Fortunately there<br />
are better ways. However this does suggest an excellent computer programm<strong>in</strong>g<br />
exercise to write a recursive procedure to compute a determ<strong>in</strong>ant.<br />
Let us compute the determ<strong>in</strong>ant of a reasonably sized matrix by hand.<br />
△<br />
□
§DM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 347<br />
Example D33M Determ<strong>in</strong>ant of a 3 × 3 matrix<br />
Suppose that we have the 3 × 3 matrix<br />
[ ]<br />
3 2 −1<br />
A = 4 1 6<br />
−3 −1 2<br />
Then<br />
3 2 −1<br />
det (A) =|A| =<br />
4 1 6<br />
∣<br />
−3 −1 2<br />
∣<br />
=3<br />
∣ 1 6<br />
∣ ∣ ∣∣∣ −1 2∣ − 2 4 6<br />
∣∣∣ −3 2∣ +(−1) 4 1<br />
−3 −1∣<br />
=3(1|2|−6 |−1|) − 2(4|2|−6 |−3|) − (4 |−1|−1 |−3|)<br />
= 3 (1(2) − 6(−1)) − 2 (4(2) − 6(−3)) − (4(−1) − 1(−3))<br />
=24− 52 + 1<br />
= −27<br />
In practice it is a bit silly to decompose a 2 × 2matrixdown<strong>in</strong>toacoupleof<br />
1 × 1 matrices and then compute the exceed<strong>in</strong>gly easy determ<strong>in</strong>ant of these puny<br />
matrices. So here is a simple theorem.<br />
Theorem DMST[ Determ<strong>in</strong>ant ] of Matrices of Size Two<br />
a b<br />
Suppose that A = .Thendet (A) =ad − bc.<br />
c d<br />
△<br />
Proof. Apply<strong>in</strong>g Def<strong>in</strong>ition DM,<br />
∣ a b<br />
c d∣ = a |d|−b |c| = ad − bc<br />
<br />
Do you recall see<strong>in</strong>g the expression ad − bc before? (H<strong>in</strong>t: Theorem TTMI)<br />
Subsection CD<br />
Comput<strong>in</strong>g Determ<strong>in</strong>ants<br />
There are a variety of ways to compute the determ<strong>in</strong>ant. We will establish first<br />
that we can choose to mimic our def<strong>in</strong>ition of the determ<strong>in</strong>ant, but by us<strong>in</strong>g matrix<br />
entries and submatrices based on a row other than the first one.<br />
Theorem DER Determ<strong>in</strong>ant Expansion about Rows
§DM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 348<br />
Suppose that A is a square matrix of size n. Then for 1 ≤ i ≤ n<br />
det (A) =(−1) i+1 [A] i1<br />
det (A (i|1)) + (−1) i+2 [A] i2<br />
det (A (i|2))<br />
+(−1) i+3 [A] i3<br />
det (A (i|3)) + ···+(−1) i+n [A] <strong>in</strong><br />
det (A (i|n))<br />
which is known as expansion about row i.<br />
Proof. <strong>First</strong>, the statement of the theorem co<strong>in</strong>cides with Def<strong>in</strong>ition DM when i =1,<br />
so throughout, we need only consider i>1.<br />
Given the recursive def<strong>in</strong>ition of the determ<strong>in</strong>ant, it should be no surprise that we<br />
will use <strong>in</strong>duction for this proof (Proof Technique I). When n = 1, there is noth<strong>in</strong>g<br />
to prove s<strong>in</strong>ce there is but one row. When n = 2, we just exam<strong>in</strong>e expansion about<br />
the second row,<br />
(−1) 2+1 [A] 21<br />
det (A (2|1)) + (−1) 2+2 [A] 22<br />
det (A (2|2))<br />
= − [A] 21<br />
[A] 12<br />
+[A] 22<br />
[A] 11<br />
Def<strong>in</strong>ition DM<br />
=[A] 11<br />
[A] 22<br />
− [A] 12<br />
[A] 21<br />
=det(A)<br />
Theorem DMST<br />
So the theorem is true for matrices of size n = 1 and n = 2. Now assume the<br />
result is true for all matrices of size n − 1 as we derive an expression for expansion<br />
about row i for a matrix of size n. We will abuse our notation for a submatrix slightly,<br />
so A (i 1 ,i 2 |j 1 ,j 2 ) will denote the matrix formed by remov<strong>in</strong>g rows i 1 and i 2 ,along<br />
with remov<strong>in</strong>g columns j 1 and j 2 . Also, as we take a determ<strong>in</strong>ant of a submatrix,<br />
we will need to “jump up” the <strong>in</strong>dex of summation partway through as we “skip<br />
over” a miss<strong>in</strong>g column. To do this smoothly we will set<br />
{<br />
0 lj<br />
Now,<br />
det (A)<br />
n∑<br />
= (−1) 1+j [A] 1j<br />
det (A (1|j))<br />
=<br />
=<br />
j=1<br />
n∑<br />
(−1) 1+j [A] 1j<br />
j=1<br />
n∑<br />
j=1<br />
∑<br />
1≤l≤n<br />
l≠j<br />
∑<br />
1≤l≤n<br />
l≠j<br />
(−1) i−1+l−ɛ lj<br />
[A] il<br />
det (A (1,i|j, l))<br />
(−1) j+i+l−ɛ lj<br />
[A] 1j<br />
[A] il<br />
det (A (1,i|j, l))<br />
Def<strong>in</strong>ition DM<br />
Induction<br />
Property DCN
§DM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 349<br />
=<br />
=<br />
=<br />
=<br />
n∑<br />
l=1<br />
∑<br />
1≤j≤n<br />
j≠l<br />
n∑<br />
(−1) i+l [A] il<br />
l=1<br />
n∑<br />
(−1) i+l [A] il<br />
l=1<br />
(−1) j+i+l−ɛ lj<br />
[A] 1j<br />
[A] il<br />
det (A (1,i|j, l))<br />
∑<br />
1≤j≤n<br />
j≠l<br />
∑<br />
1≤j≤n<br />
j≠l<br />
n∑<br />
(−1) i+l [A] il<br />
det (A (i|l))<br />
l=1<br />
(−1) j−ɛ lj<br />
[A] 1j<br />
det (A (1,i|j, l))<br />
(−1) ɛ lj+j [A] 1j<br />
det (A (i, 1|l, j))<br />
Property CACN<br />
Property DCN<br />
2ɛ lj is even<br />
Def<strong>in</strong>ition DM<br />
<br />
We can also obta<strong>in</strong> a formula that computes a determ<strong>in</strong>ant by expansion about<br />
a column, but this will be simpler if we first prove a result about the <strong>in</strong>terplay of<br />
determ<strong>in</strong>ants and transposes. Notice how the follow<strong>in</strong>g proof makes use of the ability<br />
to compute a determ<strong>in</strong>ant by expand<strong>in</strong>g about any row.<br />
Theorem DT Determ<strong>in</strong>ant of the Transpose<br />
Suppose that A is a square matrix. Then det (A t )=det(A).<br />
Proof. With our def<strong>in</strong>ition of the determ<strong>in</strong>ant (Def<strong>in</strong>ition DM) and theorems like<br />
Theorem DER, us<strong>in</strong>g <strong>in</strong>duction (Proof Technique I) is a natural approach to prov<strong>in</strong>g<br />
properties of determ<strong>in</strong>ants. And so it is here. Let n bethesizeofthematrixA, and<br />
we will use <strong>in</strong>duction on n.<br />
For n = 1, the transpose of a matrix is identical to the orig<strong>in</strong>al matrix, so<br />
vacuously, the determ<strong>in</strong>ants are equal.<br />
Now assume the result is true for matrices of size n − 1. Then,<br />
det ( A t) = 1 n∑<br />
det ( A t)<br />
n<br />
= 1 n<br />
= 1 n<br />
= 1 n<br />
i=1<br />
n∑<br />
i=1 j=1<br />
n∑<br />
i=1 j=1<br />
n∑<br />
i=1 j=1<br />
n∑<br />
(−1) i+j [ A t] ij det ( A t (i|j) )<br />
n∑<br />
(−1) i+j [A] ji<br />
det ( A t (i|j) )<br />
n∑<br />
(−1) i+j [A] ji<br />
det<br />
Theorem DER<br />
Def<strong>in</strong>ition TM<br />
(<br />
(A (j|i)) t) Def<strong>in</strong>ition TM
§DM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 350<br />
= 1 n<br />
= 1 n<br />
= 1 n<br />
n∑ n∑<br />
(−1) i+j [A] ji<br />
det (A (j|i))<br />
i=1 j=1<br />
n∑<br />
j=1 i=1<br />
n∑<br />
(−1) j+i [A] ji<br />
det (A (j|i))<br />
n∑<br />
det (A)<br />
j=1<br />
=det(A)<br />
Induction Hypothesis<br />
Property CACN<br />
Theorem DER<br />
<br />
Now we can easily get the result that a determ<strong>in</strong>ant can be computed by expansion<br />
about any column as well.<br />
Theorem DEC Determ<strong>in</strong>ant Expansion about Columns<br />
Suppose that A is a square matrix of size n. Then for 1 ≤ j ≤ n<br />
det (A) =(−1) 1+j [A] 1j<br />
det (A (1|j)) + (−1) 2+j [A] 2j<br />
det (A (2|j))<br />
+(−1) 3+j [A] 3j<br />
det (A (3|j)) + ···+(−1) n+j [A] nj<br />
det (A (n|j))<br />
which is known as expansion about column j.<br />
Proof.<br />
det (A) =det ( A t)<br />
n∑<br />
= (−1) j+i [ A t] det ( A t (j|i) )<br />
ji<br />
=<br />
=<br />
=<br />
i=1<br />
Theorem DT<br />
Theorem DER<br />
n∑<br />
(−1) j+i [ A t] ((A<br />
ji det (i|j)) t) Def<strong>in</strong>ition TM<br />
i=1<br />
n∑<br />
(−1) j+i [ A t] det (A (i|j))<br />
ji<br />
i=1<br />
n∑<br />
(−1) i+j [A] ij<br />
det (A (i|j))<br />
i=1<br />
Theorem DT<br />
Def<strong>in</strong>ition TM<br />
<br />
That the determ<strong>in</strong>ant of an n × n matrix can be computed <strong>in</strong> 2n different (albeit<br />
similar) ways is noth<strong>in</strong>g short of remarkable. For the doubters among us, we will do<br />
an example, comput<strong>in</strong>g a 4 × 4 matrix <strong>in</strong> two different ways.
§DM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 351<br />
Example TCSD Two computations, same determ<strong>in</strong>ant<br />
Let<br />
⎡<br />
⎤<br />
−2 3 0 1<br />
⎢ 9 −2 0 1 ⎥<br />
A = ⎣<br />
1 3 −2 −1<br />
⎦<br />
4 1 2 6<br />
Then expand<strong>in</strong>g about the fourth row (Theorem DER with i = 4) yields,<br />
∣ ∣ ∣∣∣∣ 3 0 1<br />
∣∣∣∣ −2 0 1<br />
|A| = (4)(−1) 4+1 −2 0 1<br />
3 −2 −1<br />
∣ + (1)(−1)4+2 9 0 1<br />
1 −2 −1<br />
∣<br />
∣ ∣ ∣∣∣∣ −2 3 1<br />
∣∣∣∣ −2 3 0<br />
+ (2)(−1) 4+3 9 −2 1<br />
1 3 −1<br />
∣ + (6)(−1)4+4 9 −2 0<br />
1 3 −2<br />
∣<br />
=(−4)(10) + (1)(−22) + (−2)(61) + 6(46) = 92<br />
Expand<strong>in</strong>g about column 3 (Theorem DEC with j =3)gives<br />
∣ ∣ ∣∣∣∣ 9 −2 1<br />
∣∣∣∣ −2 3 1<br />
|A| = (0)(−1) 1+3 1 3 −1<br />
4 1 6<br />
∣ + (0)(−1)2+3 1 3 −1<br />
4 1 6<br />
∣ +<br />
∣ ∣ ∣∣∣∣ −2 3 1<br />
∣∣∣∣ −2 3 1<br />
(−2)(−1) 3+3 9 −2 1<br />
4 1 6<br />
∣ + (2)(−1)4+3 9 −2 1<br />
1 3 −1<br />
∣<br />
=0+0+(−2)(−107) + (−2)(61) = 92<br />
Notice how much easier the second computation was. By choos<strong>in</strong>g to expand<br />
about the third column, we have two entries that are zero, so two 3 × 3 determ<strong>in</strong>ants<br />
need not be computed at all!<br />
△<br />
When a matrix has all zeros above (or below) the diagonal, exploit<strong>in</strong>g the zeros<br />
by expand<strong>in</strong>g about the proper row or column makes comput<strong>in</strong>g a determ<strong>in</strong>ant<br />
<strong>in</strong>sanely easy.<br />
Example DUTM Determ<strong>in</strong>ant of an upper triangular matrix<br />
Suppose that<br />
⎡<br />
⎤<br />
2 3 −1 3 3<br />
0 −1 5 2 −1<br />
T = ⎢0 0 3 9 2 ⎥<br />
⎣0 0 0 −1 3 ⎦<br />
0 0 0 0 5<br />
We will compute the determ<strong>in</strong>ant of this 5 × 5 matrix by consistently expand<strong>in</strong>g<br />
about the first column for each submatrix that arises and does not have a zero entry
§DM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 352<br />
multiply<strong>in</strong>g it.<br />
2 3 −1 3 3<br />
0 −1 5 2 −1<br />
det (T )=<br />
0 0 3 9 2<br />
0 0 0 −1 3<br />
∣0 0 0 0 5 ∣<br />
∣ ∣∣∣∣∣∣ −1 5 2 −1<br />
=2(−1) 1+1 0 3 9 2<br />
0 0 −1 3<br />
0 0 0 5 ∣<br />
∣ ∣∣∣∣ 3 9 2<br />
=2(−1)(−1) 1+1 0 −1 3<br />
0 0 5<br />
∣<br />
∣ ∣∣∣<br />
=2(−1)(3)(−1) 1+1 −1 3<br />
0 5∣<br />
=2(−1)(3)(−1)(−1) 1+1 |5|<br />
=2(−1)(3)(−1)(5) = 30<br />
When you consult other texts <strong>in</strong> your study of determ<strong>in</strong>ants, you may run <strong>in</strong>to<br />
the terms “m<strong>in</strong>or” and “cofactor,” especially <strong>in</strong> a discussion centered on expansion<br />
about rows and columns. We have chosen not to make these def<strong>in</strong>itions formally<br />
s<strong>in</strong>ce we have been able to get along without them. However, <strong>in</strong>formally, a m<strong>in</strong>or is<br />
a determ<strong>in</strong>ant of a submatrix, specifically det (A (i|j)) and is usually referenced as<br />
the m<strong>in</strong>or of [A] ij<br />
.Acofactor is a signed m<strong>in</strong>or, specifically the cofactor of [A] ij<br />
is<br />
(−1) i+j det (A (i|j)).<br />
Read<strong>in</strong>g Questions<br />
1. Construct the elementary matrix that will effect the row operation −6R 2 + R 3 on a<br />
4 × 7matrix.<br />
2. Compute the determ<strong>in</strong>ant of the matrix<br />
⎡<br />
⎣ 2 3 −1<br />
⎤<br />
3 8 2 ⎦<br />
4 −1 −3<br />
3. Compute the determ<strong>in</strong>ant of the matrix<br />
⎡<br />
⎤<br />
3 9 −2 4 2<br />
0 1 4 −2 7<br />
⎢0 0 −2 5 2⎥<br />
⎣<br />
0 0 0 −1 6<br />
⎦<br />
0 0 0 0 4<br />
△
§DM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 353<br />
Exercises<br />
C21 †<br />
Do<strong>in</strong>g the computations by hand, f<strong>in</strong>d the determ<strong>in</strong>ant of the matrix below.<br />
[ ]<br />
1 3<br />
6 2<br />
C22 †<br />
C23 †<br />
C24 †<br />
C25 †<br />
Do<strong>in</strong>g the computations by hand, f<strong>in</strong>d the determ<strong>in</strong>ant of the matrix below.<br />
[ ]<br />
1 3<br />
2 6<br />
Do<strong>in</strong>g the computations by hand, f<strong>in</strong>d the determ<strong>in</strong>ant of the matrix below.<br />
⎡<br />
⎣ 1 3 2<br />
⎤<br />
4 1 3⎦<br />
1 0 1<br />
Do<strong>in</strong>g the computations by hand, f<strong>in</strong>d the determ<strong>in</strong>ant of the matrix below.<br />
⎡<br />
⎤<br />
−2 3 −2<br />
⎣−4 −2 1 ⎦<br />
2 4 2<br />
Do<strong>in</strong>g the computations by hand, f<strong>in</strong>d the determ<strong>in</strong>ant of the matrix below.<br />
⎡<br />
⎣ 3 −1 4<br />
⎤<br />
2 5 1⎦<br />
2 0 6<br />
C26 † Do<strong>in</strong>g the computations by hand, f<strong>in</strong>d the determ<strong>in</strong>ant of the matrix A.<br />
⎡<br />
⎤<br />
2 0 3 2<br />
⎢5 1 2 4⎥<br />
A = ⎣<br />
3 0 1 2<br />
⎦<br />
5 3 2 1<br />
C27 † Do<strong>in</strong>g the computations by hand, f<strong>in</strong>d the determ<strong>in</strong>ant of the matrix A.<br />
⎡<br />
⎤<br />
1 0 1 1<br />
⎢2 2 −1 1⎥<br />
A = ⎣<br />
2 1 3 0<br />
⎦<br />
1 1 0 1<br />
C28 † Do<strong>in</strong>g the computations by hand, f<strong>in</strong>d the determ<strong>in</strong>ant of the matrix A.<br />
⎡<br />
⎤<br />
1 0 1 1<br />
⎢2 −1 −1 1⎥<br />
A = ⎣<br />
2 5 3 0<br />
⎦<br />
1 −1 0 1
§DM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 354<br />
C29 † Do<strong>in</strong>g the computations by hand, f<strong>in</strong>d the determ<strong>in</strong>ant of the matrix A.<br />
⎡<br />
⎤<br />
2 3 0 2 1<br />
0 1 1 1 2<br />
A = ⎢0 0 1 2 3⎥<br />
⎣<br />
0 1 2 1 0<br />
⎦<br />
0 0 0 1 2<br />
C30 † Do<strong>in</strong>g the computations by hand, f<strong>in</strong>d the determ<strong>in</strong>ant of the matrix A.<br />
⎡<br />
⎤<br />
2 1 1 0 1<br />
2 1 2 −1 1<br />
A = ⎢0 0 1 2 0⎥<br />
⎣<br />
1 0 3 1 1<br />
⎦<br />
2 1 1 2 1<br />
[ ]<br />
M10 † 2 4<br />
F<strong>in</strong>d a value of k so that the matrix A = has det(A) = 0, or expla<strong>in</strong> why<br />
3 k<br />
it is not possible.<br />
⎡<br />
M11 † F<strong>in</strong>d a value of k so that the matrix A = ⎣ 1 2 1<br />
⎤<br />
2 0 1⎦ has det(A) = 0, or expla<strong>in</strong><br />
2 3 k<br />
why it is not possible.<br />
[ ]<br />
M15 † 2 − x 1<br />
Given the matrix B =<br />
, f<strong>in</strong>d all values of x that are solutions of<br />
4 2− x<br />
det(B) =0.<br />
⎡<br />
⎤<br />
4 − x −4 −4<br />
M16 † Given the matrix B = ⎣ 2 −2 − x −4 ⎦, f<strong>in</strong>d all values of x that are<br />
3 −3 −4 − x<br />
solutions of det(B) =0.<br />
M30 The two matrices below are row-equivalent. How would you confirm this? S<strong>in</strong>ce the<br />
matrices are row-equivalent, there is a sequence of row operations that converts X <strong>in</strong>to Y ,<br />
which would be a product of elementary matrices, M, such that MX = Y .F<strong>in</strong>dM. (This<br />
approach could be used to f<strong>in</strong>d the “9 scalars” of the very early Exercise RREF.M40.)<br />
H<strong>in</strong>t: Compute the extended echelon form for both matrices, and then use the property<br />
from Theorem PEEF that reads B = JA,whereA is the orig<strong>in</strong>al matrix, B is the echelon<br />
form of the matrix and J is a nons<strong>in</strong>gular matrix obta<strong>in</strong>ed from extended echelon form.<br />
Comb<strong>in</strong>e the two square matrices <strong>in</strong> the right way to obta<strong>in</strong> M.<br />
⎡<br />
⎤<br />
⎡<br />
⎤<br />
−1 3 1 −2 8<br />
−1 2 2 0 0<br />
⎢−1 3 2 −1 4 ⎥<br />
⎢−3 6 8 −1 1 ⎥<br />
X = ⎣<br />
2 −4 −3 2 −7<br />
⎦ Y = ⎣<br />
0 1 −2 −2 9<br />
⎦<br />
−2 5 3 −2 8<br />
−1 4 −3 −3 16
Section PDM<br />
Properties of Determ<strong>in</strong>ants of Matrices<br />
We have seen how to compute the determ<strong>in</strong>ant of a matrix, and the <strong>in</strong>credible fact<br />
that we can perform expansion about any row or column to make this computation.<br />
In this largely theoretical section, we will state and prove several more <strong>in</strong>trigu<strong>in</strong>g<br />
properties about determ<strong>in</strong>ants. Our ma<strong>in</strong> goal will be the two results <strong>in</strong> Theorem<br />
SMZD and Theorem DRMM, but more specifically, we will see how the value of<br />
a determ<strong>in</strong>ant will allow us to ga<strong>in</strong> <strong>in</strong>sight <strong>in</strong>to the various properties of a square<br />
matrix.<br />
Subsection DRO<br />
Determ<strong>in</strong>ants and Row Operations<br />
We start easy with a straightforward theorem whose proof presages the style of<br />
subsequent proofs <strong>in</strong> this subsection.<br />
Theorem DZRC Determ<strong>in</strong>ant with Zero Row or Column<br />
Suppose that A is a square matrix with a row where every entry is zero, or a column<br />
where every entry is zero. Then det (A) =0.<br />
Proof. Suppose that A is a square matrix of size n and row i has every entry equal<br />
to zero. We compute det (A) via expansion about row i.<br />
n∑<br />
det (A) = (−1) i+j [A] ij<br />
det (A (i|j)) Theorem DER<br />
=<br />
=<br />
j=1<br />
n∑<br />
(−1) i+j 0det(A (i|j))<br />
j=1<br />
n∑<br />
0=0<br />
j=1<br />
Row i is zeros<br />
The proof for the case of a zero column is entirely similar, or could be derived<br />
from an application of Theorem DT employ<strong>in</strong>g the transpose of the matrix. <br />
Theorem DRCS Determ<strong>in</strong>ant for Row or Column Swap<br />
Suppose that A is a square matrix. Let B be the square matrix obta<strong>in</strong>ed from A by<br />
<strong>in</strong>terchang<strong>in</strong>g the location of two rows, or <strong>in</strong>terchang<strong>in</strong>g the location of two columns.<br />
Then det (B) =− det (A).<br />
Proof. Beg<strong>in</strong> with the special case where A is a square matrix of size n and we form<br />
B by swapp<strong>in</strong>g adjacent rows i and i +1forsome1≤ i ≤ n − 1. Notice that the<br />
assumption about swapp<strong>in</strong>g adjacent rows means that B (i +1|j) = A (i|j) for all<br />
355
§PDM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 356<br />
1 ≤ j ≤ n, and[B] i+1,j<br />
= [A] ij<br />
for all 1 ≤ j ≤ n. We compute det (B) via expansion<br />
about row i +1.<br />
n∑<br />
det (B) = (−1) (i+1)+j [B] i+1,j<br />
det (B (i +1|j)) Theorem DER<br />
=<br />
=<br />
j=1<br />
n∑<br />
(−1) (i+1)+j [A] ij<br />
det (A (i|j))<br />
j=1<br />
n∑<br />
(−1) 1 (−1) i+j [A] ij<br />
det (A (i|j))<br />
j=1<br />
=(−1)<br />
n∑<br />
(−1) i+j [A] ij<br />
det (A (i|j))<br />
j=1<br />
Hypothesis<br />
= − det (A) Theorem DER<br />
So the result holds for the special case where we swap adjacent rows of the<br />
matrix. As any computer scientist knows, we can accomplish any rearrangement of<br />
an ordered list by swapp<strong>in</strong>g adjacent elements. This pr<strong>in</strong>ciple can be demonstrated<br />
by naïve sort<strong>in</strong>g algorithms such as “bubble sort.” In any event, we do not need to<br />
discuss every possible reorder<strong>in</strong>g, we just need to consider a swap of two rows, say<br />
rows s and t with 1 ≤ s
§PDM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 357<br />
B by multiply<strong>in</strong>g each entry of row i of A by α. Notice that the other rows of A<br />
and B are equal, so A (i|j) = B (i|j), for all 1 ≤ j ≤ n. We compute det (B) via<br />
expansion about row i.<br />
n∑<br />
det (B) = (−1) i+j [B] ij<br />
det (B (i|j)) Theorem DER<br />
=<br />
=<br />
= α<br />
j=1<br />
n∑<br />
(−1) i+j [B] ij<br />
det (A (i|j))<br />
j=1<br />
n∑<br />
(−1) i+j α [A] ij<br />
det (A (i|j))<br />
j=1<br />
n∑<br />
(−1) i+j [A] ij<br />
det (A (i|j))<br />
j=1<br />
Hypothesis<br />
Hypothesis<br />
= α det (A) Theorem DER<br />
The proof for the case of a multiple of a column is entirely similar, or could be<br />
derivedfromanapplicationofTheoremDT employ<strong>in</strong>g the transpose of the matrix.<br />
<br />
Let us go for understand<strong>in</strong>g the effect of all three row operations. But first we<br />
need an <strong>in</strong>termediate result, but it is an easy one.<br />
Theorem DERC Determ<strong>in</strong>ant with Equal Rows or Columns<br />
Suppose that A is a square matrix with two equal rows, or two equal columns. Then<br />
det (A) =0.<br />
Proof. Suppose that A is a square matrix of size n where the two rows s and t are<br />
equal. Form the matrix B by swapp<strong>in</strong>g rows s and t. Notice that as a consequence<br />
of our hypothesis, A = B. Then<br />
det (A) = 1 (det (A)+det(A))<br />
2<br />
= 1 (det (A) − det (B))<br />
2<br />
Theorem DRCS<br />
= 1 (det (A) − det (A))<br />
2<br />
Hypothesis, A = B<br />
= 1 2 (0) = 0<br />
The proof for the case of two equal columns is entirely similar, or could be derived<br />
from an application of Theorem DT employ<strong>in</strong>g the transpose of the matrix. <br />
Now expla<strong>in</strong> the third row operation. Here we go.
§PDM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 358<br />
Theorem DRCMA Determ<strong>in</strong>ant for Row or Column Multiples and Addition<br />
Suppose that A is a square matrix. Let B be the square matrix obta<strong>in</strong>ed from A<br />
by multiply<strong>in</strong>g a row by the scalar α and then add<strong>in</strong>g it to another row, or by<br />
multiply<strong>in</strong>g a column by the scalar α and then add<strong>in</strong>g it to another column. Then<br />
det (B) =det(A).<br />
Proof. Suppose that A is a square matrix of size n. FormthematrixB by multiply<strong>in</strong>g<br />
row s by α andadd<strong>in</strong>gittorowt. LetC be the auxiliary matrix where we replace<br />
row t of A by row s of A. Notice that A (t|j) = B (t|j) = C (t|j) for all 1 ≤ j ≤ n.<br />
We compute the determ<strong>in</strong>ant of B by expansion about row t.<br />
n∑<br />
det (B) = (−1) t+j [B] tj<br />
det (B (t|j))<br />
Theorem DER<br />
=<br />
=<br />
= α<br />
= α<br />
j=1<br />
n∑ ( )<br />
(−1) t+j α [A] sj<br />
+[A] tj<br />
det (B (t|j)) Hypothesis<br />
j=1<br />
n∑<br />
(−1) t+j α [A] sj<br />
det (B (t|j))<br />
j=1<br />
+<br />
n∑<br />
(−1) t+j [A] tj<br />
det (B (t|j))<br />
j=1<br />
n∑<br />
(−1) t+j [A] sj<br />
det (B (t|j))<br />
j=1<br />
+<br />
n∑<br />
(−1) t+j [A] tj<br />
det (B (t|j))<br />
j=1<br />
n∑<br />
(−1) t+j [C] tj<br />
det (C (t|j))<br />
j=1<br />
+<br />
n∑<br />
(−1) t+j [A] tj<br />
det (A (t|j))<br />
j=1<br />
= α det (C)+det(A) Theorem DER<br />
= α 0+det(A) =det(A) Theorem DERC<br />
The proof for the case of add<strong>in</strong>g a multiple of a column is entirely similar, or<br />
could be derived from an application of Theorem DT employ<strong>in</strong>g the transpose of<br />
the matrix.<br />
<br />
Is this what you expected? We could argue that the third row operation is the<br />
most popular, and yet it has no effect whatsoever on the determ<strong>in</strong>ant of a matrix!
§PDM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 359<br />
We can exploit this, along with our understand<strong>in</strong>g of the other two row operations,<br />
to provide another approach to comput<strong>in</strong>g a determ<strong>in</strong>ant. We’ll expla<strong>in</strong> this <strong>in</strong> the<br />
context of an example.<br />
Example DRO Determ<strong>in</strong>ant by row operations<br />
Suppose we desire the determ<strong>in</strong>ant of the 4 × 4 matrix<br />
⎡<br />
⎤<br />
2 0 2 3<br />
⎢ 1 3 −1 1⎥<br />
A = ⎣<br />
−1 1 −1 2<br />
⎦<br />
3 5 4 0<br />
We will perform a sequence of row operations on this matrix, shoot<strong>in</strong>g for an upper<br />
triangular matrix, whose determ<strong>in</strong>ant will be simply the product of its diagonal<br />
entries. For each row operation, we will track the effect on the determ<strong>in</strong>ant via<br />
Theorem DRCS, TheoremDRCM, TheoremDRCMA.<br />
⎡<br />
⎤<br />
2 0 2 3<br />
⎢ 1 3 −1 1⎥<br />
A = ⎣<br />
−1 1 −1 2<br />
⎦ det (A)<br />
3 5 4 0<br />
⎡<br />
⎤<br />
1 3 −1 1<br />
R 1 ↔R 2<br />
−−−−−→ A1 =<br />
−2R 1 +R 2<br />
−−−−−−→ A2 =<br />
1R 1 +R 3<br />
−−−−−→ A3 =<br />
−3R 1 +R 4<br />
−−−−−−→ A4 =<br />
1R 3 +R 2<br />
−−−−−→ A5 =<br />
− 1 2 R 2<br />
−−−−→ A 6 =<br />
⎢ 2 0 2 3⎥<br />
⎣<br />
−1 1 −1 2<br />
⎦ = − det (A 1 ) Theorem DRCS<br />
3 5 4 0<br />
⎡<br />
⎤<br />
1 3 −1 1<br />
⎢ 0 −6 4 1⎥<br />
⎣<br />
−1 1 −1 2<br />
⎦ = − det (A 2 ) Theorem DRCMA<br />
3 5 4 0<br />
⎡<br />
⎤<br />
1 3 −1 1<br />
⎢0 −6 4 1⎥<br />
⎣<br />
0 4 −2 3<br />
⎦ = − det (A 3 ) Theorem DRCMA<br />
3 5 4 0<br />
⎡<br />
⎤<br />
1 3 −1 1<br />
⎢0 −6 4 1 ⎥<br />
⎣<br />
0 4 −2 3<br />
⎦ = − det (A 4 ) Theorem DRCMA<br />
0 −4 7 −3<br />
⎡<br />
⎤<br />
1 3 −1 1<br />
⎢0 −2 2 4 ⎥<br />
⎣<br />
0 4 −2 3<br />
⎦ = − det (A 5 ) Theorem DRCMA<br />
0 −4 7 −3<br />
⎡<br />
⎤<br />
1 3 −1 1<br />
⎢0 1 −1 −2⎥<br />
⎣<br />
0 4 −2 3<br />
⎦ =2det(A 6 ) Theorem DRCM<br />
0 −4 7 −3
§PDM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 360<br />
−4R 2 +R 3<br />
−−−−−−→ A7 =<br />
4R 2 +R 4<br />
−−−−−→ A8 =<br />
−1R 3 +R 4<br />
−−−−−−→ A9 =<br />
−2R 4 +R 3<br />
−−−−−−→ A10 =<br />
R 3 ↔R 4<br />
−−−−−→ A11 =<br />
1<br />
55 R 4<br />
−−−→ A 12 =<br />
⎡<br />
1 3 −1<br />
⎤<br />
1<br />
⎢0 1 −1 −2⎥<br />
⎣<br />
0 0 2 11<br />
⎦ =2det(A 7 ) Theorem DRCMA<br />
0 −4 7 −3<br />
⎡<br />
⎤<br />
1 3 −1 1<br />
⎢0 1 −1 −2 ⎥<br />
⎣<br />
0 0 2 11<br />
⎦ =2det(A 8 ) Theorem DRCMA<br />
0 0 3 −11<br />
⎡<br />
⎤<br />
1 3 −1 1<br />
⎢0 1 −1 −2 ⎥<br />
⎣<br />
0 0 2 11<br />
⎦ =2det(A 9 ) Theorem DRCMA<br />
0 0 1 −22<br />
⎡<br />
⎤<br />
1 3 −1 1<br />
⎢0 1 −1 −2 ⎥<br />
⎣<br />
0 0 0 55<br />
⎦ =2det(A 10 ) Theorem DRCMA<br />
0 0 1 −22<br />
⎡<br />
⎤<br />
1 3 −1 1<br />
⎢0 1 −1 −2 ⎥<br />
⎣<br />
0 0 1 −22<br />
⎦ = −2det(A 11 ) Theorem DRCS<br />
0 0 0 55<br />
⎡<br />
⎤<br />
1 3 −1 1<br />
⎢0 1 −1 −2 ⎥<br />
⎣<br />
0 0 1 −22<br />
⎦ = −110 det (A 12 ) Theorem DRCM<br />
0 0 0 1<br />
The matrix A 12 is upper triangular, so expansion about the first column (repeatedly)<br />
will result <strong>in</strong> det (A 12 ) = (1)(1)(1)(1) = 1 (see Example DUTM) and thus,<br />
det (A) =−110(1) = −110.<br />
Notice that our sequence of row operations was somewhat ad hoc, such as the<br />
transformation to A 5 . We could have been even more methodical, and strictly<br />
followed the process that converts a matrix to reduced row-echelon form (Theorem<br />
REMEF), eventually achiev<strong>in</strong>g the same numerical result with a f<strong>in</strong>al matrix that<br />
equaled the 4 × 4 identity matrix. Notice too that we could have stopped with A 8 ,<br />
s<strong>in</strong>ce at this po<strong>in</strong>t we could compute det (A 8 ) by two expansions about first columns,<br />
followed by a simple determ<strong>in</strong>ant of a 2 × 2 matrix (Theorem DMST).<br />
The beauty of this approach is that computationally we should already have<br />
written a procedure to convert matrices to reduced-row echelon form, so all we<br />
need to do is track the multiplicative changes to the determ<strong>in</strong>ant as the algorithm<br />
proceeds. Further, for a square matrix of size n this approach requires on the order<br />
of n 3 multiplications, while a recursive application of expansion about a row or<br />
column (Theorem DER, TheoremDEC) will require <strong>in</strong> the vic<strong>in</strong>ity of (n − 1)(n!)
§PDM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 361<br />
multiplications. So even for very small matrices, a computational approach utiliz<strong>in</strong>g<br />
row operations will have superior run-time. Track<strong>in</strong>g, and controll<strong>in</strong>g, the effects of<br />
round-off errors is another story, best saved for a numerical l<strong>in</strong>ear algebra course.△<br />
Subsection DROEM<br />
Determ<strong>in</strong>ants, Row Operations, Elementary Matrices<br />
As a f<strong>in</strong>al preparation for our two most important theorems about determ<strong>in</strong>ants,<br />
we prove a handful of facts about the <strong>in</strong>terplay of row operations and matrix<br />
multiplication with elementary matrices with regard to the determ<strong>in</strong>ant. But first, a<br />
simple, but crucial, fact about the identity matrix.<br />
Theorem DIM Determ<strong>in</strong>ant of the Identity Matrix<br />
For every n ≥ 1, det (I n )=1.<br />
Proof. It may be overkill, but this is a good situation to run through a proof by<br />
<strong>in</strong>duction on n (Proof Technique I). Is the result true when n = 1? Yes,<br />
det (I 1 )=[I 1 ] 11<br />
Def<strong>in</strong>ition DM<br />
= 1 Def<strong>in</strong>ition IM<br />
Now assume the theorem is true for the identity matrix of size n−1 and <strong>in</strong>vestigate<br />
the determ<strong>in</strong>ant of the identity matrix of size n with expansion about row 1,<br />
n∑<br />
det (I n )= (−1) 1+j [I n ] 1j<br />
det (I n (1|j))<br />
Def<strong>in</strong>ition DM<br />
j=1<br />
=(−1) 1+1 [I n ] 11<br />
det (I n (1|1))<br />
n∑<br />
+ (−1) 1+j [I n ] 1j<br />
det (I n (1|j))<br />
j=2<br />
=1det(I n−1 )+<br />
j=2<br />
n∑<br />
(−1) 1+j 0det(I n (1|j)) Def<strong>in</strong>ition IM<br />
j=2<br />
n∑<br />
=1(1)+ 0 = 1 Induction Hypothesis<br />
Theorem DEM Determ<strong>in</strong>ants of Elementary Matrices<br />
For the three possible versions of an elementary matrix (Def<strong>in</strong>ition ELEM) we have<br />
the determ<strong>in</strong>ants,
§PDM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 362<br />
1. det (E i,j )=−1<br />
2. det (E i (α)) = α<br />
3. det (E i,j (α)) = 1<br />
Proof. Swapp<strong>in</strong>g rows i and j of the identity matrix will create E i,j (Def<strong>in</strong>ition<br />
ELEM), so<br />
det (E i,j )=− det (I n )<br />
Theorem DRCS<br />
= −1 Theorem DIM<br />
Multiply<strong>in</strong>g row i of the identity matrix by α will create E i (α) (Def<strong>in</strong>ition<br />
ELEM), so<br />
det (E i (α)) = α det (I n )<br />
Theorem DRCM<br />
= α(1) = α Theorem DIM<br />
Multiply<strong>in</strong>g row i of the identity matrix by α and add<strong>in</strong>g to row j will create<br />
E i,j (α) (Def<strong>in</strong>ition ELEM), so<br />
det (E i,j (α)) = det (I n )<br />
Theorem DRCMA<br />
=1 TheoremDIM<br />
Theorem DEMMM Determ<strong>in</strong>ants, Elementary Matrices, Matrix Multiplication<br />
Suppose that A is a square matrix of size n and E is any elementary matrix of size<br />
n. Then<br />
det (EA)=det(E)det(A)<br />
Proof. The proof procedes <strong>in</strong> three parts, one for each type of elementary matrix,<br />
with each part very similar to the other two.<br />
<strong>First</strong>, let B be the matrix obta<strong>in</strong>ed from A by swapp<strong>in</strong>g rows i and j,<br />
det (E i,j A)=det(B)<br />
Theorem EMDRO<br />
= − det (A) Theorem DRCS<br />
=det(E i,j )det(A) Theorem DEM<br />
Second, let B be the matrix obta<strong>in</strong>ed from A by multiply<strong>in</strong>g row i by α,<br />
det (E i (α) A) =det(B)<br />
Theorem EMDRO<br />
= α det (A) Theorem DRCM
§PDM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 363<br />
=det(E i (α)) det (A)<br />
Theorem DEM<br />
Third, let B be the matrix obta<strong>in</strong>ed from A by multiply<strong>in</strong>g row i by α and add<strong>in</strong>g<br />
to row j,<br />
det (E i,j (α) A) =det(B)<br />
Theorem EMDRO<br />
=det(A)<br />
Theorem DRCMA<br />
=det(E i,j (α)) det (A) Theorem DEM<br />
S<strong>in</strong>ce the desired result holds for each variety of elementary matrix <strong>in</strong>dividually,<br />
we are done.<br />
<br />
Subsection DNMMM<br />
Determ<strong>in</strong>ants, Nons<strong>in</strong>gular Matrices, Matrix Multiplication<br />
If you asked someone with substantial experience work<strong>in</strong>g with matrices about the<br />
value of the determ<strong>in</strong>ant, they’d be likely to quote the follow<strong>in</strong>g theorem as the first<br />
th<strong>in</strong>gtocometom<strong>in</strong>d.<br />
Theorem SMZD S<strong>in</strong>gular Matrices have Zero Determ<strong>in</strong>ants<br />
Let A be a square matrix. Then A is s<strong>in</strong>gular if and only if det (A) =0.<br />
Proof. Rather than jump<strong>in</strong>g <strong>in</strong>to the two halves of the equivalence, we first establish<br />
a few items. Let B be the unique square matrix that is row-equivalent to A and<br />
<strong>in</strong> reduced row-echelon form (Theorem REMEF, TheoremRREFU). For each of<br />
the row operations that converts B <strong>in</strong>to A, there is an elementary matrix E i which<br />
effects the row operation by matrix multiplication (Theorem EMDRO). Repeated<br />
applications of Theorem EMDRO allow us to write<br />
A = E s E s−1 ...E 2 E 1 B<br />
Then<br />
det (A) =det(E s E s−1 ...E 2 E 1 B)<br />
=det(E s )det(E s−1 ) ...det (E 2 )det(E 1 )det(B) Theorem DEMMM<br />
From Theorem DEM we can <strong>in</strong>fer that the determ<strong>in</strong>ant of an elementary matrix<br />
is never zero (note the ban on α =0forE i (α) <strong>in</strong> Def<strong>in</strong>ition ELEM). So the product<br />
on the right is composed of nonzero scalars, with the possible exception of det (B).<br />
More precisely, we can argue that det (A) = 0 if and only if det (B) = 0. With this<br />
established, we can take up the two halves of the equivalence.<br />
(⇒) IfA is s<strong>in</strong>gular, then by Theorem NMRRI, B cannot be the identity matrix.<br />
Because(1)thenumberofpivotcolumnsisequaltothenumberofnonzerorows,(2)<br />
not every column is a pivot column, and (3) B is square, we see that B must have a<br />
zero row. By Theorem DZRC the determ<strong>in</strong>ant of B is zero, and by the above, we<br />
conclude that the determ<strong>in</strong>ant of A is zero.
§PDM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 364<br />
(⇐) We will prove the contrapositive (Proof Technique CP). So assume A is<br />
nons<strong>in</strong>gular, then by Theorem NMRRI, B is the identity matrix and Theorem DIM<br />
tells us that det (B) =1≠ 0. With the argument above, we conclude that the<br />
determ<strong>in</strong>ant of A is nonzero as well.<br />
<br />
For the case of 2 × 2 matrices you might compare the application of Theorem<br />
SMZD with the comb<strong>in</strong>ation of the results stated <strong>in</strong> Theorem DMST and Theorem<br />
TTMI.<br />
Example ZNDAB Zero and nonzero determ<strong>in</strong>ant, Archetypes A and B<br />
The coefficient matrix <strong>in</strong> Archetype A has a zero determ<strong>in</strong>ant (check this!) while the<br />
coefficient matrix Archetype B has a nonzero determ<strong>in</strong>ant (check this, too). These<br />
matrices are s<strong>in</strong>gular and nons<strong>in</strong>gular, respectively. This is exactly what Theorem<br />
SMZD says, and cont<strong>in</strong>ues our list of contrasts between these two archetypes. △<br />
In Section MINM we said “s<strong>in</strong>gular matrices are a dist<strong>in</strong>ct m<strong>in</strong>ority.” If you built<br />
a random matrix and took its determ<strong>in</strong>ant, how likely would it be that you got zero?<br />
S<strong>in</strong>ce Theorem SMZD is an equivalence (Proof Technique E) we can expand on<br />
our grow<strong>in</strong>g list of equivalences about nons<strong>in</strong>gular matrices. The addition of the<br />
condition det (A) ̸= 0 is one of the best motivations for learn<strong>in</strong>g about determ<strong>in</strong>ants.<br />
Theorem NME7 Nons<strong>in</strong>gular Matrix Equivalences, Round 7<br />
Suppose that A is a square matrix of size n. The follow<strong>in</strong>g are equivalent.<br />
1. A is nons<strong>in</strong>gular.<br />
2. A row-reduces to the identity matrix.<br />
3. The null space of A conta<strong>in</strong>s only the zero vector, N (A) ={0}.<br />
4. The l<strong>in</strong>ear system LS(A, b) has a unique solution for every possible choice of<br />
b.<br />
5. The columns of A are a l<strong>in</strong>early <strong>in</strong>dependent set.<br />
6. A is <strong>in</strong>vertible.<br />
7. The column space of A is C n , C(A) =C n .<br />
8. The columns of A are a basis for C n .<br />
9. The rank of A is n, r (A) =n.<br />
10. The nullity of A is zero, n (A) =0.<br />
11. The determ<strong>in</strong>ant of A is nonzero, det (A) ≠0.
§PDM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 365<br />
Proof. Theorem SMZD says A is s<strong>in</strong>gular if and only if det (A) = 0. If we negate<br />
each of these statements, we arrive at two contrapositives that we can comb<strong>in</strong>e as<br />
the equivalence, A is nons<strong>in</strong>gular if and only if det (A) ≠ 0. This allows us to add a<br />
new statement to the list found <strong>in</strong> Theorem NME6.<br />
<br />
Computationally, row-reduc<strong>in</strong>g a matrix is the most efficient way to determ<strong>in</strong>e<br />
if a matrix is nons<strong>in</strong>gular, though the effect of us<strong>in</strong>g division <strong>in</strong> a computer can<br />
lead to round-off errors that confuse small quantities with critical zero quantities.<br />
Conceptually, the determ<strong>in</strong>ant may seem the most efficient way to determ<strong>in</strong>e if a<br />
matrix is nons<strong>in</strong>gular. The def<strong>in</strong>ition of a determ<strong>in</strong>ant uses just addition, subtraction<br />
and multiplication, so division is never a problem. And the f<strong>in</strong>al test is easy: is the<br />
determ<strong>in</strong>ant zero or not? However, the number of operations <strong>in</strong>volved <strong>in</strong> comput<strong>in</strong>g a<br />
determ<strong>in</strong>ant by the def<strong>in</strong>ition very quickly becomes so excessive as to be impractical.<br />
Now for the coup de grace. We will generalize Theorem DEMMM to the case of any<br />
two square matrices. You may recall th<strong>in</strong>k<strong>in</strong>g that matrix multiplication was def<strong>in</strong>ed<br />
<strong>in</strong> a needlessly complicated manner. For sure, the def<strong>in</strong>ition of a determ<strong>in</strong>ant seems<br />
even stranger. (Though Theorem SMZD might be forc<strong>in</strong>g you to reconsider.) Read<br />
the statement of the next theorem and contemplate how nicely matrix multiplication<br />
and determ<strong>in</strong>ants play with each other.<br />
Theorem DRMM Determ<strong>in</strong>ant Respects Matrix Multiplication<br />
Suppose that A and B are square matrices of the same size. Then det (AB) =<br />
det (A)det(B).<br />
Proof. This proof is constructed <strong>in</strong> two cases. <strong>First</strong>, suppose that A is s<strong>in</strong>gular.<br />
Then det (A) =0byTheoremSMZD. By the contrapositive of Theorem NPNT,<br />
AB is s<strong>in</strong>gular as well. So by a second application of Theorem SMZD, det (AB) =0.<br />
Putt<strong>in</strong>g it all together<br />
det (AB) =0=0det(B) =det(A)det(B)<br />
as desired.<br />
For the second case, suppose that A is nons<strong>in</strong>gular. By Theorem NMPEM there<br />
are elementary matrices E 1 ,E 2 ,E 3 , ..., E s such that A = E 1 E 2 E 3 ...E s .Then<br />
det (AB) =det(E 1 E 2 E 3 ...E s B)<br />
=det(E 1 )det(E 2 )det(E 3 ) ...det (E s )det(B) Theorem DEMMM<br />
=det(E 1 E 2 E 3 ...E s )det(B)<br />
Theorem DEMMM<br />
=det(A)det(B)<br />
<br />
It is amaz<strong>in</strong>g that matrix multiplication and the determ<strong>in</strong>ant <strong>in</strong>teract this way.<br />
Might it also be true that det (A + B) =det(A)+det(B)? (Exercise PDM.M30)
§PDM Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 366<br />
Read<strong>in</strong>g Questions<br />
1. Consider the two matrices below, and suppose you already have computed det (A) = −120.<br />
What is det (B)? Why?<br />
⎡<br />
⎤<br />
⎡<br />
⎤<br />
0 8 3 −4<br />
0 8 3 −4<br />
⎢−1 2 −2 5 ⎥<br />
⎢ 0 −4 2 −3⎥<br />
A = ⎣<br />
−2 8 4 3<br />
⎦ B = ⎣<br />
−2 8 4 3<br />
⎦<br />
0 −4 2 −3<br />
−1 2 −2 5<br />
2. State the theorem that allows us to make yet another extension to our NMEx series of<br />
theorems.<br />
3. What is amaz<strong>in</strong>g about the <strong>in</strong>teraction between matrix multiplication and the determ<strong>in</strong>ant?<br />
Exercises<br />
C30 Each of the archetypes below is a system of equations with a square coefficient<br />
matrix, or is a square matrix itself. Compute the determ<strong>in</strong>ant of each matrix, not<strong>in</strong>g how<br />
Theorem SMZD <strong>in</strong>dicates when the matrix is s<strong>in</strong>gular or nons<strong>in</strong>gular.<br />
Archetype A, ArchetypeB, ArchetypeF, ArchetypeK, Archetype L<br />
M20 † Construct a 3 × 3 nons<strong>in</strong>gular matrix and call it A. Then, for each entry of the<br />
matrix, compute the correspond<strong>in</strong>g cofactor, and create a new 3 × 3 matrix full of these<br />
cofactors by plac<strong>in</strong>g the cofactor of an entry <strong>in</strong> the same location as the entry it was based<br />
on. Once complete, call this matrix C. Compute AC t . Any observations? Repeat with a<br />
new matrix, or perhaps with a 4 × 4matrix.<br />
M30 Construct an example to show that the follow<strong>in</strong>g statement is not true for all square<br />
matrices A and B of the same size: det (A + B) =det(A)+det(B).<br />
T10 Theorem NPNT says that if the product of square matrices AB is nons<strong>in</strong>gular, then<br />
the <strong>in</strong>dividual matrices A and B are nons<strong>in</strong>gular also. Construct a new proof of this result<br />
mak<strong>in</strong>g use of theorems about determ<strong>in</strong>ants of matrices.<br />
T15 Use Theorem DRCM to prove Theorem DZRC as a corollary. (See Proof Technique<br />
LC.)<br />
T20 Suppose that A is a square matrix of size n and α ∈ C is a scalar. Prove that<br />
det (αA) =α n det (A).<br />
T25 Employ Theorem DT to construct the second half of the proof of Theorem DRCM<br />
(the portion about a multiple of a column).
Chapter E<br />
Eigenvalues<br />
When we have a square matrix of size n, A, and we multiply it by a vector x from C n<br />
to form the matrix-vector product (Def<strong>in</strong>ition MVP), the result is another vector <strong>in</strong><br />
C n . So we can adopt a functional view of this computation — the act of multiply<strong>in</strong>g<br />
by a square matrix is a function that converts one vector (x) <strong>in</strong>toanotherone(Ax)<br />
of the same size. For some vectors, this seem<strong>in</strong>gly complicated computation is really<br />
no more complicated than scalar multiplication. The vectors vary accord<strong>in</strong>g to the<br />
choice of A, so the question is to determ<strong>in</strong>e, for an <strong>in</strong>dividual choice of A, ifthere<br />
are any such vectors, and if so, which ones. It happens <strong>in</strong> a variety of situations that<br />
these vectors (and the scalars that go along with them) are of special <strong>in</strong>terest.<br />
We will be solv<strong>in</strong>g polynomial equations <strong>in</strong> this chapter, which raises the specter of<br />
complex numbers as roots. This dist<strong>in</strong>ct possibility is our ma<strong>in</strong> reason for enterta<strong>in</strong><strong>in</strong>g<br />
the complex numbers throughout the course. You might be moved to revisit Section<br />
CNO and Section O.<br />
Section EE<br />
Eigenvalues and Eigenvectors<br />
In this section, we will def<strong>in</strong>e the eigenvalues and eigenvectors of a matrix, and see<br />
how to compute them. More theoretical properties will be taken up <strong>in</strong> the next<br />
section.<br />
Subsection EEM<br />
Eigenvalues and Eigenvectors of a Matrix<br />
We start with the pr<strong>in</strong>cipal def<strong>in</strong>ition for this chapter.<br />
Def<strong>in</strong>ition EEM Eigenvalues and Eigenvectors of a Matrix<br />
Suppose that A is a square matrix of size n, x ̸= 0 is a vector <strong>in</strong> C n ,andλ is a<br />
367
§EE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 368<br />
scalar <strong>in</strong> C. Thenwesayx is an eigenvector of A with eigenvalue λ if<br />
Ax = λx<br />
Before go<strong>in</strong>g any further, perhaps we should conv<strong>in</strong>ce you that such th<strong>in</strong>gs ever<br />
happen at all. Understand the next example, but do not concern yourself with where<br />
the pieces come from. We will have methods soon enough to be able to discover<br />
these eigenvectors ourselves.<br />
Example SEE Some eigenvalues and eigenvectors<br />
Consider the matrix<br />
⎡<br />
⎤<br />
204 98 −26 −10<br />
⎢−280 −134 36 14 ⎥<br />
A = ⎣<br />
716 348 −90 −36<br />
⎦<br />
−472 −232 60 28<br />
and the vectors<br />
⎡ ⎤<br />
⎡<br />
1<br />
⎢−1⎥<br />
⎢<br />
x = ⎣<br />
2<br />
⎦ y = ⎣<br />
5<br />
−3<br />
4<br />
−10<br />
4<br />
⎤<br />
⎥<br />
⎦ z =<br />
⎡ ⎤<br />
−3<br />
⎢<br />
⎣<br />
7<br />
0<br />
8<br />
⎥<br />
⎦ w =<br />
⎡ ⎤<br />
1<br />
⎢−1⎥<br />
⎣<br />
4<br />
⎦<br />
0<br />
Then<br />
⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
204 98 −26 −10 1 4 1<br />
⎢−280 −134 36 14 ⎥ ⎢−1⎥<br />
⎢−4⎥<br />
⎢−1⎥<br />
Ax = ⎣<br />
716 348 −90 −36<br />
⎦ ⎣<br />
2<br />
⎦ = ⎣<br />
8<br />
⎦ =4⎣<br />
2<br />
⎦ =4x<br />
−472 −232 60 28 5 20 5<br />
so x is an eigenvector of A with eigenvalue λ =4.<br />
Also,<br />
⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤ ⎡<br />
204 98 −26 −10 −3 0<br />
⎢−280 −134 36 14 ⎥ ⎢ 4 ⎥ ⎢0⎥<br />
⎢<br />
Ay = ⎣<br />
716 348 −90 −36<br />
⎦ ⎣<br />
−10<br />
⎦ = ⎣<br />
0<br />
⎦ =0⎣<br />
−472 −232 60 28 4 0<br />
so y is an eigenvector of A with eigenvalue λ =0.<br />
Also,<br />
⎡<br />
⎤ ⎡ ⎤<br />
204 98 −26 −10 −3<br />
⎢−280 −134 36 14 ⎥ ⎢ 7 ⎥<br />
Az = ⎣<br />
716 348 −90 −36<br />
⎦ ⎣<br />
0<br />
⎦ =<br />
−472 −232 60 28 8<br />
so z is an eigenvector of A with eigenvalue λ =2.<br />
−3<br />
4<br />
−10<br />
4<br />
⎤<br />
⎡ ⎤ ⎡ ⎤<br />
−6 −3<br />
⎢ 14 ⎥ ⎢ 7<br />
⎣<br />
0<br />
⎦ =2⎣<br />
0<br />
16 8<br />
⎥<br />
⎦ =0y<br />
⎥<br />
⎦ =2z<br />
□
§EE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 369<br />
Also,<br />
⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
204 98 −26 −10 1 2 1<br />
⎢−280 −134 36 14 ⎥ ⎢−1⎥<br />
⎢−2⎥<br />
⎢−1⎥<br />
Aw = ⎣<br />
716 348 −90 −36<br />
⎦ ⎣<br />
4<br />
⎦ = ⎣<br />
8<br />
⎦ =2⎣<br />
4<br />
⎦ =2w<br />
−472 −232 60 28 0 0 0<br />
so w is an eigenvector of A with eigenvalue λ =2.<br />
So we have demonstrated four eigenvectors of A. Are there more? Yes, any<br />
nonzero scalar multiple of an eigenvector is aga<strong>in</strong> an eigenvector. In this example,<br />
set u =30x. Then<br />
Au = A(30x)<br />
=30Ax<br />
Theorem MMSMM<br />
= 30(4x) Def<strong>in</strong>ition EEM<br />
=4(30x)<br />
Property SMAM<br />
=4u<br />
so that u is also an eigenvector of A for the same eigenvalue, λ =4.<br />
The vectors z and w are both eigenvectors of A for the same eigenvalue λ =2,<br />
yet this is not as simple as the two vectors just be<strong>in</strong>g scalar multiples of each other<br />
(they are not). Look what happens when we add them together, to form v = z + w,<br />
and multiply by A,<br />
Av = A(z + w)<br />
= Az + Aw Theorem MMDAA<br />
=2z +2w Def<strong>in</strong>ition EEM<br />
=2(z + w)<br />
Property DVAC<br />
=2v<br />
so that v is also an eigenvector of A for the eigenvalue λ = 2. So it would appear<br />
that the set of eigenvectors that are associated with a fixed eigenvalue is closed under<br />
the vector space operations of C n . Hmmm.<br />
The vector y is an eigenvector of A for the eigenvalue λ = 0, so we can use<br />
Theorem ZSSM to write Ay =0y = 0. But this also means that y ∈N(A). There<br />
would appear to be a connection here also.<br />
△<br />
Example SEE h<strong>in</strong>ts at a number of <strong>in</strong>trigu<strong>in</strong>g properties, and there are many<br />
more. We will explore the general properties of eigenvalues and eigenvectors <strong>in</strong><br />
Section PEE, but <strong>in</strong> this section we will concern ourselves with the question of<br />
actually comput<strong>in</strong>g eigenvalues and eigenvectors. <strong>First</strong> we need a bit of background<br />
material on polynomials and matrices.
§EE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 370<br />
Subsection PM<br />
Polynomials and Matrices<br />
A polynomial is a comb<strong>in</strong>ation of powers, multiplication by scalar coefficients, and<br />
addition (with subtraction just be<strong>in</strong>g the <strong>in</strong>verse of addition). We never have occasion<br />
to divide when comput<strong>in</strong>g the value of a polynomial. So it is with matrices. We can<br />
add and subtract matrices, we can multiply matrices by scalars, and we can form<br />
powers of square matrices by repeated applications of matrix multiplication. We do<br />
not normally divide matrices (though sometimes we can multiply by an <strong>in</strong>verse). If a<br />
matrix is square, all the operations constitut<strong>in</strong>g a polynomial will preserve the size<br />
of the matrix. So it is natural to consider evaluat<strong>in</strong>g a polynomial with a matrix,<br />
effectively replac<strong>in</strong>g the variable of the polynomial by a matrix. We will demonstrate<br />
with an example.<br />
Example PM Polynomial of a matrix<br />
Let<br />
p(x) = 14 + 19x − 3x 2 − 7x 3 + x 4 D =<br />
[ −1 3<br />
] 2<br />
1 0 −2<br />
−3 1 1<br />
and we will compute p(D).<br />
<strong>First</strong>, the necessary powers of D. Notice that D 0 is def<strong>in</strong>ed to be the multiplicative<br />
identity, I 3 , as will be the case <strong>in</strong> general.<br />
[ ] 1 0 0<br />
D 0 = I 3 = 0 1 0<br />
0 0 1<br />
[ ]<br />
−1 3 2<br />
D 1 = D = 1 0 −2<br />
−3 1 1<br />
D 2 = DD 1 =<br />
D 3 = DD 2 =<br />
D 4 = DD 3 =<br />
[ −1 3 2<br />
1 0 −2<br />
−3 1 1<br />
[ −1 3 2<br />
1 0 −2<br />
−3 1 1<br />
[ −1 3 2<br />
1 0 −2<br />
−3 1 1<br />
][ −1 3<br />
] 2<br />
[ −2 −1<br />
] −6<br />
1 0 −2 = 5 1 0<br />
−3 1 1 1 −8 −7<br />
][ ] [ ]<br />
−2 −1 −6 19 −12 −8<br />
5 1 0 = −4 15 8<br />
1 −8 −7 12 −4 11<br />
][ ] [ 19 −12 −8 −7 49 54<br />
−4 15 8 = −5 −4 −30<br />
12 −4 11 −49 47 43<br />
]<br />
Then<br />
p(D) = 14 + 19D − 3D 2 − 7D 3 + D 4
§EE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 371<br />
=14<br />
=<br />
[ 1 0 0<br />
0 1 0<br />
0 0 1<br />
]<br />
+19<br />
[ 19 −12 −8<br />
− 7 −4 15 8<br />
12 −4 11<br />
[ ]<br />
−139 193 166<br />
27 −98 −124<br />
−193 118 20<br />
[ −1 3 2<br />
1 0 −2<br />
−3 1 1<br />
]<br />
+<br />
]<br />
− 3<br />
[ −7 49 54<br />
−5 −4 −30<br />
−49 47 43<br />
[ −2 −1<br />
] −6<br />
5 1 0<br />
1 −8 −7<br />
Notice that p(x) factors as<br />
p(x) = 14 + 19x − 3x 2 − 7x 3 + x 4 =(x − 2)(x − 7)(x +1) 2<br />
Because D commutes with itself (DD = DD), we can use distributivity of matrix<br />
multiplication across matrix addition (Theorem MMDAA) without be<strong>in</strong>g careful<br />
with any of the matrix products, and just as easily evaluate p(D) us<strong>in</strong>g the factored<br />
form of p(x),<br />
p(D) = 14 + 19D − 3D 2 − 7D 3 + D 4 =(D − 2I 3 )(D − 7I 3 )(D + I 3 ) 2<br />
[ −3 3<br />
][ 2 −8 3<br />
][ 2 0 3 2<br />
]2<br />
= 1 −2 −2 1 −7 −2 1 1 −2<br />
−3 1 −1 −3 1 −6 −3 1 2<br />
[ ]<br />
−139 193 166<br />
= 27 −98 −124<br />
−193 118 20<br />
This example is not meant to be too profound. It is meant to show you that it is<br />
natural to evaluate a polynomial with a matrix, and that the factored form of the<br />
polynomial is as good as (or maybe better than) the expanded form. And do not<br />
forget that constant terms <strong>in</strong> polynomials are really multiples of the identity matrix<br />
when we are evaluat<strong>in</strong>g the polynomial with a matrix.<br />
△<br />
Subsection EEE<br />
Existence of Eigenvalues and Eigenvectors<br />
Before we embark on comput<strong>in</strong>g eigenvalues and eigenvectors, we will prove that<br />
every matrix has at least one eigenvalue (and an eigenvector to go with it). Later, <strong>in</strong><br />
Theorem MNEM, we will determ<strong>in</strong>e the maximum number of eigenvalues a matrix<br />
may have.<br />
The determ<strong>in</strong>ant (Def<strong>in</strong>ition DM) will be a powerful tool <strong>in</strong> Subsection EE.CEE<br />
when it comes time to compute eigenvalues. However, it is possible, with some<br />
more advanced mach<strong>in</strong>ery, to compute eigenvalues without ever mak<strong>in</strong>g use of the<br />
determ<strong>in</strong>ant. Sheldon Axler does just that <strong>in</strong> his book, L<strong>in</strong>ear <strong>Algebra</strong> Done Right.<br />
Here and now, we give Axler’s “determ<strong>in</strong>ant-free” proof that every matrix has an<br />
]
§EE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 372<br />
eigenvalue. The result is not too startl<strong>in</strong>g, but the proof is most enjoyable.<br />
Theorem EMHE Every Matrix Has an Eigenvalue<br />
Suppose A is a square matrix. Then A has at least one eigenvalue.<br />
Proof. Suppose that A has size n, andchoosex as any nonzero vector from C n .<br />
(Notice how much latitude we have <strong>in</strong> our choice of x. Only the zero vector is<br />
off-limits.) Consider the set<br />
S = { x,Ax, A 2 x,A 3 x, ..., A n x }<br />
This is a set of n + 1 vectors from C n ,sobyTheoremMVSLD, S is l<strong>in</strong>early<br />
dependent. Let a 0 ,a 1 ,a 2 , ..., a n be a collection of n + 1 scalars from C, not all<br />
zero, that provide a relation of l<strong>in</strong>ear dependence on S. In other words,<br />
a 0 x + a 1 Ax + a 2 A 2 x + a 3 A 3 x + ···+ a n A n x = 0<br />
Some of the a i are nonzero. Suppose that just a 0 ≠0,anda 1 = a 2 = a 3 = ···=<br />
a n =0.Thena 0 x = 0 and by Theorem SMEZV, eithera 0 =0orx = 0, which are<br />
both contradictions. So a i ≠ 0 for some i ≥ 1. Let m be the largest <strong>in</strong>teger such<br />
that a m ≠ 0. From this discussion we know that m ≥ 1. We can also assume that<br />
a m = 1, for if not, replace each a i by a i /a m to obta<strong>in</strong> scalars that serve equally well<br />
<strong>in</strong> provid<strong>in</strong>g a relation of l<strong>in</strong>ear dependence on S.<br />
Def<strong>in</strong>e the polynomial<br />
p(x) =a 0 + a 1 x + a 2 x 2 + a 3 x 3 + ···+ a m x m<br />
Because we have consistently used C as our set of scalars (rather than R), we<br />
know that we can factor p(x) <strong>in</strong>to l<strong>in</strong>ear factors of the form (x − b i ), where b i ∈ C.<br />
So there are scalars, b 1 ,b 2 ,b 3 , ..., b m ,fromC so that,<br />
p(x) =(x − b m )(x − b m−1 ) ···(x − b 3 )(x − b 2 )(x − b 1 )<br />
Put it all together and<br />
0 = a 0 x + a 1 Ax + a 2 A 2 x + ···+ a n A n x<br />
= a 0 x + a 1 Ax + a 2 A 2 x + ···+ a m A m x a i =0fori>m<br />
= ( a 0 I n + a 1 A + a 2 A 2 + ···+ a m A m) x Theorem MMDAA<br />
= p(A)x Def<strong>in</strong>ition of p(x)<br />
=(A − b m I n )(A − b m−1 I n ) ···(A − b 2 I n )(A − b 1 I n )x<br />
Let k be the smallest <strong>in</strong>teger such that<br />
(A − b k I n )(A − b k−1 I n ) ···(A − b 2 I n )(A − b 1 I n )x = 0.<br />
From the preced<strong>in</strong>g equation, we know that k ≤ m. Def<strong>in</strong>e the vector z by<br />
z =(A − b k−1 I n ) ···(A − b 2 I n )(A − b 1 I n )x<br />
Notice that by the def<strong>in</strong>ition of k, the vector z must be nonzero. In the case
§EE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 373<br />
where k = 1, we understand that z is def<strong>in</strong>ed by z = x, andz is still nonzero. Now<br />
(A − b k I n )z =(A − b k I n )(A − b k−1 I n ) ···(A − b 3 I n )(A − b 2 I n )(A − b 1 I n )x = 0<br />
which allows us to write<br />
Az =(A + O)z<br />
Property ZM<br />
=(A − b k I n + b k I n )z<br />
Property AIM<br />
=(A − b k I n )z + b k I n z<br />
Theorem MMDAA<br />
= 0 + b k I n z Def<strong>in</strong><strong>in</strong>g property of z<br />
= b k I n z Property ZM<br />
= b k z Theorem MMIM<br />
S<strong>in</strong>ce z ̸= 0, this equation says that z is an eigenvector of A for the eigenvalue<br />
λ = b k (Def<strong>in</strong>ition EEM), so we have shown that any square matrix A does have at<br />
least one eigenvalue.<br />
<br />
The proof of Theorem EMHE is constructive (it conta<strong>in</strong>s an unambiguous<br />
procedure that leads to an eigenvalue), but it is not meant to be practical. We will<br />
illustrate the theorem with an example, the purpose be<strong>in</strong>g to provide a companion<br />
for study<strong>in</strong>g the proof and not to suggest this is the best procedure for comput<strong>in</strong>g<br />
an eigenvalue.<br />
Example CAEHW Comput<strong>in</strong>g an eigenvalue the hard way<br />
This example illustrates the proof of Theorem EMHE, and so will employ the same<br />
notation as the proof — look there for full explanations. It is not meant to be<br />
an example of a reasonable computational approach to f<strong>in</strong>d<strong>in</strong>g eigenvalues and<br />
eigenvectors. OK, warn<strong>in</strong>gs <strong>in</strong> place, here we go.<br />
Consider the matrix A, and choose the vector x,<br />
⎡<br />
⎤<br />
−7 −1 11 0 −4<br />
4 1 0 2 0<br />
A = ⎢−10 −1 14 0 −4⎥<br />
⎣ 8 2 −15 −1 5 ⎦<br />
−10 −1 16 0 −6<br />
⎡<br />
x = ⎢<br />
⎣<br />
It is important to notice that the choice of x could be anyth<strong>in</strong>g, so long as it is<br />
not the zero vector. We have not chosen x totally at random, but so as to make our<br />
illustration of the theorem as general as possible. You could replicate this example<br />
with your own choice and the computations are guaranteed to be reasonable, provided<br />
you have a computational tool that will factor a fifth degree polynomial for you.<br />
The set<br />
S = { x,Ax, A 2 x,A 3 x,A 4 x,A 5 x }<br />
3<br />
0<br />
3<br />
−5<br />
4<br />
⎤<br />
⎥<br />
⎦
§EE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 374<br />
⎧⎡<br />
⎪⎨<br />
= ⎢<br />
⎣<br />
⎪⎩<br />
3<br />
0<br />
3<br />
−5<br />
4<br />
⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />
−4 6 −10 18 −34<br />
2<br />
−6<br />
14<br />
−30<br />
62<br />
⎪⎬<br />
⎥<br />
⎦ , ⎢−4⎥<br />
⎣ 4 ⎦ , ⎢ 6 ⎥<br />
⎣−2⎦ , ⎢−10⎥<br />
⎣ −2 ⎦ , ⎢ 18 ⎥<br />
⎣ 10 ⎦ , ⎢−34⎥<br />
⎣−26⎦<br />
⎪⎭<br />
−6 10 −18 34 −66<br />
is guaranteed to be l<strong>in</strong>early dependent, as it has six vectors from C 5 (Theorem<br />
MVSLD).<br />
We will search for a nontrivial relation of l<strong>in</strong>ear dependence by solv<strong>in</strong>g a homogeneous<br />
system of equations whose coefficient matrix has the vectors of S as columns<br />
through row operations,<br />
⎡<br />
⎤ ⎡<br />
⎤<br />
3 −4 6 −10 18 −34<br />
1 0 −2 6 −14 30<br />
0 2 −6 14 −30 62<br />
⎢ 3 −4 6 −10 18 −34⎥<br />
⎣−5 4 −2 −2 10 −26⎦<br />
4 −6 10 −18 34 −66<br />
RREF<br />
−−−−→<br />
0 1 −3 7 −15 31<br />
⎢ 0 0 0 0 0 0<br />
⎥<br />
⎣ 0 0 0 0 0 0⎦<br />
0 0 0 0 0 0<br />
There are four free variables for describ<strong>in</strong>g solutions to this homogeneous system,<br />
so we have our pick of solutions. The most expedient choice would be to set x 3 =1<br />
and x 4 = x 5 = x 6 = 0. However, we will aga<strong>in</strong> opt to maximize the generality of our<br />
illustration of Theorem EMHE and choose x 3 = −8, x 4 = −3, x 5 = 1 and x 6 =0.<br />
This leads to a solution with x 1 =16andx 2 = 12.<br />
This relation of l<strong>in</strong>ear dependence then says that<br />
0 =16x +12Ax − 8A 2 x − 3A 3 x + A 4 x +0A 5 x<br />
0 = ( 16 + 12A − 8A 2 − 3A 3 + A 4) x<br />
So we def<strong>in</strong>e p(x) = 16 + 12x − 8x 2 − 3x 3 + x 4 , and as advertised <strong>in</strong> the proof of<br />
Theorem EMHE, we have a polynomial of degree m =4> 1 such that p(A)x = 0.<br />
Now we need to factor p(x) overC. If you made your own choice of x at the start,<br />
this is where you might have a fifth degree polynomial, and where you might need<br />
to use a computational tool to f<strong>in</strong>d roots and factors. We have<br />
p(x) = 16 + 12x − 8x 2 − 3x 3 + x 4 =(x − 4)(x +2)(x − 2)(x +1)<br />
So we know that<br />
0 = p(A)x =(A − 4I 5 )(A +2I 5 )(A − 2I 5 )(A +1I 5 )x<br />
We apply one factor at a time, until we get the zero vector, so as to determ<strong>in</strong>e<br />
the value of k described <strong>in</strong> the proof of Theorem EMHE,<br />
⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤<br />
−6 −1 11 0 −4 3 −1<br />
4 2 0 2 0<br />
0<br />
2<br />
(A +1I 5 )x = ⎢−10 −1 15 0 −4⎥<br />
⎢ 3 ⎥<br />
⎣ 8 2 −15 0 5 ⎦ ⎣−5⎦ = ⎢−1⎥<br />
⎣−1⎦<br />
−10 −1 16 0 −5 4 −2
§EE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 375<br />
⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤<br />
−9 −1 11 0 −4 −1 4<br />
4 −1 0 2 0<br />
2<br />
−8<br />
(A − 2I 5 )(A +1I 5 )x = ⎢−10 −1 12 0 −4⎥<br />
⎢−1⎥<br />
⎣ 8 2 −15 −3 5 ⎦ ⎣−1⎦ = ⎢ 4 ⎥<br />
⎣ 4 ⎦<br />
−10 −1 16 0 −8 −2 8<br />
⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤<br />
−5 −1 11 0 −4 4 0<br />
4 3 0 2 0<br />
−8<br />
0<br />
(A +2I 5 )(A − 2I 5 )(A +1I 5 )x = ⎢−10 −1 16 0 −4⎥<br />
⎢ 4 ⎥<br />
⎣ 8 2 −15 1 5 ⎦ ⎣ 4 ⎦ = ⎢0⎥<br />
⎣0⎦<br />
−10 −1 16 0 −4 8 0<br />
So k = 3 and<br />
⎡ ⎤<br />
4<br />
−8<br />
z =(A − 2I 5 )(A +1I 5 )x = ⎢ 4 ⎥<br />
⎣ 4 ⎦<br />
8<br />
is an eigenvector of A for the eigenvalue λ = −2, as you can check by do<strong>in</strong>g the<br />
computation Az. If you work through this example with your own choice of the<br />
vector x (strongly recommended) then the eigenvalue you will f<strong>in</strong>d may be different,<br />
but will be <strong>in</strong> the set {3, 0, 1, −1, −2}. See Exercise EE.M60 for a suggested start<strong>in</strong>g<br />
vector.<br />
△<br />
Subsection CEE<br />
Comput<strong>in</strong>g Eigenvalues and Eigenvectors<br />
Fortunately, we need not rely on the procedure of Theorem EMHE each time we<br />
need an eigenvalue. It is the determ<strong>in</strong>ant, and specifically Theorem SMZD, that<br />
provides the ma<strong>in</strong> tool for comput<strong>in</strong>g eigenvalues. Here is an <strong>in</strong>formal sequence of<br />
equivalences that is the key to determ<strong>in</strong><strong>in</strong>g the eigenvalues and eigenvectors of a<br />
matrix,<br />
Ax = λx ⇐⇒ Ax − λI n x = 0 ⇐⇒ (A − λI n ) x = 0<br />
So, for an eigenvalue λ and associated eigenvector x ̸= 0, thevectorx will be<br />
a nonzero element of the null space of A − λI n , while the matrix A − λI n will<br />
be s<strong>in</strong>gular and therefore have zero determ<strong>in</strong>ant. These ideas are made precise <strong>in</strong><br />
Theorem EMRCP and Theorem EMNS, but for now this brief discussion should<br />
suffice as motivation for the follow<strong>in</strong>g def<strong>in</strong>ition and example.<br />
Def<strong>in</strong>ition CP Characteristic Polynomial<br />
Suppose that A is a square matrix of size n. Then the characteristic polynomial
= −(x − 3)(x +1) 2 △<br />
§EE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 376<br />
of A is the polynomial p A (x) def<strong>in</strong>ed by<br />
p A (x) =det(A − xI n )<br />
Example CPMS3 Characteristic polynomial of a matrix, size 3<br />
Consider<br />
[ ]<br />
−13 −8 −4<br />
F = 12 7 4<br />
24 16 7<br />
Then<br />
p F (x) =det(F − xI 3 )<br />
−13 − x −8 −4<br />
=<br />
12 7 − x 4<br />
∣<br />
24 16 7 − x<br />
∣<br />
=(−13 − x)<br />
∣ 7 − x 4<br />
∣ ∣∣∣ 16 7 − x∣ +(−8)(−1) 12 4<br />
24 7 − x∣<br />
+(−4)<br />
∣ 12 7 − x<br />
24 16 ∣<br />
=(−13 − x)((7 − x)(7 − x) − 4(16))<br />
+(−8)(−1)(12(7 − x) − 4(24))<br />
+(−4)(12(16) − (7 − x)(24))<br />
=3+5x + x 2 − x 3<br />
Def<strong>in</strong>ition CP<br />
Def<strong>in</strong>ition DM<br />
Theorem DMST<br />
□<br />
The characteristic polynomial is our ma<strong>in</strong> computational tool for f<strong>in</strong>d<strong>in</strong>g eigenvalues,<br />
and will sometimes be used to aid us <strong>in</strong> determ<strong>in</strong><strong>in</strong>g the properties of eigenvalues.<br />
Theorem EMRCP Eigenvalues of a Matrix are Roots of Characteristic Polynomials<br />
Suppose A is a square matrix. Then λ is an eigenvalue of A if and only if p A (λ) =0.<br />
Proof. Suppose A has size n.<br />
λ is an eigenvalue of A<br />
⇐⇒ there exists x ≠ 0 so that Ax = λx Def<strong>in</strong>ition EEM<br />
⇐⇒ there exists x ≠ 0 so that Ax − λx = 0 Property AIC<br />
⇐⇒ there exists x ≠ 0 so that Ax − λI n x = 0 Theorem MMIM<br />
⇐⇒ there exists x ≠ 0 so that (A − λI n )x = 0 Theorem MMDAA<br />
⇐⇒ A − λI n is s<strong>in</strong>gular<br />
Def<strong>in</strong>ition NM
§EE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 377<br />
⇐⇒ det (A − λI n )=0<br />
⇐⇒ p A (λ) = 0<br />
TheoremSMZD<br />
Def<strong>in</strong>ition CP<br />
<br />
Example EMS3 Eigenvalues of a matrix, size 3<br />
In Example CPMS3 we found the characteristic polynomial of<br />
[ −13 −8<br />
] −4<br />
F = 12 7 4<br />
24 16 7<br />
to be p F (x) = −(x − 3)(x +1) 2 . Factored, we can f<strong>in</strong>d all of its roots easily, they are<br />
x = 3 and x = −1. By Theorem EMRCP, λ = 3 and λ = −1 are both eigenvalues of<br />
F , and these are the only eigenvalues of F .Wehavefoundthemall.<br />
△<br />
Let us now turn our attention to the computation of eigenvectors.<br />
Def<strong>in</strong>ition EM Eigenspace of a Matrix<br />
Suppose that A is a square matrix and λ is an eigenvalue of A. Then the eigenspace<br />
of A for λ, E A (λ), is the set of all the eigenvectors of A for λ, together with the<br />
<strong>in</strong>clusion of the zero vector.<br />
□<br />
Example SEE h<strong>in</strong>ted that the set of eigenvectors for a s<strong>in</strong>gle eigenvalue might<br />
have some closure properties, and with the addition of the one eigenvector that is<br />
never an eigenvector, 0, we <strong>in</strong>deed get a whole subspace.<br />
Theorem EMS Eigenspace for a Matrix is a Subspace<br />
Suppose A is a square matrix of size n and λ is an eigenvalue of A. Then the<br />
eigenspace E A (λ) is a subspace of the vector space C n .<br />
Proof. We will check the three conditions of Theorem TSS. <strong>First</strong>, Def<strong>in</strong>ition EM<br />
explicitly <strong>in</strong>cludes the zero vector <strong>in</strong> E A (λ), so the set is nonempty.<br />
Suppose that x, y ∈E A (λ), thatis,x and y are two eigenvectors of A for λ.<br />
Then<br />
A (x + y) =Ax + Ay<br />
Theorem MMDAA<br />
= λx + λy Def<strong>in</strong>ition EEM<br />
= λ (x + y) Property DVAC<br />
So either x + y = 0, orx + y is an eigenvector of A for λ (Def<strong>in</strong>ition EEM). So,<br />
<strong>in</strong> either event, x + y ∈E A (λ), and we have additive closure.<br />
Suppose that α ∈ C, andthatx ∈E A (λ), thatis,x is an eigenvector of A for λ.<br />
Then<br />
A (αx) =α (Ax)<br />
Theorem MMSMM<br />
= αλx Def<strong>in</strong>ition EEM<br />
= λ (αx) Property SMAC
§EE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 378<br />
So either αx = 0, orαx is an eigenvector of A for λ (Def<strong>in</strong>ition EEM). So, <strong>in</strong><br />
either event, αx ∈E A (λ), and we have scalar closure.<br />
With the three conditions of Theorem TSS met, we know E A (λ) is a subspace.<br />
Theorem EMS tells us that an eigenspace is a subspace (and hence a vector space<br />
<strong>in</strong> its own right). Our next theorem tells us how to quickly construct this subspace.<br />
Theorem EMNS Eigenspace of a Matrix is a Null Space<br />
Suppose A is a square matrix of size n and λ is an eigenvalue of A. Then<br />
E A (λ) =N (A − λI n )<br />
Proof. The conclusion of this theorem is an equality of sets, so normally we would<br />
follow the advice of Def<strong>in</strong>ition SE. However, <strong>in</strong> this case we can construct a sequence<br />
of equivalences which will together provide the two subset <strong>in</strong>clusions we need. <strong>First</strong>,<br />
notice that 0 ∈E A (λ) by Def<strong>in</strong>ition EM and 0 ∈N(A − λI n ) by Theorem HSC.<br />
Now consider any nonzero vector x ∈ C n ,<br />
x ∈E A (λ) ⇐⇒ Ax = λx<br />
Def<strong>in</strong>ition EM<br />
⇐⇒ Ax − λx = 0<br />
Property AIC<br />
⇐⇒ Ax − λI n x = 0 Theorem MMIM<br />
⇐⇒ (A − λI n ) x = 0 Theorem MMDAA<br />
⇐⇒ x ∈N(A − λI n ) Def<strong>in</strong>ition NSM<br />
<br />
You might notice the close parallels (and differences) between the proofs of<br />
Theorem EMRCP and Theorem EMNS. S<strong>in</strong>ce Theorem EMNS describes the set<br />
of all the eigenvectors of A as a null space we can use techniques such as Theorem<br />
BNS to provide concise descriptions of eigenspaces. Theorem EMNS also provides a<br />
trivial proof for Theorem EMS.<br />
Example ESMS3 Eigenspaces of a matrix, size 3<br />
Example CPMS3 and Example EMS3 describe the characteristic polynomial and<br />
eigenvalues of the 3 × 3 matrix<br />
[ −13 −8<br />
] −4<br />
F = 12 7 4<br />
24 16 7<br />
We will now take each eigenvalue <strong>in</strong> turn and compute its eigenspace. To do this,<br />
we row-reduce the matrix F −λI 3 <strong>in</strong> order to determ<strong>in</strong>e solutions to the homogeneous<br />
system LS(F − λI 3 , 0) and then express the eigenspace as the null space of F − λI 3<br />
(Theorem EMNS). Theorem BNS then tells us how to write the null space as the
§EE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 379<br />
span of a basis.<br />
[ −16 −8 −4<br />
12 4 4<br />
24 16 4<br />
]<br />
⎡<br />
⎣ 1 0 1<br />
2<br />
0 1 − 1 2<br />
RREF<br />
λ =3 F − 3I 3 =<br />
−−−−→<br />
⎦<br />
0 0 0<br />
〈 ⎧ ⎡<br />
⎨<br />
E F (3) = N (F − 3I 3 )= ⎣ − ⎤⎫<br />
1 〉 〈{[ ]}〉<br />
2<br />
⎬ −1<br />
1 ⎦ = 1<br />
⎩ 2 ⎭<br />
1<br />
2<br />
[ ] ⎡<br />
−12 −8 −4<br />
RREF<br />
λ = −1 F +1I 3 = 12 8 4 −−−−→ ⎣ 1 ⎤<br />
2 1<br />
3 3<br />
0 0 0⎦<br />
24 16 8<br />
0 0 0<br />
〈 ⎧ ⎡<br />
⎨<br />
E F (−1) = N (F +1I 3 )= ⎣ − ⎤ ⎡<br />
2<br />
3<br />
1 ⎦ , ⎣ − ⎤⎫<br />
1 〉<br />
⎬<br />
3<br />
0 ⎦ =<br />
⎩<br />
0 1<br />
⎭<br />
⎤<br />
〈{[ ] −2<br />
3 ,<br />
0<br />
[ ]}〉 −1<br />
0<br />
3<br />
Eigenspaces <strong>in</strong> hand, we can easily compute eigenvectors by form<strong>in</strong>g nontrivial<br />
l<strong>in</strong>ear comb<strong>in</strong>ations of the basis vectors describ<strong>in</strong>g each eigenspace. In particular,<br />
notice that we can “pretty up” our basis vectors by us<strong>in</strong>g scalar multiples to clear<br />
out fractions.<br />
△<br />
Subsection ECEE<br />
Examples of Comput<strong>in</strong>g Eigenvalues and Eigenvectors<br />
There are no theorems <strong>in</strong> this section, just a selection of examples meant to illustrate<br />
the range of possibilities for the eigenvalues and eigenvectors of a matrix. These<br />
examples can all be done by hand, though the computation of the characteristic<br />
polynomial would be very time-consum<strong>in</strong>g and error-prone. It can also be difficult<br />
to factor an arbitrary polynomial, though if we were to suggest that most of our<br />
eigenvalues are go<strong>in</strong>g to be <strong>in</strong>tegers, then it can be easier to hunt for roots. These<br />
examples are meant to look similar to a concatenation of Example CPMS3, Example<br />
EMS3 and Example ESMS3. <strong>First</strong>, we will sneak <strong>in</strong> a pair of def<strong>in</strong>itions so we can<br />
illustrate them throughout this sequence of examples.<br />
Def<strong>in</strong>ition AME <strong>Algebra</strong>ic Multiplicity of an Eigenvalue<br />
Suppose that A is a square matrix and λ is an eigenvalue of A. Then the algebraic<br />
multiplicity of λ, α A (λ), is the highest power of (x−λ) that divides the characteristic<br />
polynomial, p A (x).<br />
□<br />
S<strong>in</strong>ce an eigenvalue λ is a root of the characteristic polynomial, there is always<br />
afactorof(x − λ), and the algebraic multiplicity is just the power of this factor<br />
<strong>in</strong> a factorization of p A (x). So<strong>in</strong>particular,α A (λ) ≥ 1. Compare the def<strong>in</strong>ition of<br />
algebraic multiplicity with the next def<strong>in</strong>ition.<br />
Def<strong>in</strong>ition GME Geometric Multiplicity of an Eigenvalue
§EE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 380<br />
Suppose that A is a square matrix and λ is an eigenvalue of A. Then the geometric<br />
multiplicity of λ, γ A (λ), is the dimension of the eigenspace E A (λ).<br />
□<br />
Every eigenvalue must have at least one eigenvector, so the associated eigenspace<br />
cannot be trivial, and so γ A (λ) ≥ 1.<br />
Example EMMS4 Eigenvalue multiplicities, matrix of size 4<br />
Consider the matrix<br />
⎡<br />
⎤<br />
−2 1 −2 −4<br />
⎢ 12 1 4 9 ⎥<br />
B = ⎣<br />
6 5 −2 −4<br />
⎦<br />
3 −4 5 10<br />
then<br />
p B (x) =8− 20x +18x 2 − 7x 3 + x 4 =(x − 1)(x − 2) 3<br />
So the eigenvalues are λ =1, 2 with algebraic multiplicities α B (1) = 1 and α B (2) =<br />
3.<br />
Comput<strong>in</strong>g eigenvectors,<br />
⎡<br />
⎤ ⎡<br />
−3 1 −2 −4<br />
⎢ 12 0 4 9 ⎥<br />
λ =1 B − 1I 4 = ⎣<br />
6 5 −3 −4<br />
⎦ RREF<br />
−−−−→ ⎢<br />
⎣<br />
3 −4 5 9<br />
〈 ⎧ ⎡ ⎤⎫<br />
⎪ ⎨<br />
⎪ ⎩<br />
E B (1) = N (B − 1I 4 )=<br />
⎢<br />
⎣<br />
− 1 3<br />
1<br />
1<br />
0<br />
1<br />
1 0<br />
3<br />
0<br />
0 1 −1 0<br />
0 0 0 1<br />
0 0 0 0<br />
〉 〈 ⎪⎬<br />
⎧ ⎡ ⎤⎫<br />
⎪ −1 〉<br />
⎨ ⎪⎬<br />
⎥ ⎢ 3 ⎥<br />
⎦ = ⎣<br />
⎪⎭ ⎪ ⎩<br />
3<br />
⎦<br />
⎪⎭<br />
0<br />
⎡<br />
⎤ ⎡<br />
⎤<br />
−4 1 −2 −4<br />
1 0 0 1/2<br />
⎢ 12 −1 4 9 ⎥<br />
λ =2 B − 2I 4 = ⎣<br />
6 5 −4 −4<br />
⎦ −−−−→<br />
RREF<br />
⎢ 0 1 0 −1<br />
⎥<br />
⎣<br />
0 0 1 1/2<br />
⎦<br />
3 −4 5 8<br />
0 0 0 0<br />
〈 ⎧ ⎡<br />
⎪ − 1 ⎤⎫<br />
〉 〈 ⎨ 2 ⎪⎬<br />
⎧ ⎡ ⎤⎫<br />
⎪ −1 〉<br />
⎨ ⎪⎬<br />
⎢ 1 ⎥ ⎢ 2 ⎥<br />
E B (2) = N (B − 2I 4 )= ⎣<br />
⎪ ⎩ − 1 ⎦ = ⎣<br />
2<br />
⎪⎭ ⎪ ⎩<br />
−1<br />
⎦<br />
⎪⎭<br />
1<br />
2<br />
⎤<br />
⎥<br />
⎦<br />
So each eigenspace has dimension 1 and so γ B (1) = 1 and γ B (2) =1.This<br />
example is of <strong>in</strong>terest because of the discrepancy between the two multiplicities for<br />
λ = 2. In many of our examples the algebraic and geometric multiplicities will be<br />
equal for all of the eigenvalues (as it was for λ = 1 <strong>in</strong> this example), so keep this<br />
example <strong>in</strong> m<strong>in</strong>d. We will have some explanations for this phenomenon later (see<br />
Example NDMS4).<br />
△
§EE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 381<br />
Example ESMS4 Eigenvalues, symmetric matrix of size 4<br />
Consider the matrix<br />
⎡<br />
⎤<br />
1 0 1 1<br />
⎢0 1 1 1⎥<br />
C = ⎣<br />
1 1 1 0<br />
⎦<br />
1 1 0 1<br />
then<br />
p C (x) =−3+4x +2x 2 − 4x 3 + x 4 =(x − 3)(x − 1) 2 (x +1)<br />
So the eigenvalues are λ =3, 1, −1 with algebraic multiplicities α C (3) =1,α C (1) =<br />
2andα C (−1) = 1.<br />
Comput<strong>in</strong>g eigenvectors,<br />
⎡<br />
⎤ ⎡<br />
⎤<br />
λ =3 C − 3I 4 =<br />
⎢<br />
⎣<br />
−2 0 1 1<br />
0 −2 1 1<br />
1 1 −2 0<br />
1 1 0 −2<br />
〈 ⎧ ⎡<br />
⎪ ⎨<br />
⎪ ⎩<br />
E C (3) = N (C − 3I 4 )=<br />
⎥<br />
⎦ RREF<br />
−−−−→<br />
⎤⎫<br />
1 〉 ⎪⎬<br />
⎢1⎥<br />
⎣<br />
1<br />
⎦<br />
⎪⎭<br />
1<br />
⎢<br />
⎣<br />
1 0 0 −1<br />
0 1 0 −1<br />
⎥<br />
0 0 1 −1<br />
⎦<br />
0 0 0 0<br />
⎡<br />
⎤ ⎡<br />
⎤<br />
0 0 1 1<br />
1 1 0 0<br />
⎢0 0 1 1⎥<br />
λ =1 C − 1I 4 = ⎣<br />
1 1 0 0<br />
⎦ −−−−→<br />
RREF<br />
⎢ 0 0 1 1<br />
⎥<br />
⎣ 0 0 0 0⎦<br />
1 1 0 0<br />
0 0 0 0<br />
〈 ⎧ ⎡ ⎤ ⎡ ⎤⎫<br />
⎪ −1 0 〉<br />
⎨ ⎪⎬<br />
⎢ 1 ⎥ ⎢ 0 ⎥<br />
E C (1) = N (C − 1I 4 )= ⎣<br />
⎪ ⎩<br />
0<br />
⎦ , ⎣<br />
−1<br />
⎦<br />
⎪⎭<br />
0 1<br />
⎡<br />
⎤ ⎡<br />
⎤<br />
2 0 1 1<br />
1 0 0 1<br />
⎢0 2 1 1⎥<br />
λ = −1 C +1I 4 = ⎣<br />
1 1 2 0<br />
⎦ −−−−→<br />
RREF<br />
⎢ 0 1 0 1<br />
⎥<br />
⎣<br />
0 0 1 −1<br />
⎦<br />
1 1 0 2<br />
0 0 0 0<br />
〈 ⎧ ⎡ ⎤⎫<br />
⎪ −1 〉<br />
⎨ ⎪⎬<br />
⎢−1⎥<br />
E C (−1) = N (C +1I 4 )= ⎣<br />
⎪ ⎩<br />
1<br />
⎦<br />
⎪⎭<br />
1<br />
So the eigenspace dimensions yield geometric multiplicities γ C (3) =1,γ C (1) =2<br />
and γ C (−1) = 1, the same as for the algebraic multiplicities. This example is of<br />
<strong>in</strong>terest because A is a symmetric matrix, and will be the subject of Theorem HMRE.
§EE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 382<br />
△<br />
Example HMEM5 High multiplicity eigenvalues, matrix of size 5<br />
Consider the matrix<br />
⎡<br />
⎤<br />
29 14 2 6 −9<br />
−47 −22 −1 −11 13<br />
E = ⎢ 19 10 5 4 −8⎥<br />
⎣−19 −10 −3 −2 8 ⎦<br />
7 4 3 1 −3<br />
then<br />
p E (x) =−16 + 16x +8x 2 − 16x 3 +7x 4 − x 5 = −(x − 2) 4 (x +1)<br />
So the eigenvalues are λ =2, −1 with algebraic multiplicities α E (2) = 4 and<br />
α E (−1) = 1.<br />
Comput<strong>in</strong>g eigenvectors,<br />
λ =2<br />
⎡<br />
⎤ ⎡<br />
⎤<br />
27 14 2 6 −9<br />
1 0 0 1 0<br />
−47 −24 −1 −11 13<br />
RREF<br />
0 1 0 − 3 2<br />
− 1 2<br />
E − 2I 5 = ⎢ 19 10 3 4 −8⎥<br />
−−−−→<br />
⎣−19 −10 −3 −4 8 ⎦ ⎢ 0 0 1 0 −1⎥<br />
⎣<br />
0 0 0 0 0<br />
⎦<br />
7 4 3 1 −5<br />
0 0 0 0 0<br />
⎧⎡<br />
⎤ ⎡ ⎤⎫<br />
⎧⎡<br />
⎤ ⎡ ⎤⎫<br />
−1 0<br />
−2 0<br />
〈 ⎪⎨<br />
3<br />
1 〉 〈<br />
〉<br />
⎪⎬ ⎪⎨<br />
3<br />
1<br />
⎪⎬<br />
2<br />
E E (2) = N (E − 2I 5 )= ⎢ 0 ⎥<br />
⎣<br />
⎪⎩<br />
1<br />
⎦ , 2<br />
⎢1⎥<br />
= ⎢ 0 ⎥<br />
⎣<br />
0<br />
⎦ ⎣<br />
⎪⎭ ⎪⎩<br />
2 ⎦ , ⎢2⎥<br />
⎣0⎦<br />
⎪⎭<br />
0 1<br />
0 2<br />
λ = −1<br />
⎡<br />
30 14 2 6<br />
⎤<br />
−9<br />
−47 −21 −1 −11 13<br />
E +1I 5 = ⎢ 19 10 6 4 −8⎥<br />
⎣−19 −10 −3 −1 8 ⎦<br />
7 4 3 1 −2<br />
⎧⎡<br />
⎤⎫<br />
−2<br />
〈 〉<br />
⎪⎨<br />
4<br />
⎪⎬<br />
E E (−1) = N (E +1I 5 )= ⎢−1⎥<br />
⎣<br />
⎪⎩<br />
1 ⎦<br />
⎪⎭<br />
0<br />
RREF<br />
−−−−→<br />
⎡<br />
⎢<br />
⎣<br />
1 0 0 2 0<br />
0 1 0 −4 0<br />
0 0 1 1 0<br />
0 0 0 0 1<br />
0 0 0 0 0<br />
⎤<br />
⎥<br />
⎦<br />
So the eigenspace dimensions yield geometric multiplicities γ E (2) = 2 and<br />
γ E (−1) = 1. This example is of <strong>in</strong>terest because λ = 2 has such a large algebraic
§EE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 383<br />
multiplicity, which is also not equal to its geometric multiplicity.<br />
Example CEMS6 Complex eigenvalues, matrix of size 6<br />
Consider the matrix<br />
⎡<br />
⎤<br />
−59 −34 41 12 25 30<br />
1 7 −46 −36 −11 −29<br />
−233 −119 58 −35 75 54<br />
F =<br />
⎢ 157 81 −43 21 −51 −39<br />
⎥<br />
⎣ −91 −48 32 −5 32 26 ⎦<br />
209 107 −55 28 −69 −50<br />
then<br />
p F (x) =−50 + 55x +13x 2 − 50x 3 +32x 4 − 9x 5 + x 6<br />
=(x − 2)(x +1)(x 2 − 4x +5) 2<br />
=(x − 2)(x + 1)((x − (2 + i))(x − (2 − i))) 2<br />
=(x − 2)(x +1)(x − (2 + i)) 2 (x − (2 − i)) 2<br />
△<br />
So the eigenvalues are λ =2, −1, 2+i, 2 − i with algebraic multiplicities α F (2) =1,<br />
α F (−1) = 1, α F (2 + i) =2andα F (2 − i) =2.<br />
We compute eigenvectors, not<strong>in</strong>g that the last two basis vectors are each a scalar<br />
multiple of what Theorem BNS will provide,<br />
λ =2 F − 2I 6 =<br />
⎡<br />
⎡<br />
⎤<br />
−61 −34 41 12 25 30<br />
1 5 −46 −36 −11 −29<br />
−233 −119 56 −35 75 54<br />
RREF<br />
⎢ 157 81 −43 19 −51 −39<br />
−−−−→<br />
⎥<br />
⎣ −91 −48 32 −5 30 26 ⎦ ⎢<br />
⎣<br />
209 107 −55 28 −69 −52<br />
⎧⎡<br />
− 1 ⎤⎫<br />
⎧⎡<br />
⎤⎫<br />
5<br />
−1<br />
〈<br />
0<br />
〉 〈<br />
0<br />
〉<br />
⎪⎨<br />
−<br />
E F (2) = N (F − 2I 6 )=<br />
3 ⎪⎬ ⎪⎨<br />
⎪⎬<br />
5<br />
−3<br />
⎢ 1 =<br />
5<br />
⎥ ⎢ 1<br />
⎥<br />
⎣<br />
⎪⎩ − 4 ⎦ ⎣−4⎦<br />
5 ⎪⎭ ⎪⎩ ⎪⎭<br />
1<br />
5<br />
1 0 0 0 0<br />
1<br />
5<br />
0 1 0 0 0 0<br />
3<br />
0 0 1 0 0<br />
5<br />
0 0 0 1 0 − 1 5<br />
4<br />
0 0 0 0 1<br />
5<br />
0 0 0 0 0 0<br />
⎤<br />
⎥<br />
⎦<br />
λ = −1 F +1I 6 =
§EE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 384<br />
⎡<br />
⎤<br />
⎡<br />
⎤<br />
1<br />
−58 −34 41 12 25 30<br />
1 0 0 0 0<br />
2<br />
1 8 −46 −36 −11 −29<br />
0 1 0 0 0 − 3 2<br />
−233 −119 59 −35 75 54<br />
RREF<br />
1<br />
⎢ 157 81 −43 22 −51 −39<br />
−−−−→<br />
0 0 1 0 0 2<br />
⎥<br />
⎣ −91 −48 32 −5 33 26 ⎦ ⎢<br />
0 0 0 1 0 0<br />
⎥<br />
⎣ 0 0 0 0 1 −<br />
209 107 −55 28 −69 −49<br />
1 ⎦<br />
2<br />
0 0 0 0 0 0<br />
⎧⎡<br />
− 1 ⎤⎫<br />
⎧⎡<br />
⎤⎫<br />
2<br />
−1<br />
〈<br />
3 〉 〈<br />
⎪⎨<br />
2<br />
3<br />
〉<br />
−<br />
E F (−1) = N (F + I 6 )=<br />
1 ⎪⎬ ⎪⎨<br />
⎪⎬<br />
−1<br />
2<br />
=<br />
⎢ 0 ⎥ ⎢ 0<br />
⎥<br />
⎣ 1 ⎦ ⎣ 1 ⎦<br />
⎪⎩ 2 ⎪⎭ ⎪⎩ ⎪⎭<br />
1<br />
2<br />
λ =2+i<br />
⎡<br />
⎤<br />
−61 − i −34 41 12 25 30<br />
1 5− i −46 −36 −11 −29<br />
−233 −119 56 − i −35 75 54<br />
F − (2 + i)I 6 =<br />
⎢ 157 81 −43 19 − i −51 −39<br />
⎥<br />
⎣ −91 −48 32 −5 30− i 26 ⎦<br />
209 107 −55 28 −69 −52 − i<br />
⎡<br />
⎤<br />
1<br />
1 0 0 0 0<br />
5<br />
(7 + i)<br />
1<br />
0 1 0 0 0 5<br />
(−9 − 2i)<br />
RREF<br />
−−−−→<br />
0 0 1 0 0 1<br />
⎢<br />
0 0 0 1 0 −1<br />
⎥<br />
⎣ 0 0 0 0 1 1 ⎦<br />
0 0 0 0 0 0<br />
⎧⎡<br />
⎤⎫<br />
−7 − i<br />
〈<br />
9+2i<br />
〉<br />
⎪⎨<br />
⎪⎬<br />
−5<br />
E F (2 + i) =N (F − (2 + i)I 6 )=<br />
⎢ 5<br />
⎥<br />
⎣ −5 ⎦<br />
⎪⎩ ⎪⎭<br />
5<br />
λ =2− i
§EE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 385<br />
⎡<br />
⎤<br />
−61 + i −34 41 12 25 30<br />
1 5+i −46 −36 −11 −29<br />
−233 −119 56 + i −35 75 54<br />
F − (2 − i)I 6 =<br />
⎢ 157 81 −43 19 + i −51 −39<br />
⎥<br />
⎣ −91 −48 32 −5 30+i 26 ⎦<br />
209 107 −55 28 −69 −52 + i<br />
⎡<br />
⎤<br />
1<br />
1 0 0 0 0<br />
5<br />
(7 − i)<br />
1<br />
0 1 0 0 0 5 (−9+2i)<br />
RREF<br />
−−−−→<br />
0 0 1 0 0 1<br />
⎢<br />
0 0 0 1 0 −1<br />
⎥<br />
⎣ 0 0 0 0 1 1 ⎦<br />
0 0 0 0 0 0<br />
⎧⎡<br />
⎤⎫<br />
−7+i<br />
〈<br />
9 − 2i<br />
〉<br />
⎪⎨<br />
⎪⎬<br />
−5<br />
E F (2 − i) =N (F − (2 − i)I 6 )=<br />
⎢ 5<br />
⎥<br />
⎣ −5 ⎦<br />
⎪⎩ ⎪⎭<br />
5<br />
Eigenspace dimensions yield geometric multiplicities of γ F (2) =1,γ F (−1) =<br />
1, γ F (2 + i) = 1 and γ F (2 − i) = 1. This example demonstrates some of the<br />
possibilities for the appearance of complex eigenvalues, even when all the entries<br />
of the matrix are real. Notice how all the numbers <strong>in</strong> the analysis of λ =2− i are<br />
conjugates of the correspond<strong>in</strong>g number <strong>in</strong> the analysis of λ =2+i. This is the<br />
content of the upcom<strong>in</strong>g Theorem ERMCP.<br />
△<br />
Example DEMS5 Dist<strong>in</strong>ct eigenvalues, matrix of size 5<br />
Consider the matrix<br />
⎡<br />
⎤<br />
15 18 −8 6 −5<br />
5 3 1 −1 −3<br />
H = ⎢ 0 −4 5 −4 −2 ⎥<br />
⎣−43 −46 17 −14 15 ⎦<br />
26 30 −12 8 −10<br />
then<br />
p H (x) =−6x + x 2 +7x 3 − x 4 − x 5 = x(x − 2)(x − 1)(x +1)(x +3)<br />
So the eigenvalues are λ =2, 1, 0, −1, −3 with algebraic multiplicities α H (2) =1,<br />
α H (1) = 1, α H (0) = 1, α H (−1) = 1 and α H (−3) = 1.<br />
Comput<strong>in</strong>g eigenvectors,<br />
λ =2
§EE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 386<br />
⎡<br />
⎤<br />
13 18 −8 6 −5<br />
5 1 1 −1 −3<br />
H − 2I 5 = ⎢ 0 −4 3 −4 −2 ⎥<br />
⎣−43 −46 17 −16 15 ⎦<br />
26 30 −12 8 −12<br />
⎧⎡<br />
⎤⎫<br />
1<br />
〈 〉<br />
⎪⎨<br />
−1<br />
⎪⎬<br />
E H (2) = N (H − 2I 5 )= ⎢−2⎥<br />
⎣<br />
⎪⎩<br />
−1⎦<br />
⎪⎭<br />
1<br />
RREF<br />
−−−−→<br />
⎡<br />
⎤<br />
1 0 0 0 −1<br />
0 1 0 0 1<br />
⎢ 0 0 1 0 2<br />
⎥<br />
⎣<br />
⎦<br />
0 0 0 1 1<br />
0 0 0 0 0<br />
λ =1<br />
⎡<br />
⎤ ⎡<br />
14 18 −8 6 −5<br />
1 0 0 0 − 1 ⎤<br />
2<br />
5 2 1 −1 −3<br />
RREF<br />
0 1 0 0 0<br />
H − 1I 5 = ⎢ 0 −4 4 −4 −2 ⎥ −−−−→<br />
1<br />
⎣−43 −46 17 −15 15 ⎦ ⎢ 0 0 1 0 2 ⎥<br />
⎣<br />
⎦<br />
0 0 0 1 1<br />
26 30 −12 8 −11<br />
0 0 0 0 0<br />
⎧⎡<br />
1 ⎤⎫<br />
⎧⎡<br />
⎤⎫<br />
〈 2<br />
1<br />
〉 〈 〉<br />
⎪⎨<br />
0<br />
⎪⎬ ⎪⎨<br />
0<br />
⎪⎬<br />
E H (1) = N (H − 1I 5 )= ⎢− 1 ⎣<br />
2⎥<br />
= ⎢−1⎥<br />
⎪⎩ −1<br />
⎦ ⎣<br />
⎪⎭ ⎪⎩<br />
−2⎦<br />
⎪⎭<br />
1<br />
2<br />
λ =0<br />
⎡<br />
⎤ ⎡<br />
⎤<br />
15 18 −8 6 −5<br />
1 0 0 0 1<br />
5 3 1 −1 −3<br />
RREF<br />
0 1 0 0 −2<br />
H − 0I 5 = ⎢ 0 −4 5 −4 −2 ⎥ −−−−→<br />
⎣−43 −46 17 −14 15 ⎦ ⎢ 0 0 1 0 −2<br />
⎥<br />
⎣<br />
⎦<br />
0 0 0 1 0<br />
26 30 −12 8 −10<br />
0 0 0 0 0<br />
⎧⎡<br />
⎤⎫<br />
−1<br />
〈 〉<br />
⎪⎨<br />
2<br />
⎪⎬<br />
E H (0) = N (H − 0I 5 )= ⎢ 2 ⎥<br />
⎣<br />
⎪⎩<br />
0 ⎦<br />
⎪⎭<br />
1<br />
λ = −1
§EE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 387<br />
⎡<br />
⎤ ⎡<br />
⎤<br />
16 18 −8 6 −5<br />
1 0 0 0 −1/2<br />
5 4 1 −1 −3<br />
RREF<br />
0 1 0 0 0<br />
H +1I 5 = ⎢ 0 −4 6 −4 −2⎥<br />
−−−−→<br />
⎣−43 −46 17 −13 15 ⎦ ⎢ 0 0 1 0 0<br />
⎥<br />
⎣<br />
⎦<br />
0 0 0 1 1/2<br />
26 30 −12 8 −9<br />
0 0 0 0 0<br />
⎧⎡<br />
1 ⎤⎫<br />
⎧⎡<br />
⎤⎫<br />
〈 2<br />
1<br />
〉 〈 〉<br />
⎪⎨<br />
0<br />
⎪⎬ ⎪⎨<br />
0<br />
⎪⎬<br />
E H (−1) = N (H +1I 5 )= ⎢ 0 ⎥ = ⎢ 0 ⎥<br />
⎣<br />
⎪⎩<br />
− 1 ⎦ ⎣<br />
2 ⎪⎭ ⎪⎩<br />
−1⎦<br />
⎪⎭<br />
1<br />
2<br />
λ = −3<br />
⎡<br />
⎤ ⎡<br />
⎤<br />
18 18 −8 6 −5<br />
1 0 0 0 −1<br />
5 6 1 −1 −3<br />
1<br />
RREF<br />
0 1 0 0 2<br />
H +3I 5 = ⎢ 0 −4 8 −4 −2⎥<br />
−−−−→<br />
⎣−43 −46 17 −11 15 ⎦ ⎢ 0 0 1 0 1<br />
⎥<br />
⎣<br />
⎦<br />
0 0 0 1 2<br />
26 30 −12 8 −7<br />
0 0 0 0 0<br />
⎧⎡<br />
⎤⎫<br />
⎧⎡<br />
⎤⎫<br />
1<br />
−2<br />
〈 ⎪⎨<br />
− 1 〉 〈 〉<br />
⎪⎬ ⎪⎨<br />
1<br />
⎪⎬<br />
2<br />
E H (−3) = N (H +3I 5 )= ⎢−1⎥<br />
= ⎢ 2 ⎥<br />
⎣<br />
⎪⎩<br />
−2<br />
⎦ ⎣<br />
⎪⎭ ⎪⎩<br />
4 ⎦<br />
⎪⎭<br />
1<br />
−2<br />
So the eigenspace dimensions yield geometric multiplicities γ H (2) =1,γ H (1) =1,<br />
γ H (0) =1,γ H (−1) = 1 and γ H (−3) = 1, identical to the algebraic multiplicities.<br />
This example is of <strong>in</strong>terest for two reasons. <strong>First</strong>, λ = 0 is an eigenvalue, illustrat<strong>in</strong>g<br />
the upcom<strong>in</strong>g Theorem SMZE. Second, all the eigenvalues are dist<strong>in</strong>ct, yield<strong>in</strong>g<br />
algebraic and geometric multiplicities of 1 for each eigenvalue, illustrat<strong>in</strong>g Theorem<br />
DED.<br />
△<br />
Read<strong>in</strong>g Questions<br />
1. Suppose A is the 2 × 2 matrix<br />
A =<br />
[<br />
−5<br />
]<br />
8<br />
−4 7<br />
F<strong>in</strong>d the eigenvalues of A.<br />
2. For each eigenvalue of A, f<strong>in</strong>d the correspond<strong>in</strong>g eigenspace.<br />
3. For the polynomial p(x) =3x 2 − x +2andA fromabove,computep(A).
§EE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 388<br />
Exercises<br />
C10 † [ ]<br />
1 2<br />
F<strong>in</strong>d the characteristic polynomial of the matrix A = .<br />
3 4<br />
⎡ ⎤<br />
C11 † F<strong>in</strong>d the characteristic polynomial of the matrix A = ⎣ 3 2 1<br />
0 1 1⎦.<br />
1 2 0<br />
⎡<br />
1 2 1<br />
⎤<br />
0<br />
C12 † ⎢1 0 1 0⎥<br />
F<strong>in</strong>d the characteristic polynomial of the matrix A = ⎣<br />
2 1 1 0<br />
⎦.<br />
3 1 0 1<br />
C19 † F<strong>in</strong>d the eigenvalues, eigenspaces, algebraic multiplicities and geometric multiplicities<br />
for the matrix below. It is possible to do all these computations by hand, and it would<br />
be <strong>in</strong>structive to do so.<br />
[ ]<br />
−1 2<br />
C =<br />
−6 6<br />
C20 † F<strong>in</strong>d the eigenvalues, eigenspaces, algebraic multiplicities and geometric multiplicities<br />
for the matrix below. It is possible to do all these computations by hand, and it would<br />
be <strong>in</strong>structive to do so.<br />
[ ]<br />
−12 30<br />
B =<br />
−5 13<br />
C21 † The matrix A below has λ = 2 as an eigenvalue. F<strong>in</strong>d the geometric multiplicity of<br />
λ = 2 us<strong>in</strong>g your calculator only for row-reduc<strong>in</strong>g matrices.<br />
⎡<br />
⎤<br />
18 −15 33 −15<br />
⎢−4 8 −6 6 ⎥<br />
A = ⎣<br />
−9 9 −16 9<br />
⎦<br />
5 −6 9 −4<br />
C22 † Without us<strong>in</strong>g a calculator, f<strong>in</strong>d the eigenvalues of the matrix B.<br />
[ ]<br />
2 −1<br />
B =<br />
1 1<br />
[ C23 † ] F<strong>in</strong>d the eigenvalues, eigenspaces, algebraic and geometric multiplicities for A =<br />
1 1<br />
.<br />
1 1<br />
C24 ⎡<br />
† F<strong>in</strong>d the eigenvalues, eigenspaces, algebraic and geometric multiplicities for A =<br />
⎣ 1 −1 1<br />
⎤<br />
−1 1 −1⎦.<br />
1 −1 1<br />
C25 † F<strong>in</strong>d the eigenvalues, eigenspaces, algebraic and geometric multiplicities for the<br />
3 × 3 identity matrix I 3 . Do your results make sense?
§EE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 389<br />
⎡<br />
C26 † For matrix A = ⎣ 2 1 1<br />
⎤<br />
1 2 1⎦, the characteristic polynomial of A is p A (x) =(4−<br />
1 1 2<br />
x)(1 − x) 2 . F<strong>in</strong>d the eigenvalues and correspond<strong>in</strong>g eigenspaces of A.<br />
⎡<br />
⎤<br />
0 4 −1 1<br />
C27 † ⎢−2 6 −1 1 ⎥<br />
For matrix A = ⎣<br />
−2 8 −1 −1<br />
⎦, the characteristic polynomial of A is p A (x) =<br />
−2 8 −3 1<br />
(x +2)(x − 2) 2 (x − 4). F<strong>in</strong>d the eigenvalues and correspond<strong>in</strong>g eigenspaces of A.<br />
⎡ ⎤<br />
0<br />
8<br />
M60 † Repeat Example CAEHW by choos<strong>in</strong>g x = ⎢2⎥<br />
and then arrive at an eigenvalue<br />
⎣<br />
1<br />
⎦<br />
2<br />
and eigenvector of the matrix A. The hard way.<br />
T10 † A matrix A is idempotent if A 2 = A. Show that the only possible eigenvalues of<br />
an idempotent matrix are λ =0andλ = 1. Then give an example of a matrix that is<br />
idempotent and has both of these two values as eigenvalues.<br />
T15 † The characteristic polynomial of the square matrix A is usually def<strong>in</strong>ed as r A (x) =<br />
det (xI n − A). F<strong>in</strong>d a specific relationship between our characteristic polynomial, p A (x),<br />
and r A (x), give a proof of your relationship, and use this to expla<strong>in</strong> why Theorem EMRCP<br />
can rema<strong>in</strong> essentially unchanged with either def<strong>in</strong>ition. Expla<strong>in</strong> the advantages of each<br />
def<strong>in</strong>ition over the other. (Comput<strong>in</strong>g with both def<strong>in</strong>itions, for a 2 × 2anda3× 3matrix,<br />
might be a good way to start.)<br />
T20 † Suppose that λ and ρ are two different eigenvalues of the square matrix A. Prove<br />
that the <strong>in</strong>tersection of the eigenspaces for these two eigenvalues is trivial. That is, E A (λ) ∩<br />
E A (ρ) ={0}.
Section PEE<br />
Properties of Eigenvalues and Eigenvectors<br />
The previous section <strong>in</strong>troduced eigenvalues and eigenvectors, and concentrated on<br />
their existence and determ<strong>in</strong>ation. This section will be more about theorems, and<br />
the various properties eigenvalues and eigenvectors enjoy. Like a good 4 × 100 meter<br />
relay, we will lead-off with one of our better theorems and save the very best for the<br />
anchor leg.<br />
Subsection BPE<br />
Basic Properties of Eigenvalues<br />
Theorem EDELI Eigenvectors with Dist<strong>in</strong>ct Eigenvalues are L<strong>in</strong>early Independent<br />
Suppose that A is an n × n square matrix and S = {x 1 , x 2 , x 3 , ..., x p } is a set of<br />
eigenvectors with eigenvalues λ 1 ,λ 2 ,λ 3 , ..., λ p such that λ i ≠ λ j whenever i ≠ j.<br />
Then S is a l<strong>in</strong>early <strong>in</strong>dependent set.<br />
Proof. If p = 1, then the set S = {x 1 } is l<strong>in</strong>early <strong>in</strong>dependent s<strong>in</strong>ce eigenvectors are<br />
nonzero (Def<strong>in</strong>ition EEM), so assume for the rema<strong>in</strong>der that p ≥ 2.<br />
We will prove this result by contradiction (Proof Technique CD). Suppose to the<br />
contrary that S is a l<strong>in</strong>early dependent set. Def<strong>in</strong>e S i = {x 1 , x 2 , x 3 , ..., x i } and<br />
let k be an <strong>in</strong>teger such that S k−1 = {x 1 , x 2 , x 3 , ..., x k−1 } is l<strong>in</strong>early <strong>in</strong>dependent<br />
and S k = {x 1 , x 2 , x 3 , ..., x k } is l<strong>in</strong>early dependent. We have to ask if there is<br />
even such an <strong>in</strong>teger k? <strong>First</strong>, s<strong>in</strong>ce eigenvectors are nonzero, the set {x 1 } is l<strong>in</strong>early<br />
<strong>in</strong>dependent. S<strong>in</strong>ce we are assum<strong>in</strong>g that S = S p is l<strong>in</strong>early dependent, there must<br />
be an <strong>in</strong>teger k, 2≤ k ≤ p, where the sets S i transition from l<strong>in</strong>ear <strong>in</strong>dependence to<br />
l<strong>in</strong>ear dependence (and stay that way). In other words, x k is the vector with the<br />
smallest <strong>in</strong>dex that is a l<strong>in</strong>ear comb<strong>in</strong>ation of just vectors with smaller <strong>in</strong>dices.<br />
S<strong>in</strong>ce {x 1 , x 2 , x 3 , ..., x k } is a l<strong>in</strong>early dependent set there must be scalars,<br />
a 1 ,a 2 ,a 3 , ..., a k , not all zero (Def<strong>in</strong>ition LI), so that<br />
0 = a 1 x 1 + a 2 x 2 + a 3 x 3 + ···+ a k x k<br />
Then,<br />
0 =(A − λ k I n ) 0 Theorem ZVSM<br />
=(A − λ k I n )(a 1 x 1 + a 2 x 2 + a 3 x 3 + ···+ a k x k )<br />
Def<strong>in</strong>ition RLD<br />
=(A − λ k I n ) a 1 x 1 + ···+(A − λ k I n ) a k x k<br />
Theorem MMDAA<br />
= a 1 (A − λ k I n ) x 1 + ···+ a k (A − λ k I n ) x k Theorem MMSMM<br />
= a 1 (Ax 1 − λ k I n x 1 )+···+ a k (Ax k − λ k I n x k ) Theorem MMDAA<br />
= a 1 (Ax 1 − λ k x 1 )+···+ a k (Ax k − λ k x k ) Theorem MMIM<br />
= a 1 (λ 1 x 1 − λ k x 1 )+···+ a k (λ k x k − λ k x k ) Def<strong>in</strong>ition EEM<br />
390
§PEE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 391<br />
= a 1 (λ 1 − λ k ) x 1 + ···+ a k (λ k − λ k ) x k Theorem MMDAA<br />
= a 1 (λ 1 − λ k ) x 1 + ···+ a k−1 (λ k−1 − λ k ) x k−1 + a k (0) x k Property AICN<br />
= a 1 (λ 1 − λ k ) x 1 + ···+ a k−1 (λ k−1 − λ k ) x k−1 + 0 Theorem ZSSM<br />
= a 1 (λ 1 − λ k ) x 1 + ···+ a k−1 (λ k−1 − λ k ) x k−1 Property Z<br />
This equation is a relation of l<strong>in</strong>ear dependence on the l<strong>in</strong>early <strong>in</strong>dependent set<br />
{x 1 , x 2 , x 3 , ..., x k−1 }, so the scalars must all be zero. That is, a i (λ i − λ k ) =0for<br />
1 ≤ i ≤ k − 1. However, we have the hypothesis that the eigenvalues are dist<strong>in</strong>ct, so<br />
λ i ≠ λ k for 1 ≤ i ≤ k − 1. Thus a i =0for1≤ i ≤ k − 1.<br />
This reduces the orig<strong>in</strong>al relation of l<strong>in</strong>ear dependence on {x 1 , x 2 , x 3 , ..., x k }<br />
to the simpler equation a k x k = 0. ByTheoremSMEZV we conclude that a k =0or<br />
x k = 0. Eigenvectors are never the zero vector (Def<strong>in</strong>ition EEM), so a k =0.Soall<br />
of the scalars a i ,1≤ i ≤ k are zero, contradict<strong>in</strong>g their <strong>in</strong>troduction as the scalars<br />
creat<strong>in</strong>g a nontrivial relation of l<strong>in</strong>ear dependence on the set {x 1 , x 2 , x 3 , ..., x k }.<br />
With a contradiction <strong>in</strong> hand, we conclude that S must be l<strong>in</strong>early <strong>in</strong>dependent. <br />
There is a simple connection between the eigenvalues of a matrix and whether or<br />
not the matrix is nons<strong>in</strong>gular.<br />
Theorem SMZE S<strong>in</strong>gular Matrices have Zero Eigenvalues<br />
Suppose A is a square matrix. Then A is s<strong>in</strong>gular if and only if λ =0is an eigenvalue<br />
of A.<br />
Proof. We have the follow<strong>in</strong>g equivalences:<br />
A is s<strong>in</strong>gular ⇐⇒ there exists x ≠ 0, Ax = 0<br />
⇐⇒ there exists x ≠ 0, Ax =0x<br />
⇐⇒ λ = 0 is an eigenvalue of A<br />
Def<strong>in</strong>ition NM<br />
Theorem ZSSM<br />
Def<strong>in</strong>ition EEM<br />
<br />
With an equivalence about s<strong>in</strong>gular matrices we can update our list of equivalences<br />
about nons<strong>in</strong>gular matrices.<br />
Theorem NME8 Nons<strong>in</strong>gular Matrix Equivalences, Round 8<br />
Suppose that A is a square matrix of size n. The follow<strong>in</strong>g are equivalent.<br />
1. A is nons<strong>in</strong>gular.<br />
2. A row-reduces to the identity matrix.<br />
3. The null space of A conta<strong>in</strong>s only the zero vector, N (A) ={0}.<br />
4. The l<strong>in</strong>ear system LS(A, b) has a unique solution for every possible choice of<br />
b.
§PEE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 392<br />
5. The columns of A are a l<strong>in</strong>early <strong>in</strong>dependent set.<br />
6. A is <strong>in</strong>vertible.<br />
7. The column space of A is C n , C(A) =C n .<br />
8. The columns of A are a basis for C n .<br />
9. The rank of A is n, r (A) =n.<br />
10. The nullity of A is zero, n (A) =0.<br />
11. The determ<strong>in</strong>ant of A is nonzero, det (A) ≠0.<br />
12. λ =0is not an eigenvalue of A.<br />
Proof. The equivalence of the first and last statements is Theorem SMZE, reformulated<br />
by negat<strong>in</strong>g each statement <strong>in</strong> the equivalence. So we are able to improve on<br />
Theorem NME7 with this addition.<br />
<br />
Certa<strong>in</strong> changes to a matrix change its eigenvalues <strong>in</strong> a predictable way.<br />
Theorem ESMM Eigenvalues of a Scalar Multiple of a Matrix<br />
Suppose A is a square matrix and λ is an eigenvalue of A. Thenαλ is an eigenvalue<br />
of αA.<br />
Proof. Let x ≠ 0 be one eigenvector of A for λ. Then<br />
(αA) x = α (Ax)<br />
Theorem MMSMM<br />
= α (λx) Def<strong>in</strong>ition EEM<br />
=(αλ) x<br />
Property SMAC<br />
So x ≠ 0 is an eigenvector of αA for the eigenvalue αλ.<br />
<br />
Unfortunately, there are not parallel theorems about the sum or product of<br />
arbitrary matrices. But we can prove a similar result for powers of a matrix.<br />
Theorem EOMP Eigenvalues Of Matrix Powers<br />
Suppose A is a square matrix, λ is an eigenvalue of A, ands ≥ 0 is an <strong>in</strong>teger. Then<br />
λ s is an eigenvalue of A s .<br />
Proof. Let x ̸= 0 be one eigenvector of A for λ. Suppose A has size n. Thenwe<br />
proceed by <strong>in</strong>duction on s (Proof Technique I). <strong>First</strong>, for s =0,<br />
A s x = A 0 x<br />
= I n x<br />
= x Theorem MMIM<br />
=1x Property OC
§PEE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 393<br />
= λ 0 x<br />
= λ s x<br />
so λ s is an eigenvalue of A s <strong>in</strong> this special case. If we assume the theorem is true for<br />
s, then we f<strong>in</strong>d<br />
A s+1 x = A s Ax<br />
= A s (λx) Def<strong>in</strong>ition EEM<br />
= λ (A s x) Theorem MMSMM<br />
= λ (λ s x) Induction hypothesis<br />
=(λλ s ) x<br />
Property SMAC<br />
= λ s+1 x<br />
So x ≠ 0 is an eigenvector of A s+1 for λ s+1 , and <strong>in</strong>duction tells us the theorem<br />
is true for all s ≥ 0.<br />
<br />
While we cannot prove that the sum of two arbitrary matrices behaves <strong>in</strong> any<br />
reasonable way with regard to eigenvalues, we can work with the sum of dissimilar<br />
powers of the same matrix. We have already seen two connections between eigenvalues<br />
and polynomials, <strong>in</strong> the proof of Theorem EMHE and the characteristic polynomial<br />
(Def<strong>in</strong>ition CP). Our next theorem strengthens this connection.<br />
Theorem EPM Eigenvalues of the Polynomial of a Matrix<br />
Suppose A is a square matrix and λ is an eigenvalue of A. Letq(x) be a polynomial<br />
<strong>in</strong> the variable x. Thenq(λ) is an eigenvalue of the matrix q(A).<br />
Proof. Let x ≠ 0 be one eigenvector of A for λ, andwriteq(x) =a 0 + a 1 x + a 2 x 2 +<br />
···+ a m x m .Then<br />
q(A)x = ( a 0 A 0 + a 1 A 1 + a 2 A 2 + ···+ a m A m) x<br />
=(a 0 A 0 )x +(a 1 A 1 )x +(a 2 A 2 )x + ···+(a m A m )x Theorem MMDAA<br />
= a 0 (A 0 x)+a 1 (A 1 x)+a 2 (A 2 x)+···+ a m (A m x) Theorem MMSMM<br />
= a 0 (λ 0 x)+a 1 (λ 1 x)+a 2 (λ 2 x)+···+ a m (λ m x) Theorem EOMP<br />
=(a 0 λ 0 )x +(a 1 λ 1 )x +(a 2 λ 2 )x + ···+(a m λ m )x Property SMAC<br />
= ( a 0 λ 0 + a 1 λ 1 + a 2 λ 2 + ···+ a m λ m) x Property DSAC<br />
= q(λ)x<br />
So x ≠ 0 is an eigenvector of q(A) for the eigenvalue q(λ).<br />
<br />
Example BDE Build<strong>in</strong>g desired eigenvalues
§PEE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 394<br />
In Example ESMS4 the 4 × 4 symmetric matrix<br />
⎡<br />
⎤<br />
1 0 1 1<br />
⎢0 1 1 1⎥<br />
C = ⎣<br />
1 1 1 0<br />
⎦<br />
1 1 0 1<br />
is shown to have the three eigenvalues λ =3, 1, −1. Suppose we wanted a 4 × 4<br />
matrix that has the three eigenvalues λ =4, 0, −2. We can employ Theorem EPM<br />
by f<strong>in</strong>d<strong>in</strong>g a polynomial that converts 3 to 4, 1 to 0, and −1 to−2. Such a polynomial<br />
is called an <strong>in</strong>terpolat<strong>in</strong>g polynomial, and <strong>in</strong> this example we can use<br />
r(x) = 1 4 x2 + x − 5 4<br />
We will not discuss how to concoct this polynomial, but a text on numerical<br />
analysis should provide the details. For now, simply verify that r(3) = 4, r(1) = 0<br />
and r(−1) = −2.<br />
Now compute<br />
r(C) = 1 4 C2 + C − 5 4 I 4<br />
⎡<br />
⎤ ⎡<br />
⎤ ⎡<br />
⎤ ⎡<br />
⎤<br />
3 2 2 2 1 0 1 1 1 0 0 0 1 1 3 3<br />
= 1 ⎢2 3 2 2⎥<br />
⎢0 1 1 1⎥<br />
⎣<br />
4 2 2 3 2<br />
⎦ + ⎣<br />
1 1 1 0<br />
⎦ − 5 ⎢0 1 0 0⎥<br />
⎣<br />
4 0 0 1 0<br />
⎦ = 1 ⎢1 1 3 3⎥<br />
⎣<br />
2 3 3 1 1<br />
⎦<br />
2 2 2 3 1 1 0 1 0 0 0 1 3 3 1 1<br />
Theorem EPM tells us that if r(x) transforms the eigenvalues <strong>in</strong> the desired<br />
manner, then r(C) will have the desired eigenvalues. You can check this by comput<strong>in</strong>g<br />
the eigenvalues of r(C) directly. Furthermore, notice that the multiplicities are the<br />
same, and the eigenspaces of C and r(C) are identical.<br />
△<br />
Inverses and transposes also behave predictably with regard to their eigenvalues.<br />
Theorem EIM Eigenvalues of the Inverse of a Matrix<br />
Suppose A is a square nons<strong>in</strong>gular matrix and λ is an eigenvalue of A. Thenλ −1 is<br />
an eigenvalue of the matrix A −1 .<br />
Proof. Notice that s<strong>in</strong>ce A is assumed nons<strong>in</strong>gular, A −1 exists by Theorem NI, but<br />
more importantly, λ −1 =1/λ does not <strong>in</strong>volve division by zero s<strong>in</strong>ce Theorem SMZE<br />
prohibits this possibility.<br />
Let x ≠ 0 be one eigenvector of A for λ. Suppose A has size n. Then<br />
A −1 x = A −1 (1x)<br />
Property OC<br />
= A −1 ( 1 λx)<br />
λ<br />
Property MICN<br />
= 1 λ A−1 (λx) Theorem MMSMM
§PEE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 395<br />
= 1 λ A−1 (Ax) Def<strong>in</strong>ition EEM<br />
= 1 λ (A−1 A)x Theorem MMA<br />
= 1 λ I nx Def<strong>in</strong>ition MI<br />
= 1 x Theorem MMIM<br />
λ<br />
So x ≠ 0 is an eigenvector of A −1 for the eigenvalue 1 λ .<br />
<br />
Theproofsofthetheoremsabovehaveasimilarstyletothem.Theyallbeg<strong>in</strong><br />
by grabb<strong>in</strong>g an eigenvalue-eigenvector pair and adjust<strong>in</strong>g it <strong>in</strong> some way to reach<br />
the desired conclusion. You should add this to your toolkit as a general approach to<br />
prov<strong>in</strong>g theorems about eigenvalues.<br />
So far we have been able to reserve the characteristic polynomial for strictly<br />
computational purposes. However, sometimes a theorem about eigenvalues can be<br />
proved easily by employ<strong>in</strong>g the characteristic polynomial (rather than us<strong>in</strong>g an<br />
eigenvalue-eigenvector pair). The next theorem is an example of this.<br />
Theorem ETM Eigenvalues of the Transpose of a Matrix<br />
Suppose A is a square matrix and λ is an eigenvalue of A. Thenλ is an eigenvalue<br />
of the matrix A t .<br />
Proof. Suppose A has size n. Then<br />
p A (x) =det(A − xI n )<br />
Def<strong>in</strong>ition CP<br />
(<br />
=det (A − xI n ) t) Theorem DT<br />
(<br />
=det A t − (xI n ) t)<br />
Theorem TMA<br />
=det ( A t − xIn<br />
t )<br />
Theorem TMSM<br />
=det ( A t )<br />
− xI n Def<strong>in</strong>ition IM<br />
= p A<br />
t (x) Def<strong>in</strong>ition CP<br />
So A and A t have the same characteristic polynomial, and by Theorem EMRCP,<br />
their eigenvalues are identical and have equal algebraic multiplicities. Notice that<br />
what we have proved here is a bit stronger than the stated conclusion <strong>in</strong> the theorem.<br />
<br />
If a matrix has only real entries, then the computation of the characteristic<br />
polynomial (Def<strong>in</strong>ition CP) will result <strong>in</strong> a polynomial with coefficients that are real<br />
numbers. Complex numbers could result as roots of this polynomial, but they are<br />
roots of quadratic factors with real coefficients, and as such, come <strong>in</strong> conjugate pairs.
§PEE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 396<br />
The next theorem proves this, and a bit more, without mention<strong>in</strong>g the characteristic<br />
polynomial.<br />
Theorem ERMCP Eigenvalues of Real Matrices come <strong>in</strong> Conjugate Pairs<br />
Suppose A is a square matrix with real entries and x is an eigenvector of A for the<br />
eigenvalue λ. Thenx is an eigenvector of A for the eigenvalue λ.<br />
Proof.<br />
Ax = Ax<br />
A has real entries<br />
= Ax Theorem MMCC<br />
= λx Def<strong>in</strong>ition EEM<br />
= λx Theorem CRSM<br />
So x is an eigenvector of A for the eigenvalue λ.<br />
<br />
This phenomenon is amply illustrated <strong>in</strong> Example CEMS6, where the four<br />
complex eigenvalues come <strong>in</strong> two pairs, and the two basis vectors of the eigenspaces<br />
are complex conjugates of each other. Theorem ERMCP can be a time-saver for<br />
comput<strong>in</strong>g eigenvalues and eigenvectors of real matrices with complex eigenvalues,<br />
s<strong>in</strong>ce the conjugate eigenvalue and eigenspace can be <strong>in</strong>ferred from the theorem<br />
rather than computed.<br />
Subsection ME<br />
Multiplicities of Eigenvalues<br />
A polynomial of degree n will have exactly n roots. From this fact about polynomial<br />
equations we can say more about the algebraic multiplicities of eigenvalues.<br />
Theorem DCP Degree of the Characteristic Polynomial<br />
Suppose that A is a square matrix of size n. Then the characteristic polynomial of<br />
A, p A (x), has degree n.<br />
Proof. We will prove a more general result by <strong>in</strong>duction (Proof Technique I). Then<br />
the theorem will be true as a special case. We will carefully state this result as a<br />
proposition <strong>in</strong>dexed by m, m ≥ 1.<br />
P (m): Suppose that A is an m × m matrix whose entries are complex numbers<br />
or l<strong>in</strong>ear polynomials <strong>in</strong> the variable x of the form c − x, wherec is a complex<br />
number. Suppose further that there are exactly k entries that conta<strong>in</strong> x and that no<br />
row or column conta<strong>in</strong>s more than one such entry. Then, when k = m, det (A) is a<br />
polynomial <strong>in</strong> x of degree m, with lead<strong>in</strong>g coefficient ±1, and when k
§PEE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 397<br />
polynomial <strong>in</strong> x of degree m = 1 with lead<strong>in</strong>g coefficient −1. When k
§PEE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 398<br />
Proof. By the def<strong>in</strong>ition of the algebraic multiplicity (Def<strong>in</strong>ition AME), we can factor<br />
the characteristic polynomial as<br />
p A (x) =c(x − λ 1 ) α A(λ 1 ) (x − λ 2 ) α A(λ 2 ) (x − λ 3 ) α A(λ 3) ···(x − λ k ) α A(λ k )<br />
where c is a nonzero constant. (We could prove that c =(−1) n , but we do not<br />
need that specificity right now. See Exercise PEE.T30) The left-hand side is a<br />
polynomial of degree n by Theorem DCP and the right-hand side is a polynomial of<br />
degree ∑ k<br />
i=1 α A (λ i ). So the equality of the polynomials’ degrees gives the equality<br />
∑ k<br />
i=1 α A (λ i )=n.<br />
<br />
Theorem ME Multiplicities of an Eigenvalue<br />
Suppose that A is a square matrix of size n and λ is an eigenvalue. Then<br />
1 ≤ γ A (λ) ≤ α A (λ) ≤ n<br />
Proof. S<strong>in</strong>ce λ is an eigenvalue of A, there is an eigenvector of A for λ, x. Then<br />
x ∈E A (λ), soγ A (λ) ≥ 1,s<strong>in</strong>cewecanextend{x} <strong>in</strong>to a basis of E A (λ) (Theorem<br />
ELIS).<br />
To show γ A (λ) ≤ α A (λ) is the most <strong>in</strong>volved portion of this proof. To this end,<br />
let g = γ A (λ) and let x 1 , x 2 , x 3 , ..., x g be a basis for the eigenspace of λ, E A (λ).<br />
Construct another n − g vectors, y 1 , y 2 , y 3 , ..., y n−g ,sothat<br />
{x 1 , x 2 , x 3 , ..., x g , y 1 , y 2 , y 3 , ..., y n−g }<br />
is a basis of C n . This can be done by repeated applications of Theorem ELIS.<br />
F<strong>in</strong>ally, def<strong>in</strong>e a matrix S by<br />
S =[x 1 |x 2 |x 3 | ...|x g |y 1 |y 2 |y 3 | ...|y n−g ]=[x 1 |x 2 |x 3 | ...|x g |R]<br />
where R is an n × (n − g) matrix whose columns are y 1 , y 2 , y 3 , ..., y n−g .The<br />
columns of S are l<strong>in</strong>early <strong>in</strong>dependent by design, so S is nons<strong>in</strong>gular (Theorem<br />
NMLIC) and therefore <strong>in</strong>vertible (Theorem NI).<br />
Then,<br />
[e 1 |e 2 |e 3 | ...|e n ]=I n<br />
= S −1 S<br />
= S −1 [x 1 |x 2 |x 3 | ...|x g |R]<br />
=[S −1 x 1 |S −1 x 2 |S −1 x 3 | ...|S −1 x g |S −1 R]<br />
So<br />
S −1 x i = e i 1 ≤ i ≤ g (∗)<br />
Preparations <strong>in</strong> place, we compute the characteristic polynomial of A,<br />
p A (x) =det(A − xI n )<br />
Def<strong>in</strong>ition CP<br />
=1det(A − xI n )<br />
Property OCN
§PEE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 399<br />
=det(I n )det(A − xI n )<br />
Def<strong>in</strong>ition DM<br />
=det ( S −1 S ) det (A − xI n )<br />
Def<strong>in</strong>ition MI<br />
=det ( S −1) det (S)det(A − xI n ) Theorem DRMM<br />
=det ( S −1) det (A − xI n )det(S) Property CMCN<br />
=det ( S −1 (A − xI n ) S )<br />
Theorem DRMM<br />
=det ( S −1 AS − S −1 xI n S )<br />
Theorem MMDAA<br />
=det ( S −1 AS − xS −1 I n S )<br />
Theorem MMSMM<br />
=det ( S −1 AS − xS −1 S )<br />
Theorem MMIM<br />
=det ( S −1 )<br />
AS − xI n Def<strong>in</strong>ition MI<br />
= p S −1 AS (x) Def<strong>in</strong>ition CP<br />
What can we learn then about the matrix S −1 AS?<br />
S −1 AS = S −1 A[x 1 |x 2 |x 3 | ...|x g |R]<br />
= S −1 [Ax 1 |Ax 2 |Ax 3 | ...|Ax g |AR] Def<strong>in</strong>ition MM<br />
= S −1 [λx 1 |λx 2 |λx 3 | ...|λx g |AR] Def<strong>in</strong>ition EEM<br />
=[S −1 λx 1 |S −1 λx 2 |S −1 λx 3 | ...|S −1 λx g |S −1 AR] Def<strong>in</strong>ition MM<br />
=[λS −1 x 1 |λS −1 x 2 |λS −1 x 3 | ...|λS −1 x g |S −1 AR] Theorem MMSMM<br />
=[λe 1 |λe 2 |λe 3 | ...|λe g |S −1 AR]<br />
S −1 S = I n ,((∗) above)<br />
Now imag<strong>in</strong>e comput<strong>in</strong>g the characteristic polynomial of A by comput<strong>in</strong>g the<br />
characteristic polynomial of S −1 AS us<strong>in</strong>g the form just obta<strong>in</strong>ed. The first g columns<br />
of S −1 AS are all zero, save for a λ on the diagonal. So if we compute the determ<strong>in</strong>ant<br />
by expand<strong>in</strong>g about the first column, successively, we will get successive factors of<br />
(λ − x). More precisely, let T be the square matrix of size n − g that is formed from<br />
the last n − g rows and last n − g columns of S −1 AR. Then<br />
p A (x) =p S −1 AS (x) =(λ − x) g p T (x) .<br />
This says that (x − λ) is a factor of the characteristic polynomial at least g times,<br />
so the algebraic multiplicity of λ as an eigenvalue of A is greater than or equal to g<br />
(Def<strong>in</strong>ition AME). In other words,<br />
γ A (λ) =g ≤ α A (λ)<br />
as desired.<br />
Theorem NEM says that the sum of the algebraic multiplicities for all the<br />
eigenvalues of A is equal to n. S<strong>in</strong>ce the algebraic multiplicity is a positive quantity,<br />
no s<strong>in</strong>gle algebraic multiplicity can exceed n without the sum of all of the algebraic<br />
multiplicities do<strong>in</strong>g the same.<br />
<br />
Theorem MNEM Maximum Number of Eigenvalues of a Matrix
§PEE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 400<br />
Suppose that A is a square matrix of size n. ThenA cannot have more than n<br />
dist<strong>in</strong>ct eigenvalues.<br />
Proof. Suppose that A has k dist<strong>in</strong>ct eigenvalues, λ 1 ,λ 2 ,λ 3 , ..., λ k .Then<br />
k∑<br />
k = 1<br />
≤<br />
i=1<br />
k∑<br />
α A (λ i )<br />
i=1<br />
Theorem ME<br />
= n Theorem NEM<br />
<br />
Subsection EHM<br />
Eigenvalues of Hermitian Matrices<br />
Recall that a matrix is Hermitian (or self-adjo<strong>in</strong>t) if A = A ∗ (Def<strong>in</strong>ition HM). In<br />
thecasewhereA is a matrix whose entries are all real numbers, be<strong>in</strong>g Hermitian<br />
is identical to be<strong>in</strong>g symmetric (Def<strong>in</strong>ition SYM). Keep this <strong>in</strong> m<strong>in</strong>d as you read<br />
the next two theorems. Their hypotheses could be changed to “suppose A is a real<br />
symmetric matrix.”<br />
Theorem HMRE Hermitian Matrices have Real Eigenvalues<br />
Suppose that A is a Hermitian matrix and λ is an eigenvalue of A. Thenλ ∈ R.<br />
Proof. Let x ̸= 0 be one eigenvector of A for the eigenvalue λ. Then by Theorem<br />
PIP we know 〈x, x〉 ≠0.So<br />
λ = 1 λ 〈x, x〉<br />
〈x, x〉<br />
= 1 〈x, λx〉<br />
〈x, x〉<br />
= 1 〈x, Ax〉<br />
〈x, x〉<br />
= 1 〈Ax, x〉<br />
〈x, x〉<br />
= 1 〈λx, x〉<br />
〈x, x〉<br />
= 1 λ 〈x, x〉<br />
〈x, x〉<br />
Property MICN<br />
Theorem IPSM<br />
Def<strong>in</strong>ition EEM<br />
Theorem HMIP<br />
Def<strong>in</strong>ition EEM<br />
Theorem IPSM<br />
= λ Property MICN
§PEE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 401<br />
If a complex number is equal to its conjugate, then it has a complex part equal<br />
to zero, and therefore is a real number.<br />
<br />
Notice the appeal<strong>in</strong>g symmetry to the justifications given for the steps of this<br />
proof. In the center is the ability to pitch a Hermitian matrix from one side of the<br />
<strong>in</strong>ner product to the other.<br />
Look back and compare Example ESMS4 and Example CEMS6. In Example<br />
CEMS6 the matrix has only real entries, yet the characteristic polynomial has roots<br />
that are complex numbers, and so the matrix has complex eigenvalues. However, <strong>in</strong><br />
Example ESMS4, the matrix has only real entries, but is also symmetric, and hence<br />
Hermitian. So by Theorem HMRE, we were guaranteed eigenvalues that are real<br />
numbers.<br />
In many physical problems, a matrix of <strong>in</strong>terest will be real and symmetric, or<br />
Hermitian. Then if the eigenvalues are to represent physical quantities of <strong>in</strong>terest,<br />
Theorem HMRE guarantees that these values will not be complex numbers.<br />
The eigenvectors of a Hermitian matrix also enjoy a pleas<strong>in</strong>g property that we<br />
will exploit later.<br />
Theorem HMOE Hermitian Matrices have Orthogonal Eigenvectors<br />
Suppose that A is a Hermitian matrix and x and y are two eigenvectors of A for<br />
different eigenvalues. Then x and y are orthogonal vectors.<br />
Proof. Let x be an eigenvector of A for λ and let y be an eigenvector of A for a<br />
different eigenvalue ρ. Sowehaveλ − ρ ≠0.Then<br />
〈x, y〉 = 1 (λ − ρ) 〈x, y〉<br />
λ − ρ<br />
= 1 (λ 〈x, y〉−ρ 〈x, y〉)<br />
λ − ρ<br />
(〈 〉 )<br />
λx, y −〈x, ρy〉<br />
= 1<br />
λ − ρ<br />
= 1 (〈λx, y〉−〈x, ρy〉)<br />
λ − ρ<br />
= 1 (〈Ax, y〉−〈x, Ay〉)<br />
λ − ρ<br />
= 1 (〈Ax, y〉−〈Ax, y〉)<br />
λ − ρ<br />
= 1<br />
λ − ρ (0)<br />
Property MICN<br />
Property DCN<br />
Theorem IPSM<br />
Theorem HMRE<br />
Def<strong>in</strong>ition EEM<br />
Theorem HMIP<br />
Property AICN<br />
=0<br />
This equality says that x and y are orthogonal vectors (Def<strong>in</strong>ition OV).
§PEE Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 402<br />
Notice aga<strong>in</strong> how the key step <strong>in</strong> this proof is the fundamental property of<br />
a Hermitian matrix (Theorem HMIP) — the ability to swap A across the two<br />
arguments of the <strong>in</strong>ner product. We will build on these results and cont<strong>in</strong>ue to see<br />
some more <strong>in</strong>terest<strong>in</strong>g properties <strong>in</strong> Section OD.<br />
Read<strong>in</strong>g Questions<br />
1. How can you identify a nons<strong>in</strong>gular matrix just by look<strong>in</strong>g at its eigenvalues?<br />
2. How many different eigenvalues may a square matrix of size n have?<br />
3. What is amaz<strong>in</strong>g about the eigenvalues of a Hermitian matrix and why is it amaz<strong>in</strong>g?<br />
Exercises<br />
T10 † Suppose that A is a square matrix. Prove that the constant term of the characteristic<br />
polynomial of A is equal to the determ<strong>in</strong>ant of A.<br />
T20 † Suppose that A is a square matrix. Prove that a s<strong>in</strong>gle vector may not be an<br />
eigenvector of A for two different eigenvalues.<br />
T22 Suppose that U is a unitary matrix with eigenvalue λ. Provethatλ has modulus 1,<br />
i.e. |λ| = 1. This says that all of the eigenvalues of a unitary matrix lie on the unit circle of<br />
the complex plane.<br />
T30 Theorem DCP tells us that the characteristic polynomial of a square matrix of size n<br />
has degree n. By suitably augment<strong>in</strong>g the proof of Theorem DCP prove that the coefficient<br />
of x n <strong>in</strong> the characteristic polynomial is (−1) n .<br />
T50 † Theorem EIM says that if λ is an eigenvalue of the nons<strong>in</strong>gular matrix A, then 1 λ<br />
is an eigenvalue of A −1 . Write an alternate proof of this theorem us<strong>in</strong>g the characteristic<br />
polynomial and without mak<strong>in</strong>g reference to an eigenvector of A for λ.
Section SD<br />
Similarity and Diagonalization<br />
This section’s topic will perhaps seem out of place at first, but we will make the<br />
connection soon with eigenvalues and eigenvectors. This is also our first look at one<br />
of the central ideas of Chapter R.<br />
Subsection SM<br />
Similar Matrices<br />
The notion of matrices be<strong>in</strong>g “similar” is a lot like say<strong>in</strong>g two matrices are rowequivalent.<br />
Two similar matrices are not equal, but they share many important<br />
properties. This section, and later sections <strong>in</strong> Chapter R willbedevoted<strong>in</strong>partto<br />
discover<strong>in</strong>g just what these common properties are.<br />
<strong>First</strong>, the ma<strong>in</strong> def<strong>in</strong>ition for this section.<br />
Def<strong>in</strong>ition SIM Similar Matrices<br />
Suppose A and B are two square matrices of size n. ThenA and B are similar if<br />
there exists a nons<strong>in</strong>gular matrix of size n, S, such that A = S −1 BS.<br />
□<br />
We will say “A is similar to B via S” whenwewanttoemphasizetheroleofS<br />
<strong>in</strong> the relationship between A and B. Also, it does not matter if we say A is similar<br />
to B, orB is similar to A. If one statement is true then so is the other, as can be<br />
seen by us<strong>in</strong>g S −1 <strong>in</strong> place of S (see Theorem SER for the careful proof). F<strong>in</strong>ally, we<br />
will refer to S −1 BS as a similarity transformation when we want to emphasize<br />
the way S changes B. OK, enough about language, let us build a few examples.<br />
Example SMS5 Similar matrices of size 5<br />
If you wondered if there are examples of similar matrices, then it will not be hard to<br />
conv<strong>in</strong>ce you they exist. Def<strong>in</strong>e<br />
⎡<br />
⎤<br />
⎡<br />
⎤<br />
−4 1 −3 −2 2<br />
1 2 −1 1 1<br />
1 2 −1 3 −2<br />
0 1 −1 −2 −1<br />
⎢<br />
⎥<br />
⎢<br />
⎥<br />
B = ⎢−4 1 3 2 2<br />
⎣−3 4 −2 −1 −3<br />
3 1 −1 1 −4<br />
⎥<br />
⎦<br />
S = ⎢ 1 3 −1 1 1<br />
⎣−2 −3 3 1 −2<br />
1 3 −1 2 1<br />
Check that S is nons<strong>in</strong>gular and then compute<br />
A = S −1 BS<br />
⎡<br />
⎤ ⎡<br />
⎤ ⎡<br />
⎤<br />
10 1 0 2 −5 −4 1 −3 −2 2 1 2 −1 1 1<br />
−1 0 1 0 0<br />
1 2 −1 3 −2<br />
0 1 −1 −2 −1<br />
= ⎢ 3 0 2 1 −3⎥<br />
⎢−4 1 3 2 2 ⎥ ⎢ 1 3 −1 1 1 ⎥<br />
⎣ 0 0 −1 0 1 ⎦ ⎣−3 4 −2 −1 −3⎦<br />
⎣−2 −3 3 1 −2⎦<br />
−4 −1 1 −1 1 3 1 −1 1 −4 1 3 −1 2 1<br />
⎥<br />
⎦<br />
403
§SD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 404<br />
⎡<br />
⎤<br />
−10 −27 −29 −80 −25<br />
−2 6 6 10 −2<br />
= ⎢ −3 11 −9 −14 −9 ⎥<br />
⎣ −1 −13 0 −10 −1 ⎦<br />
11 35 6 49 19<br />
So by this construction, we know that A and B are similar.<br />
Let us do that aga<strong>in</strong>.<br />
Example SMS3 Similar matrices of size 3<br />
Def<strong>in</strong>e<br />
[ ]<br />
−13 −8 −4<br />
B = 12 7 4<br />
24 16 7<br />
S =<br />
[ 1 1<br />
] 2<br />
−2 −1 −3<br />
1 −2 0<br />
Check that S is nons<strong>in</strong>gular and then compute<br />
A = S −1 BS<br />
[ −6 −4<br />
][ −1 −13 −8<br />
][ −4 1 1<br />
] 2<br />
= −3 −2 −1 12 7 4 −2 −1 −3<br />
5 3 1 24 16 7 1 −2 0<br />
[ ]<br />
−1 0 0<br />
= 0 3 0<br />
0 0 −1<br />
So by this construction, we know that A and B are similar. But before we<br />
move on, look at how pleas<strong>in</strong>g the form of A is. Not conv<strong>in</strong>ced? Then consider<br />
that several computations related to A are especially easy. For example, <strong>in</strong> the<br />
spirit of Example DUTM, det (A) =(−1)(3)(−1) = 3. Similarly, the characteristic<br />
polynomial is straightforward to compute by hand, p A (x) =(−1 − x)(3 − x)(−1 −<br />
x) =−(x − 3)(x +1) 2 and s<strong>in</strong>ce the result is already factored, the eigenvalues are<br />
transparently λ =3, −1. F<strong>in</strong>ally, the eigenvectors of A are just the standard unit<br />
vectors (Def<strong>in</strong>ition SUV).<br />
△<br />
Subsection PSM<br />
Properties of Similar Matrices<br />
Similar matrices share many properties and it is these theorems that justify the<br />
choice of the word “similar.” <strong>First</strong> we will show that similarity is an equivalence<br />
relation. Equivalence relations are important <strong>in</strong> the study of various algebras and<br />
can always be regarded as a k<strong>in</strong>d of weak version of equality. Sort of alike, but not<br />
quite equal. The notion of two matrices be<strong>in</strong>g row-equivalent is an example of an<br />
equivalence relation we have been work<strong>in</strong>g with s<strong>in</strong>ce the beg<strong>in</strong>n<strong>in</strong>g of the course (see<br />
Exercise RREF.T11). Row-equivalent matrices are not equal, but they are a lot alike.<br />
For example, row-equivalent matrices have the same rank. Formally, an equivalence<br />
△
§SD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 405<br />
relation requires three conditions hold: reflexive, symmetric and transitive. We will<br />
illustrate these as we prove that similarity is an equivalence relation.<br />
Theorem SER Similarity is an Equivalence Relation<br />
Suppose A, B and C are square matrices of size n. Then<br />
1. A is similar to A. (Reflexive)<br />
2. If A is similar to B, thenB is similar to A. (Symmetric)<br />
3. If A is similar to B and B is similar to C, thenA is similar to C. (Transitive)<br />
Proof. To see that A is similar to A, we need only demonstrate a nons<strong>in</strong>gular<br />
matrix that effects a similarity transformation of A to A. I n is nons<strong>in</strong>gular (s<strong>in</strong>ce it<br />
row-reduces to the identity matrix, Theorem NMRRI), and<br />
In<br />
−1 AI n = I n AI n = A<br />
If we assume that A is similar to B, then we know there is a nons<strong>in</strong>gular matrix<br />
S so that A = S −1 BS by Def<strong>in</strong>ition SIM. ByTheoremMIMI, S −1 is <strong>in</strong>vertible, and<br />
by Theorem NI is therefore nons<strong>in</strong>gular. So<br />
(S −1 ) −1 A(S −1 )=SAS −1 Theorem MIMI<br />
= SS −1 BSS −1 Def<strong>in</strong>ition SIM<br />
= ( SS −1) B ( SS −1) Theorem MMA<br />
= I n BI n Def<strong>in</strong>ition MI<br />
= B Theorem MMIM<br />
and we see that B is similar to A.<br />
Assume that A is similar to B, andB is similar to C. This gives us the existence<br />
of two nons<strong>in</strong>gular matrices, S and R, such that A = S −1 BS and B = R −1 CR,<br />
by Def<strong>in</strong>ition SIM. (Notice how we have to assume S ̸= R, as will usually be the<br />
case.) S<strong>in</strong>ce S and R are <strong>in</strong>vertible, so too RS is <strong>in</strong>vertible by Theorem SS and then<br />
nons<strong>in</strong>gular by Theorem NI. Now<br />
(RS) −1 C(RS) =S −1 R −1 CRS<br />
Theorem SS<br />
= S −1 ( R −1 CR ) S Theorem MMA<br />
= S −1 BS Def<strong>in</strong>ition SIM<br />
= A<br />
so A is similar to C via the nons<strong>in</strong>gular matrix RS.<br />
<br />
Here is another theorem that tells us exactly what sorts of properties similar<br />
matrices share.
§SD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 406<br />
Theorem SMEE Similar Matrices have Equal Eigenvalues<br />
Suppose A and B are similar matrices. Then the characteristic polynomials of A<br />
and B are equal, that is, p A (x) =p B (x).<br />
Proof. Let n denote the size of A and B. S<strong>in</strong>ceA and B are similar, there exists a<br />
nons<strong>in</strong>gular matrix S, such that A = S −1 BS (Def<strong>in</strong>ition SIM). Then<br />
p A (x) =det(A − xI n )<br />
Def<strong>in</strong>ition CP<br />
=det ( S −1 )<br />
BS − xI n Def<strong>in</strong>ition SIM<br />
=det ( S −1 BS − xS −1 I n S )<br />
Theorem MMIM<br />
=det ( S −1 BS − S −1 xI n S )<br />
Theorem MMSMM<br />
=det ( S −1 (B − xI n ) S )<br />
Theorem MMDAA<br />
=det ( S −1) det (B − xI n )det(S) Theorem DRMM<br />
=det ( S −1) det (S)det(B − xI n ) Property CMCN<br />
=det ( S −1 S ) det (B − xI n )<br />
Theorem DRMM<br />
=det(I n )det(B − xI n )<br />
Def<strong>in</strong>ition MI<br />
=1det(B − xI n )<br />
Def<strong>in</strong>ition DM<br />
= p B (x) Def<strong>in</strong>ition CP<br />
<br />
So similar matrices not only have the same set of eigenvalues, the algebraic<br />
multiplicities of these eigenvalues will also be the same. However, be careful with<br />
this theorem. It is tempt<strong>in</strong>g to th<strong>in</strong>k the converse is true, and argue that if two<br />
matrices have the same eigenvalues, then they are similar. Not so, as the follow<strong>in</strong>g<br />
example illustrates.<br />
Example EENS Equal eigenvalues, not similar<br />
Def<strong>in</strong>e<br />
[ ]<br />
1 1<br />
A =<br />
0 1<br />
and check that<br />
B =<br />
[ ]<br />
1 0<br />
0 1<br />
p A (x) =p B (x) =1− 2x + x 2 =(x − 1) 2<br />
and so A and B have equal characteristic polynomials. If the converse of Theorem<br />
SMEE were true, then A and B would be similar. Suppose this is the case. More<br />
precisely, suppose there is a nons<strong>in</strong>gular matrix S so that A = S −1 BS.<br />
Then<br />
A = S −1 BS = S −1 I 2 S = S −1 S = I 2
§SD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 407<br />
Clearly A ≠ I 2 and this contradiction tells us that the converse of Theorem<br />
SMEE is false.<br />
△<br />
Subsection D<br />
Diagonalization<br />
Good th<strong>in</strong>gs happen when a matrix is similar to a diagonal matrix. For example, the<br />
eigenvalues of the matrix are the entries on the diagonal of the diagonal matrix. And<br />
it can be a much simpler matter to compute high powers of the matrix. Diagonalizable<br />
matrices are also of <strong>in</strong>terest <strong>in</strong> more abstract sett<strong>in</strong>gs. Here are the relevant def<strong>in</strong>itions,<br />
then our ma<strong>in</strong> theorem for this section.<br />
Def<strong>in</strong>ition DIM Diagonal Matrix<br />
Suppose that A is a square matrix. Then A is a diagonal matrix if [A] ij<br />
=0<br />
whenever i ≠ j.<br />
□<br />
Def<strong>in</strong>ition DZM Diagonalizable Matrix<br />
Suppose A is a square matrix. Then A is diagonalizable if A is similar to a diagonal<br />
matrix.<br />
□<br />
Example DAB Diagonalization of Archetype B<br />
Archetype B has a 3 × 3 coefficient matrix<br />
[ −7 −6<br />
] −12<br />
B = 5 5 7<br />
1 0 4<br />
and is similar to a diagonal matrix, as can be seen by the follow<strong>in</strong>g computation<br />
with the nons<strong>in</strong>gular matrix S,<br />
[ −5 −3<br />
]−1 [ −2 −7 −6<br />
][ −12 −5 −3<br />
] −2<br />
S −1 BS = 3 2 1 5 5 7 3 2 1<br />
1 1 1 1 0 4 1 1 1<br />
[ ][ ][ ]<br />
−1 −1 −1 −7 −6 −12 −5 −3 −2<br />
= 2 3 1 5 5 7 3 2 1<br />
−1 −2 1 1 0 4 1 1 1<br />
[ ] −1 0 0<br />
= 0 1 0<br />
0 0 2<br />
Example SMS3 provides yet another example of a matrix that is subjected to<br />
a similarity transformation and the result is a diagonal matrix. Alright, just how<br />
would we f<strong>in</strong>d the magic matrix S that can be used <strong>in</strong> a similarity transformation<br />
to produce a diagonal matrix? Before you read the statement of the next theorem,<br />
you might study the eigenvalues and eigenvectors of Archetype B and compute the<br />
△
§SD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 408<br />
eigenvalues and eigenvectors of the matrix <strong>in</strong> Example SMS3.<br />
Theorem DC Diagonalization Characterization<br />
Suppose A is a square matrix of size n. ThenA is diagonalizable if and only if there<br />
exists a l<strong>in</strong>early <strong>in</strong>dependent set S that conta<strong>in</strong>s n eigenvectors of A.<br />
Proof. (⇐) LetS = {x 1 , x 2 , x 3 , ..., x n } be a l<strong>in</strong>early <strong>in</strong>dependent set of eigenvectors<br />
of A for the eigenvalues λ 1 ,λ 2 ,λ 3 , ..., λ n . Recall Def<strong>in</strong>ition SUV and def<strong>in</strong>e<br />
R =[x 1 |x 2 |x 3 | ...|x n ]<br />
⎡<br />
⎤<br />
λ 1 0 0 ··· 0<br />
0 λ 2 0 ··· 0<br />
D =<br />
0 0 λ 3 ··· 0<br />
⎢<br />
⎥<br />
⎣<br />
. . . .<br />
⎦ =[λ 1e 1 |λ 2 e 2 |λ 3 e 3 | ...|λ n e n ]<br />
0 0 0 ··· λ n<br />
The columns of R are the vectors of the l<strong>in</strong>early <strong>in</strong>dependent set S and so by<br />
Theorem NMLIC the matrix R is nons<strong>in</strong>gular. By Theorem NI we know R −1 exists.<br />
R −1 AR = R −1 A [x 1 |x 2 |x 3 | ...|x n ]<br />
= R −1 [Ax 1 |Ax 2 |Ax 3 | ...|Ax n ] Def<strong>in</strong>ition MM<br />
= R −1 [λ 1 x 1 |λ 2 x 2 |λ 3 x 3 | ...|λ n x n ] Def<strong>in</strong>ition EEM<br />
= R −1 [λ 1 Re 1 |λ 2 Re 2 |λ 3 Re 3 | ...|λ n Re n ] Def<strong>in</strong>ition MVP<br />
= R −1 [R(λ 1 e 1 )|R(λ 2 e 2 )|R(λ 3 e 3 )| ...|R(λ n e n )] Theorem MMSMM<br />
= R −1 R[λ 1 e 1 |λ 2 e 2 |λ 3 e 3 | ...|λ n e n ] Def<strong>in</strong>ition MM<br />
= I n D Def<strong>in</strong>ition MI<br />
= D Theorem MMIM<br />
This says that A is similar to the diagonal matrix D via the nons<strong>in</strong>gular matrix<br />
R. ThusA is diagonalizable (Def<strong>in</strong>ition DZM).<br />
(⇒) Suppose that A is diagonalizable, so there is a nons<strong>in</strong>gular matrix of size n<br />
T =[y 1 |y 2 |y 3 | ...|y n ]<br />
and a diagonal matrix (recall Def<strong>in</strong>ition SUV)<br />
⎡<br />
⎤<br />
d 1 0 0 ··· 0<br />
0 d 2 0 ··· 0<br />
E =<br />
0 0 d 3 ··· 0<br />
⎢<br />
⎥<br />
⎣ ..<br />
..<br />
..<br />
..<br />
⎦ =[d 1e 1 |d 2 e 2 |d 3 e 3 | ...|d n e n ]<br />
0 0 0 ··· d n<br />
such that T −1 AT = E.
§SD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 409<br />
Then consider,<br />
[Ay 1 |Ay 2 |Ay 3 | ...|Ay n ]<br />
= A [y 1 |y 2 |y 3 | ...|y n ] Def<strong>in</strong>ition MM<br />
= AT<br />
= I n AT Theorem MMIM<br />
= TT −1 AT Def<strong>in</strong>ition MI<br />
= TE<br />
= T [d 1 e 1 |d 2 e 2 |d 3 e 3 | ...|d n e n ]<br />
=[T (d 1 e 1 )|T (d 2 e 2 )|T (d 3 e 3 )| ...|T (d n e n )] Def<strong>in</strong>ition MM<br />
=[d 1 T e 1 |d 2 T e 2 |d 3 T e 3 | ...|d n T e n ] Def<strong>in</strong>ition MM<br />
=[d 1 y 1 |d 2 y 2 |d 3 y 3 | ...|d n y n ]<br />
Def<strong>in</strong>ition MVP<br />
This equality of matrices (Def<strong>in</strong>ition ME) allows us to conclude that the <strong>in</strong>dividual<br />
columns are equal vectors (Def<strong>in</strong>ition CVE). That is, Ay i = d i y i for 1 ≤ i ≤ n.<br />
In other words, y i is an eigenvector of A for the eigenvalue d i ,1≤ i ≤ n. (Why<br />
does y i ≠ 0?). Because T is nons<strong>in</strong>gular, the set conta<strong>in</strong><strong>in</strong>g T ’s columns, S =<br />
{y 1 , y 2 , y 3 , ..., y n }, is a l<strong>in</strong>early <strong>in</strong>dependent set (Theorem NMLIC). So the set S<br />
has all the required properties.<br />
<br />
Notice that the proof of Theorem DC is constructive. To diagonalize a matrix,<br />
we need only locate n l<strong>in</strong>early <strong>in</strong>dependent eigenvectors. Then we can construct<br />
a nons<strong>in</strong>gular matrix us<strong>in</strong>g the eigenvectors as columns (R) sothatR −1 AR is a<br />
diagonal matrix (D). The entries on the diagonal of D will be the eigenvalues of the<br />
eigenvectors used to create R, <strong>in</strong> the same order as the eigenvectors appear <strong>in</strong> R.<br />
We illustrate this by diagonaliz<strong>in</strong>g some matrices.<br />
Example DMS3 Diagonaliz<strong>in</strong>g a matrix of size 3<br />
Consider the matrix<br />
[ ]<br />
−13 −8 −4<br />
F = 12 7 4<br />
24 16 7<br />
of Example CPMS3, Example EMS3 and Example ESMS3. F ’s eigenvalues and<br />
eigenspaces are<br />
〈 ⎧ ⎡<br />
⎨<br />
λ =3 E F (3) = ⎣ − ⎤⎫<br />
1 〉<br />
2<br />
⎬<br />
1 ⎦<br />
⎩ 2 ⎭<br />
1<br />
〈 ⎧ ⎡<br />
⎨<br />
λ = −1 E F (−1) = ⎣ − ⎤ ⎡<br />
2<br />
3<br />
1 ⎦ , ⎣ − ⎤⎫<br />
1 〉<br />
⎬<br />
3<br />
0 ⎦<br />
⎩<br />
0 1<br />
⎭
§SD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 410<br />
Def<strong>in</strong>e the matrix S to be the 3 × 3 matrix whose columns are the three basis<br />
vectors <strong>in</strong> the eigenspaces for F ,<br />
⎡<br />
S = ⎣ − ⎤<br />
1<br />
2<br />
− 2 3<br />
− 1 3<br />
1<br />
2<br />
1 0 ⎦<br />
1 0 1<br />
Check that S is nons<strong>in</strong>gular (row-reduces to the identity matrix, Theorem NMRRI<br />
or has a nonzero determ<strong>in</strong>ant, Theorem SMZD). Then the three columns of S are a<br />
l<strong>in</strong>early <strong>in</strong>dependent set (Theorem NMLIC). By Theorem DC we now know that F<br />
is diagonalizable. Furthermore, the construction <strong>in</strong> the proof of Theorem DC tells us<br />
that if we apply the matrix S to F <strong>in</strong> a similarity transformation, the result will be<br />
a diagonal matrix with the eigenvalues of F on the diagonal. The eigenvalues appear<br />
on the diagonal of the matrix <strong>in</strong> the same order as the eigenvectors appear <strong>in</strong> S. So,<br />
⎡<br />
S −1 FS = ⎣ − ⎤<br />
1<br />
2<br />
− 2 3<br />
− 1 −1 [ ] ⎡<br />
3 −13 −8 −4<br />
1<br />
2<br />
1 0 ⎦ 12 7 4 ⎣ − ⎤<br />
1<br />
2<br />
− 2 3<br />
− 1 3<br />
1<br />
2<br />
1 0 ⎦<br />
1 0 1 24 16 7 1 0 1<br />
[ ][ ] ⎡ 6 4 2 −13 −8 −4<br />
= −3 −1 −1 12 7 4 ⎣ − ⎤<br />
1<br />
2<br />
− 2 3<br />
− 1 3<br />
1<br />
2<br />
1 0 ⎦<br />
−6 −4 −1 24 16 7 1 0 1<br />
[ ]<br />
3 0 0<br />
= 0 −1 0<br />
0 0 −1<br />
Note that the above computations can be viewed two ways. The proof of Theorem<br />
DC tells us that the four matrices (F , S, F −1 and the diagonal matrix) will <strong>in</strong>teract<br />
the way we have written the equation. Or as an example, we can actually perform<br />
the computations to verify what the theorem predicts.<br />
△<br />
The dimension of an eigenspace can be no larger than the algebraic multiplicity<br />
of the eigenvalue by Theorem ME. When every eigenvalue’s eigenspace is this large,<br />
then we can diagonalize the matrix, and only then. Three examples we have seen so<br />
far <strong>in</strong> this section, Example SMS5, Example DAB and Example DMS3, illustrate<br />
the diagonalization of a matrix, with vary<strong>in</strong>g degrees of detail about just how the<br />
diagonalization is achieved. However, <strong>in</strong> each case, you can verify that the geometric<br />
and algebraic multiplicities are equal for every eigenvalue. This is the substance of<br />
the next theorem.<br />
Theorem DMFE Diagonalizable Matrices have Full Eigenspaces<br />
Suppose A is a square matrix. Then A is diagonalizable if and only if γ A (λ) = α A (λ)<br />
for every eigenvalue λ of A.<br />
Proof. Suppose A has size n and k dist<strong>in</strong>ct eigenvalues, λ 1 ,λ 2 ,λ 3 , ..., λ k .Let<br />
S i = { x i1 , x i2 , x i3 , ..., x iγA (λ i )}<br />
, denote a basis for the eigenspace of λi , E A (λ i ),
§SD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 411<br />
for 1 ≤ i ≤ k. Then<br />
S = S 1 ∪ S 2 ∪ S 3 ∪···∪S k<br />
is a set of eigenvectors for A. A vector cannot be an eigenvector for two different<br />
eigenvalues (see Exercise EE.T20) soS i ∩ S j = ∅ whenever i ≠ j. In other words, S<br />
is a disjo<strong>in</strong>t union of S i ,1≤ i ≤ k.<br />
(⇐) The size of S is<br />
k∑<br />
|S| = γ A (λ i )<br />
S disjo<strong>in</strong>t union of S i<br />
=<br />
i=1<br />
k∑<br />
α A (λ i )<br />
i=1<br />
Hypothesis<br />
= n Theorem NEM<br />
We next show that S is a l<strong>in</strong>early <strong>in</strong>dependent set. So we will beg<strong>in</strong> with a relation<br />
of l<strong>in</strong>ear dependence on S, us<strong>in</strong>g doubly-subscripted scalars and eigenvectors,<br />
0 = ( a 11 x 11 + a 12 x 12 + ···+ a 1γA (λ 1 )x 1γA (λ 1 ))<br />
+<br />
( )<br />
a21 x 21 + a 22 x 22 + ···+ a 2γA (λ 2 )x 2γA (λ 2 ) +<br />
( )<br />
a31 x 31 + a 32 x 32 + ···+ a 3γA (λ 3 )x 3γA (λ 3 ) +<br />
.<br />
(<br />
ak1 x k1 + a k2 x k2 + ···+ a kγA (λ k )x kγA (λ k ))<br />
Def<strong>in</strong>e the vectors y i ,1≤ i ≤ k by<br />
y 1 = ( )<br />
a 11 x 11 + a 12 x 12 + a 13 x 13 + ···+ a γA (1λ 1 )x 1γA (λ 1 )<br />
y 2 = ( )<br />
a 21 x 21 + a 22 x 22 + a 23 x 23 + ···+ a γA (2λ 2 )x 2γA (λ 2 )<br />
y 3 = ( )<br />
a 31 x 31 + a 32 x 32 + a 33 x 33 + ···+ a γA (3λ 3 )x 3γA (λ 3 )<br />
.<br />
y k = ( )<br />
a k1 x k1 + a k2 x k2 + a k3 x k3 + ···+ a γA (kλ k )x kγA (λ k )<br />
Then the relation of l<strong>in</strong>ear dependence becomes<br />
0 = y 1 + y 2 + y 3 + ···+ y k<br />
S<strong>in</strong>ce the eigenspace E A (λ i ) is closed under vector addition and scalar multiplication,<br />
y i ∈E A (λ i ),1≤ i ≤ k. Thus, for each i, the vector y i is an eigenvector of<br />
A for λ i , or is the zero vector. Recall that sets of eigenvectors whose eigenvalues<br />
are dist<strong>in</strong>ct form a l<strong>in</strong>early <strong>in</strong>dependent set by Theorem EDELI. Should any (or<br />
some) y i be nonzero, the previous equation would provide a nontrivial relation of<br />
l<strong>in</strong>ear dependence on a set of eigenvectors with dist<strong>in</strong>ct eigenvalues, contradict<strong>in</strong>g
§SD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 412<br />
Theorem EDELI. Thusy i = 0, 1≤ i ≤ k.<br />
Each of the k equations, y i = 0, is a relation of l<strong>in</strong>ear dependence on the<br />
correspond<strong>in</strong>g set S i , a set of basis vectors for the eigenspace E A (λ i ),whichis<br />
therefore l<strong>in</strong>early <strong>in</strong>dependent. From these relations of l<strong>in</strong>ear dependence on l<strong>in</strong>early<br />
<strong>in</strong>dependent sets we conclude that the scalars are all zero, more precisely, a ij =0,<br />
1 ≤ j ≤ γ A (λ i ) for 1 ≤ i ≤ k. This establishes that our orig<strong>in</strong>al relation of l<strong>in</strong>ear<br />
dependence on S has only the trivial relation of l<strong>in</strong>ear dependence, and hence S is a<br />
l<strong>in</strong>early <strong>in</strong>dependent set.<br />
We have determ<strong>in</strong>ed that S is a set of n l<strong>in</strong>early <strong>in</strong>dependent eigenvectors for A,<br />
and so by Theorem DC is diagonalizable.<br />
(⇒) Now we assume that A is diagonalizable. Aim<strong>in</strong>g for a contradiction (Proof<br />
Technique CD), suppose that there is at least one eigenvalue, say λ t , such that<br />
γ A (λ t ) ≠ α A (λ t ).ByTheoremME we must have γ A (λ t )
§SD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 413<br />
In Example EMMS4 the matrix<br />
⎡<br />
⎤<br />
−2 1 −2 −4<br />
⎢ 12 1 4 9 ⎥<br />
B = ⎣<br />
6 5 −2 −4<br />
⎦<br />
3 −4 5 10<br />
was determ<strong>in</strong>ed to have characteristic polynomial<br />
p B (x) =(x − 1)(x − 2) 3<br />
and an eigenspace for λ =2of<br />
〈 ⎧ ⎡<br />
⎪ − 1 ⎤⎫<br />
〉<br />
⎨ 2 ⎪⎬<br />
⎢ 1 ⎥<br />
E B (2) = ⎣<br />
⎪ ⎩ − 1 ⎦<br />
2<br />
⎪⎭<br />
1<br />
So the geometric multiplicity of λ =2isγ B (2) = 1, while the algebraic multiplicity<br />
is α B (2) =3.ByTheoremDMFE, thematrixB is not diagonalizable.<br />
△<br />
Archetype A is the lone archetype with a square matrix that is not diagonalizable,<br />
as the algebraic and geometric multiplicities of the eigenvalue λ = 0 differ. Example<br />
HMEM5 is another example of a matrix that cannot be diagonalized due to the<br />
difference between the geometric and algebraic multiplicities of λ =2,asisExample<br />
CEMS6 which has two complex eigenvalues, each with differ<strong>in</strong>g multiplicities.<br />
Likewise, Example EMMS4 has an eigenvalue with different algebraic and geometric<br />
multiplicities and so cannot be diagonalized.<br />
Theorem DED Dist<strong>in</strong>ct Eigenvalues implies Diagonalizable<br />
Suppose A is a square matrix of size n with n dist<strong>in</strong>ct eigenvalues. Then A is<br />
diagonalizable.<br />
Proof. Let λ 1 ,λ 2 ,λ 3 , ..., λ n denote the n dist<strong>in</strong>ct eigenvalues of A. ThenbyTheorem<br />
NEM we have n = ∑ n<br />
i=1 α A (λ i ), which implies that α A (λ i ) =1,1≤ i ≤ n.<br />
From Theorem ME it follows that γ A (λ i ) =1,1≤ i ≤ n. Soγ A (λ i ) = α A (λ i ),<br />
1 ≤ i ≤ n and Theorem DMFE says A is diagonalizable. <br />
Example DEHD Dist<strong>in</strong>ct eigenvalues, hence diagonalizable<br />
In Example DEMS5 the matrix<br />
⎡<br />
⎤<br />
15 18 −8 6 −5<br />
5 3 1 −1 −3<br />
H = ⎢ 0 −4 5 −4 −2 ⎥<br />
⎣−43 −46 17 −14 15 ⎦<br />
26 30 −12 8 −10
§SD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 414<br />
has characteristic polynomial<br />
p H (x) =x(x − 2)(x − 1)(x +1)(x +3)<br />
and so is a 5 × 5 matrix with 5 dist<strong>in</strong>ct eigenvalues.<br />
By Theorem DED we know H must be diagonalizable. But just for practice,<br />
we exhibit a diagonalization. The matrix S conta<strong>in</strong>s eigenvectors of H as columns,<br />
one from each eigenspace, guarantee<strong>in</strong>g l<strong>in</strong>ear <strong>in</strong>dependent columns and thus the<br />
nons<strong>in</strong>gularity of S. Notice that we are us<strong>in</strong>g the versions of the eigenvectors from<br />
Example DEMS5 that have <strong>in</strong>teger entries. The diagonal matrix has the eigenvalues<br />
of H <strong>in</strong> the same order that their respective eigenvectors appear as the columns of<br />
S. With these matrices, verify computationally that S −1 HS = D.<br />
⎡<br />
⎤<br />
⎡<br />
⎤<br />
2 1 −1 1 1<br />
−3 0 0 0 0<br />
−1 0 2 0 −1<br />
0 −1 0 0 0<br />
S = ⎢−2 0 2 −1 −2⎥<br />
D = ⎢ 0 0 0 0 0⎥<br />
⎣−4 −1 0 −2 −1⎦<br />
⎣ 0 0 0 1 0⎦<br />
2 2 1 2 1<br />
0 0 0 0 2<br />
Note that there are many different ways to diagonalize H. We could replace eigenvectors<br />
by nonzero scalar multiples, or we could rearrange the order of the eigenvectors<br />
as the columns of S (which would subsequently reorder the eigenvalues along the<br />
diagonal of D).<br />
△<br />
Archetype B is another example of a matrix that has as many dist<strong>in</strong>ct eigenvalues<br />
as its size, and is hence diagonalizable by Theorem DED.<br />
Powers of a diagonal matrix are easy to compute, and when a matrix is diagonalizable,<br />
it is almost as easy. We could state a theorem here perhaps, but we will<br />
settle <strong>in</strong>stead for an example that makes the po<strong>in</strong>t just as well.<br />
Example HPDM High power of a diagonalizable matrix<br />
Suppose that<br />
⎡<br />
⎤<br />
19 0 6 13<br />
⎢−33 −1 −9 −21⎥<br />
A = ⎣<br />
21 −4 12 21<br />
⎦<br />
−36 2 −14 −28<br />
and we wish to compute A 20 . Normally this would require 19 matrix multiplications,<br />
but s<strong>in</strong>ce A is diagonalizable, we can simplify the computations substantially.<br />
<strong>First</strong>, we diagonalize A. With<br />
⎡<br />
⎤<br />
we f<strong>in</strong>d<br />
D = S −1 AS<br />
1 −1 2 −1<br />
⎢−2 3 −3 3<br />
S = ⎣<br />
1 1 3 3<br />
−2 1 −4 0<br />
⎥<br />
⎦
§SD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 415<br />
⎡<br />
⎤ ⎡<br />
⎤ ⎡<br />
⎤<br />
−6 1 −3 −6 19 0 6 13 1 −1 2 −1<br />
⎢ 0 2 −2 −3⎥<br />
⎢−33 −1 −9 −21⎥<br />
⎢−2 3 −3 3 ⎥<br />
= ⎣<br />
3 0 1 2<br />
⎦ ⎣<br />
21 −4 12 21<br />
⎦ ⎣<br />
1 1 3 3<br />
⎦<br />
−1 −1 1 1 −36 2 −14 −28 −2 1 −4 0<br />
⎡<br />
⎤<br />
−1 0 0 0<br />
⎢ 0 0 0 0⎥<br />
= ⎣<br />
0 0 2 0<br />
⎦<br />
0 0 0 1<br />
Now we f<strong>in</strong>d an alternate expression for A 20 ,<br />
A 20 = AAA . . . A<br />
= I n AI n AI n AI n ...I n AI n<br />
= ( SS −1) A ( SS −1) A ( SS −1) A ( SS −1) ... ( SS −1) A ( SS −1)<br />
= S ( S −1 AS )( S −1 AS )( S −1 AS ) ... ( S −1 AS ) S −1<br />
= SDDD...DS −1<br />
= SD 20 S −1<br />
and s<strong>in</strong>ce D is a diagonal matrix, powers are much easier to compute,<br />
⎡<br />
⎤20<br />
−1 0 0 0<br />
⎢ 0 0 0 0⎥<br />
= S ⎣<br />
0 0 2 0<br />
⎦ S −1<br />
0 0 0 1<br />
⎡<br />
(−1) 20 ⎤<br />
0 0 0<br />
⎢ 0 (0)<br />
= S<br />
20 0 0 ⎥<br />
⎣<br />
0 0 (2) 20 0<br />
⎦ S −1<br />
0 0 0 (1) 20<br />
⎡<br />
⎤ ⎡<br />
⎤ ⎡<br />
⎤<br />
1 −1 2 −1 1 0 0 0 −6 1 −3 −6<br />
⎢−2 3 −3 3 ⎥ ⎢0 0 0 0⎥<br />
⎢ 0 2 −2 −3⎥<br />
= ⎣<br />
1 1 3 3<br />
⎦ ⎣<br />
0 0 1048576 0<br />
⎦ ⎣<br />
3 0 1 2<br />
⎦<br />
−2 1 −4 0 0 0 0 1 −1 −1 1 1<br />
⎡<br />
⎤<br />
6291451 2 2097148 4194297<br />
⎢ −9437175 −5 −3145719 −6291441⎥<br />
= ⎣<br />
9437175 −2 3145728 6291453<br />
⎦<br />
−12582900 −2 −4194298 −8388596<br />
Notice how we effectively replaced the twentieth power of A by the twentieth<br />
power of D, and how a high power of a diagonal matrix is just a collection of powers<br />
of scalars on the diagonal. The price we pay for this simplification is the need to<br />
diagonalize the matrix (by comput<strong>in</strong>g eigenvalues and eigenvectors) and f<strong>in</strong>d<strong>in</strong>g the<br />
<strong>in</strong>verse of the matrix of eigenvectors. And we still need to do two matrix products.<br />
But the higher the power, the greater the sav<strong>in</strong>gs.<br />
△
§SD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 416<br />
Subsection FS<br />
Fibonacci Sequences<br />
Example FSCF Fibonacci sequence, closed form<br />
The Fibonacci sequence is a sequence of <strong>in</strong>tegers def<strong>in</strong>ed recursively by<br />
a 0 =0 a 1 =1 a n+1 = a n + a n−1 , n ≥ 1<br />
So the <strong>in</strong>itial portion of the sequence is 0, 1, 1, 2, 3, 5, 8, 13, 21, ....Inthis<br />
subsection we will illustrate an application of eigenvalues and diagonalization through<br />
the determ<strong>in</strong>ation of a closed-form expression for an arbitrary term of this sequence.<br />
To beg<strong>in</strong>, verify that for any n ≥ 1 the recursive statement above establishes the<br />
truth of the statement<br />
[ ] [ ][ ]<br />
an 0 1 an−1<br />
=<br />
a n+1 1 1 a n<br />
Let A denote this 2 × 2 matrix. Through repeated applications of the statement<br />
above we have<br />
[ ] [ ] [ ] [ ]<br />
[ ]<br />
an an−1<br />
= A = A<br />
a n+1 a 2 an−2<br />
= A<br />
n a 3 an−3<br />
= ···= A<br />
n−1 a n a0<br />
n−2 a 1<br />
In preparation for work<strong>in</strong>g with this high power of A, not unlike <strong>in</strong> Example<br />
HPDM, we will diagonalize A. The characteristic polynomial of A is p A (x) =<br />
x 2 − x − 1, with roots (the eigenvalues of A by Theorem EMRCP)<br />
ρ = 1+√ 5<br />
δ = 1 − √ 5<br />
2<br />
2<br />
With two dist<strong>in</strong>ct eigenvalues, Theorem DED implies that A is diagonalizable.<br />
It will be easier to compute with these eigenvalues once you confirm the follow<strong>in</strong>g<br />
properties (all but the last can be derived from the fact that ρ and δ are roots of<br />
the characteristic polynomial, <strong>in</strong> a factored or unfactored form)<br />
ρ + δ =1 ρδ = −1 1+ρ = ρ 2 1+δ = δ 2 ρ − δ = √ 5<br />
Then eigenvectors of A (for ρ and δ, respectively) are<br />
[ ] [<br />
1 1<br />
ρ<br />
δ]<br />
which can be easily confirmed, as we demonstrate for the eigenvector for ρ,<br />
[ ] [ ] [ ] [<br />
0 1 1 ρ ρ 1<br />
= =<br />
1 1][<br />
ρ 1+ρ ρ 2 = ρ<br />
ρ]<br />
From the proof of Theorem DC we know A can be diagonalized by a matrix<br />
S with these eigenvectors as columns, giv<strong>in</strong>g D = S −1 AS. We list S, S −1 and the
§SD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 417<br />
diagonal matrix D,<br />
[ ]<br />
1 1<br />
S =<br />
ρ δ<br />
S −1 = 1 [ ]<br />
−δ 1<br />
ρ − δ ρ −1<br />
D =<br />
[ ]<br />
ρ 0<br />
0 δ<br />
OK, we have everyth<strong>in</strong>g <strong>in</strong> place now. The ma<strong>in</strong> step <strong>in</strong> the follow<strong>in</strong>g is to replace<br />
A by SDS −1 . Here we go,<br />
[ ] [ ]<br />
an<br />
= A<br />
a n a0<br />
n+1 a 1<br />
= ( SDS −1) [ ]<br />
n a0<br />
a 1<br />
[ ]<br />
= SDS −1 SDS −1 SDS −1 ···SDS −1 a0<br />
a 1<br />
[ ]<br />
= SDDD ···DS −1 a0<br />
a 1<br />
[ ]<br />
= SD n S −1 a0<br />
a 1<br />
[ ][ ] n [ ][ ]<br />
1 1 ρ 0 1 −δ 1 a0<br />
=<br />
ρ δ 0 δ ρ − δ ρ −1 a 1<br />
= 1 [ ][ ][ ]<br />
1 1 ρ<br />
n<br />
0 −δ 1 0<br />
ρ − δ ρ δ 0 δ n ρ −1][<br />
1<br />
= 1 [ ][ ][ ]<br />
1 1 ρ<br />
n<br />
0 1<br />
ρ − δ ρ δ 0 δ n −1<br />
= 1 [ ][ ]<br />
1 1 ρ<br />
n<br />
ρ − δ ρ δ −δ n<br />
= 1 [ ]<br />
ρ n − δ n<br />
ρ − δ ρ n+1 − δ n+1<br />
Perform<strong>in</strong>g the scalar multiplication and equat<strong>in</strong>g the first entries of the two<br />
vectors, we arrive at the closed form expression<br />
a n = 1<br />
ρ − δ (ρn − δ n )<br />
((<br />
= √ 1 1+ √ ) n (<br />
5 1 − √ ) n )<br />
5<br />
− 5 2<br />
2<br />
= 1 ((<br />
2 n√ 1+ √ ) n (<br />
5 − 1 − √ ) n )<br />
5<br />
5<br />
Notice that it does not matter whether we use the equality of the first or second<br />
entries of the vectors, we will arrive at the same formula, once <strong>in</strong> terms of n and<br />
aga<strong>in</strong> <strong>in</strong> terms of n + 1. Also, our def<strong>in</strong>ition clearly describes a sequence that will
§SD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 418<br />
only conta<strong>in</strong> <strong>in</strong>tegers, yet the presence of the irrational number √ 5 might make us<br />
suspicious. But no, our expression for a n will always yield an <strong>in</strong>teger!<br />
The Fibonacci sequence, and generalizations of it, have been extensively studied<br />
(Fibonacci lived <strong>in</strong> the 12th and 13th centuries). There are many ways to derive<br />
the closed-form expression we just found, and our approach may not be the most<br />
efficient route. But it is a nice demonstration of how diagonalization can be used to<br />
solve a problem outside the field of l<strong>in</strong>ear algebra.<br />
△<br />
We close this section with a comment about an important upcom<strong>in</strong>g theorem that<br />
we prove <strong>in</strong> Chapter R. A consequence of Theorem OD is that every Hermitian matrix<br />
(Def<strong>in</strong>ition HM) is diagonalizable (Def<strong>in</strong>ition DZM), and the similarity transformation<br />
that accomplishes the diagonalization uses a unitary matrix (Def<strong>in</strong>ition UM). This<br />
means that for every Hermitian matrix of size n there is a basis of C n that is<br />
composed entirely of eigenvectors for the matrix and also forms an orthonormal set<br />
(Def<strong>in</strong>ition ONS). Notice that for matrices with only real entries, we only need the<br />
hypothesis that the matrix is symmetric (Def<strong>in</strong>ition SYM) to reach this conclusion<br />
(Example ESMS4). Can you imag<strong>in</strong>e a prettier basis for use with a matrix? I cannot.<br />
These results <strong>in</strong> Section OD expla<strong>in</strong> much of our recurr<strong>in</strong>g <strong>in</strong>terest <strong>in</strong> orthogonality,<br />
and make the section a high po<strong>in</strong>t <strong>in</strong> your study of l<strong>in</strong>ear algebra. A precise statement<br />
of this diagonalization result applies to a slightly broader class of matrices, known as<br />
“normal” matrices (Def<strong>in</strong>ition NRML), which are matrices that commute with their<br />
adjo<strong>in</strong>ts. With this expanded category of matrices, the result becomes an equivalence<br />
(Proof Technique E). See Theorem OD and Theorem OBNM <strong>in</strong> Section OD for all<br />
the details.<br />
Read<strong>in</strong>g Questions<br />
1. What is an equivalence relation?<br />
2. State a condition that is equivalent to a matrix be<strong>in</strong>g diagonalizable, but is not the<br />
def<strong>in</strong>ition.<br />
3. F<strong>in</strong>d a diagonal matrix similar to<br />
[ ]<br />
−5 8<br />
A =<br />
−4 7<br />
Exercises<br />
C20 † Consider the matrix A below. <strong>First</strong>, show that A is diagonalizable by comput<strong>in</strong>g<br />
the geometric multiplicities of the eigenvalues and quot<strong>in</strong>g the relevant theorem. Second,<br />
f<strong>in</strong>d a diagonal matrix D and a nons<strong>in</strong>gular matrix S so that S −1 AS = D. (See Exercise
§SD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 419<br />
EE.C20 for some of the necessary computations.)<br />
⎡<br />
⎤<br />
18 −15 33 −15<br />
⎢−4 8 −6 6 ⎥<br />
A = ⎣<br />
−9 9 −16 9<br />
⎦<br />
5 −6 9 −4<br />
C21 † Determ<strong>in</strong>e if the matrix A below is diagonalizable. If the matrix is diagonalizable,<br />
then f<strong>in</strong>d a diagonal matrix D that is similar to A, and provide the <strong>in</strong>vertible matrix S<br />
that performs the similarity transformation. You should use your calculator to f<strong>in</strong>d the<br />
eigenvalues of the matrix, but try only us<strong>in</strong>g the row-reduc<strong>in</strong>g function of your calculator<br />
to assist with f<strong>in</strong>d<strong>in</strong>g eigenvectors.<br />
⎡<br />
⎤<br />
1 9 9 24<br />
⎢−3 −27 −29 −68⎥<br />
A = ⎣<br />
1 11 13 26<br />
⎦<br />
1 7 7 18<br />
C22 † Consider the matrix A below. F<strong>in</strong>d the eigenvalues of A us<strong>in</strong>g a calculator and use<br />
these to construct the characteristic polynomial of A, p A (x). State the algebraic multiplicity<br />
of each eigenvalue. F<strong>in</strong>d all of the eigenspaces for A by comput<strong>in</strong>g expressions for null<br />
spaces, only us<strong>in</strong>g your calculator to row-reduce matrices. State the geometric multiplicity<br />
of each eigenvalue. Is A diagonalizable? If not, expla<strong>in</strong> why. If so, f<strong>in</strong>d a diagonal matrix D<br />
that is similar to A.<br />
⎡<br />
⎤<br />
19 25 30 5<br />
⎢−23 −30 −35 −5⎥<br />
A = ⎣<br />
7 9 10 1<br />
⎦<br />
−3 −4 −5 −1<br />
T15 † Suppose that A and B are similar matrices of size n. ProvethatA 3 and B 3 are<br />
similar matrices. Generalize.<br />
T16 † Suppose that A and B are similar matrices, with A nons<strong>in</strong>gular. Prove that B is<br />
nons<strong>in</strong>gular, and that A −1 is similar to B −1 .<br />
T17 † Suppose that B is a nons<strong>in</strong>gular matrix. Prove that AB is similar to BA.
Chapter LT<br />
L<strong>in</strong>ear Transformations<br />
In the next l<strong>in</strong>ear algebra course you take, the first lecture might be a rem<strong>in</strong>der about<br />
what a vector space is (Def<strong>in</strong>ition VS), their ten properties, basic theorems and then<br />
some examples. The second lecture would likely be all about l<strong>in</strong>ear transformations.<br />
While it may seem we have waited a long time to present what must be a central<br />
topic, <strong>in</strong> truth we have already been work<strong>in</strong>g with l<strong>in</strong>ear transformations for some<br />
time.<br />
Functions are important objects <strong>in</strong> the study of calculus, but have been absent<br />
from this course until now (well, not really, it just seems that way). In your study of<br />
more advanced mathematics it is nearly impossible to escape the use of functions —<br />
they are as fundamental as sets are.<br />
Section LT<br />
L<strong>in</strong>ear Transformations<br />
Early <strong>in</strong> Chapter VS we prefaced the def<strong>in</strong>ition of a vector space with the comment<br />
that it was “one of the two most important def<strong>in</strong>itions <strong>in</strong> the entire course.” Here<br />
comes the other. Any capsule summary of l<strong>in</strong>ear algebra would have to describe the<br />
subject as the <strong>in</strong>terplay of l<strong>in</strong>ear transformations and vector spaces. Here we go.<br />
Subsection LT<br />
L<strong>in</strong>ear Transformations<br />
Def<strong>in</strong>ition LT L<strong>in</strong>ear Transformation<br />
A l<strong>in</strong>ear transformation, T : U → V , is a function that carries elements of the<br />
vector space U (called the doma<strong>in</strong>) to the vector space V (called the codoma<strong>in</strong>),<br />
and which has two additional properties<br />
1. T (u 1 + u 2 )=T (u 1 )+T (u 2 ) for all u 1 , u 2 ∈ U<br />
420
§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 421<br />
2. T (αu) =αT (u) for all u ∈ U and all α ∈ C<br />
□<br />
The two def<strong>in</strong><strong>in</strong>g conditions <strong>in</strong> the def<strong>in</strong>ition of a l<strong>in</strong>ear transformation should<br />
“feel l<strong>in</strong>ear,” whatever that means. Conversely, these two conditions could be taken as<br />
exactly what it means to be l<strong>in</strong>ear. As every vector space property derives from vector<br />
addition and scalar multiplication, so too, every property of a l<strong>in</strong>ear transformation<br />
derives from these two def<strong>in</strong><strong>in</strong>g properties. While these conditions may be rem<strong>in</strong>iscent<br />
of how we test subspaces, they really are quite different, so do not confuse the two.<br />
Here are two diagrams that convey the essence of the two def<strong>in</strong><strong>in</strong>g properties<br />
of a l<strong>in</strong>ear transformation. In each case, beg<strong>in</strong> <strong>in</strong> the upper left-hand corner, and<br />
follow the arrows around the rectangle to the lower-right hand corner, tak<strong>in</strong>g two<br />
different routes and do<strong>in</strong>g the <strong>in</strong>dicated operations labeled on the arrows. There are<br />
two results there. For a l<strong>in</strong>ear transformation these two expressions are always equal.<br />
T<br />
u 1 , u 2 T (u 1 ),T(u 2 )<br />
+<br />
+<br />
T<br />
u 1 + u 2 T (u 1 + u 2 )=T (u 1 )+T (u 2 )<br />
Diagram DLTA: Def<strong>in</strong>ition of L<strong>in</strong>ear Transformation, Additive<br />
u<br />
T<br />
T (u)<br />
αu<br />
α<br />
T<br />
α<br />
T (αu) =αT (u)<br />
Diagram DLTM: Def<strong>in</strong>ition of L<strong>in</strong>ear Transformation, Multiplicative<br />
A couple of words about notation. T is the name of the l<strong>in</strong>ear transformation,<br />
and should be used when we want to discuss the function as a whole. T (u) is how we
§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 422<br />
talk about the output of the function, it is a vector <strong>in</strong> the vector space V .Whenwe<br />
write T (x + y) =T (x)+T (y), the plus sign on the left is the operation of vector<br />
addition <strong>in</strong> the vector space U, s<strong>in</strong>cex and y areelementsofU. The plus sign on<br />
the right is the operation of vector addition <strong>in</strong> the vector space V ,s<strong>in</strong>ceT (x) and<br />
T (y) are elements of the vector space V . These two <strong>in</strong>stances of vector addition<br />
might be wildly different.<br />
Let us exam<strong>in</strong>e several examples and beg<strong>in</strong> to form a catalog of known l<strong>in</strong>ear<br />
transformations to work with.<br />
Example ALT A l<strong>in</strong>ear transformation<br />
Def<strong>in</strong>e T : C 3 → C 2 by describ<strong>in</strong>g the output of the function for a generic <strong>in</strong>put with<br />
the formula<br />
([ ])<br />
x1 [ ]<br />
2x1 + x<br />
T x 2 =<br />
3<br />
−4x<br />
x 2 3<br />
and check the two def<strong>in</strong><strong>in</strong>g properties.<br />
([ ]<br />
x1<br />
T (x + y) =T x 2 +<br />
x 3<br />
[<br />
y1<br />
])<br />
y 2<br />
y 3<br />
([ ])<br />
x1 + y 1<br />
= T x 2 + y 2<br />
x 3 + y 3<br />
[ ]<br />
2(x1 + y<br />
=<br />
1 )+(x 3 + y 3 )<br />
−4(x 2 + y 2 )<br />
[ ]<br />
(2x1 + x<br />
=<br />
3 )+(2y 1 + y 3 )<br />
−4x 2 +(−4)y 2<br />
[ ] [ ]<br />
2x1 + x<br />
=<br />
3 2y1 + y<br />
+<br />
3<br />
−4x 2 −4y 2<br />
([ ]) ([ ])<br />
x1 y1<br />
= T x 2 + T y 2<br />
x 3 y 3<br />
= T (x)+T (y)<br />
and<br />
T (αx) =T<br />
= T<br />
(<br />
α<br />
[<br />
x1<br />
([ αx1<br />
])<br />
x 2<br />
x 3<br />
])<br />
αx 2<br />
αx 3
§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 423<br />
[ ]<br />
2(αx1 )+(αx<br />
=<br />
3 )<br />
−4(αx 2 )<br />
[ ]<br />
α(2x1 + x<br />
=<br />
3 )<br />
α(−4x 2 )<br />
[ ]<br />
2x1 + x<br />
= α<br />
3<br />
−4x 2<br />
([ ])<br />
x1<br />
= αT x 2<br />
x 3<br />
= αT (x)<br />
So by Def<strong>in</strong>ition LT, T is a l<strong>in</strong>ear transformation.<br />
It can be just as <strong>in</strong>structive to look at functions that are not l<strong>in</strong>ear transformations.<br />
S<strong>in</strong>ce the def<strong>in</strong><strong>in</strong>g conditions must be true for all vectors and scalars, it is enough to<br />
f<strong>in</strong>d just one situation where the properties fail.<br />
Example NLT Not a l<strong>in</strong>ear transformation<br />
Def<strong>in</strong>e S : C 3 → C 3 by<br />
([ ]) [ ]<br />
x1 4x1 +2x 2<br />
S x 2 = 0<br />
x 3 x 1 +3x 3 − 2<br />
This function “looks” l<strong>in</strong>ear, but consider<br />
([ ]) [ ] [ ]<br />
1 8 24<br />
3 S 2 =3 0 = 0<br />
3 8 24<br />
while<br />
(<br />
S 3<br />
[ ]) ([ ]) [ ]<br />
1 3 24<br />
2 = S 6 = 0<br />
3 9 28<br />
[ 1<br />
So the second required property fails for the choice of α = 3 and x = 2 and by<br />
3<br />
Def<strong>in</strong>ition LT, S is not a l<strong>in</strong>ear transformation. It is just about as easy to f<strong>in</strong>d an<br />
example where the first def<strong>in</strong><strong>in</strong>g property fails (try it!). Notice that it is the “-2” <strong>in</strong><br />
the third component of the def<strong>in</strong>ition of S that prevents the function from be<strong>in</strong>g a<br />
l<strong>in</strong>ear transformation.<br />
△<br />
Example LTPM L<strong>in</strong>ear transformation, polynomials to matrices<br />
]<br />
△
§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 424<br />
Def<strong>in</strong>e a l<strong>in</strong>ear transformation T : P 3 → M 22 by<br />
T ( a + bx + cx 2 + dx 3) [ ]<br />
a + b a− 2c<br />
=<br />
d b− d<br />
We verify the two def<strong>in</strong><strong>in</strong>g conditions of a l<strong>in</strong>ear transformation.<br />
T (x + y) =T ( (a 1 + b 1 x + c 1 x 2 + d 1 x 3 )+(a 2 + b 2 x + c 2 x 2 + d 2 x 3 ) )<br />
= T ( (a 1 + a 2 )+(b 1 + b 2 )x +(c 1 + c 2 )x 2 +(d 1 + d 2 )x 3)<br />
[ ]<br />
(a1 + a<br />
=<br />
2 )+(b 1 + b 2 ) (a 1 + a 2 ) − 2(c 1 + c 2 )<br />
d 1 + d 2 (b 1 + b 2 ) − (d 1 + d 2 )<br />
[ ]<br />
(a1 + b<br />
=<br />
1 )+(a 2 + b 2 ) (a 1 − 2c 1 )+(a 2 − 2c 2 )<br />
=<br />
d 1 + d 2 (b 1 − d 1 )+(b 2 − d 2 )<br />
[ ]<br />
a2 + b 2 a 2 − 2c 2<br />
d 2 b 2 − d 2<br />
[ ]<br />
a1 + b 1 a 1 − 2c 1<br />
+<br />
d 1 b 1 − d 1<br />
= T ( a 1 + b 1 x + c 1 x 2 + d 1 x 3) + T ( a 2 + b 2 x + c 2 x 2 + d 2 x 3)<br />
= T (x)+T (y)<br />
and<br />
T (αx) =T ( α(a + bx + cx 2 + dx 3 ) )<br />
= T ( (αa)+(αb)x +(αc)x 2 +(αd)x 3)<br />
[ ]<br />
(αa)+(αb) (αa) − 2(αc)<br />
=<br />
αd (αb) − (αd)<br />
[ ]<br />
α(a + b) α(a − 2c)<br />
=<br />
αd α(b − d)<br />
[ ]<br />
a + b a− 2c<br />
= α<br />
d b− d<br />
= αT ( a + bx + cx 2 + dx 3)<br />
= αT (x)<br />
So by Def<strong>in</strong>ition LT, T is a l<strong>in</strong>ear transformation.<br />
Example LTPP L<strong>in</strong>ear transformation, polynomials to polynomials<br />
Def<strong>in</strong>e a function S : P 4 → P 5 by<br />
S(p(x)) = (x − 2)p(x)<br />
Then<br />
S (p(x)+q(x)) = (x − 2)(p(x)+q(x))<br />
=(x − 2)p(x)+(x − 2)q(x) =S (p(x)) + S (q(x))<br />
S (αp(x)) = (x − 2)(αp(x)) = (x − 2)αp(x) =α(x − 2)p(x) =αS (p(x))<br />
△
§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 425<br />
So by Def<strong>in</strong>ition LT, S is a l<strong>in</strong>ear transformation.<br />
L<strong>in</strong>ear transformations have many amaz<strong>in</strong>g properties, which we will <strong>in</strong>vestigate<br />
through the next few sections. However, as a taste of th<strong>in</strong>gs to come, here is a<br />
theorem we can prove now and put to use immediately.<br />
Theorem LTTZZ L<strong>in</strong>ear Transformations Take Zero to Zero<br />
Suppose T : U → V is a l<strong>in</strong>ear transformation. Then T (0) =0.<br />
Proof. The two zero vectors <strong>in</strong> the conclusion of the theorem are different. The first<br />
is from U while the second is from V . We will subscript the zero vectors <strong>in</strong> this proof<br />
to highlight the dist<strong>in</strong>ction. Th<strong>in</strong>k about your objects. (This proof is contributed by<br />
Mark Shoemaker).<br />
T (0 U )=T (00 U )<br />
Theorem ZSSM <strong>in</strong> U<br />
=0T (0 U ) Def<strong>in</strong>ition LT<br />
= 0 V Theorem ZSSM <strong>in</strong> V<br />
<br />
([ ]) [ ]<br />
0 0<br />
Return to Example NLT and compute S 0 = 0 to quickly see aga<strong>in</strong><br />
0 −2<br />
that S is not a l<strong>in</strong>ear transformation, while <strong>in</strong> Example LTPM compute<br />
S ( 0+0x +0x 2 +0x 3) [ ]<br />
0 0<br />
=<br />
0 0<br />
as an example of Theorem LTTZZ at work.<br />
Subsection LTC<br />
L<strong>in</strong>ear Transformation Cartoons<br />
Throughout this chapter, and Chapter R, we will <strong>in</strong>clude draw<strong>in</strong>gs of l<strong>in</strong>ear transformations.<br />
We will call them “cartoons,” not because they are humorous, but because<br />
they will only expose a portion of the truth. A Bugs Bunny cartoon might give us<br />
some <strong>in</strong>sights on human nature, but the rules of physics and biology are rout<strong>in</strong>ely<br />
(and grossly) violated. So it will be with our l<strong>in</strong>ear transformation cartoons.<br />
Here is our first, followed by a guide to help you understand how these are meant<br />
to describe fundamental truths about l<strong>in</strong>ear transformations, while simultaneously<br />
violat<strong>in</strong>g other truths.<br />
△
§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 426<br />
w<br />
u<br />
v<br />
0 U<br />
x<br />
0 V<br />
y<br />
t<br />
U<br />
T<br />
V<br />
Diagram GLT: General L<strong>in</strong>ear Transformation<br />
Here we picture a l<strong>in</strong>ear transformation T : U → V , where this <strong>in</strong>formation will<br />
be consistently displayed along the bottom edge. The ovals are meant to represent<br />
the vector spaces, <strong>in</strong> this case U, the doma<strong>in</strong>, on the left and V , the codoma<strong>in</strong>, on<br />
the right. Of course, vector spaces are typically <strong>in</strong>f<strong>in</strong>ite sets, so you will have to<br />
imag<strong>in</strong>e that characteristic of these sets. A small dot <strong>in</strong>side of an oval will represent<br />
a vector with<strong>in</strong> that vector space, sometimes with a name, sometimes not (<strong>in</strong> this<br />
case every vector has a name). The sizes of the ovals are meant to be proportional<br />
to the dimensions of the vector spaces. However, when we make no assumptions<br />
about the dimensions, we will draw the ovals as the same size, as we have done here<br />
(which is not meant to suggest that the dimensions have to be equal).<br />
To convey that the l<strong>in</strong>ear transformation associates a certa<strong>in</strong> <strong>in</strong>put with a certa<strong>in</strong><br />
output, we will draw an arrow from the <strong>in</strong>put to the output. So, for example,<br />
<strong>in</strong> this cartoon we suggest that T (x) = y. Noth<strong>in</strong>g <strong>in</strong> the def<strong>in</strong>ition of a l<strong>in</strong>ear<br />
transformation prevents two different <strong>in</strong>puts be<strong>in</strong>g sent to the same output and we see<br />
this <strong>in</strong> T (u) = v = T (w). Similarly, an output may not have any <strong>in</strong>put be<strong>in</strong>g sent<br />
its way, as illustrated by no arrow po<strong>in</strong>t<strong>in</strong>g at t. Inthiscartoon,wehavecapturedthe<br />
essence of our one general theorem about l<strong>in</strong>ear transformations, Theorem LTTZZ,<br />
T (0 U ) = 0 V . On occasion we might <strong>in</strong>clude this basic fact when it is relevant, at<br />
other times maybe not. Note that the def<strong>in</strong>ition of a l<strong>in</strong>ear transformation requires<br />
that it be a function, so every element of the doma<strong>in</strong> should be associated with some<br />
element of the codoma<strong>in</strong>. This will be reflected by never hav<strong>in</strong>g an element of the
§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 427<br />
doma<strong>in</strong> without an arrow orig<strong>in</strong>at<strong>in</strong>g there.<br />
These cartoons are of course no substitute for careful def<strong>in</strong>itions and proofs, but<br />
they can be a handy way to th<strong>in</strong>k about the various properties we will be study<strong>in</strong>g.<br />
Subsection MLT<br />
Matrices and L<strong>in</strong>ear Transformations<br />
If you give me a matrix, then I can quickly build you a l<strong>in</strong>ear transformation. Always.<br />
<strong>First</strong> a motivat<strong>in</strong>g example and then the theorem.<br />
Example LTM L<strong>in</strong>ear transformation from a matrix<br />
Let<br />
[ ]<br />
3 −1 8 1<br />
A = 2 0 5 −2<br />
1 1 3 −7<br />
and def<strong>in</strong>e a function P : C 4 → C 3 by<br />
P (x) =Ax<br />
So we are us<strong>in</strong>g an old friend, the matrix-vector product (Def<strong>in</strong>ition MVP) as<br />
a way to convert a vector with 4 components <strong>in</strong>to a vector with 3 components.<br />
Apply<strong>in</strong>g Def<strong>in</strong>ition MVP allows us to write the def<strong>in</strong><strong>in</strong>g formula for P <strong>in</strong> a slightly<br />
different form,<br />
[ ] ⎡ ⎤<br />
x 3 −1 8 1 1 [ ] [ ] [ ] [ ]<br />
3 −1 8 1<br />
⎢x P (x) =Ax = 2 0 5 −2 2 ⎥ ⎣<br />
x<br />
⎦ = x 1 2 + x 2 0 + x 3 5 + x 4 −2<br />
1 1 3 −7 3<br />
1 1 3 −7<br />
x 4<br />
So we recognize the action of the function P as us<strong>in</strong>g the components of the<br />
vector (x 1 ,x 2 ,x 3 ,x 4 ) as scalars to form the output of P as a l<strong>in</strong>ear comb<strong>in</strong>ation<br />
of the four columns of the matrix A, whichareallmembersofC 3 , so the result is<br />
a vector <strong>in</strong> C 3 . We can rearrange this expression further, us<strong>in</strong>g our def<strong>in</strong>itions of<br />
operations <strong>in</strong> C 3 (Section VO).<br />
P (x) =Ax<br />
Def<strong>in</strong>ition of P<br />
[ ] [ ] [ ]<br />
3 −1 8<br />
= x 1 2 + x 2 0 + x 3 5<br />
1 1 3<br />
[ ] [ ] [ ]<br />
3x1 −x2 8x3<br />
= 2x 1 + 0 + 5x 3 +<br />
x 1 x 2 3x 3<br />
[ ]<br />
3x1 − x 2 +8x 3 + x 4<br />
= 2x 1 +5x 3 − 2x 4<br />
x 1 + x 2 +3x 3 − 7x 4<br />
]<br />
[ 1<br />
+ x 4 −2<br />
−7<br />
[<br />
x4<br />
]<br />
−2x 4<br />
−7x 4<br />
Def<strong>in</strong>ition MVP<br />
Def<strong>in</strong>ition CVSM<br />
Def<strong>in</strong>ition CVA
§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 428<br />
You might recognize this f<strong>in</strong>al expression as be<strong>in</strong>g similar <strong>in</strong> style to some previous<br />
examples (Example ALT) and some l<strong>in</strong>ear transformations def<strong>in</strong>ed <strong>in</strong> the archetypes<br />
(Archetype M through Archetype R). But the expression that says the output of<br />
this l<strong>in</strong>ear transformation is a l<strong>in</strong>ear comb<strong>in</strong>ation of the columns of A is probably<br />
the most powerful way of th<strong>in</strong>k<strong>in</strong>g about examples of this type.<br />
Almost forgot — we should verify that P is <strong>in</strong>deed a l<strong>in</strong>ear transformation. This<br />
is easy with two matrix properties from Section MM.<br />
P (x + y) =A (x + y)<br />
Def<strong>in</strong>ition of P<br />
= Ax + Ay Theorem MMDAA<br />
= P (x)+P (y) Def<strong>in</strong>ition of P<br />
and<br />
P (αx) =A (αx)<br />
Def<strong>in</strong>ition of P<br />
= α (Ax) Theorem MMSMM<br />
= αP (x) Def<strong>in</strong>ition of P<br />
So by Def<strong>in</strong>ition LT, P is a l<strong>in</strong>ear transformation.<br />
So the multiplication of a vector by a matrix “transforms” the <strong>in</strong>put vector <strong>in</strong>to<br />
an output vector, possibly of a different size, by perform<strong>in</strong>g a l<strong>in</strong>ear comb<strong>in</strong>ation.<br />
And this transformation happens <strong>in</strong> a “l<strong>in</strong>ear” fashion. This “functional” view of<br />
the matrix-vector product is the most important shift you can make right now <strong>in</strong><br />
how you th<strong>in</strong>k about l<strong>in</strong>ear algebra. Here is the theorem, whose proof is very nearly<br />
an exact copy of the verification <strong>in</strong> the last example.<br />
Theorem MBLT Matrices Build L<strong>in</strong>ear Transformations<br />
Suppose that A is an m × n matrix. Def<strong>in</strong>e a function T : C n → C m by T (x) = Ax.<br />
Then T is a l<strong>in</strong>ear transformation.<br />
Proof.<br />
and<br />
T (x + y) =A (x + y)<br />
Def<strong>in</strong>ition of T<br />
= Ax + Ay Theorem MMDAA<br />
= T (x)+T (y) Def<strong>in</strong>ition of T<br />
△<br />
T (αx) =A (αx)<br />
Def<strong>in</strong>ition of T<br />
= α (Ax) Theorem MMSMM<br />
= αT (x) Def<strong>in</strong>ition of T<br />
So by Def<strong>in</strong>ition LT, T is a l<strong>in</strong>ear transformation.<br />
<br />
So Theorem MBLT gives us a rapid way to construct l<strong>in</strong>ear transformations.
§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 429<br />
Grab an m × n matrix A, def<strong>in</strong>e T (x) = Ax and Theorem MBLT tells us that T is<br />
a l<strong>in</strong>ear transformation from C n to C m , without any further check<strong>in</strong>g.<br />
WecanturnTheoremMBLT around. You give me a l<strong>in</strong>ear transformation and I<br />
will give you a matrix.<br />
Example MFLT Matrix from a l<strong>in</strong>ear transformation<br />
Def<strong>in</strong>e the function R: C 3 → C 4 by<br />
⎡<br />
⎤<br />
([ ]) 2x 1 − 3x 2 +4x 3<br />
x1<br />
⎢ x<br />
R x 2 = 1 + x 2 + x 3 ⎥<br />
⎣<br />
−x<br />
x 1 +5x 2 − 3x<br />
⎦<br />
3<br />
3<br />
x 2 − 4x 3<br />
You could verify that R is a l<strong>in</strong>ear transformation by apply<strong>in</strong>g the def<strong>in</strong>ition, but<br />
we will <strong>in</strong>stead massage the expression def<strong>in</strong><strong>in</strong>g a typical output until we recognize<br />
the form of a known class of l<strong>in</strong>ear transformations.<br />
R<br />
([ ])<br />
x1<br />
x 2 =<br />
x 3<br />
⎡<br />
⎤<br />
2x 1 − 3x 2 +4x 3<br />
⎢ x 1 + x 2 + x 3 ⎥<br />
⎣<br />
−x 1 +5x 2 − 3x<br />
⎦<br />
3<br />
x 2 − 4x 3<br />
⎡ ⎤ ⎡ ⎤<br />
2x 1 −3x 2<br />
⎢ x<br />
= 1 ⎥ ⎢ ⎥<br />
⎣ ⎦ + ⎣ ⎦ +<br />
−x 1<br />
0<br />
⎡<br />
⎢<br />
= x 1 ⎣<br />
⎡<br />
⎢<br />
= ⎣<br />
2<br />
1<br />
−1<br />
0<br />
x 2<br />
5x 2<br />
x 2<br />
⎤ ⎡<br />
⎥ ⎢<br />
⎦ + x 2 ⎣<br />
2 −3 4<br />
1 1 1<br />
−1 5 −3<br />
0 1 −4<br />
So if we def<strong>in</strong>e the matrix<br />
⎡ ⎤<br />
4x 3<br />
⎢ x 3 ⎥<br />
⎣<br />
−3x<br />
⎦<br />
3<br />
−4x 3<br />
⎤ ⎡<br />
−3 4<br />
1 ⎥ ⎢ 1<br />
5<br />
⎦ + x 3 ⎣<br />
−3<br />
1 −4<br />
⎤<br />
⎥<br />
⎦<br />
⎡<br />
⎢<br />
B = ⎣<br />
[<br />
x1<br />
]<br />
x 2<br />
x 3<br />
2 −3 4<br />
1 1 1<br />
−1 5 −3<br />
0 1 −4<br />
⎤<br />
⎥<br />
⎦<br />
⎤<br />
⎥<br />
⎦<br />
Def<strong>in</strong>ition CVA<br />
Def<strong>in</strong>ition CVSM<br />
Def<strong>in</strong>ition MVP<br />
then R (x) = Bx. By Theorem MBLT, we can easily recognize R as a l<strong>in</strong>ear<br />
transformation s<strong>in</strong>ce it has the form described <strong>in</strong> the hypothesis of the theorem. △<br />
Example MFLT was not an accident. Consider any one of the archetypes where<br />
both the doma<strong>in</strong> and codoma<strong>in</strong> are sets of column vectors (Archetype M through<br />
Archetype R) and you should be able to mimic the previous example. Here is the<br />
theorem, which is notable s<strong>in</strong>ce it is our first occasion to use the full power of the
§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 430<br />
def<strong>in</strong><strong>in</strong>g properties of a l<strong>in</strong>ear transformation when our hypothesis <strong>in</strong>cludes a l<strong>in</strong>ear<br />
transformation.<br />
Theorem MLTCV Matrix of a L<strong>in</strong>ear Transformation, Column Vectors<br />
Suppose that T : C n → C m is a l<strong>in</strong>ear transformation. Then there is an m × n matrix<br />
A such that T (x) =Ax.<br />
Proof. The conclusion says a certa<strong>in</strong> matrix exists. What better way to prove<br />
someth<strong>in</strong>g exists than to actually build it? So our proof will be constructive (Proof<br />
Technique C), and the procedure that we will use abstractly <strong>in</strong> the proof can be<br />
used concretely <strong>in</strong> specific examples.<br />
Let e 1 , e 2 , e 3 , ..., e n be the columns of the identity matrix of size n, I n (Def<strong>in</strong>ition<br />
SUV). Evaluate the l<strong>in</strong>ear transformation T with each of these standard unit<br />
vectors as an <strong>in</strong>put, and record the result. In other words, def<strong>in</strong>e n vectors <strong>in</strong> C m ,<br />
A i ,1≤ i ≤ n by<br />
A i = T (e i )<br />
Then package up these vectors as the columns of a matrix<br />
A =[A 1 |A 2 |A 3 | ...|A n ]<br />
Does A have the desired properties? <strong>First</strong>, A is clearly an m × n matrix. Then<br />
T (x) =T (I n x)<br />
Theorem MMIM<br />
= T ([e 1 |e 2 |e 3 | ...|e n ] x) Def<strong>in</strong>ition SUV<br />
= T ([x] 1<br />
e 1 +[x] 2<br />
e 2 +[x] 3<br />
e 3 + ···+[x] n<br />
e n ) Def<strong>in</strong>ition MVP<br />
= T ([x] 1<br />
e 1 )+T ([x] 2<br />
e 2 )+T ([x] 3<br />
e 3 )+···+ T ([x] n<br />
e n ) Def<strong>in</strong>ition LT<br />
=[x] 1<br />
T (e 1 )+[x] 2<br />
T (e 2 )+[x] 3<br />
T (e 3 )+···+[x] n<br />
T (e n ) Def<strong>in</strong>ition LT<br />
=[x] 1<br />
A 1 +[x] 2<br />
A 2 +[x] 3<br />
A 3 + ···+[x] n<br />
A n<br />
Def<strong>in</strong>ition of A i<br />
= Ax Def<strong>in</strong>ition MVP<br />
as desired.<br />
<br />
So if we were to restrict our study of l<strong>in</strong>ear transformations to those where the<br />
doma<strong>in</strong> and codoma<strong>in</strong> are both vector spaces of column vectors (Def<strong>in</strong>ition VSCV),<br />
every matrix leads to a l<strong>in</strong>ear transformation of this type (Theorem MBLT), while<br />
every such l<strong>in</strong>ear transformation leads to a matrix (Theorem MLTCV). So matrices<br />
and l<strong>in</strong>ear transformations are fundamentally the same. We call the matrix A of<br />
Theorem MLTCV the matrix representation of T .<br />
We have def<strong>in</strong>ed l<strong>in</strong>ear transformations for more general vector spaces than just<br />
C m . Can we extend this correspondence between l<strong>in</strong>ear transformations and matrices<br />
to more general l<strong>in</strong>ear transformations (more general doma<strong>in</strong>s and codoma<strong>in</strong>s)? Yes,<br />
and this is the ma<strong>in</strong> theme of Chapter R. Stay tuned. For now, let us illustrate<br />
Theorem MLTCV with an example.
§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 431<br />
Example MOLT Matrix of a l<strong>in</strong>ear transformation<br />
Suppose S : C 3 → C 4 is def<strong>in</strong>ed by<br />
⎡<br />
⎤<br />
([ ]) 3x 1 − 2x 2 +5x 3<br />
x1<br />
⎢ x<br />
S x 2 = 1 + x 2 + x 3 ⎥<br />
⎣<br />
9x<br />
x 1 − 2x 2 +5x<br />
⎦<br />
3<br />
3<br />
4x 2<br />
Then<br />
so def<strong>in</strong>e<br />
⎡ ⎤<br />
([ ]) 3 1<br />
⎢1⎥<br />
C 1 = S (e 1 )=S 0 = ⎣<br />
9<br />
⎦<br />
0<br />
0<br />
⎡ ⎤<br />
([ ]) −2<br />
0<br />
⎢ 1 ⎥<br />
C 2 = S (e 2 )=S 1 = ⎣<br />
−2<br />
⎦<br />
0<br />
4<br />
⎡ ⎤<br />
([ ]) 5 0<br />
⎢1⎥<br />
C 3 = S (e 3 )=S 0 = ⎣<br />
5<br />
⎦<br />
1<br />
0<br />
⎡ ⎤<br />
3 −2 5<br />
⎢1 1 1⎥<br />
C =[C 1 |C 2 |C 3 ]= ⎣<br />
9 −2 5<br />
⎦<br />
0 4 0<br />
and Theorem MLTCV guarantees that S (x) =Cx.<br />
[ ] 2<br />
As an illum<strong>in</strong>at<strong>in</strong>g exercise, let z = −3 and compute S (z) two different ways.<br />
3<br />
<strong>First</strong>, return to the def<strong>in</strong>ition of S and evaluate S (z) directly. Then do the<br />
⎡<br />
matrixvector<br />
product Cz. In both cases you should obta<strong>in</strong> the vector S (z) = ⎣<br />
⎤<br />
27<br />
⎢ 2 ⎥<br />
39<br />
⎦.<br />
−12<br />
△<br />
Subsection LTLC<br />
L<strong>in</strong>ear Transformations and L<strong>in</strong>ear Comb<strong>in</strong>ations<br />
It is the <strong>in</strong>teraction between l<strong>in</strong>ear transformations and l<strong>in</strong>ear comb<strong>in</strong>ations that lies<br />
at the heart of many of the important theorems of l<strong>in</strong>ear algebra. The next theorem<br />
distills the essence of this. The proof is not deep, the result is hardly startl<strong>in</strong>g,<br />
but it will be referenced frequently. We have already passed by one occasion to
§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 432<br />
employ it, <strong>in</strong> the proof of Theorem MLTCV. Paraphras<strong>in</strong>g, this theorem says that<br />
we can “push” l<strong>in</strong>ear transformations “down <strong>in</strong>to” l<strong>in</strong>ear comb<strong>in</strong>ations, or “pull”<br />
l<strong>in</strong>ear transformations “up out” of l<strong>in</strong>ear comb<strong>in</strong>ations. We will have opportunities<br />
to both push and pull.<br />
Theorem LTLC L<strong>in</strong>ear Transformations and L<strong>in</strong>ear Comb<strong>in</strong>ations<br />
Suppose that T : U → V is a l<strong>in</strong>ear transformation, u 1 , u 2 , u 3 , ..., u t are vectors<br />
from U and a 1 ,a 2 ,a 3 , ..., a t are scalars from C. Then<br />
T (a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a t u t )=a 1 T (u 1 )+a 2 T (u 2 )+a 3 T (u 3 )+···+a t T (u t )<br />
Proof.<br />
T (a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a t u t )<br />
= T (a 1 u 1 )+T (a 2 u 2 )+T (a 3 u 3 )+···+ T (a t u t ) Def<strong>in</strong>ition LT<br />
= a 1 T (u 1 )+a 2 T (u 2 )+a 3 T (u 3 )+···+ a t T (u t ) Def<strong>in</strong>ition LT<br />
<br />
Some authors, especially <strong>in</strong> more advanced texts, take the conclusion of Theorem<br />
LTLC as the def<strong>in</strong><strong>in</strong>g condition of a l<strong>in</strong>ear transformation. This has the appeal of<br />
be<strong>in</strong>g a s<strong>in</strong>gle condition, rather than the two-part condition of Def<strong>in</strong>ition LT.(See<br />
Exercise LT.T20).<br />
Our next theorem says, <strong>in</strong>formally, that it is enough to know how a l<strong>in</strong>ear<br />
transformation behaves for <strong>in</strong>puts from any basis of the doma<strong>in</strong>, and all the other<br />
outputs are described by a l<strong>in</strong>ear comb<strong>in</strong>ation of these few values. Aga<strong>in</strong>, the<br />
statement of the theorem, and its proof, are not remarkable, but the <strong>in</strong>sight that<br />
goes along with it is very fundamental.<br />
Theorem LTDB L<strong>in</strong>ear Transformation Def<strong>in</strong>ed on a Basis<br />
Suppose U is a vector space with basis B = {u 1 , u 2 , u 3 , ..., u n } and the vector<br />
space V conta<strong>in</strong>s the vectors v 1 , v 2 , v 3 , ..., v n (which may not be dist<strong>in</strong>ct). Then<br />
there is a unique l<strong>in</strong>ear transformation, T : U → V , such that T (u i ) = v i , 1 ≤ i ≤ n.<br />
Proof. To prove the existence of T , we construct a function and show that it is a<br />
l<strong>in</strong>ear transformation (Proof Technique C). Suppose w ∈ U is an arbitrary element<br />
of the doma<strong>in</strong>. Then by Theorem VRRB there are unique scalars a 1 ,a 2 ,a 3 , ..., a n<br />
such that<br />
w = a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a n u n<br />
Then def<strong>in</strong>e the function T by<br />
T (w) =a 1 v 1 + a 2 v 2 + a 3 v 3 + ···+ a n v n<br />
It should be clear that T behavesasrequiredforn <strong>in</strong>puts from B. S<strong>in</strong>ce the scalars<br />
provided by Theorem VRRB are unique, there is no ambiguity <strong>in</strong> this def<strong>in</strong>ition,
§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 433<br />
and T qualifies as a function with doma<strong>in</strong> U and codoma<strong>in</strong> V (i.e. T is well-def<strong>in</strong>ed).<br />
But is T a l<strong>in</strong>ear transformation as well?<br />
Let x ∈ U be a second element of the doma<strong>in</strong>, and suppose the scalars provided<br />
by Theorem VRRB (relative to B) areb 1 ,b 2 ,b 3 , ..., b n .Then<br />
T (w + x) =T (a 1 u 1 + ···+ a n u n + b 1 u 1 + ···+ b n u n )<br />
= T ((a 1 + b 1 ) u 1 + ···+(a n + b n ) u n ) Def<strong>in</strong>ition VS<br />
=(a 1 + b 1 ) v 1 + ···+(a n + b n ) v n<br />
Def<strong>in</strong>ition of T<br />
= a 1 v 1 + ···+ a n v n + b 1 v 1 + ···+ b n v n Def<strong>in</strong>ition VS<br />
= T (w)+T (x)<br />
Let α ∈ C be any scalar. Then<br />
T (αw) =T (α (a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a n u n ))<br />
= T (αa 1 u 1 + αa 2 u 2 + αa 3 u 3 + ···+ αa n u n ) Def<strong>in</strong>ition VS<br />
= αa 1 v 1 + αa 2 v 2 + αa 3 v 3 + ···+ αa n v n Def<strong>in</strong>ition of T<br />
= α (a 1 v 1 + a 2 v 2 + a 3 v 3 + ···+ a n v n ) Def<strong>in</strong>ition VS<br />
= αT (w)<br />
So by Def<strong>in</strong>ition LT, T is a l<strong>in</strong>ear transformation.<br />
Is T unique (among all l<strong>in</strong>ear transformations that take the u i to the v i )?<br />
Apply<strong>in</strong>g Proof Technique U, we posit the existence of a second l<strong>in</strong>ear transformation,<br />
S : U → V such that S (u i ) = v i ,1≤ i ≤ n. Aga<strong>in</strong>, let w ∈ U represent an arbitrary<br />
element of U and let a 1 ,a 2 ,a 3 , ..., a n be the scalars provided by Theorem VRRB<br />
(relative to B). We have,<br />
T (w) =T (a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a n u n )<br />
Theorem VRRB<br />
= a 1 T (u 1 )+a 2 T (u 2 )+a 3 T (u 3 )+···+ a n T (u n ) Theorem LTLC<br />
= a 1 v 1 + a 2 v 2 + a 3 v 3 + ···+ a n v n Def<strong>in</strong>ition of T<br />
= a 1 S (u 1 )+a 2 S (u 2 )+a 3 S (u 3 )+···+ a n S (u n ) Def<strong>in</strong>ition of S<br />
= S (a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a n u n ) Theorem LTLC<br />
= S (w) Theorem VRRB<br />
So the output of T and S agree on every <strong>in</strong>put, which means they are equal as<br />
functions, T = S. SoT is unique.<br />
<br />
You might recall facts from analytic geometry, such as “any two po<strong>in</strong>ts determ<strong>in</strong>e<br />
a l<strong>in</strong>e” and “any three non-coll<strong>in</strong>ear po<strong>in</strong>ts determ<strong>in</strong>e a parabola.” Theorem LTDB<br />
has much of the same feel. By specify<strong>in</strong>g the n outputs for <strong>in</strong>puts from a basis, an<br />
entire l<strong>in</strong>ear transformation is determ<strong>in</strong>ed. The analogy is not perfect, but the style<br />
of these facts are not very dissimilar from Theorem LTDB.<br />
Notice that the statement of Theorem LTDB asserts the existence of a l<strong>in</strong>ear<br />
transformation with certa<strong>in</strong> properties, while the proof shows us exactly how to def<strong>in</strong>e
§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 434<br />
the desired l<strong>in</strong>ear transformation. The next two examples show how to compute<br />
values of l<strong>in</strong>ear transformations that we create this way.<br />
Example LTDB1 L<strong>in</strong>ear transformation def<strong>in</strong>ed on a basis<br />
Consider the l<strong>in</strong>ear transformation T : C 3 → C 2 that is required to have the follow<strong>in</strong>g<br />
three values,<br />
Because<br />
T<br />
([ 1<br />
0<br />
0<br />
]) [<br />
2<br />
=<br />
1]<br />
T<br />
([ 0<br />
1<br />
0<br />
])<br />
{[ ] 1<br />
B = 0<br />
0<br />
,<br />
=<br />
[ ] 0<br />
1 ,<br />
0<br />
[ ]<br />
−1<br />
4<br />
[ ]} 0<br />
0<br />
1<br />
T<br />
([ 0<br />
0<br />
1<br />
]) [<br />
6<br />
=<br />
0]<br />
is a basis for C 3 (Theorem SUVB), Theorem LTDB says there is a unique l<strong>in</strong>ear<br />
transformation T that behaves this way.<br />
How do we compute other values of T ? Consider the <strong>in</strong>put<br />
so<br />
Then<br />
Do<strong>in</strong>g it aga<strong>in</strong>,<br />
w =<br />
[ 2<br />
−3<br />
1<br />
]<br />
=(2)[ 1<br />
0<br />
0<br />
]<br />
[<br />
2<br />
T (w) =(2) +(−3)<br />
1]<br />
[ ] 5<br />
x = 2<br />
−3<br />
=(5)[ 1<br />
0<br />
0<br />
[<br />
2<br />
T (x) =(5) +(2)<br />
1]<br />
]<br />
+(−3)<br />
[ 0<br />
1<br />
0<br />
]<br />
+(1)[ 0<br />
0<br />
1<br />
]<br />
[ ] [ [ ]<br />
−1 6 13<br />
+(1) =<br />
4 0]<br />
−10<br />
] 0<br />
+(2)[<br />
1<br />
0<br />
+(−3)<br />
[ 0<br />
0<br />
1<br />
[ ] [<br />
−1 6<br />
+(−3) =<br />
4 0]<br />
]<br />
[ ]<br />
−10<br />
13<br />
Any other value of T could be computed <strong>in</strong> a similar manner. So rather than<br />
be<strong>in</strong>g given a formula for the outputs of T ,therequirement that T behave <strong>in</strong> a<br />
certa<strong>in</strong> way for the <strong>in</strong>puts chosen from a basis of the doma<strong>in</strong>, is as sufficient as a<br />
formula for comput<strong>in</strong>g any value of the function. You might notice some parallels<br />
between this example and Example MOLT or Theorem MLTCV.<br />
△<br />
Example LTDB2 L<strong>in</strong>ear transformation def<strong>in</strong>ed on a basis<br />
Consider the l<strong>in</strong>ear transformation R: C 3 → C 2 with the three values,<br />
([ ]) 1 [ ] ([ ]) −1 [ ([ ]) 3 [<br />
5<br />
0 2<br />
R 2 =<br />
R 5 =<br />
R 1 =<br />
−1<br />
4]<br />
3]<br />
1<br />
1<br />
4
§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 435<br />
You can check that<br />
{[ ] 1<br />
D = 2<br />
1<br />
,<br />
[ ] −1<br />
5 ,<br />
1<br />
[ ]} 3<br />
1<br />
4<br />
is a basis for C 3 (make the vectors the columns of a square matrix and check that<br />
the matrix is nons<strong>in</strong>gular, Theorem CNMB). By Theorem LTDB we know there<br />
is a unique l<strong>in</strong>ear transformation R with the three specified outputs. However, we<br />
have to work just a bit harder to take an <strong>in</strong>put vector and express it as a l<strong>in</strong>ear<br />
comb<strong>in</strong>ation of the vectors <strong>in</strong> D.<br />
For example, consider,<br />
[ ] 8<br />
y = −3<br />
5<br />
Then we must first write y as a l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors <strong>in</strong> D and solve<br />
for the unknown scalars, to arrive at<br />
[ ] ] [ ] ]<br />
8 1 −1 3<br />
y = −3 =(3)[<br />
2 +(−2) 5 +(1)[<br />
1<br />
5 1<br />
1 4<br />
Then the proof of Theorem LTDB gives us<br />
[ ] [<br />
5 0<br />
R (y) =(3) +(−2)<br />
−1 4]<br />
[<br />
2<br />
+(1) =<br />
3]<br />
[ ]<br />
17<br />
−8<br />
Any other value of R could be computed <strong>in</strong> a similar manner.<br />
Here is a third example of a l<strong>in</strong>ear transformation def<strong>in</strong>ed by its action on a<br />
basis, only with more abstract vector spaces <strong>in</strong>volved.<br />
Example LTDB3 L<strong>in</strong>ear transformation def<strong>in</strong>ed on a basis<br />
The set W = {p(x) ∈ P 3 | p(1) = 0,p(3) = 0} ⊆P 3 is a subspace of the vector space<br />
of polynomials P 3 . This subspace has C = { 3 − 4x + x 2 , 12 − 13x + x 3} as a basis<br />
(check this!). Suppose we consider the l<strong>in</strong>ear transformation S : P 3 → M 22 with<br />
values<br />
S ( 3 − 4x + x 2) =<br />
[ ]<br />
1 −3<br />
2 0<br />
S ( 12 − 13x + x 3) =<br />
[ ]<br />
0 1<br />
1 0<br />
By Theorem LTDB we know there is a unique l<strong>in</strong>ear transformation with these<br />
two values. To illustrate a sample computation of S, consider q(x) =9−6x−5x 2 +2x 3 .<br />
Verify that q(x) isanelementofW (does it have roots at x = 1 and x =3?),then<br />
f<strong>in</strong>d the scalars needed to write it as a l<strong>in</strong>ear comb<strong>in</strong>ation of the basis vectors <strong>in</strong> C.<br />
Because<br />
q(x) =9− 6x − 5x 2 +2x 3 =(−5)(3 − 4x + x 2 ) + (2)(12 − 13x + x 3 )<br />
△
§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 436<br />
The proof of Theorem LTDB gives us<br />
[ ]<br />
1 −3<br />
S (q) =(−5) +(2)<br />
2 0<br />
[ ]<br />
0 1<br />
=<br />
1 0<br />
[<br />
−5<br />
]<br />
17<br />
−8 0<br />
And all the other outputs of S could be computed <strong>in</strong> the same manner. Every<br />
output of S will have a zero <strong>in</strong> the second row, second column. Can you see why<br />
this is so?<br />
△<br />
Informally, we can describe Theorem LTDB by say<strong>in</strong>g “it is enough to know<br />
what a l<strong>in</strong>ear transformation does to a basis (of the doma<strong>in</strong>).”<br />
Subsection PI<br />
Pre-Images<br />
The def<strong>in</strong>ition of a function requires that for each <strong>in</strong>put <strong>in</strong> the doma<strong>in</strong> there is<br />
exactly one output <strong>in</strong> the codoma<strong>in</strong>. However, the correspondence does not have<br />
to behave the other way around. An output from the codoma<strong>in</strong> could have many<br />
different <strong>in</strong>puts from the doma<strong>in</strong> which the transformation sends to that output,<br />
or there could be no <strong>in</strong>puts at all which the transformation sends to that output.<br />
To formalize our discussion of this aspect of l<strong>in</strong>ear transformations, we def<strong>in</strong>e the<br />
pre-image.<br />
Def<strong>in</strong>ition PI Pre-Image<br />
Suppose that T : U → V is a l<strong>in</strong>ear transformation. For each v, def<strong>in</strong>e the pre-image<br />
of v to be the subset of U given by<br />
T −1 (v) ={u ∈ U| T (u) =v}<br />
□<br />
In other words, T −1 (v) is the set of all those vectors <strong>in</strong> the doma<strong>in</strong> U that get<br />
“sent” to the vector v.<br />
Example SPIAS Sample pre-images, Archetype S<br />
Archetype S is the l<strong>in</strong>ear transformation def<strong>in</strong>ed by<br />
([ ]) a [<br />
T : C 3 a − b<br />
→ M 22 , T b =<br />
3a + b + c<br />
c<br />
]<br />
2a +2b + c<br />
−2a − 6b − 2c<br />
We could compute a pre-image for every element of the codoma<strong>in</strong> M 22 . However,<br />
even <strong>in</strong> a free textbook, we do not have the room to do that, so we will compute<br />
just two.<br />
Choose<br />
[ ]<br />
2 1<br />
v = ∈ M<br />
3 2 22
§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 437<br />
for no particular reason. What is T −1 (v)? Suppose u =<br />
condition that T (u) =v becomes<br />
[ ]<br />
([ ])<br />
u1<br />
2 1<br />
= v = T (u) =T u<br />
3 2<br />
2<br />
u 3<br />
[<br />
=<br />
[<br />
u1<br />
]<br />
u 2<br />
u 3<br />
∈ T −1 (v). The<br />
]<br />
u 1 − u 2 2u 1 +2u 2 + u 3<br />
3u 1 + u 2 + u 3 −2u 1 − 6u 2 − 2u 3<br />
Us<strong>in</strong>g matrix equality (Def<strong>in</strong>ition ME), we arrive at a system of four equations<br />
<strong>in</strong> the three unknowns u 1 ,u 2 ,u 3 with an augmented matrix that we can row-reduce<br />
<strong>in</strong> the hunt for solutions,<br />
⎡<br />
⎤<br />
1 −1 0 2<br />
⎢ 2 2 1 1⎥<br />
⎣<br />
3 1 1 3<br />
−2 −6 −2 2<br />
−−−−→<br />
⎦ RREF<br />
⎡<br />
⎢<br />
⎣<br />
1<br />
1 0<br />
4<br />
1<br />
0 1<br />
5<br />
4<br />
4<br />
− 3 4<br />
0 0 0 0<br />
0 0 0 0<br />
We recognize this system as hav<strong>in</strong>g <strong>in</strong>f<strong>in</strong>itely many solutions described by the<br />
s<strong>in</strong>gle free variable u 3 . Eventually obta<strong>in</strong><strong>in</strong>g the vector form of the solutions (Theorem<br />
VFSLS), we can describe the preimage precisely as,<br />
T −1 (v) = { u ∈ C 3∣ ∣ T (u) =v }<br />
{[ ]∣ }<br />
u1 ∣∣∣∣<br />
= u 2 u 1 = 5<br />
u 4 − 1 4 u 3,u 2 = − 3 4 − 1 4 u 3<br />
3<br />
⎧⎡<br />
5<br />
⎨<br />
4<br />
= ⎣<br />
− 1 4 u ⎤<br />
⎫<br />
3<br />
− 3<br />
⎩ 4 − 1 4 u ⎬<br />
⎦<br />
3<br />
u 3<br />
∣ u 3 ∈ C<br />
⎭<br />
⎧⎡<br />
⎤ ⎡<br />
5<br />
⎨<br />
4<br />
= ⎣− 3 ⎦ + u<br />
⎩ 4 3<br />
⎣ − ⎤<br />
⎫<br />
1 4<br />
⎬<br />
− 1 ⎦<br />
4<br />
0 1 ∣ u 3 ∈ C<br />
⎭<br />
⎡ ⎤<br />
5 〈 ⎧ ⎡<br />
4<br />
⎨<br />
= ⎣− 3 ⎦ + ⎣ − ⎤⎫<br />
1 〉<br />
4<br />
⎬<br />
4<br />
− 1 ⎦<br />
⎩ 4 ⎭<br />
0<br />
1<br />
This last l<strong>in</strong>e is merely a suggestive way of describ<strong>in</strong>g the set on the previous<br />
l<strong>in</strong>e. You might create three or four vectors <strong>in</strong> the preimage, and evaluate T with<br />
each. Was the result what you expected? For a h<strong>in</strong>t of th<strong>in</strong>gs to come, you might<br />
try evaluat<strong>in</strong>g T with just the lone vector <strong>in</strong> the spann<strong>in</strong>g set above. What was the<br />
result? Now take a look back at Theorem PSPHS. Hmmmm.<br />
OK, let us compute another preimage, but with a different outcome this time.<br />
Choose<br />
[ ]<br />
1 1<br />
v = ∈ M<br />
2 4 22<br />
⎤<br />
⎥<br />
⎦
§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 438<br />
What is T −1 (v)? Suppose u =<br />
[<br />
u1<br />
[ ]<br />
([ ])<br />
u1<br />
1 1<br />
= v = T (u) =T u<br />
2 4<br />
2<br />
u 3<br />
]<br />
u 2 ∈ T −1 (v). That T (u) =v becomes<br />
u 3<br />
[<br />
]<br />
u<br />
= 1 − u 2 2u 1 +2u 2 + u 3<br />
3u 1 + u 2 + u 3 −2u 1 − 6u 2 − 2u 3<br />
Us<strong>in</strong>g matrix equality (Def<strong>in</strong>ition ME), we arrive at a system of four equations<br />
<strong>in</strong> the three unknowns u 1 ,u 2 ,u 3 with an augmented matrix that we can row-reduce<br />
<strong>in</strong> the hunt for solutions,<br />
⎡<br />
⎤ ⎡<br />
⎤<br />
1 −1 0 1<br />
⎢ 2 2 1 1⎥<br />
⎣<br />
3 1 1 2<br />
−2 −6 −2 4<br />
−−−−→<br />
⎦ RREF<br />
⎢<br />
⎣<br />
1<br />
1 0<br />
4<br />
0<br />
1<br />
0 1<br />
4<br />
0<br />
0 0 0 1<br />
0 0 0 0<br />
By Theorem RCLS we recognize this system as <strong>in</strong>consistent. So no vector u is a<br />
member of T −1 (v) andso<br />
T −1 (v) =∅<br />
△<br />
The preimage is just a set, it is almost never a subspace of U (you might th<strong>in</strong>k<br />
about just when T −1 (v) is a subspace, see Exercise ILT.T10). We will describe its<br />
properties go<strong>in</strong>g forward, and it will be central to the ma<strong>in</strong> ideas of this chapter.<br />
Subsection NLTFO<br />
New L<strong>in</strong>ear Transformations From Old<br />
We can comb<strong>in</strong>e l<strong>in</strong>ear transformations <strong>in</strong> natural ways to create new l<strong>in</strong>ear transformations.<br />
So we will def<strong>in</strong>e these comb<strong>in</strong>ations and then prove that the results<br />
really are still l<strong>in</strong>ear transformations. <strong>First</strong> the sum of two l<strong>in</strong>ear transformations.<br />
Def<strong>in</strong>ition LTA L<strong>in</strong>ear Transformation Addition<br />
Suppose that T : U → V and S : U → V are two l<strong>in</strong>ear transformations with the<br />
same doma<strong>in</strong> and codoma<strong>in</strong>. Then their sum is the function T + S : U → V whose<br />
outputs are def<strong>in</strong>ed by<br />
(T + S)(u) =T (u)+S (u)<br />
□<br />
Notice that the first plus sign <strong>in</strong> the def<strong>in</strong>ition is the operation be<strong>in</strong>g def<strong>in</strong>ed,<br />
while the second one is the vector addition <strong>in</strong> V . (Vector addition <strong>in</strong> U will appear<br />
just now <strong>in</strong> the proof that T + S is a l<strong>in</strong>ear transformation.) Def<strong>in</strong>ition LTA only<br />
provides a function. It would be nice to know that when the constituents (T , S) are<br />
l<strong>in</strong>ear transformations, then so too is T + S.<br />
⎥<br />
⎦
§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 439<br />
Theorem SLTLT Sum of L<strong>in</strong>ear Transformations is a L<strong>in</strong>ear Transformation<br />
Suppose that T : U → V and S : U → V are two l<strong>in</strong>ear transformations with the<br />
same doma<strong>in</strong> and codoma<strong>in</strong>. Then T + S : U → V is a l<strong>in</strong>ear transformation.<br />
Proof. We simply check the def<strong>in</strong><strong>in</strong>g properties of a l<strong>in</strong>ear transformation (Def<strong>in</strong>ition<br />
LT). This is a good place to consistently ask yourself which objects are be<strong>in</strong>g<br />
comb<strong>in</strong>ed with which operations.<br />
(T + S)(x + y) =T (x + y)+S (x + y) Def<strong>in</strong>ition LTA<br />
= T (x)+T (y)+S (x)+S (y) Def<strong>in</strong>ition LT<br />
= T (x)+S (x)+T (y)+S (y) Property C <strong>in</strong> V<br />
=(T + S)(x)+(T + S)(y) Def<strong>in</strong>ition LTA<br />
and<br />
(T + S)(αx) =T (αx)+S (αx) Def<strong>in</strong>ition LTA<br />
= αT (x)+αS (x) Def<strong>in</strong>ition LT<br />
= α (T (x)+S (x)) Property DVA <strong>in</strong> V<br />
= α(T + S)(x) Def<strong>in</strong>ition LTA<br />
Example STLT Sum of two l<strong>in</strong>ear transformations<br />
Suppose that T : C 2 → C 3 and S : C 2 → C 3 are def<strong>in</strong>ed by<br />
([ ]) [ ]<br />
x1 +2x 2<br />
([ ]) [ ] 4x1 − x 2<br />
x1<br />
x1<br />
T = 3x<br />
x 1 − 4x 2 S = x<br />
2 x 1 +3x 2<br />
5x 1 +2x 2 2 −7x 1 +5x 2<br />
Then by Def<strong>in</strong>ition LTA,wehave<br />
([ ]) ([ ]) ([ ])<br />
x1 x1 x1<br />
(T + S) = T + S<br />
x 2 x 2 x 2<br />
[ ] [ ] [ ]<br />
x1 +2x 2 4x1 − x 2 5x1 + x 2<br />
= 3x 1 − 4x 2 + x 1 +3x 2 = 4x 1 − x 2<br />
5x 1 +2x 2 −7x 1 +5x 2 −2x 1 +7x 2<br />
and by Theorem SLTLT we know T + S is also a l<strong>in</strong>ear transformation from C 2 to<br />
C 3 .<br />
△<br />
Def<strong>in</strong>ition LTSM L<strong>in</strong>ear Transformation Scalar Multiplication<br />
Suppose that T : U → V is a l<strong>in</strong>ear transformation and α ∈ C. Then the scalar<br />
multiple is the function αT : U → V whose outputs are def<strong>in</strong>ed by<br />
(αT )(u) =αT (u)
§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 440<br />
□<br />
Given that T is a l<strong>in</strong>ear transformation, it would be nice to know that αT is also<br />
a l<strong>in</strong>ear transformation.<br />
Theorem MLTLT Multiple of a L<strong>in</strong>ear Transformation is a L<strong>in</strong>ear Transformation<br />
Suppose that T : U → V is a l<strong>in</strong>ear transformation and α ∈ C. Then(αT ): U → V<br />
is a l<strong>in</strong>ear transformation.<br />
Proof. We simply check the def<strong>in</strong><strong>in</strong>g properties of a l<strong>in</strong>ear transformation (Def<strong>in</strong>ition<br />
LT). This is another good place to consistently ask yourself which objects are be<strong>in</strong>g<br />
comb<strong>in</strong>ed with which operations.<br />
(αT )(x + y) =α (T (x + y))<br />
Def<strong>in</strong>ition LTSM<br />
= α (T (x)+T (y)) Def<strong>in</strong>ition LT<br />
= αT (x)+αT (y) Property DVA <strong>in</strong> V<br />
=(αT )(x)+(αT )(y) Def<strong>in</strong>ition LTSM<br />
and<br />
(αT )(βx) =αT (βx)<br />
Def<strong>in</strong>ition LTSM<br />
= α (βT (x)) Def<strong>in</strong>ition LT<br />
=(αβ) T (x)<br />
Property SMA <strong>in</strong> V<br />
=(βα) T (x)<br />
Commutativity <strong>in</strong> C<br />
= β (αT (x)) Property SMA <strong>in</strong> V<br />
= β ((αT )(x)) Def<strong>in</strong>ition LTSM<br />
Example SMLT Scalar multiple of a l<strong>in</strong>ear transformation<br />
Suppose that T : C 4 → C 3 is def<strong>in</strong>ed by<br />
⎛⎡<br />
⎤⎞<br />
x 1 [ ]<br />
x1 +2x<br />
⎜⎢x T 2 ⎥⎟<br />
2 − x 3 +2x 4<br />
⎝⎣<br />
x<br />
⎦⎠ = x 1 +5x 2 − 3x 3 + x 4<br />
3<br />
−2x<br />
x 1 +3x 2 − 4x 3 +2x 4 4<br />
For the sake of an example, choose α =2,sobyDef<strong>in</strong>itionLTSM,wehave<br />
⎛⎡<br />
⎤⎞<br />
⎛⎡<br />
⎤⎞<br />
x 1<br />
x 1 [ ]<br />
x1 +2x<br />
⎜⎢x αT 2 ⎥⎟<br />
⎜⎢x ⎝⎣<br />
x<br />
⎦⎠ =2T 2 ⎥⎟<br />
2 − x 3 +2x 4<br />
⎝⎣<br />
3<br />
x<br />
⎦⎠ =2 x 1 +5x 2 − 3x 3 + x 4<br />
3<br />
−2x<br />
x 4 x 1 +3x 2 − 4x 3 +2x 4 4
§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 441<br />
=<br />
[ 2x1 +4x 2 − 2x 3 +4x 4<br />
]<br />
2x 1 +10x 2 − 6x 3 +2x 4<br />
−4x 1 +6x 2 − 8x 3 +4x 4<br />
and by Theorem MLTLT we know 2T is also a l<strong>in</strong>ear transformation from C 4 to C 3 .<br />
△<br />
Now, let us imag<strong>in</strong>e we have two vector spaces, U and V , and we collect every<br />
possible l<strong>in</strong>ear transformation from U to V <strong>in</strong>to one big set, and call it LT (U, V ).<br />
Def<strong>in</strong>ition LTA and Def<strong>in</strong>ition LTSM tell us how we can “add” and “scalar multiply”<br />
two elements of LT (U, V ). TheoremSLTLT and Theorem MLTLT tell us that if<br />
we do these operations, then the result<strong>in</strong>g functions are l<strong>in</strong>ear transformations that<br />
are also <strong>in</strong> LT (U, V ). Hmmmm, sounds like a vector space to me! A set of objects,<br />
an addition and a scalar multiplication. Why not?<br />
Theorem VSLT Vector Space of L<strong>in</strong>ear Transformations<br />
Suppose that U and V are vector spaces. Then the set of all l<strong>in</strong>ear transformations<br />
from U to V , LT (U, V ), is a vector space when the operations are those given <strong>in</strong><br />
Def<strong>in</strong>ition LTA and Def<strong>in</strong>ition LTSM.<br />
Proof. Theorem SLTLT and Theorem MLTLT provide two of the ten properties <strong>in</strong><br />
Def<strong>in</strong>ition VS. However, we still need to verify the rema<strong>in</strong><strong>in</strong>g eight properties. By<br />
and large, the proofs are straightforward and rely on concoct<strong>in</strong>g the obvious object,<br />
or by reduc<strong>in</strong>g the question to the same vector space property <strong>in</strong> the vector space V .<br />
The zero vector is of some <strong>in</strong>terest, though. What l<strong>in</strong>ear transformation would<br />
we add to any other l<strong>in</strong>ear transformation, so as to keep the second one unchanged?<br />
The answer is Z : U → V def<strong>in</strong>ed by Z (u) = 0 V for every u ∈ U. Notice how we do<br />
not need to know any of the specifics about U and V to make this def<strong>in</strong>ition of Z.<br />
Def<strong>in</strong>ition LTC L<strong>in</strong>ear Transformation Composition<br />
Suppose that T : U → V and S : V → W are l<strong>in</strong>ear transformations. Then the<br />
composition of S and T is the function (S ◦ T ): U → W whose outputs are def<strong>in</strong>ed<br />
by<br />
(S ◦ T )(u) =S (T (u))<br />
□<br />
Given that T and S are l<strong>in</strong>ear transformations, it would be nice to know that<br />
S ◦ T is also a l<strong>in</strong>ear transformation.<br />
Theorem CLTLT Composition of L<strong>in</strong>ear Transformations is a L<strong>in</strong>ear Transformation<br />
Suppose that T : U → V and S : V → W are l<strong>in</strong>ear transformations. Then (S ◦<br />
T ): U → W is a l<strong>in</strong>ear transformation.
§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 442<br />
Proof. We simply check the def<strong>in</strong><strong>in</strong>g properties of a l<strong>in</strong>ear transformation (Def<strong>in</strong>ition<br />
LT).<br />
(S ◦ T )(x + y) =S (T (x + y)) Def<strong>in</strong>ition LTC<br />
= S (T (x)+T (y)) Def<strong>in</strong>ition LT for T<br />
= S (T (x)) + S (T (y)) Def<strong>in</strong>ition LT for S<br />
=(S ◦ T )(x)+(S ◦ T )(y) Def<strong>in</strong>ition LTC<br />
and<br />
(S ◦ T )(αx) =S (T (αx)) Def<strong>in</strong>ition LTC<br />
= S (αT (x)) Def<strong>in</strong>ition LT for T<br />
= αS (T (x)) Def<strong>in</strong>ition LT for S<br />
= α(S ◦ T )(x) Def<strong>in</strong>ition LTC<br />
<br />
Example CTLT Composition of two l<strong>in</strong>ear transformations<br />
Suppose that T : C 2 → C 4 and S : C 4 → C 3 are def<strong>in</strong>ed by<br />
⎡ ⎤ ⎛⎡<br />
⎤⎞<br />
([ ]) x 1 +2x 2<br />
x 1 [ ]<br />
2x1 − x x1 ⎢3x T = 1 − 4x 2 ⎥ ⎜⎢x x<br />
⎣<br />
2 5x 1 +2x<br />
⎦ S 2 ⎥⎟<br />
2 + x 3 − x 4<br />
⎝⎣<br />
2<br />
x<br />
⎦⎠ = 5x 1 − 3x 2 +8x 3 − 2x 4<br />
3<br />
−4x<br />
6x 1 − 3x 2 x 1 +3x 2 − 4x 3 +5x 4 4<br />
Then by Def<strong>in</strong>ition LTC<br />
⎛⎡<br />
⎤⎞<br />
([ ]) ( ([ ])) x 1 +2x 2<br />
x1<br />
x1 ⎜⎢3x (S ◦ T ) = S T = S 1 − 4x 2 ⎥⎟<br />
x 2 x<br />
⎝⎣<br />
2 5x 1 +2x<br />
⎦⎠<br />
2<br />
6x 1 − 3x 2<br />
[ ]<br />
2(x1 +2x 2 ) − (3x 1 − 4x 2 )+(5x 1 +2x 2 ) − (6x 1 − 3x 2 )<br />
= 5(x 1 +2x 2 ) − 3(3x 1 − 4x 2 ) + 8(5x 1 +2x 2 ) − 2(6x 1 − 3x 2 )<br />
−4(x 1 +2x 2 ) + 3(3x 1 − 4x 2 ) − 4(5x 1 +2x 2 ) + 5(6x 1 − 3x 2 )<br />
[ ]<br />
−2x1 +13x 2<br />
= 24x 1 +44x 2<br />
15x 1 − 43x 2<br />
and by Theorem CLTLT S ◦ T is a l<strong>in</strong>ear transformation from C 2 to C 3 .<br />
Here is an <strong>in</strong>terest<strong>in</strong>g exercise that will presage an important result later. In<br />
Example STLT compute (via Theorem MLTCV) thematrixofT , S and T + S. Do<br />
you see a relationship between these three matrices?<br />
In Example SMLT compute (via Theorem MLTCV) thematrixofT and 2T .Do<br />
you see a relationship between these two matrices?<br />
Here is the tough one. In Example CTLT compute (via Theorem MLTCV) the<br />
△
§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 443<br />
matrix of T , S and S ◦ T . Do you see a relationship between these three matrices???<br />
Read<strong>in</strong>g Questions<br />
1. Is the function below a l<strong>in</strong>ear transformation? Why or why not?<br />
⎛⎡<br />
T : C 3 → C 2 , T ⎝⎣ x ⎤⎞<br />
[ ]<br />
1<br />
x 2<br />
⎦⎠ 3x1 − x<br />
=<br />
2 + x 3<br />
8x<br />
x 2 − 6<br />
3<br />
2. Determ<strong>in</strong>e the matrix representation of the l<strong>in</strong>ear transformation S below.<br />
⎡<br />
([ ])<br />
S : C 2 → C 3 x1<br />
, S = ⎣ 3x ⎤<br />
1 +5x 2<br />
8x<br />
x 1 − 3x 2<br />
⎦<br />
2<br />
−4x 1<br />
3. Theorem LTLC has a fairly simple proof. Yet the result itself is very powerful. Comment<br />
onwhywemightsaythis.<br />
Exercises<br />
C15 The archetypes below are all l<strong>in</strong>ear transformations whose doma<strong>in</strong>s and codoma<strong>in</strong>s<br />
are vector spaces of column vectors (Def<strong>in</strong>ition VSCV). For each one, compute the matrix<br />
representation described <strong>in</strong> the proof of Theorem MLTCV.<br />
Archetype M, ArchetypeN, ArchetypeO, ArchetypeP, ArchetypeQ, ArchetypeR<br />
⎛⎡<br />
C16 † F<strong>in</strong>d the matrix representation of T : C 3 → C 4 , T ⎝⎣ x ⎤⎞<br />
⎡<br />
⎤<br />
3x +2y + z<br />
y⎦⎠ ⎢ x + y + z ⎥<br />
= ⎣<br />
x − 3y<br />
⎦.<br />
z<br />
2x +3y + z<br />
⎡<br />
C20 † Let w = ⎣ −3<br />
⎤<br />
1 ⎦. Referr<strong>in</strong>g to Example MOLT, compute S (w) two different ways.<br />
4<br />
<strong>First</strong> use the def<strong>in</strong>ition of S, then compute the matrix-vector product Cw (Def<strong>in</strong>ition<br />
MVP).<br />
C25 † Def<strong>in</strong>e the l<strong>in</strong>ear transformation<br />
⎛⎡<br />
T : C 3 → C 2 , T ⎝⎣ x ⎤⎞<br />
[ ]<br />
1<br />
x 2<br />
⎦⎠ 2x1 − x<br />
=<br />
2 +5x 3<br />
−4x<br />
x 1 +2x 2 − 10x 3<br />
3<br />
Verify that T is a l<strong>in</strong>ear transformation.<br />
C26 † Verify that the function below is a l<strong>in</strong>ear transformation.<br />
T : P 2 → C 2 , T ( a + bx + cx 2) [ ]<br />
2a − b<br />
=<br />
b + c
§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 444<br />
C30 † Def<strong>in</strong>e the l<strong>in</strong>ear transformation<br />
⎛⎡<br />
T : C 3 → C 2 , T ⎝⎣ x ⎤⎞<br />
[ ]<br />
1<br />
x 2<br />
⎦⎠ 2x1 − x<br />
=<br />
2 +5x 3<br />
−4x<br />
x 1 +2x 2 − 10x 3<br />
3<br />
([ ]) ([ ])<br />
Compute the preimages, T −1 2<br />
and T<br />
3<br />
−1 4<br />
.<br />
−8<br />
C31 † For the l<strong>in</strong>ear transformation S compute the pre-images.<br />
⎛⎡<br />
S : C 3 → C 3 , S⎝⎣ a ⎤⎞<br />
⎡<br />
b⎦⎠ = ⎣ a − 2b − c<br />
⎤<br />
3a − b +2c⎦<br />
c a + b +2c<br />
⎛⎡<br />
S −1 ⎝⎣ −2<br />
⎤⎞<br />
5 ⎦⎠<br />
3<br />
⎛⎡<br />
S −1 ⎝⎣ −5<br />
⎤⎞<br />
5 ⎦⎠<br />
7<br />
([ [ ([ [ ] ([<br />
C40 † If T : C 2 → C 2 2 3 1 −1<br />
4<br />
satisfies T = and T = , f<strong>in</strong>d T .<br />
1])<br />
4]<br />
1])<br />
2<br />
3])<br />
⎡<br />
([<br />
C41 † If T : C 2 → C 3 2<br />
satisfies T = ⎣<br />
3]) 2 ⎤<br />
⎡<br />
([<br />
2⎦ 3<br />
and T = ⎣<br />
4]) −1<br />
⎤<br />
0 ⎦, f<strong>in</strong>d the matrix<br />
1<br />
2<br />
representation of T .<br />
([ ])<br />
C42 † Def<strong>in</strong>e T : M 22 → C 1 a b<br />
by T<br />
= a + b + c − d. F<strong>in</strong>d the pre-image T<br />
c d<br />
−1 (3).<br />
C43 † Def<strong>in</strong>e T : P 3 → P 2 by T ( a + bx + cx 2 + dx 3) = b+2cx+3dx 2 . F<strong>in</strong>d the pre-image<br />
of 0. Does this l<strong>in</strong>ear transformation seem familiar?<br />
M10 † Def<strong>in</strong>e two l<strong>in</strong>ear transformations, T : C 4 → C 3 and S : C 3 → C 2 by<br />
⎛⎡<br />
S ⎝⎣ x ⎤⎞<br />
⎛⎡<br />
⎤⎞<br />
[ ]<br />
x 1<br />
⎡<br />
1<br />
x 2<br />
⎦⎠ x1 − 2x<br />
=<br />
2 +3x 3 ⎜⎢x T 2 ⎥⎟<br />
5x<br />
x 1 +4x 2 +2x<br />
⎝⎣<br />
3 x<br />
⎦⎠ = ⎣ −x ⎤<br />
1 +3x 2 + x 3 +9x 4<br />
2x 1 + x 3 +7x 4<br />
⎦<br />
3<br />
3 4x<br />
x 1 +2x 2 + x 3 +2x 4<br />
4<br />
Us<strong>in</strong>g the proof of Theorem MLTCV compute the matrix representations of the three l<strong>in</strong>ear<br />
transformations T , S and S ◦ T . Discover and comment on the relationship between these<br />
three matrices.<br />
M60 Suppose U and V are vector spaces and def<strong>in</strong>e a function Z : U → V by Z (u) = 0 V<br />
for every u ∈ U. ProvethatZ is a (stupid) l<strong>in</strong>ear transformation. (See Exercise ILT.M60,<br />
Exercise SLT.M60, Exercise IVLT.M60.)<br />
T20 Use the conclusion of Theorem LTLC to motivate a new def<strong>in</strong>ition of a l<strong>in</strong>ear<br />
transformation. Then prove that your new def<strong>in</strong>ition is equivalent to Def<strong>in</strong>ition LT.(Proof<br />
Technique D and Proof Technique E might be helpful if you are not sure what you are<br />
be<strong>in</strong>g asked to prove here.)<br />
Theorem SER established three properties of matrix similarity that are collectively known
§LT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 445<br />
as the def<strong>in</strong><strong>in</strong>g properties of an “equivalence relation”. Exercises T30 and T31 extend this<br />
idea to l<strong>in</strong>ear transformations.<br />
T30 Suppose that T : U → V is a l<strong>in</strong>ear transformation. Say that two vectors from U,<br />
x and y, arerelated exactly when T (x) = T (y) <strong>in</strong> V . Prove the three properties of<br />
an equivalence relation on U: (a)foranyx ∈ U, x is related to x, (b)ifx is related to<br />
y, theny is related to x, and(c)ifx is related to y and y is related to z, thenx is<br />
related to z.<br />
T31 † Equivalence relations always create a partition of the set they are def<strong>in</strong>ed on, via<br />
a construction called equivalence classes. For the relation <strong>in</strong> the previous problem, the<br />
equivalence classes are the pre-images. Prove directly that the collection of pre-images<br />
partition U by show<strong>in</strong>g that (a) every x ∈ U is conta<strong>in</strong>ed <strong>in</strong> some pre-image, and that<br />
(b) any two different pre-images do not have any elements <strong>in</strong> common.
Section ILT<br />
Injective L<strong>in</strong>ear Transformations<br />
Some l<strong>in</strong>ear transformations possess one, or both, of two key properties, which go<br />
by the names <strong>in</strong>jective and surjective. We will see that they are closely related to<br />
ideas like l<strong>in</strong>ear <strong>in</strong>dependence and spann<strong>in</strong>g, and subspaces like the null space and<br />
the column space. In this section we will def<strong>in</strong>e an <strong>in</strong>jective l<strong>in</strong>ear transformation<br />
and analyze the result<strong>in</strong>g consequences. The next section will do the same for the<br />
surjective property. In the f<strong>in</strong>al section of this chapter we will see what happens<br />
when we have the two properties simultaneously.<br />
Subsection ILT<br />
Injective L<strong>in</strong>ear Transformations<br />
As usual, we lead with a def<strong>in</strong>ition.<br />
Def<strong>in</strong>ition ILT Injective L<strong>in</strong>ear Transformation<br />
Suppose T : U → V is a l<strong>in</strong>ear transformation. Then T is <strong>in</strong>jective if whenever<br />
T (x) =T (y), then x = y.<br />
□<br />
Given an arbitrary function, it is possible for two different <strong>in</strong>puts to yield the same<br />
output (th<strong>in</strong>k about the function f(x) =x 2 and the <strong>in</strong>puts x = 3 and x = −3). For an<br />
<strong>in</strong>jective function, this never happens. If we have equal outputs (T (x) = T (y)) then<br />
we must have achieved those equal outputs by employ<strong>in</strong>g equal <strong>in</strong>puts (x = y). Some<br />
authors prefer the term one-to-one where we use <strong>in</strong>jective, and we will sometimes<br />
refer to an <strong>in</strong>jective l<strong>in</strong>ear transformation as an <strong>in</strong>jection.<br />
Subsection EILT<br />
Examples of Injective L<strong>in</strong>ear Transformations<br />
It is perhaps most <strong>in</strong>structive to exam<strong>in</strong>e a l<strong>in</strong>ear transformation that is not <strong>in</strong>jective<br />
first.<br />
Example NIAQ Not <strong>in</strong>jective, Archetype Q<br />
Archetype Q is the l<strong>in</strong>ear transformation<br />
⎛⎡<br />
⎤⎞<br />
⎡<br />
⎤<br />
x 1 −2x 1 +3x 2 +3x 3 − 6x 4 +3x 5<br />
x<br />
T : C 5 → C 5 2<br />
−16x , T ⎜⎢x 3 ⎥⎟<br />
⎝⎣x ⎦⎠ = 1 +9x 2 +12x 3 − 28x 4 +28x 5<br />
⎢−19x 1 +7x 2 +14x 3 − 32x 4 +37x 5 ⎥<br />
⎣<br />
4 −21x 1 +9x 2 +15x 3 − 35x 4 +39x ⎦ 5<br />
x 5 −9x 1 +5x 2 +7x 3 − 16x 4 +16x 5<br />
446
§ILT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 447<br />
Notice that for<br />
⎡ ⎤<br />
1<br />
3<br />
x = ⎢−1⎥<br />
⎣ 2 ⎦<br />
4<br />
⎡ ⎤<br />
4<br />
7<br />
y = ⎢0⎥<br />
⎣5⎦<br />
7<br />
we have<br />
⎛⎡<br />
⎤⎞<br />
⎡ ⎤<br />
1 4<br />
3<br />
55<br />
T ⎜⎢−1⎥⎟<br />
⎝⎣<br />
2 ⎦⎠ = ⎢72⎥<br />
⎣77⎦<br />
4 31<br />
⎛⎡<br />
⎤⎞<br />
⎡ ⎤<br />
4 4<br />
7<br />
55<br />
T ⎜⎢0⎥⎟<br />
⎝⎣5⎦⎠ = ⎢72⎥<br />
⎣77⎦<br />
7 31<br />
So we have two vectors from the doma<strong>in</strong>, x ≠ y, yetT (x) = T (y), <strong>in</strong> violation of<br />
Def<strong>in</strong>ition ILT. This is another example where you should not concern yourself with<br />
how x and y were selected, as this will be expla<strong>in</strong>ed shortly. However, do understand<br />
why these two vectors provide enough evidence to conclude that T is not <strong>in</strong>jective.△<br />
Here is a cartoon of a non-<strong>in</strong>jective l<strong>in</strong>ear transformation. Notice that the central<br />
feature of this cartoon is that T (u) = v = T (w). Even though this happens aga<strong>in</strong><br />
with some unnamed vectors, it only takes one occurrence to destroy the possibility<br />
of <strong>in</strong>jectivity. Note also that the two vectors displayed <strong>in</strong> the bottom of V have no<br />
bear<strong>in</strong>g, either way, on the <strong>in</strong>jectivity of T .<br />
u<br />
w<br />
v<br />
U<br />
T<br />
V
§ILT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 448<br />
Diagram NILT: Non-Injective L<strong>in</strong>ear Transformation<br />
To show that a l<strong>in</strong>ear transformation is not <strong>in</strong>jective, it is enough to f<strong>in</strong>d a s<strong>in</strong>gle<br />
pair of <strong>in</strong>puts that get sent to the identical output, as <strong>in</strong> Example NIAQ. However, to<br />
show that a l<strong>in</strong>ear transformation is <strong>in</strong>jective we must establish that this co<strong>in</strong>cidence<br />
of outputs never occurs. Here is an example that shows how to establish this.<br />
Example IAR Injective, Archetype R<br />
Archetype R is the l<strong>in</strong>ear transformation<br />
⎛⎡<br />
⎤⎞<br />
⎡<br />
⎤<br />
x 1 −65x 1 + 128x 2 +10x 3 − 262x 4 +40x 5<br />
x<br />
T : C 5 → C 5 2<br />
36x , T ⎜⎢x 3 ⎥⎟<br />
⎝⎣x ⎦⎠ = 1 − 73x 2 − x 3 + 151x 4 − 16x 5<br />
⎢ −44x 1 +88x 2 +5x 3 − 180x 4 +24x 5 ⎥<br />
⎣<br />
4 34x 1 − 68x 2 − 3x 3 + 140x 4 − 18x ⎦ 5<br />
x 5 12x 1 − 24x 2 − x 3 +49x 4 − 5x 5<br />
To establish that R is <strong>in</strong>jective we must beg<strong>in</strong> with the assumption that T (x) =<br />
T (y) and somehow arrive at the conclusion that x = y. Here we go,<br />
⎡ ⎤<br />
0<br />
0<br />
⎢0⎥<br />
= T (x) − T (y)<br />
⎣0⎦ 0<br />
⎛⎡<br />
⎤⎞<br />
⎛⎡<br />
⎤⎞<br />
x 1<br />
y 1<br />
x 2<br />
y = T ⎜⎢x 3 ⎥⎟<br />
⎝⎣x ⎦⎠ − T 2<br />
⎜⎢y 3 ⎥⎟<br />
⎝⎣<br />
4<br />
y ⎦⎠<br />
4<br />
x 5 y 5<br />
⎡<br />
⎤<br />
−65x 1 + 128x 2 +10x 3 − 262x 4 +40x 5<br />
36x 1 − 73x 2 − x 3 + 151x 4 − 16x 5<br />
= ⎢ −44x 1 +88x 2 +5x 3 − 180x 4 +24x 5 ⎥<br />
⎣ 34x 1 − 68x 2 − 3x 3 + 140x 4 − 18x ⎦ 5<br />
12x 1 − 24x 2 − x 3 +49x 4 − 5x 5<br />
⎡<br />
⎤<br />
−65y 1 + 128y 2 +10y 3 − 262y 4 +40y 5<br />
36y 1 − 73y 2 − y 3 + 151y 4 − 16y 5<br />
− ⎢ −44y 1 +88y 2 +5y 3 − 180y 4 +24y 5 ⎥<br />
⎣ 34y 1 − 68y 2 − 3y 3 + 140y 4 − 18y ⎦ 5<br />
12y 1 − 24y 2 − y 3 +49y 4 − 5y 5<br />
⎡<br />
⎤<br />
−65(x 1 − y 1 ) + 128(x 2 − y 2 ) + 10(x 3 − y 3 ) − 262(x 4 − y 4 ) + 40(x 5 − y 5 )<br />
36(x 1 − y 1 ) − 73(x 2 − y 2 ) − (x 3 − y 3 ) + 151(x 4 − y 4 ) − 16(x 5 − y 5 )<br />
= ⎢ −44(x 1 − y 1 ) + 88(x 2 − y 2 )+5(x 3 − y 3 ) − 180(x 4 − y 4 ) + 24(x 5 − y 5 ) ⎥<br />
⎣ 34(x 1 − y 1 ) − 68(x 2 − y 2 ) − 3(x 3 − y 3 ) + 140(x 4 − y 4 ) − 18(x 5 − y 5 ) ⎦<br />
12(x 1 − y 1 ) − 24(x 2 − y 2 ) − (x 3 − y 3 ) + 49(x 4 − y 4 ) − 5(x 5 − y 5 )
§ILT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 449<br />
⎡<br />
⎤ ⎡ ⎤<br />
−65 128 10 −262 40 x 1 − y 1<br />
36 −73 −1 151 −16<br />
x 2 − y 2<br />
= ⎢−44 88 5 −180 24 ⎥ ⎢x 3 − y 3 ⎥<br />
⎣ 34 −68 −3 140 −18⎦<br />
⎣x 4 − y ⎦ 4<br />
12 −24 −1 49 −5 x 5 − y 5<br />
Now we recognize that we have a homogeneous system of 5 equations <strong>in</strong> 5 variables<br />
(the terms x i − y i are the variables), so we row-reduce the coefficient matrix to<br />
⎡<br />
⎤<br />
1 0 0 0 0<br />
0 1 0 0 0<br />
0 0 1 0 0<br />
⎢<br />
⎥<br />
⎣ 0 0 0 1 0 ⎦<br />
0 0 0 0 1<br />
So the only solution is the trivial solution<br />
x 1 − y 1 =0 x 2 − y 2 =0 x 3 − y 3 =0 x 4 − y 4 =0 x 5 − y 5 =0<br />
and we conclude that <strong>in</strong>deed x = y. By Def<strong>in</strong>ition ILT, T is <strong>in</strong>jective.<br />
Here is the cartoon for an <strong>in</strong>jective l<strong>in</strong>ear transformation. It is meant to suggest<br />
that we never have two <strong>in</strong>puts associated with a s<strong>in</strong>gle output. Aga<strong>in</strong>, the two lonely<br />
vectors at the bottom of V have no bear<strong>in</strong>g either way on the <strong>in</strong>jectivity of T .<br />
△<br />
U<br />
T<br />
V<br />
Diagram ILT: Injective L<strong>in</strong>ear Transformation
§ILT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 450<br />
Let us now exam<strong>in</strong>e an <strong>in</strong>jective l<strong>in</strong>ear transformation between abstract vector<br />
spaces.<br />
Example IAV Injective, Archetype V<br />
Archetype V is def<strong>in</strong>ed by<br />
T : P 3 → M 22 , T ( a + bx + cx 2 + dx 3) [ ]<br />
a + b a− 2c<br />
=<br />
d b− d<br />
To establish that the l<strong>in</strong>ear transformation is <strong>in</strong>jective, beg<strong>in</strong> by suppos<strong>in</strong>g that<br />
two polynomial <strong>in</strong>puts yield the same output matrix,<br />
T ( a 1 + b 1 x + c 1 x 2 + d 1 x 3) = T ( a 2 + b 2 x + c 2 x 2 + d 2 x 3)<br />
Then<br />
[ ]<br />
0 0<br />
O =<br />
0 0<br />
= T ( a 1 + b 1 x + c 1 x 2 + d 1 x 3) − T ( a 2 + b 2 x + c 2 x 2 + d 2 x 3) Hypothesis<br />
= T ( (a 1 + b 1 x + c 1 x 2 + d 1 x 3 ) − (a 2 + b 2 x + c 2 x 2 + d 2 x 3 ) ) Def<strong>in</strong>ition LT<br />
= T ( (a 1 − a 2 )+(b 1 − b 2 )x +(c 1 − c 2 )x 2 +(d 1 − d 2 )x 3) Operations <strong>in</strong> P 3<br />
[ ]<br />
(a1 − a<br />
=<br />
2 )+(b 1 − b 2 ) (a 1 − a 2 ) − 2(c 1 − c 2 )<br />
Def<strong>in</strong>ition of T<br />
(d 1 − d 2 ) (b 1 − b 2 ) − (d 1 − d 2 )<br />
This s<strong>in</strong>gle matrix equality translates to the homogeneous system of equations<br />
<strong>in</strong> the variables a i − b i ,<br />
(a 1 − a 2 )+(b 1 − b 2 )=0<br />
(a 1 − a 2 ) − 2(c 1 − c 2 )=0<br />
(d 1 − d 2 )=0<br />
(b 1 − b 2 ) − (d 1 − d 2 )=0<br />
This system of equations can be rewritten as the matrix equation<br />
⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤<br />
1 1 0 0 (a 1 − a 2 ) 0<br />
⎢1 0 −2 0<br />
⎣<br />
0 0 0 1<br />
0 1 0 −1<br />
⎥ ⎢(b 1 − b 2 ) ⎥ ⎢0⎥<br />
⎦ ⎣<br />
(c 1 − c 2 )<br />
⎦ = ⎣<br />
0<br />
⎦<br />
(d 1 − d 2 ) 0<br />
S<strong>in</strong>ce the coefficient matrix is nons<strong>in</strong>gular (check this) the only solution is trivial,<br />
i.e.<br />
a 1 − a 2 =0 b 1 − b 2 =0 c 1 − c 2 =0 d 1 − d 2 =0<br />
so that<br />
a 1 = a 2 b 1 = b 2 c 1 = c 2 d 1 = d 2<br />
so the two <strong>in</strong>puts must be equal polynomials. By Def<strong>in</strong>ition ILT, T is <strong>in</strong>jective. △
§ILT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 451<br />
Subsection KLT<br />
Kernel of a L<strong>in</strong>ear Transformation<br />
For a l<strong>in</strong>ear transformation T : U → V , the kernel is a subset of the doma<strong>in</strong> U.<br />
Informally, it is the set of all <strong>in</strong>puts that the transformation sends to the zero vector<br />
of the codoma<strong>in</strong>. It will have some natural connections with the null space of a<br />
matrix, so we will keep the same notation, and if you th<strong>in</strong>k about your objects, then<br />
there should be little confusion. Here is the careful def<strong>in</strong>ition.<br />
Def<strong>in</strong>ition KLT Kernel of a L<strong>in</strong>ear Transformation<br />
Suppose T : U → V is a l<strong>in</strong>ear transformation. Then the kernel of T is the set<br />
K(T )={u ∈ U| T (u) =0}<br />
Notice that the kernel of T is just the preimage of 0, T −1 (0) (Def<strong>in</strong>ition PI).<br />
Here is an example.<br />
Example NKAO Nontrivial kernel, Archetype O<br />
Archetype O is the l<strong>in</strong>ear transformation<br />
⎡<br />
⎤<br />
−x<br />
([ ])<br />
1 + x 2 − 3x 3<br />
x1<br />
−x<br />
T : C 3 → C 5 1 +2x 2 − 4x 3<br />
, T x 2 = ⎢ x 1 + x 2 + x 3 ⎥<br />
x ⎣<br />
3 2x 1 +3x 2 + x ⎦ 3<br />
x 1 +2x 3<br />
To determ<strong>in</strong>e the elements of C 3 <strong>in</strong> K(T ), f<strong>in</strong>d those vectors u such that T (u) = 0,<br />
that is,<br />
T (u) =0<br />
⎡<br />
⎤ ⎡ ⎤<br />
−u 1 + u 2 − 3u 3 0<br />
−u 1 +2u 2 − 4u 3<br />
0<br />
⎢ u 1 + u 2 + u 3 ⎥<br />
⎣ 2u 1 +3u 2 + u ⎦ = ⎢0⎥<br />
⎣<br />
3 0⎦<br />
u 1 +2u 3 0<br />
Vector equality (Def<strong>in</strong>ition CVE) leads us to a homogeneous system of 5 equations<br />
<strong>in</strong> the variables u i ,<br />
−u 1 + u 2 − 3u 3 =0<br />
−u 1 +2u 2 − 4u 3 =0<br />
u 1 + u 2 + u 3 =0<br />
2u 1 +3u 2 + u 3 =0<br />
u 1 +2u 3 =0<br />
□
§ILT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 452<br />
Row-reduc<strong>in</strong>g the coefficient matrix gives<br />
⎡<br />
⎤<br />
1 0 2<br />
0 1 −1<br />
⎢ 0 0 0<br />
⎥<br />
⎣ 0 0 0 ⎦<br />
0 0 0<br />
The kernel of T is the set of solutions to this homogeneous system of equations,<br />
which by Theorem BNS can be expressed as<br />
〈{[ ]}〉 −2<br />
K(T )= 1<br />
1<br />
We know that the span of a set of vectors is always a subspace (Theorem SSS),<br />
so the kernel computed <strong>in</strong> Example NKAO is also a subspace. This is no accident,<br />
the kernel of a l<strong>in</strong>ear transformation is always a subspace.<br />
Theorem KLTS Kernel of a L<strong>in</strong>ear Transformation is a Subspace<br />
Suppose that T : U → V is a l<strong>in</strong>ear transformation. Then the kernel of T , K(T ), is<br />
asubspaceofU.<br />
Proof. We can apply the three-part test of Theorem TSS. <strong>First</strong> T (0 U ) = 0 V<br />
Theorem LTTZZ, so0 U ∈K(T ) and we know that the kernel is nonempty.<br />
Suppose we assume that x, y ∈K(T ). Is x + y ∈K(T )?<br />
T (x + y) =T (x)+T (y)<br />
Def<strong>in</strong>ition LT<br />
= 0 + 0 x, y ∈K(T )<br />
= 0 Property Z<br />
This qualifies x + y for membership <strong>in</strong> K(T ). So we have additive closure.<br />
Suppose we assume that α ∈ C and x ∈K(T ). Is αx ∈K(T )?<br />
T (αx) =αT (x)<br />
Def<strong>in</strong>ition LT<br />
= α0 x ∈K(T )<br />
= 0 Theorem ZVSM<br />
This qualifies αx for membership <strong>in</strong> K(T ). So we have scalar closure and Theorem<br />
TSS tells us that K(T ) is a subspace of U.<br />
<br />
Let us compute another kernel, now that we know <strong>in</strong> advance that it will be a<br />
subspace.<br />
Example TKAP Trivial kernel, Archetype P<br />
△<br />
by
§ILT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 453<br />
Archetype P is the l<strong>in</strong>ear transformation<br />
T : C 3 → C 5 ,<br />
T<br />
([ ])<br />
x1<br />
x 2 =<br />
x 3<br />
⎡<br />
⎤<br />
−x 1 + x 2 + x 3<br />
−x 1 +2x 2 +2x 3<br />
⎢ x 1 + x 2 +3x 3 ⎥<br />
⎣ 2x 1 +3x 2 + x ⎦ 3<br />
−2x 1 + x 2 +3x 3<br />
To determ<strong>in</strong>e the elements of C 3 <strong>in</strong> K(T ), f<strong>in</strong>d those vectors u such that T (u) = 0,<br />
that is,<br />
T (u) =0<br />
⎡<br />
⎤ ⎡ ⎤<br />
−u 1 + u 2 + u 3 0<br />
−u 1 +2u 2 +2u 3<br />
0<br />
⎢ u 1 + u 2 +3u 3 ⎥<br />
⎣ 2u 1 +3u 2 + u ⎦ = ⎢0⎥<br />
⎣<br />
3 0⎦<br />
−2u 1 + u 2 +3u 3 0<br />
Vector equality (Def<strong>in</strong>ition CVE) leads us to a homogeneous system of 5 equations<br />
<strong>in</strong> the variables u i ,<br />
−u 1 + u 2 + u 3 =0<br />
−u 1 +2u 2 +2u 3 =0<br />
u 1 + u 2 +3u 3 =0<br />
2u 1 +3u 2 + u 3 =0<br />
−2u 1 + u 2 +3u 3 =0<br />
Row-reduc<strong>in</strong>g the coefficient matrix gives<br />
⎡<br />
⎢<br />
⎣<br />
1 0 0<br />
0 1 0<br />
0 0 1<br />
0 0 0<br />
0 0 0<br />
The kernel of T is the set of solutions to this homogeneous system of equations,<br />
which is simply the trivial solution u = 0, so<br />
K(T )={0} = 〈{ }〉<br />
△<br />
Our next theorem says that if a preimage is a nonempty set then we can construct<br />
it by pick<strong>in</strong>g any one element and add<strong>in</strong>g on elements of the kernel.<br />
Theorem KPI Kernel and Pre-Image<br />
Suppose T : U → V is a l<strong>in</strong>ear transformation and v ∈ V . If the preimage T −1 (v)<br />
⎤<br />
⎥<br />
⎦
§ILT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 454<br />
is nonempty, and u ∈ T −1 (v) then<br />
T −1 (v) ={u + z| z ∈K(T )} = u + K(T )<br />
Proof. Let M = {u + z| z ∈K(T )}. <strong>First</strong>, we show that M ⊆ T −1 (v). Suppose that<br />
w ∈ M, sow has the form w = u + z, wherez ∈K(T ). Then<br />
T (w) =T (u + z)<br />
= T (u)+T (z) Def<strong>in</strong>ition LT<br />
= v + 0 u ∈ T −1 (v) , z ∈K(T )<br />
= v Property Z<br />
which qualifies w for membership <strong>in</strong> the preimage of v, w ∈ T −1 (v).<br />
For the opposite <strong>in</strong>clusion, suppose x ∈ T −1 (v). Then,<br />
T (x − u) =T (x) − T (u)<br />
Def<strong>in</strong>ition LT<br />
= v − v x, u ∈ T −1 (v)<br />
= 0<br />
This qualifies x − u for membership <strong>in</strong> the kernel of T , K(T ). Sothereisavector<br />
z ∈K(T ) such that x − u = z. Rearrang<strong>in</strong>g this equation gives x = u + z and so<br />
x ∈ M. SoT −1 (v) ⊆ M and we see that M = T −1 (v), as desired.<br />
<br />
This theorem, and its proof, should rem<strong>in</strong>d you very much of Theorem PSPHS.<br />
Additionally, you might go back and review Example SPIAS. Can you tell now which<br />
is the only preimage to be a subspace?<br />
Here is the cartoon which describes the “many-to-one” behavior of a typical<br />
l<strong>in</strong>ear transformation. Presume that T (u i ) = v i ,fori =1, 2, 3, and as guaranteed<br />
by Theorem LTTZZ, T (0 U ) = 0 V . Then four pre-images are depicted, each labeled<br />
slightly different. T −1 (v 2 ) is the most general, employ<strong>in</strong>g Theorem KPI to provide<br />
two equal descriptions of the set. The most unusual is T −1 (0 V ) which is equal to the<br />
kernel, K(T ), and hence is a subspace (by Theorem KLTS). The subdivisions of the<br />
doma<strong>in</strong>, U, are meant to suggest the partion<strong>in</strong>g of the doma<strong>in</strong> by the collection of<br />
pre-images. It also suggests that each pre-image is of similar size or structure, s<strong>in</strong>ce<br />
each is a “shifted” copy of the kernel. Notice that we cannot speak of the dimension<br />
of a pre-image, s<strong>in</strong>ce it is almost never a subspace. Also notice that x, y ∈ V are<br />
elements of the codoma<strong>in</strong> with empty pre-images.
§ILT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 455<br />
u 1<br />
v 1<br />
u 1 + K(T )<br />
u 2<br />
v 2<br />
T −1 (v 2 )=u 2 + K(T )<br />
0 U<br />
0 V<br />
T −1 (0 V )=K(T )<br />
x<br />
u 3<br />
T −1 (v 3 )<br />
U<br />
T<br />
V<br />
v 3<br />
y<br />
Diagram KPI: Kernel and Pre-Image<br />
The next theorem is one we will cite frequently, as it characterizes <strong>in</strong>jections by<br />
the size of the kernel.<br />
Theorem KILT Kernel of an Injective L<strong>in</strong>ear Transformation<br />
Suppose that T : U → V is a l<strong>in</strong>ear transformation. Then T is <strong>in</strong>jective if and only<br />
if the kernel of T is trivial, K(T )={0}.<br />
Proof. (⇒) We assume T is <strong>in</strong>jective and we need to establish that two sets are<br />
equal (Def<strong>in</strong>ition SE). S<strong>in</strong>ce the kernel is a subspace (Theorem KLTS), {0} ⊆K(T ).<br />
To establish the opposite <strong>in</strong>clusion, suppose x ∈K(T ).<br />
T (x) =0<br />
Def<strong>in</strong>ition KLT<br />
= T (0) Theorem LTTZZ<br />
We can apply Def<strong>in</strong>ition ILT to conclude that x = 0. Therefore K(T ) ⊆{0} and<br />
by Def<strong>in</strong>ition SE, K(T )={0}.<br />
(⇐) To establish that T is <strong>in</strong>jective, appeal to Def<strong>in</strong>ition ILT and beg<strong>in</strong> with<br />
the assumption that T (x) =T (y). Then<br />
T (x − y) =T (x) − T (y)<br />
Def<strong>in</strong>ition LT<br />
= 0 Hypothesis<br />
So x − y ∈K(T ) by Def<strong>in</strong>ition KLT and with the hypothesis that the kernel is
§ILT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 456<br />
trivial we conclude that x − y = 0. Then<br />
y = y + 0 = y +(x − y) =x<br />
thus establish<strong>in</strong>g that T is <strong>in</strong>jective by Def<strong>in</strong>ition ILT.<br />
<br />
You might beg<strong>in</strong> to th<strong>in</strong>k about how Diagram KPI would change if the l<strong>in</strong>ear<br />
transformation is <strong>in</strong>jective, which would make the kernel trivial by Theorem KILT.<br />
Example NIAQR Not <strong>in</strong>jective, Archetype Q, revisited<br />
We are now <strong>in</strong> a position to revisit our first example <strong>in</strong> this section, Example NIAQ.<br />
In that example, we showed that Archetype Q is not <strong>in</strong>jective by construct<strong>in</strong>g two<br />
vectors, which when used to evaluate the l<strong>in</strong>ear transformation provided the same<br />
output, thus violat<strong>in</strong>g Def<strong>in</strong>ition ILT. Just where did those two vectors come from?<br />
The key is the vector<br />
⎡ ⎤<br />
3<br />
4<br />
z = ⎢1⎥<br />
⎣3⎦<br />
3<br />
which you can check is an element of K(T ) for Archetype Q. Choose a vector x at<br />
random, and then compute y = x + z (verify this computation back <strong>in</strong> Example<br />
NIAQ). Then<br />
T (y) =T (x + z)<br />
= T (x)+T (z) Def<strong>in</strong>ition LT<br />
= T (x)+0 z ∈K(T )<br />
= T (x) Property Z<br />
Whenever the kernel of a l<strong>in</strong>ear transformation is nontrivial, we can employ this<br />
device and conclude that the l<strong>in</strong>ear transformation is not <strong>in</strong>jective. This is another<br />
way of view<strong>in</strong>g Theorem KILT. For an <strong>in</strong>jective l<strong>in</strong>ear transformation, the kernel is<br />
trivial and our only choice for z isthezerovector,whichwillnothelpuscreatetwo<br />
different <strong>in</strong>puts for T that yield identical outputs. For every one of the archetypes<br />
that is not <strong>in</strong>jective, there is an example presented of exactly this form. △<br />
Example NIAO Not <strong>in</strong>jective, Archetype O<br />
In Example NKAO the kernel of Archetype O was determ<strong>in</strong>ed to be<br />
〈{[ ]}〉 −2<br />
1<br />
1<br />
a subspace of C 3 with dimension 1. S<strong>in</strong>ce the kernel is not trivial, Theorem KILT<br />
tells us that T is not <strong>in</strong>jective.<br />
△<br />
Example IAP Injective, Archetype P
§ILT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 457<br />
In Example TKAP it was shown that the l<strong>in</strong>ear transformation <strong>in</strong> Archetype P has<br />
a trivial kernel. So by Theorem KILT, T is <strong>in</strong>jective.<br />
△<br />
Subsection ILTLI<br />
Injective L<strong>in</strong>ear Transformations and L<strong>in</strong>ear Independence<br />
There is a connection between <strong>in</strong>jective l<strong>in</strong>ear transformations and l<strong>in</strong>early <strong>in</strong>dependent<br />
sets that we will make precise <strong>in</strong> the next two theorems. However, more<br />
<strong>in</strong>formally, we can get a feel for this connection when we th<strong>in</strong>k about how each<br />
property is def<strong>in</strong>ed. A set of vectors is l<strong>in</strong>early <strong>in</strong>dependent if the only relation of<br />
l<strong>in</strong>ear dependence is the trivial one. A l<strong>in</strong>ear transformation is <strong>in</strong>jective if the only<br />
way two <strong>in</strong>put vectors can produce the same output is <strong>in</strong> the trivial way, when both<br />
<strong>in</strong>put vectors are equal.<br />
Theorem ILTLI Injective L<strong>in</strong>ear Transformations and L<strong>in</strong>ear Independence<br />
Suppose that T : U → V is an <strong>in</strong>jective l<strong>in</strong>ear transformation and<br />
S = {u 1 , u 2 , u 3 , ..., u t }<br />
is a l<strong>in</strong>early <strong>in</strong>dependent subset of U. Then<br />
R = {T (u 1 ) ,T(u 2 ) ,T(u 3 ) , ..., T (u t )}<br />
is a l<strong>in</strong>early <strong>in</strong>dependent subset of V .<br />
Proof. Beg<strong>in</strong> with a relation of l<strong>in</strong>ear dependence on R (Def<strong>in</strong>ition RLD, Def<strong>in</strong>ition<br />
LI),<br />
a 1 T (u 1 )+a 2 T (u 2 )+a 3 T (u 3 )+...+ a t T (u t )=0<br />
T (a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a t u t )=0<br />
a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a t u t ∈K(T )<br />
a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a t u t ∈{0}<br />
a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a t u t = 0<br />
Theorem LTLC<br />
Def<strong>in</strong>ition KLT<br />
Theorem KILT<br />
Def<strong>in</strong>ition SET<br />
S<strong>in</strong>ce this is a relation of l<strong>in</strong>ear dependence on the l<strong>in</strong>early <strong>in</strong>dependent set S,<br />
we can conclude that<br />
a 1 =0 a 2 =0 a 3 =0 ... a t =0<br />
and this establishes that R is a l<strong>in</strong>early <strong>in</strong>dependent set.<br />
<br />
Theorem ILTB Injective L<strong>in</strong>ear Transformations and Bases<br />
Suppose that T : U → V is a l<strong>in</strong>ear transformation and<br />
B = {u 1 , u 2 , u 3 , ..., u m }
§ILT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 458<br />
is a basis of U. ThenT is <strong>in</strong>jective if and only if<br />
C = {T (u 1 ) ,T(u 2 ) ,T(u 3 ) , ..., T (u m )}<br />
is a l<strong>in</strong>early <strong>in</strong>dependent subset of V .<br />
Proof. (⇒) AssumeT is <strong>in</strong>jective. S<strong>in</strong>ce B is a basis, we know B is l<strong>in</strong>early <strong>in</strong>dependent<br />
(Def<strong>in</strong>ition B). Then Theorem ILTLI says that C is a l<strong>in</strong>early <strong>in</strong>dependent<br />
subset of V .<br />
(⇐) Assume that C is l<strong>in</strong>early <strong>in</strong>dependent. To establish that T is <strong>in</strong>jective, we<br />
will show that the kernel of T is trivial (Theorem KILT). Suppose that u ∈K(T ).<br />
As an element of U, we can write u as a l<strong>in</strong>ear comb<strong>in</strong>ation of the basis vectors <strong>in</strong><br />
B (uniquely). So there are are scalars, a 1 ,a 2 ,a 3 , ..., a m , such that<br />
u = a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a m u m<br />
Then,<br />
0 = T (u) Def<strong>in</strong>ition KLT<br />
= T (a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a m u m ) Def<strong>in</strong>ition SSVS<br />
= a 1 T (u 1 )+a 2 T (u 2 )+a 3 T (u 3 )+···+ a m T (u m ) Theorem LTLC<br />
This is a relation of l<strong>in</strong>ear dependence (Def<strong>in</strong>ition RLD) on the l<strong>in</strong>early <strong>in</strong>dependent<br />
set C, so the scalars are all zero: a 1 = a 2 = a 3 = ···= a m =0.Then<br />
u = a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a m u m<br />
=0u 1 +0u 2 +0u 3 + ···+0u m Theorem ZSSM<br />
= 0 + 0 + 0 + ···+ 0 Theorem ZSSM<br />
= 0 Property Z<br />
S<strong>in</strong>ce u was chosen as an arbitrary vector from K(T ), wehaveK(T ) = {0} and<br />
Theorem KILT tells us that T is <strong>in</strong>jective.<br />
<br />
Subsection ILTD<br />
Injective L<strong>in</strong>ear Transformations and Dimension<br />
Theorem ILTD Injective L<strong>in</strong>ear Transformations and Dimension<br />
Suppose that T : U → V is an <strong>in</strong>jective l<strong>in</strong>ear transformation. Then dim (U) ≤<br />
dim (V ).<br />
Proof. Suppose to the contrary that m = dim (U) > dim (V ) = t. LetB be a basis<br />
of U, which will then conta<strong>in</strong> m vectors. Apply T to each element of B to form a set<br />
C that is a subset of V .ByTheoremILTB, C is l<strong>in</strong>early <strong>in</strong>dependent and therefore<br />
must conta<strong>in</strong> m dist<strong>in</strong>ct vectors. So we have found a set of m l<strong>in</strong>early <strong>in</strong>dependent<br />
vectors <strong>in</strong> V , a vector space of dimension t, withm>t. However, this contradicts<br />
Theorem G, so our assumption is false and dim (U) ≤ dim (V ).
§ILT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 459<br />
Example NIDAU Not <strong>in</strong>jective by dimension, Archetype U<br />
The l<strong>in</strong>ear transformation <strong>in</strong> Archetype U is<br />
⎡<br />
⎤<br />
([ ]) a +2b +12c − 3d + e +6f<br />
T : M 23 → C 4 a b c ⎢ 2a − b − c + d − 11f ⎥<br />
, T<br />
=<br />
d e f<br />
⎣<br />
a + b +7c +2d + e − 3f<br />
⎦<br />
a +2b +12c +5e − 5f<br />
S<strong>in</strong>ce dim (M 23 ) =6> 4=dim ( C 4) , T cannot be <strong>in</strong>jective for then T would<br />
violate Theorem ILTD.<br />
△<br />
Notice that the previous example made no use of the actual formula def<strong>in</strong><strong>in</strong>g<br />
the function. Merely a comparison of the dimensions of the doma<strong>in</strong> and codoma<strong>in</strong><br />
are enough to conclude that the l<strong>in</strong>ear transformation is not <strong>in</strong>jective. Archetype M<br />
and Archetype N are two more examples of l<strong>in</strong>ear transformations that have “big”<br />
doma<strong>in</strong>s and “small” codoma<strong>in</strong>s, result<strong>in</strong>g <strong>in</strong> “collisions” of outputs and thus are<br />
non-<strong>in</strong>jective l<strong>in</strong>ear transformations.<br />
Subsection CILT<br />
Composition of Injective L<strong>in</strong>ear Transformations<br />
In Subsection LT.NLTFO we saw how to comb<strong>in</strong>e l<strong>in</strong>ear transformations to build<br />
new l<strong>in</strong>ear transformations, specifically, how to build the composition of two l<strong>in</strong>ear<br />
transformations (Def<strong>in</strong>ition LTC). It will be useful later to know that the composition<br />
of <strong>in</strong>jective l<strong>in</strong>ear transformations is aga<strong>in</strong> <strong>in</strong>jective, so we prove that here.<br />
Theorem CILTI Composition of Injective L<strong>in</strong>ear Transformations is Injective<br />
Suppose that T : U → V and S : V → W are <strong>in</strong>jective l<strong>in</strong>ear transformations. Then<br />
(S ◦ T ): U → W is an <strong>in</strong>jective l<strong>in</strong>ear transformation.<br />
Proof. That the composition is a l<strong>in</strong>ear transformation was established <strong>in</strong> Theorem<br />
CLTLT, so we need only establish that the composition is <strong>in</strong>jective. Apply<strong>in</strong>g<br />
Def<strong>in</strong>ition ILT, choose x, y from U. Thenif(S ◦ T )(x) =(S ◦ T )(y),<br />
⇒ S (T (x)) = S (T (y)) Def<strong>in</strong>ition LTC<br />
⇒ T (x) =T (y) Def<strong>in</strong>ition ILT for S<br />
⇒ x = y Def<strong>in</strong>ition ILT for T<br />
<br />
Read<strong>in</strong>g Questions<br />
1. Suppose T : C 8 → C 5 is a l<strong>in</strong>ear transformation. Why is T not <strong>in</strong>jective?<br />
2. Describe the kernel of an <strong>in</strong>jective l<strong>in</strong>ear transformation.<br />
3. Theorem KPI should rem<strong>in</strong>d you of Theorem PSPHS. Why do we say this?
§ILT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 460<br />
Exercises<br />
C10<br />
Each archetype below is a l<strong>in</strong>ear transformation. Compute the kernel for each.<br />
Archetype M, ArchetypeN, ArchetypeO, ArchetypeP, ArchetypeQ, ArchetypeR, Archetype<br />
S, ArchetypeT, ArchetypeU, Archetype V, ArchetypeW, ArchetypeX<br />
C20 † The l<strong>in</strong>ear transformation T : C 4 → C 3 is not <strong>in</strong>jective. F<strong>in</strong>d two <strong>in</strong>puts x, y ∈ C 4<br />
that yield the same output (that is T (x) =T (y)).<br />
⎛⎡<br />
⎤⎞<br />
x 1<br />
⎡<br />
⎤<br />
2x<br />
⎜⎢x T 2 ⎥⎟<br />
1 + x 2 + x 3<br />
⎝⎣<br />
x<br />
⎦⎠ = ⎣−x 1 +3x 2 + x 3 − x 4<br />
⎦<br />
3<br />
3x<br />
x 1 + x 2 +2x 3 − 2x 4<br />
4<br />
C25 †<br />
Def<strong>in</strong>e the l<strong>in</strong>ear transformation<br />
⎛⎡<br />
T : C 3 → C 2 , T ⎝⎣ x ⎤⎞<br />
1<br />
x 2<br />
⎦⎠ =<br />
x 3<br />
[ ]<br />
2x1 − x 2 +5x 3<br />
−4x 1 +2x 2 − 10x 3<br />
F<strong>in</strong>d a basis for the kernel of T , K(T ). Is T <strong>in</strong>jective?<br />
⎡<br />
⎤<br />
1 2 3 1 0<br />
C26 † ⎢2 −1 1 0 1⎥<br />
Let A = ⎣<br />
1 2 −1 −2 1<br />
⎦ and let T : C 5 → C 4 be given by T (x) = Ax. Is<br />
1 3 2 1 2<br />
T <strong>in</strong>jective? (H<strong>in</strong>t: No calculation is required.)<br />
⎛⎡<br />
C27 † Let T : C 3 → C 3 be given by T ⎝⎣ x ⎤⎞<br />
⎡<br />
y⎦⎠ = ⎣ 2x + y + z<br />
⎤<br />
x − y +2z⎦. F<strong>in</strong>dK(T ). IsT <strong>in</strong>jective?<br />
z x +2y − z<br />
⎡<br />
⎤<br />
1 2 3 1<br />
C28 † ⎢2 −1 1 0 ⎥<br />
Let A = ⎣<br />
1 2 −1 −2<br />
⎦ and let T : C 4 → C 4 be given by T (x) = Ax. F<strong>in</strong>d<br />
1 3 2 1<br />
K(T ). Is T <strong>in</strong>jective?<br />
⎡<br />
⎤<br />
1 2 1 1<br />
C29 † ⎢2 1 1 0⎥<br />
Let A = ⎣<br />
1 2 1 2<br />
⎦ and let T : C 4 → C 4 be given by T (x) = Ax. F<strong>in</strong>dK(T ).<br />
1 2 1 1<br />
Is T <strong>in</strong>jective?<br />
([ ])<br />
C30 †<br />
a b<br />
Let T : M 22 → P 2 be given by T<br />
=(a + b)+(a + c)x +(a + d)x<br />
c d<br />
2 .IsT<br />
<strong>in</strong>jective? F<strong>in</strong>d K(T ).<br />
⎛⎡<br />
C31 † Given that the l<strong>in</strong>ear transformation T : C 3 → C 3 , T ⎝⎣ x ⎤⎞<br />
⎡<br />
y⎦⎠ = ⎣ 2x + y<br />
⎤<br />
2y + z⎦ is<br />
z x +2z<br />
<strong>in</strong>jective, show directly that {T (e 1 ) ,T(e 2 ) ,T(e 3 )} is a l<strong>in</strong>early <strong>in</strong>dependent set.
§ILT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 461<br />
⎡<br />
([<br />
C32 † Given that the l<strong>in</strong>ear transformation T : C 2 → C 3 x<br />
, T = ⎣<br />
y]) x + y<br />
⎤<br />
2x + y⎦ is <strong>in</strong>jective,<br />
x +2y<br />
show directly that {T (e 1 ) ,T(e 2 )} is a l<strong>in</strong>early <strong>in</strong>dependent set.<br />
⎡ ⎤<br />
⎛⎡<br />
C33 † Given that the l<strong>in</strong>ear transformation T : C 3 → C 5 , T ⎝⎣ x ⎤⎞<br />
1 3 2 ⎡<br />
0 1 1<br />
y⎦⎠ = ⎢1 2 1⎥<br />
⎣ x ⎤<br />
y⎦<br />
z<br />
⎣<br />
1 0 1<br />
⎦<br />
z<br />
3 1 2<br />
is <strong>in</strong>jective, show directly that {T (e 1 ) ,T(e 2 ) ,T(e 3 )} is a l<strong>in</strong>early <strong>in</strong>dependent set.<br />
C40 † Show that the l<strong>in</strong>ear transformation R is not <strong>in</strong>jective by f<strong>in</strong>d<strong>in</strong>g two different<br />
elements of the doma<strong>in</strong>, x and y, such that R (x) = R (y). (S 22 is the vector space of<br />
symmetric 2 × 2 matrices.)<br />
([ ])<br />
a b<br />
R: S 22 → P 1 R<br />
=(2a − b + c)+(a + b +2c)x<br />
b c<br />
M60 Suppose U and V are vector spaces. Def<strong>in</strong>e the function Z : U → V by Z (u) = 0 V<br />
for every u ∈ U. Then by Exercise LT.M60, Z is a l<strong>in</strong>ear transformation. Formulate a<br />
condition on U that is equivalent to Z be<strong>in</strong>g an <strong>in</strong>jective l<strong>in</strong>ear transformation. In other<br />
words, fill <strong>in</strong> the blank to complete the follow<strong>in</strong>g statement (and then give a proof): Z is<br />
<strong>in</strong>jective if and only if U is . (See Exercise SLT.M60, Exercise IVLT.M60.)<br />
T10 † Suppose T : U → V is a l<strong>in</strong>ear transformation. For which vectors v ∈ V is T −1 (v)<br />
a subspace of U?<br />
T15 † Suppose that that T : U → V and S : V → W are l<strong>in</strong>ear transformations. Prove the<br />
follow<strong>in</strong>g relationship between kernels.<br />
T20 †<br />
K(T ) ⊆K(S ◦ T )<br />
Suppose that A is an m × n matrix. Def<strong>in</strong>e the l<strong>in</strong>ear transformation T by<br />
T : C n → C m ,<br />
T (x) =Ax<br />
Prove that the kernel of T equals the null space of A, K(T )=N (A).
Section SLT<br />
Surjective L<strong>in</strong>ear Transformations<br />
The companion to an <strong>in</strong>jection is a surjection. Surjective l<strong>in</strong>ear transformations are<br />
closely related to spann<strong>in</strong>g sets and ranges. So as you read this section reflect back<br />
on Section ILT and note the parallels and the contrasts. In the next section, Section<br />
IVLT, we will comb<strong>in</strong>e the two properties.<br />
Subsection SLT<br />
Surjective L<strong>in</strong>ear Transformations<br />
As usual, we lead with a def<strong>in</strong>ition.<br />
Def<strong>in</strong>ition SLT Surjective L<strong>in</strong>ear Transformation<br />
Suppose T : U → V is a l<strong>in</strong>ear transformation. Then T is surjective if for every<br />
v ∈ V there exists a u ∈ U so that T (u) =v.<br />
□<br />
Given an arbitrary function, it is possible for there to be an element of the<br />
codoma<strong>in</strong> that is not an output of the function (th<strong>in</strong>k about the function y = f(x) =<br />
x 2 and the codoma<strong>in</strong> element y = −3). For a surjective function, this never happens.<br />
If we choose any element of the codoma<strong>in</strong> (v ∈ V ) then there must be an <strong>in</strong>put<br />
from the doma<strong>in</strong> (u ∈ U) which will create the output when used to evaluate the<br />
l<strong>in</strong>ear transformation (T (u) = v). Some authors prefer the term onto whereweuse<br />
surjective, and we will sometimes refer to a surjective l<strong>in</strong>ear transformation as a<br />
surjection.<br />
Subsection ESLT<br />
Examples of Surjective L<strong>in</strong>ear Transformations<br />
It is perhaps most <strong>in</strong>structive to exam<strong>in</strong>e a l<strong>in</strong>ear transformation that is not surjective<br />
first.<br />
Example NSAQ Not surjective, Archetype Q<br />
Archetype Q is the l<strong>in</strong>ear transformation<br />
⎛⎡<br />
⎤⎞<br />
⎡<br />
⎤<br />
x 1 −2x 1 +3x 2 +3x 3 − 6x 4 +3x 5<br />
x<br />
T : C 5 → C 5 2<br />
−16x , T ⎜⎢x 3 ⎥⎟<br />
⎝⎣x ⎦⎠ = 1 +9x 2 +12x 3 − 28x 4 +28x 5<br />
⎢−19x 1 +7x 2 +14x 3 − 32x 4 +37x 5 ⎥<br />
⎣<br />
4 −21x 1 +9x 2 +15x 3 − 35x 4 +39x ⎦ 5<br />
x 5 −9x 1 +5x 2 +7x 3 − 16x 4 +16x 5<br />
462
§SLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 463<br />
We will demonstrate that<br />
⎡ ⎤<br />
−1<br />
2<br />
v = ⎢ 3 ⎥<br />
⎣−1⎦<br />
4<br />
is an unobta<strong>in</strong>able element of the codoma<strong>in</strong>. Suppose to the contrary that u is an<br />
element of the doma<strong>in</strong> such that T (u) =v.<br />
Then<br />
⎡ ⎤<br />
⎛⎡<br />
⎤⎞<br />
−1<br />
u 1<br />
2<br />
u ⎢ 3 ⎥<br />
⎣−1⎦ = v = T (u) =T 2<br />
⎜⎢u 3 ⎥⎟<br />
⎝⎣u ⎦⎠<br />
4<br />
4<br />
u 5<br />
⎡<br />
⎤<br />
−2u 1 +3u 2 +3u 3 − 6u 4 +3u 5<br />
−16u 1 +9u 2 +12u 3 − 28u 4 +28u 5<br />
= ⎢−19u 1 +7u 2 +14u 3 − 32u 4 +37u 5 ⎥<br />
⎣−21u 1 +9u 2 +15u 3 − 35u 4 +39u ⎦ 5<br />
−9u 1 +5u 2 +7u 3 − 16u 4 +16u 5<br />
⎡<br />
⎤ ⎡ ⎤<br />
−2 3 3 −6 3 u 1<br />
−16 9 12 −28 28<br />
u 2<br />
= ⎢−19 7 14 −32 37⎥<br />
⎢u 3 ⎥<br />
⎣−21 9 15 −35 39⎦<br />
⎣u ⎦ 4<br />
−9 5 7 −16 16 u 5<br />
Now we recognize the appropriate <strong>in</strong>put vector u as a solution to a l<strong>in</strong>ear system<br />
of equations. Form the augmented matrix of the system, and row-reduce to<br />
⎡<br />
⎤<br />
1 0 0 0 −1 0<br />
0 1 0 0 − 4 3<br />
0<br />
0 0 1 0 −<br />
⎢<br />
1 3<br />
0<br />
⎥<br />
⎣ 0 0 0 1 −1 0 ⎦<br />
0 0 0 0 0 1<br />
With a lead<strong>in</strong>g 1 <strong>in</strong> the last column, Theorem RCLS tells us the system is<br />
<strong>in</strong>consistent. From the absence of any solutions we conclude that no such vector u<br />
exists, and by Def<strong>in</strong>ition SLT, T is not surjective.<br />
Aga<strong>in</strong>, do not concern yourself with how v was selected, as this will be expla<strong>in</strong>ed<br />
shortly. However, do understand why this vector provides enough evidence to conclude<br />
that T is not surjective.<br />
△<br />
Here is a cartoon of a non-surjective l<strong>in</strong>ear transformation. Notice that the central<br />
feature of this cartoon is that the vector v ∈ V does not have an arrow po<strong>in</strong>t<strong>in</strong>g<br />
to it, imply<strong>in</strong>g there is no u ∈ U such that T (u) = v. Even though this happens
§SLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 464<br />
aga<strong>in</strong> with a second unnamed vector <strong>in</strong> V , it only takes one occurrence to destroy<br />
the possibility of surjectivity.<br />
v<br />
U<br />
T<br />
V<br />
Diagram NSLT: Non-Surjective L<strong>in</strong>ear Transformation<br />
To show that a l<strong>in</strong>ear transformation is not surjective, it is enough to f<strong>in</strong>d a s<strong>in</strong>gle<br />
element of the codoma<strong>in</strong> that is never created by any <strong>in</strong>put, as <strong>in</strong> Example NSAQ.<br />
However, to show that a l<strong>in</strong>ear transformation is surjective we must establish that<br />
every element of the codoma<strong>in</strong> occurs as an output of the l<strong>in</strong>ear transformation for<br />
some appropriate <strong>in</strong>put.<br />
Example SAR Surjective, Archetype R<br />
Archetype R is the l<strong>in</strong>ear transformation<br />
⎛⎡<br />
⎤⎞<br />
⎡<br />
⎤<br />
x 1 −65x 1 + 128x 2 +10x 3 − 262x 4 +40x 5<br />
x<br />
T : C 5 → C 5 2<br />
36x , T ⎜⎢x 3 ⎥⎟<br />
⎝⎣x ⎦⎠ = 1 − 73x 2 − x 3 + 151x 4 − 16x 5<br />
⎢ −44x 1 +88x 2 +5x 3 − 180x 4 +24x 5 ⎥<br />
⎣<br />
4 34x 1 − 68x 2 − 3x 3 + 140x 4 − 18x ⎦ 5<br />
x 5 12x 1 − 24x 2 − x 3 +49x 4 − 5x 5<br />
To establish that R is surjective we must beg<strong>in</strong> with a totally arbitrary element<br />
of the codoma<strong>in</strong>, v and somehow f<strong>in</strong>d an <strong>in</strong>put vector u such that T (u) = v. We<br />
desire,<br />
T (u) =v
§SLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 465<br />
⎡<br />
⎤ ⎡ ⎤<br />
−65u 1 + 128u 2 +10u 3 − 262u 4 +40u 5 v 1<br />
36u 1 − 73u 2 − u 3 + 151u 4 − 16u 5<br />
v ⎢ −44u 1 +88u 2 +5u 3 − 180u 4 +24u 5 ⎥<br />
⎣ 34u 1 − 68u 2 − 3u 3 + 140u 4 − 18u ⎦ = 2<br />
⎢v 3 ⎥<br />
⎣<br />
5 v ⎦ 4<br />
12u 1 − 24u 2 − u 3 +49u 4 − 5u 5 v 5<br />
⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤<br />
−65 128 10 −262 40 u 1 v 1<br />
36 −73 −1 151 −16<br />
u 2<br />
v ⎢−44 88 5 −180 24 ⎥ ⎢u 3 ⎥<br />
⎣ 34 −68 −3 140 −18⎦<br />
⎣u ⎦ = 2<br />
⎢v 3 ⎥<br />
⎣ 4 v ⎦ 4<br />
12 −24 −1 49 −5 u 5 v 5<br />
We recognize this equation as a system of equations <strong>in</strong> the variables u i , but<br />
our vector of constants conta<strong>in</strong>s symbols. In general, we would have to row-reduce<br />
the augmented matrix by hand, due to the symbolic f<strong>in</strong>al column. However, <strong>in</strong> this<br />
particular example, the 5 × 5 coefficient matrix is nons<strong>in</strong>gular and so has an <strong>in</strong>verse<br />
(Theorem NI, Def<strong>in</strong>ition MI).<br />
⎡<br />
⎤ ⎡<br />
⎤<br />
−65 128 10 −262 40<br />
36 −73 −1 151 −16<br />
⎢−44 88 5 −180 24 ⎥<br />
⎣ 34 −68 −3 140 −18⎦<br />
12 −24 −1 49 −5<br />
−1<br />
−47 92 1 −181 −14<br />
7 221<br />
27 −55 2 2<br />
11<br />
= ⎢−32 64 −1 −126 −12<br />
⎥<br />
⎣<br />
3 199<br />
25 −50<br />
2 2<br />
9<br />
⎦<br />
1 71<br />
9 −18<br />
2 2<br />
4<br />
so we f<strong>in</strong>d that<br />
⎡ ⎤ ⎡<br />
⎤ ⎡ ⎤<br />
u 1 −47 92 1 −181 −14 v 1<br />
7 221<br />
u 2<br />
27 −55 ⎢u 3 ⎥<br />
⎣u ⎦ = 2 2<br />
11<br />
v<br />
⎢−32 64 −1 −126 −12<br />
2<br />
⎥ ⎢v 3 ⎥<br />
⎣<br />
3 199<br />
4 25 −50<br />
2 2<br />
9<br />
⎦ ⎣v ⎦ 4<br />
u 1 71<br />
5 9 −18<br />
2 2<br />
4 v 5<br />
⎡<br />
⎤<br />
−47v 1 +92v 2 + v 3 − 181v 4 − 14v 5<br />
27v 1 − 55v 2 + 7 2 v 3 + 221<br />
2 v 4 +11v 5<br />
= ⎢−32v 1 +64v 2 − v 3 − 126v 4 − 12v 5 ⎥<br />
⎣<br />
25v 1 − 50v 2 + 3 2 v 3 + 199<br />
2 v 4 +9v<br />
⎦<br />
5<br />
9v 1 − 18v 2 + 1 2 v 3 + 71<br />
2 v 4 +4v 5<br />
This establishes that if we are given any output vector v, wecanuseitscomponents<br />
<strong>in</strong> this f<strong>in</strong>al expression to formulate a vector u such that T (u) = v. So<br />
by Def<strong>in</strong>ition SLT we now know that T is surjective. You might try to verify this<br />
condition <strong>in</strong> its full generality (i.e. evaluate T with this f<strong>in</strong>al expression and see if<br />
you get v as the result), or test it more specifically for some numerical vector v (see<br />
Exercise SLT.C20).<br />
△<br />
Here is the cartoon for a surjective l<strong>in</strong>ear transformation. It is meant to suggest<br />
that for every output <strong>in</strong> V there is at least one <strong>in</strong>put <strong>in</strong> U that is sent to the output.<br />
(Even though we have depicted several <strong>in</strong>puts sent to each output.) The key feature<br />
of this cartoon is that there are no vectors <strong>in</strong> V without an arrow po<strong>in</strong>t<strong>in</strong>g to them.
§SLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 466<br />
U<br />
T<br />
V<br />
Diagram SLT: Surjective L<strong>in</strong>ear Transformation<br />
Let us now exam<strong>in</strong>e a surjective l<strong>in</strong>ear transformation between abstract vector<br />
spaces.<br />
Example SAV Surjective, Archetype V<br />
Archetype V is def<strong>in</strong>ed by<br />
T : P 3 → M 22 , T ( a + bx + cx 2 + dx 3) [ ]<br />
a + b a− 2c<br />
=<br />
d b− d<br />
To establish that the l<strong>in</strong>ear transformation is surjective, beg<strong>in</strong> by choos<strong>in</strong>g an<br />
arbitrary output. In this example, we need to choose an arbitrary 2 × 2 matrix, say<br />
[ ]<br />
x y<br />
v =<br />
z w<br />
and we would like to f<strong>in</strong>d an <strong>in</strong>put polynomial<br />
u = a + bx + cx 2 + dx 3<br />
so that T (u) =v. Sowehave,<br />
[ ]<br />
x y<br />
= v<br />
z w<br />
= T (u)<br />
= T ( a + bx + cx 2 + dx 3)
§SLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 467<br />
[ ]<br />
a + b a− 2c<br />
=<br />
d b− d<br />
Matrix equality leads us to the system of four equations <strong>in</strong> the four unknowns,<br />
x, y, z, w,<br />
a + b = x<br />
a − 2c = y<br />
d = z<br />
b − d = w<br />
which can be rewritten as a matrix equation,<br />
⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤<br />
1 1 0 0 a x<br />
⎢1 0 −2 0<br />
⎣<br />
0 0 0 1<br />
0 1 0 −1<br />
⎥ ⎢b⎥<br />
⎢y<br />
⎥<br />
⎦ ⎣<br />
c<br />
⎦ = ⎣<br />
z<br />
⎦<br />
d w<br />
The coefficient matrix is nons<strong>in</strong>gular, hence it has an <strong>in</strong>verse,<br />
⎡<br />
⎤−1<br />
⎡<br />
⎤<br />
1 1 0 0 1 0 −1 −1<br />
⎢1 0 −2 0 ⎥ ⎢0 0 1 1 ⎥<br />
⎣<br />
0 0 0 1<br />
⎦ = ⎣ 1<br />
2<br />
− 1 2<br />
− 1 2<br />
− 1 ⎦<br />
2<br />
0 1 0 −1 0 0 1 0<br />
so we have<br />
⎡ ⎤ ⎡<br />
⎤ ⎡ ⎤<br />
a 1 0 −1 −1 x<br />
⎢b⎥<br />
⎢0 0 1 1 ⎥ ⎢y<br />
⎥<br />
⎣<br />
c<br />
⎦ = ⎣ 1<br />
2<br />
− 1 2<br />
− 1 2<br />
− 1 ⎦ ⎣<br />
z<br />
⎦<br />
2<br />
d 0 0 1 0 w<br />
⎡<br />
⎤<br />
x − z − w<br />
⎢ z + w ⎥<br />
= ⎣ 1<br />
2<br />
(x − y − z − w)<br />
⎦<br />
z<br />
So the <strong>in</strong>put polynomial u =(x−z −w)+(z +w)x+ 1 2 (x−y −z −w)x2 +zx 3 will<br />
yield the output matrix v, no matter what form v takes. This means by Def<strong>in</strong>ition<br />
SLT that T is surjective. All the same, let us do a concrete demonstration and<br />
evaluate T with u,<br />
T (u) =T<br />
((x − z − w)+(z + w)x + 1 )<br />
2 (x − y − z − w)x2 + zx 3<br />
[ ]<br />
(x − z − w)+(z + w) (x − z − w) − 2(<br />
1<br />
=<br />
2<br />
(x − y − z − w))<br />
z<br />
(z + w) − z<br />
[ ]<br />
x y<br />
=<br />
z w
§SLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 468<br />
= v<br />
△<br />
Subsection RLT<br />
Range of a L<strong>in</strong>ear Transformation<br />
For a l<strong>in</strong>ear transformation T : U → V , the range is a subset of the codoma<strong>in</strong> V .<br />
Informally, it is the set of all outputs that the transformation creates when fed every<br />
possible <strong>in</strong>put from the doma<strong>in</strong>. It will have some natural connections with the<br />
column space of a matrix, so we will keep the same notation, and if you th<strong>in</strong>k about<br />
your objects, then there should be little confusion. Here is the careful def<strong>in</strong>ition.<br />
Def<strong>in</strong>ition RLT Range of a L<strong>in</strong>ear Transformation<br />
Suppose T : U → V is a l<strong>in</strong>ear transformation. Then the range of T is the set<br />
R(T )={T (u)| u ∈ U}<br />
Example RAO Range, Archetype O<br />
Archetype O is the l<strong>in</strong>ear transformation<br />
T : C 3 → C 5 ,<br />
T<br />
([ ])<br />
x1<br />
x 2 =<br />
x 3<br />
⎡<br />
⎤<br />
−x 1 + x 2 − 3x 3<br />
−x 1 +2x 2 − 4x 3<br />
⎢ x 1 + x 2 + x 3 ⎥<br />
⎣ 2x 1 +3x 2 + x ⎦ 3<br />
x 1 +2x 3<br />
To determ<strong>in</strong>e the elements of C 5 <strong>in</strong> R(T ), f<strong>in</strong>d those vectors v such that T (u) = v<br />
for some u ∈ C 3 ,<br />
v = T (u)<br />
⎡<br />
⎤<br />
−u 1 + u 2 − 3u 3<br />
−u 1 +2u 2 − 4u 3<br />
= ⎢ u 1 + u 2 + u 3 ⎥<br />
⎣ 2u 1 +3u 2 + u ⎦ 3<br />
u 1 +2u 3<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
−u 1 u 2 −3u 3<br />
−u 1<br />
2u = ⎢ u 1 ⎥<br />
⎣ 2u ⎦ + 2<br />
−4u ⎢ u 2 ⎥<br />
⎣ 1 3u ⎦ + 3<br />
⎢ u 3 ⎥<br />
⎣ 2 u ⎦ 3<br />
u 1 0 2u 3<br />
□
§SLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 469<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
−1 1 −3<br />
−1<br />
2<br />
−4<br />
= u 1 ⎢ 1 ⎥<br />
⎣ 2 ⎦ + u 2 ⎢1⎥<br />
⎣3⎦ + u 3 ⎢ 1 ⎥<br />
⎣ 1 ⎦<br />
1 0 2<br />
This says that every output of T (<strong>in</strong> other words, the vector v) can be written<br />
as a l<strong>in</strong>ear comb<strong>in</strong>ation of the three vectors<br />
⎡ ⎤<br />
⎡ ⎤<br />
⎡ ⎤<br />
−1<br />
1<br />
−3<br />
−1<br />
2<br />
−4<br />
⎢ 1 ⎥<br />
⎢1⎥<br />
⎢ 1 ⎥<br />
⎣ 2 ⎦<br />
⎣3⎦<br />
⎣ 1 ⎦<br />
1<br />
0<br />
2<br />
us<strong>in</strong>g the scalars u 1 ,u 2 ,u 3 .Furthermore,s<strong>in</strong>ceu can be any element of C 3 ,every<br />
such l<strong>in</strong>ear comb<strong>in</strong>ation is an output. This means that<br />
⎧⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤⎫<br />
−1 1 −3<br />
〈<br />
〉<br />
⎪⎨<br />
−1<br />
2<br />
−4<br />
⎪⎬<br />
R(T )= ⎢ 1 ⎥<br />
⎣<br />
⎪⎩<br />
2 ⎦ , ⎢1⎥<br />
⎣3⎦ , ⎢ 1 ⎥<br />
⎣ 1 ⎦<br />
⎪⎭<br />
1 0 2<br />
The three vectors <strong>in</strong> this spann<strong>in</strong>g set for R(T ) form a l<strong>in</strong>early dependent set<br />
(check this!). So we can f<strong>in</strong>d a more economical presentation by any of the various<br />
methods from Section CRS and Section FS. We will place the vectors <strong>in</strong>to a matrix<br />
as rows, row-reduce, toss out zero rows and appeal to Theorem BRS, sowecan<br />
describe the range of T with a basis,<br />
⎧⎡<br />
⎤ ⎡ ⎤⎫<br />
1 0<br />
〈<br />
〉<br />
⎪⎨<br />
0<br />
1<br />
⎪⎬<br />
R(T )= ⎢−3⎥<br />
⎣<br />
⎪⎩<br />
−7⎦ , ⎢2⎥<br />
⎣5⎦<br />
⎪⎭<br />
−2 1<br />
We know that the span of a set of vectors is always a subspace (Theorem SSS),<br />
so the range computed <strong>in</strong> Example RAO is also a subspace. This is no accident, the<br />
range of a l<strong>in</strong>ear transformation is always a subspace.<br />
Theorem RLTS Range of a L<strong>in</strong>ear Transformation is a Subspace<br />
Suppose that T : U → V is a l<strong>in</strong>ear transformation. Then the range of T , R(T ), isa<br />
subspace of V .<br />
Proof. We can apply the three-part test of Theorem TSS. <strong>First</strong>, 0 U ∈ U and<br />
T (0 U ) = 0 V by Theorem LTTZZ, so0 V ∈R(T ) and we know that the range is<br />
nonempty.<br />
△
§SLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 470<br />
Suppose we assume that x, y ∈R(T ). Isx + y ∈R(T )? Ifx, y ∈R(T ) then we<br />
know there are vectors w, z ∈ U such that T (w) = x and T (z) = y. Because U is<br />
a vector space, additive closure (Property AC) implies that w + z ∈ U.<br />
Then<br />
T (w + z) =T (w)+T (z)<br />
Def<strong>in</strong>ition LT<br />
= x + y Def<strong>in</strong>ition of w and z<br />
So we have found an <strong>in</strong>put, w + z, which when fed <strong>in</strong>to T creates x + y as an<br />
output. This qualifies x + y for membership <strong>in</strong> R(T ). So we have additive closure.<br />
Suppose we assume that α ∈ C and x ∈R(T ). Isαx ∈R(T )? Ifx ∈R(T ), then<br />
there is a vector w ∈ U such that T (w) = x. Because U is a vector space, scalar<br />
closure implies that αw ∈ U. Then<br />
T (αw) =αT (w)<br />
Def<strong>in</strong>ition LT<br />
= αx Def<strong>in</strong>ition of w<br />
So we have found an <strong>in</strong>put (αw) which when fed <strong>in</strong>to T creates αx as an output.<br />
This qualifies αx for membership <strong>in</strong> R(T ). So we have scalar closure and Theorem<br />
TSS tells us that R(T ) is a subspace of V .<br />
<br />
Let us compute another range, now that we know <strong>in</strong> advance that it will be a<br />
subspace.<br />
Example FRAN Full range, Archetype N<br />
Archetype N is the l<strong>in</strong>ear transformation<br />
⎛⎡<br />
⎤⎞<br />
x 1<br />
[ ]<br />
x<br />
T : C 5 → C 3 2<br />
2x1 + x 2 +3x 3 − 4x 4 +5x 5<br />
, T ⎜⎢x 3 ⎥⎟<br />
⎝⎣x ⎦⎠ = x 1 − 2x 2 +3x 3 − 9x 4 +3x 5<br />
4 3x 1 +4x 3 − 6x 4 +5x 5<br />
x 5<br />
To determ<strong>in</strong>e the elements of C 3 <strong>in</strong> R(T ), f<strong>in</strong>d those vectors v such that T (u) = v<br />
for some u ∈ C 5 ,<br />
v = T (u)<br />
[ 2u1 + u 2 +3u 3 − 4u 4 +5u 5<br />
= u 1 − 2u 2 +3u 3 − 9u 4 +3u 5<br />
=<br />
[ 2u1<br />
3u 1 +4u 3 − 6u 4 +5u 5<br />
]<br />
]<br />
u 1 +<br />
3u 1<br />
]<br />
= u 1<br />
[ 2<br />
1<br />
3<br />
[<br />
u2<br />
−2u 2<br />
0<br />
]<br />
+ u 2<br />
[ 1<br />
−2<br />
0<br />
]<br />
+<br />
[ 3u3<br />
]<br />
3u 3 +<br />
4u 3<br />
+ u 3<br />
[ 3<br />
3<br />
4<br />
]<br />
[ −4u4<br />
]<br />
−9u 4 +<br />
−6u 4<br />
+ u 4<br />
[ −4<br />
−9<br />
−6<br />
]<br />
[ 5u5<br />
]<br />
3u 5<br />
5u 5<br />
+ u 5<br />
[ 5<br />
3<br />
5<br />
]
§SLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 471<br />
This says that every output of T (<strong>in</strong> other words, the vector v) can be written<br />
as a l<strong>in</strong>ear comb<strong>in</strong>ation of the five vectors<br />
[ ] [ ] [ ] [ ] [ ]<br />
2 1<br />
3 −4<br />
5<br />
1<br />
−2<br />
3<br />
−9<br />
3<br />
3<br />
0<br />
4<br />
−6<br />
5<br />
us<strong>in</strong>g the scalars u 1 ,u 2 ,u 3 ,u 4 ,u 5 .Furthermore,s<strong>in</strong>ceu can be any element of C 5 ,<br />
every such l<strong>in</strong>ear comb<strong>in</strong>ation is an output. This means that<br />
] [ ] [ ] [ ] [ ]}〉<br />
2 1 3 −4 5<br />
R(T )=〈{[<br />
1 , −2 , 3 , −9 , 3<br />
3 0 4 −6 5<br />
The five vectors <strong>in</strong> this spann<strong>in</strong>g set for R(T ) form a l<strong>in</strong>early dependent set<br />
(Theorem MVSLD). So we can f<strong>in</strong>d a more economical presentation by any of the<br />
various methods from Section CRS and Section FS. Wewillplacethevectors<strong>in</strong>toa<br />
matrix as rows, row-reduce, toss out zero rows and appeal to Theorem BRS, sowe<br />
can describe the range of T with a (nice) basis,<br />
] [ ] [ ]}〉<br />
1 0 0<br />
R(T )=〈{[<br />
0 , 1 , 0<br />
0 0 1<br />
= C 3 △<br />
In contrast to <strong>in</strong>jective l<strong>in</strong>ear transformations hav<strong>in</strong>g small (trivial) kernels<br />
(Theorem KILT), surjective l<strong>in</strong>ear transformations have large ranges, as <strong>in</strong>dicated <strong>in</strong><br />
the next theorem.<br />
Theorem RSLT Range of a Surjective L<strong>in</strong>ear Transformation<br />
Suppose that T : U → V is a l<strong>in</strong>ear transformation. Then T is surjective if and only<br />
if the range of T equals the codoma<strong>in</strong>, R(T )=V .<br />
Proof. (⇒) By Def<strong>in</strong>ition RLT, we know that R(T ) ⊆ V . To establish the reverse<br />
<strong>in</strong>clusion, assume v ∈ V .Thens<strong>in</strong>ceT is surjective (Def<strong>in</strong>ition SLT), there exists a<br />
vector u ∈ U so that T (u) = v. However, the existence of u ga<strong>in</strong>s v membership <strong>in</strong><br />
R(T ), so V ⊆R(T ). Thus, R(T )=V .<br />
(⇐) To establish that T is surjective, choose v ∈ V . S<strong>in</strong>ce we are assum<strong>in</strong>g that<br />
R(T ) = V , v ∈R(T ). This says there is a vector u ∈ U so that T (u) = v, i.e. T is<br />
surjective.<br />
<br />
Example NSAQR Not surjective, Archetype Q, revisited<br />
We are now <strong>in</strong> a position to revisit our first example <strong>in</strong> this section, Example NSAQ.<br />
In that example, we showed that Archetype Q is not surjective by construct<strong>in</strong>g a<br />
vector <strong>in</strong> the codoma<strong>in</strong> where no element of the doma<strong>in</strong> could be used to evaluate
§SLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 472<br />
the l<strong>in</strong>ear transformation to create the output, thus violat<strong>in</strong>g Def<strong>in</strong>ition SLT. Just<br />
where did this vector come from?<br />
The short answer is that the vector<br />
⎡ ⎤<br />
−1<br />
2<br />
v = ⎢ 3 ⎥<br />
⎣−1⎦<br />
4<br />
was constructed to lie outside of the range of T . How was this accomplished? <strong>First</strong>,<br />
the range of T is given by<br />
⎧⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />
1 0 0 0<br />
〈<br />
〉<br />
⎪⎨<br />
0<br />
1<br />
0<br />
0<br />
⎪⎬<br />
R(T )= ⎢0⎥<br />
⎣<br />
⎪⎩<br />
0⎦ , ⎢ 0 ⎥<br />
⎣ 0 ⎦ , ⎢ 1 ⎥<br />
⎣ 0 ⎦ , ⎢0⎥<br />
⎣1⎦<br />
⎪⎭<br />
1 −1 −1 2<br />
Suppose an element of the range v ∗ has its first 4 components equal to −1, 2, 3,<br />
−1, <strong>in</strong> that order. Then to be an element of R(T ), we would have<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
1 0 0<br />
0 −1<br />
0<br />
1<br />
0<br />
0<br />
2<br />
v ∗ =(−1) ⎢0⎥<br />
⎣0⎦ +(2) ⎢ 0 ⎥<br />
⎣ 0 ⎦ +(3) ⎢ 1 ⎥<br />
⎣ 0 ⎦ +(−1) ⎢0⎥<br />
⎣1⎦ = ⎢ 3 ⎥<br />
⎣−1⎦<br />
1 −1 −1 2 −8<br />
So the only vector <strong>in</strong> the range with these first four components specified, must<br />
have −8 <strong>in</strong> the fifth component. To set the fifth component to any other value (say,<br />
4) will result <strong>in</strong> a vector (v <strong>in</strong> Example NSAQ) outside of the range. Any attempt<br />
to f<strong>in</strong>d an <strong>in</strong>put for T that will produce v as an output will be doomed to failure.<br />
Whenever the range of a l<strong>in</strong>ear transformation is not the whole codoma<strong>in</strong>, we can<br />
employ this device and conclude that the l<strong>in</strong>ear transformation is not surjective. This<br />
is another way of view<strong>in</strong>g Theorem RSLT. For a surjective l<strong>in</strong>ear transformation,<br />
the range is all of the codoma<strong>in</strong> and there is no choice for a vector v that lies <strong>in</strong> V ,<br />
yet not <strong>in</strong> the range. For every one of the archetypes that is not surjective, there is<br />
an example presented of exactly this form.<br />
△<br />
Example NSAO Not surjective, Archetype O<br />
In Example RAO the range of Archetype O was determ<strong>in</strong>ed to be<br />
⎧⎡<br />
⎤ ⎡ ⎤⎫<br />
1 0<br />
〈<br />
〉<br />
⎪⎨<br />
0<br />
1<br />
⎪⎬<br />
R(T )= ⎢−3⎥<br />
⎣<br />
⎪⎩<br />
−7⎦ , ⎢2⎥<br />
⎣5⎦<br />
⎪⎭<br />
−2 1<br />
a subspace of dimension 2 <strong>in</strong> C 5 .S<strong>in</strong>ceR(T ) ̸= C 5 ,TheoremRSLT says T is not
§SLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 473<br />
surjective.<br />
Example SAN Surjective, Archetype N<br />
The range of Archetype N was computed <strong>in</strong> Example FRAN to be<br />
] [ ] [ ]}〉<br />
1 0 0<br />
R(T )=〈{[<br />
0 , 1 , 0<br />
0 0 1<br />
S<strong>in</strong>ce the basis for this subspace is the set of standard unit vectors for C 3<br />
(Theorem SUVB), we have R(T )=C 3 and by Theorem RSLT, T is surjective. △<br />
Subsection SSSLT<br />
Spann<strong>in</strong>g Sets and Surjective L<strong>in</strong>ear Transformations<br />
Just as <strong>in</strong>jective l<strong>in</strong>ear transformations are allied with l<strong>in</strong>ear <strong>in</strong>dependence (Theorem<br />
ILTLI, TheoremILTB), surjective l<strong>in</strong>ear transformations are allied with spann<strong>in</strong>g<br />
sets.<br />
Theorem SSRLT Spann<strong>in</strong>g Set for Range of a L<strong>in</strong>ear Transformation<br />
Suppose that T : U → V is a l<strong>in</strong>ear transformation and<br />
S = {u 1 , u 2 , u 3 , ..., u t }<br />
spans U. Then<br />
R = {T (u 1 ) ,T(u 2 ) ,T(u 3 ) , ..., T (u t )}<br />
spans R(T ).<br />
Proof. We need to establish that R(T ) = 〈R〉, a set equality. <strong>First</strong> we establish<br />
that R(T ) ⊆ 〈R〉. To this end, choose v ∈ R(T ). Then there exists a vector<br />
u ∈ U, suchthatT (u) = v (Def<strong>in</strong>ition RLT). Because S spans U there are scalars,<br />
a 1 ,a 2 ,a 3 , ..., a t , such that<br />
u = a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a t u t<br />
Then<br />
v = T (u)<br />
Def<strong>in</strong>ition RLT<br />
= T (a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a t u t ) Def<strong>in</strong>ition SSVS<br />
= a 1 T (u 1 )+a 2 T (u 2 )+a 3 T (u 3 )+...+ a t T (u t ) Theorem LTLC<br />
△<br />
which establishes that v ∈〈R〉 (Def<strong>in</strong>ition SS). So R(T ) ⊆〈R〉.<br />
To establish the opposite <strong>in</strong>clusion, choose an element of the span of R, say<br />
v ∈〈R〉. Then there are scalars b 1 ,b 2 ,b 3 , ..., b t so that<br />
v = b 1 T (u 1 )+b 2 T (u 2 )+b 3 T (u 3 )+···+ b t T (u t ) Def<strong>in</strong>ition SS
§SLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 474<br />
= T (b 1 u 1 + b 2 u 2 + b 3 u 3 + ···+ b t u t ) Theorem LTLC<br />
This demonstrates that v is an output of the l<strong>in</strong>ear transformation T ,sov ∈R(T ).<br />
Therefore 〈R〉 ⊆R(T ), so we have the set equality R(T ) = 〈R〉 (Def<strong>in</strong>ition SE). In<br />
other words, R spans R(T ) (Def<strong>in</strong>ition SSVS).<br />
<br />
Theorem SSRLT provides an easy way to beg<strong>in</strong> the construction of a basis for<br />
the range of a l<strong>in</strong>ear transformation, s<strong>in</strong>ce the construction of a spann<strong>in</strong>g set requires<br />
simply evaluat<strong>in</strong>g the l<strong>in</strong>ear transformation on a spann<strong>in</strong>g set of the doma<strong>in</strong>. In<br />
practice the best choice for a spann<strong>in</strong>g set of the doma<strong>in</strong> would be as small as<br />
possible, <strong>in</strong> other words, a basis. The result<strong>in</strong>g spann<strong>in</strong>g set for the codoma<strong>in</strong> may<br />
not be l<strong>in</strong>early <strong>in</strong>dependent, so to f<strong>in</strong>d a basis for the range might require toss<strong>in</strong>g<br />
out redundant vectors from the spann<strong>in</strong>g set. Here is an example.<br />
Example BRLT A basis for the range of a l<strong>in</strong>ear transformation<br />
Def<strong>in</strong>e the l<strong>in</strong>ear transformation T : M 22 → P 2 by<br />
([ ])<br />
a b<br />
T<br />
=(a +2b +8c + d)+(−3a +2b +5d) x +(a + b +5c) x<br />
c d<br />
2<br />
A convenient spann<strong>in</strong>g set for M 22 is the basis<br />
{[ ] [ ] [ ] [ ]}<br />
1 0 0 1 0 0 0 0<br />
S = , , ,<br />
0 0 0 0 1 0 0 1<br />
So by Theorem SSRLT, a spann<strong>in</strong>g set for R(T )is<br />
{ ([ ]) ([ ]) ([ ])<br />
1 0 0 1 0 0<br />
R = T<br />
,T<br />
,T<br />
,T<br />
0 0 0 0 1 0<br />
= { 1 − 3x + x 2 , 2+2x + x 2 , 8+5x 2 , 1+5x }<br />
([ ])}<br />
0 0<br />
0 1<br />
The set R is not l<strong>in</strong>early <strong>in</strong>dependent, so if we desire a basis for R(T ), we need<br />
to elim<strong>in</strong>ate some redundant vectors. Two particular relations of l<strong>in</strong>ear dependence<br />
on R are<br />
(−2)(1 − 3x + x 2 )+(−3)(2 + 2x + x 2 )+(8+5x 2 )=0+0x +0x 2 = 0<br />
(1 − 3x + x 2 )+(−1)(2 + 2x + x 2 )+(1+5x) =0+0x +0x 2 = 0<br />
These, <strong>in</strong>dividually, allow us to remove 8 + 5x 2 and 1 + 5x from R without<br />
destroy<strong>in</strong>g the property that R spans R(T ). The two rema<strong>in</strong><strong>in</strong>g vectors are l<strong>in</strong>early<br />
<strong>in</strong>dependent (check this!), so we can write<br />
R(T )= 〈{ 1 − 3x + x 2 , 2+2x + x 2}〉<br />
and see that dim (R(T )) = 2.<br />
△<br />
Elements of the range are precisely those elements of the codoma<strong>in</strong> with nonempty<br />
preimages.<br />
Theorem RPI Range and Pre-Image
§SLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 475<br />
Suppose that T : U → V is a l<strong>in</strong>ear transformation. Then<br />
v ∈R(T ) if and only if T −1 (v) ≠ ∅<br />
Proof. (⇒) Ifv ∈R(T ), then there is a vector u ∈ U such that T (u) = v. This<br />
qualifies u for membership <strong>in</strong> T −1 (v), and thus the preimage of v is not empty.<br />
(⇐) Suppose the preimage of v is not empty, so we can choose a vector u ∈ U<br />
such that T (u) =v. Thenv ∈R(T ).<br />
<br />
Now would be a good time to return to Diagram KPI whichdepictedthepreimages<br />
of a non-surjective l<strong>in</strong>ear transformation. The vectors x, y ∈ V were elements<br />
of the codoma<strong>in</strong> whose pre-images were empty, as we expect for a non-surjective<br />
l<strong>in</strong>ear transformation from the characterization <strong>in</strong> Theorem RPI.<br />
Theorem SLTB Surjective L<strong>in</strong>ear Transformations and Bases<br />
Suppose that T : U → V is a l<strong>in</strong>ear transformation and<br />
B = {u 1 , u 2 , u 3 , ..., u m }<br />
is a basis of U. ThenT is surjective if and only if<br />
C = {T (u 1 ) ,T(u 2 ) ,T(u 3 ) , ..., T (u m )}<br />
is a spann<strong>in</strong>g set for V .<br />
Proof. (⇒) Assume T is surjective. S<strong>in</strong>ce B is a basis, we know B is a spann<strong>in</strong>g<br />
set of U (Def<strong>in</strong>ition B). Then Theorem SSRLT says that C spans R(T ). Butthe<br />
hypothesis that T is surjective means V = R(T )(TheoremRSLT), so C spans V .<br />
(⇐) Assume that C spans V . To establish that T is surjective, we will show that<br />
every element of V is an output of T for some <strong>in</strong>put (Def<strong>in</strong>ition SLT). Suppose that<br />
v ∈ V .AsanelementofV , we can write v as a l<strong>in</strong>ear comb<strong>in</strong>ation of the spann<strong>in</strong>g<br />
set C. So there are scalars, b 1 ,b 2 ,b 3 , ..., b m , such that<br />
v = b 1 T (u 1 )+b 2 T (u 2 )+b 3 T (u 3 )+···+ b m T (u m )<br />
Now def<strong>in</strong>e the vector u ∈ U by<br />
u = b 1 u 1 + b 2 u 2 + b 3 u 3 + ···+ b m u m<br />
Then<br />
T (u) =T (b 1 u 1 + b 2 u 2 + b 3 u 3 + ···+ b m u m )<br />
= b 1 T (u 1 )+b 2 T (u 2 )+b 3 T (u 3 )+···+ b m T (u m ) Theorem LTLC<br />
= v<br />
So, given any choice of a vector v ∈ V , we can design an <strong>in</strong>put u ∈ U to produce<br />
v as an output of T . Thus, by Def<strong>in</strong>ition SLT, T is surjective.
§SLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 476<br />
Subsection SLTD<br />
Surjective L<strong>in</strong>ear Transformations and Dimension<br />
Theorem SLTD Surjective L<strong>in</strong>ear Transformations and Dimension<br />
Suppose that T : U → V is a surjective l<strong>in</strong>ear transformation. Then dim (U) ≥<br />
dim (V ).<br />
Proof. Suppose to the contrary that m = dim (U) < dim (V ) = t. LetB be a basis<br />
of U, which will then conta<strong>in</strong> m vectors. Apply T to each element of B to form a<br />
set C that is a subset of V .ByTheoremSLTB, C is a spann<strong>in</strong>g set of V with m or<br />
fewer vectors. So we have a set of m or fewer vectors that span V , a vector space of<br />
dimension t, withm
§SLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 477<br />
Because S is surjective, there must be a vector v ∈ V , such that S (v) = w. With<br />
the existence of v established, that T is surjective guarantees a vector u ∈ U such<br />
that T (u) =v. Now,<br />
(S ◦ T )(u) =S (T (u)) Def<strong>in</strong>ition LTC<br />
= S (v) Def<strong>in</strong>ition of u<br />
= w Def<strong>in</strong>ition of v<br />
This establishes that any element of the codoma<strong>in</strong> (w) can be created by evaluat<strong>in</strong>g<br />
S ◦ T with the right <strong>in</strong>put (u). Thus, by Def<strong>in</strong>ition SLT, S ◦ T is surjective.<br />
<br />
Read<strong>in</strong>g Questions<br />
1. Suppose T : C 5 → C 8 is a l<strong>in</strong>ear transformation. Why is T not surjective?<br />
2. What is the relationship between a surjective l<strong>in</strong>ear transformation and its range?<br />
3. There are many similarities and differences between <strong>in</strong>jective and surjective l<strong>in</strong>ear<br />
transformations. Compare and contrast these two different types of l<strong>in</strong>ear transformations.<br />
(This means go<strong>in</strong>g well beyond just stat<strong>in</strong>g their def<strong>in</strong>itions.)<br />
Exercises<br />
C10<br />
Each archetype below is a l<strong>in</strong>ear transformation. Compute the range for each.<br />
Archetype M, ArchetypeN, ArchetypeO, ArchetypeP, ArchetypeQ, ArchetypeR, Archetype<br />
S, ArchetypeT, ArchetypeU, Archetype V, ArchetypeW, ArchetypeX<br />
C20 Example SAR concludes with an expression for a vector u ∈ C 5 that we believe<br />
will create the vector v ∈ C 5 when used to evaluate T .Thatis,T (u) = v. Verifythis<br />
assertion by actually evaluat<strong>in</strong>g T with u. If you do not have the patience to push around<br />
all these symbols, try choos<strong>in</strong>g a numerical <strong>in</strong>stance of v, compute u, and then compute<br />
T (u),whichshouldresult<strong>in</strong>v.<br />
C22 † The l<strong>in</strong>ear transformation S : C 4 → C 3 is not surjective. F<strong>in</strong>d an output w ∈ C 3<br />
that has an empty pre-image (that is S −1 (w) =∅.)<br />
⎛⎡<br />
⎤⎞<br />
x 1<br />
⎡<br />
⎜⎢x S 2 ⎥⎟<br />
⎝⎣<br />
x<br />
⎦⎠ = ⎣ 2x ⎤<br />
1 + x 2 +3x 3 − 4x 4<br />
x 1 +3x 2 +4x 3 +3x 4<br />
⎦<br />
3<br />
−x<br />
x 1 +2x 2 + x 3 +7x 4<br />
4<br />
C23 † Determ<strong>in</strong>e whether or not the follow<strong>in</strong>g l<strong>in</strong>ear transformation T : C 5 → P 3 is<br />
surjective:<br />
⎛⎡<br />
⎤⎞<br />
a<br />
b<br />
T ⎜⎢c⎥⎟<br />
⎝⎣<br />
d<br />
⎦⎠ = a +(b + c)x +(c + d)x2 +(d + e)x 3<br />
e
§SLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 478<br />
C24 †<br />
Determ<strong>in</strong>e whether or not the l<strong>in</strong>ear transformation T : P 3 → C 5 below is surjective:<br />
⎡ ⎤<br />
a + b<br />
T ( a + bx + cx 2 + dx 3) b + c<br />
= ⎢c + d⎥<br />
⎣<br />
a + c<br />
⎦ .<br />
b + d<br />
C25 †<br />
Def<strong>in</strong>e the l<strong>in</strong>ear transformation<br />
⎛⎡<br />
T : C 3 → C 2 , T ⎝⎣ x ⎤⎞<br />
1<br />
x 2<br />
⎦⎠ =<br />
x 3<br />
[ ]<br />
2x1 − x 2 +5x 3<br />
−4x 1 +2x 2 − 10x 3<br />
F<strong>in</strong>d a basis for the range of T , R(T ). Is T surjective?<br />
⎛⎡<br />
C26 † Let T : C 3 → C 3 be given by T ⎝⎣ a ⎤⎞<br />
⎡<br />
b⎦⎠ = ⎣ a + b +2c<br />
⎤<br />
2c ⎦. F<strong>in</strong>dabasisofR(T ). Is<br />
c a + b + c<br />
T surjective?<br />
⎛⎡<br />
C27 † Let T : C 3 → C 4 be given by T ⎝⎣ a ⎤⎞<br />
⎡ ⎤<br />
a + b − c<br />
b⎦⎠ ⎢ a − b + c ⎥<br />
= ⎣<br />
−a + b + c<br />
⎦. F<strong>in</strong>dabasisofR(T ). Is<br />
c<br />
a + b + c<br />
T surjective?<br />
⎛⎡<br />
⎤⎞<br />
a [ ]<br />
C28 † Let T : C 4 ⎜⎢b⎥⎟<br />
a + b a+ b + c<br />
→ M 22 be given by T ⎝⎣<br />
c<br />
⎦⎠ =<br />
.F<strong>in</strong>dabasis<br />
a + b + c a+ d<br />
d<br />
of R(T ). Is T surjective?<br />
C29 † Let T : P 2 → P 4 be given by T (p(x)) = x 2 p(x). F<strong>in</strong>d a basis of R(T ). Is T<br />
surjective?<br />
C30 † Let T : P 4 → P 3 be given by T (p(x)) = p ′ (x), where p ′ (x) is the derivative. F<strong>in</strong>d a<br />
basis of R(T ). Is T surjective?<br />
C40 † Show that the l<strong>in</strong>ear transformation T is not surjective by f<strong>in</strong>d<strong>in</strong>g an element of<br />
the codoma<strong>in</strong>, v, such that there is no vector u with T (u) =v.<br />
⎛⎡<br />
T : C 3 → C 3 , T ⎝⎣ a ⎤⎞<br />
⎡ ⎤<br />
2a +3b − c<br />
b⎦⎠ = ⎣ 2b − 2c ⎦<br />
c a − b +2c<br />
M60 Suppose U and V are vector spaces. Def<strong>in</strong>e the function Z : U → V by Z (u) = 0 V<br />
for every u ∈ U. Then by Exercise LT.M60, Z is a l<strong>in</strong>ear transformation. Formulate a<br />
condition on V that is equivalent to Z be<strong>in</strong>g an surjective l<strong>in</strong>ear transformation. In other<br />
words, fill <strong>in</strong> the blank to complete the follow<strong>in</strong>g statement (and then give a proof): Z is<br />
surjective if and only if V is . (See Exercise ILT.M60, Exercise IVLT.M60.)<br />
T15 † Suppose that T : U → V and S : V → W are l<strong>in</strong>ear transformations. Prove the<br />
follow<strong>in</strong>g relationship between ranges.<br />
R(S ◦ T ) ⊆R(S)
§SLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 479<br />
T20 † Suppose that A is an m × n matrix. Def<strong>in</strong>e the l<strong>in</strong>ear transformation T by<br />
T : C n → C m , T (x) =Ax<br />
Prove that the range of T equals the column space of A, R(T )=C(A).
Section IVLT<br />
Invertible L<strong>in</strong>ear Transformations<br />
In this section we will conclude our <strong>in</strong>troduction to l<strong>in</strong>ear transformations by br<strong>in</strong>g<strong>in</strong>g<br />
together the tw<strong>in</strong> properties of <strong>in</strong>jectivity and surjectivity and consider l<strong>in</strong>ear<br />
transformations with both of these properties.<br />
Subsection IVLT<br />
Invertible L<strong>in</strong>ear Transformations<br />
One prelim<strong>in</strong>ary def<strong>in</strong>ition, and then we will have our ma<strong>in</strong> def<strong>in</strong>ition for this section.<br />
Def<strong>in</strong>ition IDLT Identity L<strong>in</strong>ear Transformation<br />
The identity l<strong>in</strong>ear transformation on the vector space W is def<strong>in</strong>ed as<br />
I W : W → W, I W (w) =w<br />
Informally, I W is the “do-noth<strong>in</strong>g” function. You should check that I W is really<br />
a l<strong>in</strong>ear transformation, as claimed, and then compute its kernel and range to see<br />
that it is both <strong>in</strong>jective and surjective. All of these facts should be straightforward<br />
to verify (Exercise IVLT.T05). With this <strong>in</strong> hand we can make our ma<strong>in</strong> def<strong>in</strong>ition.<br />
Def<strong>in</strong>ition IVLT Invertible L<strong>in</strong>ear Transformations<br />
Suppose that T : U → V is a l<strong>in</strong>ear transformation. If there is a function S : V → U<br />
such that<br />
S ◦ T = I U<br />
T ◦ S = I V<br />
then T is <strong>in</strong>vertible. In this case, we call S the <strong>in</strong>verse of T and write S = T −1 . □<br />
Informally, a l<strong>in</strong>ear transformation T is <strong>in</strong>vertible if there is a companion l<strong>in</strong>ear<br />
transformation, S, which “undoes” the action of T . When the two l<strong>in</strong>ear transformations<br />
are applied consecutively (composition), <strong>in</strong> either order, the result is to have<br />
no real effect. It is entirely analogous to squar<strong>in</strong>g a positive number and then tak<strong>in</strong>g<br />
its (positive) square root.<br />
Here is an example of a l<strong>in</strong>ear transformation that is <strong>in</strong>vertible. As usual at the<br />
beg<strong>in</strong>n<strong>in</strong>g of a section, do not be concerned with where S came from, just understand<br />
how it illustrates Def<strong>in</strong>ition IVLT.<br />
Example AIVLT An <strong>in</strong>vertible l<strong>in</strong>ear transformation<br />
Archetype V is the l<strong>in</strong>ear transformation<br />
T : P 3 → M 22 , T ( a + bx + cx 2 + dx 3) [ ]<br />
a + b a− 2c<br />
=<br />
d b− d<br />
□<br />
480
§IVLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 481<br />
Def<strong>in</strong>e the function S : M 22 → P 3 def<strong>in</strong>ed by<br />
([ ])<br />
a b<br />
S<br />
=(a − c − d)+(c + d)x + 1 c d<br />
2 (a − b − c − d)x2 + cx 3<br />
Then<br />
([ ])<br />
a b<br />
(T ◦ S)<br />
c d<br />
( ([ ]))<br />
a b<br />
= T S<br />
c d<br />
= T<br />
((a − c − d)+(c + d)x + 1 )<br />
2 (a − b − c − d)x2 + cx 3<br />
[ ]<br />
(a − c − d)+(c + d) (a − c − d) − 2(<br />
1<br />
=<br />
2<br />
(a − b − c − d))<br />
c<br />
(c + d) − c<br />
[ ]<br />
a b<br />
=<br />
c d<br />
([ ])<br />
a b<br />
= I M22<br />
c d<br />
and<br />
(S ◦ T ) ( a + bx + cx 2 + dx 3)<br />
= S ( T ( a + bx + cx 2 + dx 3))<br />
([ ])<br />
a + b a− 2c<br />
= S<br />
d b− d<br />
=((a + b) − d − (b − d)) + (d +(b − d))x<br />
( )<br />
1<br />
+<br />
2 ((a + b) − (a − 2c) − d − (b − d)) x 2 +(d)x 3<br />
= a + bx + cx 2 + dx 3<br />
= I P3<br />
(<br />
a + bx + cx 2 + dx 3)<br />
For now, understand why these computations show that T is <strong>in</strong>vertible, and that<br />
S = T −1 . Maybe even be amazed by how S works so perfectly <strong>in</strong> concert with T !<br />
We will see later just how to arrive at the correct form of S (when it is possible).△<br />
It can be as <strong>in</strong>structive to study a l<strong>in</strong>ear transformation that is not <strong>in</strong>vertible.<br />
Example ANILT A non-<strong>in</strong>vertible l<strong>in</strong>ear transformation<br />
Consider the l<strong>in</strong>ear transformation T : C 3 → M 22 def<strong>in</strong>ed by<br />
([ ]) a [<br />
]<br />
a − b 2a +2b + c<br />
T b =<br />
3a + b + c −2a − 6b − 2c<br />
c
§IVLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 482<br />
Suppose we were to search for an <strong>in</strong>verse [ function ] S : M 22 → C 3 .<br />
5 3<br />
<strong>First</strong> verify that the 2 × 2 matrix A = is not <strong>in</strong> the range of T . This will<br />
8 2<br />
[ ] a<br />
amount to f<strong>in</strong>d<strong>in</strong>g an <strong>in</strong>put to T , b , such that<br />
c<br />
a − b =5<br />
2a +2b + c =3<br />
3a + b + c =8<br />
−2a − 6b − 2c =2<br />
As this system of equations is <strong>in</strong>consistent, there is no <strong>in</strong>put column vector, and<br />
A ∉R(T ). How should we def<strong>in</strong>e S (A)? Note that<br />
T (S (A)) = (T ◦ S)(A) =I M22 (A) =A<br />
So any def<strong>in</strong>ition we would provide for S (A) must then be a column vector that<br />
T sends to A and we would have A ∈R(T ), contrary to the def<strong>in</strong>ition of T .Thisis<br />
enough to see that there is no function S that will allow us to conclude that T is<br />
<strong>in</strong>vertible, s<strong>in</strong>ce we cannot provide a consistent def<strong>in</strong>ition for S (A) if we assume T<br />
is <strong>in</strong>vertible.<br />
Even though we now know that T is not <strong>in</strong>vertible, let us not leave this example<br />
just yet. Check that<br />
or<br />
T<br />
([ 1<br />
−2<br />
4<br />
])<br />
=<br />
[ ]<br />
([ ]) 0<br />
3 2<br />
= B T −3 =<br />
5 2<br />
8<br />
How would we def<strong>in</strong>e S (B)?<br />
( ([ ])) 1<br />
S (B) =S T −2<br />
4<br />
=(S ◦ T )<br />
(<br />
S (B) =S T<br />
([ ]) 1<br />
−2 = I C 3<br />
4<br />
([ ])) 0<br />
([ ]) 0<br />
−3 =(S ◦ T ) −3 = I C<br />
3<br />
8<br />
8<br />
[ ]<br />
3 2<br />
= B<br />
5 2<br />
([ ]) 1<br />
[ ] 1<br />
−2 = −2<br />
4 4<br />
([ ]) 0<br />
[ ] 0<br />
−3 = −3<br />
8 8<br />
Which def<strong>in</strong>ition should we provide for S (B)? Both are necessary. But then S is<br />
not a function. So we have a second reason to know that there is no function S that<br />
will allow us to conclude that T is <strong>in</strong>vertible. It happens that there are <strong>in</strong>f<strong>in</strong>itely<br />
many column vectors that S wouldhavetotaketoB. Construct the kernel of T ,<br />
〈{[ ]}〉 −1<br />
K(T )= −1<br />
4
§IVLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 483<br />
Now choose either of the two <strong>in</strong>puts used above for T and add to it a scalar<br />
multiple of the basis vector for the kernel of T . For example,<br />
[ ] [ ] [ ]<br />
1<br />
−1 3<br />
x = −2 +(−2) −1 = 0<br />
4<br />
4 −4<br />
then verify that T (x) = B. Practice creat<strong>in</strong>g a few more <strong>in</strong>puts for T that would be<br />
sent to B, and see why it is hopeless to th<strong>in</strong>k that we could ever provide a reasonable<br />
def<strong>in</strong>ition for S (B)! There is a “whole subspace’s worth” of values that S (B) would<br />
have to take on.<br />
△<br />
In Example ANILT you may have noticed that T is not surjective, s<strong>in</strong>ce the<br />
matrix A was not <strong>in</strong> the range of T .AndT is not <strong>in</strong>jective s<strong>in</strong>ce there are two<br />
different <strong>in</strong>put column vectors that T sends to the matrix B. L<strong>in</strong>ear transformations<br />
T that are not surjective lead to putative <strong>in</strong>verse functions S that are undef<strong>in</strong>ed on<br />
<strong>in</strong>puts outside of the range of T . L<strong>in</strong>ear transformations T that are not <strong>in</strong>jective<br />
lead to putative <strong>in</strong>verse functions S that are multiply-def<strong>in</strong>ed on each of their <strong>in</strong>puts.<br />
We will formalize these ideas <strong>in</strong> Theorem ILTIS.<br />
Butfirstnotice<strong>in</strong>Def<strong>in</strong>itionIVLT that we only require the <strong>in</strong>verse (when it<br />
exists) to be a function. When it does exist, it too is a l<strong>in</strong>ear transformation.<br />
Theorem ILTLT Inverse of a L<strong>in</strong>ear Transformation is a L<strong>in</strong>ear Transformation<br />
Suppose that T : U → V is an <strong>in</strong>vertible l<strong>in</strong>ear transformation. Then the function<br />
T −1 : V → U is a l<strong>in</strong>ear transformation.<br />
Proof. We work through verify<strong>in</strong>g Def<strong>in</strong>ition LT for T −1 , us<strong>in</strong>g the fact that T is a<br />
l<strong>in</strong>ear transformation to obta<strong>in</strong> the second equality <strong>in</strong> each half of the proof. To this<br />
end, suppose x, y ∈ V and α ∈ C.<br />
T −1 (x + y) =T −1 ( T ( T −1 (x) ) + T ( T −1 (y) ))<br />
Def<strong>in</strong>ition IVLT<br />
= T −1 ( T ( T −1 (x)+T −1 (y) )) Def<strong>in</strong>ition LT<br />
= T −1 (x)+T −1 (y) Def<strong>in</strong>ition IVLT<br />
Now check the second def<strong>in</strong><strong>in</strong>g property of a l<strong>in</strong>ear transformation for T −1 ,<br />
T −1 (αx) =T −1 ( αT ( T −1 (x) ))<br />
Def<strong>in</strong>ition IVLT<br />
= T −1 ( T ( αT −1 (x) )) Def<strong>in</strong>ition LT<br />
= αT −1 (x) Def<strong>in</strong>ition IVLT<br />
So T −1 fulfills the requirements of Def<strong>in</strong>ition LT and is therefore a l<strong>in</strong>ear transformation.<br />
<br />
So when T has an <strong>in</strong>verse, T −1 is also a l<strong>in</strong>ear transformation. Furthermore, T −1<br />
is an <strong>in</strong>vertible l<strong>in</strong>ear transformation and its <strong>in</strong>verse is what you might expect.
§IVLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 484<br />
Theorem IILT Inverse of an Invertible L<strong>in</strong>ear Transformation<br />
Suppose that T : U → V is an <strong>in</strong>vertible l<strong>in</strong>ear transformation. Then T −1 is an<br />
<strong>in</strong>vertible l<strong>in</strong>ear transformation and ( T −1) −1<br />
= T .<br />
Proof. Because T is <strong>in</strong>vertible, Def<strong>in</strong>ition IVLT tells us there is a function T −1 : V →<br />
U such that<br />
T −1 ◦ T = I U<br />
T ◦ T −1 = I V<br />
Additionally, Theorem ILTLT tells us that T −1 is more than just a function,<br />
it is a l<strong>in</strong>ear transformation. Now view these two statements as properties of the<br />
l<strong>in</strong>ear transformation T −1 . In light of Def<strong>in</strong>ition IVLT, they together say that T −1 is<br />
<strong>in</strong>vertible (let T playtheroleofS <strong>in</strong> the statement of the def<strong>in</strong>ition). Furthermore,<br />
the <strong>in</strong>verse of T −1 is then T , i.e. ( T −1) −1<br />
= T .<br />
<br />
Subsection IV<br />
Invertibility<br />
We now know what an <strong>in</strong>verse l<strong>in</strong>ear transformation is, but just which l<strong>in</strong>ear transformations<br />
have <strong>in</strong>verses? Here is a theorem we have been prepar<strong>in</strong>g for all chapter<br />
long.<br />
Theorem ILTIS Invertible L<strong>in</strong>ear Transformations are Injective and Surjective<br />
Suppose T : U → V is a l<strong>in</strong>ear transformation. Then T is <strong>in</strong>vertible if and only if T<br />
is <strong>in</strong>jective and surjective.<br />
Proof. (⇒) S<strong>in</strong>ceT is presumed <strong>in</strong>vertible, we can employ its <strong>in</strong>verse, T −1 (Def<strong>in</strong>ition<br />
IVLT). To see that T is <strong>in</strong>jective, suppose x, y ∈ U and assume that T (x) = T (y),<br />
x = I U (x)<br />
Def<strong>in</strong>ition IDLT<br />
= ( T −1 ◦ T ) (x) Def<strong>in</strong>ition IVLT<br />
= T −1 (T (x)) Def<strong>in</strong>ition LTC<br />
= T −1 (T (y)) Def<strong>in</strong>ition ILT<br />
= ( T −1 ◦ T ) (y) Def<strong>in</strong>ition LTC<br />
= I U (y) Def<strong>in</strong>ition IVLT<br />
= y Def<strong>in</strong>ition IDLT<br />
So by Def<strong>in</strong>ition ILT T is <strong>in</strong>jective.<br />
To check that T is surjective, suppose v ∈ V .ThenT −1 (v) is a vector <strong>in</strong> U.<br />
Compute<br />
T ( T −1 (v) ) = ( T ◦ T −1) (v)<br />
Def<strong>in</strong>ition LTC<br />
= I V (v) Def<strong>in</strong>ition IVLT
§IVLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 485<br />
= v Def<strong>in</strong>ition IDLT<br />
So there is an element from U, when used as an <strong>in</strong>put to T (namely T −1 (v)) that<br />
produces the desired output, v, and hence T is surjective by Def<strong>in</strong>ition SLT.<br />
(⇐) Now assume that T is both <strong>in</strong>jective and surjective. We will build a function<br />
S : V → U that will establish that T is <strong>in</strong>vertible. To this end, choose any v ∈ V .<br />
S<strong>in</strong>ce T is surjective, Theorem RSLT says R(T ) = V ,sowehavev ∈R(T ). Theorem<br />
RPI says that the pre-image of v, T −1 (v), is nonempty. So we can choose a vector<br />
from the pre-image of v, sayu. In other words, there exists u ∈ T −1 (v).<br />
S<strong>in</strong>ce T −1 (v) is nonempty, Theorem KPI then says that<br />
T −1 (v) ={u + z| z ∈K(T )}<br />
However, because T is <strong>in</strong>jective, by Theorem KILT the kernel is trivial, K(T ) =<br />
{0}. So the pre-image is a set with just one element, T −1 (v) = {u}. Nowwecan<br />
def<strong>in</strong>e S by S (v) = u. This is the key to this half of this proof. Normally the<br />
preimage of a vector from the codoma<strong>in</strong> might be an empty set, or an <strong>in</strong>f<strong>in</strong>ite set.<br />
But surjectivity requires that the preimage not be empty, and then <strong>in</strong>jectivity limits<br />
the preimage to a s<strong>in</strong>gleton. S<strong>in</strong>ce our choice of v was arbitrary, we know that every<br />
pre-image for T is a set with a s<strong>in</strong>gle element. This allows us to construct S as a<br />
function. Now that it is def<strong>in</strong>ed, verify<strong>in</strong>g that it is the <strong>in</strong>verse of T will be easy.<br />
Here we go.<br />
Choose u ∈ U. Def<strong>in</strong>e v = T (u). Then T −1 (v) ={u}, sothatS (v) =u and,<br />
(S ◦ T )(u) =S (T (u)) = S (v) =u = I U (u)<br />
and s<strong>in</strong>ce our choice of u was arbitrary we have function equality, S ◦ T = I U .<br />
Now choose v ∈ V . Def<strong>in</strong>e u to be the s<strong>in</strong>gle vector <strong>in</strong> the set T −1 (v), <strong>in</strong>other<br />
words, u = S (v). Then T (u) =v, so<br />
(T ◦ S)(v) =T (S (v)) = T (u) =v = I V (v)<br />
and s<strong>in</strong>ce our choice of v was arbitrary we have function equality, T ◦ S = I V . <br />
When a l<strong>in</strong>ear transformation is both <strong>in</strong>jective and surjective, the pre-image of<br />
any element of the codoma<strong>in</strong> is a set of size one (a “s<strong>in</strong>gleton”). This fact allowed<br />
us to construct the <strong>in</strong>verse l<strong>in</strong>ear transformation <strong>in</strong> one half of the proof of Theorem<br />
ILTIS (see Proof Technique C) and is illustrated <strong>in</strong> the follow<strong>in</strong>g cartoon. This<br />
should rem<strong>in</strong>d you of the very general Diagram KPI which was used to illustrate<br />
Theorem KPI about pre-images, only now we have an <strong>in</strong>vertible l<strong>in</strong>ear transformation<br />
which is therefore surjective and <strong>in</strong>jective (Theorem ILTIS). As a surjective l<strong>in</strong>ear<br />
transformation, there are no vectors depicted <strong>in</strong> the codoma<strong>in</strong>, V ,thathaveempty<br />
pre-images. More importantly, as an <strong>in</strong>jective l<strong>in</strong>ear transformation, the kernel is<br />
trivial (Theorem KILT), so each pre-image is a s<strong>in</strong>gle vector. This makes it possible<br />
to “turn around” all the arrows to create the <strong>in</strong>verse l<strong>in</strong>ear transformation T −1 .
§IVLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 486<br />
u 1<br />
v 1<br />
u 1 + {0 U } = {u 1 }<br />
u 2<br />
v 2<br />
T −1 (v 2 )=u 2 + {0 U }<br />
0 U<br />
0 V<br />
T −1 (0 V )={0 U }<br />
u 3<br />
T −1 (v 3 )<br />
U<br />
U<br />
T<br />
T −1<br />
Diagram IVLT: Invertible L<strong>in</strong>ear Transformation<br />
Many will call an <strong>in</strong>jective and surjective function a bijective function or just a<br />
bijection. TheoremILTIS tells us that this is just a synonym for the term <strong>in</strong>vertible<br />
(which we will use exclusively).<br />
We can follow the constructive approach of the proof of Theorem ILTIS to<br />
construct the <strong>in</strong>verse of a specific l<strong>in</strong>ear transformation, as the next example shows.<br />
Example CIVLT Comput<strong>in</strong>g the Inverse of a L<strong>in</strong>ear Transformations<br />
Consider the l<strong>in</strong>ear transformation T : S 22 → P 2 def<strong>in</strong>ed by<br />
([ ])<br />
a b<br />
T<br />
=(a + b + c)+(−a +2c) x +(2a +3b +6c) x<br />
b c<br />
2<br />
T is <strong>in</strong>vertible, which you are able to verify, perhaps by determ<strong>in</strong><strong>in</strong>g that the<br />
kernel of T is trivial and the range of T is all of P 2 . This will be easier once we have<br />
Theorem RPNDD, which appears later <strong>in</strong> this section.<br />
By Theorem ILTIS we know T −1 exists, and it will be critical shortly to realize<br />
that T −1 is automatically known to be a l<strong>in</strong>ear transformation as well (Theorem<br />
ILTLT). To determ<strong>in</strong>e the complete behavior of T −1 : P 2 → S 22 we can simply<br />
determ<strong>in</strong>e its action on a basis for the doma<strong>in</strong>, P 2 . This is the substance of Theorem<br />
LTDB, and an excellent example of its application. Choose any basis of P 2 ,the<br />
simpler the better, such as B = { 1,x,x 2} . Values of T −1 for these three basis<br />
V<br />
V<br />
v 3
§IVLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 487<br />
elements will be the s<strong>in</strong>gle elements of their preimages. In turn, we have<br />
T −1 (1) :<br />
([ ])<br />
a b<br />
T<br />
=1+0x +0x<br />
b c<br />
[ 1 1 1<br />
] 1<br />
[ 1 0 0<br />
] −6<br />
−1 0 2 0<br />
RREF<br />
−−−−→ 0 1 0 10<br />
2 3 6 0 0 0 1 −3<br />
{[ ]}<br />
(preimage) T −1 −6 10<br />
(1) =<br />
10 −3<br />
[ ]<br />
(function) T −1 −6 10<br />
(1) =<br />
10 −3<br />
T −1 (x) :<br />
T −1 ( x 2) :<br />
([ ])<br />
a b<br />
T<br />
=0+1x +0x<br />
b c<br />
2<br />
]<br />
[ 1 1 1 0<br />
−1 0 2 1<br />
2 3 6 0<br />
RREF<br />
−−−−→<br />
[ 1 0 0 −3<br />
0 1 0 4<br />
0 0 1 −1<br />
{[ ]}<br />
(preimage) T −1 −3 4<br />
(x) =<br />
4 −1<br />
[ ]<br />
(function) T −1 −3 4<br />
(x) =<br />
4 −1<br />
([ ])<br />
a b<br />
T<br />
=0+0x +1x<br />
b c<br />
2<br />
]<br />
[ 1 1 1 0<br />
−1 0 2 0<br />
2 3 6 1<br />
RREF<br />
−−−−→<br />
[ 1 0 0 2<br />
0 1 0 −3<br />
0 0 1 1<br />
(preimage) T −1 ( x 2) {[ ]}<br />
2 −3<br />
=<br />
−3 1<br />
(function) T −1 ( x 2) [ ]<br />
2 −3<br />
=<br />
−3 1<br />
Theorem LTDB says, <strong>in</strong>formally, “it is enough to know what a l<strong>in</strong>ear transformation<br />
does to a basis.” Formally, we have the outputs of T −1 for a basis, so by<br />
Theorem LTDB there is a unique l<strong>in</strong>ear transformation with these outputs. So we put<br />
this <strong>in</strong>formation to work. The key step here is that we can convert any element of P 2<br />
<strong>in</strong>to a l<strong>in</strong>ear comb<strong>in</strong>ation of the elements of the basis B (Theorem VRRB). We are<br />
]<br />
]
§IVLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 488<br />
after a “formula” for the value of T −1 on a generic element of P 2 ,sayp + qx + rx 2 .<br />
T −1 ( p + qx + rx 2) = T −1 ( p(1) + q(x)+r(x 2 ) )<br />
Theorem VRRB<br />
= pT −1 (1) + qT −1 (x)+rT −1 ( x 2) Theorem LTLC<br />
[ ] [ ] [ ]<br />
−6 10 −3 4 2 −3<br />
= p<br />
+ q<br />
+ r<br />
10 −3 4 −1 −3 1<br />
[ ]<br />
−6p − 3q +2r 10p +4q − 3r<br />
=<br />
10p +4q − 3r −3p − q + r<br />
Notice how a l<strong>in</strong>ear comb<strong>in</strong>ation <strong>in</strong> the doma<strong>in</strong> of T −1 has been translated<br />
<strong>in</strong>to a l<strong>in</strong>ear comb<strong>in</strong>ation <strong>in</strong> the codoma<strong>in</strong> of T −1 s<strong>in</strong>ceweknowT −1 is a l<strong>in</strong>ear<br />
transformation by Theorem ILTLT.<br />
Also, notice how the augmented matrices used to determ<strong>in</strong>e the three pre-images<br />
could be comb<strong>in</strong>ed <strong>in</strong>to one calculation of a matrix <strong>in</strong> extended echelon form,<br />
rem<strong>in</strong>iscent of a procedure we know for comput<strong>in</strong>g the <strong>in</strong>verse of a matrix (see<br />
Example CMI). Hmmmm.<br />
△<br />
We will make frequent use of the characterization of <strong>in</strong>vertible l<strong>in</strong>ear transformations<br />
provided by Theorem ILTIS. The next theorem is a good example of this, and<br />
we will use it often, too.<br />
Theorem CIVLT Composition of Invertible L<strong>in</strong>ear Transformations<br />
Suppose that T : U → V and S : V → W are <strong>in</strong>vertible l<strong>in</strong>ear transformations. Then<br />
the composition, (S ◦ T ):U → W is an <strong>in</strong>vertible l<strong>in</strong>ear transformation.<br />
Proof. S<strong>in</strong>ce S and T are both l<strong>in</strong>ear transformations, S ◦ T is also a l<strong>in</strong>ear transformation<br />
by Theorem CLTLT. S<strong>in</strong>ceS and T are both <strong>in</strong>vertible, Theorem ILTIS says<br />
that S and T are both <strong>in</strong>jective and surjective. Then Theorem CILTI says S ◦ T is<br />
<strong>in</strong>jective, and Theorem CSLTS says S ◦ T is surjective. Now apply the “other half”<br />
of Theorem ILTIS and conclude that S ◦ T is <strong>in</strong>vertible.<br />
<br />
When a composition is <strong>in</strong>vertible, the <strong>in</strong>verse is easy to construct.<br />
Theorem ICLT Inverse of a Composition of L<strong>in</strong>ear Transformations<br />
Suppose that T : U → V and S : V → W are <strong>in</strong>vertible l<strong>in</strong>ear transformations. Then<br />
S ◦ T is <strong>in</strong>vertible and (S ◦ T ) −1 = T −1 ◦ S −1 .<br />
Proof. Compute, for all w ∈ W<br />
(<br />
(S ◦ T ) ◦<br />
(<br />
T −1 ◦ S −1)) (w) =S ( T ( T −1 ( S −1 (w) )))<br />
= S ( (<br />
I V S −1 (w) )) Def<strong>in</strong>ition IVLT<br />
= S ( S −1 (w) ) Def<strong>in</strong>ition IDLT<br />
= w Def<strong>in</strong>ition IVLT<br />
= I W (w) Def<strong>in</strong>ition IDLT
§IVLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 489<br />
So (S ◦ T ) ◦ ( T −1 ◦ S −1) = I W ,andalso<br />
((<br />
T −1 ◦ S −1) ◦ (S ◦ T ) ) (u) =T −1 ( S −1 (S (T (u))) )<br />
= T −1 (I V (T (u))) Def<strong>in</strong>ition IVLT<br />
= T −1 (T (u)) Def<strong>in</strong>ition IDLT<br />
= u Def<strong>in</strong>ition IVLT<br />
= I U (u) Def<strong>in</strong>ition IDLT<br />
so ( T −1 ◦ S −1) ◦ (S ◦ T )=I U .<br />
By Def<strong>in</strong>ition IVLT, S ◦ T is <strong>in</strong>vertible and (S ◦ T ) −1 = T −1 ◦ S −1 .<br />
<br />
Notice that this theorem not only establishes what the <strong>in</strong>verse of S ◦ T is, italso<br />
duplicates the conclusion of Theorem CIVLT and also establishes the <strong>in</strong>vertibility of<br />
S ◦ T . But somehow, the proof of Theorem CIVLT is a nicer way to get this property.<br />
Does Theorem ICLT rem<strong>in</strong>d you of the flavor of any theorem we have seen about<br />
matrices? (H<strong>in</strong>t: Th<strong>in</strong>k about gett<strong>in</strong>g dressed.) Hmmmm.<br />
Subsection SI<br />
Structure and Isomorphism<br />
A vector space is def<strong>in</strong>ed (Def<strong>in</strong>ition VS) as a set of objects (“vectors”) endowed<br />
with a def<strong>in</strong>ition of vector addition (+) and a def<strong>in</strong>ition of scalar multiplication<br />
(written with juxtaposition). Many of our def<strong>in</strong>itions about vector spaces <strong>in</strong>volve<br />
l<strong>in</strong>ear comb<strong>in</strong>ations (Def<strong>in</strong>ition LC), such as the span of a set (Def<strong>in</strong>ition SS) and<br />
l<strong>in</strong>ear <strong>in</strong>dependence (Def<strong>in</strong>ition LI). Other def<strong>in</strong>itions are built up from these ideas,<br />
such as bases (Def<strong>in</strong>ition B) and dimension (Def<strong>in</strong>ition D). The def<strong>in</strong><strong>in</strong>g properties<br />
of a l<strong>in</strong>ear transformation require that a function “respect” the operations of the<br />
two vector spaces that are the doma<strong>in</strong> and the codoma<strong>in</strong> (Def<strong>in</strong>ition LT). F<strong>in</strong>ally, an<br />
<strong>in</strong>vertible l<strong>in</strong>ear transformation is one that can be “undone” — it has a companion<br />
that reverses its effect. In this subsection we are go<strong>in</strong>g to beg<strong>in</strong> to roll all these ideas<br />
<strong>in</strong>to one.<br />
A vector space has “structure” derived from def<strong>in</strong>itions of the two operations<br />
and the requirement that these operations <strong>in</strong>teract <strong>in</strong> ways that satisfy the ten<br />
properties of Def<strong>in</strong>ition VS. When two different vector spaces have an <strong>in</strong>vertible<br />
l<strong>in</strong>ear transformation def<strong>in</strong>ed between them, then we can translate questions about<br />
l<strong>in</strong>ear comb<strong>in</strong>ations (spans, l<strong>in</strong>ear <strong>in</strong>dependence, bases, dimension) from the first<br />
vector space to the second. The answers obta<strong>in</strong>ed <strong>in</strong> the second vector space can<br />
then be translated back, via the <strong>in</strong>verse l<strong>in</strong>ear transformation, and <strong>in</strong>terpreted <strong>in</strong> the<br />
sett<strong>in</strong>g of the first vector space. We say that these <strong>in</strong>vertible l<strong>in</strong>ear transformations<br />
“preserve structure.” And we say that the two vector spaces are “structurally the<br />
same.” The precise term is “isomorphic,” from Greek mean<strong>in</strong>g “of the same form.”<br />
Let us beg<strong>in</strong> to try to understand this important concept.
§IVLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 490<br />
Def<strong>in</strong>ition IVS Isomorphic Vector Spaces<br />
Two vector spaces U and V are isomorphic if there exists an <strong>in</strong>vertible l<strong>in</strong>ear<br />
transformation T with doma<strong>in</strong> U and codoma<strong>in</strong> V , T : U → V . In this case, we write<br />
U ∼ = V , and the l<strong>in</strong>ear transformation T is known as an isomorphism between U<br />
and V .<br />
□<br />
A few comments on this def<strong>in</strong>ition. <strong>First</strong>, be careful with your language (Proof<br />
Technique L). Two vector spaces are isomorphic, or not. It is a yes/no situation and<br />
the term only applies to a pair of vector spaces. Any <strong>in</strong>vertible l<strong>in</strong>ear transformation<br />
can be called an isomorphism, it is a term that applies to functions. Second, given<br />
a pair of vector spaces there might be several different isomorphisms between the<br />
two vector spaces. But it only takes the existence of one to call the pair isomorphic.<br />
Third, U isomorphic to V ,orV isomorphic to U? Itdoesnotmatter,s<strong>in</strong>cethe<br />
<strong>in</strong>verse l<strong>in</strong>ear transformation will provide the needed isomorphism <strong>in</strong> the “opposite”<br />
direction. Be<strong>in</strong>g “isomorphic to” is an equivalence relation on the set of all vector<br />
spaces (see Theorem SER for a rem<strong>in</strong>der about equivalence relations).<br />
Example IVSAV Isomorphic vector spaces, Archetype V<br />
Archetype V is a l<strong>in</strong>ear transformation from P 3 to M 22 ,<br />
T : P 3 → M 22 , T ( a + bx + cx 2 + dx 3) [ ]<br />
a + b a− 2c<br />
=<br />
d b− d<br />
S<strong>in</strong>ce it is <strong>in</strong>jective and surjective, Theorem ILTIS tells us that it is an <strong>in</strong>vertible<br />
l<strong>in</strong>ear transformation. By Def<strong>in</strong>ition IVS we say P 3 and M 22 are isomorphic.<br />
At a basic level, the term “isomorphic” is noth<strong>in</strong>g more than a codeword for the<br />
presence of an <strong>in</strong>vertible l<strong>in</strong>ear transformation. However, it is also a description of<br />
a powerful idea, and this power only becomes apparent <strong>in</strong> the course of study<strong>in</strong>g<br />
examples and related theorems. In this example, we are led to believe that there is<br />
noth<strong>in</strong>g “structurally” different about P 3 and M 22 . In a certa<strong>in</strong> sense they are the<br />
same.Notequal,butthesame.Oneisasgoodastheother.Oneisjustas<strong>in</strong>terest<strong>in</strong>g<br />
as the other.<br />
Here is an extremely basic application of this idea. Suppose we want to compute<br />
the follow<strong>in</strong>g l<strong>in</strong>ear comb<strong>in</strong>ation of polynomials <strong>in</strong> P 3 ,<br />
5(2 + 3x − 4x 2 +5x 3 )+(−3)(3 − 5x +3x 2 + x 3 )<br />
Rather than do<strong>in</strong>g it straight-away (which is very easy), we will apply the<br />
transformation T to convert <strong>in</strong>to a l<strong>in</strong>ear comb<strong>in</strong>ation of matrices, and then compute<br />
<strong>in</strong> M 22 accord<strong>in</strong>g to the def<strong>in</strong>itions of the vector space operations there (Example<br />
VSM),<br />
T ( 5(2 + 3x − 4x 2 +5x 3 )+(−3)(3 − 5x +3x 2 + x 3 ) )<br />
=5T ( 2+3x − 4x 2 +5x 3) +(−3)T ( 3 − 5x +3x 2 + x 3) Theorem LTLC<br />
[ ] [ ]<br />
5 10 −2 −3<br />
=5 +(−3)<br />
Def<strong>in</strong>ition of T<br />
5 −2 1 −6
§IVLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 491<br />
=<br />
[<br />
31<br />
]<br />
59<br />
22 8<br />
Operations <strong>in</strong> M 22<br />
Now we will translate our answer back to P 3 by apply<strong>in</strong>g T −1 , which we demonstrated<br />
<strong>in</strong> Example AIVLT,<br />
([ ])<br />
T −1 : M 22 → P 3 , T −1 a b<br />
=(a − c − d)+(c + d)x + 1 c d<br />
2 (a − b − c − d)x2 + cx 3<br />
We compute,<br />
([ ])<br />
T −1 31 59<br />
=1+30x − 29x<br />
22 8<br />
2 +22x 3<br />
which is, as expected, exactly what we would have computed for the orig<strong>in</strong>al l<strong>in</strong>ear<br />
comb<strong>in</strong>ation had we just used the def<strong>in</strong>itions of the operations <strong>in</strong> P 3 (Example VSP).<br />
Notice this is meant only as an illustration and not a suggested route for do<strong>in</strong>g this<br />
particular computation.<br />
△<br />
In Example IVSAV we avoided a computation <strong>in</strong> P 3 by a conversion of the<br />
computation to a new vector space, M 22 , via an <strong>in</strong>vertible l<strong>in</strong>ear transformation (also<br />
known as an isomorphism). Here is a diagram meant to illustrate the more general<br />
situation of two vector spaces, U and V , and an <strong>in</strong>vertible l<strong>in</strong>ear transformation,<br />
T . The diagram is simply about a sum of two vectors from U, rather than a more<br />
<strong>in</strong>volved l<strong>in</strong>ear comb<strong>in</strong>ation. It should rem<strong>in</strong>d you of Diagram DLTA.<br />
T<br />
u 1 , u 2 T (u 1 ),T(u 2 )<br />
+<br />
+<br />
u 1 + u 2 T (u 1 + u 2 )=T (u 1 )+T (u 2 )<br />
T −1<br />
Diagram AIVS: Addition <strong>in</strong> Isomorphic Vector Spaces<br />
To understand this diagram, beg<strong>in</strong> <strong>in</strong> the upper-left corner, and by go<strong>in</strong>g straight<br />
down we can compute the sum of the two vectors us<strong>in</strong>g the addition for the vector<br />
space U. The more circuitous alternative, <strong>in</strong> the spirit of Example IVSAV, isto<br />
beg<strong>in</strong> <strong>in</strong> the upper-left corner and then proceed clockwise around the other three<br />
sides of the rectangle. Notice that the vector addition is accomplished us<strong>in</strong>g the<br />
addition <strong>in</strong> the vector space V . Then, because T is a l<strong>in</strong>ear transformation, we can<br />
say that the result of T (u 1 ) + T (u 2 ) is equal to T (u 1 + u 2 ). Then the key feature<br />
is to recognize that apply<strong>in</strong>g T −1 obviously converts the second version of this result<br />
<strong>in</strong>to the sum <strong>in</strong> the lower-left corner. So there are two routes to the sum u 1 + u 2 ,
§IVLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 492<br />
each employ<strong>in</strong>g an addition from a different vector space, but one is “direct” and<br />
the other is “roundabout”. You might try design<strong>in</strong>g a similar diagram for the case<br />
of scalar multiplication (see Diagram DLTM) or for a full l<strong>in</strong>ear comb<strong>in</strong>ation.<br />
Check<strong>in</strong>g the dimensions of two vector spaces can be a quick way to establish<br />
that they are not isomorphic. Here is the theorem.<br />
Theorem IVSED Isomorphic Vector Spaces have Equal Dimension<br />
Suppose U and V are isomorphic vector spaces. Then dim (U) =dim(V ).<br />
Proof. If U and V are isomorphic, there is an <strong>in</strong>vertible l<strong>in</strong>ear transformation<br />
T : U → V (Def<strong>in</strong>ition IVS). T is <strong>in</strong>jective by Theorem ILTIS and so by Theorem<br />
ILTD, dim (U) ≤ dim (V ). Similarly, T is surjective by Theorem ILTIS and so by<br />
Theorem SLTD, dim (U) ≥ dim (V ). The net effect of these two <strong>in</strong>equalities is that<br />
dim (U) =dim(V ).<br />
<br />
The contrapositive of Theorem IVSED says that if U and V have different<br />
dimensions, then they are not isomorphic. Dimension is the simplest “structural”<br />
characteristic that will allow you to dist<strong>in</strong>guish non-isomorphic vector spaces. For<br />
example P 6 is not isomorphic to M 34 s<strong>in</strong>ce their dimensions (7 and 12, respectively)<br />
are not equal. With tools developed <strong>in</strong> Section VR we will be able to establish that<br />
theconverseofTheoremIVSED is true. Th<strong>in</strong>k about that one for a moment.<br />
Subsection RNLT<br />
Rank and Nullity of a L<strong>in</strong>ear Transformation<br />
Just as a matrix has a rank and a nullity, so too do l<strong>in</strong>ear transformations. And<br />
just like the rank and nullity of a matrix are related (they sum to the number of<br />
columns, Theorem RPNC) the rank and nullity of a l<strong>in</strong>ear transformation are related.<br />
Here are the def<strong>in</strong>itions and theorems, see the Archetypes (Archetypes) for loads of<br />
examples.<br />
Def<strong>in</strong>ition ROLT Rank Of a L<strong>in</strong>ear Transformation<br />
Suppose that T : U → V is a l<strong>in</strong>ear transformation. Then the rank of T , r (T ), is<br />
the dimension of the range of T ,<br />
r (T )=dim(R(T ))<br />
□<br />
Def<strong>in</strong>ition NOLT Nullity Of a L<strong>in</strong>ear Transformation<br />
Suppose that T : U → V is a l<strong>in</strong>ear transformation. Then the nullity of T , n (T ), is<br />
the dimension of the kernel of T ,<br />
n (T )=dim(K(T ))<br />
□
§IVLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 493<br />
Here are two quick theorems.<br />
Theorem ROSLT Rank Of a Surjective L<strong>in</strong>ear Transformation<br />
Suppose that T : U → V is a l<strong>in</strong>ear transformation. Then the rank of T is the<br />
dimension of V , r (T )=dim(V ), ifandonlyifT is surjective.<br />
Proof. By Theorem RSLT, T is surjective if and only if R(T ) = V . Apply<strong>in</strong>g<br />
Def<strong>in</strong>ition ROLT, R(T )=V if and only if r (T )=dim(R(T )) = dim (V ). <br />
Theorem NOILT Nullity Of an Injective L<strong>in</strong>ear Transformation<br />
Suppose that T : U → V is a l<strong>in</strong>ear transformation. Then the nullity of T is zero,<br />
n (T )=0,ifandonlyifT is <strong>in</strong>jective.<br />
Proof. By Theorem KILT, T is <strong>in</strong>jective if and only if K(T ) = {0}. Apply<strong>in</strong>g<br />
Def<strong>in</strong>ition NOLT, K(T )={0} if and only if n (T )=0.<br />
<br />
Just as <strong>in</strong>jectivity and surjectivity come together <strong>in</strong> <strong>in</strong>vertible l<strong>in</strong>ear transformations,<br />
there is a clear relationship between rank and nullity of a l<strong>in</strong>ear transformation.<br />
If one is big, the other is small.<br />
Theorem RPNDD Rank Plus Nullity is Doma<strong>in</strong> Dimension<br />
Suppose that T : U → V is a l<strong>in</strong>ear transformation. Then<br />
r (T )+n (T )=dim(U)<br />
Proof. Let r = r (T ) and s = n (T ). Suppose that R = {v 1 , v 2 , v 3 , ..., v r }⊆V is<br />
a basis of the range of T , R(T ), andS = {u 1 , u 2 , u 3 , ..., u s }⊆U is a basis of the<br />
kernel of T , K(T ). Note that R and S are possibly empty, which means that some<br />
of the sums <strong>in</strong> this proof are “empty” and are equal to the zero vector.<br />
Because the elements of R are all <strong>in</strong> the range of T , each must have a nonempty preimage<br />
by Theorem RPI. Choose vectors w i ∈ U, 1≤ i ≤ r such that w i ∈ T −1 (v i ).<br />
So T (w i )=v i ,1≤ i ≤ r. Consider the set<br />
B = {u 1 , u 2 , u 3 , ..., u s , w 1 , w 2 , w 3 , ..., w r }<br />
We claim that B is a basis for U.<br />
To establish l<strong>in</strong>ear <strong>in</strong>dependence for B, beg<strong>in</strong> with a relation of l<strong>in</strong>ear dependence<br />
on B. So suppose there are scalars a 1 ,a 2 ,a 3 , ..., a s and b 1 ,b 2 ,b 3 , ..., b r<br />
0 = a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a s u s + b 1 w 1 + b 2 w 2 + b 3 w 3 + ···+ b r w r<br />
Then<br />
0 = T (0) Theorem LTTZZ<br />
= T (a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a s u s +<br />
b 1 w 1 + b 2 w 2 + b 3 w 3 + ···+ b r w r )<br />
Def<strong>in</strong>ition LI<br />
= a 1 T (u 1 )+a 2 T (u 2 )+a 3 T (u 3 )+···+ a s T (u s )+
§IVLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 494<br />
b 1 T (w 1 )+b 2 T (w 2 )+b 3 T (w 3 )+···+ b r T (w r ) Theorem LTLC<br />
= a 1 0 + a 2 0 + a 3 0 + ···+ a s 0+<br />
b 1 T (w 1 )+b 2 T (w 2 )+b 3 T (w 3 )+···+ b r T (w r ) Def<strong>in</strong>ition KLT<br />
= 0 + 0 + 0 + ···+ 0+<br />
b 1 T (w 1 )+b 2 T (w 2 )+b 3 T (w 3 )+···+ b r T (w r ) Theorem ZVSM<br />
= b 1 T (w 1 )+b 2 T (w 2 )+b 3 T (w 3 )+···+ b r T (w r ) Property Z<br />
= b 1 v 1 + b 2 v 2 + b 3 v 3 + ···+ b r v r Def<strong>in</strong>ition PI<br />
This is a relation of l<strong>in</strong>ear dependence on R (Def<strong>in</strong>ition RLD), and s<strong>in</strong>ce R is a<br />
l<strong>in</strong>early <strong>in</strong>dependent set (Def<strong>in</strong>ition LI), we see that b 1 = b 2 = b 3 = ... = b r =0.<br />
Then the orig<strong>in</strong>al relation of l<strong>in</strong>ear dependence on B becomes<br />
0 = a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a s u s +0w 1 +0w 2 + ...+0w r<br />
= a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a s u s + 0 + 0 + ...+ 0 Theorem ZSSM<br />
= a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a s u s Property Z<br />
But this is aga<strong>in</strong> a relation of l<strong>in</strong>ear <strong>in</strong>dependence (Def<strong>in</strong>ition RLD), now on the<br />
set S. S<strong>in</strong>ceS is l<strong>in</strong>early <strong>in</strong>dependent (Def<strong>in</strong>ition LI), we have a 1 = a 2 = a 3 = ...=<br />
a r = 0. S<strong>in</strong>ce we now know that all the scalars <strong>in</strong> the relation of l<strong>in</strong>ear dependence on<br />
B must be zero, we have established the l<strong>in</strong>ear <strong>in</strong>dependence of S through Def<strong>in</strong>ition<br />
LI.<br />
To now establish that B spans U, choose an arbitrary vector u ∈ U. Then<br />
T (u) ∈ R(T ), so there are scalars c 1 ,c 2 ,c 3 , ..., c r such that<br />
T (u) =c 1 v 1 + c 2 v 2 + c 3 v 3 + ···+ c r v r<br />
Use the scalars c 1 ,c 2 ,c 3 , ..., c r to def<strong>in</strong>e a vector y ∈ U,<br />
y = c 1 w 1 + c 2 w 2 + c 3 w 3 + ···+ c r w r<br />
Then<br />
T (u − y) =T (u) − T (y)<br />
Theorem LTLC<br />
= T (u) − T (c 1 w 1 + c 2 w 2 + c 3 w 3 + ···+ c r w r ) Substitution<br />
= T (u) − (c 1 T (w 1 )+c 2 T (w 2 )+···+ c r T (w r )) Theorem LTLC<br />
= T (u) − (c 1 v 1 + c 2 v 2 + c 3 v 3 + ···+ c r v r ) w i ∈ T −1 (v i )<br />
= T (u) − T (u) Substitution<br />
= 0 Property AI<br />
So the vector u − y is sent to the zero vector by T and hence is an element of the<br />
kernel of T . As such it can be written as a l<strong>in</strong>ear comb<strong>in</strong>ation of the basis vectors<br />
for K(T ), the elements of the set S. So there are scalars d 1 ,d 2 ,d 3 , ..., d s such that<br />
u − y = d 1 u 1 + d 2 u 2 + d 3 u 3 + ···+ d s u s
§IVLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 495<br />
Then<br />
u =(u − y)+y<br />
= d 1 u 1 + d 2 u 2 + d 3 u 3 + ···+ d s u s + c 1 w 1 + c 2 w 2 + c 3 w 3 + ···+ c r w r<br />
This says that for any vector, u, fromU, there exist scalars (d 1 ,d 2 ,d 3 , ..., d s ,<br />
c 1 ,c 2 ,c 3 , ..., c r ) that form u as a l<strong>in</strong>ear comb<strong>in</strong>ation of the vectors <strong>in</strong> the set B.<br />
In other words, B spans U (Def<strong>in</strong>ition SS).<br />
So B is a basis (Def<strong>in</strong>ition B) ofU with s + r vectors, and thus<br />
dim (U) =s + r = n (T )+r (T )<br />
as desired.<br />
<br />
Theorem RPNC said that the rank and nullity of a matrix sum to the number of<br />
columns of the matrix. This result is now an easy consequence of Theorem RPNDD<br />
when we consider the l<strong>in</strong>ear transformation T : C n → C m def<strong>in</strong>ed with the m × n<br />
matrix A by T (x) = Ax. The range and kernel of T are identical to the column<br />
space and null space of the matrix A (Exercise ILT.T20, Exercise SLT.T20), so the<br />
rank and nullity of the matrix A are identical to the rank and nullity of the l<strong>in</strong>ear<br />
transformation T . The dimension of the doma<strong>in</strong> of T is the dimension of C n , exactly<br />
the number of columns for the matrix A.<br />
This theorem can be especially useful <strong>in</strong> determ<strong>in</strong><strong>in</strong>g basic properties of l<strong>in</strong>ear<br />
transformations. For example, suppose that T : C 6 → C 6 is a l<strong>in</strong>ear transformation<br />
and you are able to quickly establish that the kernel is trivial. Then n (T ) = 0. <strong>First</strong><br />
this means that T is <strong>in</strong>jective by Theorem NOILT. Also, Theorem RPNDD becomes<br />
6=dim ( C 6) = r (T )+n (T )=r (T )+0=r (T )<br />
So the rank of T is equal to the dimension of the codoma<strong>in</strong>, and by Theorem ROSLT<br />
we know T is surjective. F<strong>in</strong>ally, we know T is <strong>in</strong>vertible by Theorem ILTIS. Sofrom<br />
the determ<strong>in</strong>ation that the kernel is trivial, and consideration of various dimensions,<br />
the theorems of this section allow us to conclude the existence of an <strong>in</strong>verse l<strong>in</strong>ear<br />
transformation for T . Similarly, Theorem RPNDD can be used to provide alternative<br />
proofs for Theorem ILTD, TheoremSLTD and Theorem IVSED. It would be an<br />
<strong>in</strong>terest<strong>in</strong>g exercise to construct these proofs.<br />
It would be <strong>in</strong>structive to study the archetypes that are l<strong>in</strong>ear transformations<br />
and see how many of their properties can be deduced just from consider<strong>in</strong>g only the<br />
dimensions of the doma<strong>in</strong> and codoma<strong>in</strong>. Then add <strong>in</strong> just knowledge of either the<br />
nullity or rank, and see how much more you can learn about the l<strong>in</strong>ear transformation.<br />
The table preced<strong>in</strong>g all of the archetypes (Archetypes) could be a good place to<br />
start this analysis.
§IVLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 496<br />
Subsection SLELT<br />
Systems of L<strong>in</strong>ear Equations and L<strong>in</strong>ear Transformations<br />
This subsection does not really belong <strong>in</strong> this section, or any other section, for that<br />
matter. It is just the right time to have a discussion about the connections between<br />
the central topic of l<strong>in</strong>ear algebra, l<strong>in</strong>ear transformations, and our motivat<strong>in</strong>g topic<br />
from Chapter SLE, systems of l<strong>in</strong>ear equations. We will discuss several theorems we<br />
have seen already, but we will also make some forward-look<strong>in</strong>g statements that will<br />
be justified <strong>in</strong> Chapter R.<br />
Archetype D and Archetype E are ideal examples to illustrate connections with<br />
l<strong>in</strong>ear transformations. Both have the same coefficient matrix,<br />
[ ]<br />
2 1 7 −7<br />
D = −3 4 −5 −6<br />
1 1 4 −5<br />
To apply the theory of l<strong>in</strong>ear transformations to these two archetypes, employ<br />
the matrix-vector product (Def<strong>in</strong>ition MVP) and def<strong>in</strong>e the l<strong>in</strong>ear transformation,<br />
[ ] [ ] [ ] [ ]<br />
2 1 7 −7<br />
T : C 4 → C 3 , T (x) =Dx = x 1 −3 + x 2 4 + x 3 −5 + x 4 −6<br />
1 1 4 −5<br />
Theorem MBLT tells us that T is <strong>in</strong>deed a<br />
[<br />
l<strong>in</strong>ear<br />
]<br />
transformation. Archetype<br />
8<br />
D asks for solutions to LS(D, b), whereb = −12 . In the language of l<strong>in</strong>ear<br />
−4<br />
transformations this is equivalent to ask<strong>in</strong>g for T −1 (b). In the language of vectors<br />
and matrices it asks for a l<strong>in</strong>ear comb<strong>in</strong>ation ⎡ ⎤ of the four columns of D that will<br />
7<br />
⎢8⎥<br />
equal b. One solution listed is w = ⎣<br />
1<br />
⎦. With a nonempty preimage, Theorem KPI<br />
3<br />
tells us that the complete solution set of the l<strong>in</strong>ear system is the preimage of b,<br />
w + K(T )={w + z| z ∈K(T )}<br />
The kernel of the l<strong>in</strong>ear transformation T is exactly the null space of the matrix<br />
D (see Exercise ILT.T20), so this approach to the solution set should be rem<strong>in</strong>iscent<br />
of Theorem PSPHS. The kernel of the l<strong>in</strong>ear transformation is the preimage of the<br />
zero vector, exactly equal to the solution set of the homogeneous system LS(D, 0).<br />
S<strong>in</strong>ce D has a null space of dimension two, every preimage (and <strong>in</strong> particular the<br />
preimage of b) is as “big” as a subspace of dimension two (but is not a subspace).<br />
Archetype<br />
[ ]<br />
E is identical to Archetype D but with a different vector of constants,<br />
2<br />
d = 3 . We can use the same l<strong>in</strong>ear transformation T to discuss this system<br />
2
§IVLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 497<br />
of equations s<strong>in</strong>ce the coefficient matrix is identical. Now the set of solutions to<br />
LS(D, d) is the pre-image of d, T −1 (d). However, the vector d is not <strong>in</strong> the range<br />
of the l<strong>in</strong>ear transformation (nor is it <strong>in</strong> the column space of the matrix, s<strong>in</strong>ce these<br />
two sets are equal by Exercise SLT.T20). So the empty pre-image is equivalent to<br />
the <strong>in</strong>consistency of the l<strong>in</strong>ear system.<br />
These two archetypes each have three equations <strong>in</strong> four variables, so either the<br />
result<strong>in</strong>g l<strong>in</strong>ear systems are <strong>in</strong>consistent, or they are consistent and application of<br />
Theorem CMVEI tells us that the system has <strong>in</strong>f<strong>in</strong>itely many solutions. Consider<strong>in</strong>g<br />
these same parameters for the l<strong>in</strong>ear transformation, the dimension of the doma<strong>in</strong>,<br />
C 4 , is four, while the codoma<strong>in</strong>, C 3 , has dimension three. Then<br />
n (T )=dim ( C 4) − r (T )<br />
Theorem RPNDD<br />
=4− dim (R(T )) Def<strong>in</strong>ition ROLT<br />
≥ 4 − 3 R(T ) subspace of C 3<br />
=1<br />
So the kernel of T is nontrivial simply by consider<strong>in</strong>g the dimensions of the<br />
doma<strong>in</strong> (number of variables) and the codoma<strong>in</strong> (number of equations). Pre-images<br />
of elements of the codoma<strong>in</strong> that are not <strong>in</strong> the range of T are empty (<strong>in</strong>consistent<br />
systems). For elements of the codoma<strong>in</strong> that are <strong>in</strong> the range of T (consistent<br />
systems), Theorem KPI tells us that the pre-images are built from the kernel, and<br />
with a nontrivial kernel, these pre-images are <strong>in</strong>f<strong>in</strong>ite (<strong>in</strong>f<strong>in</strong>itely many solutions).<br />
When do systems of equations have unique solutions? Consider the system of<br />
l<strong>in</strong>ear equations LS(C, f) and the l<strong>in</strong>ear transformation S (x) = Cx. IfS has a<br />
trivial kernel, then pre-images will either be empty or be f<strong>in</strong>ite sets with s<strong>in</strong>gle<br />
elements. Correspond<strong>in</strong>gly, the coefficient matrix C will have a trivial null space<br />
and solution sets will either be empty (<strong>in</strong>consistent) or conta<strong>in</strong> a s<strong>in</strong>gle solution<br />
(unique solution). Should the matrix be square and have a trivial null space then<br />
we recognize the matrix as be<strong>in</strong>g nons<strong>in</strong>gular. A square matrix means that the<br />
correspond<strong>in</strong>g l<strong>in</strong>ear transformation, T , has equal-sized doma<strong>in</strong> and codoma<strong>in</strong>. With<br />
a nullity of zero, T is <strong>in</strong>jective, and also Theorem RPNDD tells us that rank of T is<br />
equal to the dimension of the doma<strong>in</strong>, which <strong>in</strong> turn is equal to the dimension of<br />
the codoma<strong>in</strong>. In other words, T is surjective. Injective and surjective, and Theorem<br />
ILTIS tells us that T is <strong>in</strong>vertible. Just as we can use the <strong>in</strong>verse of the coefficient<br />
matrix to f<strong>in</strong>d the unique solution of any l<strong>in</strong>ear system with a nons<strong>in</strong>gular coefficient<br />
matrix (Theorem SNCM), we can use the <strong>in</strong>verse of the l<strong>in</strong>ear transformation to<br />
construct the unique element of any pre-image (proof of Theorem ILTIS).<br />
The executive summary of this discussion is that to every coefficient matrix<br />
of a system of l<strong>in</strong>ear equations we can associate a natural l<strong>in</strong>ear transformation.<br />
Solution sets for systems with this coefficient matrix are preimages of elements of the<br />
codoma<strong>in</strong> of the l<strong>in</strong>ear transformation. For every theorem about systems of l<strong>in</strong>ear<br />
equations there is an analogue about l<strong>in</strong>ear transformations. The theory of l<strong>in</strong>ear
§IVLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 498<br />
transformations provides all the tools to recreate the theory of solutions to l<strong>in</strong>ear<br />
systems of equations.<br />
We will cont<strong>in</strong>ue this adventure <strong>in</strong> Chapter R.<br />
Read<strong>in</strong>g Questions<br />
1. What conditions allow us to easily determ<strong>in</strong>e if a l<strong>in</strong>ear transformation is <strong>in</strong>vertible?<br />
2. What does it mean to say two vector spaces are isomorphic? Both technically, and<br />
<strong>in</strong>formally?<br />
3. How do l<strong>in</strong>ear transformations relate to systems of l<strong>in</strong>ear equations?<br />
Exercises<br />
C10 The archetypes below are l<strong>in</strong>ear transformations of the form T : U → V that are<br />
<strong>in</strong>vertible. For each, the <strong>in</strong>verse l<strong>in</strong>ear transformation is given explicitly as part of the<br />
archetype’s description. Verify for each l<strong>in</strong>ear transformation that<br />
T −1 ◦ T = I U<br />
T ◦ T −1 = I V<br />
Archetype R, ArchetypeV, ArchetypeW<br />
C20 † Determ<strong>in</strong>e if the l<strong>in</strong>ear transformation T : P 2 → M 22 is (a) <strong>in</strong>jective, (b) surjective,<br />
(c) <strong>in</strong>vertible.<br />
T ( a + bx + cx 2) [ ]<br />
a +2b − 2c 2a +2b<br />
=<br />
−a + b − 4c 3a +2b +2c<br />
C21 † Determ<strong>in</strong>e if the l<strong>in</strong>ear transformation S : P 3 → M 22 is (a) <strong>in</strong>jective, (b) surjective,<br />
(c) <strong>in</strong>vertible.<br />
S ( a + bx + cx 2 + dx 3) [ ]<br />
−a +4b + c +2d 4a − b +6c − d<br />
=<br />
a +5b − 2c +2d a+2c +5d<br />
C25 For each l<strong>in</strong>ear transformation below: (a) F<strong>in</strong>d the matrix representation of T ,(b)<br />
Calculate n(T ), (c) Calculate r(T ), (d) Graph the image <strong>in</strong> either R 2 or R 3 as appropriate,<br />
(e) How many dimensions are lost?, and (f) How many dimensions are preserved?<br />
⎛⎡<br />
⎤⎞<br />
1. T : C 3 → C 3 given by T ⎝⎣ x y⎦⎠ = ⎣ x x⎦<br />
z x<br />
⎛⎡<br />
⎤⎞<br />
2. T : C 3 → C 3 given by T ⎝⎣ x y⎦⎠ = ⎣ x y⎦<br />
z 0<br />
⎛⎡<br />
3. T : C 3 → C 2 given by T ⎝⎣ x ⎤⎞<br />
[<br />
y⎦⎠ x<br />
=<br />
x]<br />
z<br />
⎡<br />
⎡<br />
⎤<br />
⎤
§IVLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 499<br />
⎛⎡<br />
4. T : C 3 → C 2 given by T ⎝⎣ x ⎤⎞<br />
[<br />
y⎦⎠ x<br />
=<br />
y]<br />
z<br />
⎡<br />
([<br />
5. T : C 2 → C 3 x<br />
given by T = ⎣<br />
y]) x ⎤<br />
y⎦<br />
0<br />
⎡<br />
([<br />
6. T : C 2 → C 3 x<br />
given by T = ⎣<br />
y]) x ⎤<br />
y ⎦<br />
x + y<br />
C50 † Consider the l<strong>in</strong>ear transformation S : M 12 → P 1 from the set of 1 × 2 matrices to<br />
the set of polynomials of degree at most 1, def<strong>in</strong>ed by<br />
S ([ a<br />
b ]) =(3a + b)+(5a +2b)x<br />
Prove that S is <strong>in</strong>vertible. Then show that the l<strong>in</strong>ear transformation<br />
R: P 1 → M 12 , R(r + sx) = [ (2r − s) (−5r +3s) ]<br />
is the <strong>in</strong>verse of S, thatisS −1 = R.<br />
M30 † The l<strong>in</strong>ear transformation S below is <strong>in</strong>vertible. F<strong>in</strong>d a formula for the <strong>in</strong>verse<br />
l<strong>in</strong>ear transformation, S −1 .<br />
S : P 1 → M 12 , S(a + bx) = [ 3a + b 2a + b ]<br />
M31 † The l<strong>in</strong>ear transformation R: M 12 → M 21 is <strong>in</strong>vertible. Determ<strong>in</strong>e a formula for<br />
the <strong>in</strong>verse l<strong>in</strong>ear transformation R −1 : M 21 → M 12 .<br />
R ([ a b ]) [ ]<br />
a +3b<br />
=<br />
4a +11b<br />
M50 Rework Example CIVLT, only <strong>in</strong> place of the basis B for P 2 , choose <strong>in</strong>stead to use<br />
the basis C = { 1, 1+x, 1+x + x 2} . This will complicate writ<strong>in</strong>g a generic element of the<br />
doma<strong>in</strong> of T −1 as a l<strong>in</strong>ear comb<strong>in</strong>ation of the basis elements, and the algebra will be a bit<br />
messier, but <strong>in</strong> the end you should obta<strong>in</strong> the same formula for T −1 . The <strong>in</strong>verse l<strong>in</strong>ear<br />
transformation is what it is, and the choice of a particular basis should not <strong>in</strong>fluence the<br />
outcome.<br />
M60 Suppose U and V are vector spaces. Def<strong>in</strong>e the function Z : U → V by Z (u) = 0 V<br />
for every u ∈ U. Then by Exercise LT.M60, Z is a l<strong>in</strong>ear transformation. Formulate a<br />
condition on U and V that is equivalent to Z be<strong>in</strong>g an <strong>in</strong>vertible l<strong>in</strong>ear transformation. In<br />
other words, fill <strong>in</strong> the blank to complete the follow<strong>in</strong>g statement (and then give a proof):<br />
Z is <strong>in</strong>vertible if and only if U and V are . (See Exercise ILT.M60, Exercise SLT.M60,<br />
Exercise MR.M60.)<br />
T05 Prove that the identity l<strong>in</strong>ear transformation (Def<strong>in</strong>ition IDLT) is both <strong>in</strong>jective and<br />
surjective, and hence <strong>in</strong>vertible.<br />
T15 † Suppose that T : U → V is a surjective l<strong>in</strong>ear transformation and dim (U) = dim (V ).<br />
Prove that T is <strong>in</strong>jective.
§IVLT Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 500<br />
T16 Suppose that T : U → V is an <strong>in</strong>jective l<strong>in</strong>ear transformation and dim (U) = dim (V ).<br />
Prove that T is surjective.<br />
T30 † Suppose that U and V are isomorphic vector spaces, not of dimension zero. Prove<br />
that there are <strong>in</strong>f<strong>in</strong>itely many isomorphisms between U and V .<br />
T40 † Suppose T : U → V and S : V → W are l<strong>in</strong>ear transformations and dim (U) =<br />
dim (V ) = dim (W ). Suppose that S ◦ T is <strong>in</strong>vertible. Prove that S and T are <strong>in</strong>dividually<br />
<strong>in</strong>vertible (this could be construed as a converse of Theorem CIVLT).
Chapter R<br />
Representations<br />
Previous work with l<strong>in</strong>ear transformations may have conv<strong>in</strong>ced you that we can<br />
convert most questions about l<strong>in</strong>ear transformations <strong>in</strong>to questions about systems of<br />
equations or properties of subspaces of C m . In this section we beg<strong>in</strong> to make these<br />
vague notions precise. We have used the word “representation” prior, but it will get<br />
a heavy workout <strong>in</strong> this chapter. In many ways, everyth<strong>in</strong>g we have studied so far<br />
was <strong>in</strong> preparation for this chapter.<br />
Section VR<br />
Vector Representations<br />
You may have noticed that many questions about elements of abstract vector spaces<br />
eventually become questions about column vectors or systems of equations. Example<br />
SM32 would be an example of this. We will make this vague idea more precise <strong>in</strong><br />
this section.<br />
Subsection VR<br />
Vector Representation<br />
We beg<strong>in</strong> by establish<strong>in</strong>g an <strong>in</strong>vertible l<strong>in</strong>ear transformation between any vector<br />
space V of dimension n and C n . This will allow us to “go back and forth” between<br />
the two vector spaces, no matter how abstract the def<strong>in</strong>ition of V might be.<br />
Def<strong>in</strong>ition VR Vector Representation<br />
Suppose that V is a vector space with a basis B = {v 1 , v 2 , v 3 , ..., v n }. Def<strong>in</strong>e a<br />
function ρ B : V → C n as follows. For w ∈ V def<strong>in</strong>e the column vector ρ B (w) ∈ C n<br />
by<br />
w =[ρ B (w)] 1<br />
v 1 +[ρ B (w)] 2<br />
v 2 +[ρ B (w)] 3<br />
v 3 + ···+[ρ B (w)] n<br />
v n<br />
501
§VR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 502<br />
□<br />
This def<strong>in</strong>ition looks more complicated that it really is, though the form above will<br />
be useful <strong>in</strong> proofs. Simply stated, given w ∈ V ,wewritew as a l<strong>in</strong>ear comb<strong>in</strong>ation<br />
of the basis elements of B. ItiskeytorealizethatTheoremVRRB guarantees that<br />
we can do this for every w, and furthermore this expression as a l<strong>in</strong>ear comb<strong>in</strong>ation is<br />
unique. The result<strong>in</strong>g scalars are just the entries of the vector ρ B (w). This discussion<br />
should conv<strong>in</strong>ce you that ρ B is “well-def<strong>in</strong>ed” as a function. We can determ<strong>in</strong>e a<br />
precise output for any <strong>in</strong>put. Now we want to establish that ρ B is a function with<br />
additional properties — it is a l<strong>in</strong>ear transformation.<br />
Theorem VRLT Vector Representation is a L<strong>in</strong>ear Transformation<br />
The function ρ B (Def<strong>in</strong>ition VR) is a l<strong>in</strong>ear transformation.<br />
Proof. We will take a novel approach <strong>in</strong> this proof. We will construct another function,<br />
which we will easily determ<strong>in</strong>e is a l<strong>in</strong>ear transformation, and then show that this<br />
second function is really ρ B <strong>in</strong> disguise. Here we go.<br />
S<strong>in</strong>ce B is a basis, we can def<strong>in</strong>e T : V → C n to be the unique l<strong>in</strong>ear transformation<br />
such that T (v i ) = e i ,1≤ i ≤ n, as guaranteed by Theorem LTDB, and where the<br />
e i are the standard unit vectors (Def<strong>in</strong>ition SUV). Then suppose for an arbitrary<br />
w ∈ V we have,<br />
⎡ ⎛<br />
⎞⎤<br />
n∑<br />
[T (w)] i<br />
= ⎣T ⎝ [ρ B (w)] j<br />
v j<br />
⎠⎦<br />
Def<strong>in</strong>ition VR<br />
⎡<br />
= ⎣<br />
⎡<br />
= ⎣<br />
=<br />
=<br />
j=1<br />
⎤<br />
n∑<br />
[ρ B (w)] j<br />
T (v j ) ⎦<br />
j=1<br />
⎤<br />
n∑<br />
[ρ B (w)] j<br />
e j<br />
⎦<br />
j=1<br />
n∑<br />
]<br />
[[ρ B (w)] j<br />
e j<br />
j=1<br />
i<br />
i<br />
i<br />
i<br />
Theorem LTLC<br />
Def<strong>in</strong>ition CVA<br />
n∑<br />
[ρ B (w)] j<br />
[e j ] i<br />
Def<strong>in</strong>ition CVSM<br />
j=1<br />
=[ρ B (w)] i<br />
[e i ] i<br />
+<br />
=[ρ B (w)] i<br />
(1) +<br />
n∑<br />
[ρ B (w)] j<br />
[e j ] i<br />
Property CC<br />
j=1<br />
j≠i<br />
n∑<br />
[ρ B (w)] j<br />
(0) Def<strong>in</strong>ition SUV<br />
j=1<br />
j≠i
§VR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 503<br />
=[ρ B (w)] i<br />
As column vectors, Def<strong>in</strong>ition CVE implies that T (w) = ρ B (w). S<strong>in</strong>cew was an<br />
arbitrary element of V , as functions T = ρ B .Now,s<strong>in</strong>ceT is known to be a l<strong>in</strong>ear<br />
transformation, it must follow that ρ B is also a l<strong>in</strong>ear transformation. <br />
The proof of Theorem VRLT provides an alternate def<strong>in</strong>ition of vector representation<br />
relative to a basis B that we could state as a corollary (Proof Technique LC):<br />
ρ B is the unique l<strong>in</strong>ear transformation that takes B to the standard unit basis.<br />
Example VRC4 Vector representation <strong>in</strong> C 4<br />
Consider the vector y ∈ C 4 ⎡ ⎤<br />
6<br />
⎢14⎥<br />
y = ⎣<br />
6<br />
⎦<br />
7<br />
We will f<strong>in</strong>d several vector representations of y <strong>in</strong> this example. Notice that y<br />
never changes, but the representations of y do change. One basis for C 4 is<br />
⎧⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />
−2 3 1 4<br />
⎪⎨<br />
⎪⎬<br />
⎢ 1 ⎥ ⎢−6⎥<br />
⎢2⎥<br />
⎢3⎥<br />
B = {u 1 , u 2 , u 3 , u 4 } = ⎣<br />
⎪⎩<br />
2<br />
⎦ , ⎣<br />
2<br />
⎦ , ⎣<br />
0<br />
⎦ , ⎣<br />
1<br />
⎦<br />
⎪⎭<br />
−3 −4 5 6<br />
as can be seen by mak<strong>in</strong>g these vectors the columns of a matrix, check<strong>in</strong>g that the<br />
matrix is nons<strong>in</strong>gular and apply<strong>in</strong>g Theorem CNMB. To f<strong>in</strong>d ρ B (y), we need to<br />
f<strong>in</strong>d scalars, a 1 ,a 2 ,a 3 ,a 4 such that<br />
y = a 1 u 1 + a 2 u 2 + a 3 u 3 + a 4 u 4<br />
By Theorem SLSLC the desired scalars are a solution to the l<strong>in</strong>ear system of<br />
equations with a coefficient matrix whose columns are the vectors <strong>in</strong> B andwitha<br />
vector of constants y. With a nons<strong>in</strong>gular coefficient matrix, the solution is unique,<br />
but this is no surprise as this is the content of Theorem VRRB. This unique solution<br />
is<br />
a 1 =2 a 2 = −1 a 3 = −3 a 4 =4<br />
Then by Def<strong>in</strong>ition VR, wehave<br />
⎡ ⎤<br />
2<br />
⎢−1⎥<br />
ρ B (y) = ⎣<br />
−3<br />
⎦<br />
4<br />
Suppose now that we construct a representation of y relative to another basis of
§VR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 504<br />
C 4 ,<br />
⎧⎡<br />
⎤<br />
−15 ⎪⎨<br />
⎢ 9 ⎥<br />
C = ⎣<br />
⎪⎩<br />
−4<br />
⎦ ,<br />
−2<br />
⎡ ⎤<br />
16<br />
⎢−14⎥<br />
⎣<br />
5<br />
⎦ ,<br />
2<br />
⎡ ⎤<br />
−26<br />
⎢ ⎥<br />
⎣ ⎦ ,<br />
14<br />
−6<br />
−3<br />
⎡ ⎤⎫<br />
14 ⎪⎬<br />
⎢−13⎥<br />
⎣<br />
4<br />
⎦<br />
⎪⎭<br />
6<br />
As with B, it is easy to check that C is a basis. Writ<strong>in</strong>g y as a l<strong>in</strong>ear comb<strong>in</strong>ation<br />
of the vectors <strong>in</strong> C leads to solv<strong>in</strong>g a system of four equations <strong>in</strong> the four unknown<br />
scalars with a nons<strong>in</strong>gular coefficient matrix. The unique solution can be expressed<br />
as<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
6<br />
−15<br />
16 −26 14<br />
⎢14⎥<br />
⎢ 9 ⎥ ⎢−14⎥<br />
⎢ 14 ⎥ ⎢−13⎥<br />
y = ⎣<br />
6<br />
⎦ =(−28) ⎣<br />
−4<br />
⎦ +(−8) ⎣<br />
5<br />
⎦ +11⎣<br />
−6<br />
⎦ +0⎣<br />
4<br />
⎦<br />
7<br />
−2<br />
2 −3 6<br />
so that Def<strong>in</strong>ition VR gives<br />
ρ C (y) =<br />
⎡ ⎤<br />
−28<br />
⎢ −8 ⎥<br />
⎣<br />
11<br />
⎦<br />
0<br />
We often perform representations relative to standard bases, but for vectors <strong>in</strong><br />
C m this is a little silly. Let us f<strong>in</strong>d the vector representation of y relative to the<br />
standard basis (Theorem SUVB),<br />
D = {e 1 , e 2 , e 3 , e 4 }<br />
Then, without any computation, we can check that<br />
⎡ ⎤<br />
6<br />
⎢14⎥<br />
y = ⎣<br />
6<br />
⎦ =6e 1 +14e 2 +6e 3 +7e 4<br />
7<br />
so by Def<strong>in</strong>ition VR,<br />
⎡ ⎤<br />
6<br />
⎢14⎥<br />
ρ D (y) = ⎣<br />
6<br />
⎦<br />
7<br />
which is not very excit<strong>in</strong>g. Notice however that the order <strong>in</strong> which we place the<br />
vectors <strong>in</strong> the basis is critical to the representation. Let us keep the standard unit<br />
vectors as our basis, but rearrange the order we place them <strong>in</strong> the basis. So a fourth<br />
basis is<br />
E = {e 3 , e 4 , e 2 , e 1 }
§VR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 505<br />
Then,<br />
so by Def<strong>in</strong>ition VR,<br />
⎡ ⎤<br />
6<br />
⎢14⎥<br />
y = ⎣<br />
6<br />
⎦ =6e 3 +7e 4 +14e 2 +6e 1<br />
7<br />
⎡<br />
⎢<br />
ρ E (y) = ⎣<br />
So for every possible basis of C 4 we could construct a different representation of<br />
y. △<br />
Vector representations are most <strong>in</strong>terest<strong>in</strong>g for vector spaces that are not C m .<br />
Example VRP2 Vector representations <strong>in</strong> P 2<br />
Consider the vector u =15+10x − 6x 2 ∈ P 2 from the vector space of polynomials<br />
with degree at most 2 (Example VSP). A nice basis for P 2 is<br />
B = { 1,x,x 2}<br />
so that<br />
u =15+10x − 6x 2 = 15(1) + 10(x)+(−6)(x 2 )<br />
so by Def<strong>in</strong>ition VR<br />
ρ B (u) =<br />
6<br />
7<br />
14<br />
6<br />
⎤<br />
⎥<br />
⎦<br />
[ ] 15<br />
10<br />
−6<br />
Another nice basis for P 2 is<br />
C = { 1, 1+x, 1+x + x 2}<br />
so that now it takes a bit of computation to determ<strong>in</strong>e the scalars for the representation.<br />
We want a 1 ,a 2 ,a 3 so that<br />
15 + 10x − 6x 2 = a 1 (1) + a 2 (1 + x)+a 3 (1 + x + x 2 )<br />
Perform<strong>in</strong>g the operations <strong>in</strong> P 2 on the right-hand side, and equat<strong>in</strong>g coefficients,<br />
gives the three equations <strong>in</strong> the three unknown scalars,<br />
15 = a 1 + a 2 + a 3<br />
10 = a 2 + a 3<br />
−6 =a 3<br />
The coefficient matrix of this sytem is nons<strong>in</strong>gular, lead<strong>in</strong>g to a unique solution
§VR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 506<br />
(no surprise there, see Theorem VRRB),<br />
a 1 =5 a 2 =16 a 3 = −6<br />
so by Def<strong>in</strong>ition VR<br />
ρ C (u) =<br />
[ ] 5<br />
16<br />
−6<br />
While we often form vector representations relative to “nice” bases, noth<strong>in</strong>g<br />
prevents us from form<strong>in</strong>g representations relative to “nasty” bases. For example, the<br />
set<br />
D = { −2 − x +3x 2 , 1 − 2x 2 , 5+4x + x 2}<br />
can be verified as a basis of P 2 by check<strong>in</strong>g l<strong>in</strong>ear <strong>in</strong>dependence with Def<strong>in</strong>ition LI<br />
and then argu<strong>in</strong>g that 3 vectors from P 2 , a vector space of dimension 3 (Theorem<br />
DP), must also be a spann<strong>in</strong>g set (Theorem G).<br />
Now we desire scalars a 1 ,a 2 ,a 3 so that<br />
15 + 10x − 6x 2 = a 1 (−2 − x +3x 2 )+a 2 (1 − 2x 2 )+a 3 (5 + 4x + x 2 )<br />
Perform<strong>in</strong>g the operations <strong>in</strong> P 2 on the right-hand side, and equat<strong>in</strong>g coefficients,<br />
gives the three equations <strong>in</strong> the three unknown scalars,<br />
15 = −2a 1 + a 2 +5a 3<br />
10 = −a 1 +4a 3<br />
−6 =3a 1 − 2a 2 + a 3<br />
The coefficient matrix of this sytem is nons<strong>in</strong>gular, lead<strong>in</strong>g to a unique solution<br />
(no surprise there, see Theorem VRRB),<br />
a 1 = −2 a 2 =1 a 3 =2<br />
so by Def<strong>in</strong>ition VR<br />
ρ D (u) =<br />
[ ] −2<br />
1<br />
2<br />
△<br />
Theorem VRI Vector Representation is Injective<br />
The function ρ B (Def<strong>in</strong>ition VR) is an <strong>in</strong>jective l<strong>in</strong>ear transformation.<br />
Proof. We will appeal to Theorem KILT. Suppose U is a vector space of dimension<br />
n, so vector representation is ρ B : U → C n .LetB = {u 1 , u 2 , u 3 , ..., u n } be the<br />
basis of U used <strong>in</strong> the def<strong>in</strong>ition of ρ B . Suppose u ∈K(ρ B ).Wewriteu as a l<strong>in</strong>ear<br />
comb<strong>in</strong>ation of the vectors <strong>in</strong> the basis B where the scalars are the components of
§VR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 507<br />
the vector representation, ρ B (u).<br />
u =[ρ B (u)] 1<br />
u 1 +[ρ B (u)] 2<br />
u 2 + ···+[ρ B (u)] n<br />
u n Def<strong>in</strong>ition VR<br />
=[0] 1<br />
u 1 +[0] 2<br />
u 2 + ···+[0] n<br />
u n Def<strong>in</strong>ition KLT<br />
=0u 1 +0u 2 + ···+0u n Def<strong>in</strong>ition ZCV<br />
= 0 + 0 + ···+ 0 Theorem ZSSM<br />
= 0 Property Z<br />
Thus an arbitrary vector, u, from the kernel ,K(ρ B ), must equal the zero vector<br />
of U. SoK(ρ B )={0} and by Theorem KILT, ρ B is <strong>in</strong>jective.<br />
<br />
Theorem VRS Vector Representation is Surjective<br />
The function ρ B (Def<strong>in</strong>ition VR) is a surjective l<strong>in</strong>ear transformation.<br />
Proof. We will appeal to Theorem RSLT. Suppose U is a vector space of dimension<br />
n, so vector representation is ρ B : U → C n .LetB = {u 1 , u 2 , u 3 , ..., u n } be the<br />
basis of U used <strong>in</strong> the def<strong>in</strong>ition of ρ B . Suppose v ∈ C n . Def<strong>in</strong>e the vector u by<br />
u =[v] 1<br />
u 1 +[v] 2<br />
u 2 +[v] 3<br />
u 3 + ···+[v] n<br />
u n<br />
Then for 1 ≤ i ≤ n, byDef<strong>in</strong>itionVR,<br />
[ρ B (u)] i<br />
=[ρ B ([v] 1<br />
u 1 +[v] 2<br />
u 2 +[v] 3<br />
u 3 + ···+[v] n<br />
u n )] i<br />
=[v] i<br />
so the entries of vectors ρ B (u) and v are equal and Def<strong>in</strong>ition CVE yields the vector<br />
equality ρ B (u) = v. This demonstrates that v ∈R(ρ B ),soC n ⊆R(ρ B ).S<strong>in</strong>ce<br />
R(ρ B ) ⊆ C n by Def<strong>in</strong>ition RLT, wehaveR(ρ B ) = C n and Theorem RSLT says ρ B<br />
is surjective.<br />
<br />
We will have many occasions later to employ the <strong>in</strong>verse of vector representation, so<br />
we will record the fact that vector representation is an <strong>in</strong>vertible l<strong>in</strong>ear transformation.<br />
Theorem VRILT Vector Representation is an Invertible L<strong>in</strong>ear Transformation<br />
The function ρ B (Def<strong>in</strong>ition VR) is an <strong>in</strong>vertible l<strong>in</strong>ear transformation.<br />
Proof. The function ρ B (Def<strong>in</strong>ition VR) is a l<strong>in</strong>ear transformation (Theorem VRLT)<br />
that is <strong>in</strong>jective (Theorem VRI) and surjective (Theorem VRS) with doma<strong>in</strong> V<br />
and codoma<strong>in</strong> C n .ByTheoremILTIS we then know that ρ B is an <strong>in</strong>vertible l<strong>in</strong>ear<br />
transformation.<br />
<br />
Informally, we will refer to the application of ρ B as coord<strong>in</strong>atiz<strong>in</strong>g a vector,<br />
while the application of ρ −1<br />
B<br />
will be referred to as un-coord<strong>in</strong>atiz<strong>in</strong>g a vector.
§VR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 508<br />
Subsection CVS<br />
Characterization of Vector Spaces<br />
Limit<strong>in</strong>g our attention to vector spaces with f<strong>in</strong>ite dimension, we now describe every<br />
possible vector space. All of them. Really.<br />
Theorem CFDVS Characterization of F<strong>in</strong>ite Dimensional Vector Spaces<br />
Suppose that V is a vector space with dimension n. ThenV is isomorphic to C n .<br />
Proof. S<strong>in</strong>ce V has dimension n we can f<strong>in</strong>d a basis of V of size n (Def<strong>in</strong>ition D)which<br />
we will call B. The l<strong>in</strong>ear transformation ρ B is an <strong>in</strong>vertible l<strong>in</strong>ear transformation<br />
from V to C n ,sobyDef<strong>in</strong>itionIVS, wehavethatV and C n are isomorphic. <br />
Theorem CFDVS is the first of several surprises <strong>in</strong> this chapter, though it might<br />
be a bit demoraliz<strong>in</strong>g too. It says that there really are not all that many different<br />
(f<strong>in</strong>ite dimensional) vector spaces, and none are really any more complicated than<br />
C n . Hmmm. The follow<strong>in</strong>g examples should make this po<strong>in</strong>t.<br />
Example TIVS Two isomorphic vector spaces<br />
The vector space of polynomials with degree 8 or less, P 8 , has dimension 9 (Theorem<br />
DP). By Theorem CFDVS, P 8 is isomorphic to C 9 .<br />
△<br />
Example CVSR Crazy vector space revealed<br />
The crazy vector space, C of Example CVS, has dimension 2 by Example DC. By<br />
Theorem CFDVS, C is isomorphic to C 2 . Hmmmm. Not really so crazy after all?△<br />
Example ASC A subspace characterized<br />
In Example DSP4 we determ<strong>in</strong>ed that a certa<strong>in</strong> subspace W of P 4 has dimension 4.<br />
By Theorem CFDVS, W is isomorphic to C 4 .<br />
△<br />
Theorem IFDVS Isomorphism of F<strong>in</strong>ite Dimensional Vector Spaces<br />
Suppose U and V are both f<strong>in</strong>ite-dimensional vector spaces. Then U and V are<br />
isomorphic if and only if dim (U) =dim(V ).<br />
Proof. (⇒) This is just the statement proved <strong>in</strong> Theorem IVSED.<br />
(⇐) This is the advertised converse of Theorem IVSED. We will assume U and<br />
V have equal dimension and discover that they are isomorphic vector spaces. Let<br />
n be the common dimension of U and V . Then by Theorem CFDVS there are<br />
isomorphisms T : U → C n and S : V → C n .<br />
T is therefore an <strong>in</strong>vertible l<strong>in</strong>ear transformation by Def<strong>in</strong>ition IVS. Similarly, S is<br />
an <strong>in</strong>vertible l<strong>in</strong>ear transformation, and so S −1 is an <strong>in</strong>vertible l<strong>in</strong>ear transformation<br />
(Theorem IILT). The composition of <strong>in</strong>vertible l<strong>in</strong>ear transformations is aga<strong>in</strong><br />
<strong>in</strong>vertible (Theorem CIVLT) so the composition of S −1 with T is <strong>in</strong>vertible. Then<br />
(<br />
S −1 ◦ T ) : U → V is an <strong>in</strong>vertible l<strong>in</strong>ear transformation from U to V and Def<strong>in</strong>ition<br />
IVS says U and V are isomorphic.
§VR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 509<br />
Example MIVS Multiple isomorphic vector spaces<br />
C 10 , P 9 , M 25 and M 52 are all vector spaces and each has dimension 10. By Theorem<br />
IFDVS each is isomorphic to any other.<br />
The subspace of M 44 that conta<strong>in</strong>s all the symmetric matrices (Def<strong>in</strong>ition SYM)<br />
has dimension 10, so this subspace is also isomorphic to each of the four vector<br />
spaces above.<br />
△<br />
Subsection CP<br />
Coord<strong>in</strong>atization Pr<strong>in</strong>ciple<br />
With ρ B available as an <strong>in</strong>vertible l<strong>in</strong>ear transformation, we can translate between<br />
vectors <strong>in</strong> a vector space U of dimension m and C m . Furthermore, as a l<strong>in</strong>ear<br />
transformation, ρ B respects the addition and scalar multiplication <strong>in</strong> U, while ρ −1<br />
B<br />
respects the addition and scalar multiplication <strong>in</strong> C m . S<strong>in</strong>ce our def<strong>in</strong>itions of l<strong>in</strong>ear<br />
<strong>in</strong>dependence, spans, bases and dimension are all built up from l<strong>in</strong>ear comb<strong>in</strong>ations,<br />
we will f<strong>in</strong>ally be able to translate fundamental properties between abstract vector<br />
spaces (U) and concrete vector spaces (C m ).<br />
Theorem CLI Coord<strong>in</strong>atization and L<strong>in</strong>ear Independence<br />
Suppose that U is a vector space with a basis B of size n. Then<br />
S = {u 1 , u 2 , u 3 , ..., u k }<br />
is a l<strong>in</strong>early <strong>in</strong>dependent subset of U if and only if<br />
R = {ρ B (u 1 ) ,ρ B (u 2 ) ,ρ B (u 3 ) , ..., ρ B (u k )}<br />
is a l<strong>in</strong>early <strong>in</strong>dependent subset of C n .<br />
Proof. The l<strong>in</strong>ear transformation ρ B is an isomorphism between U and C n (Theorem<br />
VRILT). As an <strong>in</strong>vertible l<strong>in</strong>ear transformation, ρ B is an <strong>in</strong>jective l<strong>in</strong>ear transformation<br />
(Theorem ILTIS), and ρ −1<br />
B<br />
is also an <strong>in</strong>jective l<strong>in</strong>ear transformation (Theorem<br />
IILT, TheoremILTIS).<br />
(⇒) S<strong>in</strong>ceρ B is an <strong>in</strong>jective l<strong>in</strong>ear transformation and S is l<strong>in</strong>early <strong>in</strong>dependent,<br />
Theorem ILTLI says that R is l<strong>in</strong>early <strong>in</strong>dependent.<br />
(⇐) If we apply ρ −1<br />
B<br />
to each element of R, we will create the set S. S<strong>in</strong>ce we are<br />
assum<strong>in</strong>g R is l<strong>in</strong>early <strong>in</strong>dependent and ρ −1<br />
B<br />
is <strong>in</strong>jective, Theorem ILTLI says that S<br />
is l<strong>in</strong>early <strong>in</strong>dependent.<br />
<br />
Theorem CSS Coord<strong>in</strong>atization and Spann<strong>in</strong>g Sets<br />
Suppose that U is a vector space with a basis B of size n. Then<br />
u ∈〈{u 1 , u 2 , u 3 , ..., u k }〉<br />
if and only if<br />
ρ B (u) ∈〈{ρ B (u 1 ) ,ρ B (u 2 ) ,ρ B (u 3 ) , ..., ρ B (u k )}〉
§VR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 510<br />
Proof. (⇒) Suppose u ∈〈{u 1 , u 2 , u 3 , ..., u k }〉. Then we know there are scalars,<br />
a 1 ,a 2 ,a 3 , ..., a k , such that<br />
u = a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a k u k<br />
Then, by Theorem LTLC,<br />
ρ B (u) =ρ B (a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a k u k )<br />
= a 1 ρ B (u 1 )+a 2 ρ B (u 2 )+a 3 ρ B (u 3 )+···+ a k ρ B (u k )<br />
which says that ρ B (u) ∈〈{ρ B (u 1 ) ,ρ B (u 2 ) ,ρ B (u 3 ) , ..., ρ B (u k )}〉.<br />
(⇐) Suppose that ρ B (u) ∈〈{ρ B (u 1 ) ,ρ B (u 2 ) ,ρ B (u 3 ) , ..., ρ B (u k )}〉. Then<br />
there are scalars b 1 ,b 2 ,b 3 , ..., b k such that<br />
ρ B (u) =b 1 ρ B (u 1 )+b 2 ρ B (u 2 )+b 3 ρ B (u 3 )+···+ b k ρ B (u k )<br />
Recall that ρ B is <strong>in</strong>vertible (Theorem VRILT), so<br />
u = I U (u)<br />
Def<strong>in</strong>ition IDLT<br />
= ( ρ −1<br />
B ◦ ρ B)<br />
(u) Def<strong>in</strong>ition IVLT<br />
= ρ −1<br />
B (ρ B (u)) Def<strong>in</strong>ition LTC<br />
= ρ −1<br />
B (b 1ρ B (u 1 )+b 2 ρ B (u 2 )+···+ b k ρ B (u k ))<br />
= b 1 ρ −1<br />
B<br />
(ρ B (u 1 )) + b 2 ρ −1<br />
B<br />
(ρ B (u 2 )) + ···+ b k ρ −1<br />
B (ρ B (u k )) Theorem LTLC<br />
= b 1 I U (u 1 )+b 2 I U (u 2 )+···+ b k I U (u k ) Def<strong>in</strong>ition IVLT<br />
= b 1 u 1 + b 2 u 2 + b 3 u 3 + ···+ b k u k Def<strong>in</strong>ition IDLT<br />
which says that u ∈〈{u 1 , u 2 , u 3 , ..., u k }〉.<br />
<br />
Here is a fairly simple example that illustrates a very, very important idea.<br />
Example CP2 Coord<strong>in</strong>atiz<strong>in</strong>g <strong>in</strong> P 2<br />
In Example VRP2 we needed to know that<br />
D = { −2 − x +3x 2 , 1 − 2x 2 , 5+4x + x 2}<br />
is a basis for P 2 . With Theorem CLI and Theorem CSS this task is much easier.<br />
<strong>First</strong>, choose a known basis for P 2 , a basis that forms vector representations<br />
easily. We will choose<br />
B = { 1,x,x 2}<br />
D,<br />
Now, form the subset of C 3 that is the result of apply<strong>in</strong>g ρ B to each element of<br />
F = { (<br />
ρ B −2 − x +3x<br />
2 ) (<br />
,ρ B 1 − 2x<br />
2 ) (<br />
,ρ B 5+4x + x<br />
2 )}<br />
{[ ] [ ] [ ]}<br />
−2 1 5<br />
= −1 , 0 , 4<br />
3 −2 1
§VR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 511<br />
and ask if F is a l<strong>in</strong>early <strong>in</strong>dependent spann<strong>in</strong>g set for C 3 . This is easily seen to be<br />
the case by form<strong>in</strong>g a matrix A whose columns are the vectors of F , row-reduc<strong>in</strong>g A<br />
to the identity matrix I 3 , and then us<strong>in</strong>g the nons<strong>in</strong>gularity of A to assert that F is<br />
a basis for C 3 (Theorem CNMB). Now, s<strong>in</strong>ce F is a basis for C 3 ,TheoremCLI and<br />
Theorem CSS tell us that D is also a basis for P 2 .<br />
△<br />
Example CP2 illustrates the broad notion that computations <strong>in</strong> abstract vector<br />
spaces can be reduced to computations <strong>in</strong> C m . You may have noticed this phenomenon<br />
as you worked through examples <strong>in</strong> Chapter VS or Chapter LT employ<strong>in</strong>g vector<br />
spaces of matrices or polynomials. These computations seemed to <strong>in</strong>variably result<br />
<strong>in</strong> systems of equations or the like from Chapter SLE, Chapter V and Chapter M.<br />
It is vector representation, ρ B , that allows us to make this connection formal and<br />
precise.<br />
Know<strong>in</strong>g that vector representation allows us to translate questions about l<strong>in</strong>ear<br />
comb<strong>in</strong>ations, l<strong>in</strong>ear <strong>in</strong>dependence and spans from general vector spaces to C m allows<br />
us to prove a great many theorems about how to translate other properties. Rather<br />
than prove these theorems, each of the same style as the other, we will offer some<br />
general guidance about how to best employ Theorem VRLT, TheoremCLI and<br />
Theorem CSS. This comes <strong>in</strong> the form of a “pr<strong>in</strong>ciple”: a basic truth, but most<br />
def<strong>in</strong>itely not a theorem (hence, no proof).<br />
The Coord<strong>in</strong>atization Pr<strong>in</strong>ciple<br />
Suppose that U is a vector space with a basis B of size n. Then any question<br />
about U, or its elements, which ultimately depends on the vector addition or scalar<br />
multiplication <strong>in</strong> U, or depends on l<strong>in</strong>ear <strong>in</strong>dependence or spann<strong>in</strong>g, may be translated<br />
<strong>in</strong>to the same question <strong>in</strong> C n by application of the l<strong>in</strong>ear transformation ρ B to the<br />
relevant vectors. Once the question is answered <strong>in</strong> C n , the answer may be translated<br />
back to U through application of the <strong>in</strong>verse l<strong>in</strong>ear transformation ρ −1<br />
B<br />
(if necessary).<br />
Example CM32 Coord<strong>in</strong>atization <strong>in</strong> M 32<br />
This is a simple example of the Coord<strong>in</strong>atization Pr<strong>in</strong>ciple, depend<strong>in</strong>g only on the fact<br />
that coord<strong>in</strong>atiz<strong>in</strong>g is an <strong>in</strong>vertible l<strong>in</strong>ear transformation (Theorem VRILT). Suppose<br />
we have a l<strong>in</strong>ear comb<strong>in</strong>ation to perform <strong>in</strong> M 32 , the vector space of 3 × 2 matrices,<br />
but we are adverse to do<strong>in</strong>g the operations of M 32 (Def<strong>in</strong>ition MA, Def<strong>in</strong>ition MSM).<br />
More specifically, suppose we are faced with the computation<br />
[ ] [ ]<br />
3 7 −1 3<br />
6 −2 4 +2 4 8<br />
0 −3 −2 5<br />
We choose a nice basis for M 32 (or a nasty basis if we are so <strong>in</strong>cl<strong>in</strong>ed),<br />
{[ 1<br />
] 0<br />
[ 0<br />
] 0<br />
[ 0<br />
] 0<br />
[ 0<br />
] 1<br />
[ 0<br />
] 0<br />
[ 0<br />
]} 0<br />
B = 0 0 , 1 0 , 0 0 , 0 0 , 0 1 , 0 0<br />
0 0 0 0 1 0 0 0 0 0 0 1<br />
and apply ρ B to each vector <strong>in</strong> the l<strong>in</strong>ear comb<strong>in</strong>ation. This gives us a new compu-
§VR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 512<br />
tation, now <strong>in</strong> the vector space C 6 , which we can compute with operations <strong>in</strong> C 6<br />
(Def<strong>in</strong>ition CVA, Def<strong>in</strong>itionCVSM),<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤<br />
3 −1 16<br />
−2<br />
4<br />
−4<br />
0<br />
−2<br />
−4<br />
6<br />
⎢ 7<br />
+2<br />
⎥ ⎢ 3<br />
=<br />
⎥ ⎢ 48<br />
⎥<br />
⎣ 4 ⎦ ⎣ 8 ⎦ ⎣ 40 ⎦<br />
−3 5 −8<br />
We are after the result of a computation <strong>in</strong> M 32 , so we now can apply ρ −1<br />
B<br />
to<br />
obta<strong>in</strong> a 3 × 2 matrix,<br />
[ ] [ ] [ ] [ ] [ ] [ ]<br />
1 0 0 0 0 0 0 1 0 0 0 0<br />
16 0 0 +(−4) 1 0 +(−4) 0 0 +48 0 0 +40 0 1 +(−8) 0 0<br />
0 0 0 0 1 0 0 0 0 0 0 1<br />
[ ] 16 48<br />
= −4 40<br />
−4 −8<br />
which is exactly the matrix we would have computed had we just performed the<br />
matrix operations <strong>in</strong> the first place. So this was not meant to be an easier way to<br />
compute a l<strong>in</strong>ear comb<strong>in</strong>ation of two matrices, just a different way.<br />
△<br />
Read<strong>in</strong>g Questions<br />
1. The vector space of 3 × 5matrices,M 35 is isomorphic to what fundamental vector space?<br />
2. A basis for C 3 is<br />
⎧⎡<br />
⎨<br />
B = ⎣ 1 ⎤ ⎡<br />
2 ⎦ , ⎣ 3<br />
⎤ ⎡<br />
−1⎦ , ⎣ 1 ⎤⎫<br />
⎬<br />
1⎦<br />
⎩<br />
−1 2 1<br />
⎭<br />
⎛⎡<br />
Compute ρ B<br />
⎝⎣ 5 ⎤⎞<br />
8 ⎦⎠.<br />
−1<br />
3. What is the first “surprise,” and why is it surpris<strong>in</strong>g?<br />
Exercises<br />
C10 † In the vector space C 3 , compute the vector representation ρ B (v) for the basis B<br />
and vector v below.<br />
⎧⎡<br />
⎨<br />
B = ⎣ 2<br />
⎤ ⎡<br />
−2⎦ , ⎣ 1 ⎤ ⎡<br />
3⎦ , ⎣ 3 ⎤⎫<br />
⎡<br />
⎬<br />
5⎦<br />
v = ⎣ 11<br />
⎤<br />
5 ⎦<br />
⎩<br />
2 1 2<br />
⎭<br />
8
§VR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 513<br />
C20 † Rework Example CM32 replac<strong>in</strong>g the basis B by the basis<br />
⎧⎡<br />
⎨<br />
C = ⎣ −14 −9<br />
⎤ ⎡<br />
10 10 ⎦ , ⎣ −7 −4<br />
⎤ ⎡<br />
5 5 ⎦ , ⎣ −3 −1<br />
⎤ ⎡<br />
0 −2⎦ , ⎣ −7 −4<br />
⎤ ⎡<br />
3 2 ⎦ , ⎣ 4 2<br />
⎤<br />
−3 −3⎦ ,<br />
⎩<br />
−6 −2 −3 −1 1 1 −1 0 2 1<br />
⎡<br />
⎣ 0 0<br />
⎤⎫<br />
⎬<br />
−1 −2⎦<br />
1 1<br />
⎭<br />
M10 Prove that the set S below is a basis for the vector space of 2 × 2matrices,M 22 .<br />
Do this by choos<strong>in</strong>g a natural basis for M 22 and coord<strong>in</strong>atiz<strong>in</strong>g the elements of S with<br />
respect to this basis. Exam<strong>in</strong>e the result<strong>in</strong>g set of column vectors from C 4 and apply the<br />
Coord<strong>in</strong>atization Pr<strong>in</strong>ciple.<br />
{[ ] [ ] [ ] [ ]}<br />
33 99 −16 −47 10 27 −2 −7<br />
S =<br />
,<br />
, ,<br />
78 −9 −36 2 17 3 −6 4<br />
M20 † The set B = {v 1 , v 2 , v 3 , v 4 } is a basis of the vector space P 3 , polynomials with<br />
degree 3 or less. Therefore ρ B is a l<strong>in</strong>ear transformation, accord<strong>in</strong>g<br />
(<br />
to Theorem VRLT.<br />
F<strong>in</strong>d a “formula” for ρ B . In other words, f<strong>in</strong>d an expression for ρ B a + bx + cx 2 + dx 3) .<br />
v 1 =1− 5x − 22x 2 +3x 3 v 2 = −2+11x +49x 2 − 8x 3<br />
v 3 = −1+7x +33x 2 − 8x 3 v 4 = −1+4x +16x 2 + x 3
Section MR<br />
Matrix Representations<br />
We have seen that l<strong>in</strong>ear transformations whose doma<strong>in</strong> and codoma<strong>in</strong> are vector<br />
spaces of columns vectors have a close relationship with matrices (Theorem MBLT,<br />
Theorem MLTCV). In this section, we will extend the relationship between matrices<br />
and l<strong>in</strong>ear transformations to the sett<strong>in</strong>g of l<strong>in</strong>ear transformations between abstract<br />
vector spaces.<br />
Subsection MR<br />
Matrix Representations<br />
This is a fundamental def<strong>in</strong>ition.<br />
Def<strong>in</strong>ition MR Matrix Representation<br />
Suppose that T : U → V is a l<strong>in</strong>ear transformation, B = {u 1 , u 2 , u 3 , ..., u n }<br />
is a basis for U of size n, andC is a basis for V of size m. Then the matrix<br />
representation of T relative to B and C is the m × n matrix,<br />
MB,C T =[ρ C (T (u 1 ))| ρ C (T (u 2 ))| ρ C (T (u 3 ))| ...|ρ C (T (u n ))]<br />
□<br />
Example OLTTR One l<strong>in</strong>ear transformation, three representations<br />
Consider the l<strong>in</strong>ear transformation, S : P 3 → M 22 ,givenby<br />
S ( a + bx + cx 2 + dx 3) [ ]<br />
3a +7b − 2c − 5d 8a +14b − 2c − 11d<br />
=<br />
−4a − 8b +2c +6d 12a +22b − 4c − 17d<br />
<strong>First</strong>, we build a representation relative to the bases,<br />
B = { 1+2x + x 2 − x 3 , 1+3x + x 2 + x 3 , −1 − 2x +2x 3 , 2+3x +2x 2 − 5x 3}<br />
{[ ] [ ] [ ] [ ]}<br />
1 1 2 3 −1 −1 −1 −4<br />
C = , ,<br />
,<br />
1 2 2 5 0 −2 −2 −4<br />
We evaluate S with each element of the basis for the doma<strong>in</strong>, B, and coord<strong>in</strong>atize<br />
the result relative to the vectors <strong>in</strong> the basis for the codoma<strong>in</strong>, C. Noticeherehow<br />
we take elements of vector spaces and decompose them <strong>in</strong>to l<strong>in</strong>ear comb<strong>in</strong>ations of<br />
basis elements as the key step <strong>in</strong> construct<strong>in</strong>g coord<strong>in</strong>atizations of vectors. There<br />
is a system of equations <strong>in</strong>volved almost every time, but we will omit these details<br />
s<strong>in</strong>ce this should be a rout<strong>in</strong>e exercise at this stage.<br />
( (<br />
ρ C S 1+2x + x 2 − x 3)) ([ ])<br />
20 45<br />
= ρ C<br />
−24 69<br />
514
§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 515<br />
= ρ C<br />
((−90)<br />
[ ]<br />
1 1<br />
+37<br />
1 2<br />
[ ]<br />
2 3<br />
+(−40)<br />
2 5<br />
( (<br />
ρ C S 1+3x + x 2 + x 3)) ([ ])<br />
17 37<br />
= ρ C<br />
−20 57<br />
= ρ C<br />
((−72)<br />
[ ]<br />
1 1<br />
+29<br />
1 2<br />
[ ]<br />
2 3<br />
+(−34)<br />
2 5<br />
( (<br />
ρ C S −1 − 2x +2x<br />
3 )) ([ ])<br />
−27 −58<br />
= ρ C<br />
32 −90<br />
= ρ C<br />
(114<br />
[ ]<br />
1 1<br />
+(−46)<br />
1 2<br />
[ ]<br />
2 3<br />
+54<br />
2 5<br />
( (<br />
ρ C S 2+3x +2x 2 − 5x 3)) ([ ])<br />
48 109<br />
= ρ C<br />
−58 167<br />
= ρ C<br />
(−220<br />
[ ]<br />
1 1<br />
+91<br />
1 2<br />
[ ]<br />
2 3<br />
+(−96)<br />
2 5<br />
[ ]<br />
−1 −1<br />
+4<br />
0 −2<br />
[ ]<br />
−1 −1<br />
+3<br />
0 −2<br />
[ ]<br />
−1 −1<br />
+(−5)<br />
0 −2<br />
[ ]<br />
−1 −1<br />
+10<br />
0 −2<br />
⎡ ⎤<br />
[ ]) −90<br />
−1 −4 ⎢ 37 ⎥<br />
=<br />
−2 −4<br />
⎣<br />
−40<br />
⎦<br />
4<br />
⎡ ⎤<br />
[ ]) −72<br />
−1 −4 ⎢ 29 ⎥<br />
=<br />
−2 −4<br />
⎣<br />
−34<br />
⎦<br />
3<br />
⎡ ⎤<br />
[ ]) 114<br />
−1 −4 ⎢−46⎥<br />
=<br />
−2 −4<br />
⎣<br />
54<br />
⎦<br />
−5<br />
⎡ ⎤<br />
[ ]) −220<br />
−1 −4 ⎢ 91 ⎥<br />
=<br />
−2 −4<br />
⎣<br />
−96<br />
⎦<br />
10<br />
Thus, employ<strong>in</strong>g Def<strong>in</strong>ition MR<br />
⎡<br />
⎤<br />
−90 −72 114 −220<br />
MB,C S ⎢ 37 29 −46 91 ⎥<br />
= ⎣<br />
−40 −34 54 −96<br />
⎦<br />
4 3 −5 10<br />
Often we use “nice” bases to build matrix representations and the work <strong>in</strong>volved<br />
is much easier. Suppose we take bases<br />
D = { 1,x,x 2 ,x 3} {[ ] [ ] [ ] [ ]}<br />
1 0 0 1 0 0 0 0<br />
E = , , ,<br />
0 0 0 0 1 0 0 1<br />
The evaluation of S at the elements of D is easy and coord<strong>in</strong>atization relative to<br />
E canbedoneonsight,<br />
([ ])<br />
3 8<br />
ρ E (S (1)) = ρ E<br />
−4 12
§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 516<br />
= ρ E<br />
(3<br />
[ ]<br />
1 0<br />
+8<br />
0 0<br />
([ ])<br />
7 14<br />
ρ E (S (x)) = ρ E<br />
−8 22<br />
= ρ E<br />
(7<br />
[ ]<br />
1 0<br />
+14<br />
0 0<br />
( (<br />
ρ E S x<br />
2 )) ([ ])<br />
−2 −2<br />
= ρ E<br />
2 −4<br />
= ρ E<br />
((−2)<br />
[ ]<br />
1 0<br />
+(−2)<br />
0 0<br />
[ ]<br />
0 1<br />
+(−4)<br />
0 0<br />
( (<br />
ρ E S x<br />
3 )) ([ ])<br />
−5 −11<br />
= ρ E<br />
6 −17<br />
[ ]<br />
1 0<br />
= ρ E<br />
((−5) +(−11)<br />
0 0<br />
⎡ ⎤<br />
−5<br />
⎢−11⎥<br />
= ⎣<br />
6<br />
⎦<br />
−17<br />
[ ]<br />
0 1<br />
+(−8)<br />
0 0<br />
[ ]<br />
0 1<br />
+2<br />
0 0<br />
[ ]<br />
0 1<br />
+6<br />
0 0<br />
[ ]<br />
0 0<br />
+12<br />
1 0<br />
[ ]<br />
0 0<br />
+22<br />
1 0<br />
[ ])<br />
0 0<br />
=<br />
0 1<br />
[ ]<br />
0 0<br />
+(−4)<br />
1 0<br />
⎡<br />
⎢<br />
⎣<br />
[ ])<br />
0 0<br />
=<br />
0 1<br />
[ ]<br />
0 0<br />
+(−17)<br />
1 0<br />
So the matrix representation of S relative to D and E is<br />
⎡<br />
⎤<br />
3 7 −2 −5<br />
MD,E S ⎢ 8 14 −2 −11⎥<br />
= ⎣<br />
−4 −8 2 6<br />
⎦<br />
12 22 −4 −17<br />
3<br />
8<br />
−4<br />
12<br />
⎤<br />
⎥<br />
⎦<br />
⎡ ⎤<br />
7<br />
⎢ 14 ⎥<br />
⎣<br />
−8<br />
⎦<br />
22<br />
[ ])<br />
0 0<br />
=<br />
0 1<br />
[ ])<br />
0 0<br />
0 1<br />
One more time, but now let us use bases<br />
F = { 1+x − x 2 +2x 3 , −1+2x +2x 3 , 2+x − 2x 2 +3x 3 , 1+x +2x 3}<br />
{[ ] [ ] [ ] [ ]}<br />
1 1 −1 2 2 1 1 1<br />
G =<br />
, , ,<br />
−1 2 0 2 −2 3 0 2<br />
⎡ ⎤<br />
−2<br />
⎢−2⎥<br />
⎣<br />
2<br />
⎦<br />
−4<br />
and evaluate S with the elements of F , then coord<strong>in</strong>atize the results relative to G,<br />
⎡ ⎤<br />
( (<br />
ρ G S 1+x − x 2 +2x 3)) ([ ]) ( [ ]) 2<br />
2 2<br />
1 1 ⎢0⎥<br />
= ρ G = ρ<br />
−2 4 G 2<br />
=<br />
−1 2<br />
⎣<br />
0<br />
⎦<br />
0
§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 517<br />
( (<br />
ρ G S −1+2x +2x<br />
3 )) ([ ]) (<br />
1 −2<br />
= ρ G = ρ<br />
0 −2 G (−1)<br />
[ ])<br />
−1 2<br />
=<br />
0 2<br />
( (<br />
ρ G S 2+x − 2x 2 +3x 3)) ([ ]) ([ ])<br />
2 1<br />
2 1<br />
= ρ G = ρ<br />
−2 3 G =<br />
−2 3<br />
( (<br />
ρ G S 1+x +2x<br />
3 )) ([ ]) (<br />
0 0<br />
= ρ G = ρ<br />
0 0 G 0<br />
[ ])<br />
1 1<br />
=<br />
0 2<br />
⎡ ⎤<br />
0<br />
⎢0⎥<br />
⎣<br />
1<br />
⎦<br />
0<br />
⎡ ⎤<br />
0<br />
⎢0⎥<br />
⎣<br />
0<br />
⎦<br />
0<br />
⎡ ⎤<br />
0<br />
⎢−1⎥<br />
⎣<br />
0<br />
⎦<br />
0<br />
So we arrive at an especially economical matrix representation,<br />
⎡<br />
⎤<br />
2 0 0 0<br />
MF,G S ⎢0 −1 0 0⎥<br />
= ⎣<br />
0 0 1 0<br />
⎦<br />
0 0 0 0<br />
We may choose to use whatever terms we want when we make a def<strong>in</strong>ition. Some<br />
are arbitrary, while others make sense, but only <strong>in</strong> light of subsequent theorems.<br />
Matrix representation is <strong>in</strong> the latter category. We beg<strong>in</strong> with a l<strong>in</strong>ear transformation<br />
and produce a matrix. So what? Here is the theorem that justifies the term “matrix<br />
representation.”<br />
Theorem FTMR Fundamental Theorem of Matrix Representation<br />
Suppose that T : U → V is a l<strong>in</strong>ear transformation, B is a basis for U, C is a basis<br />
for V and MB,C T is the matrix representation of T relative to B and C. Then,for<br />
any u ∈ U,<br />
ρ C (T (u)) = MB,C T (ρ B (u))<br />
or equivalently<br />
T (u) =ρ −1 (<br />
C M<br />
T<br />
B,C (ρ B (u)) )<br />
Proof. Let B = {u 1 , u 2 , u 3 , ..., u n } be the basis of U. S<strong>in</strong>ceu ∈ U, there are<br />
scalars a 1 ,a 2 ,a 3 , ..., a n such that<br />
△<br />
u = a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a n u n<br />
Then,<br />
MB,Cρ T B (u)<br />
=[ρ C (T (u 1 ))| ρ C (T (u 2 ))| ρ C (T (u 3 ))| ...|ρ C (T (u n ))] ρ B (u)<br />
Def<strong>in</strong>ition MR
§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 518<br />
⎡ ⎤<br />
a 1<br />
a 2<br />
=[ρ C (T (u 1 ))| ρ C (T (u 2 ))| ρ C (T (u 3 ))| ...|ρ C (T (u n ))]<br />
a 3<br />
⎢ ⎥<br />
⎣<br />
.<br />
⎦<br />
a n<br />
Def<strong>in</strong>ition VR<br />
= a 1 ρ C (T (u 1 )) + a 2 ρ C (T (u 2 )) + ···+ a n ρ C (T (u n )) Def<strong>in</strong>ition MVP<br />
= ρ C (a 1 T (u 1 )+a 2 T (u 2 )+a 3 T (u 3 )+···+ a n T (u n )) Theorem LTLC<br />
= ρ C (T (a 1 u 1 + a 2 u 2 + a 3 u 3 + ···+ a n u n )) Theorem LTLC<br />
= ρ C (T (u))<br />
The alternative conclusion is obta<strong>in</strong>ed as<br />
T (u) =I V (T (u))<br />
Def<strong>in</strong>ition IDLT<br />
= ( ρ −1<br />
C ◦ ρ C)<br />
(T (u)) Def<strong>in</strong>ition IVLT<br />
= ρ −1<br />
C (ρ C (T (u))) Def<strong>in</strong>ition LTC<br />
= ρ −1<br />
C<br />
(<br />
M<br />
T<br />
B,C (ρ B (u)) ) <br />
This theorem says that we can apply T to u and coord<strong>in</strong>atize the result relative<br />
to C <strong>in</strong> V , or we can first coord<strong>in</strong>atize u relative to B <strong>in</strong> U, then multiply by the<br />
matrix representation. Either way, the result is the same. So the effect of a l<strong>in</strong>ear<br />
transformation can always be accomplished by a matrix-vector product (Def<strong>in</strong>ition<br />
MVP). That is important enough to say aga<strong>in</strong>. The effect of a l<strong>in</strong>ear transformation<br />
is a matrix-vector product.<br />
u<br />
T<br />
T (u)<br />
ρ B<br />
ρ C<br />
ρ B (u)<br />
M T B,C<br />
M T B,C ρ B (u) =ρ C (T (u))<br />
Diagram FTMR: Fundamental Theorem of Matrix Representations<br />
The alternative conclusion of this result might be even more strik<strong>in</strong>g. It says that<br />
to effect a l<strong>in</strong>ear transformation (T ) of a vector (u), coord<strong>in</strong>atize the <strong>in</strong>put (with ρ B ),<br />
do a matrix-vector product (with MB,C T ), and un-coord<strong>in</strong>atize the result (with ρ−1<br />
C ).
§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 519<br />
So, absent some bookkeep<strong>in</strong>g about vector representations, a l<strong>in</strong>ear transformation<br />
is a matrix. To adjust the diagram, we “reverse” the arrow on the right, which<br />
means <strong>in</strong>vert<strong>in</strong>g the vector representation ρ C on V . Now we can go directly across<br />
the top of the diagram, comput<strong>in</strong>g the l<strong>in</strong>ear transformation between the abstract<br />
vector spaces. Or, we can around the other three sides, us<strong>in</strong>g vector representation,<br />
a matrix-vector product, followed by un-coord<strong>in</strong>atization.<br />
u<br />
T<br />
T (u) =ρ −1<br />
C<br />
(<br />
M<br />
T<br />
B,C ρ B (u) )<br />
ρ B<br />
ρ −1<br />
C<br />
ρ B (u)<br />
M T B,C<br />
M T B,C ρ B (u)<br />
Diagram FTMRA: Fundamental Theorem of Matrix Representations (Alternate)<br />
Here is an example to illustrate how the “action” of a l<strong>in</strong>ear transformation can<br />
be effected by matrix multiplication.<br />
Example ALTMM A l<strong>in</strong>ear transformation as matrix multiplication<br />
In Example OLTTR we found three representations of the l<strong>in</strong>ear transformation S.<br />
In this example, we will compute a s<strong>in</strong>gle output of S <strong>in</strong> four different ways. <strong>First</strong><br />
“normally,” then three times over us<strong>in</strong>g Theorem FTMR.<br />
Choose p(x) =3−x+2x 2 −5x 3 , for no particular reason. Then the straightforward<br />
application of S to p(x) yields<br />
S (p(x)) = S ( 3 − x +2x 2 − 5x 3)<br />
[ ]<br />
3(3) + 7(−1) − 2(2) − 5(−5) 8(3) + 14(−1) − 2(2) − 11(−5)<br />
=<br />
−4(3) − 8(−1) + 2(2) + 6(−5) 12(3) + 22(−1) − 4(2) − 17(−5)<br />
[ ]<br />
23 61<br />
=<br />
−30 91<br />
Now use the representation of S relative to the bases B and C and Theorem<br />
FTMR. Note that we will employ the follow<strong>in</strong>g l<strong>in</strong>ear comb<strong>in</strong>ation <strong>in</strong> mov<strong>in</strong>g from<br />
the second l<strong>in</strong>e to the third,<br />
3 − x +2x 2 − 5x 3 = 48(1 + 2x + x 2 − x 3 )+(−20)(1 + 3x + x 2 + x 3 )+<br />
(−1)(−1 − 2x +2x 3 )+(−13)(2 + 3x +2x 2 − 5x 3 )<br />
S (p(x)) = ρ −1<br />
C<br />
(<br />
M<br />
S<br />
B,C ρ B (p(x)) )
§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 520<br />
= ρ −1<br />
C<br />
= ρ −1<br />
C<br />
( (<br />
M<br />
S<br />
B,C ρ B 3 − x +2x 2 − 5x 3))<br />
⎛ ⎡ ⎤⎞<br />
48<br />
⎜<br />
⎝MB,C<br />
S ⎢−20⎥⎟<br />
⎣<br />
−1<br />
⎦⎠<br />
−13<br />
⎛⎡<br />
⎤ ⎡ ⎤⎞<br />
−90 −72 114 −220 48<br />
= ρ −1 ⎜⎢<br />
37 29 −46 91 ⎥ ⎢−20⎥⎟<br />
C ⎝⎣<br />
−40 −34 54 −96<br />
⎦ ⎣<br />
−1<br />
⎦⎠<br />
4 3 −5 10 −13<br />
⎛⎡<br />
⎤⎞<br />
−134<br />
= ρ −1 ⎜⎢<br />
59 ⎥⎟<br />
C ⎝⎣<br />
−46<br />
⎦⎠<br />
7<br />
[ ] [ ]<br />
1 1 2 3<br />
=(−134) +59 +(−46)<br />
1 2 2 5<br />
[ ]<br />
23 61<br />
=<br />
−30 91<br />
[ ]<br />
−1 −1<br />
+7<br />
0 −2<br />
[<br />
−1<br />
]<br />
−4<br />
−2 −4<br />
Aga<strong>in</strong>, but now with “nice” bases like D and E, and the computations are more<br />
transparent.<br />
S (p(x)) = ρ −1 (<br />
E M<br />
S<br />
D,E ρ D (p(x)) )<br />
= ρ −1 ( (<br />
E M<br />
S<br />
D,E ρ D 3 − x +2x 2 − 5x 3))<br />
= ρ −1 ( (<br />
E M<br />
S<br />
D,E ρ D 3(1) + (−1)(x)+2(x 2 )+(−5)(x 3 ) ))<br />
⎛ ⎡ ⎤⎞<br />
3<br />
= ρ −1 ⎜<br />
E ⎝MD,E<br />
S ⎢−1⎥⎟<br />
⎣<br />
2<br />
⎦⎠<br />
−5<br />
⎛⎡<br />
⎤ ⎡ ⎤⎞<br />
3 7 −2 −5 3<br />
= ρ −1 ⎜⎢<br />
8 14 −2 −11⎥<br />
⎢−1⎥⎟<br />
E ⎝⎣<br />
−4 −8 2 6<br />
⎦ ⎣<br />
2<br />
⎦⎠<br />
12 22 −4 −17 −5<br />
⎛⎡<br />
⎤⎞<br />
23<br />
= ρ −1 ⎜⎢<br />
61 ⎥⎟<br />
E ⎝⎣<br />
−30<br />
⎦⎠<br />
91<br />
[ ] [ ] [ ] [ ]<br />
1 0 0 1<br />
0 0 0 0<br />
=23 +61 +(−30) +91<br />
0 0 0 0<br />
1 0 0 1<br />
[ ]<br />
23 61<br />
=<br />
−30 91
§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 521<br />
OK, last time, now with the bases F and G. The coord<strong>in</strong>atizations will take<br />
some work this time, but the matrix-vector product (Def<strong>in</strong>ition MVP) (whichisthe<br />
actual action of the l<strong>in</strong>ear transformation) will be especially easy, given the diagonal<br />
nature of the matrix representation, MF,G S . Here we go,<br />
S (p(x)) = ρ −1<br />
G<br />
= ρ −1<br />
G<br />
= ρ −1<br />
G<br />
= ρ −1<br />
G<br />
(<br />
M<br />
S<br />
F,G ρ F (p(x)) )<br />
(<br />
M<br />
S<br />
F,G ρ F<br />
(<br />
3 − x +2x 2 − 5x 3))<br />
( (<br />
M<br />
S<br />
F,G ρ F 32(1 + x − x 2 +2x 3 ) − 7(−1+2x +2x 3 )<br />
−17(2 + x − 2x 2 +3x 3 ) − 2(1 + x +2x 3 ) ))<br />
⎛ ⎡ ⎤⎞<br />
32<br />
−7 ⎥⎟<br />
⎜<br />
⎝M S F,G<br />
⎢<br />
⎣<br />
−17<br />
−2<br />
⎦⎠<br />
⎛⎡<br />
⎤ ⎡<br />
2 0 0 0<br />
= ρ −1 ⎜⎢0 −1 0 0⎥<br />
⎢<br />
G ⎝⎣<br />
0 0 1 0<br />
⎦ ⎣<br />
0 0 0 0<br />
⎛⎡<br />
⎤⎞<br />
64<br />
= ρ −1 ⎜⎢<br />
7 ⎥⎟<br />
G ⎝⎣<br />
−17<br />
⎦⎠<br />
0<br />
[ ]<br />
1 1<br />
=64 +7<br />
−1 2<br />
[ ]<br />
23 61<br />
=<br />
−30 91<br />
32<br />
−7<br />
−17<br />
−2<br />
⎤⎞<br />
⎥⎟<br />
⎦⎠<br />
[ ]<br />
−1 2<br />
+(−17)<br />
0 2<br />
[ ]<br />
2 1<br />
+0<br />
−2 3<br />
[ ]<br />
1 1<br />
0 2<br />
This example is not meant to necessarily illustrate that any one of these four<br />
computations is simpler than the others. Instead, it is meant to illustrate the many<br />
different ways we can arrive at the same result, with the last three all employ<strong>in</strong>g a<br />
matrix representation to effect the l<strong>in</strong>ear transformation.<br />
△<br />
We will use Theorem FTMR frequently <strong>in</strong> the next few sections. A typical<br />
application will feel like the l<strong>in</strong>ear transformation T “commutes” with a vector<br />
representation, ρ C , and as it does the transformation morphs <strong>in</strong>to a matrix, M T B,C ,<br />
while the vector representation changes to a new basis, ρ B . Or vice-versa.<br />
Subsection NRFO<br />
New Representations from Old<br />
In Subsection LT.NLTFO we built new l<strong>in</strong>ear transformations from other l<strong>in</strong>ear transformations.<br />
Sums, scalar multiples and compositions. These new l<strong>in</strong>ear transformations<br />
will have matrix representations as well. How do the new matrix representations
§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 522<br />
relate to the old matrix representations? Here are the three theorems.<br />
Theorem MRSLT Matrix Representation of a Sum of L<strong>in</strong>ear Transformations<br />
Suppose that T : U → V and S : U → V are l<strong>in</strong>ear transformations, B is a basis of<br />
U and C is a basis of V .Then<br />
M T +S<br />
B,C<br />
= M T B,C + M S B,C<br />
Proof. Let x be any vector <strong>in</strong> C n . Def<strong>in</strong>e u ∈ U by u = ρ −1<br />
B<br />
(x), sox = ρ B (u).<br />
Then,<br />
M T +S<br />
B,C x = M T +S<br />
B,C ρ B (u)<br />
Substitution<br />
= ρ C ((T + S)(u)) Theorem FTMR<br />
= ρ C (T (u)+S (u)) Def<strong>in</strong>ition LTA<br />
= ρ C (T (u)) + ρ C (S (u)) Def<strong>in</strong>ition LT<br />
= MB,C T (ρ B (u)) + MB,C S (ρ B (u)) Theorem FTMR<br />
= ( MB,C T + MB,C) S ρB (u) Theorem MMDAA<br />
= ( MB,C T + MB,C) S x Substitution<br />
S<strong>in</strong>ce the matrices M T +S<br />
B,C<br />
and M B,C T + M B,C S have equal matrix-vector products<br />
for every vector <strong>in</strong> C n ,byTheoremEMMVP they are equal matrices. (Now would<br />
be a good time to double-back and study the proof of Theorem EMMVP. You did<br />
promise to come back to this theorem sometime, didn’t you?)<br />
<br />
Theorem MRMLT Matrix Representation of a Multiple of a L<strong>in</strong>ear Transformation<br />
Suppose that T : U → V is a l<strong>in</strong>ear transformation, α ∈ C, B is a basis of U and C<br />
is a basis of V .Then<br />
MB,C αT = αMB,C<br />
T<br />
Proof. Let x be any vector <strong>in</strong> C n . Def<strong>in</strong>e u ∈ U by u = ρ −1<br />
B<br />
(x), sox = ρ B (u).<br />
Then,<br />
MB,Cx αT = MB,Cρ αT<br />
B (u)<br />
Substitution<br />
= ρ C ((αT )(u)) Theorem FTMR<br />
= ρ C (αT (u)) Def<strong>in</strong>ition LTSM<br />
= αρ C (T (u)) Def<strong>in</strong>ition LT<br />
= α ( MB,Cρ T B (u) ) Theorem FTMR<br />
= ( αMB,C) T ρB (u) Theorem MMSMM<br />
= ( αMB,C) T x Substitution
§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 523<br />
S<strong>in</strong>ce the matrices MB,C αT and αM B,C T have equal matrix-vector products for every<br />
vector <strong>in</strong> C n ,byTheoremEMMVP they are equal matrices.<br />
<br />
The vector space of all l<strong>in</strong>ear transformations from U to V is now isomorphic to<br />
the vector space of all m × n matrices.<br />
Theorem MRCLT Matrix Representation of a Composition of L<strong>in</strong>ear Transformations<br />
Suppose that T : U → V and S : V → W are l<strong>in</strong>ear transformations, B is a basis of<br />
U, C is a basis of V ,andD is a basis of W .Then<br />
MB,D S◦T = MC,DM S B,C<br />
T<br />
Proof. Let x be any vector <strong>in</strong> C n . Def<strong>in</strong>e u ∈ U by u = ρ −1<br />
B<br />
(x), sox = ρ B (u).<br />
Then,<br />
MB,Dx S◦T = MB,Dρ S◦T<br />
B (u)<br />
Substitution<br />
= ρ D ((S ◦ T )(u)) Theorem FTMR<br />
= ρ D (S (T (u))) Def<strong>in</strong>ition LTC<br />
= MC,Dρ S C (T (u)) Theorem FTMR<br />
= MC,D<br />
S (<br />
M<br />
T<br />
B,C ρ B (u) ) Theorem FTMR<br />
= ( MC,DM S B,C) T ρB (u) Theorem MMA<br />
= ( MC,DM S B,C) T x Substitution<br />
S<strong>in</strong>ce the matrices MB,D S◦T and M C,D S M B,C T have equal matrix-vector products for<br />
every vector <strong>in</strong> C n ,byTheoremEMMVP they are equal matrices.<br />
<br />
This is the second great surprise of <strong>in</strong>troductory l<strong>in</strong>ear algebra. Matrices are<br />
l<strong>in</strong>ear transformations (functions, really), and matrix multiplication is function<br />
composition! We can form the composition of two l<strong>in</strong>ear transformations, then form<br />
the matrix representation of the result. Or we can form the matrix representation of<br />
each l<strong>in</strong>ear transformation separately, then multiply the two representations together<br />
via Def<strong>in</strong>ition MM. In either case, we arrive at the same result.<br />
Example MPMR Matrix product of matrix representations<br />
Consider the two l<strong>in</strong>ear transformations,<br />
([<br />
T : C 2 a<br />
→ P 2 T =(−a +3b)+(2a +4b)x +(a − 2b)x<br />
b])<br />
2<br />
S : P 2 → M 22 S ( a + bx + cx 2) [ ]<br />
2a + b +2c a+4b − c<br />
=<br />
−a +3c 3a + b +2c
§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 524<br />
and bases for C 2 , P 2 and M 22 (respectively),<br />
{[ [<br />
3 2<br />
B = ,<br />
1]<br />
1]}<br />
C = { 1 − 2x + x 2 , −1+3x, 2x +3x 2}<br />
{[ ] [ ] [ ] [ ]}<br />
1 −2 1 −1 −1 2 2 −3<br />
D =<br />
, , ,<br />
1 −1 1 2 0 0 2 2<br />
Beg<strong>in</strong> by comput<strong>in</strong>g the new l<strong>in</strong>ear transformation that is the composition of T<br />
and S (Def<strong>in</strong>ition LTC,TheoremCLTLT), (S ◦ T ):C 2 → M 22 ,<br />
([ ( ([<br />
a a<br />
(S ◦ T ) = S T<br />
b])<br />
b]))<br />
= S ( (−a +3b)+(2a +4b)x +(a − 2b)x 2)<br />
[ ]<br />
2(−a +3b)+(2a +4b)+2(a − 2b) (−a +3b) + 4(2a +4b) − (a − 2b)<br />
=<br />
−(−a +3b)+3(a − 2b) 3(−a +3b)+(2a +4b)+2(a − 2b)<br />
[ ]<br />
2a +6b 6a +21b<br />
=<br />
4a − 9b a+9b<br />
Now compute the matrix representations (Def<strong>in</strong>ition MR) for each of these three<br />
l<strong>in</strong>ear transformations (T , S, S ◦ T ), relative to the appropriate bases. <strong>First</strong> for T ,<br />
([<br />
3 (<br />
ρ C<br />
(T = ρ<br />
1]))<br />
C 10x + x<br />
2 )<br />
(<br />
= ρ C 28(1 − 2x + x 2 ) + 28(−1+3x)+(−9)(2x +3x 2 ) )<br />
[ ] 28<br />
= 28<br />
−9<br />
([<br />
2<br />
ρ C<br />
(T = ρ<br />
1]))<br />
C (1 + 8x)<br />
(<br />
= ρ C 33(1 − 2x + x 2 ) + 32(−1+3x)+(−11)(2x +3x 2 ) )<br />
[ ] 33<br />
= 32<br />
−11<br />
So we have the matrix representation of T ,<br />
[ ] 28 33<br />
MB,C T = 28 32<br />
−9 −11
§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 525<br />
Now, a representation of S,<br />
( (<br />
ρ D S 1 − 2x + x<br />
2 )) ([ ])<br />
2 −8<br />
= ρ D<br />
2 3<br />
[ ]<br />
1 −2<br />
= ρ D<br />
((−11) +(−21)<br />
1 −1<br />
⎡ ⎤<br />
−11<br />
⎢−21⎥<br />
= ⎣<br />
0<br />
⎦<br />
17<br />
([ ])<br />
1 11<br />
ρ D (S (−1+3x)) = ρ D<br />
1 0<br />
[ ]<br />
1 −2<br />
= ρ D<br />
(26 +51<br />
1 −1<br />
=<br />
⎡<br />
⎢<br />
⎣<br />
26<br />
51<br />
0<br />
−38<br />
⎤<br />
⎥<br />
⎦<br />
[<br />
1 −1<br />
1 2<br />
( (<br />
ρ D S 2x +3x<br />
2 )) ([ ])<br />
8 5<br />
= ρ D<br />
9 8<br />
[ ] [<br />
1 −2 1 −1<br />
= ρ D<br />
(34 +67<br />
1 −1 1 2<br />
=<br />
⎡<br />
⎢<br />
⎣<br />
34<br />
67<br />
1<br />
−46<br />
⎤<br />
⎥<br />
⎦<br />
[ ]<br />
1 −1<br />
+0<br />
1 2<br />
]<br />
+0<br />
]<br />
+1<br />
[ ]<br />
−1 2<br />
+ (17)<br />
0 0<br />
[ ]<br />
−1 2<br />
+(−38)<br />
0 0<br />
[ ]<br />
−1 2<br />
+(−46)<br />
0 0<br />
[ ])<br />
2 −3<br />
2 2<br />
[ ])<br />
2 −3<br />
2 2<br />
[ ])<br />
2 −3<br />
2 2<br />
So we have the matrix representation of S,<br />
⎡<br />
⎤<br />
−11 26 34<br />
MC,D S ⎢−21 51 67 ⎥<br />
= ⎣<br />
0 0 1<br />
⎦<br />
17 −38 −46<br />
F<strong>in</strong>ally, a representation of S ◦ T ,<br />
([ ([ ])<br />
3 12 39<br />
ρ D<br />
((S ◦ T ) = ρ<br />
1]))<br />
D<br />
3 12<br />
[ ] [ ]<br />
1 −2 1 −1<br />
= ρ D<br />
(114 + 237 +(−9)<br />
1 −1 1 2<br />
[ ]<br />
−1 2<br />
+(−174)<br />
0 0<br />
[ ])<br />
2 −3<br />
2 2
§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 526<br />
=<br />
⎡<br />
⎢<br />
⎣<br />
114<br />
237<br />
−9<br />
−174<br />
⎤<br />
⎥<br />
⎦<br />
([ ([ ])<br />
2 10 33<br />
ρ D<br />
((S ◦ T ) = ρ<br />
1]))<br />
D<br />
−1 11<br />
[ ] [ ]<br />
1 −2 1 −1<br />
= ρ D<br />
(95 + 202 +(−11)<br />
1 −1 1 2<br />
⎡ ⎤<br />
95<br />
⎢ 202 ⎥<br />
= ⎣<br />
−11<br />
⎦<br />
−149<br />
[ ]<br />
−1 2<br />
+(−149)<br />
0 0<br />
[ ])<br />
2 −3<br />
2 2<br />
So we have the matrix representation of S ◦ T ,<br />
⎡<br />
⎤<br />
114 95<br />
MB,D S◦T ⎢ 237 202 ⎥<br />
= ⎣<br />
−9 −11<br />
⎦<br />
−174 −149<br />
Now, we are all set to verify the conclusion of Theorem MRCLT,<br />
⎡<br />
⎤<br />
−11 26 34 [ ] 28 33<br />
MC,DM S B,C T ⎢−21 51 67 ⎥<br />
= ⎣<br />
0 0 1<br />
⎦ 28 32<br />
−9 −11<br />
17 −38 −46<br />
⎡<br />
⎤<br />
114 95<br />
⎢ 237 202 ⎥<br />
= ⎣<br />
−9 −11<br />
⎦<br />
−174 −149<br />
= M S◦T<br />
B,D<br />
We have <strong>in</strong>tentionally used nonstandard bases. If you were to choose “nice” bases<br />
for the three vector spaces, then the result of the theorem might be rather transparent.<br />
But this would still be a worthwhile exercise — give it a go.<br />
△<br />
A diagram, similar to ones we have seen earlier, might make the importance of<br />
this theorem clearer,
§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 527<br />
Def<strong>in</strong>ition MR<br />
S, T MC,D S ,MT B,C<br />
Def<strong>in</strong>ition LTC<br />
S ◦ T<br />
Def<strong>in</strong>ition MR<br />
Def<strong>in</strong>ition MM<br />
M S◦T<br />
B,D = M S C,D M T B,C<br />
Diagram MRCLT: Matrix Representation and Composition of L<strong>in</strong>ear<br />
Transformations<br />
One of our goals <strong>in</strong> the first part of this book is to make the def<strong>in</strong>ition of<br />
matrix multiplication (Def<strong>in</strong>ition MVP, Def<strong>in</strong>ition MM) seem as natural as possible.<br />
However, many of us are brought up with an entry-by-entry description of matrix<br />
multiplication (Theorem EMP) asthedef<strong>in</strong>ition of matrix multiplication, and<br />
then theorems about columns of matrices and l<strong>in</strong>ear comb<strong>in</strong>ations follow from that<br />
def<strong>in</strong>ition. With this unmotivated def<strong>in</strong>ition, the realization that matrix multiplication<br />
is function composition is quite remarkable. It is an <strong>in</strong>terest<strong>in</strong>g exercise to beg<strong>in</strong> with<br />
the question, “What is the matrix representation of the composition of two l<strong>in</strong>ear<br />
transformations?” and then, without us<strong>in</strong>g any theorems about matrix multiplication,<br />
f<strong>in</strong>ally arrive at the entry-by-entry description of matrix multiplication. Try it yourself<br />
(Exercise MR.T80).<br />
Subsection PMR<br />
Properties of Matrix Representations<br />
It will not be a surprise to discover that the kernel and range of a l<strong>in</strong>ear transformation<br />
are closely related to the null space and column space of the transformation’s matrix<br />
representation. Perhaps this idea has been bounc<strong>in</strong>g around <strong>in</strong> your head already,<br />
even before see<strong>in</strong>g the def<strong>in</strong>ition of a matrix representation. However, with a formal<br />
def<strong>in</strong>ition of a matrix representation (Def<strong>in</strong>ition MR), and a fundamental theorem<br />
to go with it (Theorem FTMR) we can be formal about the relationship, us<strong>in</strong>g the<br />
idea of isomorphic vector spaces (Def<strong>in</strong>ition IVS). Here are the tw<strong>in</strong> theorems.<br />
Theorem KNSI Kernel and Null Space Isomorphism<br />
Suppose that T : U → V is a l<strong>in</strong>ear transformation, B is a basis for U of size n, and<br />
C is a basis for V . Then the kernel of T is isomorphic to the null space of MB,C T ,<br />
K(T ) ∼ = N ( MB,C<br />
T )<br />
Proof. To establish that two vector spaces are isomorphic, we must f<strong>in</strong>d an isomorphism<br />
between them, an <strong>in</strong>vertible l<strong>in</strong>ear transformation (Def<strong>in</strong>ition IVS). The kernel<br />
of the l<strong>in</strong>ear transformation T , K(T ), is a subspace of U, while the null space of the
§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 528<br />
matrix representation, N ( M T B,C)<br />
is a subspace of C n . The function ρ B is def<strong>in</strong>ed as<br />
a function from U to C n , but we can just as well employ the def<strong>in</strong>ition of ρ B as a<br />
function from K(T )toN ( M T B,C)<br />
.<br />
We must first <strong>in</strong>sure that if we choose an <strong>in</strong>put for ρ B from K(T ) that then the<br />
output will be an element of N ( M T B,C)<br />
. So suppose that u ∈K(T ). Then<br />
MB,Cρ T B (u) =ρ C (T (u))<br />
Theorem FTMR<br />
= ρ C (0) Def<strong>in</strong>ition KLT<br />
= 0 Theorem LTTZZ<br />
This says that ρ B (u) ∈N ( MB,C) T , as desired.<br />
The restriction <strong>in</strong> the size of the doma<strong>in</strong> and codoma<strong>in</strong> ρ B will not affect the<br />
fact that ρ B is a l<strong>in</strong>ear transformation (Theorem VRLT), nor will it affect the fact<br />
that ρ B is <strong>in</strong>jective (Theorem VRI). Someth<strong>in</strong>g must be done though to verify that<br />
ρ B is surjective. To this end, appeal to the def<strong>in</strong>ition of surjective (Def<strong>in</strong>ition SLT),<br />
and suppose that we have an element of the codoma<strong>in</strong>, x ∈N ( MB,C) T ⊆ C n and we<br />
wish to f<strong>in</strong>d an element of the doma<strong>in</strong> with x as its image. We now show that the<br />
desired element of the doma<strong>in</strong> is u = ρ −1<br />
B<br />
(x). <strong>First</strong>, verify that u ∈K(T ),<br />
T (u) =T ( ρ −1<br />
B<br />
(x))<br />
= ρ −1 ( ( (<br />
C M<br />
T<br />
B,C ρB ρ<br />
−1<br />
B (x)))) Theorem FTMR<br />
= ρ −1 (<br />
C M<br />
T<br />
B,C (I C n (x)) ) Def<strong>in</strong>ition IVLT<br />
= ρ −1 (<br />
C M<br />
T<br />
B,C x ) Def<strong>in</strong>ition IDLT<br />
= ρ −1<br />
C (0 Cn) Def<strong>in</strong>ition KLT<br />
= 0 V Theorem LTTZZ<br />
Second, verify that the proposed isomorphism, ρ B ,takesu to x,<br />
(<br />
ρ B (u) =ρ B ρ<br />
−1<br />
B (x)) Substitution<br />
= I C n (x) Def<strong>in</strong>ition IVLT<br />
= x Def<strong>in</strong>ition IDLT<br />
With ρ B demonstrated to be an <strong>in</strong>jective and surjective l<strong>in</strong>ear transformation<br />
from K(T ) to N ( M T B,C)<br />
,TheoremILTIS tells us ρB is <strong>in</strong>vertible, and so by Def<strong>in</strong>ition<br />
IVS, wesayK(T )andN ( M T B,C)<br />
are isomorphic. <br />
Example KVMR Kernel via matrix representation<br />
Consider the kernel of the l<strong>in</strong>ear transformation, T : M 22 → P 2 ,givenby<br />
([ ])<br />
a b<br />
T<br />
=(2a − b + c − 5d)+(a +4b +5c +2d)x +(3a − 2b + c − 8d)x<br />
c d<br />
2
§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 529<br />
We will beg<strong>in</strong> with a matrix representation of T relative to the bases for M 22<br />
and P 2 (respectively),<br />
{[ ] [ ] [ ] [ ]}<br />
1 2 1 3 1 2 2 5<br />
B =<br />
,<br />
, ,<br />
−1 −1 −1 −4 0 −2 −2 −4<br />
C = { 1+x + x 2 , 2+3x, −1 − 2x 2}<br />
Then,<br />
([ ]))<br />
1 2<br />
(<br />
ρ C<br />
(T<br />
= ρ<br />
−1 −1<br />
C 4+2x +6x<br />
2 )<br />
(<br />
= ρ C 2(1 + x + x 2 ) + 0(2 + 3x)+(−2)(−1 − 2x 2 ) )<br />
]<br />
[ 2<br />
= 0<br />
−2<br />
([ ]))<br />
1 3<br />
(<br />
ρ C<br />
(T<br />
= ρ<br />
−1 −4<br />
C 18 + 28x<br />
2 )<br />
(<br />
= ρ C (−24)(1 + x + x 2 ) + 8(2 + 3x)+(−26)(−1 − 2x 2 ) )<br />
[ ] −24<br />
= 8<br />
−26<br />
([ ]))<br />
1 2<br />
(<br />
ρ C<br />
(T<br />
= ρ<br />
0 −2<br />
C 10 + 5x +15x<br />
2 )<br />
(<br />
= ρ C 5(1 + x + x 2 ) + 0(2 + 3x)+(−5)(−1 − 2x 2 ) )<br />
]<br />
[ 5<br />
= 0<br />
−5<br />
([ ]))<br />
2 5<br />
(<br />
ρ C<br />
(T<br />
= ρ<br />
−2 −4<br />
C 17 + 4x +26x<br />
2 )<br />
(<br />
= ρ C (−8)(1 + x + x 2 ) + (4)(2 + 3x)+(−17)(−1 − 2x 2 ) )<br />
[ ] −8<br />
= 4<br />
−17<br />
So the matrix representation of T (relative to B and C) is<br />
[ ]<br />
2 −24 5 −8<br />
MB,C T = 0 8 0 4<br />
−2 −26 −5 −17<br />
We know from Theorem KNSI that the kernel of the l<strong>in</strong>ear transformation T is<br />
isomorphic to the null space of the matrix representation MB,C T and by study<strong>in</strong>g<br />
the proof of Theorem KNSI we learn that ρ B is an isomorphism between these
§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 530<br />
null spaces. Rather than try<strong>in</strong>g to compute the kernel of T us<strong>in</strong>g def<strong>in</strong>itions and<br />
techniques from Chapter LT we will <strong>in</strong>stead analyze the null space of MB,C T us<strong>in</strong>g<br />
techniques from way back <strong>in</strong> Chapter V. <strong>First</strong> row-reduce MB,C T ,<br />
[ ] ⎡<br />
2 −24 5 −8<br />
RREF<br />
0 8 0 4 −−−−→ ⎣ 1 0 ⎤<br />
5<br />
2<br />
2<br />
1<br />
0 1 0 ⎦<br />
2<br />
−2 −26 −5 −17 0 0 0 0<br />
ρ −1<br />
B<br />
So, by Theorem BNS, abasisforN ( MB,C) T is<br />
⎧⎡<br />
− ⎪⎨<br />
5 ⎤ ⎡ ⎤⎫<br />
−2<br />
2 ⎪⎬<br />
⎢ 0 ⎥ ⎢− ⎣<br />
⎪⎩ 1<br />
⎦ ,<br />
1 ⎣<br />
2⎥<br />
0<br />
⎦<br />
⎪⎭<br />
0 1<br />
We can now convert this basis of N ( MB,C) T <strong>in</strong>to a basis of K(T ) by apply<strong>in</strong>g<br />
to each element of the basis,<br />
⎛⎡<br />
− 5 ⎤⎞<br />
2<br />
⎜⎢<br />
0 ⎥⎟<br />
⎝⎣<br />
1<br />
⎦⎠ =(− 5 [ ] [ ] [ ] [ ]<br />
1 2 1 3 1 2 2 5<br />
2 ) +0<br />
+1 +0<br />
−1 −1 −1 −4 0 −2 −2 −4<br />
0<br />
ρ −1<br />
B<br />
ρ −1<br />
B<br />
[ ]<br />
−<br />
3<br />
= 2<br />
−3<br />
5 1<br />
2 2<br />
⎛⎡<br />
⎤⎞<br />
−2<br />
[ ]<br />
⎜⎢− 1<br />
⎝⎣<br />
2⎥⎟<br />
1 2<br />
0<br />
⎦⎠ =(−2)<br />
−1 −1<br />
1<br />
[ ]<br />
−<br />
1<br />
= 2<br />
− 1 2<br />
1<br />
2<br />
0<br />
So the set<br />
{[ ]<br />
−<br />
3<br />
2<br />
−3<br />
,<br />
5<br />
2<br />
+(− 1 [ ] [ ] [ ]<br />
1 3 1 2 2 5<br />
2 ) +0 +1<br />
−1 −4 0 −2 −2 −4<br />
1<br />
2<br />
[ ]}<br />
−<br />
1<br />
2<br />
− 1 2<br />
1<br />
2<br />
0<br />
is a basis for K(T ). Just for fun, you might evaluate T with each of these two basis<br />
vectors and verify that the output is the zero polynomial (Exercise MR.C10). △<br />
An entirely similar result applies to the range of a l<strong>in</strong>ear transformation and the<br />
column space of a matrix representation of the l<strong>in</strong>ear transformation.<br />
Theorem RCSI Range and Column Space Isomorphism<br />
Suppose that T : U → V is a l<strong>in</strong>ear transformation, B is a basis for U of size n, and<br />
C is a basis for V of size m. Then the range of T is isomorphic to the column space
§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 531<br />
of M T B,C ,<br />
R(T ) ∼ = C ( MB,C<br />
T )<br />
Proof. To establish that two vector spaces are isomorphic, we must f<strong>in</strong>d an isomorphism<br />
between them, an <strong>in</strong>vertible l<strong>in</strong>ear transformation (Def<strong>in</strong>ition IVS). The range<br />
of the l<strong>in</strong>ear transformation T , R(T ), is a subspace of V , while the column space of<br />
the matrix representation, C ( MB,C) T is a subspace of C m . The function ρ C is def<strong>in</strong>ed<br />
as a function from V to C m , but we can just as well employ the def<strong>in</strong>ition of ρ C as<br />
a function from R(T )toC ( MB,C) T .<br />
We must first <strong>in</strong>sure that if we choose an <strong>in</strong>put for ρ C from R(T ) that then the<br />
output will be an element of C ( MB,C) T .Sosupposethatv ∈R(T ). Then there is a<br />
vector u ∈ U, such that T (u) =v. Consider<br />
MB,Cρ T B (u) =ρ C (T (u))<br />
Theorem FTMR<br />
= ρ C (v) Def<strong>in</strong>ition RLT<br />
This says that ρ C (v) ∈C ( MB,C) T , as desired.<br />
The restriction <strong>in</strong> the size of the doma<strong>in</strong> and codoma<strong>in</strong> will not affect the fact<br />
that ρ C is a l<strong>in</strong>ear transformation (Theorem VRLT), nor will it affect the fact that<br />
ρ C is <strong>in</strong>jective (Theorem VRI). Someth<strong>in</strong>g must be done though to verify that ρ C<br />
is surjective. This all gets a bit confus<strong>in</strong>g, s<strong>in</strong>ce the doma<strong>in</strong> of our isomorphism<br />
is the range of the l<strong>in</strong>ear transformation, so th<strong>in</strong>k about your objects as you go.<br />
To establish that ρ C is surjective, appeal to the def<strong>in</strong>ition of a surjective l<strong>in</strong>ear<br />
transformation (Def<strong>in</strong>ition SLT), and suppose that we have an element of the<br />
codoma<strong>in</strong>, y ∈C ( MB,C) T ⊆ C m and we wish to f<strong>in</strong>d an element of the doma<strong>in</strong> with<br />
y as its image. S<strong>in</strong>ce y ∈C ( MB,C) T , there exists a vector, x ∈ C n with MB,C T x = y.<br />
We now show that the desired element of the doma<strong>in</strong> is v = ρ −1<br />
C<br />
(y). <strong>First</strong>, verify<br />
that v ∈R(T ) by apply<strong>in</strong>g T to u = ρ −1<br />
B<br />
(x),<br />
T (u) =T ( ρ −1<br />
B<br />
(x))<br />
= ρ −1 ( ( (<br />
C M<br />
T<br />
B,C ρB ρ<br />
−1<br />
B (x)))) Theorem FTMR<br />
= ρ −1 (<br />
C M<br />
T<br />
B,C (I C n (x)) ) Def<strong>in</strong>ition IVLT<br />
= ρ −1 (<br />
C M<br />
T<br />
B,C x ) Def<strong>in</strong>ition IDLT<br />
= ρ −1<br />
C<br />
(y) Def<strong>in</strong>ition CSM<br />
= v Substitution<br />
Second, verify that the proposed isomorphism, ρ C ,takesv to y,<br />
(<br />
ρ C (v) =ρ C ρ<br />
−1<br />
C (y)) Substitution<br />
= I C m (y) Def<strong>in</strong>ition IVLT<br />
= y Def<strong>in</strong>ition IDLT
§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 532<br />
With ρ C demonstrated to be an <strong>in</strong>jective and surjective l<strong>in</strong>ear transformation<br />
from R(T ) to C ( M T B,C)<br />
,TheoremILTIS tells us ρC is <strong>in</strong>vertible, and so by Def<strong>in</strong>ition<br />
IVS, wesayR(T )andC ( M T B,C)<br />
are isomorphic. <br />
Example RVMR Range via matrix representation<br />
In this example, we will recycle the l<strong>in</strong>ear transformation T and the bases B and C<br />
of Example KVMR but now we will compute the range of T : M 22 → P 2 ,givenby<br />
([ ])<br />
a b<br />
T<br />
=(2a − b + c − 5d)+(a +4b +5c +2d)x +(3a − 2b + c − 8d)x<br />
c d<br />
2<br />
With bases B and C,<br />
{[ ] [ ] [ ] [ ]}<br />
1 2 1 3 1 2 2 5<br />
B =<br />
,<br />
, ,<br />
−1 −1 −1 −4 0 −2 −2 −4<br />
C = { 1+x + x 2 , 2+3x, −1 − 2x 2}<br />
we obta<strong>in</strong> the matrix representation<br />
[ ]<br />
2 −24 5 −8<br />
MB,C T = 0 8 0 4<br />
−2 −26 −5 −17<br />
We know from Theorem RCSI that the range of the l<strong>in</strong>ear transformation T is<br />
isomorphic to the column space of the matrix representation MB,C T and by study<strong>in</strong>g<br />
the proof of Theorem RCSI we learn that ρ C is an isomorphism between these<br />
subspaces. Notice that s<strong>in</strong>ce the range is a subspace of the codoma<strong>in</strong>, we will<br />
employ ρ C as the isomorphism, rather than ρ B , which was the correct choice for an<br />
isomorphism between the null spaces of Example KVMR.<br />
Rather than try<strong>in</strong>g to compute the range of T us<strong>in</strong>g def<strong>in</strong>itions and techniques<br />
from Chapter LT we will <strong>in</strong>stead analyze the column space of MB,C T us<strong>in</strong>g techniques<br />
from way back <strong>in</strong> Chapter M. <strong>First</strong> row-reduce ( t,<br />
MB,C) T<br />
⎡<br />
⎤ ⎡<br />
⎤<br />
2 0 −2<br />
⎢−24 8 −26⎥<br />
⎣<br />
5 0 −5<br />
⎦ −−−−→<br />
RREF<br />
−8 4 −17<br />
⎢<br />
⎣<br />
1 0 −1<br />
0 1 − 25<br />
4 ⎥<br />
0 0 0 ⎦<br />
0 0 0<br />
Now employ Theorem CSRST and Theorem BRS (there are other methods we<br />
could choose here to compute the column space, such as Theorem BCS) to obta<strong>in</strong><br />
the basis for C ( )<br />
MB,C<br />
T ,<br />
⎧[ ] ⎡<br />
⎨ 1<br />
0 , ⎣ 0 ⎤⎫<br />
⎬<br />
1 ⎦<br />
⎩<br />
−1 − 25 ⎭<br />
4<br />
We can now convert this basis of C ( )<br />
MB,C<br />
T <strong>in</strong>to a basis of R(T ) by apply<strong>in</strong>g ρ<br />
−1<br />
C
§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 533<br />
to each element of the basis,<br />
([ ]) 1<br />
ρ −1<br />
C<br />
0 =(1+x + x 2 ) − (−1 − 2x 2 )=2+x +3x 2<br />
−1<br />
⎛⎡<br />
⎤⎞<br />
ρ −1<br />
C<br />
So the set<br />
⎝⎣ 0 1<br />
− 25<br />
4<br />
⎦⎠ =(2+3x) − 25 4 (−1 − 2x2 )= 33 4<br />
{2+3x +3x 2 , 33 4 +3x + 31 2 x2 }<br />
+3x +<br />
31<br />
2 x2<br />
is a basis for R(T ).<br />
Theorem KNSI and Theorem RCSI can be viewed as further formal evidence for<br />
the Coord<strong>in</strong>atization Pr<strong>in</strong>ciple, though they are not direct consequences.<br />
Diagram KRI is meant to suggest Theorem KNSI and Theorem RCSI, <strong>in</strong> addition<br />
to their proofs (and so carry the same notation as the statements of these two<br />
theorems). The dashed l<strong>in</strong>es <strong>in</strong>dicate a subspace relationship, with the smaller vector<br />
space lower down <strong>in</strong> the diagram. The central square is highly rem<strong>in</strong>iscent of Diagram<br />
FTMR. Each of the four vector representations is an isomorphism, so the <strong>in</strong>verse<br />
l<strong>in</strong>ear transformation could be depicted with an arrow po<strong>in</strong>t<strong>in</strong>g <strong>in</strong> the other direction.<br />
The four vector spaces across the bottom are familiar from the earliest days of the<br />
course, while the four vector spaces across the top are completely abstract. The<br />
vector representations that are restrictions (far left and far right) are the functions<br />
shown to be <strong>in</strong>vertible representations as the key technique <strong>in</strong> the proofs of Theorem<br />
KNSI and Theorem RCSI. So this diagram could be helpful as you study those two<br />
proofs.<br />
△<br />
U<br />
T<br />
V<br />
K(T ) R(T )<br />
ρ B ρ C<br />
M T<br />
C n B,C<br />
ρ C m<br />
B | K(T ) ρ C | R(T )<br />
N ( )<br />
MB,C<br />
T<br />
C ( )<br />
MB,C<br />
T<br />
Diagram KRI: Kernel and Range Isomorphisms
§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 534<br />
Subsection IVLT<br />
Invertible L<strong>in</strong>ear Transformations<br />
We have seen, both <strong>in</strong> theorems and <strong>in</strong> examples, that questions about l<strong>in</strong>ear transformations<br />
are often equivalent to questions about matrices. It is the matrix representation<br />
of a l<strong>in</strong>ear transformation that makes this idea precise. Here is our f<strong>in</strong>al<br />
theorem that solidifies this connection.<br />
Theorem IMR Invertible Matrix Representations<br />
Suppose that T : U → V is a l<strong>in</strong>ear transformation, B is a basis for U and C is a<br />
basis for V .ThenT is an <strong>in</strong>vertible l<strong>in</strong>ear transformation if and only if the matrix<br />
representation of T relative to B and C, MB,C T is an <strong>in</strong>vertible matrix. When T is<br />
<strong>in</strong>vertible,<br />
M T −1<br />
C,B<br />
= ( MB,C<br />
T ) −1<br />
Proof. (⇒) Suppose T is <strong>in</strong>vertible, so the <strong>in</strong>verse l<strong>in</strong>ear transformation T −1 : V → U<br />
exists (Def<strong>in</strong>ition IVLT). Both l<strong>in</strong>ear transformations have matrix representations<br />
relative to the bases of U and V , namely MB,C T −1<br />
T<br />
and MC,B<br />
(Def<strong>in</strong>ition MR).<br />
Then<br />
M T −1<br />
C,B MB,C T = M T −1 ◦T<br />
B,B<br />
Theorem MRCLT<br />
= M I U<br />
B,B<br />
Def<strong>in</strong>ition IVLT<br />
=[ρ B (I U (u 1 ))| ρ B (I U (u 2 ))| ...|ρ B (I U (u n ))] Def<strong>in</strong>ition MR<br />
=[ρ B (u 1 )| ρ B (u 2 )| ρ B (u 3 )| ...|ρ B (u n )] Def<strong>in</strong>ition IDLT<br />
=[e 1 |e 2 |e 3 | ...|e n ]<br />
Def<strong>in</strong>ition VR<br />
= I n Def<strong>in</strong>ition IM<br />
And<br />
M T B,CM T −1<br />
C,B<br />
−1<br />
T ◦T<br />
= MC,C<br />
Theorem MRCLT<br />
= M I V<br />
C,C<br />
Def<strong>in</strong>ition IVLT<br />
=[ρ C (I V (v 1 ))| ρ C (I V (v 2 ))| ...|ρ C (I V (v n ))] Def<strong>in</strong>ition MR<br />
=[ρ C (v 1 )| ρ C (v 2 )| ρ C (v 3 )| ...|ρ C (v n )] Def<strong>in</strong>ition IDLT<br />
=[e 1 |e 2 |e 3 | ...|e n ]<br />
Def<strong>in</strong>ition VR<br />
= I n Def<strong>in</strong>ition IM<br />
These two equations show that MB,C T<br />
MI) and establish that when T is <strong>in</strong>vertible, then M T −1<br />
C,B = ( −1.<br />
MB,C) T<br />
(⇐) Suppose now that MB,C<br />
T<br />
and M<br />
T<br />
−1<br />
C,B<br />
are <strong>in</strong>verse matrices (Def<strong>in</strong>ition<br />
is an <strong>in</strong>vertible matrix and hence nons<strong>in</strong>gular<br />
(Theorem NI). We compute the nullity of T ,<br />
n (T )=dim(K(T ))<br />
Def<strong>in</strong>ition KLT
§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 535<br />
=dim ( N ( MB,C<br />
T ))<br />
Theorem KNSI<br />
= n ( MB,C<br />
T )<br />
Def<strong>in</strong>ition NOM<br />
=0 TheoremRNNM<br />
So the kernel of T is trivial, and by Theorem KILT, T is <strong>in</strong>jective.<br />
We now compute the rank of T ,<br />
r (T )=dim(R(T ))<br />
Def<strong>in</strong>ition RLT<br />
=dim ( C ( MB,C<br />
T ))<br />
Theorem RCSI<br />
= r ( MB,C<br />
T )<br />
Def<strong>in</strong>ition ROM<br />
=dim(V )<br />
Theorem RNNM<br />
S<strong>in</strong>ce the dimension of the range of T equals the dimension of the codoma<strong>in</strong> V ,<br />
by Theorem EDYES, R(T ) = V .WhichsaysthatT is surjective by Theorem RSLT.<br />
Because T is both <strong>in</strong>jective and surjective, by Theorem ILTIS, T is <strong>in</strong>vertible.<br />
By now, the connections between matrices and l<strong>in</strong>ear transformations should<br />
be start<strong>in</strong>g to become more transparent, and you may have already recognized the<br />
<strong>in</strong>vertibility of a matrix as be<strong>in</strong>g tantamount to the <strong>in</strong>vertibility of the associated<br />
matrix representation. The next example shows how to apply this theorem to<br />
the problem of actually build<strong>in</strong>g a formula for the <strong>in</strong>verse of an <strong>in</strong>vertible l<strong>in</strong>ear<br />
transformation.<br />
Example ILTVR Inverse of a l<strong>in</strong>ear transformation via a representation<br />
Consider the l<strong>in</strong>ear transformation<br />
R: P 3 → M 22 , R ( a + bx + cx 2 + x 3) [ ]<br />
a + b − c +2d 2a +3b − 2c +3d<br />
=<br />
a + b +2d −a + b +2c − 5d<br />
If we wish to quickly f<strong>in</strong>d a formula for the <strong>in</strong>verse of R (presum<strong>in</strong>g it exists),<br />
then choos<strong>in</strong>g “nice” bases will work best. So build a matrix representation of R<br />
relative to the bases B and C,<br />
B = { 1,x,x 2 ,x 3}<br />
Then,<br />
C =<br />
{[<br />
1 0<br />
0 0<br />
]<br />
,<br />
[<br />
0 1<br />
0 0<br />
]<br />
,<br />
[ ]<br />
0 0<br />
,<br />
1 0<br />
([ ])<br />
1 2<br />
ρ C (R (1)) = ρ C =<br />
1 −1<br />
[ ]}<br />
0 0<br />
0 1<br />
⎡<br />
⎢<br />
⎣<br />
1<br />
2<br />
1<br />
−1<br />
⎤<br />
⎥<br />
⎦
§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 536<br />
So a representation of R is<br />
⎡ ⎤<br />
([ ]) 1<br />
1 3 ⎢3⎥<br />
ρ C (R (x)) = ρ C =<br />
1 1<br />
⎣<br />
1<br />
⎦<br />
1<br />
⎡ ⎤<br />
( (<br />
ρ C R x<br />
2 )) ([ ]) −1<br />
−1 −2 ⎢−2⎥<br />
= ρ C =<br />
0 2<br />
⎣<br />
0<br />
⎦<br />
2<br />
⎡ ⎤<br />
( (<br />
ρ C R x<br />
3 )) ([ ]) 2<br />
2 3 ⎢ 3 ⎥<br />
= ρ C =<br />
2 −5<br />
⎣<br />
2<br />
⎦<br />
−5<br />
M R B,C =<br />
⎡<br />
⎢<br />
⎣<br />
1 1 −1 2<br />
2 3 −2 3<br />
1 1 0 2<br />
−1 1 2 −5<br />
The matrix MB,C R is <strong>in</strong>vertible (as you can check) so we know for sure that R is<br />
<strong>in</strong>vertible by Theorem IMR. Furthermore,<br />
⎡<br />
⎤−1<br />
⎡<br />
⎤<br />
MC,B<br />
R−1<br />
= ( 1 1 −1 2 20 −7 −2 3<br />
MB,C<br />
R ) −1 ⎢ 2 3 −2 3 ⎥ ⎢−8 3 1 −1⎥<br />
= ⎣<br />
1 1 0 2<br />
⎦ = ⎣<br />
−1 0 1 0<br />
⎦<br />
−1 1 2 −5 −6 2 1 −1<br />
We can use this representation of the <strong>in</strong>verse l<strong>in</strong>ear transformation, <strong>in</strong> concert<br />
with Theorem FTMR, to determ<strong>in</strong>e an explicit formula for the <strong>in</strong>verse itself,<br />
([ ]) ( ([ ]))<br />
R −1 a b<br />
= ρ<br />
c d<br />
−1<br />
a b<br />
B<br />
MC,B R−1<br />
ρ C Theorem FTMR<br />
c d<br />
( ([ ]))<br />
(M<br />
= ρ −1 )<br />
R −1 a b<br />
B B,C ρC Theorem IMR<br />
c d<br />
⎛ ⎡ ⎤⎞<br />
= ρ −1 ⎜<br />
B ⎝ ( a<br />
MB,C<br />
R ) −1 ⎢b⎥⎟<br />
⎣<br />
c<br />
⎦⎠<br />
Def<strong>in</strong>ition VR<br />
d<br />
⎛⎡<br />
⎤ ⎡ ⎤⎞<br />
20 −7 −2 3 a<br />
= ρ −1 ⎜⎢−8 3 1 −1⎥<br />
⎢b⎥⎟<br />
B ⎝⎣<br />
−1 0 1 0<br />
⎦ ⎣<br />
c<br />
⎦⎠<br />
Def<strong>in</strong>ition MI<br />
−6 2 1 −1 d<br />
⎤<br />
⎥<br />
⎦
§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 537<br />
= ρ −1<br />
B<br />
⎛⎡<br />
⎤⎞<br />
20a − 7b − 2c +3d<br />
⎜⎢<br />
−8a +3b + c − d ⎥⎟<br />
⎝⎣<br />
−a + c<br />
⎦⎠<br />
−6a +2b + c − d<br />
=(20a − 7b − 2c +3d)+(−8a +3b + c − d)x<br />
+(−a + c)x 2 +(−6a +2b + c − d)x 3<br />
Def<strong>in</strong>ition MVP<br />
Def<strong>in</strong>ition VR<br />
You might look back at Example AIVLT, where we first witnessed the <strong>in</strong>verse of<br />
a l<strong>in</strong>ear transformation and recognize that the <strong>in</strong>verse (S) was built from us<strong>in</strong>g the<br />
method of Example ILTVR with a matrix representation of T .<br />
Theorem IMILT Invertible Matrices, Invertible L<strong>in</strong>ear Transformation<br />
Suppose that A is a square matrix of size n and T : C n → C n is the l<strong>in</strong>ear transformation<br />
def<strong>in</strong>ed by T (x) = Ax. ThenA is an <strong>in</strong>vertible matrix if and only if T is an<br />
<strong>in</strong>vertible l<strong>in</strong>ear transformation.<br />
Proof. Choose bases B = C = {e 1 , e 2 , e 3 , ..., e n } consist<strong>in</strong>g of the standard unit<br />
vectors as a basis of C n (Theorem SUVB) and build a matrix representation of T<br />
relative to B and C. Then<br />
ρ C (T (e i )) = ρ C (Ae i )<br />
= ρ C (A i )<br />
= A i<br />
So then the matrix representation of T , relative to B and C, issimplyMB,C T = A.<br />
With this observation, the proof becomes a specialization of Theorem IMR,<br />
T is <strong>in</strong>vertible ⇐⇒ MB,C T is <strong>in</strong>vertible ⇐⇒ A is <strong>in</strong>vertible<br />
<br />
This theorem may seem gratuitous. Why state such a special case of Theorem<br />
IMR? Because it adds another condition to our NMEx series of theorems, and <strong>in</strong><br />
some ways it is the most fundamental expression of what it means for a matrix to<br />
be nons<strong>in</strong>gular — the associated l<strong>in</strong>ear transformation is <strong>in</strong>vertible. This is our f<strong>in</strong>al<br />
update.<br />
Theorem NME9 Nons<strong>in</strong>gular Matrix Equivalences, Round 9<br />
Suppose that A is a square matrix of size n. The follow<strong>in</strong>g are equivalent.<br />
1. A is nons<strong>in</strong>gular.<br />
2. A row-reduces to the identity matrix.<br />
3. The null space of A conta<strong>in</strong>s only the zero vector, N (A) ={0}.<br />
△
§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 538<br />
4. The l<strong>in</strong>ear system LS(A, b) has a unique solution for every possible choice of<br />
b.<br />
5. The columns of A are a l<strong>in</strong>early <strong>in</strong>dependent set.<br />
6. A is <strong>in</strong>vertible.<br />
7. The column space of A is C n , C(A) =C n .<br />
8. The columns of A are a basis for C n .<br />
9. The rank of A is n, r (A) =n.<br />
10. The nullity of A is zero, n (A) =0.<br />
11. The determ<strong>in</strong>ant of A is nonzero, det (A) ≠0.<br />
12. λ =0is not an eigenvalue of A.<br />
13. The l<strong>in</strong>ear transformation T : C n → C n def<strong>in</strong>ed by T (x) =Ax is <strong>in</strong>vertible.<br />
Proof. By Theorem IMILT, the new addition to this list is equivalent to the statement<br />
that A is <strong>in</strong>vertible, so we can expand Theorem NME8.<br />
<br />
Read<strong>in</strong>g Questions<br />
1. WhydoesTheoremFTMR deserve the moniker “fundamental”?<br />
2. F<strong>in</strong>d the matrix representation, MB,C T of the l<strong>in</strong>ear transformation<br />
([ ]) [ ]<br />
T : C 2 → C 2 x1 2x1 − x<br />
, T =<br />
2<br />
x 2 3x 1 +2x 2<br />
relative to the bases<br />
B =<br />
{[<br />
2<br />
3]<br />
,<br />
[ ]}<br />
−1<br />
2<br />
C =<br />
3. What is the second “surprise,” and why is it surpris<strong>in</strong>g?<br />
Exercises<br />
{[ [<br />
1 1<br />
,<br />
0]<br />
1]}<br />
C10 Example KVMR concludes with a basis for the kernel of the l<strong>in</strong>ear transformation T .<br />
Compute the value of T for each of these two basis vectors. Did you get what you expected?<br />
C20 † Compute the matrix representation of T relative to the bases B and C.<br />
⎡<br />
⎤<br />
T : P 3 → C 3 , T ( a + bx + cx 2 + dx 3) 2a − 3b +4c − 2d<br />
= ⎣ a + b − c + d ⎦<br />
3a +2c − 3d
§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 539<br />
⎧⎡<br />
B = { 1,x,x 2 ,x 3} ⎨<br />
C = ⎣ 1 ⎤<br />
0⎦ ,<br />
⎩<br />
0<br />
⎡<br />
⎤<br />
⎣ 1 1⎦ ,<br />
0<br />
⎡<br />
⎣ 1 ⎤⎫<br />
⎬<br />
1⎦<br />
1<br />
⎭<br />
C21 †<br />
and C.<br />
F<strong>in</strong>d a matrix representation of the l<strong>in</strong>ear transformation T relative to the bases B<br />
[ ]<br />
T : P 2 → C 2 p(1)<br />
, T (p(x)) =<br />
p(3)<br />
B = { 2 − 5x + x 2 , 1+x − x 2 ,x 2}<br />
{[ [<br />
3 2<br />
C = ,<br />
4]<br />
3]}<br />
C22 † Let S 22 be the vector space of 2 × 2 symmetric matrices. Build the matrix representation<br />
of the l<strong>in</strong>ear transformation T : P 2 → S 22 relative to the bases B and C and then<br />
use this matrix representation to compute T ( 3+5x − 2x 2) .<br />
B = { 1, 1+x, 1+x + x 2} {[ ] [ ] [ ]}<br />
1 0 0 1 0 0<br />
C = , ,<br />
0 0 1 0 0 1<br />
T ( a + bx + cx 2) [ ]<br />
2a − b + c a+3b − c<br />
=<br />
a +3b − c a− c<br />
C25 † Use a matrix representation to determ<strong>in</strong>e if the l<strong>in</strong>ear transformation T : P 3 → M 22<br />
is surjective.<br />
T ( a + bx + cx 2 + dx 3) [ ]<br />
−a +4b + c +2d 4a − b +6c − d<br />
=<br />
a +5b − 2c +2d a+2c +5d<br />
C30 † F<strong>in</strong>d bases for the kernel and range of the l<strong>in</strong>ear transformation S below.<br />
([ ])<br />
a b<br />
S : M 22 → P 2 , S<br />
=(a +2b +5c − 4d)+(3a − b +8c +2d)x +(a + b +4c − 2d)x 2<br />
c d<br />
C40 † Let S 22 be the set of 2× 2 symmetric matrices. Verify that the l<strong>in</strong>ear transformation<br />
R is <strong>in</strong>vertible and f<strong>in</strong>d R −1 .<br />
([ ])<br />
a b<br />
R: S 22 → P 2 , R<br />
=(a − b)+(2a − 3b − 2c)x +(a − b + c)x 2<br />
b c<br />
C41 † Prove that the l<strong>in</strong>ear transformation S is <strong>in</strong>vertible. Then f<strong>in</strong>d a formula for the<br />
<strong>in</strong>verse l<strong>in</strong>ear transformation, S −1 , by employ<strong>in</strong>g a matrix <strong>in</strong>verse.<br />
S : P 1 → M 12 , S(a + bx) = [ 3a + b 2a + b ]<br />
C42 † The l<strong>in</strong>ear transformation R: M 12 → M 21 is <strong>in</strong>vertible. Use a matrix representation<br />
to determ<strong>in</strong>e a formula for the <strong>in</strong>verse l<strong>in</strong>ear transformation R −1 : M 21 → M 12 .<br />
R ([ a b ]) [ ]<br />
a +3b<br />
=<br />
4a +11b<br />
C50 †<br />
Use a matrix representation to f<strong>in</strong>d a basis for the range of the l<strong>in</strong>ear transformation
§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 540<br />
L.<br />
([ ])<br />
a b<br />
L: M 22 → P 2 , T<br />
=(a +2b +4c + d)+(3a + c − 2d)x +(−a + b +3c +3d)x 2<br />
c d<br />
C51 Use a matrix representation to f<strong>in</strong>d a basis for the kernel of the l<strong>in</strong>ear transformation<br />
L.<br />
([ ])<br />
a b<br />
L: M 22 → P 2 , T<br />
=(a +2b +4c + d)+(3a + c − 2d)x +(−a + b +3c +3d)x 2<br />
c d<br />
C52 † F<strong>in</strong>d a basis for the kernel of the l<strong>in</strong>ear transformation T : P 2 → M 22 .<br />
T ( a + bx + cx 2) [ ]<br />
a +2b − 2c 2a +2b<br />
=<br />
−a + b − 4c 3a +2b +2c<br />
M20 † The l<strong>in</strong>ear transformation D performs differentiation on polynomials. Use a matrix<br />
representation of D to f<strong>in</strong>d the rank and nullity of D.<br />
D : P n → P n ,<br />
D(p(x)) = p ′ (x)<br />
M60 Suppose U and V are vector spaces and def<strong>in</strong>e a function Z : U → V by T (u) = 0 V<br />
for every u ∈ U. Then Exercise IVLT.M60 asks you to formulate the theorem: Z is <strong>in</strong>vertible<br />
if and only if U = {0 U } and V = {0 V }. What would a matrix representation of Z look like<br />
<strong>in</strong> this case? How does Theorem IMR read <strong>in</strong> this case?<br />
M80 In light of Theorem KNSI and Theorem MRCLT, write a short comparison of<br />
Exercise MM.T40 with Exercise ILT.T15.<br />
M81 In light of Theorem RCSI and Theorem MRCLT, write a short comparison of<br />
Exercise CRS.T40 with Exercise SLT.T15.<br />
M82 In light of Theorem MRCLT and Theorem IMR, write a short comparison of<br />
Theorem SS and Theorem ICLT.<br />
M83 In light of Theorem MRCLT and Theorem IMR, write a short comparison of<br />
Theorem NPNT and Exercise IVLT.T40.<br />
T20 † Construct a new solution to Exercise B.T50 along the follow<strong>in</strong>g outl<strong>in</strong>e. From<br />
the n × n matrix A, construct the l<strong>in</strong>ear transformation T : C n → C n , T (x) = Ax. Use<br />
Theorem NI, TheoremIMILT and Theorem ILTIS to translate between the nons<strong>in</strong>gularity<br />
of A and the surjectivity/<strong>in</strong>jectivity of T . Then apply Theorem ILTB and Theorem SLTB<br />
to connect these properties with bases.<br />
T40 Theorem VSLT def<strong>in</strong>es the vector space LT (U, V ) conta<strong>in</strong><strong>in</strong>g all l<strong>in</strong>ear transformations<br />
with doma<strong>in</strong> U and codoma<strong>in</strong> V . Suppose dim (U) = n and dim (V ) = m. Provethat<br />
LT (U, V ) is isomorphic to M mn , the vector space of all m × n matrices (Example VSM).<br />
(H<strong>in</strong>t: we could have suggested this exercise <strong>in</strong> Chapter LT, but have postponed it to this<br />
section. Why?)<br />
T41 Theorem VSLT def<strong>in</strong>es the vector space LT (U, V ) conta<strong>in</strong><strong>in</strong>g all l<strong>in</strong>ear transformations<br />
with doma<strong>in</strong> U and codoma<strong>in</strong> V . Determ<strong>in</strong>e a basis for LT (U, V ). (H<strong>in</strong>t:study<br />
Exercise MR.T40 first.)
§MR Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 541<br />
T60 Create an entirely different proof of Theorem IMILT that relies on Def<strong>in</strong>ition IVLT to<br />
establish the <strong>in</strong>vertibility of T , and that relies on Def<strong>in</strong>ition MI to establish the <strong>in</strong>vertibility<br />
of A.<br />
T80 † Suppose that T : U → V and S : V → W are l<strong>in</strong>ear transformations, and that B, C<br />
and D are bases for U, V ,andW . Us<strong>in</strong>g only Def<strong>in</strong>ition MR def<strong>in</strong>e matrix representations<br />
for T and S. Us<strong>in</strong>g these two def<strong>in</strong>itions, and Def<strong>in</strong>ition MR, derive a matrix representation<br />
for the composition S ◦ T <strong>in</strong> terms of the entries of the matrices MB,C T and MC,D. S Expla<strong>in</strong><br />
how you would use this result to motivate a def<strong>in</strong>ition for matrix multiplication that is<br />
strik<strong>in</strong>gly similar to Theorem EMP.
Section CB<br />
Change of Basis<br />
We have seen <strong>in</strong> Section MR that a l<strong>in</strong>ear transformation can be represented by<br />
a matrix, once we pick bases for the doma<strong>in</strong> and codoma<strong>in</strong>. How does the matrix<br />
representation change if we choose different bases? Which bases lead to especially<br />
nice representations? From the <strong>in</strong>f<strong>in</strong>ite possibilities, what is the best possible representation?<br />
This section will beg<strong>in</strong> to answer these questions. But first we need to<br />
def<strong>in</strong>e eigenvalues for l<strong>in</strong>ear transformations and the change-of-basis matrix.<br />
Subsection EELT<br />
Eigenvalues and Eigenvectors of L<strong>in</strong>ear Transformations<br />
We now def<strong>in</strong>e the notion of an eigenvalue and eigenvector of a l<strong>in</strong>ear transformation.<br />
It should not be too surpris<strong>in</strong>g, especially if you rem<strong>in</strong>d yourself of the close<br />
relationship between matrices and l<strong>in</strong>ear transformations.<br />
Def<strong>in</strong>ition EELT Eigenvalue and Eigenvector of a L<strong>in</strong>ear Transformation<br />
Suppose that T : V → V is a l<strong>in</strong>ear transformation. Then a nonzero vector v ∈ V is<br />
an eigenvector of T for the eigenvalue λ if T (v) =λv.<br />
□<br />
We will see shortly the best method for comput<strong>in</strong>g the eigenvalues and eigenvectors<br />
of a l<strong>in</strong>ear transformation, but for now, here are some examples to verify that such<br />
th<strong>in</strong>gs really do exist.<br />
Example ELTBM Eigenvectors of l<strong>in</strong>ear transformation between matrices<br />
Consider the l<strong>in</strong>ear transformation T : M 22 → M 22 def<strong>in</strong>ed by<br />
([ ]) [ ]<br />
a b −17a +11b +8c − 11d −57a +35b +24c − 33d<br />
T<br />
=<br />
c d −14a +10b +6c − 10d −41a +25b +16c − 23d<br />
and the vectors<br />
[ ]<br />
[ ]<br />
[ ]<br />
[ ]<br />
0 1<br />
1 1<br />
1 3<br />
2 6<br />
x 1 =<br />
x<br />
0 1<br />
2 =<br />
x<br />
1 0<br />
3 =<br />
x<br />
2 3<br />
4 =<br />
1 4<br />
Then compute<br />
([ ]) [ ]<br />
0 1 0 2<br />
T (x 1 )=T<br />
= =2x<br />
0 1 0 2 1<br />
([ ]) [ ]<br />
1 1 2 2<br />
T (x 2 )=T<br />
= =2x<br />
1 0 2 0 2<br />
([ ]) [ ]<br />
1 3 −1 −3<br />
T (x 3 )=T<br />
=<br />
=(−1)x<br />
2 3 −2 −3<br />
3<br />
([ ]) [ ]<br />
2 6 −4 −12<br />
T (x 4 )=T<br />
=<br />
=(−2)x<br />
1 4 −2 −8<br />
4<br />
542
§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 543<br />
So x 1 , x 2 , x 3 , x 4 are eigenvectors of T with eigenvalues (respectively) λ 1 =2,<br />
λ 2 =2,λ 3 = −1, λ 4 = −2.<br />
△<br />
Here is another.<br />
Example ELTBP Eigenvectors of l<strong>in</strong>ear transformation between polynomials<br />
Consider the l<strong>in</strong>ear transformation R: P 2 → P 2 def<strong>in</strong>ed by<br />
R ( a + bx + cx 2) =(15a +8b − 4c)+(−12a − 6b +3c)x +(24a +14b − 7c)x 2<br />
and the vectors<br />
w 1 =1− x + x 2 w 2 = x +2x 2 w 3 =1+4x 2<br />
Then compute<br />
R (w 1 )=R ( 1 − x + x 2) =3− 3x +3x 2 =3w 1<br />
R (w 2 )=R ( x +2x 2) =0+0x +0x 2 =0w 2<br />
R (w 3 )=R ( 1+4x 2) = −1 − 4x 2 =(−1)w 3<br />
So w 1 , w 2 , w 3 are eigenvectors of R with eigenvalues (respectively) λ 1 =3,<br />
λ 2 =0,λ 3 = −1. Notice how the eigenvalue λ 2 = 0 <strong>in</strong>dicates that the eigenvector w 2<br />
is a nontrivial element of the kernel of R, and therefore R is not <strong>in</strong>jective (Exercise<br />
CB.T15).<br />
△<br />
Of course, these examples are meant only to illustrate the def<strong>in</strong>ition of eigenvectors<br />
and eigenvalues for l<strong>in</strong>ear transformations, and therefore beg the question, “How<br />
would I f<strong>in</strong>d eigenvectors?” We will have an answer before we f<strong>in</strong>ish this section. We<br />
need one more construction first.<br />
Subsection CBM<br />
Change-of-Basis Matrix<br />
Given a vector space, we know we can usually f<strong>in</strong>d many different bases for the<br />
vector space, some nice, some nasty. If we choose a s<strong>in</strong>gle vector from this vector<br />
space, we can build many different representations of the vector by construct<strong>in</strong>g the<br />
representations relative to different bases. How are these different representations<br />
related to each other? A change-of-basis matrix answers this question.<br />
Def<strong>in</strong>ition CBM Change-of-Basis Matrix<br />
Suppose that V is a vector space, and I V : V → V is the identity l<strong>in</strong>ear transformation<br />
on V .LetB = {v 1 , v 2 , v 3 , ..., v n } and C be two bases of V . Then the changeof-basis<br />
matrix from B to C is the matrix representation of I V relative to B and
§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 544<br />
C,<br />
C B,C = M I V<br />
B,C<br />
=[ρ C (I V (v 1 ))| ρ C (I V (v 2 ))| ρ C (I V (v 3 ))| ...|ρ C (I V (v n ))]<br />
=[ρ C (v 1 )| ρ C (v 2 )| ρ C (v 3 )| ...|ρ C (v n )]<br />
Notice that this def<strong>in</strong>ition is primarily about a s<strong>in</strong>gle vector space (V )andtwo<br />
bases of V (B, C). The l<strong>in</strong>ear transformation (I V ) is necessary but not critical. As<br />
you might expect, this matrix has someth<strong>in</strong>g to do with chang<strong>in</strong>g bases. Here is the<br />
theorem that gives the matrix its name (not the other way around).<br />
Theorem CB Change-of-Basis<br />
Suppose that v is a vector <strong>in</strong> the vector space V and B and C are bases of V .Then<br />
ρ C (v) =C B,C ρ B (v)<br />
□<br />
Proof.<br />
ρ C (v) =ρ C (I V (v))<br />
Def<strong>in</strong>ition IDLT<br />
= M I V<br />
B,C ρ B (v)<br />
Theorem FTMR<br />
= C B,C ρ B (v) Def<strong>in</strong>ition CBM<br />
<br />
So the change-of-basis matrix can be used with matrix multiplication to convert a<br />
vector representation of a vector (v) relative to one basis (ρ B (v)) to a representation<br />
of the same vector relative to a second basis (ρ C (v)).<br />
Theorem ICBM Inverse of Change-of-Basis Matrix<br />
Suppose that V is a vector space, and B and C are bases of V . Then the change-ofbasis<br />
matrix C B,C is nons<strong>in</strong>gular and<br />
C −1<br />
B,C = C C,B<br />
Proof. The l<strong>in</strong>ear transformation I V : V → V is <strong>in</strong>vertible, and its <strong>in</strong>verse is itself, I V<br />
(check this!). So by Theorem IMR, thematrixM I V<br />
B,C = C B,C is <strong>in</strong>vertible. Theorem<br />
NI says an <strong>in</strong>vertible matrix is nons<strong>in</strong>gular.<br />
Then<br />
( −1<br />
C −1<br />
B,C = M I V<br />
B,C)<br />
Def<strong>in</strong>ition CBM<br />
= M I−1 V<br />
C,B<br />
Theorem IMR<br />
= M I V<br />
C,B<br />
Def<strong>in</strong>ition IDLT<br />
= C C,B Def<strong>in</strong>ition CBM
§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 545<br />
Example CBP Change of basis with polynomials<br />
The vector space P 4 (Example VSP) has two nice bases (Example BP),<br />
B = { 1,x,x 2 ,x 3 ,x 4}<br />
C = { 1, 1+x, 1+x + x 2 , 1+x + x 2 + x 3 , 1+x + x 2 + x 3 + x 4}<br />
To build the change-of-basis matrix between B and C, we must first build a<br />
vector representation of each vector <strong>in</strong> B relative to C,<br />
⎡ ⎤<br />
1<br />
0<br />
ρ C (1) = ρ C ((1) (1)) = ⎢0⎥<br />
⎣0⎦<br />
0<br />
⎡ ⎤<br />
−1<br />
1<br />
ρ C (x) =ρ C ((−1) (1) + (1) (1 + x)) = ⎢ 0 ⎥<br />
⎣ 0 ⎦<br />
0<br />
⎡ ⎤<br />
0<br />
(<br />
ρ C x<br />
2 ) ( (<br />
= ρ C (−1) (1 + x)+(1) 1+x + x<br />
2 )) −1<br />
= ⎢ 1 ⎥<br />
⎣ 0 ⎦<br />
0<br />
⎡ ⎤<br />
0<br />
(<br />
ρ C x<br />
3 ) ( (<br />
= ρ C (−1) 1+x + x<br />
2 ) +(1) ( 1+x + x 2 + x 3)) 0<br />
= ⎢−1⎥<br />
⎣ 1 ⎦<br />
0<br />
⎡ ⎤<br />
0<br />
(<br />
ρ C x<br />
4 ) ( (<br />
= ρ C (−1) 1+x + x 2 + x 3) +(1) ( 1+x + x 2 + x 3 + x 4)) 0<br />
= ⎢ 0 ⎥<br />
⎣−1⎦<br />
1<br />
Then we package up these vectors as the columns of a matrix,<br />
⎡<br />
⎤<br />
1 −1 0 0 0<br />
0 1 −1 0 0<br />
C B,C = ⎢0 0 1 −1 0 ⎥<br />
⎣0 0 0 1 −1⎦<br />
0 0 0 0 1
§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 546<br />
Now, to illustrate Theorem CB, consider the vector u =5− 3x +2x 2 +8x 3 − 3x 4 .<br />
We can build the representation of u relative to B easily,<br />
⎡ ⎤<br />
5<br />
(<br />
ρ B (u) =ρ B 5 − 3x +2x 2 +8x 3 − 3x 4) −3<br />
= ⎢ 2 ⎥<br />
⎣ 8 ⎦<br />
−3<br />
Apply<strong>in</strong>g Theorem CB, we obta<strong>in</strong> a second representation of u, but now relative<br />
to C,<br />
ρ C (u) =C B,C ρ B (u)<br />
Theorem CB<br />
⎡<br />
⎤ ⎡ ⎤<br />
1 −1 0 0 0 5<br />
0 1 −1 0 0<br />
−3<br />
= ⎢0 0 1 −1 0 ⎥ ⎢ 2 ⎥<br />
⎣0 0 0 1 −1⎦<br />
⎣ 8 ⎦<br />
0 0 0 0 1 −3<br />
⎡ ⎤<br />
8<br />
−5<br />
= ⎢−6⎥<br />
Def<strong>in</strong>ition MVP<br />
⎣ 11 ⎦<br />
−3<br />
We can check our work by unravel<strong>in</strong>g this second representation,<br />
u = ρ −1<br />
C (ρ C (u)) Def<strong>in</strong>ition IVLT<br />
⎛⎡<br />
⎤⎞<br />
8<br />
−5<br />
= ρ −1<br />
C ⎜⎢−6⎥⎟<br />
⎝⎣<br />
11 ⎦⎠<br />
−3<br />
=8(1)+(−5)(1 + x)+(−6)(1 + x + x 2 )<br />
+ (11)(1 + x + x 2 + x 3 )+(−3)(1 + x + x 2 + x 3 + x 4 ) Def<strong>in</strong>ition VR<br />
=5− 3x +2x 2 +8x 3 − 3x 4<br />
The change-of-basis matrix from C to B is actually easier to build. Grab each<br />
vector <strong>in</strong> the basis C and form its representation relative to B<br />
⎡ ⎤<br />
1<br />
0<br />
ρ B (1) = ρ B ((1)1) = ⎢0⎥<br />
⎣0⎦<br />
0
§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 547<br />
⎡ ⎤<br />
1<br />
1<br />
ρ B (1 + x) =ρ B ((1)1 + (1)x) = ⎢0⎥<br />
⎣0⎦<br />
0<br />
⎡ ⎤<br />
1<br />
(<br />
ρ B 1+x + x<br />
2 ) (<br />
= ρ B (1)1 + (1)x +(1)x<br />
2 ) 1<br />
= ⎢1⎥<br />
⎣0⎦<br />
0<br />
⎡ ⎤<br />
1<br />
(<br />
ρ B 1+x + x 2 + x 3) (<br />
= ρ B (1)1 + (1)x +(1)x 2 +(1)x 3) 1<br />
= ⎢1⎥<br />
⎣1⎦<br />
0<br />
⎡ ⎤<br />
1<br />
(<br />
ρ B 1+x + x 2 + x 3 + x 4) (<br />
= ρ B (1)1 + (1)x +(1)x 2 +(1)x 3 +(1)x 4) 1<br />
= ⎢1⎥<br />
⎣1⎦<br />
1<br />
Then we package up these vectors as the columns of a matrix,<br />
⎡<br />
⎤<br />
1 1 1 1 1<br />
0 1 1 1 1<br />
C C,B = ⎢0 0 1 1 1⎥<br />
⎣0 0 0 1 1⎦<br />
0 0 0 0 1<br />
We formed two representations of the vector u above, so we can aga<strong>in</strong> provide a<br />
check on our computations by convert<strong>in</strong>g from the representation of u relative to C<br />
to the representation of u relative to B,<br />
ρ B (u) =C C,B ρ C (u)<br />
Theorem CB<br />
⎡<br />
⎤ ⎡ ⎤<br />
1 1 1 1 1 8<br />
0 1 1 1 1<br />
−5<br />
= ⎢0 0 1 1 1⎥<br />
⎢−6⎥<br />
⎣0 0 0 1 1⎦<br />
⎣ 11 ⎦<br />
0 0 0 0 1 −3<br />
⎡ ⎤<br />
5<br />
−3<br />
= ⎢ 2 ⎥<br />
Def<strong>in</strong>ition MVP<br />
⎣ 8 ⎦<br />
−3
§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 548<br />
One more computation that is either a check on our work, or an illustration of a<br />
theorem. The two change-of-basis matrices, C B,C and C C,B , should be <strong>in</strong>verses of<br />
each other, accord<strong>in</strong>g to Theorem ICBM. Here we go,<br />
⎡<br />
⎤ ⎡<br />
⎤<br />
⎤<br />
1 −1 0 0 0 1 1 1 1 1 1 0 0 0 0<br />
0 1 −1 0 0<br />
0 1 1 1 1<br />
0 1 0 0 0<br />
⎢<br />
⎥ ⎢<br />
⎥ ⎢<br />
⎥<br />
C B,C C C,B = ⎢0 0 1 −1 0<br />
⎣0 0 0 1 −1<br />
0 0 0 0 1<br />
⎡<br />
⎥ ⎢0 0 1 1 1⎥<br />
⎦ ⎣0 0 0 1 1⎦ = ⎢0 0 1 0 0⎥<br />
⎣0 0 0 1 0⎦<br />
0 0 0 0 1 0 0 0 0 1<br />
The computations of the previous example are not meant to present any laborsav<strong>in</strong>g<br />
devices, but <strong>in</strong>stead are meant to illustrate the utility of the change-of-basis<br />
matrix. However, you might have noticed that C C,B was easier to compute than<br />
C B,C . If you needed C B,C , then you could first compute C C,B and then compute its<br />
<strong>in</strong>verse, which by Theorem ICBM, would equal C B,C .<br />
Here is another illustrative example. We have been concentrat<strong>in</strong>g on work<strong>in</strong>g<br />
with abstract vector spaces, but all of our theorems and techniques apply just as well<br />
to C m , the vector space of column vectors. We only need to use more complicated<br />
bases than the standard unit vectors (Theorem SUVB) to make th<strong>in</strong>gs <strong>in</strong>terest<strong>in</strong>g.<br />
Example CBCV Change of basis with column vectors<br />
For the vector space C 4 we have the two bases,<br />
⎧⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />
⎧⎡<br />
⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫<br />
1 −1 2 −1<br />
⎪⎨<br />
⎪⎬ ⎪⎨ 1 −4 −5 3 ⎪⎬<br />
⎢−2⎥<br />
⎢ 3 ⎥ ⎢−3⎥<br />
⎢ 3 ⎥<br />
B = ⎣<br />
⎪⎩<br />
1<br />
⎦ , ⎣<br />
1<br />
⎦ , ⎣<br />
3<br />
⎦ , ⎣<br />
3<br />
⎦<br />
⎪⎭<br />
C = ⎢−6⎥<br />
⎢ 8 ⎥ ⎢ 13 ⎥ ⎢−7⎥<br />
⎣<br />
⎪ ⎩<br />
−4<br />
⎦ , ⎣<br />
−5<br />
⎦ , ⎣<br />
−2<br />
⎦ , ⎣<br />
3<br />
⎦<br />
⎪⎭<br />
−2 1 −4 0<br />
−1 8 9 −6<br />
The change-of-basis matrix from B to C requires writ<strong>in</strong>g each vector of B as a<br />
l<strong>in</strong>ear comb<strong>in</strong>ation the vectors <strong>in</strong> C,<br />
⎛⎡<br />
⎤⎞<br />
⎛ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎞<br />
⎡ ⎤<br />
1<br />
1<br />
−4 −5<br />
3 1<br />
⎜⎢−2⎥⎟<br />
⎜ ⎢−6⎥<br />
⎢ 8 ⎥ ⎢ 13 ⎥ ⎢−7⎥⎟<br />
⎢−2⎥<br />
ρ C ⎝⎣<br />
1<br />
⎦⎠ = ρ C ⎝(1) ⎣<br />
−4<br />
⎦ +(−2) ⎣<br />
−5<br />
⎦ +(1) ⎣<br />
−2<br />
⎦ +(−1) ⎣<br />
3<br />
⎦⎠ = ⎣<br />
1<br />
⎦<br />
−2<br />
−1<br />
8 9<br />
−6 −1<br />
⎛⎡<br />
⎤⎞<br />
⎛ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎞<br />
⎡ ⎤<br />
−1<br />
1<br />
−4 −5 3 2<br />
⎜⎢<br />
3 ⎥⎟<br />
⎜ ⎢−6⎥<br />
⎢ 8 ⎥ ⎢ 13 ⎥ ⎢−7⎥⎟<br />
⎢−3⎥<br />
ρ C ⎝⎣<br />
1<br />
⎦⎠ = ρ C ⎝(2) ⎣<br />
−4<br />
⎦ +(−3) ⎣<br />
−5<br />
⎦ +(3) ⎣<br />
−2<br />
⎦ +(0) ⎣<br />
3<br />
⎦⎠ = ⎣<br />
3<br />
⎦<br />
1<br />
−1<br />
8 9 −6 0<br />
⎛⎡<br />
⎤⎞<br />
⎛ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎞<br />
⎡ ⎤<br />
2<br />
1<br />
−4 −5<br />
3 1<br />
⎜⎢−3⎥⎟<br />
⎜ ⎢−6⎥<br />
⎢ 8 ⎥ ⎢ 13 ⎥ ⎢−7⎥⎟<br />
⎢−3⎥<br />
ρ C ⎝⎣<br />
3<br />
⎦⎠ = ρ C ⎝(1) ⎣<br />
−4<br />
⎦ +(−3) ⎣<br />
−5<br />
⎦ +(1) ⎣<br />
−2<br />
⎦ +(−2) ⎣<br />
3<br />
⎦⎠ = ⎣<br />
1<br />
⎦<br />
−4<br />
−1<br />
8 9<br />
−6 −2<br />
△
§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 549<br />
⎛⎡<br />
⎤⎞<br />
⎛ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎞<br />
⎡ ⎤<br />
−1<br />
1<br />
−4 −5 3 2<br />
⎜⎢<br />
3 ⎥⎟<br />
⎜ ⎢−6⎥<br />
⎢ 8 ⎥ ⎢ 13 ⎥ ⎢−7⎥⎟<br />
⎢−2⎥<br />
ρ C ⎝⎣<br />
3<br />
⎦⎠ = ρ C ⎝(2) ⎣<br />
−4<br />
⎦ +(−2) ⎣<br />
−5<br />
⎦ +(4) ⎣<br />
−2<br />
⎦ +(3) ⎣<br />
3<br />
⎦⎠ = ⎣<br />
4<br />
⎦<br />
0<br />
−1<br />
8 9 −6 3<br />
Then we package these vectors up as the change-of-basis matrix,<br />
⎡<br />
⎤<br />
1 2 1 2<br />
⎢−2 −3 −3 −2⎥<br />
C B,C = ⎣<br />
1 3 1 4<br />
⎦<br />
−1 0 −2 3<br />
⎡ ⎤<br />
2<br />
⎢ 6 ⎥<br />
Now consider a s<strong>in</strong>gle (arbitrary) vector y = ⎣<br />
−3<br />
⎦. <strong>First</strong>, build the vector<br />
4<br />
representation of y relative to B. This will require writ<strong>in</strong>g y as a l<strong>in</strong>ear comb<strong>in</strong>ation<br />
of the vectors <strong>in</strong> B,<br />
⎛⎡<br />
⎤⎞<br />
2<br />
⎜⎢<br />
6 ⎥⎟<br />
ρ B (y) =ρ B ⎝⎣<br />
−3<br />
⎦⎠<br />
4<br />
= ρ B<br />
⎛<br />
⎜<br />
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎞<br />
⎡ ⎤<br />
1 −1 2<br />
−1 −21<br />
⎢−2⎥<br />
⎢ 3 ⎥ ⎢−3⎥<br />
⎢ 3 ⎥⎟<br />
⎢ 6 ⎥<br />
⎝(−21) ⎣<br />
1<br />
⎦ +(6) ⎣<br />
1<br />
⎦ + (11) ⎣<br />
3<br />
⎦ +(−7) ⎣<br />
3<br />
⎦⎠ = ⎣<br />
11<br />
⎦<br />
−2 1 −4<br />
0 −7<br />
Now, apply<strong>in</strong>g Theorem CB we can convert the representation of y relative to B<br />
<strong>in</strong>to a representation relative to C,<br />
ρ C (y) =C B,C ρ B (y)<br />
Theorem CB<br />
⎡<br />
⎤ ⎡ ⎤<br />
1 2 1 2 −21<br />
⎢−2 −3 −3 −2⎥<br />
⎢ 6<br />
= ⎣<br />
1 3 1 4<br />
⎦ ⎣<br />
11<br />
−1 0 −2 3 −7<br />
⎡ ⎤<br />
−12<br />
⎢ 5 ⎥<br />
= ⎣<br />
−20<br />
⎦<br />
−22<br />
⎥<br />
⎦<br />
Def<strong>in</strong>ition MVP<br />
We could cont<strong>in</strong>ue further with this example, perhaps by comput<strong>in</strong>g the representation<br />
of y relative to the basis C directly as a check on our work (Exercise<br />
CB.C20). Or we could choose another vector to play the role of y and compute two<br />
different representations of this vector relative to the two bases B and C. △
§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 550<br />
Subsection MRS<br />
Matrix Representations and Similarity<br />
Here is the ma<strong>in</strong> theorem of this section. It looks a bit <strong>in</strong>volved at first glance, but<br />
the proof should make you realize it is not all that complicated. In any event, we<br />
are more <strong>in</strong>terested <strong>in</strong> a special case.<br />
Theorem MRCB Matrix Representation and Change of Basis<br />
Suppose that T : U → V is a l<strong>in</strong>ear transformation, B and C are bases for U, and<br />
D and E are bases for V .Then<br />
MB,D T = C E,D MC,EC T B,C<br />
Proof.<br />
C E,D MC,EC T B,C = M I V<br />
E,D M C,EM T I U<br />
B,C<br />
Def<strong>in</strong>ition CBM<br />
= M I V<br />
E,D M T ◦I U<br />
B,E<br />
Theorem MRCLT<br />
= M I V<br />
E,D M B,E<br />
T Def<strong>in</strong>ition IDLT<br />
= M I V ◦T<br />
B,D<br />
Theorem MRCLT<br />
= MB,D T Def<strong>in</strong>ition IDLT<br />
<br />
We will be most <strong>in</strong>terested <strong>in</strong> a special case of this theorem (Theorem SCB), but<br />
here is an example that illustrates the full generality of Theorem MRCB.<br />
Example MRCM Matrix representations and change-of-basis matrices<br />
Beg<strong>in</strong> with two vector spaces, S 2 , the subspace of M 22 conta<strong>in</strong><strong>in</strong>g all 2 × 2 symmetric<br />
matrices, and P 3 (Example VSP), the vector space of all polynomials of degree 3 or<br />
less. Then def<strong>in</strong>e the l<strong>in</strong>ear transformation Q: S 2 → P 3 by<br />
([ ])<br />
a b<br />
Q =(5a − 2b +6c)+(3a − b +2c)x +(a +3b − c)x<br />
b c<br />
2 +(−4a +2b + c)x 3<br />
Here are two bases for each vector space, one nice, one nasty. <strong>First</strong> for S 2 ,<br />
{[ ] [ ] [ ]} {[ ] [ ] [ ]}<br />
5 −3 2 −3 1 2<br />
1 0 0 1 0 0<br />
B =<br />
,<br />
,<br />
C = , ,<br />
−3 −2 −3 0 2 4<br />
0 0 1 0 0 1<br />
and then for P 3 ,<br />
D = { 2+x − 2x 2 +3x 3 , −1 − 2x 2 +3x 3 , −3 − x + x 3 , −x 2 + x 3}<br />
E = { 1,x,x 2 ,x 3}<br />
We will beg<strong>in</strong> with a matrix representation of Q relative to C and E. Wefirst
§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 551<br />
f<strong>in</strong>d vector representations of the elements of C relative to E,<br />
⎡<br />
([ ]))<br />
1 0 (<br />
ρ E<br />
(Q<br />
= ρ<br />
0 0<br />
E 5+3x + x 2 − 4x 3) ⎢<br />
= ⎣<br />
ρ E<br />
(Q<br />
ρ E<br />
(Q<br />
([ ]))<br />
0 1 (<br />
= ρ<br />
1 0<br />
E −2 − x +3x 2 +2x 3) =<br />
([ ]))<br />
0 0 (<br />
= ρ<br />
0 1<br />
E 6+2x − x 2 + x 3) =<br />
⎡<br />
⎢<br />
⎣<br />
5<br />
3<br />
1<br />
−4<br />
⎤<br />
⎥<br />
⎦<br />
⎡ ⎤<br />
−2<br />
⎢−1⎥<br />
⎣<br />
3<br />
⎦<br />
2<br />
⎤<br />
6<br />
2<br />
−1<br />
1<br />
⎥<br />
⎦<br />
So<br />
M Q C,E = ⎡<br />
⎢<br />
⎣<br />
5 −2 6<br />
3 −1 2<br />
1 3 −1<br />
−4 2 1<br />
Now we construct two change-of-basis matrices. <strong>First</strong>, C B,C requires vector<br />
representations of the elements of B, relative to C. S<strong>in</strong>ceC is a nice basis, this is<br />
straightforward,<br />
([ ]) ( [ ] [ ] [ ]) [ ] 5<br />
5 −3<br />
1 0 0 1 0 0<br />
ρ C = ρ<br />
−3 −2 C (5) +(−3) +(−2) = −3<br />
0 0 1 0 0 1<br />
−2<br />
([ ]) ( [ ] [ ] [ ]) [ ] 2<br />
2 −3<br />
1 0 0 1 0 0<br />
ρ C = ρ<br />
−3 0<br />
C (2) +(−3) +(0) = −3<br />
0 0 1 0 0 1<br />
0<br />
([ ]) ( [ ] [ ] [ ]) [ ] 1 1 2<br />
1 0 0 1 0 0<br />
ρ C = ρ<br />
2 4 C (1) +(2) +(4) = 2<br />
0 0 1 0 0 1<br />
4<br />
So<br />
C B,C =<br />
[ 5 2<br />
] 1<br />
−3 −3 2<br />
−2 0 4<br />
The other change-of-basis matrix we will compute is C E,D . However, s<strong>in</strong>ce E is<br />
a nice basis (and D is not) we will turn it around and <strong>in</strong>stead compute C D,E and<br />
⎤<br />
⎥<br />
⎦
§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 552<br />
apply Theorem ICBM to use an <strong>in</strong>verse to compute C E,D .<br />
⎡ ⎤<br />
2<br />
(<br />
ρ E 2+x − 2x 2 +3x 3) (<br />
= ρ E (2)1 + (1)x +(−2)x 2 +(3)x 3) ⎢ 1 ⎥<br />
= ⎣<br />
−2<br />
⎦<br />
3<br />
⎡ ⎤<br />
−1<br />
(<br />
ρ E −1 − 2x 2 +3x 3) (<br />
= ρ E (−1)1 + (0)x +(−2)x 2 +(3)x 3) ⎢ 0 ⎥<br />
= ⎣<br />
−2<br />
⎦<br />
3<br />
⎡ ⎤<br />
(<br />
ρ E −3 − x + x<br />
3 ) −3<br />
(<br />
= ρ E (−3)1 + (−1)x +(0)x 2 +(1)x 3) ⎢−1⎥<br />
= ⎣<br />
0<br />
⎦<br />
1<br />
⎡ ⎤<br />
0<br />
0 ⎥<br />
(<br />
ρ E −x 2 + x 3) (<br />
= ρ E (0)1 + (0)x +(−1)x 2 +(1)x 3) ⎢<br />
= ⎣<br />
So, we can package these column vectors up as a matrix to obta<strong>in</strong> C D,E and<br />
then,<br />
C E,D =(C D,E ) −1<br />
Theorem ICBM<br />
⎡<br />
⎤−1<br />
2 −1 −3 0<br />
⎢ 1 0 −1 0 ⎥<br />
= ⎣<br />
−2 −2 0 −1<br />
⎦<br />
3 3 1 1<br />
⎡<br />
⎤<br />
1 −2 1 1<br />
⎢−2 5 −1 −1⎥<br />
= ⎣<br />
1 −3 1 1<br />
⎦<br />
2 −6 −1 0<br />
We are now <strong>in</strong> a position to apply Theorem MRCB. The matrix representation<br />
of Q relative to B and D can be obta<strong>in</strong>ed as follows,<br />
M Q B,D = C E,DM Q C,E C B,C<br />
Theorem MRCB<br />
⎡<br />
⎤ ⎡<br />
⎤<br />
1 −2 1 1 5 −2 6<br />
⎢−2 5 −1 −1⎥<br />
⎢ 3 −1 2 ⎥<br />
= ⎣<br />
1 −3 1 1<br />
⎦ ⎣<br />
1 3 −1<br />
⎦<br />
2 −6 −1 0 −4 2 1<br />
⎡<br />
⎤ ⎡<br />
⎤<br />
1 −2 1 1 19 16 25<br />
⎢−2 5 −1 −1⎥<br />
⎢ 14 9 9 ⎥<br />
= ⎣<br />
1 −3 1 1<br />
⎦ ⎣<br />
−2 −7 3<br />
⎦<br />
2 −6 −1 0 −28 −14 4<br />
[ 5 2<br />
] 1<br />
−3 −3 2<br />
−2 0 4<br />
−1<br />
1<br />
⎦
§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 553<br />
⎡<br />
⎤<br />
−39 −23 14<br />
⎢ 62 34 −12⎥<br />
= ⎣<br />
−53 −32 5<br />
⎦<br />
−44 −15 −7<br />
Now check our work by comput<strong>in</strong>g M Q B,D<br />
directly (Exercise CB.C21). △<br />
Here is a special case of the previous theorem, where we choose U and V to<br />
be the same vector space, so the matrix representations and the change-of-basis<br />
matrices are all square of the same size.<br />
Theorem SCB Similarity and Change of Basis<br />
Suppose that T : V → V is a l<strong>in</strong>ear transformation and B and C are bases of V .<br />
Then<br />
MB,B T = C −1<br />
B,C M C,CC T B,C<br />
Proof. In the conclusion of Theorem MRCB, replace D by B, and replace E by C,<br />
MB,B T = C C,B MC,CC T B,C<br />
Theorem MRCB<br />
= C −1<br />
B,C M C,CC T B,C<br />
Theorem ICBM<br />
<br />
This is the third surprise of this chapter. Theorem SCB considers the special case<br />
where a l<strong>in</strong>ear transformation has the same vector space for the doma<strong>in</strong> and codoma<strong>in</strong><br />
(V ). We build a matrix representation of T us<strong>in</strong>g the basis B simultaneously for both<br />
the doma<strong>in</strong> and codoma<strong>in</strong> (MB,B T ), and then we build a second matrix representation<br />
of T , now us<strong>in</strong>g the basis C for both the doma<strong>in</strong> and codoma<strong>in</strong> (MC,C T ). Then these<br />
two representations are related via a similarity transformation (Def<strong>in</strong>ition SIM)<br />
us<strong>in</strong>g a change-of-basis matrix (C B,C )!<br />
Example MRBE Matrix representation with basis of eigenvectors<br />
We return to the l<strong>in</strong>ear transformation T : M 22 → M 22 of Example ELTBM def<strong>in</strong>ed<br />
by<br />
([ ]) [ ]<br />
a b −17a +11b +8c − 11d −57a +35b +24c − 33d<br />
T<br />
=<br />
c d −14a +10b +6c − 10d −41a +25b +16c − 23d<br />
In Example ELTBM we showcased four eigenvectors of T . We will now put these<br />
four vectors <strong>in</strong> a set,<br />
{[ ] [ ] [ ] [ ]}<br />
0 1 1 1 1 3 2 6<br />
B = {x 1 , x 2 , x 3 , x 4 } = , , ,<br />
0 1 1 0 2 3 1 4<br />
Check that B is a basis of M 22 by first establish<strong>in</strong>g the l<strong>in</strong>ear <strong>in</strong>dependence of<br />
B and then employ<strong>in</strong>g Theorem G to get the spann<strong>in</strong>g property easily. Here is a
§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 554<br />
second set of 2 × 2 matrices, which also forms a basis of M 22 (Example BM),<br />
{[ ] [ ] [ ] [ ]}<br />
1 0 0 1 0 0 0 0<br />
C = {y 1 , y 2 , y 3 , y 4 } = , , ,<br />
0 0 0 0 1 0 0 1<br />
We can build two matrix representations of T , one relative to B and one relative<br />
to C. Each is easy, but for wildly different reasons. In our computation of the matrix<br />
representation relative to B we borrow some of our work <strong>in</strong> Example ELTBM. Here<br />
are the representations, then the explanation.<br />
⎡ ⎤<br />
2<br />
⎢0⎥<br />
ρ B (T (x 1 )) = ρ B (2x 1 )=ρ B (2x 1 +0x 2 +0x 3 +0x 4 )= ⎣<br />
0<br />
⎦<br />
0<br />
⎡ ⎤<br />
0<br />
⎢2⎥<br />
ρ B (T (x 2 )) = ρ B (2x 2 )=ρ B (0x 1 +2x 2 +0x 3 +0x 4 )= ⎣<br />
0<br />
⎦<br />
0<br />
⎡ ⎤<br />
0<br />
⎢ 0 ⎥<br />
ρ B (T (x 3 )) = ρ B ((−1)x 3 )=ρ B (0x 1 +0x 2 +(−1)x 3 +0x 4 )= ⎣<br />
−1<br />
⎦<br />
0<br />
⎡ ⎤<br />
0<br />
⎢ 0 ⎥<br />
ρ B (T (x 4 )) = ρ B ((−2)x 4 )=ρ B (0x 1 +0x 2 +0x 3 +(−2)x 4 )= ⎣<br />
0<br />
⎦<br />
−2<br />
So the result<strong>in</strong>g representation is<br />
⎡<br />
⎤<br />
2 0 0 0<br />
MB,B T ⎢0 2 0 0 ⎥<br />
= ⎣<br />
0 0 −1 0<br />
⎦<br />
0 0 0 −2<br />
Very pretty.<br />
Now for the matrix representation relative to C first compute,<br />
([ ])<br />
−17 −57<br />
ρ C (T (y 1 )) = ρ C<br />
−14 −41<br />
= ρ C<br />
((−17)<br />
[ ]<br />
1 0<br />
+(−57)<br />
0 0<br />
([ ])<br />
11 35<br />
ρ C (T (y 2 )) = ρ C<br />
10 25<br />
[ ]<br />
0 1<br />
+(−14)<br />
0 0<br />
[ ]<br />
0 0<br />
+(−41)<br />
1 0<br />
[ ])<br />
0 0<br />
=<br />
0 1<br />
⎡ ⎤<br />
−17<br />
⎢−57⎥<br />
⎣<br />
−14<br />
⎦<br />
−41
§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 555<br />
= ρ C<br />
(11<br />
[ ]<br />
1 0<br />
+35<br />
0 0<br />
([ ])<br />
8 24<br />
ρ C (T (y 3 )) = ρ C<br />
6 16<br />
= ρ C<br />
(8<br />
[ ]<br />
1 0<br />
+24<br />
0 0<br />
([<br />
−11<br />
ρ C (T (y 4 )) = ρ C<br />
−10<br />
= ρ C<br />
((−11)<br />
[ ]<br />
0 1<br />
+10<br />
0 0<br />
[ ]<br />
0 1<br />
+6<br />
0 0<br />
])<br />
−33<br />
−23<br />
[ ]<br />
1 0<br />
+(−33)<br />
0 0<br />
[ ]<br />
0 0<br />
+25<br />
1 0<br />
[ ]<br />
0 0<br />
+16<br />
1 0<br />
[ ]<br />
0 1<br />
+(−10)<br />
0 0<br />
[ ])<br />
0 0<br />
=<br />
0 1<br />
[ ])<br />
0 0<br />
=<br />
0 1<br />
⎡ ⎤<br />
11<br />
⎢35⎥<br />
⎣<br />
10<br />
⎦<br />
25<br />
⎡ ⎤<br />
8<br />
⎢24⎥<br />
⎣<br />
6<br />
⎦<br />
16<br />
[ ]<br />
0 0<br />
+(−23)<br />
1 0<br />
[ ])<br />
0 0<br />
=<br />
0 1<br />
⎡ ⎤<br />
−11<br />
⎢−33⎥<br />
⎣<br />
−10<br />
⎦<br />
−23<br />
So the result<strong>in</strong>g representation is<br />
⎡<br />
⎤<br />
−17 11 8 −11<br />
MC,C T ⎢−57 35 24 −33⎥<br />
= ⎣<br />
−14 10 6 −10<br />
⎦<br />
−41 25 16 −23<br />
Not quite as pretty.<br />
The purpose of this example is to illustrate Theorem SCB. This theorem says that<br />
the two matrix representations, MB,B T and M C,C T , of the one l<strong>in</strong>ear transformation,<br />
T , are related by a similarity transformation us<strong>in</strong>g the change-of-basis matrix C B,C .<br />
Let us compute this change-of-basis matrix. Notice that s<strong>in</strong>ce C is such a nice basis,<br />
this is fairly straightforward,<br />
⎡ ⎤<br />
([ ]) ( [ ] [ ] [ ] [ ]) 0<br />
0 1<br />
1 0 0 1 0 0 0 0 ⎢1⎥<br />
ρ C (x 1 )=ρ C = ρ<br />
0 1 C 0 +1 +0 +1 =<br />
0 0 0 0 1 0 0 1<br />
⎣<br />
0<br />
⎦<br />
1<br />
⎡ ⎤<br />
([ ]) ( [ ] [ ] [ ] [ ]) 1<br />
1 1<br />
1 0 0 1 0 0 0 0 ⎢1⎥<br />
ρ C (x 2 )=ρ C = ρ<br />
1 0 C 1 +1 +1 +0 =<br />
0 0 0 0 1 0 0 1<br />
⎣<br />
1<br />
⎦<br />
0<br />
⎡ ⎤<br />
([ ]) ( [ ] [ ] [ ] [ ]) 1<br />
1 3<br />
1 0 0 1 0 0 0 0 ⎢3⎥<br />
ρ C (x 3 )=ρ C = ρ<br />
2 3 C 1 +3 +2 +3 =<br />
0 0 0 0 1 0 0 1<br />
⎣<br />
2<br />
⎦<br />
3
§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 556<br />
([ ]) (<br />
2 6<br />
ρ C (x 4 )=ρ C = ρ<br />
1 4 C 2<br />
So we have,<br />
[ ]<br />
1 0<br />
+6<br />
0 0<br />
[ ]<br />
0 1<br />
+1<br />
0 0<br />
⎡<br />
⎤<br />
0 1 1 2<br />
⎢1 1 3 6⎥<br />
C B,C = ⎣<br />
0 1 2 1<br />
⎦<br />
1 0 3 4<br />
[ ]<br />
0 0<br />
+4<br />
1 0<br />
[ ])<br />
0 0<br />
=<br />
0 1<br />
Now, accord<strong>in</strong>g to Theorem SCB we can write,<br />
MB,B T = C −1<br />
B,C M C,CC T B,C<br />
⎡<br />
⎤ ⎡<br />
⎤−1 ⎡<br />
⎤ ⎡<br />
⎤<br />
2 0 0 0 0 1 1 2 −17 11 8 −11 0 1 1 2<br />
⎢0 2 0 0 ⎥ ⎢1 1 3 6⎥<br />
⎢−57 35 24 −33⎥<br />
⎢1 1 3 6⎥<br />
⎣<br />
0 0 −1 0<br />
⎦ = ⎣<br />
0 1 2 1<br />
⎦ ⎣<br />
−14 10 6 −10<br />
⎦ ⎣<br />
0 1 2 1<br />
⎦<br />
0 0 0 −2 1 0 3 4 −41 25 16 −23 1 0 3 4<br />
⎡ ⎤<br />
2<br />
⎢6⎥<br />
⎣<br />
1<br />
⎦<br />
4<br />
This should look and feel exactly like the process for diagonaliz<strong>in</strong>g a matrix, as<br />
was described <strong>in</strong> Section SD. And it is.<br />
△<br />
We can now return to the question of comput<strong>in</strong>g an eigenvalue or eigenvector<br />
of a l<strong>in</strong>ear transformation. For a l<strong>in</strong>ear transformation of the form T : V → V ,we<br />
know that representations relative to different bases are similar matrices. We also<br />
know that similar matrices have equal characteristic polynomials by Theorem SMEE.<br />
We will now show that eigenvalues of a l<strong>in</strong>ear transformation T are precisely the<br />
eigenvalues of any matrix representation of T . S<strong>in</strong>ce the choice of a different matrix<br />
representation leads to a similar matrix, there will be no “new” eigenvalues obta<strong>in</strong>ed<br />
from this second representation. Similarly, the change-of-basis matrix can be used<br />
to show that eigenvectors obta<strong>in</strong>ed from one matrix representation will be precisely<br />
those obta<strong>in</strong>ed from any other representation. So we can determ<strong>in</strong>e the eigenvalues<br />
and eigenvectors of a l<strong>in</strong>ear transformation by form<strong>in</strong>g one matrix representation,<br />
us<strong>in</strong>g any basis we please, and analyz<strong>in</strong>g the matrix <strong>in</strong> the manner of Chapter E.<br />
Theorem EER Eigenvalues, Eigenvectors, Representations<br />
Suppose that T : V → V is a l<strong>in</strong>ear transformation and B is a basis of V .Thenv ∈ V<br />
is an eigenvector of T for the eigenvalue λ if and only if ρ B (v) is an eigenvector of<br />
MB,B T for the eigenvalue λ.<br />
Proof. (⇒) Assume that v ∈ V is an eigenvector of T for the eigenvalue λ. Then<br />
MB,Bρ T B (v) =ρ B (T (v))<br />
Theorem FTMR<br />
= ρ B (λv) Def<strong>in</strong>ition EELT<br />
= λρ B (v) Theorem VRLT
§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 557<br />
which by Def<strong>in</strong>ition EEM says that ρ B (v) is an eigenvector of the matrix MB,B T for<br />
the eigenvalue λ.<br />
(⇐) Assume that ρ B (v) is an eigenvector of MB,B T for the eigenvalue λ. Then<br />
T (v) =ρ −1<br />
B (ρ B (T (v))) Def<strong>in</strong>ition IVLT<br />
= ρ −1 (<br />
B M<br />
T<br />
B,B ρ B (v) ) Theorem FTMR<br />
= ρ −1<br />
B (λρ B (v)) Def<strong>in</strong>ition EEM<br />
= λρ −1<br />
B (ρ B (v)) Theorem ILTLT<br />
= λv Def<strong>in</strong>ition IVLT<br />
which by Def<strong>in</strong>ition EELT says v is an eigenvector of T for the eigenvalue λ. <br />
Subsection CELT<br />
Comput<strong>in</strong>g Eigenvectors of L<strong>in</strong>ear Transformations<br />
Theorem EER tells us that the eigenvalues of a l<strong>in</strong>ear transformation are the<br />
eigenvalues of any representation, no matter what the choice of the basis B might be.<br />
So we could now unambiguously def<strong>in</strong>e items such as the characteristic polynomial<br />
of a l<strong>in</strong>ear transformation, which we would def<strong>in</strong>e as the characteristic polynomial<br />
of any matrix representation. We will say that aga<strong>in</strong> — eigenvalues, eigenvectors,<br />
and characteristic polynomials are <strong>in</strong>tr<strong>in</strong>sic properties of a l<strong>in</strong>ear transformation,<br />
<strong>in</strong>dependent of the choice of a basis used to construct a matrix representation.<br />
As a practical matter, how does one compute the eigenvalues and eigenvectors<br />
of a l<strong>in</strong>ear transformation of the form T : V → V ? Choose a nice basis B for V ,<br />
one where the vector representations of the values of the l<strong>in</strong>ear transformations<br />
necessary for the matrix representation are easy to compute. Construct the matrix<br />
representation relative to this basis, and f<strong>in</strong>d the eigenvalues and eigenvectors of<br />
this matrix us<strong>in</strong>g the techniques of Chapter E. The result<strong>in</strong>g eigenvalues of the<br />
matrix are precisely the eigenvalues of the l<strong>in</strong>ear transformation. The eigenvectors<br />
of the matrix are column vectors that need to be converted to vectors <strong>in</strong> V through<br />
application of ρ −1<br />
B<br />
(this is part of the content of Theorem EER).<br />
Now consider the case where the matrix representation of a l<strong>in</strong>ear transformation<br />
is diagonalizable. The n l<strong>in</strong>early <strong>in</strong>dependent eigenvectors that must exist for the<br />
matrix (Theorem DC) can be converted (via ρ −1<br />
B<br />
) <strong>in</strong>to eigenvectors of the l<strong>in</strong>ear<br />
transformation. A matrix representation of the l<strong>in</strong>ear transformation relative to a<br />
basis of eigenvectors will be a diagonal matrix — an especially nice representation!<br />
Though we did not know it at the time, the diagonalizations of Section SD were really<br />
about f<strong>in</strong>d<strong>in</strong>g especially pleas<strong>in</strong>g matrix representations of l<strong>in</strong>ear transformations.<br />
Here are some examples.<br />
Example ELTT Eigenvectors of a l<strong>in</strong>ear transformation, twice
§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 558<br />
Consider the l<strong>in</strong>ear transformation S : M 22 → M 22 def<strong>in</strong>ed by<br />
([ ]) [<br />
]<br />
a b<br />
−b − c − 3d −14a − 15b − 13c + d<br />
S<br />
=<br />
c d 18a +21b +19c +3d −6a − 7b − 7c − 3d<br />
To f<strong>in</strong>d the eigenvalues and eigenvectors of S we will build a matrix representation<br />
and analyze the matrix. S<strong>in</strong>ce Theorem EER places no restriction on the choice of<br />
the basis B, wemayaswelluseabasisthatiseasytoworkwith.Soset<br />
{[ ] [ ] [ ] [ ]}<br />
1 0 0 1 0 0 0 0<br />
B = {x 1 , x 2 , x 3 , x 4 } = , , ,<br />
0 0 0 0 1 0 0 1<br />
Then to build the matrix representation of S relative to B compute,<br />
([ ])<br />
0 −14<br />
ρ B (S (x 1 )) = ρ B<br />
18 −6<br />
⎡ ⎤<br />
0<br />
⎢−14⎥<br />
= ρ B (0x 1 +(−14)x 2 +18x 3 +(−6)x 4 )= ⎣<br />
18<br />
⎦<br />
−6<br />
([ ])<br />
−1 −15<br />
ρ B (S (x 2 )) = ρ B<br />
21 −7<br />
⎡ ⎤<br />
−1<br />
⎢−15⎥<br />
= ρ B ((−1)x 1 +(−15)x 2 +21x 3 +(−7)x 4 )= ⎣<br />
21<br />
⎦<br />
−7<br />
([ ])<br />
−1 −13<br />
ρ B (S (x 3 )) = ρ B<br />
19 −7<br />
⎡ ⎤<br />
−1<br />
⎢−13⎥<br />
= ρ B ((−1)x 1 +(−13)x 2 +19x 3 +(−7)x 4 )= ⎣<br />
19<br />
⎦<br />
−7<br />
([ ])<br />
−3 1<br />
ρ B (S (x 4 )) = ρ B<br />
3 −3<br />
⎡ ⎤<br />
= ρ B ((−3)x 1 +1x 2 +3x 3 +(−3)x 4 )=<br />
So by Def<strong>in</strong>ition MR we have<br />
⎡<br />
⎤<br />
0 −1 −1 −3<br />
M = MB,B S ⎢−14 −15 −13 1 ⎥<br />
= ⎣<br />
18 21 19 3<br />
⎦<br />
−6 −7 −7 −3<br />
⎢<br />
⎣<br />
−3<br />
1<br />
3<br />
−3<br />
⎥<br />
⎦
§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 559<br />
Now compute eigenvalues and eigenvectors of the matrix representation of M<br />
with the techniques of Section EE. <strong>First</strong> the characteristic polynomial,<br />
p M (x) =det(M − xI 4 )=x 4 − x 3 − 10x 2 +4x +24=(x − 3)(x − 2)(x +2) 2<br />
We could now make statements about the eigenvalues of M, but <strong>in</strong> light of<br />
Theorem EER we can refer to the eigenvalues of S and mildly abuse (or extend) our<br />
notation for multiplicities to write<br />
α S (3) = 1 α S (2) = 1 α S (−2) = 2<br />
Now compute the eigenvectors of M,<br />
⎡<br />
⎤ ⎡<br />
⎤<br />
−3 −1 −1 −3<br />
⎢−14 −18 −13 1<br />
λ =3 M − 3I 4 = ⎣<br />
18 21 16 3<br />
−6 −7 −7 −6<br />
〈 ⎧ ⎡ ⎤⎫<br />
⎪ −1 〉<br />
⎨ ⎪⎬<br />
⎢ 3 ⎥<br />
E M (3) = N (M − 3I 4 )= ⎣<br />
⎪ ⎩<br />
−3<br />
⎦<br />
⎪⎭<br />
1<br />
⎥<br />
⎦ RREF<br />
−−−−→<br />
⎡<br />
⎤<br />
−2 −1 −1 −3<br />
⎢−14 −17 −13 1<br />
λ =2 M − 2I 4 = ⎣<br />
18 21 17 3<br />
−6 −7 −7 −5<br />
〈 ⎧ ⎡ ⎤⎫<br />
⎪ −2 〉<br />
⎨ ⎪⎬<br />
⎢ 4 ⎥<br />
E M (2) = N (M − 2I 4 )= ⎣<br />
⎪ ⎩<br />
−3<br />
⎦<br />
⎪⎭<br />
1<br />
⎥<br />
⎦ RREF<br />
−−−−→<br />
⎡<br />
2 −1 −1<br />
⎤<br />
−3<br />
λ = −2<br />
⎢−14 −13 −13 1<br />
M − (−2)I 4 = ⎣<br />
18 21 21 3<br />
−6 −7 −7 −1<br />
〈 ⎧ ⎡<br />
⎪ ⎨<br />
⎪ ⎩<br />
E M (−2) = N (M − (−2)I 4 )=<br />
⎢<br />
⎣<br />
⎡<br />
⎢<br />
⎣<br />
⎥<br />
⎦ RREF<br />
−−−−→<br />
⎤<br />
0<br />
⎢−1⎥<br />
⎣<br />
1<br />
⎦ ,<br />
0<br />
1 0 0 1<br />
0 1 0 −3<br />
⎥<br />
0 0 1 3<br />
⎦<br />
0 0 0 0<br />
⎤<br />
1 0 0 2<br />
0 1 0 −4<br />
⎥<br />
0 0 1 3<br />
⎦<br />
0 0 0 0<br />
⎡<br />
⎢<br />
⎣<br />
⎤<br />
1 0 0 −1<br />
0 1 1 1<br />
⎥<br />
0 0 0 0 ⎦<br />
0 0 0 0<br />
⎡ ⎤⎫<br />
1 〉 ⎪⎬<br />
⎢−1⎥<br />
⎣<br />
0<br />
⎦<br />
⎪⎭<br />
1<br />
Accord<strong>in</strong>g to Theorem EER the eigenvectors just listed as basis vectors for the<br />
eigenspaces of M are vector representations (relative to B) of eigenvectors for S. So<br />
the application if the <strong>in</strong>verse function ρ −1<br />
B<br />
will convert these column vectors <strong>in</strong>to<br />
elements of the vector space M 22 (2 × 2 matrices) that are eigenvectors of S. S<strong>in</strong>ce<br />
ρ B is an isomorphism (Theorem VRILT), so is ρ −1<br />
B<br />
. Apply<strong>in</strong>g the <strong>in</strong>verse function
§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 560<br />
will then preserve l<strong>in</strong>ear <strong>in</strong>dependence and spann<strong>in</strong>g properties, so with a sweep<strong>in</strong>g<br />
application of the Coord<strong>in</strong>atization Pr<strong>in</strong>ciple and some extensions of our previous<br />
notation for eigenspaces and geometric multiplicities, we can write,<br />
ρ −1<br />
B<br />
ρ −1<br />
B<br />
ρ −1<br />
B<br />
ρ −1<br />
B<br />
⎛⎡<br />
⎤⎞<br />
−1<br />
⎜⎢<br />
3 ⎥⎟<br />
⎝⎣<br />
−3<br />
⎦⎠ =(−1)x 1 +3x 2 +(−3)x 3 +1x 4 =<br />
1<br />
⎛⎡<br />
⎤⎞<br />
−2<br />
⎜⎢<br />
4 ⎥⎟<br />
⎝⎣<br />
−3<br />
⎦⎠ =(−2)x 1 +4x 2 +(−3)x 3 +1x 4 =<br />
1<br />
⎛⎡<br />
⎤⎞<br />
0<br />
⎜⎢−1⎥⎟<br />
⎝⎣<br />
1<br />
⎦⎠ =0x 1 +(−1)x 2 +1x 3 +0x 4 =<br />
0<br />
⎛⎡<br />
⎤⎞<br />
1<br />
⎜⎢−1⎥⎟<br />
⎝⎣<br />
0<br />
⎦⎠ =1x 1 +(−1)x 2 +0x 3 +1x 4 =<br />
1<br />
[<br />
−1<br />
]<br />
3<br />
−3 1<br />
[<br />
−2<br />
]<br />
4<br />
−3 1<br />
[ ]<br />
0 −1<br />
1 0<br />
[ ]<br />
1 −1<br />
0 1<br />
So<br />
〈{[ ]}〉<br />
−1 3<br />
E S (3) =<br />
−3 1<br />
〈{[ ]}〉<br />
−2 4<br />
E S (2) =<br />
−3 1<br />
〈{[ ] [ ]}〉<br />
0 −1 1 −1<br />
E S (−2) =<br />
,<br />
1 0 0 1<br />
with geometric multiplicities given by<br />
γ S (3) = 1 γ S (2) = 1 γ S (−2) = 2<br />
Suppose we now decided to build another matrix representation of S, onlynow<br />
relative to a l<strong>in</strong>early <strong>in</strong>dependent set of eigenvectors of S, suchas<br />
{[ ] [ ] [ ] [ ]}<br />
−1 3 −2 4 0 −1 1 −1<br />
C =<br />
, , ,<br />
−3 1 −3 1 1 0 0 1<br />
At this po<strong>in</strong>t you should have computed enough matrix representations to predict<br />
that the result of represent<strong>in</strong>g S relative to C will be a diagonal matrix. Comput<strong>in</strong>g<br />
this representation is an example of how Theorem SCB generalizes the
§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 561<br />
diagonalizations from Section SD. For the record, here is the diagonal representation,<br />
⎡<br />
⎤<br />
3 0 0 0<br />
MC,C S ⎢0 2 0 0 ⎥<br />
= ⎣<br />
0 0 −2 0<br />
⎦<br />
0 0 0 −2<br />
Our <strong>in</strong>terest <strong>in</strong> this example is not necessarily build<strong>in</strong>g nice representations, but<br />
<strong>in</strong>stead we want to demonstrate how eigenvalues and eigenvectors are an <strong>in</strong>tr<strong>in</strong>sic<br />
property of a l<strong>in</strong>ear transformation, <strong>in</strong>dependent of any particular representation.<br />
To this end, we will repeat the forego<strong>in</strong>g, but replace B by another basis. We will<br />
make this basis different, but not extremely so,<br />
{[ ] [ ] [ ] [ ]}<br />
1 0 1 1 1 1 1 1<br />
D = {y 1 , y 2 , y 3 , y 4 } = , , ,<br />
0 0 0 0 1 0 1 1<br />
Then to build the matrix representation of S relative to D compute,<br />
([ ])<br />
0 −14<br />
ρ D (S (y 1 )) = ρ D<br />
18 −6<br />
⎡ ⎤<br />
14<br />
⎢−32⎥<br />
= ρ D (14y 1 +(−32)y 2 +24y 3 +(−6)y 4 )= ⎣<br />
24<br />
⎦<br />
−6<br />
([ ])<br />
−1 −29<br />
ρ D (S (y 2 )) = ρ D<br />
39 −13<br />
⎡ ⎤<br />
28<br />
⎢−68⎥<br />
= ρ D (28y 1 +(−68)y 2 +52y 3 +(−13)y 4 )= ⎣<br />
52<br />
⎦<br />
−13<br />
([ ])<br />
−2 −42<br />
ρ D (S (y 3 )) = ρ D<br />
58 −20<br />
⎡ ⎤<br />
40<br />
⎢−100⎥<br />
= ρ D (40y 1 +(−100)y 2 +78y 3 +(−20)y 4 )= ⎣<br />
78<br />
⎦<br />
−20<br />
([ ])<br />
−5 −41<br />
ρ D (S (y 4 )) = ρ D<br />
61 −23<br />
⎡ ⎤<br />
36<br />
⎢−102⎥<br />
= ρ D (36y 1 +(−102)y 2 +84y 3 +(−23)y 4 )= ⎣<br />
84<br />
⎦<br />
−23
§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 562<br />
So by Def<strong>in</strong>ition MR we have<br />
⎡<br />
⎤<br />
14 28 40 36<br />
N = MD,D S ⎢−32 −68 −100 −102⎥<br />
= ⎣<br />
24 52 78 84<br />
⎦<br />
−6 −13 −20 −23<br />
Now compute eigenvalues and eigenvectors of the matrix representation of N<br />
with the techniques of Section EE. <strong>First</strong> the characteristic polynomial,<br />
p N (x) =det(N − xI 4 )=x 4 − x 3 − 10x 2 +4x +24=(x − 3)(x − 2)(x +2) 2<br />
Of course this is not news. We now know that M = MB,B S and N = M D,D S are<br />
similar matrices (Theorem SCB). But Theorem SMEE told us long ago that similar<br />
matrices have identical characteristic polynomials. Now compute eigenvectors for<br />
the matrix representation, which will be different than what we found for M,<br />
⎡<br />
⎤ ⎡<br />
⎤<br />
11 28 40 36<br />
1 0 0 4<br />
⎢−32 −71 −100 −102⎥<br />
λ =3 N − 3I 4 = ⎣<br />
24 52 75 84<br />
⎦ −−−−→<br />
RREF ⎢0 1 0 −6⎥<br />
⎣<br />
0 0 1 4<br />
⎦<br />
−6 −13 −20 −26 0 0 0 0<br />
〈 ⎧ ⎡ ⎤⎫<br />
⎪ −4 〉<br />
⎨ ⎪⎬<br />
⎢ 6 ⎥<br />
E N (3) = N (N − 3I 4 )= ⎣<br />
⎪ ⎩<br />
−4<br />
⎦<br />
⎪⎭<br />
1<br />
⎡<br />
⎤ ⎡<br />
⎤<br />
12 28 40 36<br />
1 0 0 6<br />
⎢−32 −70 −100 −102⎥<br />
λ =2 N − 2I 4 = ⎣<br />
24 52 76 84<br />
⎦ −−−−→<br />
RREF ⎢0 1 0 −7⎥<br />
⎣<br />
0 0 1 4<br />
⎦<br />
−6 −13 −20 −25 0 0 0 0<br />
〈 ⎧ ⎡ ⎤⎫<br />
⎪ −6 〉<br />
⎨ ⎪⎬<br />
⎢ 7 ⎥<br />
E N (2) = N (N − 2I 4 )= ⎣<br />
⎪ ⎩<br />
−4<br />
⎦<br />
⎪⎭<br />
1<br />
⎡<br />
⎤ ⎡<br />
⎤<br />
16 28 40 36<br />
1 0 −1 −3<br />
⎢−32 −66 −100 −102⎥<br />
λ = −2 N − (−2)I 4 = ⎣<br />
24 52 80 84<br />
⎦ −−−−→<br />
RREF ⎢0 1 2 3 ⎥<br />
⎣<br />
0 0 0 0<br />
⎦<br />
−6 −13 −20 −21 0 0 0 0<br />
〈 ⎧ ⎡ ⎤ ⎡ ⎤⎫<br />
⎪ 1 3 〉<br />
⎨ ⎪⎬<br />
⎢−2⎥<br />
⎢−3⎥<br />
E N (−2) = N (N − (−2)I 4 )= ⎣<br />
⎪ ⎩<br />
1<br />
⎦ , ⎣<br />
0<br />
⎦<br />
⎪⎭<br />
0 1<br />
Employ<strong>in</strong>g Theorem EER we can apply ρ −1<br />
D<br />
to each of the basis vectors of the<br />
eigenspaces of N to obta<strong>in</strong> eigenvectors for S that also form bases for eigenspaces of
§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 563<br />
S,<br />
ρ −1<br />
D<br />
ρ −1<br />
D<br />
ρ −1<br />
D<br />
ρ −1<br />
D<br />
⎛⎡<br />
⎤⎞<br />
−4<br />
⎜⎢<br />
6 ⎥⎟<br />
⎝⎣<br />
−4<br />
⎦⎠ =(−4)y 1 +6y 2 +(−4)y 3 +1y 4 =<br />
1<br />
⎛⎡<br />
⎤⎞<br />
−6<br />
⎜⎢<br />
7 ⎥⎟<br />
⎝⎣<br />
−4<br />
⎦⎠ =(−6)y 1 +7y 2 +(−4)y 3 +1y 4 =<br />
1<br />
⎛⎡<br />
⎤⎞<br />
1<br />
⎜⎢−2⎥⎟<br />
⎝⎣<br />
1<br />
⎦⎠ =1y 1 +(−2)y 2 +1y 3 +0y 4 =<br />
0<br />
⎛⎡<br />
⎤⎞<br />
3<br />
⎜⎢−3⎥⎟<br />
⎝⎣<br />
0<br />
⎦⎠ =3y 1 +(−3)y 2 +0y 3 +1y 4 =<br />
1<br />
[<br />
−1<br />
]<br />
3<br />
−3 1<br />
[<br />
−2<br />
]<br />
4<br />
−3 1<br />
[ ]<br />
0 −1<br />
1 0<br />
[ ]<br />
1 −2<br />
1 1<br />
The eigenspaces for the eigenvalues of algebraic multiplicity 1 are exactly as<br />
before,<br />
〈{[ ]}〉<br />
−1 3<br />
E S (3) =<br />
−3 1<br />
〈{[ ]}〉<br />
−2 4<br />
E S (2) =<br />
−3 1<br />
However, the eigenspace for λ = −2 would at first glance appear to be different.<br />
Here are the two eigenspaces for λ = −2, first the eigenspace obta<strong>in</strong>ed from M =<br />
MB,B S , then followed by the eigenspace obta<strong>in</strong>ed from M = M D,D S .<br />
E S (−2) =<br />
〈{[<br />
0 −1<br />
1 0<br />
]<br />
,<br />
[<br />
1 −1<br />
0 1<br />
]}〉<br />
E S (−2) =<br />
〈{[<br />
0 −1<br />
1 0<br />
]<br />
,<br />
[ ]}〉<br />
1 −2<br />
1 1<br />
Subspaces generally have many bases, and that is the situation here. With a<br />
careful proof of set equality, you can show that these two eigenspaces are equal sets.<br />
The key observation to make such a proof go is that<br />
[<br />
1 −2<br />
1 1<br />
]<br />
=<br />
[<br />
0 −1<br />
1 0<br />
]<br />
+<br />
[<br />
1 −1<br />
0 1<br />
which will establish that the second set is a subset of the first. With equal dimensions,<br />
Theorem EDYES will f<strong>in</strong>ish the task.<br />
So the eigenvalues of a l<strong>in</strong>ear transformation are <strong>in</strong>dependent of the matrix<br />
representation employed to compute them!<br />
△<br />
]
§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 564<br />
Another example, this time a bit larger and with complex eigenvalues.<br />
Example CELT Complex eigenvectors of a l<strong>in</strong>ear transformation<br />
Consider the l<strong>in</strong>ear transformation Q: P 4 → P 4 def<strong>in</strong>ed by<br />
Q ( a + bx + cx 2 + dx 3 + ex 4)<br />
=(−46a − 22b +13c +5d + e) + (117a +57b − 32c − 15d − 4e)x+<br />
(−69a − 29b +21c − 7e)x 2 + (159a +73b − 44c − 13d +2e)x 3 +<br />
(−195a − 87b +55c +10d − 13e)x 4<br />
Choose a simple basis to compute with, say<br />
B = { 1,x,x 2 ,x 3 ,x 4}<br />
Then it should be apparent that the matrix representation of Q relative to B is<br />
⎡<br />
⎤<br />
−46 −22 13 5 1<br />
117 57 −32 −15 −4<br />
M = M Q B,B = ⎢ −69 −29 21 0 −7 ⎥<br />
⎣ 159 73 −44 −13 2 ⎦<br />
−195 −87 55 10 −13<br />
Compute the characteristic polynomial, eigenvalues and eigenvectors accord<strong>in</strong>g<br />
to the techniques of Section EE,<br />
p Q (x) =−x 5 +6x 4 − x 3 − 88x 2 + 252x − 208<br />
= −(x − 2) 2 (x +4) ( x 2 − 6x +13 )<br />
= −(x − 2) 2 (x +4)(x − (3 + 2i)) (x − (3 − 2i))<br />
α Q (2) = 2 α Q (−4) = 1 α Q (3 + 2i) =1 α Q (3 − 2i) =1<br />
λ =2<br />
⎡<br />
⎤ ⎡<br />
1<br />
−48 −22 13 5 1<br />
1 0 0<br />
2<br />
− 1 ⎤<br />
2<br />
117 55 −32 −15 −4<br />
0 1 0 − RREF<br />
5 2<br />
− 5 2<br />
M − (2)I 5 = ⎢ −69 −29 19 0 −7 ⎥ −−−−→ ⎢0 0 1 −2 −6⎥<br />
⎣ 159 73 −44 −15 2 ⎦ ⎣<br />
0 0 0 0 0<br />
⎦<br />
−195 −87 55 10 −15 0 0 0 0 0<br />
⎧⎡<br />
−<br />
〈<br />
1 ⎤ ⎡ 1⎤⎫<br />
⎧⎡<br />
⎤ ⎡ ⎤⎫<br />
2 2<br />
−1 1<br />
⎪⎨<br />
5 5 〉 〈<br />
〉<br />
⎪⎬ ⎪⎨<br />
5<br />
5<br />
⎪⎬<br />
E M (2) = N (M − (2)I 5 )= ⎢ 2 ⎥<br />
⎣<br />
⎪⎩ 1<br />
⎦ , 2<br />
⎢6⎥<br />
= ⎢ 4 ⎥<br />
⎣<br />
0<br />
⎦ ⎣<br />
⎪⎭ ⎪⎩<br />
2 ⎦ , ⎢12⎥<br />
⎣ 0 ⎦<br />
⎪⎭<br />
0 1<br />
0 2<br />
λ = −4
§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 565<br />
⎡<br />
⎤ ⎡<br />
⎤<br />
−42 −22 13 5 1<br />
1 0 0 0 1<br />
117 61 −32 −15 −4<br />
0 1 0 0 −3<br />
RREF<br />
M − (−4)I 5 = ⎢ −69 −29 25 0 −7⎥<br />
−−−−→ ⎢0 0 1 0 −1⎥<br />
⎣ 159 73 −44 −9 2 ⎦ ⎣0 0 0 1 −2⎦<br />
−195 −87 55 10 −9 0 0 0 0 0<br />
⎧⎡<br />
⎤⎫<br />
−1<br />
〈 〉<br />
⎪⎨<br />
3<br />
⎪⎬<br />
E M (−4) = N (M − (−4)I 5 )= ⎢ 1 ⎥<br />
⎣<br />
⎪⎩<br />
2 ⎦<br />
⎪⎭<br />
1<br />
λ =3+2i<br />
⎡<br />
⎤<br />
−49 − 2i −22 13 5 1<br />
117 54 − 2i −32 −15 −4<br />
M − (3 + 2i)I 5 = ⎢ −69 −29 18 − 2i 0 −7 ⎥<br />
⎣ 159 73 −44 −16 − 2i 2 ⎦<br />
−195 −87 55 10 −16 − 2i<br />
⎡<br />
1 0 0 0 − 3 4 + ⎤<br />
i<br />
4<br />
7<br />
RREF<br />
0 1 0 0<br />
4<br />
−−−−→<br />
− i 4<br />
⎢0 0 1 0 − 1 2<br />
⎣<br />
+ i 2<br />
7<br />
0 0 0 1<br />
4 − ⎥<br />
i ⎦<br />
4<br />
0 0 0 0 0<br />
⎧⎡<br />
3<br />
〈<br />
4<br />
⎪⎨<br />
− ⎤⎫<br />
i ⎧⎡<br />
⎤⎫<br />
4<br />
3 − i<br />
− 7 4<br />
E M (3 + 2i) =N (M − (3 + 2i)I 5 )=<br />
+ i 〉 〈<br />
〉<br />
4<br />
1<br />
⎢ 2<br />
⎣<br />
− ⎪⎬ ⎪⎨<br />
−7+i<br />
⎪⎬<br />
i 2<br />
− ⎪⎩<br />
7 4 + ⎥ = ⎢ 2 − 2i ⎥<br />
i ⎦ ⎣<br />
4 ⎪⎭<br />
⎪⎩<br />
−7+i⎦<br />
⎪⎭<br />
1<br />
4<br />
λ =3− 2i<br />
⎡<br />
⎤<br />
−49 + 2i −22 13 5 1<br />
117 54 + 2i −32 −15 −4<br />
M − (3 − 2i)I 5 = ⎢ −69 −29 18 + 2i 0 −7 ⎥<br />
⎣ 159 73 −44 −16 + 2i 2 ⎦<br />
−195 −87 55 10 −16 + 2i<br />
⎡<br />
1 0 0 0 − 3 4 − ⎤<br />
i<br />
4<br />
7<br />
RREF<br />
0 1 0 0<br />
4<br />
−−−−→<br />
+ i 4<br />
⎢0 0 1 0 − 1 2<br />
⎣<br />
− i 2<br />
7<br />
0 0 0 1<br />
4 + ⎥<br />
i ⎦<br />
4<br />
0 0 0 0 0
§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 566<br />
⎧⎡<br />
3<br />
〈<br />
4<br />
⎪⎨<br />
+ ⎤⎫<br />
i ⎧⎡<br />
⎤⎫<br />
4<br />
3+i<br />
− 7 4<br />
E M (3 − 2i) =N (M − (3 − 2i)I 5 )=<br />
− i 〉 〈<br />
〉<br />
4<br />
1<br />
⎢ 2<br />
⎣<br />
+ ⎪⎬ ⎪⎨<br />
−7 − i<br />
⎪⎬<br />
i 2<br />
− ⎪⎩<br />
7 4 − ⎥ = ⎢ 2+2i ⎥<br />
i ⎦ ⎣<br />
4 ⎪⎭<br />
⎪⎩<br />
−7 − i⎦<br />
⎪⎭<br />
1<br />
4<br />
It is straightforward to convert each of these basis vectors for eigenspaces of M<br />
back to elements of P 4 by apply<strong>in</strong>g the isomorphism ρ −1<br />
B<br />
⎛⎡<br />
⎤⎞<br />
,<br />
−1<br />
5<br />
ρ −1<br />
B ⎜⎢<br />
4 ⎥⎟<br />
⎝⎣<br />
2 ⎦⎠ = −1+5x +4x2 +2x 3<br />
0<br />
⎛⎡<br />
⎤⎞<br />
1<br />
5<br />
ρ −1<br />
B ⎜⎢12⎥⎟<br />
⎝⎣<br />
0 ⎦⎠ =1+5x +12x2 +2x 4<br />
2<br />
⎛⎡<br />
⎤⎞<br />
ρ −1<br />
B<br />
ρ −1<br />
B<br />
ρ −1<br />
B<br />
−1<br />
3<br />
⎜⎢<br />
1 ⎥⎟<br />
⎝⎣<br />
2 ⎦⎠ = −1+3x + x2 +2x 3 + x 4<br />
1<br />
⎛⎡<br />
⎤⎞<br />
3 − i<br />
−7+i<br />
⎜⎢<br />
2 − 2i ⎥⎟<br />
⎝⎣−7+i⎦⎠ =(3− i)+(−7+i)x +(2− 2i)x2 +(−7+i)x 3 +4x 4<br />
4<br />
⎛⎡<br />
⎤⎞<br />
3+i<br />
−7 − i<br />
⎜⎢<br />
2+2i ⎥⎟<br />
⎝⎣−7 − i⎦⎠ =(3+i)+(−7 − i)x +(2+2i)x2 +(−7 − i)x 3 +4x 4<br />
4<br />
So we apply Theorem EER and the Coord<strong>in</strong>atization Pr<strong>in</strong>ciple to get the<br />
eigenspaces for Q,<br />
E Q (2) = 〈{ −1+5x +4x 2 +2x 3 , 1+5x +12x 2 +2x 4}〉<br />
E Q (−4) = 〈{ −1+3x + x 2 +2x 3 + x 4}〉<br />
E Q (3 + 2i) = 〈{ (3 − i)+(−7+i)x +(2− 2i)x 2 +(−7+i)x 3 +4x 4}〉<br />
E Q (3 − 2i) = 〈{ (3 + i)+(−7 − i)x +(2+2i)x 2 +(−7 − i)x 3 +4x 4}〉
§CB Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 567<br />
with geometric multiplicities<br />
γ Q (2) = 2 γ Q (−4) = 1 γ Q (3 + 2i) =1 γ Q (3 − 2i) =1<br />
△<br />
Read<strong>in</strong>g Questions<br />
1. The change-of-basis matrix is a matrix representation of which l<strong>in</strong>ear transformation?<br />
2. F<strong>in</strong>d the change-of-basis matrix, C B,C ,forthetwobasesofC 2<br />
{[ [ ]}<br />
{[ [<br />
2 −1<br />
1 1<br />
B = ,<br />
C = ,<br />
3]<br />
2<br />
0]<br />
1]}<br />
3. What is the third “surprise,” and why is it surpris<strong>in</strong>g?<br />
Exercises<br />
C20 In Example CBCV we computed the vector representation of y relative to C, ρ C (y),<br />
as an example of Theorem CB. Compute this same representation directly. In other words,<br />
apply Def<strong>in</strong>ition VR rather than Theorem CB.<br />
C21 † Perform a check on Example MRCM by comput<strong>in</strong>g M Q B,D<br />
directly. In other words,<br />
apply Def<strong>in</strong>ition MR rather than Theorem MRCB.<br />
C30 † F<strong>in</strong>d a basis for the vector space P 3 composed of eigenvectors of the l<strong>in</strong>ear transformation<br />
T . Then f<strong>in</strong>d a matrix representation of T relative to this basis.<br />
T : P 3 → P 3 , T ( a + bx + cx 2 + dx 3) =(a+c+d)+(b+c+d)x+(a+b+c)x 2 +(a+b+d)x 3<br />
C40 † Let S 22 be the vector space of 2 × 2 symmetric matrices. F<strong>in</strong>d a basis C for S 22<br />
that yields a diagonal matrix representation of the l<strong>in</strong>ear transformation R.<br />
([ ]) [ ]<br />
a b −5a +2b − 3c −12a +5b − 6c<br />
R: S 22 → S 22 , R<br />
=<br />
b c −12a +5b − 6c 6a − 2b +4c<br />
C41 † Let S 22 be the vector space of 2 × 2 symmetric matrices. F<strong>in</strong>d a basis for S 22<br />
composed of eigenvectors of the l<strong>in</strong>ear transformation Q: S 22 → S 22 .<br />
([ ]) [ ]<br />
a b 25a +18b +30c −16a − 11b − 20c<br />
Q<br />
=<br />
b c −16a − 11b − 20c −11a − 9b − 12c<br />
T10 †<br />
Suppose that T : V → V is an <strong>in</strong>vertible l<strong>in</strong>ear transformation with a nonzero<br />
eigenvalue λ. Provethat 1 λ is an eigenvalue of T −1 .<br />
T15 Suppose that V is a vector space and T : V → V is a l<strong>in</strong>ear transformation. Prove<br />
that T is <strong>in</strong>jective if and only if λ = 0 is not an eigenvalue of T .
Section OD<br />
Orthonormal Diagonalization<br />
We have seen <strong>in</strong> Section SD that under the right conditions a square matrix is<br />
similar to a diagonal matrix. We recognize now, via Theorem SCB, that a similarity<br />
transformation is a change of basis on a matrix representation. So we can now discuss<br />
the choice of a basis used to build a matrix representation, and decide if some bases<br />
are better than others for this purpose. This will be the tone of this section. We will<br />
also see that every matrix has a reasonably useful matrix representation, and we will<br />
discover a new class of diagonalizable l<strong>in</strong>ear transformations. <strong>First</strong> we need some<br />
basic facts about triangular matrices.<br />
Subsection TM<br />
Triangular Matrices<br />
An upper, or lower, triangular matrix is exactly what it sounds like it should be,<br />
but here are the two relevant def<strong>in</strong>itions.<br />
Def<strong>in</strong>ition UTM Upper Triangular Matrix<br />
The n × n square matrix A is upper triangular if [A] ij<br />
= 0 whenever i>j.<br />
Def<strong>in</strong>ition LTM Lower Triangular Matrix<br />
The n × n square matrix A is lower triangular if [A] ij<br />
= 0 whenever i
§OD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 569<br />
∑j−1<br />
= [A] ik<br />
0+<br />
k=1<br />
∑j−1<br />
= [A] ik<br />
0+<br />
k=1<br />
∑j−1<br />
= 0+<br />
k=1<br />
n∑<br />
0<br />
k=j<br />
n∑<br />
[A] ik<br />
[B] kj<br />
k
§OD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 570<br />
What happens <strong>in</strong> columns n +1through2n of M ′ ? These columns began <strong>in</strong> M<br />
as the identity matrix, and <strong>in</strong> M ′ each diagonal entry was scaled to a reciprocal of<br />
the correspond<strong>in</strong>g diagonal entry of A. Notice that trivially, these f<strong>in</strong>al n columns<br />
of M ′ form a lower triangular matrix. Just as we argued for the first n columns,<br />
the row operations that convert M j−1 <strong>in</strong>to M j will preserve the lower triangular<br />
form <strong>in</strong> the f<strong>in</strong>al n columns and preserve the exact values of the diagonal entries. By<br />
Theorem CINM, the f<strong>in</strong>al n columns of M n is the <strong>in</strong>verse of A, and this matrix has<br />
the necessary properties advertised <strong>in</strong> the conclusion of this theorem.<br />
<br />
Subsection UTMR<br />
Upper Triangular Matrix Representation<br />
Not every matrix is diagonalizable, but every l<strong>in</strong>ear transformation has a matrix<br />
representation that is an upper triangular matrix, and the basis that achieves this<br />
representation is especially pleas<strong>in</strong>g. Here is the theorem.<br />
Theorem UTMR Upper Triangular Matrix Representation<br />
Suppose that T : V → V is a l<strong>in</strong>ear transformation. Then there is a basis B for V<br />
such that the matrix representation of T relative to B, MB,B T , is an upper triangular<br />
matrix. Each diagonal entry is an eigenvalue of T ,andifλ is an eigenvalue of T ,<br />
then λ occurs α T (λ) times on the diagonal.<br />
Proof. We beg<strong>in</strong> with a proof by <strong>in</strong>duction (Proof Technique I) of the first statement<br />
<strong>in</strong> the conclusion of the theorem. We use <strong>in</strong>duction on the dimension of V to show<br />
that if T : V → V is a l<strong>in</strong>ear transformation, then there is a basis B for V such that<br />
the matrix representation of T relative to B, MB,B T , is an upper triangular matrix.<br />
To start suppose that dim (V ) = 1. Choose any nonzero vector v ∈ V and realize<br />
that V = 〈{v}〉. ThenT (v) = βv for some β ∈ C, which determ<strong>in</strong>es T uniquely<br />
(Theorem LTDB). This description of T also gives us a matrix representation relative<br />
to the basis B = {v} as the 1 × 1 matrix with lone entry equal to β. And this matrix<br />
representation is upper triangular (Def<strong>in</strong>ition UTM).<br />
For the <strong>in</strong>duction step let dim (V ) = m, and assume the theorem is true for every<br />
l<strong>in</strong>ear transformation def<strong>in</strong>ed on a vector space of dimension less than m. ByTheorem<br />
EMHE (suitably converted to the sett<strong>in</strong>g of a l<strong>in</strong>ear transformation), T has at least<br />
one eigenvalue, and we denote this eigenvalue as λ. (We will remark later about<br />
how critical this step is.) We now consider properties of the l<strong>in</strong>ear transformation<br />
T − λI V : V → V .<br />
Let x be an eigenvector of T for λ. By def<strong>in</strong>ition x ≠ 0. Then<br />
(T − λI V )(x) =T (x) − λI V (x) Theorem VSLT<br />
= T (x) − λx Def<strong>in</strong>ition IDLT<br />
= λx − λx Def<strong>in</strong>ition EELT<br />
= 0 Property AI
§OD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 571<br />
So T − λI V is not <strong>in</strong>jective, as it has a nontrivial kernel (Theorem KILT). With<br />
an application of Theorem RPNDD we bound the rank of T − λI V ,<br />
r (T − λI V )=dim(V ) − n (T − λI V ) ≤ m − 1<br />
Let W bethesubspaceofV that is the range of T − λI V , W = R(T − λI V ),<br />
and def<strong>in</strong>e k =dim(W ) ≤ m − 1. We def<strong>in</strong>e a new l<strong>in</strong>ear transformation S, onW ,<br />
S : W → W, S (w) =T (w)<br />
This does not look we have accomplished much, s<strong>in</strong>ce the action of S is identical<br />
to the action of T . For our purposes this will be a good th<strong>in</strong>g. What is different is<br />
the doma<strong>in</strong> and codoma<strong>in</strong>. S is def<strong>in</strong>ed on W , a vector space with dimension less<br />
than m, and so is susceptible to our <strong>in</strong>duction hypothesis. Verify<strong>in</strong>g that S is really a<br />
l<strong>in</strong>ear transformation is almost entirely rout<strong>in</strong>e, with one exception. Employ<strong>in</strong>g T <strong>in</strong><br />
our def<strong>in</strong>ition of S raises the possibility that the outputs of S will not be conta<strong>in</strong>ed<br />
with<strong>in</strong> W (but <strong>in</strong>stead will lie <strong>in</strong>side V , but outside W ). To exam<strong>in</strong>e this possibility,<br />
suppose that w ∈ W .<br />
S (w) =T (w)<br />
= T (w)+0 Property Z<br />
= T (w)+(λI V (w) − λI V (w)) Property AI<br />
=(T (w) − λI V (w)) + λI V (w) Property AA<br />
=(T (w) − λI V (w)) + λw<br />
Def<strong>in</strong>ition IDLT<br />
=(T − λI V )(w)+λw<br />
Theorem VSLT<br />
S<strong>in</strong>ce W is the range of T − λI V , (T − λI V )(w) ∈ W . And by Property SC,<br />
λw ∈ W . F<strong>in</strong>ally, apply<strong>in</strong>g Property AC we see by closure that the sum is <strong>in</strong> W and<br />
so we conclude that S (w) ∈ W . This argument conv<strong>in</strong>ces us that it is legitimate to<br />
def<strong>in</strong>e S as we did with W as the codoma<strong>in</strong>.<br />
S is a l<strong>in</strong>ear transformation def<strong>in</strong>ed on a vector space with dimension k, less<br />
than m, so we can apply the <strong>in</strong>duction hypothesis and conclude that W has a basis,<br />
C = {w 1 , w 2 , w 3 , ..., w k }, such that the matrix representation of S relative to C<br />
is an upper triangular matrix.<br />
Beg<strong>in</strong>n<strong>in</strong>g with the l<strong>in</strong>early <strong>in</strong>dependent set C, repeatedly apply Theorem ELIS<br />
to add vectors to C, ma<strong>in</strong>ta<strong>in</strong><strong>in</strong>g a l<strong>in</strong>early <strong>in</strong>dependent set and spann<strong>in</strong>g ever larger<br />
subspaces of V . This process will end with the addition of m−k vectors, which together<br />
with C will span all of V . Denote these vectors as D = {u 1 , u 2 , u 3 , ..., u m−k }.<br />
Then B = C ∪ D is a basis for V , and is the basis we desire for the conclusion of<br />
the theorem. So we now consider the matrix representation of T relative to B.<br />
S<strong>in</strong>ce the def<strong>in</strong>ition of T and S agree on W , the first k columns of MB,B T will have<br />
the upper triangular matrix representation of S <strong>in</strong> the first k rows. The rema<strong>in</strong><strong>in</strong>g<br />
m − k rows of these first k columns will be all zeros s<strong>in</strong>ce the outputs of T for basis<br />
vectors from C are all conta<strong>in</strong>ed <strong>in</strong> W and hence are l<strong>in</strong>ear comb<strong>in</strong>ations of the basis
§OD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 572<br />
vectors <strong>in</strong> C. The situation for T on the basis vectors <strong>in</strong> D is not quite as pretty,<br />
but it is close.<br />
For 1 ≤ i ≤ m − k, consider<br />
ρ B (T (u i )) = ρ B (T (u i )+0)<br />
Property Z<br />
= ρ B (T (u i )+(λI V (u i ) − λI V (u i ))) Property AI<br />
= ρ B ((T (u i ) − λI V (u i )) + λI V (u i )) Property AA<br />
= ρ B ((T (u i ) − λI V (u i )) + λu i ) Def<strong>in</strong>ition IDLT<br />
= ρ B ((T − λI V )(u i )+λu i ) Theorem VSLT<br />
= ρ B (a 1 w 1 + a 2 w 2 + a 3 w 3 + ···+ a k w k + λu i )<br />
⎡ ⎤<br />
Def<strong>in</strong>ition RLT<br />
=<br />
⎢<br />
⎣<br />
a 1<br />
a 2<br />
.<br />
a k<br />
0<br />
. ..<br />
0<br />
λ<br />
0<br />
.<br />
0<br />
⎥<br />
⎦<br />
Def<strong>in</strong>ition VR<br />
In the penultimate equality, we have rewritten an element of the range of T −λI V<br />
as a l<strong>in</strong>ear comb<strong>in</strong>ation of the basis vectors, C, for the range of T − λI V , W ,us<strong>in</strong>g<br />
the scalars a 1 ,a 2 ,a 3 , ..., a k . If we <strong>in</strong>corporate these m − k column vectors <strong>in</strong>to<br />
the matrix representation MB,B T we f<strong>in</strong>d m − k occurrences of λ on the diagonal,<br />
and any nonzero entries ly<strong>in</strong>g only <strong>in</strong> the first k rows. Together with the k × k<br />
upper triangular representation <strong>in</strong> the upper left-hand corner, the entire matrix<br />
representation for T is clearly upper triangular. This completes the <strong>in</strong>duction step.<br />
So for any l<strong>in</strong>ear transformation there is a basis that creates an upper triangular<br />
matrix representation.<br />
We have one more statement <strong>in</strong> the conclusion of the theorem to verify. The<br />
eigenvalues of T , and their multiplicities, can be computed with the techniques of<br />
Chapter E relative to any matrix representation (Theorem EER). We take this<br />
approach with our upper triangular matrix representation MB,B T .Letd i be the<br />
diagonal entry of MB,B T <strong>in</strong> row i and column i. Then the characteristic polynomial,<br />
computed as a determ<strong>in</strong>ant (Def<strong>in</strong>ition CP) with repeated expansions about the<br />
first column, is<br />
p M<br />
T<br />
B,B<br />
(x) =(d 1 − x)(d 2 − x)(d 3 − x) ···(d m − x)
§OD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 573<br />
The roots of the polynomial equation p M<br />
T<br />
B,B<br />
(x) = 0 are the eigenvalues of the<br />
l<strong>in</strong>ear transformation (Theorem EMRCP). So each diagonal entry is an eigenvalue,<br />
and is repeated on the diagonal exactly α T (λ) times (Def<strong>in</strong>ition AME). <br />
A key step <strong>in</strong> this proof was the construction of the subspace W with dimension<br />
strictly less than that of V . This required an eigenvalue/eigenvector pair, which was<br />
guaranteed to us by Theorem EMHE. Digg<strong>in</strong>g deeper, the proof of Theorem EMHE<br />
requires that we can factor polynomials completely, <strong>in</strong>to l<strong>in</strong>ear factors. This will not<br />
always happen if our set of scalars is the reals, R. So this is our f<strong>in</strong>al explanation of<br />
our choice of the complex numbers, C, as our set of scalars. In C polynomials factor<br />
completely, so every matrix has at least one eigenvalue, and an <strong>in</strong>ductive argument<br />
will get us to upper triangular matrix representations.<br />
In the case of l<strong>in</strong>ear transformations def<strong>in</strong>ed on C m , we can use the <strong>in</strong>ner product<br />
(Def<strong>in</strong>ition IP) profitably to f<strong>in</strong>e-tune the basis that yields an upper triangular matrix<br />
representation. Recall that the adjo<strong>in</strong>t of matrix A (Def<strong>in</strong>ition A) is written as A ∗ .<br />
Theorem OBUTR Orthonormal Basis for Upper Triangular Representation<br />
Suppose that A is a square matrix. Then there is a unitary matrix U, andanupper<br />
triangular matrix T , such that<br />
U ∗ AU = T<br />
and T has the eigenvalues of A as the entries of the diagonal.<br />
Proof. This theorem is a statement about matrices and similarity. We can convert<br />
it to a statement about l<strong>in</strong>ear transformations, matrix representations and bases<br />
(Theorem SCB). Suppose that A is an n × n matrix, and def<strong>in</strong>e the l<strong>in</strong>ear transformation<br />
S : C n → C n by S (x) = Ax. Then Theorem UTMR gives us a basis<br />
B = {v 1 , v 2 , v 3 , ..., v n } for C n such that a matrix representation of S relative to<br />
B, MB,B S , is upper triangular.<br />
Now convert the basis B <strong>in</strong>to an orthogonal basis, C, by an application of the<br />
Gram-Schmidt procedure (Theorem GSP). This is a messy bus<strong>in</strong>ess computationally,<br />
but here we have an excellent illustration of the power of the Gram-Schmidt procedure.<br />
We need only be sure that B is l<strong>in</strong>early <strong>in</strong>dependent and spans C n ,andthenweknow<br />
that C is l<strong>in</strong>early <strong>in</strong>dependent, spans C n and is also an orthogonal set. We will now<br />
consider the matrix representation of S relative to C (rather than B). Write the new<br />
basis as C = {y 1 , y 2 , y 3 , ..., y n }. The application of the Gram-Schmidt procedure<br />
creates each vector of C, sayy j , as the difference of v j and a l<strong>in</strong>ear comb<strong>in</strong>ation<br />
of y 1 , y 2 , y 3 , ..., y j−1 . We are not concerned here with the actual values of the<br />
scalars <strong>in</strong> this l<strong>in</strong>ear comb<strong>in</strong>ation, so we will write<br />
∑j−1<br />
y j = v j − b jk y k<br />
k=1
§OD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 574<br />
where the b jk are shorthand for the scalars. The equation above is <strong>in</strong> a form useful<br />
for creat<strong>in</strong>g the basis C from B. To better understand the relationship between B<br />
and C convert it to read<br />
∑j−1<br />
v j = y j + b jk y k<br />
In this form, we recognize that the change-of-basis matrix C B,C = M I C n<br />
B,C (Def<strong>in</strong>ition<br />
CBM) is an upper triangular matrix. By Theorem SCB we have<br />
MC,C S = C B,C MB,BC S −1<br />
B,C<br />
The <strong>in</strong>verse of an upper triangular matrix is upper triangular (Theorem ITMT), and<br />
the product of two upper triangular matrices is aga<strong>in</strong> upper triangular (Theorem<br />
PTMT). So MC,C S is an upper triangular matrix.<br />
Now, multiply each vector of C by a nonzero scalar, so that the result has norm<br />
1. In this way we create a new basis D which is an orthonormal set (Def<strong>in</strong>ition ONS).<br />
Note that the change-of-basis matrix C C,D is a diagonal matrix with nonzero entries<br />
equal to the norms of the vectors <strong>in</strong> C.<br />
Now we can convert our results <strong>in</strong>to the language of matrices. Let E be the basis<br />
of C n formed with the standard unit vectors (Def<strong>in</strong>ition SUV). Then the matrix<br />
representation of S relative to E is simply A, A = ME,E S . The change-of-basis matrix<br />
C D,E has columns that are simply the vectors <strong>in</strong> D, the orthonormal basis. As such,<br />
Theorem CUMOS tells us that C D,E is a unitary matrix, and by Def<strong>in</strong>ition UM has<br />
an <strong>in</strong>verse equal to its adjo<strong>in</strong>t. Write U = C D,E .Wehave<br />
U ∗ AU = U −1 AU<br />
Theorem UMI<br />
= C −1<br />
D,E M E,EC S D,E<br />
k=1<br />
= MD,D S Theorem SCB<br />
= C C,D MC,CC S −1<br />
C,D<br />
Theorem SCB<br />
The <strong>in</strong>verse of a diagonal matrix is also a diagonal matrix, and so this f<strong>in</strong>al<br />
expression is the product of three upper triangular matrices, and so is aga<strong>in</strong> upper<br />
triangular (Theorem PTMT). Thus the desired upper triangular matrix, T ,isthe<br />
matrix representation of S relative to the orthonormal basis D, MD,D S . <br />
Subsection NM<br />
Normal Matrices<br />
Normal matrices comprise a broad class of <strong>in</strong>terest<strong>in</strong>g matrices, many of which we<br />
have met already. But they are most <strong>in</strong>terest<strong>in</strong>g s<strong>in</strong>ce they def<strong>in</strong>e exactly which<br />
matrices we can diagonalize via a unitary matrix. This is the upcom<strong>in</strong>g Theorem<br />
OD. Here is the def<strong>in</strong>ition.
§OD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 575<br />
Def<strong>in</strong>ition NRML Normal Matrix<br />
The square matrix A is normal if A ∗ A = AA ∗ .<br />
So a normal matrix commutes with its adjo<strong>in</strong>t. Part of the beauty of this def<strong>in</strong>ition<br />
is that it <strong>in</strong>cludes many other types of matrices. A diagonal matrix will commute<br />
with its adjo<strong>in</strong>t, s<strong>in</strong>ce the adjo<strong>in</strong>t is aga<strong>in</strong> diagonal and the entries are just conjugates<br />
of the entries of the orig<strong>in</strong>al diagonal matrix. A Hermitian (self-adjo<strong>in</strong>t) matrix<br />
(Def<strong>in</strong>ition HM) will trivially commute with its adjo<strong>in</strong>t, s<strong>in</strong>ce the two matrices are<br />
the same. A real, symmetric matrix is Hermitian, so these matrices are also normal.<br />
A unitary matrix (Def<strong>in</strong>ition UM) has its adjo<strong>in</strong>t as its <strong>in</strong>verse, and <strong>in</strong>verses commute<br />
(Theorem OSIS), so unitary matrices are normal. Another class of normal matrices is<br />
the skew-symmetric matrices. However, these broad descriptions still do not capture<br />
all of the normal matrices, as the next example shows.<br />
Example ANM A normal matrix<br />
Let<br />
Then<br />
[ ][ ]<br />
1 −1 1 1<br />
1 1 −1 1<br />
A =<br />
=<br />
[ ]<br />
1 −1<br />
1 1<br />
[ ]<br />
2 0<br />
=<br />
0 2<br />
[<br />
1 1<br />
−1 1<br />
][ ]<br />
1 −1<br />
1 1<br />
so we see by Def<strong>in</strong>ition NRML that A is normal. However, A is not symmetric (hence,<br />
as a real matrix, not Hermitian), not unitary, and not skew-symmetric. △<br />
Subsection OD<br />
Orthonormal Diagonalization<br />
A diagonal matrix is very easy to work with <strong>in</strong> matrix multiplication (Example<br />
HPDM) and an orthonormal basis also has many advantages (Theorem COB). How<br />
about convert<strong>in</strong>g a matrix to a diagonal matrix through a similarity transformation<br />
us<strong>in</strong>g a unitary matrix (i.e. build a diagonal matrix representation with an orthonormal<br />
matrix)? That’d be fantastic! When can we do this? We can always accomplish<br />
this feat when the matrix is normal, and normal matrices are the only ones that<br />
behave this way. Here is the theorem.<br />
Theorem OD Orthonormal Diagonalization<br />
Suppose that A is a square matrix. Then there is a unitary matrix U and a diagonal<br />
matrix D, with diagonal entries equal to the eigenvalues of A, such that U ∗ AU = D<br />
if and only if A is a normal matrix.<br />
Proof. (⇒) Suppose there is a unitary matrix U that diagonalizes A. Wewould<br />
usally write this condition as U ∗ AU = D, but we will f<strong>in</strong>d it convenient <strong>in</strong> this part<br />
of the proof to use our hypothesis <strong>in</strong> the equivalent form, A = UDU ∗ .Recallthata<br />
□
§OD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 576<br />
diagonal matrix is normal, and notice that this observation is at the center of the<br />
next sequence of equalities. We check the normality of A,<br />
A ∗ A =(UDU ∗ ) ∗ (UDU ∗ )<br />
Hypothesis<br />
=(U ∗ ) ∗ D ∗ U ∗ UDU ∗<br />
Theorem MMAD<br />
= UD ∗ U ∗ UDU ∗ Theorem AA<br />
= UD ∗ I n DU ∗ Def<strong>in</strong>ition UM<br />
= UD ∗ DU ∗ Theorem MMIM<br />
= UDD ∗ U ∗ Def<strong>in</strong>ition NRML<br />
= UDI n D ∗ U ∗ Theorem MMIM<br />
= UDU ∗ UD ∗ U ∗ Def<strong>in</strong>ition UM<br />
= UDU ∗ (U ∗ ) ∗ D ∗ U ∗ Theorem AA<br />
=(UDU ∗ )(UDU ∗ ) ∗<br />
Theorem MMAD<br />
= AA ∗ Hypothesis<br />
So by Def<strong>in</strong>ition NRML, A is a normal matrix.<br />
(⇐) For the converse, suppose that A is a normal matrix. Whether or not A is<br />
normal, Theorem OBUTR provides a unitary matrix U and an upper triangular<br />
matrix T , whose diagonal entries are the eigenvalues of A, andsuchthatU ∗ AU = T .<br />
With the added condition that A is normal, we will determ<strong>in</strong>e that the entries of T<br />
above the diagonal must be all zero. Here we go.<br />
<strong>First</strong> notice that Def<strong>in</strong>ition UM implies that the <strong>in</strong>verse of a unitary matrix U is<br />
the adjo<strong>in</strong>t, U ∗ , so the product of these two matrices, <strong>in</strong> either order, is the identity<br />
matrix (Theorem OSIS). We beg<strong>in</strong> by show<strong>in</strong>g that T is normal.<br />
T ∗ T =(U ∗ AU) ∗ (U ∗ AU)<br />
Theorem OBUTR<br />
= U ∗ A ∗ (U ∗ ) ∗ U ∗ AU Theorem MMAD<br />
= U ∗ A ∗ UU ∗ AU Theorem AA<br />
= U ∗ A ∗ I n AU Def<strong>in</strong>ition UM<br />
= U ∗ A ∗ AU Theorem MMIM<br />
= U ∗ AA ∗ U Def<strong>in</strong>ition NRML<br />
= U ∗ AI n A ∗ U Theorem MMIM<br />
= U ∗ AUU ∗ A ∗ U Def<strong>in</strong>ition UM<br />
= U ∗ AUU ∗ A ∗ (U ∗ ) ∗ Theorem AA<br />
=(U ∗ AU)(U ∗ AU) ∗<br />
Theorem MMAD<br />
= TT ∗ Theorem OBUTR<br />
So by Def<strong>in</strong>ition NRML, T is a normal matrix.
§OD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 577<br />
We can translate the normality of T <strong>in</strong>to the statement TT ∗ − T ∗ T = O. We<br />
now establish an equality we will use repeatedly. For 1 ≤ i ≤ n,<br />
0=[O] ii<br />
Def<strong>in</strong>ition ZM<br />
=[TT ∗ − T ∗ T ] ii<br />
Def<strong>in</strong>ition NRML<br />
=[TT ∗ ] ii<br />
− [T ∗ T ] ii<br />
Def<strong>in</strong>ition MA<br />
n∑<br />
n∑<br />
= [T ] ik<br />
[T ∗ ] ki<br />
− [T ∗ ] ik<br />
[T ] ki<br />
Theorem EMP<br />
=<br />
=<br />
=<br />
k=1<br />
n∑<br />
[T ] ik<br />
[T ] ik<br />
−<br />
k=1<br />
n∑<br />
[T ] ik<br />
[T ] ik<br />
−<br />
k=i<br />
n∑<br />
|[T ] ik<br />
| 2 −<br />
k=i<br />
k=1<br />
n∑<br />
[T ] ki<br />
[T ] ki<br />
Def<strong>in</strong>ition A<br />
k=1<br />
i∑<br />
[T ] ki<br />
[T ] ki<br />
Def<strong>in</strong>ition UTM<br />
k=1<br />
i∑<br />
|[T ] ki<br />
| 2<br />
k=1<br />
Def<strong>in</strong>ition MCN<br />
To conclude, we use the above equality repeatedly, beg<strong>in</strong>n<strong>in</strong>g with i =1,and<br />
discover, row by row, that the entries above the diagonal of T are all zero. The key<br />
observation is that a sum of squares can only equal zero when each term of the sum<br />
is zero. For i =1wehave<br />
n∑<br />
1∑<br />
n∑<br />
0= |[T ] 1k<br />
| 2 − |[T ] k1<br />
| 2 = |[T ] 1k<br />
| 2<br />
k=1<br />
which forces the conclusions<br />
[T ] 12<br />
=0 [T ] 13<br />
=0 [T ] 14<br />
=0 ··· [T ] 1n<br />
=0<br />
k=1<br />
For i = 2 we use the same equality, but also <strong>in</strong>corporate the portion of the above<br />
conclusions that says [T ] 12<br />
=0,<br />
n∑<br />
2∑<br />
n∑<br />
2∑<br />
n∑<br />
0= |[T ] 2k<br />
| 2 − |[T ] k2<br />
| 2 = |[T ] 2k<br />
| 2 − |[T ] k2<br />
| 2 = |[T ] 2k<br />
| 2<br />
k=2<br />
k=1<br />
which forces the conclusions<br />
[T ] 23<br />
=0 [T ] 24<br />
=0 [T ] 25<br />
=0 ··· [T ] 2n<br />
=0<br />
k=2<br />
We can repeat this process for the subsequent values of i =3, 4, 5 ..., n− 1.<br />
Notice that it is critical we do this <strong>in</strong> order, s<strong>in</strong>ce we need to employ portions of each<br />
of the previous conclusions about rows hav<strong>in</strong>g zero entries <strong>in</strong> order to successfully get<br />
the same conclusion for later rows. Eventually, we conclude that all of the nondiagonal<br />
entries of T are zero, so the extra assumption of normality forces T to be diagonal.<br />
k=2<br />
k=2<br />
k=3
§OD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 578<br />
We can rearrange the conclusion of this theorem to read A = UDU ∗ . Recall that<br />
a unitary matrix can be viewed as a geometry-preserv<strong>in</strong>g transformation (isometry),<br />
or more loosely as a rotation of sorts. Then a matrix-vector product, Ax, canbe<br />
viewed <strong>in</strong>stead as a sequence of three transformations. U ∗ is unitary, and so is a<br />
rotation. S<strong>in</strong>ce D is diagonal, it just multiplies each entry of a vector by a scalar.<br />
Diagonal entries that are positive or negative, with absolute values bigger or smaller<br />
than 1 evoke descriptions like reflection, expansion and contraction. Generally we<br />
can say that D “stretches” a vector <strong>in</strong> each component. F<strong>in</strong>al multiplication by U<br />
undoes (<strong>in</strong>verts) the rotation performed by U ∗ . So a normal matrix is a rotationstretch-rotation<br />
transformation.<br />
The orthonormal basis formed from the columns of U can be viewed as a system<br />
of mutually perpendicular axes. The rotation by U ∗ allows the transformation by A<br />
to be replaced by the simple transformation D along these axes, and then D br<strong>in</strong>gs<br />
the result back to the orig<strong>in</strong>al coord<strong>in</strong>ate system. For this reason Theorem OD is<br />
known as the Pr<strong>in</strong>cipal Axis Theorem.<br />
The columns of the unitary matrix <strong>in</strong> Theorem OD create an especially nice<br />
basis for use with the normal matrix. We record this observation as a theorem.<br />
Theorem OBNM Orthonormal Bases and Normal Matrices<br />
Suppose that A is a normal matrix of size n. Then there is an orthonormal basis of<br />
C n composed of eigenvectors of A.<br />
Proof. Let U be the unitary matrix promised by Theorem OD and let D be the<br />
result<strong>in</strong>g diagonal matrix. The desired set of vectors is formed by collect<strong>in</strong>g the<br />
columns of U <strong>in</strong>to a set. Theorem CUMOS says this set of columns is orthonormal.<br />
S<strong>in</strong>ce U is nons<strong>in</strong>gular (Theorem UMI), Theorem CNMB says the set is a basis.<br />
S<strong>in</strong>ce A is diagonalized by U, the diagonal entries of the matrix D are the<br />
eigenvalues of A. An argument exactly like the second half of the proof of Theorem<br />
DC shows that each vector of the basis is an eigenvector of A.<br />
<br />
In a vague way Theorem OBNM is an improvement on Theorem HMOE which<br />
said that eigenvectors of a Hermitian matrix for different eigenvalues are always<br />
orthogonal. Hermitian matrices are normal and we see that we can f<strong>in</strong>d at least<br />
one basis where every pair of eigenvectors is orthogonal. Notice that this is not a<br />
generalization, s<strong>in</strong>ce Theorem HMOE states a weak result which applies to many<br />
(but not all) pairs of eigenvectors, while Theorem OBNM is a seem<strong>in</strong>gly stronger<br />
result, but only asserts that there is one collection of eigenvectors with the stronger<br />
property.<br />
Given an n × n matrix A, an orthonormal basis for C n , comprised of eigenvectors<br />
of A is an extremely useful basis to have at the service of the matrix A. Whydo<br />
we say this? We can consider the vectors of a basis as a preferred set of directions,<br />
known as “axes,” which taken together might also be called a “coord<strong>in</strong>ate system.”<br />
The standard basis of Def<strong>in</strong>ition SUV could be considered the default, or prototype,
§OD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 579<br />
coord<strong>in</strong>ate system. When a basis is orthornormal, we can consider the directions to<br />
be standardized to have unit length, and we can consider the axes as be<strong>in</strong>g mutually<br />
perpendicular. But there is more — let us be a bit more formal.<br />
Suppose U is a matrix whose columns are an orthonormal basis of eigenvectors<br />
of the n × n matrix A. So, <strong>in</strong> particular U is a unitary matrix (Theorem CUMOS).<br />
For a vector x ∈ C n , use the notation ˆx for the vector representation of x relative<br />
to the orthonormal basis. So the entries of ˆx, used <strong>in</strong> a l<strong>in</strong>ear comb<strong>in</strong>ation of the<br />
columns of U will create x. With Def<strong>in</strong>ition MVP, wecanwritethisrelationshipas<br />
U ˆx = x<br />
S<strong>in</strong>ce U ∗ is the <strong>in</strong>verse of U (Def<strong>in</strong>ition UM), we can rearrange this equation as<br />
ˆx = U ∗ x<br />
This says we can easily create the vector representation relative to the orthonormal<br />
basis with a matrix-vector product of the adjo<strong>in</strong>t of U. Note that the adjo<strong>in</strong>t is much<br />
easier to compute than a matrix <strong>in</strong>verse, which would be one general way to obta<strong>in</strong><br />
a vector representation. This is our first observation about coord<strong>in</strong>atization relative<br />
to orthonormal basis. However, we already knew this, as we just have Theorem COB<br />
<strong>in</strong> disguise (see Exercise OD.T20).<br />
We also know that orthonormal bases play nicely with <strong>in</strong>ner products. Theorem<br />
UMPIP says unitary matrices preserve <strong>in</strong>ner products (and hence norms). More<br />
geometrically, lengths and angles are preserved by multiplication by a unitary matrix.<br />
Us<strong>in</strong>g our notation, this becomes<br />
〈x, y〉 = 〈U ˆx, Uŷ〉 = 〈ˆx, ŷ〉<br />
So we can compute <strong>in</strong>ner products with the orig<strong>in</strong>al vectors, or with their representations,<br />
and obta<strong>in</strong> the same result. It follows that norms, lengths, and angles can<br />
all be computed with the orig<strong>in</strong>al vectors or with the representations <strong>in</strong> the new<br />
coord<strong>in</strong>ate system based on an orthonormal basis.<br />
So far we have not really said anyth<strong>in</strong>g new, nor has the matrix A, oritseigenvectors,<br />
come <strong>in</strong>to play. We know that a matrix is really a l<strong>in</strong>ear transformation, so<br />
we express this view of a matrix as a function by writ<strong>in</strong>g generically that Ax = y.<br />
The matrix U will diagonalize A, creat<strong>in</strong>g the diagonal matrix D with diagonal<br />
entries equal to the eigenvalues of A. WecanwritethisasU ∗ AU = D and convert<br />
to U ∗ A = DU ∗ .Thenwehave<br />
ŷ = U ∗ y = U ∗ Ax = DU ∗ x = Dˆx<br />
So with the coord<strong>in</strong>atized vectors, the transformation by the matrix A can be<br />
accomplished with multiplication by a diagonal matrix D. A moment’s thought<br />
should conv<strong>in</strong>ce you that a matrix-vector product with a diagonal matrix is exeed<strong>in</strong>gly<br />
simple computationally. Geometrically, this is simply stretch<strong>in</strong>g, contract<strong>in</strong>g and/or<br />
reflect<strong>in</strong>g <strong>in</strong> the direction of each basis vector (“axis”). And the multiples used for<br />
these changes are the diagonal entries of the diagonal matrix, the eigenvalues of A.
§OD Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 580<br />
So the new coord<strong>in</strong>ate system (provided by the orthonormal basis of eigenvectors)<br />
is a collection of mutually perpendicular unit vectors where <strong>in</strong>ner products are<br />
preserved, and the action of the matrix A is described by multiples (eigenvalues) of<br />
the entries of the coord<strong>in</strong>atized versions of the vectors. Nice.<br />
Read<strong>in</strong>g Questions<br />
1. Name three broad classes of normal matrices that we have studied previously. No set<br />
that you give should be a subset of another on your list.<br />
2. Compare and contrast Theorem UTMR with Theorem OD.<br />
3. Given an n × n matrix A, why would you desire an orthonormal basis of C n composed<br />
entirely of eigenvectors of A?<br />
Exercises<br />
T10 Exercise MM.T35 asked you to show that AA ∗ is Hermitian. Prove directly that<br />
AA ∗ is a normal matrix.<br />
T20 In the discussion follow<strong>in</strong>g Theorem OBNM we comment that the equation ˆx = U ∗ x<br />
is just Theorem COB <strong>in</strong> disguise. Formulate this observation more formally and prove the<br />
equivalence.<br />
T30 For the proof of Theorem PTMT we only show that the product of two lower<br />
triangular matrices is aga<strong>in</strong> lower triangular. Provide a proof that the product of two upper<br />
triangular matrices is aga<strong>in</strong> upper triangular. Look to the proof of Theorem PTMT for<br />
guidance if you need a h<strong>in</strong>t.
Chapter P<br />
Prelim<strong>in</strong>aries<br />
“Prelim<strong>in</strong>aries” are basic mathematical concepts you are likely to have seen before.<br />
So we have collected them near the end as reference material (despite the name).<br />
Head back here when you need a refresher, or when a theorem or exercise builds on<br />
some of this basic material.<br />
Section CNO<br />
Complex Number Operations<br />
In this section we review some of the basics of work<strong>in</strong>g with complex numbers.<br />
Subsection CNA<br />
Arithmetic with complex numbers<br />
A complex number is a l<strong>in</strong>ear comb<strong>in</strong>ation of 1 and i = √ −1, typically written <strong>in</strong><br />
the form a + bi. Complex numbers can be added, subtracted, multiplied and divided,<br />
just like we are used to do<strong>in</strong>g with real numbers, <strong>in</strong>clud<strong>in</strong>g the restriction on division<br />
by zero. We will not def<strong>in</strong>e these operations carefully immediately, but <strong>in</strong>stead first<br />
illustrate with examples.<br />
Example ACN Arithmetic of complex numbers<br />
(2 + 5i)+(6− 4i) = (2 + 6) + (5 + (−4))i =8+i<br />
(2 + 5i) − (6 − 4i) =(2− 6) + (5 − (−4))i = −4+9i<br />
(2 + 5i)(6 − 4i) = (2)(6) + (5i)(6) + (2)(−4i)+(5i)(−4i) = 12 + 30i − 8i − 20i 2<br />
=12+22i − 20(−1) = 32 + 22i<br />
581
§CNO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 582<br />
Division takes just a bit more care. We multiply the denom<strong>in</strong>ator by a complex<br />
number chosen to produce a real number and then we can produce a complex number<br />
as a result.<br />
2+5i<br />
6 − 4i = 2+5i 6+4i<br />
6 − 4i 6+4i = −8+38i<br />
52<br />
= − 8<br />
52 + 38<br />
52 i = − 2<br />
13 + 19<br />
26 i △<br />
In this example, we used 6 + 4i to convert the denom<strong>in</strong>ator <strong>in</strong> the fraction to a<br />
real number. This number is known as the conjugate, which we def<strong>in</strong>e <strong>in</strong> the next<br />
section.<br />
We will often exploit the basic properties of complex number addition, subtraction,<br />
multiplication and division, so we will carefully def<strong>in</strong>e the two basic operations,<br />
together with a def<strong>in</strong>ition of equality, and then collect n<strong>in</strong>e basic properties <strong>in</strong> a<br />
theorem.<br />
Def<strong>in</strong>ition CNE Complex Number Equality<br />
The complex numbers α = a + bi and β = c + di are equal, denoted α = β, ifa = c<br />
and b = d.<br />
□<br />
Def<strong>in</strong>ition CNA Complex Number Addition<br />
The sum of the complex numbers α = a + bi and β = c + di , denoted α + β, is<br />
(a + c)+(b + d)i. □<br />
Def<strong>in</strong>ition CNM Complex Number Multiplication<br />
The product of the complex numbers α = a + bi and β = c + di , denoted αβ, is<br />
(ac − bd)+(ad + bc)i.<br />
□<br />
Theorem PCNA Properties of Complex Number Arithmetic<br />
The operations of addition and multiplication of complex numbers have the follow<strong>in</strong>g<br />
properties.<br />
• ACCN Additive Closure, Complex Numbers<br />
If α, β ∈ C, thenα + β ∈ C.<br />
• MCCN Multiplicative Closure, Complex Numbers<br />
If α, β ∈ C, thenαβ ∈ C.<br />
• CACN Commutativity of Addition, Complex Numbers<br />
For any α, β ∈ C, α + β = β + α.<br />
• CMCN Commutativity of Multiplication, Complex Numbers<br />
For any α, β ∈ C, αβ = βα.<br />
• AACN Additive Associativity, Complex Numbers<br />
For any α, β, γ ∈ C, α +(β + γ) =(α + β)+γ.
§CNO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 583<br />
• MACN Multiplicative Associativity, Complex Numbers<br />
For any α, β, γ ∈ C, α (βγ) =(αβ) γ.<br />
• DCN Distributivity, Complex Numbers<br />
For any α, β, γ ∈ C, α(β + γ) =αβ + αγ.<br />
• ZCN Zero, Complex Numbers<br />
There is a complex number 0=0+0i so that for any α ∈ C, 0+α = α.<br />
• OCN One, Complex Numbers<br />
There is a complex number 1=1+0i so that for any α ∈ C, 1α = α.<br />
• AICN Additive Inverse, Complex Numbers<br />
For every α ∈ C there exists −α ∈ C so that α +(−α) =0.<br />
• MICN Multiplicative Inverse, Complex Numbers<br />
For every α ∈ C, α ≠0there exists 1 α ∈ C so that α ( 1<br />
α)<br />
=1.<br />
Proof. We could derive each of these properties of complex numbers with a proof<br />
that builds on the identical properties of the real numbers. The only proof that<br />
might be at all <strong>in</strong>terest<strong>in</strong>g would be to show Property MICN s<strong>in</strong>ce we would need<br />
to trot out a conjugate. For this property, and especially for the others, we might be<br />
tempted to construct proofs of the identical properties for the reals. This would take<br />
us way too far afield, so we will draw a l<strong>in</strong>e <strong>in</strong> the sand right here and just agree<br />
that these n<strong>in</strong>e fundamental behaviors are true. OK?<br />
Mostly we have stated these n<strong>in</strong>e properties carefully so that we can make<br />
reference to them later <strong>in</strong> other proofs. So we will be l<strong>in</strong>k<strong>in</strong>g back here often. <br />
Zero and one play special roles, of course, and especially zero. Our first result is<br />
one we take for granted, but it requires a proof, derived from our n<strong>in</strong>e properties.<br />
You can compare it to its vector space counterparts, Theorem ZSSM and Theorem<br />
ZVSM.<br />
Theorem ZPCN Zero Product, Complex Numbers<br />
Suppose α ∈ C. Then0α =0.<br />
Proof.<br />
0α =0α +0<br />
PropertyZCN<br />
=0α +(0α − (0α)) Property AICN<br />
=(0α +0α) − (0α)<br />
Property AACN<br />
=(0+0)α − (0α)<br />
Property DCN<br />
=0α − (0α) Property ZCN
§CNO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 584<br />
=0 PropertyAICN<br />
Our next theorem could be called “cancellation”, s<strong>in</strong>ce it will make that possible.<br />
Though you will never see us draw<strong>in</strong>g slashes through parts of products. We will<br />
also make very limited use of this result, or its vector space counterpart, Theorem<br />
SMEZV.<br />
Theorem ZPZT Zero Product, Zero Terms<br />
Suppose α, β ∈ C. Thenαβ =0if and only if at least one of α =0or β =0.<br />
Proof. (⇒)We conduct the forward argument <strong>in</strong> two cases. <strong>First</strong> suppose that α =0.<br />
Then we are done. (That was easy.)<br />
For the second case, suppose now that α ≠0.Then<br />
β =1β<br />
Property OCN<br />
( ) 1<br />
=<br />
α α β<br />
Property MICN<br />
= 1 (αβ) Property MACN<br />
α<br />
= 1 α 0 Hypothesis<br />
=0 TheoremZPCN<br />
(⇐)With two applications of Theorem ZPCN it is easy to see that if one of the<br />
scalars is zero, then so is the product.<br />
<br />
As an equivalence (Proof Technique E), we could restate this result as the<br />
contrapositive (Proof Technique CP) by negat<strong>in</strong>g each statement, so it would read<br />
“αβ ̸= 0 if and only if α ≠ 0 and β ̸= 0.” After you have learned more about<br />
nons<strong>in</strong>gular matrices and matrix multiplication, you should compare this result with<br />
Theorem NPNT.<br />
Subsection CCN<br />
Conjugates of Complex Numbers<br />
Def<strong>in</strong>ition CCN Conjugate of a Complex Number<br />
The conjugate of the complex number α = a + bi ∈ C is the complex number<br />
α = a − bi.<br />
□<br />
Example CSCN Conjugate of some complex numbers<br />
2+3i =2− 3i 5 − 4i =5+4i −3+0i = −3+0i 0+0i =0+0i<br />
△
§CNO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 585<br />
Notice how the conjugate of a real number leaves the number unchanged. The<br />
conjugate enjoys some basic properties that are useful when we work with l<strong>in</strong>ear<br />
expressions <strong>in</strong>volv<strong>in</strong>g addition and multiplication.<br />
Theorem CCRA Complex Conjugation Respects Addition<br />
Suppose that α and β are complex numbers. Then α + β = α + β.<br />
Proof. Let α = a + bi and β = r + si. Then<br />
α + β = (a + r)+(b + s)i =(a + r) − (b + s)i =(a − bi)+(r − si) =α + β<br />
<br />
Theorem CCRM Complex Conjugation Respects Multiplication<br />
Suppose that α and β are complex numbers. Then αβ = αβ.<br />
Proof. Let α = a + bi and β = r + si. Then<br />
αβ = (ar − bs)+(as + br)i =(ar − bs) − (as + br)i<br />
=(ar − (−b)(−s)) + (a(−s)+(−b)r)i =(a − bi)(r − si) =αβ<br />
<br />
Theorem CCT Complex Conjugation Twice<br />
Suppose that α is a complex number. Then α = α.<br />
Proof. Let α = a + bi. Then<br />
α = a − bi = a − (−bi) =a + bi = α<br />
<br />
Subsection MCN<br />
Modulus of a Complex Number<br />
We def<strong>in</strong>e one more operation with complex numbers that may be new to you.<br />
Def<strong>in</strong>ition MCN Modulus of a Complex Number<br />
The modulus of the complex number α = a+bi ∈ C, is the nonnegative real number<br />
|α| = √ αα = √ a 2 + b 2 .<br />
Example MSCN Modulus of some complex numbers<br />
□<br />
|2+3i| = √ 13 |5 − 4i| = √ 41 |−3+0i| =3 |0+0i| =0<br />
△
§CNO Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 586<br />
The modulus can be <strong>in</strong>terpreted as a version of the absolute value for complex<br />
numbers, as is suggested by the notation employed. You can see this <strong>in</strong> how |−3| =<br />
|−3+0i| = 3. Notice too how the modulus of the complex zero, 0 + 0i, has value 0.
Section SET<br />
Sets<br />
We will frequently work carefully with sets, so the material <strong>in</strong> this review section<br />
is very important. If these topics are new to you, study this section carefully and<br />
consider consult<strong>in</strong>g another text for a more comprehensive <strong>in</strong>troduction.<br />
Subsection SET<br />
Sets<br />
Def<strong>in</strong>ition SET Set<br />
A set is an unordered collection of objects. If S is a set and x is an object that is<br />
<strong>in</strong> the set S, wewritex ∈ S. Ifx is not <strong>in</strong> S, thenwewritex ∉ S. We refer to the<br />
objects <strong>in</strong> a set as its elements.<br />
□<br />
Hard to get much more basic than that. Notice that the objects <strong>in</strong> a set can be<br />
anyth<strong>in</strong>g, and there is no notion of order among the elements of the set. A set can be<br />
f<strong>in</strong>ite as well as <strong>in</strong>f<strong>in</strong>ite. A set can conta<strong>in</strong> other sets as its objects. At a primitive<br />
level, a set is just a way to break up some class of objects <strong>in</strong>to two group<strong>in</strong>gs: those<br />
objects <strong>in</strong> the set, and those objects not <strong>in</strong> the set.<br />
Example SETM Set membership<br />
From the set of all possible symbols, construct the follow<strong>in</strong>g set of three symbols,<br />
S = {, , ⋆}<br />
Then the statement ∈ S is true, while the statement ∈ S is false. However, then<br />
the statement ∉ S is true.<br />
△<br />
A portion of a set is known as a subset. Notice how the follow<strong>in</strong>g def<strong>in</strong>ition uses<br />
an implication (ifwhenever...then...). Note too howthe def<strong>in</strong>ition ofa subsetrelies<br />
on the def<strong>in</strong>ition of a set through the idea of set membership.<br />
Def<strong>in</strong>ition SSET Subset<br />
If S and T are two sets, then S is a subset of T , written S ⊆ T if whenever x ∈ S<br />
then x ∈ T .<br />
□<br />
If we want to disallow the possibility that S isthesameasT , we use the notation<br />
S ⊂ T and we say that S is a proper subset of T . We will do an example, but first<br />
we will def<strong>in</strong>e a special set.<br />
Def<strong>in</strong>ition ES Empty Set<br />
The empty set is the set with no elements. It is denoted by ∅.<br />
Example SSET Subset<br />
□<br />
587
§SET Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 588<br />
If S = {, , ⋆}, T = {⋆, }, R = {, ⋆}, then<br />
T ⊆ S R ⊈ T ∅⊆S<br />
T ⊂ S S ⊆ S S ⊄ S<br />
What does it mean for two sets to be equal? They must be the same. Well, that<br />
explanation is not really too helpful, is it? How about: If A ⊆ B and B ⊆ A, then<br />
A equals B. This gives us someth<strong>in</strong>g to work with, if A is a subset of B, andvice<br />
versa, then they must really be the same set. We will now make the symbol “=” do<br />
double-duty and extend its use to statements like A = B, whereA and B are sets.<br />
Here is the def<strong>in</strong>ition, which we will reference often.<br />
Def<strong>in</strong>ition SE Set Equality<br />
Two sets, S and T , are equal, if S ⊆ T and T ⊆ S. In this case, we write S = T . □<br />
Sets are typically written <strong>in</strong>side of braces, as {}, as we have seen above. However,<br />
when sets have more than a few elements, a description will typically have two<br />
components. The first is a description of the general type of objects conta<strong>in</strong>ed <strong>in</strong> a<br />
set, while the second is some sort of restriction on the properties the objects have.<br />
Every object <strong>in</strong> the set must be of the type described <strong>in</strong> the first part and it must<br />
satisfy the restrictions <strong>in</strong> the second part. Conversely, any object of the proper type<br />
for the first part, that also meets the conditions of the second part, will be <strong>in</strong> the<br />
set. These two parts are set off from each other somehow, often with a vertical bar<br />
(|) or a colon (:).<br />
I like to th<strong>in</strong>k of sets as clubs. The first part is some description of the type of<br />
people who might belong to the club, the basic objects. For example, a bicycle club<br />
would describe its members as be<strong>in</strong>g people who like to ride bicycles. The second<br />
part is like a membership committee, it restricts the people who are allowed <strong>in</strong> the<br />
club. Cont<strong>in</strong>u<strong>in</strong>g with our bicycle club analogy, we might decide to limit ourselves<br />
to “serious” riders and only have members who can document hav<strong>in</strong>g ridden 100<br />
kilometers or more <strong>in</strong> a s<strong>in</strong>gle day at least one time.<br />
The restrictions on membership can migrate around some between the first and<br />
second part, and there may be several ways to describe the same set of objects. Here<br />
is a more mathematical example, employ<strong>in</strong>g the set of all <strong>in</strong>tegers, Z, to describe<br />
the set of even <strong>in</strong>tegers.<br />
E = {x ∈ Z| x is an even number}<br />
= {x ∈ Z| 2 divides x evenly}<br />
= {2k| k ∈ Z}<br />
Notice how this set tells us that its objects are <strong>in</strong>teger numbers (not, say, matrices<br />
or functions, for example) and just those that are even. So we can write that 10 ∈ E,<br />
△
§SET Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 589<br />
while 17 ∉ E once we check the membership criteria. We also recognize the question<br />
[ ]<br />
1 −3 5<br />
∈ E?<br />
2 0 3<br />
as be<strong>in</strong>g simply ridiculous.<br />
Subsection SC<br />
Set Card<strong>in</strong>ality<br />
On occasion, we will be <strong>in</strong>terested <strong>in</strong> the number of elements <strong>in</strong> a f<strong>in</strong>ite set. Here is<br />
the def<strong>in</strong>ition and the associated notation.<br />
Def<strong>in</strong>ition C Card<strong>in</strong>ality<br />
Suppose S is a f<strong>in</strong>ite set. Then the number of elements <strong>in</strong> S is called the card<strong>in</strong>ality<br />
or size of S, and is denoted |S|.<br />
□<br />
Example CS Card<strong>in</strong>ality and Size<br />
If S = {, ⋆, }, then|S| =3.<br />
Subsection SO<br />
Set Operations<br />
In this subsection we def<strong>in</strong>e and illustrate the three most common basic ways to<br />
manipulate sets to create other sets. S<strong>in</strong>ce much of l<strong>in</strong>ear algebra is about sets, we<br />
will use these often.<br />
Def<strong>in</strong>ition SU Set Union<br />
Suppose S and T are sets. Then the union of S and T , denoted S ∪ T , is the set<br />
whose elements are those that are elements of S or of T , or both. More formally,<br />
x ∈ S ∪ T if and only if x ∈ S or x ∈ T<br />
□<br />
Notice that the use of the word “or” <strong>in</strong> this def<strong>in</strong>ition is meant to be nonexclusive.<br />
That is, it allows for x to be an element of both S and T and still qualify<br />
for membership <strong>in</strong> S ∪ T .<br />
Example SU Set union<br />
If S = {, ⋆, } and T = {, ⋆, } then S ∪ T = {, ⋆, , }.<br />
Def<strong>in</strong>ition SI Set Intersection<br />
Suppose S and T are sets. Then the <strong>in</strong>tersection of S and T , denoted S ∩ T ,isthe<br />
set whose elements are only those that are elements of S and of T . More formally,<br />
x ∈ S ∩ T if and only if x ∈ S and x ∈ T<br />
□<br />
△<br />
△
§SET Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 590<br />
Example SI Set <strong>in</strong>tersection<br />
If S = {, ⋆, } and T = {, ⋆, } then S ∩ T = {, ⋆}.<br />
The union and <strong>in</strong>tersection of sets are operations that beg<strong>in</strong> with two sets and<br />
produce a third, new, set. Our f<strong>in</strong>al operation is the set complement, which we<br />
usually th<strong>in</strong>k of as an operation that takes a s<strong>in</strong>gle set and creates a second, new,<br />
set. However, if you study the def<strong>in</strong>ition carefully, you will see that it needs to be<br />
computed relative to some “universal” set.<br />
Def<strong>in</strong>ition SC Set Complement<br />
Suppose S is a set that is a subset of a universal set U. Then the complement of<br />
S, denoted S, is the set whose elements are those that are elements of U and not<br />
elements of S. More formally,<br />
x ∈ S if and only if x ∈ U and x ∉ S<br />
□<br />
Notice that there is noth<strong>in</strong>g at all special about the universal set. This is simply<br />
a term that suggests that U conta<strong>in</strong>s all of the possible objects we are consider<strong>in</strong>g.<br />
Often this set will be clear from the context, and we will not th<strong>in</strong>k much about it,<br />
nor reference it <strong>in</strong> our notation. In other cases (rarely <strong>in</strong> our work <strong>in</strong> this course)<br />
the exact nature of the universal set must be made explicit, and reference to it will<br />
possibly be carried through <strong>in</strong> our choice of notation.<br />
Example SC Set complement<br />
If U = {, ⋆, , } and S = {, ⋆, } then S = {}.<br />
There are many more natural operations that can be performed on sets, such as<br />
an exclusive-or and the symmetric difference. Many of these can be def<strong>in</strong>ed <strong>in</strong> terms<br />
of the union, <strong>in</strong>tersection and complement. We will not have much need of them<br />
<strong>in</strong> this course, and so we will not give precise descriptions here <strong>in</strong> this prelim<strong>in</strong>ary<br />
section.<br />
There is also an <strong>in</strong>terest<strong>in</strong>g variety of basic results that describe the <strong>in</strong>terplay<br />
of these operations with each other. We mention just two as an example, these are<br />
known as DeMorgan’s Laws.<br />
(S ∪ T )=S ∩ T<br />
(S ∩ T )=S ∪ T<br />
△<br />
△<br />
Besides hav<strong>in</strong>g an appeal<strong>in</strong>g symmetry, we mention these two facts, s<strong>in</strong>ce construct<strong>in</strong>g<br />
the proofs of each is a useful exercise that will require a solid understand<strong>in</strong>g<br />
of all but one of the def<strong>in</strong>itions presented <strong>in</strong> this section. Give it a try.
Reference<br />
Proof Techniques<br />
In this section we collect many short essays designed to help you understand how to<br />
read, understand and construct proofs. Some are very factual, while others consist<br />
of advice. They appear <strong>in</strong> the order that they are first needed (or advisable) <strong>in</strong> the<br />
text, and are meant to be self-conta<strong>in</strong>ed. So you should not th<strong>in</strong>k of read<strong>in</strong>g through<br />
this section <strong>in</strong> one sitt<strong>in</strong>g as you beg<strong>in</strong> this course. But be sure to head back here for<br />
a first read<strong>in</strong>g whenever the text suggests it. Also th<strong>in</strong>k about return<strong>in</strong>g to browse<br />
at various po<strong>in</strong>ts dur<strong>in</strong>g the course, and especially as you struggle with becom<strong>in</strong>g<br />
an accomplished mathematician who is comfortable with the difficult process of<br />
design<strong>in</strong>g new proofs.<br />
Proof Technique D<br />
Def<strong>in</strong>itions<br />
A def<strong>in</strong>ition is a made-up term, used as a k<strong>in</strong>d of shortcut for some typically more<br />
complicated idea. For example, we say a whole number is even as a shortcut for<br />
say<strong>in</strong>g that when we divide the number by two we get a rema<strong>in</strong>der of zero. With a<br />
precise def<strong>in</strong>ition, we can answer certa<strong>in</strong> questions unambiguously. For example, did<br />
you ever wonder if zero was an even number? Now the answer should be clear s<strong>in</strong>ce<br />
we have a precise def<strong>in</strong>ition of what we mean by the term even.<br />
A s<strong>in</strong>gle term might have several possible def<strong>in</strong>itions. For example, we could say<br />
that the whole number n is even if there is some whole number k such that n =2k.<br />
We say this is an equivalent def<strong>in</strong>ition s<strong>in</strong>ce it categorizes even numbers the same<br />
way our first def<strong>in</strong>ition does.<br />
Def<strong>in</strong>itions are like two-way streets — we can use a def<strong>in</strong>ition to replace someth<strong>in</strong>g<br />
rather complicated by its def<strong>in</strong>ition (if it fits) and we can replace a def<strong>in</strong>ition by its<br />
more complicated description. A def<strong>in</strong>ition is usually written as some form of an<br />
implication, such as “If someth<strong>in</strong>g-nice-happens, then blatzo.” However, this also<br />
means that “If blatzo, then someth<strong>in</strong>g-nice-happens,” even though this may not be<br />
591
Techniques Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 592<br />
formally stated. This is what we mean when we say a def<strong>in</strong>ition is a two-way street<br />
— it is really two implications, go<strong>in</strong>g <strong>in</strong> opposite “directions.”<br />
Anybody (<strong>in</strong>clud<strong>in</strong>g you) can make up a def<strong>in</strong>ition, so long as it is unambiguous,<br />
but the real test of a def<strong>in</strong>ition’s utility is whether or not it is useful for concisely<br />
describ<strong>in</strong>g <strong>in</strong>terest<strong>in</strong>g or frequent situations. We will try to restrict our def<strong>in</strong>itions<br />
to parts of speech that are nouns (e.g. “matrix”) or adjectives (e.g. “nons<strong>in</strong>gular”<br />
matrix), and so avoid def<strong>in</strong>itions that are verbs or adverbs. Therefore our def<strong>in</strong>itions<br />
will describe an object (noun) or a property of an object (adjective).<br />
We will talk about theorems later (and especially equivalences). For now, be sure<br />
not to confuse the notion of a def<strong>in</strong>ition with that of a theorem.<br />
In this book, we will display every new def<strong>in</strong>ition carefully set-off from the text,<br />
and the term be<strong>in</strong>g def<strong>in</strong>ed will be written thus: def<strong>in</strong>ition. Additionally, there is a<br />
full list of all the def<strong>in</strong>itions, <strong>in</strong> order of their appearance located <strong>in</strong> the reference<br />
section of the same name (Def<strong>in</strong>itions). Def<strong>in</strong>itions are critical to do<strong>in</strong>g mathematics<br />
and prov<strong>in</strong>g theorems, so we have given you lots of ways to locate a def<strong>in</strong>ition should<br />
you forget its. . . uh, well, . . . def<strong>in</strong>ition.<br />
Can you formulate a precise def<strong>in</strong>ition for what it means for a number to be odd?<br />
(do not just say it is the opposite of even. Act as if you do not have a def<strong>in</strong>ition for<br />
even yet.) Can you formulate your def<strong>in</strong>ition a second, equivalent, way? Can you<br />
employ your def<strong>in</strong>ition to test an odd and an even number for “odd-ness”?<br />
Proof Technique T<br />
Theorems<br />
Higher mathematics is about understand<strong>in</strong>g theorems. Read<strong>in</strong>g them, understand<strong>in</strong>g<br />
them, apply<strong>in</strong>g them, prov<strong>in</strong>g them. Every theorem is a shortcut — we prove<br />
someth<strong>in</strong>g <strong>in</strong> general, and then whenever we f<strong>in</strong>d a specific <strong>in</strong>stance covered by the<br />
theorem we can immediately say that we know someth<strong>in</strong>g else about the situation<br />
by apply<strong>in</strong>g the theorem. In many cases, this new <strong>in</strong>formation can be ga<strong>in</strong>ed with<br />
much less effort than if we did not know the theorem.<br />
The first step <strong>in</strong> understand<strong>in</strong>g a theorem is to realize that the statement of<br />
every theorem can be rewritten us<strong>in</strong>g statements of the form “If someth<strong>in</strong>g-happens,<br />
then someth<strong>in</strong>g-else-happens.” The “someth<strong>in</strong>g-happens” part is the hypothesis<br />
and the “someth<strong>in</strong>g-else-happens” is the conclusion. To understand a theorem,<br />
it helps to rewrite its statement us<strong>in</strong>g this construction. To apply a theorem, we<br />
verify that “someth<strong>in</strong>g-happens” <strong>in</strong> a particular <strong>in</strong>stance and immediately conclude<br />
that “someth<strong>in</strong>g-else-happens.” To prove a theorem, we must argue based on the<br />
assumption that the hypothesis is true, and arrive through the process of logic that<br />
the conclusion must then also be true.
Techniques Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 593<br />
Proof Technique L<br />
Language<br />
Like any science, the language of math must be understood before further<br />
study can cont<strong>in</strong>ue.<br />
Er<strong>in</strong> Wilson, Student<br />
September, 2004<br />
Mathematics is a language. It is a way to express complicated ideas clearly,<br />
precisely, and unambiguously. Because of this, it can be difficult to read. Read slowly,<br />
and have pencil and paper at hand. It will usually be necessary to read someth<strong>in</strong>g<br />
several times. While read<strong>in</strong>g can be difficult, it is even harder to speak mathematics,<br />
and so that is the topic of this technique.<br />
“Natural” language, <strong>in</strong> the present case English, is fraught with ambiguity. Consider<br />
the possible mean<strong>in</strong>gs of the sentence: The fish is ready to eat. One fish, or two<br />
fish? Are the fish hungry, or will the fish be eaten? (See Exercise SSLE.M10, Exercise<br />
SSLE.M11, Exercise SSLE.M12, Exercise SSLE.M13.) In your daily <strong>in</strong>teractions with<br />
others, give some thought to how many mis-understand<strong>in</strong>gs arise from the ambiguity<br />
of pronouns, modifiers and objects.<br />
I am go<strong>in</strong>g to suggest a simple modification to the way you use language that<br />
will make it much, much easier to become proficient at speak<strong>in</strong>g mathematics and<br />
eventually it will become second nature. Th<strong>in</strong>k of it as a tra<strong>in</strong><strong>in</strong>g aid or practice<br />
drill you might use when learn<strong>in</strong>g to become skilled at a sport.<br />
<strong>First</strong>, elim<strong>in</strong>ate pronouns from your vocabulary when discuss<strong>in</strong>g l<strong>in</strong>ear algebra,<br />
<strong>in</strong> class or with your colleagues. Do not use: it, that, those, their or similar sources<br />
of confusion. This is the s<strong>in</strong>gle easiest step you can take to make your oral expression<br />
of mathematics clearer to others, and <strong>in</strong> turn, it will greatly help your own<br />
understand<strong>in</strong>g.<br />
Now rid yourself of the word “th<strong>in</strong>g” (or variants like “someth<strong>in</strong>g”). When<br />
you are tempted to use this word realize that there is some object you want to<br />
discuss, and we likely have a def<strong>in</strong>ition for that object (see the discussion at Proof<br />
Technique D). Always “th<strong>in</strong>k about your objects” and many aspects of the study of<br />
mathematics will get easier. Ask yourself: “Am I work<strong>in</strong>g with a set, a number, a<br />
function, an operation, a differential equation, or what?” Know<strong>in</strong>g what an object<br />
is will allow you to narrow down the procedures you may apply to it. Ifyouhave<br />
studied an object-oriented computer programm<strong>in</strong>g language, then you will already<br />
have experience identify<strong>in</strong>g objects and th<strong>in</strong>k<strong>in</strong>g carefully about what procedures<br />
are allowed to be applied to them.<br />
Third, elim<strong>in</strong>ate the verb “works” (as <strong>in</strong> “the equation works”) from your<br />
vocabulary. This term is used as a substitute when we are not sure just what we<br />
are try<strong>in</strong>g to accomplish. Usually we are try<strong>in</strong>g to say that some object fulfills some
Techniques Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 594<br />
condition. The condition might even have a def<strong>in</strong>ition associated with it, mak<strong>in</strong>g it<br />
even easier to describe.<br />
Last, speak slooooowly and thoughtfully as you try to get by without all these<br />
lazy words. It is hard at first, but you will get better with practice. Especially <strong>in</strong> class,<br />
when the pressure is on and all eyes are on you, do not succumb to the temptation<br />
to use these weak words. Slow down, we’d all rather wait for a slow, well-formed<br />
question or answer than a fast, sloppy, <strong>in</strong>comprehensible one.<br />
You will f<strong>in</strong>d the improvement <strong>in</strong> your ability to speak clearly about complicated<br />
ideas will greatly improve your ability to th<strong>in</strong>k clearly about complicated ideas.<br />
And I believe that you cannot th<strong>in</strong>k clearly about complicated ideas if you cannot<br />
formulate questions or answers clearly <strong>in</strong> the correct language. This is as applicable<br />
to the study of law, economics or philosophy as it is to the study of science or<br />
mathematics.<br />
In this spirit, Dupont Hubert has contributed the follow<strong>in</strong>g quotation, which is<br />
widely used <strong>in</strong> French mathematics courses (and which might be construed as the<br />
contrapositive of Proof Technique CP)<br />
Ce que l’on concoit bien s’enonce clairement,<br />
Et les mots pour le dire arrivent aisement.<br />
which translates as<br />
Whatever is well conceived is clearly said,<br />
And the words to say it flow with ease.<br />
Nicolas Boileau<br />
L’art poetique<br />
Chant I, 1674<br />
So when you come to class, check your pronouns at the door, along with other<br />
weak words. And when study<strong>in</strong>g with friends, you might make a game of catch<strong>in</strong>g<br />
one another us<strong>in</strong>g pronouns, “th<strong>in</strong>g,” or “works.” I know I’ll be call<strong>in</strong>g you on it!<br />
Proof Technique GS<br />
Gett<strong>in</strong>g Started<br />
“I don’t know how to get started!” is often the lament of the novice proof-builder.<br />
Here are a few pieces of advice.<br />
1. As mentioned <strong>in</strong> Proof Technique T, rewrite the statement of the theorem <strong>in</strong><br />
an “if-then” form. This will simplify identify<strong>in</strong>g the hypothesis and conclusion,<br />
which are referenced <strong>in</strong> the next few items.
Techniques Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 595<br />
2. Ask yourself what k<strong>in</strong>d of statement you are try<strong>in</strong>g to prove. This is always<br />
part of your conclusion. Are you be<strong>in</strong>g asked to conclude that two numbers<br />
are equal, that a function is differentiable or a set is a subset of another?<br />
You cannot br<strong>in</strong>g other techniques to bear if you do not know what type of<br />
conclusion you have.<br />
3. Write down reformulations of your hypotheses. Interpret and translate each<br />
def<strong>in</strong>ition properly.<br />
4. Write your hypothesis at the top of a sheet of paper and your conclusion at the<br />
bottom. See if you can formulate a statement that precedes the conclusion and<br />
also implies it. Work down from your hypothesis, and up from your conclusion,<br />
and see if you can meet <strong>in</strong> the middle. When you are f<strong>in</strong>ished, rewrite the<br />
proof nicely, from hypothesis to conclusion, with verifiable implications giv<strong>in</strong>g<br />
each subsequent statement.<br />
5. As you work through your proof, th<strong>in</strong>k about what k<strong>in</strong>ds of objects your<br />
symbols represent. For example, suppose A is a set and f(x) is a real-valued<br />
function. Then the expression A + f might make no sense if we have not<br />
def<strong>in</strong>ed what it means to “add” a set to a function, so we can stop at that<br />
po<strong>in</strong>t and adjust accord<strong>in</strong>gly. On the other hand we might understand 2f to<br />
be the function whose rule is described by (2f)(x) =2f(x). “Th<strong>in</strong>k about<br />
your objects” means to always verify that your objects and operations are<br />
compatible.<br />
Proof Technique C<br />
Constructive Proofs<br />
Conclusions of proofs come <strong>in</strong> a variety of types. Often a theorem will simply assert<br />
that someth<strong>in</strong>g exists. The best way, but not the only way, to show someth<strong>in</strong>g exists<br />
is to actually build it. Such a proof is called constructive. The th<strong>in</strong>g to realize<br />
about constructive proofs is that the proof itself will conta<strong>in</strong> a procedure that might<br />
be used computationally to construct the desired object. If the procedure is not too<br />
cumbersome, then the proof itself is as useful as the statement of the theorem.<br />
Proof Technique E<br />
Equivalences<br />
When a theorem uses the phrase “if and only if” (or the abbreviation “iff”) it is a<br />
shorthand way of say<strong>in</strong>g that two if-then statements are true. So if a theorem says<br />
“P if and only if Q,” then it is true that “if P, then Q” while it is also true that “if<br />
Q, then P.” For example, it may be a theorem that “I wear bright yellow knee-high<br />
plastic boots if and only if it is ra<strong>in</strong><strong>in</strong>g.” This means that I never forget to wear my
Techniques Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 596<br />
super-duper yellow boots when it is ra<strong>in</strong><strong>in</strong>g and I would not be seen <strong>in</strong> such silly<br />
boots unless it was ra<strong>in</strong><strong>in</strong>g. You never have one without the other. I have my boots<br />
on and it is ra<strong>in</strong><strong>in</strong>g or I do not have my boots on and it is dry.<br />
The upshot for prov<strong>in</strong>g such theorems is that it is like a 2-for-1 sale, we get to do<br />
two proofs. Assume P and conclude Q, then start over and assume Q and conclude<br />
P . For this reason, “if and only if” is sometimes abbreviated by ⇐⇒ , while proofs<br />
<strong>in</strong>dicate which of the two implications is be<strong>in</strong>g proved by prefac<strong>in</strong>g each with ⇒ or<br />
⇐. A carefully written proof will rem<strong>in</strong>d the reader which statement is be<strong>in</strong>g used<br />
as the hypothesis, a quicker version will let the reader deduce it from the direction<br />
of the arrow. Tradition dictates we do the “easy” half first, but that is hard for a<br />
student to know until you have f<strong>in</strong>ished do<strong>in</strong>g both halves! Oh well, if you rewrite<br />
your proofs (a good habit), you can then choose to put the easy half first.<br />
Theorems of this type are called “equivalences” or “characterizations,” and they<br />
are some of the most pleas<strong>in</strong>g results <strong>in</strong> mathematics. They say that two objects, or<br />
two situations, are really the same. You do not have one without the other, like ra<strong>in</strong><br />
and my yellow boots. The more different P and Q seem to be, the more pleas<strong>in</strong>g<br />
it is to discover they are really equivalent. And if P describes a very mysterious<br />
solution or <strong>in</strong>volves a tough computation, while Q is transparent or <strong>in</strong>volves easy<br />
computations, then we have found a great shortcut for better understand<strong>in</strong>g or faster<br />
computation. Remember that every theorem really is a shortcut <strong>in</strong> some form. You<br />
will also discover that if prov<strong>in</strong>g P ⇒ Q is very easy, then prov<strong>in</strong>g Q ⇒ P is likely<br />
to be proportionately harder. Sometimes the two halves are about equally hard. And<br />
<strong>in</strong> rare cases, you can str<strong>in</strong>g together a whole sequence of other equivalences to form<br />
the one you are after and you do not even need to do two halves. In this case, the<br />
argument of one half is just the argument of the other half, but <strong>in</strong> reverse.<br />
One last th<strong>in</strong>g about equivalences. If you see a statement of a theorem that says<br />
two th<strong>in</strong>gs are “equivalent,” translate it first <strong>in</strong>to an “if and only if” statement.<br />
Proof Technique N<br />
Negation<br />
When we construct the contrapositive of a theorem (Proof Technique CP), we need<br />
to negate the two statements <strong>in</strong> the implication. And when we construct a proof<br />
by contradiction (Proof Technique CD), we need to negate the conclusion of the<br />
theorem. One way to construct a converse (Proof Technique CV) is to simultaneously<br />
negate the hypothesis and conclusion of an implication (but remember that this<br />
is not guaranteed to be a true statement). So we often have the need to negate<br />
statements, and <strong>in</strong> some situations it can be tricky.<br />
If a statement says that a set is empty, then its negation is the statement that<br />
the set is nonempty. That is straightforward. Suppose a statement says “someth<strong>in</strong>ghappens”<br />
for all i, oreveryi, oranyi. Then the negation is that “someth<strong>in</strong>g-doesnot-happen”<br />
for at least one value of i. If a statement says that there exists at
Techniques Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 597<br />
least one “th<strong>in</strong>g,” then the negation is the statement that there is no “th<strong>in</strong>g.” If a<br />
statement says that a “th<strong>in</strong>g” is unique, then the negation is that there is zero, or<br />
more than one, of the “th<strong>in</strong>g.”<br />
We are not cover<strong>in</strong>g all of the possibilities, but we wish to make the po<strong>in</strong>t that<br />
logical qualifiers like “there exists” or “for every” must be handled with care when<br />
negat<strong>in</strong>g statements. Study<strong>in</strong>g the proofs which employ contradiction (as listed<br />
<strong>in</strong> Proof Technique CD) is a good first step towards understand<strong>in</strong>g the range of<br />
possibilities.<br />
Proof Technique CP<br />
Contrapositives<br />
The contrapositive of an implication P ⇒ Q is the implication not(Q) ⇒ not(P ),<br />
where “not” means the logical negation, or opposite. An implication is true if and<br />
only if its contrapositive is true. In symbols, (P ⇒ Q) ⇐⇒ (not(Q) ⇒ not(P ))<br />
is a theorem. Such statements about logic, that are always true, are known as<br />
tautologies.<br />
For example, it is a theorem that “if a vehicle is a fire truck, then it has big<br />
tires and has a siren.” (Yes, I’m sure you can conjure up a counterexample, but play<br />
along with me anyway.) The contrapositive is “if a vehicle does not have big tires or<br />
does not have a siren, then it is not a fire truck.” Notice how the “and” became an<br />
“or” when we negated the conclusion of the orig<strong>in</strong>al theorem.<br />
It will frequently happen that it is easier to construct a proof of the contrapositive<br />
than of the orig<strong>in</strong>al implication. If you are hav<strong>in</strong>g difficulty formulat<strong>in</strong>g a proof of<br />
some implication, see if the contrapositive is easier for you. The trick is to construct<br />
the negation of complicated statements accurately. More on that later.<br />
Proof Technique CV<br />
Converses<br />
The converse of the implication P ⇒ Q is the implication Q ⇒ P . There is no<br />
guarantee that the truth of these two statements are related. In particular, if an<br />
implication has been proven to be a theorem, then do not try to use its converse too,<br />
as if it were a theorem. Sometimes the converse is true (and we have an equivalence,<br />
see Proof Technique E). But more likely the converse is false, especially if it was not<br />
<strong>in</strong>cluded <strong>in</strong> the statement of the orig<strong>in</strong>al theorem.<br />
Forexample,wehavethetheorem,“ifavehicleisafiretruck,thenitishasbig<br />
tires and has a siren.” The converse is false. The statement that “if a vehicle has big<br />
tires and a siren, then it is a fire truck” is false. A police vehicle for use on a sandy<br />
public beach would have big tires and a siren, yet is not equipped to fight fires.<br />
We br<strong>in</strong>g this up now, because Theorem CSRN has a tempt<strong>in</strong>g converse. Does<br />
this theorem say that if r
Techniques Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 598<br />
Archetype E has r =3< 4=n, yet is <strong>in</strong>consistent. This example is then said<br />
to be a counterexample to the converse. Whenever you th<strong>in</strong>k a theorem that is<br />
an implication might actually be an equivalence, it is good to hunt around for a<br />
counterexample that shows the converse to be false (the archetypes, Archetypes, can<br />
be a good hunt<strong>in</strong>g ground).<br />
Proof Technique CD<br />
Contradiction<br />
Another proof technique is known as “proof by contradiction” and it can be a<br />
powerful (and satisfy<strong>in</strong>g) approach. Simply put, suppose you wish to prove the<br />
implication, “If A, thenB.” As usual, we assume that A is true, but we also make<br />
the additional assumption that B is false. If our orig<strong>in</strong>al implication is true, then<br />
these tw<strong>in</strong> assumptions should lead us to a logical <strong>in</strong>consistency. In practice we<br />
assume the negation of B to be true (see Proof Technique N). So we argue from the<br />
assumptions A and not(B) look<strong>in</strong>g for some obviously false conclusion such as 1 = 6,<br />
or a set is simultaneously empty and nonempty, or a matrix is both nons<strong>in</strong>gular and<br />
s<strong>in</strong>gular.<br />
You should be careful about formulat<strong>in</strong>g proofs that look like proofs by contradiction,<br />
but really are not. This happens when you assume A and not(B) and proceed<br />
to give a “normal” and direct proof that B is true by only us<strong>in</strong>g the assumption<br />
that A is true. Your last step is to then claim that B is true and you then appeal to<br />
the assumption that not(B) is true, thus gett<strong>in</strong>g the desired contradiction. Instead,<br />
you could have avoided the overhead of a proof by contradiction and just run with<br />
the direct proof. This stylistic flaw is known, quite graphically, as “sett<strong>in</strong>g up the<br />
strawman to knock him down.”<br />
Here is a simple example of a proof by contradiction. There are direct proofs that<br />
are just about as easy, but this will demonstrate the po<strong>in</strong>t, while narrowly avoid<strong>in</strong>g<br />
knock<strong>in</strong>g down the straw man.<br />
Theorem: If a and b are odd <strong>in</strong>tegers, then their product, ab, is odd.<br />
Proof: To beg<strong>in</strong> a proof by contradiction, assume the hypothesis, that a and b<br />
are odd. Also assume the negation of the conclusion, <strong>in</strong> this case, that ab is even.<br />
Then there are <strong>in</strong>tegers, j, k, l so that a =2j +1,b =2k +1,ab =2l. Then<br />
0=ab − ab<br />
=(2j + 1)(2k +1)− (2l)<br />
=4jk +2j +2k − 2l +1<br />
=2(2jk + j + k − l)+1<br />
Aga<strong>in</strong>, we do not offer this example as the best proof of this fact about even and<br />
odd numbers, but rather it is a simple illustration of a proof by contradiction. You
Techniques Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 599<br />
can f<strong>in</strong>d examples of proofs by contradiction <strong>in</strong><br />
Theorem RREFU, TheoremNMUS, TheoremNPNT, TheoremTTMI, Theorem<br />
GSP, TheoremELIS, TheoremEDYES, TheoremEMHE, TheoremEDELI, and<br />
Theorem DMFE, <strong>in</strong> addition to several examples and solutions to exercises.<br />
Proof Technique U<br />
Uniqueness<br />
A theorem will sometimes claim that some object, hav<strong>in</strong>g some desirable property,<br />
is unique. In other words, there should be only one such object. To prove this, a<br />
standard technique is to assume there are two such objects and proceed to analyze<br />
the consequences. The end result may be a contradiction (Proof Technique CD), or<br />
the conclusion that the two allegedly different objects really are equal.<br />
Proof Technique ME<br />
Multiple Equivalences<br />
A very specialized form of a theorem beg<strong>in</strong>s with the statement “The follow<strong>in</strong>g are<br />
equivalent...,” which is then followed by a list of statements. Informally, this lead-<strong>in</strong><br />
sometimes gets abbreviated by “TFAE.” This formulation means that any two of the<br />
statements on the list can be connected with an “if and only if” to form a theorem.<br />
So if the list has n statements then, there are n(n−1)<br />
2<br />
possible equivalences that can<br />
be constructed (and are claimed to be true).<br />
Suppose a theorem of this form has statements denoted as A, B, C, ..., Z. To<br />
prove the entire theorem, we can prove A ⇒ B, B ⇒ C, C ⇒ D, ...,Y ⇒ Z and<br />
f<strong>in</strong>ally, Z ⇒ A. This circular cha<strong>in</strong> of n equivalences would allow us, logically, if<br />
not practically, to form any one of the n(n−1)<br />
2<br />
possible equivalences by chas<strong>in</strong>g the<br />
equivalences around the circle as far as required.<br />
Proof Technique PI<br />
Prov<strong>in</strong>g Identities<br />
Many theorems have conclusions that say two objects are equal. Perhaps one object<br />
is hard to compute or understand, while the other is easy to compute or understand.<br />
This would make for a pleas<strong>in</strong>g theorem. Whether the result is pleas<strong>in</strong>g or not,<br />
we take the same approach to formulate a proof. Sometimes we need to employ<br />
specialized notions of equality, such as Def<strong>in</strong>ition SE or Def<strong>in</strong>ition CVE, but <strong>in</strong> other<br />
cases we can str<strong>in</strong>g together a list of equalities.<br />
The wrong way to prove an identity is to beg<strong>in</strong> by writ<strong>in</strong>g it down and then<br />
beat<strong>in</strong>g on it until it reduces to an obvious identity. The first flaw is that you would<br />
be writ<strong>in</strong>g down the statement you wish to prove, as if you already believed it to be
Techniques Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 600<br />
true. But more dangerous is the possibility that some of your maneuvers are not<br />
reversible. Here is an example. Let us prove that 3 = −3.<br />
3=−3 (This is a bad start)<br />
3 2 =(−3) 2 Square both sides<br />
9=9<br />
0 = 0 Subtract 9 from both sides<br />
So because 0 = 0 is a true statement, does it follow that 3 = −3 isatrue<br />
statement? Nope. Of course, we did not really expect a legitimate proof of 3 = −3,<br />
but this attempt should illustrate the dangers of this (<strong>in</strong>correct) approach.<br />
What you have just seen <strong>in</strong> the proof of Theorem VSPCV, and what you will<br />
see consistently throughout this text, is proofs of the follow<strong>in</strong>g form. To prove that<br />
A = D we write<br />
A = B Theorem, Def<strong>in</strong>ition or Hypothesis justify<strong>in</strong>g A = B<br />
= C Theorem, Def<strong>in</strong>ition or Hypothesis justify<strong>in</strong>g B = C<br />
= D Theorem, Def<strong>in</strong>ition or Hypothesis justify<strong>in</strong>g C = D<br />
In your scratch work explor<strong>in</strong>g possible approaches to prov<strong>in</strong>g a theorem you<br />
may massage a variety of expressions, sometimes mak<strong>in</strong>g connections to various bits<br />
and pieces, while some parts get abandoned. Once you see a l<strong>in</strong>e of attack, rewrite<br />
your proof carefully mimick<strong>in</strong>g this style.<br />
Proof Technique DC<br />
Decompositions<br />
Much of your mathematical upbr<strong>in</strong>g<strong>in</strong>g, especially once you began a study of algebra,<br />
revolved around simplify<strong>in</strong>g expressions — comb<strong>in</strong><strong>in</strong>g like terms, obta<strong>in</strong><strong>in</strong>g common<br />
denom<strong>in</strong>ators so as to add fractions, factor<strong>in</strong>g <strong>in</strong> order to solve polynomial equations.<br />
However, as often as not, we will do the opposite. Many theorems and techniques<br />
will revolve around tak<strong>in</strong>g some object and “decompos<strong>in</strong>g” it <strong>in</strong>to some comb<strong>in</strong>ation<br />
of other objects, ostensibly <strong>in</strong> a more complicated fashion. When we say someth<strong>in</strong>g<br />
can “be written as” someth<strong>in</strong>g else, we mean that the one object can be decomposed<br />
<strong>in</strong>to some comb<strong>in</strong>ation of other objects. This may seem unnatural at first, but results<br />
of this type will give us <strong>in</strong>sight <strong>in</strong>to the structure of the orig<strong>in</strong>al object by expos<strong>in</strong>g<br />
its <strong>in</strong>ner work<strong>in</strong>gs. An appropriate analogy might be stripp<strong>in</strong>g the wallboards away<br />
from the <strong>in</strong>terior of a build<strong>in</strong>g to expose the structural members support<strong>in</strong>g the<br />
whole build<strong>in</strong>g.<br />
Perhaps you have studied <strong>in</strong>tegral calculus, or a pre-calculus course, where you<br />
learned about partial fractions. This is a technique where a fraction of two polynomials<br />
is decomposed (written as, expressed as) a sum of simpler fractions. The purpose <strong>in</strong><br />
calculus is to make f<strong>in</strong>d<strong>in</strong>g an antiderivative simpler. For example, you can verify
Techniques Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 601<br />
the truth of the expression<br />
12x 5 +2x 4 − 20x 3 +66x 2 − 294x + 308 5x +2<br />
x 6 + x 5 − 3x 4 +21x 3 − 52x 2 =<br />
+20x − 48 x 2 − x +6 + 3x − 7<br />
x 2 +1 + 3<br />
x +4 + 1<br />
x − 2<br />
In an early course <strong>in</strong> algebra, you might be expected to comb<strong>in</strong>e the four terms<br />
on the right over a common denom<strong>in</strong>ator to create the “simpler” expression on<br />
the left. Go<strong>in</strong>g the other way, the partial fraction technique would allow you to<br />
systematically decompose the fraction of polynomials on the left <strong>in</strong>to the sum of the<br />
four (arguably) simpler fractions of polynomials on the right.<br />
This is a major shift <strong>in</strong> th<strong>in</strong>k<strong>in</strong>g, so come back here often, especially when we<br />
say “can be written as”, or “can be expressed as,” or “can be decomposed as.”<br />
Proof Technique I<br />
Induction<br />
“Induction” or “mathematical <strong>in</strong>duction” is a framework for prov<strong>in</strong>g statements that<br />
are <strong>in</strong>dexed by <strong>in</strong>tegers. In other words, suppose you have a statement to prove that<br />
is really multiple statements, one for n = 1, another for n =2,athirdforn =3,<br />
and so on. If there is enough similarity between the statements, then you can use a<br />
script (the framework) to prove them all at once.<br />
n(n +1)<br />
For example, consider the theorem: 1 + 2 + 3 + ···+ n = for n ≥ 1.<br />
2<br />
This is shorthand for the many statements 1 = 1(1+1)<br />
2<br />
,1+2= 2(2+1)<br />
2<br />
, 1+2+3 =<br />
3(3+1)<br />
2<br />
,1+2+3+4= 4(4+1)<br />
2<br />
, and so on. Forever. You can do the calculations <strong>in</strong><br />
each of these statements and verify that all four are true. We might not be surprised<br />
to learn that the fifth statement is true as well (go ahead and check). However, do<br />
we th<strong>in</strong>k the theorem is true for n = 872? Or n =1, 234, 529?<br />
To see that these questions are not so ridiculous, consider the follow<strong>in</strong>g example<br />
from Rotman’s Journey <strong>in</strong>to Mathematics. The statement “n 2 − n +41isprime”is<br />
true for <strong>in</strong>tegers 1 ≤ n ≤ 40 (check a few). However, when we check n =41wef<strong>in</strong>d<br />
41 2 − 41 + 41 = 41 2 , which is not prime.<br />
So how do we prove <strong>in</strong>f<strong>in</strong>itely many statements all at once? More formally, let us<br />
denote our statements as P (n). Then, if we can prove the two assertions<br />
1. P (1) is true.<br />
2. If P (k) is true, then P (k +1)istrue.<br />
then it follows that P (n) is true for all n ≥ 1. To understand this, I liken the<br />
process to climb<strong>in</strong>g an <strong>in</strong>f<strong>in</strong>itely long ladder with equally spaced rungs. Confronted<br />
with such a ladder, suppose I tell you that you are able to step up onto the first<br />
rung, and if you are on any particular rung, then you are capable of stepp<strong>in</strong>g up to<br />
the next rung. It follows that you can climb the ladder as far up as you wish. The
Techniques Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 602<br />
first formal assertion above is ak<strong>in</strong> to stepp<strong>in</strong>g onto the first rung, and the second<br />
formal assertion is ak<strong>in</strong> to assum<strong>in</strong>g that if you are on any one rung then you can<br />
always reach the next rung.<br />
In practice, establish<strong>in</strong>g that P (1) is true is called the “base case” and <strong>in</strong><br />
most cases is straightforward. Establish<strong>in</strong>g that P (k) ⇒ P (k + 1) is referred to<br />
as the “<strong>in</strong>duction step,” or <strong>in</strong> this book (and elsewhere) we will typically refer to<br />
the assumption of P (k) as the “<strong>in</strong>duction hypothesis.” This is perhaps the most<br />
mysterious part of a proof by <strong>in</strong>duction, s<strong>in</strong>ce we are eventually try<strong>in</strong>g to prove that<br />
P (n) istrueanditappearswedothisbyassum<strong>in</strong>gwhatwearetry<strong>in</strong>gtoprove<br />
(when we assume P (k)). We are try<strong>in</strong>g to prove the truth of P (n) (foralln), but <strong>in</strong><br />
the <strong>in</strong>duction step we establish the truth of an implication, P (k) ⇒ P (k +1),an<br />
“if-then” statement. Sometimes it is even worse, s<strong>in</strong>ce as you get more comfortable<br />
with <strong>in</strong>duction, we often do not bother to use a different letter (k) for the <strong>in</strong>dex (n)<br />
<strong>in</strong> the <strong>in</strong>duction step. Notice that the second formal assertion never says that P (k)<br />
is true, it simply says that if P (k) were true, what might logically follow. We can<br />
establish statements like “If I lived on the moon, then I could pole-vault over a bar<br />
12 meters high.” This may be a true statement, but it does not say we live on the<br />
moon, and <strong>in</strong>deed we may never live there.<br />
Enough generalities. Let us work an example and prove the theorem above about<br />
n(n +1)<br />
sums of <strong>in</strong>tegers. Formally, our statement is P (n): 1+2+3+···+ n = .<br />
2<br />
Proof: Base Case. P (1) is the statement 1 = 1(1+1)<br />
2<br />
, which we see simplifies to<br />
the true statement 1 = 1.<br />
Induction Step: We will assume P (k) is true, and will try to prove P (k +1).<br />
Given what we want to accomplish, it is natural to beg<strong>in</strong> by exam<strong>in</strong><strong>in</strong>g the sum of<br />
the first k + 1 <strong>in</strong>tegers.<br />
1+2+3+···+(k +1)=(1+2+3+···+ k)+(k +1)<br />
k(k +1)<br />
= +(k + 1)<br />
Induction Hypothesis<br />
2<br />
= k2 + k 2k +2<br />
+<br />
2 2<br />
= k2 +3k +2<br />
2<br />
(k +1)(k +2)<br />
=<br />
2<br />
(k + 1)((k +1)+1)<br />
=<br />
2<br />
We then recognize the two ends of this cha<strong>in</strong> of equalities as P (k +1). So,by<br />
mathematical <strong>in</strong>duction, the theorem is true for all n.<br />
How do you recognize when to use <strong>in</strong>duction? The first clue is a statement that<br />
is really many statements, one for each <strong>in</strong>teger. The second clue would be that you
Techniques Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 603<br />
beg<strong>in</strong> a more standard proof and you f<strong>in</strong>d yourself us<strong>in</strong>g words like “and so on” (as<br />
above!) or lots of ellipses (dots) to establish patterns that you are conv<strong>in</strong>ced cont<strong>in</strong>ue<br />
on and on forever. However, there are many m<strong>in</strong>or <strong>in</strong>stances where <strong>in</strong>duction might<br />
be warranted but we do not bother.<br />
Induction is important enough, and used often enough, that it appears <strong>in</strong> various<br />
variations. The base case sometimes beg<strong>in</strong>s with n = 0, or perhaps an <strong>in</strong>teger greater<br />
than n. Some formulate the <strong>in</strong>duction step as P (k − 1) ⇒ P (k). There is also a<br />
“strong form” of <strong>in</strong>duction where we assume all of P (1), P (2), P (3),. . . P (k) asa<br />
hypothesis for show<strong>in</strong>g the conclusion P (k +1).<br />
You can f<strong>in</strong>d examples of <strong>in</strong>duction <strong>in</strong> the proofs of Theorem GSP, Theorem<br />
DER, TheoremDT, TheoremDIM, TheoremEOMP, TheoremDCP, andTheorem<br />
UTMR.<br />
Proof Technique P<br />
Practice<br />
Here is a technique used by many practic<strong>in</strong>g mathematicians when they are teach<strong>in</strong>g<br />
themselves new mathematics. As they read a textbook, monograph or research article,<br />
they attempt to prove each new theorem themselves, before read<strong>in</strong>g the proof. Often<br />
the proofs can be very difficult, so it is wise not to spend too much time on each.<br />
Maybe limit your losses and try each proof for 10 or 15 m<strong>in</strong>utes. Even if the proof<br />
is not found, it is time well-spent. You become more familiar with the def<strong>in</strong>itions<br />
<strong>in</strong>volved, and the hypothesis and conclusion of the theorem. When you do work<br />
through the proof, it might make more sense, and you will ga<strong>in</strong> added <strong>in</strong>sight about<br />
just how to construct a proof.<br />
Proof Technique LC<br />
Lemmas and Corollaries<br />
Theorems often go by different titles. Two of the most popular be<strong>in</strong>g “lemma”<br />
and “corollary.” Before we describe the f<strong>in</strong>e dist<strong>in</strong>ctions, be aware that lemmas,<br />
corollaries, propositions, claims and facts are all just theorems. And every theorem<br />
can be rephrased as an “if-then” statement, or perhaps a pair of “if-then” statements<br />
expressed as an equivalence (Proof Technique E).<br />
A lemma is a theorem that is not too <strong>in</strong>terest<strong>in</strong>g <strong>in</strong> its own right, but is important<br />
for prov<strong>in</strong>g other theorems. It might be a generalization or abstraction of a key<br />
step of several different proofs. For this reason you often hear the phrase “technical<br />
lemma” though some might argue that the adjective “technical” is redundant.<br />
A corollary is a theorem that follows very easily from another theorem. For this<br />
reason, corollaries frequently do not have proofs. You are expected to easily and<br />
quickly see how a previous theorem implies the corollary.
Techniques Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 604<br />
A proposition or fact is really just a codeword for a theorem. A claim might be<br />
similar, but some authors like to use claims with<strong>in</strong> a proof to organize key steps. In<br />
a similar manner, some long proofs are organized as a series of lemmas.<br />
In order to not confuse the novice, we have just called all our theorems theorems.<br />
It is also an organizational convenience. With only theorems and def<strong>in</strong>itions, the<br />
theoretical backbone of the course is laid bare <strong>in</strong> the two lists of Def<strong>in</strong>itions and<br />
Theorems.
Archetypes<br />
This section conta<strong>in</strong>s def<strong>in</strong>itions and capsule summaries for each archetypical example.<br />
Comprehensive and detailed analysis of each can be found <strong>in</strong> the onl<strong>in</strong>e supplement.<br />
Archetype A L<strong>in</strong>ear system of three equations, three unknowns. S<strong>in</strong>gular coefficient<br />
matrix with dimension 1 null space. Integer eigenvalues and a degenerate<br />
eigenspace for coefficient matrix.<br />
x 1 − x 2 +2x 3 =1<br />
2x 1 + x 2 + x 3 =8<br />
x 1 + x 2 =5<br />
Archetype B System with three equations, three unknowns. Nons<strong>in</strong>gular coefficient<br />
matrix. Dist<strong>in</strong>ct <strong>in</strong>teger eigenvalues for coefficient matrix.<br />
−7x 1 − 6x 2 − 12x 3 = −33<br />
5x 1 +5x 2 +7x 3 =24<br />
x 1 +4x 3 =5<br />
Archetype C System with three equations, four variables. Consistent. Null space<br />
of coefficient matrix has dimension 1.<br />
2x 1 − 3x 2 + x 3 − 6x 4 = −7<br />
4x 1 + x 2 +2x 3 +9x 4 = −7<br />
3x 1 + x 2 + x 3 +8x 4 = −8<br />
Archetype D System with three equations, four variables. Consistent. Null space<br />
of coefficient matrix has dimension 2. Coefficient matrix identical to that of Archetype<br />
E, vector of constants is different.<br />
2x 1 + x 2 +7x 3 − 7x 4 =8<br />
−3x 1 +4x 2 − 5x 3 − 6x 4 = −12<br />
x 1 + x 2 +4x 3 − 5x 4 =4<br />
Archetype E System with three equations, four variables. Inconsistent. Null space<br />
of coefficient matrix has dimension 2. Coefficient matrix identical to that of Archetype<br />
D, constant vector is different.<br />
2x 1 + x 2 +7x 3 − 7x 4 =2<br />
−3x 1 +4x 2 − 5x 3 − 6x 4 =3
Archetypes Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 606<br />
x 1 + x 2 +4x 3 − 5x 4 =2<br />
Archetype F System with four equations, four variables. Nons<strong>in</strong>gular coefficient<br />
matrix. Integer eigenvalues, one has “high” multiplicity.<br />
33x 1 − 16x 2 +10x 3 − 2x 4 = −27<br />
99x 1 − 47x 2 +27x 3 − 7x 4 = −77<br />
78x 1 − 36x 2 +17x 3 − 6x 4 = −52<br />
−9x 1 +2x 2 +3x 3 +4x 4 =5<br />
Archetype G System with five equations, two variables. Consistent. Null space of<br />
coefficient matrix has dimension 0. Coefficient matrix identical to that of Archetype<br />
H, constant vector is different.<br />
2x 1 +3x 2 =6<br />
−x 1 +4x 2 = −14<br />
3x 1 +10x 2 = −2<br />
3x 1 − x 2 =20<br />
6x 1 +9x 2 =18<br />
Archetype H System with five equations, two variables. Inconsistent, overdeterm<strong>in</strong>ed.<br />
Null space of coefficient matrix has dimension 0. Coefficient matrix identical<br />
to that of Archetype G, constant vector is different.<br />
2x 1 +3x 2 =5<br />
−x 1 +4x 2 =6<br />
3x 1 +10x 2 =2<br />
3x 1 − x 2 = −1<br />
6x 1 +9x 2 =3<br />
Archetype I System with four equations, seven variables. Consistent. Null space<br />
of coefficient matrix has dimension 4.<br />
x 1 +4x 2 − x 4 +7x 6 − 9x 7 =3<br />
2x 1 +8x 2 − x 3 +3x 4 +9x 5 − 13x 6 +7x 7 =9<br />
2x 3 − 3x 4 − 4x 5 +12x 6 − 8x 7 =1<br />
−x 1 − 4x 2 +2x 3 +4x 4 +8x 5 − 31x 6 +37x 7 =4<br />
Archetype J<br />
System with six equations, n<strong>in</strong>e variables. Consistent. Null space of
Archetypes Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 607<br />
coefficient matrix has dimension 5.<br />
x 1 +2x 2 − 2x 3 +9x 4 +3x 5 − 5x 6 − 2x 7 + x 8 +27x 9 = −5<br />
2x 1 +4x 2 +3x 3 +4x 4 − x 5 +4x 6 +10x 7 +2x 8 − 23x 9 =18<br />
x 1 +2x 2 + x 3 +3x 4 + x 5 + x 6 +5x 7 +2x 8 − 7x 9 =6<br />
2x 1 +4x 2 +3x 3 +4x 4 − 7x 5 +2x 6 +4x 7 − 11x 9 =20<br />
x 1 +2x 2 +5x 4 +2x 5 − 4x 6 +3x 7 +8x 8 +13x 9 = −4<br />
−3x 1 − 6x 2 − x 3 − 13x 4 +2x 5 − 5x 6 − 4x 7 +13x 8 +10x 9 = −29<br />
Archetype K<br />
multiplicity 2.<br />
Archetype L<br />
Square matrix of size 5. Nons<strong>in</strong>gular. 3 dist<strong>in</strong>ct eigenvalues, 2 of<br />
⎡<br />
⎤<br />
10 18 24 24 −12<br />
12 −2 −6 0 −18<br />
⎢−30 −21 −23 −30 39 ⎥<br />
⎣ 27 30 36 37 −30⎦<br />
18 24 30 30 −20<br />
Square matrix of size 5. S<strong>in</strong>gular, nullity 2. 2 dist<strong>in</strong>ct eigenvalues,<br />
each of “high” multiplicity.<br />
⎡ ⎤<br />
−2 −1 −2 −4 4<br />
⎢−6 −5 −4 −4 6<br />
⎢⎢⎣ 10 7 7 10 −13⎥<br />
−7 −5 −6 −9 10 ⎦<br />
−4 −3 −4 −6 6<br />
Archetype M L<strong>in</strong>ear transformation with bigger doma<strong>in</strong> than codoma<strong>in</strong>, so it is<br />
guaranteed to not be <strong>in</strong>jective. Happens to not be surjective.<br />
⎛⎡<br />
⎤⎞<br />
x 1<br />
[ ]<br />
x<br />
T : C 5 → C 3 2<br />
x1 +2x 2 +3x 3 +4x 4 +4x 5<br />
, T ⎜⎢x 3 ⎥⎟<br />
⎝⎣x ⎦⎠ = 3x 1 + x 2 +4x 3 − 3x 4 +7x 5<br />
4<br />
x 1 − x 2 − 5x 4 + x 5<br />
x 5<br />
Archetype N L<strong>in</strong>ear transformation with doma<strong>in</strong> larger than its codoma<strong>in</strong>, so it<br />
is guaranteed to not be <strong>in</strong>jective. Happens to be onto.<br />
⎛⎡<br />
⎤⎞<br />
x 1<br />
[ ]<br />
x<br />
T : C 5 → C 3 2<br />
2x1 + x 2 +3x 3 − 4x 4 +5x 5<br />
, T ⎜⎢x 3 ⎥⎟<br />
⎝⎣x ⎦⎠ = x 1 − 2x 2 +3x 3 − 9x 4 +3x 5<br />
4 3x 1 +4x 3 − 6x 4 +5x 5<br />
x 5
Archetypes Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 608<br />
Archetype O L<strong>in</strong>ear transformation with a doma<strong>in</strong> smaller than the codoma<strong>in</strong>,<br />
so it is guaranteed to not be onto. Happens to not be one-to-one.<br />
⎡<br />
⎤<br />
−x<br />
([ ])<br />
1 + x 2 − 3x 3<br />
x1<br />
−x<br />
T : C 3 → C 5 1 +2x 2 − 4x 3<br />
, T x 2 = ⎢ x 1 + x 2 + x 3 ⎥<br />
x ⎣<br />
3 2x 1 +3x 2 + x ⎦ 3<br />
x 1 +2x 3<br />
Archetype P L<strong>in</strong>ear transformation with a doma<strong>in</strong> smaller that its codoma<strong>in</strong>, so<br />
it is guaranteed to not be surjective. Happens to be <strong>in</strong>jective.<br />
⎡<br />
⎤<br />
−x<br />
([ ])<br />
1 + x 2 + x 3<br />
x1<br />
−x<br />
T : C 3 → C 5 1 +2x 2 +2x 3<br />
, T x 2 = ⎢ x 1 + x 2 +3x 3 ⎥<br />
x ⎣<br />
3 2x 1 +3x 2 + x ⎦ 3<br />
−2x 1 + x 2 +3x 3<br />
Archetype Q L<strong>in</strong>ear transformation with equal-sized doma<strong>in</strong> and codoma<strong>in</strong>, so<br />
it has the potential to be <strong>in</strong>vertible, but <strong>in</strong> this case is not. Neither <strong>in</strong>jective nor<br />
surjective. Diagonalizable, though.<br />
⎛⎡<br />
⎤⎞<br />
⎡<br />
⎤<br />
x 1 −2x 1 +3x 2 +3x 3 − 6x 4 +3x 5<br />
x<br />
T : C 5 → C 5 2<br />
−16x , T ⎜⎢x 3 ⎥⎟<br />
⎝⎣x ⎦⎠ = 1 +9x 2 +12x 3 − 28x 4 +28x 5<br />
⎢−19x 1 +7x 2 +14x 3 − 32x 4 +37x 5 ⎥<br />
⎣<br />
4 −21x 1 +9x 2 +15x 3 − 35x 4 +39x ⎦ 5<br />
x 5 −9x 1 +5x 2 +7x 3 − 16x 4 +16x 5<br />
Archetype R L<strong>in</strong>ear transformation with equal-sized doma<strong>in</strong> and codoma<strong>in</strong>. Injective,<br />
surjective, <strong>in</strong>vertible, diagonalizable, the works.<br />
⎛⎡<br />
⎤⎞<br />
⎡<br />
⎤<br />
x 1 −65x 1 + 128x 2 +10x 3 − 262x 4 +40x 5<br />
x<br />
T : C 5 → C 5 2<br />
36x , T ⎜⎢x 3 ⎥⎟<br />
⎝⎣x ⎦⎠ = 1 − 73x 2 − x 3 + 151x 4 − 16x 5<br />
⎢ −44x 1 +88x 2 +5x 3 − 180x 4 +24x 5 ⎥<br />
⎣<br />
4 34x 1 − 68x 2 − 3x 3 + 140x 4 − 18x ⎦ 5<br />
x 5 12x 1 − 24x 2 − x 3 +49x 4 − 5x 5<br />
Archetype S Doma<strong>in</strong> is column vectors, codoma<strong>in</strong> is matrices. Doma<strong>in</strong> is dimension<br />
3 and codoma<strong>in</strong> is dimension 4. Not <strong>in</strong>jective, not surjective.<br />
([ ]) a [<br />
]<br />
T : C 3 a − b 2a +2b + c<br />
→ M 22 , T b =<br />
3a + b + c −2a − 6b − 2c<br />
c<br />
Archetype T Doma<strong>in</strong> and codoma<strong>in</strong> are polynomials. Doma<strong>in</strong> has dimension 5,
Archetypes Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 609<br />
while codoma<strong>in</strong> has dimension 6. Is <strong>in</strong>jective, can not be surjective.<br />
T : P 4 → P 5 , T (p(x)) = (x − 2)p(x)<br />
Archetype U Doma<strong>in</strong> is matrices, codoma<strong>in</strong> is column vectors. Doma<strong>in</strong> has<br />
dimension 6, while codoma<strong>in</strong> has dimension 4. Cannot be <strong>in</strong>jective, is surjective.<br />
⎡<br />
⎤<br />
([ ]) a +2b +12c − 3d + e +6f<br />
T : M 23 → C 4 a b c ⎢ 2a − b − c + d − 11f ⎥<br />
, T<br />
=<br />
d e f<br />
⎣<br />
a + b +7c +2d + e − 3f<br />
⎦<br />
a +2b +12c +5e − 5f<br />
Archetype V Doma<strong>in</strong> is polynomials, codoma<strong>in</strong> is matrices. Both doma<strong>in</strong> and<br />
codoma<strong>in</strong> have dimension 4. Injective, surjective, <strong>in</strong>vertible. Square matrix representation,<br />
but doma<strong>in</strong> and codoma<strong>in</strong> are unequal, so no eigenvalue <strong>in</strong>formation.<br />
T : P 3 → M 22 , T ( a + bx + cx 2 + dx 3) [ ]<br />
a + b a− 2c<br />
=<br />
d b− d<br />
Archetype W Doma<strong>in</strong> is polynomials, codoma<strong>in</strong> is polynomials. Doma<strong>in</strong> and<br />
codoma<strong>in</strong> both have dimension 3. Injective, surjective, <strong>in</strong>vertible, 3 dist<strong>in</strong>ct eigenvalues,<br />
diagonalizable.<br />
T : P 2 → P 2 ,<br />
T ( a + bx + cx 2) =(19a +6b − 4c)+(−24a − 7b +4c) x +(36a +12b − 9c) x 2<br />
Archetype X Doma<strong>in</strong> and codoma<strong>in</strong> are square matrices. Doma<strong>in</strong> and codoma<strong>in</strong><br />
both have dimension 4. Not <strong>in</strong>jective, not surjective, not <strong>in</strong>vertible, 3 dist<strong>in</strong>ct eigenvalues,<br />
diagonalizable.<br />
([ ]) [ ]<br />
a b −2a +15b +3c +27d 10b +6c +18d<br />
T : M 22 → M 22 , T<br />
=<br />
c d<br />
a − 5b − 9d −a − 4b − 5c − 8d
Def<strong>in</strong>itions<br />
Section WILA What is L<strong>in</strong>ear <strong>Algebra</strong>?<br />
Section SSLE Solv<strong>in</strong>g Systems of L<strong>in</strong>ear Equations<br />
SLE System of L<strong>in</strong>ear Equations . . . . . . . . . . . . . . . . . . . 9<br />
SSLE Solution of a System of L<strong>in</strong>ear Equations . . . . . . . . . . . . 9<br />
SSSLE Solution Set of a System of L<strong>in</strong>ear Equations . . . . . . . . . 9<br />
ESYS Equivalent Systems . . . . . . . . . . . . . . . . . . . . . . . . 11<br />
EO Equation Operations . . . . . . . . . . . . . . . . . . . . . . . 12<br />
Section RREF Reduced Row-Echelon Form<br />
M Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22<br />
CV Column Vector . . . . . . . . . . . . . . . . . . . . . . . . . . 23<br />
ZCV Zero Column Vector . . . . . . . . . . . . . . . . . . . . . . . 23<br />
CM Coefficient Matrix . . . . . . . . . . . . . . . . . . . . . . . . . 23<br />
VOC Vector of Constants . . . . . . . . . . . . . . . . . . . . . . . . 24<br />
SOLV Solution Vector . . . . . . . . . . . . . . . . . . . . . . . . . . 24<br />
MRLS Matrix Representation of a L<strong>in</strong>ear System . . . . . . . . . . . 24<br />
AM Augmented Matrix . . . . . . . . . . . . . . . . . . . . . . . . 25<br />
RO Row Operations . . . . . . . . . . . . . . . . . . . . . . . . . . 26<br />
REM Row-Equivalent Matrices . . . . . . . . . . . . . . . . . . . . . 26<br />
RREF Reduced Row-Echelon Form . . . . . . . . . . . . . . . . . . . 28<br />
SectionTSSTypesofSolutionSets<br />
CS Consistent System . . . . . . . . . . . . . . . . . . . . . . . . 45<br />
IDV Independent and Dependent Variables . . . . . . . . . . . . . 48<br />
Section HSE Homogeneous Systems of Equations<br />
HS Homogeneous System . . . . . . . . . . . . . . . . . . . . . . . 58<br />
TSHSE Trivial Solution to Homogeneous Systems of Equations . . . . 59<br />
NSM Null Space of a Matrix . . . . . . . . . . . . . . . . . . . . . . 61<br />
Section NM Nons<strong>in</strong>gular Matrices<br />
SQM Square Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . 66<br />
NM Nons<strong>in</strong>gular Matrix . . . . . . . . . . . . . . . . . . . . . . . . 66<br />
IM Identity Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . 67<br />
Section VO Vector Operations<br />
VSCV Vector Space of Column Vectors . . . . . . . . . . . . . . . . . 74<br />
CVE Column Vector Equality . . . . . . . . . . . . . . . . . . . . . 75<br />
CVA Column Vector Addition . . . . . . . . . . . . . . . . . . . . . 76<br />
610
Def<strong>in</strong>itions Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 611<br />
CVSM Column Vector Scalar Multiplication . . . . . . . . . . . . . . 77<br />
Section LC L<strong>in</strong>ear Comb<strong>in</strong>ations<br />
LCCV L<strong>in</strong>ear Comb<strong>in</strong>ation of Column Vectors . . . . . . . . . . . . . 83<br />
Section SS Spann<strong>in</strong>g Sets<br />
SSCV Span of a Set of Column Vectors . . . . . . . . . . . . . . . . 107<br />
Section LI L<strong>in</strong>ear Independence<br />
RLDCV Relation of L<strong>in</strong>ear Dependence for Column Vectors . . . . . . 122<br />
LICV L<strong>in</strong>ear Independence of Column Vectors . . . . . . . . . . . . 122<br />
Section LDS L<strong>in</strong>ear Dependence and Spans<br />
Section O Orthogonality<br />
CCCV Complex Conjugate of a Column Vector . . . . . . . . . . . . 148<br />
IP Inner Product . . . . . . . . . . . . . . . . . . . . . . . . . . . 149<br />
NV Norm of a Vector . . . . . . . . . . . . . . . . . . . . . . . . . 152<br />
OV Orthogonal Vectors . . . . . . . . . . . . . . . . . . . . . . . . 154<br />
OSV Orthogonal Set of Vectors . . . . . . . . . . . . . . . . . . . . 155<br />
SUV Standard Unit Vectors . . . . . . . . . . . . . . . . . . . . . . 155<br />
ONS OrthoNormal Set . . . . . . . . . . . . . . . . . . . . . . . . . 159<br />
Section MO Matrix Operations<br />
VSM Vector Space of m × n Matrices . . . . . . . . . . . . . . . . . 162<br />
ME Matrix Equality . . . . . . . . . . . . . . . . . . . . . . . . . . 162<br />
MA Matrix Addition . . . . . . . . . . . . . . . . . . . . . . . . . . 163<br />
MSM Matrix Scalar Multiplication . . . . . . . . . . . . . . . . . . . 163<br />
ZM Zero Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165<br />
TM Transpose of a Matrix . . . . . . . . . . . . . . . . . . . . . . 166<br />
SYM Symmetric Matrix . . . . . . . . . . . . . . . . . . . . . . . . 166<br />
CCM Complex Conjugate of a Matrix . . . . . . . . . . . . . . . . . 168<br />
A Adjo<strong>in</strong>t . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170<br />
Section MM Matrix Multiplication<br />
MVP Matrix-Vector Product . . . . . . . . . . . . . . . . . . . . . . 175<br />
MM Matrix Multiplication . . . . . . . . . . . . . . . . . . . . . . . 179<br />
HM Hermitian Matrix . . . . . . . . . . . . . . . . . . . . . . . . . 188<br />
Section MISLE Matrix Inverses and Systems of L<strong>in</strong>ear Equations<br />
MI Matrix Inverse . . . . . . . . . . . . . . . . . . . . . . . . . . . 194<br />
Section MINM Matrix Inverses and Nons<strong>in</strong>gular Matrices
Def<strong>in</strong>itions Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 612<br />
UM Unitary Matrices . . . . . . . . . . . . . . . . . . . . . . . . . 211<br />
Section CRS Column and Row Spaces<br />
CSM Column Space of a Matrix . . . . . . . . . . . . . . . . . . . . 217<br />
RSM Row Space of a Matrix . . . . . . . . . . . . . . . . . . . . . . 225<br />
Section FS Four Subsets<br />
LNS Left Null Space . . . . . . . . . . . . . . . . . . . . . . . . . . 236<br />
EEF Extended Echelon Form . . . . . . . . . . . . . . . . . . . . . 241<br />
Section VS Vector Spaces<br />
VS Vector Space . . . . . . . . . . . . . . . . . . . . . . . . . . . 257<br />
Section S Subspaces<br />
S Subspace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272<br />
TS Trivial Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . 277<br />
LC L<strong>in</strong>ear Comb<strong>in</strong>ation . . . . . . . . . . . . . . . . . . . . . . . . 279<br />
SS Span of a Set . . . . . . . . . . . . . . . . . . . . . . . . . . . 279<br />
Section LISS L<strong>in</strong>ear Independence and Spann<strong>in</strong>g Sets<br />
RLD Relation of L<strong>in</strong>ear Dependence . . . . . . . . . . . . . . . . . 288<br />
LI L<strong>in</strong>ear Independence . . . . . . . . . . . . . . . . . . . . . . . 288<br />
SSVS Spann<strong>in</strong>g Set of a Vector Space . . . . . . . . . . . . . . . . . 294<br />
Section B Bases<br />
B Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303<br />
Section D Dimension<br />
D Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318<br />
NOM Nullity Of a Matrix . . . . . . . . . . . . . . . . . . . . . . . . 325<br />
ROM Rank Of a Matrix . . . . . . . . . . . . . . . . . . . . . . . . . 325<br />
Section PD Properties of Dimension<br />
Section DM Determ<strong>in</strong>ant of a Matrix<br />
ELEM Elementary Matrices . . . . . . . . . . . . . . . . . . . . . . . 340<br />
SM SubMatrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346<br />
DM Determ<strong>in</strong>ant of a Matrix . . . . . . . . . . . . . . . . . . . . . 346<br />
Section PDM Properties of Determ<strong>in</strong>ants of Matrices<br />
Section EE Eigenvalues and Eigenvectors<br />
EEM Eigenvalues and Eigenvectors of a Matrix . . . . . . . . . . . 367<br />
CP Characteristic Polynomial . . . . . . . . . . . . . . . . . . . . 375
Def<strong>in</strong>itions Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 613<br />
EM Eigenspace of a Matrix . . . . . . . . . . . . . . . . . . . . . . 377<br />
AME <strong>Algebra</strong>ic Multiplicity of an Eigenvalue . . . . . . . . . . . . . 379<br />
GME Geometric Multiplicity of an Eigenvalue . . . . . . . . . . . . 380<br />
Section PEE Properties of Eigenvalues and Eigenvectors<br />
Section SD Similarity and Diagonalization<br />
SIM Similar Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . 403<br />
DIM Diagonal Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . 407<br />
DZM Diagonalizable Matrix . . . . . . . . . . . . . . . . . . . . . . 407<br />
Section LT L<strong>in</strong>ear Transformations<br />
LT L<strong>in</strong>ear Transformation . . . . . . . . . . . . . . . . . . . . . . 420<br />
PI Pre-Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 436<br />
LTA L<strong>in</strong>ear Transformation Addition . . . . . . . . . . . . . . . . . 438<br />
LTSM L<strong>in</strong>ear Transformation Scalar Multiplication . . . . . . . . . . 439<br />
LTC L<strong>in</strong>ear Transformation Composition . . . . . . . . . . . . . . . 441<br />
Section ILT Injective L<strong>in</strong>ear Transformations<br />
ILT Injective L<strong>in</strong>ear Transformation . . . . . . . . . . . . . . . . . 446<br />
KLT Kernel of a L<strong>in</strong>ear Transformation . . . . . . . . . . . . . . . 451<br />
Section SLT Surjective L<strong>in</strong>ear Transformations<br />
SLT Surjective L<strong>in</strong>ear Transformation . . . . . . . . . . . . . . . . 462<br />
RLT Range of a L<strong>in</strong>ear Transformation . . . . . . . . . . . . . . . . 468<br />
Section IVLT Invertible L<strong>in</strong>ear Transformations<br />
IDLT Identity L<strong>in</strong>ear Transformation . . . . . . . . . . . . . . . . . 480<br />
IVLT Invertible L<strong>in</strong>ear Transformations . . . . . . . . . . . . . . . . 480<br />
IVS Isomorphic Vector Spaces . . . . . . . . . . . . . . . . . . . . 490<br />
ROLT Rank Of a L<strong>in</strong>ear Transformation . . . . . . . . . . . . . . . . 492<br />
NOLT Nullity Of a L<strong>in</strong>ear Transformation . . . . . . . . . . . . . . . 492<br />
Section VR Vector Representations<br />
VR Vector Representation . . . . . . . . . . . . . . . . . . . . . . 501<br />
Section MR Matrix Representations<br />
MR Matrix Representation . . . . . . . . . . . . . . . . . . . . . . 514<br />
Section CB Change of Basis<br />
EELT Eigenvalue and Eigenvector of a L<strong>in</strong>ear Transformation . . . . 542<br />
CBM Change-of-Basis Matrix . . . . . . . . . . . . . . . . . . . . . 543
Def<strong>in</strong>itions Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 614<br />
Section OD Orthonormal Diagonalization<br />
UTM Upper Triangular Matrix . . . . . . . . . . . . . . . . . . . . . 568<br />
LTM Lower Triangular Matrix . . . . . . . . . . . . . . . . . . . . . 568<br />
NRML Normal Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . 575<br />
Section CNO Complex Number Operations<br />
CNE Complex Number Equality . . . . . . . . . . . . . . . . . . . . 582<br />
CNA Complex Number Addition . . . . . . . . . . . . . . . . . . . 582<br />
CNM Complex Number Multiplication . . . . . . . . . . . . . . . . 582<br />
CCN Conjugate of a Complex Number . . . . . . . . . . . . . . . . 584<br />
MCN Modulus of a Complex Number . . . . . . . . . . . . . . . . . 585<br />
Section SET Sets<br />
SET Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 587<br />
SSET Subset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 587<br />
ES Empty Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 587<br />
SE Set Equality . . . . . . . . . . . . . . . . . . . . . . . . . . . . 588<br />
C Card<strong>in</strong>ality . . . . . . . . . . . . . . . . . . . . . . . . . . . . 589<br />
SU Set Union . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 589<br />
SI Set Intersection . . . . . . . . . . . . . . . . . . . . . . . . . . 589<br />
SC Set Complement . . . . . . . . . . . . . . . . . . . . . . . . . 590
Theorems<br />
Section WILA What is L<strong>in</strong>ear <strong>Algebra</strong>?<br />
Section SSLE Solv<strong>in</strong>g Systems of L<strong>in</strong>ear Equations<br />
EOPSS Equation Operations Preserve Solution Sets . . . . . . . . . . . 12<br />
Section RREF Reduced Row-Echelon Form<br />
REMES Row-Equivalent Matrices represent Equivalent Systems . . . . . 27<br />
REMEF Row-Equivalent Matrix <strong>in</strong> Echelon Form . . . . . . . . . . . . . 29<br />
RREFU Reduced Row-Echelon Form is Unique . . . . . . . . . . . . . . 32<br />
SectionTSSTypesofSolutionSets<br />
RCLS Recogniz<strong>in</strong>g Consistency of a L<strong>in</strong>ear System . . . . . . . . . . . 49<br />
CSRN Consistent Systems, r and n . . . . . . . . . . . . . . . . . . . . 50<br />
FVCS Free Variables for Consistent Systems . . . . . . . . . . . . . . . 51<br />
PSSLS Possible Solution Sets for L<strong>in</strong>ear Systems . . . . . . . . . . . . . 52<br />
CMVEI Consistent, More Variables than Equations, Inf<strong>in</strong>ite solutions . . 53<br />
Section HSE Homogeneous Systems of Equations<br />
HSC Homogeneous Systems are Consistent . . . . . . . . . . . . . . . 58<br />
HMVEI Homogeneous, More Variables than Equations, Inf<strong>in</strong>ite solutions 60<br />
Section NM Nons<strong>in</strong>gular Matrices<br />
NMRRI Nons<strong>in</strong>gular Matrices Row Reduce to the Identity matrix . . . . 68<br />
NMTNS Nons<strong>in</strong>gular Matrices have Trivial Null Spaces . . . . . . . . . . 69<br />
NMUS Nons<strong>in</strong>gular Matrices and Unique Solutions . . . . . . . . . . . . 69<br />
NME1 Nons<strong>in</strong>gular Matrix Equivalences, Round 1 . . . . . . . . . . . . 70<br />
Section VO Vector Operations<br />
VSPCV Vector Space Properties of Column Vectors . . . . . . . . . . . . 78<br />
Section LC L<strong>in</strong>ear Comb<strong>in</strong>ations<br />
SLSLC Solutions to L<strong>in</strong>ear Systems are L<strong>in</strong>ear Comb<strong>in</strong>ations . . . . . . 87<br />
VFSLS Vector Form of Solutions to L<strong>in</strong>ear Systems . . . . . . . . . . . 94<br />
PSPHS Particular Solution Plus Homogeneous Solutions . . . . . . . . . 101<br />
Section SS Spann<strong>in</strong>g Sets<br />
SSNS Spann<strong>in</strong>g Sets for Null Spaces . . . . . . . . . . . . . . . . . . . 114<br />
Section LI L<strong>in</strong>ear Independence<br />
LIVHS L<strong>in</strong>early Independent Vectors and Homogeneous Systems . . . . 124<br />
615
Theorems Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 616<br />
LIVRN L<strong>in</strong>early Independent Vectors, r and n . . . . . . . . . . . . . . 126<br />
MVSLD More Vectors than Size implies L<strong>in</strong>ear Dependence . . . . . . . 127<br />
NMLIC Nons<strong>in</strong>gular Matrices have L<strong>in</strong>early Independent Columns . . . 128<br />
NME2 Nons<strong>in</strong>gular Matrix Equivalences, Round 2 . . . . . . . . . . . . 129<br />
BNS Basis for Null Spaces . . . . . . . . . . . . . . . . . . . . . . . . 131<br />
Section LDS L<strong>in</strong>ear Dependence and Spans<br />
DLDS Dependency <strong>in</strong> L<strong>in</strong>early Dependent Sets . . . . . . . . . . . . . 136<br />
BS Basis of a Span . . . . . . . . . . . . . . . . . . . . . . . . . . . 142<br />
Section O Orthogonality<br />
CRVA Conjugation Respects Vector Addition . . . . . . . . . . . . . . 148<br />
CRSM Conjugation Respects Vector Scalar Multiplication . . . . . . . 149<br />
IPVA Inner Product and Vector Addition . . . . . . . . . . . . . . . . 150<br />
IPSM Inner Product and Scalar Multiplication . . . . . . . . . . . . . 151<br />
IPAC Inner Product is Anti-Commutative . . . . . . . . . . . . . . . . 151<br />
IPN Inner Products and Norms . . . . . . . . . . . . . . . . . . . . . 153<br />
PIP Positive Inner Products . . . . . . . . . . . . . . . . . . . . . . . 153<br />
OSLI Orthogonal Sets are L<strong>in</strong>early Independent . . . . . . . . . . . . 156<br />
GSP Gram-Schmidt Procedure . . . . . . . . . . . . . . . . . . . . . . 157<br />
Section MO Matrix Operations<br />
VSPM Vector Space Properties of Matrices . . . . . . . . . . . . . . . . 164<br />
SMS Symmetric Matrices are Square . . . . . . . . . . . . . . . . . . 167<br />
TMA Transpose and Matrix Addition . . . . . . . . . . . . . . . . . . 167<br />
TMSM Transpose and Matrix Scalar Multiplication . . . . . . . . . . . 167<br />
TT Transpose of a Transpose . . . . . . . . . . . . . . . . . . . . . . 168<br />
CRMA Conjugation Respects Matrix Addition . . . . . . . . . . . . . . 169<br />
CRMSM Conjugation Respects Matrix Scalar Multiplication . . . . . . . 169<br />
CCM Conjugate of the Conjugate of a Matrix . . . . . . . . . . . . . . 169<br />
MCT Matrix Conjugation and Transposes . . . . . . . . . . . . . . . . 170<br />
AMA Adjo<strong>in</strong>t and Matrix Addition . . . . . . . . . . . . . . . . . . . . 170<br />
AMSM Adjo<strong>in</strong>t and Matrix Scalar Multiplication . . . . . . . . . . . . . 171<br />
AA Adjo<strong>in</strong>t of an Adjo<strong>in</strong>t . . . . . . . . . . . . . . . . . . . . . . . . 171<br />
Section MM Matrix Multiplication<br />
SLEMM Systems of L<strong>in</strong>ear Equations as Matrix Multiplication . . . . . . 176<br />
EMMVP Equal Matrices and Matrix-Vector Products . . . . . . . . . . . 178<br />
EMP Entries of Matrix Products . . . . . . . . . . . . . . . . . . . . . 180<br />
MMZM Matrix Multiplication and the Zero Matrix . . . . . . . . . . . . 182<br />
MMIM Matrix Multiplication and Identity Matrix . . . . . . . . . . . . 182<br />
MMDAA Matrix Multiplication Distributes Across Addition . . . . . . . . 183
Theorems Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 617<br />
MMSMM Matrix Multiplication and Scalar Matrix Multiplication . . . . . 184<br />
MMA Matrix Multiplication is Associative . . . . . . . . . . . . . . . . 184<br />
MMIP Matrix Multiplication and Inner Products . . . . . . . . . . . . 185<br />
MMCC Matrix Multiplication and Complex Conjugation . . . . . . . . . 186<br />
MMT Matrix Multiplication and Transposes . . . . . . . . . . . . . . . 186<br />
MMAD Matrix Multiplication and Adjo<strong>in</strong>ts . . . . . . . . . . . . . . . . 187<br />
AIP Adjo<strong>in</strong>t and Inner Product . . . . . . . . . . . . . . . . . . . . . 188<br />
HMIP Hermitian Matrices and Inner Products . . . . . . . . . . . . . . 189<br />
Section MISLE Matrix Inverses and Systems of L<strong>in</strong>ear Equations<br />
TTMI Two-by-Two Matrix Inverse . . . . . . . . . . . . . . . . . . . . 196<br />
CINM Comput<strong>in</strong>g the Inverse of a Nons<strong>in</strong>gular Matrix . . . . . . . . . 200<br />
MIU Matrix Inverse is Unique . . . . . . . . . . . . . . . . . . . . . . 202<br />
SS Socks and Shoes . . . . . . . . . . . . . . . . . . . . . . . . . . . 202<br />
MIMI Matrix Inverse of a Matrix Inverse . . . . . . . . . . . . . . . . . 203<br />
MIT Matrix Inverse of a Transpose . . . . . . . . . . . . . . . . . . . 203<br />
MISM Matrix Inverse of a Scalar Multiple . . . . . . . . . . . . . . . . 204<br />
Section MINM Matrix Inverses and Nons<strong>in</strong>gular Matrices<br />
NPNT Nons<strong>in</strong>gular Product has Nons<strong>in</strong>gular Terms . . . . . . . . . . . 207<br />
OSIS One-Sided Inverse is Sufficient . . . . . . . . . . . . . . . . . . . 209<br />
NI Nons<strong>in</strong>gularity is Invertibility . . . . . . . . . . . . . . . . . . . 209<br />
NME3 Nons<strong>in</strong>gular Matrix Equivalences, Round 3 . . . . . . . . . . . . 210<br />
SNCM Solution with Nons<strong>in</strong>gular Coefficient Matrix . . . . . . . . . . . 210<br />
UMI Unitary Matrices are Invertible . . . . . . . . . . . . . . . . . . . 212<br />
CUMOS Columns of Unitary Matrices are Orthonormal Sets . . . . . . . 212<br />
UMPIP Unitary Matrices Preserve Inner Products . . . . . . . . . . . . 213<br />
Section CRS Column and Row Spaces<br />
CSCS Column Spaces and Consistent Systems . . . . . . . . . . . . . . 218<br />
BCS Basis of the Column Space . . . . . . . . . . . . . . . . . . . . . 221<br />
CSNM Column Space of a Nons<strong>in</strong>gular Matrix . . . . . . . . . . . . . . 224<br />
NME4 Nons<strong>in</strong>gular Matrix Equivalences, Round 4 . . . . . . . . . . . . 224<br />
REMRS Row-Equivalent Matrices have equal Row Spaces . . . . . . . . 227<br />
BRS Basis for the Row Space . . . . . . . . . . . . . . . . . . . . . . 228<br />
CSRST Column Space, Row Space, Transpose . . . . . . . . . . . . . . . 230<br />
Section FS Four Subsets<br />
PEEF Properties of Extended Echelon Form . . . . . . . . . . . . . . . 242<br />
FS Four Subsets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244<br />
Section VS Vector Spaces
Theorems Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 618<br />
ZVU Zero Vector is Unique . . . . . . . . . . . . . . . . . . . . . . . . 266<br />
AIU Additive Inverses are Unique . . . . . . . . . . . . . . . . . . . . 266<br />
ZSSM Zero Scalar <strong>in</strong> Scalar Multiplication . . . . . . . . . . . . . . . . 266<br />
ZVSM Zero Vector <strong>in</strong> Scalar Multiplication . . . . . . . . . . . . . . . . 267<br />
AISM Additive Inverses from Scalar Multiplication . . . . . . . . . . . 267<br />
SMEZV Scalar Multiplication Equals the Zero Vector . . . . . . . . . . . 268<br />
Section S Subspaces<br />
TSS Test<strong>in</strong>g Subsets for Subspaces . . . . . . . . . . . . . . . . . . . 274<br />
NSMS Null Space of a Matrix is a Subspace . . . . . . . . . . . . . . . 277<br />
SSS Span of a Set is a Subspace . . . . . . . . . . . . . . . . . . . . . 280<br />
CSMS Column Space of a Matrix is a Subspace . . . . . . . . . . . . . 285<br />
RSMS Row Space of a Matrix is a Subspace . . . . . . . . . . . . . . . 285<br />
LNSMS Left Null Space of a Matrix is a Subspace . . . . . . . . . . . . . 285<br />
Section LISS L<strong>in</strong>ear Independence and Spann<strong>in</strong>g Sets<br />
VRRB Vector Representation Relative to a Basis . . . . . . . . . . . . . 299<br />
Section B Bases<br />
SUVB Standard Unit Vectors are a Basis . . . . . . . . . . . . . . . . . 304<br />
CNMB Columns of Nons<strong>in</strong>gular Matrix are a Basis . . . . . . . . . . . . 309<br />
NME5 Nons<strong>in</strong>gular Matrix Equivalences, Round 5 . . . . . . . . . . . . 310<br />
COB Coord<strong>in</strong>ates and Orthonormal Bases . . . . . . . . . . . . . . . . 311<br />
UMCOB Unitary Matrices Convert Orthonormal Bases . . . . . . . . . . 314<br />
Section D Dimension<br />
SSLD Spann<strong>in</strong>g Sets and L<strong>in</strong>ear Dependence . . . . . . . . . . . . . . 318<br />
BIS Bases have Identical Sizes . . . . . . . . . . . . . . . . . . . . . . 322<br />
DCM Dimension of C m . . . . . . . . . . . . . . . . . . . . . . . . . . 323<br />
DP Dimension of P n . . . . . . . . . . . . . . . . . . . . . . . . . . . 323<br />
DM Dimension of M mn . . . . . . . . . . . . . . . . . . . . . . . . . 323<br />
CRN Comput<strong>in</strong>g Rank and Nullity . . . . . . . . . . . . . . . . . . . . 326<br />
RPNC Rank Plus Nullity is Columns . . . . . . . . . . . . . . . . . . . 326<br />
RNNM Rank and Nullity of a Nons<strong>in</strong>gular Matrix . . . . . . . . . . . . 327<br />
NME6 Nons<strong>in</strong>gular Matrix Equivalences, Round 6 . . . . . . . . . . . . 328<br />
Section PD Properties of Dimension<br />
ELIS Extend<strong>in</strong>g L<strong>in</strong>early Independent Sets . . . . . . . . . . . . . . . 331<br />
G Goldilocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332<br />
PSSD Proper Subspaces have Smaller Dimension . . . . . . . . . . . . 335<br />
EDYES Equal Dimensions Yields Equal Subspaces . . . . . . . . . . . . 335<br />
RMRT Rank of a Matrix is the Rank of the Transpose . . . . . . . . . . 335
Theorems Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 619<br />
DFS Dimensions of Four Subspaces . . . . . . . . . . . . . . . . . . . 337<br />
Section DM Determ<strong>in</strong>ant of a Matrix<br />
EMDRO Elementary Matrices Do Row Operations . . . . . . . . . . . . . 342<br />
EMN Elementary Matrices are Nons<strong>in</strong>gular . . . . . . . . . . . . . . . 345<br />
NMPEM Nons<strong>in</strong>gular Matrices are Products of Elementary Matrices . . . 345<br />
DMST Determ<strong>in</strong>ant of Matrices of Size Two . . . . . . . . . . . . . . . 347<br />
DER Determ<strong>in</strong>ant Expansion about Rows . . . . . . . . . . . . . . . . 348<br />
DT Determ<strong>in</strong>ant of the Transpose . . . . . . . . . . . . . . . . . . . 349<br />
DEC Determ<strong>in</strong>ant Expansion about Columns . . . . . . . . . . . . . . 350<br />
Section PDM Properties of Determ<strong>in</strong>ants of Matrices<br />
DZRC Determ<strong>in</strong>ant with Zero Row or Column . . . . . . . . . . . . . . 355<br />
DRCS Determ<strong>in</strong>ant for Row or Column Swap . . . . . . . . . . . . . . 355<br />
DRCM Determ<strong>in</strong>ant for Row or Column Multiples . . . . . . . . . . . . 356<br />
DERC Determ<strong>in</strong>ant with Equal Rows or Columns . . . . . . . . . . . . 357<br />
DRCMA Determ<strong>in</strong>ant for Row or Column Multiples and Addition . . . . 358<br />
DIM Determ<strong>in</strong>ant of the Identity Matrix . . . . . . . . . . . . . . . . 361<br />
DEM Determ<strong>in</strong>ants of Elementary Matrices . . . . . . . . . . . . . . . 361<br />
DEMMM Determ<strong>in</strong>ants, Elementary Matrices, Matrix Multiplication . . . 362<br />
SMZD S<strong>in</strong>gular Matrices have Zero Determ<strong>in</strong>ants . . . . . . . . . . . . 363<br />
NME7 Nons<strong>in</strong>gular Matrix Equivalences, Round 7 . . . . . . . . . . . . 364<br />
DRMM Determ<strong>in</strong>ant Respects Matrix Multiplication . . . . . . . . . . . 365<br />
Section EE Eigenvalues and Eigenvectors<br />
EMHE Every Matrix Has an Eigenvalue . . . . . . . . . . . . . . . . . . 372<br />
EMRCP Eigenvalues of a Matrix are Roots of Characteristic Polynomials 376<br />
EMS Eigenspace for a Matrix is a Subspace . . . . . . . . . . . . . . . 377<br />
EMNS Eigenspace of a Matrix is a Null Space . . . . . . . . . . . . . . 378<br />
Section PEE Properties of Eigenvalues and Eigenvectors<br />
EDELI Eigenvectors with Dist<strong>in</strong>ct Eigenvalues are L<strong>in</strong>early Independent 390<br />
SMZE S<strong>in</strong>gular Matrices have Zero Eigenvalues . . . . . . . . . . . . . 391<br />
NME8 Nons<strong>in</strong>gular Matrix Equivalences, Round 8 . . . . . . . . . . . . 391<br />
ESMM Eigenvalues of a Scalar Multiple of a Matrix . . . . . . . . . . . 392<br />
EOMP Eigenvalues Of Matrix Powers . . . . . . . . . . . . . . . . . . . 392<br />
EPM Eigenvalues of the Polynomial of a Matrix . . . . . . . . . . . . 393<br />
EIM Eigenvalues of the Inverse of a Matrix . . . . . . . . . . . . . . . 394<br />
ETM Eigenvalues of the Transpose of a Matrix . . . . . . . . . . . . . 395<br />
ERMCP Eigenvalues of Real Matrices come <strong>in</strong> Conjugate Pairs . . . . . . 396<br />
DCP Degree of the Characteristic Polynomial . . . . . . . . . . . . . . 396<br />
NEM Number of Eigenvalues of a Matrix . . . . . . . . . . . . . . . . 397
Theorems Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 620<br />
ME Multiplicities of an Eigenvalue . . . . . . . . . . . . . . . . . . . 398<br />
MNEM Maximum Number of Eigenvalues of a Matrix . . . . . . . . . . 400<br />
HMRE Hermitian Matrices have Real Eigenvalues . . . . . . . . . . . . 400<br />
HMOE Hermitian Matrices have Orthogonal Eigenvectors . . . . . . . . 401<br />
Section SD Similarity and Diagonalization<br />
SER Similarity is an Equivalence Relation . . . . . . . . . . . . . . . 405<br />
SMEE Similar Matrices have Equal Eigenvalues . . . . . . . . . . . . . 406<br />
DC Diagonalization Characterization . . . . . . . . . . . . . . . . . . 408<br />
DMFE Diagonalizable Matrices have Full Eigenspaces . . . . . . . . . . 410<br />
DED Dist<strong>in</strong>ct Eigenvalues implies Diagonalizable . . . . . . . . . . . . 413<br />
Section LT L<strong>in</strong>ear Transformations<br />
LTTZZ L<strong>in</strong>ear Transformations Take Zero to Zero . . . . . . . . . . . . 425<br />
MBLT Matrices Build L<strong>in</strong>ear Transformations . . . . . . . . . . . . . . 428<br />
MLTCV Matrix of a L<strong>in</strong>ear Transformation, Column Vectors . . . . . . . 430<br />
LTLC L<strong>in</strong>ear Transformations and L<strong>in</strong>ear Comb<strong>in</strong>ations . . . . . . . . 432<br />
LTDB L<strong>in</strong>ear Transformation Def<strong>in</strong>ed on a Basis . . . . . . . . . . . . 432<br />
SLTLT Sum of L<strong>in</strong>ear Transformations is a L<strong>in</strong>ear Transformation . . . 439<br />
MLTLT Multiple of a L<strong>in</strong>ear Transformation is a L<strong>in</strong>ear Transformation 440<br />
VSLT Vector Space of L<strong>in</strong>ear Transformations . . . . . . . . . . . . . . 441<br />
CLTLT Composition of L<strong>in</strong>ear Transformations is a L<strong>in</strong>ear Transformation 441<br />
Section ILT Injective L<strong>in</strong>ear Transformations<br />
KLTS Kernel of a L<strong>in</strong>ear Transformation is a Subspace . . . . . . . . . 452<br />
KPI Kernel and Pre-Image . . . . . . . . . . . . . . . . . . . . . . . . 453<br />
KILT Kernel of an Injective L<strong>in</strong>ear Transformation . . . . . . . . . . . 455<br />
ILTLI Injective L<strong>in</strong>ear Transformations and L<strong>in</strong>ear Independence . . . 457<br />
ILTB Injective L<strong>in</strong>ear Transformations and Bases . . . . . . . . . . . . 457<br />
ILTD Injective L<strong>in</strong>ear Transformations and Dimension . . . . . . . . . 458<br />
CILTI Composition of Injective L<strong>in</strong>ear Transformations is Injective . . 459<br />
Section SLT Surjective L<strong>in</strong>ear Transformations<br />
RLTS Range of a L<strong>in</strong>ear Transformation is a Subspace . . . . . . . . . 469<br />
RSLT Range of a Surjective L<strong>in</strong>ear Transformation . . . . . . . . . . . 471<br />
SSRLT Spann<strong>in</strong>g Set for Range of a L<strong>in</strong>ear Transformation . . . . . . . 473<br />
RPI Range and Pre-Image . . . . . . . . . . . . . . . . . . . . . . . . 475<br />
SLTB Surjective L<strong>in</strong>ear Transformations and Bases . . . . . . . . . . . 475<br />
SLTD Surjective L<strong>in</strong>ear Transformations and Dimension . . . . . . . . 476<br />
CSLTS Composition of Surjective L<strong>in</strong>ear Transformations is Surjective . 476<br />
Section IVLT Invertible L<strong>in</strong>ear Transformations
Theorems Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 621<br />
ILTLT Inverse of a L<strong>in</strong>ear Transformation is a L<strong>in</strong>ear Transformation . 483<br />
IILT Inverse of an Invertible L<strong>in</strong>ear Transformation . . . . . . . . . . 484<br />
ILTIS Invertible L<strong>in</strong>ear Transformations are Injective and Surjective . 484<br />
CIVLT Composition of Invertible L<strong>in</strong>ear Transformations . . . . . . . . 488<br />
ICLT Inverse of a Composition of L<strong>in</strong>ear Transformations . . . . . . . 488<br />
IVSED Isomorphic Vector Spaces have Equal Dimension . . . . . . . . . 492<br />
ROSLT Rank Of a Surjective L<strong>in</strong>ear Transformation . . . . . . . . . . . 493<br />
NOILT Nullity Of an Injective L<strong>in</strong>ear Transformation . . . . . . . . . . 493<br />
RPNDD Rank Plus Nullity is Doma<strong>in</strong> Dimension . . . . . . . . . . . . . 493<br />
Section VR Vector Representations<br />
VRLT Vector Representation is a L<strong>in</strong>ear Transformation . . . . . . . . 502<br />
VRI Vector Representation is Injective . . . . . . . . . . . . . . . . . 506<br />
VRS Vector Representation is Surjective . . . . . . . . . . . . . . . . 507<br />
VRILT Vector Representation is an Invertible L<strong>in</strong>ear Transformation . . 507<br />
CFDVS Characterization of F<strong>in</strong>ite Dimensional Vector Spaces . . . . . . 508<br />
IFDVS Isomorphism of F<strong>in</strong>ite Dimensional Vector Spaces . . . . . . . . 508<br />
CLI Coord<strong>in</strong>atization and L<strong>in</strong>ear Independence . . . . . . . . . . . . 509<br />
CSS Coord<strong>in</strong>atization and Spann<strong>in</strong>g Sets . . . . . . . . . . . . . . . . 509<br />
Section MR Matrix Representations<br />
FTMR Fundamental Theorem of Matrix Representation . . . . . . . . . 517<br />
MRSLT Matrix Representation of a Sum of L<strong>in</strong>ear Transformations . . . 522<br />
MRMLT Matrix Representation of a Multiple of a L<strong>in</strong>ear Transformation 522<br />
MRCLT Matrix Representation of a Composition of L<strong>in</strong>ear Transformations 523<br />
KNSI Kernel and Null Space Isomorphism . . . . . . . . . . . . . . . . 527<br />
RCSI Range and Column Space Isomorphism . . . . . . . . . . . . . . 530<br />
IMR Invertible Matrix Representations . . . . . . . . . . . . . . . . . 534<br />
IMILT Invertible Matrices, Invertible L<strong>in</strong>ear Transformation . . . . . . 537<br />
NME9 Nons<strong>in</strong>gular Matrix Equivalences, Round 9 . . . . . . . . . . . . 537<br />
Section CB Change of Basis<br />
CB Change-of-Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . 544<br />
ICBM Inverse of Change-of-Basis Matrix . . . . . . . . . . . . . . . . . 544<br />
MRCB Matrix Representation and Change of Basis . . . . . . . . . . . 550<br />
SCB Similarity and Change of Basis . . . . . . . . . . . . . . . . . . . 553<br />
EER Eigenvalues, Eigenvectors, Representations . . . . . . . . . . . . 556<br />
Section OD Orthonormal Diagonalization<br />
PTMT Product of Triangular Matrices is Triangular . . . . . . . . . . . 568<br />
ITMT Inverse of a Triangular Matrix is Triangular . . . . . . . . . . . 569<br />
UTMR Upper Triangular Matrix Representation . . . . . . . . . . . . . 570
Theorems Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 622<br />
OBUTR Orthonormal Basis for Upper Triangular Representation . . . . 573<br />
OD Orthonormal Diagonalization . . . . . . . . . . . . . . . . . . . . 575<br />
OBNM Orthonormal Bases and Normal Matrices . . . . . . . . . . . . . 578<br />
Section CNO Complex Number Operations<br />
PCNA Properties of Complex Number Arithmetic . . . . . . . . . . . . 582<br />
ZPCN Zero Product, Complex Numbers . . . . . . . . . . . . . . . . . 583<br />
ZPZT Zero Product, Zero Terms . . . . . . . . . . . . . . . . . . . . . 584<br />
CCRA Complex Conjugation Respects Addition . . . . . . . . . . . . . 585<br />
CCRM Complex Conjugation Respects Multiplication . . . . . . . . . . 585<br />
CCT Complex Conjugation Twice . . . . . . . . . . . . . . . . . . . . 585<br />
Section SET Sets
Notation<br />
A Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22<br />
[A] ij<br />
Matrix Entries . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22<br />
v Column Vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23<br />
[v] i<br />
Column Vector Entries . . . . . . . . . . . . . . . . . . . . . . . 23<br />
0 Zero Column Vector . . . . . . . . . . . . . . . . . . . . . . . . . 23<br />
LS(A, b) Matrix Representation of a L<strong>in</strong>ear System . . . . . . . . . . . . 24<br />
[A | b] Augmented Matrix . . . . . . . . . . . . . . . . . . . . . . . . . 25<br />
R i ↔ R j Row Operation, Swap . . . . . . . . . . . . . . . . . . . . . . . . 26<br />
αR i Row Operation, Multiply . . . . . . . . . . . . . . . . . . . . . . 26<br />
αR i + R j Row Operation, Add . . . . . . . . . . . . . . . . . . . . . . . . 26<br />
r, D, F Reduced Row-Echelon Form Analysis . . . . . . . . . . . . . . . 28<br />
N (A) Null Space of a Matrix . . . . . . . . . . . . . . . . . . . . . . . 61<br />
I m Identity Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . 67<br />
C m Vector Space of Column Vectors . . . . . . . . . . . . . . . . . . 74<br />
u = v Column Vector Equality . . . . . . . . . . . . . . . . . . . . . . 75<br />
u + v Column Vector Addition . . . . . . . . . . . . . . . . . . . . . . 76<br />
αu Column Vector Scalar Multiplication . . . . . . . . . . . . . . . 77<br />
〈S〉 Span of a Set of Vectors . . . . . . . . . . . . . . . . . . . . . . 107<br />
u Complex Conjugate of a Column Vector . . . . . . . . . . . . . 148<br />
〈u, v〉 Inner Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149<br />
‖v‖ Norm of a Vector . . . . . . . . . . . . . . . . . . . . . . . . . . 152<br />
e i Standard Unit Vectors . . . . . . . . . . . . . . . . . . . . . . . 155<br />
M mn Vector Space of Matrices . . . . . . . . . . . . . . . . . . . . . . 162<br />
A = B Matrix Equality . . . . . . . . . . . . . . . . . . . . . . . . . . . 162<br />
A + B Matrix Addition . . . . . . . . . . . . . . . . . . . . . . . . . . . 163<br />
αA Matrix Scalar Multiplication . . . . . . . . . . . . . . . . . . . . 163<br />
O Zero Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165<br />
A t Transpose of a Matrix . . . . . . . . . . . . . . . . . . . . . . . . 166<br />
A Complex Conjugate of a Matrix . . . . . . . . . . . . . . . . . . 168<br />
A ∗ Adjo<strong>in</strong>t . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170<br />
Au Matrix-Vector Product . . . . . . . . . . . . . . . . . . . . . . . 175<br />
AB Matrix Multiplication . . . . . . . . . . . . . . . . . . . . . . . . 179<br />
A −1 Matrix Inverse . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194<br />
C(A) Column Space of a Matrix . . . . . . . . . . . . . . . . . . . . . 217<br />
R(A) Row Space of a Matrix . . . . . . . . . . . . . . . . . . . . . . . 225<br />
L(A) Left Null Space . . . . . . . . . . . . . . . . . . . . . . . . . . . 236<br />
dim (V ) Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318<br />
n (A) Nullity of a Matrix . . . . . . . . . . . . . . . . . . . . . . . . . 325<br />
r (A) Rank of a Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . 325<br />
623
Notation Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 624<br />
E i,j Elementary Matrix, Swap . . . . . . . . . . . . . . . . . . . . . . 340<br />
E i (α) Elementary Matrix, Multiply . . . . . . . . . . . . . . . . . . . . 340<br />
E i,j (α) Elementary Matrix, Add . . . . . . . . . . . . . . . . . . . . . . 340<br />
A (i|j) SubMatrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346<br />
|A| Determ<strong>in</strong>ant of a Matrix, Bars . . . . . . . . . . . . . . . . . . . 346<br />
det (A) Determ<strong>in</strong>ant of a Matrix, Functional . . . . . . . . . . . . . . . 346<br />
α A (λ) <strong>Algebra</strong>ic Multiplicity of an Eigenvalue . . . . . . . . . . . . . . 379<br />
γ A (λ) Geometric Multiplicity of an Eigenvalue . . . . . . . . . . . . . . 380<br />
T : U → V L<strong>in</strong>ear Transformation . . . . . . . . . . . . . . . . . . . . . . . 420<br />
K(T ) Kernel of a L<strong>in</strong>ear Transformation . . . . . . . . . . . . . . . . . 451<br />
R(T ) Range of a L<strong>in</strong>ear Transformation . . . . . . . . . . . . . . . . . 468<br />
r (T ) Rank of a L<strong>in</strong>ear Transformation . . . . . . . . . . . . . . . . . 492<br />
n (T ) Nullity of a L<strong>in</strong>ear Transformation . . . . . . . . . . . . . . . . . 492<br />
ρ B (w) Vector Representation . . . . . . . . . . . . . . . . . . . . . . . . 501<br />
MB,C T Matrix Representation . . . . . . . . . . . . . . . . . . . . . . . 514<br />
α = β Complex Number Equality . . . . . . . . . . . . . . . . . . . . . 582<br />
α + β Complex Number Addition . . . . . . . . . . . . . . . . . . . . . 582<br />
αβ Complex Number Multiplication . . . . . . . . . . . . . . . . . . 582<br />
α Conjugate of a Complex Number . . . . . . . . . . . . . . . . . 584<br />
x ∈ S Set Membership . . . . . . . . . . . . . . . . . . . . . . . . . . . 587<br />
S ⊆ T Subset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 587<br />
∅ Empty Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 587<br />
S = T Set Equality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 588<br />
|S| Card<strong>in</strong>ality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 589<br />
S ∪ T Set Union . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 589<br />
S ∩ T Set Intersection . . . . . . . . . . . . . . . . . . . . . . . . . . . 589<br />
S Set Complement . . . . . . . . . . . . . . . . . . . . . . . . . . . 590
GNU Free Documentation License<br />
Version 1.3, 3 November 2008<br />
Copyright c○ 2000, 2001, 2002, 2007, 2008 Free Software Foundation, Inc.<br />
<br />
Everyone is permitted to copy and distribute verbatim copies of this license document, but<br />
chang<strong>in</strong>g it is not allowed.<br />
Preamble<br />
The purpose of this License is to make a manual, textbook, or other functional and useful<br />
document “free” <strong>in</strong> the sense of freedom: to assure everyone the effective freedom to copy<br />
and redistribute it, with or without modify<strong>in</strong>g it, either commercially or noncommercially.<br />
Secondarily, this License preserves for the author and publisher a way to get credit for their<br />
work, while not be<strong>in</strong>g considered responsible for modifications made by others.<br />
This License is a k<strong>in</strong>d of “copyleft”, which means that derivative works of the document<br />
must themselves be free <strong>in</strong> the same sense. It complements the GNU General Public License,<br />
which is a copyleft license designed for free software.<br />
We have designed this License <strong>in</strong> order to use it for manuals for free software, because<br />
free software needs free documentation: a free program should come with manuals provid<strong>in</strong>g<br />
the same freedoms that the software does. But this License is not limited to software<br />
manuals; it can be used for any textual work, regardless of subject matter or whether it<br />
is published as a pr<strong>in</strong>ted book. We recommend this License pr<strong>in</strong>cipally for works whose<br />
purpose is <strong>in</strong>struction or reference.<br />
1. APPLICABILITY AND DEFINITIONS<br />
This License applies to any manual or other work, <strong>in</strong> any medium, that conta<strong>in</strong>s a<br />
notice placed by the copyright holder say<strong>in</strong>g it can be distributed under the terms of this<br />
License. Such a notice grants a world-wide, royalty-free license, unlimited <strong>in</strong> duration, to use<br />
that work under the conditions stated here<strong>in</strong>. The “Document”, below, refers to any such<br />
manual or work. Any member of the public is a licensee, and is addressed as “you”. You<br />
accept the license if you copy, modify or distribute the work <strong>in</strong> a way requir<strong>in</strong>g permission<br />
under copyright law.<br />
A“Modified Version” of the Document means any work conta<strong>in</strong><strong>in</strong>g the Document<br />
or a portion of it, either copied verbatim, or with modifications and/or translated <strong>in</strong>to<br />
another language.<br />
A“Secondary Section” is a named appendix or a front-matter section of the Document<br />
that deals exclusively with the relationship of the publishers or authors of the Document<br />
to the Document’s overall subject (or to related matters) and conta<strong>in</strong>s noth<strong>in</strong>g that could<br />
fall directly with<strong>in</strong> that overall subject. (Thus, if the Document is <strong>in</strong> part a textbook of<br />
mathematics, a Secondary Section may not expla<strong>in</strong> any mathematics.) The relationship<br />
could be a matter of historical connection with the subject or with related matters, or of<br />
legal, commercial, philosophical, ethical or political position regard<strong>in</strong>g them.
GFDL Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 626<br />
The “Invariant Sections” are certa<strong>in</strong> Secondary Sections whose titles are designated,<br />
as be<strong>in</strong>g those of Invariant Sections, <strong>in</strong> the notice that says that the Document is released<br />
under this License. If a section does not fit the above def<strong>in</strong>ition of Secondary then it is not<br />
allowed to be designated as Invariant. The Document may conta<strong>in</strong> zero Invariant Sections.<br />
If the Document does not identify any Invariant Sections then there are none.<br />
The “Cover Texts” are certa<strong>in</strong> short passages of text that are listed, as Front-Cover<br />
Texts or Back-Cover Texts, <strong>in</strong> the notice that says that the Document is released under<br />
this License. A Front-Cover Text may be at most 5 words, and a Back-Cover Text may be<br />
at most 25 words.<br />
A“Transparent” copy of the Document means a mach<strong>in</strong>e-readable copy, represented<br />
<strong>in</strong> a format whose specification is available to the general public, that is suitable for revis<strong>in</strong>g<br />
the document straightforwardly with generic text editors or (for images composed of pixels)<br />
generic pa<strong>in</strong>t programs or (for draw<strong>in</strong>gs) some widely available draw<strong>in</strong>g editor, and that is<br />
suitable for <strong>in</strong>put to text formatters or for automatic translation to a variety of formats<br />
suitable for <strong>in</strong>put to text formatters. A copy made <strong>in</strong> an otherwise Transparent file format<br />
whose markup, or absence of markup, has been arranged to thwart or discourage subsequent<br />
modification by readers is not Transparent. An image format is not Transparent if used for<br />
any substantial amount of text. A copy that is not “Transparent” is called “Opaque”.<br />
Examples of suitable formats for Transparent copies <strong>in</strong>clude pla<strong>in</strong> ASCII without<br />
markup, Tex<strong>in</strong>fo <strong>in</strong>put format, LaTeX <strong>in</strong>put format, SGML or XML us<strong>in</strong>g a publicly<br />
available DTD, and standard-conform<strong>in</strong>g simple HTML, PostScript or PDF designed for<br />
human modification. Examples of transparent image formats <strong>in</strong>clude PNG, XCF and<br />
JPG. Opaque formats <strong>in</strong>clude proprietary formats that can be read and edited only by<br />
proprietary word processors, SGML or XML for which the DTD and/or process<strong>in</strong>g tools are<br />
not generally available, and the mach<strong>in</strong>e-generated HTML, PostScript or PDF produced<br />
by some word processors for output purposes only.<br />
The “Title Page” means, for a pr<strong>in</strong>ted book, the title page itself, plus such follow<strong>in</strong>g<br />
pages as are needed to hold, legibly, the material this License requires to appear <strong>in</strong> the title<br />
page. For works <strong>in</strong> formats which do not have any title page as such, “Title Page” means<br />
the text near the most prom<strong>in</strong>ent appearance of the work’s title, preced<strong>in</strong>g the beg<strong>in</strong>n<strong>in</strong>g<br />
of the body of the text.<br />
The “publisher” means any person or entity that distributes copies of the Document<br />
to the public.<br />
Asection“Entitled XYZ” means a named subunit of the Document whose title<br />
either is precisely XYZ or conta<strong>in</strong>s XYZ <strong>in</strong> parentheses follow<strong>in</strong>g text that translates<br />
XYZ <strong>in</strong> another language. (Here XYZ stands for a specific section name mentioned below,<br />
such as “Acknowledgements”, “Dedications”, “Endorsements”, or “History”.) To<br />
“Preserve the Title” of such a section when you modify the Document means that it<br />
rema<strong>in</strong>s a section “Entitled XYZ” accord<strong>in</strong>g to this def<strong>in</strong>ition.<br />
The Document may <strong>in</strong>clude Warranty Disclaimers next to the notice which states that<br />
this License applies to the Document. These Warranty Disclaimers are considered to be<br />
<strong>in</strong>cluded by reference <strong>in</strong> this License, but only as regards disclaim<strong>in</strong>g warranties: any other<br />
implication that these Warranty Disclaimers may have is void and has no effect on the<br />
mean<strong>in</strong>g of this License.
GFDL Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 627<br />
2. VERBATIM COPYING<br />
You may copy and distribute the Document <strong>in</strong> any medium, either commercially or<br />
noncommercially, provided that this License, the copyright notices, and the license notice<br />
say<strong>in</strong>g this License applies to the Document are reproduced <strong>in</strong> all copies, and that you<br />
add no other conditions whatsoever to those of this License. You may not use technical<br />
measures to obstruct or control the read<strong>in</strong>g or further copy<strong>in</strong>g of the copies you make or<br />
distribute. However, you may accept compensation <strong>in</strong> exchange for copies. If you distribute<br />
a large enough number of copies you must also follow the conditions <strong>in</strong> section 3.<br />
You may also lend copies, under the same conditions stated above, and you may publicly<br />
display copies.<br />
3. COPYING IN QUANTITY<br />
If you publish pr<strong>in</strong>ted copies (or copies <strong>in</strong> media that commonly have pr<strong>in</strong>ted covers) of<br />
the Document, number<strong>in</strong>g more than 100, and the Document’s license notice requires Cover<br />
Texts, you must enclose the copies <strong>in</strong> covers that carry, clearly and legibly, all these Cover<br />
Texts: Front-Cover Texts on the front cover, and Back-Cover Texts on the back cover. Both<br />
covers must also clearly and legibly identify you as the publisher of these copies. The front<br />
cover must present the full title with all words of the title equally prom<strong>in</strong>ent and visible.<br />
You may add other material on the covers <strong>in</strong> addition. Copy<strong>in</strong>g with changes limited to<br />
the covers, as long as they preserve the title of the Document and satisfy these conditions,<br />
can be treated as verbatim copy<strong>in</strong>g <strong>in</strong> other respects.<br />
If the required texts for either cover are too volum<strong>in</strong>ous to fit legibly, you should put<br />
the first ones listed (as many as fit reasonably) on the actual cover, and cont<strong>in</strong>ue the rest<br />
onto adjacent pages.<br />
If you publish or distribute Opaque copies of the Document number<strong>in</strong>g more than 100,<br />
you must either <strong>in</strong>clude a mach<strong>in</strong>e-readable Transparent copy along with each Opaque copy,<br />
or state <strong>in</strong> or with each Opaque copy a computer-network location from which the general<br />
network-us<strong>in</strong>g public has access to download us<strong>in</strong>g public-standard network protocols a<br />
complete Transparent copy of the Document, free of added material. If you use the latter<br />
option, you must take reasonably prudent steps, when you beg<strong>in</strong> distribution of Opaque<br />
copies <strong>in</strong> quantity, to ensure that this Transparent copy will rema<strong>in</strong> thus accessible at the<br />
stated location until at least one year after the last time you distribute an Opaque copy<br />
(directly or through your agents or retailers) of that edition to the public.<br />
It is requested, but not required, that you contact the authors of the Document well<br />
before redistribut<strong>in</strong>g any large number of copies, to give them a chance to provide you with<br />
an updated version of the Document.<br />
4. MODIFICATIONS<br />
You may copy and distribute a Modified Version of the Document under the conditions<br />
of sections 2 and 3 above, provided that you release the Modified Version under precisely<br />
this License, with the Modified Version fill<strong>in</strong>g the role of the Document, thus licens<strong>in</strong>g<br />
distribution and modification of the Modified Version to whoever possesses a copy of it. In<br />
addition, you must do these th<strong>in</strong>gs <strong>in</strong> the Modified Version:
GFDL Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 628<br />
A. Use <strong>in</strong> the Title Page (and on the covers, if any) a title dist<strong>in</strong>ct from that of the<br />
Document, and from those of previous versions (which should, if there were any, be<br />
listed <strong>in</strong> the History section of the Document). You may use the same title as a<br />
previous version if the orig<strong>in</strong>al publisher of that version gives permission.<br />
B. List on the Title Page, as authors, one or more persons or entities responsible for<br />
authorship of the modifications <strong>in</strong> the Modified Version, together with at least five<br />
of the pr<strong>in</strong>cipal authors of the Document (all of its pr<strong>in</strong>cipal authors, if it has fewer<br />
than five), unless they release you from this requirement.<br />
C. State on the Title page the name of the publisher of the Modified Version, as the<br />
publisher.<br />
D. Preserve all the copyright notices of the Document.<br />
E. Add an appropriate copyright notice for your modifications adjacent to the other<br />
copyright notices.<br />
F. Include, immediately after the copyright notices, a license notice giv<strong>in</strong>g the public<br />
permission to use the Modified Version under the terms of this License, <strong>in</strong> the form<br />
shown <strong>in</strong> the Addendum below.<br />
G. Preserve <strong>in</strong> that license notice the full lists of Invariant Sections and required Cover<br />
Texts given <strong>in</strong> the Document’s license notice.<br />
H. Include an unaltered copy of this License.<br />
I. Preserve the section Entitled “History”, Preserve its Title, and add to it an item<br />
stat<strong>in</strong>g at least the title, year, new authors, and publisher of the Modified Version as<br />
given on the Title Page. If there is no section Entitled “History” <strong>in</strong> the Document,<br />
create one stat<strong>in</strong>g the title, year, authors, and publisher of the Document as given<br />
on its Title Page, then add an item describ<strong>in</strong>g the Modified Version as stated <strong>in</strong> the<br />
previous sentence.<br />
J. Preserve the network location, if any, given <strong>in</strong> the Document for public access to<br />
a Transparent copy of the Document, and likewise the network locations given <strong>in</strong><br />
the Document for previous versions it was based on. These may be placed <strong>in</strong> the<br />
“History” section. You may omit a network location for a work that was published at<br />
least four years before the Document itself, or if the orig<strong>in</strong>al publisher of the version<br />
it refers to gives permission.<br />
K. For any section Entitled “Acknowledgements” or “Dedications”, Preserve the Title<br />
of the section, and preserve <strong>in</strong> the section all the substance and tone of each of the<br />
contributor acknowledgements and/or dedications given there<strong>in</strong>.<br />
L. Preserve all the Invariant Sections of the Document, unaltered <strong>in</strong> their text and <strong>in</strong><br />
their titles. Section numbers or the equivalent are not considered part of the section<br />
titles.
GFDL Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 629<br />
M. Delete any section Entitled “Endorsements”. Such a section may not be <strong>in</strong>cluded <strong>in</strong><br />
the Modified Version.<br />
N. Do not retitle any exist<strong>in</strong>g section to be Entitled “Endorsements” or to conflict <strong>in</strong><br />
title with any Invariant Section.<br />
O. Preserve any Warranty Disclaimers.<br />
If the Modified Version <strong>in</strong>cludes new front-matter sections or appendices that qualify as<br />
Secondary Sections and conta<strong>in</strong> no material copied from the Document, you may at your<br />
option designate some or all of these sections as <strong>in</strong>variant. To do this, add their titles to<br />
the list of Invariant Sections <strong>in</strong> the Modified Version’s license notice. These titles must be<br />
dist<strong>in</strong>ct from any other section titles.<br />
You may add a section Entitled “Endorsements”, provided it conta<strong>in</strong>s noth<strong>in</strong>g but<br />
endorsements of your Modified Version by various parties—for example, statements of peer<br />
review or that the text has been approved by an organization as the authoritative def<strong>in</strong>ition<br />
of a standard.<br />
You may add a passage of up to five words as a Front-Cover Text, and a passage of up<br />
to 25 words as a Back-Cover Text, to the end of the list of Cover Texts <strong>in</strong> the Modified<br />
Version. Only one passage of Front-Cover Text and one of Back-Cover Text may be added<br />
by (or through arrangements made by) any one entity. If the Document already <strong>in</strong>cludes a<br />
cover text for the same cover, previously added by you or by arrangement made by the<br />
same entity you are act<strong>in</strong>g on behalf of, you may not add another; but you may replace<br />
the old one, on explicit permission from the previous publisher that added the old one.<br />
The author(s) and publisher(s) of the Document do not by this License give permission<br />
to use their names for publicity for or to assert or imply endorsement of any Modified<br />
Version.<br />
5. COMBINING DOCUMENTS<br />
You may comb<strong>in</strong>e the Document with other documents released under this License,<br />
under the terms def<strong>in</strong>ed <strong>in</strong> section 4 above for modified versions, provided that you <strong>in</strong>clude<br />
<strong>in</strong> the comb<strong>in</strong>ation all of the Invariant Sections of all of the orig<strong>in</strong>al documents, unmodified,<br />
and list them all as Invariant Sections of your comb<strong>in</strong>ed work <strong>in</strong> its license notice, and that<br />
you preserve all their Warranty Disclaimers.<br />
The comb<strong>in</strong>ed work need only conta<strong>in</strong> one copy of this License, and multiple identical<br />
Invariant Sections may be replaced with a s<strong>in</strong>gle copy. If there are multiple Invariant<br />
Sections with the same name but different contents, make the title of each such section<br />
unique by add<strong>in</strong>g at the end of it, <strong>in</strong> parentheses, the name of the orig<strong>in</strong>al author or<br />
publisher of that section if known, or else a unique number. Make the same adjustment to<br />
the section titles <strong>in</strong> the list of Invariant Sections <strong>in</strong> the license notice of the comb<strong>in</strong>ed work.<br />
In the comb<strong>in</strong>ation, you must comb<strong>in</strong>e any sections Entitled “History” <strong>in</strong> the various<br />
orig<strong>in</strong>al documents, form<strong>in</strong>g one section Entitled “History”; likewise comb<strong>in</strong>e any sections<br />
Entitled “Acknowledgements”, and any sections Entitled “Dedications”. You must delete<br />
all sections Entitled “Endorsements”.
GFDL Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 630<br />
6. COLLECTIONS OF DOCUMENTS<br />
You may make a collection consist<strong>in</strong>g of the Document and other documents released<br />
under this License, and replace the <strong>in</strong>dividual copies of this License <strong>in</strong> the various documents<br />
with a s<strong>in</strong>gle copy that is <strong>in</strong>cluded <strong>in</strong> the collection, provided that you follow the rules of<br />
this License for verbatim copy<strong>in</strong>g of each of the documents <strong>in</strong> all other respects.<br />
You may extract a s<strong>in</strong>gle document from such a collection, and distribute it <strong>in</strong>dividually<br />
under this License, provided you <strong>in</strong>sert a copy of this License <strong>in</strong>to the extracted document,<br />
and follow this License <strong>in</strong> all other respects regard<strong>in</strong>g verbatim copy<strong>in</strong>g of that document.<br />
7. AGGREGATION WITH INDEPENDENT<br />
WORKS<br />
A compilation of the Document or its derivatives with other separate and <strong>in</strong>dependent<br />
documents or works, <strong>in</strong> or on a volume of a storage or distribution medium, is called an<br />
“aggregate” if the copyright result<strong>in</strong>g from the compilation is not used to limit the legal<br />
rights of the compilation’s users beyond what the <strong>in</strong>dividual works permit. When the<br />
Document is <strong>in</strong>cluded <strong>in</strong> an aggregate, this License does not apply to the other works <strong>in</strong><br />
the aggregate which are not themselves derivative works of the Document.<br />
If the Cover Text requirement of section 3 is applicable to these copies of the Document,<br />
then if the Document is less than one half of the entire aggregate, the Document’s Cover<br />
Texts may be placed on covers that bracket the Document with<strong>in</strong> the aggregate, or the<br />
electronic equivalent of covers if the Document is <strong>in</strong> electronic form. Otherwise they must<br />
appear on pr<strong>in</strong>ted covers that bracket the whole aggregate.<br />
8. TRANSLATION<br />
Translation is considered a k<strong>in</strong>d of modification, so you may distribute translations of<br />
the Document under the terms of section 4. Replac<strong>in</strong>g Invariant Sections with translations<br />
requires special permission from their copyright holders, but you may <strong>in</strong>clude translations<br />
of some or all Invariant Sections <strong>in</strong> addition to the orig<strong>in</strong>al versions of these Invariant<br />
Sections. You may <strong>in</strong>clude a translation of this License, and all the license notices <strong>in</strong> the<br />
Document, and any Warranty Disclaimers, provided that you also <strong>in</strong>clude the orig<strong>in</strong>al<br />
English version of this License and the orig<strong>in</strong>al versions of those notices and disclaimers. In<br />
case of a disagreement between the translation and the orig<strong>in</strong>al version of this License or a<br />
notice or disclaimer, the orig<strong>in</strong>al version will prevail.<br />
If a section <strong>in</strong> the Document is Entitled “Acknowledgements”, “Dedications”, or “History”,<br />
the requirement (section 4) to Preserve its Title (section 1) will typically require<br />
chang<strong>in</strong>g the actual title.<br />
9. TERMINATION<br />
You may not copy, modify, sublicense, or distribute the Document except as expressly<br />
provided under this License. Any attempt otherwise to copy, modify, sublicense, or distribute<br />
it is void, and will automatically term<strong>in</strong>ate your rights under this License.
GFDL Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 631<br />
However, if you cease all violation of this License, then your license from a particular<br />
copyright holder is re<strong>in</strong>stated (a) provisionally, unless and until the copyright holder explicitly<br />
and f<strong>in</strong>ally term<strong>in</strong>ates your license, and (b) permanently, if the copyright holder fails to<br />
notify you of the violation by some reasonable means prior to 60 days after the cessation.<br />
Moreover, your license from a particular copyright holder is re<strong>in</strong>stated permanently if<br />
the copyright holder notifies you of the violation by some reasonable means, this is the first<br />
time you have received notice of violation of this License (for any work) from that copyright<br />
holder, and you cure the violation prior to 30 days after your receipt of the notice.<br />
Term<strong>in</strong>ation of your rights under this section does not term<strong>in</strong>ate the licenses of parties<br />
who have received copies or rights from you under this License. If your rights have been<br />
term<strong>in</strong>ated and not permanently re<strong>in</strong>stated, receipt of a copy of some or all of the same<br />
material does not give you any rights to use it.<br />
10. FUTURE REVISIONS OF THIS LICENSE<br />
The Free Software Foundation may publish new, revised versions of the GNU Free<br />
Documentation License from time to time. Such new versions will be similar <strong>in</strong> spirit to<br />
the present version, but may differ <strong>in</strong> detail to address new problems or concerns. See<br />
http://www.gnu.org/copyleft/.<br />
Each version of the License is given a dist<strong>in</strong>guish<strong>in</strong>g version number. If the Document<br />
specifies that a particular numbered version of this License “or any later version” applies<br />
to it, you have the option of follow<strong>in</strong>g the terms and conditions either of that specified<br />
version or of any later version that has been published (not as a draft) by the Free Software<br />
Foundation. If the Document does not specify a version number of this License, you may<br />
choose any version ever published (not as a draft) by the Free Software Foundation. If the<br />
Document specifies that a proxy can decide which future versions of this License can be<br />
used, that proxy’s public statement of acceptance of a version permanently authorizes you<br />
to choose that version for the Document.<br />
11. RELICENSING<br />
“Massive Multiauthor Collaboration Site” (or “MMC Site”) means any World Wide<br />
Web server that publishes copyrightable works and also provides prom<strong>in</strong>ent facilities for<br />
anybody to edit those works. A public wiki that anybody can edit is an example of such a<br />
server. A “Massive Multiauthor Collaboration” (or “MMC”) conta<strong>in</strong>ed <strong>in</strong> the site means<br />
any set of copyrightable works thus published on the MMC site.<br />
“CC-BY-SA” means the Creative Commons Attribution-Share Alike 3.0 license published<br />
by Creative Commons Corporation, a not-for-profit corporation with a pr<strong>in</strong>cipal<br />
place of bus<strong>in</strong>ess <strong>in</strong> San Francisco, California, as well as future copyleft versions of that<br />
license published by that same organization.<br />
“Incorporate” means to publish or republish a Document, <strong>in</strong> whole or <strong>in</strong> part, as part<br />
of another Document.<br />
An MMC is “eligible for relicens<strong>in</strong>g” if it is licensed under this License, and if all<br />
works that were first published under this License somewhere other than this MMC, and<br />
subsequently <strong>in</strong>corporated <strong>in</strong> whole or <strong>in</strong> part <strong>in</strong>to the MMC, (1) had no cover texts or<br />
<strong>in</strong>variant sections, and (2) were thus <strong>in</strong>corporated prior to November 1, 2008.
GFDL Beezer: A <strong>First</strong> <strong>Course</strong> <strong>in</strong> L<strong>in</strong>ear <strong>Algebra</strong> 632<br />
The operator of an MMC Site may republish an MMC conta<strong>in</strong>ed <strong>in</strong> the site under<br />
CC-BY-SA on the same site at any time before August 1, 2009, provided the MMC is<br />
eligible for relicens<strong>in</strong>g.<br />
ADDENDUM: How to use this License for your<br />
documents<br />
To use this License <strong>in</strong> a document you have written, <strong>in</strong>clude a copy of the License <strong>in</strong><br />
the document and put the follow<strong>in</strong>g copyright and license notices just after the title page:<br />
Copyright c○ YEAR YOUR NAME. Permission is granted to copy, distribute<br />
and/or modify this document under the terms of the GNU Free Documentation<br />
License, Version 1.3 or any later version published by the Free Software Foundation;<br />
with no Invariant Sections, no Front-Cover Texts, and no Back-Cover<br />
Texts. A copy of the license is <strong>in</strong>cluded <strong>in</strong> the section entitled “GNU Free<br />
Documentation License”.<br />
If you have Invariant Sections, Front-Cover Texts and Back-Cover Texts, replace the<br />
“with . . . Texts.” l<strong>in</strong>e with this:<br />
with the Invariant Sections be<strong>in</strong>g LIST THEIR TITLES, with the Front-Cover<br />
Texts be<strong>in</strong>g LIST, and with the Back-Cover Texts be<strong>in</strong>g LIST.<br />
If you have Invariant Sections without Cover Texts, or some other comb<strong>in</strong>ation of the<br />
three, merge those two alternatives to suit the situation.<br />
If your document conta<strong>in</strong>s nontrivial examples of program code, we recommend releas<strong>in</strong>g<br />
these examples <strong>in</strong> parallel under your choice of free software license, such as the GNU<br />
General Public License, to permit their use <strong>in</strong> free software.