A. Linear Algebra - Meboo Publishing featured book, Convex ...

A. Linear Algebra - Meboo Publishing featured book, Convex ... A. Linear Algebra - Meboo Publishing featured book, Convex ...

convexoptimization.com
from convexoptimization.com More from this publisher
10.03.2015 Views

Appendix A Linear algebra A.1 Main-diagonal δ operator, λ , trace, vec We introduce notation δ denoting the main-diagonal linear self-adjoint operator. When linear function δ operates on a square matrix A∈ R N×N , δ(A) returns a vector composed of all the entries from the main diagonal in the natural order; δ(A) ∈ R N (1391) Operating on a vector y ∈ R N , δ naturally returns a diagonal matrix; δ(y) ∈ S N (1392) Operating recursively on a vector Λ∈ R N or diagonal matrix Λ∈ S N , δ(δ(Λ)) returns Λ itself; δ 2 (Λ) ≡ δ(δ(Λ)) Λ (1393) Defined in this manner, main-diagonal linear operator δ is self-adjoint [222,3.10,9.5-1]; A.1 videlicet, (2.2) δ(A) T y = 〈δ(A), y〉 = 〈A , δ(y)〉 = tr ( A T δ(y) ) (1394) A.1 Linear operator T : R m×n → R M×N is self-adjoint when, for each and every X 1 , X 2 ∈ R m×n 〈T(X 1 ), X 2 〉 = 〈X 1 , T(X 2 )〉 2001 Jon Dattorro. co&edg version 2010.01.05. All rights reserved. citation: Dattorro, Convex Optimization & Euclidean Distance Geometry, Mεβoo Publishing USA, 2005, v2010.01.05. 581

Appendix A<br />

<strong>Linear</strong> algebra<br />

A.1 Main-diagonal δ operator, λ , trace, vec<br />

We introduce notation δ denoting the main-diagonal linear self-adjoint<br />

operator. When linear function δ operates on a square matrix A∈ R N×N ,<br />

δ(A) returns a vector composed of all the entries from the main diagonal in<br />

the natural order;<br />

δ(A) ∈ R N (1391)<br />

Operating on a vector y ∈ R N , δ naturally returns a diagonal matrix;<br />

δ(y) ∈ S N (1392)<br />

Operating recursively on a vector Λ∈ R N or diagonal matrix Λ∈ S N ,<br />

δ(δ(Λ)) returns Λ itself;<br />

δ 2 (Λ) ≡ δ(δ(Λ)) Λ (1393)<br />

Defined in this manner, main-diagonal linear operator δ is self-adjoint<br />

[222,3.10,9.5-1]; A.1 videlicet, (2.2)<br />

δ(A) T y = 〈δ(A), y〉 = 〈A , δ(y)〉 = tr ( A T δ(y) ) (1394)<br />

A.1 <strong>Linear</strong> operator T : R m×n → R M×N is self-adjoint when, for each and every<br />

X 1 , X 2 ∈ R m×n 〈T(X 1 ), X 2 〉 = 〈X 1 , T(X 2 )〉<br />

2001 Jon Dattorro. co&edg version 2010.01.05. All rights reserved.<br />

citation: Dattorro, <strong>Convex</strong> Optimization & Euclidean Distance Geometry,<br />

Mεβoo <strong>Publishing</strong> USA, 2005, v2010.01.05.<br />

581


582 APPENDIX A. LINEAR ALGEBRA<br />

A.1.1<br />

Identities<br />

This δ notation is efficient and unambiguous as illustrated in the following<br />

examples where: A ◦ B denotes Hadamard product [198] [155,1.1.4] of<br />

matrices of like size, ⊗ Kronecker product [162] (D.1.2.1), y a vector, X<br />

a matrix, e i the i th member of the standard basis for R n , S N h the symmetric<br />

hollow subspace, σ(A) a vector of (nonincreasingly) ordered singular values,<br />

and λ(A) a vector of nonincreasingly ordered eigenvalues of matrix A :<br />

1. δ(A) = δ(A T )<br />

2. tr(A) = tr(A T ) = δ(A) T 1 = 〈I, A〉<br />

3. δ(cA) = cδ(A) c∈R<br />

4. tr(cA) = c tr(A) = c1 T λ(A) c∈R<br />

5. vec(cA) = c vec(A) c∈R<br />

6. A ◦ cB = cA ◦ B c∈R<br />

7. A ⊗ cB = cA ⊗ B c∈R<br />

8. δ(A + B) = δ(A) + δ(B)<br />

9. tr(A + B) = tr(A) + tr(B)<br />

10. vec(A + B) = vec(A) + vec(B)<br />

11. (A + B) ◦ C = A ◦ C + B ◦ C<br />

A ◦ (B + C) = A ◦ B + A ◦ C<br />

12. (A + B) ⊗ C = A ⊗ C + B ⊗ C<br />

A ⊗ (B + C) = A ⊗ B + A ⊗ C<br />

13. sgn(c)λ(|c|A) = cλ(A) c∈R<br />

14. sgn(c)σ(|c|A) = cσ(A) c∈R<br />

15. tr(c √ A T A) = c tr √ A T A = c1 T σ(A) c∈R<br />

16. π(δ(A)) = λ(I ◦A) where π is the presorting function.<br />

17. δ(AB) = (A ◦ B T )1 = (B T ◦ A)1<br />

18. δ(AB) T = 1 T (A T ◦ B) = 1 T (B ◦ A T )


A.1. MAIN-DIAGONAL δ OPERATOR, λ , TRACE, VEC 583<br />

⎡ ⎤<br />

u 1 v 1<br />

19. δ(uv T ⎢ ⎥<br />

) = ⎣ . ⎦ = u ◦ v ,<br />

u N v N<br />

u,v ∈ R N<br />

20. tr(A T B) = tr(AB T ) = tr(BA T ) = tr(B T A)<br />

= 1 T (A ◦ B)1 = 1 T δ(AB T ) = δ(A T B) T 1 = δ(BA T ) T 1 = δ(B T A) T 1<br />

21. D = [d ij ] ∈ S N h , H = [h ij ] ∈ S N h , V = I − 1 N 11T ∈ S N (conferB.4.2 no.20)<br />

N tr(−V (D ◦ H)V ) = tr(D T H) = 1 T (D ◦ H)1 = tr(11 T (D ◦ H)) = ∑ d ij h ij<br />

i,j<br />

22. tr(ΛA) = δ(Λ) T δ(A), δ 2 (Λ) Λ ∈ S N<br />

23. y T Bδ(A) = tr ( Bδ(A)y T) = tr ( δ(B T y)A ) = tr ( Aδ(B T y) )<br />

= δ(A) T B T y = tr ( yδ(A) T B T) = tr ( A T δ(B T y) ) = tr ( δ(B T y)A T)<br />

24. δ 2 (A T A) = ∑ i<br />

e i e T iA T Ae i e T i<br />

25. δ ( δ(A)1 T) = δ ( 1δ(A) T) = δ(A)<br />

26. δ(A1)1 = δ(A11 T ) = A1 , δ(y)1 = δ(y1 T ) = y<br />

27. δ(I1) = δ(1) = I<br />

28. δ(e i e T j 1) = δ(e i ) = e i e T i<br />

29. For ζ =[ζ i ]∈ R k and x=[x i ]∈ R k ,<br />

30.<br />

∑<br />

ζ i /x i = ζ T δ(x) −1 1<br />

vec(A ◦ B) = vec(A) ◦ vec(B) = δ(vecA) vec(B)<br />

= vec(B) ◦ vec(A) = δ(vecB) vec(A)<br />

31. vec(AXB) = (B T ⊗ A) vec X<br />

32. vec(BXA) = (A T ⊗ B) vec X<br />

i<br />

(42)(1747)<br />

(<br />

not<br />

H )<br />

33.<br />

34.<br />

tr(AXBX T ) = vec(X) T vec(AXB) = vec(X) T (B T ⊗ A) vec X [162]<br />

= δ ( vec(X) vec(X) T (B T ⊗ A) ) T<br />

1<br />

tr(AX T BX) = vec(X) T vec(BXA) = vec(X) T (A T ⊗ B) vec X<br />

= δ ( vec(X) vec(X) T (A T ⊗ B) ) T<br />

1


584 APPENDIX A. LINEAR ALGEBRA<br />

35. For any permutation matrix Ξ and dimensionally compatible vector y<br />

or matrix A<br />

δ(Ξy) = Ξδ(y) Ξ T (1395)<br />

δ(ΞAΞ T ) = Ξδ(A) (1396)<br />

So given any permutation matrix Ξ and any dimensionally compatible<br />

matrix B , for example,<br />

δ 2 (B) = Ξδ 2 (Ξ T B Ξ)Ξ T (1397)<br />

36. A ⊗ 1 = 1 ⊗ A = A<br />

37. A ⊗ (B ⊗ C) = (A ⊗ B) ⊗ C<br />

38. (A ⊗ B)(C ⊗ D) = AC ⊗ BD<br />

39. For A a vector, (A ⊗ B) = (A ⊗ I)B<br />

40. For B a row vector, (A ⊗ B) = A(I ⊗ B)<br />

41. (A ⊗ B) T = A T ⊗ B T<br />

42. (A ⊗ B) −1 = A −1 ⊗ B −1<br />

43. tr(A ⊗ B) = trA trB<br />

44. For A∈ R m×m , B ∈ R n×n , det(A ⊗ B) = det n (A) det m (B)<br />

45. There exist permutation matrices Ξ 1 and Ξ 2 such that [162, p.28]<br />

A ⊗ B = Ξ 1 (B ⊗ A)Ξ 2 (1398)<br />

46. For eigenvalues λ(A)∈ C n and eigenvectors v(A)∈ C n×n such that<br />

A = vδ(λ)v −1 ∈ R n×n<br />

λ(A ⊗ B) = λ(A) ⊗ λ(B) , v(A ⊗ B) = v(A) ⊗ v(B) (1399)<br />

47. Given analytic function [80] f : C n×n → C n×n , then f(I ⊗A)= I ⊗f(A)<br />

and f(A ⊗ I) = f(A) ⊗ I [162, p.28]


A.1. MAIN-DIAGONAL δ OPERATOR, λ , TRACE, VEC 585<br />

A.1.2<br />

Majorization<br />

A.1.2.0.1 Theorem. (Schur) Majorization. [386,7.4] [198,4.3]<br />

[199,5.5] Let λ∈ R N denote a given vector of eigenvalues and let<br />

δ ∈ R N denote a given vector of main diagonal entries, both arranged in<br />

nonincreasing order. Then<br />

and conversely<br />

∃A∈ S N λ(A)=λ and δ(A)= δ ⇐ λ − δ ∈ K ∗ λδ (1400)<br />

A∈ S N ⇒ λ(A) − δ(A) ∈ K ∗ λδ (1401)<br />

The difference belongs to the pointed polyhedral cone of majorization (empty<br />

interior, confer (291))<br />

K ∗ λδ K ∗ M+ ∩ {ζ1 | ζ ∈ R} ∗ (1402)<br />

where K ∗ M+ is the dual monotone nonnegative cone (413), and where the<br />

dual of the line is a hyperplane; ∂H = {ζ1 | ζ ∈ R} ∗ = 1 ⊥ .<br />

⋄<br />

Majorization cone K ∗ λδ is naturally consequent to the definition of<br />

majorization; id est, vector y ∈ R N majorizes vector x if and only if<br />

k∑<br />

x i ≤<br />

i=1<br />

k∑<br />

y i ∀ 1 ≤ k ≤ N (1403)<br />

i=1<br />

and<br />

1 T x = 1 T y (1404)<br />

Under these circumstances, rather, vector x is majorized by vector y .<br />

In the particular circumstance δ(A)=0 we get:<br />

A.1.2.0.2 Corollary. Symmetric hollow majorization.<br />

Let λ∈ R N denote a given vector of eigenvalues arranged in nonincreasing<br />

order. Then<br />

∃A∈ S N h λ(A)=λ ⇐ λ ∈ K ∗ λδ (1405)<br />

and conversely<br />

where K ∗ λδ<br />

is defined in (1402).<br />

⋄<br />

A∈ S N h ⇒ λ(A) ∈ K ∗ λδ (1406)


586 APPENDIX A. LINEAR ALGEBRA<br />

A.2 Semidefiniteness: domain of test<br />

The most fundamental necessary, sufficient, and definitive test for positive<br />

semidefiniteness of matrix A ∈ R n×n is: [199,1]<br />

x T Ax ≥ 0 for each and every x ∈ R n such that ‖x‖ = 1 (1407)<br />

Traditionally, authors demand evaluation over broader domain; namely,<br />

over all x ∈ R n which is sufficient but unnecessary. Indeed, that standard<br />

text<strong>book</strong> requirement is far over-reaching because if x T Ax is nonnegative for<br />

particular x = x p , then it is nonnegative for any αx p where α∈ R . Thus,<br />

only normalized x in R n need be evaluated.<br />

Many authors add the further requirement that the domain be complex;<br />

the broadest domain. By so doing, only Hermitian matrices (A H = A where<br />

superscript H denotes conjugate transpose) A.2 are admitted to the set of<br />

positive semidefinite matrices (1410); an unnecessary prohibitive condition.<br />

A.2.1<br />

Symmetry versus semidefiniteness<br />

We call (1407) the most fundamental test of positive semidefiniteness. Yet<br />

some authors instead say, for real A and complex domain {x∈ C n } , the<br />

complex test x H Ax≥0 is most fundamental. That complex broadening of the<br />

domain of test causes nonsymmetric real matrices to be excluded from the set<br />

of positive semidefinite matrices. Yet admitting nonsymmetric real matrices<br />

or not is a matter of preference A.3 unless that complex test is adopted, as we<br />

shall now explain.<br />

Any real square matrix A has a representation in terms of its symmetric<br />

and antisymmetric parts; id est,<br />

A = (A +AT )<br />

2<br />

+ (A −AT )<br />

2<br />

(53)<br />

Because, for all real A , the antisymmetric part vanishes under real test,<br />

x T(A −AT )<br />

2<br />

x = 0 (1408)<br />

A.2 Hermitian symmetry is the complex analogue; the real part of a Hermitian matrix<br />

is symmetric while its imaginary part is antisymmetric. A Hermitian matrix has real<br />

eigenvalues and real main diagonal.<br />

A.3 Golub & Van Loan [155,4.2.2], for example, admit nonsymmetric real matrices.


A.2. SEMIDEFINITENESS: DOMAIN OF TEST 587<br />

only the symmetric part of A , (A +A T )/2, has a role determining positive<br />

semidefiniteness. Hence the oft-made presumption that only symmetric<br />

matrices may be positive semidefinite is, of course, erroneous under (1407).<br />

Because eigenvalue-signs of a symmetric matrix translate unequivocally to<br />

its semidefiniteness, the eigenvalues that determine semidefiniteness are<br />

always those of the symmetrized matrix. (A.3) For that reason, and<br />

because symmetric (or Hermitian) matrices must have real eigenvalues,<br />

the convention adopted in the literature is that semidefinite matrices are<br />

synonymous with symmetric semidefinite matrices. Certainly misleading<br />

under (1407), that presumption is typically bolstered with compelling<br />

examples from the physical sciences where symmetric matrices occur within<br />

the mathematical exposition of natural phenomena. A.4 [137,52]<br />

Perhaps a better explanation of this pervasive presumption of symmetry<br />

comes from Horn & Johnson [198,7.1] whose perspective A.5 is the complex<br />

matrix, thus necessitating the complex domain of test throughout their<br />

treatise. They explain, if A∈C n×n<br />

...and if x H Ax is real for all x ∈ C n , then A is Hermitian.<br />

Thus, the assumption that A is Hermitian is not necessary in the<br />

definition of positive definiteness. It is customary, however.<br />

Their comment is best explained by noting, the real part of x H Ax comes<br />

from the Hermitian part (A +A H )/2 of A ;<br />

rather,<br />

re(x H Ax) = x<br />

HA +AH<br />

x (1409)<br />

2<br />

x H Ax ∈ R ⇔ A H = A (1410)<br />

because the imaginary part of x H Ax comes from the anti-Hermitian part<br />

(A −A H )/2 ;<br />

im(x H Ax) = x<br />

HA −AH<br />

x (1411)<br />

2<br />

that vanishes for nonzero x if and only if A = A H . So the Hermitian<br />

symmetry assumption is unnecessary, according to Horn & Johnson, not<br />

A.4 Symmetric matrices are certainly pervasive in the our chosen subject as well.<br />

A.5 A totally complex perspective is not necessarily more advantageous. The positive<br />

semidefinite cone, for example, is not selfdual (2.13.5) in the ambient space of Hermitian<br />

matrices. [192,II]


588 APPENDIX A. LINEAR ALGEBRA<br />

because nonHermitian matrices could be regarded positive semidefinite,<br />

rather because nonHermitian (includes nonsymmetric real) matrices are not<br />

comparable on the real line under x H Ax . Yet that complex edifice is<br />

dismantled in the test of real matrices (1407) because the domain of test<br />

is no longer necessarily complex; meaning, x T Ax will certainly always be<br />

real, regardless of symmetry, and so real A will always be comparable.<br />

In summary, if we limit the domain of test to all x in R n as in (1407),<br />

then nonsymmetric real matrices are admitted to the realm of semidefinite<br />

matrices because they become comparable on the real line. One important<br />

exception occurs for rank-one matrices Ψ=uv T where u and v are real<br />

vectors: Ψ is positive semidefinite if and only if Ψ=uu T . (A.3.1.0.7)<br />

We might choose to expand the domain of test to all x in C n so that only<br />

symmetric matrices would be comparable. The alternative to expanding<br />

domain of test is to assume all matrices of interest to be symmetric; that<br />

is commonly done, hence the synonymous relationship with semidefinite<br />

matrices.<br />

A.2.1.0.1 Example. Nonsymmetric positive definite product.<br />

Horn & Johnson assert and Zhang agrees:<br />

If A,B ∈ C n×n are positive definite, then we know that the<br />

product AB is positive definite if and only if AB is Hermitian.<br />

[198,7.6, prob.10] [386,6.2,3.2]<br />

Implicitly in their statement, A and B are assumed individually Hermitian<br />

and the domain of test is assumed complex.<br />

We prove that assertion to be false for real matrices under (1407) that<br />

adopts a real domain of test.<br />

A T = A =<br />

⎡<br />

⎢<br />

⎣<br />

3 0 −1 0<br />

0 5 1 0<br />

−1 1 4 1<br />

0 0 1 4<br />

⎤<br />

⎥<br />

⎦ ,<br />

⎡<br />

λ(A) = ⎢<br />

⎣<br />

5.9<br />

4.5<br />

3.4<br />

2.0<br />

⎤<br />

⎥<br />

⎦ (1412)<br />

B T = B =<br />

⎡<br />

⎢<br />

⎣<br />

4 4 −1 −1<br />

4 5 0 0<br />

−1 0 5 1<br />

−1 0 1 4<br />

⎤<br />

⎥<br />

⎦ ,<br />

⎡<br />

λ(B) = ⎢<br />

⎣<br />

8.8<br />

5.5<br />

3.3<br />

0.24<br />

⎤<br />

⎥<br />

⎦ (1413)


A.3. PROPER STATEMENTS 589<br />

(AB) T ≠ AB =<br />

⎡<br />

⎢<br />

⎣<br />

13 12 −8 −4<br />

19 25 5 1<br />

−5 1 22 9<br />

−5 0 9 17<br />

⎤<br />

⎥<br />

⎦ ,<br />

⎡<br />

λ(AB) = ⎢<br />

⎣<br />

36.<br />

29.<br />

10.<br />

0.72<br />

⎤<br />

⎥<br />

⎦ (1414)<br />

1<br />

(AB + 2 (AB)T ) =<br />

⎡<br />

⎢<br />

⎣<br />

13 15.5 −6.5 −4.5<br />

15.5 25 3 0.5<br />

−6.5 3 22 9<br />

−4.5 0.5 9 17<br />

⎤<br />

⎥<br />

⎦ , λ( 1<br />

2 (AB + (AB)T ) ) =<br />

⎡<br />

⎢<br />

⎣<br />

36.<br />

30.<br />

10.<br />

0.014<br />

(1415)<br />

Whenever A∈ S n + and B ∈ S n + , then λ(AB)=λ( √ AB √ A) will always<br />

be a nonnegative vector by (1439) and Corollary A.3.1.0.5. Yet positive<br />

definiteness of product AB is certified instead by the nonnegative eigenvalues<br />

λ ( 1<br />

2 (AB + (AB)T ) ) in (1415) (A.3.1.0.1) despite the fact that AB is not<br />

symmetric. A.6 Horn & Johnson and Zhang resolve the anomaly by choosing<br />

to exclude nonsymmetric matrices and products; they do so by expanding<br />

the domain of test to C n .<br />

<br />

⎤<br />

⎥<br />

⎦<br />

A.3 Proper statements<br />

of positive semidefiniteness<br />

Unlike Horn & Johnson and Zhang, we never adopt a complex domain of test<br />

with real matrices. So motivated is our consideration of proper statements<br />

of positive semidefiniteness under real domain of test. This restriction,<br />

ironically, complicates the facts when compared to corresponding statements<br />

for the complex case (found elsewhere [198] [386]).<br />

We state several fundamental facts regarding positive semidefiniteness of<br />

real matrix A and the product AB and sum A +B of real matrices under<br />

fundamental real test (1407); a few require proof as they depart from the<br />

standard texts, while those remaining are well established or obvious.<br />

A.6 It is a little more difficult to find a counter-example in R 2×2 or R 3×3 ; which may<br />

have served to advance any confusion.


590 APPENDIX A. LINEAR ALGEBRA<br />

A.3.0.0.1 Theorem. Positive (semi)definite matrix.<br />

A∈ S M is positive semidefinite if and only if for each and every vector x∈ R M<br />

of unit norm, ‖x‖ = 1 , A.7 we have x T Ax ≥ 0 (1407);<br />

A ≽ 0 ⇔ tr(xx T A) = x T Ax ≥ 0 ∀xx T (1416)<br />

Matrix A ∈ S M is positive definite if and only if for each and every ‖x‖ = 1<br />

we have x T Ax > 0 ;<br />

A ≻ 0 ⇔ tr(xx T A) = x T Ax > 0 ∀xx T , xx T ≠ 0 (1417)<br />

Proof. Statements (1416) and (1417) are each a particular instance<br />

of dual generalized inequalities (2.13.2) with respect to the positive<br />

semidefinite cone; videlicet, [353]<br />

⋄<br />

A ≽ 0 ⇔ 〈xx T , A〉 ≥ 0 ∀xx T (≽ 0)<br />

A ≻ 0 ⇔ 〈xx T , A〉 > 0 ∀xx T (≽ 0), xx T ≠ 0<br />

(1418)<br />

This says: positive semidefinite matrix A must belong to the normal side<br />

of every hyperplane whose normal is an extreme direction of the positive<br />

semidefinite cone. Relations (1416) and (1417) remain true when xx T is<br />

replaced with “for each and every” positive semidefinite matrix X ∈ S M +<br />

(2.13.5) of unit norm, ‖X‖= 1, as in<br />

A ≽ 0 ⇔ tr(XA) ≥ 0 ∀X ∈ S M +<br />

A ≻ 0 ⇔ tr(XA) > 0 ∀X ∈ S M + , X ≠ 0<br />

(1419)<br />

But that condition is more than what is necessary. By the discretized<br />

membership theorem in2.13.4.2.1, the extreme directions xx T of the positive<br />

semidefinite cone constitute a minimal set of generators necessary and<br />

sufficient for discretization of dual generalized inequalities (1419) certifying<br />

membership to that cone.<br />

<br />

A.7 The traditional condition requiring all x∈ R M for defining positive (semi)definiteness<br />

is actually more than what is necessary. The set of norm-1 vectors is necessary and<br />

sufficient to establish positive semidefiniteness; actually, any particular norm and any<br />

nonzero norm-constant will work.


A.3. PROPER STATEMENTS 591<br />

A.3.1<br />

Semidefiniteness, eigenvalues, nonsymmetric<br />

When A∈ R n×n , let λ ( 1<br />

2 (A +AT ) ) ∈ R n denote eigenvalues of the<br />

symmetrized matrix A.8 arranged in nonincreasing order.<br />

By positive semidefiniteness of A∈ R n×n we mean, A.9 [266,1.3.1]<br />

(conferA.3.1.0.1)<br />

x T Ax ≥ 0 ∀x∈ R n ⇔ A +A T ≽ 0 ⇔ λ(A +A T ) ≽ 0 (1420)<br />

(2.9.0.1)<br />

A ≽ 0 ⇒ A T = A (1421)<br />

A ≽ B ⇔ A −B ≽ 0 A ≽ 0 or B ≽ 0 (1422)<br />

x T Ax≥0 ∀x A T = A (1423)<br />

Matrix symmetry is not intrinsic to positive semidefiniteness;<br />

A T = A, λ(A) ≽ 0 ⇒ x T Ax ≥ 0 ∀x (1424)<br />

If A T = A thenλ(A) ≽ 0 ⇐ A T = A, x T Ax ≥ 0 ∀x (1425)<br />

λ(A) ≽ 0 ⇔ A ≽ 0 (1426)<br />

meaning, matrix A belongs to the positive semidefinite cone in the<br />

subspace of symmetric matrices if and only if its eigenvalues belong to<br />

the nonnegative orthant.<br />

〈A , A〉 = 〈λ(A), λ(A)〉 (45)<br />

For µ∈ R , A∈ R n×n , and vector λ(A)∈ C n holding the ordered<br />

eigenvalues of A<br />

λ(µI + A) = µ1 + λ(A) (1427)<br />

Proof: A=MJM −1 and µI + A = M(µI + J)M −1 where J<br />

is the Jordan form for A ; [325,5.6, App.B] id est, δ(J) = λ(A) ,<br />

so λ(µI + A) = δ(µI + J) because µI + J is also a Jordan form. <br />

A.8 The symmetrization of A is (A +A T )/2. λ ( 1<br />

2 (A +AT ) ) = λ(A +A T )/2.<br />

A.9 Strang agrees [325, p.334] it is not λ(A) that requires observation. Yet he is mistaken<br />

by proposing the Hermitian part alone x H (A+A H )x be tested, because the anti-Hermitian<br />

part does not vanish under complex test unless A is Hermitian. (1411)


592 APPENDIX A. LINEAR ALGEBRA<br />

By similar reasoning,<br />

λ(I + µA) = 1 + λ(µA) (1428)<br />

For vector σ(A) holding the singular values of any matrix A<br />

σ(I + µA T A) = π(|1 + µσ(A T A)|) (1429)<br />

σ(µI + A T A) = π(|µ1 + σ(A T A)|) (1430)<br />

where π is the nonlinear permutation-operator sorting its vector<br />

argument into nonincreasing order.<br />

For A∈ S M and each and every ‖w‖= 1 [198,7.7, prob.9]<br />

w T Aw ≤ µ ⇔ A ≼ µI ⇔ λ(A) ≼ µ1 (1431)<br />

[198,2.5.4] (confer (44))<br />

A is normal matrix ⇔ ‖A‖ 2 F = λ(A) T λ(A) (1432)<br />

For A∈ R m×n A T A ≽ 0, AA T ≽ 0 (1433)<br />

because, for dimensionally compatible vector x , x T A T Ax = ‖Ax‖ 2 2 ,<br />

x T AA T x = ‖A T x‖ 2 2 .<br />

For A∈ R n×n and c∈R<br />

tr(cA) = c tr(A) = c1 T λ(A)<br />

(A.1.1 no.4)<br />

For m a nonnegative integer, (1828)<br />

det(A m ) =<br />

tr(A m ) =<br />

n∏<br />

λ(A) m i (1434)<br />

i=1<br />

n∑<br />

λ(A) m i (1435)<br />

i=1


A.3. PROPER STATEMENTS 593<br />

For A diagonalizable (A.5), A = SΛS −1 , (confer [325, p.255])<br />

rankA = rankδ(λ(A)) = rank Λ (1436)<br />

meaning, rank is equal to the number of nonzero eigenvalues in vector<br />

by the 0 eigenvalues theorem (A.7.3.0.1).<br />

(Ky Fan) For A,B∈ S n [53,1.2] (confer (1703))<br />

λ(A) δ(Λ) (1437)<br />

tr(AB) ≤ λ(A) T λ(B) (1689)<br />

with equality (Theobald) when A and B are simultaneously<br />

diagonalizable [198] with the same ordering of eigenvalues.<br />

For A∈ R m×n and B ∈ R n×m<br />

tr(AB) = tr(BA) (1438)<br />

and η eigenvalues of the product and commuted product are identical,<br />

including their multiplicity; [198,1.3.20] id est,<br />

λ(AB) 1:η = λ(BA) 1:η , ηmin{m , n} (1439)<br />

Any eigenvalues remaining are zero. By the 0 eigenvalues theorem<br />

(A.7.3.0.1),<br />

rank(AB) = rank(BA), AB and BA diagonalizable (1440)<br />

For any compatible matrices A,B [198,0.4]<br />

min{rankA, rankB} ≥ rank(AB) (1441)<br />

For A,B ∈ S n +<br />

rankA + rankB ≥ rank(A + B) ≥ min{rankA, rankB} ≥ rank(AB)<br />

(1442)<br />

For linearly independent matrices A,B ∈ S n + (2.1.2, R(A) ∩ R(B)=0,<br />

R(A T ) ∩ R(B T )=0,B.1.1),<br />

rankA + rankB = rank(A + B) > min{rankA, rankB} ≥ rank(AB)<br />

(1443)


594 APPENDIX A. LINEAR ALGEBRA<br />

Because R(A T A)= R(A T ) and R(AA T )= R(A) (p.624), for any<br />

A∈ R m×n rank(AA T ) = rank(A T A) = rankA = rankA T (1444)<br />

For A∈ R m×n having no nullspace, and for any B ∈ R n×k<br />

rank(AB) = rank(B) (1445)<br />

Proof. For any compatible matrix C , N(CAB)⊇ N(AB)⊇ N(B)<br />

is obvious. By assumption ∃A † A † A = I . Let C = A † , then<br />

N(AB)= N(B) and the stated result follows by conservation of<br />

dimension (1549).<br />

<br />

For A∈ S n and any nonsingular matrix Y<br />

inertia(A) = inertia(YAY T ) (1446)<br />

a.k.a, Sylvester’s law of inertia. (1489) [110,2.4.3]<br />

For A,B∈R n×n square, [198,0.3.5]<br />

Yet for A∈ R m×n and B ∈ R n×m [75, p.72]<br />

det(AB) = det(BA) (1447)<br />

det(AB) = detA detB (1448)<br />

det(I + AB) = det(I + BA) (1449)<br />

For A,B ∈ S n , product AB is symmetric if and only if AB is<br />

commutative;<br />

(AB) T = AB ⇔ AB = BA (1450)<br />

Proof. (⇒) Suppose AB=(AB) T . (AB) T =B T A T =BA .<br />

AB=(AB) T ⇒ AB=BA .<br />

(⇐) Suppose AB=BA . BA=B T A T =(AB) T . AB=BA ⇒<br />

AB=(AB) T .<br />

<br />

Commutativity alone is insufficient for symmetry of the product.<br />

[325, p.26]


A.3. PROPER STATEMENTS 595<br />

Diagonalizable matrices A,B∈R n×n commute if and only if they are<br />

simultaneously diagonalizable. [198,1.3.12] A product of diagonal<br />

matrices is always commutative.<br />

For A,B ∈ R n×n and AB = BA<br />

x T Ax ≥ 0, x T Bx ≥ 0 ∀x ⇒ λ(A+A T ) i λ(B+B T ) i ≥ 0 ∀i x T ABx ≥ 0 ∀x<br />

(1451)<br />

the negative result arising because of the schism between the product<br />

of eigenvalues λ(A + A T ) i λ(B + B T ) i and the eigenvalues of the<br />

symmetrized matrix product λ(AB + (AB) T ) i<br />

. For example, X 2 is<br />

generally not positive semidefinite unless matrix X is symmetric; then<br />

(1433) applies. Simply substituting symmetric matrices changes the<br />

outcome:<br />

For A,B ∈ S n and AB = BA<br />

A ≽ 0, B ≽ 0 ⇒ λ(AB) i =λ(A) i λ(B) i ≥0 ∀i ⇔ AB ≽ 0 (1452)<br />

Positive semidefiniteness of A and B is sufficient but not a necessary<br />

condition for positive semidefiniteness of the product AB .<br />

Proof. Because all symmetric matrices are diagonalizable, (A.5.1)<br />

[325,5.6] we have A=SΛS T and B=T∆T T , where Λ and ∆ are<br />

real diagonal matrices while S and T are orthogonal matrices. Because<br />

(AB) T =AB , then T must equal S , [198,1.3] and the eigenvalues of<br />

A are ordered in the same way as those of B ; id est, λ(A) i =δ(Λ) i and<br />

λ(B) i =δ(∆) i correspond to the same eigenvector.<br />

(⇒) Assume λ(A) i λ(B) i ≥0 for i=1... n . AB=SΛ∆S T is<br />

symmetric and has nonnegative eigenvalues contained in diagonal<br />

matrix Λ∆ by assumption; hence positive semidefinite by (1420). Now<br />

assume A,B ≽0. That, of course, implies λ(A) i λ(B) i ≥0 for all i<br />

because all the individual eigenvalues are nonnegative.<br />

(⇐) Suppose AB=SΛ∆S T ≽ 0. Then Λ∆≽0 by (1420),<br />

and so all products λ(A) i λ(B) i must be nonnegative; meaning,<br />

sgn(λ(A))= sgn(λ(B)). We may, therefore, conclude nothing about<br />

the semidefiniteness of A and B .


596 APPENDIX A. LINEAR ALGEBRA<br />

For A,B ∈ S n and A ≽ 0, B ≽ 0 (Example A.2.1.0.1)<br />

AB = BA ⇒ λ(AB) i =λ(A) i λ(B) i ≥ 0 ∀i ⇒ AB ≽ 0 (1453)<br />

AB = BA ⇒ λ(AB) i ≥ 0, λ(A) i λ(B) i ≥ 0 ∀i ⇔ AB ≽ 0 (1454)<br />

For A,B ∈ S n [199,4.2.13]<br />

A ≽ 0, B ≽ 0 ⇒ A ⊗ B ≽ 0 (1455)<br />

A ≻ 0, B ≻ 0 ⇒ A ⊗ B ≻ 0 (1456)<br />

and the Kronecker product is symmetric.<br />

For A,B ∈ S n [386,6.2]<br />

A ≽ 0 ⇒ trA ≥ 0 (1457)<br />

A ≽ 0, B ≽ 0 ⇒ trA trB ≥ tr(AB) ≥ 0 (1458)<br />

Because A ≽ 0, B ≽ 0 ⇒ λ(AB) = λ( √ AB √ A) ≽ 0 by (1439) and<br />

Corollary A.3.1.0.5, then we have tr(AB) ≥ 0.<br />

A ≽ 0 ⇔ tr(AB)≥ 0 ∀B ≽ 0 (356)<br />

For A,B,C ∈ S n (Löwner)<br />

A ≼ B , B ≼ C ⇒ A ≼ C<br />

A ≼ B ⇔ A + C ≼ B + C<br />

A ≼ B , A ≽ B ⇒ A = B<br />

A ≼ A<br />

A ≼ B , B ≺ C ⇒ A ≺ C<br />

A ≺ B ⇔ A + C ≺ B + C<br />

(transitivity)<br />

(additivity)<br />

(antisymmetry)<br />

(reflexivity)<br />

(strict transitivity)<br />

(strict additivity)<br />

(1459)<br />

(1460)<br />

For A,B ∈ R n×n x T Ax ≥ x T Bx ∀x ⇒ trA ≥ trB (1461)<br />

Proof. x T Ax≥x T Bx ∀x ⇔ λ((A −B) + (A −B) T )/2 ≽ 0 ⇒<br />

tr(A+A T −(B+B T ))/2 = tr(A −B)≥0. There is no converse.


A.3. PROPER STATEMENTS 597<br />

For A,B ∈ S n [386,6.2] (Theorem A.3.1.0.4)<br />

A ≽ B ⇒ trA ≥ trB (1462)<br />

A ≽ B ⇒ δ(A) ≽ δ(B) (1463)<br />

There is no converse, and restriction to the positive semidefinite cone<br />

does not improve the situation. All-strict versions hold.<br />

A ≽ B ≽ 0 ⇒ rankA ≥ rankB (1464)<br />

A ≽ B ≽ 0 ⇒ detA ≥ detB ≥ 0 (1465)<br />

A ≻ B ≽ 0 ⇒ detA > detB ≥ 0 (1466)<br />

For A,B ∈ int S n + [33,4.2] [198,7.7.4]<br />

A ≽ B ⇔ A −1 ≼ B −1 , A ≻ 0 ⇔ A −1 ≻ 0 (1467)<br />

For A,B ∈ S n [386,6.2]<br />

A ≽ B ≽ 0 ⇒ √ A ≽ √ B (1468)<br />

For A,B ∈ S n and AB = BA [386,6.2, prob.3]<br />

A ≽ B ≽ 0 ⇒ A k ≽ B k , k=1, 2,... (1469)<br />

A.3.1.0.1 Theorem. Positive semidefinite ordering of eigenvalues.<br />

For A,B∈ R M×M , place the eigenvalues of each symmetrized matrix into<br />

the respective vectors λ ( 1<br />

(A 2 +AT ) ) , λ ( 1<br />

(B 2 +BT ) ) ∈ R M . Then, [325,6]<br />

x T Ax ≥ 0 ∀x ⇔ λ ( A +A T) ≽ 0 (1470)<br />

x T Ax > 0 ∀x ≠ 0 ⇔ λ ( A +A T) ≻ 0 (1471)<br />

because x T (A −A T )x=0. (1408) Now arrange the entries of λ ( 1<br />

(A 2 +AT ) )<br />

and λ ( 1<br />

(B 2 +BT ) ) in nonincreasing order so λ ( 1<br />

(A 2 +AT ) ) holds the<br />

1<br />

largest eigenvalue of symmetrized A while λ ( 1<br />

(B 2 +BT ) ) holds the largest<br />

1<br />

eigenvalue of symmetrized B , and so on. Then [198,7.7, prob.1, prob.9]<br />

for κ ∈ R<br />

x T Ax ≥ x T Bx ∀x ⇒ λ ( A +A T) ≽ λ ( B +B T)<br />

x T Ax ≥ x T Ixκ ∀x ⇔ λ ( 1<br />

2 (A +AT ) ) ≽ κ1<br />

(1472)


598 APPENDIX A. LINEAR ALGEBRA<br />

Now let A,B ∈ S M have diagonalizations A=QΛQ T and B =UΥU T with<br />

λ(A)=δ(Λ) and λ(B)=δ(Υ) arranged in nonincreasing order. Then<br />

where S = QU T . [386,7.5]<br />

A ≽ B ⇔ λ(A−B) ≽ 0 (1473)<br />

A ≽ B ⇒ λ(A) ≽ λ(B) (1474)<br />

A ≽ B λ(A) ≽ λ(B) (1475)<br />

S T AS ≽ B ⇐ λ(A) ≽ λ(B) (1476)<br />

A.3.1.0.2 Theorem. (Weyl) Eigenvalues of sum. [198,4.3.1]<br />

For A,B∈ R M×M , place the eigenvalues of each symmetrized matrix into<br />

the respective vectors λ ( 1<br />

(A 2 +AT ) ) , λ ( 1<br />

(B 2 +BT ) ) ∈ R M in nonincreasing<br />

order so λ ( 1<br />

(A 2 +AT ) ) holds the largest eigenvalue of symmetrized A while<br />

1<br />

λ ( 1<br />

(B 2 +BT ) ) holds the largest eigenvalue of symmetrized B , and so on.<br />

1<br />

Then, for any k ∈{1... M }<br />

λ ( A +A T) k + λ( B +B T) M ≤ λ( (A +A T ) + (B +B T ) ) k ≤ λ( A +A T) k + λ( B +B T) 1<br />

⋄<br />

(1477)<br />

Weyl’s theorem establishes: concavity of the smallest λ M and convexity<br />

of the largest eigenvalue λ 1 of a symmetric matrix, via (475), and positive<br />

semidefiniteness of a sum of positive semidefinite matrices; for A,B∈ S M +<br />

λ(A) k + λ(B) M ≤ λ(A + B) k ≤ λ(A) k + λ(B) 1 (1478)<br />

Because S M + is a convex cone (2.9.0.0.1), then by (166)<br />

A,B ≽ 0 ⇒ ζA + ξB ≽ 0 for all ζ,ξ ≥ 0 (1479)<br />

A.3.1.0.3 Corollary. Eigenvalues of sum and difference. [198,4.3]<br />

For A∈ S M and B ∈ S M + , place the eigenvalues of each matrix into the<br />

respective vectors λ(A), λ(B)∈ R M in nonincreasing order so λ(A) 1 holds<br />

the largest eigenvalue of A while λ(B) 1 holds the largest eigenvalue of B ,<br />

and so on. Then, for any k ∈{1... M}<br />

λ(A − B) k ≤ λ(A) k ≤ λ(A +B) k (1480)<br />

⋄<br />


A.3. PROPER STATEMENTS 599<br />

When B is rank-one positive semidefinite, the eigenvalues interlace; id est,<br />

for B = qq T<br />

λ(A) k−1 ≤ λ(A − qq T ) k ≤ λ(A) k ≤ λ(A + qq T ) k ≤ λ(A) k+1 (1481)<br />

A.3.1.0.4 Theorem. Positive (semi)definite principal submatrices. A.10<br />

A∈ S M is positive semidefinite if and only if all M principal submatrices<br />

of dimension M−1 are positive semidefinite and detA is nonnegative.<br />

A ∈ S M is positive definite if and only if any one principal submatrix<br />

of dimension M −1 is positive definite and detA is positive. ⋄<br />

If any one principal submatrix of dimension M−1 is not positive definite,<br />

conversely, then A can neither be. Regardless of symmetry, if A∈ R M×M is<br />

positive (semi)definite, then the determinant of each and every principal<br />

submatrix is (nonnegative) positive. [266,1.3.1]<br />

A.3.1.0.5<br />

[198, p.399]<br />

Corollary. Positive (semi)definite symmetric products.<br />

If A∈ S M is positive definite and any particular dimensionally<br />

compatible matrix Z has no nullspace, then Z T AZ is positive definite.<br />

If matrix A∈ S M is positive (semi)definite then, for any matrix Z of<br />

compatible dimension, Z T AZ is positive semidefinite.<br />

A∈ S M is positive (semi)definite if and only if there exists a nonsingular<br />

Z such that Z T AZ is positive (semi)definite.<br />

If A∈ S M is positive semidefinite and singular it remains possible, for<br />

some skinny Z ∈ R M×N with N


600 APPENDIX A. LINEAR ALGEBRA<br />

We can deduce from these, given nonsingular matrix Z and any particular<br />

dimensionally<br />

[ ]<br />

compatible Y : matrix A∈ S M is positive semidefinite if and<br />

Z<br />

T<br />

only if<br />

Y T A[Z Y ] is positive semidefinite. In other words, from the<br />

Corollary it follows: for dimensionally compatible Z<br />

A ≽ 0 ⇔ Z T AZ ≽ 0 and Z T has a left inverse<br />

Products such as Z † Z and ZZ † are symmetric and positive semidefinite<br />

although, given A ≽ 0, Z † AZ and ZAZ † are neither necessarily symmetric<br />

or positive semidefinite.<br />

A.3.1.0.6 Theorem. Symmetric projector semidefinite. [20,III]<br />

[21,6] [217, p.55] For symmetric idempotent matrices P and R<br />

P,R ≽ 0<br />

P ≽ R ⇔ R(P ) ⊇ R(R) ⇔ N(P ) ⊆ N(R)<br />

(1482)<br />

Projector P is never positive definite [327,6.5, prob.20] unless it is the<br />

identity matrix.<br />

⋄<br />

A.3.1.0.7 Theorem. Symmetric positive semidefinite.<br />

Given real matrix Ψ with rank Ψ = 1<br />

Ψ ≽ 0 ⇔ Ψ = uu T (1483)<br />

where u is some real vector; id est, symmetry is necessary and sufficient for<br />

positive semidefiniteness of a rank-1 matrix.<br />

⋄<br />

Proof. Any rank-one matrix must have the form Ψ = uv T . (B.1)<br />

Suppose Ψ is symmetric; id est, v = u . For all y ∈ R M , y T uu T y ≥ 0.<br />

Conversely, suppose uv T is positive semidefinite. We know that can hold if<br />

and only if uv T + vu T ≽ 0 ⇔ for all normalized y ∈ R M , 2y T uv T y ≥ 0 ;<br />

but that is possible only if v = u .<br />

<br />

The same does not hold true for matrices of higher rank, as Example A.2.1.0.1<br />

shows.


A.4. SCHUR COMPLEMENT 601<br />

A.4 Schur complement<br />

Consider Schur-form partitioned matrix G : Given A T = A and C T = C ,<br />

then [57]<br />

[ ] A B<br />

G =<br />

B T ≽ 0<br />

C<br />

(1484)<br />

⇔ A ≽ 0, B T (I −AA † ) = 0, C −B T A † B ≽ 0<br />

⇔ C ≽ 0, B(I −CC † ) = 0, A−BC † B T ≽ 0<br />

where A † denotes the Moore-Penrose (pseudo)inverse (E). In the first<br />

instance, I − AA † is a symmetric projection matrix orthogonally projecting<br />

on N(A T ). (1870) It is apparently required<br />

R(B) ⊥ N(A T ) (1485)<br />

which precludes A = 0 when B is any nonzero matrix. Note that A ≻ 0 ⇒<br />

A † =A −1 ; thereby, the projection matrix vanishes. Likewise, in the second<br />

instance, I − CC † projects orthogonally on N(C T ). It is required<br />

R(B T ) ⊥ N(C T ) (1486)<br />

which precludes C =0 for B nonzero. Again, C ≻ 0 ⇒ C † = C −1 . So we<br />

get, for A or C nonsingular,<br />

[ ] A B<br />

G =<br />

B T ≽ 0<br />

C<br />

⇔<br />

A ≻ 0, C −B T A −1 B ≽ 0<br />

or<br />

C ≻ 0, A−BC −1 B T ≽ 0<br />

(1487)<br />

When A is full-rank then, for all B of compatible dimension, R(B) is in<br />

R(A). Likewise, when C is full-rank, R(B T ) is in R(C). Thus the flavor,<br />

for A and C nonsingular,<br />

[ ] A B<br />

G =<br />

B T ≻ 0<br />

C<br />

⇔ A ≻ 0, C −B T A −1 B ≻ 0<br />

⇔ C ≻ 0, A−BC −1 B T ≻ 0<br />

(1488)<br />

where C − B T A −1 B is called the Schur complement of A in G , while the<br />

Schur complement of C in G is A − BC −1 B T . [148,4.8]


602 APPENDIX A. LINEAR ALGEBRA<br />

Origin of the term Schur complement is from complementary inertia:<br />

[110,2.4.4] Define<br />

inertia ( G∈ S M) {p,z,n} (1489)<br />

where p,z,n respectively represent number of positive, zero, and negative<br />

eigenvalues of G ; id est,<br />

M = p + z + n (1490)<br />

Then, when A is invertible,<br />

and when C is invertible,<br />

inertia(G) = inertia(A) + inertia(C − B T A −1 B) (1491)<br />

inertia(G) = inertia(C) + inertia(A − BC −1 B T ) (1492)<br />

When A=C =0, denoting by σ(B)∈ R m + the nonincreasingly ordered<br />

singular values of matrix B ∈ R m×m , then we have the eigenvalues<br />

[53,1.2, prob.17]<br />

and<br />

([ 0 B<br />

λ(G) = λ<br />

B T 0<br />

])<br />

=<br />

[<br />

σ(B)<br />

−Ξσ(B)<br />

]<br />

(1493)<br />

inertia(G) = inertia(B T B) + inertia(−B T B) (1494)<br />

where Ξ is the order-reversing permutation matrix defined in (1690).<br />

A.4.0.0.1 Example. Nonnegative polynomial. [33, p.163]<br />

Schur-form positive semidefiniteness is sufficient for quadratic polynomial<br />

convexity in x , but it is necessary and sufficient for nonnegativity; videlicet,<br />

for all compatible x<br />

[x T 1] [ A b<br />

b T c<br />

][ x<br />

1<br />

]<br />

≥ 0 ⇔ x T Ax + 2b T x + c ≥ 0 (1495)<br />

if and only if it can be expressed as a sum of squares of polynomials.<br />

[33, p.157] Sublevel set {x | x T Ax + 2b T x + c ≤ 0} is convex if A ≽ 0, but<br />

the quadratic polynomial is a convex function if and only if A ≽ 0.


A.4. SCHUR COMPLEMENT 603<br />

A.4.0.0.2 Example. Schur conditions.<br />

From (1457),<br />

[ A B<br />

B T C<br />

]<br />

≽ 0 ⇒ tr(A + C) ≥ 0<br />

⇕<br />

[ A 0<br />

0 T C −B T A −1 B<br />

]<br />

≽ 0 ⇒ tr(C −B T A −1 B) ≥ 0<br />

(1496)<br />

⇕<br />

[ A−BC −1 B T 0<br />

0 T C<br />

]<br />

≽ 0 ⇒ tr(A−BC −1 B T ) ≥ 0<br />

Since tr(C −B T A −1 B)≥ 0 ⇔ trC ≥ tr(B T A −1 B) ≥ 0 for example, then<br />

minimization of trC is necessary and sufficient [ for minimization ] of<br />

A B<br />

tr(C −B T A −1 B) when both are under constraint<br />

B T ≽ 0. <br />

C<br />

A.4.0.0.3 Example. Sparse Schur conditions.<br />

Setting matrix A to the identity simplifies the Schur conditions.<br />

consequence relates the definiteness of three quantities:<br />

[ ] I B<br />

B T ≽ 0 ⇔<br />

C<br />

C − B T B ≽ 0 ⇔<br />

[ I 0<br />

0 T C −B T B<br />

One<br />

]<br />

≽ 0 (1497)<br />

<br />

A.4.0.0.4 Exercise. Eigenvalues λ of sparse Schur-form.<br />

Prove: given C −B T B = 0, for B ∈ R m×n and C ∈ S n<br />

([ I B<br />

λ<br />

B T C<br />

])<br />

i<br />

=<br />

⎧<br />

⎪⎨<br />

⎪⎩<br />

1 + λ(C) i , 1 ≤ i ≤ n<br />

1, n < i ≤ m<br />

0, otherwise<br />

(1498)


604 APPENDIX A. LINEAR ALGEBRA<br />

A.4.0.0.5 Theorem. Rank of partitioned matrices.<br />

When symmetric matrix A is invertible and C is symmetric,<br />

[ ] [ ]<br />

A B A 0<br />

rank<br />

B T = rank<br />

C 0 T C −B T A −1 B<br />

= rankA + rank(C −B T A −1 B)<br />

(1499)<br />

equals rank of a block on the main diagonal plus rank of its Schur complement<br />

[386,2.2, prob.7]. Similarly, when symmetric matrix C is invertible and A<br />

is symmetric,<br />

[ ] [ ]<br />

A B A − BC<br />

rank<br />

B T = rank<br />

−1 B T 0<br />

C<br />

0 T C<br />

(1500)<br />

= rank(A − BC −1 B T ) + rankC<br />

Proof. The first assertion (1499) holds if and only if [198,0.4.6(c)]<br />

[ A B<br />

∃ nonsingular X,Y X<br />

B T C<br />

] [ A 0<br />

Y =<br />

0 T C −B T A −1 B<br />

]<br />

⋄<br />

(1501)<br />

Let [198,7.7.6]<br />

Y = X T =<br />

[ I −A −1 B<br />

0 T I<br />

]<br />

(1502)<br />

<br />

A.4.0.0.6 Lemma. Rank of Schur-form block. [133] [131]<br />

Matrix B ∈ R m×n has rankB≤ ρ if and only if there exist matrices A∈ S m<br />

and C ∈ S n such that<br />

[ ]<br />

[ ]<br />

A 0<br />

A B<br />

rank<br />

0 T ≤ 2ρ and G =<br />

C<br />

B T ≽ 0 (1503)<br />

C<br />

Schur-form positive semidefiniteness alone implies rankA ≥ rankB and<br />

rankC ≥ rankB . But, even in absence of semidefiniteness, we must always<br />

have rankG ≥ rankA, rankB, rankC by fundamental linear algebra.<br />


A.4. SCHUR COMPLEMENT 605<br />

A.4.1<br />

Determinant<br />

[ ] A B<br />

G =<br />

B T C<br />

(1504)<br />

We consider again a matrix G partitioned like (1484), but not necessarily<br />

positive (semi)definite, where A and C are symmetric.<br />

When A is invertible,<br />

When C is invertible,<br />

detG = detA det(C − B T A −1 B) (1505)<br />

detG = detC det(A − BC −1 B T ) (1506)<br />

When B is full-rank and skinny, C = 0, and A ≽ 0, then [59,10.1.1]<br />

detG ≠ 0 ⇔ A + BB T ≻ 0 (1507)<br />

When B is a (column) vector, then for all C ∈ R and all A of dimension<br />

compatible with G<br />

detG = det(A)C − B T A T cofB (1508)<br />

while for C ≠ 0<br />

detG = C det(A − 1 C BBT ) (1509)<br />

where A cof is the matrix of cofactors [325,4] corresponding to A .<br />

When B is full-rank and fat, A = 0, and C ≽ 0, then<br />

detG ≠ 0 ⇔ C + B T B ≻ 0 (1510)<br />

When B is a row-vector, then for A ≠ 0 and all C of dimension<br />

compatible with G<br />

while for all A∈ R<br />

detG = A det(C − 1 A BT B) (1511)<br />

detG = det(C)A − BC T cofB T (1512)<br />

where C cof is the matrix of cofactors corresponding to C .


606 APPENDIX A. LINEAR ALGEBRA<br />

A.5 Eigenvalue decomposition<br />

When a square matrix X ∈ R m×m is diagonalizable, [325,5.6] then<br />

X = SΛS −1 = [ s 1 · · · s m ] Λ⎣<br />

⎡<br />

w T1<br />

.<br />

w T m<br />

⎤<br />

⎦ =<br />

m∑<br />

λ i s i wi T (1513)<br />

where {s i ∈ N(X − λ i I)⊆ C m } are l.i. (right-)eigenvectors A.12 constituting<br />

the columns of S ∈ C m×m defined by<br />

i=1<br />

XS = SΛ (1514)<br />

{w i ∈ N(X T − λ i I)⊆ C m } are linearly independent left-eigenvectors of X<br />

(eigenvectors of X T ) constituting the rows of S −1 defined by [198]<br />

and where {λ i ∈ C} are eigenvalues (1437)<br />

S −1 X = ΛS −1 (1515)<br />

δ(λ(X)) = Λ ∈ C m×m (1516)<br />

corresponding to both left and right eigenvectors; id est, λ(X) = λ(X T ).<br />

There is no connection between diagonalizability and invertibility of X .<br />

[325,5.2] Diagonalizability is guaranteed by a full set of linearly independent<br />

eigenvectors, whereas invertibility is guaranteed by all nonzero eigenvalues.<br />

distinct eigenvalues ⇒ l.i. eigenvectors ⇔ diagonalizable<br />

not diagonalizable ⇒ repeated eigenvalue<br />

(1517)<br />

A.5.0.0.1 Theorem. Real eigenvector.<br />

Eigenvectors of a real matrix corresponding to real eigenvalues must be real.<br />

⋄<br />

Proof. Ax = λx . Given λ=λ ∗ , x H Ax = λx H x = λ‖x‖ 2 = x T Ax ∗<br />

x = x ∗ , where x H =x ∗T . The converse is equally simple. <br />

⇒<br />

A.12 Eigenvectors must, of course, be nonzero. The prefix eigen is from the German; in<br />

this context meaning, something akin to “characteristic”. [322, p.14]


A.5. EIGENVALUE DECOMPOSITION 607<br />

A.5.0.1<br />

Uniqueness<br />

From the fundamental theorem of algebra, [214] which guarantees existence<br />

of zeros for a given polynomial, it follows: eigenvalues, including their<br />

multiplicity, for a given square matrix are unique; meaning, there is no other<br />

set of eigenvalues for that matrix. (Conversely, many different matrices may<br />

share the same unique set of eigenvalues; e.g., for any X , λ(X) = λ(X T ).)<br />

Uniqueness of eigenvectors, in contrast, disallows multiplicity of the same<br />

direction:<br />

A.5.0.1.1 Definition. Unique eigenvectors.<br />

When eigenvectors are unique, we mean: unique to within a real nonzero<br />

scaling, and their directions are distinct.<br />

△<br />

If S is a matrix of eigenvectors of X as in (1513), for example, then −S<br />

is certainly another matrix of eigenvectors decomposing X with the same<br />

eigenvalues.<br />

For any square matrix, the eigenvector corresponding to a distinct<br />

eigenvalue is unique; [322, p.220]<br />

distinct eigenvalues ⇒ eigenvectors unique (1518)<br />

Eigenvectors corresponding to a repeated eigenvalue are not unique for a<br />

diagonalizable matrix;<br />

repeated eigenvalue ⇒ eigenvectors not unique (1519)<br />

Proof follows from the observation: any linear combination of distinct<br />

eigenvectors of diagonalizable X , corresponding to a particular eigenvalue,<br />

produces another eigenvector. For eigenvalue λ whose multiplicity A.13<br />

dim N(X −λI) exceeds 1, in other words, any choice of independent<br />

vectors from N(X −λI) (of the same multiplicity) constitutes eigenvectors<br />

corresponding to λ .<br />

<br />

A.13 A matrix is diagonalizable iff algebraic multiplicity (number of occurrences of same<br />

eigenvalue) equals geometric multiplicity dim N(X −λI) = m − rank(X −λI) [322, p.15]<br />

(number of Jordan blocks w.r.t λ or number of corresponding l.i. eigenvectors).


608 APPENDIX A. LINEAR ALGEBRA<br />

Caveat diagonalizability insures linear independence which implies<br />

existence of distinct eigenvectors. We may conclude, for diagonalizable<br />

matrices,<br />

distinct eigenvalues ⇔ eigenvectors unique (1520)<br />

A.5.0.2<br />

Invertible matrix<br />

When diagonalizable matrix X ∈ R m×m is nonsingular (no zero eigenvalues),<br />

then it has an inverse obtained simply by inverting eigenvalues in (1513):<br />

A.5.0.3<br />

eigenmatrix<br />

X −1 = SΛ −1 S −1 (1521)<br />

The (right-)eigenvectors {s i } (1513) are naturally orthogonal w T i s j =0<br />

to left-eigenvectors {w i } except, for i=1... m , w T i s i =1 ; called a<br />

biorthogonality condition [358,2.2.4] [198] because neither set of left or right<br />

eigenvectors is necessarily an orthogonal set. Consequently, each dyad from a<br />

diagonalization is an independent (B.1.1) nonorthogonal projector because<br />

s i w T i s i w T i = s i w T i (1522)<br />

(whereas the dyads of singular value decomposition are not inherently<br />

projectors (confer (1529))).<br />

Dyads of eigenvalue decomposition can be termed eigenmatrices because<br />

Sum of the eigenmatrices is the identity;<br />

A.5.1<br />

X s i w T i = λ i s i w T i (1523)<br />

m∑<br />

s i wi T = I (1524)<br />

i=1<br />

Symmetric matrix diagonalization<br />

The set of normal matrices is, precisely, that set of all real matrices having<br />

a complete orthonormal set of eigenvectors; [386,8.1] [327, prob.10.2.31]<br />

id est, any matrix X for which XX T = X T X ; [155,7.1.3] [322, p.3]<br />

e.g., orthogonal and circulant matrices [164]. All normal matrices are<br />

diagonalizable.


A.5. EIGENVALUE DECOMPOSITION 609<br />

A symmetric matrix is a special normal matrix whose eigenvalues Λ<br />

must be real A.14 and whose eigenvectors S can be chosen to make a real<br />

orthonormal set; [327,6.4] [325, p.315] id est, for X ∈ S m<br />

s T1<br />

X = SΛS T = [s 1 · · · s m ] Λ⎣<br />

.<br />

⎡<br />

s T m<br />

⎤<br />

⎦ =<br />

m∑<br />

λ i s i s T i (1525)<br />

where δ 2 (Λ) = Λ∈ S m (A.1) and S −1 = S T ∈ R m×m (orthogonal matrix,<br />

B.5) because of symmetry: SΛS −1 = S −T ΛS T . By the 0 eigenvalues<br />

theorem (A.7.3.0.1),<br />

i=1<br />

R{s i |λ i ≠0} = R(A) = R(A T )<br />

R{s i |λ i =0} = N(A T ) = N(A)<br />

(1526)<br />

A.5.1.1<br />

Diagonal matrix diagonalization<br />

Because arrangement of eigenvectors and their corresponding eigenvalues is<br />

arbitrary, we almost always arrange eigenvalues in nonincreasing order as<br />

is the convention for singular value decomposition. Then to diagonalize a<br />

symmetric matrix that is already a diagonal matrix, orthogonal matrix S<br />

becomes a permutation matrix.<br />

A.5.1.2<br />

Invertible symmetric matrix<br />

When symmetric matrix X ∈ S m is nonsingular (invertible), then its inverse<br />

(obtained by inverting eigenvalues in (1525)) is also symmetric:<br />

A.5.1.3<br />

X −1 = SΛ −1 S T ∈ S m (1527)<br />

Positive semidefinite matrix square root<br />

When X ∈ S m + , its unique positive semidefinite matrix square root is defined<br />

√<br />

X S<br />

√<br />

ΛS T ∈ S m + (1528)<br />

where the square root of nonnegative diagonal matrix √ Λ is taken entrywise<br />

and positive. Then X = √ X √ X .<br />

A.14 Proof. Suppose λ i is an eigenvalue corresponding to eigenvector s i of real A=A T .<br />

Then s H i As i= s T i As∗ i (by transposition) ⇒ s ∗T<br />

i λ i s i = s T i λ∗ i s∗ i because (As i ) ∗ = (λ i s i ) ∗ by<br />

assumption. So we have λ i ‖s i ‖ 2 = λ ∗ i ‖s i‖ 2 . There is no converse.


610 APPENDIX A. LINEAR ALGEBRA<br />

A.6 Singular value decomposition, SVD<br />

A.6.1<br />

Compact SVD<br />

[155,2.5.4] For any A∈ R m×n<br />

q T1<br />

A = UΣQ T = [ u 1 · · · u η ] Σ⎣<br />

.<br />

⎡<br />

⎤<br />

∑<br />

⎦ = η σ i u i qi<br />

T<br />

qη<br />

T i=1<br />

(1529)<br />

U ∈ R m×η , Σ ∈ R η×η , Q ∈ R n×η<br />

where U and Q are always skinny-or-square each having orthonormal<br />

columns, and where<br />

η min{m , n} (1530)<br />

Square matrix Σ is diagonal (A.1.1)<br />

δ 2 (Σ) = Σ ∈ R η×η (1531)<br />

holding the singular values {σ i ∈ R} of A which are always arranged in<br />

nonincreasing order by convention and are related to eigenvalues λ by A.15<br />

⎧√ ⎨ λ(AT A)<br />

σ(A) i = σ(A T i = √ (√ )<br />

λ(AA T ) i = λ AT A)<br />

= λ(√<br />

AA<br />

T<br />

> 0, 1 ≤ i ≤ ρ<br />

) i =<br />

i<br />

i<br />

⎩<br />

0, ρ < i ≤ η<br />

(1532)<br />

of which the last η −ρ are 0 , A.16 where<br />

ρ rankA = rank Σ (1533)<br />

A point sometimes lost: Any real matrix may be decomposed in terms of<br />

its real singular values σ(A)∈ R η and real matrices U and Q as in (1529),<br />

where [155,2.5.3]<br />

R{u i |σ i ≠0} = R(A)<br />

R{u i |σ i =0} ⊆ N(A T )<br />

R{q i |σ i ≠0} = R(A T (1534)<br />

)<br />

R{q i |σ i =0} ⊆ N(A)<br />

A.15 When matrix A is normal,<br />

√<br />

σ(A) = |λ(A)|.<br />

)<br />

[386,8.1]<br />

A.16 For η = n , σ(A) = λ(A T A) = λ(√<br />

AT A .<br />

√<br />

)<br />

For η = m , σ(A) = λ(AA T ) = λ(√<br />

AA<br />

T<br />

.


A.6. SINGULAR VALUE DECOMPOSITION, SVD 611<br />

A.6.2<br />

Subcompact SVD<br />

Some authors allow only nonzero singular values. In that case the compact<br />

decomposition can be made smaller; it can be redimensioned in terms of<br />

rank ρ because, for any A∈ R m×n<br />

ρ = rankA = rank Σ = max {i∈{1... η} | σ i ≠ 0} ≤ η (1535)<br />

There are η singular values. For any flavor SVD, rank is equivalent to<br />

the number of nonzero singular values on the main diagonal of Σ.<br />

Now<br />

⎡<br />

q T1<br />

A = UΣQ T = [ u 1 · · · u ρ ] Σ⎣<br />

.<br />

⎤<br />

∑<br />

⎦ = ρ σ i u i qi<br />

T<br />

qρ<br />

T i=1<br />

(1536)<br />

U ∈ R m×ρ , Σ ∈ R ρ×ρ , Q ∈ R n×ρ<br />

where the main diagonal of diagonal matrix Σ has no 0 entries, and<br />

R{u i } = R(A)<br />

R{q i } = R(A T )<br />

(1537)<br />

A.6.3<br />

Full SVD<br />

Another common and useful expression of the SVD makes U and Q<br />

square; making the decomposition larger than compact SVD. Completing<br />

the nullspace bases in U and Q from (1534) provides what is called the<br />

full singular value decomposition of A∈ R m×n [325, App.A]. Orthonormal<br />

matrices U and Q become orthogonal matrices (B.5):<br />

R{u i |σ i ≠0} = R(A)<br />

R{u i |σ i =0} = N(A T )<br />

R{q i |σ i ≠0} = R(A T )<br />

R{q i |σ i =0} = N(A)<br />

(1538)


612 APPENDIX A. LINEAR ALGEBRA<br />

For any matrix A having rank ρ (= rank Σ)<br />

⎡<br />

q T1<br />

A = UΣQ T = [ u 1 · · · u m ] Σ⎣<br />

.<br />

q T n<br />

⎤<br />

∑<br />

⎦ = η σ i u i qi<br />

T<br />

⎡<br />

σ 1<br />

= [ m×ρ basis R(A) m×m−ρ basis N(A T ) ] σ 2<br />

⎢<br />

⎣<br />

...<br />

i=1<br />

⎤<br />

⎡ ( n×ρ basis R(A T ) ) ⎤<br />

T<br />

⎥⎣<br />

⎦<br />

⎦<br />

(n×n−ρ basis N(A)) T<br />

U ∈ R m×m , Σ ∈ R m×n , Q ∈ R n×n (1539)<br />

where upper limit of summation η is defined in (1530). Matrix Σ is no<br />

longer necessarily square, now padded with respect to (1531) by m−η<br />

zero rows or n−η zero columns; the nonincreasingly ordered (possibly 0)<br />

singular values appear along its main diagonal as for compact SVD (1532).<br />

An important geometrical interpretation of SVD is given in Figure 152<br />

for m = n = 2 : The image of the unit sphere under any m × n matrix<br />

multiplication is an ellipse. Considering the three factors of the SVD<br />

separately, note that Q T is a pure rotation of the circle. Figure 152 shows<br />

how the axes q 1 and q 2 are first rotated by Q T to coincide with the coordinate<br />

axes. Second, the circle is stretched by Σ in the directions of the coordinate<br />

axes to form an ellipse. The third step rotates the ellipse by U into its<br />

final position. Note how q 1 and q 2 are rotated to end up as u 1 and u 2 , the<br />

principal axes of the final ellipse. A direct calculation shows that Aq j = σ j u j .<br />

Thus q j is first rotated to coincide with the j th coordinate axis, stretched by<br />

a factor σ j , and then rotated to point in the direction of u j . All of this<br />

is beautifully illustrated for 2 ×2 matrices by the Matlab code eigshow.m<br />

(see [328]).<br />

A direct consequence of the geometric interpretation is that the largest<br />

singular value σ 1 measures the “magnitude” of A (its 2-norm):<br />

‖A‖ 2 = sup ‖Ax‖ 2 = σ 1 (1540)<br />

‖x‖ 2 =1<br />

This means that ‖A‖ 2 is the length of the longest principal semiaxis of the<br />

ellipse.


A.6. SINGULAR VALUE DECOMPOSITION, SVD 613<br />

Σ<br />

q 2<br />

q<br />

q 2<br />

1<br />

Q T<br />

q 1<br />

q 2 u 2<br />

U<br />

u 1<br />

q 1<br />

Figure 152: Geometrical interpretation of full SVD [265]: Image of circle<br />

{x∈ R 2 | ‖x‖ 2 =1} under matrix multiplication Ax is, in general, an ellipse.<br />

For the example illustrated, U [u 1 u 2 ]∈ R 2×2 , Q[q 1 q 2 ]∈ R 2×2 .


614 APPENDIX A. LINEAR ALGEBRA<br />

Expressions for U , Q , and Σ follow readily from (1539),<br />

AA T U = UΣΣ T and A T AQ = QΣ T Σ (1541)<br />

demonstrating that the columns of U are the eigenvectors of AA T and the<br />

columns of Q are the eigenvectors of A T A. −Muller, Magaia, & Herbst [265]<br />

A.6.4<br />

Pseudoinverse by SVD<br />

Matrix pseudoinverse (E) is nearly synonymous with singular value<br />

decomposition because of the elegant expression, given A = UΣQ T ∈ R m×n<br />

A † = QΣ †T U T ∈ R n×m (1542)<br />

that applies to all three flavors of SVD, where Σ † simply inverts nonzero<br />

entries of matrix Σ .<br />

Given symmetric matrix A∈ S n and its diagonalization A = SΛS T<br />

(A.5.1), its pseudoinverse simply inverts all nonzero eigenvalues:<br />

A † = SΛ † S T ∈ S n (1543)<br />

A.6.5<br />

SVD of symmetric matrices<br />

From (1532) and (1528) for A = A T<br />

⎧√ (√<br />

⎨ λ(A2 ) i = λ A 2)<br />

= |λ(A) i | > 0, 1 ≤ i ≤ ρ<br />

σ(A) i =<br />

i<br />

⎩<br />

0, ρ < i ≤ η<br />

(1544)<br />

A.6.5.0.1 Definition. Step function. (confer4.3.2.0.1)<br />

Define the signum-like quasilinear function ψ : R n → R n that takes value 1<br />

corresponding to a 0-valued entry in its argument:<br />

[<br />

ψ(a) <br />

lim<br />

x i →a i<br />

x i<br />

|x i | = { 1, ai ≥ 0<br />

−1, a i < 0 , i=1... n ]<br />

∈ R n (1545)<br />


A.7. ZEROS 615<br />

Eigenvalue signs of a symmetric matrix having diagonalization<br />

A = SΛS T (1525) can be absorbed either into real U or real Q from the<br />

full SVD; [343, p.34] (conferC.4.2.1)<br />

or<br />

A = SΛS T = Sδ(ψ(δ(Λ))) |Λ|S T U ΣQ T ∈ S n (1546)<br />

A = SΛS T = S|Λ|δ(ψ(δ(Λ)))S T UΣ Q T ∈ S n (1547)<br />

where matrix of singular values Σ = |Λ| denotes entrywise absolute value of<br />

diagonal eigenvalue matrix Λ .<br />

A.7 Zeros<br />

A.7.1<br />

norm zero<br />

For any given norm, by definition,<br />

‖x‖ l<br />

= 0 ⇔ x = 0 (1548)<br />

Consequently, a generally nonconvex constraint in x like ‖Ax − b‖ = κ<br />

becomes convex when κ = 0.<br />

A.7.2<br />

0 entry<br />

If a positive semidefinite matrix A = [A ij ] ∈ R n×n has a 0 entry A ii on its<br />

main diagonal, then A ij + A ji = 0 ∀j . [266,1.3.1]<br />

Any symmetric positive semidefinite matrix having a 0 entry on its main<br />

diagonal must be 0 along the entire row and column to which that 0 entry<br />

belongs. [155,4.2.8] [198,7.1, prob.2]<br />

A.7.3 0 eigenvalues theorem<br />

This theorem is simple, powerful, and widely applicable:<br />

A.7.3.0.1 Theorem. Number of 0 eigenvalues.<br />

For any matrix A∈ R m×n<br />

rank(A) + dim N(A) = n (1549)


616 APPENDIX A. LINEAR ALGEBRA<br />

by conservation of dimension. [198,0.4.4]<br />

For any square matrix A∈ R m×m , the number of 0 eigenvalues is at least<br />

equal to dim N(A)<br />

dim N(A) ≤ number of 0 eigenvalues ≤ m (1550)<br />

while the eigenvectors corresponding to those 0 eigenvalues belong to N(A).<br />

[325,5.1] A.17<br />

For diagonalizable matrix A (A.5), the number of 0 eigenvalues is<br />

precisely dim N(A) while the corresponding eigenvectors span N(A). The<br />

real and imaginary parts of the eigenvectors remaining span R(A).<br />

(TRANSPOSE.)<br />

Likewise, for any matrix A∈ R m×n<br />

rank(A T ) + dim N(A T ) = m (1551)<br />

For any square A∈ R m×m , the number of 0 eigenvalues is at least equal<br />

to dim N(A T ) = dim N(A) while the left-eigenvectors (eigenvectors of A T )<br />

corresponding to those 0 eigenvalues belong to N(A T ).<br />

For diagonalizable A , the number of 0 eigenvalues is precisely<br />

dim N(A T ) while the corresponding left-eigenvectors span N(A T ). The real<br />

and imaginary parts of the left-eigenvectors remaining span R(A T ). ⋄<br />

Proof. First we show, for a diagonalizable matrix, the number of 0<br />

eigenvalues is precisely the dimension of its nullspace while the eigenvectors<br />

corresponding to those 0 eigenvalues span the nullspace:<br />

Any diagonalizable matrix A∈ R m×m must possess a complete set of<br />

linearly independent eigenvectors. If A is full-rank (invertible), then all<br />

m=rank(A) eigenvalues are nonzero. [325,5.1]<br />

A.17 We take as given the well-known fact that the number of 0 eigenvalues cannot be less<br />

than dimension of the nullspace. We offer an example of the converse:<br />

⎡ ⎤<br />

1 0 1 0<br />

A = ⎢ 0 0 1 0<br />

⎥<br />

⎣ 0 0 0 0 ⎦<br />

1 0 0 0<br />

dim N(A) = 2, λ(A) = [0 0 0 1] T ; three eigenvectors in the nullspace but only two are<br />

independent. The right-hand side of (1550) is tight for nonzero matrices; e.g., (B.1) dyad<br />

uv T ∈ R m×m has m 0-eigenvalues when u ∈ v ⊥ .


A.7. ZEROS 617<br />

Suppose rank(A)< m . Then dim N(A) = m−rank(A). Thus there is<br />

a set of m−rank(A) linearly independent vectors spanning N(A). Each<br />

of those can be an eigenvector associated with a 0 eigenvalue because<br />

A is diagonalizable ⇔ ∃ m linearly independent eigenvectors. [325,5.2]<br />

Eigenvectors of a real matrix corresponding to 0 eigenvalues must be real. A.18<br />

Thus A has at least m−rank(A) eigenvalues equal to 0.<br />

Now suppose A has more than m−rank(A) eigenvalues equal to 0.<br />

Then there are more than m−rank(A) linearly independent eigenvectors<br />

associated with 0 eigenvalues, and each of those eigenvectors must be in<br />

N(A). Thus there are more than m−rank(A) linearly independent vectors<br />

in N(A) ; a contradiction.<br />

Diagonalizable A therefore has rank(A) nonzero eigenvalues and exactly<br />

m−rank(A) eigenvalues equal to 0 whose corresponding eigenvectors<br />

span N(A).<br />

By similar argument, the left-eigenvectors corresponding to 0 eigenvalues<br />

span N(A T ).<br />

Next we show when A is diagonalizable, the real and imaginary parts of<br />

its eigenvectors (corresponding to nonzero eigenvalues) span R(A) :<br />

The (right-)eigenvectors of a diagonalizable matrix A∈ R m×m are linearly<br />

independent if and only if the left-eigenvectors are. So, matrix A has<br />

a representation in terms of its right- and left-eigenvectors; from the<br />

diagonalization (1513), assuming 0 eigenvalues are ordered last,<br />

A =<br />

m∑<br />

λ i s i wi T =<br />

i=1<br />

k∑<br />

≤ m<br />

i=1<br />

λ i ≠0<br />

λ i s i w T i (1552)<br />

From the linearly independent dyads theorem (B.1.1.0.2), the dyads {s i w T i }<br />

must be independent because each set of eigenvectors are; hence rankA = k ,<br />

the number of nonzero eigenvalues. Complex eigenvectors and eigenvalues<br />

are common for real matrices, and must come in complex conjugate pairs for<br />

the summation to remain real. Assume that conjugate pairs of eigenvalues<br />

appear in sequence. Given any particular conjugate pair from (1552), we get<br />

the partial summation<br />

λ i s i w T i + λ ∗ i s ∗ iw ∗T<br />

i = 2re(λ i s i w T i )<br />

= 2 ( res i re(λ i w T i ) − im s i im(λ i w T i ) ) (1553)<br />

A.18 Proof. Let ∗ denote complex conjugation. Suppose A=A ∗ and As i =0. Then<br />

s i = s ∗ i ⇒ As i =As ∗ i ⇒ As ∗ i =0. Conversely, As∗ i =0 ⇒ As i=As ∗ i ⇒ s i= s ∗ i .


618 APPENDIX A. LINEAR ALGEBRA<br />

where A.19 λ ∗ i λ i+1 , s ∗ i s i+1 , and w ∗ i w i+1 . Then (1552) is<br />

equivalently written<br />

A = 2 ∑ i<br />

λ ∈ C<br />

λ i ≠0<br />

res 2i re(λ 2i w T 2i) − im s 2i im(λ 2i w T 2i) + ∑ j<br />

λ ∈ R<br />

λ j ≠0<br />

λ j s j w T j (1554)<br />

The summation (1554) shows: A is a linear combination of real and imaginary<br />

parts of its (right-)eigenvectors corresponding to nonzero eigenvalues. The<br />

k vectors {re s i ∈ R m , ims i ∈ R m | λ i ≠0, i∈{1... m}} must therefore span<br />

the range of diagonalizable matrix A .<br />

The argument is similar regarding the span of the left-eigenvectors. <br />

A.7.4<br />

0 trace and matrix product<br />

For X,A∈ R M×N<br />

+ (39)<br />

tr(X T A) = 0 ⇔ X ◦ A = A ◦ X = 0 (1555)<br />

For X,A∈ S M +<br />

[33,2.6.1, exer.2.8] [352,3.1]<br />

tr(XA) = 0 ⇔ XA = AX = 0 (1556)<br />

Proof. (⇐) Suppose XA = AX = 0. Then tr(XA)=0 is obvious.<br />

(⇒) Suppose tr(XA)=0. tr(XA)= tr( √ AX √ A) whose argument is<br />

positive semidefinite by Corollary A.3.1.0.5. Trace of any square matrix is<br />

equivalent to the sum of its eigenvalues. Eigenvalues of a positive semidefinite<br />

matrix can total 0 if and only if each and every nonnegative eigenvalue is 0.<br />

The only positive semidefinite matrix, having all 0 eigenvalues, resides at the<br />

origin; (confer (1580)) id est,<br />

√<br />

AX<br />

√<br />

A =<br />

(√<br />

X<br />

√<br />

A<br />

) T√<br />

X<br />

√<br />

A = 0 (1557)<br />

implying √ X √ A = 0 which in turn implies √ X( √ X √ A) √ A = XA = 0.<br />

Arguing similarly yields AX = 0.<br />

<br />

Diagonalizable matrices A and X are simultaneously diagonalizable if and<br />

only if they are commutative under multiplication; [198,1.3.12] id est, iff<br />

they share a complete set of eigenvectors.<br />

A.19 Complex conjugate of w is denoted w ∗ . Conjugate transpose is denoted w H = w ∗T .


A.7. ZEROS 619<br />

A.7.4.0.1 Example. An equivalence in nonisomorphic spaces.<br />

Identity (1556) leads to an unusual equivalence relating convex geometry to<br />

traditional linear algebra: The convex sets, given A ≽ 0<br />

{X | 〈X , A〉 = 0} ∩ {X ≽ 0} ≡ {X | N(X) ⊇ R(A)} ∩ {X ≽ 0} (1558)<br />

(one expressed in terms of a hyperplane, the other in terms of nullspace and<br />

range) are equivalent only when symmetric matrix A is positive semidefinite.<br />

We might apply this equivalence to the geometric center subspace, for<br />

example,<br />

S M c = {Y ∈ S M | Y 1 = 0}<br />

= {Y ∈ S M | N(Y ) ⊇ 1} = {Y ∈ S M | R(Y ) ⊆ N(1 T )}<br />

(1957)<br />

from which we derive (confer (970))<br />

S M c ∩ S M + ≡ {X ≽ 0 | 〈X , 11 T 〉 = 0} (1559)<br />

<br />

A.7.5<br />

Zero definite<br />

The domain over which an arbitrary real matrix A is zero definite can exceed<br />

its left and right nullspaces. For any positive semidefinite matrix A∈ R M×M<br />

(for A +A T ≽ 0)<br />

{x | x T Ax = 0} = N(A +A T ) (1560)<br />

because ∃R A+A T =R T R , ‖Rx‖=0 ⇔ Rx=0, and N(A+A T )= N(R).<br />

Then given any particular vector x p , x T pAx p = 0 ⇔ x p ∈ N(A +A T ). For<br />

any positive definite matrix A (for A +A T ≻ 0)<br />

Further, [386,3.2, prob.5]<br />

{x | x T Ax = 0} = 0 (1561)<br />

{x | x T Ax = 0} = R M ⇔ A T = −A (1562)<br />

while<br />

{x | x H Ax = 0} = C M ⇔ A = 0 (1563)


620 APPENDIX A. LINEAR ALGEBRA<br />

The positive semidefinite matrix<br />

[ ] 1 2<br />

A =<br />

0 1<br />

for example, has no nullspace. Yet<br />

(1564)<br />

{x | x T Ax = 0} = {x | 1 T x = 0} ⊂ R 2 (1565)<br />

which is the nullspace of the symmetrized matrix. Symmetric matrices are<br />

not spared from the excess; videlicet,<br />

[ ] 1 2<br />

B =<br />

(1566)<br />

2 1<br />

has eigenvalues {−1, 3} , no nullspace, but is zero definite on A.20<br />

X {x∈ R 2 | x 2 = (−2 ± √ 3)x 1 } (1567)<br />

A.7.5.0.1 Proposition. (Sturm/Zhang) Dyad-decompositions. [331,5.2]<br />

Let positive semidefinite matrix X ∈ S M + have rank ρ . Then given symmetric<br />

matrix A∈ S M , 〈A, X〉 = 0 if and only if there exists a dyad-decomposition<br />

ρ∑<br />

X = x j xj T (1568)<br />

satisfying<br />

j=1<br />

〈A , x j x T j 〉 = 0 for each and every j ∈ {1... ρ} (1569)<br />

⋄<br />

The dyad-decomposition of X proposed is generally not that obtained<br />

from a standard diagonalization by eigenvalue decomposition, unless ρ =1<br />

or the given matrix A is simultaneously diagonalizable (A.7.4) with X .<br />

That means, elemental dyads in decomposition (1568) constitute a generally<br />

nonorthogonal set. Sturm & Zhang give a simple procedure for constructing<br />

the dyad-decomposition [Wıκımization]; matrix A may be regarded as a<br />

parameter.<br />

A.20 These two lines represent the limit in the union of two generally distinct hyperbolae;<br />

id est, for matrix B and set X as defined<br />

lim<br />

ε→0 +{x∈ R2 | x T Bx = ε} = X


A.7. ZEROS 621<br />

A.7.5.0.2 Example. Dyad.<br />

The dyad uv T ∈ R M×M (B.1) is zero definite on all x for which either<br />

x T u=0 or x T v=0;<br />

{x | x T uv T x = 0} = {x | x T u=0} ∪ {x | v T x=0} (1570)<br />

id est, on u ⊥ ∪ v ⊥ . Symmetrizing the dyad does not change the outcome:<br />

{x | x T (uv T + vu T )x/2 = 0} = {x | x T u=0} ∪ {x | v T x=0} (1571)

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!