A. Linear Algebra - Meboo Publishing featured book, Convex ...
A. Linear Algebra - Meboo Publishing featured book, Convex ... A. Linear Algebra - Meboo Publishing featured book, Convex ...
Appendix A Linear algebra A.1 Main-diagonal δ operator, λ , trace, vec We introduce notation δ denoting the main-diagonal linear self-adjoint operator. When linear function δ operates on a square matrix A∈ R N×N , δ(A) returns a vector composed of all the entries from the main diagonal in the natural order; δ(A) ∈ R N (1391) Operating on a vector y ∈ R N , δ naturally returns a diagonal matrix; δ(y) ∈ S N (1392) Operating recursively on a vector Λ∈ R N or diagonal matrix Λ∈ S N , δ(δ(Λ)) returns Λ itself; δ 2 (Λ) ≡ δ(δ(Λ)) Λ (1393) Defined in this manner, main-diagonal linear operator δ is self-adjoint [222,3.10,9.5-1]; A.1 videlicet, (2.2) δ(A) T y = 〈δ(A), y〉 = 〈A , δ(y)〉 = tr ( A T δ(y) ) (1394) A.1 Linear operator T : R m×n → R M×N is self-adjoint when, for each and every X 1 , X 2 ∈ R m×n 〈T(X 1 ), X 2 〉 = 〈X 1 , T(X 2 )〉 2001 Jon Dattorro. co&edg version 2010.01.05. All rights reserved. citation: Dattorro, Convex Optimization & Euclidean Distance Geometry, Mεβoo Publishing USA, 2005, v2010.01.05. 581
- Page 2 and 3: 582 APPENDIX A. LINEAR ALGEBRA A.1.
- Page 4 and 5: 584 APPENDIX A. LINEAR ALGEBRA 35.
- Page 6 and 7: 586 APPENDIX A. LINEAR ALGEBRA A.2
- Page 8 and 9: 588 APPENDIX A. LINEAR ALGEBRA beca
- Page 10 and 11: 590 APPENDIX A. LINEAR ALGEBRA A.3.
- Page 12 and 13: 592 APPENDIX A. LINEAR ALGEBRA By s
- Page 14 and 15: 594 APPENDIX A. LINEAR ALGEBRA Beca
- Page 16 and 17: 596 APPENDIX A. LINEAR ALGEBRA For
- Page 18 and 19: 598 APPENDIX A. LINEAR ALGEBRA Now
- Page 20 and 21: 600 APPENDIX A. LINEAR ALGEBRA We c
- Page 22 and 23: 602 APPENDIX A. LINEAR ALGEBRA Orig
- Page 24 and 25: 604 APPENDIX A. LINEAR ALGEBRA A.4.
- Page 26 and 27: 606 APPENDIX A. LINEAR ALGEBRA A.5
- Page 28 and 29: 608 APPENDIX A. LINEAR ALGEBRA Cave
- Page 30 and 31: 610 APPENDIX A. LINEAR ALGEBRA A.6
- Page 32 and 33: 612 APPENDIX A. LINEAR ALGEBRA For
- Page 34 and 35: 614 APPENDIX A. LINEAR ALGEBRA Expr
- Page 36 and 37: 616 APPENDIX A. LINEAR ALGEBRA by c
- Page 38 and 39: 618 APPENDIX A. LINEAR ALGEBRA wher
- Page 40 and 41: 620 APPENDIX A. LINEAR ALGEBRA The
Appendix A<br />
<strong>Linear</strong> algebra<br />
A.1 Main-diagonal δ operator, λ , trace, vec<br />
We introduce notation δ denoting the main-diagonal linear self-adjoint<br />
operator. When linear function δ operates on a square matrix A∈ R N×N ,<br />
δ(A) returns a vector composed of all the entries from the main diagonal in<br />
the natural order;<br />
δ(A) ∈ R N (1391)<br />
Operating on a vector y ∈ R N , δ naturally returns a diagonal matrix;<br />
δ(y) ∈ S N (1392)<br />
Operating recursively on a vector Λ∈ R N or diagonal matrix Λ∈ S N ,<br />
δ(δ(Λ)) returns Λ itself;<br />
δ 2 (Λ) ≡ δ(δ(Λ)) Λ (1393)<br />
Defined in this manner, main-diagonal linear operator δ is self-adjoint<br />
[222,3.10,9.5-1]; A.1 videlicet, (2.2)<br />
δ(A) T y = 〈δ(A), y〉 = 〈A , δ(y)〉 = tr ( A T δ(y) ) (1394)<br />
A.1 <strong>Linear</strong> operator T : R m×n → R M×N is self-adjoint when, for each and every<br />
X 1 , X 2 ∈ R m×n 〈T(X 1 ), X 2 〉 = 〈X 1 , T(X 2 )〉<br />
2001 Jon Dattorro. co&edg version 2010.01.05. All rights reserved.<br />
citation: Dattorro, <strong>Convex</strong> Optimization & Euclidean Distance Geometry,<br />
Mεβoo <strong>Publishing</strong> USA, 2005, v2010.01.05.<br />
581
582 APPENDIX A. LINEAR ALGEBRA<br />
A.1.1<br />
Identities<br />
This δ notation is efficient and unambiguous as illustrated in the following<br />
examples where: A ◦ B denotes Hadamard product [198] [155,1.1.4] of<br />
matrices of like size, ⊗ Kronecker product [162] (D.1.2.1), y a vector, X<br />
a matrix, e i the i th member of the standard basis for R n , S N h the symmetric<br />
hollow subspace, σ(A) a vector of (nonincreasingly) ordered singular values,<br />
and λ(A) a vector of nonincreasingly ordered eigenvalues of matrix A :<br />
1. δ(A) = δ(A T )<br />
2. tr(A) = tr(A T ) = δ(A) T 1 = 〈I, A〉<br />
3. δ(cA) = cδ(A) c∈R<br />
4. tr(cA) = c tr(A) = c1 T λ(A) c∈R<br />
5. vec(cA) = c vec(A) c∈R<br />
6. A ◦ cB = cA ◦ B c∈R<br />
7. A ⊗ cB = cA ⊗ B c∈R<br />
8. δ(A + B) = δ(A) + δ(B)<br />
9. tr(A + B) = tr(A) + tr(B)<br />
10. vec(A + B) = vec(A) + vec(B)<br />
11. (A + B) ◦ C = A ◦ C + B ◦ C<br />
A ◦ (B + C) = A ◦ B + A ◦ C<br />
12. (A + B) ⊗ C = A ⊗ C + B ⊗ C<br />
A ⊗ (B + C) = A ⊗ B + A ⊗ C<br />
13. sgn(c)λ(|c|A) = cλ(A) c∈R<br />
14. sgn(c)σ(|c|A) = cσ(A) c∈R<br />
15. tr(c √ A T A) = c tr √ A T A = c1 T σ(A) c∈R<br />
16. π(δ(A)) = λ(I ◦A) where π is the presorting function.<br />
17. δ(AB) = (A ◦ B T )1 = (B T ◦ A)1<br />
18. δ(AB) T = 1 T (A T ◦ B) = 1 T (B ◦ A T )
A.1. MAIN-DIAGONAL δ OPERATOR, λ , TRACE, VEC 583<br />
⎡ ⎤<br />
u 1 v 1<br />
19. δ(uv T ⎢ ⎥<br />
) = ⎣ . ⎦ = u ◦ v ,<br />
u N v N<br />
u,v ∈ R N<br />
20. tr(A T B) = tr(AB T ) = tr(BA T ) = tr(B T A)<br />
= 1 T (A ◦ B)1 = 1 T δ(AB T ) = δ(A T B) T 1 = δ(BA T ) T 1 = δ(B T A) T 1<br />
21. D = [d ij ] ∈ S N h , H = [h ij ] ∈ S N h , V = I − 1 N 11T ∈ S N (conferB.4.2 no.20)<br />
N tr(−V (D ◦ H)V ) = tr(D T H) = 1 T (D ◦ H)1 = tr(11 T (D ◦ H)) = ∑ d ij h ij<br />
i,j<br />
22. tr(ΛA) = δ(Λ) T δ(A), δ 2 (Λ) Λ ∈ S N<br />
23. y T Bδ(A) = tr ( Bδ(A)y T) = tr ( δ(B T y)A ) = tr ( Aδ(B T y) )<br />
= δ(A) T B T y = tr ( yδ(A) T B T) = tr ( A T δ(B T y) ) = tr ( δ(B T y)A T)<br />
24. δ 2 (A T A) = ∑ i<br />
e i e T iA T Ae i e T i<br />
25. δ ( δ(A)1 T) = δ ( 1δ(A) T) = δ(A)<br />
26. δ(A1)1 = δ(A11 T ) = A1 , δ(y)1 = δ(y1 T ) = y<br />
27. δ(I1) = δ(1) = I<br />
28. δ(e i e T j 1) = δ(e i ) = e i e T i<br />
29. For ζ =[ζ i ]∈ R k and x=[x i ]∈ R k ,<br />
30.<br />
∑<br />
ζ i /x i = ζ T δ(x) −1 1<br />
vec(A ◦ B) = vec(A) ◦ vec(B) = δ(vecA) vec(B)<br />
= vec(B) ◦ vec(A) = δ(vecB) vec(A)<br />
31. vec(AXB) = (B T ⊗ A) vec X<br />
32. vec(BXA) = (A T ⊗ B) vec X<br />
i<br />
(42)(1747)<br />
(<br />
not<br />
H )<br />
33.<br />
34.<br />
tr(AXBX T ) = vec(X) T vec(AXB) = vec(X) T (B T ⊗ A) vec X [162]<br />
= δ ( vec(X) vec(X) T (B T ⊗ A) ) T<br />
1<br />
tr(AX T BX) = vec(X) T vec(BXA) = vec(X) T (A T ⊗ B) vec X<br />
= δ ( vec(X) vec(X) T (A T ⊗ B) ) T<br />
1
584 APPENDIX A. LINEAR ALGEBRA<br />
35. For any permutation matrix Ξ and dimensionally compatible vector y<br />
or matrix A<br />
δ(Ξy) = Ξδ(y) Ξ T (1395)<br />
δ(ΞAΞ T ) = Ξδ(A) (1396)<br />
So given any permutation matrix Ξ and any dimensionally compatible<br />
matrix B , for example,<br />
δ 2 (B) = Ξδ 2 (Ξ T B Ξ)Ξ T (1397)<br />
36. A ⊗ 1 = 1 ⊗ A = A<br />
37. A ⊗ (B ⊗ C) = (A ⊗ B) ⊗ C<br />
38. (A ⊗ B)(C ⊗ D) = AC ⊗ BD<br />
39. For A a vector, (A ⊗ B) = (A ⊗ I)B<br />
40. For B a row vector, (A ⊗ B) = A(I ⊗ B)<br />
41. (A ⊗ B) T = A T ⊗ B T<br />
42. (A ⊗ B) −1 = A −1 ⊗ B −1<br />
43. tr(A ⊗ B) = trA trB<br />
44. For A∈ R m×m , B ∈ R n×n , det(A ⊗ B) = det n (A) det m (B)<br />
45. There exist permutation matrices Ξ 1 and Ξ 2 such that [162, p.28]<br />
A ⊗ B = Ξ 1 (B ⊗ A)Ξ 2 (1398)<br />
46. For eigenvalues λ(A)∈ C n and eigenvectors v(A)∈ C n×n such that<br />
A = vδ(λ)v −1 ∈ R n×n<br />
λ(A ⊗ B) = λ(A) ⊗ λ(B) , v(A ⊗ B) = v(A) ⊗ v(B) (1399)<br />
47. Given analytic function [80] f : C n×n → C n×n , then f(I ⊗A)= I ⊗f(A)<br />
and f(A ⊗ I) = f(A) ⊗ I [162, p.28]
A.1. MAIN-DIAGONAL δ OPERATOR, λ , TRACE, VEC 585<br />
A.1.2<br />
Majorization<br />
A.1.2.0.1 Theorem. (Schur) Majorization. [386,7.4] [198,4.3]<br />
[199,5.5] Let λ∈ R N denote a given vector of eigenvalues and let<br />
δ ∈ R N denote a given vector of main diagonal entries, both arranged in<br />
nonincreasing order. Then<br />
and conversely<br />
∃A∈ S N λ(A)=λ and δ(A)= δ ⇐ λ − δ ∈ K ∗ λδ (1400)<br />
A∈ S N ⇒ λ(A) − δ(A) ∈ K ∗ λδ (1401)<br />
The difference belongs to the pointed polyhedral cone of majorization (empty<br />
interior, confer (291))<br />
K ∗ λδ K ∗ M+ ∩ {ζ1 | ζ ∈ R} ∗ (1402)<br />
where K ∗ M+ is the dual monotone nonnegative cone (413), and where the<br />
dual of the line is a hyperplane; ∂H = {ζ1 | ζ ∈ R} ∗ = 1 ⊥ .<br />
⋄<br />
Majorization cone K ∗ λδ is naturally consequent to the definition of<br />
majorization; id est, vector y ∈ R N majorizes vector x if and only if<br />
k∑<br />
x i ≤<br />
i=1<br />
k∑<br />
y i ∀ 1 ≤ k ≤ N (1403)<br />
i=1<br />
and<br />
1 T x = 1 T y (1404)<br />
Under these circumstances, rather, vector x is majorized by vector y .<br />
In the particular circumstance δ(A)=0 we get:<br />
A.1.2.0.2 Corollary. Symmetric hollow majorization.<br />
Let λ∈ R N denote a given vector of eigenvalues arranged in nonincreasing<br />
order. Then<br />
∃A∈ S N h λ(A)=λ ⇐ λ ∈ K ∗ λδ (1405)<br />
and conversely<br />
where K ∗ λδ<br />
is defined in (1402).<br />
⋄<br />
A∈ S N h ⇒ λ(A) ∈ K ∗ λδ (1406)
586 APPENDIX A. LINEAR ALGEBRA<br />
A.2 Semidefiniteness: domain of test<br />
The most fundamental necessary, sufficient, and definitive test for positive<br />
semidefiniteness of matrix A ∈ R n×n is: [199,1]<br />
x T Ax ≥ 0 for each and every x ∈ R n such that ‖x‖ = 1 (1407)<br />
Traditionally, authors demand evaluation over broader domain; namely,<br />
over all x ∈ R n which is sufficient but unnecessary. Indeed, that standard<br />
text<strong>book</strong> requirement is far over-reaching because if x T Ax is nonnegative for<br />
particular x = x p , then it is nonnegative for any αx p where α∈ R . Thus,<br />
only normalized x in R n need be evaluated.<br />
Many authors add the further requirement that the domain be complex;<br />
the broadest domain. By so doing, only Hermitian matrices (A H = A where<br />
superscript H denotes conjugate transpose) A.2 are admitted to the set of<br />
positive semidefinite matrices (1410); an unnecessary prohibitive condition.<br />
A.2.1<br />
Symmetry versus semidefiniteness<br />
We call (1407) the most fundamental test of positive semidefiniteness. Yet<br />
some authors instead say, for real A and complex domain {x∈ C n } , the<br />
complex test x H Ax≥0 is most fundamental. That complex broadening of the<br />
domain of test causes nonsymmetric real matrices to be excluded from the set<br />
of positive semidefinite matrices. Yet admitting nonsymmetric real matrices<br />
or not is a matter of preference A.3 unless that complex test is adopted, as we<br />
shall now explain.<br />
Any real square matrix A has a representation in terms of its symmetric<br />
and antisymmetric parts; id est,<br />
A = (A +AT )<br />
2<br />
+ (A −AT )<br />
2<br />
(53)<br />
Because, for all real A , the antisymmetric part vanishes under real test,<br />
x T(A −AT )<br />
2<br />
x = 0 (1408)<br />
A.2 Hermitian symmetry is the complex analogue; the real part of a Hermitian matrix<br />
is symmetric while its imaginary part is antisymmetric. A Hermitian matrix has real<br />
eigenvalues and real main diagonal.<br />
A.3 Golub & Van Loan [155,4.2.2], for example, admit nonsymmetric real matrices.
A.2. SEMIDEFINITENESS: DOMAIN OF TEST 587<br />
only the symmetric part of A , (A +A T )/2, has a role determining positive<br />
semidefiniteness. Hence the oft-made presumption that only symmetric<br />
matrices may be positive semidefinite is, of course, erroneous under (1407).<br />
Because eigenvalue-signs of a symmetric matrix translate unequivocally to<br />
its semidefiniteness, the eigenvalues that determine semidefiniteness are<br />
always those of the symmetrized matrix. (A.3) For that reason, and<br />
because symmetric (or Hermitian) matrices must have real eigenvalues,<br />
the convention adopted in the literature is that semidefinite matrices are<br />
synonymous with symmetric semidefinite matrices. Certainly misleading<br />
under (1407), that presumption is typically bolstered with compelling<br />
examples from the physical sciences where symmetric matrices occur within<br />
the mathematical exposition of natural phenomena. A.4 [137,52]<br />
Perhaps a better explanation of this pervasive presumption of symmetry<br />
comes from Horn & Johnson [198,7.1] whose perspective A.5 is the complex<br />
matrix, thus necessitating the complex domain of test throughout their<br />
treatise. They explain, if A∈C n×n<br />
...and if x H Ax is real for all x ∈ C n , then A is Hermitian.<br />
Thus, the assumption that A is Hermitian is not necessary in the<br />
definition of positive definiteness. It is customary, however.<br />
Their comment is best explained by noting, the real part of x H Ax comes<br />
from the Hermitian part (A +A H )/2 of A ;<br />
rather,<br />
re(x H Ax) = x<br />
HA +AH<br />
x (1409)<br />
2<br />
x H Ax ∈ R ⇔ A H = A (1410)<br />
because the imaginary part of x H Ax comes from the anti-Hermitian part<br />
(A −A H )/2 ;<br />
im(x H Ax) = x<br />
HA −AH<br />
x (1411)<br />
2<br />
that vanishes for nonzero x if and only if A = A H . So the Hermitian<br />
symmetry assumption is unnecessary, according to Horn & Johnson, not<br />
A.4 Symmetric matrices are certainly pervasive in the our chosen subject as well.<br />
A.5 A totally complex perspective is not necessarily more advantageous. The positive<br />
semidefinite cone, for example, is not selfdual (2.13.5) in the ambient space of Hermitian<br />
matrices. [192,II]
588 APPENDIX A. LINEAR ALGEBRA<br />
because nonHermitian matrices could be regarded positive semidefinite,<br />
rather because nonHermitian (includes nonsymmetric real) matrices are not<br />
comparable on the real line under x H Ax . Yet that complex edifice is<br />
dismantled in the test of real matrices (1407) because the domain of test<br />
is no longer necessarily complex; meaning, x T Ax will certainly always be<br />
real, regardless of symmetry, and so real A will always be comparable.<br />
In summary, if we limit the domain of test to all x in R n as in (1407),<br />
then nonsymmetric real matrices are admitted to the realm of semidefinite<br />
matrices because they become comparable on the real line. One important<br />
exception occurs for rank-one matrices Ψ=uv T where u and v are real<br />
vectors: Ψ is positive semidefinite if and only if Ψ=uu T . (A.3.1.0.7)<br />
We might choose to expand the domain of test to all x in C n so that only<br />
symmetric matrices would be comparable. The alternative to expanding<br />
domain of test is to assume all matrices of interest to be symmetric; that<br />
is commonly done, hence the synonymous relationship with semidefinite<br />
matrices.<br />
A.2.1.0.1 Example. Nonsymmetric positive definite product.<br />
Horn & Johnson assert and Zhang agrees:<br />
If A,B ∈ C n×n are positive definite, then we know that the<br />
product AB is positive definite if and only if AB is Hermitian.<br />
[198,7.6, prob.10] [386,6.2,3.2]<br />
Implicitly in their statement, A and B are assumed individually Hermitian<br />
and the domain of test is assumed complex.<br />
We prove that assertion to be false for real matrices under (1407) that<br />
adopts a real domain of test.<br />
A T = A =<br />
⎡<br />
⎢<br />
⎣<br />
3 0 −1 0<br />
0 5 1 0<br />
−1 1 4 1<br />
0 0 1 4<br />
⎤<br />
⎥<br />
⎦ ,<br />
⎡<br />
λ(A) = ⎢<br />
⎣<br />
5.9<br />
4.5<br />
3.4<br />
2.0<br />
⎤<br />
⎥<br />
⎦ (1412)<br />
B T = B =<br />
⎡<br />
⎢<br />
⎣<br />
4 4 −1 −1<br />
4 5 0 0<br />
−1 0 5 1<br />
−1 0 1 4<br />
⎤<br />
⎥<br />
⎦ ,<br />
⎡<br />
λ(B) = ⎢<br />
⎣<br />
8.8<br />
5.5<br />
3.3<br />
0.24<br />
⎤<br />
⎥<br />
⎦ (1413)
A.3. PROPER STATEMENTS 589<br />
(AB) T ≠ AB =<br />
⎡<br />
⎢<br />
⎣<br />
13 12 −8 −4<br />
19 25 5 1<br />
−5 1 22 9<br />
−5 0 9 17<br />
⎤<br />
⎥<br />
⎦ ,<br />
⎡<br />
λ(AB) = ⎢<br />
⎣<br />
36.<br />
29.<br />
10.<br />
0.72<br />
⎤<br />
⎥<br />
⎦ (1414)<br />
1<br />
(AB + 2 (AB)T ) =<br />
⎡<br />
⎢<br />
⎣<br />
13 15.5 −6.5 −4.5<br />
15.5 25 3 0.5<br />
−6.5 3 22 9<br />
−4.5 0.5 9 17<br />
⎤<br />
⎥<br />
⎦ , λ( 1<br />
2 (AB + (AB)T ) ) =<br />
⎡<br />
⎢<br />
⎣<br />
36.<br />
30.<br />
10.<br />
0.014<br />
(1415)<br />
Whenever A∈ S n + and B ∈ S n + , then λ(AB)=λ( √ AB √ A) will always<br />
be a nonnegative vector by (1439) and Corollary A.3.1.0.5. Yet positive<br />
definiteness of product AB is certified instead by the nonnegative eigenvalues<br />
λ ( 1<br />
2 (AB + (AB)T ) ) in (1415) (A.3.1.0.1) despite the fact that AB is not<br />
symmetric. A.6 Horn & Johnson and Zhang resolve the anomaly by choosing<br />
to exclude nonsymmetric matrices and products; they do so by expanding<br />
the domain of test to C n .<br />
<br />
⎤<br />
⎥<br />
⎦<br />
A.3 Proper statements<br />
of positive semidefiniteness<br />
Unlike Horn & Johnson and Zhang, we never adopt a complex domain of test<br />
with real matrices. So motivated is our consideration of proper statements<br />
of positive semidefiniteness under real domain of test. This restriction,<br />
ironically, complicates the facts when compared to corresponding statements<br />
for the complex case (found elsewhere [198] [386]).<br />
We state several fundamental facts regarding positive semidefiniteness of<br />
real matrix A and the product AB and sum A +B of real matrices under<br />
fundamental real test (1407); a few require proof as they depart from the<br />
standard texts, while those remaining are well established or obvious.<br />
A.6 It is a little more difficult to find a counter-example in R 2×2 or R 3×3 ; which may<br />
have served to advance any confusion.
590 APPENDIX A. LINEAR ALGEBRA<br />
A.3.0.0.1 Theorem. Positive (semi)definite matrix.<br />
A∈ S M is positive semidefinite if and only if for each and every vector x∈ R M<br />
of unit norm, ‖x‖ = 1 , A.7 we have x T Ax ≥ 0 (1407);<br />
A ≽ 0 ⇔ tr(xx T A) = x T Ax ≥ 0 ∀xx T (1416)<br />
Matrix A ∈ S M is positive definite if and only if for each and every ‖x‖ = 1<br />
we have x T Ax > 0 ;<br />
A ≻ 0 ⇔ tr(xx T A) = x T Ax > 0 ∀xx T , xx T ≠ 0 (1417)<br />
Proof. Statements (1416) and (1417) are each a particular instance<br />
of dual generalized inequalities (2.13.2) with respect to the positive<br />
semidefinite cone; videlicet, [353]<br />
⋄<br />
A ≽ 0 ⇔ 〈xx T , A〉 ≥ 0 ∀xx T (≽ 0)<br />
A ≻ 0 ⇔ 〈xx T , A〉 > 0 ∀xx T (≽ 0), xx T ≠ 0<br />
(1418)<br />
This says: positive semidefinite matrix A must belong to the normal side<br />
of every hyperplane whose normal is an extreme direction of the positive<br />
semidefinite cone. Relations (1416) and (1417) remain true when xx T is<br />
replaced with “for each and every” positive semidefinite matrix X ∈ S M +<br />
(2.13.5) of unit norm, ‖X‖= 1, as in<br />
A ≽ 0 ⇔ tr(XA) ≥ 0 ∀X ∈ S M +<br />
A ≻ 0 ⇔ tr(XA) > 0 ∀X ∈ S M + , X ≠ 0<br />
(1419)<br />
But that condition is more than what is necessary. By the discretized<br />
membership theorem in2.13.4.2.1, the extreme directions xx T of the positive<br />
semidefinite cone constitute a minimal set of generators necessary and<br />
sufficient for discretization of dual generalized inequalities (1419) certifying<br />
membership to that cone.<br />
<br />
A.7 The traditional condition requiring all x∈ R M for defining positive (semi)definiteness<br />
is actually more than what is necessary. The set of norm-1 vectors is necessary and<br />
sufficient to establish positive semidefiniteness; actually, any particular norm and any<br />
nonzero norm-constant will work.
A.3. PROPER STATEMENTS 591<br />
A.3.1<br />
Semidefiniteness, eigenvalues, nonsymmetric<br />
When A∈ R n×n , let λ ( 1<br />
2 (A +AT ) ) ∈ R n denote eigenvalues of the<br />
symmetrized matrix A.8 arranged in nonincreasing order.<br />
By positive semidefiniteness of A∈ R n×n we mean, A.9 [266,1.3.1]<br />
(conferA.3.1.0.1)<br />
x T Ax ≥ 0 ∀x∈ R n ⇔ A +A T ≽ 0 ⇔ λ(A +A T ) ≽ 0 (1420)<br />
(2.9.0.1)<br />
A ≽ 0 ⇒ A T = A (1421)<br />
A ≽ B ⇔ A −B ≽ 0 A ≽ 0 or B ≽ 0 (1422)<br />
x T Ax≥0 ∀x A T = A (1423)<br />
Matrix symmetry is not intrinsic to positive semidefiniteness;<br />
A T = A, λ(A) ≽ 0 ⇒ x T Ax ≥ 0 ∀x (1424)<br />
If A T = A thenλ(A) ≽ 0 ⇐ A T = A, x T Ax ≥ 0 ∀x (1425)<br />
λ(A) ≽ 0 ⇔ A ≽ 0 (1426)<br />
meaning, matrix A belongs to the positive semidefinite cone in the<br />
subspace of symmetric matrices if and only if its eigenvalues belong to<br />
the nonnegative orthant.<br />
〈A , A〉 = 〈λ(A), λ(A)〉 (45)<br />
For µ∈ R , A∈ R n×n , and vector λ(A)∈ C n holding the ordered<br />
eigenvalues of A<br />
λ(µI + A) = µ1 + λ(A) (1427)<br />
Proof: A=MJM −1 and µI + A = M(µI + J)M −1 where J<br />
is the Jordan form for A ; [325,5.6, App.B] id est, δ(J) = λ(A) ,<br />
so λ(µI + A) = δ(µI + J) because µI + J is also a Jordan form. <br />
A.8 The symmetrization of A is (A +A T )/2. λ ( 1<br />
2 (A +AT ) ) = λ(A +A T )/2.<br />
A.9 Strang agrees [325, p.334] it is not λ(A) that requires observation. Yet he is mistaken<br />
by proposing the Hermitian part alone x H (A+A H )x be tested, because the anti-Hermitian<br />
part does not vanish under complex test unless A is Hermitian. (1411)
592 APPENDIX A. LINEAR ALGEBRA<br />
By similar reasoning,<br />
λ(I + µA) = 1 + λ(µA) (1428)<br />
For vector σ(A) holding the singular values of any matrix A<br />
σ(I + µA T A) = π(|1 + µσ(A T A)|) (1429)<br />
σ(µI + A T A) = π(|µ1 + σ(A T A)|) (1430)<br />
where π is the nonlinear permutation-operator sorting its vector<br />
argument into nonincreasing order.<br />
For A∈ S M and each and every ‖w‖= 1 [198,7.7, prob.9]<br />
w T Aw ≤ µ ⇔ A ≼ µI ⇔ λ(A) ≼ µ1 (1431)<br />
[198,2.5.4] (confer (44))<br />
A is normal matrix ⇔ ‖A‖ 2 F = λ(A) T λ(A) (1432)<br />
For A∈ R m×n A T A ≽ 0, AA T ≽ 0 (1433)<br />
because, for dimensionally compatible vector x , x T A T Ax = ‖Ax‖ 2 2 ,<br />
x T AA T x = ‖A T x‖ 2 2 .<br />
For A∈ R n×n and c∈R<br />
tr(cA) = c tr(A) = c1 T λ(A)<br />
(A.1.1 no.4)<br />
For m a nonnegative integer, (1828)<br />
det(A m ) =<br />
tr(A m ) =<br />
n∏<br />
λ(A) m i (1434)<br />
i=1<br />
n∑<br />
λ(A) m i (1435)<br />
i=1
A.3. PROPER STATEMENTS 593<br />
For A diagonalizable (A.5), A = SΛS −1 , (confer [325, p.255])<br />
rankA = rankδ(λ(A)) = rank Λ (1436)<br />
meaning, rank is equal to the number of nonzero eigenvalues in vector<br />
by the 0 eigenvalues theorem (A.7.3.0.1).<br />
(Ky Fan) For A,B∈ S n [53,1.2] (confer (1703))<br />
λ(A) δ(Λ) (1437)<br />
tr(AB) ≤ λ(A) T λ(B) (1689)<br />
with equality (Theobald) when A and B are simultaneously<br />
diagonalizable [198] with the same ordering of eigenvalues.<br />
For A∈ R m×n and B ∈ R n×m<br />
tr(AB) = tr(BA) (1438)<br />
and η eigenvalues of the product and commuted product are identical,<br />
including their multiplicity; [198,1.3.20] id est,<br />
λ(AB) 1:η = λ(BA) 1:η , ηmin{m , n} (1439)<br />
Any eigenvalues remaining are zero. By the 0 eigenvalues theorem<br />
(A.7.3.0.1),<br />
rank(AB) = rank(BA), AB and BA diagonalizable (1440)<br />
For any compatible matrices A,B [198,0.4]<br />
min{rankA, rankB} ≥ rank(AB) (1441)<br />
For A,B ∈ S n +<br />
rankA + rankB ≥ rank(A + B) ≥ min{rankA, rankB} ≥ rank(AB)<br />
(1442)<br />
For linearly independent matrices A,B ∈ S n + (2.1.2, R(A) ∩ R(B)=0,<br />
R(A T ) ∩ R(B T )=0,B.1.1),<br />
rankA + rankB = rank(A + B) > min{rankA, rankB} ≥ rank(AB)<br />
(1443)
594 APPENDIX A. LINEAR ALGEBRA<br />
Because R(A T A)= R(A T ) and R(AA T )= R(A) (p.624), for any<br />
A∈ R m×n rank(AA T ) = rank(A T A) = rankA = rankA T (1444)<br />
For A∈ R m×n having no nullspace, and for any B ∈ R n×k<br />
rank(AB) = rank(B) (1445)<br />
Proof. For any compatible matrix C , N(CAB)⊇ N(AB)⊇ N(B)<br />
is obvious. By assumption ∃A † A † A = I . Let C = A † , then<br />
N(AB)= N(B) and the stated result follows by conservation of<br />
dimension (1549).<br />
<br />
For A∈ S n and any nonsingular matrix Y<br />
inertia(A) = inertia(YAY T ) (1446)<br />
a.k.a, Sylvester’s law of inertia. (1489) [110,2.4.3]<br />
For A,B∈R n×n square, [198,0.3.5]<br />
Yet for A∈ R m×n and B ∈ R n×m [75, p.72]<br />
det(AB) = det(BA) (1447)<br />
det(AB) = detA detB (1448)<br />
det(I + AB) = det(I + BA) (1449)<br />
For A,B ∈ S n , product AB is symmetric if and only if AB is<br />
commutative;<br />
(AB) T = AB ⇔ AB = BA (1450)<br />
Proof. (⇒) Suppose AB=(AB) T . (AB) T =B T A T =BA .<br />
AB=(AB) T ⇒ AB=BA .<br />
(⇐) Suppose AB=BA . BA=B T A T =(AB) T . AB=BA ⇒<br />
AB=(AB) T .<br />
<br />
Commutativity alone is insufficient for symmetry of the product.<br />
[325, p.26]
A.3. PROPER STATEMENTS 595<br />
Diagonalizable matrices A,B∈R n×n commute if and only if they are<br />
simultaneously diagonalizable. [198,1.3.12] A product of diagonal<br />
matrices is always commutative.<br />
For A,B ∈ R n×n and AB = BA<br />
x T Ax ≥ 0, x T Bx ≥ 0 ∀x ⇒ λ(A+A T ) i λ(B+B T ) i ≥ 0 ∀i x T ABx ≥ 0 ∀x<br />
(1451)<br />
the negative result arising because of the schism between the product<br />
of eigenvalues λ(A + A T ) i λ(B + B T ) i and the eigenvalues of the<br />
symmetrized matrix product λ(AB + (AB) T ) i<br />
. For example, X 2 is<br />
generally not positive semidefinite unless matrix X is symmetric; then<br />
(1433) applies. Simply substituting symmetric matrices changes the<br />
outcome:<br />
For A,B ∈ S n and AB = BA<br />
A ≽ 0, B ≽ 0 ⇒ λ(AB) i =λ(A) i λ(B) i ≥0 ∀i ⇔ AB ≽ 0 (1452)<br />
Positive semidefiniteness of A and B is sufficient but not a necessary<br />
condition for positive semidefiniteness of the product AB .<br />
Proof. Because all symmetric matrices are diagonalizable, (A.5.1)<br />
[325,5.6] we have A=SΛS T and B=T∆T T , where Λ and ∆ are<br />
real diagonal matrices while S and T are orthogonal matrices. Because<br />
(AB) T =AB , then T must equal S , [198,1.3] and the eigenvalues of<br />
A are ordered in the same way as those of B ; id est, λ(A) i =δ(Λ) i and<br />
λ(B) i =δ(∆) i correspond to the same eigenvector.<br />
(⇒) Assume λ(A) i λ(B) i ≥0 for i=1... n . AB=SΛ∆S T is<br />
symmetric and has nonnegative eigenvalues contained in diagonal<br />
matrix Λ∆ by assumption; hence positive semidefinite by (1420). Now<br />
assume A,B ≽0. That, of course, implies λ(A) i λ(B) i ≥0 for all i<br />
because all the individual eigenvalues are nonnegative.<br />
(⇐) Suppose AB=SΛ∆S T ≽ 0. Then Λ∆≽0 by (1420),<br />
and so all products λ(A) i λ(B) i must be nonnegative; meaning,<br />
sgn(λ(A))= sgn(λ(B)). We may, therefore, conclude nothing about<br />
the semidefiniteness of A and B .
596 APPENDIX A. LINEAR ALGEBRA<br />
For A,B ∈ S n and A ≽ 0, B ≽ 0 (Example A.2.1.0.1)<br />
AB = BA ⇒ λ(AB) i =λ(A) i λ(B) i ≥ 0 ∀i ⇒ AB ≽ 0 (1453)<br />
AB = BA ⇒ λ(AB) i ≥ 0, λ(A) i λ(B) i ≥ 0 ∀i ⇔ AB ≽ 0 (1454)<br />
For A,B ∈ S n [199,4.2.13]<br />
A ≽ 0, B ≽ 0 ⇒ A ⊗ B ≽ 0 (1455)<br />
A ≻ 0, B ≻ 0 ⇒ A ⊗ B ≻ 0 (1456)<br />
and the Kronecker product is symmetric.<br />
For A,B ∈ S n [386,6.2]<br />
A ≽ 0 ⇒ trA ≥ 0 (1457)<br />
A ≽ 0, B ≽ 0 ⇒ trA trB ≥ tr(AB) ≥ 0 (1458)<br />
Because A ≽ 0, B ≽ 0 ⇒ λ(AB) = λ( √ AB √ A) ≽ 0 by (1439) and<br />
Corollary A.3.1.0.5, then we have tr(AB) ≥ 0.<br />
A ≽ 0 ⇔ tr(AB)≥ 0 ∀B ≽ 0 (356)<br />
For A,B,C ∈ S n (Löwner)<br />
A ≼ B , B ≼ C ⇒ A ≼ C<br />
A ≼ B ⇔ A + C ≼ B + C<br />
A ≼ B , A ≽ B ⇒ A = B<br />
A ≼ A<br />
A ≼ B , B ≺ C ⇒ A ≺ C<br />
A ≺ B ⇔ A + C ≺ B + C<br />
(transitivity)<br />
(additivity)<br />
(antisymmetry)<br />
(reflexivity)<br />
(strict transitivity)<br />
(strict additivity)<br />
(1459)<br />
(1460)<br />
For A,B ∈ R n×n x T Ax ≥ x T Bx ∀x ⇒ trA ≥ trB (1461)<br />
Proof. x T Ax≥x T Bx ∀x ⇔ λ((A −B) + (A −B) T )/2 ≽ 0 ⇒<br />
tr(A+A T −(B+B T ))/2 = tr(A −B)≥0. There is no converse.
A.3. PROPER STATEMENTS 597<br />
For A,B ∈ S n [386,6.2] (Theorem A.3.1.0.4)<br />
A ≽ B ⇒ trA ≥ trB (1462)<br />
A ≽ B ⇒ δ(A) ≽ δ(B) (1463)<br />
There is no converse, and restriction to the positive semidefinite cone<br />
does not improve the situation. All-strict versions hold.<br />
A ≽ B ≽ 0 ⇒ rankA ≥ rankB (1464)<br />
A ≽ B ≽ 0 ⇒ detA ≥ detB ≥ 0 (1465)<br />
A ≻ B ≽ 0 ⇒ detA > detB ≥ 0 (1466)<br />
For A,B ∈ int S n + [33,4.2] [198,7.7.4]<br />
A ≽ B ⇔ A −1 ≼ B −1 , A ≻ 0 ⇔ A −1 ≻ 0 (1467)<br />
For A,B ∈ S n [386,6.2]<br />
A ≽ B ≽ 0 ⇒ √ A ≽ √ B (1468)<br />
For A,B ∈ S n and AB = BA [386,6.2, prob.3]<br />
A ≽ B ≽ 0 ⇒ A k ≽ B k , k=1, 2,... (1469)<br />
A.3.1.0.1 Theorem. Positive semidefinite ordering of eigenvalues.<br />
For A,B∈ R M×M , place the eigenvalues of each symmetrized matrix into<br />
the respective vectors λ ( 1<br />
(A 2 +AT ) ) , λ ( 1<br />
(B 2 +BT ) ) ∈ R M . Then, [325,6]<br />
x T Ax ≥ 0 ∀x ⇔ λ ( A +A T) ≽ 0 (1470)<br />
x T Ax > 0 ∀x ≠ 0 ⇔ λ ( A +A T) ≻ 0 (1471)<br />
because x T (A −A T )x=0. (1408) Now arrange the entries of λ ( 1<br />
(A 2 +AT ) )<br />
and λ ( 1<br />
(B 2 +BT ) ) in nonincreasing order so λ ( 1<br />
(A 2 +AT ) ) holds the<br />
1<br />
largest eigenvalue of symmetrized A while λ ( 1<br />
(B 2 +BT ) ) holds the largest<br />
1<br />
eigenvalue of symmetrized B , and so on. Then [198,7.7, prob.1, prob.9]<br />
for κ ∈ R<br />
x T Ax ≥ x T Bx ∀x ⇒ λ ( A +A T) ≽ λ ( B +B T)<br />
x T Ax ≥ x T Ixκ ∀x ⇔ λ ( 1<br />
2 (A +AT ) ) ≽ κ1<br />
(1472)
598 APPENDIX A. LINEAR ALGEBRA<br />
Now let A,B ∈ S M have diagonalizations A=QΛQ T and B =UΥU T with<br />
λ(A)=δ(Λ) and λ(B)=δ(Υ) arranged in nonincreasing order. Then<br />
where S = QU T . [386,7.5]<br />
A ≽ B ⇔ λ(A−B) ≽ 0 (1473)<br />
A ≽ B ⇒ λ(A) ≽ λ(B) (1474)<br />
A ≽ B λ(A) ≽ λ(B) (1475)<br />
S T AS ≽ B ⇐ λ(A) ≽ λ(B) (1476)<br />
A.3.1.0.2 Theorem. (Weyl) Eigenvalues of sum. [198,4.3.1]<br />
For A,B∈ R M×M , place the eigenvalues of each symmetrized matrix into<br />
the respective vectors λ ( 1<br />
(A 2 +AT ) ) , λ ( 1<br />
(B 2 +BT ) ) ∈ R M in nonincreasing<br />
order so λ ( 1<br />
(A 2 +AT ) ) holds the largest eigenvalue of symmetrized A while<br />
1<br />
λ ( 1<br />
(B 2 +BT ) ) holds the largest eigenvalue of symmetrized B , and so on.<br />
1<br />
Then, for any k ∈{1... M }<br />
λ ( A +A T) k + λ( B +B T) M ≤ λ( (A +A T ) + (B +B T ) ) k ≤ λ( A +A T) k + λ( B +B T) 1<br />
⋄<br />
(1477)<br />
Weyl’s theorem establishes: concavity of the smallest λ M and convexity<br />
of the largest eigenvalue λ 1 of a symmetric matrix, via (475), and positive<br />
semidefiniteness of a sum of positive semidefinite matrices; for A,B∈ S M +<br />
λ(A) k + λ(B) M ≤ λ(A + B) k ≤ λ(A) k + λ(B) 1 (1478)<br />
Because S M + is a convex cone (2.9.0.0.1), then by (166)<br />
A,B ≽ 0 ⇒ ζA + ξB ≽ 0 for all ζ,ξ ≥ 0 (1479)<br />
A.3.1.0.3 Corollary. Eigenvalues of sum and difference. [198,4.3]<br />
For A∈ S M and B ∈ S M + , place the eigenvalues of each matrix into the<br />
respective vectors λ(A), λ(B)∈ R M in nonincreasing order so λ(A) 1 holds<br />
the largest eigenvalue of A while λ(B) 1 holds the largest eigenvalue of B ,<br />
and so on. Then, for any k ∈{1... M}<br />
λ(A − B) k ≤ λ(A) k ≤ λ(A +B) k (1480)<br />
⋄<br />
⋄
A.3. PROPER STATEMENTS 599<br />
When B is rank-one positive semidefinite, the eigenvalues interlace; id est,<br />
for B = qq T<br />
λ(A) k−1 ≤ λ(A − qq T ) k ≤ λ(A) k ≤ λ(A + qq T ) k ≤ λ(A) k+1 (1481)<br />
A.3.1.0.4 Theorem. Positive (semi)definite principal submatrices. A.10<br />
A∈ S M is positive semidefinite if and only if all M principal submatrices<br />
of dimension M−1 are positive semidefinite and detA is nonnegative.<br />
A ∈ S M is positive definite if and only if any one principal submatrix<br />
of dimension M −1 is positive definite and detA is positive. ⋄<br />
If any one principal submatrix of dimension M−1 is not positive definite,<br />
conversely, then A can neither be. Regardless of symmetry, if A∈ R M×M is<br />
positive (semi)definite, then the determinant of each and every principal<br />
submatrix is (nonnegative) positive. [266,1.3.1]<br />
A.3.1.0.5<br />
[198, p.399]<br />
Corollary. Positive (semi)definite symmetric products.<br />
If A∈ S M is positive definite and any particular dimensionally<br />
compatible matrix Z has no nullspace, then Z T AZ is positive definite.<br />
If matrix A∈ S M is positive (semi)definite then, for any matrix Z of<br />
compatible dimension, Z T AZ is positive semidefinite.<br />
A∈ S M is positive (semi)definite if and only if there exists a nonsingular<br />
Z such that Z T AZ is positive (semi)definite.<br />
If A∈ S M is positive semidefinite and singular it remains possible, for<br />
some skinny Z ∈ R M×N with N
600 APPENDIX A. LINEAR ALGEBRA<br />
We can deduce from these, given nonsingular matrix Z and any particular<br />
dimensionally<br />
[ ]<br />
compatible Y : matrix A∈ S M is positive semidefinite if and<br />
Z<br />
T<br />
only if<br />
Y T A[Z Y ] is positive semidefinite. In other words, from the<br />
Corollary it follows: for dimensionally compatible Z<br />
A ≽ 0 ⇔ Z T AZ ≽ 0 and Z T has a left inverse<br />
Products such as Z † Z and ZZ † are symmetric and positive semidefinite<br />
although, given A ≽ 0, Z † AZ and ZAZ † are neither necessarily symmetric<br />
or positive semidefinite.<br />
A.3.1.0.6 Theorem. Symmetric projector semidefinite. [20,III]<br />
[21,6] [217, p.55] For symmetric idempotent matrices P and R<br />
P,R ≽ 0<br />
P ≽ R ⇔ R(P ) ⊇ R(R) ⇔ N(P ) ⊆ N(R)<br />
(1482)<br />
Projector P is never positive definite [327,6.5, prob.20] unless it is the<br />
identity matrix.<br />
⋄<br />
A.3.1.0.7 Theorem. Symmetric positive semidefinite.<br />
Given real matrix Ψ with rank Ψ = 1<br />
Ψ ≽ 0 ⇔ Ψ = uu T (1483)<br />
where u is some real vector; id est, symmetry is necessary and sufficient for<br />
positive semidefiniteness of a rank-1 matrix.<br />
⋄<br />
Proof. Any rank-one matrix must have the form Ψ = uv T . (B.1)<br />
Suppose Ψ is symmetric; id est, v = u . For all y ∈ R M , y T uu T y ≥ 0.<br />
Conversely, suppose uv T is positive semidefinite. We know that can hold if<br />
and only if uv T + vu T ≽ 0 ⇔ for all normalized y ∈ R M , 2y T uv T y ≥ 0 ;<br />
but that is possible only if v = u .<br />
<br />
The same does not hold true for matrices of higher rank, as Example A.2.1.0.1<br />
shows.
A.4. SCHUR COMPLEMENT 601<br />
A.4 Schur complement<br />
Consider Schur-form partitioned matrix G : Given A T = A and C T = C ,<br />
then [57]<br />
[ ] A B<br />
G =<br />
B T ≽ 0<br />
C<br />
(1484)<br />
⇔ A ≽ 0, B T (I −AA † ) = 0, C −B T A † B ≽ 0<br />
⇔ C ≽ 0, B(I −CC † ) = 0, A−BC † B T ≽ 0<br />
where A † denotes the Moore-Penrose (pseudo)inverse (E). In the first<br />
instance, I − AA † is a symmetric projection matrix orthogonally projecting<br />
on N(A T ). (1870) It is apparently required<br />
R(B) ⊥ N(A T ) (1485)<br />
which precludes A = 0 when B is any nonzero matrix. Note that A ≻ 0 ⇒<br />
A † =A −1 ; thereby, the projection matrix vanishes. Likewise, in the second<br />
instance, I − CC † projects orthogonally on N(C T ). It is required<br />
R(B T ) ⊥ N(C T ) (1486)<br />
which precludes C =0 for B nonzero. Again, C ≻ 0 ⇒ C † = C −1 . So we<br />
get, for A or C nonsingular,<br />
[ ] A B<br />
G =<br />
B T ≽ 0<br />
C<br />
⇔<br />
A ≻ 0, C −B T A −1 B ≽ 0<br />
or<br />
C ≻ 0, A−BC −1 B T ≽ 0<br />
(1487)<br />
When A is full-rank then, for all B of compatible dimension, R(B) is in<br />
R(A). Likewise, when C is full-rank, R(B T ) is in R(C). Thus the flavor,<br />
for A and C nonsingular,<br />
[ ] A B<br />
G =<br />
B T ≻ 0<br />
C<br />
⇔ A ≻ 0, C −B T A −1 B ≻ 0<br />
⇔ C ≻ 0, A−BC −1 B T ≻ 0<br />
(1488)<br />
where C − B T A −1 B is called the Schur complement of A in G , while the<br />
Schur complement of C in G is A − BC −1 B T . [148,4.8]
602 APPENDIX A. LINEAR ALGEBRA<br />
Origin of the term Schur complement is from complementary inertia:<br />
[110,2.4.4] Define<br />
inertia ( G∈ S M) {p,z,n} (1489)<br />
where p,z,n respectively represent number of positive, zero, and negative<br />
eigenvalues of G ; id est,<br />
M = p + z + n (1490)<br />
Then, when A is invertible,<br />
and when C is invertible,<br />
inertia(G) = inertia(A) + inertia(C − B T A −1 B) (1491)<br />
inertia(G) = inertia(C) + inertia(A − BC −1 B T ) (1492)<br />
When A=C =0, denoting by σ(B)∈ R m + the nonincreasingly ordered<br />
singular values of matrix B ∈ R m×m , then we have the eigenvalues<br />
[53,1.2, prob.17]<br />
and<br />
([ 0 B<br />
λ(G) = λ<br />
B T 0<br />
])<br />
=<br />
[<br />
σ(B)<br />
−Ξσ(B)<br />
]<br />
(1493)<br />
inertia(G) = inertia(B T B) + inertia(−B T B) (1494)<br />
where Ξ is the order-reversing permutation matrix defined in (1690).<br />
A.4.0.0.1 Example. Nonnegative polynomial. [33, p.163]<br />
Schur-form positive semidefiniteness is sufficient for quadratic polynomial<br />
convexity in x , but it is necessary and sufficient for nonnegativity; videlicet,<br />
for all compatible x<br />
[x T 1] [ A b<br />
b T c<br />
][ x<br />
1<br />
]<br />
≥ 0 ⇔ x T Ax + 2b T x + c ≥ 0 (1495)<br />
if and only if it can be expressed as a sum of squares of polynomials.<br />
[33, p.157] Sublevel set {x | x T Ax + 2b T x + c ≤ 0} is convex if A ≽ 0, but<br />
the quadratic polynomial is a convex function if and only if A ≽ 0.
A.4. SCHUR COMPLEMENT 603<br />
A.4.0.0.2 Example. Schur conditions.<br />
From (1457),<br />
[ A B<br />
B T C<br />
]<br />
≽ 0 ⇒ tr(A + C) ≥ 0<br />
⇕<br />
[ A 0<br />
0 T C −B T A −1 B<br />
]<br />
≽ 0 ⇒ tr(C −B T A −1 B) ≥ 0<br />
(1496)<br />
⇕<br />
[ A−BC −1 B T 0<br />
0 T C<br />
]<br />
≽ 0 ⇒ tr(A−BC −1 B T ) ≥ 0<br />
Since tr(C −B T A −1 B)≥ 0 ⇔ trC ≥ tr(B T A −1 B) ≥ 0 for example, then<br />
minimization of trC is necessary and sufficient [ for minimization ] of<br />
A B<br />
tr(C −B T A −1 B) when both are under constraint<br />
B T ≽ 0. <br />
C<br />
A.4.0.0.3 Example. Sparse Schur conditions.<br />
Setting matrix A to the identity simplifies the Schur conditions.<br />
consequence relates the definiteness of three quantities:<br />
[ ] I B<br />
B T ≽ 0 ⇔<br />
C<br />
C − B T B ≽ 0 ⇔<br />
[ I 0<br />
0 T C −B T B<br />
One<br />
]<br />
≽ 0 (1497)<br />
<br />
A.4.0.0.4 Exercise. Eigenvalues λ of sparse Schur-form.<br />
Prove: given C −B T B = 0, for B ∈ R m×n and C ∈ S n<br />
([ I B<br />
λ<br />
B T C<br />
])<br />
i<br />
=<br />
⎧<br />
⎪⎨<br />
⎪⎩<br />
1 + λ(C) i , 1 ≤ i ≤ n<br />
1, n < i ≤ m<br />
0, otherwise<br />
(1498)
604 APPENDIX A. LINEAR ALGEBRA<br />
A.4.0.0.5 Theorem. Rank of partitioned matrices.<br />
When symmetric matrix A is invertible and C is symmetric,<br />
[ ] [ ]<br />
A B A 0<br />
rank<br />
B T = rank<br />
C 0 T C −B T A −1 B<br />
= rankA + rank(C −B T A −1 B)<br />
(1499)<br />
equals rank of a block on the main diagonal plus rank of its Schur complement<br />
[386,2.2, prob.7]. Similarly, when symmetric matrix C is invertible and A<br />
is symmetric,<br />
[ ] [ ]<br />
A B A − BC<br />
rank<br />
B T = rank<br />
−1 B T 0<br />
C<br />
0 T C<br />
(1500)<br />
= rank(A − BC −1 B T ) + rankC<br />
Proof. The first assertion (1499) holds if and only if [198,0.4.6(c)]<br />
[ A B<br />
∃ nonsingular X,Y X<br />
B T C<br />
] [ A 0<br />
Y =<br />
0 T C −B T A −1 B<br />
]<br />
⋄<br />
(1501)<br />
Let [198,7.7.6]<br />
Y = X T =<br />
[ I −A −1 B<br />
0 T I<br />
]<br />
(1502)<br />
<br />
A.4.0.0.6 Lemma. Rank of Schur-form block. [133] [131]<br />
Matrix B ∈ R m×n has rankB≤ ρ if and only if there exist matrices A∈ S m<br />
and C ∈ S n such that<br />
[ ]<br />
[ ]<br />
A 0<br />
A B<br />
rank<br />
0 T ≤ 2ρ and G =<br />
C<br />
B T ≽ 0 (1503)<br />
C<br />
Schur-form positive semidefiniteness alone implies rankA ≥ rankB and<br />
rankC ≥ rankB . But, even in absence of semidefiniteness, we must always<br />
have rankG ≥ rankA, rankB, rankC by fundamental linear algebra.<br />
⋄
A.4. SCHUR COMPLEMENT 605<br />
A.4.1<br />
Determinant<br />
[ ] A B<br />
G =<br />
B T C<br />
(1504)<br />
We consider again a matrix G partitioned like (1484), but not necessarily<br />
positive (semi)definite, where A and C are symmetric.<br />
When A is invertible,<br />
When C is invertible,<br />
detG = detA det(C − B T A −1 B) (1505)<br />
detG = detC det(A − BC −1 B T ) (1506)<br />
When B is full-rank and skinny, C = 0, and A ≽ 0, then [59,10.1.1]<br />
detG ≠ 0 ⇔ A + BB T ≻ 0 (1507)<br />
When B is a (column) vector, then for all C ∈ R and all A of dimension<br />
compatible with G<br />
detG = det(A)C − B T A T cofB (1508)<br />
while for C ≠ 0<br />
detG = C det(A − 1 C BBT ) (1509)<br />
where A cof is the matrix of cofactors [325,4] corresponding to A .<br />
When B is full-rank and fat, A = 0, and C ≽ 0, then<br />
detG ≠ 0 ⇔ C + B T B ≻ 0 (1510)<br />
When B is a row-vector, then for A ≠ 0 and all C of dimension<br />
compatible with G<br />
while for all A∈ R<br />
detG = A det(C − 1 A BT B) (1511)<br />
detG = det(C)A − BC T cofB T (1512)<br />
where C cof is the matrix of cofactors corresponding to C .
606 APPENDIX A. LINEAR ALGEBRA<br />
A.5 Eigenvalue decomposition<br />
When a square matrix X ∈ R m×m is diagonalizable, [325,5.6] then<br />
X = SΛS −1 = [ s 1 · · · s m ] Λ⎣<br />
⎡<br />
w T1<br />
.<br />
w T m<br />
⎤<br />
⎦ =<br />
m∑<br />
λ i s i wi T (1513)<br />
where {s i ∈ N(X − λ i I)⊆ C m } are l.i. (right-)eigenvectors A.12 constituting<br />
the columns of S ∈ C m×m defined by<br />
i=1<br />
XS = SΛ (1514)<br />
{w i ∈ N(X T − λ i I)⊆ C m } are linearly independent left-eigenvectors of X<br />
(eigenvectors of X T ) constituting the rows of S −1 defined by [198]<br />
and where {λ i ∈ C} are eigenvalues (1437)<br />
S −1 X = ΛS −1 (1515)<br />
δ(λ(X)) = Λ ∈ C m×m (1516)<br />
corresponding to both left and right eigenvectors; id est, λ(X) = λ(X T ).<br />
There is no connection between diagonalizability and invertibility of X .<br />
[325,5.2] Diagonalizability is guaranteed by a full set of linearly independent<br />
eigenvectors, whereas invertibility is guaranteed by all nonzero eigenvalues.<br />
distinct eigenvalues ⇒ l.i. eigenvectors ⇔ diagonalizable<br />
not diagonalizable ⇒ repeated eigenvalue<br />
(1517)<br />
A.5.0.0.1 Theorem. Real eigenvector.<br />
Eigenvectors of a real matrix corresponding to real eigenvalues must be real.<br />
⋄<br />
Proof. Ax = λx . Given λ=λ ∗ , x H Ax = λx H x = λ‖x‖ 2 = x T Ax ∗<br />
x = x ∗ , where x H =x ∗T . The converse is equally simple. <br />
⇒<br />
A.12 Eigenvectors must, of course, be nonzero. The prefix eigen is from the German; in<br />
this context meaning, something akin to “characteristic”. [322, p.14]
A.5. EIGENVALUE DECOMPOSITION 607<br />
A.5.0.1<br />
Uniqueness<br />
From the fundamental theorem of algebra, [214] which guarantees existence<br />
of zeros for a given polynomial, it follows: eigenvalues, including their<br />
multiplicity, for a given square matrix are unique; meaning, there is no other<br />
set of eigenvalues for that matrix. (Conversely, many different matrices may<br />
share the same unique set of eigenvalues; e.g., for any X , λ(X) = λ(X T ).)<br />
Uniqueness of eigenvectors, in contrast, disallows multiplicity of the same<br />
direction:<br />
A.5.0.1.1 Definition. Unique eigenvectors.<br />
When eigenvectors are unique, we mean: unique to within a real nonzero<br />
scaling, and their directions are distinct.<br />
△<br />
If S is a matrix of eigenvectors of X as in (1513), for example, then −S<br />
is certainly another matrix of eigenvectors decomposing X with the same<br />
eigenvalues.<br />
For any square matrix, the eigenvector corresponding to a distinct<br />
eigenvalue is unique; [322, p.220]<br />
distinct eigenvalues ⇒ eigenvectors unique (1518)<br />
Eigenvectors corresponding to a repeated eigenvalue are not unique for a<br />
diagonalizable matrix;<br />
repeated eigenvalue ⇒ eigenvectors not unique (1519)<br />
Proof follows from the observation: any linear combination of distinct<br />
eigenvectors of diagonalizable X , corresponding to a particular eigenvalue,<br />
produces another eigenvector. For eigenvalue λ whose multiplicity A.13<br />
dim N(X −λI) exceeds 1, in other words, any choice of independent<br />
vectors from N(X −λI) (of the same multiplicity) constitutes eigenvectors<br />
corresponding to λ .<br />
<br />
A.13 A matrix is diagonalizable iff algebraic multiplicity (number of occurrences of same<br />
eigenvalue) equals geometric multiplicity dim N(X −λI) = m − rank(X −λI) [322, p.15]<br />
(number of Jordan blocks w.r.t λ or number of corresponding l.i. eigenvectors).
608 APPENDIX A. LINEAR ALGEBRA<br />
Caveat diagonalizability insures linear independence which implies<br />
existence of distinct eigenvectors. We may conclude, for diagonalizable<br />
matrices,<br />
distinct eigenvalues ⇔ eigenvectors unique (1520)<br />
A.5.0.2<br />
Invertible matrix<br />
When diagonalizable matrix X ∈ R m×m is nonsingular (no zero eigenvalues),<br />
then it has an inverse obtained simply by inverting eigenvalues in (1513):<br />
A.5.0.3<br />
eigenmatrix<br />
X −1 = SΛ −1 S −1 (1521)<br />
The (right-)eigenvectors {s i } (1513) are naturally orthogonal w T i s j =0<br />
to left-eigenvectors {w i } except, for i=1... m , w T i s i =1 ; called a<br />
biorthogonality condition [358,2.2.4] [198] because neither set of left or right<br />
eigenvectors is necessarily an orthogonal set. Consequently, each dyad from a<br />
diagonalization is an independent (B.1.1) nonorthogonal projector because<br />
s i w T i s i w T i = s i w T i (1522)<br />
(whereas the dyads of singular value decomposition are not inherently<br />
projectors (confer (1529))).<br />
Dyads of eigenvalue decomposition can be termed eigenmatrices because<br />
Sum of the eigenmatrices is the identity;<br />
A.5.1<br />
X s i w T i = λ i s i w T i (1523)<br />
m∑<br />
s i wi T = I (1524)<br />
i=1<br />
Symmetric matrix diagonalization<br />
The set of normal matrices is, precisely, that set of all real matrices having<br />
a complete orthonormal set of eigenvectors; [386,8.1] [327, prob.10.2.31]<br />
id est, any matrix X for which XX T = X T X ; [155,7.1.3] [322, p.3]<br />
e.g., orthogonal and circulant matrices [164]. All normal matrices are<br />
diagonalizable.
A.5. EIGENVALUE DECOMPOSITION 609<br />
A symmetric matrix is a special normal matrix whose eigenvalues Λ<br />
must be real A.14 and whose eigenvectors S can be chosen to make a real<br />
orthonormal set; [327,6.4] [325, p.315] id est, for X ∈ S m<br />
s T1<br />
X = SΛS T = [s 1 · · · s m ] Λ⎣<br />
.<br />
⎡<br />
s T m<br />
⎤<br />
⎦ =<br />
m∑<br />
λ i s i s T i (1525)<br />
where δ 2 (Λ) = Λ∈ S m (A.1) and S −1 = S T ∈ R m×m (orthogonal matrix,<br />
B.5) because of symmetry: SΛS −1 = S −T ΛS T . By the 0 eigenvalues<br />
theorem (A.7.3.0.1),<br />
i=1<br />
R{s i |λ i ≠0} = R(A) = R(A T )<br />
R{s i |λ i =0} = N(A T ) = N(A)<br />
(1526)<br />
A.5.1.1<br />
Diagonal matrix diagonalization<br />
Because arrangement of eigenvectors and their corresponding eigenvalues is<br />
arbitrary, we almost always arrange eigenvalues in nonincreasing order as<br />
is the convention for singular value decomposition. Then to diagonalize a<br />
symmetric matrix that is already a diagonal matrix, orthogonal matrix S<br />
becomes a permutation matrix.<br />
A.5.1.2<br />
Invertible symmetric matrix<br />
When symmetric matrix X ∈ S m is nonsingular (invertible), then its inverse<br />
(obtained by inverting eigenvalues in (1525)) is also symmetric:<br />
A.5.1.3<br />
X −1 = SΛ −1 S T ∈ S m (1527)<br />
Positive semidefinite matrix square root<br />
When X ∈ S m + , its unique positive semidefinite matrix square root is defined<br />
√<br />
X S<br />
√<br />
ΛS T ∈ S m + (1528)<br />
where the square root of nonnegative diagonal matrix √ Λ is taken entrywise<br />
and positive. Then X = √ X √ X .<br />
A.14 Proof. Suppose λ i is an eigenvalue corresponding to eigenvector s i of real A=A T .<br />
Then s H i As i= s T i As∗ i (by transposition) ⇒ s ∗T<br />
i λ i s i = s T i λ∗ i s∗ i because (As i ) ∗ = (λ i s i ) ∗ by<br />
assumption. So we have λ i ‖s i ‖ 2 = λ ∗ i ‖s i‖ 2 . There is no converse.
610 APPENDIX A. LINEAR ALGEBRA<br />
A.6 Singular value decomposition, SVD<br />
A.6.1<br />
Compact SVD<br />
[155,2.5.4] For any A∈ R m×n<br />
q T1<br />
A = UΣQ T = [ u 1 · · · u η ] Σ⎣<br />
.<br />
⎡<br />
⎤<br />
∑<br />
⎦ = η σ i u i qi<br />
T<br />
qη<br />
T i=1<br />
(1529)<br />
U ∈ R m×η , Σ ∈ R η×η , Q ∈ R n×η<br />
where U and Q are always skinny-or-square each having orthonormal<br />
columns, and where<br />
η min{m , n} (1530)<br />
Square matrix Σ is diagonal (A.1.1)<br />
δ 2 (Σ) = Σ ∈ R η×η (1531)<br />
holding the singular values {σ i ∈ R} of A which are always arranged in<br />
nonincreasing order by convention and are related to eigenvalues λ by A.15<br />
⎧√ ⎨ λ(AT A)<br />
σ(A) i = σ(A T i = √ (√ )<br />
λ(AA T ) i = λ AT A)<br />
= λ(√<br />
AA<br />
T<br />
> 0, 1 ≤ i ≤ ρ<br />
) i =<br />
i<br />
i<br />
⎩<br />
0, ρ < i ≤ η<br />
(1532)<br />
of which the last η −ρ are 0 , A.16 where<br />
ρ rankA = rank Σ (1533)<br />
A point sometimes lost: Any real matrix may be decomposed in terms of<br />
its real singular values σ(A)∈ R η and real matrices U and Q as in (1529),<br />
where [155,2.5.3]<br />
R{u i |σ i ≠0} = R(A)<br />
R{u i |σ i =0} ⊆ N(A T )<br />
R{q i |σ i ≠0} = R(A T (1534)<br />
)<br />
R{q i |σ i =0} ⊆ N(A)<br />
A.15 When matrix A is normal,<br />
√<br />
σ(A) = |λ(A)|.<br />
)<br />
[386,8.1]<br />
A.16 For η = n , σ(A) = λ(A T A) = λ(√<br />
AT A .<br />
√<br />
)<br />
For η = m , σ(A) = λ(AA T ) = λ(√<br />
AA<br />
T<br />
.
A.6. SINGULAR VALUE DECOMPOSITION, SVD 611<br />
A.6.2<br />
Subcompact SVD<br />
Some authors allow only nonzero singular values. In that case the compact<br />
decomposition can be made smaller; it can be redimensioned in terms of<br />
rank ρ because, for any A∈ R m×n<br />
ρ = rankA = rank Σ = max {i∈{1... η} | σ i ≠ 0} ≤ η (1535)<br />
There are η singular values. For any flavor SVD, rank is equivalent to<br />
the number of nonzero singular values on the main diagonal of Σ.<br />
Now<br />
⎡<br />
q T1<br />
A = UΣQ T = [ u 1 · · · u ρ ] Σ⎣<br />
.<br />
⎤<br />
∑<br />
⎦ = ρ σ i u i qi<br />
T<br />
qρ<br />
T i=1<br />
(1536)<br />
U ∈ R m×ρ , Σ ∈ R ρ×ρ , Q ∈ R n×ρ<br />
where the main diagonal of diagonal matrix Σ has no 0 entries, and<br />
R{u i } = R(A)<br />
R{q i } = R(A T )<br />
(1537)<br />
A.6.3<br />
Full SVD<br />
Another common and useful expression of the SVD makes U and Q<br />
square; making the decomposition larger than compact SVD. Completing<br />
the nullspace bases in U and Q from (1534) provides what is called the<br />
full singular value decomposition of A∈ R m×n [325, App.A]. Orthonormal<br />
matrices U and Q become orthogonal matrices (B.5):<br />
R{u i |σ i ≠0} = R(A)<br />
R{u i |σ i =0} = N(A T )<br />
R{q i |σ i ≠0} = R(A T )<br />
R{q i |σ i =0} = N(A)<br />
(1538)
612 APPENDIX A. LINEAR ALGEBRA<br />
For any matrix A having rank ρ (= rank Σ)<br />
⎡<br />
q T1<br />
A = UΣQ T = [ u 1 · · · u m ] Σ⎣<br />
.<br />
q T n<br />
⎤<br />
∑<br />
⎦ = η σ i u i qi<br />
T<br />
⎡<br />
σ 1<br />
= [ m×ρ basis R(A) m×m−ρ basis N(A T ) ] σ 2<br />
⎢<br />
⎣<br />
...<br />
i=1<br />
⎤<br />
⎡ ( n×ρ basis R(A T ) ) ⎤<br />
T<br />
⎥⎣<br />
⎦<br />
⎦<br />
(n×n−ρ basis N(A)) T<br />
U ∈ R m×m , Σ ∈ R m×n , Q ∈ R n×n (1539)<br />
where upper limit of summation η is defined in (1530). Matrix Σ is no<br />
longer necessarily square, now padded with respect to (1531) by m−η<br />
zero rows or n−η zero columns; the nonincreasingly ordered (possibly 0)<br />
singular values appear along its main diagonal as for compact SVD (1532).<br />
An important geometrical interpretation of SVD is given in Figure 152<br />
for m = n = 2 : The image of the unit sphere under any m × n matrix<br />
multiplication is an ellipse. Considering the three factors of the SVD<br />
separately, note that Q T is a pure rotation of the circle. Figure 152 shows<br />
how the axes q 1 and q 2 are first rotated by Q T to coincide with the coordinate<br />
axes. Second, the circle is stretched by Σ in the directions of the coordinate<br />
axes to form an ellipse. The third step rotates the ellipse by U into its<br />
final position. Note how q 1 and q 2 are rotated to end up as u 1 and u 2 , the<br />
principal axes of the final ellipse. A direct calculation shows that Aq j = σ j u j .<br />
Thus q j is first rotated to coincide with the j th coordinate axis, stretched by<br />
a factor σ j , and then rotated to point in the direction of u j . All of this<br />
is beautifully illustrated for 2 ×2 matrices by the Matlab code eigshow.m<br />
(see [328]).<br />
A direct consequence of the geometric interpretation is that the largest<br />
singular value σ 1 measures the “magnitude” of A (its 2-norm):<br />
‖A‖ 2 = sup ‖Ax‖ 2 = σ 1 (1540)<br />
‖x‖ 2 =1<br />
This means that ‖A‖ 2 is the length of the longest principal semiaxis of the<br />
ellipse.
A.6. SINGULAR VALUE DECOMPOSITION, SVD 613<br />
Σ<br />
q 2<br />
q<br />
q 2<br />
1<br />
Q T<br />
q 1<br />
q 2 u 2<br />
U<br />
u 1<br />
q 1<br />
Figure 152: Geometrical interpretation of full SVD [265]: Image of circle<br />
{x∈ R 2 | ‖x‖ 2 =1} under matrix multiplication Ax is, in general, an ellipse.<br />
For the example illustrated, U [u 1 u 2 ]∈ R 2×2 , Q[q 1 q 2 ]∈ R 2×2 .
614 APPENDIX A. LINEAR ALGEBRA<br />
Expressions for U , Q , and Σ follow readily from (1539),<br />
AA T U = UΣΣ T and A T AQ = QΣ T Σ (1541)<br />
demonstrating that the columns of U are the eigenvectors of AA T and the<br />
columns of Q are the eigenvectors of A T A. −Muller, Magaia, & Herbst [265]<br />
A.6.4<br />
Pseudoinverse by SVD<br />
Matrix pseudoinverse (E) is nearly synonymous with singular value<br />
decomposition because of the elegant expression, given A = UΣQ T ∈ R m×n<br />
A † = QΣ †T U T ∈ R n×m (1542)<br />
that applies to all three flavors of SVD, where Σ † simply inverts nonzero<br />
entries of matrix Σ .<br />
Given symmetric matrix A∈ S n and its diagonalization A = SΛS T<br />
(A.5.1), its pseudoinverse simply inverts all nonzero eigenvalues:<br />
A † = SΛ † S T ∈ S n (1543)<br />
A.6.5<br />
SVD of symmetric matrices<br />
From (1532) and (1528) for A = A T<br />
⎧√ (√<br />
⎨ λ(A2 ) i = λ A 2)<br />
= |λ(A) i | > 0, 1 ≤ i ≤ ρ<br />
σ(A) i =<br />
i<br />
⎩<br />
0, ρ < i ≤ η<br />
(1544)<br />
A.6.5.0.1 Definition. Step function. (confer4.3.2.0.1)<br />
Define the signum-like quasilinear function ψ : R n → R n that takes value 1<br />
corresponding to a 0-valued entry in its argument:<br />
[<br />
ψ(a) <br />
lim<br />
x i →a i<br />
x i<br />
|x i | = { 1, ai ≥ 0<br />
−1, a i < 0 , i=1... n ]<br />
∈ R n (1545)<br />
△
A.7. ZEROS 615<br />
Eigenvalue signs of a symmetric matrix having diagonalization<br />
A = SΛS T (1525) can be absorbed either into real U or real Q from the<br />
full SVD; [343, p.34] (conferC.4.2.1)<br />
or<br />
A = SΛS T = Sδ(ψ(δ(Λ))) |Λ|S T U ΣQ T ∈ S n (1546)<br />
A = SΛS T = S|Λ|δ(ψ(δ(Λ)))S T UΣ Q T ∈ S n (1547)<br />
where matrix of singular values Σ = |Λ| denotes entrywise absolute value of<br />
diagonal eigenvalue matrix Λ .<br />
A.7 Zeros<br />
A.7.1<br />
norm zero<br />
For any given norm, by definition,<br />
‖x‖ l<br />
= 0 ⇔ x = 0 (1548)<br />
Consequently, a generally nonconvex constraint in x like ‖Ax − b‖ = κ<br />
becomes convex when κ = 0.<br />
A.7.2<br />
0 entry<br />
If a positive semidefinite matrix A = [A ij ] ∈ R n×n has a 0 entry A ii on its<br />
main diagonal, then A ij + A ji = 0 ∀j . [266,1.3.1]<br />
Any symmetric positive semidefinite matrix having a 0 entry on its main<br />
diagonal must be 0 along the entire row and column to which that 0 entry<br />
belongs. [155,4.2.8] [198,7.1, prob.2]<br />
A.7.3 0 eigenvalues theorem<br />
This theorem is simple, powerful, and widely applicable:<br />
A.7.3.0.1 Theorem. Number of 0 eigenvalues.<br />
For any matrix A∈ R m×n<br />
rank(A) + dim N(A) = n (1549)
616 APPENDIX A. LINEAR ALGEBRA<br />
by conservation of dimension. [198,0.4.4]<br />
For any square matrix A∈ R m×m , the number of 0 eigenvalues is at least<br />
equal to dim N(A)<br />
dim N(A) ≤ number of 0 eigenvalues ≤ m (1550)<br />
while the eigenvectors corresponding to those 0 eigenvalues belong to N(A).<br />
[325,5.1] A.17<br />
For diagonalizable matrix A (A.5), the number of 0 eigenvalues is<br />
precisely dim N(A) while the corresponding eigenvectors span N(A). The<br />
real and imaginary parts of the eigenvectors remaining span R(A).<br />
(TRANSPOSE.)<br />
Likewise, for any matrix A∈ R m×n<br />
rank(A T ) + dim N(A T ) = m (1551)<br />
For any square A∈ R m×m , the number of 0 eigenvalues is at least equal<br />
to dim N(A T ) = dim N(A) while the left-eigenvectors (eigenvectors of A T )<br />
corresponding to those 0 eigenvalues belong to N(A T ).<br />
For diagonalizable A , the number of 0 eigenvalues is precisely<br />
dim N(A T ) while the corresponding left-eigenvectors span N(A T ). The real<br />
and imaginary parts of the left-eigenvectors remaining span R(A T ). ⋄<br />
Proof. First we show, for a diagonalizable matrix, the number of 0<br />
eigenvalues is precisely the dimension of its nullspace while the eigenvectors<br />
corresponding to those 0 eigenvalues span the nullspace:<br />
Any diagonalizable matrix A∈ R m×m must possess a complete set of<br />
linearly independent eigenvectors. If A is full-rank (invertible), then all<br />
m=rank(A) eigenvalues are nonzero. [325,5.1]<br />
A.17 We take as given the well-known fact that the number of 0 eigenvalues cannot be less<br />
than dimension of the nullspace. We offer an example of the converse:<br />
⎡ ⎤<br />
1 0 1 0<br />
A = ⎢ 0 0 1 0<br />
⎥<br />
⎣ 0 0 0 0 ⎦<br />
1 0 0 0<br />
dim N(A) = 2, λ(A) = [0 0 0 1] T ; three eigenvectors in the nullspace but only two are<br />
independent. The right-hand side of (1550) is tight for nonzero matrices; e.g., (B.1) dyad<br />
uv T ∈ R m×m has m 0-eigenvalues when u ∈ v ⊥ .
A.7. ZEROS 617<br />
Suppose rank(A)< m . Then dim N(A) = m−rank(A). Thus there is<br />
a set of m−rank(A) linearly independent vectors spanning N(A). Each<br />
of those can be an eigenvector associated with a 0 eigenvalue because<br />
A is diagonalizable ⇔ ∃ m linearly independent eigenvectors. [325,5.2]<br />
Eigenvectors of a real matrix corresponding to 0 eigenvalues must be real. A.18<br />
Thus A has at least m−rank(A) eigenvalues equal to 0.<br />
Now suppose A has more than m−rank(A) eigenvalues equal to 0.<br />
Then there are more than m−rank(A) linearly independent eigenvectors<br />
associated with 0 eigenvalues, and each of those eigenvectors must be in<br />
N(A). Thus there are more than m−rank(A) linearly independent vectors<br />
in N(A) ; a contradiction.<br />
Diagonalizable A therefore has rank(A) nonzero eigenvalues and exactly<br />
m−rank(A) eigenvalues equal to 0 whose corresponding eigenvectors<br />
span N(A).<br />
By similar argument, the left-eigenvectors corresponding to 0 eigenvalues<br />
span N(A T ).<br />
Next we show when A is diagonalizable, the real and imaginary parts of<br />
its eigenvectors (corresponding to nonzero eigenvalues) span R(A) :<br />
The (right-)eigenvectors of a diagonalizable matrix A∈ R m×m are linearly<br />
independent if and only if the left-eigenvectors are. So, matrix A has<br />
a representation in terms of its right- and left-eigenvectors; from the<br />
diagonalization (1513), assuming 0 eigenvalues are ordered last,<br />
A =<br />
m∑<br />
λ i s i wi T =<br />
i=1<br />
k∑<br />
≤ m<br />
i=1<br />
λ i ≠0<br />
λ i s i w T i (1552)<br />
From the linearly independent dyads theorem (B.1.1.0.2), the dyads {s i w T i }<br />
must be independent because each set of eigenvectors are; hence rankA = k ,<br />
the number of nonzero eigenvalues. Complex eigenvectors and eigenvalues<br />
are common for real matrices, and must come in complex conjugate pairs for<br />
the summation to remain real. Assume that conjugate pairs of eigenvalues<br />
appear in sequence. Given any particular conjugate pair from (1552), we get<br />
the partial summation<br />
λ i s i w T i + λ ∗ i s ∗ iw ∗T<br />
i = 2re(λ i s i w T i )<br />
= 2 ( res i re(λ i w T i ) − im s i im(λ i w T i ) ) (1553)<br />
A.18 Proof. Let ∗ denote complex conjugation. Suppose A=A ∗ and As i =0. Then<br />
s i = s ∗ i ⇒ As i =As ∗ i ⇒ As ∗ i =0. Conversely, As∗ i =0 ⇒ As i=As ∗ i ⇒ s i= s ∗ i .
618 APPENDIX A. LINEAR ALGEBRA<br />
where A.19 λ ∗ i λ i+1 , s ∗ i s i+1 , and w ∗ i w i+1 . Then (1552) is<br />
equivalently written<br />
A = 2 ∑ i<br />
λ ∈ C<br />
λ i ≠0<br />
res 2i re(λ 2i w T 2i) − im s 2i im(λ 2i w T 2i) + ∑ j<br />
λ ∈ R<br />
λ j ≠0<br />
λ j s j w T j (1554)<br />
The summation (1554) shows: A is a linear combination of real and imaginary<br />
parts of its (right-)eigenvectors corresponding to nonzero eigenvalues. The<br />
k vectors {re s i ∈ R m , ims i ∈ R m | λ i ≠0, i∈{1... m}} must therefore span<br />
the range of diagonalizable matrix A .<br />
The argument is similar regarding the span of the left-eigenvectors. <br />
A.7.4<br />
0 trace and matrix product<br />
For X,A∈ R M×N<br />
+ (39)<br />
tr(X T A) = 0 ⇔ X ◦ A = A ◦ X = 0 (1555)<br />
For X,A∈ S M +<br />
[33,2.6.1, exer.2.8] [352,3.1]<br />
tr(XA) = 0 ⇔ XA = AX = 0 (1556)<br />
Proof. (⇐) Suppose XA = AX = 0. Then tr(XA)=0 is obvious.<br />
(⇒) Suppose tr(XA)=0. tr(XA)= tr( √ AX √ A) whose argument is<br />
positive semidefinite by Corollary A.3.1.0.5. Trace of any square matrix is<br />
equivalent to the sum of its eigenvalues. Eigenvalues of a positive semidefinite<br />
matrix can total 0 if and only if each and every nonnegative eigenvalue is 0.<br />
The only positive semidefinite matrix, having all 0 eigenvalues, resides at the<br />
origin; (confer (1580)) id est,<br />
√<br />
AX<br />
√<br />
A =<br />
(√<br />
X<br />
√<br />
A<br />
) T√<br />
X<br />
√<br />
A = 0 (1557)<br />
implying √ X √ A = 0 which in turn implies √ X( √ X √ A) √ A = XA = 0.<br />
Arguing similarly yields AX = 0.<br />
<br />
Diagonalizable matrices A and X are simultaneously diagonalizable if and<br />
only if they are commutative under multiplication; [198,1.3.12] id est, iff<br />
they share a complete set of eigenvectors.<br />
A.19 Complex conjugate of w is denoted w ∗ . Conjugate transpose is denoted w H = w ∗T .
A.7. ZEROS 619<br />
A.7.4.0.1 Example. An equivalence in nonisomorphic spaces.<br />
Identity (1556) leads to an unusual equivalence relating convex geometry to<br />
traditional linear algebra: The convex sets, given A ≽ 0<br />
{X | 〈X , A〉 = 0} ∩ {X ≽ 0} ≡ {X | N(X) ⊇ R(A)} ∩ {X ≽ 0} (1558)<br />
(one expressed in terms of a hyperplane, the other in terms of nullspace and<br />
range) are equivalent only when symmetric matrix A is positive semidefinite.<br />
We might apply this equivalence to the geometric center subspace, for<br />
example,<br />
S M c = {Y ∈ S M | Y 1 = 0}<br />
= {Y ∈ S M | N(Y ) ⊇ 1} = {Y ∈ S M | R(Y ) ⊆ N(1 T )}<br />
(1957)<br />
from which we derive (confer (970))<br />
S M c ∩ S M + ≡ {X ≽ 0 | 〈X , 11 T 〉 = 0} (1559)<br />
<br />
A.7.5<br />
Zero definite<br />
The domain over which an arbitrary real matrix A is zero definite can exceed<br />
its left and right nullspaces. For any positive semidefinite matrix A∈ R M×M<br />
(for A +A T ≽ 0)<br />
{x | x T Ax = 0} = N(A +A T ) (1560)<br />
because ∃R A+A T =R T R , ‖Rx‖=0 ⇔ Rx=0, and N(A+A T )= N(R).<br />
Then given any particular vector x p , x T pAx p = 0 ⇔ x p ∈ N(A +A T ). For<br />
any positive definite matrix A (for A +A T ≻ 0)<br />
Further, [386,3.2, prob.5]<br />
{x | x T Ax = 0} = 0 (1561)<br />
{x | x T Ax = 0} = R M ⇔ A T = −A (1562)<br />
while<br />
{x | x H Ax = 0} = C M ⇔ A = 0 (1563)
620 APPENDIX A. LINEAR ALGEBRA<br />
The positive semidefinite matrix<br />
[ ] 1 2<br />
A =<br />
0 1<br />
for example, has no nullspace. Yet<br />
(1564)<br />
{x | x T Ax = 0} = {x | 1 T x = 0} ⊂ R 2 (1565)<br />
which is the nullspace of the symmetrized matrix. Symmetric matrices are<br />
not spared from the excess; videlicet,<br />
[ ] 1 2<br />
B =<br />
(1566)<br />
2 1<br />
has eigenvalues {−1, 3} , no nullspace, but is zero definite on A.20<br />
X {x∈ R 2 | x 2 = (−2 ± √ 3)x 1 } (1567)<br />
A.7.5.0.1 Proposition. (Sturm/Zhang) Dyad-decompositions. [331,5.2]<br />
Let positive semidefinite matrix X ∈ S M + have rank ρ . Then given symmetric<br />
matrix A∈ S M , 〈A, X〉 = 0 if and only if there exists a dyad-decomposition<br />
ρ∑<br />
X = x j xj T (1568)<br />
satisfying<br />
j=1<br />
〈A , x j x T j 〉 = 0 for each and every j ∈ {1... ρ} (1569)<br />
⋄<br />
The dyad-decomposition of X proposed is generally not that obtained<br />
from a standard diagonalization by eigenvalue decomposition, unless ρ =1<br />
or the given matrix A is simultaneously diagonalizable (A.7.4) with X .<br />
That means, elemental dyads in decomposition (1568) constitute a generally<br />
nonorthogonal set. Sturm & Zhang give a simple procedure for constructing<br />
the dyad-decomposition [Wıκımization]; matrix A may be regarded as a<br />
parameter.<br />
A.20 These two lines represent the limit in the union of two generally distinct hyperbolae;<br />
id est, for matrix B and set X as defined<br />
lim<br />
ε→0 +{x∈ R2 | x T Bx = ε} = X
A.7. ZEROS 621<br />
A.7.5.0.2 Example. Dyad.<br />
The dyad uv T ∈ R M×M (B.1) is zero definite on all x for which either<br />
x T u=0 or x T v=0;<br />
{x | x T uv T x = 0} = {x | x T u=0} ∪ {x | v T x=0} (1570)<br />
id est, on u ⊥ ∪ v ⊥ . Symmetrizing the dyad does not change the outcome:<br />
{x | x T (uv T + vu T )x/2 = 0} = {x | x T u=0} ∪ {x | v T x=0} (1571)