v2009.01.01 - Convex Optimization
v2009.01.01 - Convex Optimization v2009.01.01 - Convex Optimization
626 APPENDIX D. MATRIX CALCULUS In the case g(X) : R K →R has vector argument, they further simplify: →Y dg(X) = ∇g(X) T Y (1711) and so on. →Y dg 2 (X) = Y T ∇ 2 g(X)Y (1712) →Y dg 3 (X) = ∇ X ( Y T ∇ 2 g(X)Y ) T Y (1713) D.1.7 Taylor series Series expansions of the differentiable matrix-valued function g(X) , of matrix argument, were given earlier in (1685) and (1706). Assuming g(X) has continuous first-, second-, and third-order gradients over the open set domg , then for X ∈ dom g and any Y ∈ R K×L the complete Taylor series on some open interval of µ∈R is expressed g(X+µY ) = g(X) + µ dg(X) →Y + 1 →Y µ2dg 2 (X) + 1 →Y µ3dg 3 (X) + o(µ 4 ) (1714) 2! 3! or on some open interval of ‖Y ‖ 2 g(Y ) = g(X) + →Y −X →Y −X →Y −X dg(X) + 1 dg 2 (X) + 1 dg 3 (X) + o(‖Y ‖ 4 ) (1715) 2! 3! which are third-order expansions about X . The mean value theorem from calculus is what insures finite order of the series. [190] [37,1.1] [36, App.A.5] [173,0.4] D.1.7.0.1 Example. Inverse matrix function. Say g(Y )= Y −1 . Let’s find its Taylor series expansion about X = I . Continuing the table on page 632, →Y dg(X) = d dt∣ g(X+ t Y ) = −X −1 Y X −1 (1716) t=0 →Y dg 2 (X) = d2 dt 2 ∣ ∣∣∣t=0 g(X+ t Y ) = 2X −1 Y X −1 Y X −1 (1717) →Y dg 3 (X) = d3 dt 3 ∣ ∣∣∣t=0 g(X+ t Y ) = −6X −1 Y X −1 Y X −1 Y X −1 (1718)
D.1. DIRECTIONAL DERIVATIVE, TAYLOR SERIES 627 Since g(I) = I , (µ = 1) for ‖Y ‖ 2 < 1 g(I + Y ) = (I + Y ) −1 = I − Y + Y 2 − Y 3 + ... (1719) If Y is small, (I + Y ) −1 ≈ I − Y . (Had we instead set g(Y )=(I + Y ) −1 , then the equivalent expansion would have been about X = 0 .) D.1.7.0.2 Exercise. log det. (confer [53, p.644]) Find the first three terms of a Taylor series expansion for log detY . Specify an open interval over which the expansion holds in vicinity of X . D.1.8 Correspondence of gradient to derivative From the foregoing expressions for directional derivative, we derive a relationship between the gradient with respect to matrix X and the derivative with respect to real variable t : D.1.8.1 first-order Removing from (1686) the evaluation at t = 0 , D.4 we find an expression for the directional derivative of g(X) in direction Y evaluated anywhere along a line {X+ t Y | t ∈ R} intersecting domg →Y dg(X+ t Y ) = d g(X+ t Y ) (1720) dt In the general case g(X) : R K×L →R M×N , from (1679) and (1682) we find tr ( ∇ X g mn (X+ t Y ) T Y ) = d dt g mn(X+ t Y ) (1721) which is valid at t = 0, of course, when X ∈ domg . In the important case of a real function g(X) : R K×L →R , from (1708) we have simply tr ( ∇ X g(X+ t Y ) T Y ) = d g(X+ t Y ) (1722) dt D.4 Justified by replacing X with X+ tY in (1679)-(1681); beginning, dg mn (X+ tY )| dX→Y = ∑ k, l ∂g mn (X+ tY ) Y kl ∂X kl
- Page 575 and 576: A.7. ZEROS 575 A.7.5.0.1 Propositio
- Page 577 and 578: Appendix B Simple matrices Mathemat
- Page 579 and 580: B.1. RANK-ONE MATRIX (DYAD) 579 R(v
- Page 581 and 582: B.1. RANK-ONE MATRIX (DYAD) 581 ran
- Page 583 and 584: B.2. DOUBLET 583 R([u v ]) R(Π)= R
- Page 585 and 586: B.3. ELEMENTARY MATRIX 585 If λ
- Page 587 and 588: B.4. AUXILIARY V -MATRICES 587 the
- Page 589 and 590: B.4. AUXILIARY V -MATRICES 589 18.
- Page 591 and 592: B.5. ORTHOGONAL MATRIX 591 B.5 Orth
- Page 593 and 594: B.5. ORTHOGONAL MATRIX 593 Figure 1
- Page 595 and 596: Appendix C Some analytical optimal
- Page 597 and 598: C.2. TRACE, SINGULAR AND EIGEN VALU
- Page 599 and 600: C.2. TRACE, SINGULAR AND EIGEN VALU
- Page 601 and 602: C.2. TRACE, SINGULAR AND EIGEN VALU
- Page 603 and 604: C.3. ORTHOGONAL PROCRUSTES PROBLEM
- Page 605 and 606: C.4. TWO-SIDED ORTHOGONAL PROCRUSTE
- Page 607 and 608: C.4. TWO-SIDED ORTHOGONAL PROCRUSTE
- Page 609 and 610: Appendix D Matrix calculus From too
- Page 611 and 612: D.1. DIRECTIONAL DERIVATIVE, TAYLOR
- Page 613 and 614: D.1. DIRECTIONAL DERIVATIVE, TAYLOR
- Page 615 and 616: D.1. DIRECTIONAL DERIVATIVE, TAYLOR
- Page 617 and 618: D.1. DIRECTIONAL DERIVATIVE, TAYLOR
- Page 619 and 620: D.1. DIRECTIONAL DERIVATIVE, TAYLOR
- Page 621 and 622: D.1. DIRECTIONAL DERIVATIVE, TAYLOR
- Page 623 and 624: D.1. DIRECTIONAL DERIVATIVE, TAYLOR
- Page 625: D.1. DIRECTIONAL DERIVATIVE, TAYLOR
- Page 629 and 630: D.1. DIRECTIONAL DERIVATIVE, TAYLOR
- Page 631 and 632: D.2. TABLES OF GRADIENTS AND DERIVA
- Page 633 and 634: D.2. TABLES OF GRADIENTS AND DERIVA
- Page 635 and 636: D.2. TABLES OF GRADIENTS AND DERIVA
- Page 637 and 638: D.2. TABLES OF GRADIENTS AND DERIVA
- Page 639 and 640: Appendix E Projection For any A∈
- Page 641 and 642: 641 U T = U † for orthonormal (in
- Page 643 and 644: E.1. IDEMPOTENT MATRICES 643 where
- Page 645 and 646: E.1. IDEMPOTENT MATRICES 645 order,
- Page 647 and 648: E.1. IDEMPOTENT MATRICES 647 When t
- Page 649 and 650: E.3. SYMMETRIC IDEMPOTENT MATRICES
- Page 651 and 652: E.3. SYMMETRIC IDEMPOTENT MATRICES
- Page 653 and 654: E.3. SYMMETRIC IDEMPOTENT MATRICES
- Page 655 and 656: E.5. PROJECTION EXAMPLES 655 E.4.0.
- Page 657 and 658: E.5. PROJECTION EXAMPLES 657 a ∗
- Page 659 and 660: E.5. PROJECTION EXAMPLES 659 E.5.0.
- Page 661 and 662: E.6. VECTORIZATION INTERPRETATION,
- Page 663 and 664: E.6. VECTORIZATION INTERPRETATION,
- Page 665 and 666: E.6. VECTORIZATION INTERPRETATION,
- Page 667 and 668: E.6. VECTORIZATION INTERPRETATION,
- Page 669 and 670: E.7. ON VECTORIZED MATRICES OF HIGH
- Page 671 and 672: E.7. ON VECTORIZED MATRICES OF HIGH
- Page 673 and 674: E.8. RANGE/ROWSPACE INTERPRETATION
- Page 675 and 676: E.9. PROJECTION ON CONVEX SET 675 A
D.1. DIRECTIONAL DERIVATIVE, TAYLOR SERIES 627<br />
Since g(I) = I , (µ = 1) for ‖Y ‖ 2 < 1<br />
g(I + Y ) = (I + Y ) −1 = I − Y + Y 2 − Y 3 + ... (1719)<br />
If Y is small, (I + Y ) −1 ≈ I − Y . (Had we instead set g(Y )=(I + Y ) −1 ,<br />
then the equivalent expansion would have been about X = 0 .) <br />
D.1.7.0.2 Exercise. log det. (confer [53, p.644])<br />
Find the first three terms of a Taylor series expansion for log detY . Specify<br />
an open interval over which the expansion holds in vicinity of X . <br />
D.1.8<br />
Correspondence of gradient to derivative<br />
From the foregoing expressions for directional derivative, we derive a<br />
relationship between the gradient with respect to matrix X and the derivative<br />
with respect to real variable t :<br />
D.1.8.1<br />
first-order<br />
Removing from (1686) the evaluation at t = 0 , D.4 we find an expression for<br />
the directional derivative of g(X) in direction Y evaluated anywhere along<br />
a line {X+ t Y | t ∈ R} intersecting domg<br />
→Y<br />
dg(X+ t Y ) = d g(X+ t Y ) (1720)<br />
dt<br />
In the general case g(X) : R K×L →R M×N , from (1679) and (1682) we find<br />
tr ( ∇ X g mn (X+ t Y ) T Y ) = d dt g mn(X+ t Y ) (1721)<br />
which is valid at t = 0, of course, when X ∈ domg . In the important case<br />
of a real function g(X) : R K×L →R , from (1708) we have simply<br />
tr ( ∇ X g(X+ t Y ) T Y ) = d g(X+ t Y ) (1722)<br />
dt<br />
D.4 Justified by replacing X with X+ tY in (1679)-(1681); beginning,<br />
dg mn (X+ tY )| dX→Y<br />
= ∑ k, l<br />
∂g mn (X+ tY )<br />
Y kl<br />
∂X kl