v2009.01.01 - Convex Optimization
v2009.01.01 - Convex Optimization v2009.01.01 - Convex Optimization
620 APPENDIX D. MATRIX CALCULUS υ ∆ = ⎡ ⎢ ⎣ ∇ x f(α) →∇ xf(α) 1df(α) 2 ⎤ ⎥ ⎦ f(α + t y) υ T (f(α),α) ✡ ✡✡✡✡✡✡✡✡✡ f(x) ∂H Figure 136: Convex quadratic bowl in R 2 ×R ; f(x)= x T x : R 2 → R versus x on some open disc in R 2 . Plane slice ∂H is perpendicular to function domain. Slice intersection with domain connotes bidirectional vector y . Slope of tangent line T at point (α , f(α)) is value of directional derivative ∇ x f(α) T y (1711) at α in slice direction y . Negative gradient −∇ x f(x)∈ R 2 is direction of steepest descent. [327] [190,15.6] [128] When vector υ ∈[ R 3 entry ] υ 3 is half directional derivative in gradient direction at α υ1 and when = ∇ υ x f(α) , then −υ points directly toward bowl bottom. 2 [239,2.1,5.4.5] [31,6.3.1] which is simplest. In case of a real function g(X) : R K×L →R In case g(X) : R K →R →Y dg(X) = tr ( ∇g(X) T Y ) (1708) →Y dg(X) = ∇g(X) T Y (1711) Unlike gradient, directional derivative does not expand dimension; directional derivative (1686) retains the dimensions of g . The derivative with respect to t makes the directional derivative resemble ordinary calculus →Y (D.2); e.g., when g(X) is linear, dg(X) = g(Y ). [215,7.2]
D.1. DIRECTIONAL DERIVATIVE, TAYLOR SERIES 621 D.1.4.1 Interpretation of directional derivative In the case of any differentiable real function g(X) : R K×L →R , the directional derivative of g(X) at X in any direction Y yields the slope of g along the line {X+ t Y | t ∈ R} through its domain evaluated at t = 0. For higher-dimensional functions, by (1683), this slope interpretation can be applied to each entry of the directional derivative. Figure 136, for example, shows a plane slice of a real convex bowl-shaped function f(x) along a line {α + t y | t ∈ R} through its domain. The slice reveals a one-dimensional real function of t ; f(α + t y). The directional derivative at x = α in direction y is the slope of f(α + t y) with respect to t at t = 0. In the case of a real function having vector argument h(X) : R K →R , its directional derivative in the normalized direction of its gradient is the gradient magnitude. (1711) For a real function of real variable, the directional derivative evaluated at any point in the function domain is just the slope of that function there scaled by the real direction. (confer3.1.8) Directional derivative generalizes our one-dimensional notion of derivative to a multidimensional domain. When direction Y coincides with a member of the standard unit Cartesian basis e k e T l (53), then a single partial derivative ∂g(X)/∂X kl is obtained from directional derivative (1684); such is each entry of gradient ∇g(X) in equalities (1708) and (1711), for example. D.1.4.1.1 Theorem. Directional derivative optimality condition. [215,7.4] Suppose f(X) : R K×L →R is minimized on convex set C ⊆ R p×k by X ⋆ , and the directional derivative of f exists there. Then for all X ∈ C D.1.4.1.2 Example. Simple bowl. Bowl function (Figure 136) →X−X ⋆ df(X) ≥ 0 (1687) f(x) : R K → R ∆ = (x − a) T (x − a) − b (1688) has function offset −b ∈ R , axis of revolution at x = a , and positive definite Hessian (1638) everywhere in its domain (an open hyperdisc in R K ); id est, strictly convex quadratic f(x) has unique global minimum equal to −b at ⋄
- Page 569 and 570: A.7. ZEROS 569 Given symmetric matr
- Page 571 and 572: A.7. ZEROS 571 (TRANSPOSE.) Likewis
- Page 573 and 574: A.7. ZEROS 573 For X,A∈ S M + [31
- Page 575 and 576: A.7. ZEROS 575 A.7.5.0.1 Propositio
- Page 577 and 578: Appendix B Simple matrices Mathemat
- Page 579 and 580: B.1. RANK-ONE MATRIX (DYAD) 579 R(v
- Page 581 and 582: B.1. RANK-ONE MATRIX (DYAD) 581 ran
- Page 583 and 584: B.2. DOUBLET 583 R([u v ]) R(Π)= R
- Page 585 and 586: B.3. ELEMENTARY MATRIX 585 If λ
- Page 587 and 588: B.4. AUXILIARY V -MATRICES 587 the
- Page 589 and 590: B.4. AUXILIARY V -MATRICES 589 18.
- Page 591 and 592: B.5. ORTHOGONAL MATRIX 591 B.5 Orth
- Page 593 and 594: B.5. ORTHOGONAL MATRIX 593 Figure 1
- Page 595 and 596: Appendix C Some analytical optimal
- Page 597 and 598: C.2. TRACE, SINGULAR AND EIGEN VALU
- Page 599 and 600: C.2. TRACE, SINGULAR AND EIGEN VALU
- Page 601 and 602: C.2. TRACE, SINGULAR AND EIGEN VALU
- Page 603 and 604: C.3. ORTHOGONAL PROCRUSTES PROBLEM
- Page 605 and 606: C.4. TWO-SIDED ORTHOGONAL PROCRUSTE
- Page 607 and 608: C.4. TWO-SIDED ORTHOGONAL PROCRUSTE
- Page 609 and 610: Appendix D Matrix calculus From too
- Page 611 and 612: D.1. DIRECTIONAL DERIVATIVE, TAYLOR
- Page 613 and 614: D.1. DIRECTIONAL DERIVATIVE, TAYLOR
- Page 615 and 616: D.1. DIRECTIONAL DERIVATIVE, TAYLOR
- Page 617 and 618: D.1. DIRECTIONAL DERIVATIVE, TAYLOR
- Page 619: D.1. DIRECTIONAL DERIVATIVE, TAYLOR
- Page 623 and 624: D.1. DIRECTIONAL DERIVATIVE, TAYLOR
- Page 625 and 626: D.1. DIRECTIONAL DERIVATIVE, TAYLOR
- Page 627 and 628: D.1. DIRECTIONAL DERIVATIVE, TAYLOR
- Page 629 and 630: D.1. DIRECTIONAL DERIVATIVE, TAYLOR
- Page 631 and 632: D.2. TABLES OF GRADIENTS AND DERIVA
- Page 633 and 634: D.2. TABLES OF GRADIENTS AND DERIVA
- Page 635 and 636: D.2. TABLES OF GRADIENTS AND DERIVA
- Page 637 and 638: D.2. TABLES OF GRADIENTS AND DERIVA
- Page 639 and 640: Appendix E Projection For any A∈
- Page 641 and 642: 641 U T = U † for orthonormal (in
- Page 643 and 644: E.1. IDEMPOTENT MATRICES 643 where
- Page 645 and 646: E.1. IDEMPOTENT MATRICES 645 order,
- Page 647 and 648: E.1. IDEMPOTENT MATRICES 647 When t
- Page 649 and 650: E.3. SYMMETRIC IDEMPOTENT MATRICES
- Page 651 and 652: E.3. SYMMETRIC IDEMPOTENT MATRICES
- Page 653 and 654: E.3. SYMMETRIC IDEMPOTENT MATRICES
- Page 655 and 656: E.5. PROJECTION EXAMPLES 655 E.4.0.
- Page 657 and 658: E.5. PROJECTION EXAMPLES 657 a ∗
- Page 659 and 660: E.5. PROJECTION EXAMPLES 659 E.5.0.
- Page 661 and 662: E.6. VECTORIZATION INTERPRETATION,
- Page 663 and 664: E.6. VECTORIZATION INTERPRETATION,
- Page 665 and 666: E.6. VECTORIZATION INTERPRETATION,
- Page 667 and 668: E.6. VECTORIZATION INTERPRETATION,
- Page 669 and 670: E.7. ON VECTORIZED MATRICES OF HIGH
D.1. DIRECTIONAL DERIVATIVE, TAYLOR SERIES 621<br />
D.1.4.1<br />
Interpretation of directional derivative<br />
In the case of any differentiable real function g(X) : R K×L →R , the<br />
directional derivative of g(X) at X in any direction Y yields the slope of<br />
g along the line {X+ t Y | t ∈ R} through its domain evaluated at t = 0.<br />
For higher-dimensional functions, by (1683), this slope interpretation can be<br />
applied to each entry of the directional derivative.<br />
Figure 136, for example, shows a plane slice of a real convex bowl-shaped<br />
function f(x) along a line {α + t y | t ∈ R} through its domain. The slice<br />
reveals a one-dimensional real function of t ; f(α + t y). The directional<br />
derivative at x = α in direction y is the slope of f(α + t y) with respect<br />
to t at t = 0. In the case of a real function having vector argument<br />
h(X) : R K →R , its directional derivative in the normalized direction of its<br />
gradient is the gradient magnitude. (1711) For a real function of real variable,<br />
the directional derivative evaluated at any point in the function domain is just<br />
the slope of that function there scaled by the real direction. (confer3.1.8)<br />
Directional derivative generalizes our one-dimensional notion of derivative<br />
to a multidimensional domain. When direction Y coincides with a member<br />
of the standard unit Cartesian basis e k e T l (53), then a single partial derivative<br />
∂g(X)/∂X kl is obtained from directional derivative (1684); such is each entry<br />
of gradient ∇g(X) in equalities (1708) and (1711), for example.<br />
D.1.4.1.1 Theorem. Directional derivative optimality condition.<br />
[215,7.4] Suppose f(X) : R K×L →R is minimized on convex set C ⊆ R p×k<br />
by X ⋆ , and the directional derivative of f exists there. Then for all X ∈ C<br />
D.1.4.1.2 Example. Simple bowl.<br />
Bowl function (Figure 136)<br />
→X−X ⋆<br />
df(X) ≥ 0 (1687)<br />
f(x) : R K → R ∆ = (x − a) T (x − a) − b (1688)<br />
has function offset −b ∈ R , axis of revolution at x = a , and positive definite<br />
Hessian (1638) everywhere in its domain (an open hyperdisc in R K ); id est,<br />
strictly convex quadratic f(x) has unique global minimum equal to −b at<br />
⋄