Chapter 3 Geometry of convex functions - Meboo Publishing ...
Chapter 3 Geometry of convex functions - Meboo Publishing ... Chapter 3 Geometry of convex functions - Meboo Publishing ...
220 CHAPTER 3. GEOMETRY OF CONVEX FUNCTIONS 3.3.0.0.1 Example. Projecting the origin on an affine subset, in 1-norm. In (1844) we interpret least norm solution to linear system Ax = b as orthogonal projection of the origin 0 on affine subset A = {x∈ R n |Ax=b} where A∈ R m×n is fat full-rank. Suppose, instead of the Euclidean metric, we use taxicab distance to do projection. Then the least 1-norm problem is stated, for b ∈ R(A) minimize ‖x‖ 1 x (496) subject to Ax = b Optimal solution can be interpreted as an oblique projection on A simply because the Euclidean metric is not employed. This problem statement sometimes returns optimal x ⋆ having minimum cardinality; which can be explained intuitively with reference to Figure 68: [19] Projection of the origin, in 1-norm, on affine subset A is equivalent to maximization (in this case) of the 1-norm ball until it kisses A ; rather, a kissing point in A achieves the distance in 1-norm from the origin to A . For the example illustrated (m=1, n=3), it appears that a vertex of the ball will be first to touch A . 1-norm ball vertices in R 3 represent nontrivial points of minimum cardinality 1, whereas edges represent cardinality 2, while relative interiors of facets represent maximum cardinality 3. By reorienting affine subset A so it were parallel to an edge or facet, it becomes evident as we expand or contract the ball that a kissing point is not necessarily unique. 3.8 The 1-norm ball in R n has 2 n facets and 2n vertices. 3.9 For n > 0 B 1 = {x∈ R n | ‖x‖ 1 ≤ 1} = conv{±e i ∈ R n , i=1... n} (497) is a vertex-description of the unit 1-norm ball. Maximization of the 1-norm ball until it kisses A is equivalent to minimization of the 1-norm ball until it no longer intersects A . Then projection of the origin on affine subset A is where minimize x∈R n ‖x‖ 1 subject to Ax = b ≡ minimize c c∈R , x∈R n subject to x ∈ cB 1 Ax = b (498) cB 1 = {[I −I ]a | a T 1=c, a≽0} (499) 3.8 This is unlike the case for the Euclidean ball (1844) where minimum-distance projection on a convex set is unique (E.9); all kissable faces of the Euclidean ball are single points (vertices). 3.9 The ∞-norm ball in R n has 2n facets and 2 n vertices.
3.3. PRACTICAL NORM FUNCTIONS, ABSOLUTE VALUE 221 k/m 1 0.9 1 0.9 0.8 0.8 0.7 signed 0.7 positive 0.6 0.6 0.5 0.5 0.4 0.4 0.3 0.3 0.2 0.2 0.1 0.1 0 0 0.2 0.4 0.6 0.8 1 0 0 0.2 0.4 0.6 0.8 1 m/n (496) minimize ‖x‖ 1 x subject to Ax = b minimize ‖x‖ 1 x subject to Ax = b x ≽ 0 (501) Figure 69: Exact recovery transition: Respectively signed [114] [116] or positive [121] [119] [120] solutions x to Ax=b with sparsity k below thick curve are recoverable. For Gaussian random matrix A∈ R m×n , thick curve demarcates phase transition in ability to find sparsest solution x by linear programming. These results were empirically reproduced in [37]. f 2 (x) f 3 (x) f 4 (x) xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx xx Figure 70: Under 1-norm f 2 (x) , histogram (hatched) of residual amplitudes Ax −b exhibits predominant accumulation of zero-residuals. Nonnegatively constrained 1-norm f 3 (x) from (501) accumulates more zero-residuals than f 2 (x). Under norm f 4 (x) (not discussed), histogram would exhibit predominant accumulation of (nonzero) residuals at gradient discontinuities.
- Page 1 and 2: Chapter 3 Geometry of convex functi
- Page 3 and 4: 3.1. CONVEX FUNCTION 213 f 1 (x) f
- Page 5 and 6: 3.1. CONVEX FUNCTION 215 Rf (b) f(X
- Page 7 and 8: 3.3. PRACTICAL NORM FUNCTIONS, ABSO
- Page 9: 3.3. PRACTICAL NORM FUNCTIONS, ABSO
- Page 13 and 14: 3.3. PRACTICAL NORM FUNCTIONS, ABSO
- Page 15 and 16: 3.3. PRACTICAL NORM FUNCTIONS, ABSO
- Page 17 and 18: 3.4. INVERTED FUNCTIONS AND ROOTS 2
- Page 19 and 20: 3.5. AFFINE FUNCTION 229 3.4.1.3 po
- Page 21 and 22: 3.5. AFFINE FUNCTION 231 3.5.0.0.2
- Page 23 and 24: 3.6. EPIGRAPH, SUBLEVEL SET 233 q(x
- Page 25 and 26: 3.6. EPIGRAPH, SUBLEVEL SET 235 To
- Page 27 and 28: 3.6. EPIGRAPH, SUBLEVEL SET 237 con
- Page 29 and 30: 3.6. EPIGRAPH, SUBLEVEL SET 239 3.6
- Page 31 and 32: 3.7. GRADIENT 241 2 1.5 1 0.5 Y 2 0
- Page 33 and 34: 3.7. GRADIENT 243 From (1749) andD.
- Page 35 and 36: 3.7. GRADIENT 245 This equivalence
- Page 37 and 38: 3.7. GRADIENT 247 3.7.1.0.3 Example
- Page 39 and 40: 3.7. GRADIENT 249 f(Y ) [ ∇f(X)
- Page 41 and 42: 3.7. GRADIENT 251 meaning, the grad
- Page 43 and 44: 3.8. MATRIX-VALUED CONVEX FUNCTION
- Page 45 and 46: 3.8. MATRIX-VALUED CONVEX FUNCTION
- Page 47 and 48: 3.8. MATRIX-VALUED CONVEX FUNCTION
- Page 49 and 50: 3.9. QUASICONVEX 259 3.8.3.0.6 Exam
- Page 51 and 52: 3.9. QUASICONVEX 261 3.9.0.0.2 Defi
- Page 53: 3.10. SALIENT PROPERTIES 263 7. - N
3.3. PRACTICAL NORM FUNCTIONS, ABSOLUTE VALUE 221<br />
k/m<br />
1<br />
0.9<br />
1<br />
0.9<br />
0.8<br />
0.8<br />
0.7<br />
signed<br />
0.7<br />
positive<br />
0.6<br />
0.6<br />
0.5<br />
0.5<br />
0.4<br />
0.4<br />
0.3<br />
0.3<br />
0.2<br />
0.2<br />
0.1<br />
0.1<br />
0<br />
0 0.2 0.4 0.6 0.8 1<br />
0<br />
0 0.2 0.4 0.6 0.8 1<br />
m/n<br />
(496)<br />
minimize ‖x‖ 1<br />
x<br />
subject to Ax = b<br />
minimize ‖x‖ 1<br />
x<br />
subject to Ax = b<br />
x ≽ 0<br />
(501)<br />
Figure 69: Exact recovery transition: Respectively signed [114] [116] or<br />
positive [121] [119] [120] solutions x to Ax=b with sparsity k below thick<br />
curve are recoverable. For Gaussian random matrix A∈ R m×n , thick curve<br />
demarcates phase transition in ability to find sparsest solution x by linear<br />
programming. These results were empirically reproduced in [37].<br />
f 2 (x) f 3 (x) f 4 (x)<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xxx<br />
xxx<br />
xxx<br />
xxx<br />
xxx<br />
xxx<br />
xxx<br />
xxx<br />
xxx<br />
xxx<br />
xxx<br />
xxx<br />
xxx<br />
xxx<br />
xxx<br />
xxx<br />
xxx<br />
xxx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
xx<br />
Figure 70: Under 1-norm f 2 (x) , histogram (hatched) <strong>of</strong> residual amplitudes<br />
Ax −b exhibits predominant accumulation <strong>of</strong> zero-residuals. Nonnegatively<br />
constrained 1-norm f 3 (x) from (501) accumulates more zero-residuals<br />
than f 2 (x). Under norm f 4 (x) (not discussed), histogram would exhibit<br />
predominant accumulation <strong>of</strong> (nonzero) residuals at gradient discontinuities.