sparse image representation via combined transforms - Convex ...
sparse image representation via combined transforms - Convex ... sparse image representation via combined transforms - Convex ...
110 CHAPTER 5. ITERATIVE METHODS where ⎛ D(x k )= √ λ/2 ⎜ ⎝ ¯ρ ′′ (x k 1 ) ⎞ . .. ⎟ ⎠ ¯ρ ′′ (x k N ) 1/2 , and ⎛ g(x k )= λ ¯ρ ′ (x k 1 ) ⎞ ⎜ 2 ⎝ . ⎟ ⎠ . ¯ρ ′ (x k N ) Note that this is the [N] case. Recall ¯ρ ′ and ¯ρ ′′ are defined in (4.4) and (4.5). We can transfer problem (LS) into a damped LS problem by solving for a shifted variable ¯d: δ ¯d = D(x k )d + D −1 (x k )g(x k ). The problem (LS) becomes (dLS) : minimize ¯d [ ΦD −1 (x k )δ ∥ δI ] [ y − Φ(xk − D ¯d −2 (x k )g(x k )) − 0 ]∥ ∥∥∥∥ 2 . This is the [R] case. This method is also called diagonal preconditioning. Potentially, it will turn the original LS problem into one with clustered singular values, so that LSQR may take fewer iterations. LSQR also works with shorter vectors (as it handles the δI term implicitly). 5.2.2 Why LSQR? Since the Hessian in (5.1) is at least semidefinite, we prefer a conjugate-gradients type method. LSQR is analytically equivalent to the standard method of conjugate gradients, but possesses more favorable numerical properties. Particularly when the Hessian is illconditioned—which is very likely to happen when the iteration gets closer to the convergence point—LSQR is more stable than the standard CG. But LSQR costs slightly more storage and work per iteration.
5.2. LSQR 111 5.2.3 Algorithm LSQR A formal description of LSQR is given in [117, page 50]. We list it here for the convenience of readers. Algorithm LSQR:to minimize ‖b − Ax‖ 2 . 1. Initialize. β 1 u 1 = b, α 1 v 1 = A T u 1 ,w 1 = v 1 ,x 0 =0, ¯φ 1 = β 1 , ¯ρ 1 = α 1 . 2. For i =1, 2, 3,... (a) Continue the bidiagonalization. i. β i+1 u i+1 = Av i − α i u i ii. α i+1 v i+1 = A T u i+1 − β i+1 v i . (b) Construct and apply next orthogonal transformation. i. ρ i =(¯ρ 2 i + βi+1 2 )1/2 ii. c i =¯ρ i /ρ i iii. s i = β i+1 /ρ i iv. θ i+1 = s i α i+1 v. ¯ρ i+1 = −c i α i+1 vi. φ i = c i ¯φi vii. ¯φi+1 = s i ¯φi . (c) Update x, w. i. x i = x i−1 +(φ i /ρ i )w i ii. w i+1 = v i+1 − (θ i+1 /ρ i )w i . (d) Test for convergence. Exit if some stopping criteria have been met. 5.2.4 Discussion There are two possible dangers in the previous approaches (LS) and (dLS). They are both caused by the existence of large entries in the vector D −1 (x k )g(x k ). The first danger occurs in (LS), when D −1 (x k )g(x k ) is large, the right-hand side is large even though the elements of d will be small as Newton’s method converges. Converting (5.1) to (LS) is a
- Page 87 and 88: 3.2. WAVELETS AND POINT SINGULARITI
- Page 89 and 90: 3.2. WAVELETS AND POINT SINGULARITI
- Page 91 and 92: 3.2. WAVELETS AND POINT SINGULARITI
- Page 93 and 94: 3.3. EDGELETS AND LINEAR SINGULARIT
- Page 95 and 96: 3.4. OTHER TRANSFORMS 67 uncertaint
- Page 97 and 98: 3.4. OTHER TRANSFORMS 69 Chirplets
- Page 99 and 100: 3.4. OTHER TRANSFORMS 71 Folding. A
- Page 101 and 102: 3.4. OTHER TRANSFORMS 73 We can app
- Page 103 and 104: 3.5. DISCUSSION 75 give only a few
- Page 105 and 106: 3.7. PROOFS 77 the ijth component o
- Page 107 and 108: 3.7. PROOFS 79 Similarly, we have [
- Page 109 and 110: Chapter 4 Combined Image Representa
- Page 111 and 112: 4.2. SPARSE DECOMPOSITION 83 interi
- Page 113 and 114: 4.3. MINIMUM l 1 NORM SOLUTION 85 l
- Page 115 and 116: 4.4. LAGRANGE MULTIPLIERS 87 ρ( x
- Page 117 and 118: 4.5. HOW TO CHOOSE ρ AND λ 89 3 (
- Page 119 and 120: 4.6. HOMOTOPY 91 A way to interpret
- Page 121 and 122: 4.7. NEWTON DIRECTION 93 4.7 Newton
- Page 123 and 124: 4.9. ITERATIVE METHODS 95 1. Avoidi
- Page 125 and 126: 4.11. DISCUSSION 97 ρ(β) =‖β
- Page 127 and 128: 4.12. PROOFS 99 4.12.2 Proof of The
- Page 129 and 130: 4.12. PROOFS 101 case of (4.16). Co
- Page 131 and 132: Chapter 5 Iterative Methods This ch
- Page 133 and 134: 5.1. OVERVIEW 105 the k-th iteratio
- Page 135 and 136: 5.1. OVERVIEW 107 5.1.4 Preconditio
- Page 137: 5.2. LSQR 109 among all the block d
- Page 141 and 142: 5.3. MINRES 113 2. For k =1, 2,...,
- Page 143 and 144: 5.3. MINRES 115 using the precondit
- Page 145 and 146: 5.4. DISCUSSION 117 From (I + S 1 )
- Page 147 and 148: Chapter 6 Simulations Section 6.1 d
- Page 149 and 150: 6.3. DECOMPOSITION 121 10 5 5 5 20
- Page 151 and 152: 6.4. DECAY OF COEFFICIENTS 123 10 2
- Page 153 and 154: 6.5. COMPARISON WITH MATCHING PURSU
- Page 155 and 156: 6.6. SUMMARY OF COMPUTATIONAL EXPER
- Page 157 and 158: Chapter 7 Future Work In the future
- Page 159 and 160: 7.2. MODIFYING EDGELET DICTIONARY 1
- Page 161 and 162: 7.3. ACCELERATING THE ITERATIVE ALG
- Page 163 and 164: Appendix A Direct Edgelet Transform
- Page 165 and 166: A.2. EXAMPLES 137 edgelet transform
- Page 167 and 168: A.3. DETAILS 139 (a) Stick image (b
- Page 169 and 170: A.3. DETAILS 141 (a) Lenna image (b
- Page 171 and 172: A.3. DETAILS 143 Ordering of Dyadic
- Page 173 and 174: A.3. DETAILS 145 (1,K +1), (1,K +2)
- Page 175 and 176: A.3. DETAILS 147 x 1 , y 1 x 2 , y
- Page 177 and 178: Appendix B Fast Edgelet-like Transf
- Page 179 and 180: B.1. TRANSFORMS FOR 2-D CONTINUOUS
- Page 181 and 182: B.2. DISCRETE ALGORITHM 153 B.2.1 S
- Page 183 and 184: B.2. DISCRETE ALGORITHM 155 extensi
- Page 185 and 186: B.2. DISCRETE ALGORITHM 157 For the
- Page 187 and 188: B.3. ADJOINT OF THE FAST TRANSFORM
110 CHAPTER 5. ITERATIVE METHODS<br />
where<br />
⎛<br />
D(x k )= √ λ/2 ⎜<br />
⎝<br />
¯ρ ′′ (x k 1 ) ⎞<br />
. .. ⎟<br />
⎠<br />
¯ρ ′′ (x k N )<br />
1/2<br />
,<br />
and<br />
⎛<br />
g(x k )= λ ¯ρ ′ (x k 1 )<br />
⎞<br />
⎜<br />
2 ⎝ .<br />
⎟<br />
⎠ .<br />
¯ρ ′ (x k N )<br />
Note that this is the [N] case. Recall ¯ρ ′ and ¯ρ ′′ are defined in (4.4) and (4.5).<br />
We can transfer problem (LS) into a damped LS problem by solving for a shifted variable<br />
¯d: δ ¯d = D(x k )d + D −1 (x k )g(x k ). The problem (LS) becomes<br />
(dLS) :<br />
minimize<br />
¯d<br />
[ ΦD −1 (x k )δ<br />
∥ δI<br />
] [<br />
y − Φ(xk − D<br />
¯d −2 (x k )g(x k ))<br />
−<br />
0<br />
]∥ ∥∥∥∥<br />
2<br />
.<br />
This is the [R] case. This method is also called diagonal preconditioning. Potentially, it<br />
will turn the original LS problem into one with clustered singular values, so that LSQR<br />
may take fewer iterations. LSQR also works with shorter vectors (as it handles the δI term<br />
implicitly).<br />
5.2.2 Why LSQR?<br />
Since the Hessian in (5.1) is at least semidefinite, we prefer a conjugate-gradients type<br />
method. LSQR is analytically equivalent to the standard method of conjugate gradients,<br />
but possesses more favorable numerical properties. Particularly when the Hessian is illconditioned—which<br />
is very likely to happen when the iteration gets closer to the convergence<br />
point—LSQR is more stable than the standard CG. But LSQR costs slightly more storage<br />
and work per iteration.