sparse image representation via combined transforms - Convex ...

sparse image representation via combined transforms - Convex ... sparse image representation via combined transforms - Convex ...

convexoptimization.com
from convexoptimization.com More from this publisher
10.03.2015 Views

110 CHAPTER 5. ITERATIVE METHODS where ⎛ D(x k )= √ λ/2 ⎜ ⎝ ¯ρ ′′ (x k 1 ) ⎞ . .. ⎟ ⎠ ¯ρ ′′ (x k N ) 1/2 , and ⎛ g(x k )= λ ¯ρ ′ (x k 1 ) ⎞ ⎜ 2 ⎝ . ⎟ ⎠ . ¯ρ ′ (x k N ) Note that this is the [N] case. Recall ¯ρ ′ and ¯ρ ′′ are defined in (4.4) and (4.5). We can transfer problem (LS) into a damped LS problem by solving for a shifted variable ¯d: δ ¯d = D(x k )d + D −1 (x k )g(x k ). The problem (LS) becomes (dLS) : minimize ¯d [ ΦD −1 (x k )δ ∥ δI ] [ y − Φ(xk − D ¯d −2 (x k )g(x k )) − 0 ]∥ ∥∥∥∥ 2 . This is the [R] case. This method is also called diagonal preconditioning. Potentially, it will turn the original LS problem into one with clustered singular values, so that LSQR may take fewer iterations. LSQR also works with shorter vectors (as it handles the δI term implicitly). 5.2.2 Why LSQR? Since the Hessian in (5.1) is at least semidefinite, we prefer a conjugate-gradients type method. LSQR is analytically equivalent to the standard method of conjugate gradients, but possesses more favorable numerical properties. Particularly when the Hessian is illconditioned—which is very likely to happen when the iteration gets closer to the convergence point—LSQR is more stable than the standard CG. But LSQR costs slightly more storage and work per iteration.

5.2. LSQR 111 5.2.3 Algorithm LSQR A formal description of LSQR is given in [117, page 50]. We list it here for the convenience of readers. Algorithm LSQR:to minimize ‖b − Ax‖ 2 . 1. Initialize. β 1 u 1 = b, α 1 v 1 = A T u 1 ,w 1 = v 1 ,x 0 =0, ¯φ 1 = β 1 , ¯ρ 1 = α 1 . 2. For i =1, 2, 3,... (a) Continue the bidiagonalization. i. β i+1 u i+1 = Av i − α i u i ii. α i+1 v i+1 = A T u i+1 − β i+1 v i . (b) Construct and apply next orthogonal transformation. i. ρ i =(¯ρ 2 i + βi+1 2 )1/2 ii. c i =¯ρ i /ρ i iii. s i = β i+1 /ρ i iv. θ i+1 = s i α i+1 v. ¯ρ i+1 = −c i α i+1 vi. φ i = c i ¯φi vii. ¯φi+1 = s i ¯φi . (c) Update x, w. i. x i = x i−1 +(φ i /ρ i )w i ii. w i+1 = v i+1 − (θ i+1 /ρ i )w i . (d) Test for convergence. Exit if some stopping criteria have been met. 5.2.4 Discussion There are two possible dangers in the previous approaches (LS) and (dLS). They are both caused by the existence of large entries in the vector D −1 (x k )g(x k ). The first danger occurs in (LS), when D −1 (x k )g(x k ) is large, the right-hand side is large even though the elements of d will be small as Newton’s method converges. Converting (5.1) to (LS) is a

110 CHAPTER 5. ITERATIVE METHODS<br />

where<br />

⎛<br />

D(x k )= √ λ/2 ⎜<br />

⎝<br />

¯ρ ′′ (x k 1 ) ⎞<br />

. .. ⎟<br />

⎠<br />

¯ρ ′′ (x k N )<br />

1/2<br />

,<br />

and<br />

⎛<br />

g(x k )= λ ¯ρ ′ (x k 1 )<br />

⎞<br />

⎜<br />

2 ⎝ .<br />

⎟<br />

⎠ .<br />

¯ρ ′ (x k N )<br />

Note that this is the [N] case. Recall ¯ρ ′ and ¯ρ ′′ are defined in (4.4) and (4.5).<br />

We can transfer problem (LS) into a damped LS problem by solving for a shifted variable<br />

¯d: δ ¯d = D(x k )d + D −1 (x k )g(x k ). The problem (LS) becomes<br />

(dLS) :<br />

minimize<br />

¯d<br />

[ ΦD −1 (x k )δ<br />

∥ δI<br />

] [<br />

y − Φ(xk − D<br />

¯d −2 (x k )g(x k ))<br />

−<br />

0<br />

]∥ ∥∥∥∥<br />

2<br />

.<br />

This is the [R] case. This method is also called diagonal preconditioning. Potentially, it<br />

will turn the original LS problem into one with clustered singular values, so that LSQR<br />

may take fewer iterations. LSQR also works with shorter vectors (as it handles the δI term<br />

implicitly).<br />

5.2.2 Why LSQR?<br />

Since the Hessian in (5.1) is at least semidefinite, we prefer a conjugate-gradients type<br />

method. LSQR is analytically equivalent to the standard method of conjugate gradients,<br />

but possesses more favorable numerical properties. Particularly when the Hessian is illconditioned—which<br />

is very likely to happen when the iteration gets closer to the convergence<br />

point—LSQR is more stable than the standard CG. But LSQR costs slightly more storage<br />

and work per iteration.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!