sparse image representation via combined transforms - Convex ...

sparse image representation via combined transforms - Convex ... sparse image representation via combined transforms - Convex ...

convexoptimization.com
from convexoptimization.com More from this publisher
10.03.2015 Views

94 CHAPTER 4. COMBINED IMAGE REPRESENTATION column vector; • x o =(x T +,x T −) T , where x + and x − are the positive and negative part of the column vector x that is the same as the “x” in (4.2): x = x + − x − , x + ≥ 0andx − ≥ 0; • A =[Φ, −Φ] and Φ is the same matrix as the one specified in (4.2); • y is equivalent to the “y” in (4.2); • x o ≥ 0 means that each entry of the vector x o is greater than or equal to 0. The main effort of this approach is to solve the following system of linear equations [27, equation (6.4), page 56] (ADA T + δ 2 I)∆y = r − AD(X −1 v − t), (4.13) where r, v and t are column vectors given in CDS paper, D =(X −1 Z + γ 2 I) −1 , X and Z are diagonal matrices composed from primal variable x o and dual slack variable z. (For more specific description, we refer readers to the original paper.) Recall in our approach, to obtain the Newton direction, we solve [ 2Φ T Φ+λρ ′′ (x k ) ] n =2Φ T (y − Φx k ) − λρ ′ (x k ), (4.14) where ρ ′′ (x k ) = diag{¯ρ ′′ (x k 1 ),... ,¯ρ′′ (x k N )}, ρ′ (x k ) = (¯ρ ′ (x k 1 ),... ,¯ρ′ (x k N ))T and n is the desired Newton direction that is equivalent to n(x (k) ) in (4.12). After careful examination of the matrices on the left hand sides of both (4.13) and (4.14), we observe that both of them are positive definite matrices having the same size. By this we say that the two approaches are similar. But it is hard to say which one may have a better eigenvalue distribution than the other. When they are close to the solution, both algorithms become slow to converge. Note that the system of linear equations in (4.13) is for a perturbed LP problem. If both γ and δ are extremely small, the eigenvalue distribution of the matrix ADA T + δ 2 I can be very bad, so that iterative methods converge slowly. The same discussion is also true for the matrix on the left hand side of (4.14). When the vector x k has nonzero and spread entries, the eigenvalue distribution of the positive definite matrix 2Φ T Φ+λρ ′′ (x k )couldbebadfor any iterative methods. The appealing properties of our approach are:

4.9. ITERATIVE METHODS 95 1. Avoiding introducing the dual variables as in the primal-dual log-barrier perturbed LP approach. We transfer the problem to a unconstrained convex optimization problem and apply Newton methods directly to this problem. It is not necessary to introduce the dual variables and we do not need to increase the size of the primal variable x by two (this was done in CDS approach). It is not necessary to consider the KKT conditions. Although the core problem—as in (4.14)—is very similar to the core problem—as in (4.13)—of CDS approach, the derivation is simpler and more direct. 2. Instead of using a standard conjugate gradients methods, we choose LSQR, which is analytically equivalent to CG algorithm but is numerically more stable. Chen, Donoho and Saunders [26] have successfully developed software to carry out numerical experiments with problems having the size of thousands by tens of thousands. We choose to implement a new approach in the hope that it simplifies the algorithm and hopefully increase the numerical efficiency. A careful comparison of our new approach with the one previously implemented will be an interesting future research topic. At the same time, we choose LSQR instead of CG. (Chen et al. showed how to use LSQR when δ>0 in (4.13), but their experiments used CG.) 4.9 Iterative Methods In our problem, we choose an iterative solver to solve the system in (4.12) because the Hessian is a positive definite matrix with special structure to facilitate fast algorithms for matrix-vector multiplication. As we know, when there is a fast (low complexity) algorithm for matrix-vector multiplication and assuming the iterative solver takes a moderate number of iterations to converge, an iterative solver is an ideal tool to solve a system of linear equations [74, 78]. Here, we have a fast algorithm to multiply with the Hessian H(x) because, as in (4.8), the second term is a diagonal matrix, and we know it is an O(N) algorithm to multiply with a diagonal matrix. The Φ in our model is a combination of transforms having fast algorithms (for example, 2-D wavelet transforms, 2-D DCT and edgelet-like transforms), so we have fast algorithms to multiply with Φ and Φ T (which is simply the adjoint). Hence we have a fast algorithm for multiplying with the Hessian. Choosing the “correct” iterative method is a big topic. The next chapter is dedicated to this topic. We choose a variation of the conjugate gradient method: LSQR [116, 117].

4.9. ITERATIVE METHODS 95<br />

1. Avoiding introducing the dual variables as in the primal-dual log-barrier perturbed LP<br />

approach. We transfer the problem to a unconstrained convex optimization problem<br />

and apply Newton methods directly to this problem. It is not necessary to introduce<br />

the dual variables and we do not need to increase the size of the primal variable x<br />

by two (this was done in CDS approach). It is not necessary to consider the KKT<br />

conditions. Although the core problem—as in (4.14)—is very similar to the core<br />

problem—as in (4.13)—of CDS approach, the derivation is simpler and more direct.<br />

2. Instead of using a standard conjugate gradients methods, we choose LSQR, which is<br />

analytically equivalent to CG algorithm but is numerically more stable.<br />

Chen, Donoho and Saunders [26] have successfully developed software to carry out<br />

numerical experiments with problems having the size of thousands by tens of thousands.<br />

We choose to implement a new approach in the hope that it simplifies the algorithm and<br />

hopefully increase the numerical efficiency. A careful comparison of our new approach with<br />

the one previously implemented will be an interesting future research topic. At the same<br />

time, we choose LSQR instead of CG. (Chen et al. showed how to use LSQR when δ>0<br />

in (4.13), but their experiments used CG.)<br />

4.9 Iterative Methods<br />

In our problem, we choose an iterative solver to solve the system in (4.12) because the<br />

Hessian is a positive definite matrix with special structure to facilitate fast algorithms for<br />

matrix-vector multiplication. As we know, when there is a fast (low complexity) algorithm<br />

for matrix-vector multiplication and assuming the iterative solver takes a moderate number<br />

of iterations to converge, an iterative solver is an ideal tool to solve a system of linear<br />

equations [74, 78].<br />

Here, we have a fast algorithm to multiply with the Hessian H(x) because, as in (4.8),<br />

the second term is a diagonal matrix, and we know it is an O(N) algorithm to multiply<br />

with a diagonal matrix. The Φ in our model is a combination of <strong>transforms</strong> having fast<br />

algorithms (for example, 2-D wavelet <strong>transforms</strong>, 2-D DCT and edgelet-like <strong>transforms</strong>), so<br />

we have fast algorithms to multiply with Φ and Φ T (which is simply the adjoint). Hence<br />

we have a fast algorithm for multiplying with the Hessian.<br />

Choosing the “correct” iterative method is a big topic. The next chapter is dedicated<br />

to this topic. We choose a variation of the conjugate gradient method: LSQR [116, 117].

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!