29.07.2014 Views

User's Guide to lqa - LMU

User's Guide to lqa - LMU

User's Guide to lqa - LMU

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

2<br />

must be linear. Note that the sum of all J penalty functions determines the penalty term.<br />

The penalty level λ must not be identical for all J functions but <strong>to</strong> keep the notation<br />

simple we omit a further subindex j. Furthermore, the penalty function type must not<br />

be identical for all j. The number J of penalty functions is not necessarily equal <strong>to</strong> p the<br />

number of dimensions. For example, the fused lasso penalty (Tibshirani et al., 2005)<br />

can be written as<br />

where<br />

P fl<br />

λ (β) = λ 1<br />

P fl<br />

p∑<br />

p∑<br />

|β j | + λ 2 |β k − β k−1 |<br />

j=1<br />

k=2<br />

2p−1<br />

λ (β) = ∑<br />

p λ,j (|a ⊤ j β|), (3)<br />

j=1<br />

p λ,j (·) = λ 1 |a ⊤ j β|,<br />

j = 1, . . . , p<br />

and a j = (0, . . . , 0, 1, 0, . . . , 0) ⊤ with a one at the j-th position, and<br />

p λ,j (·) = λ 2 |a ⊤ j β|, j = p + 1, . . . , 2p − 1<br />

where a j = (0, . . . , 0, −1, 1, 0, . . . , 0) ⊤ with a one at the (j − p + 1)-th position and a<br />

minus one at the (j − p)-th position. The parameters λ 1 and λ 2 correspond <strong>to</strong> s 1 and s 2<br />

in the constrained regression problem formulation, see Tibshirani et al. (2005).<br />

Ulbricht (2010b) has shown that if the penalty term has structure (2) then the penalized<br />

regression problem (1) can be solved by a New<strong>to</strong>n-type algorithm based on local quadratic<br />

approximations of the penalty term. That is, the unknown coefficients b can be estimated<br />

iteratively with the estimation equation<br />

b (k+1) = b (k) − γ ( F(b (k) ) + A ∗ λ) −1<br />

{−s(b(k) ) + A ∗ λb (k) }, (4)<br />

where s(b) and F(b) denote score vec<strong>to</strong>r and Fisher information matrix of the underlying<br />

log-likelihood, respectively, and<br />

[ ] 0 0<br />

A ∗ ⊤<br />

λ =<br />

p<br />

,<br />

0 p A λ<br />

where 0 p is the p-dimensional null vec<strong>to</strong>r. The (approximated) penalty matrix A λ is<br />

defined as<br />

J∑ p ′ λ,j<br />

A λ =<br />

(|a⊤ j β (k) |)<br />

√ (a ) a j a ⊤ j , (5)<br />

⊤ 2<br />

j β (k) + c<br />

j=1<br />

where p ′ λ,j (|a⊤ j β (k) |) = dp λ,j (|a ⊤ j β|)/d|a ⊤ j β| denotes the first derivative with p ′ λ,j (0) ≡ 0,<br />

and c > 0 is a small positive real number. In the <strong>lqa</strong> package we will use c = 1e −6 as<br />

default value. To our experience, this value works quite well.<br />

The New<strong>to</strong>n update (4) is repeated until convergence. We apply the rule<br />

‖b (k+1) − b (k) ‖<br />

‖b (k) ‖<br />

≤ ɛ, ɛ > 0

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!