01.11.2014 Views

MACHINE LEARNING TECHNIQUES - LASA

MACHINE LEARNING TECHNIQUES - LASA

MACHINE LEARNING TECHNIQUES - LASA

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

106<br />

5.8.1 n-SVR<br />

The e parameter in e-SVR determines the desired accuracy of the approximation. Determining a<br />

good e may be difficult in practice. n-SVR is an alternative method that suppresses the need to<br />

pre-specifying the free parameter C in (5.65), by introducing (yet another!) parameter n. The idea<br />

if that this new parameter will make it easier to estimate e as one finds a tradeoff between model<br />

complexity and slack variables.<br />

By introducing a penalty throughν ≥ 0 , the optimization of (5.65) becomes:<br />

( )<br />

( * M<br />

1 2 ⎛⎛ 1<br />

w ξ ε) w ⎜⎜νε + ∑( ξ *<br />

i<br />

+ ξi<br />

)<br />

minimize L , , + C<br />

2 ⎝⎝<br />

i<br />

( )<br />

i<br />

( )<br />

⎧⎧ i<br />

w,<br />

φ x + b− y ≤ ε + ξi<br />

⎪⎪<br />

⎪⎪ i<br />

*<br />

subject to ⎨⎨y − w,<br />

φ x −b≤ ε + ξi<br />

⎪⎪<br />

*<br />

⎪⎪ξi<br />

≥ 0, ξi<br />

≥ 0, ε ≥ 0<br />

⎩⎩<br />

M<br />

i=<br />

1<br />

⎞⎞<br />

⎟⎟<br />

⎠⎠<br />

(5.74)<br />

Notice that we now optimize also for the value of e. The term νε in the objective function is now<br />

a weighting term that balances a growth ofε and the effect this has on the increase of the poorly<br />

fit datapoints (last term of the objective function).<br />

This last new equality constraint on e yields a new Lagrange solution with associated Lagrange<br />

multiplier β . One can then proceed to the same steps as done when solving e-SVM, i.e. write the<br />

Lagrangian, take the partial derivatives, write the dual and the solution to KKT conditions (see<br />

Scholkopf & Smola 2002 for details). This yields the following n-SVR optimization problem:<br />

For ν ≥ 0, C>0<br />

⎛⎛<br />

M<br />

M<br />

* 1<br />

⎞⎞<br />

i * *<br />

i j<br />

max { ⎜⎜∑( αi −αi<br />

) y − ∑( αi −αi<br />

)( α<br />

j<br />

−α<br />

j) k( x , x ) ⎟⎟,<br />

⎝⎝ i= 1 2 i, j=<br />

1<br />

⎠⎠<br />

(*)<br />

M<br />

α ∈°<br />

M<br />

*<br />

∑ ( αi<br />

αi<br />

)<br />

subject to − = 0,<br />

i, j=<br />

1<br />

α<br />

(*)<br />

i<br />

M<br />

*<br />

∑ ( i i )<br />

i, j=<br />

1<br />

⎡⎡ C ⎤⎤<br />

∈ ⎢⎢ 0, ,<br />

M ⎥⎥<br />

⎣⎣ ⎦⎦<br />

α + α ≤Cν.<br />

(5.75)<br />

The regression estimate then takes the same form as before and is expressed as a linear<br />

(*)<br />

combination of the kernel estimated on each of the data point with non zero α (the support<br />

i<br />

vectors), i.e.:<br />

M<br />

*<br />

i<br />

( ) ( αi<br />

αi<br />

) ( )<br />

f x = ∑ − k x , x + b.<br />

(5.76)<br />

i, j=<br />

1<br />

As we did before solyly for, we can now also find b and ε by solving the KKT conditions.<br />

© A.G.Billard 2004 – Last Update March 2011

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!