MACHINE LEARNING TECHNIQUES - LASA
MACHINE LEARNING TECHNIQUES - LASA
MACHINE LEARNING TECHNIQUES - LASA
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
100<br />
In SVM, we found that minimizing w was interesting as it allowed a better separation of the two<br />
classes. Scholkopf and Smola argue that, in SVR, minimizing w is also interesting albeit for a<br />
different geometrical reason. They start by computing the e-margin given by:<br />
{ }<br />
( ) ( ) ( ) ( ) ( )<br />
mε f : = inf φ x −φ x' , ∀x, x', s.t. f x − f x' ≥ 2ε<br />
(5.61)<br />
Figure 5-11: Minimizing the e-margin in SVR is equivalent to maximizing the slope of the function f.<br />
Similarly to the margin in SVM, the e-margin is a function of the projection vector w . As illustrated<br />
in Figure 5-11, the flatter the slope w of the function f, the larger the margin. Conversely, the<br />
steeper the slope is, the larger the width of the e-insensitive tube. Hence to maximize the margin,<br />
we must minimize w (minimizing each projection of w will flatten the slope). The linear illustration<br />
in Figure 5-11 holds in feature space as we will proceed to a linear fit in feature space.<br />
Finding the optimal estimate of f can now be formulated as an optimization problem of the form:<br />
⎛⎛1<br />
min ⎜⎜<br />
w ⎝⎝ 2<br />
2 ε<br />
{ w + C⋅R [ f]<br />
⎞⎞<br />
⎟⎟<br />
⎠⎠<br />
(5.62)<br />
ε<br />
Where R [ f]<br />
is a regularized risk function that gives a measure of the e-insensitive error:<br />
1 M i i<br />
[ ] = − ( )<br />
ε<br />
R f ∑ y f x<br />
(5.63)<br />
M<br />
ε<br />
i=<br />
1<br />
C in (5.62) is a constant and hence a free parameter that determines a tradeoff between<br />
minimizing the increase in the error and optimizing for the complexity of the fit. This procedure is<br />
called e-SVR.<br />
Formalism:<br />
© A.G.Billard 2004 – Last Update March 2011