01.11.2014 Views

MACHINE LEARNING TECHNIQUES - LASA

MACHINE LEARNING TECHNIQUES - LASA

MACHINE LEARNING TECHNIQUES - LASA

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

96<br />

Figure 5-7: Linear separating hyperplanes for the non-separable case.<br />

5.7.2 Support Vector Machine for Non-linearly Separable Datasets<br />

The above algorithm for separable data, when applied to non-separable data, will find no feasible<br />

solution: this will be evidenced by the objective function (i.e. the dual Lagrangian) growing<br />

arbitrarily large. In order to handle non-separable data, one must relax the constraints (5.37). This<br />

can be done by introducing positive slack variables ξ<br />

i, i= 1,..., M , in the constraints, which then<br />

become:<br />

T i<br />

w x + b≥+ 1− ξ for y =+ 1<br />

(5.47)<br />

i<br />

i<br />

T i<br />

w x + b≤− 1+ ξ for y =− 1<br />

(5.48)<br />

ξ ≥ 0 i<br />

Thus, for an error to occur the corresponding ξ<br />

i<br />

must exceed unity, so<br />

i<br />

i<br />

i<br />

∀ (5.49)<br />

∑<br />

ξ i i<br />

is an upper bound<br />

on the number of training errors. Hence a natural way to assign an extra cost for errors is to<br />

change the objective function to be minimized to include a cost function, such as<br />

M<br />

⎛⎛ 2 C ⎞⎞<br />

min { ⎜⎜ w + ∑ ξi<br />

,<br />

w, ξ M<br />

⎟⎟<br />

⎝⎝<br />

i=<br />

1 ⎠⎠<br />

whereC is a parameter to be chosen by the user, a larger C corresponding to assigning a higher<br />

penalty to errors. As it stands, this is a convex programming problem and the Wolfe dual problem<br />

becomes:<br />

subject to:<br />

The solution is again given by:<br />

1<br />

i j i<br />

L α ≡ α − αα yy x,<br />

x<br />

j<br />

D ∑ i ∑ (5.50)<br />

i j<br />

i 2 i,<br />

j<br />

max { ( )<br />

α<br />

C<br />

≤ ≤ (5.51)<br />

M<br />

i<br />

α y = 0<br />

0 αi<br />

∑ i<br />

(5.52)<br />

i<br />

Ns<br />

w = ∑ α<br />

(5.53)<br />

i=<br />

1<br />

i i<br />

i<br />

yx<br />

© A.G.Billard 2004 – Last Update March 2011

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!