01.11.2014 Views

MACHINE LEARNING TECHNIQUES - LASA

MACHINE LEARNING TECHNIQUES - LASA

MACHINE LEARNING TECHNIQUES - LASA

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

98<br />

i<br />

that ( , ) ( ),<br />

( )<br />

j<br />

i<br />

j<br />

k x x φ x φ x<br />

= , then training depends only on knowing k and would not<br />

require to know φ .<br />

The optimization problem consists then to maximize the following quantity:<br />

i j<br />

i<br />

( αi −αi )( αi −αi) k( x , x ) −ε ( αi − αi ) + y ( αi −αi<br />

)<br />

M M M<br />

1<br />

− ʹ′ ʹ′ ʹ′<br />

2<br />

l<br />

subject to<br />

1( αi<br />

αʹ′<br />

i=<br />

i )<br />

∑ ∑ ∑ (5.55)<br />

i, j= 1 i= 1 i=<br />

1<br />

∑ − = 0 and α , α [ 0, A<br />

i i ]<br />

ʹ′ ∈ . ε ∈R. In this case, the class label is<br />

computed as follows:<br />

( )<br />

M<br />

⎛⎛<br />

⎞⎞<br />

j<br />

y = sgn ⎜⎜ αik x, x + b⎟⎟<br />

⎝⎝ i<br />

⎠⎠<br />

∑ (5.56)<br />

Each expansion corresponds to a separating hyperplane in a feature space. In this sense, the<br />

can be considered a dual representation of the hyperplane's normal vector. A test point is<br />

classified by comparing it to all the training points with non-zero weight.<br />

α<br />

i<br />

5.7.4 n-SVM<br />

Chosing the right parameter C may be difficult in practice. n-SVM is an alternative that optimizes<br />

for the best tradeoff between model complexity (the largest margin) and penalty on the error<br />

automatically. To this end, it introduces two other parameters n and r. ν ≥ 0 is an open parameter<br />

while ρ will be optimized for. The objective function becomes:<br />

M<br />

⎛⎛ 2 1 ⎞⎞<br />

min { ⎜⎜ w − νρ + ∑ξi<br />

,<br />

w, ξ<br />

M<br />

⎟⎟<br />

⎝⎝<br />

i=<br />

1 ⎠⎠<br />

i<br />

subject to y ,<br />

i<br />

( wx b)<br />

and ξ ≥0, ρ ≥0.<br />

i<br />

+ ≥ ρ−ξ<br />

i<br />

(5.57)<br />

To understand the role of ρ , observe first that when the points are well classified, the margin has<br />

now been changed to 2 ρ / w . The larger ρ the larger the margin. Since points within the margin<br />

may be misclassified, one can compute the margin error, i.e. the number of points that are within<br />

the margin while misclassified. ν varies the effect of this increase in the margin error while<br />

optimizing for a large value of ρ . One can show that:<br />

• ν is an upper bound on the fraction of margin error (i.e. the number of datapoints<br />

misclassified in the margin)<br />

• ν is a lower bound on the number of support vectors<br />

© A.G.Billard 2004 – Last Update March 2011

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!