MACHINE LEARNING TECHNIQUES - LASA
MACHINE LEARNING TECHNIQUES - LASA
MACHINE LEARNING TECHNIQUES - LASA
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
98<br />
i<br />
that ( , ) ( ),<br />
( )<br />
j<br />
i<br />
j<br />
k x x φ x φ x<br />
= , then training depends only on knowing k and would not<br />
require to know φ .<br />
The optimization problem consists then to maximize the following quantity:<br />
i j<br />
i<br />
( αi −αi )( αi −αi) k( x , x ) −ε ( αi − αi ) + y ( αi −αi<br />
)<br />
M M M<br />
1<br />
− ʹ′ ʹ′ ʹ′<br />
2<br />
l<br />
subject to<br />
1( αi<br />
αʹ′<br />
i=<br />
i )<br />
∑ ∑ ∑ (5.55)<br />
i, j= 1 i= 1 i=<br />
1<br />
∑ − = 0 and α , α [ 0, A<br />
i i ]<br />
ʹ′ ∈ . ε ∈R. In this case, the class label is<br />
computed as follows:<br />
( )<br />
M<br />
⎛⎛<br />
⎞⎞<br />
j<br />
y = sgn ⎜⎜ αik x, x + b⎟⎟<br />
⎝⎝ i<br />
⎠⎠<br />
∑ (5.56)<br />
Each expansion corresponds to a separating hyperplane in a feature space. In this sense, the<br />
can be considered a dual representation of the hyperplane's normal vector. A test point is<br />
classified by comparing it to all the training points with non-zero weight.<br />
α<br />
i<br />
5.7.4 n-SVM<br />
Chosing the right parameter C may be difficult in practice. n-SVM is an alternative that optimizes<br />
for the best tradeoff between model complexity (the largest margin) and penalty on the error<br />
automatically. To this end, it introduces two other parameters n and r. ν ≥ 0 is an open parameter<br />
while ρ will be optimized for. The objective function becomes:<br />
M<br />
⎛⎛ 2 1 ⎞⎞<br />
min { ⎜⎜ w − νρ + ∑ξi<br />
,<br />
w, ξ<br />
M<br />
⎟⎟<br />
⎝⎝<br />
i=<br />
1 ⎠⎠<br />
i<br />
subject to y ,<br />
i<br />
( wx b)<br />
and ξ ≥0, ρ ≥0.<br />
i<br />
+ ≥ ρ−ξ<br />
i<br />
(5.57)<br />
To understand the role of ρ , observe first that when the points are well classified, the margin has<br />
now been changed to 2 ρ / w . The larger ρ the larger the margin. Since points within the margin<br />
may be misclassified, one can compute the margin error, i.e. the number of points that are within<br />
the margin while misclassified. ν varies the effect of this increase in the margin error while<br />
optimizing for a large value of ρ . One can show that:<br />
• ν is an upper bound on the fraction of margin error (i.e. the number of datapoints<br />
misclassified in the margin)<br />
• ν is a lower bound on the number of support vectors<br />
© A.G.Billard 2004 – Last Update March 2011