SVM Example
SVM Example SVM Example
Figure 8: The discriminating hyperplane corresponding to the values α 1 = −7and α 2 = 4⎛= −7 ⎝111⎞⎛⎠ + 4 ⎝=⎛⎝11⎞⎠−3( ) 1giving us the separating hyperplane equation y = wx + b with w = and1b = −3. Plotting the line gives the expected decision surface (see Figure 8).3.1 Using the SVMLet’s briefly look at how we would use the SVM model to classify data. Givenx, the classification f(x) is given by the equation( )∑f(x) = σ α i Φ(s i ) · Φ(x)(2)iwhere σ(z) returns the sign of z. For example, if we wanted to classify the pointx = (4, 5) using the mapping function of Eq. 1,221⎞⎠( 4f5)( ( ) ( ) ( ) ( 1 4 2 4= σ −7Φ 1 · Φ1 1 + 4Φ51 · Φ2 15⎛ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞⎞1 0 2 0= σ ⎝−7 ⎝ 1 ⎠ · ⎝ 1 ⎠ + 4 ⎝ 2 ⎠ · ⎝ 1 ⎠⎠1 1 1 1= σ(−2)))8
Figure 9: The decision surface in input space corresponding to Φ 1 . Note thesingularity.and thus we would classify x = (4, 5) as negative. Looking again at the inputspace, we might be tempted to think this is not a reasonable classification; however,it is what our model says, and our model is consistent with all the trainingdata. As always, there are no guarantees on generalization accuracy, and if weare not happy about our generalization, the likely culprit is our choice of Φ.Indeed, if we map our discriminating hyperplane (which lives in feature space)back into input space, we can see the effective decision surface of our model(see Figure 9). Of course, we may or may not be able to improve generalizationaccuracy by choosing a different Φ; however, there is another reason to revisitour choice of mapping function.4 The Kernel TrickOur definition of Φ in Eq. 1 preserved the number of dimensions. In otherwords, our input and feature spaces are the same size. However, it is often thecase that in order to effectively separate the data, we must use a feature spacethat is of (sometimes very much) higher dimension than our input space. Letus now consider an alternative mapping functionΦ 2(x1x 2)=⎛⎜⎝x 1x 2(x 2 1 +x2 2)−53⎞⎟⎠ (3)which transforms our data from 2-dimensional input space to 3-dimensionalfeature space. Using this alternative mapping, the data in the new featurespace looks like9
- Page 1 and 2: SVM ExampleDan VenturaApril 16, 200
- Page 3 and 4: Figure 2: The three support vectors
- Page 5 and 6: Figure 4: The discriminating hyperp
- Page 7: ˜w = ∑ iFigure 7: The two suppor
Figure 8: The discriminating hyperplane corresponding to the values α 1 = −7and α 2 = 4⎛= −7 ⎝111⎞⎛⎠ + 4 ⎝=⎛⎝11⎞⎠−3( ) 1giving us the separating hyperplane equation y = wx + b with w = and1b = −3. Plotting the line gives the expected decision surface (see Figure 8).3.1 Using the <strong>SVM</strong>Let’s briefly look at how we would use the <strong>SVM</strong> model to classify data. Givenx, the classification f(x) is given by the equation( )∑f(x) = σ α i Φ(s i ) · Φ(x)(2)iwhere σ(z) returns the sign of z. For example, if we wanted to classify the pointx = (4, 5) using the mapping function of Eq. 1,221⎞⎠( 4f5)( ( ) ( ) ( ) ( 1 4 2 4= σ −7Φ 1 · Φ1 1 + 4Φ51 · Φ2 15⎛ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞⎞1 0 2 0= σ ⎝−7 ⎝ 1 ⎠ · ⎝ 1 ⎠ + 4 ⎝ 2 ⎠ · ⎝ 1 ⎠⎠1 1 1 1= σ(−2)))8