SVM Example

SVM Example SVM Example

axon.cs.byu.edu
from axon.cs.byu.edu More from this publisher
13.07.2015 Views

Figure 8: The discriminating hyperplane corresponding to the values α 1 = −7and α 2 = 4⎛= −7 ⎝111⎞⎛⎠ + 4 ⎝=⎛⎝11⎞⎠−3( ) 1giving us the separating hyperplane equation y = wx + b with w = and1b = −3. Plotting the line gives the expected decision surface (see Figure 8).3.1 Using the SVMLet’s briefly look at how we would use the SVM model to classify data. Givenx, the classification f(x) is given by the equation( )∑f(x) = σ α i Φ(s i ) · Φ(x)(2)iwhere σ(z) returns the sign of z. For example, if we wanted to classify the pointx = (4, 5) using the mapping function of Eq. 1,221⎞⎠( 4f5)( ( ) ( ) ( ) ( 1 4 2 4= σ −7Φ 1 · Φ1 1 + 4Φ51 · Φ2 15⎛ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞⎞1 0 2 0= σ ⎝−7 ⎝ 1 ⎠ · ⎝ 1 ⎠ + 4 ⎝ 2 ⎠ · ⎝ 1 ⎠⎠1 1 1 1= σ(−2)))8

Figure 9: The decision surface in input space corresponding to Φ 1 . Note thesingularity.and thus we would classify x = (4, 5) as negative. Looking again at the inputspace, we might be tempted to think this is not a reasonable classification; however,it is what our model says, and our model is consistent with all the trainingdata. As always, there are no guarantees on generalization accuracy, and if weare not happy about our generalization, the likely culprit is our choice of Φ.Indeed, if we map our discriminating hyperplane (which lives in feature space)back into input space, we can see the effective decision surface of our model(see Figure 9). Of course, we may or may not be able to improve generalizationaccuracy by choosing a different Φ; however, there is another reason to revisitour choice of mapping function.4 The Kernel TrickOur definition of Φ in Eq. 1 preserved the number of dimensions. In otherwords, our input and feature spaces are the same size. However, it is often thecase that in order to effectively separate the data, we must use a feature spacethat is of (sometimes very much) higher dimension than our input space. Letus now consider an alternative mapping functionΦ 2(x1x 2)=⎛⎜⎝x 1x 2(x 2 1 +x2 2)−53⎞⎟⎠ (3)which transforms our data from 2-dimensional input space to 3-dimensionalfeature space. Using this alternative mapping, the data in the new featurespace looks like9

Figure 8: The discriminating hyperplane corresponding to the values α 1 = −7and α 2 = 4⎛= −7 ⎝111⎞⎛⎠ + 4 ⎝=⎛⎝11⎞⎠−3( ) 1giving us the separating hyperplane equation y = wx + b with w = and1b = −3. Plotting the line gives the expected decision surface (see Figure 8).3.1 Using the <strong>SVM</strong>Let’s briefly look at how we would use the <strong>SVM</strong> model to classify data. Givenx, the classification f(x) is given by the equation( )∑f(x) = σ α i Φ(s i ) · Φ(x)(2)iwhere σ(z) returns the sign of z. For example, if we wanted to classify the pointx = (4, 5) using the mapping function of Eq. 1,221⎞⎠( 4f5)( ( ) ( ) ( ) ( 1 4 2 4= σ −7Φ 1 · Φ1 1 + 4Φ51 · Φ2 15⎛ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞⎞1 0 2 0= σ ⎝−7 ⎝ 1 ⎠ · ⎝ 1 ⎠ + 4 ⎝ 2 ⎠ · ⎝ 1 ⎠⎠1 1 1 1= σ(−2)))8

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!