# Support Vector Machine (SVM)

An SVM is a method for solving two-class problems and is included in discriminative models [33, 39]. Linear classifiers classify data, x, into two categories based on the sign of the following linear classification function:

where *ф.х)* denotes a vector of which components are features extracted from the data x, and *w* and *b* denote the coefficients of the linear function. The features to be extracted are determined in advance at the step of the data structure analysis shown in Fig. 2.6, and in the designing step of classifiers, the values of these coefficients are estimated by using a set of given training data. One of the strong points of SVMs is the ability to estimate the coefficient values by solving a convex optimization problem: If the training data are linearly separable, then the globally optimal values of the coefficients can be attained. Assuming that the data are linearly separable, the convex optimization problem is derived from a criterion of the goodness of the decision boundary, *f* (x) = 0, which is a hyperplane in a feature space represented by *ф(х).* Let the term *margin* denote the distance between the decision boundary, *f (x) =* 0, and the training data closest to the decision boundary. An SVM estimates the values of the coefficients, which generate a decision boundary that maximizes the margin (see Fig. 2.8). The generalization error of the resultant linear classifier can be minimized using the strategy of the maximization of the margin. The

Fig. 2.8 **An example of a decision boundary computed by an SVM**

assumption of the linear separability can be relaxed by using soft- margin techniques

[39] , and nonlinear classification functions can be constructed by using a kernel trick

[40] .