What is SVM? :- A classifier defined by a separator hyperplane known as a Support vector machine is a supervised learning model which classifies data by a separator called a hyperplane. In other words, on giving an input set(training data), the support vector classifier outputs a hyperplane that best classifies or categorizes new input datasets.
Some important terms:-
Hyperplane:- Hyperplanes are decision boundaries that help classify the data points. Data points falling on either side of the hyperplane can be attributed to different classes. In simple terms, it is the ability of your machine learning model to correctly differentiate/separate/classify between different groups of data. The first approximation that SVMs do is to find a separating line(or hyperplane) between data of two classes. SVM is an algorithm that takes the data as an input and outputs a line that best separates those classes if possible. We have infinite lines that can separate these two classes. So how does SVM find the ideal one??? We have two candidates here, the green-colored line and the yellow-colored line. Which line according to you best separates the data? If you selected the yellow line then congrats because that's the line we are looking for.
Support vectors::- According to the SVM algorithm, we find the points closest to the line from both the classes. These points are called support vectors. Now, we compute the distance between the line and the support vectors. This distance is called the margin. Our goal is to maximize the margin. The hyperplane for which the margin is maximum is the optimal hyperplane.
Margin:- The distance between the hyperplane and support vectors is called margin. Our goal is to maximize the margin by finding the best separator(hyperplane).
Let’s consider a bit complex dataset, which is not linearly separable.
This data is not linearly separable and hence, a straight hyperplane can not be drawn. But we can convert it into higher dimensions So let's add one more dimension and call it as the z-axis. Now, when this data is linearly separable. Let's draw a black line separating the data be z=k, where k is constant. since, we all know, z= x^2 + y^2 then it will become - x^2 + y^2 = k which is an equation of a circle. So, we can project this linear separator in a higher dimension back in the original dimensions using this transformation.
Some Important points.
Hyperplanes close to data points have smaller margins.
The farther a hyperplane is from a data point, the larger its margin will be.
This means that the optimal hyperplane will be the one with the biggest margin, because a larger margin ensures that slight deviations in the data points should not affect the outcome of the model.
Math behind SVM
The equation of line is y = ax+by, considering x and y as features naming them as - x1,x2...xn or ax1-x^2+b=0.
let x = (x1,x2) and w=(a,-1)
we get-
w*x+b=0.
This equation is derived from 2-d vectors. It can work for any number of dimensions:-
Hyperplane :- w^Tx=0
Line :- y = ax+b
Let's code.
>>> from sklearn import svm
First, we are importing svm module from sklearn.
>>> X = [[0, 0], [1, 1]]
>>> y = [0, 1]
Input two arrays: an array X of shape (n_samples, n_features) holding the training samples, and an array y of class labels (strings or integers), of shape (n_samples).
>>> clf = svm.SVC()
>>> clf.fit(X, y)
SVC()
Here, we are calling Support vector classifier and storing it in a variable named clf. After being fitted, the model can then be used to predict new values:-
>>> clf.predict([[2., 2.]])
array([1])
This is a basic SVM classifier. In the next post, we will see the types of SVM classifier. Thank you for being a patient reader. I hope this helps you in understanding SVM well.
You can connect with me on LinkedIn for further tech conversation and networking.