📚 The CoCalc Library - books, templates and other resources
License: OTHER
scikit-learn-svm
Credits: Forked from PyCon 2015 Scikit-learn Tutorial by Jake VanderPlas
Support Vector Machine Classifier
Support Vector Machine with Kernels Classifier
Support Vector Machine Classifier
Support Vector Machines (SVMs) are a powerful supervised learning algorithm used for classification or for regression. SVMs draw a boundary between clusters of data. SVMs attempt to maximize the margin between sets of points. Many lines can be drawn to separate the points above:
Fit the model:
Plot the boundary:
In the following plot the dashed lines touch a couple of the points known as support vectors, which are stored in the support_vectors_
attribute of the classifier:
Use IPython's interact
functionality to explore how the distribution of points affects the support vectors and the discriminative fit:
Support Vector Machine with Kernels Classifier
Kernels are useful when the decision boundary is not linear. A Kernel is some functional transformation of the input data. SVMs have clever tricks to ensure kernel calculations are efficient. In the example below, a linear boundary is not useful in separating the groups of points:
A simple model that could be useful is a radial basis function:
In three dimensions, there is a clear separation between the data. Run the SVM with the rbf kernel:
SVM additional notes:
When using an SVM you need to choose the right values for parameters such as c and gamma. Model validation can help to determine these optimal values by trial and error.
SVMs run in O(n^3) performance. LinearSVC is scalable, SVC does not seem to be scalable. For large data sets try transforming the data to a smaller space and use LinearSVC with rbf.