Understanding One-Class SVM for Anomaly Detection

The applications of machine learning are endless. Predicting the price of a house or a product(regression), and classifying the customer’s reviews into positive or negative(classification), to even more advanced techniques like anomaly detection, and machine learning algorithms are being used everywhere.

When it comes to machine learning algorithms, these can be briefly divided into supervised and unsupervised techniques. Supervised learning allows the model to be trained on labeled data, and unsupervised learning means the model is trained on unlabeled data, to possibly understand or observe the patterns in the data.

There are separate models for such learning techniques however, there are also a few algorithms that can be used for both supervised and unsupervised tasks. One such machine learning algorithm is the Support Vector Machine(SVM).

SVM tries to find the best or optimal hyperplane that separates or classifies the data points into different labels. The concept of hyperplanes seems justified when there is more than one label or class to deal with. That brings us to the main question, how is SVM useful when there is only one class?

That is what we try to answer in this post. Keep reading!

One-Class SVM, a variant of Support Vector Machines, specializes in anomaly detection, primarily used in unsupervised learning tasks. This algorithm identifies outliers by training on a single class of data, making it ideal for spotting anomalies in complex datasets, such as fraud detection or unusual patterns in medical imaging.

Introduction to Support Vector Machine(SVM)

As discussed above, SVM can be used in supervised learning tasks(both regression and classification), but it works best when used for classification problems. SVM finds the best hyperplane that optimally separates the data points in an n-dimensional space into their respective labels or classes.

SVM can be used in binary classification(two labels) and multiclass classification(more than two classes). It can also be used to predict numerical values suitable for regression tasks.

Now that we have understood the basic concept of support vector machine, let us move on to the next topic – One class SVM.

One Class SVM

The one class SVM instance can be imported from the scikit learn library with the syntax given below:

class sklearn.svm.OneClassSVM(*, kernel='rbf', degree=3, gamma='scale', coef0=0.0, tol=0.001, nu=0.5, shrinking=True, cache_size=200, verbose=False, max_iter=-1)

The important parameters of this class are:

kernel: The SVM works on something called a kernel, and this attribute decides which kernel to choose from. There are ‘linear’,’ poly’,’rbf’, ‘sigmoid’, and ‘precomputed’. If no kernel is selected, ‘rbf’ is taken as the default kernel
degree: The number of degree to choose so that the model captures complex data. It is only used when the kernel is poly
gamma: The coefficient of the kernel chosen. Only required for kernel = rbf, poly and sigmoid
coef0: It is an independent term in the kernel function, only specified when the kernel is poly or sigmoid. The default coef0 is 0.0
max_iter: The maximum number of iterations the model is allowed to run during training
(default = -1)
nu: This parameter works as a tradeoff between the model’s ability to categorize data as normal or abnormal

Let us see an example of using one class SVM for anomaly detection by generating normal and abnormal data.

from sklearn import svm
import numpy as np
import matplotlib.pyplot as plt

Import the SVM model from the scikit learn library, the numpy library for computations and matplotlib library for data visualization.

# normal dataset
normaldata = np.random.normal(0, 1, (100, 2))
# Training the model on normal data
clf = svm.OneClassSVM(nu=0.1, kernel="rbf")
clf.fit(normaldata)
#testing data with both normal and anamolous datapoints
testnormal = np.random.normal(0, 1, (50, 2))
testanomalous = np.random.uniform(-5, 5, (10, 2))

First, we generate a set of synthetic points using the numpy’s random object. These data points are taken as normal data. The oneclass svm is trained and fitted on these data points with kernel = rbf and 0.1 outlier detection.

We then proceed with generating test data(Synthetic) for both normal and anamolous labels.

The model predicts the normal and abnormal test data we created in the previous step.

#Visualization 
plt.figure(figsize=(10, 6))
plt.scatter(normaldata[:, 0], normaldata[:, 1], label='Normal Data', color='green')
plt.scatter(clf.support_vectors_[:, 0], clf.support_vectors_[:, 1], s=100, facecolors='none', edgecolors='r', label='Support Vectors')
plt.scatter(testnormal[:, 0], testnormal[:, 1], label='Test Data - Normal', color='blue')
plt.scatter(testanomalous[:, 0], testanomalous[:, 1], label='Test Data - Anomalous', color='red')
plt.title('1-Class SVM for Anomaly Detection')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.legend()
plt.show()

We display all the points related to normal data(training), normal data(testing), and anomalous data using a scatterplot. In addition to that, we are also displaying the support vectors. The support vectors are the points that lie closest to the hyperplane or the decision boundary and can be crucial in classifying a point as abnormal or normal.

The title for the plot is set along with the x-axis and y-axis labels. All these components are displayed using the show method.

Since the data we generated itself is not close to the normal data, all the supposed anomalous data is situated far from the normal data, which can also be considered as outliers.

Conclusion

One-Class SVM stands out in machine learning for its unique approach to anomaly detection, especially in unsupervised scenarios. Its versatility in handling complex data makes it a powerful tool for real-world applications like fraud detection in finance and identifying anomalies in healthcare data. How might One-Class SVM evolve to tackle emerging challenges in anomaly detection?

References

One class SVM documentation