Anomaly detection is the discovery or classification of events or observations that differ substantially from most of the data. Anomalies are also known as outliers, deviations, novelties, exceptions, or noise. Anomaly detection is categorized into three techniques as given below.

Unsupervised
Supervised
Semi-supervised

Unsupervised anomaly detection techniques assume that the majority of the data points are normal. The techniques look for data points that fit the least in the remaining data points in a dataset to detect anomalies.

Supervised anomaly detection techniques detect anomalies in a dataset with data points labeled as "normal" and "abnormal". It involves training a classifier to remove the anomalies.

Semi-supervised anomaly detection techniques build a model which represents normal behavior from a given trained dataset. It then tests the likelihood of outliers being generated by the model.

Anomaly detection is applicable in,

detecting intrusions
detecting frauds
detecting faults
monitoring system health
detecting ecosystem disturbances
detecting defects in images using machine vision

Anomaly detection algorithms are used in data preprocessing to remove inconsistent data from a dataset. In supervised learning, it is an important step of data preprocessing to train a dataset, also known as data cleansing.

List of Anomaly Detection Algorithms

The Anomaly Detection algorithms available in Rubiscape are as given below.

Density-Based Spatial Clustering
Isolation Forest
One Class SVM
Local Outlier Factor (LOF)

Notes:

The Reader (dataset) should be connected to the algorithm.
LOF algorithm can be used only on categorical data.
These algorithms are used to detect anomalies in a given data.

Table of Contents