Anomaly detection is the discovery or classification of events or observations that differ substantially from most of the data. Anomalies are also known as outliers, deviations, novelties, exceptions, or noise. Anomaly detection is categorized into three techniques as given below.

  • Unsupervised
  • Supervised
  • Semi-supervised

Unsupervised anomaly detection techniques assume that the majority of the data points are normal. The techniques look for data points that fit the least in the remaining data points in a dataset to detect anomalies.

Supervised anomaly detection techniques detect anomalies in a dataset with data points labeled as "normal" and "abnormal". It involves training a classifier to remove the anomalies.

Semi-supervised anomaly detection techniques build a model which represents normal behavior from a given trained dataset. It then tests the likelihood of outliers being generated by the model.

Anomaly detection is applicable in,

  • detecting intrusions
  • detecting frauds
  • detecting faults
  • monitoring system health
  • detecting ecosystem disturbances
  • detecting defects in images using machine vision

Anomaly detection algorithms are used in data preprocessing to remove inconsistent data from a dataset. In supervised learning, it is an important step of data preprocessing to train a dataset, also known as data cleansing.

List of Anomaly Detection Algorithms

The Anomaly Detection algorithms available in Rubiscape are as given below.

(info)
Notes:

  • The Reader (dataset) should be connected to the algorithm.
  • LOF algorithm can be used only on categorical data.
  • These algorithms are used to detect anomalies in a given data.


Table of Contents