Anomaly detection is the discovery or classification of events or observations that differ substantially from most of the data. Anomalies are also known as outliers, deviations, novelties, exceptions, or noise. Anomaly detection is categorized into three techniques as given below.
- Unsupervised
- Supervised
- Semi-supervised
Unsupervised anomaly detection techniques assume that the majority of the data points are normal. The techniques look for data points that fit the least in the remaining data points in a dataset to detect anomalies.
Supervised anomaly detection techniques detect anomalies in a dataset with data points labeled as "normal" and "abnormal". It involves training a classifier to remove the anomalies.
Semi-supervised anomaly detection techniques build a model which represents normal behavior from a given trained dataset. It then tests the likelihood of outliers being generated by the model.
Anomaly detection is applicable in,
- detecting intrusions
- detecting frauds
- detecting faults
- monitoring system health
- detecting ecosystem disturbances
- detecting defects in images using machine vision
Anomaly detection algorithms are used in data preprocessing to remove inconsistent data from a dataset. In supervised learning, it is an important step of data preprocessing to train a dataset, also known as data cleansing.
List of Anomaly Detection Algorithms
The Anomaly Detection algorithms available in Rubiscape are as given below.
- Density-Based Spatial Clustering
- Isolation Forest
- One Class SVM
- Local Outlier Factor (LOF)
|
|
Table of Contents