Data science techniques using supervised and unsupervised machine learning techniques are becoming increasing popular in identifying patterns in data and to build systems that can better predict the future. With increase in volume, variety and velocity of data, many big data techniques are being used to scale these techniques today.
However, Anomaly detection techniques focus on detecting the asymetric outliers in the data sets. Many techniques are used for anomaly detection and a comprehensive understanding of these techniques would help get a better understanding of the nature of the data and to detect and act upon these outliers.
In this workshop, we will discuss the various anomaly detection techniques that are practiced in the industry. Through practical case-studies, we will discuss how these techniques can be used to identify anomolies in cross-sectional and time-series datasets. Using Apache Spark, we will also illustrate how these techniques could be scaled to address the big data challenges in the enterprise.
On day one, we will review the core techniques in anomaly detection. Through examples we will undertand the different outlier detection techniques and review evaluation criteria
What you will learn
- Anomaly Detection: An introduction
- Graphical and Exploratory analysis techniques
- Statistical techniques in Anomaly Detection
- Machine learning methods for Outlier analysis
- Evaluating performance in Anomaly detection techniques
- Case study 1: Anomalies in Freddie mac mortgage data
- Case study 2: Detecting anomalies in time series data
On day two, we will discuss advanced techniques in anomoly detection and use Apache Spark for anomaly detection. We will also discuss best practices in scaling and using anomaly detection techniques.