GithubHelp home page GithubHelp logo

adsketch's Introduction

ADSketch

This repo contains the source code for paper Adaptive Time Series Anomaly Detection for Online Services via System Failure Sketching (ICSE'22).

ADSketch (Anomaly Detection via Pattern Sketching) is an interpretable and adaptive performance anomaly detection algorithm for online service systems. Its core idea is to locate metric subsequences that significantly deviate from those shown in the history. ADSketch achieves interpretability by identifying groups of anomalous metric patterns, which represent particular types of performance issues. The underlying issues can then be immediately recognized if similar patterns emerge again. Figure 1 illustrates the algorithm. Moreover, an adaptive learning algorithm is designed to embrace unprecedented patterns.

GRLIA Framework

Prerequisites

  • Python version 3.6
  • All required packages are installed (by command pip install -r requirements.txt)
  • Windows, Linux or macOS system

Usage

To use the model, unzip the data, change directory to this project code, execute the command:

  • python yahoo_demo.py (for Yahoo dataset)
  • python aiops18_demo.py (for AIOps18 dataset)
  • python industry_demo.py (for Industry dataset)

Project Structure

  1. adsketch/motif_operations.py contains the core functions of ADSketch
  2. ./data contains the datasets used in the paper
  3. params.json contains the parameter settings for different datasets
  4. yahoo_demo.py, aiops18_demo.py, and industry_demo.py are the scripts to run experiments with different datasets

adsketch's People

Contributors

zbchern avatar

Stargazers

Cabin Z avatar  avatar Bhaskar Dhariyal avatar YiDan Sun avatar Jianfeng Wang avatar  avatar OnismV avatar  avatar LiYuan avatar  avatar IceyBlackTea avatar Dongmin Kim avatar Poo Hwang avatar Qiaosheng Chen avatar Jesse avatar  avatar  avatar  avatar Bersekas Tully avatar Masanori Ogino avatar Yuuki TSUBOUCHI avatar  avatar Oscar Macias avatar  avatar ZuoXiang avatar Zeyan Li 李则言 avatar Chaoyu Chen avatar Xiaoyun Li avatar  avatar

Watchers

 avatar

adsketch's Issues

Min-Max Scaling in Online Adaptive Learning

Hi there,

First of all, thank you for sharing your work here! It's been incredibly insightful.

I have a question regarding the use of min-max scaling in the online adaptive learning stage of the algorithm.

In the paper, it is mentioned that "min-max scaling" is used instead of the "z-normalization" from the original MASS algorithm. I noticed that in the codebase, sklearn.preprocessing.MinMaxScaler is employed for this purpose. Specifically, I see the following line applying MinMaxScaler on online test data, where the scaler is fit on the original anomaly-free training samples:

# motif_operations.online_anomaly_detection()
_, online_scaled_test_metrics = scale_two_metrics(train_metric_values, online_test_metric_values)

However, in real-world scenarios, new data samples continuously stream in, and they are mostly labeled as new nominal points (i.e., anomaly-free, similar to the training data). My concern is:

  • How does this implementation handle concept drift while always keeping the old min-max boundaries for scaling? Could this approach potentially degrade detection performance in the long term?
  • If not, is there any trivial approach to mitigate the repercussion?

I am relatively new to the field of anomaly detection, so I apologize if my question seems basic. I appreciate your time and any insights you can offer!

Thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.