GithubHelp home page GithubHelp logo

mala-lab / couta Goto Github PK

View Code? Open in Web Editor NEW

This project forked from xuhongzuo/couta

0.0 0.0 0.0 294 KB

Implementation of TKDE paper "Calibrated One-class classification-based Unsupervised Time series Anomaly detection"

License: Apache License 2.0

Shell 0.51% Python 41.51% Jupyter Notebook 57.98%
anomaly-detection one-class-classification time-series-analysis time-series-anomaly-detection

couta's Introduction

COUTA - time series anomaly detection

Implementation of "Calibrated One-class classification-based Unsupervised Time series Anomaly detection" (COUTA for short).
The full paper is available at link.
Please consider citing our paper if you use this repository. ๐Ÿ˜‰

@article{xu2022deep,
  title={Calibrated One-class Classification for Unsupervised Time Series Anomaly
Detection},
  author={Xu, Hongzuo and Wang, Yijie and Jian, Songlei and Liao, Qing and Wang, Yongjun and Pang, Guansong},
  journal={arXiv preprint arXiv:2207.12201},
  year={2022}
}

Environment

main packages

torch==1.10.1+cu113  
numpy==1.20.3  
pandas==1.3.3  
scipy==1.4.1  
scikit-learn==1.1.1  

we provide a requirements.txt in our repository.

Takeaways

APIs

COUTA provides easy APIs in a sklearn/pyod style, that is, we can first instantiate the model class by giving the parameters

from src.algorithms.couta_algo import COUTA
model_configs = {'sequence_length': 50, 'stride': 1}
model = COUTA(**model_configs)

then, the instantiated model can be used to fit and predict data, please use dataframes of pandas as input data

model.fit(train_df)
score_dic = model.predict(test_df)
score = score_dic['score_t']

We use a dictionary as our prediction output for the sake of consistency with an evaluation work of time series anomaly detection link
score_t is a vector that indicates anomaly scores of each time observation in the testing dataframe, and a higher value represents a higher likehood to be an anomaly

model save and load

Training by feeding the save_model_path parameter, the model will be saved in this path

from src.algorithms.couta_algo import COUTA
path = 'saved_models/couta.pth'
model_configs = {'sequence_length': 50, 'stride': 1, 'save_model_path': path}
model = COUTA(**model_configs)
model.fit(train_df)

Then, couta can be used without fitting.

from src.algorithms.couta_algo import COUTA
path = 'saved_models/couta.pth'
model_configs = {'load_model_path': path}
model = COUTA(**model_configs)
model.predict(test_df)

Datasets used in our paper

  • Due to the license issue of these datasets, we provide download links here. We also offer the preprocessing script in data_preprocessing.ipynb. You can easily generate processed datasets that can be directly fed into our pipeline by downloading original data and running this notebook. *

The used datasets can be downloaded from:

Reproduction of experiment results

Experiments of the effectivness (4.2)

After handling the used datasets, you can use main.py to perform COUTA on different time series datasets, we use six datasets in our paper, and --data can be chosen from [ASD, SMD, SWaT, WaQ, Epilepsy, DSADS].

For example, perform COUTA on the ASD dataset by

python main.py --data ASD --algo COUTA

or you can directly use script_effectivenss.sh

Generalization test (4.3)

we include the used synthetic datasets in data_processed/

python main_showcase.py --type point
python main_showcase.py --type pattern

two anomaly score npy files are generated, you can use experiment_generalization_ability.ipynb to visualize the data and our results.

Robustness (4.4)

use src/experiments/data_contaminated_generator_dsads.py and src/experiments/data_contaminated_generator_ep.py to generate datasets with various contamination ratios
use main.py to perform COUTA on these datasets, or directly execute script_robustness.sh

Ablation study (4.5)

change the --algo argument to COUTA_wto_umc, COUTA_wto_nac, or Canonical, e.g.,

python main.py --algo COUTA_wto_umc --data ASD

use script_effectiveness.sh also produce detection results of ablated variants

Others

As for the sensitivity test (4.6), please adjust the parameters in the yaml file.
As for the scalability test (4.7), the produced result files also contain execution time.

Competing methods

All of the anomaly detectors in our paper are implemented in Python. We list their publicly available implementations below.

couta's People

Contributors

jdk-21 avatar xuhongzuo avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.