GithubHelp home page GithubHelp logo

gpla / faster-etapr Goto Github PK

View Code? Open in Web Editor NEW
1.0 2.0 0.0 204 KB

Faster implementation of the enhanced time-aware precision and recall (eTaPR) for the evaluation of anomaly detection methods on time series data

License: Other

Python 100.00%
anomaly detection learning machine metrics performance series time etapr

faster-etapr's Introduction

faster-eTaPR

Latest Version Supported Python Versions Documentation Status Pre-Commit enabled MyPy checked Code Coverage

Faster implementation (~200x) of the enhanced time-aware precision and recall (eTaPR) from Hwang et al. The original implementation is saurf4ng/eTaPR and this implementation is fully tested against it.

Motivation

The motivation behind the eTaPR is that it is enough for a detection method to partially detect an anomaly segment, as along as an human expert can find the anomaly around this prediction. The following illustration (a recreation from the paper) highlights the four cases which are considered by eTaPR:

Motivation behind eTaPR

  1. A successful detection: A human expert can likely find the anomaly A_1 based on the prediction P_1.
  2. A failed detection: Only a small portion of the prediction P_2 overlaps with the anomaly A_2.
  3. A failed detection: Most of the prediction P_3 lies in the range of non-anomalous behavior (prediction starts too early). A human expert will likely regard the prediction P_3 as incorrect or a false alarm. The prediction P_3 is too imprecise and the anomaly A_3 is likely to be missed.
  4. A failed prediction: The prediction P_4 mostly overlaps with the anomaly A_4, but covers only a small portion of the actual anomaly segment. Thus, a human expert is likely to dismiss the prediction P_4 as incorrect because the full extend of the anomaly remains hidden. The prediction P_4 contains insufficient information about the anomaly.

Note that for case 4, we could still mark the anomaly as detected, if there were more predictions which overlap with the anomaly A_4. Specifically, the handling of the cases 3 and 4 is what sets eTaPR apart from other scoring methods.

If you want an in-depth explanation of the calculation, check out the documentation.

Getting Started

Install this package from PyPI using pip or uv:

pip install faster-etapr
uv pip install faster-etapr

Now, you run your evaluation in python:

import faster_etapr
faster_etapr.evaluate_from_preds(
    y_hat=[0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0],
    y=    [0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 1],
    theta_p=0.5,
    theta_r=0.1,
)
{
    'eta/recall': 0.3875,
    'eta/recall_detection': 0.5,
    'eta/recall_portion': 0.275,
    'eta/detected_anomalies': 2.0,
    'eta/precision': 0.46476766302377037,
    'eta/precision_detection': 0.46476766302377037,
    'eta/precision_portion': 0.46476766302377037,
    'eta/correct_predictions': 2.0,
    'eta/f1': 0.4226312395393011,
    'eta/TP': 4,
    'eta/FP': 5,
    'eta/FN': 7,
    'eta/wrong_predictions': 2,
    'eta/missed_anomalies': 2,
    'eta/anomalies': 4,
    'eta/segments': 0.499999999999875,
    'point/recall': 0.45454545454541323,
    'point/precision': 0.5555555555554939,
    'point/f1': 0.49999999999945494,
    'point/TP': 5,
    'point/FP': 4,
    'point/FN': 6,
    'point/anomalies': 4,
    'point/detected_anomalies': 3.0,
    'point/segments': 0.75,
    'point_adjust/recall': 0.9090909090909091,
    'point_adjust/precision': 0.7142857142857143,
    'point_adjust/f1': 0.7999999999995071
}

We calculate three types of metrics:

Benchmark

A little benchmark with randomly generated inputs (np.random.randint(0, 2, size=size)):

size eTaPR_pkg faster_etapr factor
1 000 0.4090 0.0032 ~125x
10 000 35.8264 0.1810 ~198x
20 000 148.2670 0.6547 ~226x
100 000 too long 55.04712 ย 

Citation

If you use eTaPR, please cite the original author/paper:

@inproceedings{10.1145/3477314.3507024,
author = {Hwang, Won-Seok and Yun, Jeong-Han and Kim, Jonguk and Min, Byung Gil},
title = {"Do You Know Existing Accuracy Metrics Overrate Time-Series Anomaly Detections?"},
year = {2022},
isbn = {9781450387132},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3477314.3507024},
doi = {10.1145/3477314.3507024},
booktitle = {Proceedings of the 37th ACM/SIGAPP Symposium on Applied Computing},
pages = {403โ€“412},
numpages = {10},
keywords = {accuracy metric, anomaly detection, precision, recall, time-series},
location = {Virtual Event},
series = {SAC '22}
}

faster-etapr's People

Contributors

gpla avatar

Stargazers

 avatar

Watchers

Kostas Georgiou avatar  avatar

faster-etapr's Issues

Fix example in readme and docstring

Problem to Solve

  • Docstring for evaluate_from_preds has evaluate_from_ranges in the docstring example
  • Example in readme has evaluate_from_ranges instead of evaluate_from_preds

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.