GithubHelp home page GithubHelp logo

clad_occ_ch's Introduction

CLAD: Confidence-based self-Labeling Anomaly Detection

This is a PyTorch implementation for <What is Wrong with One-Class Anomaly Detection?> published at ICLR 2021 Workship on Security and Safety in Machine Learning Systems.

Abstract

From a safety perspective, a machine learning method embedded in real-world applications is required to distinguish irregular situations. For this reason, there has been a growing interest in the anomaly detection (AD) task. Since we cabbit observe abnormal samples for most of the cases, recent AD methods attemp to formulate it as a task of classifying whether the sample is normal or not. However, they potentially fail when the given normal samples are inherited from diverse semantic labels. To tackle this problem, we introduce a latent class-condition-based AD scenario. In addition, we propose a confidence-based self-labeling AD framework tailored to our proposed scenario. Since our method leverages the hidden class information, it successfully avoids generating the undesirable loose decision region that one-class methods suffer. Our proposed framework outperforms the recent one-class AD methods in the latent multi-class scenarios.

Requirements

  • Python 3.8
  • PyTorch 1.7.1

To install all the required elements, run the code:

bash requirements.txt

Run scripts

Implemented dataset list : MNIST, GTSRB, CIFAR-10, Tiny-ImageNet

To run the experiments, run the scripts:

cd <path-to-CLAD-directory>

# change to run_scripts directory
cd src/run_scripts

# run MNIST experiments
sh run_mnist.sh

# run GTSRB experiments
sh run_gtsrb.sh

# run CIFAR-10 experiments
sh run_cifar10.sh

# run Tiny-ImageNet experiments
sh run_tiny_imagenet.sh

Scenario (Latent Class-condition Anomaly Detection Scenraio)

To reduce the gap between the real-world and the one-class AD senarios, we simulate the sceanrio environment where the latent sub-classes exist implicitly. With this environment, it is crucial to learn a decision boundary by seeing not only the normality of the data samples but also its semantics. Note that such class information is not observable, thus the AD framework may require learning the semantic representation in an unsupervised or self-supervised manner.

CLAD (Confidence-based self-Labeling Anoamly Detection)

We propose a Confiedence-based self-Labeling Anomaly Detection (CLAD) framework with four states illustrated in the figure below.

Categorizing Each Dataset

We devised super-categories by merging the semantic labels to simulate our AD scenario as illustrated in the figure and table below.

Experimental Results

We compare with one-class AD methods: OCSVM, OCNN, OCCNN, SVDD, and DeepSVDD.

Proposed Scenario

One-Class Classification

Ablation study on hyper-parameters

Implementation Details

For latent feature extraction, we used convolutional autoencoder architecture. For clustering visualization, we used Multicore-TSNE for efficiency issue. For self-labeling via clustering, we mimicked the approach of the DEC to self-assign labels to data samples. For classifier for confidence-based AD, we used ResNet-18. For the scaling the confidence scores, we adopted the OOD mechanism from the ODIN to gain more robust AD score.

  • Note that the hyper-parameters may vary depending on the scenarios for each dataset.

Citation

@misc{park2021wrong,
  title         = {What is Wrong with One-Class Anomaly Detection?}, 
  author        = {JuneKyu Park and Jeong-Hyeon Moon and Namhyuk Ahn and Kyung-Ah Sohn},
  year          = {2021},
  eprint        = {2104.09793},
  archivePrefix = {arXiv},
  primaryClass  = {cs.LG}
}

clad_occ_ch's People

Contributors

junekyu avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.