GithubHelp home page GithubHelp logo

bityangke / moana Goto Github PK

View Code? Open in Web Editor NEW

This project forked from ku-milab/moana

0.0 1.0 0.0 1.73 MB

Pytorch implementation of "MoANA: Module of Axis-based Nexus Attention for Weakly Supervised Object Localization and Semantic Segmentation"

Python 98.88% Dockerfile 0.98% Shell 0.14%

moana's Introduction

MoANA: Module of Axis-based Nexus Attention for Weakly Supervised Object Localization and Semantic Segmentation

This repository provides the official PyTorch implementation of the following paper:

MoANA: Module of Axis-based Nexus Attention for Weakly Supervised Object Localization and Semantic Segmentation
Junghyo Sohn1, Eunjin Jeon2, Wonsik Jung2, Eunsong Kang2, Heung-Il Suk1,2
1 Department of Artificial Intelligence, Korea University, Seoul 02841, Republic of Korea
2 Department of Brain and Cognitive Engineering, Korea University, Seoul 02841, Republic of Korea

Under review, IEEE Transactions on Image Processing

Abstract: Although recent advances in deep learning have led to an acceleration in the improvement in weakly supervised object localization (WSOL) tasks, it remains challenging to identify and segment an entire object rather than only discriminative parts of the object. To tackle this problem, corruption-based approaches have been devised, which involve the training of non-discriminative regions by corrupting (erasing) the input images or intermediate feature maps. However, this approach requires an additional hyperparameter, the corrupting threshold, to determine the degree of corruption and can unfavorably disrupt training. It also tends to localize object regions coarsely. In this paper, we propose a novel approach, Module of Axis-based Nexus Attention (MoANA), which helps to adaptively activate less discriminative regions along with the class-discriminative regions without an additional hyperparameter, and elaborately localizes an entire object by utilizing information distributed over widths, heights, and channels with an attention mechanism for calibrating features. Specifically, MoANA consists of three mechanisms (1) triple-view attentions representation, (2) attentions expansion, and (3) features calibration mechanism. Unlike other attention-based methods that train a coarse attention map with the same values across elements in feature maps, our proposed MoANA trains fine-grained values in an attention map by assigning different attention values to each element in a cost-efficient manner. We validated our proposed method by comparing it with recent WSOL and weakly supervised semantic segmentation (WSSS) methods over various datasets. We also analyzed the effect of each component in our MoANA and visualized attention maps to provide insights into the calibration.

Dependencies

Requirements

conda env create -name moana -f environment.yaml
conda activate moana

Dataset

  • ILSVRC: 11,788 images from 200 bird categories, divided into 5,994 images for training and 5,794 images for evaluation.
  • CUB-200-2011: 1.2 million images in about 1,000 categories for training and 50,000 images for a validation.
  • Pascal VOC 2012: 21 classes, composed of 1,464 training images, 1,449 validation images and 1,456 test images.

Code Reference

Acknowledge

This work was supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government(MSIT) (No. 2019-0-00079, Artificial Intelligence Graduate School Program(Korea University))

moana's People

Contributors

junghyosohn avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.