GithubHelp home page GithubHelp logo

pkurainbow / defgrid-release Goto Github PK

View Code? Open in Web Editor NEW

This project forked from fidler-lab/defgrid-release

0.0 1.0 0.0 1.96 MB

Official PyTorch implementation of Deformable Grid (ECCV 2020)

Home Page: http://www.cs.toronto.edu/~jungao/def-grid/

License: Other

Python 64.76% Shell 0.03% C++ 3.93% Cuda 31.28%

defgrid-release's Introduction

Beyond Fixed Grid: Learning Geometric Image Representation with a Deformable Grid

This is the official PyTorch implementation of Deformable Grid (ECCV 2020). For technical details, please refer to:


Beyond Fixed Grid: Learning Geometric Image Representation with a Deformable Grid
Jun Gao, Zian Wang, Jinchen Xuan, Sanja Fidler

University of Toronto [Paper] [Video] [Supplementary] [Project]

ECCV 2020

  • In modern computer vision, images are typically represented as a fixed uniform grid with some stride and processed via a deep convolutional neural network. We argue that deforming the grid to better align with the high-frequency image content is a more effective strategy. We introduce \emph{Deformable Grid} (DefGrid), a learnable neural network module that predicts location offsets of vertices of a 2-dimensional triangular grid, such that the edges of the deformed grid align with image boundaries. We showcase our DefGrid in a variety of use cases, i.e., by inserting it as a module at various levels of processing. We utilize DefGrid as an end-to-end \emph{learnable geometric downsampling} layer that replaces standard pooling methods for reducing feature resolution when feeding images into a deep CNN. We show significantly improved results at the same grid resolution compared to using CNNs on uniform grids for the task of semantic segmentation. We also utilize DedGrid at the output layers for the task of object mask annotation, and show that reasoning about object boundaries on our predicted polygonal grid leads to more accurate results over existing pixel-wise and curve-based approaches. We finally showcase {DefGrid} as a standalone module for unsupervised image partitioning, showing superior performance over existing approaches.

License

Copyright (C) University of Toronto. Jun Gao, Zian Wang, Jinchen Xuan, Sanja Fidler
All rights reserved.
Licensed under the CC BY-NC-SA 4.0 license (https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode).

Permission to use, copy, modify, and distribute this software and its documentation
for any non-commercial purpose is hereby granted without fee, provided that the above
copyright notice appear in all copies and that both that copyright notice and this
permission notice appear in supporting documentation, and that the name of the author
not be used in advertising or publicity pertaining to distribution of the software
without specific, written prior permission.

THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING ALL
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR ANY PARTICULAR PURPOSE.
IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL
DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS,
WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING
OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.

Environment Setup

All the code have been run and tested on Ubuntu 16.04, Python 2.7 (and 3.8), Pytorch 1.1.0 (and 1.2.0), CUDA 10.0, TITAN X/Xp and GTX 1080Ti GPUs

  • Go into the downloaded code directory
cd <path_to_downloaded_directory>
  • Setup python environment
conda create --name defgrid
conda activate defgrid
conda install pytorch==1.2.0 torchvision==0.4.0 cudatoolkit=10.0 -c pytorch
pip install opencv-python matplotlib networkx tensorboardx tqdm scikit-image ipdb
  • Add the project to PYTHONPATH
export PYTHONPATH=$PWD:$PYTHONPATH

Eexample use cases

We provide several usecases on DefGrid, more usecases are on the way! We are hoping these usecases can provide insights and improvements on other image-based computer vision tasks as well.

Train DefGrid on Cityscapes Images

Data

  • Download the Cityscapes dataset (leftImg8bit_trainvaltest.zip) from the official website [11 GB]
  • Our dataloaders work with our processed annotation files which can be downloaded from here.
  • From the root directory, run the following command with appropriate paths to get the annotation files ready for your machine
python scripts/dataloaders/change_paths.py --city_dir <path_to_downloaded_leftImg8bit_folder> --json_dir <path_to_downloaded_annotation_file> --out_dir <output_dir>

Training

Train DefGrid on the whole traininig set.

python scripts/train/train_def_grid_full.py --debug false --version train_on_cityscapes_full --encoder_backbone simplenn --resolution 512 1024 --grid_size 20 40 --w_area 0.005

Train DefGrid on Cityscapes "MultiComp" cropped images

Training

Train DefGrid on the whole traininig set.

python scripts/train/train_def_grid_multi_comp.py --debug false --version train_on_cityscapes_multicomp

To train on other custom dataloader, please add a new DataLoader class according what we have provided. the hyper-parameters might also need to change accordingly.

Learnable downsampling for semantic segmentation on Cityscapes Images.

We provide the code in this branch

Citation

If you use this code, please cite:

@inproceedings{deformablegrid,
title={Beyond Fixed Grid: Learning Geometric Image Representation with a Deformable Grid},
author={Jun Gao and Zian Wang and Jinchen Xuan and Sanja Fidler},
booktitle={ECCV},
year={2020}
}

defgrid-release's People

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.