GithubHelp home page GithubHelp logo

nqanh / object_captioning Goto Github PK

View Code? Open in Web Editor NEW
9.0 3.0 1.0 783 KB

Object Captioning and Retrieval with Natural Language - ICCVW19

Shell 2.32% Makefile 0.03% MATLAB 0.54% Python 95.58% C++ 0.04% Cuda 1.50%

object_captioning's Introduction

By Anh Nguyen, Quang D. Tran, Thanh-Toan Do, Ian Reid, Darwin G. Caldwell, Nikos G. Tsagarakis

object_captioning

Contents

  1. Requirements
  2. Quick Demo
  3. Training

Requirements

  1. Tensorflow (version > 1.0)
  2. Hardware
    • A gpu with ~6GB

Quick Demo

  • Clone the repo to your $PROJECT_PATH folder
  • Download pretrained weight from this link, and put it under your $PROJECT_PATH\trained_weight folder
  • Download the Flickr5k dataset, and put it under your $PROJECT_PATH\data\VOCdevkit2007 folder
  • Change the project path in file lib/model/config.py: __C.root_folder_path = '$PROJECT_PATH'
  • Build the lib module: cd $PROJECT_PATH/lib then make
  • Run the demo: cd $PROJECT_PATH/tool then python demo_caption.py to generate captions for your images

Training

  1. We train the network on Flickr5k dataset

    • We need to format Flickr5k dataset as in Pascal-VOC dataset for training.
    • For your convinience, we did it for you. Just download this file (Google Drive and extract it into your $PROJECT_PATH\data\VOCdevkit2007 folder.
  2. Train the network:

    • python $PROJECT_PATH/tool/trainval_net.py

If you find this source code useful in your research, please consider citing:

@article{Nguyen_objcaption,
  author    = {Anh Nguyen and
			   Duy Q. Tran and
			   Thanh{-}Toan Do and
			   Ian D. Reid and
			   Darwin G. Caldwell and
			   Nikos G.Tsagarakis},
  title     = {Object Captioning and Retrieval with Natural Language},
  journal   = {International Conference on Computer Vision Workshop},
  year      = {2019},
}

License

MIT License

Acknowledgement

This repo used a lot of source code from Faster-RCNN and AffordanceNet

Contact

If you have any questions or comments, please send an email to: [email protected]

object_captioning's People

Contributors

nqanh avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

bjarten

object_captioning's Issues

CUDA = locate_cuda() Issue

When I run the command "make". it gives following issue
python setup.py build_ext --inplace
Traceback (most recent call last):
File "setup.py", line 55, in
CUDA = locate_cuda()
File "setup.py", line 43, in locate_cuda
raise EnvironmentError('The nvcc binary could not be '
OSError: The nvcc binary could not be located in your $PATH. Either add it to your path, or set $CUDAHOME
make: *** [Makefile:2: all] Error 1

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.