image-captioning's Introduction

Neural Image Captioning

In this project, we propose a deep learning approach to generate automatically the caption of an image. Project done as part of the course Deep Learning: Models and Optimization taught by Marco Cuturi (Google Brain).

Authors: Ryan Boustany, Emma Sarfati

Notebooks

This repo is splitted into two parts:

DLnotebook1: our first approach, based on a CNN-RNN without visual attention weights and a GRU decoder. This notebook is not based on any original paper or code as we built our own architectures and training procedures. We use a simple RNN model with a convolved image as initial state.
DLnotebook2: our second approach, based on the paper of Xu et al. : Show, Attend and Tell: Neural Image Caption Generation with Visual Attention (https://arxiv.org/abs/1502.03044). The idea is to add an attention mechanism on the encoded image.

The notebooks might not display correctly because the files are too large. You can use Jupyter nbviewer : https://nbviewer.jupyter.org.
If you want to run the notebooks locally, you will need to run the !wget cells at the beggining, which may take a long time. Once it is done, do not forget to change the paths towards images and captions in the notebooks.

Report

The theoretical bases and motivations are detailed in our report in the file DLreport.pdf.

Preview

A little foretaste of the results of the visual attention model...

Google Colab

If you wish to run the notebooks on your computer, you can either git clone this repo or create a copy of the Google Colab links.
Notebook 1: https://colab.research.google.com/drive/1L3K0pHwiu0UEVfWn1njjh-MgN2IgHe_I?usp=sharing
Notebook 2: https://colab.research.google.com/drive/1GTyrtpoKGYDHSC9429cAhKFiTrSVZDCh?usp=sharing

Recommend Projects

esarf / image-captioning Goto Github PK

image-captioning's Introduction

Neural Image Captioning

Notebooks

Report

Preview

Google Colab

image-captioning's People

Contributors

Watchers

Forkers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs