GithubHelp home page GithubHelp logo

kush210 / show-attend-and-tell-neural-image-caption-generation-with-visual-attention Goto Github PK

View Code? Open in Web Editor NEW

This project forked from ayush-patel-10/show-attend-and-tell-neural-image-caption-generation-with-visual-attention

0.0 0.0 0.0 66.49 MB

Neural Image caption Generation Project

HTML 4.20% Jupyter Notebook 95.80%

show-attend-and-tell-neural-image-caption-generation-with-visual-attention's Introduction

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

Website: https://expo.baulab.info/2023-Fall/Ayush-Patel-10/

This repository provides two implementations for the paper "Show, Attend and Tell" using both VGG and InceptionV3 models trained on the MS COCO dataset. Below are the instructions to run, train, and evaluate the models.

Intructions

  1. Training the Model:
  • Open the TrainingAndSavingModel.ipynb Jupyter Notebook file.
  • Run the entire file to train the model. Ensure that the model is saved on your Google Drive.
  • You can adjust hyperparameters according to your system needs.
  1. Testing the Model:
  • Open the LoadingAndTestingModel.ipynb Jupyter Notebook file.
  • Run the file to check the model's performance on your personal images.
  • Ensure that the images are in the required format.
  1. Evaluation:
  • Open the cocoEvalCapDemo.ipynb Jupyter Notebook file.
  • Run the file to check the BLEU and METEOR scores for the models.
  • Note: Customized coco-caption is included in the repository, and some modifications have been made to meet the model's requirements.

Usage

To use the pre-trained models on your images, follow the steps in the LoadingAndTestingModel.ipynb file.

Training

If you want to train the models on your dataset:

  1. Prepare your dataset in the required format.
  2. Open the TrainingAndSavingModel.ipynb file.
  3. Run the file, adjusting hyperparameters as needed.

We have trained the model on 5000 vocab size, 100000 images and captions.

Evaluation

Evaluate the models using the cocoEvalCapDemo.ipynb file to obtain BLEU and METEOR scores.

Customization

Feel free to customize the code, adjust hyperparameters, or use different pre-trained models. These hyperparameters were adjusted according to our system requirements. You can increase the vocab size and the number of images and captions used.

Results

We evaluated the model on 2 metrics - BLUE and METEOR. We achieved a BLUE-1: 36 and METEOR: 10.04.

References

For any issues, contributions, or questions, please feel free to reach out.

Linkedln: https://www.linkedin.com/in/ayush-patel-ml/

show-attend-and-tell-neural-image-caption-generation-with-visual-attention's People

Contributors

ayush-patel-10 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.