GithubHelp home page GithubHelp logo

mcan-vqa-thai's Introduction

mcan-vqa-thai

This repository is originally from https://github.com/MILVLG/mcan-vqa. You can read old README from original authors at https://github.com/suakow/mcan-vqa-thai/blob/master/old_README.md.

This repository is a part of term project of NLP course at Chulalongkorn University, semester 2/2020. This project is about VQA in Thai. We chosed the model https://github.com/MILVLG/mcan-vqa and modify language understanding part by replaced Embedding and LSTM layers with WangchanBERTa from VISTEC-AI(https://github.com/vistec-AI/thai2transformers , https://arxiv.org/abs/2101.09635)

We used VQA 2.0 as dataset by selected 8,000 question-answer pairs as training set and 2,000 pairs as test set from original VQA 2.0 validation set. All of selected 10,000 question-answer pairs were translated to Thai by Google Translate and manually verified by our group.

About the image feature, we used original extracted feature from original repository which able to download by this link

Trianing

You require Google Colaboratory-Pro with GPU enabled for training and inference. If you setup you own environment, you can download dependencies required by this project with requirement.txt

$ pip install -r requirements.txt

And you have to download image feature by this link

The question-answer pairs for training and inference (test set) are included in this repository. You can find them at this link.

If you have done above steps. you can fine-tune the model for Thai language by link by Google Colab or your machine. You have to change the image feature path before running. You can click this link to open with Google Colab directly.

Inference

For inference, you have to download original image from this link for display images while inference.

You have to setup as same as training step before running inference. The inference notebook is in this link or click this link to open in Google Colab directly. You may have to change model weight file path(click here to see the code) and original file path(click here to see the code) before running.

mcan-vqa-thai's People

Contributors

suakow avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.