GithubHelp home page GithubHelp logo

mostafa-eltalawy / nqg Goto Github PK

View Code? Open in Web Editor NEW

This project forked from magic282/nqg

0.0 0.0 0.0 52 KB

Code for the paper "Neural Question Generation from Text: A Preliminary Study"

License: GNU General Public License v3.0

Python 98.86% Shell 1.14%

nqg's Introduction

NQG

This repository contains code for the paper "Neural Question Generation from Text: A Preliminary Study"

About this code

The experiments in the paper were done with an in-house deep learning tool. Therefore, we re-implement this with PyTorch as a reference.

This code only implements the setting NQG+ in the paper. Within 1 hour's training on Tesla P100, the NQG+ model achieves 12.78 BLEU-4 score on the dev set.

If you find this code useful in your research, please consider citing:

@article{zhou2017neural,
  title={Neural Question Generation from Text: A Preliminary Study},
  author={Zhou, Qingyu and Yang, Nan and Wei, Furu and Tan, Chuanqi and Bao, Hangbo and Zhou, Ming},
  journal={arXiv preprint arXiv:1704.01792},
  year={2017}
}

How to run

Prepare the dataset and code

Make an experiment home folder for NQG data and code:

NQG_HOME=~/workspace/nqg
mkdir -p $NQG_HOME/code
mkdir -p $NQG_HOME/data
cd $NQG_HOME/code
git clone https://github.com/magic282/NQG.git
cd $NQG_HOME/data
wget https://res.qyzhou.me/redistribute.zip
unzip redistribute.zip

Put the data in the folder $NQG_HOME/code/data/giga and organize them as:

nqg
├── code
│   └── NQG
│       └── seq2seq_pt
└── data
    └── redistribute
        ├── QG
        │   ├── dev
        │   ├── test
        │   ├── test_sample
        │   └── train
        └── raw

Then collect vocabularies:

python $NQG_HOME/code/NQG/seq2seq_pt/CollectVocab.py \
       $NQG_HOME/data/redistribute/QG/train/train.txt.source.txt \
       $NQG_HOME/data/redistribute/QG/train/train.txt.target.txt \
       $NQG_HOME/data/redistribute/QG/train/vocab.txt
python $NQG_HOME/code/NQG/seq2seq_pt/CollectVocab.py \
       $NQG_HOME/data/redistribute/QG/train/train.txt.bio \
       $NQG_HOME/data/redistribute/QG/train/bio.vocab.txt
python $NQG_HOME/code/NQG/seq2seq_pt/CollectVocab.py \
       $NQG_HOME/data/redistribute/QG/train/train.txt.pos \
       $NQG_HOME/data/redistribute/QG/train/train.txt.ner \
       $NQG_HOME/data/redistribute/QG/train/train.txt.case \
       $NQG_HOME/data/redistribute/QG/train/feat.vocab.txt
head -n 20000 $NQG_HOME/data/redistribute/QG/train/vocab.txt > $NQG_HOME/data/redistribute/QG/train/vocab.txt.20k

Setup the environment

Package Requirements:

nltk scipy numpy pytorch

PyTorch version: This code requires PyTorch v0.4.0.

Python version: This code requires Python3.

Warning: Older versions of NLTK have a bug in the PorterStemmer. Therefore, a fresh installation or update of NLTK is recommended.

A Docker image is also provided.

Docker image

docker pull magic282/pytorch:0.4.0

Run training

The file run.sh is an example. Modify it according to your configuration.

Without Docker

bash $NQG_HOME/code/NQG/seq2seq_pt/run_squad_qg.sh $NQG_HOME/data/redistribute/QG $NQG_HOME/code/NQG/seq2seq_pt

With Docker

nvidia-docker run --rm -ti -v $NQG_HOME:/workspace magic282/pytorch:0.4.0

Then inside the docker:

bash code/NQG/seq2seq_pt/run_squad_qg.sh /workspace/data/redistribute/QG /workspace/code/NQG/seq2seq_pt

nqg's People

Contributors

magic282 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.