GithubHelp home page GithubHelp logo

airdialogue_model's Introduction

AirDialogue Model

Prerequisites

General

  • python (verified on 2.7.16)
  • wget

Python Packages

  • tensorflow (verified on 1.14.0)
  • airdialogue

1. Prepare Dataset

AirDialogue dataset and its meta data can be downloaded using our download script. In this script we will also download the nltk corpus used for preprocessing.

bash ./scripts/download.sh

We will also generate a set of synthesized context pairs for self-play training. These context pairs contain initial conditions and the optiaml decisions of the synthesized dialogue. Additionally, here we also generate the Out-of-domain evaluation set (OOD1). See the AirDialogue paper for more details.

bash ./scripts/gen_syn.sh --ood1

2. Preprocessing

We preprocess the train dataset in order to begin the training of our model.

bash ./scripts/preprocess.sh -p train

3. Training

Supervised Learning

The fist step is to train our model using supervised learning.

python airdialogue_model_tf.py --task_type TRAINEVAL --num_gpus 8 \
                            --input_dir ./data/airdialogue/tokenized \
                            --out_dir ./data/out_dir \
                            --num_units 256 --num_layers 2

Training with Reinforcement Learning Self-play

The second step would be to train our model using self-play based on our supervised learning checkpoint.

python airdialogue_model_tf.py --task_type SP_DISTRIBUTED --num_gpus 8 \
                            --input_dir ./data/airdialogue/tokenized \
                            --self_play_pretrain_dir ./data/out_dir \
                            --out_dir ./data/selfplay_out_dir

Examine Training Meta Information

Training meta data will be written to the output directory, which can be examined using tensorboard. The following command will examine the training procedure of the supervised learning model.

tensorboard --logdir=./data/out_dir

To view the training meta data for the self-play mode, swap logdir to ./data/selfplay_out_dir.

4. Evaluating on the AirDialogue dev set

Preprocessing

Similar to training, we will first need to preprocess the dev dataset. here we will also preprocess the ood1 dataset for evaluation.

bash ./scripts/preprocess.sh -p dev --ood1

Predicting

We use the following script to evaluate our trained model on the dev set. Following the AirDialogue paper, here we also evaluate the model's performance on the OOD1 evaluation set that we generated. The evaluation script will first try to find the selfplay model. If failed, it will use the supervised model.

bash ./scripts/evaluate.sh -p dev -a ood1

5. Scoring

Once the predictative files are generated, we will depend on the AirDialogue tookit for scoring. We are currently working on the scoring script.

airdialogue score --pred_data ./data/out_dir/dev_inference_out.txt \
                  --true_data ./data/airdialogue/tokenized/dev.infer.tar.data \
                  --true_kb ./data/airdialogue/tokenized/dev.infer.kb \
                  --task infer \
                  --output ./data/out_dir/dev_bleu.json
airdialogue score --pred_data ./data/out_dir/dev_selfplay_out.txt \
                  --true_data ./data/airdialogue/tokenized/dev.selfplay.eval.data \
                  --true_kb ./data/airdialogue/tokenized/dev.selfplay.eval.kb \
                  --task selfplay \
                  --output ./data/out_dir/dev_selfplay.json
airdialogue score --pred_data ./data/out_dir/ood1_selfplay_out.txt \
                  --true_data ./data/airdialogue/tokenized/ood1.selfplay.eval.data \
                  --true_kb ./data/airdialogue/tokenized/ood1.selfplay.eval.kb \
                  --task selfplay \
                  --output ./data/out_dir/ood1_selfplay.json

6. Evaluating on the AirDialogue test set

We are currently working on the evalaution process of the test set.

7. Benchmark Results

We are currently working on benchmarking the results.

8. Misc

a. Task and Dataset Alignments

Stage Tasks
Training Supervised Self-play
train.data train.selfplay.data
train.kb train.selfplay.kb
source Airdialogue synthesized (meta1)
Testing-Dev Inference Self-play Eval Self-play Eval Eval
dev.infer.src.data dev.selfplay.eval.data ood1.selfplay.data dev.eval.data
dev.infer.tar.data
dev.infer.kb dev.selfplay.eval.kb ood1.selfplay.eval.kb dev.eval.kb
source AirDialogue AirDialogue synthesized (meta2) AirDialogue
Testing-Test (hidden) Inference Self-play Eval Self-play Eval
test.infer.src.data test.selfplay.eval.data ood2.selfplay.data
test.infer.tar.data
test.infer.kb test.selfplay.eval.kb ood2.selfplay.kb
source AirDialogue AirDialogue synthesized (meta3)

b. Working with Synthesized Data

As an alternative to the AirDialogue Dataset, we can verify our model using the synthesized data.

Training

To genreate a synthesized dataset for training, flip the -s option for the data generation script. By default, synthesized data will be put under ./data/synthesized/

bash ./scripts/gen_syn.sh -s --ood1

We will then need to preprocess the synthesized data for training

bash ./scripts/preprocess.sh -s -p train

Similar to experiments on the AirDialogue dataset, we can train a supervised model for the synthesized data:

python airdialogue_model_tf.py --task_type TRAINEVAL --num_gpus 8 \
                            --input_dir ./data/synthesized/tokenized \
                            --out_dir ./data/synthesized_out_dir \
                            --num_units 256 --num_layers 2

With supervised model pre-training, we can also train the synthesized model using self-play:

python airdialogue_model_tf.py --task_type SP_DISTRIBUTED --num_gpus 8 \
                            --input_dir ./data/synthesized/tokenized \
                            --self_play_pretrain_dir ./data/synthesized_out_dir \
                            --out_dir ./data/synthesized_selfplay_out_dir
Testing

Before testing on the dev data, we will need to do preprocessing. Dev Dataset

bash ./scripts/preprocess.sh -p dev --ood1 -s

We can run execute the evalution script on the synthesized dev set.

bash ./scripts/evaluate.sh -p dev -a ood1 -m ./data/synthesized_out_dir -o ./data/synthesized_out_dir -i ./data/synthesized/tokenized/
Scoring
airdialogue score --pred_data ./data/synthesized_out_dir/dev_inference_out.txt \
                  --true_data ./data/synthesized/tokenized/dev.infer.tar.data \
                  --true_kb ./data/airdialogue/tokenized/dev.infer.kb \
                  --task infer \
                  --output ./data/synthesized_out_dir/dev_bleu.json
airdialogue score --pred_data ./data/synthesized_out_dir/dev_selfplay_out.txt \
                  --true_data ./data/synthesized/tokenized/dev.selfplay.eval.data \
                  --true_kb ./data/airdialogue/tokenized/dev.selfplay.eval.kb \
                  --task selfplay \
                  --output ./data/synthesized_out_dir/dev_selfplay.json
airdialogue score --pred_data ./data/synthesized_out_dir/ood1_selfplay_out.txt \
                  --true_data ./data/synthesized/tokenized/ood1.selfplay.eval.data \
                  --true_kb ./data/airdialogue/tokenized/ood1.selfplay.eval.kb \
                  --task selfplay \
                  --output ./data/synthesized_out_dir/ood1_selfplay.json

One can repeat same steps for synthesized test set as well. Please refer to the AirDialogue paper for the results on the synthesized dataset.

airdialogue_model's People

Contributors

josephch405 avatar w31w31 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.