keras_dialogue_generation_toolkit
Introduction
This is a Keras framework for dialogue generation. It includes some basic generative models:
- Seq2Seq with Attention Mechanism (Neural machine translation by jointly learning to align and translate)
- Pointer-Generator (Get to the point: Summarization with pointer-generator networks)
- Memory Network (End-to-end memory networks)
- Multi-Task (A knowledge-grounded neural conversation model)
- Transformer (Attention is all you need)
- Universal Transformer (Universal transformers)
- Transformer_ED: Transformer with Expanded Decoder (Enhancing Conversational Dialogue Models with Grounded Knowledge)
Structure
- commonly_used_code: This folder contains commonly used code for all of the sub-tasks. We can reuse the code in this folder.
- configuration: This folder contains configuration of the task, including all kinds of file path.
- run_script: The parameter parser file is put in this folder and the run script can be put in this folder. All models' hyper-parameters are set in the args_parser.py.
- data: This folder contains all data. For this framework, it has a specific requirement for the data set format. Example data set is in this folder.
- pre_precessing: This folder contains data processing code. All process about the data processing should be put into this folder. The final data should be generated to the data folder.
- example: For all of the models listed above, the training entrance is in this folder. You can get start from this folder.
- models: The core model files are in this folder. You can get the code of each model.
Reference and Acknowledgement:
- Pointer_generator: Thank Abisee, we use his repo on the Github. Obeying the Apache License version 2.0, we just use this repo for researching. I refer readers to the original repo: Pointer-Generator
- Memory Neural Network: Thank these authors. We inspire from two repos, memn2n tensorflow version and memn2n keras version
- Transformer and Universal Transformer: Thanks kpot for his repo: keras-transformer. I changed his original code to fit with my experiments. He also implemented BERT on top of this repo. Even this repo didn't implement entire Transformer, it still easy to add Decoder part in this framework. The original repo can be found here: keras-transformer.
Usage:
Go into the example folder and run the 'train_***.py' file. An example is given below:
python train_transformer.py --data_set=wizard \
--exp_name=transformer \
--batch_size=40 \
--src_seq_length=30 \
--tar_seq_length=30 \
--early_stop_patience=2 \
--lr_decay_patience=1 \
--lr=0.001
As for the hyper-parameters, they can be found in run_script\args_parser.py
.