GithubHelp home page GithubHelp logo

rayxu14 / sl4du Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 232 KB

Learning an Effective Context-Response Matching Model with Self-Supervised Tasks for Retrieval-based Dialogues

License: MIT License

Python 100.00%

sl4du's Introduction

SL4DU

Environment

Option 1: container

frontlibrary/transformers-pytorch-gpu:4.6.1-pyarrow

~$ docker run --runtime=nvidia -it --rm -v $HOME/SL4DU:/workspace frontlibrary/transformers-pytorch-gpu:4.6.1-pyarrow

Option 2: build from scatch

Reproduce step: an example

  1. Initialize directories
    SL4DU
        code
        data
        pretrained
    
  2. Download code and the Ubuntu data
    ~/SL4DU/code$ git clone https://github.com/RayXu14/SL4DU.git
    ~/SL4DU/data$ wget https://www.dropbox.com/s/2fdn26rj6h9bpvl/ubuntu_data.zip
    ~/SL4DU/data$ unzip ubuntu_data.zip
  3. Add bert-base-uncased pretrained model in pretrained
    • config.json
    • vocab.txt
    • pytorch_model.bin
  4. Preprocess data
    ~/SL4DU/code/SL4DU$ python3 preprocess.py --task=RS --dataset=Ubuntu --raw_data_path=../../data/ubuntu_data --pkl_data_path=../../data/ubuntu_data --pretrained_model=bert-base-uncased
  5. Reproduce BERT result
    ~/SL4DU/code/SL4DU$ python3 -u train.py --save_ckpt --task=RS --dataset=Ubuntu --pkl_data_path=../../data/ubuntu_data --pretrained_model=bert-base-uncased --add_EOT --freeze_layers=0 --train_batch_size=8 --eval_batch_size=100 --log_dir=? # --pkl_valid_file=test.pkl
  6. Add post-ubuntu-bert-base-uncased in pretrained
    • Download whang's Ubuntu ckpt and use deprecated/whangpth2bin.py to transform it into our form; compared to bert-base-uncased, only need to +1 for vocab size in config.json and add a new word [EOS] after vocab.txt
    • Or use our pretrained models (already transformed) instead
  7. Reproduce BERT-VFT result
    ~/SL4DU/code/SL4DU$ python3 -u train.py --save_ckpt --task=RS --dataset=Ubuntu --pkl_data_path=../../data/ubuntu_data --pretrained_model=post-ubuntu-bert-base-uncased --freeze_layers=8 --train_batch_size=16 --eval_batch_size=100 --log_dir=? #--pkl_valid_file=test.pkl
  8. Reproduce SL4RS result
    ~/SL4DU/code/SL4DU$ python3 -u train.py --save_ckpt --task=RS --dataset=Ubuntu --pkl_data_path=../../data/ubuntu_data --pretrained_model=post-ubuntu-bert-base-uncased --freeze_layers=8 --train_batch_size=4 --eval_batch_size=100 --log_dir=? --use_NSP --use_UR --use_ID --use_CD --train_view_every=80 #--pkl_valid_file=test.pkl
  9. Evaluation
    ~/SL4DU/code/SL4DU$ python3 -u eval.py --task=Ubuntu --data_path=../../data/ubuntu_data --pretrained_model=post-ubuntu-bert-base-uncased --freeze_layers=8 --eval_batch_size=100 --log_dir ? --load_path=?

Pretrained on yourself

Using Whang's repo

Remember to transform the saved model to our form using deprecated/whangpth2bin.py.

Additional information for pretraining settings

set the number of epochs as 2 for post-training with 10 duplication data and set the virtual batch size as 384

sl4du's People

Contributors

rayxu14 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.