GithubHelp home page GithubHelp logo

vin-ivar / simple_elmo_training Goto Github PK

View Code? Open in Web Editor NEW

This project forked from ltgoslo/simple_elmo_training

0.0 1.0 0.0 100 KB

Minimal code to train ELMo models in recent versions of TensorFlow

License: Apache License 2.0

Python 100.00%

simple_elmo_training's Introduction

simple_elmo_training

Minimal code to train ELMo models in TensorFlow.

Heavily based on https://github.com/allenai/bilm-tf .

Most changes are simplifications and updating the code to the recent versions of TensorFlow 1. See also our repository with simple code to infer contextualized word vectors from pre-trained ELMo models.

Training

python3 bilm/train_elmo.py --train_prefix $DATA --size $SIZE --vocab_file $VOCAB --save_dir $OUT

where

$DATA is a path to the directory containing 2 or more of (possibly gzipped) plain text files: your training corpus.

$SIZE if the number of word tokens in $DATA (necessary to properly construct and log batches).

$VOCAB is a (possibly gzipped) one-word-per-line vocabulary file to be used for language modeling; it should always contain at least , and .

$OUT is a directory where the TensorFlow checkpoints will be saved.

Before training, please review the settings in bilm/train_elmo.py. The most important are:

  • batch_size (default 128)
  • n_gpus (default 2; if no GPU, all available CPU cores are used)
  • LSTM dimensionality (default 2048; the original paper used 4096)
  • n_epochs (default 3; optimal value depends on the size of your corpus)
  • n_negative_samples_batch (default 4096; the original paper used 8192)

Converting to HDF5

After the training, use the bilm/dump_weights.py script to convert the checkpoints to and HDF5 model.

python3 bilm/dump_weights.py --save_dir $MODEL_DIR --outfile $MODEL_DIR/model.hdf5

Save your vocabulary file in the same directory. Change the n_characters value in the options.json file from 261 to 262 to use the saved model for inference.

More details at https://github.com/allenai/bilm-tf

simple_elmo_training's People

Contributors

akutuzov avatar vinbo8 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.