GithubHelp home page GithubHelp logo

blank_language_model's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

blank_language_model's Issues

how to fill different options to a <blank>?

python test.py --checkpoint checkpoints/yelp/neg/blm/lightning_logs/version_0/checkpoints/epoch=???.ckpt
--fill data/yelp/blank/test.1.blank --output test.1.tsf

I used the above script to fill blank. But it always creates the same fillings.
How to generate different feasible fillings?

Is there anyway to do only 1-to-1 blank filling without expanding?

Is there anyway to do only 1-to-1 blank filling without expanding?

example:
I am used to <blank> at <blank>.

-->I am used to reading at night. (Good)

-->I am used to reading books at every night. (Bad).

--> I want the filling only replaces each with a word without adding additional or more than one word.
so the total number of words is kept the same.

how to do this?

How to train the model?

Hello, how to train the new model, can you explain the steps of data processing and training model in detail? Want to replicate your approach.

Share of embeddings

Hi all,

I was wondering what the benefits of sharing the word and projection weights when training a BLM model?
Do you think/suggest using it as default hyper-param when training the BLM model, or we're better off fine-tuning i?

Thank you all :)

No such file or directory: 'checkpoints/yelp/neg/blm/lightning_logs/version_0/hparams.yaml' when testing the model on negative yelp.

Hi,

I have sucessfully trained the model using:

python train.py --train data/yelp/train.0 --valid data/yelp/valid.0 --root_dir checkpoints/yelp/neg/blm/ \
--vocab_size 10000 --max_len 20 --model_type blm --share_emb_prj_weight --gpus 1 --max_steps 10 

But testing the model

python test.py --checkpoint checkpoints/yelp/neg/blm/lightning_logs/version_0/checkpoints/epoch\=???.ckpt \
--fill data/yelp/blank/test.1.blank --output test.1.tsf

results to an error:

 No such file or directory: 'checkpoints/yelp/neg/blm/lightning_logs/version_0/hparams.yaml'

Beam search code?

Hi, it looks like there's no option to run beam search during decoding. Would you be willing to release code for computing BLEU results on the Yahoo Answers dataset (which I assume uses beam search)? Thanks in advance.

Can we adapt the BLM model on source code?

Hi everybody,

I'm trying to adapt the blank language model on source code since in my opinion, there are many code-related tasks where such a model con gives us astonishing results. When we deal with source code, one of the main limitations is the open-vocabulry problem.
Do you think it is doable to plug in a fully-fledged BLP/Sentencepiece model to cope with such a problem without altering the under-the-hood network?
Just for the record, I tried to increase the vocabulary size but, the number of UNK tokens I get, hinder the learning process by a lot.

Thank you :)

Speed-up the filling process

Hi Guys,

I was wondering if there is a way to speed up the filling process when testing the model.
I found out that even if I change the number of available GPUs, actually it does not seems to work
'''
Last few lines of test.py

cuda = not args.no_cuda and torch.cuda.is_available()
device = torch.device('cuda' if cuda else 'cpu')
args.gpus = 3 if cuda else 0
'''

Any ideas?

Thank you in advance ;)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.