varal7 / blank_language_model Goto Github PK

View Code? Open in Web Editor NEW

43.0 43.0 10.0 505 KB

Blank Language Models

License: Apache License 2.0

Python 95.05% C++ 4.46% Shell 0.49%

blank_language_model's People

Stargazers

Watchers

Forkers

opcheese dscilab m-hahn naveenjafer reactivetype jennhu clearloveyuan heonly hallofuture ducngg

blank_language_model's Issues

how to fill different options to a <blank>?

python test.py --checkpoint checkpoints/yelp/neg/blm/lightning_logs/version_0/checkpoints/epoch=???.ckpt
--fill data/yelp/blank/test.1.blank --output test.1.tsf

I used the above script to fill blank. But it always creates the same fillings.
How to generate different feasible fillings?

Can you provide the Yahoo dataset used

Is there anyway to do only 1-to-1 blank filling without expanding?

example:
I am used to <blank> at <blank>.

-->I am used to reading at night. (Good)

-->I am used to reading books at every night. (Bad).

--> I want the filling only replaces each with a word without adding additional or more than one word.
so the total number of words is kept the same.

how to do this?

How to train the model？

Hello, how to train the new model, can you explain the steps of data processing and training model in detail? Want to replicate your approach.

I was wondering what the benefits of sharing the word and projection weights when training a BLM model?
Do you think/suggest using it as default hyper-param when training the BLM model, or we're better off fine-tuning i?

Thank you all :)

No such file or directory: 'checkpoints/yelp/neg/blm/lightning_logs/version_0/hparams.yaml' when testing the model on negative yelp.

Hi,

I have sucessfully trained the model using:

python train.py --train data/yelp/train.0 --valid data/yelp/valid.0 --root_dir checkpoints/yelp/neg/blm/ \
--vocab_size 10000 --max_len 20 --model_type blm --share_emb_prj_weight --gpus 1 --max_steps 10

But testing the model

python test.py --checkpoint checkpoints/yelp/neg/blm/lightning_logs/version_0/checkpoints/epoch\=???.ckpt \
--fill data/yelp/blank/test.1.blank --output test.1.tsf

results to an error:

 No such file or directory: 'checkpoints/yelp/neg/blm/lightning_logs/version_0/hparams.yaml'

Beam search code?

Hi, it looks like there's no option to run beam search during decoding. Would you be willing to release code for computing BLEU results on the Yahoo Answers dataset (which I assume uses beam search)? Thanks in advance.

Can we adapt the BLM model on source code?

Hi everybody,

I'm trying to adapt the blank language model on source code since in my opinion, there are many code-related tasks where such a model con gives us astonishing results. When we deal with source code, one of the main limitations is the open-vocabulry problem.
Do you think it is doable to plug in a fully-fledged BLP/Sentencepiece model to cope with such a problem without altering the under-the-hood network?
Just for the record, I tried to increase the vocabulary size but, the number of UNK tokens I get, hinder the learning process by a lot.

Thank you :)

Speed-up the filling process

Hi Guys,

I was wondering if there is a way to speed up the filling process when testing the model.
I found out that even if I change the number of available GPUs, actually it does not seems to work
'''
Last few lines of test.py

cuda = not args.no_cuda and torch.cuda.is_available()
device = torch.device('cuda' if cuda else 'cpu')
args.gpus = 3 if cuda else 0
'''

Any ideas?

Thank you in advance ;)

varal7 / blank_language_model Goto Github PK

blank_language_model's People

Stargazers

Watchers

Forkers

blank_language_model's Issues

how to fill different options to a <blank>?

Can you provide the Yahoo dataset used

Is there anyway to do only 1-to-1 blank filling without expanding?

How to train the model？

Share of embeddings

No such file or directory: 'checkpoints/yelp/neg/blm/lightning_logs/version_0/hparams.yaml' when testing the model on negative yelp.

Beam search code?

Can we adapt the BLM model on source code?

Speed-up the filling process

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs