varal7 / blank_language_model Goto Github PK
View Code? Open in Web Editor NEWBlank Language Models
License: Apache License 2.0
Blank Language Models
License: Apache License 2.0
python test.py --checkpoint checkpoints/yelp/neg/blm/lightning_logs/version_0/checkpoints/epoch=???.ckpt
--fill data/yelp/blank/test.1.blank --output test.1.tsf
I used the above script to fill blank. But it always creates the same fillings.
How to generate different feasible fillings?
Is there anyway to do only 1-to-1 blank filling without expanding?
example:
I am used to <blank> at <blank>.
-->I am used to reading at night. (Good)
-->I am used to reading books at every night. (Bad).
--> I want the filling only replaces each with a word without adding additional or more than one word.
so the total number of words is kept the same.
how to do this?
Hello, how to train the new model, can you explain the steps of data processing and training model in detail? Want to replicate your approach.
Hi all,
I was wondering what the benefits of sharing the word and projection weights when training a BLM model?
Do you think/suggest using it as default hyper-param when training the BLM model, or we're better off fine-tuning i?
Thank you all :)
Hi,
I have sucessfully trained the model using:
python train.py --train data/yelp/train.0 --valid data/yelp/valid.0 --root_dir checkpoints/yelp/neg/blm/ \
--vocab_size 10000 --max_len 20 --model_type blm --share_emb_prj_weight --gpus 1 --max_steps 10
But testing the model
python test.py --checkpoint checkpoints/yelp/neg/blm/lightning_logs/version_0/checkpoints/epoch\=???.ckpt \
--fill data/yelp/blank/test.1.blank --output test.1.tsf
results to an error:
No such file or directory: 'checkpoints/yelp/neg/blm/lightning_logs/version_0/hparams.yaml'
Hi, it looks like there's no option to run beam search during decoding. Would you be willing to release code for computing BLEU results on the Yahoo Answers dataset (which I assume uses beam search)? Thanks in advance.
Hi everybody,
I'm trying to adapt the blank language model on source code since in my opinion, there are many code-related tasks where such a model con gives us astonishing results. When we deal with source code, one of the main limitations is the open-vocabulry problem.
Do you think it is doable to plug in a fully-fledged BLP/Sentencepiece model to cope with such a problem without altering the under-the-hood network?
Just for the record, I tried to increase the vocabulary size but, the number of UNK tokens I get, hinder the learning process by a lot.
Thank you :)
Hi Guys,
I was wondering if there is a way to speed up the filling process when testing the model.
I found out that even if I change the number of available GPUs, actually it does not seems to work
'''
Last few lines of test.py
cuda = not args.no_cuda and torch.cuda.is_available()
device = torch.device('cuda' if cuda else 'cpu')
args.gpus = 3 if cuda else 0
'''
Any ideas?
Thank you in advance ;)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.