atcbosselut / comet-commonsense Goto Github PK
View Code? Open in Web Editor NEWCode for ACL 2019 Paper: "COMET: Commonsense Transformers for Automatic Knowledge Graph Construction" https://arxiv.org/abs/1906.05317
License: Apache License 2.0
Code for ACL 2019 Paper: "COMET: Commonsense Transformers for Automatic Knowledge Graph Construction" https://arxiv.org/abs/1906.05317
License: Apache License 2.0
I know this question is a little out of topic. But it is helpful to me. Thank you.
Text-to-(word)set generation or sequence-to-(token)set generation.
For example, input a text and then output the tags for this text:
'Peter is studying English'
--> {'good behavior','person','doing something'}
Thank you!
Dear authors,
Due to the GPU memory limitation of my devices, I tend to train the model with multi-GPU. After scanning your code, I modified the config/atomic/config_2.json
(my experiment_num is 2)and config/default.json
to set multi-GPU, as shown below:
Hi,
Seems like token level averaged NLL loss will be logged for all loss types:
comet-commonsense/src/train/train.py
Line 118 in 070aad1
If you can confirm that this is indeed an issue, I can submit a pull request.
Thanks
Hi, I want to implement the task ‘Classifying Generated Tupes’, I have the following questions:
Can the experiment be run under the windows system?
What version of torch is used for this task? The pytorch official website does not provide a version of the windows system corresponding to python2.7.
The interface when running the command ‘bash scripts/classify/classify.sh’,has no result.
Hi,
Thank you for the very nice and interesting work.
I have a question regarding the model. In the paper it s mentioned that you used the same architecture as GPT, which is a transformer decoder. However, you also talked about input encoder and how you configure the input to encoder.
I was a bit confused, whether you have both encoder and decoder? or it is a language model and then it would be just decoder. And if it is a decoder, then how do you encode your [X_s, X_r] ?
Could you please clarify this. Many thanks
Can this model give a knowledge graph according to the input sentence?
Such as "One absolutely cardinal reason is the fact that universities offer an opportunity to intensify the knowledge in a particular field of interest."
Maybe this is a long sentence, I want to ask if our model has such a function?
Hi,
I've tested a number of cases either using code&¶meter from this repo or demo from https://mosaickg.apps.allenai.org/comet_atomic. And I'm kind of confused as there have obvious differences in the results. So what's the exact difference between them?
Thanks
In the README.md you need to change the tensorflow and spacy install methods. pip tensorflow downloads a different version than what is needed and works better if you just use conda to install it. Also, the spacy 3.0 that is default install now, doesn't work with your dataloader. Proposed changes:
pip install tensorflow --> conda install tensorflow
conda install -c conda-forge spacy --> conda install -c conda-forge spacy=2.3.5
I am trying to construct a codebase for my ACL paper where I used comet
I tested my code previously on 3.7 but now due to an additional feature i downgraded to 3.6 with that the COMET results are differing
For example for 👍
cereal crisp terrible dinner
3.6
['cereal', 'breakfast', 'drink milk', 'eat it', 'coffee']
3.7
['stomach ache', 'eat it', 'bad breath', 'vomit', 'you will feel bad']
Under such situations reproducibility is an issue
i set paperresults = True for both cases
any way this can be made to work
I have completed training in ConceptNet. However, if I would like to generate in our own test file, how could I do? I have changed our test file into 'test.txt', but the length of generation is fixed in 1200 as same as the original 'test.txt' 's positive length, how can I fix the length the same as our own test? Thanku.
Hi, when I execute the training got:
(PyTorch) zys@zys:~/Documents/gits/comet-commonsense$ python src/main.py --experiment_type atomic --experiment_num 0
config/atomic/config_0.json
{'gpu_mode': 'T', 'gpu_index': 0, 'gpu_indices': [0, 1], 'multigpu': 'F', 'topk_size': 10, 'beam_size': 1, 'gen_seqlength': 40, 'eval_sampler': 'greedy', 'num_sequences': 1, 'generate_sequences': 1000, 'evaluate_sequences': 10000, 'random_seed': 123, 'optimizer': 'adam', 'batch_size': 64, 'learning_rate': 6.25e-05, 'clip': 1, 'loss': 'nll', 'weight_decay': 0, 'adam': {'b2': 0.999, 'b1': 0.9, 'e': 1e-08}, 'model': 'transformer', 'pretrain': 'gpt', 'hidden_dim': 768, 'num_layers': 12, 'num_heads': 12, 'embedding_dropout': 0.1, 'attention_dropout': 0.1, 'residual_dropout': 0.1, 'output_dropout': 0.1, 'activation': 'gelu', 'init': 'pt', 'trainer': 'iteration', 'iterations': 50000, 'cycle': 500, 'save_strategy': 'best', 'epochs': 20, 'toy': 'F', 'do_gen': 'F', 'save': 'T', 'test_save': 'F', 'dataset': 'atomic', 'categories': ['oEffect'], 'eval_categories': ['oEffect'], 'exp': 'generation', 'labels': 'individual', 'encoder_path': 'model/encoder_bpe_40000.json', 'bpe_path': 'model/vocab_40000.bpe', 'learning_rate_schedule': 'warmup_linear', 'learning_rate_warmup': 0.002, 'l2': 0.01, 'vector_l2': 'T'}
{'b2': 0.999, 'b1': 0.9, 'e': 1e-08}
DD{'net': DD{'model': 'transformer', 'nL': 12, 'nH': 12, 'hSize': 768, 'edpt': 0.1, 'adpt': 0.1, 'rdpt': 0.1, 'odpt': 0.1, 'pt': 'gpt', 'afn': 'gelu', 'init': 'pt'}, 'mle': 0, 'dataset': 'atomic', 'train': DD{'static': DD{'exp': 'generation', 'seed': 123, 'l2': 0.01, 'vl2': True, 'lrsched': 'warmup_linear', 'lrwarm': 0.002, 'clip': 1, 'loss': 'nll', 'b2': 0.999, 'b1': 0.9, 'e': 1e-08}, 'dynamic': DD{'lr': 6.25e-05, 'bs': 64, 'optim': 'adam'}}, 'model': 'transformer', 'exp': 'generation', 'data': DD{'categories': ['oEffect'], 'maxe1': 17, 'maxe2': 35, 'maxr': 1}, 'eval': DD{'bs': 1, 'smax': 40, 'sample': 'greedy', 'numseq': 1, 'gs': 1000, 'es': 10000, 'categories': ['oEffect']}, 'trainer': 'iteration', 'cycle': 500, 'iters': 50000}
Loading Data
Loading data from: data/atomic/processed/generation/categories_oEffect-maxe1_17-maxe2_35-maxr_1.pickle
61795
Done.
dict_keys(['data', 'sequences', 'masks', 'offsets', 'categories', 'opt', 'vocab_encoder', 'vocab_decoder', 'special_chars', 'max_event', 'max_effect', 'batch_size'])
Building Model
50
LOADING PRETRAINED TRANSFORMER
Loading weights...
Done.
Files will be logged at: results/losses/atomic-generation/iteration-500-50000/transformer/categories_oEffect-maxe1_17-maxe2_33-maxr_1/model_transformer-nL_12-nH_12-hSize_768-edpt_0.1-adpt_0.1-rdpt_0.1-odpt_0.1-pt_gpt-afn_gelu-init_pt-vSize_40532/exp_generation-seed_123-l2_0.01-vl2_T-lrsched_warmup_linear-lrwarm_0.002-clip_1-loss_nll-b2_0.999-b1_0.9-e_1e-08/bs_1-smax_40-sample_greedy-numseq_1-gs_1000-es_10000-categories_oEffect/6.25e-05_adam_64_0
Pushing to GPU: 0
Done.
Training
Logging Tensorboard Files at: logs/atomic-generation/iteration-500-50000/transformer/categories_oEffect-maxe1_17-maxe2_33-maxr_1/model_transformer-nL_12-nH_12-hSize_768-edpt_0.1-adpt_0.1-rdpt_0.1-odpt_0.1-pt_gpt-afn_gelu-init_pt-vSize_40532/exp_generation-seed_123-l2_0.01-vl2_T-lrsched_warmup_linear-lrwarm_0.002-clip_1-loss_nll-b2_0.999-b1_0.9-e_1e-08/bs_1-smax_40-sample_greedy-numseq_1-gs_1000-es_10000-categories_oEffect/6.25e-05_adam_64
0%| | 0/50000 [00:00<?, ?it/s]{'total_micro': [0], 'total_macro': [0], 'oEffect_micro': [0], 'oEffect_macro': [0]}
0%| | 1/50000 [00:00<10:46:40, 1.29it/s]Traceback (most recent call last):
File "src/main.py", line 16, in
main(args.experiment_num)
File "/home/zys/Documents/gits/comet-commonsense/src/main_atomic.py", line 125, in main
trainer.run()
File "/home/zys/Documents/gits/comet-commonsense/src/train/train.py", line 194, in run
self.cycle(bar, cycle_num)
File "/home/zys/Documents/gits/comet-commonsense/src/train/train.py", line 212, in cycle
loss, nums, reset = self.do_forward_pass(nums)
File "/home/zys/Documents/gits/comet-commonsense/src/train/train.py", line 159, in do_forward_pass
self.batch_variables)
File "/home/zys/Documents/gits/comet-commonsense/src/train/atomic_train.py", line 34, in batch
outputs = batch.batch_atomic_generate(opt, *args)
File "/home/zys/Documents/gits/comet-commonsense/src/train/batch.py", line 37, in batch_atomic_generate
attention_mask[:, :-1], loss_reduction="none")
File "/home/zys/Documents/gits/comet-commonsense/src/train/batch.py", line 108, in mle_steps
attention_mask, i)
File "/home/zys/Documents/gits/comet-commonsense/src/train/batch.py", line 125, in decode
return model(input_, sequence_mask=attention_mask)
File "/home/zys/anaconda3/envs/PyTorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/home/zys/Documents/gits/comet-commonsense/src/models/gpt.py", line 212, in forward
lm_logits = lm_logits + self.pos_emb_mask
RuntimeError: CUDA out of memory. Tried to allocate 486.00 MiB (GPU 0; 7.76 GiB total capacity; 6.32 GiB already allocated; 48.56 MiB free; 437.98 MiB cached)
0%| | 1/50000 [00:00<13:40:12, 1.02it/s]
Could you tell me your minimum configuration of GPU? I use a NVIDIA2060 got these errors.
Just want to make sure. Are we meant to change the original setting values 17, 35 for ATOMIC to bigger values in order to feed with a longer sentence than 18 words?
Or can we feed with a long sentence?
FileNotFoundError: [Errno 2] No such file or directory: 'comet/data/atomic/processed/generation/categories_oEffect#oReact#oWant#xAttr#xEffect#xIntent#xNeed#xReact#xWant-maxe1_17-maxe2_35-maxr_1.pickle'
scripts/setup/get_atomic_data.sh
tries to download atomic_data.tgz
from washington.edu but the URL is not available.
Probably the URL could be changed to https://storage.googleapis.com/ai2-mosaic/public/atomic/v1.0/atomic_data.tgz
.
I believe the following change is needed to make comet-commonsense compatible with the latest tensorboardX:
src/train/train.py
83 print("Logging Tensorboard Files at: {}".format(self.logger.log_dir))
83 print("Logging Tensorboard Files at: {}".format(self.logger.logdir))
Hi,
I wonder if it is possible to use the trained comet model, however limit the vocabulary set of generation? In other words have a customized subset of the current vocabulary set and force the model to only generate from that?
I need this specifically for one of the nine available dimensions.
Thanks in advance,
I'm having some difficulty reproducing the conceptnet accuracy scores, could you please point me in the right direction? These are the steps I took:
However, because I achieved an accuracy of 80.04%, which is much lower than the 95.25% reported in your Table, I'm sure I must be doing something wrong! The (short) code I used to get the scores is here: https://colab.research.google.com/drive/10oaX-_1qS75xrgbm67MDkRy_hOHEJA0s
Hello,
I have a question about MASK tokens in the input examples (Figure 3, paper). Which role do they play in training and predictions. Unfortunately, I could not find the explanation in the paper. What is the role of mask tokens between subject tokens and relation tokens (ConceptNet) when only object tokens need to be predicted?
Thank you very much in advance.
you said that you assumed I have python 3.6, but why do you use python2 style like this???
comet-commonsense/scripts/classify/demo_bilinear.py
line 21 print 'can not find corresponding vector total:', array[0].lower()
Can I Save this atomic model into h5 file or language model file(like : added_tokens.json,config.json,pytorch_model.bin,vocab.json..)
Dear authors,
I found you set the max_event
and max_affect
to 17 and 35 respectively. However, the length of my input event is much longer than 17, can I change it according to my need?
After scanning your code, I tend to change opt['data']["maxe1"]
in main_atomic.py
and self.max_event
in atomic.py
, can it work?
I'm following the setup instructions but get_conceptnet_data.sh is returning errors:
bash scripts/setup/get_conceptnet_data.sh
--2020-05-17 11:44:00-- https://ttic.uchicago.edu/~kgimpel/comsense_resources/train100k.txt.gz
Resolving ttic.uchicago.edu (ttic.uchicago.edu)... 128.135.8.186
Connecting to ttic.uchicago.edu (ttic.uchicago.edu)|128.135.8.186|:443... connected.
OpenSSL: error:1425F102:SSL routines:ssl_choose_client_version:unsupported protocol
Unable to establish SSL connection.
...
Visiting https://ttic.uchicago.edu/~kgimpel/comsense_resources/ also gives a 403 access forbidden error.
Dear author :
I follow the instruction in the Readme file,when i try to make the data loaders about conceptNet,i came across the question:
~/comet-commonsense$ python scripts/data/make_conceptnet_data_loader.py
100%███████████████████████████████████████████| 100000/100000 [04:06<00:00, 405.46it/s]
100%███████████████████████████████████████████████████| 2400/2400 [00:05<00:00, 408.77it/s]
100%|████████████████████████████████████| 2400/2400 [00:05<00:00, 426.74it/s]
28
16
16
dev
Traceback (most recent call last):
File "scripts/data/make_conceptnet_data_loader.py", line 66, in
data_loader.make_tensors(text_encoder, special, test=False)
File "/home/caihanqing/comet-commonsense/src/data/conceptnet.py", line 196, in make_tensors
self.data[split]['total']) if not j[3]]))
RuntimeError: index out of range: Tried to access index 2306 out of table with 2305 rows. at /opt/conda/conda-bld/pytorch_1570910687230/work/aten/src/TH/generic/THTensorEvenMoreMath.cpp:418
Can you give me some useful advice to my further training ? @madaan @atcbosselut
i use this command with bash scripts/setup/get_model_files.sh, but the project seems to be stalled when gitting clone 36% of this files
Gongratulations for your work! What part of the code should I use in order to give as input a text file with some new words(begin node) and receive relations with generated new words/concepts (end node)? Thank you.
For atomic evaluation, the evaluation doesn't have bleu-2 score. Will you update the script to calculate the generation by bleu-2 score?
Can I run this project on the CPU? I don't have enough GPU conditions
Here is the detail of the question :
File "comet-commonsense/src/interactive/functions.py", line 124, in get_atomic_sequence
input_event, category, data_loader, text_encoder)
File "comet-commonsense/src/interactive/functions.py", line 158, in set_atomic_inputs
XMB[:, :len(prefix)] = torch.LongTensor(prefix)
RuntimeError: The expanded size of the tensor (18) must match the existing size (93) at non-singleton dimension 1. Tar get sizes: [1, 18]. Tensor sizes: [93]
do you know how to deal with it ? @madaan
My command is python3 scripts/interactive/pretrain_atomic_txt.py --device 0 --model_file pretrained_models/atomic_pretrained_model.pickle
What I got:
in forward h = self.transformer(x, sequence_mask) File "tmp/lib64/python3.6/site-packages/torch/nn/modules/module.py", line 550, in __call__ result = self.forward(*input, **kwargs) File "tmp/comet-commonsense/src/models/gpt.py", line 184, in forward e = self.embed(x) File "lib64/python3.6/site-packages/torch/nn/modules/module.py", line 550, in __call__ result = self.forward(*input, **kwargs) File "tmp/lib64/python3.6/site-packages/torch/nn/modules/sparse.py", line 114, in forward self.norm_type, self.scale_grad_by_freq, self.sparse) File "tmp/lib64/python3.6/site-packages/torch/nn/functional.py", line 1724, in embedding return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) RuntimeError: Expected object of device type cuda but got device type cpu for argument #3 'index' in call to _th_index_select
Hello, Sir.
I have something to ask you about this code.
I wonder a reason of why you do not exclude 'none' target answer.
Your code seems to include such data samples like "PersonX loses personX's sight xNeed none " as a training data or test data.
I think learning to generate 'none' dose not seems to meaningful.
And I found same thing in COMET-ATOMIC 2020.
Is there any specific reasons?
Thanks.
I want to know how to use multi-gpu and whether I need to implement multi-gpu version by myself?
Hello!
Thank you for sharing your work! I want to generate xIntent for some sequences. Is there a way that I can give an input file and have the output be stored in another file?
Thanks for your time!
Frank
It seems that the beam sampler is doing sth weird when computing losses: (the following line) in src/evaluate/sampler.py
beam_lls, top_beam_idxs = (hyp_beam_lls / temp_counts).topk(self.opt.eval.bs)
in the loop, temp_counts is being divided again and again, making the losses for tokens in the begining less and less important.
Dear authors,
Thanks for the great work. I just have one small question about your paper. How is the model selected, especially for the conceptnet task? It is selected based on its performance on the dev set or you simply selected the final model after training?
Thank you so much for your help in advance.
In Figure 2(c), the model input the 0:n-1
words to predict the n
word. Or the model input s
and r
predict o
.
But in Figure 3, the model input all the tuple s r o
.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.