GithubHelp home page GithubHelp logo

atcbosselut / comet-commonsense Goto Github PK

View Code? Open in Web Editor NEW
656.0 656.0 126.0 93 KB

Code for ACL 2019 Paper: "COMET: Commonsense Transformers for Automatic Knowledge Graph Construction" https://arxiv.org/abs/1906.05317

License: Apache License 2.0

Python 99.51% Shell 0.49%

comet-commonsense's People

Contributors

atcbosselut avatar gruentee avatar madaan avatar panaceai avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

comet-commonsense's Issues

Are there some research papers about text-to-set generation?

I know this question is a little out of topic. But it is helpful to me. Thank you.

Text-to-(word)set generation or sequence-to-(token)set generation.

For example, input a text and then output the tags for this text:

'Peter is studying English' --> {'good behavior','person','doing something'}

Thank you!

Can your code support multi-gpu training?

Dear authors,

Due to the GPU memory limitation of my devices, I tend to train the model with multi-GPU. After scanning your code, I modified the config/atomic/config_2.json (my experiment_num is 2)and config/default.json to set multi-GPU, as shown below:

  • config/atomic/config_2.json:
    image

  • config/default.json :

Task:Classifying Generated Tupes

Hi, I want to implement the task ‘Classifying Generated Tupes’, I have the following questions:

  1. Can the experiment be run under the windows system?

  2. What version of torch is used for this task? The pytorch official website does not provide a version of the windows system corresponding to python2.7.

  3. The interface when running the command ‘bash scripts/classify/classify.sh’,has no result.

Is the model a transformer encoder-decoder or just decoder?

Hi,

Thank you for the very nice and interesting work.
I have a question regarding the model. In the paper it s mentioned that you used the same architecture as GPT, which is a transformer decoder. However, you also talked about input encoder and how you configure the input to encoder.
I was a bit confused, whether you have both encoder and decoder? or it is a language model and then it would be just decoder. And if it is a decoder, then how do you encode your [X_s, X_r] ?
Could you please clarify this. Many thanks

Can this model give a knowledge graph according to the input sentence?

Can this model give a knowledge graph according to the input sentence?
Such as "One absolutely cardinal reason is the fact that universities offer an opportunity to intensify the knowledge in a particular field of interest."
Maybe this is a long sentence, I want to ask if our model has such a function?

Environment install problems

In the README.md you need to change the tensorflow and spacy install methods. pip tensorflow downloads a different version than what is needed and works better if you just use conda to install it. Also, the spacy 3.0 that is default install now, doesn't work with your dataloader. Proposed changes:

pip install tensorflow --> conda install tensorflow
conda install -c conda-forge spacy --> conda install -c conda-forge spacy=2.3.5

Results are different with python versions

I am trying to construct a codebase for my ACL paper where I used comet
I tested my code previously on 3.7 but now due to an additional feature i downgraded to 3.6 with that the COMET results are differing

For example for 👍
cereal crisp terrible dinner

3.6
['cereal', 'breakfast', 'drink milk', 'eat it', 'coffee']

3.7
['stomach ache', 'eat it', 'bad breath', 'vomit', 'you will feel bad']

Under such situations reproducibility is an issue
i set paperresults = True for both cases

any way this can be made to work

how to generate in our own test file in ConceptNet?

I have completed training in ConceptNet. However, if I would like to generate in our own test file, how could I do? I have changed our test file into 'test.txt', but the length of generation is fixed in 1200 as same as the original 'test.txt' 's positive length, how can I fix the length the same as our own test? Thanku.

RuntimeError: CUDA out of memory.

Hi, when I execute the training got:

(PyTorch) zys@zys:~/Documents/gits/comet-commonsense$ python src/main.py --experiment_type atomic --experiment_num 0
config/atomic/config_0.json
{'gpu_mode': 'T', 'gpu_index': 0, 'gpu_indices': [0, 1], 'multigpu': 'F', 'topk_size': 10, 'beam_size': 1, 'gen_seqlength': 40, 'eval_sampler': 'greedy', 'num_sequences': 1, 'generate_sequences': 1000, 'evaluate_sequences': 10000, 'random_seed': 123, 'optimizer': 'adam', 'batch_size': 64, 'learning_rate': 6.25e-05, 'clip': 1, 'loss': 'nll', 'weight_decay': 0, 'adam': {'b2': 0.999, 'b1': 0.9, 'e': 1e-08}, 'model': 'transformer', 'pretrain': 'gpt', 'hidden_dim': 768, 'num_layers': 12, 'num_heads': 12, 'embedding_dropout': 0.1, 'attention_dropout': 0.1, 'residual_dropout': 0.1, 'output_dropout': 0.1, 'activation': 'gelu', 'init': 'pt', 'trainer': 'iteration', 'iterations': 50000, 'cycle': 500, 'save_strategy': 'best', 'epochs': 20, 'toy': 'F', 'do_gen': 'F', 'save': 'T', 'test_save': 'F', 'dataset': 'atomic', 'categories': ['oEffect'], 'eval_categories': ['oEffect'], 'exp': 'generation', 'labels': 'individual', 'encoder_path': 'model/encoder_bpe_40000.json', 'bpe_path': 'model/vocab_40000.bpe', 'learning_rate_schedule': 'warmup_linear', 'learning_rate_warmup': 0.002, 'l2': 0.01, 'vector_l2': 'T'}
{'b2': 0.999, 'b1': 0.9, 'e': 1e-08}
DD{'net': DD{'model': 'transformer', 'nL': 12, 'nH': 12, 'hSize': 768, 'edpt': 0.1, 'adpt': 0.1, 'rdpt': 0.1, 'odpt': 0.1, 'pt': 'gpt', 'afn': 'gelu', 'init': 'pt'}, 'mle': 0, 'dataset': 'atomic', 'train': DD{'static': DD{'exp': 'generation', 'seed': 123, 'l2': 0.01, 'vl2': True, 'lrsched': 'warmup_linear', 'lrwarm': 0.002, 'clip': 1, 'loss': 'nll', 'b2': 0.999, 'b1': 0.9, 'e': 1e-08}, 'dynamic': DD{'lr': 6.25e-05, 'bs': 64, 'optim': 'adam'}}, 'model': 'transformer', 'exp': 'generation', 'data': DD{'categories': ['oEffect'], 'maxe1': 17, 'maxe2': 35, 'maxr': 1}, 'eval': DD{'bs': 1, 'smax': 40, 'sample': 'greedy', 'numseq': 1, 'gs': 1000, 'es': 10000, 'categories': ['oEffect']}, 'trainer': 'iteration', 'cycle': 500, 'iters': 50000}
Loading Data
Loading data from: data/atomic/processed/generation/categories_oEffect-maxe1_17-maxe2_35-maxr_1.pickle
61795
Done.
dict_keys(['data', 'sequences', 'masks', 'offsets', 'categories', 'opt', 'vocab_encoder', 'vocab_decoder', 'special_chars', 'max_event', 'max_effect', 'batch_size'])
Building Model
50
LOADING PRETRAINED TRANSFORMER
Loading weights...
Done.
Files will be logged at: results/losses/atomic-generation/iteration-500-50000/transformer/categories_oEffect-maxe1_17-maxe2_33-maxr_1/model_transformer-nL_12-nH_12-hSize_768-edpt_0.1-adpt_0.1-rdpt_0.1-odpt_0.1-pt_gpt-afn_gelu-init_pt-vSize_40532/exp_generation-seed_123-l2_0.01-vl2_T-lrsched_warmup_linear-lrwarm_0.002-clip_1-loss_nll-b2_0.999-b1_0.9-e_1e-08/bs_1-smax_40-sample_greedy-numseq_1-gs_1000-es_10000-categories_oEffect/6.25e-05_adam_64_0
Pushing to GPU: 0
Done.
Training
Logging Tensorboard Files at: logs/atomic-generation/iteration-500-50000/transformer/categories_oEffect-maxe1_17-maxe2_33-maxr_1/model_transformer-nL_12-nH_12-hSize_768-edpt_0.1-adpt_0.1-rdpt_0.1-odpt_0.1-pt_gpt-afn_gelu-init_pt-vSize_40532/exp_generation-seed_123-l2_0.01-vl2_T-lrsched_warmup_linear-lrwarm_0.002-clip_1-loss_nll-b2_0.999-b1_0.9-e_1e-08/bs_1-smax_40-sample_greedy-numseq_1-gs_1000-es_10000-categories_oEffect/6.25e-05_adam_64
0%| | 0/50000 [00:00<?, ?it/s]{'total_micro': [0], 'total_macro': [0], 'oEffect_micro': [0], 'oEffect_macro': [0]}
0%| | 1/50000 [00:00<10:46:40, 1.29it/s]Traceback (most recent call last):
File "src/main.py", line 16, in
main(args.experiment_num)
File "/home/zys/Documents/gits/comet-commonsense/src/main_atomic.py", line 125, in main
trainer.run()
File "/home/zys/Documents/gits/comet-commonsense/src/train/train.py", line 194, in run
self.cycle(bar, cycle_num)
File "/home/zys/Documents/gits/comet-commonsense/src/train/train.py", line 212, in cycle
loss, nums, reset = self.do_forward_pass(nums)
File "/home/zys/Documents/gits/comet-commonsense/src/train/train.py", line 159, in do_forward_pass
self.batch_variables)
File "/home/zys/Documents/gits/comet-commonsense/src/train/atomic_train.py", line 34, in batch
outputs = batch.batch_atomic_generate(opt, *args)
File "/home/zys/Documents/gits/comet-commonsense/src/train/batch.py", line 37, in batch_atomic_generate
attention_mask[:, :-1], loss_reduction="none")
File "/home/zys/Documents/gits/comet-commonsense/src/train/batch.py", line 108, in mle_steps
attention_mask, i)
File "/home/zys/Documents/gits/comet-commonsense/src/train/batch.py", line 125, in decode
return model(input_, sequence_mask=attention_mask)
File "/home/zys/anaconda3/envs/PyTorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/home/zys/Documents/gits/comet-commonsense/src/models/gpt.py", line 212, in forward
lm_logits = lm_logits + self.pos_emb_mask
RuntimeError: CUDA out of memory. Tried to allocate 486.00 MiB (GPU 0; 7.76 GiB total capacity; 6.32 GiB already allocated; 48.56 MiB free; 437.98 MiB cached)
0%| | 1/50000 [00:00<13:40:12, 1.02it/s]

Could you tell me your minimum configuration of GPU? I use a NVIDIA2060 got these errors.

Feeding a bigger size of the input event than 18.

Just want to make sure. Are we meant to change the original setting values 17, 35 for ATOMIC to bigger values in order to feed with a longer sentence than 18 words?

Or can we feed with a long sentence?

URL of ATOMIC dataset is not available

scripts/setup/get_atomic_data.sh tries to download atomic_data.tgz from washington.edu but the URL is not available.
Probably the URL could be changed to https://storage.googleapis.com/ai2-mosaic/public/atomic/v1.0/atomic_data.tgz.

tensorboardX logdir name change

I believe the following change is needed to make comet-commonsense compatible with the latest tensorboardX:

src/train/train.py
83 print("Logging Tensorboard Files at: {}".format(self.logger.log_dir))
83 print("Logging Tensorboard Files at: {}".format(self.logger.logdir))

Is it possible to limit the vocabulary set of generation?

Hi,

I wonder if it is possible to use the trained comet model, however limit the vocabulary set of generation? In other words have a customized subset of the current vocabulary set and force the model to only generate from that?
I need this specifically for one of the nine available dimensions.

Thanks in advance,

Difficulty reproducing ConceptNet scores

I'm having some difficulty reproducing the conceptnet accuracy scores, could you please point me in the right direction? These are the steps I took:

  1. Download pretrained COMET model from https://drive.google.com/open?id=1FccEsYPUHnjzmX-Y5vjCBeyRt1pLo8FB
  2. Run generations with greedy decoding on conceptnet test set
  3. Use Bilinear AVG model by Li et al. (2016) to score generated tuples and threshold by 0.5 to get accuracy

However, because I achieved an accuracy of 80.04%, which is much lower than the 95.25% reported in your Table, I'm sure I must be doing something wrong! The (short) code I used to get the scores is here: https://colab.research.google.com/drive/10oaX-_1qS75xrgbm67MDkRy_hOHEJA0s

MASK tokens

Hello,

I have a question about MASK tokens in the input examples (Figure 3, paper). Which role do they play in training and predictions. Unfortunately, I could not find the explanation in the paper. What is the role of mask tokens between subject tokens and relation tokens (ConceptNet) when only object tokens need to be predicted?
Thank you very much in advance.

python2 or python3 ?

you said that you assumed I have python 3.6, but why do you use python2 style like this???
comet-commonsense/scripts/classify/demo_bilinear.py
line 21 print 'can not find corresponding vector total:', array[0].lower()

Can I use the input with a longer length?

Dear authors,

I found you set the max_event and max_affect to 17 and 35 respectively. However, the length of my input event is much longer than 17, can I change it according to my need?

After scanning your code, I tend to change opt['data']["maxe1"] in main_atomic.py and self.max_event in atomic.py, can it work?

get_conceptnet_data.sh gives access denied error

I'm following the setup instructions but get_conceptnet_data.sh is returning errors:

bash scripts/setup/get_conceptnet_data.sh
--2020-05-17 11:44:00-- https://ttic.uchicago.edu/~kgimpel/comsense_resources/train100k.txt.gz
Resolving ttic.uchicago.edu (ttic.uchicago.edu)... 128.135.8.186
Connecting to ttic.uchicago.edu (ttic.uchicago.edu)|128.135.8.186|:443... connected.
OpenSSL: error:1425F102:SSL routines:ssl_choose_client_version:unsupported protocol
Unable to establish SSL connection.
...

Visiting https://ttic.uchicago.edu/~kgimpel/comsense_resources/ also gives a 403 access forbidden error.

Question about Making the Data loaders

Dear author :
I follow the instruction in the Readme file,when i try to make the data loaders about conceptNet,i came across the question:
~/comet-commonsense$ python scripts/data/make_conceptnet_data_loader.py
100%███████████████████████████████████████████| 100000/100000 [04:06<00:00, 405.46it/s]
100%███████████████████████████████████████████████████| 2400/2400 [00:05<00:00, 408.77it/s]
100%|████████████████████████████████████| 2400/2400 [00:05<00:00, 426.74it/s]
28
16
16
dev
Traceback (most recent call last):
File "scripts/data/make_conceptnet_data_loader.py", line 66, in
data_loader.make_tensors(text_encoder, special, test=False)
File "/home/caihanqing/comet-commonsense/src/data/conceptnet.py", line 196, in make_tensors
self.data[split]['total']) if not j[3]]))
RuntimeError: index out of range: Tried to access index 2306 out of table with 2305 rows. at /opt/conda/conda-bld/pytorch_1570910687230/work/aten/src/TH/generic/THTensorEvenMoreMath.cpp:418
Can you give me some useful advice to my further training ? @madaan @atcbosselut

Testing COMET to new ConceptNet-like words

Gongratulations for your work! What part of the code should I use in order to give as input a text file with some new words(begin node) and receive relations with generated new words/concepts (end node)? Thank you.

Atomic Evaluation

For atomic evaluation, the evaluation doesn't have bleu-2 score. Will you update the script to calculate the generation by bleu-2 score?

question about the runtime error

Here is the detail of the question :
File "comet-commonsense/src/interactive/functions.py", line 124, in get_atomic_sequence
input_event, category, data_loader, text_encoder)
File "comet-commonsense/src/interactive/functions.py", line 158, in set_atomic_inputs
XMB[:, :len(prefix)] = torch.LongTensor(prefix)
RuntimeError: The expanded size of the tensor (18) must match the existing size (93) at non-singleton dimension 1. Tar get sizes: [1, 18]. Tensor sizes: [93]
do you know how to deal with it ? @madaan

Having troubles in interactive mode generation when using GPU

My command is python3 scripts/interactive/pretrain_atomic_txt.py --device 0 --model_file pretrained_models/atomic_pretrained_model.pickle

What I got:
in forward h = self.transformer(x, sequence_mask) File "tmp/lib64/python3.6/site-packages/torch/nn/modules/module.py", line 550, in __call__ result = self.forward(*input, **kwargs) File "tmp/comet-commonsense/src/models/gpt.py", line 184, in forward e = self.embed(x) File "lib64/python3.6/site-packages/torch/nn/modules/module.py", line 550, in __call__ result = self.forward(*input, **kwargs) File "tmp/lib64/python3.6/site-packages/torch/nn/modules/sparse.py", line 114, in forward self.norm_type, self.scale_grad_by_freq, self.sparse) File "tmp/lib64/python3.6/site-packages/torch/nn/functional.py", line 1724, in embedding return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) RuntimeError: Expected object of device type cuda but got device type cpu for argument #3 'index' in call to _th_index_select

Why do not excludes "none" answers in ATOMIC datasets?

Hello, Sir.

I have something to ask you about this code.

I wonder a reason of why you do not exclude 'none' target answer.

Your code seems to include such data samples like "PersonX loses personX's sight xNeed none " as a training data or test data.

I think learning to generate 'none' dose not seems to meaningful.

And I found same thing in COMET-ATOMIC 2020.

Is there any specific reasons?

Thanks.

How can I use multigpu?

I want to know how to use multi-gpu and whether I need to implement multi-gpu version by myself?

Potential problems with BeamSampler in src/evaluate/sampler.py

It seems that the beam sampler is doing sth weird when computing losses: (the following line) in src/evaluate/sampler.py

beam_lls, top_beam_idxs = (hyp_beam_lls / temp_counts).topk(self.opt.eval.bs)

in the loop, temp_counts is being divided again and again, making the losses for tokens in the begining less and less important.

A question regarding the model selection

Dear authors,

Thanks for the great work. I just have one small question about your paper. How is the model selected, especially for the conceptnet task? It is selected based on its performance on the dev set or you simply selected the final model after training?

Thank you so much for your help in advance.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.