GithubHelp home page GithubHelp logo

nmn-drop's Introduction

Neural Module Networks for Reasoning over Text

This is the official code for the ICLR 2020 paper, Neural Module Networks for Reasoning over Text. This repository contains the code for replicating our experiments and can be used to extend our model as you wish.

Live Demo: https://demo.allennlp.org/ -- Goto 'Reading Comprehension' and select 'NMN (trained on DROP)' in the Model section.

Resources

  1. Download the data and a trained model checkpoint from here. Unzip the downloaded contents and place the resulting directory iclr_cameraready inside a convenient location, henceforth referred to as -- MODEL_CKPT_PATH

  2. Clone the allennlp-semparse repository from here to a convenient location, henceforth referred to as -- PATH_TO_allennlp-semparse. Checkout using git checkout 937d594 the specific commit that this code is built on. Such issues will be resolved soon when allennlp-semparse becomes pip installable.

Installation

The code is written in python using AllenNLP and allennlp-semparse.

The following commands create a miniconda environment, install allennlp, and creates symlinks for allennlp-semparse and the downloaded resources.

# Make conda environment
conda create -name nmn-drop python=3.6
conda activate nmn-drop

# Install required packages
pip install allennlp==0.9
pip install dateparser==0.7.2
python -m spacy download en_core_web_lg

# Clone code and make symlinks
git clone [email protected]:nitishgupta/nmn-drop.git
cd nmn-drop/
mkdir resources; cd resources; ln -s MODEL_CKPT_PATH/iclr_cameraready ./; cd ..    
ln -s PATH_TO_allennlp-semparse/allennlp-semparse/allennlp_semparse/ ./ 

Prediction

To make predictions on your data, format your data in a json lines format -- input.jsonl where each line is a valid JSON value containing the keys "question" and "passage".

Run the command

allennlp predict \
    --output-file output.jsonl \
    --predictor drop_demo_predictor \
    --include-package semqa \
    --silent \
    --batch-size 1 \ 
    resources/iclr_cameraready/ckpt/model.tar.gz \
    input.jsonl

The output output.jsonl contains the answer in an additional key "answer".

Evaluation

To evaluate the model on the dev set, run the command -- bash scripts/iclr/evaluate.sh

The model_ckpt/data path in the script can be modified to evaluate a different model on a different dataset.

Prediction

To generate text based visualization of the model's prediction on the development data, run the command -- bash scripts/iclr/predict.sh

A file drop_mydev_verbosepred.txt is written to MODEL_CKPT_PATH/iclr_cameraready/ckpt/predictions containing this visualization.

An interactive demo of our model will be available soon.

Training

We already provide a trained model checkpoint and the subset of the DROP data used in the ICLR2020 paper with the resources above.

If you would like to re-train the model on this data, run the command -- bash scripts/iclr/train.sh.

The model checkpoint would be saved at MODEL_CKPT_PATH/iclr_cameraready/my_ckpt.

Note that this code needs the DROP data to be preprocessed with additional information such as, tokenization, numbers, and dates, etc. To train a model on a different subset of the DROP data, this pre-processing can be performed using the python script datasets/drop/preprocess/tokenize.py on any DROP-formatted json file.

References

Please consider citing our work if you found this code or our paper beneficial to your research.

@inproceedings{nmn:iclr20,
  author = {Nitish Gupta and Kevin Lin and Dan Roth and Sameer Singh and Matt Gardner},
  title = {Neural Module Networks for Reasoning over Text},
  booktitle = {International Conference on Learning Representations (ICLR)},
  year = {2020}
}

Contributions and Contact

This code was developed by Nitish Gupta, contact [email protected].

If you'd like to contribute code, feel free to open a pull request. If you find an issue with the code or require additional support, please open an issue.

nmn-drop's People

Contributors

nitishgupta avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

nmn-drop's Issues

ImportError: cannot import name 'SpacyTokenizer'

Hi,

I follow the instruction and I got the following error:
2020-03-11 17:10:55,134 - INFO - pytorch_pretrained_bert.modeling - Better speed can be achieved with apex installed from https://www.github.com/nvidia/apex . 2020-03-11 17:10:55,473 - INFO - pytorch_transformers.modeling_bert - Better speed can be achieved with apex installed from https://www.github.com/nvidia/apex . 2020-03-11 17:10:55,476 - INFO - pytorch_transformers.modeling_xlnet - Better speed can be achieved with apex installed from https://www.github.com/nvidia/apex . 2020-03-11 17:10:55,637 - INFO - allennlp.common.registrable - instantiating registered subclass relu of <class 'allennlp.nn.activations.Activation'> 2020-03-11 17:10:55,638 - INFO - allennlp.common.registrable - instantiating registered subclass relu of <class 'allennlp.nn.activations.Activation'> 2020-03-11 17:10:55,638 - INFO - allennlp.common.registrable - instantiating registered subclass relu of <class 'allennlp.nn.activations.Activation'> 2020-03-11 17:10:55,639 - INFO - allennlp.common.registrable - instantiating registered subclass relu of <class 'allennlp.nn.activations.Activation'> Traceback (most recent call last): File "/home/detuvoldo/anaconda3/envs/nmn/bin/allennlp", line 8, in <module> sys.exit(run()) File "/home/detuvoldo/anaconda3/envs/nmn/lib/python3.6/site-packages/allennlp/run.py", line 18, in run main(prog="allennlp") File "/home/detuvoldo/anaconda3/envs/nmn/lib/python3.6/site-packages/allennlp/commands/__init__.py", line 101, in main import_submodules(package_name) File "/home/detuvoldo/anaconda3/envs/nmn/lib/python3.6/site-packages/allennlp/common/util.py", line 323, in import_submodules module = importlib.import_module(package_name) File "/home/detuvoldo/anaconda3/envs/nmn/lib/python3.6/importlib/__init__.py", line 126, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "<frozen importlib._bootstrap>", line 994, in _gcd_import File "<frozen importlib._bootstrap>", line 971, in _find_and_load File "<frozen importlib._bootstrap>", line 955, in _find_and_load_unlocked File "<frozen importlib._bootstrap>", line 665, in _load_unlocked File "<frozen importlib._bootstrap_external>", line 678, in exec_module File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed File "./semqa/__init__.py", line 1, in <module> import semqa.state_machines File "./semqa/state_machines/__init__.py", line 1, in <module> from semqa.state_machines.constrained_beam_search import FirstStepConstrainedBeamSearch File "./semqa/state_machines/constrained_beam_search.py", line 6, in <module> from allennlp_semparse.state_machines.states import State File "./allennlp_semparse/state_machines/__init__.py", line 26, in <module> from allennlp_semparse.state_machines.beam_search import BeamSearch File "./allennlp_semparse/state_machines/beam_search.py", line 9, in <module> from allennlp_semparse.state_machines.states import State File "./allennlp_semparse/state_machines/states/__init__.py", line 14, in <module> from allennlp_semparse.state_machines.states.coverage_state import CoverageState File "./allennlp_semparse/state_machines/states/coverage_state.py", line 7, in <module> from allennlp_semparse.fields.production_rule_field import ProductionRule File "./allennlp_semparse/fields/__init__.py", line 1, in <module> from allennlp_semparse.fields.knowledge_graph_field import KnowledgeGraphField File "./allennlp_semparse/fields/knowledge_graph_field.py", line 14, in <module> from allennlp.data.tokenizers import Token, Tokenizer, SpacyTokenizer ImportError: cannot import name 'SpacyTokenizer'

It seems that the error comes from allennlp_semparse. I clone it from the provided link. How can I fix it?

Different results on demo and trained model checkpoint

I have followed the instructions to run the checkpoint model, and been able to successfully run it. However, I get different "answer"s on the demo and the trained model checkpoint.

Demo:

  • Passage: The family of a cancer victim that had won a $4.4 million judgment against Laboratory Corp. of America over a botched Pap smear test settled with the company Monday just before the parties were set to start a new trial on noneconomic damages in Florida federal court.
  • Question: Who was the injured?
  • Answer: cancer victim

Trained Model Checkpoint:

  • input.jsonl :
    {"passage": "The family of a cancer victim that had won a $4.4 million judgment against Laboratory Corp. of America over a botched Pap smear test settled with the company Monday just before the parties were set to start a new trial on noneconomic damages in Florida federal court.", "question":"Who was the injured?"}
  • output.jsonl:
    {"question": "Who was the injured?", "passage": "The family of a cancer victim that had won a $4.4 million judgment against Laboratory Corp. of America over a botched Pap smear test settled with the company Monday just before the parties were set to start a new trial on noneconomic damages in Florida federal court.", "predicted_ans": "against Laboratory Corp.", "answer": "against Laboratory Corp.", "inputs": [{"name": "question", "tokens": ["Who", "was", "the", "injured", "?"]}, {"name": "passage", "tokens": ["The", "family", "of", "a", "cancer", "victim", "that", "had", "won", "a", "$", "4.4", "million", "judgment", "against", "Laboratory", "Corp.", "of", "America", "over", "a", "botched", "Pap", "smear", "test", "settled", "with", "the", "company", "Monday", "just", "before", "the", "parties", "were", "set", "to", "start", "a", "new", "trial", "on", "noneconomic", "damages", "in", "Florida", "federal", "court", "."]}, {"name": "numbers", "tokens": ["4.4", "0", "100.0"]}, {"name": "dates", "tokens": ["-1/-1/-1"]}, {"name": "composed_numbers", "tokens": ["0.0", "8.8", "95.6", "104.4", "200.0"]}, {"name": "year_diffs", "tokens": ["0"]}, {"name": "count", "tokens": ["0", "1", "2", "3", "4", "5", "6", "7", "8", "9"]}], "program_nested_expression": [{"name": "span", "identifier": 4}, [{"name": "relocate", "identifier": 3}, [{"name": "filter", "identifier": 2}, {"name": "find", "identifier": 1}]]], "program_lisp": "(span (relocate (filter find)))", "program_execution": []}

==> cancer victim vs. against Laboratory Corp

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

Hi! This is almost a duplicate of this issue, except I wanted to add that this behavior is robust for me and I get this error every time I train the model.

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

Nitish suggested earlier that I should try to restart the training, but it doesn't work for me. I thought, maybe this has something to do with the fact that I'm using the master version of this repo which might be too old. Is it possible that it contains some bugs that were fixed in the later versions? Does it make sense for me to try and switch to other branches of this repository, like qmdr or ai2 or to the code from the latest pre-release?

I'm for now only interested in getting it to work in some way or another to get roughly the same numbers as in the paper (with or without QDMR representations and paired training).

Issue with running the model

I followed all the steps to set up the model and ran the allennlp predict command as instructed in the README. However, I came across an issue stating that:

allennlp.common.checks.ConfigurationError: 'Cannot register atis_parser as Model; name already in use for AtisSemanticParser'

How might this be resolved? If more information is needed, I'd love to clarify. Thank you.

gc_params problem

Hi, I'm using the allennlp version as you recommended. When I ran your code, I got into the following problem:

Traceback (most recent call last):
  File "/home/hustchenwenhu/anaconda3/envs/pytorch1.4/bin/allennlp", line 11, in <module>
    load_entry_point('allennlp', 'console_scripts', 'allennlp')()
  File "/data/wenhu/allennlp/allennlp/run.py", line 18, in run
    main(prog="allennlp")
  File "/data/wenhu/allennlp/allennlp/commands/__init__.py", line 102, in main
    args.func(args)
  File "/data/wenhu/allennlp/allennlp/commands/train.py", line 124, in train_model_from_args
    args.cache_prefix)
  File "/data/wenhu/allennlp/allennlp/commands/train.py", line 168, in train_model_from_file
    cache_directory, cache_prefix)
  File "/data/wenhu/allennlp/allennlp/commands/train.py", line 234, in train_model
    validation_iterator=pieces.validation_iterator)
  File "/data/wenhu/allennlp/allennlp/training/trainer.py", line 726, in from_params
    params.assert_empty(cls.__name__)
  File "/data/wenhu/allennlp/allennlp/common/params.py", line 433, in assert_empty
    raise ConfigurationError("Extra parameters passed to {}: {}".format(class_name, self.params))
allennlp.common.checks.ConfigurationError: "Extra parameters passed to Trainer: {'gc_freq': 500}"

Any idea how to fix this problem?

Loss tensor without grad_fn arises during training

After following the instructions and pip-installing the modified version of allennlp I was able to run the training script, but it lead to CUDA out-of-memory errors in my GPU, so I brought the batch size down from 4 to 3 and then it works, but I get the following error around 7% of the first epoch:

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

Is this something related to dependencies with the batch size or is there any way of solving it?

Note: I'm running this with CUDA 10.1

ImportError: cannot import name 'SpacyTokenizer'

As I try to run scripts/iclr/evaluate.sh, I got this error:


Traceback (most recent call last):
File "/home/milk/.conda/envs/nmn-drop/bin/allennlp", line 8, in
sys.exit(run())
File "/home/milk/.conda/envs/nmn-drop/lib/python3.6/site-packages/allennlp/run.py", line 18, in run
main(prog="allennlp")
File "/home/milk/.conda/envs/nmn-drop/lib/python3.6/site-packages/allennlp/commands/init.py", line 101, in main
import_submodules(package_name)
File "/home/milk/.conda/envs/nmn-drop/lib/python3.6/site-packages/allennlp/common/util.py", line 323, in import_submodules
module = importlib.import_module(package_name)
File "/home/milk/.conda/envs/nmn-drop/lib/python3.6/importlib/init.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 994, in _gcd_import
File "", line 971, in _find_and_load
File "", line 955, in _find_and_load_unlocked
File "", line 665, in _load_unlocked
File "", line 678, in exec_module
File "", line 219, in _call_with_frames_removed
File "./semqa/init.py", line 1, in
import semqa.state_machines
File "./semqa/state_machines/init.py", line 1, in
from semqa.state_machines.constrained_beam_search import FirstStepConstrainedBeamSearch
File "./semqa/state_machines/constrained_beam_search.py", line 6, in
from allennlp_semparse.state_machines.states import State
File "./allennlp_semparse/state_machines/init.py", line 26, in
from allennlp_semparse.state_machines.beam_search import BeamSearch
File "./allennlp_semparse/state_machines/beam_search.py", line 9, in
from allennlp_semparse.state_machines.states import State
File "./allennlp_semparse/state_machines/states/init.py", line 14, in
from allennlp_semparse.state_machines.states.coverage_state import CoverageState
File "./allennlp_semparse/state_machines/states/coverage_state.py", line 7, in
from allennlp_semparse.fields.production_rule_field import ProductionRule
File "./allennlp_semparse/fields/init.py", line 1, in
from allennlp_semparse.fields.knowledge_graph_field import KnowledgeGraphField
File "./allennlp_semparse/fields/knowledge_graph_field.py", line 14, in
from allennlp.data.tokenizers import Token, Tokenizer, SpacyTokenizer
ImportError: cannot import name 'SpacyTokenizer'


I've tried --upgrade and --ignore-installed, but they didn't work.

Get the operation extracted from the question only

Hi,
First, I want to thank you for publishing a great paper with code.
I have a question in terms of the operation. Can I get the operation results instead of the answers?
I mean the operation extracted from each question.

Thanks,

Not loading the right pre trained model

Followed the steps in setting the project, but
I get the below error while trying to run a prediction.

Any hint?

2020-02-05 18:42:39,044 - INFO - allennlp.nn.initializers -    qp_matrix_attention._bias
2020-02-05 18:42:39,044 - INFO - allennlp.nn.initializers -    qp_matrix_attention._weight_vector
2020-02-05 18:42:41,383 - INFO - allennlp.common.from_params - instantiating class <class 'allennlp.data.dataset_readers.dataset_reader.DatasetReader'> from params {'lazy': False, 'pretrained_model': 'bert-base-uncased', 'question_length_limit': 50, 'skip_due_to_gold_programs': False, 'skip_instances': False, 'token_indexers': {'tokens': {'pretrained_model': 'bert-base-uncased', 'type': 'bert-drop'}}, 'type': 'drop_reader_bert'} and extras set()
2020-02-05 18:42:41,384 - INFO - allennlp.common.params - validation_dataset_reader.type = drop_reader_bert
2020-02-05 18:42:41,384 - INFO - allennlp.common.from_params - instantiating class <class 'semqa.data.dataset_readers.drop_reader_bert.DROPReaderNew'> from params {'lazy': False, 'pretrained_model': 'bert-base-uncased', 'question_length_limit': 50, 'skip_due_to_gold_programs': False, 'skip_instances': False, 'token_indexers': {'tokens': {'pretrained_model': 'bert-base-uncased', 'type': 'bert-drop'}}} and extras set()
2020-02-05 18:42:41,385 - INFO - allennlp.common.params - validation_dataset_reader.lazy = False
2020-02-05 18:42:41,385 - INFO - allennlp.common.params - validation_dataset_reader.pretrained_model = bert-base-uncased
2020-02-05 18:42:41,386 - INFO - allennlp.common.from_params - instantiating class allennlp.data.token_indexers.token_indexer.TokenIndexer from params {'pretrained_model': 'bert-base-uncased', 'type': 'bert-drop'} and extras set()
2020-02-05 18:42:41,386 - INFO - allennlp.common.params - validation_dataset_reader.token_indexers.tokens.type = bert-drop
2020-02-05 18:42:41,386 - INFO - allennlp.common.from_params - instantiating class semqa.data.dataset_readers.drop_reader_bert.BertDropTokenIndexer from params {'pretrained_model': 'bert-base-uncased'} and extras set()
2020-02-05 18:42:41,387 - INFO - allennlp.common.params - validation_dataset_reader.token_indexers.tokens.pretrained_model = bert-base-uncased
2020-02-05 18:42:41,387 - INFO - allennlp.common.params - validation_dataset_reader.token_indexers.tokens.max_pieces = 512
2020-02-05 18:42:41,614 - INFO - pytorch_pretrained_bert.tokenization - loading vocabulary file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at /root/.pytorch_pretrained_bert/26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084
2020-02-05 18:42:41,718 - INFO - allennlp.common.params - validation_dataset_reader.relaxed_span_match = True
2020-02-05 18:42:41,719 - INFO - allennlp.common.params - validation_dataset_reader.do_augmentation = True
2020-02-05 18:42:41,719 - INFO - allennlp.common.params - validation_dataset_reader.question_length_limit = 50
2020-02-05 18:42:41,719 - INFO - allennlp.common.params - validation_dataset_reader.only_strongly_supervised = False
2020-02-05 18:42:41,719 - INFO - allennlp.common.params - validation_dataset_reader.skip_instances = False
2020-02-05 18:42:41,719 - INFO - allennlp.common.params - validation_dataset_reader.skip_due_to_gold_programs = False
2020-02-05 18:42:41,720 - INFO - allennlp.common.params - validation_dataset_reader.convert_spananswer_to_num = False
2020-02-05 18:42:42,003 - INFO - pytorch_pretrained_bert.tokenization - loading vocabulary file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at /root/.pytorch_pretrained_bert/26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084
2020-02-05 18:42:42,094 - INFO - allennlp.common.registrable - instantiating registered subclass drop_demo_predictor of <class 'allennlp.predictors.predictor.Predictor'>
Traceback (most recent call last):
  File "/root/anaconda3/envs/py3/bin/allennlp", line 10, in <module>
    sys.exit(run())
  File "/root/anaconda3/envs/py3/lib/python3.6/site-packages/allennlp/run.py", line 18, in run
    main(prog="allennlp")
  File "/root/anaconda3/envs/py3/lib/python3.6/site-packages/allennlp/commands/__init__.py", line 102, in main
    args.func(args)
  File "/root/anaconda3/envs/py3/lib/python3.6/site-packages/allennlp/commands/predict.py", line 227, in _predict
    manager.run()
  File "/root/anaconda3/envs/py3/lib/python3.6/site-packages/allennlp/commands/predict.py", line 206, in run
    for model_input_json, result in zip(batch_json, self._predict_json(batch_json)):
  File "/root/anaconda3/envs/py3/lib/python3.6/site-packages/allennlp/commands/predict.py", line 151, in _predict_json
    results = [self._predictor.predict_json(batch_data[0])]
  File "./semqa/predictors/demo_predictor.py", line 180, in predict_json
    instance = self._json_to_instance(inputs)
  File "./semqa/predictors/demo_predictor.py", line 100, in _json_to_instance
    passage_spacydoc = spacyutils.getSpacyDoc(cleaned_passage_text, spacy_nlp)
  File "./utils/spacyutils.py", line 38, in getSpacyDoc
    return nlp(sent)
  File "/root/anaconda3/envs/py3/lib/python3.6/site-packages/spacy/language.py", line 435, in __call__
    doc = proc(doc, **component_cfg.get(name, {}))
  File "pipes.pyx", line 397, in spacy.pipeline.pipes.Tagger.__call__
  File "pipes.pyx", line 442, in spacy.pipeline.pipes.Tagger.set_annotations
  File "morphology.pyx", line 312, in spacy.morphology.Morphology.assign_tag_id
  File "morphology.pyx", line 200, in spacy.morphology.Morphology.add
ValueError: [E167] Unknown morphological feature: 'ConjType' (9141427322507498425). This can happen if the tagger was trained with a different set of morphological features. If you're using a pretrained model, make sure that your models are up to date:
python -m spacy validate
2020-02-05 18:43:25,401 - INFO - allennlp.models.archival - removing temporary unarchived model dir at /tmp/tmp1kf0l594

The file looks like this

{"passage":" Hoping to snap a two-game losing streak, the Falcons went home for a Week 9 duel with the Washington Redskins.  Atlanta would take flight in the first quarter as quarterback Matt Ryan completed a 2-yard touchdown pass to tight end Tony Gonzalez, followed by cornerback Tye Hill returning an interception 62 yards for a touchdown.  The Redskins would answer in the second quarter as kicker Shaun Suisham nailed a 48-yard field goal, yet the Falcons kept their attack on as running back Michael Turner got a 30-yard touchdown run, followed by kicker Jason Elam booting a 33-yard field goal. Washington began to rally in the third quarter with a 1-yard touchdown run from running back Ladell Betts.  The Redskins would come closer in the fourth quarter as quarterback Jason Campbell hooked up with tight end Todd Yoder on a 3-yard touchdown pass, yet Atlanta closed out the game with Turner's 58-yard touchdown run.","question":"How many yards was the shortest touchdown pass?"}

ImportError: cannot import name 'TokenType

I've installed everything exactly as requested however I still get the following error

from allennlp.data.token_indexers.token_indexer import TokenIndexer, TokenType
ImportError: cannot import name 'TokenType' from 'allennlp.data.token_indexers.token_indexer' (/home/mvictor96/miniconda3/envs/myenv/lib/python3.7/site-packages/allennlp/data/token_indexers/token_indexer.py)

Full error is


Traceback (most recent call last):
File "/home/mvictor96/miniconda3/envs/myenv/bin/allennlp", line 8, in
sys.exit(run())
File "/home/mvictor96/miniconda3/envs/myenv/lib/python3.7/site-packages/allennlp/main.py", line 19, in run
main(prog="allennlp")
File "/home/mvictor96/miniconda3/envs/myenv/lib/python3.7/site-packages/allennlp/commands/init.py", line 91, in main
import_module_and_submodules(package_name)
File "/home/mvictor96/miniconda3/envs/myenv/lib/python3.7/site-packages/allennlp/common/util.py", line 340, in import_module_and_submodules
module = importlib.import_module(package_name)
File "/home/mvictor96/miniconda3/envs/myenv/lib/python3.7/importlib/init.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1006, in _gcd_import
File "", line 983, in _find_and_load
File "", line 967, in _find_and_load_unlocked
File "", line 677, in _load_unlocked
File "", line 728, in exec_module
File "", line 219, in _call_with_frames_removed
File "/mnt/c/Users/matth/Documents/RedMan/src/nmn-drop/semqa/init.py", line 1, in
import semqa.state_machines
File "/mnt/c/Users/matth/Documents/RedMan/src/nmn-drop/semqa/state_machines/init.py", line 1, in
from semqa.state_machines.constrained_beam_search import FirstStepConstrainedBeamSearch
File "/mnt/c/Users/matth/Documents/RedMan/src/nmn-drop/semqa/state_machines/constrained_beam_search.py", line 6, in
from allennlp_semparse.state_machines.states import State
File "/home/mvictor96/miniconda3/envs/myenv/lib/python3.7/site-packages/allennlp_semparse/state_machines/init.py", line 26, in
from allennlp_semparse.state_machines.beam_search import BeamSearch
File "/home/mvictor96/miniconda3/envs/myenv/lib/python3.7/site-packages/allennlp_semparse/state_machines/beam_search.py", line 9, in
from allennlp_semparse.state_machines.states import State
File "/home/mvictor96/miniconda3/envs/myenv/lib/python3.7/site-packages/allennlp_semparse/state_machines/states/init.py", line 14, in
from allennlp_semparse.state_machines.states.coverage_state import CoverageState
File "/home/mvictor96/miniconda3/envs/myenv/lib/python3.7/site-packages/allennlp_semparse/state_machines/states/coverage_state.py", line 7, in
from allennlp_semparse.fields.production_rule_field import ProductionRule
File "/home/mvictor96/miniconda3/envs/myenv/lib/python3.7/site-packages/allennlp_semparse/fields/init.py", line 1, in
from allennlp_semparse.fields.knowledge_graph_field import KnowledgeGraphField
File "/home/mvictor96/miniconda3/envs/myenv/lib/python3.7/site-packages/allennlp_semparse/fields/knowledge_graph_field.py", line 14, in
from allennlp.data.token_indexers.token_indexer import TokenIndexer, TokenType
ImportError: cannot import name 'TokenType' from 'allennlp.data.token_indexers.token_indexer' (/home/mvictor96/miniconda3/envs/myenv/lib/python3.7/site-packages/allennlp/data/token_indexers/token_indexer.py)

Training NMN model on full DROP (with numerical answers)

I have some queries regarding retraining your model on the subset of DROP that have only numerical answers. This subset has ~45K training instances but the preprocessing scripts are able to generate additional supervision only for ~12K training instances and the final model is being able to only use these for training (the remaining instances are not used).

Specifically I am interested in training a version without 'execution_supervised' or 'qattn_supervised'. In all of the cases (by removing one of these supervisions or both) the validation results are quite poor (around 13-15% Exact Match). Trying different learning rates and beam sizes also did not improve.

Also by turning on all the kinds of supervision in the config file, on this subset of data, i am getting poor validation performance of ~20% (Exact Match) . Does this sound reasonable or am i doing something wrong? It would be great if you can shed some light on this.

I am using the tokenize.py code followed by preprocessing scripts in datasets/drop/preprocess to generate the different types of supervision in the training data and use merge_data.py to get the final training data.
At the end, i get the following supervision from the training data {'program_supervised': 9669, 'qattn_supervised': 7260, 'execution_supervised': 1750}).

For your reference i am attaching the scripts i used for generating this supervision for the training data, Can you kindly let me know if i am doing it the right way.

generate_annotations.txt

Question about extracting Q and P representations

Hi, thanks for sharing the codes!
When I was reading through your codes, and I noticed that

    # Skip [CLS]; then the next max_ques_len tokens are question tokens        
    encoded_question = bert_out[:, 1 : self.max_ques_len + 1, :]

    question_mask = (pad_mask[:, 1 : self.max_ques_len + 1]).float()

    # Skip [CLS] Q_tokens [SEP]

    encoded_passage = bert_out[:, 1 + self.max_ques_len + 1 :, :]

    passage_mask = (pad_mask[:, 1 + self.max_ques_len + 1 :]).float()

    passage_token_idxs = question_passage_tokens[:, 1 + self.max_ques_len + 1 :]

I know it is much more convenient for us to separate Q and P representations for all instances if we pad all questions to a fixed length(max_ques_len).
However, can we separate Q and P dynamically depending on the real length of each instance? Is there any limitation here?

Accelerate predictors

Thanks for this interesting work and the released code! I am currently using drop_demo_predictor to predict answers on my dataset of <passage, question> pairs, but it seems to be quite slow (even with GPUs). I guess it's because predict_json only processes one example at a time (not in a batch) and some preprocessing that converts the raw passage/question pair into your internal format. Any idea about improving speed?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.