GithubHelp home page GithubHelp logo

azpoliak / hypothesis-only-nli Goto Github PK

View Code? Open in Web Editor NEW
26.0 3.0 7.0 832 KB

Code and data corresponding to "Hypothesis Only Baselines in Natural Language Inference" (StarSem 2018)

Python 97.52% Shell 2.48%

hypothesis-only-nli's Introduction

Hypothesis-only NLI baselines

Add description about the repo & project

Requirements

All code in the repo relies on pytho2.7. Running pip install -r requirements.txt from the root of the project will install the required python packages.

This project relies on pytorch.

Data

We provide a bash script that can be used to downlod all data used in our experiments. The script also cleans and processes the data. To get and process the data, run data/get_data.sh.

Training

To train a hypothesis-only NLI model, use src/train.py.

All command line arguments are initialized with default values. If you ran get_data.sh as described above, all of the paths will be set directly and you can just run src/train.py.

The most useful command line arguments are:

  • embdfile - File containin the word embeddings
  • outputdir - Output directory to store the model after training
  • train_lbls_file NLI train data labels file
  • train_src_file NLI train data source file
  • val_lbls_file NLI validation (dev) data labels file
  • val_src_file NLI validation (dev) data source file
  • test_lbls_file NLI test data labels file
  • test_src_file NLI test data source file
  • remove_dup 1 to remove duplicate hypothesis from train, 0 to keep them in. 0 is the default

To see a description of more command line arguments, run src/train.py --help.

Bibligoraphy

If you use our code, please cite us using the following citation

@inproceedings{hypothsesis-only-nli-baselines,
  author = {Poliak, Adam and Naradowsky, Jason and Haldar, Aparajita and Rudinger, Rachel and {Van Durme}, Benjamin},
  title = {Hypothesis Only Baselines for Natural Language Inference},
  booktitle = {The Seventh Joint Conference on Lexical and Computational Semantics (*SEM)},
  year = {2018}
}

We additionally provide the bibliographies for the datasets that we use as well. If you use any of those datasets, please cite them as well.

@InProceedings{white-EtAl:2017:I17-1,
  author    = {White, Aaron Steven  and  Rastogi, Pushpendre  and  Duh, Kevin  and  Van Durme, Benjamin},
  title     = {Inference is Everything: Recasting Semantic Resources into a Unified Evaluation Framework},
  booktitle = {Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)},
  month     = {November},
  year      = {2017},
  address   = {Taipei, Taiwan},
  publisher = {Asian Federation of Natural Language Processing},
  pages     = {996--1005},
  url       = {http://www.aclweb.org/anthology/I17-1100}
}

@article{williams2017broad,
  title={A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference},
  author={Williams, Adina and Nangia, Nikita and Bowman, Samuel R},
  journal={arXiv preprint arXiv:1704.05426},
  year={2017}
}

@inproceedings{snli:emnlp2015,
        Author = {Bowman, Samuel R. and Angeli, Gabor and Potts, Christopher and Manning, Christopher D.},
        Booktitle = {Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
        Publisher = {Association for Computational Linguistics},
        Title = {A large annotated corpus for learning natural language inference},
        Year = {2015}
}

@InProceedings{lai-bisk-hockenmaier:2017:I17-1,
  author    = {Lai, Alice  and  Bisk, Yonatan  and  Hockenmaier, Julia},
  title     = {Natural Language Inference from Multiple Premises},
  booktitle = {Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)},
  month     = {November},
  year      = {2017},
  address   = {Taipei, Taiwan},
  publisher = {Asian Federation of Natural Language Processing},
  pages     = {100--109},
  url       = {http://www.aclweb.org/anthology/I17-1011}
}

@inproceedings{scitail,
     author = {Tushar Khot and Ashish Sabharwal and Peter Clark},
     booktitle = {AAAI},
     title = {{SciTail}: A Textual Entailment Dataset from Science Question Answering},
     year = {2018}
}

@InProceedings{P16-1204,
  author =      "Pavlick, Ellie
                and Callison-Burch, Chris",
  title =       "Most ``babies'' are ``little'' and most ``problems'' are ``huge'': Compositional      Entailment in Adjective-Nouns    ",
  booktitle =   "Proceedings of the 54th Annual Meeting of the Association for      Computational Linguistics (Volume 1: Long Papers)    ",
  year =        "2016",
  publisher =   "Association for Computational Linguistics",
  pages =       "2164--2173",
  location =    "Berlin, Germany",
  doi =         "10.18653/v1/P16-1204",
  url =         "http://www.aclweb.org/anthology/P16-1204"
}

@inproceedings{MARELLI14.363.L14-1314,
    author = {Marco Marelli and Stefano Menini and Marco Baroni and Luisa Bentivogli and Raffaella bernardi and Roberto Zamparelli},
    url = {http://www.lrec-conf.org/proceedings/lrec2014/pdf/363_Paper.pdf},
    note = {ACL Anthology Identifier: L14-1314},
    title = {A SICK cure for the evaluation of compositional distributional semantic models},
    booktitle = {Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)},
    year = {2014},
    month = {May},
    date = {26-31},
    address = {Reykjavik, Iceland},
    publisher = {European Language Resources Association (ELRA)},
    isbn = {978-2-9517408-8-4},
    pages = {216--223}
}

hypothesis-only-nli's People

Contributors

ahaldar avatar azpoliak avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

hypothesis-only-nli's Issues

joci.zip not avaiable

mldl@ub1604:~/ub16_prj/hypothesis-only-NLI$ wget -c http://decomp.net/wp-content/uploads/2015/08/joci.zip
--2018-09-24 10:33:58-- http://decomp.net/wp-content/uploads/2015/08/joci.zip
Connecting to 127.0.0.1:39739... connected.
Proxy request sent, awaiting response... 301 Moved Permanently
Location: http://decomp.io/wp-content/uploads/2015/08/joci.zip [following]
--2018-09-24 10:34:00-- http://decomp.io/wp-content/uploads/2015/08/joci.zip
Reusing existing connection to 127.0.0.1:39739.
Proxy request sent, awaiting response... 404 Not Found
2018-09-24 10:34:01 ERROR 404: Not Found.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.