GithubHelp home page GithubHelp logo

scylior-hu / pytorch-nli Goto Github PK

View Code? Open in Web Editor NEW

This project forked from bentrevett/pytorch-nli

0.0 0.0 0.0 32 KB

A tutorial on how to implement models for natural language inference using PyTorch and TorchText. [IN PROGRESS]

License: MIT License

Jupyter Notebook 100.00%

pytorch-nli's Introduction

PyTorch Natural Language Inference [In Progress]

This repo contains tutorials covering how to do natural language inference (NLI) using PyTorch 1.2 and TorchText 0.4 using Python 3.7.

These tutorials will cover getting started with NLIby introducing a simple network with no recurrent layers. The second covers how to use TorchText's NestedField in order to get the characters for each word which will be used to construct a sentence encoder for the premise and hypothesis which processes not only the words in each sentence, but also the individual characters.

If you find any mistakes or disagree with any of the explanations, please do not hesitate to submit an issue. I welcome any feedback, positive or negative!

Getting Started

To install PyTorch, see installation instructions on the PyTorch website.

To install TorchText:

pip install torchtext

We'll also make use of spaCy to tokenize our data. To install spaCy, follow the instructions here making sure to install the English and German model with:

python -m spacy download en

Tutorials

  • 1 - Simple NLI Model

    This tutorial covers how to implement a basic NLI model. This model embeds the tokens in each sentence into 300-dimensional GloVe embeddings (which are frozen) and then creates an embedding for the entire sentence by simply summing the tokens of all embeddings within the sentence. These are then fed into 3 linear layers which output a prediction. Although this model is simple, it achieves comparable performance to using RNNs over each sentence. We also show to to implement a simple model with RNNs.

  • 2 - NestedField, Sentence Encoders and Inference

    Now we have a basic NLI model working we can improve on it. In this tutorial we introduce the NestedField - a TorchText field that processes another field. The NestedField provides an easy way to get both the words and characters for the sequences we want to tag. We will also create a sentence encoder that will take in boths the words and characters within a sentence - processing the words with a RNN and the characters with a CNN - and produces a single sentence embedding vector for both the premise and the hypothesis - which are fed into the rest of our model. Finally, we show how to use the model for inference, allowing us to perform natural language inference on any sentence.

References

Here are some things I looked at while making these tutorials. Some of it may be out of date.

pytorch-nli's People

Contributors

bentrevett avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.