GithubHelp home page GithubHelp logo

cosecant-csc / annualreview-complearning Goto Github PK

View Code? Open in Web Editor NEW

This project forked from cgpotts/annualreview-complearning

0.0 2.0 0.0 39 KB

Demonstration code for Liang and Potts 2015

License: GNU General Public License v2.0

Python 100.00%

annualreview-complearning's Introduction

Bringing machine learning and compositional semantics together

Demonstration code for the paper

Liang, Percy and Christopher Potts. 2014. Bringing machine learning and compositional semantics together. Annual Review of Linguistics 1(1): 355โ€“376.

The purpose of the code is just to illustrate how the algorithms work, as an aid to understanding the paper and developing new models that synthesize compositionality and machine learning.

All of the files contain detailed explanations and documentation, with cross references to the paper. evenodd.py, grammar.py, synthesis.py run demos corresponding to examples and discussions from the paper.

The current verson has been tested with Python 2.7 and Python 3.5. The only other requirement is Numpy 1.10 or greater, and that dependency is only for the neural network in distributed.py.

Code snippets

Build the gold-standard grammar and use it for parsing:

from grammar import Grammar, gold_lexicon, rules, functions
gram = Grammar(gold_lexicon, rules, functions)
gram.gen('minus two plus three')

And interpretation:

for lf in gram.gen('minus two plus three'):
    print gram.sem(lf)

Check out the crazy things that the crude grammar does with the simple example (it creates 486 logical forms):

from synthesis import crude_lexicon
crude_gram = Grammar(crude_lexicon, rules, functions)
crude_gram.gen('minus two plus three')

Train a semantic parsing model using the default training set from semdata.py and the crude grammar as a starting point:

from semdata import sem_train
from synthesis import phi_sem
from learning import SGD
# For semantic parsing, the denotations are ignored:
semparse_train = [[x,y] for x, y, d in sem_train]

# The space of output classes is determined by GEN:
weights = SGD(D=semparse_train, phi=phi_sem, classes=crude_gram.gen)

And now see that the crude grammar plus the learned weights deliver a right result for 'minus two plus three' (the training set happens to favor the second parse in the list returned by the gold grammar gram):

from learning import predict
predict(x='minus two plus three', w=weights, phi=phi_sem, classes=crude_gram.gen)

For semantic parsing with the trees/derivations as latent variables, and for learning from denotations, see synthesis.evaluate_interpretive and synthesis.evaluate_latent_semparse.

grammar.py

Implements a simple interpreted context-free grammar formalism in which each nonterminal node is a tuple (s, r) where s is the syntactic category and r is the logical form representation. The user supplies a lexicon, a rule-set, and possibly a set of functions that, together with Python itself, make the logical forms interpretable as Python code.

semdata.py

Uses grammar.py to create training and testing data for the semantic models.

learning.py

The core learning framework from section 3.2 of the paper.

evenodd.py

A simple supervised learning example using learning.py. The examples correspond to table 3 in the paper.

synthesis.py

Implements three different theories for learning compositional semantics. All are illustrated with the same feature function and train/test data.

  • Basic semantic parsing, in which we learn from and predict entire tree-structure logical forms. This involves no latent variables; optimization is with SGD. This is the focus of section 4.1 of the paper.

  • Learning from denotations, in which the tree-structural logical forms are latent variables; optimization is with LatentSGD. This is the focus of section 4.2 of the paper.

  • Latent variable semantic parsing, in which we learn from and predict only the root node of the logical form, making the tree structure a hidden variable. This is not covered in detail in the paper due to space constraints, but it achieves a richer connection with the literature on semantic parsing. To make things interesting, we add a type-lifting rule to the grammar for this example, so that individual logical forms correspond to multiple derivations.

annualreview-complearning's People

Contributors

cgpotts avatar

Watchers

James Cloos avatar CoSeCant avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.