GithubHelp home page GithubHelp logo

plstm's Introduction

Phased LSTM implementation in Tensorflow

Welcome to the Tensorflow implementation of the recently introduced Phased LSTM by Neil et. al @ NIPS 2016

You can find here the original implementation from Daniel Neil (in Theano)

Results

Here is the results of frequency discrimination at high resolution sampling (described in the paper): in dark blue you see PLSTM that converges way faster then GRU (in light blue)

PLSTMvsGRU

Implementation

Here I implemented the PLSTM in a plug-and-play fashion such that if you wanna use it in one of your models you can switch from LSTMCell/GRUCell to PhasedLSTMCell.

The core of the PLSTM is

    tau = vs.get_variable(
        "T", shape=[self._num_units],
        initializer=random_exp_initializer(0, self.tau_init), dtype=dtype)

    r_on = vs.get_variable(
        "R", shape=[self._num_units],
        initializer=init_ops.constant_initializer(self.r_on_init), dtype=dtype)

    s = vs.get_variable(
        "S", shape=[self._num_units],
        initializer=init_ops.random_uniform_initializer(0., tau.initialized_value()), dtype=dtype)
        # for backward compatibility (v < 0.12.0) use the following line instead of the above
        # initializer = init_ops.random_uniform_initializer(0., tau), dtype = dtype)

    tau_broadcast = tf.expand_dims(tau, dim=0)
    r_on_broadcast = tf.expand_dims(r_on, dim=0)
    s_broadcast = tf.expand_dims(s, dim=0)

    r_on_broadcast = tf.abs(r_on_broadcast)
    tau_broadcast = tf.abs(tau_broadcast)
    times = tf.tile(times, [1, self._num_units])

    # calculate kronos gate
    phi = tf.div(tf.mod(tf.mod(times - s_broadcast, tau_broadcast) + tau_broadcast, tau_broadcast),
                 tau_broadcast)
    is_up = tf.less(phi, (r_on_broadcast * 0.5))
    is_down = tf.logical_and(tf.less(phi, r_on_broadcast), tf.logical_not(is_up))

    k = tf.select(is_up, phi / (r_on_broadcast * 0.5),
                  tf.select(is_down, 2. - 2. * (phi / r_on_broadcast), self.alpha * phi))

then the kronos gate is applied to the cell simply by

        c = k * c + (1. - k) * c_prev
        m = k * m + (1. - k) * m_prev

PhasedLSTMCell has the same parameters set has the LSTMCell plus, here I report the default parameters (as indicated by the paper)

  • The slope in the off period of the gate
   alpha=0.001
  • The initial value of r_on
        r_on_init=0.05
  • The parameter for the initial sampling of tau
        tau_init=6.

tau is sampled as ~exp(uniform(0, tau_init))

Notes for backward compatibility (v < 0.12.0)

The current implementation uses Tensorflow 0.12.0. If you don't wanna update Tensorflow (BTW you should :)) I inserted in the code some commented lines to be backward compatible.

In PLSTM

    initializer=init_ops.random_uniform_initializer(0., tau.initialized_value()), dtype=dtype)
    # for backward compatibility (v < 0.12.0) use the following line instead of the above
    # initializer = init_ops.random_uniform_initializer(0., tau), dtype = dtype)

In the unit test

    sess.run(init)
    # for backward compatibility (v < 0.12.0) use the following line instead of the above
    # initialize_all_variables(sess)

Remember also to change the summaries calls, that is:

    tf.summary.scalar -> tf.scalar_summary 
    tf.summary.histogram -> tf.histogram_summary 
    tf.summary.FileWriter -> tf.train.SummaryWriter 
    tf.summary.merge -> tf.merge_summary 

Paper's Task

I implemented the first task described in the paper, that is frequency discrimination. The network is presented with sine waves and has to discriminate between waves of a target range of frequencies (e.g. 5-6 Hz) and waves outside of this range. Furthermore there are three different ways in which you can sample these sine waves:

  • Low resolution (1 ms)
  • High resolution (0.1 ms)
  • Asynchronously

The 3 ways are implemented and you can select them with the flags.


Update 17-02-2017

Transition to tensorflow v1.0 The folder has now two different scripts to support v1.0 and older To use v1.0 refer to PLSTM_v1

Updated also

    outputs = multiPLSTM(cells, inputs, lens, n_input, initial_states)

now the function takes a list of cells previously generated, in this way you can paramterize every cell separately.

The function supports also in place copy of initial states


Contact

Let me know if you encounter any problem: [email protected]


+Enea

plstm's People

Contributors

enny1991 avatar jamessergeant avatar vivanov10 avatar williemaddox avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.