cas's Introduction

Customized Attention Span (CAS) Models

The recurrent weighted average (RWA) model is a new kind of recurrent neural network (RNN) that is based on the attention mechanism. The model is described in this manuscript. The RWA model exhibits no bias as to where information is located along a sequence. Recent information is treated as being equally important as information at the beginning of the sequence. For many problems, this is undesirable. That is why the RWA model needs an attention span.

In this repository, the RWA model is reimplemented as a TensorFlow RNNCell. The implementation of the RWA model can be used as easily as any other RNN architecture available in TensorFlow. To enforce an attention span, a decay term has been introduced. The value of the decay term determines how quickly a memory is forgotten and is inversely proportional to the expected half-life. When the decay term is 0 memories are retained indefinitely, although more important information can still overwrite less important information. When the decay term is larger than 0, the memory will be forgotten. The larger the decay term, the quicker the model forgets.

Each unit can be given a different decay value. Some units can have very short attention spans while other units can have indefinite attention spans. It is my hope that this model will exhibit superior performance on a range of problems in fields like NLP and bioinformatics.

If you like the idea I am open to collaboration! This project is still in its infancy.

Recommend Projects

mrchristophrivera / cas Goto Github PK

cas's Introduction

Customized Attention Span (CAS) Models

cas's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs