vene / sparse-structured-attention Goto Github PK

Sparse and structured neural attention mechanisms

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%

attention-mechanism attention-mechanisms fused-lasso deep-learning deeplearning deep-neural-networks sparsity sparse segmentation

sparse-structured-attention's Introduction

Sparse and structured attention mechanisms

Efficient implementation of structured sparsity inducing attention mechanisms: fusedmax, oscarmax and sparsemax.

Note: If you are just looking for sparsemax, I recommend the implementation in the entmax.

Currently available for pytorch >= 0.4.1. (For older versions, use a previous release of this package.) Requires python >= 2.7, cython, numpy, scipy.

Usage example:

In [1]: import torch
In [2]: import torchsparseattn
In [3]: a = torch.tensor([1, 2.1, 1.9], dtype=torch.double)
In [4]: lengths = torch.tensor([3])
In [5]: fusedmax = torchsparseattn.Fusedmax(alpha=.1)
In [6]: fusedmax(a, lengths)
Out[6]: tensor([0.0000, 0.5000, 0.5000], dtype=torch.float64)

For details, check out our paper:

Vlad Niculae and Mathieu Blondel A Regularized Framework for Sparse and Structured Neural Attention In: Proceedings of NIPS, 2017. https://arxiv.org/abs/1705.07704

sparse-structured-attention's People

Contributors

Stargazers

Watchers

sparse-structured-attention's Issues

How to run tests

Please add some docs about how to run the tests presented in the paper.
Thank you!

Got RuntimeError: expected a Variable argument, but got NoneType

Hi,

I was trying to use Fusedmax() with a tensor and got the following error.
*** RuntimeError: expected a Variable argument, but got NoneType

This was observed both in Pytorch 0.4.1 and Pytorch 1.1.0.

Support for PyTorch 1.6.0

Hi,
Thanks for sharing your code!
I run the tests with PyTorch 1.6.0 and quite a lot of them are failing. It might be because PyTorch changed from version 1.1.0 on how they manage the bool tensors.

It would be nice if you have time to update your code so newest version of PyTorch

Best,
Pierre

About compiling in latest pytorch

Hi, thank you for sharing codes firstly.
I am considering to recompile the project using latest pytorch and other related packages.

It seems that there are some pre-compiled file like _fused_jv.pyx without source files.
My questions are then: Are they important? Is it possible to re-compile by just ignoring them without influencing the modules' functions?

Thank you very much!

Hi, is there any faster gpu-version?

Hi, I find it was too slow when I ran the code, is there any faster gpu-version ?

Comparison to softmax with temperature

Very interesting work.
However, I noticed that neither the paper not the repo has results of softmax with a tunable temperature "T" on the softmax (it is writted as "gamma" in the paper's notation). Setting T < 1 will result in a sparser softmax output, and the optimal value for T can be determined on a held-out validation set. This type of temperature-softened (or hardened) softmax has been used often in areas like knowledge distillation.

I wonder if this is a meaningful baseline to compare sparsemax against?

Different results of SparseMax

Hi, thank you very much for providing the open source code!
However, I meet a problem when I use this code.
The model is not converged when I replace the Softmax function in attention with the Sparsemax function in my model.
Next, I try to use the Sparsemax function which is implemented by https://github.com/msobroza/SparsemaxPytorch, the model is converged as soon as possible.
So, I want to know whether some errors are in this code.
Thank you!

vene / sparse-structured-attention Goto Github PK

sparse-structured-attention's Introduction

Sparse and structured attention mechanisms

sparse-structured-attention's People

Contributors

Stargazers

Watchers

Forkers

sparse-structured-attention's Issues

pytorch 1 support

How to run tests

Got RuntimeError: expected a Variable argument, but got NoneType

Support for PyTorch 1.6.0

About compiling in latest pytorch

Hi, is there any faster gpu-version?

Comparison to softmax with temperature

Different results of SparseMax

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs