GithubHelp home page GithubHelp logo

septr's Introduction

SepTr: Separable Transformer for Audio Spectrogram Processing @ INTERSPEECH 2022 (official repository)

We propose the Separable Transformer (SepTr), an architecture that employs two transformer blocks in a sequential manner, the first attending to tokens within the same frequency bin, and the second attending to tokens within the same time interval.

The original paper is available at: https://www.isca-speech.org/archive/interspeech_2022/ristea22_interspeech.html

The arxiv version is avalable at: https://arxiv.org/pdf/2203.09581.pdf

This code is released under the CC BY-SA 4.0 license.


map


Information

Our architecture does not impose a certain axis (time or frequency) for the first transformer block, being flexible in this regard. Without loss of generality, in the above Figure, we illustrate a model that separates the tokens along the time axis first. Our separable transformer block can be repeated L times to increase the depth of the architecture. The final prediction of our model is made by the MLP block.

Implementation

We implemented the model in PyTorch and provide all scripts to run our architecture.

In order to make it work properly, we recommend a python version newer than 3.6

We used the python 3.6.8 version.

Cite us

@inproceedings{Ristea-INTERSPEECH-2022,
  title={SepTr: Separable Transformer for Audio Spectrogram Processing},
  author={Ristea, Nicolae-Catalin and Ionescu, Radu Tudor and Khan, Fahad Shahbaz},
  year={2022},
  booktitle={Proceedings of INTERSPEECH},
  pages={4103--4107},
  doi={10.21437/Interspeech.2022-249}
}

Related Projects

pytorch-vit

You can send your questions or suggestions to:

[email protected], [email protected]

septr's People

Contributors

ristea avatar raduionescu avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.