Where can get [re-trained model?

Speech-Transformer

PyTorch implementation of The SpeechTransformer for Large-scale Mandarin Chinese Speech Recognition.

Speech Transformer is a transformer framework specialized in speech recognition tasks.
This repository contains only model code, but you can train with speech transformer with this repository.
I appreciate any kind of feedback or contribution

Usage

Training

import torch
from speech_transformer import SpeechTransformer

BATCH_SIZE, SEQ_LENGTH, DIM, NUM_CLASSES = 3, 12345, 80, 4

cuda = torch.cuda.is_available()
device = torch.device('cuda' if cuda else 'cpu')

inputs = torch.rand(BATCH_SIZE, SEQ_LENGTH, DIM).to(device)
input_lengths = torch.IntTensor([100, 50, 8])
targets = torch.LongTensor([[2, 3, 3, 3, 3, 3, 2, 2, 1, 0],
                            [2, 3, 3, 3, 3, 3, 2, 1, 2, 0],
                            [2, 3, 3, 3, 3, 3, 2, 2, 0, 1]]).to(device)  # 1 means <eos_token>
target_lengths = torch.IntTensor([10, 9, 8])

model = SpeechTransformer(num_classes=NUM_CLASSES, d_model=512, num_heads=8, input_dim=DIM)
predictions, logits = model(inputs, input_lengths, targets, target_lengths)

Beam Search Decoding

import torch
from speech_transformer import SpeechTransformer

BATCH_SIZE, SEQ_LENGTH, DIM, NUM_CLASSES = 3, 12345, 80, 10

cuda = torch.cuda.is_available()
device = torch.device('cuda' if cuda else 'cpu')

inputs = torch.rand(BATCH_SIZE, SEQ_LENGTH, DIM).to(device)  # BxTxD
input_lengths = torch.LongTensor([SEQ_LENGTH, SEQ_LENGTH - 10, SEQ_LENGTH - 20]).to(device)

model = SpeechTransformer(num_classes=NUM_CLASSES, d_model=512, num_heads=8, input_dim=DIM)
model.set_beam_decoder(batch_size=BATCH_SIZE, beam_size=3)
predictions, _ = model(inputs, input_lengths)

Troubleshoots and Contributing

If you have any questions, bug reports, and feature requests, please open an issue on github or
contacts [email protected] please.

I appreciate any kind of feedback or contribution. Feel free to proceed with small issues like bug fixes, documentation improvement. For major contributions and new features, please discuss with the collaborators in corresponding issues.

Code Style

I follow PEP-8 for code style. Especially the style of docstrings is important to generate documentation.

Reference

Author

Soohwan Kim @sooftware
Contacts: [email protected]

sooftware / speech-transformer Goto Github PK

speech-transformer's Introduction

Speech-Transformer

Usage

Troubleshoots and Contributing

Code Style

Reference

Author

speech-transformer's People

Contributors

Stargazers

Watchers

Forkers

speech-transformer's Issues

Where can get [re-trained model?

There are some obvious bugs.

How can I use the transformer model to inference?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs