1. Datasets

Speech ehancement datasets (sorted by usage frequency in paper)

English

Name	Source	Hours
Dataset by University of Edinburgh	https://datashare.ed.ac.uk	39h(141165.816338)
VCTK(2009)	https://datashare.ed.ac.uk	82h(297552.223038)
LibriSpeech	http://www.openslr.org	983h(3539995.721648)
Common Voice	https://commonvoice.mozilla.org	1837h(6615925.423006)
The VoxCeleb1 Dataset	https://www.robots.ox.ac.uk	x
The VoxCeleb2 Dataset	https://www.robots.ox.ac.uk	x

German

Name	Source	Hours
Common Voice	https://commonvoice.mozilla.org/	832h (2995609.871861)

Augmentation noise sources (sorted by usage frequency in paper)

Name	Source	Hours
DEMAND	https://zenodo.org	8h(28800.384000)
100 Noise	http://web.cse.ohio-state.edu	293s(293.299375)
RIRS_NOISES	https://www.openslr.org	27h(97661.178407)
QUT-NOISE	https://research.qut.edu.au	27h (98262.946746)
MUSAN	https://www.openslr.org	48h (175827.483386)
Deep Noise Suppression (DNS) Challenge - Interspeech 2020	https://github.com/breizhn/DNS-Challenge
Deep Noise Suppression (DNS) Challenge - Interspeech 2022	https://github.com/microsoft/DNS-Challenge

Audio data augmentation

Link	Language	Description
Data simulation	Python	Add reverberation, noise or mix speaker.
audio-SNR	Python	Mixing an audio file with a noise file at any Signal-to-Noise Ratio.

2. 논문

1.음성처리

1. 블러그

2. Tensorflow

3. Audio recognition using Tensorflow Lite

Quantization


Bitmap bitmap = Bitmap.createScaledBitmap(yourInputImage, 224, 224, true);
ByteBuffer input = ByteBuffer.allocateDirect(224 * 224 * 3 * 4).order(ByteOrder.nativeOrder());
for (int y = 0; y < 224; y++) {
    for (int x = 0; x < 224; x++) {
        int px = bitmap.getPixel(x, y);

        // Get channel values from the pixel value.
        int r = Color.red(px);
        int g = Color.green(px);
        int b = Color.blue(px);

        // Normalize channel values to [-1.0, 1.0]. This requirement depends
        // on the model. For example, some models might require values to be
        // normalized to the range [0.0, 1.0] instead.
        float rf = (r - 127) / 255.0f;
        float gf = (g - 127) / 255.0f;
        float bf = (b - 127) / 255.0f;

        input.putFloat(rf);
        input.putFloat(gf);
        input.putFloat(bf);
    }
}
int bufferSize = 1000 * java.lang.Float.SIZE / java.lang.Byte.SIZE;
ByteBuffer modelOutput = ByteBuffer.allocateDirect(bufferSize).order(ByteOrder.nativeOrder());
interpreter.run(input, modelOutput);
...

Recognize Flowers with TensorFlow Lite on Android
GitHub:

4.Voice Filter

VoiceFilter-Lite: Streaming Targeted Voice Separation for On-Device Speech Recognition

starmkk / mylink Goto Github PK

mylink's Introduction

1. Datasets

Speech ehancement datasets (sorted by usage frequency in paper)

English

German

Augmentation noise sources (sorted by usage frequency in paper)

Audio data augmentation

2. 논문

1.음성처리

1. 블러그

2. Tensorflow

3. Audio recognition using Tensorflow Lite

4.Voice Filter

. Tensorflow

2.Speech Recognition

1.toolkit

Challenge

mylink's People

Contributors

Watchers

Recommend Projects

Recommend Topics

Recommend Org

Jobs