GithubHelp home page GithubHelp logo

voicedatasetaz's Introduction

VoiceDatasetAz

CommonVoiceDataset for deepspeech

  • The dataset for converting prepared text to voice has been created using the Whisper model (by OpenAI).
  • In a sequential dataset consisting of 'path', 'filesize', and 'transcript' columns, the entries describe the location of the audio file, the volume of the voice, and the transcribed text, respectively.
  1. download clips file https://huggingface.co/datasets/RashadGarazadeh/CommonVoiceAz/blob/main/clips.zip >

First, we need to download some libraries and package

  • sudo apt install ffmpeg
  • pip install pydub
  • pip install faster_whisper
  • Splitting the audio files into segments and placing them into folders using the ffmpeg library because of the large size of the audio files, I've prepared a dataset using the faster_whisper library.

python spleetaudio.py

You can use Python scripts in the helper folder to assist in preparing the dataset.

By calling the datacreate.py file in the helper directory, you can prepare the train, test, and dev.csv files. Note that you should make changes to the folder names in the file. Also, when translating text within the code, it cleans the text before recording it into the CSV files. Define the values in the code as follows

  • python datacreate.py

First, by placing a specific portion of audio files into the audio folder, I ensure their availability for the creation of the train.csv. Note that train and dev.csv files should differ in volume; the train.csv file should be significantly larger than dev.csv. Assuming that you have prepared the train.csv file and are moving on to the next, dev.csv, replace the audio files obtained for dev.csv in the audio folder (not in addition to but replacing, ensuring that train, dev, and test files have distinct text and audio). Also, within the datacreate.py file, increment the value of i=0(row=22), representing the number of clips in the clips folder, by one unit.for exam. i=22400

If you have additional entries for the "train.csv" file after a certain period, you can merge the train and merge.csv files using the concatcsv.py file I added to the helper directory.

  • If needed, one of the essential files for preparing the model is the corpus.txt file, which I've provided as an example in the helper directory. If you wish to create a large-scale corpus, you can utilize the corporacreate.py file located in the helper folder. Following the same procedure, convert several mp4 files to audio files, segment them, and within the Python script, there's a code sample that reads each audio file individually, converts the audio to text using the model, and records the text into the "corpus.txt" file.

voicedatasetaz's People

Watchers

Rashad Garayev avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.