GithubHelp home page GithubHelp logo

baiiiiiiiiii / asr-trainer Goto Github PK

View Code? Open in Web Editor NEW

This project forked from voidful/asr-trainer

0.0 0.0 0.0 97 KB

one script for xls-r/xlsr/whisper/hubert alpaca fine-tuning

Python 91.81% Jupyter Notebook 8.19%

asr-trainer's Introduction

one script for xls-r/xlsr/whisper fine-tuning

script modify from https://huggingface.co/blog/fine-tune-xlsr-wav2vec2

  • fix large memory usage during eval metric calculation
  • add cer and wer for evaluation

install requirement

pip install -r requirements.txt

example usage - common voice

python -m torch.distributed.launch --nproc_per_node=2 \
train.py \
--train_subset zh-TW \
--train_split train \
--test_split validation \
--tokenize_config voidful/wav2vec2-large-xlsr-53-tw-gpt \
--model_config facebook/wav2vec2-xls-r-300m \
--batch 10 \
--group_by_length \
--max_input_length_in_sec 20
python -m torch.distributed.launch --nproc_per_node=2 \
train.py \
--train_subset zh-TW \
--model_config openai/whisper-base \
--batch 10 \
--group_by_length \
--max_input_length_in_sec 20

example usage - custom set

custom data format data.csv:

path,text
/xxx/2.wav,被你拒絕而記仇
/xxx/4.wav,電影界的人
/xxx/7.wav,其實我最近在想
python -m torch.distributed.launch --nproc_per_node=2 \
--custom_set ./data.csv \
--tokenize_config voidful/wav2vec2-large-xlsr-53-tw-gpt \
--model_config facebook/wav2vec2-xls-r-300m \
--batch 3 \
--grad_accum 15 \
--max_input_length_in_sec 15 \
--eval_step 10000
python -m torch.distributed.launch --nproc_per_node=2 \
train.py --tokenize_config facebook/hubert-large-ls960-ft \
--model_config ntu-spml/distilhubert \
--group_by_length \
--train_set librispeech_asr \
--train_subset all \
--train_split train.clean.100+train.clean.360+train.other.500 \
--test_split test.other \
--learning_rate 0.0003 \
--batch 30 \
--logging_steps 10 \
--eval_steps 60 \
--epoch 150 \
--use_auth_token True \
--output_dir ./model_sweep \
--overwrite_output_dir

Train whisper on custom dataset

python train.py --tokenize_config openai/whisper-base \
--model_config openai/whisper-base \
--group_by_length \
--custom_set_train cm_train_unified.csv \
--custom_set_test cm_dev_unified.csv \
--output_dir ./whisper-custom \
--overwrite_output_dir

sweep usage

python -m wandb sweep ./sweep_xxx.yaml
python -m wandb agent xxxxxxxxxxxxxx

test command

python train.py --tokenize_config facebook/hubert-large-ls960-ft --model_config ntu-spml/distilhubert --group_by_length --train_set hf-internal-testing/librispeech_asr_dummy --train_split validation --train_subset clean --test_split validation --test_subset clean --learning_rate 0.0003 --batch 30 --logging_steps 10 --eval_steps 60 --epoch 150 --use_auth_token True --output_dir ./model_test --overwrite_output_dir --batch 3

python train.py --tokenize_config openai/whisper-tiny --model_config openai/whisper-tiny --group_by_length --train_set hf-internal-testing/librispeech_asr_dummy --train_split validation --train_subset clean --test_split validation --test_subset clean --learning_rate 0.0003 --batch 30 --logging_steps 10 --eval_steps 60 --epoch 150 --use_auth_token True --output_dir ./model_test --overwrite_output_dir --batch 3

asr-trainer's People

Contributors

voidful avatar qanastek avatar baiiiiiiiiii avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.