GithubHelp home page GithubHelp logo

thevasudevgupta / gsoc-wav2vec2 Goto Github PK

View Code? Open in Web Editor NEW
89.0 89.0 29.0 6.83 MB

GSoC'2021 | TensorFlow implementation of Wav2Vec2

Home Page: https://thevasudevgupta.github.io/gsoc-wav2vec2/assets/final_report

License: Apache License 2.0

Python 4.97% Jupyter Notebook 95.03%
gsoc librispeech-dataset speech-to-text tensorflow wav2vec2

gsoc-wav2vec2's Introduction

gsoc-wav2vec2's People

Contributors

sayakpaul avatar thevasudevgupta avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

gsoc-wav2vec2's Issues

unable to create TPU node

Hello @sayakpaul @MorganR ,

I am unable to create TPU node since last 2 days. I am getting following error:

ERROR: (gcloud.compute.tpus.execution-groups.create) There is no more capacity in the zone "europe-west4-a"; you can try in another zone where Cloud TPU Nodes are offered (see https://cloud.google.com/tpu/docs/regions) [EID: 0xba393c906974e61]

As I have v3-8 access only in europe-west-4a region, so can't create node in other regions. Is there any solution to above issue or I will have to just wait?

Error in accompanying Colab: "Failed to get convolution algorithm"

In the accompanying Colab, under Setting up training state at this step:

model(tf.random.uniform(shape=(BATCH_SIZE, AUDIO_MAXLEN)))
model.summary()

Running all the code in the colab up to this point produces an error here:

---------------------------------------------------------------------------
UnknownError                              Traceback (most recent call last)
<ipython-input-7-59995e55f4ed> in <module>()
----> 1 model(tf.random.uniform(shape=(BATCH_SIZE, AUDIO_MAXLEN)))
      2 model.summary()

11 frames
/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
     58     ctx.ensure_initialized()
     59     tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
---> 60                                         inputs, attrs, num_outputs)
     61   except core._NotOkStatusException as e:
     62     if name is not None:

UnknownError:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[{{node StatefulPartitionedCall/wav2vec2/feature_extractor/conv_layers/0/conv/conv1d}}]] [Op:__inference_restored_function_body_38644]

Function call stack:
restored_function_body

Here's a screenshot:

Screen Shot 2022-01-09 at 1 38 39 PM

Ideas from the wav2vec2 repo

Initial action plans

Copying these things from the wav2vec2 repo for safe housekeeping.

  • An immediate quantize could be to convert the fine-tuned model using TFLite APIs. Post-training quantization, in specific, might be very useful. Quantization-aware training might be even more helpful but its support on TPUs is limited. I remember you had tried post-training quantization but the resulting model size was around 400 MB and I had shared some thoughts around that. Might be a good idea to again revisit post-training quantization in that case.
  • Google Research recently published FRILL which could be relevant for us. Basically, they perform knowledge distillation with a smaller student model with careful design choices along with quantization-aware training.
  • Meanwhile, if you have any other ideas that you think might be worth trying out please feel free to share them. If we have anything concrete and novel we can even target a publication in that case.

Suggesting another important resource here: Knowledge distillation: A good teacher is patient and consistent. The paper introduces simple recipes to get the best possible student model. But the study is based on image classification models. So, might be a fun exercise to try to think of ways in which this could be extended here.

A baseline approach to distil Wav2Vec2: Shrinking Bigfoot: Reducing wav2vec 2.0 footprint

Other useful resources

Model Optimization

Efficient Methods and Hardware for Deep Learning by Song Han
Lecture on Quantization by Pete Warden

For non-trivial model conversions in TFLite you can refer to the following repositories

https://github.com/tulasiram58827/ocr_tflite/
https://github.com/tulasiram58827/TTS_TFLite
https://github.com/sayakpaul/Adventures-in-TensorFlow-Lite

Kaggle TPU loading/ initialization fails

Really awesome work.
I was able to work with the tf-hub layer with the following code in kaggle TPU

    load_locally = tf.saved_model.LoadOptions(experimental_io_device='/job:localhost')
    pretrained_layer = hub.KerasLayer("https://tfhub.dev/vasudevgupta7/wav2vec2/1",load_options=load_locally,trainable=True)
    inputs = tf.keras.Input(shape=cfg.audio_shape)
    states = pretrained_layer(inputs)
    logits= tf.keras.layers.Dense(cfg.vocab_len)(states)
    model = tf.keras.Model(inputs=inputs, outputs=logits)

However, I need to load a custom model (which converts fine with your provided code and is available here: https://www.kaggle.com/code/nazmuddhohaansary/test-conversion )

but while working with TPU, this fails

tf_model = Wav2Vec2Model(config)
tf_model.summary()

This specific section worked for GPU after conversion in this script: https://www.kaggle.com/code/nazmuddhohaansary/test-conversion
but fails in this
kaggle notebook: https://www.kaggle.com/code/nazmuddhohaansary/tpu-loading-test?scriptVersionId=100483620

  • jit_complie is not a recognized parameter in kaggle TPU @tf.function.

please help. Any guidance is much appreciated. Thanks in advance.

Training related doubt

Hope everyone doing good!

Myself working on Finetuning of the Wav2vec model for Indian Accent and the size of the data is about 1.7 TB.

What would be your suggestion related to this task or any other better models to fine-tune?

You have also mentioned loading data lazily, could you please brief me about its usage.

Anyone with good knowledge, please update your comments. Thank you.

Training Wav2Vec2 model on 100h & experiment-2

@sayakpaul, sorry for delay again. I have started serious experimentation now and will keep you posted with the results. I am starting with experiment-2 for now as mentioned in vasudevgupta7/compressed-wav2vec2#1. I will mention all the results in this issue by tomorrow (TPUs are running now!!)

Experiment description WER Wandb
wav2vec2-960h (Facebook version) 3% -
wav2vec2-960h (trained during gsoc) 5.6% -
wav2vec2-100h 7.4% https://wandb.ai/7vasudevgupta/gsoc-wav2vec2/runs/lwiepmm0
wav2vec2-100h (skipped stage-1) 8.2% https://wandb.ai/7vasudevgupta/gsoc-wav2vec2/runs/h0bug1zp
wav2vec2-100h (train conv also) 9.1% https://wandb.ai/7vasudevgupta/gsoc-wav2vec2/runs/2iro0pl0, https://wandb.ai/7vasudevgupta/gsoc-wav2vec2/runs/284a713r
distilled wav2vec2-100h https://wandb.ai/7vasudevgupta/wav2vec2-distillation/runs/2h82mhgc

Evaluation script: https://colab.research.google.com/drive/1aNgochNmchx1R5TcoVH7nM0uPkmxNqE1?usp=sharing

Just wanted to ask one thing: Is it fine if I code in my gsoc repository or I should code in this private repo??

URL issues

When I run the following code:

from wav2vec2 import Wav2Vec2Config, Wav2Vec2Processor
tokenizer = Wav2Vec2Processor(is_tokenizer=True)

I get the following error:

Downloading `vocab.json` from https://github.com/vasudevgupta7/gsoc-wav2vec2/raw/main/data/vocab.json ... Traceback (most recent call last):
  File "C:\Users\romar\AppData\Local\Programs\Python\Python39\lib\site-packages\wav2vec2\processor.py", line 43, in _setup_vocab
    subprocess.run(
  File "C:\Users\romar\AppData\Local\Programs\Python\Python39\lib\subprocess.py", line 505, in run
    with Popen(*popenargs, **kwargs) as process:
  File "C:\Users\romar\AppData\Local\Programs\Python\Python39\lib\subprocess.py", line 951, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "C:\Users\romar\AppData\Local\Programs\Python\Python39\lib\subprocess.py", line 1420, in _execute_child
    hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
FileNotFoundError: [WinError 2] The system cannot find the file specified

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\romar\AppData\Local\Programs\Python\Python39\lib\site-packages\wav2vec2\processor.py", line 21, in __init__
    self._setup_vocab()
  File "C:\Users\romar\AppData\Local\Programs\Python\Python39\lib\site-packages\wav2vec2\processor.py", line 47, in _setup_vocab
    raise ValueError(f"Couldn't download `vocab.json` from {url}")
ValueError: Couldn't download `vocab.json` from https://github.com/vasudevgupta7/gsoc-wav2vec2/raw/main/data/vocab.json

When I look up the URL, I get redirected to 'https://raw.githubusercontent.com/thevasudevgupta/gsoc-wav2vec2/main/data/vocab.json', which does have the vocab, but maybe the fact that it needs a redirect causes an issue for python?

I get a similar issue when I try to get the finetuned model:

from wav2vec2 import Wav2Vec2ForCTC, Wav2Vec2Config, Wav2Vec2Processor
model_id = "finetuned-wav2vec2-960h"
model = Wav2Vec2ForCTC.from_pretrained(model_id)

raises the following error:

Downloading model weights from https://huggingface.co/finetuned-wav2vec2-960h ... Traceback (most recent call last):
  File "C:\Users\romar\AppData\Local\Programs\Python\Python39\lib\site-packages\wav2vec2\modeling.py", line 69, in from_pretrained
    subprocess.run(url.split(), check=True, stderr=subprocess.PIPE)
  File "C:\Users\romar\AppData\Local\Programs\Python\Python39\lib\subprocess.py", line 505, in run
    with Popen(*popenargs, **kwargs) as process:
  File "C:\Users\romar\AppData\Local\Programs\Python\Python39\lib\subprocess.py", line 951, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "C:\Users\romar\AppData\Local\Programs\Python\Python39\lib\subprocess.py", line 1420, in _execute_child
    hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
FileNotFoundError: [WinError 2] The system cannot find the file specified

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\romar\AppData\Local\Programs\Python\Python39\lib\site-packages\wav2vec2\modeling.py", line 71, in from_pretrained
    raise ValueError(
ValueError: Couldn't download model weights from https://huggingface.co/finetuned-wav2vec2-960h

When I try to look up the URL, I get a 404 error. Maybe the URLs should be updated? If it's some issue on my end, please let me know.

Questions about processor

what does this code do :

def _normalize(self, x):
        """You must call this before padding."""
        # -> (1, seqlen)
        mean = tf.reduce_mean(x, axis=-1, keepdims=True)
        var = tf.math.reduce_variance(x, axis=-1, keepdims=True)
        return tf.squeeze((x - mean) / tf.sqrt(var + 1e-5))

my other question is on what basis are numbers assigned to the vocab list by that i mean this :
image

I understand the code in the picture it basically gets all the characters from the text but my question is when it turns the characters into a dictionary with the values as their index does it matter what character is at what index and if yes then how does the right character get at the right index. I was trying to test my version of your tokenizer and I had trouble producing the right outputs with your vocab.json so I went and took the one here which worked fine.Also i was using a fine-tuned model for making predictions which was associated with this tokenizer via hugging face

About the README

@vasudevgupta7 the README looks superb now!

I have a few suggestions that might make it even better:

  • I see that you have included instructions on how to load the model from Hugging Face Hub which is great. Do you think we could add something similar for TF Hub too? Maybe a note to guide the readers that they can pretty do the similar thing in TensorFlow by doing [...]?
  • While we are on the topic of ASR, do you think it might be good to mention a few other projects that are doing exceptional work in the area and even compare performances qualitatively? I understand this might be difficult for you to accommodate right away. So, feel free to either have this under future works or you can completely discard the choice (I won't mind).

Fused conv implementation error when running fine-tuning notebook

Hi,

I'm getting this error when running the following lines in the fune-tuning notebook for Wav2Vec.

model(tf.random.uniform(shape=(BATCH_SIZE, AUDIO_MAXLEN)))

UnimplementedError:  Fused conv implementation does not support grouped convolutions for now.
	 [[{{node StatefulPartitionedCall/wav2vec2/encoder/pos_conv_embed/conv/Conv1DWithWeightNorm}}]] [Op:__inference_restored_function_body_38644]

This is with Python 3.7, and Tensorflow 2.7.0.

Any idea the issue?

Thanks!

installing wav2vec2 package

Hello, I am having a problem installing the package. I am using Python 3.7, Linux 18.04, and a conda environment. I have installed a lot of Python packages before, but I am not very knowledgeable about compiling from scratch. Here is the error message I get. If this question is not appropriate for this forum please let me know. thanks!

ERROR: Command errored out with exit status 1:
command: /usr/bin/python3 -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-srjmq01i/python-levenshtein_8ebad8d7f4cd4c6689fe83843411f653/setup.py'"'"'; file='"'"'/tmp/pip-install-srjmq01i/python-levenshtein_8ebad8d7f4cd4c6689fe83843411f653/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(file) if os.path.exists(file) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' bdist_wheel -d /tmp/pip-wheel-_39rp1n8
cwd: /tmp/pip-install-srjmq01i/python-levenshtein_8ebad8d7f4cd4c6689fe83843411f653/
Complete output (37 lines):
running bdist_wheel
running build
running build_py
creating build
creating build/lib.linux-x86_64-3.6
creating build/lib.linux-x86_64-3.6/Levenshtein
copying Levenshtein/init.py -> build/lib.linux-x86_64-3.6/Levenshtein
copying Levenshtein/StringMatcher.py -> build/lib.linux-x86_64-3.6/Levenshtein
running egg_info
writing python_Levenshtein.egg-info/PKG-INFO
writing dependency_links to python_Levenshtein.egg-info/dependency_links.txt
writing entry points to python_Levenshtein.egg-info/entry_points.txt
writing namespace_packages to python_Levenshtein.egg-info/namespace_packages.txt
writing requirements to python_Levenshtein.egg-info/requires.txt
writing top-level names to python_Levenshtein.egg-info/top_level.txt
reading manifest file 'python_Levenshtein.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
warning: no previously-included files matching '*pyc' found anywhere in distribution
warning: no previously-included files matching '*so' found anywhere in distribution
warning: no previously-included files matching '.project' found anywhere in distribution
warning: no previously-included files matching '.pydevproject' found anywhere in distribution
adding license file 'COPYING'
writing manifest file 'python_Levenshtein.egg-info/SOURCES.txt'
copying Levenshtein/_levenshtein.c -> build/lib.linux-x86_64-3.6/Levenshtein
copying Levenshtein/_levenshtein.h -> build/lib.linux-x86_64-3.6/Levenshtein
running build_ext
building 'Levenshtein._levenshtein' extension
creating build/temp.linux-x86_64-3.6
creating build/temp.linux-x86_64-3.6/Levenshtein
/home/jonathanfolstein/anaconda3/envs/DataScience/bin/x86_64-conda_cos6-linux-gnu-cc -DNDEBUG -g -fwrapv -O2 -Wall -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem /home/jonathanfolstein/anaconda3/envs/DataScience/include -DNDEBUG -D_FORTIFY_SOURCE=2 -O2 -isystem /home/jonathanfolstein/anaconda3/envs/DataScience/include -fPIC -I/usr/include/python3.6m -c Levenshtein/_levenshtein.c -o build/temp.linux-x86_64-3.6/Levenshtein/_levenshtein.o
In file included from /usr/include/python3.6m/Python.h:8:0,
from Levenshtein/_levenshtein.c:99:
/usr/include/python3.6m/pyconfig.h:3:12: fatal error: x86_64-linux-gnu/python3.6m/pyconfig.h: No such file or directory

include <x86_64-linux-gnu/python3.6m/pyconfig.h>

          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

compilation terminated.
error: command '/home/jonathanfolstein/anaconda3/envs/DataScience/bin/x86_64-conda_cos6-linux-gnu-cc' failed with exit status 1

ERROR: Failed building wheel for python-Levenshtein
ERROR: Command errored out with exit status 1:
command: /usr/bin/python3 -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-srjmq01i/psutil_1614f330d50941d385cd51c6e38e8d71/setup.py'"'"'; file='"'"'/tmp/pip-install-srjmq01i/psutil_1614f330d50941d385cd51c6e38e8d71/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(file) if os.path.exists(file) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' bdist_wheel -d /tmp/pip-wheel-t444e8dl
cwd: /tmp/pip-install-srjmq01i/psutil_1614f330d50941d385cd51c6e38e8d71/
Complete output (47 lines):
running bdist_wheel
running build
running build_py
creating build
creating build/lib.linux-x86_64-3.6
creating build/lib.linux-x86_64-3.6/psutil
copying psutil/_psbsd.py -> build/lib.linux-x86_64-3.6/psutil
copying psutil/_pswindows.py -> build/lib.linux-x86_64-3.6/psutil
copying psutil/_psosx.py -> build/lib.linux-x86_64-3.6/psutil
copying psutil/_pssunos.py -> build/lib.linux-x86_64-3.6/psutil
copying psutil/_psaix.py -> build/lib.linux-x86_64-3.6/psutil
copying psutil/_psposix.py -> build/lib.linux-x86_64-3.6/psutil
copying psutil/_common.py -> build/lib.linux-x86_64-3.6/psutil
copying psutil/_pslinux.py -> build/lib.linux-x86_64-3.6/psutil
copying psutil/_compat.py -> build/lib.linux-x86_64-3.6/psutil
copying psutil/init.py -> build/lib.linux-x86_64-3.6/psutil
creating build/lib.linux-x86_64-3.6/psutil/tests
copying psutil/tests/test_windows.py -> build/lib.linux-x86_64-3.6/psutil/tests
copying psutil/tests/test_misc.py -> build/lib.linux-x86_64-3.6/psutil/tests
copying psutil/tests/test_sunos.py -> build/lib.linux-x86_64-3.6/psutil/tests
copying psutil/tests/test_testutils.py -> build/lib.linux-x86_64-3.6/psutil/tests
copying psutil/tests/test_unicode.py -> build/lib.linux-x86_64-3.6/psutil/tests
copying psutil/tests/test_process.py -> build/lib.linux-x86_64-3.6/psutil/tests
copying psutil/tests/test_connections.py -> build/lib.linux-x86_64-3.6/psutil/tests
copying psutil/tests/test_contracts.py -> build/lib.linux-x86_64-3.6/psutil/tests
copying psutil/tests/test_system.py -> build/lib.linux-x86_64-3.6/psutil/tests
copying psutil/tests/test_posix.py -> build/lib.linux-x86_64-3.6/psutil/tests
copying psutil/tests/test_aix.py -> build/lib.linux-x86_64-3.6/psutil/tests
copying psutil/tests/runner.py -> build/lib.linux-x86_64-3.6/psutil/tests
copying psutil/tests/test_bsd.py -> build/lib.linux-x86_64-3.6/psutil/tests
copying psutil/tests/main.py -> build/lib.linux-x86_64-3.6/psutil/tests
copying psutil/tests/test_linux.py -> build/lib.linux-x86_64-3.6/psutil/tests
copying psutil/tests/test_memleaks.py -> build/lib.linux-x86_64-3.6/psutil/tests
copying psutil/tests/init.py -> build/lib.linux-x86_64-3.6/psutil/tests
copying psutil/tests/test_osx.py -> build/lib.linux-x86_64-3.6/psutil/tests
running build_ext
building 'psutil._psutil_linux' extension
creating build/temp.linux-x86_64-3.6
creating build/temp.linux-x86_64-3.6/psutil
/home/jonathanfolstein/anaconda3/envs/DataScience/bin/x86_64-conda_cos6-linux-gnu-cc -DNDEBUG -g -fwrapv -O2 -Wall -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem /home/jonathanfolstein/anaconda3/envs/DataScience/include -DNDEBUG -D_FORTIFY_SOURCE=2 -O2 -isystem /home/jonathanfolstein/anaconda3/envs/DataScience/include -fPIC -DPSUTIL_POSIX=1 -DPSUTIL_SIZEOF_PID_T=4 -DPSUTIL_VERSION=590 -DPSUTIL_LINUX=1 -I/usr/include/python3.6m -c psutil/_psutil_common.c -o build/temp.linux-x86_64-3.6/psutil/_psutil_common.o
In file included from /usr/include/python3.6m/Python.h:8:0,
from psutil/_psutil_common.c:9:
/usr/include/python3.6m/pyconfig.h:3:12: fatal error: x86_64-linux-gnu/python3.6m/pyconfig.h: No such file or directory

include <x86_64-linux-gnu/python3.6m/pyconfig.h>

          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

compilation terminated.
error: command '/home/jonathanfolstein/anaconda3/envs/DataScience/bin/x86_64-conda_cos6-linux-gnu-cc' failed with exit status 1

ERROR: Failed building wheel for psutil
ERROR: Command errored out with exit status 1:
command: /usr/bin/python3 -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-srjmq01i/python-levenshtein_8ebad8d7f4cd4c6689fe83843411f653/setup.py'"'"'; file='"'"'/tmp/pip-install-srjmq01i/python-levenshtein_8ebad8d7f4cd4c6689fe83843411f653/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(file) if os.path.exists(file) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record /tmp/pip-record-vt2xki_3/install-record.txt --single-version-externally-managed --user --prefix= --compile --install-headers /home/jonathanfolstein/.local/include/python3.6m/python-Levenshtein
cwd: /tmp/pip-install-srjmq01i/python-levenshtein_8ebad8d7f4cd4c6689fe83843411f653/
Complete output (37 lines):
running install
running build
running build_py
creating build
creating build/lib.linux-x86_64-3.6
creating build/lib.linux-x86_64-3.6/Levenshtein
copying Levenshtein/init.py -> build/lib.linux-x86_64-3.6/Levenshtein
copying Levenshtein/StringMatcher.py -> build/lib.linux-x86_64-3.6/Levenshtein
running egg_info
writing python_Levenshtein.egg-info/PKG-INFO
writing dependency_links to python_Levenshtein.egg-info/dependency_links.txt
writing entry points to python_Levenshtein.egg-info/entry_points.txt
writing namespace_packages to python_Levenshtein.egg-info/namespace_packages.txt
writing requirements to python_Levenshtein.egg-info/requires.txt
writing top-level names to python_Levenshtein.egg-info/top_level.txt
reading manifest file 'python_Levenshtein.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
warning: no previously-included files matching '*pyc' found anywhere in distribution
warning: no previously-included files matching '*so' found anywhere in distribution
warning: no previously-included files matching '.project' found anywhere in distribution
warning: no previously-included files matching '.pydevproject' found anywhere in distribution
adding license file 'COPYING'
writing manifest file 'python_Levenshtein.egg-info/SOURCES.txt'
copying Levenshtein/_levenshtein.c -> build/lib.linux-x86_64-3.6/Levenshtein
copying Levenshtein/_levenshtein.h -> build/lib.linux-x86_64-3.6/Levenshtein
running build_ext
building 'Levenshtein._levenshtein' extension
creating build/temp.linux-x86_64-3.6
creating build/temp.linux-x86_64-3.6/Levenshtein
/home/jonathanfolstein/anaconda3/envs/DataScience/bin/x86_64-conda_cos6-linux-gnu-cc -DNDEBUG -g -fwrapv -O2 -Wall -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem /home/jonathanfolstein/anaconda3/envs/DataScience/include -DNDEBUG -D_FORTIFY_SOURCE=2 -O2 -isystem /home/jonathanfolstein/anaconda3/envs/DataScience/include -fPIC -I/usr/include/python3.6m -c Levenshtein/_levenshtein.c -o build/temp.linux-x86_64-3.6/Levenshtein/_levenshtein.o
In file included from /usr/include/python3.6m/Python.h:8:0,
from Levenshtein/_levenshtein.c:99:
/usr/include/python3.6m/pyconfig.h:3:12: fatal error: x86_64-linux-gnu/python3.6m/pyconfig.h: No such file or directory
# include <x86_64-linux-gnu/python3.6m/pyconfig.h>
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
error: command '/home/jonathanfolstein/anaconda3/envs/DataScience/bin/x86_64-conda_cos6-linux-gnu-cc' failed with exit status 1
----------------------------------------
ERROR: Command errored out with exit status 1: /usr/bin/python3 -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-srjmq01i/python-levenshtein_8ebad8d7f4cd4c6689fe83843411f653/setup.py'"'"'; file='"'"'/tmp/pip-install-srjmq01i/python-levenshtein_8ebad8d7f4cd4c6689fe83843411f653/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(file) if os.path.exists(file) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record /tmp/pip-record-vt2xki_3/install-record.txt --single-version-externally-managed --user --prefix= --compile --install-headers /home/jonathanfolstein/.local/include/python3.6m/python-Levenshtein Check the logs for full command output.

TPU error: Input 2 to node `CTCLoss/ctc-loss/ctc_state_trans/ScatterNd_1` must be a compile-time constant

Thanks for the all the work that has gone into this project. I tried to run the code from the fine-tuning colab notebook on a TPU but I get the following error when running model.fit:

  (0) INVALID_ARGUMENT: {{function_node __inference_train_function_84983}} Input 2 to node `CTCLoss/ctc-loss/ctc_state_trans/ScatterNd_1` with op ScatterNd must be a compile-time constant.

XLA compilation requires that operator arguments that represent shapes or dimensions be evaluated to concrete values at compile time. This error means that a shape or dimension argument could not be evaluated at compile time, usually because the value of the argument depends on a parameter to the computation, on a variable, or on a stateful operation such as a random number generator.

As per the colab notebook I am loading a pretrained wav2vec2 from tfhub::

  load_locally = tf.saved_model.LoadOptions(experimental_io_device='/job:localhost')  # required for TPU
  pretrained_layer = tfhub.KerasLayer("https://tfhub.dev/vasudevgupta7/wav2vec2/1", trainable=True, load_options=load_locally)

(I added "load_locally" to make it tfhub.KerasLayer work on the TPU)

I guess the pretrained model uses the scatter_nd op in a way that is not compatible with the TPU, right? Any idea what I can do about this?

I did see your end-to-end training script, by the way, which I see has TPU support, but I was hoping to load a pretrained model.

some issue with running sample notebook

By running the below code

processor = Wav2Vec2Processor(is_tokenizer=False)
tokenizer = Wav2Vec2Processor(is_tokenizer=True)
model = Wav2Vec2ForCTC.from_pretrained("vasudevgupta/finetuned-wav2vec2-960h")

this is happening, Is it related to incompatible package versioning?

Loading weights locally from vasudevgupta/finetuned-wav2vec2-960h

ValueError                                Traceback (most recent call last)
<ipython-input-12-84c97bf7852e> in <module>()
      1 processor = Wav2Vec2Processor(is_tokenizer=False)
      2 tokenizer = Wav2Vec2Processor(is_tokenizer=True)
----> 3 model = Wav2Vec2ForCTC.from_pretrained("vasudevgupta/finetuned-wav2vec2-960h")

8 frames
/usr/local/lib/python3.7/dist-packages/wav2vec2/tensorflow_addons.py in build(self, input_shape)
     22 
     23     def build(self, input_shape):
---> 24         super().build(input_shape)
     25 
     26         kernel_norm_axes = list(range(self.kernel.shape.rank))

ValueError: Exception encountered when calling layer "pos_conv_embed" (type PositionalConvEmbedding).

One of the dimensions in the output is <= 0 due to downsampling in conv. Consider increasing the input size. Received input shape [1, 6, 768] which would produce output shape with a zero or negative value in a dimension.

Call arguments received:
  • batch=tf.Tensor(shape=(1, 6, 768), dtype=float32)

Error in accompanying Colab: "InternalError: libdevice not found at ./libdevice.10.bc [Op:__inference_tf_forward_55230]"

In an accompanying Colab (different Colab than for #29), under Setting up training state at this step:

predictions, labels = infer_librispeech(dataset, num_batches=2618)

Running all the code in the colab up to this point produces an error here:

---------------------------------------------------------------------------
InternalError                             Traceback (most recent call last)
<ipython-input-16-85b57072fd5d> in <module>()
----> 1 predictions, labels = infer_librispeech(dataset, num_batches=2618)

6 frames
/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
     58     ctx.ensure_initialized()
     59     tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
---> 60                                         inputs, attrs, num_outputs)
     61   except core._NotOkStatusException as e:
     62     if name is not None:

InternalError: libdevice not found at ./libdevice.10.bc [Op:__inference_tf_forward_55230]

Here's a screenshot:

Screen Shot 2022-01-09 at 2 18 26 PM

How to change input signature

hey there!

thanks for making this repository! This may be a huge help for me.

When I download the model from https://tfhub.dev/vasudevgupta7/wav2vec2/1 the saved_model_cli says that the input signature for the model is actually (None, 50000) and not (None, 246000)... however when using tfhub to load the model into a keras layer (as done in this cloab ) it is (None, 246000)

i am confused... please help :)
thanks a lot!

Feedback on the fine-tuning notebook

@vasudevgupta7 great work thus far. Here ate some more pointers.

"How to train TensorFlow saved-model with extra head", I suggest "Fine-tuning with an extra head".

"In this notebook, we will load the pre-trained wav2vec2 model from TFHub and will train it on LibriSpeech dataset by appending LM head over the top of our pre-trained model.", I suggest something like -

In this notebook, we will load the pre-trained wav2vec2 model from TFHub (should be a link to the model when available) and will train it on LibriSpeech dataset by appending LM head over the top of our pre-trained model. The underlying task is to ...

"You can also refer to this repositary for some more amazing tutorials on speech-related tasks. In case you encountered any bug in this notebook, please create an issue here."

Typo.

Additional feedback:

  • Let's try to wrap the training and evaluation loop as a subclassed model (tf.keras.Model). Let us know if you face any problems there.
  • Is it possible to load a few FLAC files in an Audio widget and play them for reference? See if this Colab Notebook helps.

Cc: @MorganR

Discussion

Hey @sayakpaul, @MorganR,

I have few questions before I can start training the model:

  1. LibriSpeech dataset is available in .flac format which can be read using tensorflow_io. But AFAIU cloud TPU's uses special build of TensorFlow and tensorflow_io is not working with that version. Is there any work around to this problem??
  2. There are multiple variants of librispeech dataset- 100h, 360h, 500h (see this). 100h takes 6.3 GB, 360h takes 23 GB, 500h takes 30 GB disk space in compressed form. Best model in paper is obtained by training on combination of all datasets (i.e 960h). Which one dataset should I target for?? OR should I target 960h only (dataset will be quite large in uncompressed form) ??

Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.