GithubHelp home page GithubHelp logo

Comments (9)

Jeronymous avatar Jeronymous commented on May 16, 2024

Can you please test with the last version? (unless it's already the last version).

If it fails with the last version, can you give me the full set of options you use (in the CLI or in python)? model, language, ...

from whisper-timestamped.

tkorchagin avatar tkorchagin commented on May 16, 2024

python3

import whisper_timestamped as whisper
model = whisper.load_model("base")
audio = whisper.load_audio(audio_path)
result = whisper.transcribe(model, audio, language='ru')

from whisper-timestamped.

Jeronymous avatar Jeronymous commented on May 16, 2024

I was asking the version of this package.
If you used pip to install, you should see it with:

pip freeze | grep whisper

And you should see version 1.5.4 if you're up to date.
If not try:

pip install --upgrade --no-deps --force-reinstall git+https://github.com/Jeronymous/whisper-timestamped

from whisper-timestamped.

tkorchagin avatar tkorchagin commented on May 16, 2024

@Jeronymous I use google colab and it resets virtual machine every time

Still got the same error here commands as you asked:

!pip freeze | grep whisper

openai-whisper @ git+https://github.com/openai/whisper.git@9f7aba609971434b9de2a8d34ca2de766976904d
whisper-timestamped @ git+https://github.com/Jeronymous/whisper-timestamped@826778f91f9dbadbd80b6f86df64e7352b0c9796

from whisper-timestamped.

samheutmaker avatar samheutmaker commented on May 16, 2024

I'm also having the same issue on the latest version.

from whisper-timestamped.

Jeronymous avatar Jeronymous commented on May 16, 2024

Thanks a lot @tkorchagin for your effort to help narrowing this down.
Unfortunately I am not able to reproduce your audio ekaterina_koval 05.01.2023, 17-31.wav with the base model.
Neither on CPU nor GPU (I also tried other model sizes, to be sure).

Maybe @samheutmaker can share the audio and option details to check if I have "more chance" with his case?

from whisper-timestamped.

Jeronymous avatar Jeronymous commented on May 16, 2024

@tkorchagin I pushed a new version, where the assertion failure give more details (the list of logprobs).
I would appreciate if you can re-run the transcription that fails and share the new failure message. Maybe I can see something obvious...

Also, you can use option compute_word_confidence = False in transcribe().
This should prevent the failure to occur (just you won't have word confidence). Then I'm interested in seeing the output you get (sharing the json file would be awesome). To try to understand why I cannot reproduce your issue...

from whisper-timestamped.

Jeronymous avatar Jeronymous commented on May 16, 2024

I'm assuming that it's not occurring on the latest version.
Feel free to re-open (and give details) if it occurs again.

from whisper-timestamped.

Rtut654 avatar Rtut654 commented on May 16, 2024

@Jeronymous
I'm getting similar issue once in ~ 30 transcribe function calling. Probably can't share audio since it is user's data. I use default setup, with no custom parameters same as in the example.
Here is the logs:

File "/home/test/rep/rep/lib/python3.8/site-packages/whisper_timestamped/transcribe.py", line 688, in may_flush_segment
    assert min([p.isfinite().item() for p in logprobs]), \
AssertionError: Got infinite logprob among (24) [(286, ' I', -inf), (519, ' think', -5.2494893074035645), (8815, ' television', -16.253812789916992), (815, ' may', -10.85153293
6096191), (362, ' have', -6.786816120147705), (257, ' a', -9.276606559753418), (562, ' when', -inf), (436, ' they', -5.740912437438965), (1401, ' read', -7.83854341506958), (3642, ' books', -12.358419418334961), (11, ',', -6.387020587921143), (50257, '
<|endoftext|>', -11.639397621154785)]

ps. I understand what it means, but suppose it shouldn't ruin the transcribe procedure as it does now?

from whisper-timestamped.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.