I've found this to be amazingly effective; it gives good real-time performance

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

i have tried crudely hammering this into the but i cannot for the life of me ge

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

Whisper_mic for faster-whisper/CTranslate2? about whisper_mic HOT 7 CLOSED

mallorbc commented on July 22, 2024 1

Whisper_mic for faster-whisper/CTranslate2?

from whisper_mic.

Comments (7)

Toxotei commented on July 22, 2024 1

@mallorbc > I actually heard about faster-whisper recently. I plan on looking into it closer to other projects and may add support here.

Well, I'm no coder, but I did seem to figure out rudimentarily how to get it working with faster-whisper – it pretty much works on the same principle, except that the output of faster-whisper isn't a dictionary like in normal whisper, but outputs a tuple that contains a generator segment and audioinfo. So, you just need to import "from faster_whisper import WhisperModel", and modify result to specify index 0 of the tuple for the generator:

 predicted_text = result[0]
            result_queue.put_nowait(predicted_text)

And then when it comes time to print it out, iterate over it to get the text field specifically:

        for segment in result_queue.get():
            finished_text = segment.text
            print(finished_text)

And that's really all it takes to get it going. It seems to work well, but I imagine you might have a more sophisticated solution. If not, maybe I could try to bang it together and do my first ever pull request :)

from whisper_mic.

DevenBL commented on July 22, 2024 1

i have tried crudely hammering this into the script but i cannot for the life of me get the

def transcribe_forever(audio_queue, result_queue, audio_model, english, verbose, save_file):
    while True:
        audio_data = audio_queue.get()
        if english:
            #result = audio_model.transcribe(audio_data,language='english')
            result, _ = audio_model.transcribe(audio_data,language='english')
        else:
            #result = audio_model.transcribe(audio_data)
            result, _ = audio_model.transcribe(audio_data)

        if not verbose:
            predicted_text = result[0]
            result_queue.put_nowait("You said: " + predicted_text)
        else:
            result_queue.put_nowait(result)

        if save_file:
            os.remove(audio_data)

Function working.
it always dies at result, _ = audio_model.transcribe(audio_data)
no clue what this magic syntax from the faster whisper documentation is supposed to be: , _

from whisper_mic.

mallorbc commented on July 22, 2024

I actually heard about faster-whisper recently. I plan on looking into it closer to other projects and may add support here.

from whisper_mic.

elia-ashraf commented on July 22, 2024

@mallorbc > I actually heard about faster-whisper recently. I plan on looking into it closer to other projects and may add support here.

Well, I'm no coder, but I did seem to figure out rudimentarily how to get it working with faster-whisper – it pretty much works on the same principle, except that the output of faster-whisper isn't a dictionary like in normal whisper, but outputs a tuple that contains a generator segment and audioinfo. So, you just need to import "from faster_whisper import WhisperModel", and modify result to specify index 0 of the tuple for the generator:
 predicted_text = result[0]
            result_queue.put_nowait(predicted_text)
And then when it comes time to print it out, iterate over it to get the text field specifically:
        for segment in result_queue.get():
            finished_text = segment.text
            print(finished_text)
And that's really all it takes to get it going. It seems to work well, but I imagine you might have a more sophisticated solution. If not, maybe I could try to bang it together and do my first ever pull request :)

Hey. Did you make a pull-request and do this? I was really hoping this worked with faster-whisper, which is definetly much better than the standard Whisper (or maybe with WhisperJAX).

from whisper_mic.

DeluxeMonster commented on July 22, 2024

I try to make a whisper llm bark bot so awesome repo just what i was looking for thanks mallorbc

the problem with faster-whisper is:

the iteration of the segments is when the model is actually running you cant leave out the iteration

segments, info = audio_model.transcribe(audio_data, without_timestamps=True)
result=""
for segment in segments:
result+=segment.text

without_timestamps=True is much faster then normal
print("Detected language '%s' with probability %f" % (info.language, info.language_probability)) _ are some infos you can work with

from whisper_mic.

evranch commented on July 22, 2024

The main issue was that faster-whisper doesn't want to be passed a Tensor. Got it working, way better performance.

from whisper_mic.

mallorbc commented on July 22, 2024

I'm going to close this issue since it is now been merged into main.

Thanks for the PRs everybody!

from whisper_mic.

Whisper_mic for faster-whisper/CTranslate2? about whisper_mic HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs