Comments (7)
@mallorbc > I actually heard about faster-whisper recently. I plan on looking into it closer to other projects and may add support here.
Well, I'm no coder, but I did seem to figure out rudimentarily how to get it working with faster-whisper – it pretty much works on the same principle, except that the output of faster-whisper isn't a dictionary like in normal whisper, but outputs a tuple that contains a generator segment and audioinfo. So, you just need to import "from faster_whisper import WhisperModel", and modify result to specify index 0 of the tuple for the generator:
predicted_text = result[0]
result_queue.put_nowait(predicted_text)
And then when it comes time to print it out, iterate over it to get the text field specifically:
for segment in result_queue.get():
finished_text = segment.text
print(finished_text)
And that's really all it takes to get it going. It seems to work well, but I imagine you might have a more sophisticated solution. If not, maybe I could try to bang it together and do my first ever pull request :)
from whisper_mic.
i have tried crudely hammering this into the script but i cannot for the life of me get the
def transcribe_forever(audio_queue, result_queue, audio_model, english, verbose, save_file):
while True:
audio_data = audio_queue.get()
if english:
#result = audio_model.transcribe(audio_data,language='english')
result, _ = audio_model.transcribe(audio_data,language='english')
else:
#result = audio_model.transcribe(audio_data)
result, _ = audio_model.transcribe(audio_data)
if not verbose:
predicted_text = result[0]
result_queue.put_nowait("You said: " + predicted_text)
else:
result_queue.put_nowait(result)
if save_file:
os.remove(audio_data)
Function working.
it always dies at result, _ = audio_model.transcribe(audio_data)
no clue what this magic syntax from the faster whisper documentation is supposed to be: , _
from whisper_mic.
I actually heard about faster-whisper recently. I plan on looking into it closer to other projects and may add support here.
from whisper_mic.
@mallorbc > I actually heard about faster-whisper recently. I plan on looking into it closer to other projects and may add support here.
Well, I'm no coder, but I did seem to figure out rudimentarily how to get it working with faster-whisper – it pretty much works on the same principle, except that the output of faster-whisper isn't a dictionary like in normal whisper, but outputs a tuple that contains a generator segment and audioinfo. So, you just need to import "from faster_whisper import WhisperModel", and modify result to specify index 0 of the tuple for the generator:
predicted_text = result[0] result_queue.put_nowait(predicted_text)
And then when it comes time to print it out, iterate over it to get the text field specifically:
for segment in result_queue.get(): finished_text = segment.text print(finished_text)
And that's really all it takes to get it going. It seems to work well, but I imagine you might have a more sophisticated solution. If not, maybe I could try to bang it together and do my first ever pull request :)
Hey. Did you make a pull-request and do this? I was really hoping this worked with faster-whisper, which is definetly much better than the standard Whisper (or maybe with WhisperJAX).
from whisper_mic.
I try to make a whisper llm bark bot so awesome repo just what i was looking for thanks mallorbc
the problem with faster-whisper is:
the iteration of the segments is when the model is actually running you cant leave out the iteration
segments, info = audio_model.transcribe(audio_data, without_timestamps=True)
result=""
for segment in segments:
result+=segment.text
without_timestamps=True is much faster then normal
print("Detected language '%s' with probability %f" % (info.language, info.language_probability)) _ are some infos you can work with
from whisper_mic.
The main issue was that faster-whisper doesn't want to be passed a Tensor. Got it working, way better performance.
from whisper_mic.
I'm going to close this issue since it is now been merged into main.
Thanks for the PRs everybody!
from whisper_mic.
Related Issues (20)
- Feature request: Provide code sample for Web UI Mic recording HOT 1
- no transcript output HOT 2
- cannot import the function to another project HOT 2
- Takes considerable time to actually setup the mic and start transcribing HOT 1
- [Fix] Keyboard interrupt for listen_loop HOT 1
- mic.py does not exist in file or directory HOT 1
- Error while using the save_file to save the transcribed data HOT 1
- add large-v3 HOT 1
- Issues with Python Setup HOT 1
- ALSA lib error, invalid card HOT 4
- Proccess hanging in infinite loop when input audio is not loud enough HOT 2
- Feature Request: Use isolated-env to make the app bind to the GPU automatically on windows
- ModuleNotFoundError: No module named 'distutils' HOT 1
- Many incomplete segments, what is it even returning? predicted_text referenced before assignment. HOT 5
- Can't use mic in linux - ALSA errors HOT 1
- Sonoma 14.4.1 - Python 3.12 - Running whisper_mic returns errors HOT 1
- Feature requests HOT 1
- Crushes soon after start HOT 1
- Exacmples of mic = WhisperMic()
- please add biggest model large-v3 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from whisper_mic.