A tiny vision language model that runs anywhere
Website | Hugging Face | Demo | Moondream Github
Using Whisper Speech to text:
pip install -r whisper_requirements.txt
whisper_client.py
Or, Use google Speech Recognition:
pip install -r google_requirements.txt
google_client.py
app_url="https://460c1d4fa3515c02dd.gradio.live/"
If you don't want to use the vs code due to low ram
first setup the app_url
in your client
then,
modify the path in run.py
then click on run.bat