stt-bot's People
Forkers
renyhpstt-bot's Issues
Chat model: "punctuation" property
Let each chat decide whether to include punctuation or not
Test transcription quality after converting voice to FLAC
ffmpeg conversion: https://github.com/backmeupplz/voicy/blob/32c94d0e4b6114352fc64a31ae367e74ba652d42/helpers/urlToText.js#L87
encoding to pass to google: https://github.com/backmeupplz/voicy/blob/32c94d0e4b6114352fc64a31ae367e74ba652d42/engines/google.js#L231
Google's docs about which encoding to prefer: https://cloud.google.com/speech-to-text/docs/encoding#audio-encodings and https://cloud.google.com/speech-to-text/docs/best-practices
Google docs about optimizing audio files for recognition, with ffmpeg examples: https://cloud.google.com/solutions/media-entertainment/optimizing-audio-files-for-speech-to-text
ffmpeg to convert to linear16: https://medium.com/cod3/convert-speech-from-an-audio-file-to-text-using-google-speech-api-b951f4032a64
Python
ffmpeg commands from python: https://github.com/MarshalX/tgcalls/blob/7e6b5b11877fa39d6959ea429af3c6950e666768/examples/radio_as_smart_plugin.py#L63
Opus to flac conversion
pydub docs: https://github.com/jiaaro/pydub
about opus export (shouldn't be an issue because we would import only): https://github.com/jiaaro/pydub#ogg-exporting-and-default-codecs
voicybot conversion: https://github.com/backmeupplz/voicy/blob/d31f159ee18204587f1a1d73ed8c3d141503d3e3/helpers/urlToText.js#L76
voicybot ffmpeg command: https://github.com/backmeupplz/voicy/blob/master/helpers/flac.js#L18
Plotting transcription durations
Export the table to a panda data set and then see here: https://realpython.com/pandas-plot-python/
Animate "..." while waiting for a result
Edit the "transcrining..." message while we are waiting for a result, and maybe show the elapsed time. Possible solution: spawn a new thread, pass it the message, then join the thread when the transcription is completed.
Or maybe, the VoiceMessageLocal should yield a "result" object when running long operations. The object should have a "done" property that singals when we are done
New model: TranscriptionRequest
Log how much it takes (in seconds, float (eg. 6.7 seconds)) to transcribe audios.
What the model should track:
audio_duration
elapsed_seconds
success
"success" is optional, we could just add to a session the model instance only when the transcription is successful
This model is useful to give the user an estimated transcription time based on experience with audio with similar length
catch "Wrong file_id or the file is temporarily unavailable" errors
We should make the bot retry (.from_message()
) download (getFile) requests when this error is raised by the api (and sleep a few seconds when it happens). If it keeps happening, raise it
Transcribe video messages
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.