GithubHelp home page GithubHelp logo

Comments (3)

jech avatar jech commented on July 25, 2024

I'm open to the idea, but I'd need to speak with the people who actually need the feature. In particular, I'd need to understand why they don't use a system-wide speech-to-text system.

I have spoken to visually impaired users of Galene, and they tell me that they use a system-wide screenreader, and therefore don't need TTS support in Galene itself, they just need the Galene UI to be accessible (which is apparently the case). Before implementing the feature you request, I need to understand whether hearing impaired users use a system-wide speech-to-text system, and, if they don't, why.

If the issue is that there are no good speech-to-text systems for free OSes, then in my opinion we should work on building one, rather than adding speech-to-text support to every single application.

from galene.

TechnologyClassroom avatar TechnologyClassroom commented on July 25, 2024

Those are good questions.

The technology exists today for free desktop OSes, but it is still in the developer skill-set range and not a user-friendly range. The above script could be run in a local terminal on old laptops and connected to the desktop-audio instead of the microphone to get a local live-transcription in near real-time. The terminal would need to be always on-top and and take up enough of the screen real-estate to be useful. The setup for local whisper models takes some command line experience which not everyone is familiar with. There is definitely work that could be done to make this process easier such as GUIs, packaging, and installers. On the mobile front, it is still in the very early stages and processing power could be an issue.

If you run an event that may or may not have hearing issues and supplying all of the technology yourself, the local whisper system would need to be configured on all of the desktop machines and someone would need to introduce how to get it started if and when it is needed. Individual system configurations would scale poorly in this scenario.

Jitsi Meet with Jigasi adds the optional functionality of transcription followed by option functionality of translation through LibreTranslate. Transcription would be the first step towards translation.

If the event organizer could get TTS working once on the conferencing system, then all users could benefit whether they needed TTS, prefer subtitles, or are not native language speakers. The TTS could be integrated into the chat system or some other intuitive way that does not leave the users switching between two windows, trying to balance the screen sizes to experience the chat to the fullest extent, waiting for a model to download before they can participate, or not being able to participate on their mobile device.

from galene.

jech avatar jech commented on July 25, 2024

Ah-ha, you're thinking of server-side TTS. Yes, that makes more sense.

I think this could by done by writing a separate client that connects to the Galene server and does TTS then publishes the resulting text in the chat. This could be run on any computer, which would avoid putting CPU intensive stuff on the Galene server.

Please don't hold your breath.

from galene.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.