GithubHelp home page GithubHelp logo

Comments (4)

syl22-00 avatar syl22-00 commented on June 16, 2024

I think download time is probably not a huge deal as it happens in a web worker if you use recognizer.js. And the file is cached, so it would only be downloaded once.

However, large JavaScript files might not be valid with browsers, you have to try, I do not know.

There is probably an alternative which is to use HTML5 storage instead of compiling the files inside the JavaScript. For that you'd need to look at Emscripten's documentation:
https://github.com/kripken/emscripten/wiki/Filesystem-Guide
https://github.com/kripken/emscripten/wiki/Filesystem-API

About the basics of speech recognition, you'll find everything on http://cmusphinx.org.

  • Acoustic models: parameters that describe the phonemes (building blocks of words).
  • Statistical language model: type of language model that defines probabilities of series of words. As opposite of grammar that describe language as a graph.
  • Pronunciation dictionary: gives the mapping between words and phonemes. You can see it as the bridge between acoustic and language models.

For a speech recognition system, you need an acoustic model, a pronunciation dictionary and a language model (either a grammar or a statistical language model).

I hope that helps.

from pocketsphinx.js.

hashimawan avatar hashimawan commented on June 16, 2024

Thanks for your quick response!

I build pocketsphinx with an en_us acoustic model (57 MB) without language model and dictionary file, It created pocketsphinx.js file of size 253 MB, obviously its a huge size which is not affordable for a web base applications.
Can you please share with me links from where I can download generic acoustic model, lm, dic files and build them with pocketsphinx that create small size js file which I can use in my website that can easily understand all generic conversation of users.

Regarding HTML5 storage, I think for that I need to send audio files to the server (HTML5 Storage) where it will transform them into text. Please correct me if I'm wrong.

Thanks,

from pocketsphinx.js.

syl22-00 avatar syl22-00 commented on June 16, 2024

@hashimawan you can find many resources, documentation and help from http://cmusphinx.org, including acoustic and language models. You'll also find links to http://voxforge.org/ with resources, acoustic and language models.

For HTML5 storage, please follow the docs I sent you, I don't know more than that, but it'd be great that you share your experience in a wiki entry if you get anything working.

from pocketsphinx.js.

syl22-00 avatar syl22-00 commented on June 16, 2024

@hashimawan you can now package your acoustci model in separate files, see README.md. That will solve your issue.

from pocketsphinx.js.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.