Skip to content

Acoustic and language models

syl22-00 edited this page Sep 22, 2014 · 2 revisions

Pocketsphinx.js ships with a default acoustic model, but any acoustic model compatible with Pocketsphinx can be used. There is no default language models. Grammars can be added at runtime and statistical language models can be packaged inside pocketsphinx.js and loaded at init time.

Here are a few pointers to tools and resources about acoustic and language models:

1. Acoustic models

Although, features extracted by the recognizer should not depend on the signal characteristics (sample rate and resolution, at least at my understanding), it seems that acoustic models depend on these characteristics. The recorder included in pocketsphinx.js uses 16kHz and 16 bits, so you'll probably want to make sure your model was trained with this type of audio.

a. Default model

By default, pocketsphinx.js ships with and loads a model trained with the RM1 corpus, with 200 senones, semi-continuous.

b. Training a model

Refer to the tutorial on CMU Sphinx's website:

http://cmusphinx.sourceforge.net/wiki/tutorialam

c. Existing models

http://www.speech.cs.cmu.edu/sphinx/models/