Delays when transcribing streaming audio #58

scott-vector · 2022-11-10T00:36:12Z

First of all, excellent work. Vosk is great as it is, and this library makes it even better.

I am experiencing a heavy delay on transcription when pulling in a stream from webRTC (partials and fulls).

I suspect maybe it is because of the deprecated "createScriptProcessor" and "onaudioprocess" pieces, but I am unsure.

Here is how I am processing things. If you have any ideas as to why things would be delayed, please let me know. Thank you.

this.recognizeSpeech = async () => {
    console.log("starting recognizeSpeech");
    let audioContext = this.remoteAudioContext;
    let remoteStream = this.incomingAudioStream;
    //
    const recognizerNode = audioContext.createScriptProcessor(4096, 1, 1);
    const model = await createModel("./softphone/model.tar.gz");
    const recognizer = new model.KaldiRecognizer(48000);
    recognizer.setWords(true);
    recognizer.on("partialresult", function (message) {
      console.log("PARTIAL: " + message.result.partial);
    });
    recognizerNode.onaudioprocess = async (event) => {
      try {
        recognizer.acceptWaveform(event.inputBuffer);
      } catch (error) {
        console.error("acceptWaveform failed", error);
      }
    };
    this.remoteTrack.connect(recognizerNode).connect(audioContext.destination);
  };

The text was updated successfully, but these errors were encountered:

ccoreilly · 2022-11-12T08:38:31Z

Hi @scott-vector, interesting use case!
Please take into account that transcription speed depends on the CPU of the user as it is running locally. I have experienced low speed on an Android phone vs a Macbook.
You might try using AudioWorklets to see if it helps. Otherwise it'd be interesting to analyze where the latencies are coming from.

scott-vector · 2022-11-12T22:33:17Z

Yeah I was going to convert to that style. Do you have any interest in that?

…

On Sat, Nov 12, 2022 at 3:38 AM ccoreilly ***@***.***> wrote: Hi @scott-vector <https://github.com/scott-vector>, interesting use case! Please take into account that transcription speed depends on the CPU of the user as it is running locally. I have experienced low speed on an Android phone vs a Macbook. You might try using AudioWorklets <https://github.com/ccoreilly/vosk-browser/tree/master/examples/modern-vanilla> to see if it helps. Otherwise it'd be interesting to analyze where the latencies are coming from. — Reply to this email directly, view it on GitHub <#58 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ASOV5EQPL7KQCQWTSMOGR6TWH5JRFANCNFSM6AAAAAAR4AHBSM> . You are receiving this because you were mentioned.Message ID: ***@***.***>

ccoreilly · 2022-11-13T10:01:13Z

Generally speaking, yes, it' be nice to integrate AudioWorklets seamlessly in the library but allow developers to use the underlying API directly.

korabelnikov · 2024-02-19T12:02:17Z

@scott-vector Hey, have you managed to fix delays?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Delays when transcribing streaming audio #58

Delays when transcribing streaming audio #58

scott-vector commented Nov 10, 2022 •

edited

Loading

ccoreilly commented Nov 12, 2022

scott-vector commented Nov 12, 2022 via email

ccoreilly commented Nov 13, 2022

korabelnikov commented Feb 19, 2024

Delays when transcribing streaming audio #58

Delays when transcribing streaming audio #58

Comments

scott-vector commented Nov 10, 2022 • edited Loading

ccoreilly commented Nov 12, 2022

scott-vector commented Nov 12, 2022 via email

ccoreilly commented Nov 13, 2022

korabelnikov commented Feb 19, 2024

scott-vector commented Nov 10, 2022 •

edited

Loading