Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delays when transcribing streaming audio #58

Open
scott-vector opened this issue Nov 10, 2022 · 4 comments
Open

Delays when transcribing streaming audio #58

scott-vector opened this issue Nov 10, 2022 · 4 comments

Comments

@scott-vector
Copy link

scott-vector commented Nov 10, 2022

First of all, excellent work. Vosk is great as it is, and this library makes it even better.

I am experiencing a heavy delay on transcription when pulling in a stream from webRTC (partials and fulls).

I suspect maybe it is because of the deprecated "createScriptProcessor" and "onaudioprocess" pieces, but I am unsure.

Here is how I am processing things. If you have any ideas as to why things would be delayed, please let me know. Thank you.

this.recognizeSpeech = async () => {
    console.log("starting recognizeSpeech");
    let audioContext = this.remoteAudioContext;
    let remoteStream = this.incomingAudioStream;
    //
    const recognizerNode = audioContext.createScriptProcessor(4096, 1, 1);
    const model = await createModel("./softphone/model.tar.gz");
    const recognizer = new model.KaldiRecognizer(48000);
    recognizer.setWords(true);
    recognizer.on("partialresult", function (message) {
      console.log("PARTIAL: " + message.result.partial);
    });
    recognizerNode.onaudioprocess = async (event) => {
      try {
        recognizer.acceptWaveform(event.inputBuffer);
      } catch (error) {
        console.error("acceptWaveform failed", error);
      }
    };
    this.remoteTrack.connect(recognizerNode).connect(audioContext.destination);
  };
@ccoreilly
Copy link
Owner

Hi @scott-vector, interesting use case!
Please take into account that transcription speed depends on the CPU of the user as it is running locally. I have experienced low speed on an Android phone vs a Macbook.
You might try using AudioWorklets to see if it helps. Otherwise it'd be interesting to analyze where the latencies are coming from.

@scott-vector
Copy link
Author

scott-vector commented Nov 12, 2022 via email

@ccoreilly
Copy link
Owner

Generally speaking, yes, it' be nice to integrate AudioWorklets seamlessly in the library but allow developers to use the underlying API directly.

@korabelnikov
Copy link

@scott-vector Hey, have you managed to fix delays?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants