Skip to content

Latest commit

 

History

History
43 lines (33 loc) · 2.01 KB

ai-flavors.md

File metadata and controls

43 lines (33 loc) · 2.01 KB

AI Voice Connector - Community Edition - Flavors

A "flavor" refers to an individual AI engine or a combination of services/engines that collectively power an end-to-end, speech-to-speech application. Each flavor represents a unique configuration tailored to deliver seamless voice interactions. Below is an overview of the currently supported AI engine flavors, detailing the specific engines and services integrated to provide robust, real-time speech processing capabilities.

Flavors

Deepgram

The Deepgram flavor uses the Deepgram's Speech-to-Text API to transcribe the SIP User's input into text, then uses the ChatGPT API to interpret it and get a response in return. The response is then pushed back into Deepram's Text-to-Speech API to grab the voice and play it back to the user. You can read more about this flavor here.

OpenAI

The OpenAI flavor hooks directly into the OpenAI's Realtime API that provides direct Speech-to-Speech interpretation of the user's conversation. Find out more information about OpenAI flavor here.

Flavor Selection

For every new call, the engine needs to select an AI flavor to use. For this, the SIP To user is being used, with the following logic:

  1. For each flavor defined in the configuration file as a section, the user is checked against the match node. If it matches, the corresponding section is begin used. The priority of the flavors considered is driven by thier order in the configuration file.
  2. If nothing matches in the previous step, then the engine checks if the user matches the name of the flavor (in lowercase)
  3. If the name does not match either, the selection is performed by hasing the user value - this ensures a consistent engine choosing.

Note that in any step, if the flavor is disabled in the configuration file, its settings are completely ignored in the selection process.