Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.Net: New Feature: Add support for GPT-4o Real Time endpoint #9075

Open
rboen opened this issue Oct 3, 2024 · 3 comments
Open

.Net: New Feature: Add support for GPT-4o Real Time endpoint #9075

rboen opened this issue Oct 3, 2024 · 3 comments
Assignees
Labels
ai connector Anything related to AI connectors Build Features planned for next Build conference .NET Issue or Pull requests regarding .NET code

Comments

@rboen
Copy link

rboen commented Oct 3, 2024

Low latency conversational interactions using speech is an impressive enhancement and game changer for audio chat bots. With the emergence of the gpt-40-realtime-preview in Azure OpenAI I'd love to see an integration with the Sematik Kernel in order to facilitate agents, skills / plugins in call agents scenarios.

Please have a look at https://github.com/azure-samples/aoai-realtime-audio-sdk

@RogerBarreto
Copy link
Member

@rboen Thanks for the ask.

We will keep track on this feature and investigate how to bring as a Speech-to-Speech streaming abstraction to SK, for now our suggestion is while we don't have this abstraction in place to use our current APIs with the breaking glass option (Providing either the OpenAIClient or AzureOpenAIClient, directly) and consuming the RealtimeConversationClient directly.

@RogerBarreto RogerBarreto moved this to Backlog: Planned in Semantic Kernel Oct 3, 2024
@RogerBarreto RogerBarreto added .NET Issue or Pull requests regarding .NET code ai connector Anything related to AI connectors labels Oct 3, 2024
@github-actions github-actions bot changed the title New Feature: Add support for GPT-4o Real Time endpoint .Net: New Feature: Add support for GPT-4o Real Time endpoint Oct 3, 2024
@joslat
Copy link
Contributor

joslat commented Oct 7, 2024

@RogerBarreto as this can be easily implemented, even the .NET c# code is a bit obfuscated (tried to improve this here: Azure-Samples/aoai-realtime-audio-sdk#28) it would be great to use this in the context of an Agent / Assistant where I can provide tools and a nice metaprompt.

Using it as the "UserProxy Agent" basically ;)
Otherwise we can just provide the initial prompt and that's all, also bind some tool here for function calling but the current way to do this is hyper-counter-intuitive :( - see here https://github.com/joslat/aoai-realtime-audio-sdk/blob/main/dotnet/samples/console-from-file/Program.cs (lines 28, 238 and 109...)

@jerry2007
Copy link

When will be this implemented? This could be gamechanger...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ai connector Anything related to AI connectors Build Features planned for next Build conference .NET Issue or Pull requests regarding .NET code
Projects
Status: Backlog: Planned
Development

No branches or pull requests

5 participants