Integrate OpenAI Whisper API as Additional Transcription Backend (Issue #137) #146
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Pull Request Description
Title: Support Additional Whisper Services - OpenAI Whisper API Integration
Overview:
This pull request addresses Issue #137, which focuses on enhancing the Telegram bot's transcription capabilities by integrating the OpenAI Whisper API as an additional service option alongside the existing AWS Whisper service. This enhancement is aimed at improving flexibility, performance, and cost-effectiveness for users, enabling them to choose the transcription backend that best suits their needs.
Summary of Changes Implemented:
New OpenAI Whisper API Support:
OpenAITranscriptionService
responsible for handling audio transcription requests using the OpenAI Whisper API.Transcription Service Architecture:
TranscriptionService
, which defines the essential functionality that all transcription services must implement. This lays the groundwork for any future integrations.AWSTranscriptionService
, improving code organization and readability.Configuration Updates:
config.py
file to include a configuration option for the OpenAI API key. A new environment variable,TRANSCRIPTION_SERVICE
, has been added to enable users to select between the AWS and OpenAI transcription services based on their preferences.Bot Handlers Adaptation:
bot_handlers.py
file to accommodate the new architecture, allowing the bot to determine which transcription service to instantiate at runtime. Added appropriate logic to conditionally instantiate either the AWS or OpenAI service based on the current configuration settings.Instructions for Users:
OPENAI_API_KEY
environment variable in their environment.TRANSCRIPTION_SERVICE=openai
in their configuration. The default option remainsaws
if the variable is not set.Impact:
These enhancements significantly increase the flexibility of the transcription service within the bot, while also maintaining backward compatibility with the existing functionality. Users now have more choices tailored to their specific needs, which should lead to an overall improvement in user experience.
Next Steps:
If you have any questions or require further assistance, please feel free to reach out! Thank you for reviewing this pull request.