Integrate OpenAI Whisper API as Additional Transcription Backend (Issue #137) #146

agentmarketbot · 2025-01-26T16:02:23Z

Pull Request Description

Title: Support Additional Whisper Services - OpenAI Whisper API Integration

Overview:

This pull request addresses Issue #137, which focuses on enhancing the Telegram bot's transcription capabilities by integrating the OpenAI Whisper API as an additional service option alongside the existing AWS Whisper service. This enhancement is aimed at improving flexibility, performance, and cost-effectiveness for users, enabling them to choose the transcription backend that best suits their needs.

Summary of Changes Implemented:

New OpenAI Whisper API Support:
- Introduced a new class called OpenAITranscriptionService responsible for handling audio transcription requests using the OpenAI Whisper API.
Transcription Service Architecture:
- Created an abstract base class, TranscriptionService, which defines the essential functionality that all transcription services must implement. This lays the groundwork for any future integrations.
- Refactored the existing AWS transcription logic into a dedicated class AWSTranscriptionService, improving code organization and readability.
Configuration Updates:
- Updated the config.py file to include a configuration option for the OpenAI API key. A new environment variable, TRANSCRIPTION_SERVICE, has been added to enable users to select between the AWS and OpenAI transcription services based on their preferences.
Bot Handlers Adaptation:
- Modified the bot_handlers.py file to accommodate the new architecture, allowing the bot to determine which transcription service to instantiate at runtime. Added appropriate logic to conditionally instantiate either the AWS or OpenAI service based on the current configuration settings.
Instructions for Users:
- Users looking to utilize the OpenAI service need to set the OPENAI_API_KEY environment variable in their environment.
- To switch the bot’s transcription service to OpenAI’s Whisper API, users need to set TRANSCRIPTION_SERVICE=openai in their configuration. The default option remains aws if the variable is not set.

Impact:

These enhancements significantly increase the flexibility of the transcription service within the bot, while also maintaining backward compatibility with the existing functionality. Users now have more choices tailored to their specific needs, which should lead to an overall improvement in user experience.

Next Steps:

Review and test the integration to ensure compatibility with the existing features.
Provide feedback or raise any concerns regarding the implementation.

If you have any questions or require further assistance, please feel free to reach out! Thank you for reviewing this pull request.

- Implement OpenAI Whisper as an alternative to AWS transcription - Create abstract TranscriptionService base class for flexibility - Add configuration option to switch between AWS and OpenAI services - Update bot handlers to use new transcription service factory - Add OPENAI_API_KEY and TRANSCRIPTION_SERVICE config variables - Improve error logging with service-specific messages - Update logging to handle case when summary is None

vadanrod14 closed this Jan 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrate OpenAI Whisper API as Additional Transcription Backend (Issue #137) #146

Integrate OpenAI Whisper API as Additional Transcription Backend (Issue #137) #146

agentmarketbot commented Jan 26, 2025

Integrate OpenAI Whisper API as Additional Transcription Backend (Issue #137) #146

Integrate OpenAI Whisper API as Additional Transcription Backend (Issue #137) #146

Conversation

agentmarketbot commented Jan 26, 2025

Pull Request Description