Add Support for OpenAI Whisper API as an Additional Transcription Backend (Issue #137) #141

agentmarketbot · 2025-01-26T15:23:43Z

Pull Request Description

Title: Support Additional Whisper Services: Integration of OpenAI Whisper API

Related Issue: #137

Summary:

This pull request introduces support for OpenAI's Whisper API as an alternative transcription service alongside the existing AWS Whisper functionality in our Telegram bot. This enhancement aims to provide users with more flexibility, performance options, and potentially lower costs depending on their specific needs.

Changes Made:

OpenAI Whisper API Support:
- Implemented a new class, OpenAITranscriber, located in services.py. This class encapsulates the logic for audio transcription using the OpenAI Whisper API.
- Updated the AudioTranscriber class to accommodate both AWS Transcribe and OpenAI Whisper, allowing selection based on configuration.
Configuration Updates:
- Modified config.py to introduce new settings for the OpenAI API key.
- Added a new environment variable, TRANSCRIPTION_SERVICE, enabling users to specify their desired transcription service (either 'aws' or 'openai').
Bot Handlers Modification:
- Adjusted the implementation in bot_handlers.py to utilize the new configurable transcription service. This change allows the bot to dynamically select between AWS and OpenAI for voice message transcription based on user preferences.
Dependency Management:
- Updated the pyproject.toml file to include OpenAI as a dependency. Resolved initial installation challenges via Poetry due to permission issues by opting to install OpenAI using pip.
Documentation Enhancements:
- Modified the README.md file to incorporate detailed instructions on the new features and configuration options, specifically guiding users on setting up and using the OpenAI Whisper API.

Conclusion:

With this update, users of the grouplang-secretary-bot can easily switch between the AWS Transcribe service and the OpenAI Whisper API. This is done by adjusting the TRANSCRIPTION_SERVICE environment variable to either 'aws' or 'openai', along with providing the necessary credentials for the selected service.

We believe this integration will significantly enhance the user experience by allowing more tailored voice transcription solutions.

If there are any questions or further clarifications needed, please don't hesitate to reach out!

Thank you for considering this pull request!

Implement dual transcription service support allowing users to choose between AWS Transcribe and OpenAI Whisper API for voice message transcription. Changes include: - Add OpenAITranscriber class for Whisper API integration - Refactor AudioTranscriber to support multiple services - Update configuration to include OpenAI API key and service selection - Add new environment variable TRANSCRIPTION_SERVICE - Update documentation with setup instructions for both services - Add openai package dependency - Update API documentation references The implementation maintains backward compatibility while providing more flexibility in choosing transcription services based on user needs and preferences.

vadanrod14 closed this Jan 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Support for OpenAI Whisper API as an Additional Transcription Backend (Issue #137) #141

Add Support for OpenAI Whisper API as an Additional Transcription Backend (Issue #137) #141

agentmarketbot commented Jan 26, 2025

Add Support for OpenAI Whisper API as an Additional Transcription Backend (Issue #137) #141

Add Support for OpenAI Whisper API as an Additional Transcription Backend (Issue #137) #141

Conversation

agentmarketbot commented Jan 26, 2025

Pull Request Description