Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Support for OpenAI Whisper API as an Additional Transcription Backend (Issue #137) #141

Conversation

agentmarketbot
Copy link
Contributor

Pull Request Description

Title: Support Additional Whisper Services: Integration of OpenAI Whisper API

Related Issue: #137


Summary:

This pull request introduces support for OpenAI's Whisper API as an alternative transcription service alongside the existing AWS Whisper functionality in our Telegram bot. This enhancement aims to provide users with more flexibility, performance options, and potentially lower costs depending on their specific needs.


Changes Made:

  1. OpenAI Whisper API Support:

    • Implemented a new class, OpenAITranscriber, located in services.py. This class encapsulates the logic for audio transcription using the OpenAI Whisper API.
    • Updated the AudioTranscriber class to accommodate both AWS Transcribe and OpenAI Whisper, allowing selection based on configuration.
  2. Configuration Updates:

    • Modified config.py to introduce new settings for the OpenAI API key.
    • Added a new environment variable, TRANSCRIPTION_SERVICE, enabling users to specify their desired transcription service (either 'aws' or 'openai').
  3. Bot Handlers Modification:

    • Adjusted the implementation in bot_handlers.py to utilize the new configurable transcription service. This change allows the bot to dynamically select between AWS and OpenAI for voice message transcription based on user preferences.
  4. Dependency Management:

    • Updated the pyproject.toml file to include OpenAI as a dependency. Resolved initial installation challenges via Poetry due to permission issues by opting to install OpenAI using pip.
  5. Documentation Enhancements:

    • Modified the README.md file to incorporate detailed instructions on the new features and configuration options, specifically guiding users on setting up and using the OpenAI Whisper API.

Conclusion:

With this update, users of the grouplang-secretary-bot can easily switch between the AWS Transcribe service and the OpenAI Whisper API. This is done by adjusting the TRANSCRIPTION_SERVICE environment variable to either 'aws' or 'openai', along with providing the necessary credentials for the selected service.

We believe this integration will significantly enhance the user experience by allowing more tailored voice transcription solutions.

If there are any questions or further clarifications needed, please don't hesitate to reach out!


Thank you for considering this pull request!

Implement dual transcription service support allowing users to choose 
between AWS Transcribe and OpenAI Whisper API for voice message 
transcription. Changes include:

- Add OpenAITranscriber class for Whisper API integration
- Refactor AudioTranscriber to support multiple services
- Update configuration to include OpenAI API key and service selection
- Add new environment variable TRANSCRIPTION_SERVICE
- Update documentation with setup instructions for both services
- Add openai package dependency
- Update API documentation references

The implementation maintains backward compatibility while providing
more flexibility in choosing transcription services based on user
needs and preferences.
@vadanrod14 vadanrod14 closed this Jan 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants