Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate OpenAI Whisper API Support for Enhanced Transcription Options #137 #142

Conversation

agentmarketbot
Copy link
Contributor

Pull Request Description

Title: Support for Additional Whisper Services with OpenAI Whisper API Integration

Overview:
This pull request addresses issue #137, which requests the integration of additional transcription services into the Telegram bot currently utilizing AWS Whisper. Specifically, this update includes support for the OpenAI Whisper API, enhancing versatility and providing users with flexible transcription options.

Key Changes:

  1. Integration of OpenAI Whisper API:

    • Introduced a new class OpenAITranscriber to handle interactions with the OpenAI API for voice message transcription.
    • Established a base class BaseTranscriber that encapsulates shared functionality for different transcription services.
    • Refactored existing code related to AWS transcription into a dedicated class AWSTranscriber, promoting cleaner and modular code.
  2. Configuration Adjustments:

    • Enhanced the config.py file to allow for seamless configuration of the chosen transcription service:
      • OPENAI_API_KEY: Added for authenticating requests to the OpenAI API.
      • TRANSCRIPTION_SERVICE: Introduced a new setting, allowing the user to specify their preferred transcription service (default set to 'aws').
  3. Updates to Bot Handlers:

    • Revised bot_handlers.py to facilitate a dynamic initialization of the AudioTranscriber class based on the specified configuration, enabling clear separation of concern and functionality.

New Features:

  • Users can now switch between AWS and OpenAI transcription services simply by adjusting the TRANSCRIPTION_SERVICE environment variable. When selecting OpenAI, users must ensure to provide their OPENAI_API_KEY for successful integration.

Testing:
All new features have been rigorously tested, ensuring compatibility and performance expectations meet the needs outlined in the original issue description.

Conclusion:
This implementation provides a step forward in supporting diverse transcription services, ultimately leading to improved user experiences and operational efficiency. We welcome any feedback or questions regarding this enhancement.

Related Links:

Thank you for considering this pull request!

Implement a new audio transcription option using OpenAI's Whisper API 
alongside the existing AWS transcription service. This change:

- Add BaseTranscriber class to share common functionality
- Create OpenAITranscriber class for Whisper API integration
- Refactor AudioTranscriber to support multiple transcription services
- Add configuration options to switch between AWS and OpenAI services
- Move AWS service initialization to transcriber class
- Add OPENAI_API_KEY and TRANSCRIPTION_SERVICE config variables

The service can now be configured via environment variables to use 
either AWS Transcribe or OpenAI Whisper for audio transcription.
@vadanrod14 vadanrod14 closed this Jan 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants