Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate OpenAI Whisper API as Additional Transcription Backend (Issue #137) #146

Conversation

agentmarketbot
Copy link
Contributor

Pull Request Description

Title: Support Additional Whisper Services - OpenAI Whisper API Integration

Overview:

This pull request addresses Issue #137, which focuses on enhancing the Telegram bot's transcription capabilities by integrating the OpenAI Whisper API as an additional service option alongside the existing AWS Whisper service. This enhancement is aimed at improving flexibility, performance, and cost-effectiveness for users, enabling them to choose the transcription backend that best suits their needs.

Summary of Changes Implemented:

  1. New OpenAI Whisper API Support:

    • Introduced a new class called OpenAITranscriptionService responsible for handling audio transcription requests using the OpenAI Whisper API.
  2. Transcription Service Architecture:

    • Created an abstract base class, TranscriptionService, which defines the essential functionality that all transcription services must implement. This lays the groundwork for any future integrations.
    • Refactored the existing AWS transcription logic into a dedicated class AWSTranscriptionService, improving code organization and readability.
  3. Configuration Updates:

    • Updated the config.py file to include a configuration option for the OpenAI API key. A new environment variable, TRANSCRIPTION_SERVICE, has been added to enable users to select between the AWS and OpenAI transcription services based on their preferences.
  4. Bot Handlers Adaptation:

    • Modified the bot_handlers.py file to accommodate the new architecture, allowing the bot to determine which transcription service to instantiate at runtime. Added appropriate logic to conditionally instantiate either the AWS or OpenAI service based on the current configuration settings.
  5. Instructions for Users:

    • Users looking to utilize the OpenAI service need to set the OPENAI_API_KEY environment variable in their environment.
    • To switch the bot’s transcription service to OpenAI’s Whisper API, users need to set TRANSCRIPTION_SERVICE=openai in their configuration. The default option remains aws if the variable is not set.

Impact:

These enhancements significantly increase the flexibility of the transcription service within the bot, while also maintaining backward compatibility with the existing functionality. Users now have more choices tailored to their specific needs, which should lead to an overall improvement in user experience.

Next Steps:

  • Review and test the integration to ensure compatibility with the existing features.
  • Provide feedback or raise any concerns regarding the implementation.

If you have any questions or require further assistance, please feel free to reach out! Thank you for reviewing this pull request.

- Implement OpenAI Whisper as an alternative to AWS transcription
- Create abstract TranscriptionService base class for flexibility
- Add configuration option to switch between AWS and OpenAI services
- Update bot handlers to use new transcription service factory
- Add OPENAI_API_KEY and TRANSCRIPTION_SERVICE config variables
- Improve error logging with service-specific messages
- Update logging to handle case when summary is None
@vadanrod14 vadanrod14 closed this Jan 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants