Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Support for OpenAI Whisper API as Alternative Transcription Service (Issue #137) #150

Conversation

agentmarketbot
Copy link
Contributor

Pull Request Description

Background

This pull request addresses issue #137, titled "support additional whisper services (focus on openai whisper api)", which can be viewed here. The objective of this enhancement is to integrate the OpenAI Whisper API as an additional transcription backend in the existing Telegram bot. This functionality will complement the current AWS Transcribe service, providing users with more options and flexibility regarding transcription performance, cost-effectiveness, and personalization based on individual needs.

Summary of Changes

This PR implements significant updates to the transcription capabilities of the Telegram bot. Specifically, the changes include:

  1. Introduction of New Transcription Services:

    • Added support for the OpenAI Whisper API by creating a new transcription service class called OpenAITranscriptionService.
    • Refactored the existing AWS Transcribe functionality into a dedicated class named AWSTranscriptionService.
    • Established an abstract base class TranscriptionService to standardize the interface across various transcription services.
  2. Service Architecture Update:

    • Modified the bot handlers to support the updated transcription service architecture.
    • Integrated environment variable options that allow users to select their preferred transcription service.
  3. Environment Configuration:

    • Users can now set the TRANSCRIPTION_SERVICE environment variable to either 'aws' or 'openai'. By default, the system will utilize AWS Transcribe if no preference is indicated.
    • Required authentication details:
      • For AWS: Users must provide AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY.
      • For OpenAI: Users must provide OPENAI_API_KEY.
  4. Documentation Updates:

    • The README.md file has been thoroughly updated to include the new features, prerequisites, and detailed configuration instructions for utilizing both transcription services.
  5. Dependency Management:

    • Introduced the necessary OpenAI package as a required dependency of the project.

Next Steps

To utilize this new functionality, please ensure that the environment variables are accurately configured to reflect the desired transcription service. Once set up, you can run the bot to test and confirm that both AWS Transcribe and OpenAI transcription services function as intended.

Feel free to reach out if any questions or further assistance are needed. Thank you for considering this enhancement!

Implement alternative transcription service using OpenAI Whisper API:
- Create abstract TranscriptionService base class
- Add AWSTranscriptionService and OpenAITranscriptionService implementations
- Update configuration to support service selection via TRANSCRIPTION_SERVICE env var
- Add OpenAI dependencies and configuration requirements
- Update documentation with new service options and setup instructions
- Enhance logging to include transcription service type

The change allows users to choose between AWS Transcribe for enterprise-grade 
transcription or OpenAI Whisper API for high-accuracy multilingual support.
@vadanrod14 vadanrod14 closed this Jan 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants