Skip to content

Latest commit

 

History

History
46 lines (35 loc) · 2.7 KB

README.md

File metadata and controls

46 lines (35 loc) · 2.7 KB

Audio Transcription and Summarization AI Agent with OpenAI API

This Python script allows you to transcribe audio files and generate contextual summaries (e.g., for meetings, phone calls, or interviews) using OpenAI's Whisper and GPT-4 models. The results can be exported in various formats (.txt, .md, or .pdf) and automatically emailed using Gmail. I've tested with .wav and .m4a files. If you have ffmpeg installed it should be able to handle multiple types of audio files. According to the site "It supports the most obscure ancient formats up to the cutting edge." lol. Just make sure to label the file correctly in the config.json file.

I also recommend setting up a GMail App Password rather than have your own password listed in the .json file.

Features

  • Audio Transcription: Transcribe audio files using OpenAI's Whisper Model.
  • Contextual Summarization: Generate detailed summaries for meetings, phone calls, or interviews based on user input using OpenAI's GPT 4o
  • Multi-Format Export: Save transcriptions and summaries as .txt, .md, or .pdf.
  • Email Integration: Automatically email the transcription and summary files as attachments.
  • Parallel Processing: Speed up transcription by processing audio chunks concurrently.
  • Error Handling: Robust error handling with retry logic for API requests.

Installation

  1. Clone the Repository

    git clone https://github.com/holmesha/GPT-Meeting-Summary.git
    cd GPT-Meeting-Summary
  2. Install Python Dependencies

    • Install the Python dependencies from the requirements.txt file
      pip install -r requirements.txt
  3. Edit config.json File

    • Add your OpenAI key, path to audio file, name of audio file, the output format you want (options are: "pdf" .pdf, "markdown" .md or "plain" .txt), the email of your recipient and your own email info.
    • I recommend setting an app specific email password - I personally used Gmail, which allowed me to generate an app-specific password (instructions here).
  4. Install ffmpeg for Audio Processing

    • Next, install ffmpeg for audio processing. Use the following commands depending on your operating system: • macOS: brew install ffmpeg • Ubuntu: sudo apt install ffmpeg • Windows: Download from ffmpeg.org.

Once everything is set up you can start the script by running: python main.py

Contributions are welcome! If you have suggestions for improvements or new features/prompt ideas/etc, feel free to submit a pull request or open an issue.

GPT Meeting Summary © 2024 by AH is licensed under CC BY-NC-SA 4.0