Developed by Pavan Kumar, ClearSpeak is a Python application that utilizes Google's Speech-to-Text API for real-time audio transcription. The application includes a user-friendly graphical interface built with Tkinter, designed to provide clear transcription of human speech while filtering out background noise.
- Features
- Prerequisites
- Setup and Installation
- Running the Application
- How to Use
- Troubleshooting
- Contributing
- License
- Contact
- Real-Time Transcription: Instantaneous transcription of speech from the microphone.
- Noise Filtering: Distinguishes between human speech and background noise.
- User Interface: Easy-to-use GUI for starting and stopping transcription.
Before starting, ensure you have the following:
- Python 3.x installed.
- An active Google Cloud Platform (GCP) account.
- Speech-to-Text API enabled in your GCP account.
- Your GCP service account key file downloaded.
git clone https://github.com/ascender1729/ClearSpeak.git
cd ClearSpeak
Create a virtual environment to manage your project's dependencies:
python -m venv myenv
.\myenv\Scripts\Activate.ps1 # On Windows
source myenv/bin/activate # On Unix or MacOS
Install the required libraries:
pip install google-cloud-speech pyaudio
Set your credentials to authenticate with Google Cloud:
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 'path_to_your_service_account_key.json'
Execute the application with:
python transcribe.py
- Click "Start Transcription" to begin.
- Click "Stop Transcription" to end. The transcribed text will be displayed in the application window.
For pyaudio
installation issues:
pip install pipwin
pipwin install pyaudio
To contribute:
- Fork the repository.
- Create a new branch (
git checkout -b feature/YourFeature
). - Commit your changes (
git commit -m 'Add YourFeature'
). - Push to the branch (
git push origin feature/YourFeature
). - Create a new Pull Request.
This project is available under the MIT License.
Pavan Kumar - [email protected]
LinkedIn: linkedin.com/in/im-pavankumar
Project Link: ClearSpeak