Image Text Extractor

Project Overview

This project uses OpenCV and Tesseract OCR to detect and extract text from images. The program preprocesses images, identifies text regions, and converts them into readable text, which is saved to a file and optionally displayed. This project is ideal for automating text recognition tasks in scanned documents, photographs, or other image files.

Features

Preprocessing:
- Converts images to grayscale for simpler processing.
- Uses adaptive thresholding to binarize the image and make text regions prominent.
- Applies dilation to expand text regions for better detection.
Text Detection:
- Finds contours to identify potential text regions.
- Filters out small contours to remove irrelevant areas.
Text Extraction:
- Crops and resizes text regions for improved OCR accuracy.
- Extracts text using Tesseract OCR.
Output:
- Saves the extracted text to detected text.txt.
- Prints the extracted text directly to the terminal.

Getting Started

Prerequisites

Python (3.x recommended)
Required Libraries:
- Install using pip:
```
pip install opencv-python pytesseract
```
Tesseract OCR:
- macOS:
```
brew install tesseract
```
- Linux:
```
sudo apt update
sudo apt install tesseract-ocr
```
- Windows:
  - Download and install from Tesseract OCR GitHub.

Installation

Clone the repository:

git clone https://github.com/your-username/text-extraction-opencv.git
cd text-extraction-opencv

Ensure Tesseract is installed and accessible. Update the path to Tesseract in the script if necessary:
```
pytesseract.pytesseract.tesseract_cmd = '/path/to/tesseract'
```

Usage

Add your input image (e.g., sample.jpg) to the project directory.
Run the script:
```
python text_extraction.py
```
Output:
- The extracted text will be saved in detected text.txt.
- Detected text will also be displayed in the terminal.

Example

Input Image:

Output:

Code Explanation

The program works as follows:

Load and Preprocess the Image:
- Converts the input image to grayscale.
- Applies adaptive thresholding to binarize the image.
- Uses dilation to merge close text components.
Detect Text Regions:
- Finds contours in the processed image.
- Filters small contours to avoid noise.
- Crops and resizes potential text regions.
Extract and Output Text:
- Passes the cropped regions to Tesseract for text recognition.
- Writes the detected text to a file and displays it in the terminal.

Future Possible Enhancements

Improved OCR Accuracy:
- Implement advanced preprocessing techniques like noise reduction, skew correction, and edge enhancement for cleaner text regions.
- Experiment with different OCR configurations for handling complex layouts.
Multi-Language Support:
- Extend Tesseract's language models to support text extraction in languages other than English.
Support for Curved Text:
- Enhance the program to detect and process curved or rotated text using tools like Hough Transform or deskewing algorithms.
Error Handling and Validation:
- Add error handling for missing dependencies, invalid input files, and unreadable text regions.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
LICENSE		LICENSE
README.md		README.md
image1.png		image1.png
image2.png		image2.png
image3.png		image3.png
output sample.png		output sample.png
text_extraction2.py		text_extraction2.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Image Text Extractor

Project Overview

Features

Getting Started

Prerequisites

Installation

Usage

Example

Input Image:

Output:

Code Explanation

Future Possible Enhancements

About

Releases

Packages

Languages

License

ManiMajd89/Image-Text-Extractor

Folders and files

Latest commit

History

Repository files navigation

Image Text Extractor

Project Overview

Features

Getting Started

Prerequisites

Installation

Usage

Example

Input Image:

Output:

Code Explanation

Future Possible Enhancements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages