OCR_Sumerian

Here is the jupyter book: https://ancient-world-citation-analysis.github.io/OCR_Sumerian/
Here is a Medium article: https://medium.com/@austinpereira6602/unearthing-the-past-the-journey-from-cuneiform-inscriptions-to-ai-translations-5948c2dccd45

Welcome to the OCR (Optical Character Recognition) JupyterBook! This comprehensive guide will walk you through the fascinating world of character recognition and translation, specifically focusing on transliterating text from cuneiform scripts into modern-day English using OCR tools like Tesseract through Python.

Overview

This repository contains code and documentation for Optical Character Recognition (OCR) using Tesseract, an open-source OCR engine maintained by Google.

What You'll Learn

In this JupyterBook, you'll embark on a journey to understand the intricate process of optical character recognition and how it can be harnessed to unlock ancient scripts and languages. We'll explore step-by-step how to use Python and powerful OCR libraries like Tesseract to perform these transformations.

Getting Started

To get started, simply run the following cell in Google Colaboratory. This will mount the Jupyter Notebooks to your Google Drive, allowing you to execute the code and follow along with the examples.

Why This Book?

Cuneiform scripts, some of the earliest known systems of writing, have held the secrets of ancient civilizations for centuries. By learning how to transliterate and translate these scripts using modern OCR technology, you'll gain the ability to uncover and understand the rich history they contain.

Who Should Read This Book

This JupyterBook is designed for anyone curious about the world of OCR, from beginners who want to grasp the basics to advanced users seeking to tackle complex transliteration and translation tasks. Whether you're an archaeologist, historian, linguist, or simply an enthusiast eager to decode ancient texts, you'll find valuable insights here.

Features

Introduction to OCR
Setting Up Your Environment
Tesseract Walkthrough
OCR Demonstration
Advanced OCR
Transliteration Techniques
Statistics on Our Models
Optional Modules

Usage

Prerequisites include Python libraries such as Tesseract, Pandas, and other required dependencies. You can install them using pip:

File Structure

notebook.ipynb: Jupyter Notebook containing the OCR implementation.

Contributors

Austin Pereira
Adam Anderson

License

This project is licensed under the MIT License.

Acknowledgments

Special thanks to prior OCR tools used for cuneiform extraction.

Feedback and Support

We welcome contributions to improve this project. Feel free to provide feedback or seek support through GitHub issues.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.github/workflows		.github/workflows
.DS_Store		.DS_Store
OCR.ipynb		OCR.ipynb
README.md		README.md
index.html		index.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OCR_Sumerian

Overview

What You'll Learn

Getting Started

Why This Book?

Who Should Read This Book

Features

Usage

File Structure

Contributors

License

Acknowledgments

Feedback and Support

About

Releases

Packages

Contributors 2

Languages

ancient-world-citation-analysis/OCR_Sumerian

Folders and files

Latest commit

History

Repository files navigation

OCR_Sumerian

Overview

What You'll Learn

Getting Started

Why This Book?

Who Should Read This Book

Features

Usage

File Structure

Contributors

License

Acknowledgments

Feedback and Support

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages