Skip to content

Converting invoice pdf to image, image to text and then get, from the text, invoice informations like invoice number or vendor name

Notifications You must be signed in to change notification settings

Hermann-web/python-OCR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

498d795 · Aug 4, 2023

History

29 Commits
Aug 27, 2021
Sep 20, 2022
Sep 20, 2022
Sep 20, 2022
Sep 20, 2022
Sep 20, 2022
Aug 27, 2021
Aug 31, 2021
Aug 27, 2021
Aug 27, 2021
Aug 27, 2021
Aug 4, 2023
Aug 31, 2021

Repository files navigation

python-OCR

In accounting, working with thousands of vendors is quite challenging when it comes to search invoices by invoice number between scanned documents.

Text invoices contain variety of information such as product names, VAT, product prices, vendor or customer names, tax information, the date of the transaction etc. The process of reading text from images is called Object Character Recognition since characters in images are essentially treated as objects.

In this repository, i have gone trough some ways de convert pdf to images using python. The, we can read text from these images. A little further content extraction is not provided here

#Prerequistes

#Bibliographie

#More ressources

#more on tesseract https://learnopencv.com/deep-learning-based-text-recognition-ocr-using-tesseract-and-opencv/ https://learnopencv.com/category/text-recognition/

#datasets

About

Converting invoice pdf to image, image to text and then get, from the text, invoice informations like invoice number or vendor name

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published