Skip to content

Converting invoice pdf to image, image to text and then get, from the text, invoice informations like invoice number or vendor name

Notifications You must be signed in to change notification settings

Hermann-web/python-OCR

Repository files navigation

python-OCR

In accounting, working with thousands of vendors is quite challenging when it comes to search invoices by invoice number between scanned documents.

Text invoices contain variety of information such as product names, VAT, product prices, vendor or customer names, tax information, the date of the transaction etc. The process of reading text from images is called Object Character Recognition since characters in images are essentially treated as objects.

In this repository, i have gone trough some ways de convert pdf to images using python. The, we can read text from these images. A little further content extraction is not provided here

#Prerequistes

#Bibliographie

#More ressources

#more on tesseract https://learnopencv.com/deep-learning-based-text-recognition-ocr-using-tesseract-and-opencv/ https://learnopencv.com/category/text-recognition/

#datasets

About

Converting invoice pdf to image, image to text and then get, from the text, invoice informations like invoice number or vendor name

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published