Python OCR packages

« All Tags

Selected Tags

Click on a tag to remove it

OCR

More Tags

Click on a tag to add it and filter down

OCR packages

Showing projects tagged as OCR

PyMuPDF

8.5 9.7 Python

PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
pytesseract

8.1 1.0 L5 Python

A Python wrapper for Google Tesseract
Kreuzberg

6.6 9.9 HTML

A polyglot document intelligence framework with a Rust core. Extract text, metadata, and structured information from PDFs, Office documents, images, and 50+ formats. Available for Rust, Python, Ruby, Go, PHP, Elixir, and TypeScript/Node.js—or use via CLI, REST API, or MCP server.
pdftabextract

6.4 0.0 L3 Python

A set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.
normcap

6.0 9.5 Python

OCR powered screen-capture tool to capture information instead of images
pyocr

5.0 0.0 L5 Python

DISCONTINUED. A wrapper for Tesseract and Cuneiform.

* Code Quality Rankings and insights are calculated and provided by Lumnify.
They vary from L1 to L5 with "L5" being the highest.

Awesome Python is part of the LibHunt network. Terms. Privacy Policy.

We recommend Spin The Wheel Of Names for a cryptographically secure random name picker.