Selected Tags
Click on a tag to remove itMore Tags
Click on a tag to add it and filter downOCR packages
Showing projects tagged as OCR
-
PyMuPDF
8.5 9.7 PythonPyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents. -
Kreuzberg
6.6 9.9 HTMLA polyglot document intelligence framework with a Rust core. Extract text, metadata, and structured information from PDFs, Office documents, images, and 50+ formats. Available for Rust, Python, Ruby, Go, PHP, Elixir, and TypeScript/Node.js—or use via CLI, REST API, or MCP server. -
pdftabextract
6.4 0.0 L3 PythonA set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.
* Code Quality Rankings and insights are calculated and provided by Lumnify.
They vary from L1 to L5 with "L5" being the highest.