Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Talabov/Invoice-OCR-Parser-API-Web-UI

Open more actions menu

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
1 Commit
 
 
 
 

Repository files navigation

📄 Invoice OCR Parser API & Web UI


AI-powered REST API + beautiful frontend to extract invoice data (number, date, supplier, amount, and more) from PDFs, scans, or smartphone photos.

Includes: Dockerfile, full project structure, HTML5/CSS frontend, Postman-ready endpoints, and setup guides. All in one ZIP.

👉 Buy it on Gumroad


A modern Flask REST API that turns ugly business paperwork into structured, ready-to-use JSON. Perfect for SaaS, internal tools, freelancers, accountants, integrators, and anyone who’s tired of manual data entry.


✅ Key Features

  • 🖼 Works with PDF, JPG, PNG (scanned, camera, digital, whatever)
  • 🔎 Extracts: invoice number, date, supplier, total (auto-detects in most layouts)
  • 🌍 Multilanguage OCR (tesseract)
  • 🧠 Smart text parsing (handles most weird invoice templates, even messy scans)
  • ⚡ Lightning-fast: avg 1–3 seconds per invoice
  • 🖥 Built-in beautiful HTML/CSS UI (drag & drop, mobile ready)
  • 🚦 API and web front available on a single server — no CORS, no extra configs
  • 🐳 Docker-ready & classic Python scripts
  • 🔒 No cloud/3rd party: runs 100% locally, your docs never leave your PC
  • 🧑‍💻 Easy to customize/extend for your business logic

🚀 API Endpoint

Parse Invoice (OCR)

POST /parse-invoice

Request:

  • multipart/form-data with a file (file=...) — PDF/JPG/PNG
  • (optional) lang — OCR language code (default: "eng")

Example using curl:

curl -X POST -F "file=@invoice.pdf" http://localhost:5000/parse-invoice

Response (200):

{
  "parsed_fields": {
    "invoice_date": "April 15, 2024",
    "invoice_number": "INV-2024-117",
    "supplier_name": "Widget Solutions",
    "total_amount": "$750.00"
  },
  "raw_text": "INVOICE Invoice #\n\nINV-2024-117\nSupplier: Date: April 15, 2024\nWidget Solutions\n123 Industrial Park\nSpringfield, IL 62701\n..."
}

⛔ Error Handling

{"error": "No file part in the request"}
{"error": "Unsupported file type"}
{"error": "Text extraction failed: ..."}

🖥 Frontend Demo

Open http://localhost:5000/ — drag & drop your invoice, get structured results and raw OCR instantly.

  • JSON response — formatted for devs
  • Raw Text — human readable
  • "Copy" and "Download" buttons for instant reuse
  • Works on desktop/mobile, looks clean as hell 😎

⚙️ Requirements

pip install -r requirements.txt
  • Flask
  • pytesseract
  • pdf2image
  • Pillow
  • flask-cors
  • Flask-Limiter
  • flasgger (optional, for API docs)
  • python-magic-bin (Windows) / python-magic (Linux)
  • tesseract-ocr (system dependency!)

🐳 Run with Docker

docker build -t invoice-ocr-api .
docker run -p 5000:5000 invoice-ocr-api

🧑‍💻 Manual Run (dev mode)

python app.py

🧪 Screenshots

  • ✅ API result
  • ✅ Web frontend demo
  • ✅ Error handling
  • ✅ Real OCR with tricky invoices

See /screens/ for live examples and raw data.


💼 Buy & Support

Get the full ZIP: project structure, Dockerfile, API + UI, and all the love:

👉 Buy it on Gumroad


📬 Contacts


Need this in Node.js, Go, or another stack? Custom integration? DM me — I'm ready for business.

About

Turn messy scans and invoice PDFs into structured JSON. This Flask API + drag-and-drop UI uses Tesseract to extract invoice number, date, amount, and supplier — even from smartphone photos.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Morty Proxy This is a proxified and sanitized view of the page, visit original site.