Document Processing Tool
A structured document-processing workflow for invoices and receipts, with embedded-text extraction, scanned-document fallback, and fixed-schema output.
Overview

This project showcases a Python backend built to process invoices, receipts, and similar financial documents into one dependable output format. Instead of treating every document the same way, the workflow first checks whether a PDF already contains readable text, then switches to a scanned-document vision path only when needed.
The goal is simple: accept messy real-world documents, keep the response shape fixed, and reduce all unstructured information into a single summary field that is still useful downstream.
Try It Out
Upload your own invoice, receipt, scanned PDF, or image to test the live document-processing flow.
If embedded text is missing, the backend automatically falls back to OCR-style vision extraction and still returns the same fixed schema.
The backend is deployed on Render free tier, so the first request can take a moment.
Invoice AI Extractor
Processed documents
...
successful documents since launch
Average processing time
Not available
calculated from lifetime successful runs
How It Works
- The user uploads a PDF or image from the browser.
- The backend checks whether the PDF has enough embedded text to extract directly.
- If the text is weak or missing, the PDF pages are rendered as images and processed through a vision-based OCR fallback.
- The extracted content is sent to the model with a strict Pydantic schema , so the response stays structured.
- The API returns one consistent JSON shape with fields like vendor, date, totals, line items, category, and a concise summary.
This makes the system useful for both clean digital invoices and rough scans without forcing the frontend to handle multiple result formats.
Fixed Output
The backend always aims to return a single schema, including:
document_typeprocessing_modevendordocument_numberpurchase_order_numberdatedue_datecurrencysubtotaltaxtotalcategorypayment_methodline_itemssummary
That structure is what keeps the tool suitable for financial workflows, Excel pipelines, and downstream automation .
Tech Stack
- Backend: FastAPI, Python
- Parsing:
pdfplumber,pypdfium2 - AI Extraction: OpenAI Responses API
- Schema Enforcement:
pydantic - Frontend Demo: Astro + Svelte
Need something similar?
I help startups, agencies, and small remote teams automate workflows, improve reporting, and build internal tools around real operational problems.
If this project looks close to what your team needs, feel free to reach out and I can suggest a practical approach.