40 Pages of PDF, Read in One Breath
Baidu just open-sourced an OCR model that reads entire documents in one pass — no page-by-page tricks, MIT licensed, and it outperforms giants like Gemini 2.5 Pro.
A small model that reads like a human
Imagine handing someone a thick contract and asking them to type it all out. Most tools handle this by reading one page, stopping, then starting again — losing the thread every time. Baidu just released something that works more like a person who reads the whole document before putting pen to paper.
It's called Unlimited-OCR. It came out last week, it's free to use, and it can process over 40 pages of a PDF in a single pass — no chopping things up, no stitching them back together. The result is cleaner, more accurate text extraction than most paid cloud services.
The surprising part? In head-to-head tests, it outperformed models that are 80 times its size — including Google's Gemini 2.5 Pro. Smaller doesn't mean worse when the design is genuinely clever.
Why this matters for your business
If you're doing anything with documents — invoices, contracts, menus, intake forms, reports — this kind of tool sits at the front door of the whole process. The better it reads, the better everything downstream works.
Right now there are developers building contract review tools, accounting assistants, and document search systems on top of paid cloud OCR APIs. This gives them (and you) a genuinely good free alternative that can run privately, on your own servers if needed.
Worth asking your tech person about if documents are a bottleneck in any workflow.
Words worth knowing
OCR — Optical Character Recognition. The technology that turns a scanned page or photo of a document into actual text a computer can read and search.
Open-source — The code is public and free. Anyone can use it, inspect it, or build on it — no subscription required.
Context window — How much text an AI can hold in its head at once. A bigger context window means it can read longer documents without losing track of what came earlier.
RAG pipeline — A setup where an AI searches through your own documents to answer questions. Think of it as giving your AI a filing cabinet of your own business knowledge.