← Field Notes
EN/ES

What if your AI assistant actually read the document — like a human?

February 25, 2026via github · @VectifyAI
AIopen-sourceMCPtoolsworkflow

The problem with how AI reads today

When you ask an AI tool a question about a long document — a contract, a report, a manual — it doesn't actually read the whole thing. It breaks the document into tiny pieces, converts each piece into numbers, and then searches for the pieces that look most similar to your question. It's a bit like trying to understand a book by sorting its pages by colour.

It works, sometimes. But it misses things that matter. Context. Structure. The fact that clause 7.2 only makes sense if you read section 3 first.

What PageIndex does differently

PageIndex builds a kind of table of contents for the document — a map of how it's organised — and then reasons over that map. Instead of asking which pieces sound like the question?, it asks where would a knowledgeable person actually look?

The result is noticeably more accurate. On a standard finance benchmark, it hit 98.7% accuracy, beating the traditional approach by a meaningful margin. It's also open-source and free to run.

You can connect it directly to Claude or other AI tools you already use, so your assistant gains the ability to actually find what it's looking for in a long document — not just a lucky guess.

Words worth knowing

RAG — "Retrieval-Augmented Generation." A way of giving an AI access to your own documents before it answers a question. Think of it as handing your assistant a folder before a meeting.

Vector database — A special kind of filing system that stores documents as patterns of numbers so AI can find "similar" ones quickly. Useful, but blunt.

MCP — A connector standard that lets different AI tools talk to each other. Like a universal plug socket for AI.

Open-source — Software whose inner workings are public and free to use or modify. Usually maintained by a community of developers.


If you work with long documents — legal, financial, technical — this is worth watching. The question to sit with: what decisions in your business depend on someone actually reading something carefully?

Want us to audit your site? Takes 60 seconds →