DOC · category

Documents & OCR

Reading messy PDFs, scans and images into clean, structured text.

3 tools in this category

A vision model that reads documents and compresses pages into very few tokens, making long PDFs cheap to feed a model.

Parses document images — tables, columns, figures — into clean structured layout for downstream use.

Uses an LLM to pull structured fields out of messy text, keeping each value linked back to where it came from.