Get 10k free credits when you signup for LlamaParse!

[ From the team behind LlamaParse ]

Parse Any Document.
Locally. Fast.

Open-source document parsing from the team behind LlamaParse. Parsed text from PDFs, Office docs, and images. No cloud, no LLM tokens, no limits.

npm i -g @llamaindex/liteparse
lit parse anything.pdf

Fully open-source

Fast local processing

All major formats

Bounding box output

How it works

How LiteParse works?

01

Input

Drop in any document: PDF, DOCX, PPTX, XLSX, or image. LiteParse auto-detects the format and selects the right parsing strategy.

02

Text Parsing + OCR

A hybrid approach: structure embedded text from files, fall back to traditional OCR for scanned regions. Both run locally, no API calls, no data leaving your machine.

03

AI Ready Output

Get clean JSON with every text element tagged by position, bounding boxes included. Ready for AI agents, citations, or downstream tasks.

One tool, every format

Stop juggling different parsers

One command handles PDFs, Office documents, images, and more. Same interface, same structured output, every time.

Precise spatial output

Know exactly where every element lives

Every parsed element comes with precise bounding box coordinates. Titles, paragraphs, tables, figures are all tagged with their exact position on the page.

  • Citations — Point users to exact locations in source documents.

  • Multimodal pipelines — Pair extracted text with visual screenshots for richer LLM inputs.

Built for agents

Runs anywhere. Integrates with any workflow.

A fast CLI and Python package designed for automation. No API keys, no cloud dependency. Parse documents in CI/CD pipelines, agent workflows, or local scripts.

  • Pipe output directly to LLMs or vector stores

  • Batch-process entire directories in seconds

  • JSON output for programmatic workflows

  • Zero configuration, just install and run

Comparison

LiteParse vs LlamaParse

Features

LiteParse

LlamaParse

Spatial Text Output

Text bounding boxes

Screenshot Image Capture

Local-Only

Markdown Output

Figure/Chart Understanding

Scalable

Scales to number of computer cores

Cloud scaling

Embedded Image Extraction

Image Captioning

Layout Detection

SOTA OCR for Scanned Docs

Dive Deeper

Resources

Get started 
in seconds

Parse your documents with a fast, open-source solution by the LlamaIndex team. No API keys required.