Advance Parsing modes

LlamaParse leverage Large Language Models (LLM) and Large Vision Models (LVM) to parse documents. By setting parse_mode it is possible to control of the parsing method used.

Parse without LLM

Used by setting parse_mode="parse_page_without_llm" on the API.

Equivalent to setting fast_mode=True. In this mode LlamaParse will not use LLM or LVM to parse the document. Only a layered text will be output. This mode do not return markdown.

By default this mode extract image from the document and OCR them.

If faster result are required it is possible to disable image extraction (by setting disable_image_extraction=True) and OCR (by setting disable_ocr=True).

In Python:

parser = LlamaParse(
  parse_mode="parse_page_without_llm"
)

Using the API:

curl -X 'POST' \
  'https://api.cloud.llamaindex.ai/api/parsing/upload'  \
  -H 'accept: application/json' \
  -H 'Content-Type: multipart/form-data' \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
  --form 'parse_mode="parse_page_without_llm"' \
  -F 'file=@/path/to/your/file.pdf;type=application/pdf'

Parse page with LLM

Used by setting parse_mode="parse_page_with_llm" on the API.

Our default mode (equivalent to balanced mode). In this mode LlamaParse will first extract a layered text (output as text), then feed it to a Large Language Model for reconstruction of the page structure (output as markdown).

This mode feed the document page by page to the model.

This offer a good quality / cost ballance (LLM are cheaper to run than LVM).

The model used is non configurable.

In Python:

parser = LlamaParse(
  parse_mode="parse_page_with_llm"
)

Using the API:

curl -X 'POST' \
  'https://api.cloud.llamaindex.ai/api/parsing/upload'  \
  -H 'accept: application/json' \
  -H 'Content-Type: multipart/form-data' \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
  --form 'parse_mode="parse_page_with_llm"' \
  -F 'file=@/path/to/your/file.pdf;type=application/pdf'

Parse document with LLM

Used by setting parse_mode="parse_document_with_llm" on the API.

Same as parse_page_with_llm, but instead feed the document in full to the model, leading to better coherence in headings / multipage tables.

In Python:

parser = LlamaParse(
  parse_mode="parse_document_with_llm"
)

Using the API:

curl -X 'POST' \
  'https://api.cloud.llamaindex.ai/api/parsing/upload'  \
  -H 'accept: application/json' \
  -H 'Content-Type: multipart/form-data' \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
  --form 'parse_mode="parse_document_with_llm"' \
  -F 'file=@/path/to/your/file.pdf;type=application/pdf'

Parse page with LVM

Used by setting parse_mode="parse_page_with_lvm" on the API.

Equivalent to use_vendor_multimodal_model=True.

In this mode LlamaParse will take screenshot of each pages of the document and feed them to a LVM for reconstruction.

By default this mode use openai-gpt4o as a model. This can be setup using vendor_multimodal_model_name=<model_name>.

Available models are :

Model	Model string	Price
Open AI Gpt4o (Default)	`openai-gpt4o`	10 credits per page (3c/page)
Open AI Gpt4o Mini	`openai-gpt-4o-mini`	5 credits per page (1.5c/page)
Sonnet 3.5	`anthropic-sonnet-3.5`	20 credits per page (6c/page)
Sonnet 3.5	`anthropic-sonnet-3.7`	20 credits per page (6c/page)
Gemini 2.0 Flash 001	`gemini-2.0-flash-001`	5 credits per page (1.5c/page)
Gemini 1.5 Flash	`gemini-1.5-flash`	5 credits per page (1.5c/page)
Gemini 1.5 Pro	`gemini-1.5-pro`	10 credits per page (3c/page)
Custom Azure Model	`custom-azure-model`	N/A

See Multimodal for more options / details.

In Python:

parser = LlamaParse(
  parse_mode="parse_page_with_lvm"
)

Using the API:

curl -X 'POST' \
  'https://api.cloud.llamaindex.ai/api/parsing/upload'  \
  -H 'accept: application/json' \
  -H 'Content-Type: multipart/form-data' \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
  --form 'parse_mode="parse_page_with_lvm"' \
  -F 'file=@/path/to/your/file.pdf;type=application/pdf'

Parse page with Agent

Used by setting parse_mode="parse_page_with_agent" on the API.

Equivalent to premium_mode=True.

Our most accurate mode. In this mode LlamaParse will first extract a layered text (output as text) and take a screenshot of each page of a document.

Then it will use an agentic process to feed it to a Large Language Model / Large vision mode for reconstruction of the page structure (output as markdown).

This mode feed the document page by page to the model(s).

The model used are non configurable.

In Python:

parser = LlamaParse(
  parse_mode="parse_page_with_agent"
)

Using the API:

curl -X 'POST' \
  'https://api.cloud.llamaindex.ai/api/parsing/upload'  \
  -H 'accept: application/json' \
  -H 'Content-Type: multipart/form-data' \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
  --form 'parse_mode="parse_page_with_agent"' \
  -F 'file=@/path/to/your/file.pdf;type=application/pdf'

Parse without LLM​

Parse page with LLM​

Parse document with LLM​

Parse page with LVM​

Parse page with Agent​

Parse without LLM

Parse page with LLM

Parse document with LLM

Parse page with LVM

Parse page with Agent