Skip to main content

Advance Parsing modes

LlamaParse leverage Large Language Models (LLM) and Large Vision Models (LVM) to parse documents. By setting parse_mode it is possible to control of the parsing method used.

Parse without LLM​

Used by setting parse_mode="parse_page_without_llm" on the API.

Equivalent to setting fast_mode=True. In this mode LlamaParse will not use LLM or LVM to parse the document. Only a layered text will be output. This mode do not return markdown.

By default this mode extract image from the document and OCR them.

If faster result are required it is possible to disable image extraction (by setting disable_image_extraction=True) and OCR (by setting disable_ocr=True).

In Python:
parser = LlamaParse(
  parse_mode="parse_page_without_llm"
)
Using the API:
curl -X 'POST' \
  'https://api.cloud.llamaindex.ai/api/parsing/upload'  \
  -H 'accept: application/json' \
  -H 'Content-Type: multipart/form-data' \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
  --form 'parse_mode="parse_page_without_llm"' \
  -F 'file=@/path/to/your/file.pdf;type=application/pdf'

Parse page with LLM​

Used by setting parse_mode="parse_page_with_llm" on the API.

Our default mode (equivalent to balanced mode). In this mode LlamaParse will first extract a layered text (output as text), then feed it to a Large Language Model for reconstruction of the page structure (output as markdown).

This mode feed the document page by page to the model.

This offer a good quality / cost ballance (LLM are cheaper to run than LVM).

The model used is non configurable.

In Python:
parser = LlamaParse(
  parse_mode="parse_page_with_llm"
)
Using the API:
curl -X 'POST' \
  'https://api.cloud.llamaindex.ai/api/parsing/upload'  \
  -H 'accept: application/json' \
  -H 'Content-Type: multipart/form-data' \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
  --form 'parse_mode="parse_page_with_llm"' \
  -F 'file=@/path/to/your/file.pdf;type=application/pdf'

Parse document with LLM​

Used by setting parse_mode="parse_document_with_llm" on the API.

Same as parse_page_with_llm, but instead feed the document in full to the model, leading to better coherence in headings / multipage tables.

In Python:
parser = LlamaParse(
  parse_mode="parse_document_with_llm"
)
Using the API:
curl -X 'POST' \
  'https://api.cloud.llamaindex.ai/api/parsing/upload'  \
  -H 'accept: application/json' \
  -H 'Content-Type: multipart/form-data' \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
  --form 'parse_mode="parse_document_with_llm"' \
  -F 'file=@/path/to/your/file.pdf;type=application/pdf'

Parse page with LVM​

Used by setting parse_mode="parse_page_with_lvm" on the API.

Equivalent to use_vendor_multimodal_model=True.

In this mode LlamaParse will take screenshot of each pages of the document and feed them to a LVM for reconstruction.

By default this mode use openai-gpt4o as a model. This can be setup using vendor_multimodal_model_name=<model_name>.

Available models are :

ModelModel stringPrice
Open AI Gpt4o (Default)openai-gpt4o10 credits per page (3c/page)
Open AI Gpt4o Miniopenai-gpt-4o-mini5 credits per page (1.5c/page)
Sonnet 3.5anthropic-sonnet-3.520 credits per page (6c/page)
Sonnet 3.5anthropic-sonnet-3.720 credits per page (6c/page)
Gemini 2.0 Flash 001gemini-2.0-flash-0015 credits per page (1.5c/page)
Gemini 1.5 Flashgemini-1.5-flash5 credits per page (1.5c/page)
Gemini 1.5 Progemini-1.5-pro10 credits per page (3c/page)
Custom Azure Modelcustom-azure-modelN/A

See Multimodal for more options / details.

In Python:
parser = LlamaParse(
  parse_mode="parse_page_with_lvm"
)
Using the API:
curl -X 'POST' \
  'https://api.cloud.llamaindex.ai/api/parsing/upload'  \
  -H 'accept: application/json' \
  -H 'Content-Type: multipart/form-data' \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
  --form 'parse_mode="parse_page_with_lvm"' \
  -F 'file=@/path/to/your/file.pdf;type=application/pdf'

Parse page with Agent​

Used by setting parse_mode="parse_page_with_agent" on the API.

Equivalent to premium_mode=True.

Our most accurate mode. In this mode LlamaParse will first extract a layered text (output as text) and take a screenshot of each page of a document.

Then it will use an agentic process to feed it to a Large Language Model / Large vision mode for reconstruction of the page structure (output as markdown).

This mode feed the document page by page to the model(s).

The model used are non configurable.

In Python:
parser = LlamaParse(
  parse_mode="parse_page_with_agent"
)
Using the API:
curl -X 'POST' \
  'https://api.cloud.llamaindex.ai/api/parsing/upload'  \
  -H 'accept: application/json' \
  -H 'Content-Type: multipart/form-data' \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
  --form 'parse_mode="parse_page_with_agent"' \
  -F 'file=@/path/to/your/file.pdf;type=application/pdf'