Skip to main content

LlamaParse API v2 Guide

This comprehensive guide covers the new v2 API endpoint for LlamaParse, which introduces a structured configuration approach for better organization and validation.

⚠️ Alpha Version Warning: The v2 endpoint is currently in alpha (v2alpha1) and is subject to breaking changes until the stable release. We recommend testing thoroughly and being prepared for potential API changes during development.

Quick Start

Basic Usage

curl -X POST \
-H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
-F "file=@document.pdf" \
-F 'configuration={
"parse_options": {
"parse_mode": "preset",
"preset_options": {
"preset": "scientific"
}
}
}' \
"https://api.cloud.llamaindex.ai/api/v2alpha1/parse/upload"

What's Different from v1

  • Single configuration parameter: Instead of 70+ individual form parameters, v2 uses one JSON configuration string
  • Parse mode-specific options: Only relevant options for your chosen parsing mode are available
  • Better validation: Structured JSON schema with clear error messages
  • Hierarchical organization: Related settings are grouped logically

Endpoint Details

  • URL: https://api.cloud.llamaindex.ai/api/v2alpha1/parse/upload
  • Method: POST
  • Content-Type: multipart/form-data
  • Required Headers: Authorization: Bearer YOUR_API_KEY

Configuration Structure

The v2 API accepts two form parameters:

  1. file (optional): The document file to parse
  2. configuration (required): JSON string containing all parsing options

Input Methods

You can provide input in two ways (but not both):

  1. File upload: Use the file parameter with multipart form data
  2. URL: Specify a URL in the configuration's source_url.url field

Parse Modes

The parse_mode field determines how your document is processed. Each mode has specific options available only to that mode.

Preset Mode ("preset")

Best for: Quick setup with predefined configurations optimized for specific document types.

Available Presets:

  • "invoice" / "invoice-v-1" - Optimized for invoices and receipts
  • "scientific" / "scientific-v-1" - For scientific papers and research documents
  • "forms" / "forms-v-1" - For forms and questionnaires
  • "technicalDocumentation" / "technicalDocumentation-v-1" - For technical docs with schematics
  • "slides" - For presentation slides
  • "formsBboxExperimental" - Experimental forms parsing with bounding boxes

Configuration:

{
"parse_options": {
"parse_mode": "preset",
"preset_options": {
"preset": "scientific",
"ocr_parameters": {
"languages": ["en", "es"]
}
}
}
}

Parse Without AI ("parse_without_ai")

Best for: Fast text extraction from simple documents without complex layouts.

How it works: Extracts text directly without AI reconstruction. Fastest option but no markdown formatting.

Configuration:

{
"parse_options": {
"parse_mode": "parse_without_ai",
"parse_without_ai_options": {
"ignore": {
"ignore_diagonal_text": true,
"ignore_text_in_image": false
},
"ocr_parameters": {
"languages": ["en"]
}
}
}
}

Parse with LLM ("parse_with_llm")

Best for: Documents with mixed content (text, tables, images) requiring structured output.

How it works: Uses a Large Language Model to reconstruct document structure from extracted text and images.

Configuration:

{
"parse_options": {
"parse_mode": "parse_with_llm",
"parse_with_llm_options": {
"model": "gpt-4o",
"prompts": {
"user_prompt": "Extract key financial information",
"system_prompt_append": "Focus on tables and charts"
},
"ignore": {
"ignore_diagonal_text": false,
"ignore_text_in_image": false
},
"ocr_parameters": {
"languages": ["en", "fr"]
}
}
}
}

Parse with External Provider ("parse_with_external_provider")

Best for: Using your own API keys for multimodal models or specific Azure deployments.

How it works: Sends page screenshots to external vision models for processing.

Configuration:

{
"parse_options": {
"parse_mode": "parse_with_external_provider",
"parse_with_external_provider_options": {
"model": "openai-gpt4o",
"vendor_multimodal_api_key": "sk-proj-...",
"prompts": {
"user_prompt": "Extract structured data"
},
"azure_openai": {
"deployment_name": "gpt-4-vision",
"endpoint": "https://myresource.openai.azure.com/",
"api_key": "your-key",
"api_version": "2024-02-01"
}
}
}
}

Supported Models:

  • openai-gpt4o (default)
  • openai-gpt-4o-mini
  • openai-gpt-4-1-nano
  • openai-gpt-4-1-mini
  • openai-gpt-4-1
  • anthropic-sonnet-3.5
  • anthropic-sonnet-3.7
  • anthropic-sonnet-4.0
  • gemini-2.0-flash
  • gemini-2.5-flash
  • gemini-2.5-pro
  • gemini-1.5-flash
  • gemini-1.5-pro

Parse with Agent ("parse_with_agent")

Best for: Complex documents requiring highest accuracy (financial reports, dense layouts).

How it works: Uses an agentic reasoning loop with both text and visual analysis for maximum fidelity.

Configuration:

{
"parse_options": {
"parse_mode": "parse_with_agent",
"parse_with_agent_options": {
"model": "anthropic-sonnet-4.0",
"ignore": {
"ignore_diagonal_text": false
},
"ocr_parameters": {
"languages": ["en"]
},
"prompts": {
"user_prompt": "Preserve all table structure and equations"
}
}
}
}

Parse with Layout Agent ("parse_with_layout_agent")

Best for: Documents where precise positioning matters (visual citations, dense layouts).

How it works: Uses vision-language models optimized for layout preservation.

Configuration:

{
"parse_options": {
"parse_mode": "parse_with_layout_agent",
"parse_with_layout_agent_options": {}
}
}

Auto Mode ("auto")

Best for: Dynamic parsing that adapts based on document content.

How it works: Automatically selects parsing strategy based on detected content types.

Configuration:

{
"parse_options": {
"parse_mode": "auto",
"auto_options": {
"configuration_json": "{}",
"trigger_on": {
"image": true,
"table": true,
"text": "financial",
"regexp": "\\$[0-9,]+\\.[0-9]{2}"
},
"ignore": {
"ignore_diagonal_text": false
},
"ocr_parameters": {
"languages": ["en"]
}
}
}
}

Input Options

Configure how different file types are processed:

{
"input_options": {
"html": {
"make_all_elements_visible": true,
"remove_fixed_elements": true,
"remove_navigation_elements": true
},
"pdf": {
"disable_image_extraction": false
},
"spreadsheet": {
"detect_sub_tables_in_sheets": true
}
}
}

HTML Options

  • make_all_elements_visible: Forces hidden elements to be visible during parsing
  • remove_fixed_elements: Removes fixed-position elements (headers, sidebars)
  • remove_navigation_elements: Removes navigation menus

PDF Options

  • disable_image_extraction: Skip extracting embedded images from PDFs

Spreadsheet Options

  • detect_sub_tables_in_sheets: Find and extract sub-tables within spreadsheet cells

Source URL Configuration

Parse documents from web URLs instead of file uploads:

{
"source_url": {
"url": "https://example.com/document.pdf",
"http_proxy": "https://proxy.company.com:8080"
}
}
  • url: Direct URL to the document (must be publicly accessible)
  • http_proxy: Optional proxy server for URL requests

Page Ranges

Control which pages to process:

{
"page_ranges": {
"max_pages": 10,
"target_pages": "1,3,5-10"
}
}
  • max_pages: Maximum number of pages to process
  • target_pages: Specific pages using 1-based indexing (e.g., "1,3,5-10" for pages 1, 3, and 5 through 10)

Important: v2 uses 1-based page indexing, unlike v1 which used 0-based indexing.

Crop Box

Define a specific area of each page to parse:

{
"crop_box": {
"top": 0.1,
"right": 0.1,
"bottom": 0.1,
"left": 0.1
}
}

Values are ratios (0.0 to 1.0) of the page dimensions. Example above crops 10% margin on all sides.

Output Options

Customize the output format and structure:

Markdown Options

{
"output_options": {
"markdown": {
"annotate_links": true,
"pages": {
"prefix": "## Page {pageNumber}\n",
"custom_page_separator": "\n\n== {pageNumber} ==\n\n",
},
"headers_footers": {
"hide_headers": true,
"hide_footers": false,
"page_header_prefix": "Header: ",
"page_footer_suffix": " (Footer)"
},
"tables": {
"compact_markdown_tables": false,
"output_tables_as_markdown": false,
"markdown_table_multiline_separator": " | "
}
}
}
}

Spatial Text Options

{
"output_options": {
"spatial_text": {
"preserve_layout_alignment_across_pages": true,
"preserve_very_small_text": false,
"do_not_unroll_columns": false
}
}
}

Export Options

{
"output_options": {
"tables_as_spreadsheet": {
"enable": true,
"guess_sheet_name": true
},
"extract_layout": {
"enable": true,
"ignore_document_elements_for_layout_detection": false
},
"vectorial_objects": {
"enable": true
},
"embedded_images": {
"enable": true
},
"screenshots": {
"enable": true
},
"export_pdf": {
"enable": false
}
}
}

Webhook Configuration

Set up notifications for job completion:

{
"webhook_configurations": [
{
"webhook_url": "https://your-app.com/webhook",
"webhook_headers": {
"X-Custom-Header": "value"
},
"webhook_events": ["parse.done"]
}
]
}

Note: Currently only the first webhook configuration is used.

Processing Control

Configure timeouts and error handling:

{
"processing_control": {
"timeouts": {
"base_in_seconds": 300,
"extra_time_per_page_in_seconds": 30
},
"job_failure_conditions": {
"allowed_page_failure_ratio": 0.1,
"fail_on_image_extraction_error": false,
"fail_on_image_ocr_error": false,
"fail_on_markdown_reconstruction_error": true,
"fail_on_buggy_font": false
},
"fallback_content": {
"mode": "empty_page",
"prefix": "ERROR: ",
"suffix": " (failed to parse)"
}
}
}

Cache Control

Disable caching for fresh results:

{
"disable_cache": true
}

When true, this both invalidates any existing cache and prevents caching of new results.

Always-Enabled Features

The following features are always enabled in v2 and cannot be disabled:

  • adaptive_long_table: Adaptive long table detection
  • high_res_ocr: High-resolution OCR processing
  • merge_tables_across_pages_in_markdown: Table merging across pages
  • outlined_table_extraction: Outlined table extraction

These were made default because they improve results for most documents.

Complete Configuration Example

{
"parse_options": {
"parse_mode": "parse_with_llm",
"parse_with_llm_options": {
"model": "gpt-4o",
"prompts": {
"user_prompt": "Extract all financial data and preserve table structure"
},
"ignore": {
"ignore_diagonal_text": true,
"ignore_text_in_image": false
},
"ocr_parameters": {
"languages": ["en", "es"]
}
}
},
"source_url": {
"url": "https://example.com/report.pdf"
},
"page_ranges": {
"max_pages": 20,
"target_pages": "1-5,10,15-20"
},
"crop_box": {
"top": 0.05,
"bottom": 0.95,
"left": 0.05,
"right": 0.95
},
"output_options": {
"markdown": {
"annotate_links": true,
"pages": {
"prefix": "# Page {pageNumber}\n"
},
"tables": {
"output_tables_as_markdown": true
}
},
"extract_layout": {
"enable": true
},
"screenshots": {
"enable": true
}
},
"webhook_configurations": [
{
"webhook_url": "https://myapp.com/webhook",
"webhook_events": ["parse.done"]
}
],
"processing_control": {
"timeouts": {
"base_in_seconds": 600
},
"job_failure_conditions": {
"allowed_page_failure_ratio": 0.05
}
},
"disable_cache": false
}

Error Handling

v2 provides detailed validation errors:

{
"detail": [
{
"type": "value_error",
"loc": ["parse_options", "parse_with_llm_options"],
"msg": "parse_with_llm_options can only be used with parse_mode 'parse_with_llm'",
"input": {...}
}
]
}

Response Format

The response structure remains the same as v1, returning a ParsingJob object with job details and status.

Migration from v1

If you're migrating from v1, see our detailed migration guide for parameter mapping and breaking changes.