LlamaParse API v2 Guide

This comprehensive guide covers the new v2 API endpoint for LlamaParse, which introduces a structured configuration approach for better organization and validation.

⚠️ Alpha Version Warning: The v2 endpoint is currently in alpha (v2alpha1) and is subject to breaking changes until the stable release. We recommend testing thoroughly and being prepared for potential API changes during development.

Quick Start

Basic Usage

curl -X POST \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
  -F "file=@document.pdf" \
  -F 'configuration={
    "parse_options": {
      "parse_mode": "preset",
      "preset_options": {
        "preset": "scientific"
      }
    }
  }' \
  "https://api.cloud.llamaindex.ai/api/v2alpha1/parse/upload"

What's Different from v1

Single configuration parameter: Instead of 70+ individual form parameters, v2 uses one JSON configuration string
Parse mode-specific options: Only relevant options for your chosen parsing mode are available
Better validation: Structured JSON schema with clear error messages
Hierarchical organization: Related settings are grouped logically

Endpoint Details

URL: https://api.cloud.llamaindex.ai/api/v2alpha1/parse/upload
Method: POST
Content-Type: multipart/form-data
Required Headers: Authorization: Bearer YOUR_API_KEY

Configuration Structure

The v2 API accepts two form parameters:

file (optional): The document file to parse
configuration (required): JSON string containing all parsing options

Input Methods

You can provide input in two ways (but not both):

File upload: Use the file parameter with multipart form data
URL: Specify a URL in the configuration's source_url.url field

Parse Modes

The parse_mode field determines how your document is processed. Each mode has specific options available only to that mode.

Preset Mode (`"preset"`)

Best for: Quick setup with predefined configurations optimized for specific document types.

Available Presets:

"invoice" / "invoice-v-1" - Optimized for invoices and receipts
"scientific" / "scientific-v-1" - For scientific papers and research documents
"forms" / "forms-v-1" - For forms and questionnaires
"technicalDocumentation" / "technicalDocumentation-v-1" - For technical docs with schematics
"slides" - For presentation slides
"formsBboxExperimental" - Experimental forms parsing with bounding boxes

Configuration:

{
  "parse_options": {
    "parse_mode": "preset",
    "preset_options": {
      "preset": "scientific",
      "ocr_parameters": {
        "languages": ["en", "es"]
      }
    }
  }
}

Parse Without AI (`"parse_without_ai"`)

Best for: Fast text extraction from simple documents without complex layouts.

How it works: Extracts text directly without AI reconstruction. Fastest option but no markdown formatting.

Configuration:

{
  "parse_options": {
    "parse_mode": "parse_without_ai",
    "parse_without_ai_options": {
      "ignore": {
        "ignore_diagonal_text": true,
        "ignore_text_in_image": false
      },
      "ocr_parameters": {
        "languages": ["en"]
      }
    }
  }
}

Parse with LLM (`"parse_with_llm"`)

Best for: Documents with mixed content (text, tables, images) requiring structured output.

How it works: Uses a Large Language Model to reconstruct document structure from extracted text and images.

Configuration:

{
  "parse_options": {
    "parse_mode": "parse_with_llm",
    "parse_with_llm_options": {
      "model": "gpt-4o",
      "prompts": {
        "user_prompt": "Extract key financial information",
        "system_prompt_append": "Focus on tables and charts"
      },
      "ignore": {
        "ignore_diagonal_text": false,
        "ignore_text_in_image": false
      },
      "ocr_parameters": {
        "languages": ["en", "fr"]
      }
    }
  }
}

Parse with External Provider (`"parse_with_external_provider"`)

Best for: Using your own API keys for multimodal models or specific Azure deployments.

How it works: Sends page screenshots to external vision models for processing.

Configuration:

{
  "parse_options": {
    "parse_mode": "parse_with_external_provider",
    "parse_with_external_provider_options": {
      "model": "openai-gpt4o",
      "vendor_multimodal_api_key": "sk-proj-...",
      "prompts": {
        "user_prompt": "Extract structured data"
      },
      "azure_openai": {
        "deployment_name": "gpt-4-vision",
        "endpoint": "https://myresource.openai.azure.com/",
        "api_key": "your-key",
        "api_version": "2024-02-01"
      }
    }
  }
}

Supported Models:

openai-gpt4o (default)
openai-gpt-4o-mini
openai-gpt-4-1-nano
openai-gpt-4-1-mini
openai-gpt-4-1
anthropic-sonnet-3.5
anthropic-sonnet-3.7
anthropic-sonnet-4.0
gemini-2.0-flash
gemini-2.5-flash
gemini-2.5-pro
gemini-1.5-flash
gemini-1.5-pro

Parse with Agent (`"parse_with_agent"`)

Best for: Complex documents requiring highest accuracy (financial reports, dense layouts).

How it works: Uses an agentic reasoning loop with both text and visual analysis for maximum fidelity.

Configuration:

{
  "parse_options": {
    "parse_mode": "parse_with_agent",
    "parse_with_agent_options": {
      "model": "anthropic-sonnet-4.0",
      "ignore": {
        "ignore_diagonal_text": false
      },
      "ocr_parameters": {
        "languages": ["en"]
      },
      "prompts": {
        "user_prompt": "Preserve all table structure and equations"
      }
    }
  }
}

Parse with Layout Agent (`"parse_with_layout_agent"`)

Best for: Documents where precise positioning matters (visual citations, dense layouts).

How it works: Uses vision-language models optimized for layout preservation.

Configuration:

{
  "parse_options": {
    "parse_mode": "parse_with_layout_agent",
    "parse_with_layout_agent_options": {}
  }
}

Auto Mode (`"auto"`)

Best for: Dynamic parsing that adapts based on document content.

How it works: Automatically selects parsing strategy based on detected content types.

Configuration:

{
  "parse_options": {
    "parse_mode": "auto",
    "auto_options": {
      "configuration_json": "{}",
      "trigger_on": {
        "image": true,
        "table": true,
        "text": "financial",
        "regexp": "\\$[0-9,]+\\.[0-9]{2}"
      },
      "ignore": {
        "ignore_diagonal_text": false
      },
      "ocr_parameters": {
        "languages": ["en"]
      }
    }
  }
}

Input Options

Configure how different file types are processed:

{
  "input_options": {
    "html": {
      "make_all_elements_visible": true,
      "remove_fixed_elements": true,
      "remove_navigation_elements": true
    },
    "pdf": {
      "disable_image_extraction": false
    },
    "spreadsheet": {
      "detect_sub_tables_in_sheets": true
    }
  }
}

HTML Options

make_all_elements_visible: Forces hidden elements to be visible during parsing
remove_fixed_elements: Removes fixed-position elements (headers, sidebars)
remove_navigation_elements: Removes navigation menus

PDF Options

disable_image_extraction: Skip extracting embedded images from PDFs

Spreadsheet Options

detect_sub_tables_in_sheets: Find and extract sub-tables within spreadsheet cells

Source URL Configuration

Parse documents from web URLs instead of file uploads:

{
  "source_url": {
    "url": "https://example.com/document.pdf",
    "http_proxy": "https://proxy.company.com:8080"
  }
}

url: Direct URL to the document (must be publicly accessible)
http_proxy: Optional proxy server for URL requests

Page Ranges

Control which pages to process:

{
  "page_ranges": {
    "max_pages": 10,
    "target_pages": "1,3,5-10"
  }
}

max_pages: Maximum number of pages to process
target_pages: Specific pages using 1-based indexing (e.g., "1,3,5-10" for pages 1, 3, and 5 through 10)

Important: v2 uses 1-based page indexing, unlike v1 which used 0-based indexing.

Crop Box

Define a specific area of each page to parse:

{
  "crop_box": {
    "top": 0.1,
    "right": 0.1,
    "bottom": 0.1,
    "left": 0.1
  }
}

Values are ratios (0.0 to 1.0) of the page dimensions. Example above crops 10% margin on all sides.

Output Options

Customize the output format and structure:

Markdown Options

{
  "output_options": {
    "markdown": {
      "annotate_links": true,
      "pages": {
        "prefix": "## Page {pageNumber}\n",
        "custom_page_separator": "\n\n== {pageNumber} ==\n\n",
      },
      "headers_footers": {
        "hide_headers": true,
        "hide_footers": false,
        "page_header_prefix": "Header: ",
        "page_footer_suffix": " (Footer)"
      },
      "tables": {
        "compact_markdown_tables": false,
        "output_tables_as_markdown": false,
        "markdown_table_multiline_separator": " | "
      }
    }
  }
}

Spatial Text Options

{
  "output_options": {
    "spatial_text": {
      "preserve_layout_alignment_across_pages": true,
      "preserve_very_small_text": false,
      "do_not_unroll_columns": false
    }
  }
}

Export Options

{
  "output_options": {
    "tables_as_spreadsheet": {
      "enable": true,
      "guess_sheet_name": true
    },
    "extract_layout": {
      "enable": true,
      "ignore_document_elements_for_layout_detection": false
    },
    "vectorial_objects": {
      "enable": true
    },
    "embedded_images": {
      "enable": true
    },
    "screenshots": {
      "enable": true
    },
    "export_pdf": {
      "enable": false
    }
  }
}

Webhook Configuration

Set up notifications for job completion:

{
  "webhook_configurations": [
    {
      "webhook_url": "https://your-app.com/webhook",
      "webhook_headers": {
        "X-Custom-Header": "value"
      },
      "webhook_events": ["parse.done"]
    }
  ]
}

Note: Currently only the first webhook configuration is used.

Processing Control

Configure timeouts and error handling:

{
  "processing_control": {
    "timeouts": {
      "base_in_seconds": 300,
      "extra_time_per_page_in_seconds": 30
    },
    "job_failure_conditions": {
      "allowed_page_failure_ratio": 0.1,
      "fail_on_image_extraction_error": false,
      "fail_on_image_ocr_error": false,
      "fail_on_markdown_reconstruction_error": true,
      "fail_on_buggy_font": false
    },
    "fallback_content": {
      "mode": "empty_page",
      "prefix": "ERROR: ",
      "suffix": " (failed to parse)"
    }
  }
}

Cache Control

Disable caching for fresh results:

{
  "disable_cache": true
}

When true, this both invalidates any existing cache and prevents caching of new results.

Always-Enabled Features

The following features are always enabled in v2 and cannot be disabled:

adaptive_long_table: Adaptive long table detection
high_res_ocr: High-resolution OCR processing
merge_tables_across_pages_in_markdown: Table merging across pages
outlined_table_extraction: Outlined table extraction

These were made default because they improve results for most documents.

Complete Configuration Example

{
  "parse_options": {
    "parse_mode": "parse_with_llm",
    "parse_with_llm_options": {
      "model": "gpt-4o",
      "prompts": {
        "user_prompt": "Extract all financial data and preserve table structure"
      },
      "ignore": {
        "ignore_diagonal_text": true,
        "ignore_text_in_image": false
      },
      "ocr_parameters": {
        "languages": ["en", "es"]
      }
    }
  },
  "source_url": {
    "url": "https://example.com/report.pdf"
  },
  "page_ranges": {
    "max_pages": 20,
    "target_pages": "1-5,10,15-20"
  },
  "crop_box": {
    "top": 0.05,
    "bottom": 0.95,
    "left": 0.05,
    "right": 0.95
  },
  "output_options": {
    "markdown": {
      "annotate_links": true,
      "pages": {
        "prefix": "# Page {pageNumber}\n"
      },
      "tables": {
        "output_tables_as_markdown": true
      }
    },
    "extract_layout": {
      "enable": true
    },
    "screenshots": {
      "enable": true
    }
  },
  "webhook_configurations": [
    {
      "webhook_url": "https://myapp.com/webhook",
      "webhook_events": ["parse.done"]
    }
  ],
  "processing_control": {
    "timeouts": {
      "base_in_seconds": 600
    },
    "job_failure_conditions": {
      "allowed_page_failure_ratio": 0.05
    }
  },
  "disable_cache": false
}

Error Handling

v2 provides detailed validation errors:

{
  "detail": [
    {
      "type": "value_error",
      "loc": ["parse_options", "parse_with_llm_options"],
      "msg": "parse_with_llm_options can only be used with parse_mode 'parse_with_llm'",
      "input": {...}
    }
  ]
}

Response Format

The response structure remains the same as v1, returning a ParsingJob object with job details and status.

Migration from v1

If you're migrating from v1, see our detailed migration guide for parameter mapping and breaking changes.

Quick Start​

Basic Usage​

What's Different from v1​

Endpoint Details​

Configuration Structure​

Input Methods​

Parse Modes​

Preset Mode ("preset")​

Parse Without AI ("parse_without_ai")​

Parse with LLM ("parse_with_llm")​

Parse with External Provider ("parse_with_external_provider")​

Parse with Agent ("parse_with_agent")​

Parse with Layout Agent ("parse_with_layout_agent")​

Auto Mode ("auto")​

Input Options​

HTML Options​

PDF Options​

Spreadsheet Options​

Source URL Configuration​

Page Ranges​

Crop Box​

Output Options​

Markdown Options​

Spatial Text Options​

Export Options​

Webhook Configuration​

Processing Control​

Cache Control​

Always-Enabled Features​

Complete Configuration Example​

Error Handling​

Response Format​

Migration from v1​

Quick Start

Basic Usage

What's Different from v1

Endpoint Details

Configuration Structure

Input Methods

Parse Modes

Preset Mode (`"preset"`)

Parse Without AI (`"parse_without_ai"`)

Parse with LLM (`"parse_with_llm"`)

Parse with External Provider (`"parse_with_external_provider"`)

Parse with Agent (`"parse_with_agent"`)

Parse with Layout Agent (`"parse_with_layout_agent"`)

Auto Mode (`"auto"`)

Input Options

HTML Options

PDF Options

Spreadsheet Options

Source URL Configuration

Page Ranges

Crop Box

Output Options

Markdown Options

Spatial Text Options

Export Options

Webhook Configuration

Processing Control

Cache Control

Always-Enabled Features

Complete Configuration Example

Error Handling

Response Format

Migration from v1