Migration Guide: Parse Upload Endpoint v1 to v2

This guide will help you migrate from the v1 Parse upload endpoint to the new v2 endpoint, which introduces a structured configuration approach and improved organization of parsing options.

⚠️ Alpha Version Warning: The v2 endpoint is currently in alpha (v2alpha1) and is subject to breaking changes until the stable release. We recommend testing thoroughly and being prepared for potential API changes during development.

Overview of Changes

The v2 endpoint replaces individual form parameters with a single JSON configuration string, providing:

Better organization: Related options are grouped into logical sections
Type safety: Structured validation with clear schemas
Parse mode separation: Only relevant options for your chosen parse mode are required
Extensibility: Easier to add new features without endpoint bloat
Validation: Better error messages and configuration validation

Key Differences

v1 Endpoint

POST /api/v1/parsing/upload
Content-Type: multipart/form-data

- 70+ individual form parameters
- Flat parameter structure
- All parameters available regardless of parse mode

v2 Endpoint

POST /api/v2alpha1/parse/upload
Content-Type: multipart/form-data

- Single 'configuration' JSON string parameter
- Hierarchical, structured configuration
- Parse mode-specific options
- Strict validation with clear error messages

Migration Steps

1. Update the Endpoint URL

Before (v1):

POST https://api.cloud.llamaindex.ai/api/v1/parsing/upload

After (v2):

POST https://api.cloud.llamaindex.ai/api/v2alpha1/parse/upload

2. Replace Form Parameters with Configuration JSON

Instead of sending individual form parameters, you now send a single configuration parameter containing a JSON string.

Note: The file parameter remains unchanged - you still upload files the same way using multipart form data. Only the configuration approach has changed.

3. Migration Checklist

Before migrating, review this checklist:

Check for always-enabled parameters: adaptive_long_table, high_res_ocr, merge_tables_across_pages_in_markdown, outlined_table_extraction are always enabled in v2
Update page indexing: Change target_pages from 0-based to 1-based indexing
Replace deprecated parameters: Remove gpt4o_mode, premium_mode, fast_mode, etc.
Move language parameter: Move language to parse mode specific ocr_parameters
Update cache parameters: Replace invalidate_cache + do_not_cache with single disable_cache
Convert webhooks: Change from single webhook_url to webhook_configurations array
Update prompts: Move prompt parameters to parse mode specific sections
Test thoroughly: The alpha API may have additional breaking changes

Configuration Structure

The v2 configuration follows this structure:

{
  "client_name": "string (optional)",
  "parse_options": {
    "parse_mode": "preset|parse_with_llm|parse_with_agent|etc.",
    // Mode-specific options (see examples below)
  },
  "source_url": {
    "url": "string (optional)",
    "http_proxy": "string (optional)"
  },
  "webhook_configurations": [...],
  "input_options": {...},
  "crop_box": {...},
  "page_ranges": {...},
  "disable_cache": "boolean (optional)",
  "output_options": {...},
  "processing_control": {...}
}

Parse Mode Options

Important: You can only include the sub-object that corresponds to your chosen parse mode. For example, if you choose parse_mode: "preset", you can only include preset_options, not parse_with_llm_options.

Preset Mode

v1 Example:

curl -X POST \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
  -F "file=@document.pdf" \
  -F "preset=scientific" \
  -F "language=en,es" \
  "https://api.cloud.llamaindex.ai/api/v1/parsing/upload"

v2 Example:

curl -X POST \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
  -F "file=@document.pdf" \
  -F 'configuration={
    "parse_options": {
      "parse_mode": "preset",
      "preset_options": {
        "preset": "scientific",
        "ocr_parameters": {
          "languages": ["en", "es"]
        }
      }
    }
  }' \
  "https://api.cloud.llamaindex.ai/api/v2alpha1/parse/upload"

Parse with LLM Mode

v1 Example:

curl -X POST \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
  -F "file=@document.pdf" \
  -F "parse_mode=parse_page_with_llm" \
  -F "model=gpt-4o" \
  -F "user_prompt=Extract key information" \
  -F "disable_ocr=true" \
  "https://api.cloud.llamaindex.ai/api/v1/parsing/upload"

v2 Example:

curl -X POST \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
  -F "file=@document.pdf" \
  -F 'configuration={
    "parse_options": {
      "parse_mode": "parse_with_llm",
      "parse_with_llm_options": {
        "model": "gpt-4o",
        "prompts": {
          "user_prompt": "Extract key information"
        },
        "ignore": {
          "ignore_text_in_image": true
        }
      }
    }
  }' \
  "https://api.cloud.llamaindex.ai/api/v2alpha1/parse/upload"

External Provider Mode (Azure OpenAI)

v1 Example:

curl -X POST \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
  -F "file=@document.pdf" \
  -F "parse_mode=parse_page_with_lvm" \
  -F "azure_openai_endpoint=https://myresource.openai.azure.com/" \
  -F "azure_openai_deployment_name=gpt-4-vision" \
  -F "azure_openai_key=your-key" \
  -F "azure_openai_api_version=2024-02-01" \
  "https://api.cloud.llamaindex.ai/api/v1/parsing/upload"

v2 Example:

curl -X POST \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
  -F "file=@document.pdf" \
  -F 'configuration={
    "parse_options": {
      "parse_mode": "parse_with_external_provider",
      "parse_with_external_provider_options": {
        "azure_openai": {
          "endpoint": "https://myresource.openai.azure.com/",
          "deployment_name": "gpt-4-vision",
          "api_key": "your-key",
          "api_version": "2024-02-01"
        }
      }
    }
  }' \
  "https://api.cloud.llamaindex.ai/api/v2alpha1/parse/upload"

Parameter Mapping Reference

Basic Options

v1 Parameter	v2 Location	Notes
`input_url`	`source_url.url`	Moved to structured source configuration
`http_proxy`	`source_url.http_proxy`	Same functionality
`max_pages`	`page_ranges.max_pages`	Same functionality
`target_pages`	`page_ranges.target_pages`	Breaking change: Now uses 1-based indexing (user inputs "1,2,3" instead of "0,1,2")
`invalidate_cache` and `do_not_cache`	`disable_cache`	Breaking change: Single boolean combines both v1 parameters
`language`	`parse_options.{mode}_options.ocr_parameters.languages`	Same functionality

Important: In v1, target_pages used 0-based indexing (e.g., "0,1,2" for pages 1, 2, 3). In v2, it uses 1-based indexing (e.g., "1,2,3" for the same pages) to be homogenous with the rest of the platform.

Always Enabled in v2 (Breaking Changes)

The following parameters are always enabled in v2 and cannot be disabled. We're doing this to simplify calling LlamaParse and because these options give better results:

v1 Parameter	v2 Behavior	Breaking Change
`adaptive_long_table`	Always `true`	Breaking: Cannot be disabled in v2
`high_res_ocr`	Always `true`	Breaking: Cannot be disabled in v2
`merge_tables_across_pages_in_markdown`	Always `true`	Breaking: Cannot be disabled in v2
`outlined_table_extraction`	Always `true`	Breaking: Cannot be disabled in v2

Removed/Deprecated Parameters

The following v1 parameters are not supported in v2:

v1 Parameter	v2 Status	Migration Path
`use_vendor_multimodal_model`	Removed (was deprecated)	Use `parse_mode: "parse_with_external_provider"` instead
`gpt4o_mode`	Removed	Use `parse_mode: "parse_with_llm"` with `model: "gpt-4o"`
`gpt4o_api_key`	Removed	Use `parse_mode: "parse_with_external_provider"` with appropriate provider config
`premium_mode`	Removed	Use appropriate parse mode instead
`fast_mode`	Removed	Use `parse_mode: "parse_without_ai"` for faster processing
`continuous_mode`	Removed	No direct equivalent
`parsing_instruction`	Renamed	Use `parse_options.{mode}_options.prompts.user_prompt`
`formatting_instruction`	Renamed	Use `parse_options.{mode}_options.prompts.user_prompt`
`system_prompt`	Renamed	Use `parse_options.{mode}_options.prompts.system_prompt_append`
`bounding_box`	Renamed	Use `crop_box` object instead
`input_s3_path` and `input_s3_region`	Removed	Not supported in v2alpha1
`output_s3_path_prefix` and `output_s3_region`	Removed	Not supported in v2alpha1

Webhook Configuration Breaking Changes

v1 Parameter	v2 Location	Notes
`webhook_url`	`webhook_configurations[0].webhook_url`	Breaking: Now an array, but only first entry is used at the moment
`webhook_configurations` (string)	`webhook_configurations` (array)	Breaking: Format changed from JSON string to structured array

Not Yet Implemented in v2

The following options exist in the v2 schema but are not yet implemented:

ignore_strikethrough_text (exists in schema but not processed)
input_options.pdf.password (placeholder for future implementation)

Crop Box Options

v1 Parameter	v2 Location
`bbox_top`	`crop_box.top`
`bbox_bottom`	`crop_box.bottom`
`bbox_left`	`crop_box.left`
`bbox_right`	`crop_box.right`

Input Format Options

v1 Parameter	v2 Location
`html_make_all_elements_visible`	`input_options.html.make_all_elements_visible`
`html_remove_fixed_elements`	`input_options.html.remove_fixed_elements`
`html_remove_navigation_elements`	`input_options.html.remove_navigation_elements`
`disable_image_extraction`	`input_options.pdf.disable_image_extraction`
`spreadsheet_extract_sub_tables`	`input_options.spreadsheet.detect_sub_tables_in_sheets`

Ignore Options (Parse Mode Specific)

v1 Parameter	v2 Location	Available In Modes
`skip_diagonal_text`	`parse_options.{mode}_options.ignore.ignore_diagonal_text`	All modes except preset
`disable_ocr`	`parse_options.{mode}_options.ignore.ignore_text_in_image`	All modes except preset

Output Options

v1 Parameter	v2 Location
`annotate_links`	`output_options.markdown.annotate_links`
`page_prefix`	`output_options.markdown.pages.prefix`
`page_separator`	`output_options.markdown.pages.custom_page_separator`
`page_suffix`	`output_options.markdown.pages.suffix`
`hide_headers`	`output_options.markdown.headers_footers.hide_headers`
`hide_footers`	`output_options.markdown.headers_footers.hide_footers`
`compact_markdown_table`	`output_options.markdown.tables.compact_markdown_tables`
`output_tables_as_HTML`	`output_options.markdown.tables.output_tables_as_markdown` (inverted)
`guess_xlsx_sheet_name`	`output_options.tables_as_spreadsheet.guess_sheet_name`
`extract_layout`	`output_options.extract_layout.enable`
`take_screenshot`	`output_options.screenshots.enable`
`output_pdf_of_document`	`output_options.export_pdf.enable`

Processing Control

v1 Parameter	v2 Location
`job_timeout_in_seconds`	`processing_control.timeouts.base_in_seconds`
`job_timeout_extra_time_per_page_in_seconds`	`processing_control.timeouts.extra_time_per_page_in_seconds`
`page_error_tolerance`	`processing_control.job_failure_conditions.allowed_page_failure_ratio`

Complete Migration Examples

Simple Document Parsing

v1:

curl -X POST \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
  -F "file=@document.pdf" \
  -F "parse_mode=parse_page_with_agent" \
  "https://api.cloud.llamaindex.ai/api/v1/parsing/upload"

v2:

curl -X POST \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
  -F "file=@document.pdf" \
  -F 'configuration={
    "parse_options": {
      "parse_mode": "parse_with_agent",
      "parse_with_agent_options": {}
      }
    }
  }' \
  "https://api.cloud.llamaindex.ai/api/v2alpha1/parse/upload"

Complex Configuration with Custom Output

v1:

curl -X POST \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
  -F "file=@document.pdf" \
  -F "parse_mode=parse_page_with_llm" \
  -F "model=gpt-4o" \
  -F "user_prompt=Extract financial data" \
  -F "max_pages=10" \
  -F "page_prefix=## Page " \
  -F "hide_headers=true" \
  -F "extract_layout=true" \
  -F "webhook_url=https://example.com/webhook" \
  "https://api.cloud.llamaindex.ai/api/v1/parsing/upload"

v2:

curl -X POST \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
  -F "file=@document.pdf" \
  -F 'configuration={
    "parse_options": {
      "parse_mode": "parse_with_llm",
      "parse_with_llm_options": {
        "model": "gpt-4o",
        "prompts": {
          "user_prompt": "Extract financial data"
        }
      }
    },
    "page_ranges": {
      "max_pages": 10
    },
    "output_options": {
      "markdown": {
        "pages": {
          "prefix": "## Page "
        },
        "headers_footers": {
          "hide_headers": true
        }
      },
      "extract_layout": {
        "enable": true
      }
    },
    "webhook_configurations": [{
      "webhook_url": "https://example.com/webhook",
      "webhook_events": ["parse.done"]
    }]
  }' \
  "https://api.cloud.llamaindex.ai/api/v2alpha1/parse/upload"

Python SDK Migration

v1 (llama-parse):

from llama_cloud_services import LlamaParse

parser = LlamaParse(
    api_key="llx-...",
    result_type="markdown",
    parsing_instruction="Extract key information",
    max_pages=10
)
result = parser.load_data("document.pdf")

v2 (llama-cloud-services):

from llama_cloud_services import LlamaParse

# Simple preset usage
parser = LlamaParse(
    api_key="llx-...",
    preset="scientific",
    max_pages=10
)
result = parser.parse("document.pdf")

# Advanced configuration
config = {
    "parse_options": {
        "parse_mode": "parse_with_llm",
        "parse_with_llm_options": {
            "prompts": {
                "user_prompt": "Extract key information"
            }
        }
    },
    "page_ranges": {"max_pages": 10}
}

parser = LlamaParse(api_key="llx-...", configuration=config)
result = parser.parse("document.pdf")

Error Handling

v2 provides more detailed error messages:

v1 Errors:

400: Invalid parameter combination

v2 Errors:

{
  "detail": [
    {
      "type": "value_error",
      "loc": ["parse_options", "parse_with_llm_options"],
      "msg": "parse_with_llm_options can only be used with parse_mode 'parse_with_llm'",
      "input": {...}
    }
  ]
}

The v1 endpoint will remain available for the foreseeable future, so you can migrate at your own pace. However, new features and improvements will be focused on the v2 endpoint structure.

Overview of Changes​

Key Differences​

v1 Endpoint​

v2 Endpoint​

Migration Steps​

1. Update the Endpoint URL​

2. Replace Form Parameters with Configuration JSON​

3. Migration Checklist​

Configuration Structure​

Parse Mode Options​

Preset Mode​

Parse with LLM Mode​

External Provider Mode (Azure OpenAI)​

Parameter Mapping Reference​

Basic Options​

Always Enabled in v2 (Breaking Changes)​

Removed/Deprecated Parameters​

Webhook Configuration Breaking Changes​

Not Yet Implemented in v2​

Crop Box Options​

Input Format Options​

Ignore Options (Parse Mode Specific)​

Output Options​

Processing Control​

Complete Migration Examples​

Simple Document Parsing​

Complex Configuration with Custom Output​

Python SDK Migration​

v1 (llama-parse):​

v2 (llama-cloud-services):​

Error Handling​

v1 Errors:​

v2 Errors:​

Overview of Changes

Key Differences

v1 Endpoint

v2 Endpoint

Migration Steps

1. Update the Endpoint URL

2. Replace Form Parameters with Configuration JSON

3. Migration Checklist

Configuration Structure

Parse Mode Options

Preset Mode

Parse with LLM Mode

External Provider Mode (Azure OpenAI)

Parameter Mapping Reference

Basic Options

Always Enabled in v2 (Breaking Changes)

Removed/Deprecated Parameters

Webhook Configuration Breaking Changes

Not Yet Implemented in v2

Crop Box Options

Input Format Options

Ignore Options (Parse Mode Specific)

Output Options

Processing Control

Complete Migration Examples

Simple Document Parsing

Complex Configuration with Custom Output

Python SDK Migration

v1 (llama-parse):

v2 (llama-cloud-services):

Error Handling

v1 Errors:

v2 Errors: