Skip to main content

Structured output (beta)

About structured output​

Structured output allows you to extract structured data (such as JSON) from a document directly at the parsing stage, reducing cost and time needed.

Structured output is currently only compatible with our default parsing mode and can be activated by setting structured_output=True in the API.

In Python:
parser = LlamaParse(
  structured_output=True
)
Using the API:
curl -X 'POST' \
  'https://api.cloud.llamaindex.ai/api/parsing/upload'  \
  -H 'accept: application/json' \
  -H 'Content-Type: multipart/form-data' \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
  --form 'structured_output="true"' \
  -F 'file=@/path/to/your/file.pdf;type=application/pdf'

You then need to provide either:

  • a JSON schema in the structured_output_json_schema API variable, which will be used to extract data in the desired format
  • or the name of one of our pre-defined schemas in the variable structured_output_json_schema_name
In Python:
parser = LlamaParse(
  structured_output_json_schema='A JSON SCHEMA'
)
Using the API:
curl -X 'POST' \
  'https://api.cloud.llamaindex.ai/api/parsing/upload'  \
  -H 'accept: application/json' \
  -H 'Content-Type: multipart/form-data' \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
  --form 'structured_output_json_schema='A JSON SCHEMA'' \
  -F 'file=@/path/to/your/file.pdf;type=application/pdf'

or

In Python:
parser = LlamaParse(
  structured_output_json_schema_name="invoice"
)
Using the API:
curl -X 'POST' \
  'https://api.cloud.llamaindex.ai/api/parsing/upload'  \
  -H 'accept: application/json' \
  -H 'Content-Type: multipart/form-data' \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
  --form 'structured_output_json_schema_name="invoice"' \
  -F 'file=@/path/to/your/file.pdf;type=application/pdf'

Supported pre-defined schemas​

imFeelingLucky​

This schema is a wild card, telling LlamaParse to dream the output schema. Use at your own risk.

Using the API:
curl -X 'POST' \
  'https://api.cloud.llamaindex.ai/api/parsing/upload'  \
  -H 'accept: application/json' \
  -H 'Content-Type: multipart/form-data' \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
  --form 'structured_output_json_schema_name="imFeelingLucky"' \
  -F 'file=@/path/to/your/file.pdf;type=application/pdf'

invoice​

This schema represents an invoice

Using the API:
curl -X 'POST' \
  'https://api.cloud.llamaindex.ai/api/parsing/upload'  \
  -H 'accept: application/json' \
  -H 'Content-Type: multipart/form-data' \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
  --form 'structured_output_json_schema_name="invoice"' \
  -F 'file=@/path/to/your/file.pdf;type=application/pdf'

Schema details

resume​

This schema represents an resume. It is based on https://github.com/jsonresume/resume-schema

Using the API:
curl -X 'POST' \
  'https://api.cloud.llamaindex.ai/api/parsing/upload'  \
  -H 'accept: application/json' \
  -H 'Content-Type: multipart/form-data' \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
  --form 'structured_output_json_schema_name="resume"' \
  -F 'file=@/path/to/your/file.pdf;type=application/pdf'

Schema details