Structured output (beta)
About structured output
Structured output allows you to extract structured data (such as JSON) from a document directly at the parsing stage, reducing cost and time needed.
Structured output is currently only compatible with our default parsing mode and can be activated by setting structured_output=True
in the API.
parser = LlamaParse(Using the API:
structured_output=True
)
curl -X 'POST' \
'https://api.cloud.llamaindex.ai/api/parsing/upload' \
-H 'accept: application/json' \
-H 'Content-Type: multipart/form-data' \
-H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
--form 'structured_output="true"' \
-F 'file=@/path/to/your/file.pdf;type=application/pdf'
You then need to provide either:
- a JSON schema in the
structured_output_json_schema
API variable, which will be used to extract data in the desired format - or the name of one of our pre-defined schemas in the variable
structured_output_json_schema_name
parser = LlamaParse(Using the API:
structured_output_json_schema='A JSON SCHEMA'
)
curl -X 'POST' \
'https://api.cloud.llamaindex.ai/api/parsing/upload' \
-H 'accept: application/json' \
-H 'Content-Type: multipart/form-data' \
-H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
--form 'structured_output_json_schema='A JSON SCHEMA'' \
-F 'file=@/path/to/your/file.pdf;type=application/pdf'
or
In Python:parser = LlamaParse(Using the API:
structured_output_json_schema_name="invoice"
)
curl -X 'POST' \
'https://api.cloud.llamaindex.ai/api/parsing/upload' \
-H 'accept: application/json' \
-H 'Content-Type: multipart/form-data' \
-H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
--form 'structured_output_json_schema_name="invoice"' \
-F 'file=@/path/to/your/file.pdf;type=application/pdf'
Supported pre-defined schemas
imFeelingLucky
This schema is a wild card, telling LlamaParse to dream the output schema. Use at your own risk.
Using the API:curl -X 'POST' \
'https://api.cloud.llamaindex.ai/api/parsing/upload' \
-H 'accept: application/json' \
-H 'Content-Type: multipart/form-data' \
-H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
--form 'structured_output_json_schema_name="imFeelingLucky"' \
-F 'file=@/path/to/your/file.pdf;type=application/pdf'
invoice
This schema represents an invoice
Using the API:curl -X 'POST' \
'https://api.cloud.llamaindex.ai/api/parsing/upload' \
-H 'accept: application/json' \
-H 'Content-Type: multipart/form-data' \
-H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
--form 'structured_output_json_schema_name="invoice"' \
-F 'file=@/path/to/your/file.pdf;type=application/pdf'
resume
This schema represents an resume. It is based on https://github.com/jsonresume/resume-schema
Using the API:curl -X 'POST' \
'https://api.cloud.llamaindex.ai/api/parsing/upload' \
-H 'accept: application/json' \
-H 'Content-Type: multipart/form-data' \
-H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
--form 'structured_output_json_schema_name="resume"' \
-F 'file=@/path/to/your/file.pdf;type=application/pdf'