Skip to main content

Metadata

In JSON mode, LlamaParse will return a data structure representing the parsed object. This is useful for further processing or analysis.

To use this mode, set the result type to "json".

Using the API:
curl -X 'POST' \
  'https://api.cloud.llamaindex.ai/api/parsing/parsing/job/<job_id>/result/json'  \
  -H 'accept: application/json' \
  -H 'Content-Type: multipart/form-data' \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY"

Result format

{
"pages": [
..page objects..
],
"job_metadata": {
"credits_used": int,
"credits_max": int,
"job_credits_usage": int,
"job_pages": int,
"job_is_cache_hit": boolean
}
}

Page objects

Within page objects, the following keys may be present depending on your document.

  • page: The page number of the document.
  • text: The text extracted from the page.
  • md: The markdown version of the extracted text.
  • images: Any images extracted from the page.
  • items: An array of heading, text and table objects in the order they appear on the page.

Retrieving images

Images are returned as an array of image objects, of the form:

{
"name": "img_p2_5.png",
"height": 718,
"width": 251
}

You can retrieve the image extracted directly using the value of the name, like this:

Using the API:
curl -X 'POST' \
  'https://api.cloud.llamaindex.ai/api/parsing/parsing/job/<job_id>/result/image/<name>'  \
  -H 'accept: application/json' \
  -H 'Content-Type: multipart/form-data' \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
  --output "file.png"

Note the additional --output argument to curl to get the binary saved to a file.