Skip to main content

Output

LlamaParse supports the following output formats:

  • Text: A basic text representation of the parsed document
  • Markdown: A Markdown representation of the parsed document
  • JSON : A JSON representation of the content of the document
  • XLSX: A spreadsheet containing all the tables found in the document
  • PDF: A PDF representation of the parsed document (note: this is not the same as the original document)
  • Images: All images contained in the document
  • Page Screenshot: Screenshots of document pages
  • Structured: if structured output is required, a JSON object containing the required data.

Parsing modes and output​

LlamaParse supports different output formats depending on the parsing mode:

Modetextmarkdownjsonxlsxpdfstructured1imagesscreenshots2
default (accurate) mode✅✅✅✅✅✅✅✅
fast_mode✅🚫✅🚫✅🚫✅✅
vendor_multimodal_mode🚫✅✅✅✅🚫🚫✅
premium_mode✅✅✅✅✅🚫✅✅
auto_mode✅✅✅✅✅🚫✅✅
continuous_mode✅✅✅✅✅🚫✅✅
spreadsheet3✅✅✅✅🚫🚫✅🚫
audio files4✅🚫🚫🚫🚫🚫🚫🚫

Result endpoint​

LlamaParse allows you to retrieve your job results in different ways using the result endpoint. The supported result formats are text, markdown, json, xlsx, pdf, or structured.

Using the API:
curl -X 'POST' \
  'https://api.cloud.llamaindex.ai/api/parsing/job/{job_id}/result/markdown'  \
  -H 'accept: application/json' \
  -H 'Content-Type: multipart/form-data' \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY"

The return result is a json object containing the requested result and a job_metadata field. The job_metadata contain:

  • credits_used : How much credit you used so far today
  • job_credits_usage : How much credits did this job used.
  • job_pages : How many pages (or for spreadsheet sheets) were in your document.
  • job_auto_mode_triggered_pages : How many pages where upgraded to premium_mode after triggering auto_mode
  • job_is_cache_hit : If the job was a cache hit (we do not bill cache hits).
    {
"markdown" : "Here the markdown of the document if you asked for markdown as the result type....",
"job_metadata": {
"credits_used": 500,
"job_credits_usage": 5,
"job_pages": 5,
"job_auto_mode_triggered_pages": 0,
"job_is_cache_hit": false
}
}

Raw endpoint​

Instead of returning a JSON object containing your parsed document, you can set LlamaParse to return the raw text extracted from the document by retrieving the data in "raw" mode. The raw result can be text, markdown, json, xlsx, or structured.

Using the API:
curl -X 'POST' \
  'https://api.cloud.llamaindex.ai/api/parsing/job/{job_id}/raw/result/markdown'  \
  -H 'accept: application/json' \
  -H 'Content-Type: multipart/form-data' \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY"

Images​

Image (and screenshot) can be download using the job/{job_id}/result/image/image_name.png endpoint.

Using the API:
curl -X 'POST' \
  'https://api.cloud.llamaindex.ai/api/parsing/job/{job_id}/result/image/image_name.png'  \
  -H 'accept: application/json' \
  -H 'Content-Type: multipart/form-data' \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY"

Details endpoint​

It is possible to see the details of a job including eventual job error or warning (both at the document and page model), but also the original job parameter using the job/{job_id}/details endpoint.

Using the API:
curl -X 'POST' \
  'https://api.cloud.llamaindex.ai/api/parsing/job/{job_id}/details'  \
  -H 'accept: application/json' \
  -H 'Content-Type: multipart/form-data' \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY"

Status endpoint​

It is possible to see the status of a job using the job/{job_id} endpoint.

Using the API:
curl -X 'POST' \
  'https://api.cloud.llamaindex.ai/api/parsing/job/{job_id}'  \
  -H 'accept: application/json' \
  -H 'Content-Type: multipart/form-data' \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY"

Footnotes​

  1. structured output is only available if structured_output=True ↩

  2. document screenshots are available when take_screenshot=True ↩

  3. Spreadsheets have their own pipeline and are processed differently than other documents, independently of the selected mode. ↩

  4. Audio file have their own pipeline and are processed differently than other documents, independently of the selected mode. ↩