Skip to main content

GPT-4o mode

You can use OpenAI's GPT-4o to handle document extraction. This is more expensive than regular parsing (10 credits per page instead of 1) but can get better results for some documents.

When using this mode, LlamaParse's regular parsing is bypassed and instead the following process is used:

  • A screenshot of every page of your document is taken
  • Each page screenshot is sent to GPT-4o with instruction to extract as markdown
  • The resulting markdown of each page is consolidated into the final result.

Using GPT-4o

To use the GPT-4o mode, set gpt4o_mode to True.

In Python:
parser = LlamaParse(
  gpt4o_mode=True
)
Using the API:
curl -X 'POST' \
  'https://api.cloud.llamaindex.ai/api/parsing/upload'  \
  -H 'accept: application/json' \
  -H 'Content-Type: multipart/form-data' \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
  --form 'gpt4o_mode="true"' \
  -F 'file=@/path/to/your/file.pdf;type=application/pdf'

Bring your own LLM key

When using GPT-4o mode, you can opt to supply your own OpenAI API key to control your costs directly.

In Python:
parser = LlamaParse(
  gpt4o_mode=True
  gpt4o_api_key=sk-proj-xxxxxx
)
Using the API:
curl -X 'POST' \
  'https://api.cloud.llamaindex.ai/api/parsing/upload'  \
  -H 'accept: application/json' \
  -H 'Content-Type: multipart/form-data' \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
  --form 'gpt_4o_mode="true"' \
  --form 'gpt4o_api_key="sk-proj-xxxxxx"' \
  -F 'file=@/path/to/your/file.pdf;type=application/pdf'