Skip to main content

Extracting Figures from Documents

LlamaCloud provides several API endpoints to help you extract and work with figures (images) from your documents, including charts, tables, and other visual elements. This guide will show you how to use these endpoints effectively.

These figures can be used for a variety of purposes, such as creating visual summaries, generating reports, chatbot responses, and more.

note

To extract figures from documents, you need to create an index with a file and enable Extract Layout option when creating the index on Parse Settings -> Text and images handling -> Extract Layout. This will allow you to extract and work with figures from your documents.

Available Endpoints​

  1. List All Figures: /v1/files/{id}/page-figures
  2. List Figures on a Specific Page: /v1/files/{id}/page-figures/{page_index}
  3. Get a Specific Figure: /v1/files/{id}/page-figures/{page_index}/{figure_name}

How to Use​

App setup​

Install API client package

pip install llama-cloud 

Import and configure client

from llama_cloud.client import LlamaCloud

client = LlamaCloud(token='<llama-cloud-api-key>')

1. List All Figures in a Document​

To get a list of all figures across all pages in a document:

# Get all figures from a document
figures = client.files.list_file_pages_figures(id="your-file-id")

output:

[
{
"figure_name": "page_1_figure_1.jpg",
"file_id": "71370e55-0f32-4977-b347-460735079386",
"page_index": 1,
"figure_size": 87724,
"is_likely_noise": true,
"confidence": 0.423
},
{
"figure_name": "page_2_figure_1.jpg",
"file_id": "71370e55-0f32-4977-b347-460735079386",
"page_index": 2,
"figure_size": 87724,
"is_likely_noise": true,
"confidence": 0.423
},
]

2. List Figures on a Specific Page​

To get figures from a specific page in your document:

# Get figures from a specific page
page_figures = client.files.list_file_page_figures(
id="your-file-id",
page_index=1 # Page numbers start at 0
)

output:

[
{
"figure_name": "page_1_figure_1.jpg",
"file_id": "71370e55-0f32-4977-b347-460735079386",
"page_index": 1,
"figure_size": 87724,
"is_likely_noise": true,
"confidence": 0.423
},
{
"figure_name": "page_1_figure_2.jpg",
"file_id": "71370e55-0f32-4977-b347-460735079386",
"page_index": 1,
"figure_size": 47724,
"is_likely_noise": true,
"confidence": 0.423
}
]

3. Get a Specific Figure​

To retrieve a specific figure from your document:

# Get a specific figure
figure = client.files.get_file_page_figure(
id="your-file-id",
page_index=1,
figure_name="figure1"
)

output:

the base64 encoded image