Python SDK
This guide shows how to classify documents using the Python SDK. You will:
- Create classification rules
- Upload files
- Submit a classify job
- Read predictions (type, confidence, reasoning)
The SDK is available in llama-cloud-services.
Setup
First, get an API key: Get an API key
Put it in a .env
file:
LLAMA_CLOUD_API_KEY=llx-xxxxxx
Install dependencies:
pip install llama-cloud-services python-dotenv
or with uv
:
uv add llama-cloud-services python-dotenv
Quick start
The snippet below uses a convenience ClassifyClient
wrapper from llama-cloud-services
that uploads files, creates a classify job, polls for completion and returns results.
import os
from dotenv import load_dotenv
from llama_cloud.client import AsyncLlamaCloud
from llama_cloud.types import ClassifierRule, ClassifyParsingConfiguration, ParserLanguages
from llama_cloud_services.classify.client import ClassifyClient # helper wrapper
load_dotenv()
client = AsyncLlamaCloud(token=os.environ["LLAMA_CLOUD_API_KEY"])
project_id = "your-project-id"
organization_id = "your-organization-id"
classify = ClassifyClient(client, project_id=project_id, organization_id=organization_id)
rules = [
ClassifierRule(
type="invoice",
description="Documents that contain an invoice number, invoice date, bill-to section, and line items with totals."
),
ClassifierRule(
type="receipt",
description="Short purchase receipts, typically from POS systems, with merchant, items and total, often a single page."
),
]
parsing = ClassifyParsingConfiguration(
lang=ParserLanguages.EN,
max_pages=5, # optional, parse at most 5 pages
# target_pages=[1] # optional, parse only specific pages (1-indexed), can't be used with max_pages
)
# for async usage, use `await classify.aclassify_file_paths(...)`
results = classify.classify_file_paths(
rules=rules,
file_input_paths=["/path/to/doc1.pdf", "/path/to/doc2.pdf"],
parsing_configuration=parsing,
)
for item in results.items:
# in cases of partial success, some of the items may not have a result
if item.result is None:
print(f"Classification job {item.classify_job_id} error-ed on file {item.file_id}")
continue
print(item.file_id, item.result.type, item.result.confidence)
print(item.result.reasoning)
Notes:
ClassifierRule
requires atype
and a descriptivedescription
that the model can follow.ClassifyParsingConfiguration
is optional; setlang
,max_pages
, ortarget_pages
to control parsing.results.items
contains oneFileClassification
per file withresult.type
,result.confidence
, andresult.reasoning
.
Tips for writing good rules
- Be specific about content features that distinguish the type.
- Include key fields the document usually contains (e.g., invoice number, total amount).
- Add multiple rules when needed to cover distinct patterns.
- Start simple, test on a small set, then refine.
Next steps
- Explore programmatic batching and progress: run multiple uploads concurrently and classify in one job.
- Combine with Extract for downstream structured parsing after classification.