Skip to main content

Python Usage

Python options​

Some parameters are specific to the Python implementation

Number of workers​

This controls the number of workers to use sending API requests for parsing. The default is 4.

In Python:
parser = LlamaParse(
  num_workers=10
)

Check interval​

In synchronous mode (see below), Python will poll to check the status of the job. The default is 1 second.

In Python:
parser = LlamaParse(
  check_interval=10
)

Verbose mode​

By default, LlamaParse will print the status of the job as it is uploaded and checked. You can disable this output.

In Python:
parser = LlamaParse(
  verbose=False
)

Use with SimpleDirectoryReader​

You can use LlamaParse directly within LlamaIndex by using SimpleDirectoryReader. This will parse all files in a directory called data and return the parsed documents.

from llama_parse import LlamaParse
from llama_index.core import SimpleDirectoryReader

parser = LlamaParse()

file_extractor = {".pdf": parser}
documents = SimpleDirectoryReader(
"./data", file_extractor=file_extractor
).load_data()

Direct usage​

It is also possible to call the parser directly, in one of 4 modes:

Synchronous parsing​

documents = parser.load_data("./my_file.pdf")

Synchronous batch parsing​

documents = parser.load_data(["./my_file1.pdf", "./my_file2.pdf"])

Asynchronous parsing​

documents = await parser.aload_data("./my_file.pdf")

Asynchronous batch parsing​

documents = await parser.aload_data(["./my_file1.pdf", "./my_file2.pdf"])