Using in Python
First, get an api key. We recommend putting your key in a file called .env
that looks like this:
LLAMA_CLOUD_API_KEY=llx-xxxxxx
Set up a new python environment using the tool of your choice, we used poetry init
. Then install the deps you’ll need:
pip install llama-extract python-dotenv
Now we have our libraries and our API key available, let’s create a extract.py
file and extract data from files. In this case, we're using invoice documents from our examples:
# bring in our LLAMA_CLOUD_API_KEY
from dotenv import load_dotenv
load_dotenv()
# bring in deps
from llama_extract import LlamaExtract
# set up extractor
extractor = LlamaExtract()
# infer a schema from the files
extraction_schema = extractor.infer_schema("Our Schema", ["data/file1.pdf", "data/file2.pdf"])
# extract data using the inferred schema
results = extractor.extract(
extraction_schema.id,
["data/file3.pdf", "data/file4.pdf"],
)
print(results)
Now run it like any python file:
python extract.py
This will print the results of the extraction, containg the data extracted and other additional info.