Input
LlamaParse API supports different ways to upload a file to parse:
File
It is possible to send a file to parse directly to LlamaParse using the file
parameter. In this case the /upload
endpoint accepts multi-part form data.
S3
It is possible to specify an S3 path where the file is located. The bucket containing the file needs to be accessible by LlamaParse (public).
To specify the S3 path, set it as input_s3_path
parser = LlamaParse(Using the API:
input_s3_path="s3://bucketname/s3path"
)
curl -X 'POST' \
'https://api.cloud.llamaindex.ai/api/parsing/upload' \
-H 'accept: application/json' \
-H 'Content-Type: multipart/form-data' \
-H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
--form 'input_s3_path="s3://bucketname/s3path"'
URL
It is possible to specify a URL of the file to parse. In this case LlamaParse will try to download the file from the specified URL. If the URL is not accessible to LlamaParse the job will fail. If the URL target is not a file but a website, LlamaParse will try to parse the contents of the website.
In Python:parser = LlamaParse(Using the API:
input_url="https://example.com/file.pdf"
)
curl -X 'POST' \
'https://api.cloud.llamaindex.ai/api/parsing/upload' \
-H 'accept: application/json' \
-H 'Content-Type: multipart/form-data' \
-H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
--form 'input_url="https://example.com/file.pdf"'
It is also possible to specify an HTTP proxy URL to use for accessing the file. This can be useful for files that are in a private network not exposed to the internet. In this case you need to specify the http_proxy
argument.
parser = LlamaParse(Using the API:
http_proxy="http://proxyaddress.com"
)
curl -X 'POST' \
'https://api.cloud.llamaindex.ai/api/parsing/upload' \
-H 'accept: application/json' \
-H 'Content-Type: multipart/form-data' \
-H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
--form 'http_proxy="http://proxyaddress.com"'