Service Configurations

Individual LlamaCloud services can be configured based on your specific needs. This page will cover the different configurations for each service.

Global Configurations

At the time of writing, the only global configurations are for external dependencies. For more information, please refer to the Dependencies page.

Backend Service

Qdrant

Qdrant is a popular vector database that is used to store and retrieve embeddings. Users can configure Qdrant as a Data Sink on a project by project basis, or if they prefer, they can configure it to be used as a Data Sink across all projects and organizations. For the latter, the following configurations can be set:

# basic example
backend:
  config:
    qdrant:
      enabled: true
      url: "http://qdrant:6333"
      apiKey: "your-api-key"

# or, if you prefer to use an existing secret
backend:
  config:
    qdrant:
      enabled: true
      existingSecret: "qdrant-secret"

Jobs Worker Service

There are several configs that can be set to modify how the Jobs Worker handled job executions.

Concurrency Settings

These settings modify how concurrent jobs get distributed across the job worker pods. These are mainly used to:

Prevent a noisy neighbor problem whereby a single user can flood the job workers, thereby leading to starvation for other users
Prevent external resources such as Mongo, Embeddings API, Vector DBs etc. from being overloaded with too many requests

maxJobsInExecutionPerJobType: This setting defines the maximum number of concurrent jobs a user can have running per job type. It is used by the job runner to help prevent any one user from overloading the system.
- Set this to 0 to disable this concurrency check.
maxIndexJobsInExecution: This configuration specifies the maximum number of ingestion (indexing) jobs that a single pipeline is allowed to execute concurrently. It is applied to pipelines handling document ingestion and indexing operations to control resource usage.
- Set this to 0 to disable this concurrency check.
maxDocumentIngestionJobsInExecution: This parameter limits the number of concurrent document ingestion jobs a user can have in execution. Document ingestion is typically resource intensive, so this should be kept relatively low to avoid overloading the system.
- Set this to 0 to disable this concurrency check.

# example values.yaml for high throughput
jobsWorker:
  config:
    maxJobsInExecutionPerJobType: 25
    maxIndexJobsInExecution: 0
    maxDocumentIngestionJobsInExecution: 10

Timeout and Limit Settings

These settings control the execution timeout and data processing limits for jobs handled by the Jobs Worker service:

defaultTransformDocumentTimeoutSeconds: This setting defines the default timeout in seconds for document transformation jobs. Document transformation can be resource-intensive and may take significant time depending on document size and complexity.
- Default value is 240 (4 minutes).
transformEmbeddingCharLimit: This configuration specifies the maximum number of characters that can be processed in a single transform embedding operation. This helps prevent memory issues and ensures consistent performance when processing large documents.
- Default value is 11520000 (11.52 million characters).

# example values.yaml with custom timeout and limits
jobsWorker:
  config:
    defaultTransformDocumentTimeoutSeconds: "7200"  # 2 hours
    transformEmbeddingCharLimit: "2000000"  # 2 million characters

Note: These configuration values must be provided as quoted strings to prevent YAML parsing issues with large numbers.

LlamaParse

Job Throughput Settings

maxQueueConcurrency: This configuration sets the maximum number of jobs that can be processed concurrently by the LlamaParse service. It helps enable the service to process a high volume of jobs efficiently. The higher the number, the more resources will be used, so please be mindful of this.
- Default value is 3.

# example values.yaml for high throughput
llamaParse:
  config:
    maxQueueConcurrency: 10

Global Configurations​

Backend Service​

Qdrant​

Jobs Worker Service​

Concurrency Settings​

Timeout and Limit Settings​

LlamaParse​

Job Throughput Settings​