Skip to main content

Job Settings

There are several environment variables that can be set on the Job Workers pod in your helm deployments values.yaml to modify how it handles job executions.

Example values.yaml with modified settings

Modify the jobsWorker.extraEnvVariables section in your values.yaml with custom environment variables in the following format:

# values.yaml
jobsWorker:
extraEnvVariables:
- name: MAX_JOBS_IN_EXECUTION_PER_JOB_TYPE
value: "10"
- name: MAX_INDEX_JOBS_IN_EXECUTION
value: "0"
- name: MAX_DOCUMENT_INGESTION_JOBS_IN_EXECUTION
value: "1"

Concurrency Settings

These settings modify how concurrent jobs get distributed across the job worker pods. These are mainly used to:

  1. Prevent a noisy neighbor problem whereby a single user can flood the job workers, thereby leading to starvation for other users
  2. Prevent external resources such as Mongo, Embeddings API, Vector DBs etc. from being overloaded with too many requests

MAX_JOBS_IN_EXECUTION_PER_JOB_TYPE

This setting defines the maximum number of concurrent jobs a user can have running per job type. It is used by the job runner to help prevent any one user from overloading the system.

Set this to 0 to disable this concurrency check.

MAX_INDEX_JOBS_IN_EXECUTION

This configuration specifies the maximum number of ingestion (indexing) jobs that a single pipeline is allowed to execute concurrently. It is applied to pipelines handling document ingestion and indexing operations to control resource usage.

Set this to 0 to disable this concurrency check.

MAX_DOCUMENT_INGESTION_JOBS_IN_EXECUTION

This parameter limits the number of concurrent document ingestion jobs a user can have in execution. Document ingestion is typically resource intensive, so this should be kept relatively low to avoid overloading the system.

Set this to 0 to disable this concurrency check.