Set Concurrent Kubernetes Lake Garbage Collector Job Limits
For AWS, Qrvey 9.3 deployments include a Kubernetes controller (Kueue) to manage the number of Lake Garbage Collector jobs that can run concurrently.
Overview
New deployments automatically enable the job control mechanism for Lake Garbage Collector jobs. The Kueue executes Garbage Collector jobs in the order they arrive in the queue, with a default limit of five concurrent pods (jobs).
Customers can request DevOps to add data sync execution jobs to a similar queue.
Before You Begin
Before setting up job controls, verify the following items:
- You have
kubectlaccess to the Kubernetes cluster. - The
qrveyapps-jobsnamespace is available with the Kueue controller installed. - The ClusterQueue resource
qrvey-jobs-cluster-queue-lakegcexists in theqrveyapps-jobsnamespace.
Change the Concurrent Job Limit
The ClusterQueue's nominalQuota for pods determines how many pods (jobs) can run at the same time.
-
Patch the ClusterQueue resource by running the following command, replacing
<value>with the desired number:kubectl patch clusterqueue qrvey-jobs-cluster-queue-lakegc --type json -p='[
{
"op": "replace",
"path": "/spec/resourceGroups/0/flavors/0/resources/2/nominalQuota",
"value": "<value>"
}
]'For example, to set the limit to 10 concurrent jobs:
kubectl patch clusterqueue qrvey-jobs-cluster-queue-lakegc --type json -p='[
{
"op": "replace",
"path": "/spec/resourceGroups/0/flavors/0/resources/2/nominalQuota",
"value": "10"
}
]' -
Verify the change by inspecting the ClusterQueue resource:
kubectl get clusterqueue qrvey-jobs-cluster-queue-lakegc -o yaml
Configure LGC Job Settings
LGC jobs are triggered from pods managed by the qrvey-job-manager deployment. To modify job settings, update the environment variables in that deployment.
| Variable | Default | Description |
|---|---|---|
LAKEGC_QUEUE_NAME | qrvey-lakegc-queue | Name of the local queue where LGC jobs are sent. |
LAKEGC_TTL_SECONDS_AFTER_FINISHED | 5 | Number of seconds to keep a completed LGC Kubernetes Job after it finishes before Kubernetes automatically deletes it. Use this variable to control how long finished jobs remain available for inspection, logs, and debugging before cleanup. |
LAKEGC_JOB_CPU_REQUEST | 100m | Kubernetes CPU request assigned to the LGC job container. Defines the minimum CPU the scheduler reserves for the job. |
LAKEGC_JOB_MEM_REQUEST | 256Mi | Kubernetes memory request assigned to the LGC job container. Defines the minimum memory the scheduler reserves for the job. |
LAKEGC_JOB_CPU_LIMIT | 500m | Kubernetes CPU limit assigned to the LGC job container. Defines the maximum CPU the container is allowed to use at runtime. |
LAKEGC_JOB_MEM_LIMIT | 512Mi | Kubernetes memory limit assigned to the LGC job container. Defines the maximum memory the container is allowed to use at runtime. |