Understanding Chalk's job queue and resource groups
The job queue in Chalk, together with resource groups, functions similarly to warehouses in analytical data platforms - they provide dedicated, configurable compute resources for processing workloads.
A job queue consumer is a persistent worker process that consumes jobs from a queue and executes them one at a time. By configuring multiple resource groups with different job queue consumers, you can create isolated compute environments optimized for different workload types.
In the Chalk dashboard, this is configured under the “Job Queue Consumer” service. Earlier documentation referred to this service as the “Job Queue Server”; both names refer to the same underlying component.
The job queue handles two primary types of workloads:
run_asynchronously=True is set# This runs on the job queue
client.offline_query(
input={'user.id': range(1_000_000)},
output=['user.name'],
run_asynchronously=True, # Runs as a task on job queue
)
# This runs on the query server (NOT the job queue)
client.offline_query(
input={'user.id': [1, 2, 3]},
output=['user.name'],
# run_asynchronously=False by default - runs as synchronous RPC
)Jobs are processed in first-in, first-out (FIFO) order. Each job queue consumer processes one job at a time sequentially.
Each job queue consumer has a single, pre-configured resource allocation (CPU and memory).
If a job requests resources larger than the job queue consumer can handle, Chalk automatically skips the queue and runs the job as a standalone Kubernetes pod with the requested resources.
Resource groups allow you to create multiple job queue consumers with different resource configurations. This is useful for:
In the Chalk dashboard under Infrastructure > Resource Configuration, you can configure the “Job Queue Consumer” for each resource group:
All Chalk environments start with a Default resource group.
from chalk import ScheduledQuery
ScheduledQuery(
name="large-batch-job",
schedule="0 0 * * *",
output=[User.features],
resource_group="large-jobs", # Runs on the "large-jobs" resource group
)from chalk.client import ChalkClient, ResourceRequests
client.offline_query(
input={'user.id': range(1_000_000)},
output=['user.name'],
run_asynchronously=True,
resources=ResourceRequests(
resource_group="large-jobs" # Runs on the "large-jobs" resource group
),
)For spill-heavy or Iceberg-backed async offline queries, see Local SSDs for spilling and scan caching for the full setup of a dedicated LSSD-backed resource group.
| Aspect | Job Queue Consumer | Query Server |
|---|---|---|
| Processes | Scheduled queries, async offline queries | Synchronous offline queries, online queries |
| Execution | One job at a time (FIFO) | Multiple concurrent requests |
| Resources | Fixed per resource group | Requested per query |
| Scaling | Horizontal (more instances) | Vertical (larger pods) |
| Workload Isolation | Jobs run sequentially without resource contention | Multiple concurrent queries may compete for resources on the same server |
| Timeout Behavior | Can run indefinitely beyond load balancer timeout | Will report an error if execution exceeds load balancer timeout |
Create separate resource groups for jobs with significantly different resource requirements
Right-size your default job queue to handle typical workloads
Use resource groups for isolation
Monitor queue depth and adjust max instances if jobs are waiting too long
Here’s a common setup with two resource groups:
# Default resource group: moderate sizing for typical scheduled queries
# Configured in dashboard: 8 CPU, 16 GB memory
ScheduledQuery(
name="daily-features",
schedule="0 1 * * *",
output=[User.daily_features],
# Uses default resource group
)
# Large jobs resource group: high-memory machines for big batch processing
# Configured in dashboard: 32 CPU, 450 GB memory
ScheduledQuery(
name="weekly-aggregations",
schedule="0 0 * * 0",
output=[User.historical_aggregates],
resource_group="large-jobs", # Uses dedicated high-memory queue
)