Queries
Automatic parallelization of scheduled offline queries
Metaplanning is an environment-level feature that automatically parallelizes scheduled offline queries by splitting them into multiple shards that run concurrently.
Contact our support team to enable metaplanning in your environments.
When metaplanning is enabled for your environment, all scheduled offline queries go through a metaplanning workflow:
Example: A query with 100,000 rows and the default target of 10,000 rows per shard creates 10 parallel shard jobs.
Metaplanning is configured at the environment level. Once enabled, all scheduled offline queries in that environment use metaplanning automatically.
The shard size can be controlled via the AUTOSHARDER_TARGET_ROWS_PER_SHARD environment variable (default: 10,000 rows per shard).
ScheduledQuery(
name="daily_user_scores",
outputs=[User.id, User.score],
schedule="0 0 * * *",
)With metaplanning enabled, this query will:
input is specified)