Infrastructure
What data flows between the Chalk Metadata Plane and Data Plane, and what the security implications are.
Chalk’s service architecture is divided into two planes: the Metadata Plane and the Data Plane. Understanding what flows between them is critical for security-conscious deployments, especially when configuring data residency controls or auditing what can transit outside your cloud boundary.
Key principle: The Metadata Plane orchestrates the Data Plane but never stores customer feature values. All production query traffic flows directly from your API clients to the Data Plane.
| Flow | Direction | Contains Customer Data? | Required? | Can Be Disabled? |
|---|---|---|---|---|
| Logs | Data → Metadata | No (metadata only) | No | Yes |
| Metrics | Data → Metadata | No | No | Yes |
| Query Execution | Metadata → Data | Yes (feature values) | No | Yes |
| EKS API Access | Metadata → Data | No | Yes | No |
| Container Images (ECR/Artifact Registry) | Metadata → Data | No | Yes | No |
Direction: Data Plane → Metadata Plane
Logs are collected by an OpenTelemetry Collector running in the Data Plane EKS/GKE cluster and optionally forwarded to the Metadata Plane for centralized dashboarding.
What’s included:
What’s NOT included: Actual feature values or customer PII. Logs contain metadata about computations (e.g. resolver name, latency, error type), not the data itself.
Disabling: Logs can be kept exclusively within the Data Plane by configuring the OpenTelemetry Collector to export to your own observability tooling (e.g. Dynatrace, Datadog) instead of forwarding to the Metadata Plane.
Direction: Data Plane → Metadata Plane
Performance and operational metrics are emitted by the Data Plane and optionally forwarded for centralized monitoring.
Metrics collected:
Metrics do not contain customer data—only aggregated operational statistics.
Disabling: Like logs, metrics can be routed exclusively to your own monitoring systems via OpenTelemetry exporter configuration.
Direction: Metadata Plane → Data Plane
This flow is what enables the Chalk web UI to execute queries interactively against your live data plane: the UI sends a request to the Metadata Plane, which forwards it as an API client to your Data Plane, and returns the results.
What’s included: Query inputs, feature outputs, and execution plan metadata. This flow can transmit customer feature values (including PII).
Disabling: This flow requires a VPC Endpoint (VPCE / PrivateLink) between the Metadata Plane and Data Plane. Removing that connection disables it entirely, ensuring customer data never transits to the Metadata Plane.
Disabling query execution connectivity is not a simple on/off toggle—it removes a significant portion of Chalk’s product capabilities:
Regardless of how you configure Metadata-to-Data-Plane connectivity, production online query traffic is not affected. When using Named Queries, your API clients talk directly to the Data Plane and never route through the Metadata Plane. OAuth token exchange still occurs via the Metadata Plane, but no feature values transit it.
Direction: Metadata Plane → Data Plane
The Metadata Plane needs access to your Data Plane’s Kubernetes API server to manage the lifecycle of your Chalk deployment.
What it’s used for:
What’s NOT included: No customer feature data is exposed via the Kubernetes API—only infrastructure metadata (pod status, deployment state, etc.).
Disabling: This flow is required. Without it, Chalk cannot deploy code changes, scale resources, or perform health monitoring.
There are two main options for how the Metadata Plane connects to the Data Plane Kubernetes API:
Option A: Public EKS API with IP whitelisting (recommended)
The EKS API server endpoint is publicly accessible, but access is restricted to the Chalk Metadata Plane’s IP ranges via whitelist. All traffic is encrypted in transit (TLS), and AWS IAM authentication is required for all API calls.
Benefits: simpler setup, zero-downtime deployments guaranteed, no dependency on VPC Endpoint infrastructure.
Option B: Fully private EKS API
The EKS API server is only accessible from within the VPC. If the Metadata Plane and Data Plane are in separate VPCs, this requires VPC peering or Transit Gateway.
Drawbacks: operationally complex, and zero-downtime deployments cannot be guaranteed because EKS API endpoints don’t have stable IP addresses—a change in endpoint IP can break connectivity until network rules are manually updated (estimated recovery: 15+ minutes).
This option is typically only warranted when regulatory requirements mandate a fully private control plane.
Direction: Metadata Plane → Data Plane
When you deploy a new version of your Chalk project, the Metadata Plane’s Argo Image Builder builds Docker container images and pushes them to your ECR (AWS) or Artifact Registry (GCP) repository. The Data Plane then pulls these images when deploying.
What’s included: Docker container images for Chalk services. No customer feature data.
Disabling: This flow is required for deployments.
Establish a VPC Endpoint between the Metadata Plane and Data Plane, and rely on Chalk’s RBAC system for access control. This gives you full product functionality: planning engines, web UI query testing, real-time dashboards, and data quality tooling.
In a Customer Cloud deployment or Air-Gapped deployment where both planes run within your cloud boundary, this is the recommended configuration. Your data never leaves your infrastructure, and Chalk RBAC provides granular user-level access control over what can be queried via the UI.
Do not establish a VPCE connection from the Metadata Plane to the Data Plane. This ensures customer data can never transit the Metadata Plane under any circumstances.
Trade-offs:
This configuration is appropriate for organizations with strict data residency requirements where no customer data—even via authorized queries—can touch infrastructure outside a defined boundary.