Infrastructure
Background Persistence Installation Via the UI
Chalk uses background writers hosted in the (“Customer Cloud”) Kubernetes cluster to write information about queries to various storage locations.
In order to install Chalk persistence writers, you need to have the following:
If using Kafka:
If using Pubsub:
Navigate to the Settings/Team/Shared Resources/Background Persistence page in the Chalk UI
to view the background persistence configuration. If no background persistence is configured,
you will see a message indicating that no background persistence is currently present, and the
first save and apply will create background persistence writers.
Chalk supports different types of background persistence writers, each designed for specific data flow and storage purposes:
COPY INTO operations.bigquery-streaming-write-loader is typically used instead.Each writer type requires specific subscription IDs and topics to be configured in the common persistence specifications.
When using pubsub, topics and subscriptions are 2 separate entities, but for Kafka, we use the same topic for both publishing and subscribing. Additionally, we need to provide Kafka authentication credential, whereas pubsub uses its google identity to authenticate.
In the JSON format, these fields are in the common_specs field.
bus_backendstringnamespacestringservice_account_namestringsecret_clientstringkafka_dlq_topicstringapi_server_hoststringkafka_sasl_secretstringkafka_bootstrap_serversstringkafka_security_protocolstringkafka_sasl_mechanismstringredis_is_clusteredstringsnowflake_storage_integration_namestringmetadata_providerstringnamestringbus_subscriber_typestringrequestobjectlimitobjectversionstringdefault_replica_countintIn the JSON format, these fields are in the common_specs field but are not necessarily required.
Writers will each require an image and some, but not all, of the subscription and topic ID’s.
In each writer’s specification form, a writer will ask for its required fields and images.
bus_writer_image_gostringbus_writer_image_pythonstringbus_writer_image_bswlstringbigquery_parquet_upload_subscription_idstringbigquery_streaming_write_subscription_idstringbigquery_streaming_write_topicstringbigquery_upload_bucketstringbigquery_upload_topicstringmetrics_bus_subscription_idstringmetrics_bus_topic_idstringresult_bus_metrics_subscription_idstringresult_bus_offline_store_subscription_idstringresult_bus_online_store_subscription_idstringThe following is an example configuration for background persistence writers:
{
"common_persistence_specs": {
"bus_backend": "KAFKA",
"bus_writer_image_go": "<go bus writer image>",
"bus_writer_image_python": "<python bus writer image>",
"bus_writer_image_bswl": "<bswl bus writer image>",
"namespace": "background-persistence",
"service_account_name": "background-persistence-sa",
"secret_client": "AWS",
"bigquery_parquet_upload_subscription_id": "offline-store-bulk-insert-bus-1",
"bigquery_streaming_write_subscription_id": "offline-store-streaming-insert-bus-1",
"bigquery_streaming_write_topic": "offline-store-streaming-insert-bus-1",
"bigquery_upload_bucket": "s3://<your data bucket>",
"bigquery_upload_topic": "offline-store-bulk-insert-bus-1",
"metrics_bus_subscription_id": "metrics-bus-1",
"metrics_bus_topic_id": "metrics-bus-1",
"result_bus_metrics_subscription_id": "result-bus-1",
"result_bus_offline_store_subscription_id": "result-bus-1",
"result_bus_online_store_subscription_id": "result-bus-1",
"kafka_dlq_topic": "dlq-1",
"operation_subscription_id": "operation-bus-1"
},
"api_server_host": "<your api server here>",
"kafka_sasl_secret": "<your aws kafka auth secret here>",
"kafka_bootstrap_servers": "<bootstrap server1>:<port>, <bootstrap server2>:<port>, ...",
"kafka_security_protocol": "SASL_SSL",
"kafka_sasl_mechanism": "SCRAM-SHA-512",
"redis_is_clustered": "1",
"snowflake_storage_integration_name": "<snowflak integration name>",
"metadata_provider": "GRPC_SERVER",
"writers": [
{
"name": "go-metrics-bus-writer",
"bus_subscriber_type": "GO_METRICS_BUS_WRITER",
"request": {
"cpu": "200m",
"memory": "512Mi"
},
"limit": {
"cpu": "1",
"memory": "512Mi"
},
"version": "1.0",
"default_replica_count": 1
},
{
"name": "go-result-bus-metrics-writer",
"bus_subscriber_type": "GO_RESULT_BUS_METRICS_WRITER",
"request": {
"cpu": "400m",
"memory": "1024Mi"
},
"limit": {
"cpu": "1",
"memory": "1024Mi"
},
"version": "1.0",
"default_replica_count": 1
}
]
}