1. Online Store Survey

Chalk SQL provides a virtual table that lets you survey the contents of your Redis or Valkey online store. This is useful for capacity planning, identifying deprecated features that haven’t been cleaned up, and understanding which feature namespaces consume the most memory.

Prerequisites

To run online store survey queries, you need:

  • A Redis or Valkey online store configured and operational
  • Access to the SQL Explorer in the Chalk dashboard
  • Queries can run in either synchronous or asynchronous mode—the online store schema is available in both execution paths

For background on these tools, see:

The chalk.online_store.keys table

The chalk.online_store.keys view exposes metadata about every key in your Redis or Valkey online store. The table has the following schema:

  • key (binary) — The raw key stored in the online store.
  • key_type (string) — The type of key: Scalar, HasMany, TimeSeries, etc.
  • cache_prefix (string) — The configured cache prefix for this key.
  • fqn (string) — The fully qualified feature name (e.g. user.fraud_score).
  • memory_usage (uint64) — Bytes consumed in the store by this key.
  • ttl_seconds (uint64) — Time-to-live remaining in seconds.

Safety warning

Querying chalk.online_store.keys without a LIMIT performs a full Redis/Valkey SCAN, which can significantly impact production performance. Always use aLIMIT clause in the inner subquery to cap the number of keys scanned. Because keys are proportionally distributed across the scan space, a limited sample provides a representative picture of your online store's contents.

Redis/Valkey are key-value stores and as such do not support indexing on different "columns" out-of-the-box. A "select-from-where" query with a filter will still scan the online store, yielding only rows that match the filter. This means that a filter that doesn't match any keys might perform a full scan -- e.g. `SELECT * FROM "chalk.online_store.keys" WHERE fqn='nonexistent_feature'`. Support for filtering by _primary_ key (i.e. performing a direct redis GET instead of a SCAN for a certain id) is not currently supported for this SQL interface but is coming soon!

Survey queries

In the SQL Explorer settings sidebar, enable Persist Results to save query output to cloud storage. This makes results available in the dashboard for later review. Asynchronous queries always persist results automatically, but for synchronous queries you must enable this toggle explicitly. This is recommended for larger survey queries where you want to revisit or share results.

Sampling keys

Start with a small sample to see what’s in your online store:

SELECT * FROM chalk.online_store.keys LIMIT 100;

Key distribution and memory by feature namespace

Aggregate over a sample to see how keys and memory are distributed across feature namespaces:

SELECT key_type, fqn, count(*), sum(memory_usage)
FROM (SELECT * FROM chalk.online_store.keys LIMIT 20000)
GROUP BY key_type, fqn;

Ordered by memory usage

Find the largest consumers by sorting on total memory:

SELECT key_type, fqn, count(*), sum(memory_usage)
FROM (SELECT * FROM chalk.online_store.keys LIMIT 100000)
GROUP BY key_type, fqn
ORDER BY sum(memory_usage) DESC;

Checking for a specific feature namespace

Verify whether a particular namespace still has keys in the store — useful after removing a feature class:

SELECT * FROM (
  SELECT * FROM chalk.online_store.keys LIMIT 100
) WHERE fqn = 'my_feature_namespace';

Interpreting results

  • Representative sampling — Because keys are proportionally distributed across the Redis/Valkey scan space, a limited sample gives an accurate picture of the overall distribution. Derive percentages by dividing the count for each fqn by the total count in your sample.
  • Memory usage — The memory_usage column reports bytes per key. Sum these values per namespace to estimate how much of the store each feature class occupies.
  • Key types — The key_type column indicates the kind of feature data stored: Scalar for single values, HasMany for relationship data, TimeSeries for time-indexed features, etc.
  • TTL — The ttl_seconds column shows how long until a key expires. Keys with no expiration will show a large or zero value depending on your store configuration.

Common use cases

Capacity planning

Use the memory-ordered query to identify which namespaces dominate your online store. This informs scaling decisions and helps you decide whether to tune max_staleness to reduce retention for features that don’t need long cache lifetimes.

Verifying deprecated feature cleanup

After removing a feature class from your Chalk project, keys may still exist in the online store. Chalk runs a nightly cleanup job that removes keys for features no longer in active deployments. Use the namespace-specific query above to confirm that the cleanup has completed.

Reducing online store size

If your online store is growing large, consider:

  • Lowering max_staleness on features that don’t need long retention — see Feature Caching
  • Removing feature classes that are no longer used and waiting for the nightly cleanup
  • Reviewing HasMany and TimeSeries key types, which tend to consume more memory per entity