# BigQuery
source: https://docs.chalk.ai/docs/bigquery

## Integrate with your BigQuery data warehouse.

Chalk has an integration with BigQuery
that makes it easy to read queries and tables into your feature store.

### Authorization

To use BigQuery
in your resolvers, you first need to add the
Chalk GCP integration to the environments
where you would like to use
BigQuery.

When querying your BigQuery data source, Chalk will push down filters on top of your
queries to optimize the amount of data read from your tables. For larger queries, rather than
interpolating values directly in the SQL string for the query, which has length limits in
BigQuery, Chalk will use a table to temporarily hold the values against which to query.

Chalk stores these temporary tables in a dedicated project and dataset that you configure for scratch usage, which is also used for unload operations.
It's recommended to configure this dataset with a TTL (Time To Live) on all tables,
such as 6 hours, which should be bounded by your maximum job lifetime to ensure automatic cleanup of
temporary resources.

The service account that you register in the data source requires the following permissions to fully allow Chalk
to integrate with BigQuery. Here, the "target dataset" is the temporary project and temporary dataset
if they are specified within the data source. If not, the target dataset is just the project and dataset
of the data source.

- bigquery.readsessions.create on the target dataset
- bigquery.readsessions.getData on the target dataset
- bigquery.tables.getData on the target dataset and all referenced datasets in queries
- bigquery.tables.get on the target dataset and all referenced datasets in queries
- bigquery.tables.create on the target dataset, for table pushdown in larger Chalk queries
- bigquery.tables.delete on the target dataset, for cleaning up the aforementioned table pushdown
- bigquery.tables.updateData on the target dataset
- bigquery.jobs.create on the target project
- bigquery.jobs.get on the target project
- bigquery.datasets.get on all projects, for Chalk SQL compatibility

Using BigQuery's predefined IAM roles, you can get these permissions by ensuring that the service account has the following:

- roles/bigquery.JobUser on the target project
- roles/bigquery.dataEditor on the target dataset
- roles/bigquery.dataViewer on the target dataset and all referenced datasets in queries

You can learn more about the various BigQuery IAM roles and permissions here.

### Integrations Setup

After configuring your BigQuery integration with the GCP integration, define your data sources in Python:

```
from chalk.sql import BigQuerySource

risk = BigQuerySource(name="RISK")
marketing = BigQuerySource(name="MARKETING")
```

You can then reference them in SQL file resolvers using the name parameter. For example, to query from the RISK source:

```
-- type: online
-- resolves: User
-- source: RISK
SELECT id, credit_score FROM users
```

And to query from the MARKETING source:

```
-- type: online
-- resolves: User
-- source: MARKETING
SELECT id, email, campaign_status FROM users
```