Google Cloud BigQuery Dataset¶
Google Cloud BigQuery is a service that provides a relational database that is optimized for analytical workloads. It is a good choice for storing data that is relational in nature.
NAIS Application yaml manifest options¶
Full documentation of all available options can be found over at: spec.gcp.bigQueryDatasets[]
.
Example of an application using a nais.yaml
provisioned BigQuery Dataset can be found here: testapp.
Minimal Working Example¶
Connecting to the Dataset¶
When connecting your BigQuery client you need to specify the project ID and the dataset ID.
The project ID is available in the GCP_TEAM_PROJECT_ID
environment variable.
There's no automatic environment variable for the dataset ID.
val projectId = System.getenv("GCP_TEAM_PROJECT_ID")
val datasetId = "my_bigquery_dataset"
val bigQueryClient = BigQueryClient.create(projectId, datasetId)
Caveats to be aware of¶
Once a BigQuery Dataset is provisioned, it will not be automatically deleted - unless one explicitly sets spec.gcp.bigQueryDatasets[].cascadingDelete
to true
.
Clean up is done by deleting application resource and deleting the BigQuery instance directly in console.cloud.google.com.
When there exist no tables in the specified BigQuery Dataset, deleting the "nais application" will delete the whole BigQuery Dataset, even if spec.gcp.bigQueryDatasets[].cascadingDelete
is set to false
.
The name of your Dataset must be unique within your team's GCP project.
The NAIS Manifest does not currently support updating any setting of existing BigQuery Datasets.
Thus, if you want a read-connection to a already-created BigQuery Dataset, take a look at nais/dp.
Since Kubernetes does not permit underscores (_
) in the names of any K8s resource, any underscores will be converted to hyphens (-
).
Example with all configuration options¶
See full example.
Troubleshooting¶
If you have problems getting your bucket up and running, check errors in the event log:
Created: 2021-06-09