Skip to content

Persistent Data

In this section we will discuss how to work with persistent data in your applications and the different options available to you.

Persistent data is data that is stored on disk and survives application restarts. This is in contrast to ephemeral data which is stored in memory and is lost when the application is restarted.

Responsibilities

The team is responsible for any data that is stored in the various storage options that are available through the platform. You can read more in the Data Responsibilities section.

Availability

Some of the storage options are only available from certain environments. Make sure to check what storage options are available in your environment in the Storage Comparison section below.

What should I choose?

Sequence of questions to ask yourself when choosing the right storage option. Choose wisely.

graph TD
  A[I got data!] --> B[Is it structured?]
  B --> |Yes| C[Is it events?]
  B --> |No| D[Is it files?]
  C --> |Yes| Kafka
  C --> |No| E[Is it analytical?]
  D --> |Yes| GCS[Cloud Storage]
  D --> |No| Opensearch[OpenSearch]
  E --> |Yes| GBQ[BigQuery]
  E --> |No| GCSQL[Cloud SQL]

  click GBQ "#bigquery"
  click GCS "#cloud-storage-buckets"
  click GCSQL "#cloud-sql"
  click Kafka "#kafka"
  click Opensearch "#opensearch"

Storage Comparison

Below is a list of the different storage options available to you.

Name Type Recommendation Availability Backup
Kafka Streaming All Yes*
Cloud Storage Object GCP Yes*
Cloud SQL Relational GCP Yes
BigQuery Relational GCP Yes*
Elasticsearch Document ⚠️ All Yes
OpenSearch Document All Yes
Redis Key/Value All Yes
InfluxDB Time Series All Yes
IBM MQ Message ⚠️ All Yes*

* Data is highly available and fault-tolerant but not backed up if deleted by mistake.

Kafka

Kafka is a streaming platform that is used for storing and processing data. It is a very powerful tool that can be used for a wide variety of use cases. It is also a very complex tool that requires a lot of knowledge to use effectively.

Getting started with Kafka

Cloud Storage (Buckets)

Cloud Storage is a service that provides object storage. It is a very simple service that is easy to use and provides a lot of flexibility. It is a good choice for storing data that is not relational in nature.

Getting started with Cloud Storage

Cloud SQL

Cloud SQL is a PostgreSQL relational database service that is provided by Google Cloud Platform. It is a good choice for storing data that is relational in nature.

Getting started with Cloud SQL

BigQuery

BigQuery is a service that provides a relational database that is optimized for analytical workloads. It is a good choice for storing data that is relational in nature.

Getting started with Google BigQuery

OpenSearch

OpenSearch is a document database that is used for storing and searching data. It is a good choice for storing data that is not relational in nature. OpenSearch offers a drop-in replacement for Elasticsearch.

Getting started with OpenSearch

Redis

Redis is a key value database that is used for storing and querying data. It is a good choice for storing data that is not relational in nature.

Getting started with Redis

InfluxDB

InfluxDB is a time series database that is used for storing and querying data. It is a good choice for storing data that is not relational in nature.

Getting started with InfluxDB

IBM MQ

Legacy

IBM MQ is considered legacy technology. Please use Kafka instead.

IBM MQ is a message broker that is used for storing and processing messages. It is a good choice for storing data that is not relational in nature.

Getting started with IBM MQ

MongoDB

Unsupported

MongoDB is not officially supported by the platform. It is provided as a reference. Please concider Cloud SQL and PostgreSQL JSON Types which are supported since PostgreSQL 10.

MongoDB is a document database that is used for storing and searching data. It is a good choice for storing data that is not relational in nature.

Run your own MongoDB in GCP


Last update: 2023-09-14
Created: 2022-11-08