Managing topics and access¶
This feature applies only to Aiven hosted Kafka. On-premises Kafka is deprecated, and creating new topics on-premises was disabled summer 2021. For on-premises Kafka, see on-premises Kafka documentation.
Creating topics and defining access¶
Creating or modifying a
Topic Kubernetes resource will trigger topic creation and ACL management with Aiven (hosted Kafka provider). The topic name will be prefixed with your team namespace, thus in the example below, the fully qualified topic name will be
myteam.mytopic. This name will be set in the
.status.fullyQualifiedName field on your Topic resource once the Topic is synchronized to Aiven.
To add access to this topic for your application, see the next section: Accessing topics from an application.
Topic resources can only be specified in GCP clusters. However, applications might access topics from any cluster, including on-premises. For details, read the next section.
Currently, use the
nav-dev pool for development, and
nav-prod for production.
If you need cross-environment communications, use the
nav-infrastructure pool, but please consult the NAIS team before you do.
|Pool||Min. replication||Max. replication||Topic declared in||Available from|
On your topic, you must define ACLs to manage access to the topic.
Access is granted to applications, belonging to teams.
Every ACL must contain team, application and which access to grant.
Possible access is
It is possible to use simple wildcards (
*) in both team and application names, which matches any character any number of times.
Be aware that due to the way ACLs are generated and length limits, the ends of long names can be cut, eliminating any wildcards at the end.
--- apiVersion: kafka.nais.io/v1 kind: Topic metadata: name: mytopic namespace: myteam labels: team: myteam spec: pool: nav-dev acl: - team: myteam application: ownerapp access: readwrite # read, write, readwrite - team: bigteam application: consumerapp1 access: read - team: bigteam application: consumerapp2 access: read - team: bigteam application: producerapp1 access: write - team: producerteam application: producerapp access: write - team: trusted-team application: * access: read # All applications from this trusted-team has read - team: * application: aivia access: read # Applications named aivia from any team has read - team: myteam application: rapid-* access: readwrite # Applications from myteam with names starting with `rapid-` has readwrite access
Topics may be configured beyond the default settings for various use cases. Only a subset of all the possible topic-level configurations are available.
--- apiVersion: kafka.nais.io/v1 kind: Topic metadata: name: mytopic namespace: myteam labels: team: myteam spec: pool: nav-dev config: # optional; all fields are optional too; defaults shown cleanupPolicy: delete # delete, compact, compact,delete maxMessageBytes: 1048588 # 1 MiB minimumInSyncReplicas: 2 partitions: 1 replication: 3 # see min/max requirements retentionBytes: -1 # -1 means unlimited retentionHours: 168 # -1 means unlimited segmentHours: 168 # 1 week acl: - team: myteam application: ownerapp access: readwrite
Maximum message size¶
maxMessageBytes configuration controls the largest record batch size allowed by Kafka.
It has a default value of
1048588 (1 MiB) and a maximum value of
5242880 (5 MiB).
Generally speaking, Kafka is not designed to handle large messages.
We recommend that you do not increase the
maxMessageBytes value above the defaults, unless absolutely necessary.
To keep your Kafka messages below the size limit, do consider implementing strategies such as:
- using an efficient serialization format such as Avro or Protobuf
- using compression - set the
compression.typeconfiguration for your producer(s)
- using patterns such as claim-checks or splitting messages into multiple segments
If you do increase the
maxMessageBytes value, you will need to configure all your producers and consumers as well to accommodate this.
Each topic partition is split into segments. The
segmentHours configuration controls the period of time after which
Kafka will commit the segment even if not full to ensure that retention can delete or compact old data.
It has a default value of
168 (1 week) and a minimum value of
1 (1 hour).
Setting this value lower can be useful for GDPR purposes where you need to compact or delete data more regularly than the default setting of 1 week.
Data catalog metadata¶
If your topic exposes data meant for consumption by a wider audience, you should define some metadata describing the topic and its contents. This data will be automatically scraped and added to the internal data catalog. If the
catalog key is set to
public, the topic metadata is also published to the external data catalog and the National Data Catalog.
Use the following annotations and prefix them with
dcat.data.nav.no/. Default values will be used where not supplied.
|title||mandatory||String||Inntektskjema mottatt fra Altinn||topic name|
|description||mandatory||String||Inntektsmeldingen arbeidsgiveren sender fra eget lønns- og personalsystem eller fra altinn.no|
|theme||recommended||A main category of the resource. A resource can have multiple themes entered as a comma-separated list of strings.||inntekt|
|keyword||recommended||A string or a list of strings||inntekt,arbeidsgiver,altinn|
One or more of the following keys can also be supplied if the default values below are not sufficient:
|Key||Importance||Comment||Example Value||Default value|
|temporal||optional||An interval of time covered by the topic, start and end date. Formatted as two ISO 8601 dates (or datetimes) separated by a slash.||2020/2020 or 2020-06/2020-06||current year/current year|
|language||optional||Two or three letter code.||NO||NO|
|creator||optional||The entity responsible for producing the topic. An agent (eg. person, group, software or physical artifact).||NAV||team name|
|publisher||optional||The entity responsible for making the topic available. An agent (eg. person, group, software or physical artifact).||NAV||NAV|
|accessRights||optional||Information about who can access the topic or an indication of its security status.||internal||internal|
|license||optional||Either a license URI or a title.||MIT|
|rights||optional||A statement that concerns all rights not addressed with
||Copyright 2020, NAV||Copyright year, NAV|
|catalog||optional||The catalog(s) where the metadata will be published. The value can be either
Permanently deleting topic and data¶
Permanent deletes are irreversible. Enable this feature only as a step to completely remove your data.
Topic resource is deleted from a Kubernetes cluster, the Kafka topic is still retained, and the data kept intact. If you need to remove data and start from scratch, you must add the following annotation to your
When this annotation is in place, deleting the topic resource from Kubernetes will also delete the Kafka topic and all of its data.
Accessing topics from an application¶
.kafka.pool to your
Application spec will inject Kafka credentials into your pod. Your application needs to follow some design guidelines; see the next section on application design guidelines. Make sure that the topic name matches the
fullyQualifiedName found in the Topic resource, e.g.