Strimzi Kafka Using local Node Storage

Question

i am running kafka on kube.netes (deployed on Azure) using strimzi for development environment and would prefer to use internal kube.netes node storage. if i use persistant-claim or jbod, it creates standard disks on azure storage. however i prefer to use internal node storage as i have 16 gb available there. i do not want to use ephemeral as i want the data to be persisted atleast on kube.netes nodes. folllowing is my deployment.yml

apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
  name: kafka-cluster
spec:
  kafka:
    version: 3.1.0
    replicas: 2
    listeners:
      - name: plain
        port: 9092
        type: internal
        tls: false
      - name: tls
        port: 9093
        type: internal
        tls: true
      - name: external
        type: loadbalancer
        tls: false
        port: 9094
      
    config:
      offsets.topic.replication.factor: 2
      transaction.state.log.replication.factor: 2
      transaction.state.log.min.isr: 2
      default.replication.factor: 2
      min.insync.replicas: 2
      inter.broker.protocol.version: "3.1"
    storage:
      type: persistent-claim
      size : 2Gi
      deleteClaim: false
  zookeeper:
    replicas: 2
    storage:
      type: persistent-claim
      size: 2Gi
      deleteClaim: false
  entityOperator:
    topicOperator: {}
    userOperator: {}

Answer 1

The persistent-claim storage as you use it will provision the storage using the default storage class which in your case I guess creates standard storage .

You have two options how to use local disk space of the worker node:

You can use the ephemeral type storage. But keep in mind that this is like a temporary directory, it will be lost in every rolling update. Also if you for example delete all the pods at the same time, you will loose all data. As such it is something recommended only for some short-lived clusters in CI, maybe some short development etc. But for sure not for anything where you need reliability.
You can use Local Persistent Volumes which are persistent volumes which are bound to a particular node. These are persistent, so the pods will re-use the volume between restarts and rolling udpates. However, it bounds the pod to the particular worker node the storage is on -> so you cannot easily reschedule it to another worker node. But apart from these limitation, it is something what can be (unlike the ephemeral storage) used with reliability and availability when done right. The local persistent volumes are normally provisioned through StorageClass as well -> so in the Kafka custom resource in Strimzi it will still use the persistent-claim type storage, just with different storage class.

You should really thing what exactly you want to use and why. From my experience, the local persistent volumes are great option when

You run on bare metal / on-premise clusters where often good shared block storage is not available
When you require maximum performance (local storage does not depend on.network, so it can be often faster)

But in public clouds with good support for high quality for.networked block storage such as Amazon EBS volumes and their Azure or Google counterparts, local storage often brings more problems than advantages because of how it bounds your Kafka brokers to a particular worker node.

Some more details about the local persistent volumes can be found here: https://kube.netes.io/docs/concepts/storage/volumes/#local ... there are also different provisioners which can help you use it. I'm not sure if Azure supports anything out of the box.

Sidenote: 2Gi of space is very small for Kafka. Not sure how much you will be able to do before running out of disk space. Even 16Gi would be quite small. If you know what are you doing, then fine. But if not, you should be careful.

Strimzi Kafka Using local Node Storage

Question

1 answers

solution1
1 2022-03-01 16:14:24

Strimzi Kafka Using local Node Storage

Question

1 answers

solution1 1 2022-03-01 16:14:24

solution1
1 2022-03-01 16:14:24