简体   繁体   中英

Application fails to start due to kafka streams store is waiting to be running

I have a spring boot application working with kafka streams (kafka docker image: wurstmeister/kafka:2.12-2.1.1 , kafka dependencies: org.apache.kafka:kafka-streams:2.4.1 ). During application start up I check whether or not topic my-topic is created, if not - application creates it. After that application creates KTable like:

streamsBuilder.table("my-topic", Consumed.with(Serdes.String(), Serdes.String()), Materialized.as("my-topic-store"))

And further, I create store in order to query it:

while(true)
    try{
        return kafkaStreams.store("my-topic-store", QueryableStoreTypes.keyValueStore()) 
    } catch (InvalidStateStoreException e) {
        log.info("Waiting for store {} is RUNNING", "my-topic-store");
        Thread.sleep(1000);
    }
}

My application is deployed in k8s. When new version of application is ready, k8s starts new application and scales down old one. The problem is when new application starts up, I see in logs only multiple lines like: "Waiting for store my-topic-store is RUNNING.

I tried to dig into the problem. From kafka documentation, 1 partition is read by only 1 consumer, and 1 consumer can read from multiple partitions. If new consumer comes and all partitions already 'occupied', this consumer becomes idle. In our case, when new application starts up, means new consumer is coming, it becomes idle because old application with old consumers is still working, therefore new consumer is not able to listen kafka partition. I should notice, that application is configured with 5 threads for kafka streams, and there are 23 topics at all each of them with 1 partition (I tried to change partition number from 1 to 5, but it did not help). Application redeployment is happened with no load at all.

What you describe (in the comment) is expected behavior.

When you start the new app, it will join the consumer group. Because there is only one partition, the new app does not get any work assigned (there is no reason to re-assign work because it would just be an expensive state migration; note that from a rebalancing point of view, you application scaled out; it's unknown that you plan to stop the already existing app).

When you finally stop the old app, work (and state) is reassigned.

Also note that starting a new instance would never stop any existing instance. Instead, as mentioned already, it's considered a scale out of your application.

The recommended way to upgrade an application is to stop the old instance first, and restart the new instance on the same server so it can pick up the state from the old instance from disk. This avoids expensive state migration.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM