简体   繁体   中英

Kafka Streams: Store is not ready

We recently upgraded Kafka to v1.1 and Confluent to v4.0.But upon upgrading we have encountered a persistent problems regarding state stores. Our application starts a collection of streams and we check for the state stores to be ready before killing the application after 100 tries. But after the upgrade there's atleast one stream that will have Store is not ready : the state store, <your stream>, may have migrated to another instance The stream itself has RUNNING state and the messages will flow through but the state of the store still shows up as not ready. So I have no idea as to what may be happening.

  • Should I not check for store state?
  • And since our application has a lot of streams (~15), would starting them simultaneously cause problems?
  • Should we not do a hard restart -- currently we run it as a service on linux

We are running Kafka in cluster with 3 brokers.Below is a sample stream (not the entire code):

public BaseStream createStreamInstance() {
    final Serializer<JsonNode> jsonSerializer = new JsonSerializer();
    final Deserializer<JsonNode> jsonDeserializer = new JsonDeserializer();
    final Serde<JsonNode> jsonSerde = Serdes.serdeFrom(jsonSerializer, jsonDeserializer);

    MessagePayLoadParser<Note> noteParser = new MessagePayLoadParser<Note>(Note.class);
    GenericJsonSerde<Note> noteSerde = new GenericJsonSerde<Note>(Note.class);

    StreamsBuilder builder = new StreamsBuilder();

    //below reducer will use sets to combine
    //value1 in the reducer is what is already present in the store.
    //value2 is the incoming message and for notes should have max 1 item in it's list (since its 1 attachment 1 tag per row, but multiple rows per note)
    Reducer<Note> reducer = new Reducer<Note>() {
        @Override
        public Note apply(Note value1, Note value2) {
            value1.merge(value2);
            return value1;
        }
    };

    KTable<Long, Note> noteTable = builder
            .stream(this.subTopic, Consumed.with(jsonSerde, jsonSerde))
            .map(noteParser::parse)
            .groupByKey(Serialized.with(Serdes.Long(), noteSerde))
            .reduce(reducer);

    noteTable.toStream().to(this.pubTopic, Produced.with(Serdes.Long(), noteSerde));

    this.stream = new KafkaStreams(builder.build(), this.properties);
    return this;
}

There are some open questions here, like the ones Matthias put on comment, but will try to answer/give help to your actual questions:

  • Should I not check for store state? Rebalancing is usually the case here. But in that case, you should not see that partition's thread keep consuming, but that processing should be "transferred" to be done to another thread that took over. Make sure if it is actually that very thread the one that keeps on processing that partition, and not the new one. Check kafka-consumer-groups utility to follow the consumers (threads) there.
  • And since our application has a lot of streams (~15), would starting them simultaneously cause problems? No, rebalancing is automatic.
  • Should we not do a hard restart -- currently we run it as a service on linux Are you keeping your state stores in a certain, non-default directory? You should configure your state stores directory properly and make sure it is accessible, insensitive to application restarts. Unsure about how you perform your hard restart, but some exception handling code should cover against it, closing your streams application.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM