简体   繁体   English

Kafka Streams仅提交KGroupedTable的最新消息

[英]Kafka Streams committing just the latest message of KGroupedTable

I've got Kafka Streams application as follows: 我有Kafka Streams应用程序,如下所示:

static KafkaStreams build(AppConfig appConfig, SerdesHelper serdes) {
  final KStreamBuilder builder = new KStreamBuilder();

  builder
      .table(serdes.sourceKeySerde, serdes.sourceValueSerde, appConfig.sourceTopic)
      .groupBy(StreamBuilder::groupByMapper, serdes.intSerde, serdes.longSerde)
      .aggregate(
          StreamBuilder::initialize,
          StreamBuilder::add,
          StreamBuilder::subtract,
          serdes.sinkValueSerde)
      .to(serdes.intSerde, serdes.sinkValueSerde, appConfig.sinkTopic);

  return new KafkaStreams(builder, appConfig.streamConfig);
}

My concrete example groups records as follows 我的具体示例将记录分组如下

((k, v)) -> ((k), v[])

And while running this with dummy data of 3.000.000 messages having only two unique keys, I ended up having about 10.000 messages in sinkTopic in less than a minute and I hoped to get either 4/6 (based on the moment I manage to stop the application). 当使用只有两个唯一键的3.000.000条消息的伪数据运行此消息时,我在不到一分钟的时间内接收到了大约10.000条消息在sinkTopic中,我希望得到4/6(基于我设法停止的那一刻)应用程序)。

How can I ensure that only the key with the latest grouped value will be committed back to Kafka instead of every intermediate message? 如何确保仅将具有最新分组值的密钥提交回Kafka,而不是每个中间消息?

It's stream processing, not batch processing. 它是流处理,而不是批处理。 There is no "latest grouped value" -- the input is infinite, and thus, the output is infinite... 没有“最新分组值”-输入是无限的,因此输出是无限的...

You can only reduce the number of intermediates by 您只能通过减少中间体数量

  1. increasing the KTable cache size (but this seems not to be an issue for your case as you have only 2 unique keys and thus both fit into the cache if you did not disable caching or 增加KTable缓存大小(但是对于您的情况,这似乎不是问题,因为您只有2个唯一键,因此如果不禁用缓存或
  2. increasing the commit interval 增加提交间隔

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM