使用kafka流基于消息密钥将消息发送到主题

Question

I want to be able to send all records in a Kafkastream to a different topic based on the key of the message key. 我希望能够根据消息密钥的密钥将Kafkastream中的所有记录发送到另一个主题。 Ex. 例如 A stream in Kafka contains name as key and record as value. Kafka中的流包含名称作为键和记录作为值。 I want to fan out these records to different topic based on the key of the record 我想根据记录的关键将这些记录分散到不同的主题

data : (jhon -> {jhonsRecord}),(sean -> {seansRecord}),(mary -> {marysRecord}),(jhon -> {jhonsRecord2}), expected 数据：（jhon-> {jhonsRecord}），（sean-> {seansRecord}），（mary-> {marysRecord}），（jhon-> {jhonsRecord2}），预期

topic1 :name: jhon ->(jhon -> {jhonsRecord}),(jhon -> {jhonsRecord2}) topic1：名称：jhon->（jhon-> {jhonsRecord}），（jhon-> {jhonsRecord2}）
topic2 :sean-> (sean -> {seansRecord}) topic2：sean->（sean-> {seansRecord}）
topic3 :mary -> (mary -> {marysRecord}) topic3：mary->（mary-> {marysRecord}）

Below is the way I am doing this right now, but since the list of names is hudge this is slow. 下面是我现在执行此操作的方式，但是由于名称列表比较笨拙，因此速度很慢。 Plus even if there are records a few names, I need to traverse the entire list Please suggest a fix 另外，即使有一些名字的记录，我也需要遍历整个列表。请提出修复建议

    for( String name : names )
    {
        recordsByName.filterNot(( k, v ) -> k.equalsIgnoreCase(name)).to(name);
    }

Answer 1

I think you should use KStream::to(final TopicNameExtractor<K, V> topicExtractor) function. 我认为您应该使用KStream::to(final TopicNameExtractor<K, V> topicExtractor)函数。 It gives you ability to calculate name of the topic for each message. 它使您能够计算每条消息的主题名称。

Sample code: 样例代码：

final KStream<String, String> stream = ???;
stream.to((key, value, recordContext) -> key);

Answer 2

I think what you're looking for is KStream#branch . 我认为您正在寻找的是KStream#branch 。

Following is untested, but it shows the general idea 以下未经测试，但显示了总体思路

// get a list of predicates to branch a topic on
final List<String> names = Arrays.asList("jhon", "sean", "mary");
final Predicate[] predicates = names.stream()
    .map((Function<String, Predicate<String, Object>>) n -> (s, o) -> s.equals(n))
    .toArray(Predicate[]::new);

// example input
final KStream<Object, Object> stream = new StreamsBuilder().stream("names");

// split the topic
KStream<String, Object>[] branches = stream.branch(predicates);
for (int i = 0; i < names.size(); i++) {
    branches[i].to(names.get(i));
}

// KStream branches[0] contains all records whose keys are "jhon"
// KStream branches[1] contains all records whose keys are "sean"
...

Answer 3

If you need to generate aggregate data for each user, you don't need to write to a separate topic per user. 如果需要为每个用户生成汇总数据，则无需为每个用户写一个单独的主题。 You'd be better off writing an aggregate on the source stream. 您最好在源流上编写聚合。 This way you won't end up with one topic per key, but you can still run operations on each user independently. 这样，您最终不会对每个键有一个主题，但是仍然可以对每个用户独立运行操作。

Serde<UserRecord> recordSerde = ...
KStream<Stream, UserAggregate> aggregateByName = recordsByName
   .groupByKey(Grouped.with(Serdes.String(), recordSerde))
   .aggregate(...)
   .toStream()

See https://docs.confluent.io/current/streams/developer-guide/dsl-api.html#aggregating for details 有关详细信息，请参见https://docs.confluent.io/current/streams/developer-guide/dsl-api.html#aggregating

This approach will scale to millions of users, someone you won't currently be able to achieve with the one topic per user approach. 这种方法将扩展到数百万个用户，而您目前无法通过每个用户一个主题的方法来实现这一目标。

使用kafka流基于消息密钥将消息发送到主题

问题描述

3 个解决方案

解决方案1
5 已采纳 2019-09-17 19:47:42

解决方案2
1 2019-09-17 19:22:58

解决方案3
1 2019-09-19 01:40:15

使用kafka流基于消息密钥将消息发送到主题

问题描述

3 个解决方案

解决方案1 5 已采纳 2019-09-17 19:47:42

解决方案2 1 2019-09-17 19:22:58

解决方案3 1 2019-09-19 01:40:15

解决方案1
5 已采纳 2019-09-17 19:47:42

解决方案2
1 2019-09-17 19:22:58

解决方案3
1 2019-09-19 01:40:15