[英]KafkaHeaders.RECEIVED_MESSAGE_KEY vs KafkaHeaders.MESSAGE_KEY header on Spring Cloud
[英]Spring Cloud Stream send/consume message to different partitions with KafkaHeaders.Message_KEY
I am trying to implement a prototype for implementing messaging system using Spring Cloud Stream. I selected Apache Kafka as binder. I created a topic with 2 partitions for scalability. Then I tried to send different messages to different partitions using following rest api method.
我為 2 個分區設置了 2 個不同的消息鍵。
@PostMapping("/publish")
public void publish(@RequestParam String message) {
log.debug("REST request the message : {} to send to Kafka topic ", message);
Message message1 = MessageBuilder.withPayload("Hello from a")
.setHeader(KafkaHeaders.MESSAGE_KEY, "node1")
.build();
Message message2 = MessageBuilder.withPayload("Hello from b")
.setHeader(KafkaHeaders.MESSAGE_KEY, "node1")
.build();
Message message3 = MessageBuilder.withPayload("Hello from c")
.setHeader(KafkaHeaders.MESSAGE_KEY, "node1")
.build();
Message message4 = MessageBuilder.withPayload("Hello from d")
.setHeader(KafkaHeaders.MESSAGE_KEY, "node2")
.build();
Message message5 = MessageBuilder.withPayload("Hello from e")
.setHeader(KafkaHeaders.MESSAGE_KEY, "node2")
.build();
Message message6 = MessageBuilder.withPayload("Hello from f")
.setHeader(KafkaHeaders.MESSAGE_KEY, "node2")
.build();
output.send("simulatePf-out-0", message1);
output.send("simulatePf-out-0", message2);
output.send("simulatePf-out-0", message3);
output.send("simulatePf-out-0", message4);
output.send("simulatePf-out-0", message5);
output.send("simulatePf-out-0", message6);
}
這是我用於生產者應用程序的 application.yml
cloud:
stream:
kafka:
binder:
replicationFactor: 2
auto-create-topics: true
brokers: localhost:9092,localhost:9093,localhost:9094
auto-add-partitions: true
bindings:
simulatePf-out-0:
producer:
configuration:
key.serializer: org.apache.kafka.common.serialization.StringSerializer
value.serializer: org.springframework.kafka.support.serializer.JsonSerializer
bindings:
simulatePf-out-0:
producer:
useNativeEncoding: true
partition-count: 3
destination: pf-topic
content-type: text/plain
group: dsa-back-end
為了測試並行性,我創建了一個從 pf-topic 讀取消息的消費者應用程序。 這是來自消費者應用程序的配置。
cloud:
stream:
kafka:
binder:
replicationFactor: 2
auto-create-topics: true
brokers: localhost:9092, localhost:9093, localhost:9094
min-partition-count: 2
bindings:
simulatePf-in-0:
consumer:
configuration:
key.deserializer: org.apache.kafka.common.serialization.StringDeserializer
value.deserializer: org.springframework.kafka.support.serializer.JsonDeserializer
bindings:
simulatePf-in-0:
destination: pf-topic
content-type: text/plain
group: powerflowservice
consumer:
use-native-decoding: true
. 我在消費者應用程序中創建了一個 function 來消費消息
@Bean
public Consumer<Message> simulatePf() {
return message -> {
log.info("header " + message.getHeaders());
log.info("received " + message.getPayload());
};
}
現在是測試的時候了。 為了測試並行性,我運行了 2 個 spring 引導使用者應用程序實例。 我期待看到一個消費者從一個分區消費消息,其他消費者消費者從另一個分區消費消息。 所以我希望消息a,消息b,消息被消費者一消費。 消息 d、消息 e 和消息 f 是其他消費者的消費者。 因為我設置了不同的消息鍵來分配不同的分區。 但所有消息僅由一個應用程序使用
2022-06-30 20:34:48.895 INFO 11860 --- [container-0-C-1] c.s.powerflow.config.AsyncConfiguration : header {deliveryAttempt=1, kafka_timestampType=CREATE_TIME, kafka_receivedMessageKey=node1, kafka_receivedTopic=pf-topic, skip-input-type-conversion=true, kafka_offset=270, scst_nativeHeadersPresent=true, kafka_consumer=org.apache.kafka.clients.consumer.KafkaConsumer@1eaf51df, source-type=streamBridge, id=a77d12f2-f184-0f2f-6a76-147803dd43f3, kafka_receivedPartitionId=0, kafka_receivedTimestamp=1656610488838, kafka_groupId=powerflowservice, timestamp=1656610488890}
2022-06-30 20:34:48.901 INFO 11860 --- [container-0-C-1] c.s.powerflow.config.AsyncConfiguration : received Hello from a
2022-06-30 20:34:48.929 INFO 11860 --- [container-0-C-1] c.s.powerflow.config.AsyncConfiguration : header {deliveryAttempt=1, kafka_timestampType=CREATE_TIME, kafka_receivedMessageKey=node1, kafka_receivedTopic=pf-topic, skip-input-type-conversion=true, kafka_offset=271, scst_nativeHeadersPresent=true, kafka_consumer=org.apache.kafka.clients.consumer.KafkaConsumer@1eaf51df, source-type=streamBridge, id=2e89f9b7-b6e7-482f-3c46-f73b2ad0705c, kafka_receivedPartitionId=0, kafka_receivedTimestamp=1656610488840, kafka_groupId=powerflowservice, timestamp=1656610488929}
2022-06-30 20:34:48.932 INFO 11860 --- [container-0-C-1] c.s.powerflow.config.AsyncConfiguration : received Hello from b
2022-06-30 20:34:48.933 INFO 11860 --- [container-0-C-1] c.s.powerflow.config.AsyncConfiguration : header {deliveryAttempt=1, kafka_timestampType=CREATE_TIME, kafka_receivedMessageKey=node1, kafka_receivedTopic=pf-topic, skip-input-type-conversion=true, kafka_offset=272, scst_nativeHeadersPresent=true, kafka_consumer=org.apache.kafka.clients.consumer.KafkaConsumer@1eaf51df, source-type=streamBridge, id=15640532-b57f-b58e-62e7-c2bc9375fdf0, kafka_receivedPartitionId=0, kafka_receivedTimestamp=1656610488841, kafka_groupId=powerflowservice, timestamp=1656610488933}
2022-06-30 20:34:48.934 INFO 11860 --- [container-0-C-1] c.s.powerflow.config.AsyncConfiguration : received Hello from c
2022-06-30 20:34:48.935 INFO 11860 --- [container-0-C-1] c.s.powerflow.config.AsyncConfiguration : header {deliveryAttempt=1, kafka_timestampType=CREATE_TIME, kafka_receivedMessageKey=node2, kafka_receivedTopic=pf-topic, skip-input-type-conversion=true, kafka_offset=273, scst_nativeHeadersPresent=true, kafka_consumer=org.apache.kafka.clients.consumer.KafkaConsumer@1eaf51df, source-type=streamBridge, id=590f0fb7-042f-e134-d214-ead570e42fe3, kafka_receivedPartitionId=0, kafka_receivedTimestamp=1656610488842, kafka_groupId=powerflowservice, timestamp=1656610488934}
2022-06-30 20:34:48.938 INFO 11860 --- [container-0-C-1] c.s.powerflow.config.AsyncConfiguration : received Hello from d
2022-06-30 20:34:48.940 INFO 11860 --- [container-0-C-1] c.s.powerflow.config.AsyncConfiguration : header {deliveryAttempt=1, kafka_timestampType=CREATE_TIME, kafka_receivedMessageKey=node2, kafka_receivedTopic=pf-topic, skip-input-type-conversion=true, kafka_offset=274, scst_nativeHeadersPresent=true, kafka_consumer=org.apache.kafka.clients.consumer.KafkaConsumer@1eaf51df, source-type=streamBridge, id=9a67e68b-95d4-a02e-cc14-ac30c684b639, kafka_receivedPartitionId=0, kafka_receivedTimestamp=1656610488842, kafka_groupId=powerflowservice, timestamp=1656610488940}
2022-06-30 20:34:48.941 INFO 11860 --- [container-0-C-1] c.s.powerflow.config.AsyncConfiguration : received Hello from e
2022-06-30 20:34:48.943 INFO 11860 --- [container-0-C-1] c.s.powerflow.config.AsyncConfiguration : header {deliveryAttempt=1, kafka_timestampType=CREATE_TIME, kafka_receivedMessageKey=node2, kafka_receivedTopic=pf-topic, skip-input-type-conversion=true, kafka_offset=275, scst_nativeHeadersPresent=true, kafka_consumer=org.apache.kafka.clients.consumer.KafkaConsumer@1eaf51df, source-type=streamBridge, id=333269af-bbd5-12b0-09de-8bd7959ebf08, kafka_receivedPartitionId=0, kafka_receivedTimestamp=1656610488843, kafka_groupId=powerflowservice, timestamp=1656610488943}
2022-06-30 20:34:48.943 INFO 11860 --- [container-0-C-1] c.s.powerflow.config.AsyncConfiguration : received Hello from f
你能幫我解決我所缺少的嗎。
您僅在發送時將消息密鑰設置為 header。 您可以在消息上添加KafkaHeaders.PARTITION
header 以強制執行特定分區。
如果您不想通過 header 添加硬編碼分區,則可以在應用程序中設置分區鍵 SpEL 表達式或分區鍵提取器 bean。 這兩種機制都是 Spring Cloud Stream 特定的。 如果您提供其中任何一個,您仍然需要告訴 Spring Cloud Stream 您希望如何 select 分區。 為此,您可以使用分區選擇器 SpEL 表達式或分區選擇器策略。 如果您不提供它們,那么它將使用默認選擇器策略,方法是獲取消息鍵 % 主題分區數的hashCode
。
我想你昨天問了另一個相關的問題,我在回答中鏈接了這個博客。 在該博客的最后幾節中,解釋了所有這些細節。
引用博客:
如果您不提供分區鍵表達式或分區鍵提取器 bean,那么 Spring Cloud Stream 將完全不為您做出任何分區決定。 在這種情況下,如果主題有多個分區,則會觸發 Kafka 的默認分區機制。 默認情況下,Kafka 使用 DefaultPartitioner,如果消息有一個鍵(見上文),則使用該鍵的 hash 來計算分區。
我認為您在應用程序中看到了 Kafka 的默認行為。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.