简体   繁体   English

全局状态存储不创建更改日志主题如果全局存储的输入主题具有空键,有什么解决方法?

[英]global state store don't create change-log topic what is the workaround if input topic to global store has null key?

I read lot about global state store that it does not create change-topic topic for restore instead it use the source topic as restore.我读了很多关于全局状态存储的信息,它不会为还原创建更改主题主题,而是使用源主题作为还原。

i am create custom key and store the data in global state store, but after restart it will gone because global store on restore will directly take data from source topic and bypass the processor.我正在创建自定义密钥并将数据存储在全局状态存储中,但在重新启动后它将消失,因为还原时的全局存储将直接从源主题获取数据并绕过处理器。

my input topic has above data.我的输入主题有以上数据。

{
      "id": "user-12345",
      "user_client": [
        "clientid-1",
        "clientid-2"
      ]
} 

i am maintaining two state store as follow:我正在维护两个状态存储如下:

  1. id ->record (record means above json) id ->record(记录的意思是在json之上)
  2. clientid-1: ["user-12345"] (clientid -> user-id) clientid-1: ["user-12345"] (clientid -> user-id)
  3. clientid-2: ["user-12345"] (clientid -> user-id) clientid-2: ["user-12345"] (clientid -> user-id)

So i have seen workaround is to create a custom change-log topic and send data with key to that topic that will act as a source topic for the global state store.所以我看到的解决方法是创建一个自定义更改日志主题并发送带有该主题的密钥的数据,该主题将作为全局状态存储的源主题。

but in my scenario i have to fill two record in state store what is the best way to do it.但在我的场景中,我必须在状态存储中填写两条记录,最好的方法是什么。

Example Scenario:示例场景:

Record1: {
          "id": "user-1",
          "user_client": [
            "clientid-1",
            "clientid-2"
          ]
    } 



 Record2:{
          "id": "user-2",
          "user_client": [
            "clientid-1",
            "clientid-3"
          ]
    } 

Global-state store should have:全局状态存储应该具有:

id -> json Record'

clientid-1: ["user-1", "user-2"]
clientid-2: ["user-2"]
clientid-3: ["user-2"]

how to maintain the restore case for the above scenario in global state store如何在全局状态存储中维护上述场景的恢复情况

One approach is we maintain a changelog topic (has retention.policy=compact) for GlobalKTable, let call it user_client_global_ktable_changelog , for the sake of simplicity, let say we serialize your message to java classes (you can just use HashMap or JsonNode or something):一种方法是我们为 GlobalKTable 维护一个变更日志主题(具有保留.policy=compact),我们称之为user_client_global_ktable_changelog ,为了简单起见,假设我们将您的消息序列化为 java 类(您可以只使用 HashMap 或 JsonNode 或其他东西) :

//initial message format
public class UserClients {
    String id;
    Set<String> userClient;
}
//message when key is client
public class ClientUsers {
    String clientId;
    Set<String> userIds;
}
//your initial topic
KStream<String, UserClients> userClientKStream = streamsBuilder.stream("un_keyed_topic");
  1. It easy to re-key the record to user_id, just rekey the KStream then send it to the output topic很容易将记录重新加密到 user_id,只需重新加密 KStream 然后将其发送到输出主题
//re-map initial message to user_id:{inital_message_payload}
userClientKStream
        .map((defaultNullKey, userClients) -> KeyValue.pair(userClients.getId(), userClients))
        .to("user_client_global_ktable_changelog");//please provide appropriate serdes
  1. Aggregate user_id for a particular client, we can use a local state (KTable) for keeping the (current user_ids list of current client_id):聚合特定客户端的 user_id,我们可以使用本地状态(KTable)来保存(当前 client_id 的当前 user_ids 列表):
userClientKStream
        //will cause data re-partition before running groupByKey (will create an internal -repartition topic)
        .flatMap((defaultNullKey, userClients)
                -> userClients.getUserClient().stream().map(clientId -> KeyValue.pair(clientId, userClients.getId())).collect(Collectors.toList()))
        //we have to maintain a current aggregated store for user_ids for a particular client_id
        .groupByKey()
        .aggregate(ClientUsers::new, (clientId, userId, clientUsers) -> {
            clientUsers.getUserIds().add(userId);
            return clientUsers;
        }, Materialized.as("client_with_aggregated_user_ids"))
        .toStream()
        .to("user_client_global_ktable_changelog");//please provide appropriate serdes

Eg for aggregating user_ids in local state:例如,在本地状态下聚合 user_ids:

//re-key message for client-based message
clientid-1:user-1
//your current aggregated for `clientid-1`
"clientid-1"
{
    "user_id": ["user-1"]
}

//re-key message for client-based message
clientid-1:user-2
//your current aggregated for `clientid-1`
"clientid-1"
{
    "user_id": ["user-1", "user-2"]
}

Actually we could use the changelog topic of the local state as changelog for GlobalKTable directly if you make some change, which is topic your_application-client_with_aggregated_user_ids-changelog , by adjust the state to keep both the payload of user key and client key message.实际上,如果您进行一些更改,我们可以直接使用本地状态的更改日志主题作为your_application-client_with_aggregated_user_ids-changelog ,即主题your_application-client_with_aggregated_user_ids-changelog ,通过调整状态以保留用户密钥和客户端密钥消息的有效负载。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Kafka Streams Global Store - 添加变更日志主题 - Kafka Streams Global Store - Adding change log topic GlobalKTable - StreamsException:遇到与任何全局状态存储无关的主题分区 - GlobalKTable - StreamsException: Encountered a topic-partition not associated with any global state store 如何使用交互式查询存储和全局存储实现处理单个主题的 Kafka Streams 拓扑 - How to implement Kafka Streams topology that process single topic with interactive queries store and global store 全局状态存储与 kafka 中的普通状态存储有何不同? - How global state store differ from normal state store in kafka? 全局变量不能存值 - Global variable can't store value Kafka如何为每个主题存储偏移量? - How does Kafka store offsets for each topic? Spring Boot 中的 Liquibase 更改日志 - Liquibase change-log in Spring Boot 在 Spring 中存储和更改全局应用程序属性的最佳方法是什么 以线程安全的方式启动 - What is the best way to store and change global app properties in Spring Boot in a thread safe way 我们可以在全局状态存储恢复期间调用处理器吗? - can we invoke the processor during global state store restoration? 在高级Java游戏中存储全局/静态变量的最佳方法是什么? - What is the best way to store global/static variables in an advanced java game?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM