[英]global state store don't create change-log topic what is the workaround if input topic to global store has null key?
我读了很多关于全局状态存储的信息,它不会为还原创建更改主题主题,而是使用源主题作为还原。
我正在创建自定义密钥并将数据存储在全局状态存储中,但在重新启动后它将消失,因为还原时的全局存储将直接从源主题获取数据并绕过处理器。
我的输入主题有以上数据。
{
"id": "user-12345",
"user_client": [
"clientid-1",
"clientid-2"
]
}
我正在维护两个状态存储如下:
所以我看到的解决方法是创建一个自定义更改日志主题并发送带有该主题的密钥的数据,该主题将作为全局状态存储的源主题。
但在我的场景中,我必须在状态存储中填写两条记录,最好的方法是什么。
示例场景:
Record1: {
"id": "user-1",
"user_client": [
"clientid-1",
"clientid-2"
]
}
Record2:{
"id": "user-2",
"user_client": [
"clientid-1",
"clientid-3"
]
}
全局状态存储应该具有:
id -> json Record'
clientid-1: ["user-1", "user-2"]
clientid-2: ["user-2"]
clientid-3: ["user-2"]
如何在全局状态存储中维护上述场景的恢复情况
一种方法是我们为 GlobalKTable 维护一个变更日志主题(具有保留.policy=compact),我们称之为user_client_global_ktable_changelog
,为了简单起见,假设我们将您的消息序列化为 java 类(您可以只使用 HashMap 或 JsonNode 或其他东西) :
//initial message format
public class UserClients {
String id;
Set<String> userClient;
}
//message when key is client
public class ClientUsers {
String clientId;
Set<String> userIds;
}
//your initial topic
KStream<String, UserClients> userClientKStream = streamsBuilder.stream("un_keyed_topic");
//re-map initial message to user_id:{inital_message_payload}
userClientKStream
.map((defaultNullKey, userClients) -> KeyValue.pair(userClients.getId(), userClients))
.to("user_client_global_ktable_changelog");//please provide appropriate serdes
userClientKStream
//will cause data re-partition before running groupByKey (will create an internal -repartition topic)
.flatMap((defaultNullKey, userClients)
-> userClients.getUserClient().stream().map(clientId -> KeyValue.pair(clientId, userClients.getId())).collect(Collectors.toList()))
//we have to maintain a current aggregated store for user_ids for a particular client_id
.groupByKey()
.aggregate(ClientUsers::new, (clientId, userId, clientUsers) -> {
clientUsers.getUserIds().add(userId);
return clientUsers;
}, Materialized.as("client_with_aggregated_user_ids"))
.toStream()
.to("user_client_global_ktable_changelog");//please provide appropriate serdes
例如,在本地状态下聚合 user_ids:
//re-key message for client-based message
clientid-1:user-1
//your current aggregated for `clientid-1`
"clientid-1"
{
"user_id": ["user-1"]
}
//re-key message for client-based message
clientid-1:user-2
//your current aggregated for `clientid-1`
"clientid-1"
{
"user_id": ["user-1", "user-2"]
}
实际上,如果您进行一些更改,我们可以直接使用本地状态的更改日志主题作为your_application-client_with_aggregated_user_ids-changelog
,即主题your_application-client_with_aggregated_user_ids-changelog
,通过调整状态以保留用户密钥和客户端密钥消息的有效负载。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.