繁体   English   中英

如何使用 kafka 流加入主题

[英]How to join topics with kafka stream

Kafka 流我正在尝试流式传输,但我遇到了一些问题,它不起作用。 首先,我有三个连接器,但我不能使用自己的密钥。 我需要钥匙才能加入他们,对吗? 如何使用 2 个或更多密钥加入? 我尝试复制这样的东西: select * from (select a. * from users a internal join deps b on a.dep = b.dep and a.group = b.group ) a internal join user_afy on a.id = b 。ID

我想将内部联接的数据保存在主题中并将其用于外部联接。 这是我有的一个例子。

连接器属性:

....
connector.class=io.confluent.connect.jdbc.JdbcSourceConnector
mode=timestamp
query=select id, user, dep,tal, group,time from users
numeric.mapping=best_fit
table.types=TABLE
topic=users  
// I try use this with 1 or more fields but not worked  
transforms=createKey, extractInt  
transforms.createKey.type=org.apache.kafka.connect.transforms.ValueToKey  
transforms.createKey.fields=dep, group  
transforms.extractInt.type=org.apache.kafka.connect.transforms.ExtractField$Key  
transforms.extractInt.field=dep, group  

独立属性

bootstrap.servers=localhost:9092  
key.converter=org.apache.kafka.connect.json.JsonConverter  
value.converter=org.apache.kafka.connect.json.JsonConverter  
key.converter.schemas.enable=false  
value.converter.schemas.enable=false  
offset.storage.file.filename=D:/tmp/connect.offsets  
plugin.path=D:/connector/lib  

话题:

Topic users    
{"id":"0001", "user":"Alex", "dep":"ofi", "postal":170, group="ingen",time:"xxx"}    
{"id":"0002", "user":"Emy", "dep":"lab", "postal":170, group="itn",time:"xxx"}    
{"id":"0003", "user":"Lea", "dep":"lab", "postal":170, group="itn",time:"xxx"}    
{"id":"0004", "user":"Silva", "dep":"cent", "postal":170, group="ingen",time:"xxx"}    
{"id":"0005", "user":"Foxy", "dep":"cent", "postal":170, group="ete",time:"xxx"}    

topic user_afy
{"id":"0001", name="bask"}
{"id":"0001", name="Silf"}
{"id":"0002", name="BTT"}
{"id":"0005", name="butf"}


Topic deps  
{"id_dep":"1", "dep":"ofi", "sind"="worker", "group"="ingen."}  
{"id_dep":"2", "dep":"lab", "sind"="worker", "group"="iti."}  
{"id_dep":"3", "dep":"cent", "sind"="worker", "group"="etc."} 

我的代码是官网的一个例子但是我无法测试

public static void main(String[] Args) {
        Properties props = new Properties();
        props.put(......);

    final Serializer<JsonNode> jsonSerializer = new JsonSerializer();
        final Deserializer<JsonNode> jsonDeserializer = new JsonDeserializer();
        final Serde<JsonNode> jsonSerde = Serdes.serdeFrom(jsonSerializer, jsonDeserializer);
        final Consumed<String, JsonNode> consumed = Consumed.with(Serdes.String(), jsonSerde);
        StreamsBuilder builder = new StreamsBuilder();
        final KStream<String, JsonNode> left = builder.stream("user", consumed);
        KTable<String, JsonNode> right = builder.table("deps", consumed);
        KStream<String, String> joined = left.join(right,
            (leftValue, rightValue) -> "left=" + leftValue + ", right=" + rightValue,
            Joined.with(Serdes.String(), jsonSerde, jsonSerde)
        );
//Edit
       joined.foreach((k, v) -> {
          System.out.println("key="+k+ ", val=" + v);
       });

}

输出,它会如何显示? 创建一个新的topic,一个hashmap是不是最好用json格式保存的值? 稍后我将创建自定义 Serdes

“我不能使用自己的密钥”是什么意思? 在 Kafka Streams 中,您始终可以根据处理需要设置新密钥。

但是,如果要将数据读入 KTable,则无法直接更改密钥。 您需要将主题读取为 KStream,设置新密钥,并将 KStream 转换为 KTable(参见Kafka Streams API:KStream to KTable )。

对于多个连续连接,您可以将相应的操作“链接”在一起。

builder.stream("topic-1").selectKey(...).to("table-topic-1");
KTable t1 = builder.table("table-topic-1");

KStream firstJoinResult = builder.stream(...).selectKey(...).join(t1, ...).

builder.stream("topic-2").selectKey(...).to("table-topic-2");
KTable t2 = builder.table("table-topic-2");

firstJoinResult.selectKey(...).join(t2, ...).to("result-topic");

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM