简体   繁体   English

无法在 Kafka Connect 2.4 中成功覆盖我的连接器

[英]Can't override successfully in Kafka Connect 2.4 my connectors

Hi i'm looking to use the new Override policy, that have been released in 2.3, through java code.嗨,我希望通过 java 代码使用 2.3 中发布的新覆盖策略。

And i want to create an example like this:我想创建一个这样的例子:

  • Create a Topic with 10 messages创建一个包含 10 条消息的主题

  • Create a Consumer that consume Messages and next send them to a default FileSink创建一个消费消息的消费者,然后将它们发送到默认的 FileSink

  • Create an Override Sink that should not take the data from the Consumer (It's configured as Earliest)创建一个不应从消费者那里获取数据的覆盖接收器(它被配置为最早)

  • Produce a message that is Consume and take by the two Sinks !产生一条消息,即由两个接收器消耗和接收!

    Here are the configurations of my SINK (file) connectors (the default one):以下是我的 SINK(文件)连接器(默认连接器)的配置:

        taskOut = new FileStreamSinkTask();
        Map<String, String> sinkProperties = new HashMap<>();
        sinkProperties.put(FileStreamSinkConnector.TOPICS_CONFIG, new ConstantSettingsBehavior().SINGLE_TOPIC);
        sinkProperties.put(FileStreamSinkConnector.FILE_CONFIG, new ConstantSettingsBehavior().FILE_OUT_LATEST);
        sinkProperties.put(OFFSET_COMMIT_INTERVAL_MS_CONFIG, String.valueOf(5_000));
        sinkProperties.put(ConnectorConfig.CONNECTOR_CLIENT_CONSUMER_OVERRIDES_PREFIX + ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "latest");
        connectorOut.start(sinkProperties);
        taskOut.initialize(createMock(SinkTaskContext.class));
        taskOut.start(connectorOut.taskConfigs(1).get(0));

And here the Earliest (only what is changing):这里是最早的(只有正在发生变化的):

     sinkProperties.put(FileStreamSinkConnector.TOPICS_CONFIG, new ConstantSettingsBehavior().SINGLE_TOPIC);
        sinkProperties.put(FileStreamSinkConnector.FILE_CONFIG, new ConstantSettingsBehavior().FILE_OUT_EARLY);
        sinkProperties.put(OFFSET_COMMIT_INTERVAL_MS_CONFIG, String.valueOf(5_000));
        sinkProperties.put(ConnectorConfig.CONNECTOR_CLIENT_CONSUMER_OVERRIDES_PREFIX + ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");

Next i'm going to create a consumer that take the messages from the Topic as接下来,我将创建一个消费者,将主题中的消息作为
List< SinkRecord >列表< SinkRecord >

I give this list to the tasks of each connectors:我将此列表分配给每个连接器的任务:

        myLatestOne.getTaskOut().put(data);
        myEarlyOne.getTaskOut().put(data);

But it looks that i'm not doing the right way !但看起来我的方法不对! Because all messages are taken by each connectors因为所有消息都由每个连接器获取

Here the code the pull Request of the code Override code that i'm using.这里的代码是我正在使用的代码覆盖代码的拉取请求。

If i miss something don't hesitate to tell me.如果我错过了什么,请毫不犹豫地告诉我。 (first question). (第一个问题)。

Thank's谢谢

Each connector will create a new consumer group ID.每个连接器都会创建一个新的消费者组 ID。 If they both read from the same topics, then they will both get all messages如果他们都阅读相同的主题,那么他们都会收到所有消息

Also, consumer and producer overrides have already been possible at the worker level, and I've not seen anyone write their own connector like this since you could just use connect-standalone此外,在工作人员级别已经可以覆盖消费者和生产者,而且我还没有看到有人像这样编写自己的连接器,因为您可以只使用connect-standalone

So i gave to do it trough out JAVA.所以我通过 JAVA 来完成它。 I found a way to do with terminal that is pretty easy:我找到了一种很容易使用终端的方法:

Command to do命令执行

We first launch our server Zookeeper:我们首先启动我们的服务器 Zookeeper:

 bin/zookeeper-server-start.sh config/zokeeper.properties

Next we start our server kafka:接下来我们启动我们的服务器kafka:

bin/kafka-server-start.sh config/server.properties

We need to create a topic:我们需要创建一个主题:

./bin/kafka-topics.sh --create --bootstrap-server localhost:9092 --replication-factor 1 --partitions 1 --topic test

Now we need to produce messages:现在我们需要生成消息:

./bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test
> [Your message]

And now we can launch our worker, with 1 connector connected.现在我们可以启动我们的工人,连接 1 个连接器。 You can have their properties in the config file.您可以在配置文件中拥有它们的属性。

bootstrap.servers=localhost:9092

key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter

key.converter.schemas.enable=true
value.converter.schemas.enable=true

offset.storage.file.filename=/tmp/connect.offsets
offset.flush.interval.ms=10000
connector.client.config.override.policy=All 

connector.client.config.override.policy=All Allow to override the client by the connector. connector.client.config.override.policy=All允许连接器覆盖客户端。

Here is our connector with the option earliest (If there is no offset saved it start from the first entry)这是我们带有选项的连接器earliest (如果没有保存偏移量,则从第一个条目开始)

name=local-file-earliest-sink 
connector.class=org.apache.kafka.connect.file.FileStreamSinkConnector
tasks.max=1
file=/tmp/test.sink.earliest.txt
topics=test
consumer.override.auto.offset.reset=earliest
value.converter=org.apache.kafka.connect.storage.StringConverter
sudo ./bin/connect-standalone.sh config/connect-standalone.properties  config/connect-file-sink-early.properties

We stop it few seconds later (you can look at tmp/test.sink.earliest.txt ).几秒钟后我们停止它(您可以查看tmp/test.sink.earliest.txt )。

This time we add a new connector:这次我们添加一个新的连接器:

name=local-file-latest-sink
connector.class=org.apache.kafka.connect.file.FileStreamSinkConnector
tasks.max=1
file=/tmp/test.sink.latest.txt
topics=test
consumer.override.auto.offset.reset=latest
value.converter=org.apache.kafka.connect.storage.StringConverter

We can launch both of them:我们可以同时启动它们:

sudo ./bin/connect-standalone.sh config/connect-standalone.properties  config/connect-file-sink-early.properties config/connect-file-sink-latest.properties 

We can add new messages and check if /tmp/test.sink.latest.txt is only fill with those messages.我们可以添加新消息并检查/tmp/test.sink.latest.txt是否仅填充这些消息。

Explanation解释

The main idea here is to be able to have a default reconfigurable each connector in a different way.这里的主要思想是能够以不同的方式为每个连接器提供默认的可重新配置。 So to do it we are using the add Override Policy为此,我们使用了 add Override Policy

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM