简体   繁体   English

Kafka连接问题

[英]Kafka-connect issue

I installed Apache Kafka on centos 7 (confluent), am trying to run filestream kafka connect in distributed mode but I was getting below error: 我在centos 7(confluent)上安装了Apache Kafka,试图在分布式模式下运行文件流kafka connect,但出现以下错误:

[2017-08-10 05:26:27,355] INFO Added alias 'ValueToKey' to plugin 'org.apache.kafka.connect.transforms.ValueToKey' (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader:290)
Exception in thread "main" org.apache.kafka.common.config.ConfigException: Missing required configuration "internal.key.converter" which has no default value.
at org.apache.kafka.common.config.ConfigDef.parseValue(ConfigDef.java:463)
at org.apache.kafka.common.config.ConfigDef.parse(ConfigDef.java:453)
at org.apache.kafka.common.config.AbstractConfig.<init>(AbstractConfig.java:62)
at org.apache.kafka.common.config.AbstractConfig.<init>(AbstractConfig.java:75)
at org.apache.kafka.connect.runtime.WorkerConfig.<init>(WorkerConfig.java:197)
at org.apache.kafka.connect.runtime.distributed.DistributedConfig.<init>(DistributedConfig.java:289)
at org.apache.kafka.connect.cli.ConnectDistributed.main(ConnectDistributed.java:65)

Which is now resolved by updating the workers.properties as mentioned in http://docs.confluent.io/current/connect/userguide.html#connect-userguide-distributed-config 现在可以通过更新http://docs.confluent.io/current/connect/userguide.html#connect-userguide-distributed-config中提到的worker.properties来解决

Command used:

/home/arun/kafka/confluent-3.3.0/bin/connect-distributed.sh ../../../properties/file-stream-demo-distributed.properties

Filestream properties file (workers.properties):

name=file-stream-demo-distributed
connector.class=org.apache.kafka.connect.file.FileStreamSourceConnector
tasks.max=1
file=/tmp/demo-file.txt
bootstrap.servers=localhost:9092,localhost:9093,localhost:9094
config.storage.topic=demo-2-distributed
offset.storage.topic=demo-2-distributed
status.storage.topic=demo-2-distributed
key.converter=org.apache.kafka.connect.json.JsonConverter
key.converter.schemas.enable=true
value.converter=org.apache.kafka.connect.json.JsonConverter
value.converter.schemas.enable=true
internal.key.converter=org.apache.kafka.connect.json.JsonConverter
internal.key.converter.schemas.enable=false
internal.value.converter=org.apache.kafka.connect.json.JsonConverter
internal.value.converter.schemas.enable=false
group.id=""

I added below properties and command went through without any errors. 我在下面添加了属性,并且命令通过时没有任何错误。

bootstrap.servers=localhost:9092,localhost:9093,localhost:9094
config.storage.topic=demo-2-distributed
offset.storage.topic=demo-2-distributed
status.storage.topic=demo-2-distributed
group.id=""

But, now when I run consumer command, I am unable to see the messages in /tmp/demo-file.txt. 但是,现在当我运行消费者命令时,无法在/tmp/demo-file.txt中看到消息。 Please let me know if there is a way I can check if the messages are published to kafka topics and partitions ? 请让我知道是否有一种方法可以检查消息是否发布到kafka主题和分区?

kafka-console-consumer --zookeeper localhost:2181 --topic demo-2-distributed --from-beginning

I believe I am missing something really basic here. 我相信我在这里确实缺少一些基本的东西。 Can some one please help? 有人可以帮忙吗?

You need to define unique topics for Kafka connect framework to store its config, offset, and status. 您需要为Kafka Connect框架定义唯一的主题,以存储其配置,偏移量和状态。

In your workers.properties file change these parameters to something like the following: 在worker.properties文件中,将这些参数更改为如下所示:

config.storage.topic=demo-2-distributed-config
offset.storage.topic=demo-2-distributed-offset
status.storage.topic=demo-2-distributed-status

These topics are use to store state and configuration metadata of connect and not for storing the messages for any of the connectors that run on top of connect. 这些主题用于存储connect的状态和配置元数据, 而不用于存储在connect之上运行的任何连接器的消息。 Do not use console consumer on any of these three topics and expect to see the messages. 不要在这三个主题中的任何一个上使用控制台使用者,并希望看到消息。

The messages are stored in the topic configured in the connector configuration json with the parameter called "topic". 消息存储在连接器配置json中配置的主题中,该主题的参数为“ topic”。

Example file-sink-config.json file 示例文件-sink-config.json文件

{
  "name": "MyFileSink",
  "config": {
      "topics": "mytopic",
      "connector.class": "org.apache.kafka.connect.file.FileStreamSinkConnector",
      "tasks.max": 1,
      "key.converter": "org.apache.kafka.connect.storage.StringConverter",
      "value.converter": "org.apache.kafka.connect.storage.StringConverter",
      "file": "/tmp/demo-file.txt"
    }
}

Once the distributed worker is running you need to apply the config file to it using curl like so: 分布式工作程序运行后,您需要使用curl将配置文件应用到它,如下所示:

curl -X POST -H "Content-Type: application/json" --data @file-sink-config.json http://localhost:8083/connectors

After that the config will be safely stored in the config topic you created for all distributed workers to use. 之后,配置将安全地存储在您创建的配置主题中,以供所有分布式工作人员使用。 Make sure the config topic (and the status and offset topics) will not expire messages or you will loose you Connector configuration when it does. 确保配置主题(以及状态主题和偏移主题)不会使消息过期,或者在连接器配置丢失时将其释放。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM