简体   繁体   English

Kafka Connect Elasticsearch sink 没有索引文档

[英]Kafka Connect Elasticsearch sink no documents are indexed

I'm trying to set up a test to move data from MySQL to Elasticsearch.我正在尝试设置一个测试以将数据从 MySQL 移动到 Elasticsearch。

I have a dockerized setup with broker, zookeeper, connect, ksql server and cli, schema registry and Elasticsearch.我有一个带有代理、zookeeper、connect、ksql 服务器和 cli、模式注册表和 Elasticsearch 的 dockerized 设置。 I'm using the docker images from confluent version 5.1.0 and for Elasticsearch I'm using elasticsearch:6.5.4我正在使用来自 confluent 版本5.1.0的 docker 图像,对于 Elasticsearch,我使用的是elasticsearch:6.5.4

I configured a JDBC connector to get data from MySQL into Kafka, this is working I see my topics created and using ksql-cli I can see the new messages in the stream as I update rows in MySQL.我配置了一个JDBC 连接器以将数据从 MySQL 获取到 Kafka,这是有效的我看到我创建的主题并使用 ksql-cli 我可以在更新 MySQL 中的行时看到流中的新消息。

I also configured an Elasticsearch sink connector the connector is created successfully and the index in Elasticsearch is also there, but I see no documents in my Elasticsearch index .我还配置了一个Elasticsearch sink 连接器,连接器创建成功,Elasticsearch 中的索引也在那里,但我在 Elasticsearch index 中没有看到任何文档

This is the ES sink connector configuration:这是 ES 接收器连接器配置:

{
    "name": "es-connector",
    "config": {
            "connector.class": "io.confluent.connect.elasticsearch.ElasticsearchSinkConnector",
            "value.converter": "io.confluent.connect.avro.AvroConverter",
            "key.converter": "io.confluent.connect.avro.AvroConverter",
            "key.converter.schema.registry.url": "http://schema-registry:8081",
            "value.converter.schema.registry.url": "http://schema-registry:8081",
            "connection.url": "http://es:9200",
            "type.name": "_doc",
            "topics": "test_topic",
            "drop.invalid.message": true,
            "behavior.on.null.values": "ignore",
            "behavior.on.malformed.documents": "ignore",
            "schema.ignore": true
    }
}

This is what I see when I query the status of the sink connector: curl -X GET http://connect:8083/connectors/es-connector这是我查询接收器连接器状态时看到的: curl -X GET http://connect:8083/connectors/es-connector

{
    "name": "es-connector",
    "connector": {
        "state": "RUNNING",
        "worker_id": "connect:8083"
    },
    "tasks": [
        {
            "state": "RUNNING",
            "id": 0,
            "worker_id": "connect:8083"
        }
    ],
    "type": "sink"
}

In Elasticsearch I can see the index http://es:9200/test_topic/_search在 Elasticsearch 中,我可以看到索引http://es:9200/test_topic/_search

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 0,
    "max_score": null,
    "hits": []
  }
}

I keep making updates and inserts in MySQL, I see the messages in the stream using ksql-cli but no documents are created in Elasticsearch.我一直在 MySQL 中进行更新和插入,我使用 ksql-cli 看到流中的消息,但在 Elasticsearch 中没有创建任何文档。 I even created a topic manually using kafka-avro-console-producer and published messages, then created a second sink connector for this topic and the same result, I see the index but no documents.我什至使用kafka-avro-console-producer手动创建了一个主题并发布了消息,然后为此主题创建了第二个接收器连接器,结果相同,我看到了索引但没有文档。

I see no errors in kafka-connect so I don't understand why is not working.我在 kafka-connect 中没有看到错误,所以我不明白为什么不起作用。 Is there something wrong with the connector configuration?连接器配置有问题吗? Am I missing something?我错过了什么吗?

Edit:编辑:

For the Elasticsearch sink configuration I tried with and without these lines:对于 Elasticsearch 接收器配置,我尝试使用和不使用这些行:

"drop.invalid.message": true,
"behavior.on.null.values": "ignore",
"behavior.on.malformed.documents": "ignore",
"schema.ignore": true

And the result is the same, no documents.结果是一样的,没有文件。

Edit编辑

I found the error:我发现了错误:

Key is used as document id and cannot be null密钥用作文档 ID,不能为空

. .

With

"key.ignore": true

the Elasticsearch sink will use the topic+partition+offset as the Elasticsearch document ID. Elasticsearch sink 将使用 topic+partition+offset 作为 Elasticsearch 文档 ID。 As you found, you will get a new document for every message.如您所见,您将获得每条消息的新文档。

With

"key.ignore": false

the Elasticsearch sink will use the Key of the Kafka message as the Elasticsearch document ID. Elasticsearch sink 将使用Kafka 消息Key作为 Elasticsearch 文档 ID。 If you don't have a key in your Kafka message, you will understandably get the error Key is used as document id and cannot be null .如果您的 Kafka 消息中没有密钥,您将可以理解地得到错误Key is used as document id and cannot be null You can use various methods for setting the key in a Kafka message, including Single Message Transform to set the Kafka message key if you're ingesting through Kafka Connect, detailed here .您可以使用各种方法来设置 Kafka 消息中的密钥,包括单消息转换来设置 Kafka 消息密钥(如果您通过 Kafka Connect 摄取), 详见此处

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Kafka Connect Elasticsearch 带有自定义路由的接收器连接器 - Kafka Connect Elasticsearch Sink Connector with custom _routing 如何激活和配置ElasticSearch Kafka Connect接收器? - How to activate and configure ElasticSearch Kafka Connect sink? 自定义Kafka Connect-ElasticSearch接收器连接器 - Customize Kafka Connect - ElasticSearch Sink Connector 无法使用 elasticsearch sink 连接器(kafka-connect) - Unable to use elasticsearch sink connector (kafka-connect) Kafka connect elasticsearch sink 从 JSON 中提取和执行值 - Kafka connect elasticsearch sink extract and perform values from JSON 用kafka接收器在elasticsearch中重命名索引 - rename index in elasticsearch with kafka sink Elasticsearch接收器仅使用kafka-connect-elasticsearch +时间戳SMT仅获得新消息,而不接收前一条消息 - Elasticsearch sink only get new messages and not the previous one using kafka-connect-elasticsearch + timestamp SMT kafka-connect-elasticsearch:如何发送文件删除? - kafka-connect-elasticsearch: how to send deletes of documents? kafka 连接弹性接收器无法连接到 Elasticsearch。 一般 SSLEngine 问题 - kafka connect elastic sink Could not connect to Elasticsearch. General SSLEngine problem 并非所有文档都使用ElasticSearch和MongoDB编制索引 - Not all documents are indexed with ElasticSearch and MongoDB
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM