Kafka Connect Elasticsearch sink 没有索引文档

Question

I'm trying to set up a test to move data from MySQL to Elasticsearch.我正在尝试设置一个测试以将数据从 MySQL 移动到 Elasticsearch。

I have a dockerized setup with broker, zookeeper, connect, ksql server and cli, schema registry and Elasticsearch.我有一个带有代理、zookeeper、connect、ksql 服务器和 cli、模式注册表和 Elasticsearch 的 dockerized 设置。 I'm using the docker images from confluent version 5.1.0 and for Elasticsearch I'm using elasticsearch:6.5.4我正在使用来自 confluent 版本5.1.0的 docker 图像，对于 Elasticsearch，我使用的是elasticsearch:6.5.4

I configured a JDBC connector to get data from MySQL into Kafka, this is working I see my topics created and using ksql-cli I can see the new messages in the stream as I update rows in MySQL.我配置了一个JDBC 连接器以将数据从 MySQL 获取到 Kafka，这是有效的我看到我创建的主题并使用 ksql-cli 我可以在更新 MySQL 中的行时看到流中的新消息。

I also configured an Elasticsearch sink connector the connector is created successfully and the index in Elasticsearch is also there, but I see no documents in my Elasticsearch index .我还配置了一个Elasticsearch sink 连接器，连接器创建成功，Elasticsearch 中的索引也在那里，但我在 Elasticsearch index 中没有看到任何文档。

This is the ES sink connector configuration:这是 ES 接收器连接器配置：

{
    "name": "es-connector",
    "config": {
            "connector.class": "io.confluent.connect.elasticsearch.ElasticsearchSinkConnector",
            "value.converter": "io.confluent.connect.avro.AvroConverter",
            "key.converter": "io.confluent.connect.avro.AvroConverter",
            "key.converter.schema.registry.url": "http://schema-registry:8081",
            "value.converter.schema.registry.url": "http://schema-registry:8081",
            "connection.url": "http://es:9200",
            "type.name": "_doc",
            "topics": "test_topic",
            "drop.invalid.message": true,
            "behavior.on.null.values": "ignore",
            "behavior.on.malformed.documents": "ignore",
            "schema.ignore": true
    }
}

This is what I see when I query the status of the sink connector: curl -X GET http://connect:8083/connectors/es-connector这是我查询接收器连接器状态时看到的： curl -X GET http://connect:8083/connectors/es-connector

{
    "name": "es-connector",
    "connector": {
        "state": "RUNNING",
        "worker_id": "connect:8083"
    },
    "tasks": [
        {
            "state": "RUNNING",
            "id": 0,
            "worker_id": "connect:8083"
        }
    ],
    "type": "sink"
}

In Elasticsearch I can see the index http://es:9200/test_topic/_search在 Elasticsearch 中，我可以看到索引http://es:9200/test_topic/_search

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 0,
    "max_score": null,
    "hits": []
  }
}

I keep making updates and inserts in MySQL, I see the messages in the stream using ksql-cli but no documents are created in Elasticsearch.我一直在 MySQL 中进行更新和插入，我使用 ksql-cli 看到流中的消息，但在 Elasticsearch 中没有创建任何文档。 I even created a topic manually using kafka-avro-console-producer and published messages, then created a second sink connector for this topic and the same result, I see the index but no documents.我什至使用kafka-avro-console-producer手动创建了一个主题并发布了消息，然后为此主题创建了第二个接收器连接器，结果相同，我看到了索引但没有文档。

I see no errors in kafka-connect so I don't understand why is not working.我在 kafka-connect 中没有看到错误，所以我不明白为什么不起作用。 Is there something wrong with the connector configuration?连接器配置有问题吗？ Am I missing something?我错过了什么吗？

Edit:编辑：

For the Elasticsearch sink configuration I tried with and without these lines:对于 Elasticsearch 接收器配置，我尝试使用和不使用这些行：

"drop.invalid.message": true,
"behavior.on.null.values": "ignore",
"behavior.on.malformed.documents": "ignore",
"schema.ignore": true

And the result is the same, no documents.结果是一样的，没有文件。

Edit编辑

I found the error:我发现了错误：

Key is used as document id and cannot be null密钥用作文档 ID，不能为空

. .

Answer 1

With和

"key.ignore": true

the Elasticsearch sink will use the topic+partition+offset as the Elasticsearch document ID. Elasticsearch sink 将使用 topic+partition+offset 作为 Elasticsearch 文档 ID。 As you found, you will get a new document for every message.如您所见，您将获得每条消息的新文档。

With和

"key.ignore": false

the Elasticsearch sink will use the Key of the Kafka message as the Elasticsearch document ID. Elasticsearch sink 将使用Kafka 消息的Key作为 Elasticsearch 文档 ID。 If you don't have a key in your Kafka message, you will understandably get the error Key is used as document id and cannot be null .如果您的 Kafka 消息中没有密钥，您将可以理解地得到错误Key is used as document id and cannot be null 。 You can use various methods for setting the key in a Kafka message, including Single Message Transform to set the Kafka message key if you're ingesting through Kafka Connect, detailed here .您可以使用各种方法来设置 Kafka 消息中的密钥，包括单消息转换来设置 Kafka 消息密钥（如果您通过 Kafka Connect 摄取），详见此处。

Kafka Connect Elasticsearch sink 没有索引文档

问题描述

1 个解决方案

解决方案1
4 已采纳 2019-01-28 23:25:18

Kafka Connect Elasticsearch sink 没有索引文档

问题描述

1 个解决方案

解决方案1 4 已采纳 2019-01-28 23:25:18

解决方案1
4 已采纳 2019-01-28 23:25:18