kafka 連接器 elasticsearch 不消耗主題

Question

這是我的 kafka 連接器屬性

##
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#    http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
##

# This file contains some of the configurations for the Kafka Connect distributed worker. This file is intended
# to be used with the examples, and some settings may differ from those used in a production system, especially
# the `bootstrap.servers` and those specifying replication factors.

# A list of host/port pairs to use for establishing the initial connection to the Kafka cluster.
bootstrap.servers=localhost:9092

# unique name for the cluster, used in forming the Connect cluster group. Note that this must not conflict with consumer group IDs
group.id=connect-cluster

# The converters specify the format of data in Kafka and how to translate it into Connect data. Every Connect user will
# need to configure these based on the format they want their data in when loaded from or stored into Kafka
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
# Converter-specific settings can be passed in by prefixing the Converter's setting with the converter we want to apply
# it to
key.converter.schemas.enable=false
value.converter.schemas.enable=false

# Topic to use for storing offsets. This topic should have many partitions and be replicated and compacted.
# Kafka Connect will attempt to create the topic automatically when needed, but you can always manually create
# the topic before starting Kafka Connect if a specific topic configuration is needed.
# Most users will want to use the built-in default replication factor of 3 or in some cases even specify a larger value.
# Since this means there must be at least as many brokers as the maximum replication factor used, we'd like to be able
# to run this example on a single-broker cluster and so here we instead set the replication factor to 1.
offset.storage.topic=__connect_offsets
offset.storage.replication.factor=1

#offset.storage.partitions=25

# Topic to use for storing connector and task configurations; note that this should be a single partition, highly replicated,
# and compacted topic. Kafka Connect will attempt to create the topic automatically when needed, but you can always manually create
# the topic before starting Kafka Connect if a specific topic configuration is needed.
# Most users will want to use the built-in default replication factor of 3 or in some cases even specify a larger value.
# Since this means there must be at least as many brokers as the maximum replication factor used, we'd like to be able
# to run this example on a single-broker cluster and so here we instead set the replication factor to 1.
config.storage.topic=__connect_configs
config.storage.replication.factor=1

# Topic to use for storing statuses. This topic can have multiple partitions and should be replicated and compacted.
# Kafka Connect will attempt to create the topic automatically when needed, but you can always manually create
# the topic before starting Kafka Connect if a specific topic configuration is needed.
# Most users will want to use the built-in default replication factor of 3 or in some cases even specify a larger value.
# Since this means there must be at least as many brokers as the maximum replication factor used, we'd like to be able
# to run this example on a single-broker cluster and so here we instead set the replication factor to 1.
status.storage.topic=__connect_status
status.storage.replication.factor=1
#status.storage.partitions=5

# Flush much faster than normal, which is useful for testing/debugging
#offset.flush.interval.ms=10000

# These are provided to inform the user about the presence of the REST host and port configs
# Hostname & Port for the REST API to listen on. If this is set, it will bind to the interface used to listen to requests.
#rest.host.name=
#rest.port=8083

# The Hostname & Port that will be given out to other workers to connect to i.e. URLs that are routable from other servers.
#rest.advertised.host.name=
#rest.advertised.port=

# Set to a list of filesystem paths separated by commas (,) to enable class loading isolation for plugins
# (connectors, converters, transformations). The list should consist of top level directories that include
# any combination of:
# a) directories immediately containing jars with plugins and their dependencies
# b) uber-jars with plugins and their dependencies
# c) directories immediately containing the package directory structure of classes of plugins and their dependencies
# Examples:
# plugin.path=/usr/local/share/java,/usr/local/share/kafka/plugins,/opt/connectors,
plugin.path=/usr/share/java

這是我用於創建 Elasticsearch 接收器的 POST 主體

{
 "name" : "test-distributed-connector",
 "config" : {
  "connector.class" : "io.confluent.connect.elasticsearch.ElasticsearchSinkConnector",
  "tasks.max" : "2",
  "topics.regex" : "^test[0-9A-Za-z-_]*(?<!-raw$)$", 
  "connection.url" : "http://elasticsearch:9200",
  "connection.username": "admin",
  "connection.password": "admin",
  "type.name" : "_doc",
  "key.ignore" : "true",
  "schema.ignore" : "true",
  "transforms": "TimestampRouter",
  "transforms.TimestampRouter.type": "org.apache.kafka.connect.transforms.TimestampRouter",
  "transforms.TimestampRouter.topic.format": "${topic}-${timestamp}",
  "transforms.TimestampRouter.timestamp.format": "YYYY.MM.dd",
  "batch.size": "100",
  "offset.flush.interval.ms":"60000",
  "offset.flush.timeout.ms": "15000",
  "read.timeout.ms": "15000",
  "connection.timeout.ms": "10000",
  "max.buffered.records": "1500"
 }
}

我遇到的問題是有時這個接收器會工作並將數據發送到 Elasticsearch 並顯示

[2020-09-15 20:27:05,904] INFO WorkerSinkTask{id=test-distributed-connector-0} 使用序列號 1 異步提交偏移.......

但大多數時候它只會卡住並重復這部分

[2020-09-15 20:24:29,458] INFO [Consumer clientId=consumer-4, groupId=connect-test-distributed-connector] Group coordinator kafka:9092 (id: 2147483543 rack: null) is unavailable or invalid, will attempt rediscovery (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:706)
[2020-09-15 20:24:29,560] INFO [Consumer clientId=consumer-4, groupId=connect-test-distributed-connector] Discovered group coordinator kafka:9092 (id: 2147483543 rack: null) (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:654)
[2020-09-15 20:24:29,561] INFO [Consumer clientId=consumer-4, groupId=connect-test-distributed-connector] (Re-)joining group (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:486)

我的一個疑問是，這個 Elasticsearch 接收器會讀取大量帶有大量消息/數據的主題。

因此在嘗試從 Kafka 讀取主題時出現問題

因為我有另一個 Elasticsearh 水槽，其設置與這個和那個基本相同。

有沒有什么方法/調整可以使這個 Elasticseach 工作？

＃＃＃＃＃＃＃＃更新＃＃＃＃＃＃＃＃＃

有時（非常頻繁）我會看到這個日志

[2020-09-16 09:51:18,189] WARN [Consumer clientId=consumer-6, groupId=connect-test-distributed-connector] Close timed out with 1 pending requests to coordinator, terminating client connections (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:769)

和

[2020-09-16 10:17:43,369] WARN [Consumer clientId=consumer-16, groupId=connect-test-distributed-connector-] Close timed out with 1 pending requests to coordinator, terminating client connections (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:769)

Answer 1

您正在使用安全連接，我假設您的 xpack 已啟用。

嘗試將“connection.url”：“http://elasticsearch:9200”更改為“connection.url”：“https://elasticsearch:9200”

kafka 連接器 elasticsearch 不消耗主題

問題描述

1 個解決方案

解決方案1
0 2021-12-02 09:14:56

kafka 連接器 elasticsearch 不消耗主題

問題描述

1 個解決方案

解決方案1 0 2021-12-02 09:14:56

解決方案1
0 2021-12-02 09:14:56