简体   繁体   English

当集群中有 4 个代理中有 3 个代理时,kafka 主题创建失败

[英]kafka topic creation faling when 3 brokers are up out of 4 in a cluster

Kafka topic creation is failing in below scenarios: Kafka 主题创建在以下情况下失败:

Node is kafka cluster: 4节点为kafka集群:4

Replication factor: 4复制因子:4

Number of nodes up and running in cluster: 3集群中启动并运行的节点数:3

Below is the error:下面是错误:

./kafka-topics.sh --zookeeper :2181 --create --topic test_1 --partitions 1 --replication-factor 4
WARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both.
Error while executing topic command : Replication factor: 4 larger than available brokers: 3.
[2018-10-31 11:58:13,084] ERROR org.apache.kafka.common.errors.InvalidReplicationFactorException: Replication factor: 4 larger than available brokers: 3.

Is it a valid behavior or some known issue in kafka?这是 kafka 中的有效行为还是某些已知问题?

If all the nodes in a cluster should be up and running always then what about failure tolerance?如果集群中的所有节点都应该始终启动并运行,那么容错呢?

upating json file for increasing the replication factor for already created topic:更新 json 文件以增加已创建主题的复制因子:

$cat /tmp/increase-replication-factor.json
{"version":1,
  "partitions":[
     {"topic":"vHost_v81drv4","partition":0,"replicas":[4,1,2,3]},
     {"topic":"vHost_v81drv4","partition":1,"replicas":[4,1,2,3]},
     {"topic":"vHost_v81drv4","partition":2,"replicas":[4,1,2,3]},
     {"topic":"vHost_v81drv4","partition":3,"replicas":[4,1,2,3]}
     {"topic":"vHost_v81drv4","partition":4,"replicas":[4,1,2,3]},
     {"topic":"vHost_v81drv4","partition":5,"replicas":[4,1,2,3]},
     {"topic":"vHost_v81drv4","partition":6,"replicas":[4,1,2,3]},
     {"topic":"vHost_v81drv4","partition":7,"replicas":[4,1,2,3]}
]}

When a new topic is created in Kafka, it is replicated N=replication-factor times across your brokers.在 Kafka 中创建新主题时,它会在您的代理之间复制N=replication-factor次。 Since you have 3 brokers up and running and replication-factor set to 4 the topic cannot be replicated 4 times and thus you get an error.由于您启动并运行了 3 个代理,并且replication-factor设置为4因此主题无法复制 4 次,因此您会收到错误消息。

When creating a new topic you either need to ensure that all of your 4 brokers are up and running or set the replication factor to a smaller value in order to avoid failure on topic creation when one of your brokers is down.创建新主题时,您需要确保所有 4 个代理都启动并运行,或者将复制因子设置为较小的值,以避免在您的一个代理关闭时创建主题失败。

In case you want to create topic with replication factor set to 4 while one broker is down, you can initially create the topic with replication-factor=3 and once your 4th broker is up and running you can modify the configuration of that topic and increase its replication factor by following the steps below (assuming you have a topic example with 4 partitions):如果您想在一个代理关闭时创建复制因子设置为4的主题,您可以最初使用replication-factor=3创建主题,一旦您的第四个代理启动并运行,您可以修改该主题的配置并增加按照以下步骤操作其复制因子(假设您有一个包含 4 个分区的主题example ):

Create a increase-replication-factor.json file with this content:使用以下内容创建一个increase-replication-factor.json文件:

{"version":1,
  "partitions":[
     {"topic":"example","partition":0,"replicas":[0,1,2,3]},
     {"topic":"example","partition":1,"replicas":[0,1,2,3]},
     {"topic":"example","partition":2,"replicas":[0,1,2,3]},
     {"topic":"example","partition":3,"replicas":[0,1,2,3]}
]}

Then execute the following command:然后执行以下命令:

kafka-reassign-partitions --zookeeper localhost:2181 --reassignment-json-file increase-replication-factor.json --execute

And finally you'd be able to confirm that your topic is replicated across the 4 brokers:最后,您将能够确认您的主题已在 4 个代理之间复制:

kafka-topics --zookeeper localhost:2181 --topic signals --describe
Topic:signals   PartitionCount:4    ReplicationFactor:4 Configs:retention.ms=1000000000
Topic: signals  Partition: 0    Leader: 2   Replicas: 0,1,2,3 Isr: 2,0,1,3
Topic: signals  Partition: 1    Leader: 2   Replicas: 0,1,2,3 Isr: 2,0,1,3
Topic: signals  Partition: 2    Leader: 2   Replicas: 0,1,2,3 Isr: 2,0,1,3
Topic: signals  Partition: 3    Leader: 2   Replicas: 0,1,2,3 Isr: 2,0,1,3

Regarding high availability let me explain how Kafka works:关于高可用性,让我解释一下 Kafka 的工作原理:

Every topic , is a particular stream of data (similar to a table in a database).每个主题,都是一个特定的数据流(类似于数据库中的表)。 Topics, are split into partitions (as many as you like) where each message within a partition gets an incremental id, known as offset as shown below.主题被分成多个分区(任意多),其中分区内的每条消息都有一个增量 ID,称为偏移量,如下所示。

Partition 0:分区 0:

+---+---+---+-----+
| 0 | 1 | 2 | ... |
+---+---+---+-----+

Partition 1:分区 1:

+---+---+---+---+----+
| 0 | 1 | 2 | 3 | .. |
+---+---+---+---+----+

Now a Kafka cluster is composed of multiple brokers .现在一个Kafka集群由多个broker组成。 Each broker is identified with an ID and can contain certain topic partitions.每个代理都用一个 ID 标识,并且可以包含某些主题分区。

Example of 2 topics (each having 3 and 2 partitions respectively): 2 个主题的示例(每个主题分别有 3 个和 2 个分区):

Broker 1:经纪人1:

+-------------------+
|      Topic 1      |
|    Partition 0    |
|                   |
|                   |
|     Topic 2       |
|   Partition 1     |
+-------------------+

Broker 2:经纪人2:

+-------------------+
|      Topic 1      |
|    Partition 2    |
|                   |
|                   |
|     Topic 2       |
|   Partition 0     |
+-------------------+

Broker 3:经纪人3:

+-------------------+
|      Topic 1      |
|    Partition 1    |
|                   |
|                   |
|                   |
|                   |
+-------------------+

Note that data is distributed (and Broker 3 doesn't hold any data of topic 2 ).请注意,数据是分布式的(并且Broker 3不保存主题 2 的任何数据)。

Topics, should have a replication-factor > 1 (usually 2 or 3) so that when a broker is down, another one can serve the data of a topic.主题,应该有一个replication-factor > 1(通常是 2 或 3),这样当一个代理宕机时,另一个代理可以提供一个主题的数据。 For instance, assume that we have a topic with 2 partitions with a replication-factor set to 2 as shown below:例如,假设我们有一个包含 2 个分区的主题, replication-factor设置为 2,如下所示:

Broker 1:经纪人1:

+-------------------+
|      Topic 1      |
|    Partition 0    |
|                   |
|                   |
|                   |
|                   |
+-------------------+

Broker 2:经纪人2:

+-------------------+
|      Topic 1      |
|    Partition 0    |
|                   |
|                   |
|     Topic 1       |
|   Partition 1     |
+-------------------+

Broker 3:经纪人3:

+-------------------+
|      Topic 1      |
|    Partition 1    |
|                   |
|                   |
|                   |
|                   |
+-------------------+

Now assume that Broker 2 has failed.现在假设Broker 2失败了。 Broker 1 and 3 can still serve the data for topic 1. So a replication-factor of 3 is always a good idea since it allows for one broker to be taken down for maintenance purposes and also for another one to be taken down unexpectedly.代理 1和 3 仍然可以为主题 1 提供数据。因此, replication-factor为 3 始终是一个好主意,因为它允许出于维护目的而关闭一个代理,也允许另一个代理意外关闭。 Therefore, Apache-Kafka offers strong durability and fault tolerance guarantees.因此,Apache-Kafka 提供了强大的持久性和容错性保证。

Note about Leaders: At any time, only one broker can be a leader of a partition and only that leader can receive and serve data for that partition.关于领导者的注意事项:在任何时候,只有一个代理可以成为分区的领导者,并且只有该领导者可以为该分区接收和提供数据。 The remaining brokers will just synchronize the data (in-sync replicas).其余的代理只会同步数据(同步副本)。 Also note that when the replication-factor is set to 1, the leader cannot be moved elsewhere when a broker fails.另请注意,当replication-factor设置为 1 时,当代理失败时,领导者无法移动到其他地方。 In general, when all replicas of a partition fail or go offline, the leader will automatically be set to -1 .一般来说,当一个分区的所有副本都失败或离线时, leader会自动设置为-1

This is a valid behavior.这是一个有效的行为。 When creating a new topic all nodes should be up and running.创建新主题时,所有节点都应启动并运行。

Confluence Replica placements - Initial placement Confluence 副本放置 - 初始放置

Only create the topic, make decision-based on current live brokers (manual create topic command);只创建主题,根据当前活跃的经纪人做出决策(手动创建主题命令);

All nodes must not be up and running while using this topic (after it is created)使用此主题时(创建后)不得启动并运行所有节点

Apache documentation about replication factor关于复制因子的 Apache 文档

The replication factor controls how many servers will replicate each message that is written.复制因子控制将复制每条写入的消息的服务器数量。 If you have a replication factor of 3 then up to 2 servers can fail before you will lose access to your data.如果您的复制因子为 3,那么在您无法访问数据之前,最多可能有 2 个服务器出现故障。 We recommend you use a replication factor of 2 or 3 so that you can transparently bounce machines without interrupting data consumption.我们建议您使用 2 或 3 的复制因子,以便您可以在不中断数据消耗的情况下透明地退回机器。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用 3 个代理集群创建新主题时,kafka 代理连接失败 - kafka broker connection failed when create new topic with a cluster of 3 brokers 如何更改Kafka集群中某个主题的代理数量? - How to change the number of brokers for a topic in a kafka cluster? 5个代理中kafka主题的最佳分区数,在1个集群中复制因子= 3 - Optimal number of partition for kafka topic on 5 brokers with replication factor=3 in 1 cluster 卡夫卡主题在经纪人之间的分布 - Kafka Topic Distribution among brokers 当只有一个经纪人可及时,是否有可能产生卡夫卡话题? - Is it possible to produce to a kafka topic when only 1 of the brokers is reachable? 如何使用 jhipster 在 Kafka 集群上设置多个代理 - How to set up multiple brokers on the Kafka Cluster with jhipster 卡夫卡经纪人未启动 - Kafka brokers not starting up Kafka 代理从集群中移除且重新分配失败后降低主题复制因子 - Decrease topic replication factor after Kafka brokers removed from cluster and failed reassignments Kafka不会将主题复制到创建该主题时未分配给该主题的经纪人吗? - Kafka does not replicate a topic to thoes brokers which were not assigned to the topic when it was created? 如何找出Kafka主题的最新偏移量,以了解读者何时了解该主题的最新信息? - How to find out the latest offset of a Kafka topic to know when my reader is up-to-date with topic?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM