简体   繁体   English

如何更改Kafka主题的副本数?

[英]How to change the number of replicas of a Kafka topic?

在生产者或管理员创建 Kafka 主题后,您将如何更改该主题的副本数?

To increase the number of replicas for a given topic you have to:要增加给定主题的副本数量,您必须:

1. Specify the extra replicas in a custom reassignment json file 1. 在自定义重新分配 json 文件中指定额外的副本

For example, you could create increase-replication-factor.json and put this content in it:例如,您可以创建increase-replication-factor.json并将此内容放入其中:

{"version":1,
  "partitions":[
     {"topic":"signals","partition":0,"replicas":[0,1,2]},
     {"topic":"signals","partition":1,"replicas":[0,1,2]},
     {"topic":"signals","partition":2,"replicas":[0,1,2]}
]}

2. Use the file with the --execute option of the kafka-reassign-partitions tool 2. 使用带有kafka-reassign-partitions工具的 --execute 选项的文件

[or kafka-reassign-partitions.sh - depending on the kafka package] [或kafka-reassign-partitions.sh - 取决于 kafka 包]

For example:例如:

$ kafka-reassign-partitions --zookeeper localhost:2181 --reassignment-json-file increase-replication-factor.json --execute

3. Verify the replication factor with the kafka-topics tool 3.使用kafka-topics工具验证复制因子

[or kafka-topics.sh - depending on the kafka package] [或 kafka-topics.sh - 取决于 kafka 包]

 $ kafka-topics --zookeeper localhost:2181 --topic signals --describe

Topic:signals   PartitionCount:3    ReplicationFactor:3 Configs:retention.ms=1000000000
Topic: signals  Partition: 0    Leader: 2   Replicas: 0,1,2 Isr: 2,0,1
Topic: signals  Partition: 1    Leader: 2   Replicas: 0,1,2 Isr: 2,0,1
Topic: signals  Partition: 2    Leader: 2   Replicas: 0,1,2 Isr: 2,0,1

See also: the part of the official documentation that describes how to increase the replication factor .另见: 官方文档中描述如何增加复制因子的部分

You can also use kafkactl for this:您也可以为此使用kafkactl

# first run with --validate-only to see what kafkactl will do
kafkactl alter topic my-topic --replication-factor 2 --validate-only

# then do the replica reassignment
kafkactl alter topic my-topic --replication-factor 2

Note that the Kafka API that kafkactl is using for this is only available for Kafka ≥ 2.4.0.请注意,kafkactl 用于此的 Kafka API 仅适用于 Kafka ≥ 2.4.0。

Disclaimer: I am contributor to this project免责声明:我是这个项目的贡献者

Edit: I was proven to be wrong - please check excellent answer from Łukasz Dumiszewski.编辑:我被证明是错误的 - 请查看Łukasz Dumiszewski 的出色回答。

I'm leaving my original answer for completness for now.为了完整起见,我暂时保留原来的答案。



I don't think you can.我不认为你可以。 Normally it would be something like通常它会像

./kafka-topics.sh --zookeeper localhost:2181 --alter --topic test2 --replication-factor 3 ./kafka-topics.sh --zookeeper localhost:2181 --alter --topic test2 --replication-factor 3

but it says但它说

Option "[replication-factor]" can't be used with option"[alter]"选项“[replication-factor]”不能与选项“[alter]”一起使用

It is funny that you can change number of partitions on the fly (which is often hugely destructive action when done in runtime), but cannot increase replication factor, which should be transparent.有趣的是,您可以动态更改分区数量(在运行时执行此操作通常具有极大的破坏性),但不能增加复制因子,这应该是透明的。 But remember, it is 0.10, not 10.0... Please see here for enhancement request https://issues.apache.org/jira/browse/KAFKA-1543但请记住,它是 0.10,而不是 10.0...请参阅此处以获取增强请求https://issues.apache.org/jira/browse/KAFKA-1543

Łukasz Dumiszewski's answer is correct but manually generating that file is a bit hard. Łukasz Dumiszewski 的回答是正确的,但手动生成该文件有点困难。 Luckily there are some easy ways to achieve what @Łukasz Dumiszewski said.幸运的是,有一些简单的方法可以实现@Łukasz Dumiszewski 所说的。

  • If you are using kafka-manager tool , from version 2.0.0.2 you can change the replication factor in Generate Partition Assignment section in a topic view.如果您使用的是kafka-manager工具,从2.0.0.2版本开始,您可以在主题视图的Generate Partition Assignment部分更改复制因子。 Then you should click on Reassign Partitions to apply the generated partition assignment (if you select a different replication factor, you will get a warning but you can click on Force Reassign afterward).然后您应该单击Reassign Partitions以应用生成的分区分配(如果您选择不同的复制因子,您将收到警告,但您可以在之后单击Force Reassign )。

  • If you have ruby installed you can use this helper script如果你安装了 ruby​​,你可以使用这个帮助脚本

  • If you prefer nodejs you can generate the file with this gist too.如果你更喜欢 nodejs,你也可以使用 这个gist 生成文件。

This script may help you, if you want change replication factor for all topics:如果您想更改所有主题的复制因子,此脚本可能会对您有所帮助:

#!/bin/bash

topics=`kafka-topics --list --zookeeper zookeeper:2181`

while read -r line; do lines+=("$line"); done <<<"$topics"
echo '{"version":1,
  "partitions":[' > tmp.json
for t in $topics; do 
    if [ "${t}" == "${lines[-1]}" ]; then
        echo "    {\"topic\":\"${t}\",\"partition\":0,\"replicas\":[0,1,2]}" >> tmp.json
    else
        echo "    {\"topic\":\"${t}\",\"partition\":0,\"replicas\":[0,1,2]}," >> tmp.json
    fi
done

echo '  ]
}' >> tmp.json

kafka-reassign-partitions --zookeeper zookeeper:2181 --reassignment-json-file tmp.json --execute

The scripted answer of @Дмитрий-Шепелев did not include a solution for topics with multiple partitions. @Дмитрий-Шепелев 的脚本答案不包括具有多个分区的主题的解决方案。 This updated version does:此更新版本可以:

#!/bin/bash

brokerids="1,2,3"
topics=`kafka-topics --list --zookeeper zookeeper:2181`

while read -r line; do lines+=("$line"); done <<<"$topics"
echo '{"version":1,
  "partitions":['
for t in $topics; do
    sep=","
    pcount=$(kafka-topics --describe --zookeeper zookeeper:2181 --topic $t | awk '{print $2}' | uniq -c |awk 'NR==2{print $1}')
    for i in $(seq 0 $[pcount - 1]); do
        if [ "${t}" == "${lines[-1]}" ] && [ "$[pcount - 1]" == "$i" ]; then sep=""; fi
        randombrokers=$(echo "$brokerids" | sed -r 's/,/ /g' | tr " " "\n" | shuf | tr  "\n" "," | head -c -1)
        echo "    {\"topic\":\"${t}\",\"partition\":${i},\"replicas\":[${randombrokers}]}$sep"
    done
done

echo '  ]
}'

Note: it also randomizes the brokers and picks two replicas per partition.注意:它还会随机化代理并为每个分区选择两个副本。 So make sure the brokerid's in the script are correctly defined.因此,请确保脚本中的 brokerid 已正确定义。

Execute as follows:执行如下:

$ ./reassign.sh > reassign.json
$ kafka-reassign-partitions --zookeeper zookeeper:2181 --reassignment-json-file reassign.json --execute

If you have a lot of partitions, using kafka-reassign-partitions to generate the json file required by Łukasz Dumiszewski's answer (and the official documentation) can be a timesaver.如果您有很多分区,使用kafka-reassign-partitions生成 Łukasz Dumiszewski 的答案(和官方文档)所需的 json 文件可以节省时间。 Here is an example of replicating a 64 partition topic from 1 to 2 servers without having to specify all the partitions:这是一个将 64 个分区主题从 1 台服务器复制到 2 台服务器而无需指定所有分区的示例:

expand_topic=TestTopic
current_server=111
new_servers=111,222
echo '{"topics": [{"topic":"'${expand_topic}'"}], "version":1}' > /tmp/topics-to-expand.json
/bin/kafka-reassign-partitions.sh --zookeeper localhost:2181 --topics-to-move-json-file /tmp/topics-to-expand.json --broker-list "${current_server}" --generate | tail -1 | sed s/\\[${current_server}\\]/\[${new_servers}\]/g | tee /tmp/topic-expand-plan.json
/bin/kafka-reassign-partitions.sh --zookeeper localhost:2181 --reassignment-json-file /tmp/topic-expand-plan.json --execute
/bin/kafka-topics.sh --zookeeper localhost:2181 --describe --topic ${expand_topic}

Outputs:输出:

Topic:TestTopic PartitionCount:64   ReplicationFactor:2 Configs:retention.ms=6048000
    Topic: TestTopic    Partition: 0    Leader: 111 Replicas: 111,222   Isr: 111,222
    Topic: TestTopic    Partition: 1    Leader: 111 Replicas: 111,222   Isr: 111,222
    ....

1. Copy all topics to json file 1.将所有主题复制到json文件

#!/bin/bash
topics=`kafka-topics.sh --zookeeper localhost:2181 --list`

while read -r line; do lines+=("$line"); done <<<"$topics"
echo '{"version":1,
 "topics":['
 for t in $topics; do
     echo -e '     { "topic":' \"$t\" '},'
done

echo '  ]
}'

bash alltopics.sh > alltopics.json

2. Run kafka-reassign-partitions.sh to generate rebalanced file 2.运行kafka-reassign-partitions.sh生成rebalanced文件

kafka-reassign-partitions.sh --zookeeper localhost:2181 --broker-list "0,1,2" --generate --topics-to-move-json-file alltopics.json > reassign.json

3. Cleanup reassign.json file it contains existing and proposed values 3. 清理 reassign.json 文件,它包含现有的和建议的值

4. Run kafka-reassign-partitions.sh to rebalance topics 4. 运行 kafka-reassign-partitions.sh 重新平衡主题

kafka-reassign-partitions.sh --zookeeper localhost:2181 --reassignment-json-file reassign.json --execute

In the first step we need to alter topics with replicas在第一步中,我们需要使用副本更改主题

./kafka-topics.sh --describe --zookeeper prod-az-p1-zk01.<domain>.prod:2181 --topic test2

then in the next step we need to identify brokers list where we need to sync our replicas and it requires topic rebalance to do this create a json file and define all the ISR brokers and topic然后在下一步中,我们需要确定需要同步副本的代理列表,并且需要重新平衡主题来执行此操作,创建一个 json 文件并定义所有 ISR 代理和主题

    {"version":1,
    "partitions":[
     {"topic":"test2","partition":0,"replicas":[0,10]},
     {"topic":"test2","partition":1,"replicas":[10,20]}
    ]}

In the last we need to rebalance the topics for partitions最后,我们需要重新平衡分区的主题

./kafka-reassign-partitions.sh --zookeeper prod-az-p1-zk01.<domain>.prod:2181 --reassignment-json-file /tmp/increase-replication-factor.json --execute

To verify验证

[root@prod-az-p2-kafka02 bin]# ./kafka-topics.sh --describe --zookeeper prod-az-p1-zk01.<domain>.prod:2181 --topic test2
Topic: test2    TopicId: -LoL36ztSeyC8rzvnp4YMw PartitionCount: 2   ReplicationFactor: 2    Configs:
    Topic: test2    Partition: 0    Leader: 10  Replicas: 0,10  Isr: 10
    Topic: test2    Partition: 1    Leader: 20  Replicas: 10,20 Isr: 20,10

This script will generate the JSON for kafka-reassign-partitions.sh and feed it into that script to increase the replication factor.该脚本将为kafka-reassign-partitions.sh生成 JSON 并将其输入该脚本以增加复制因子。 The new set of replicas will:新的副本集将:

  • Keep the current replicas保留当前副本
  • Add new unique brokers (this will prevent unneeded data migrations)添加新的唯一代理(这将防止不必要的数据迁移)

This script was tested with 2.8.0 Kafka scripts.该脚本使用 2.8.0 Kafka 脚本进行了测试。 Only the variables at the top of the file will need modified.只有文件顶部的变量需要修改。

#!/bin/bash

KAFKA_BIN="./bin"
KAFKA_CONNECTION_ARGS="--bootstrap-server localhost:9094"

broker_ids="1,2,3"
topic="topic_foobar"
new_replication_factor=3 # New replication factor


reassignment_file="./reassignment.json"


#~~~~ Don't change anything after this line ~~~~#


# Generate a list of "partition|replicas"
topic_data="$("$KAFKA_BIN/kafka-topics.sh" $KAFKA_CONNECTION_ARGS --describe --topic "$topic" | tail -n +2 | sed -E 's/.*Partition:\s+([0-9]+).*Replicas:\s+([0-9,]+).*/\1|\2/g')"
partition_count=$(echo "$topic_data" | wc -l)

echo '{
    "version": 1,
    "partitions": [' > "$reassignment_file"


log_dirs="$(yes '"any"' | head -n $new_replication_factor | sed -e ':a;N;$!ba;s/\n/,/g')"
obj_sep=","
while read -r partition_data; do
    partition=$(echo "$partition_data" | cut -d '|' -f 1)
    replicas=$(echo "$partition_data" | cut -d '|' -f 2)

    # Randomize the replicas (using this list as a queue)
    random_replicas="$(echo $broker_ids | tr "," "\n" | shuf)"
    
    # Loop until the replicas has desired RF - 1 commas
    while [ "$(echo "$replicas" | tr -dc , | wc -c)" != $((new_replication_factor-1)) ]; do
        # Pick the next replica, add it to the list if it isn't already there, otherwise advance the queue
        next_replica="$(echo "$random_replicas" | head -1)"
        if [[ $replicas != *$next_replica* ]]; then
            replicas="$replicas,$next_replica"
        else
            random_replicas="$(echo "$random_replicas" | tail -n +2)"
        fi
    done
    
    # Don't add a comma on the last object
    if [ "$((partition_count-1))" == "$partition" ]; then obj_sep=""; fi
    
    echo '      {
            "topic": "'"$topic"'",
            "partition": '"$partition"',
            "replicas": ['"$replicas"'],
            "log_dirs": ['"$log_dirs"']
        }'$obj_sep >> "$reassignment_file"
done < <(echo "$topic_data")

echo '  ]
}' >> "$reassignment_file"


cat "$reassignment_file"
read -p "Apply the above reassignment? (Ctrl-C to exit): "


"$KAFKA_BIN/kafka-reassign-partitions.sh" $KAFKA_CONNECTION_ARGS --execute --reassignment-json-file "$reassignment_file"

To increase the number of replicas for a given topic you have to:要增加给定主题的副本数量,您必须:

1. Specify the extra partitions to the existing topic with below command(let us say increase from 2 to 3) 1.使用以下命令为现有主题指定额外的分区(假设从2增加到3)

bin/kafktopics.sh --zookeeper localhost:2181 --alter --topic topic-to-increase --partitions 3

2. Specify the extra replicas in a custom reassignment json file 2. 在自定义重新分配 json 文件中指定额外的副本

For example, you could create increase-replication-factor.json and put this content in it:例如,您可以创建 increase-replication-factor.json 并将此内容放入其中:

{"version":1,
  "partitions":[
     {"topic":"topic-to-increase","partition":0,"replicas":[0,1,2]},
     {"topic":"topic-to-increase","partition":1,"replicas":[0,1,2]},
     {"topic":"topic-to-increase","partition":2,"replicas":[0,1,2]}
]}

3. Use the file with the --execute option of the kafka-reassign-partitions tool 3. 使用带有 kafka-reassign-partitions 工具的 --execute 选项的文件

bin/kafka-reassign-partitions --zookeeper localhost:2181 --reassignment-json-file increase-replication-factor.json --execute

4. Verify the replication factor with the kafka-topics tool 4.使用kafka-topics工具验证复制因子

bin/kafka-topics --zookeeper localhost:2181 --topic topic-to-increase --describe

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM