简体   繁体   English

如何将 Clickhouse 的 Zookeeper 迁移到新实例?

[英]How to migrate Clickhouse's Zookeeper to new instances?

I'm hosting ClickHouse (v20.4.3.16) in 2 replicas on Kubernetes and it makes use of Zookeeper (v3.5.5) in 3 replicas (also hosted on the same Kubernetes cluster).我在 Kubernetes 上的 2 个副本中托管 ClickHouse(v20.4.3.16),它在 3 个副本中使用 Zookeeper(v3.5.5)(也托管在同一个 Kubernetes 集群上)。

I would need to migrate the Zookeeper used by ClickHouse with another installation, still 3 replicas but v3.6.2.我需要使用另一个安装迁移 ClickHouse 使用的 Zookeeper,仍然是 3 个副本,但 v3.6.2。

What I tried to do was the following:我试图做的是以下内容:

  • I stopped all instances of ClickHouse in order to freeze Zookeeper nodes.我停止了 ClickHouse 的所有实例以冻结 Zookeeper 节点。 Using zk-shell, I mirrored all znodes from /clickhouse of the old ZK cluster to the new one (it took some time but it was completed without problems)使用 zk-shell,我将旧 ZK 集群的 /clickhouse 中的所有 znode 镜像到新的(花了一些时间,但完成没有问题)
  • I restarted all instances of ClickHouse, one at a time, now attached to the new instance of Zookeeper.我重新启动了 ClickHouse 的所有实例,一次一个,现在附加到 Zookeeper 的新实例。
  • Both the ClickHouse instances started correctly, without any errors, but all the times I try (or someone tries) to add rows to a table with an insert, ClickHouse logs something like the following:两个 ClickHouse 实例都正确启动,没有任何错误,但在我尝试(或有人尝试)通过插入向表添加行时,ClickHouse 记录如下内容:
2021.01.13 13:03:36.454415 [ 135 ] {885576c1-832e-4ac6-82d8-45fbf33b7790} <Warning> default.check_in_availability: Tried to add obsolete part 202101_0_0_0 covered by 202101_0_1159_290 (state Committed)

and the new data is never inserted.并且永远不会插入新数据。

I've read all the info about Data Replication and Deduplication, but I am sure I'm adding new data in the insert, plus all tables make use of temporal fields (event_time or update_timestamp and so on) but it simply doesn't work.我已经阅读了有关数据复制和重复数据删除的所有信息,但我确信我在插入中添加了新数据,而且所有表都使用了时间字段(event_time 或 update_timestamp 等),但它根本不起作用.

Attaching ClickHouse back to the old Zookeeper, the problem is not happening with the same data inserted.将 ClickHouse 附加回旧的 Zookeeper,插入相同的数据不会出现问题。

Is there something which needs to be done prior to change Zookeeper endpoints?在更改 Zookeeper 端点之前需要做些什么吗? Am I missing something obvious?我错过了一些明显的东西吗?

Using zk-shell, I使用 zk-shell,我

You cannot use this method because it does not copy autoincrement values which are used for part block numbers.您不能使用此方法,因为它不会复制用于零件块编号的自动增量值。

There are much simpler way.有很多更简单的方法。 You can migrate ZK cluster by adding new ZK nodes as followers.您可以通过添加新的 ZK 节点作为追随者来迁移 ZK 集群。

Here is a plan for ZK 3.4.9 (no dynamic reconfiguration):
1. Configure the 3 new ZK nodes as a cluster of 6 nodes (3 old + 3 new), start them. No changes needed for the 3 old ZK nodes at this time.
    The new server would not connect and download a snapshot, so I had to start one of them in the cluster of 4 nodes first.
2. Make sure the 3 new ZK nodes connected to the old ZK cluster as followers (run echo stat | nc localhost 2181 on the 3 new ZK nodes)
3. Confirm that the leader has 5 synced followers (run echo mntr | nc localhost 2181 on the leader, look for zk_synced_followers)
7. Remove the 3 old ZK nodes from zoo.cfg on the 3 new ZK nodes.
8. Stop data loading in CH (this is to minimize errors when CH loses ZK).
4. Change the zookeeper section in the configs on the CH nodes (remove the 3 old ZK servers, add the 3 new ZK servers)
5. Restart all CH nodes (CH must restart to connect to different ZK servers)
6. Make sure there are no connections from CH to the 3 old ZK nodes (run echo stat | nc localhost 2181 on the 3 old nodes, check their Client ssection).
11. Turn off the 3 old ZK nodes
9. Restart the 3 new ZK nodes. They should form a cluster of 3 nodes.
10. When CH reconnects to ZK, start data loading.

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM