简体繁体 English

在MySQL多站点群集中如何完成数据节点选择-是否基于节点组中数据节点的地理位置

[英]How data node selection done in MySQL Multi-site clustering - Is it based on geographic proximity of data nodes in a node group

原文 2017-10-23 11:27:06 5 1 mysql/ mysql-cluster

I am exploring the option of deploying MySQL multi-site clustering. 我正在探索部署MySQL多站点群集的选项。 The MySQL cluster deployment has three sites and it has two node groups and each node group contains three data nodes with the NoOfReplicas=3. MySQL群集部署具有三个站点，并且具有两个节点组，每个节点组包含三个NoOfReplicas = 3的数据节点。 The 3 data nodes in a node group are placed in three different sites for geographic spread. 节点组中的3个数据节点位于三个不同的站点中，以进行地理分布。

All three sites are accessing the MySQL cluster through the mysqld server and performs the SELECT and INSERT/UPDATE operations in a single table. 这三个站点都通过mysqld服务器访问MySQL集群，并在单个表中执行SELECT和INSERT / UPDATE操作。

Question 1: 问题1：

Which data node will be accessed when the queries are issued from a site. 从站点发出查询时将访问哪个数据节点。 Will the query access the local data node sitting at the same site from where the query originates. 查询将访问位于查询起源于同一站点的本地数据节点。

Question 2: 问题2：

Using the EXPLAIN statement, I can understand which partition is being used in the query operation, but not the exact data node that gets accessed for the query. 使用EXPLAIN语句，我可以了解查询操作中正在使用哪个分区，但不能了解查询所访问的确切数据节点。 Is there any way to understand which data node in the node group accessed for the query. 是否有任何方法可以了解要查询的节点组中的哪个数据节点。

Question 3: 问题3：

Is there a way to set the site affinity / tagging for the data node selection in a node group. 有没有一种方法可以为节点组中的数据节点选择设置站点关联性/标记。

1 个解决方案

Question 1: The answer depends on if the table is using the READ BACKUP feature or not. 问题1：答案取决于表是否正在使用READ BACKUP功能。 If not the query will almost always be sent to the primary replica independent of where it is. 如果不是，查询将几乎总是发送到主副本，而与查询的位置无关。

For READ BACKUP in MySQL Server we will send the query to a data node in the same node group and on the same host. 对于MySQL Server中的READ BACKUP，我们会将查询发送到同一节点组中同一主机上的数据节点。 This will normally be automatic based on the same hostname being used for MySQL Server and data node. 基于与MySQL服务器和数据节点相同的主机名，这通常是自动的。 It is possible also to set a variable ndb_data_node_neighbour to be the data node that you are closest to (this is a config variable in the MySQL Server). 也可以将变量ndb_data_node_neighbour设置为最接近的数据节点（这是MySQL Server中的配置变量）。

When the data node evaluates where to send the query to it will go to the local node if data resides there. 当数据节点评估将查询发送到哪里时，如果数据驻留在本地节点，它将转到本地节点。 Otherwise it will go to the primary replica node. 否则，它将转到主副本节点。

You can also use fully replicated tables in which case the data resides in each node and in this case it will always go to the a data node on the same host and it will find data there. 您也可以使用完全复制的表，在这种情况下，数据驻留在每个节点中，在这种情况下，它将始终转到同一主机上的一个数据节点，并且可以在其中找到数据。

Question 2: The mapping of a partition to nodes and LDM threads is static. 问题2：分区到节点和LDM线程的映射是静态的。 This information is available in the ndbinfo table table_fragments found at: https://dev.mysql.com/doc/mysql-cluster-excerpt/5.7/en/mysql-cluster-ndbinfo-table-fragments.html 在位于以下位置的ndbinfo表table_fragments中可以找到此信息： https ://dev.mysql.com/doc/mysql-cluster-excerpt/5.7/en/mysql-cluster-ndbinfo-table-fragments.html

Question 3: Interesting question, I have been working on such a feature quite recently. 问题3：一个有趣的问题，我最近一直在研究这种功能。 Whether it will be actually released and when is as usual not something that cannot be promised, but the idea is in line with your thoughts, one defines a LocationDomainId for each data node and MySQL Server and uses this to route read requests. 无论是实际发布，还是通常情况下都无法保证，但其想法符合您的想法，为每个数据节点和MySQL Server定义一个LocationDomainId并将其用于路由读取请求。 Again it will only be applicable to tables that use the READ BACKUP feature or fully replicated tables. 同样，它仅适用于使用READ BACKUP功能或完全复制的表的表。