简体   繁体   English

如何在ArangoDB中设置集群和分片?

[英]How to set clusters and sharding in ArangoDB?

I want to use sharding in arangoDB.I have made coordinators, DBServers as mentioned in documentation 2.8.5. 我想在arangoDB中使用分片。我已经制作了协调员,DBServers,如文档2.8.5所述。 But still can someone still explain it in details and also how can I able to check the performance of my query after and before sharding. 但仍然可以有人仍然详细解释它,以及如何在分片之前和之前检查我的查询的性能。

Testing your application can be done with a local cluster, were all instances run on one machine - which is what you already did, if I get that correctly? 测试您的应用程序可以使用本地群集完成,所有实例都在一台计算机上运行 - 如果我能正确完成,您已经这样做了吗?

An ArangoDB cluster consists of coordinator and dbserver nodes. ArangoDB集群由协调器和dbserver节点组成。 Coordinators don't have own user specific local collections on disk. 协调器在磁盘上没有自己的特定于用户的本地集合。 Their role is to handle the I/O with the clients, parse, optimize and distribute the queries and the user data to the dbserver nodes. 它们的作用是处理客户端的I / O,解析,优化和分发查询以及用户数据到dbserver节点。 Foxx services will also be run on the coordinators. Foxx服务也将在协调员上运行。 DBServers are the storage nodes in this setup, they keep the user data. DBServers是此设置中的存储节点,它们保留用户数据。

To compare the performance between clustered and non clustered mode you import a dataset on a clustered instance and a non clustered one and compare the query result times. 要比较群集模式和非群集模式之间的性能,请在群集实例和非群集实例上导入数据集,并比较查询结果时间。 Since the cluster setup can have more network communication (ie if you do a join) than the single server case, the performance can be different. 由于群集设置可以具有比单个服务器情况更多的网络通信(即,如果您进行连接),因此性能可能不同。 On a physically distributed cluster you may achieve higher throughput , since in the end the cluster nodes are own machines and have their own IO paths that end on separate physical harddisks. 物理分布式群集上,您可以实现更高的吞吐量 ,因为最终群集节点是自己的计算机,并且拥有自己的IO路径,这些路径以不同的物理硬盘结束。

In the cluster case you create collections specifying the number of shards using the numberOfShards parameter; 在集群案例中,您使用numberOfShards参数创建指定分片数的集合 ; the shardKeys parameter can control the distribution of your documents across the shards. shardKeys参数可以控制跨分片的文档分发。 You should choose that key so documents distribute well across the shards (ie are not inbalanced to just one shard). 您应该选择该密钥,以便文档在分片中很好地分布(即不平衡到一个分片)。 The numberOfShards can be an arbitrary value and doesn't have to corrospond to the number of dbserver nodes - it could even be bigger so you can more easily move a shard from one dbserver to a new dbserver when scaling up your cluster to more nodes in the future to adapt to higher loads. numberOfShards可以是任意值,并且不必与dbserver节点的数量相对应 - 它甚至可能更大,因此在将群集扩展到更多节点时,您可以更轻松地将分片从一个dbserver移动到新的dbserver未来适应更高的负荷。

When you're developping AQL queries with cluster use in mind, its essential to use the explain command to inspect how the query is distributed across the clusters, and where filters can be deployed: 当您在考虑集群时开发AQL查询时,使用explain命令检查查询在集群中的分布方式以及可以部署过滤器的位置至关重要:

db._create("sharded", {numberOfShards: 2})
db._explain("FOR x IN sharded RETURN x")
Query string:
 FOR x IN sharded RETURN x

Execution plan:
 Id   NodeType                  Est.   Comment
  1   SingletonNode                1   * ROOT
  2   EnumerateCollectionNode      1     - FOR x IN sharded /* full collection scan */
  6   RemoteNode                   1       - REMOTE
  7   GatherNode                   1       - GATHER
  3   ReturnNode                   1       - RETURN x

Indexes used:
 none

Optimization rules applied:
 Id   RuleName
  1   scatter-in-cluster
  2   remove-unnecessary-remote-scatter

In this simple query the RETURN & GATHER -nodes are on the coordinator; 在这个简单的查询中, RETURNGATHER -nodes位于协调器上; the nodes upwards including the REMOTE -node are deployed to the DB-server. 包括REMOTE -node在内的节点向上部署到DB服务器。

In general less REMOTE / SCATTER -> GATHER pairs means less cluster communication. 通常,较少的REMOTE / SCATTER - > GATHER对意味着较少的群集通信。 The closer FILTER nodes can be deployed to *CollectionNodes to reduce the amount of the documents to be sent via the REMOTE -nodes the better the performance. 可以将更接近的FILTER节点部署到*CollectionNodes以减少通过REMOTE node发送的文档量,性能越好。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM