ArangoDB 分片集群性能问题

Question

I have a query that runs well in single-instance setup.我有一个在单实例设置中运行良好的查询。 However, when I tried to run it on a sharded cluster, the performance dropped (4x longer execution time).但是，当我尝试在分片集群上运行它时，性能下降了（执行时间延长了 4 倍）。

The query plan shows that practically all processing is done on Coordinator node, not on DbServer.查询计划显示几乎所有处理都在 Coordinator 节点上完成，而不是在 DbServer 上。 How to push the query to be executed at DbServer?如何推送要在 DbServer 上执行的查询？

To give a bit of a context: I have a collection of ~120k (will grow to several millions) of multi-level JSON documents with nested arrays.提供一些背景信息：我收集了大约 120k（将增长到数百万）的多级 JSON 文档，其中嵌套了 arrays。 And the query needs to unnest these arrays before getting to the proper node.并且查询需要在到达正确的节点之前取消嵌套这些 arrays。

AQL Query: AQL 查询：

for doc IN doccollection
for arrayLevel1Elem in doc.report.container.children.container
for arrayLevel2Elem in arrayLevel1Elem.children.container.children.num
for arrayLevel3Elem in arrayLevel2Elem.children.code

filter doc.report.container.concept.simpleCodedValue == 'A' 
filter arrayLevel1Elem.concept.codedValue == "B"
filter arrayLevel2Elem.concept.simpleCodedValue == "C"   
filter arrayLevel3Elem.concept.simpleCodedValue == 'X'
filter arrayLevel3Elem.value.simpleCodedValue == 'Y'     

collect studyUid = doc.report.study.uid, personId = doc.report.person.id, metricName = arrayLevel2Elem.concept.meaning, value = to_number(arrayLevel2Elem.value)

return {studyUid, personId, metricName, value}

Query Plan:查询计划：

 Id   NodeType                  Site          Est.   Comment
  1   SingletonNode             DBS              1   * ROOT
  2   EnumerateCollectionNode   DBS         121027     - FOR doc IN doccollection   /* full collection scan, projections: `report`, 2 shard(s) */   FILTER (doc.`report`.`container`.`concept`.`simpleCodedValue` == "A")   /* early pruning */
  3   CalculationNode           DBS         121027       - LET #8 = doc.`report`.`container`.`children`.`container`   /* attribute expression */   /* collections used: doc : doccollection */
 19   CalculationNode           DBS         121027       - LET #24 = doc.`report`.`study`.`uid`   /* attribute expression */   /* collections used: doc : doccollection */
 20   CalculationNode           DBS         121027       - LET #26 = doc.`report`.`person`.`id`   /* attribute expression */   /* collections used: doc : doccollection  */
 29   RemoteNode                COOR        121027       - REMOTE
 30   GatherNode                COOR        121027       - GATHER   /* parallel, unsorted */
  4   EnumerateListNode         COOR      12102700       - FOR arrayLevel1Elem IN #8   /* list iteration */
 11   CalculationNode           COOR      12102700         - LET #16 = (arrayLevel1Elem.`concept`.`codedValue` == "B")   /* simple expression */
 12   FilterNode                COOR      12102700         - FILTER #16
  5   CalculationNode           COOR      12102700         - LET #10 = arrayLevel1Elem.`children`.`container`.`children`.`num`   /* attribute expression */
  6   EnumerateListNode         COOR    1210270000         - FOR arrayLevel2Elem IN #10   /* list iteration */
 13   CalculationNode           COOR    1210270000           - LET #18 = (arrayLevel2Elem.`concept`.`simpleCodedValue` == "C")   /* simple expression */
 14   FilterNode                COOR    1210270000           - FILTER #18
  7   CalculationNode           COOR    1210270000           - LET #12 = arrayLevel2Elem.`children`.`code`   /* attribute expression */
 21   CalculationNode           COOR    1210270000           - LET #28 = arrayLevel2Elem.`concept`.`meaning`   /* attribute expression */
 22   CalculationNode           COOR    1210270000           - LET #30 = TO_NUMBER(arrayLevel2Elem.`value`)   /* simple expression */
  8   EnumerateListNode         COOR  121027000000           - FOR arrayLevel3Elem IN #12   /* list iteration */
 15   CalculationNode           COOR  121027000000             - LET #20 = ((arrayLevel3Elem.`concept`.`simpleCodedValue` == "X") && (arrayLevel3Elem.`value`.`simpleCodedValue` == "Y"))   /* simple expression */
 16   FilterNode                COOR  121027000000             - FILTER #20
 23   CollectNode               COOR   96821600000             - COLLECT studyUid = #24, personId = #26, metricName = #28, value = #30   /* hash */
 26   SortNode                  COOR   96821600000             - SORT studyUid ASC, personId ASC, metricName ASC, value ASC   /* sorting strategy: standard */
 24   CalculationNode           COOR   96821600000             - LET #32 = { "studyUid" : studyUid, "personId" : personId, "metricName" : metricName, "value" : value }   /* simple expression */
 25   ReturnNode                COOR   96821600000             - RETURN #32

Thanks a lot for any hint.非常感谢任何提示。

Answer 1

Queries are not actually executed at the DB server - the coordinators handle query compilation and execution, only really asking the DB server(s) for data.查询实际上并没有在数据库服务器上执行——协调器处理查询编译和执行，只是真正地向数据库服务器询问数据。

This means memory load for query execution happens on the coordinators (good.) but that the coordinator has to transport (sometimes LARGE amounts of) data across the network.这意味着用于查询执行的 memory 负载发生在协调器上（很好。）但是协调器必须通过网络传输（有时是大量的）数据。 This is THE BIGGEST downside to moving to a cluster - and not one that is easily solved.这是迁移到集群的最大缺点 - 而不是一个容易解决的问题。

I walked this same road in the beginning and found ways to optimize some of my queries, but in the end, it was easier to go with a " one-shard " cluster or an " active-failover " setup.我一开始也走这条路，并找到了优化我的一些查询的方法，但最后，使用“ 单分片”集群或“ 主动故障转移”设置更容易实现 go。

It's tricky to make architecture suggestions because each use case can be so different, but there are some general AQL guidelines I follow:提出架构建议很棘手，因为每个用例都可能如此不同，但我遵循一些通用的 AQL 指南：

Collecting FOR and FILTER statements is not recommended (see #2).不建议收集FOR和FILTER语句（参见 #2）。 Try this version to see if it runs any faster (and try indexing report.container.concept.simpleCodedValue ):试试这个版本，看看它是否运行得更快（并尝试索引report.container.concept.simpleCodedValue ）：

FOR doc IN doccollection
    FILTER doc.report.container.concept.simpleCodedValue == 'A'
    FOR arrayLevel1Elem in doc.report.container.children.container
        FILTER arrayLevel1Elem.concept.codedValue == 'B'
        FOR arrayLevel2Elem in arrayLevel1Elem.children.container.children.num
            FILTER arrayLevel2Elem.concept.simpleCodedValue == 'C'
            FOR arrayLevel3Elem in arrayLevel2Elem.children.code
                FILTER arrayLevel3Elem.concept.simpleCodedValue == 'X'
                FILTER arrayLevel3Elem.value.simpleCodedValue == 'Y'
                COLLECT
                    studyUid = doc.report.study.uid,
                    personId = doc.report.person.id,
                    metricName = arrayLevel2Elem.concept.meaning,
                    value = to_number(arrayLevel2Elem.value)
                RETURN { studyUid, personId, metricName, value }

The FOR doc IN doccollection pattern will recall the ENTIRE document from the DB server for each item in doccollection . FOR doc IN doccollection模式将从数据库服务器中为doccollection中的每个项目调用整个文档。 Best practice is to either limit the number of documents you are retrieving (best done with an index-backed search) and/or return only a few attributes.最佳做法是限制您要检索的文档数量（最好使用索引支持的搜索）和/或只返回几个属性。 Don't be afraid of using LET - in-memory on the coordinator can be faster than in-memory on the DB.不要害怕使用LET - 协调器上的内存可以比数据库上的内存更快。 This example does both - filters and returns a smaller set of data:此示例同时进行 - 过滤并返回一组较小的数据：

LET filteredDocs = (
    FOR doc IN doccollection
        FILTER doc.report.container.concept.simpleCodedValue == 'A'
        RETURN {
            study_id: doc.report.study.uid,
            person_id: doc.report.person.id,
            arrayLevel1: doc.report.container.children.container
        }
)
FOR doc IN filteredDocs
    FOR arrayLevel1Elem in doc.arrayLevel1
        FILTER arrayLevel1Elem.concept.codedValue == 'B'
        ...

ArangoDB 分片集群性能问题

问题描述

1 个解决方案

解决方案1
0 2020-12-15 20:19:31

ArangoDB 分片集群性能问题

问题描述

1 个解决方案

解决方案1 0 2020-12-15 20:19:31

解决方案1
0 2020-12-15 20:19:31