ArangoDB 分片集群性能问题

Question

我有一个在单实例设置中运行良好的查询。 但是，当我尝试在分片集群上运行它时，性能下降了（执行时间延长了 4 倍）。

查询计划显示几乎所有处理都在 Coordinator 节点上完成，而不是在 DbServer 上。 如何推送要在 DbServer 上执行的查询？

提供一些背景信息：我收集了大约 120k（将增长到数百万）的多级 JSON 文档，其中嵌套了 arrays。 并且查询需要在到达正确的节点之前取消嵌套这些 arrays。

AQL 查询：

for doc IN doccollection
for arrayLevel1Elem in doc.report.container.children.container
for arrayLevel2Elem in arrayLevel1Elem.children.container.children.num
for arrayLevel3Elem in arrayLevel2Elem.children.code

filter doc.report.container.concept.simpleCodedValue == 'A' 
filter arrayLevel1Elem.concept.codedValue == "B"
filter arrayLevel2Elem.concept.simpleCodedValue == "C"   
filter arrayLevel3Elem.concept.simpleCodedValue == 'X'
filter arrayLevel3Elem.value.simpleCodedValue == 'Y'     

collect studyUid = doc.report.study.uid, personId = doc.report.person.id, metricName = arrayLevel2Elem.concept.meaning, value = to_number(arrayLevel2Elem.value)

return {studyUid, personId, metricName, value}

查询计划：

 Id   NodeType                  Site          Est.   Comment
  1   SingletonNode             DBS              1   * ROOT
  2   EnumerateCollectionNode   DBS         121027     - FOR doc IN doccollection   /* full collection scan, projections: `report`, 2 shard(s) */   FILTER (doc.`report`.`container`.`concept`.`simpleCodedValue` == "A")   /* early pruning */
  3   CalculationNode           DBS         121027       - LET #8 = doc.`report`.`container`.`children`.`container`   /* attribute expression */   /* collections used: doc : doccollection */
 19   CalculationNode           DBS         121027       - LET #24 = doc.`report`.`study`.`uid`   /* attribute expression */   /* collections used: doc : doccollection */
 20   CalculationNode           DBS         121027       - LET #26 = doc.`report`.`person`.`id`   /* attribute expression */   /* collections used: doc : doccollection  */
 29   RemoteNode                COOR        121027       - REMOTE
 30   GatherNode                COOR        121027       - GATHER   /* parallel, unsorted */
  4   EnumerateListNode         COOR      12102700       - FOR arrayLevel1Elem IN #8   /* list iteration */
 11   CalculationNode           COOR      12102700         - LET #16 = (arrayLevel1Elem.`concept`.`codedValue` == "B")   /* simple expression */
 12   FilterNode                COOR      12102700         - FILTER #16
  5   CalculationNode           COOR      12102700         - LET #10 = arrayLevel1Elem.`children`.`container`.`children`.`num`   /* attribute expression */
  6   EnumerateListNode         COOR    1210270000         - FOR arrayLevel2Elem IN #10   /* list iteration */
 13   CalculationNode           COOR    1210270000           - LET #18 = (arrayLevel2Elem.`concept`.`simpleCodedValue` == "C")   /* simple expression */
 14   FilterNode                COOR    1210270000           - FILTER #18
  7   CalculationNode           COOR    1210270000           - LET #12 = arrayLevel2Elem.`children`.`code`   /* attribute expression */
 21   CalculationNode           COOR    1210270000           - LET #28 = arrayLevel2Elem.`concept`.`meaning`   /* attribute expression */
 22   CalculationNode           COOR    1210270000           - LET #30 = TO_NUMBER(arrayLevel2Elem.`value`)   /* simple expression */
  8   EnumerateListNode         COOR  121027000000           - FOR arrayLevel3Elem IN #12   /* list iteration */
 15   CalculationNode           COOR  121027000000             - LET #20 = ((arrayLevel3Elem.`concept`.`simpleCodedValue` == "X") && (arrayLevel3Elem.`value`.`simpleCodedValue` == "Y"))   /* simple expression */
 16   FilterNode                COOR  121027000000             - FILTER #20
 23   CollectNode               COOR   96821600000             - COLLECT studyUid = #24, personId = #26, metricName = #28, value = #30   /* hash */
 26   SortNode                  COOR   96821600000             - SORT studyUid ASC, personId ASC, metricName ASC, value ASC   /* sorting strategy: standard */
 24   CalculationNode           COOR   96821600000             - LET #32 = { "studyUid" : studyUid, "personId" : personId, "metricName" : metricName, "value" : value }   /* simple expression */
 25   ReturnNode                COOR   96821600000             - RETURN #32

非常感谢任何提示。

Answer 1

查询实际上并没有在数据库服务器上执行——协调器处理查询编译和执行，只是真正地向数据库服务器询问数据。

这意味着用于查询执行的 memory 负载发生在协调器上（很好。）但是协调器必须通过网络传输（有时是大量的）数据。 这是迁移到集群的最大缺点 - 而不是一个容易解决的问题。

我一开始也走这条路，并找到了优化我的一些查询的方法，但最后，使用“ 单分片”集群或“ 主动故障转移”设置更容易实现 go。

提出架构建议很棘手，因为每个用例都可能如此不同，但我遵循一些通用的 AQL 指南：

不建议收集FOR和FILTER语句（参见 #2）。 试试这个版本，看看它是否运行得更快（并尝试索引report.container.concept.simpleCodedValue ）：

FOR doc IN doccollection
    FILTER doc.report.container.concept.simpleCodedValue == 'A'
    FOR arrayLevel1Elem in doc.report.container.children.container
        FILTER arrayLevel1Elem.concept.codedValue == 'B'
        FOR arrayLevel2Elem in arrayLevel1Elem.children.container.children.num
            FILTER arrayLevel2Elem.concept.simpleCodedValue == 'C'
            FOR arrayLevel3Elem in arrayLevel2Elem.children.code
                FILTER arrayLevel3Elem.concept.simpleCodedValue == 'X'
                FILTER arrayLevel3Elem.value.simpleCodedValue == 'Y'
                COLLECT
                    studyUid = doc.report.study.uid,
                    personId = doc.report.person.id,
                    metricName = arrayLevel2Elem.concept.meaning,
                    value = to_number(arrayLevel2Elem.value)
                RETURN { studyUid, personId, metricName, value }

FOR doc IN doccollection模式将从数据库服务器中为doccollection中的每个项目调用整个文档。 最佳做法是限制您要检索的文档数量（最好使用索引支持的搜索）和/或只返回几个属性。 不要害怕使用LET - 协调器上的内存可以比数据库上的内存更快。 此示例同时进行 - 过滤并返回一组较小的数据：

LET filteredDocs = (
    FOR doc IN doccollection
        FILTER doc.report.container.concept.simpleCodedValue == 'A'
        RETURN {
            study_id: doc.report.study.uid,
            person_id: doc.report.person.id,
            arrayLevel1: doc.report.container.children.container
        }
)
FOR doc IN filteredDocs
    FOR arrayLevel1Elem in doc.arrayLevel1
        FILTER arrayLevel1Elem.concept.codedValue == 'B'
        ...

ArangoDB 分片集群性能问题

问题描述

1 个解决方案

解决方案1
0 2020-12-15 20:19:31

ArangoDB 分片集群性能问题

问题描述

1 个解决方案

解决方案1 0 2020-12-15 20:19:31

解决方案1
0 2020-12-15 20:19:31