简体   繁体   English

在ArangoDb群集上进行了两个FULLTEXT搜索:涉及到V8

[英]Two FULLTEXT searches on ArangoDb Cluster: V8 is involved

I am investigating ArangoDb cluster and found out that in case of usage two FULLTEXT() searches one of them involves V8 engine. 我正在研究ArangoDb群集,发现在使用情况下,两个FULLTEXT()搜索其中之一涉及V8引擎。 My data: 我的资料:

[
{
"TITL": "Attacks induced by bromocryptin in Parkinson patients",
"WORD": [
        "hascites",
        "Six patients with Parkinson's disease"
         ],
"ID":1,
},
{
"TITL": "Linear modeling of possible mechanisms for Parkinson tremor generation",
"WORD": [
        "hascites",
        "jsubsetIM"
         ],
"ID":2,
},
{
"TITL": "Drug-induced parkinsonism in the rat- a model for biochemical ...",
"WORD": [
        "hascites",
        "Following treatment with reserpine or alternatively with ...",
        "hasabstract"
        ],
"ID":3,
}
]

Simplest query: 最简单的查询:

FOR title IN FULLTEXT(pmshort,"TITL","parkinson")
    FOR word IN FULLTEXT(pmshort,"WORD","hascites")
        FILTER title.ID==word.ID
    RETURN title

In other words, I am trying to find all documents that have parkinson in TITL and hascites in WORD . 换句话说,我正在尝试查找hascites中具有parkinsonWORD中具有TITL所有文档。 This example is seriously simplified, so the usage of something like 这个例子被认真地简化了,所以类似

FILTER word.WORD=='hascites'

is not possible. 不可能。 Two or more FULLTEXT searches are required for providing the necessary functionality. 为了提供必要的功能,需要两次或多次FULLTEXT搜索。 Collection includes about 520,000 documents. 馆藏约有52万份文件。 FullText indexes are set up on each field. 在每个字段上设置全文索引。

I found out that each of FULLTEXT queries, being run separately, involves index: 我发现,分别运行的每个FULLTEXT查询都涉及索引:

Execution plan:
 Id   NodeType        Site         Est.   Comment
  1   SingletonNode   DBS             1   * ROOT
  5   IndexNode       DBS        526577     - FOR title IN pmshort   /* fulltext index scan */
  8   RemoteNode      COOR       526577       - REMOTE
  9   GatherNode      COOR       526577       - GATHER 
  4   ReturnNode      COOR       526577       - RETURN title

But in case of usage both FOR first one is being processed by V8 (JavaScript) and runs on coordinator, not DBS: 但是,在使用情况下, FOR第一个都是由V8(JavaScript)处理的,并在协调器上运行,而不是在DBS上运行:

Execution plan:
 Id   NodeType            Site           Est.   Comment
  1   SingletonNode       COOR              1   * ROOT
  2   CalculationNode     COOR              1     - LET #2 = FULLTEXT(pmshort   /* all collection documents */, "TITL", "parkinson")   /* v8 expression */
  3   EnumerateListNode   COOR            100     - FOR title IN #2   /* list iteration */
 10   ScatterNode         COOR            100       - SCATTER
 11   RemoteNode          DBS             100       - REMOTE
  9   IndexNode           DBS        52657700       - FOR word IN pmshort   /* fulltext index scan */
  6   CalculationNode     DBS        52657700         - LET #6 = (title.`ID` == word.`ID`)   /* simple expression */   /* collections used: word : pmshort */
  7   FilterNode          DBS        52657700         - FILTER #6
 12   RemoteNode          COOR       52657700         - REMOTE
 13   GatherNode          COOR       52657700         - GATHER 
  8   ReturnNode          COOR       52657700         - RETURN title

Of course, this slows down system a lot. 当然,这会大大减慢系统速度。 So my questions are: 1. Why ArangoDb cluster can't process both conditions on DBS, not on coordinator (COOR)? 所以我的问题是:1.为什么ArangoDb集群不能在DBS上而不是在协调器(COOR)上处理两个条件? 2. How to avoid such situation since performance drops 300-500 times? 2.由于性能下降300-500倍,如何避免这种情况? 3. May be somebody can point on some additional materials to read about this. 3.可能有人可以指出一些其他材料来阅读此内容。

Any help is appreciated. 任何帮助表示赞赏。 Thanks! 谢谢!

It looks like the query optimizer stops looking for further fulltext improvements after having applied one fulltext transformation in each query/subquery. 在每个查询/子查询中应用了一个全文转换之后,查询优化器似乎停止寻求进一步的全文改进。

A potential fix for this can be found in this pull request (which targets 3.3.10). 可以在此拉取请求 (针对3.3.10)中找到可能的解决方案。

Thanks a lot! 非常感谢! It should be available in 3.3.10 and future 3.4, right? 它应该在3.3.10和将来的3.4中可用,对吗?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM