简体   繁体   English

在cube.js 中设计一个MongoDB 数组属性中元素数量的计数度量

[英]Design a counting measure of the element number in a MongoDB array property in cube.js

I'm using cube.js with MongoDB through MongoDB Connector for BI and MongoBI Driver and so far so good.我通过MongoDB Connector for BIMongoBI Drivercube.js与MongoDB 一起使用,到目前为止一切顺利。 I'd like to have a cube.js numerical measure that counts the element length from a MongoDB array of object nested property.我想要一个cube.js 数值度量,它计算来自对象嵌套属性的MongoDB 数组的元素长度。 Something like:就像是:

{
  "nested": {
    "arrayPropertyName": [
      {
        "name": "Leatha Bauch",
        "email": "Leatha.Bauch76@hotmail.com"
      },
      {
        "name": "Pedro Hermiston",
        "email": "Pedro76@hotmail.com"
      }
    ]
  }
}

I wasn't able to figure that out looking at the docs and I was wondering if that is even possible.我无法通过查看文档弄清楚这一点,我想知道这是否可能。

I tried with type: count :我尝试使用type: count

    MyNestedArrayPropertyCounter: {
      sql: `${CUBE}.\`nested.arrayPropertyName\``,
      type: `count`,
      format: `number`,
    },

but I'm getting但我得到

Error: Error: Unknown column 'nested.arrayPropertyName' in 'field list'

Any help/advice is really appreciated.任何帮助/建议都非常感谢。 Thanks谢谢

BI treats nested arrays as separate relational tables. BI 将嵌套数组视为单独的关系表。 See https://www.mongodb.com/blog/post/introducing-the-mongodb-connector-for-bi-20https://www.mongodb.com/blog/post/introducing-the-mongodb-connector-for-bi-20

That's why you get unknown column error, it's not part of the parent document table.这就是为什么您会收到未知列错误的原因,它不是父文档表的一部分。

So my guess you have to build schema on the nested array and then build measure count with dimension on parent object id.所以我猜你必须在嵌套数组上构建模式,然后在父对象 id 上构建带有维度的度量计数。

Hope it halps.希望它停止。

I followed Michael Parshin's advice and here's my findings and outcomes to overcome the problem:我遵循了Michael Parshin 的建议,以下是我的发现和解决问题的结果:

  1. LEFT JOIN approach with cube.js joins . LEFT JOIN方法与 cube.js joins I found it painfully slow and most of the time it endend out in a timeout even when querying was performed through command line SQL clients;我发现它非常缓慢,而且大多数时候即使通过命令行 SQL 客户端执行查询,它也会以超时结束;

  2. Launch mongosqld with --prejoin flag.使用--prejoin标志启动mongosqld That was a better option since mongosqld automatically adds master table columns/properties to the secondary tables thus enabling you to conveniently query cube.js measures without joining a secondary Cube ;这是一个更好的选择,因为mongosqld自动将主表列/属性添加到辅助表,从而使您能够方便地查询 cube.js度量,而无需加入辅助Cube

  3. Wrote a mongo script that fetch/iterate/precalc and persist nested.arrayPropertyName count in a separate property of the collection documents.编写了一个 mongo 脚本,它在集合文档的一个单独的属性中获取/迭代/预计算并持久化nested.arrayPropertyName计数。

Conclusion结论

Leaving out option 1, option 3 significantly outperforms option 2, typically less than a seconds against more than 20 seconds on my local machine.忽略选项 1,选项 3明显优于选项 2,通常不到 1 秒,而在我的本地机器上则超过 20 秒。 I compared both options with the same measure , different timeDimension ranges and granularity.我将两个选项与相同的度量、不同的timeDimension范围和粒度进行了比较。

Most probably I'll incorporate array count precalculation into mongo document back-end persisting logic.很可能我会将数组计数预计算合并到 mongo 文档后端持久化逻辑中。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM