简体   繁体   English

Python 中的复杂 Mongo 查询

[英]Complex Mongo query in Python

I have a collection named measurements .我有一个名为measurements的集合。 In this collection, there are only three fields: id (univoque), tmstamp (univoque) and value .在这个集合中,只有三个字段: id (univoque)、 tmstamp (univoque) 和value Each row which contains a value greater than 0 is considered an alert.包含大于 0 的值的每一行都被视为警报。 All the alerts that occur from the zero state to the next zero state is considered an episode.从零状态到下一个零状态发生的所有警报都被视为一个事件。 I want to query the data in such way that it returns it in episodes format.我想以它以剧集格式返回数据的方式查询数据。 That is to say, each row is an episode.也就是说,每一行都是一个情节。

In order to make it easier to understand, I'll put an example:为了更容易理解,我举个例子:

{"id":1, tmstamp:1577644027, value:0}
{"id":2, tmstamp:1577644028, value:0}
{"id":3, tmstamp:1577644029, value:1}
{"id":4, tmstamp:1577644030, value:1}
{"id":5, tmstamp:1577644031, value:2}
{"id":6, tmstamp:1577644032, value:2}
{"id":7, tmstamp:1577644033, value:3}
{"id":8, tmstamp:1577644034, value:2}
{"id":9, tmstamp:1577644035, value:1}
{"id":10, tmstamp:1577644036, value:0}
{"id":11, tmstamp:1577644037, value:1}
{"id":12, tmstamp:1577644038, value:1}
{"id":13, tmstamp:1577644039, value:1}
{"id":14, tmstamp:1577644040, value:0}

Given this data, the episodes would be:鉴于这些数据,剧集将是:

episode1 : episode1

{"id":3, tmstamp:1577644029, value:1}
{"id":4, tmstamp:1577644030, value:1}
{"id":5, tmstamp:1577644031, value:2}
{"id":6, tmstamp:1577644032, value:2}
{"id":7, tmstamp:1577644033, value:3}
{"id":8, tmstamp:1577644034, value:2}
{"id":9, tmstamp:1577644035, value:1}

episode2 : episode2

{"id":11, tmstamp:1577644037, value:1}
{"id":12, tmstamp:1577644038, value:1}
{"id":13, tmstamp:1577644039, value:1}

My question is: is there any way to query the data in Mongo in order to obtain the result in this format without having to do these operations after the query itself?我的问题是:有没有什么方法可以在Mongo中查询数据,从而获得这种格式的结果,而不必在查询本身之后进行这些操作?

You need to combine $facet and Array expression operators .您需要结合$facetArray expression operators
As @aws_apprentice mentioned, $bucket will do it for you if you know previously zero-state measure id's , as boundaries don't accepts expresions .正如@aws_apprentice提到的,如果您知道以前的零状态度量 id's$bucket会为您做这件事,因为边界不接受表达式

So, we need to separate zero-state and non-zero-state data.因此,我们需要将零状态和非零状态数据分开。 Let's call them: alerts (value = 0) and episode (value > 1).让我们称它们为:警报(值 = 0)和剧集(值 > 1)。
For alerts , we store _id of each measurements with alerts into array (we need it to filter episodes).对于alerts ,我们将每个带有警报的测量值的_id存储到数组中(我们需要它来过滤剧集)。 With $indexOfArray and $arrayElemAt we can take next _id i+1 (filter episodes between i and i+1 ids).使用$indexOfArray$arrayElemAt我们可以获取下一个 _id i+1 (过滤ii+1 id 之间的剧集)。

ASSUMPTION假设

I've replaced id to _id to perform aggregation我已将id替换为_id以执行聚合
You know how to translate MongoDB aggregate command in Python syntax你知道如何用Python语法翻译 MongoDB 聚合命令

db.measurements.aggregate([
  {
    $facet: {
      alerts: [
        {
          $match: {
            value: 0
          }
        },
        {
          $group: {
            _id: "",
            ids: {
              $push: "$_id"
            }
          }
        }
      ],
      episodes: [
        {
          $match: {
            value: {
              $gt: 0
            }
          }
        }
      ]
    }
  },
  {
    $unwind: "$alerts"
  },
  {
    $addFields: {
      alert_idx: "$alerts.ids"
    }
  },
  {
    $unwind: "$alerts.ids"
  },
  {
    $project: {
      "k": {
        $concat: [
          "Episode",
          {
            $toString: {
              $indexOfArray: [
                "$alert_idx",
                "$alerts.ids"
              ]
            }
          }
        ]
      },
      "v": {
        $filter: {
          input: "$episodes",
          cond: {
            $and: [
              {
                $gt: [
                  "$$this._id",
                  "$alerts.ids"
                ]
              },
              {
                $lt: [
                  "$$this._id",
                  {
                    $arrayElemAt: [
                      "$alert_idx",
                      {
                        $sum: [
                          {
                            $indexOfArray: [
                              "$alert_idx",
                              "$alerts.ids"
                            ]
                          },
                          1
                        ]
                      }
                    ]
                  }
                ]
              }
            ]
          }
        }
      }
    }
  },
  {
    $match: {
      "v": {
        $ne: []
      }
    }
  }
])

MongoPlayground蒙戈游乐场

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM