[英]Complex Mongo query in Python
I have a collection named measurements
.我有一个名为
measurements
的集合。 In this collection, there are only three fields: id
(univoque), tmstamp
(univoque) and value
.在这个集合中,只有三个字段:
id
(univoque)、 tmstamp
(univoque) 和value
。 Each row which contains a value greater than 0 is considered an alert.包含大于 0 的值的每一行都被视为警报。 All the alerts that occur from the zero state to the next zero state is considered an episode.
从零状态到下一个零状态发生的所有警报都被视为一个事件。 I want to query the data in such way that it returns it in episodes format.
我想以它以剧集格式返回数据的方式查询数据。 That is to say, each row is an episode.
也就是说,每一行都是一个情节。
In order to make it easier to understand, I'll put an example:为了更容易理解,我举个例子:
{"id":1, tmstamp:1577644027, value:0}
{"id":2, tmstamp:1577644028, value:0}
{"id":3, tmstamp:1577644029, value:1}
{"id":4, tmstamp:1577644030, value:1}
{"id":5, tmstamp:1577644031, value:2}
{"id":6, tmstamp:1577644032, value:2}
{"id":7, tmstamp:1577644033, value:3}
{"id":8, tmstamp:1577644034, value:2}
{"id":9, tmstamp:1577644035, value:1}
{"id":10, tmstamp:1577644036, value:0}
{"id":11, tmstamp:1577644037, value:1}
{"id":12, tmstamp:1577644038, value:1}
{"id":13, tmstamp:1577644039, value:1}
{"id":14, tmstamp:1577644040, value:0}
Given this data, the episodes would be:鉴于这些数据,剧集将是:
episode1
: episode1
:
{"id":3, tmstamp:1577644029, value:1}
{"id":4, tmstamp:1577644030, value:1}
{"id":5, tmstamp:1577644031, value:2}
{"id":6, tmstamp:1577644032, value:2}
{"id":7, tmstamp:1577644033, value:3}
{"id":8, tmstamp:1577644034, value:2}
{"id":9, tmstamp:1577644035, value:1}
episode2
: episode2
:
{"id":11, tmstamp:1577644037, value:1}
{"id":12, tmstamp:1577644038, value:1}
{"id":13, tmstamp:1577644039, value:1}
My question is: is there any way to query the data in Mongo in order to obtain the result in this format without having to do these operations after the query itself?我的问题是:有没有什么方法可以在Mongo中查询数据,从而获得这种格式的结果,而不必在查询本身之后进行这些操作?
You need to combine $facet
and Array expression operators
.您需要结合
$facet
和Array expression operators
。
As @aws_apprentice mentioned, $bucket
will do it for you if you know previously zero-state measure id's , as boundaries don't accepts expresions .正如@aws_apprentice提到的,如果您知道以前的零状态度量 id's ,
$bucket
会为您做这件事,因为边界不接受表达式。
So, we need to separate zero-state and non-zero-state data.因此,我们需要将零状态和非零状态数据分开。 Let's call them: alerts (value = 0) and episode (value > 1).
让我们称它们为:警报(值 = 0)和剧集(值 > 1)。
For alerts
, we store _id
of each measurements with alerts into array (we need it to filter episodes).对于
alerts
,我们将每个带有警报的测量值的_id
存储到数组中(我们需要它来过滤剧集)。 With $indexOfArray
and $arrayElemAt
we can take next _id i+1
(filter episodes between i
and i+1
ids).使用
$indexOfArray
和$arrayElemAt
我们可以获取下一个 _id i+1
(过滤i
和i+1
id 之间的剧集)。
ASSUMPTION假设
I've replaced id
to _id
to perform aggregation我已将
id
替换为_id
以执行聚合
You know how to translate MongoDB aggregate command in Python
syntax你知道如何用
Python
语法翻译 MongoDB 聚合命令
db.measurements.aggregate([
{
$facet: {
alerts: [
{
$match: {
value: 0
}
},
{
$group: {
_id: "",
ids: {
$push: "$_id"
}
}
}
],
episodes: [
{
$match: {
value: {
$gt: 0
}
}
}
]
}
},
{
$unwind: "$alerts"
},
{
$addFields: {
alert_idx: "$alerts.ids"
}
},
{
$unwind: "$alerts.ids"
},
{
$project: {
"k": {
$concat: [
"Episode",
{
$toString: {
$indexOfArray: [
"$alert_idx",
"$alerts.ids"
]
}
}
]
},
"v": {
$filter: {
input: "$episodes",
cond: {
$and: [
{
$gt: [
"$$this._id",
"$alerts.ids"
]
},
{
$lt: [
"$$this._id",
{
$arrayElemAt: [
"$alert_idx",
{
$sum: [
{
$indexOfArray: [
"$alert_idx",
"$alerts.ids"
]
},
1
]
}
]
}
]
}
]
}
}
}
}
},
{
$match: {
"v": {
$ne: []
}
}
}
])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.