[英]MongoDB, PyMongo how to filter results by count of uniques field?
MongoDb contains next set of data MongoDb 包含下一组数据
[{"user": "a", "domain": "some.com"},
{"user": "b", "domain": "some.com"},
{"user": "b1", "domain": "some.com"},
{"user": "c", "domain": "test.com"},
{"user": "d", "domain": "work.com"},
{"user": "aaa", "domain": "work.com"},
{"user": "some user", "domain": "work.com"} ]
I need select first items filtered by domain, no more that 2 same domains in result.我需要 select 由域过滤的第一项,结果中没有超过 2 个相同的域。 After mongo query result should looks like
在 mongo 查询结果应该看起来像之后
[{"user": "a", "domain": "some.com"},
{"user": "b", "domain": "some.com"},
{"user": "c", "domain": "test.com"},
{"user": "d", "domain": "work.com"},
{"user": "aaa", "domain": "work.com"}]
Just 2 results with same domain, other with same domains must be skipped.只有 2 个具有相同域的结果,必须跳过具有相同域的其他结果。 Is this possible do do with $aggregation, $filter or something else?
这可能与 $aggregation、$filter 或其他东西有关吗?
Is the a way to group by domain and get just first N(2 in example) users data?是一种按域分组并仅获取前 N(例如 2 个)用户数据的方法吗? Example:
例子:
[{"domain": "some.com", "users": [a, b]}]
so所以
{"user": "b1", "domain": "some.com"} will be skip
You may get desired result performing MongoDB aggregation.执行 MongoDB 聚合可能会得到所需的结果。
It consists in four stages:它包括四个阶段:
1. We group by domain
field and accumulate into data
documents with the same domain name 1.我们按
domain
字段分组,积累成同域名的data
文档
2. Than, we splice array to set max 2 items per domain 2.然后,我们拼接数组以设置每个域最多 2 个项目
3. We flatten data
field with $unwind
operator 3. 我们使用
$unwind
操作符展平data
字段
4. We return original document structure with $replaceRoot
operator 4. 我们用
$replaceRoot
操作符返回原始文档结构
db.collection.aggregate([
{
"$group": {
"_id": "$domain",
"data": { "$push": "$$ROOT" }
}
},
{
"$addFields": {
"data": {
"$slice": [ "$data", 0, 2 ]
}
}
},
{
"$unwind": "$data"
},
{
$replaceRoot: { "newRoot": "$data" }
}
])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.