[英]Get count of unique ObjectId from array MongoDB
I'm new to working with MongoDb and do not know a lot of things. 我刚接触MongoDb并不陌生。 I need to write an aggregation request.
我需要写一个聚合请求。 Here is the JSON document structure.
这是JSON文档结构。
{
"_id" : ObjectId("5a72f7a75ef7d430e8c462d2"),
"crawler_id" : ObjectId("5a71cbb746e0fb0007adc6c2"),
"skill" : "stack",
"created_date" : ISODate("2018-02-01T13:19:03.522+0000"),
"modified_date" : ISODate("2018-02-01T13:22:23.078+0000"),
"connects" : [
{
"subskill" : "we’re",
"weight" : NumberInt(1),
"parser_id" : [
ObjectId("5a71d88d5ef7d41964fbec11")
]
},
{
"subskill" : "b1",
"weight" : NumberInt(2),
"parser_id" : [
ObjectId("5a71d88d5ef7d41964fbec11"),
ObjectId("5a71d88d5ef7d41964fbec1b")
]
},
{
"subskill" : "making",
"weight" : NumberInt(2),
"parser_id" : [
ObjectId("5a71d88d5ef7d41964fbec1b"),
ObjectId("5a71d88d5ef7d41964fbec1c")
]
},
{
"subskill" : "delivery",
"weight" : NumberInt(2),
"parser_id" : [
ObjectId("5a71d88d5ef7d41964fbec1c"),
ObjectId("5a71d88d5ef7d41964fbec1e")
]
}
]
}
I need the result return the name of skill and the number of unique parser_id. 我需要结果返回技能名称和唯一parser_id的数量。 In this case, the result should be:
在这种情况下,结果应为:
[
{
"skill": "stack",
"quantity": 4
}
]
where "stack" - skill name, and "quantity" - count of unique parser_id. 其中“堆栈”-技能名称,“数量”-唯一parser_id的计数。
ObjectId("5a71d88d5ef7d41964fbec11")
ObjectId("5a71d88d5ef7d41964fbec1b")
ObjectId("5a71d88d5ef7d41964fbec1c")
ObjectId("5a71d88d5ef7d41964fbec1e")
Can some one help me with this request ??? 有人可以帮我这个要求吗?
Given the document supplied in your question, this command ... 给定您问题中提供的文档,此命令...
db.collection.aggregate([
{ $unwind: "$connects" },
// count all occurrences
{ "$group": { "_id": {skill: "$skill", parser_id: "$connects.parser_id"}, "count": { "$sum": 1 } }},
// sum all occurrences and count distinct
{ "$group": { "_id": "$_id.skill", "quantity": { "$sum": 1 } }},
// (optional) rename the '_id' attribute to 'skill'
{ $project: { 'skill': '$_id', 'quantity': 1, _id: 0 } }
])
... will return: ... 将返回:
{
"quantity" : 4,
"skill" : "stack"
}
The above command groups by skill
and connects.parser_id
and then gets a distinct count of those groups. 上面的命令按
skill
和connects.parser_id
分组,然后获得这些组的不同计数。
Your command includes the java
tag so I suspect you are looking to execute the same command using the MongoDB Java driver. 您的命令包含
java
标记,因此我怀疑您正在寻找使用MongoDB Java驱动程序执行同一命令的方法。 The code below (using MongoDB Java driver v3.x) will return the same result: 下面的代码(使用MongoDB Java驱动程序v3.x)将返回相同的结果:
MongoClient mongoClient = ...;
MongoCollection<Document> collection = mongoClient.getDatabase("...").getCollection("...");
List<Document> documents = collection.aggregate(Arrays.asList(
Aggregates.unwind("$connects"),
new Document("$group", new Document("_id", new Document("skill", "$skill").append("parser_id", "$connects.parser_id"))
.append("count", new Document("$sum", 1))),
new Document("$group", new Document("_id", "$_id.skill").append("quantity", new Document("$sum", 1))),
new Document("$project", new Document("skill", "$_id").append("quantity", 1).append("_id", 0))
)).into(new ArrayList<>());
for (Document document : documents) {
logger.info("{}", document.toJson());
}
Note: this code deliberately uses the form new Document(<pipeline aggregator>, ...)
instead of the Aggregators
utilities to make it easier to see the translation between the shell command and its Java equivalent. 注意:此代码故意使用形式为
new Document(<pipeline aggregator>, ...)
而不是Aggregators
实用程序,以便更轻松地查看 shell命令与其等效的Java之间的转换。
try $project
with $reduce
尝试
$project
与$reduce
$setUnion
is used to keep only the distinct ids and finally $size
used to get the distinct array count $setUnion
用于仅保留不同的ID,最后$size
用于获取不同的数组计数
db.col.aggregate(
[
{$project : {
_id : 0,
skill : 1,
quantity : {$size :{$reduce : {input : "$connects.parser_id", initialValue : [] , in : {$setUnion : ["$$value", "$$this"]}}}}
}
}
]
).pretty()
result 结果
{ "skill" : "stack", "quantity" : 4 }
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.