简体   繁体   English

从数组MongoDB获取唯一ObjectId的计数

[英]Get count of unique ObjectId from array MongoDB

I'm new to working with MongoDb and do not know a lot of things. 我刚接触MongoDb并不陌生。 I need to write an aggregation request. 我需要写一个聚合请求。 Here is the JSON document structure. 这是JSON文档结构。

{ 
    "_id" : ObjectId("5a72f7a75ef7d430e8c462d2"), 
    "crawler_id" : ObjectId("5a71cbb746e0fb0007adc6c2"), 
    "skill" : "stack", 
    "created_date" : ISODate("2018-02-01T13:19:03.522+0000"), 
    "modified_date" : ISODate("2018-02-01T13:22:23.078+0000"), 
    "connects" : [
        {
            "subskill" : "we’re", 
            "weight" : NumberInt(1), 
            "parser_id" : [
                ObjectId("5a71d88d5ef7d41964fbec11")
            ]
        }, 
        {
            "subskill" : "b1", 
            "weight" : NumberInt(2), 
            "parser_id" : [
                ObjectId("5a71d88d5ef7d41964fbec11"), 
                ObjectId("5a71d88d5ef7d41964fbec1b")
            ]
        }, 
        {
            "subskill" : "making", 
            "weight" : NumberInt(2), 
            "parser_id" : [
                ObjectId("5a71d88d5ef7d41964fbec1b"), 
                ObjectId("5a71d88d5ef7d41964fbec1c")
            ]
        }, 
        {
            "subskill" : "delivery", 
            "weight" : NumberInt(2), 
            "parser_id" : [
                ObjectId("5a71d88d5ef7d41964fbec1c"), 
                ObjectId("5a71d88d5ef7d41964fbec1e")
            ]
        }
    ]
}

I need the result return the name of skill and the number of unique parser_id. 我需要结果返回技能名称和唯一parser_id的数量。 In this case, the result should be: 在这种情况下,结果应为:

[
   {
    "skill": "stack",
    "quantity": 4
    }
]

where "stack" - skill name, and "quantity" - count of unique parser_id. 其中“堆栈”-技能名称,“数量”-唯一parser_id的计数。

ObjectId("5a71d88d5ef7d41964fbec11")
ObjectId("5a71d88d5ef7d41964fbec1b")
ObjectId("5a71d88d5ef7d41964fbec1c")
ObjectId("5a71d88d5ef7d41964fbec1e")

Can some one help me with this request ??? 有人可以帮我这个要求吗?

Given the document supplied in your question, this command ... 给定您问题中提供的文档,此命令...

db.collection.aggregate([
    { $unwind: "$connects" },

    // count all occurrences
    { "$group": { "_id": {skill: "$skill", parser_id: "$connects.parser_id"}, "count": { "$sum": 1 } }},

    // sum all occurrences and count distinct
    { "$group": { "_id": "$_id.skill", "quantity": { "$sum": 1 } }},

    // (optional) rename the '_id' attribute to 'skill'
    { $project: { 'skill': '$_id', 'quantity': 1, _id: 0 } }
])

... will return: ... 将返回:

{
    "quantity" : 4,
    "skill" : "stack"
}

The above command groups by skill and connects.parser_id and then gets a distinct count of those groups. 上面的命令按skillconnects.parser_id分组,然后获得这些组的不同计数。

Your command includes the java tag so I suspect you are looking to execute the same command using the MongoDB Java driver. 您的命令包含java标记,因此我怀疑您正在寻找使用MongoDB Java驱动程序执行同一命令的方法。 The code below (using MongoDB Java driver v3.x) will return the same result: 下面的代码(使用MongoDB Java驱动程序v3.x)将返回相同的结果:

MongoClient mongoClient = ...;

MongoCollection<Document> collection = mongoClient.getDatabase("...").getCollection("...");

List<Document> documents = collection.aggregate(Arrays.asList(
        Aggregates.unwind("$connects"),
        new Document("$group", new Document("_id", new Document("skill", "$skill").append("parser_id", "$connects.parser_id"))
                .append("count", new Document("$sum", 1))),
        new Document("$group", new Document("_id", "$_id.skill").append("quantity", new Document("$sum", 1))),
        new Document("$project", new Document("skill", "$_id").append("quantity", 1).append("_id", 0))
)).into(new ArrayList<>());

for (Document document : documents) {
    logger.info("{}", document.toJson());
}

Note: this code deliberately uses the form new Document(<pipeline aggregator>, ...) instead of the Aggregators utilities to make it easier to see the translation between the shell command and its Java equivalent. 注意:此代码故意使用形式为new Document(<pipeline aggregator>, ...)而不是Aggregators实用程序,以便更轻松地查看 shell命令与其等效的Java之间的转换。

try $project with $reduce 尝试$project$reduce

$setUnion is used to keep only the distinct ids and finally $size used to get the distinct array count $setUnion用于仅保留不同的ID,最后$size用于获取不同的数组计数

db.col.aggregate(
    [
        {$project : {
                _id : 0,
                skill : 1,
                quantity : {$size :{$reduce : {input : "$connects.parser_id", initialValue : [] , in : {$setUnion : ["$$value", "$$this"]}}}}
            }
        }
    ]
).pretty()

result 结果

{ "skill" : "stack", "quantity" : 4 }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM