简体   繁体   中英

Get unique values from arrays per record in Mongodb

I have a collection in MongoDB that looks like this:

{
    "_id" : ObjectId("56d3e53b965b57e4d1eb3e71"),
    "name" : "John",
    "posts" : [
                 {
                    "topic" : "Harry Potter",
                    "obj_ids" : [
                            "1234"
                    ],
                    "dates_posted" : [
                            "2014-12-24"
                    ]
                 },
                 {
                    "topic" : "Daniel Radcliffe",
                    "obj_ids" : [
                            "1235",
                            "1236",
                            "1237"
                    ],
                    "dates_posted" : [
                            "2014-12-22",
                            "2015-01-13",
                            "2014-12-24"
                    ]
                 }
              ],
},
{
    "_id" : ObjectId("56d3e53b965b57e4d1eb3e72"),
    "name" : "Jane",
    "posts" : [
                 {
                    "topic" : "Eragon",
                    "tweet_ids" : [
                            "1672",
                            "1673",
                            "1674"
                    ],
                    "dates_posted" : [
                            "2014-12-27",
                            "2014-11-16"
                    ]
                }
            ],
}

How could I query to get a result like:

{
       "name": "John",
       "dates": ["2014-12-24", "2014-12-22", "2015-01-13"]
},
{
       "name": "Jane",
       "dates" : ["2014-12-27", "2014-11-16"]
}

I need the dates to be unique, as "2014-12-24" appears in both elements of "posts" but I need only the one.

I tried doing db.collection.aggregate([{$unwind: "$posts"}, {$group:{_id:"$posts.dates_posted"}}]) and that gave me results like this:

{ "_id" : [ "2014-12-24", "2014-12-22", "2015-01-13", "2014-12-24" ] }
{ "_id" : [ "2014-12-27", "2014-11-16" ] }

How can I remove the duplicates and also get the name corresponding to the dates?

You would need to use the $addToSet operator to maintain unique values. One way of doing it would be to:

  • unwind posts.
  • unwind "posts.date_posted", so that the array gets flattened and the value can be aggregated in the group stage.
  • Then group by _id and accumulate unique values for the date field, along with name .

code:

db.collection.aggregate([
{
  $unwind:"$posts"
},
{
  $unwind:"$posts.dates_posted"
},
{
  $group:
         {
           "_id":"$_id",
           "dates":{$addToSet:"$posts.dates_posted"},
           "name":{$first:"$name"}
         }
},
{
  $project:
            {
              "name":1,
              "dates":1,
              "_id":0
            }
}
])

The cons of this approach being that, it uses two unwind stages, which is quiet costly, since it would increase the number of documents, input to the subsequent stages, by a multiplication factor of n where n is the number of values in the array that is flattened.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM