简体   繁体   中英

Aggregate nested documents with missing data

I have a mongoDB collection with data such as:

{
    "_id" : ObjectId("..."),
    "records" : [
        ISODate("2020-04-19T00:49:18.945Z"),
        {
            "_id" : ObjectId(""),
            "date" : ISODate("2020-05-07T04:49:55.643Z"),
            "text" : "someText"
        }
    ],
}

The value in records is different due to version upgrades.

I want to aggregate records.text across all documents ignoring the missing data. The code from MongoDB: Aggregate and flatten an array field

db.collection.aggregate({$unwind : "records"},
                      {$project: {_id: 1, 'text': '$records.text'}})

throws:

path option to $unwind stage should be prefixed with a '$': records

and fixing the error from these directions to accommodate empty fields:

db.collection.aggregate({$unwind : "records", includeEmpty: false},
                      {$project: {_id: 1, 'text': '$records.text'}})

throws

A pipeline stage specification object must contain exactly one field.

How can I aggregate values from nested arrays with possibly empty values?

You can filter out the empty ones using $exists :

db.collection.aggregate([
    { $unwind: "$records" },
    { $match: { "records.text": { $exists: true } } },
    { $project: { _id: 1, text: "$records.text" }}
    {$group: {_id: "$text", count: {$sum: 1}}},
    {$sort: {count: -1}}
])

Mongo Playground

In your first query, you are missing the '$', since record is a field value, so you should prefix it with '$'. The final query will be:

db.collection.aggregate({$unwind : "$records"},
                      {$project: {_id: 1, 'text': '$records.text'}})

I hope this works for you.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM