简体   繁体   中英

MongoDB aggregate count of items in two arrays across different documents is the same?

Here is my MongoDB collection schema:

company: String
model: String
cons: [String] // array of tags that were marked as "cons"
pros: [String] // array of tags that were marked as "pros"

Here is my query :

[
    { "$project": {
        "company": 1,
        "model": 1,
        "data": {
            "$setUnion": [
                { "$map": {
                    "input": "$pros",
                    "as": "pro",
                    "in": {
                        "type": "$pro",
                        "value": "$$pro"
                    }
                }},
                { "$map": {
                    "input": "$cons",
                    "as": "con",
                    "in": {
                        "type": "$con",
                        "value": "$$con"
                    }
                }}
            ]
        }
    }},
    { "$unwind": "$data" },
    { "$group": {
      "_id": { 
          "company": "$company",
          "model": "$model",
          "theTag": "$data.value"
      },
      "sumPros": { 
        "$sum": { 
          "$cond": [
            { "$eq": [ "$data.type", "$pro" ] },
              1,
              0
          ]
        }
      },
      "sumCons": { 
        "$sum": { 
          "$cond": [
            { "$eq": [ "$data.type", "$con" ] },
              1,
              0
          ]
        }
      }
    }},
    { "$group": {
        "_id": { 
            "company": "$_id.company",
            "model": "$_id.model",
        },
        "tags": {$push: { 
          "tag": "$_id.theTag", 
          "pros": "$sumPros",
          "cons": "$sumCons"
        }

      }}
}]

Here is the output:

{
        "_id": {
            "company": "Lenovo",
            "model": "T400"
        },
        "tags": [
            {
                "tag": "Quality",
                "pros": 64, // expected value is 54
                "cons": 64  // expected value is 10
            },
            {
                "tag": "Value",
                "pros": 76, // expected value is 30
                "cons": 76  // expected value is 46
            }
        ]
}
...

Notice that pros and cons values are the same. They, for some reason, represent the sum of pros and cons and I can't figure-out why.

What am I doing wrong?

Update:

Here is a document from the collection:

{
  "company": "Lenovo",
  "model": "X200",

  "cons": [
      "Quality"
  ],
  "pros": [
      "Value",
      "Styling"
  ]
}

As the author of the content you are using in the query and also after asking you to submit some information in the form of data that actually supports the claim in the question here, I have to say that what you are saying is incorrect.

For the record, this is your sample at time of answer:

{
  "company": "Lenovo",
  "model": "X200",

  "cons": [
      "Quality"
  ],
  "pros": [
      "Value",
      "Styling"
  ]
}

On your sample here, if I run the following query ( and I do extend responsibity for any misleading operations in previous answers and will ammend those immediately ) then the results I see should be what is expected:

db.collection.aggregate([
    { "$project": {
        "company": 1,
        "model": 1,
        "data": {
            "$setUnion": [
                { "$map": {
                    "input": "$cons",
                    "as": "con",
                    "in": {
                        "type": { "$literal": "con" },
                        "value": "$$con"
                    }
                }},
                { "$map": {
                    "input": "$pros",
                    "as": "pro",
                    "in": {
                        "type": { "$literal": "pro" },
                        "value": "$$pro"
                    }
                }}
            ]
        }
    }},
    { "$unwind": "$data" },
    { "$group": {
        "_id": {
            "company": "$company",
            "model": "$model",
            "tag": "$data.value"
        },
        "pros": {
            "$sum": {
                "$cond": [
                    { "$eq": [ "$data.type", "pro" ] },
                    1,
                    0
                ]
            }
        },
        "cons": {
            "$sum": {
                "$cond": [
                    { "$eq": [ "$data.type", "con" ] },
                    1,
                    0
                ]
            }
        }
    }}
])

Which produces from your sample

{
    "_id" : {
            "company" : "Lenovo",
            "model" : "X200",
            "tag" : "Quality"
    },
    "pros" : 0,
    "cons" : 1
}
{
    "_id" : {
            "company" : "Lenovo",
            "model" : "X200",
            "tag" : "Value"
    },
    "pros" : 1,
    "cons" : 0
}
{
    "_id" : {
            "company" : "Lenovo",
            "model" : "X200",
            "tag" : "Styling"
    },
    "pros" : 1,
    "cons" : 0
}

Which clearly correctly allocates both "pros" and "cons" totals across the grouping keys as should be expected.

Therefore what "I see" here is that the values are not in fact "the same" but are actually "different" as matches the different conditions given to each field accumulator.


Therefore taking that further, and based on your original question :

db.collection.aggregate([
    { "$project": {
        "company": 1,
        "model": 1,
        "data": {
            "$setUnion": [
                { "$map": {
                    "input": "$cons",
                    "as": "con",
                    "in": {
                        "type": { "$literal": "con" },
                        "value": "$$con"
                    }
                }},
                { "$map": {
                    "input": "$pros",
                    "as": "pro",
                    "in": {
                        "type": { "$literal": "pro" },
                        "value": "$$pro"
                    }
                }}
            ]
        }
    }},
    { "$unwind": "$data" },
    { "$group": {
        "_id": {
            "company": "$company",
            "model": "$model",
            "tag": "$data.value"
        },
        "pros": {
            "$sum": {
                "$cond": [
                    { "$eq": [ "$data.type", "pro" ] },
                    1,
                    0
                ]
            }
        },
        "cons": {
            "$sum": {
                "$cond": [
                    { "$eq": [ "$data.type", "con" ] },
                    1,
                    0
                ]
            }
        }
    }},
    { "$group": {
        "_id": {
            "company": "$_id.company",
            "model": "$_id.model"
        },
        "data": { "$push": {
            "tag": "$_id.tag",
            "pros": "$pros",
            "cons": "$cons"
        }}
    }}
])

Produces:

{
    "_id" : {
            "company" : "Lenovo",
            "model" : "X200"
    },
    "data" : [
            {
                    "tag" : "Quality",
                    "pros" : 0,
                    "cons" : 1
            },
            {
                    "tag" : "Value",
                    "pros" : 1,
                    "cons" : 0
            },
            {
                    "tag" : "Styling",
                    "pros" : 1,
                    "cons" : 0
            }
    ]
}

Which is exactly what you are asking for.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM