简体   繁体   English

MongoDB跨不同文档的两个数组中的项目汇总计数是否相同?

[英]MongoDB aggregate count of items in two arrays across different documents is the same?

Here is my MongoDB collection schema: 这是我的MongoDB集合架构:

company: String
model: String
cons: [String] // array of tags that were marked as "cons"
pros: [String] // array of tags that were marked as "pros"

Here is my query : 这是我的query

[
    { "$project": {
        "company": 1,
        "model": 1,
        "data": {
            "$setUnion": [
                { "$map": {
                    "input": "$pros",
                    "as": "pro",
                    "in": {
                        "type": "$pro",
                        "value": "$$pro"
                    }
                }},
                { "$map": {
                    "input": "$cons",
                    "as": "con",
                    "in": {
                        "type": "$con",
                        "value": "$$con"
                    }
                }}
            ]
        }
    }},
    { "$unwind": "$data" },
    { "$group": {
      "_id": { 
          "company": "$company",
          "model": "$model",
          "theTag": "$data.value"
      },
      "sumPros": { 
        "$sum": { 
          "$cond": [
            { "$eq": [ "$data.type", "$pro" ] },
              1,
              0
          ]
        }
      },
      "sumCons": { 
        "$sum": { 
          "$cond": [
            { "$eq": [ "$data.type", "$con" ] },
              1,
              0
          ]
        }
      }
    }},
    { "$group": {
        "_id": { 
            "company": "$_id.company",
            "model": "$_id.model",
        },
        "tags": {$push: { 
          "tag": "$_id.theTag", 
          "pros": "$sumPros",
          "cons": "$sumCons"
        }

      }}
}]

Here is the output: 这是输出:

{
        "_id": {
            "company": "Lenovo",
            "model": "T400"
        },
        "tags": [
            {
                "tag": "Quality",
                "pros": 64, // expected value is 54
                "cons": 64  // expected value is 10
            },
            {
                "tag": "Value",
                "pros": 76, // expected value is 30
                "cons": 76  // expected value is 46
            }
        ]
}
...

Notice that pros and cons values are the same. 请注意, proscons值是相同的。 They, for some reason, represent the sum of pros and cons and I can't figure-out why. 他们出于某种原因,代表的总和proscons ,我不明白,为什么。

What am I doing wrong? 我究竟做错了什么?

Update: 更新:

Here is a document from the collection: 这是集合中的文档:

{
  "company": "Lenovo",
  "model": "X200",

  "cons": [
      "Quality"
  ],
  "pros": [
      "Value",
      "Styling"
  ]
}

As the author of the content you are using in the query and also after asking you to submit some information in the form of data that actually supports the claim in the question here, I have to say that what you are saying is incorrect. 作为您在查询中使用的内容的作者,以及在要求您以实际上支持该问题中的声明的数据形式提交某些信息之后,我不得不说您的说法是错误的。

For the record, this is your sample at time of answer: 作为记录,这是您在回答时的样本:

{
  "company": "Lenovo",
  "model": "X200",

  "cons": [
      "Quality"
  ],
  "pros": [
      "Value",
      "Styling"
  ]
}

On your sample here, if I run the following query ( and I do extend responsibity for any misleading operations in previous answers and will ammend those immediately ) then the results I see should be what is expected: 在这里的样本中,如果我运行以下查询(并且确实对以前的答案中的任何误导性操作扩大了责任范围,并将立即予以纠正),那么我看到的结果应该是预期的:

db.collection.aggregate([
    { "$project": {
        "company": 1,
        "model": 1,
        "data": {
            "$setUnion": [
                { "$map": {
                    "input": "$cons",
                    "as": "con",
                    "in": {
                        "type": { "$literal": "con" },
                        "value": "$$con"
                    }
                }},
                { "$map": {
                    "input": "$pros",
                    "as": "pro",
                    "in": {
                        "type": { "$literal": "pro" },
                        "value": "$$pro"
                    }
                }}
            ]
        }
    }},
    { "$unwind": "$data" },
    { "$group": {
        "_id": {
            "company": "$company",
            "model": "$model",
            "tag": "$data.value"
        },
        "pros": {
            "$sum": {
                "$cond": [
                    { "$eq": [ "$data.type", "pro" ] },
                    1,
                    0
                ]
            }
        },
        "cons": {
            "$sum": {
                "$cond": [
                    { "$eq": [ "$data.type", "con" ] },
                    1,
                    0
                ]
            }
        }
    }}
])

Which produces from your sample 由您的样品产生

{
    "_id" : {
            "company" : "Lenovo",
            "model" : "X200",
            "tag" : "Quality"
    },
    "pros" : 0,
    "cons" : 1
}
{
    "_id" : {
            "company" : "Lenovo",
            "model" : "X200",
            "tag" : "Value"
    },
    "pros" : 1,
    "cons" : 0
}
{
    "_id" : {
            "company" : "Lenovo",
            "model" : "X200",
            "tag" : "Styling"
    },
    "pros" : 1,
    "cons" : 0
}

Which clearly correctly allocates both "pros" and "cons" totals across the grouping keys as should be expected. 显然,可以正确地在分组键之间正确分配“优点”和“缺点”总数。

Therefore what "I see" here is that the values are not in fact "the same" but are actually "different" as matches the different conditions given to each field accumulator. 因此,这里的“我看到的”是,这些值实际上不是“相同”,而是实际上是“不同的”,因为它们与赋予每个字段累加器的不同条件相匹配。


Therefore taking that further, and based on your original question : 因此,请根据您的原始问题进一步说明

db.collection.aggregate([
    { "$project": {
        "company": 1,
        "model": 1,
        "data": {
            "$setUnion": [
                { "$map": {
                    "input": "$cons",
                    "as": "con",
                    "in": {
                        "type": { "$literal": "con" },
                        "value": "$$con"
                    }
                }},
                { "$map": {
                    "input": "$pros",
                    "as": "pro",
                    "in": {
                        "type": { "$literal": "pro" },
                        "value": "$$pro"
                    }
                }}
            ]
        }
    }},
    { "$unwind": "$data" },
    { "$group": {
        "_id": {
            "company": "$company",
            "model": "$model",
            "tag": "$data.value"
        },
        "pros": {
            "$sum": {
                "$cond": [
                    { "$eq": [ "$data.type", "pro" ] },
                    1,
                    0
                ]
            }
        },
        "cons": {
            "$sum": {
                "$cond": [
                    { "$eq": [ "$data.type", "con" ] },
                    1,
                    0
                ]
            }
        }
    }},
    { "$group": {
        "_id": {
            "company": "$_id.company",
            "model": "$_id.model"
        },
        "data": { "$push": {
            "tag": "$_id.tag",
            "pros": "$pros",
            "cons": "$cons"
        }}
    }}
])

Produces: 产生:

{
    "_id" : {
            "company" : "Lenovo",
            "model" : "X200"
    },
    "data" : [
            {
                    "tag" : "Quality",
                    "pros" : 0,
                    "cons" : 1
            },
            {
                    "tag" : "Value",
                    "pros" : 1,
                    "cons" : 0
            },
            {
                    "tag" : "Styling",
                    "pros" : 1,
                    "cons" : 0
            }
    ]
}

Which is exactly what you are asking for. 这正是您要的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM