简体   繁体   English

查询数组中的重复元素(MongoDB)

[英]Querying for a repeated element in an array (MongoDB)

I have a mongodb collection with an array field, containing a list of strings. 我有一个带有数组字段的mongodb集合,其中包含一个字符串列表。 There may be repeats in those strings. 这些字符串中可能会有重复。 for example: 例如:

doc1 = {a: ["p", "q", "r", "p", "r"]}
doc2 = {a: ["p", "q", "q"]}
doc3 = {a: ["p"]}
doc4 = {a: ["p", "r", "r"]}

I want to find all the documents that, given a string (say, "p"), finds all the documents that have the string at least two times in the array. 我想找到所有文件,给定一个字符串(比如说“p”),找到在数组中至少有两次字符串的所有文档。

For example: 例如:

query("p") == [doc1]
query("q") == [doc2]
query("r") == [doc1, doc4]

Is there a way to do this directly in mongo? 有没有办法直接在mongo中这样做? I know I can query for occurrence once, and then filter the results on my application, but I'd rather avoid that. 我知道我可以查询一次出现,然后在我的应用程序中过滤结果,但我宁愿避免这种情况。

You could try something like below. 你可以试试下面的东西。 This query returns the _id of the documents matching your query and also the count. 此查询返回与您的查询匹配的文档的_id以及计数。

db.mycoll.aggregate([
    {$unwind:"$a"}, 
    {$group:{_id:{_id:"$_id", a:"$a"}, count:{$sum:1}}}, 
    {$match:{"_id.a":"r", count:{$gte:2}}}, 
    {$project:{_id:0, id:"$_id._id", count:1}}
])

Note that $match phase contains "p". 请注意,$ match阶段包含“p”。 You can substitute that with "q" or "r" 你可以用“q”或“r”代替

var search = 'r';
docs.aggregate([
  {$match: { a : search } }, //step 1, filter to the arrays we care about for speed
  //could do a project here to trim fields depending on object size
  {$unwind: '$a'}, //unwind to create a separate row for each letter
  { $group: { _id: '$_id', total: { $sum: { $cond : [ { $eq: ['$a', search] }, 1, 0] } } } }, //the real work, explained below
  {$match : {total : {$gte: 2} } } //grab the summed items with at least 2
  {$project: {_id: 1} } //grab just the _id field
]  )

Notes: 笔记:

I believe $elemMatch won't work as it always finds the first item in the array, not every item in the array. 我相信$ elemMatch不会工作,因为它总是找到数组中的第一项,而不是数组中的每一项。

The real work happens in the $group call, where the $sum is based on the condition of finding the element you're searching for in the array. 实际工作发生在$ group调用中,其中$ sum基于查找您在数组中搜索的元素的条件。 This works because we've unwound them to be separate rows. 这是有效的,因为我们已将它们解开为单独的行。

Enjoy! 请享用!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM