简体   繁体   English

在 mongodb 中的嵌套文档中搜索

[英]Search in nested documents in mongodb

Hi I have the following issue with a collection that saves the contact info from a user, and it looks like this.嗨,我对保存用户联系信息的集合有以下问题,它看起来像这样。

[{
  "_id": {
    "$oid": "5836b917885383034437d26b"
  },
  "Nombre": "Juan",
  "Email": "jsanzrobles@hotmail.es",
  "Edad": 34,
  "País": "España",
  "Contactos": [
    {
      "Usuario_contacto": {
        "_id": {
          "$oid": "5836b916885383034437d23d"
        },
        "Nombre": "Alejandro",
        "Email": "aamericagarzon@hotmail.es",
        "Edad": 32,
        "País": "España",
        "Tipo": "Usuario individual",
        "Apellidos": "América Garzón",
        "Teléfono": 639123123,
        "Ciudad": "Salamanca",
        "Identificador": "U-3491",
        "Información_creación": {
          "Fecha_creación": {
            "Mes": 7,
            "Día": 14,
            "Año": 2016
          },
          "Hora_creación": {
            "Hora": 5,
            "Minutos": 22,
            "Segundos": 16
          }
        }
      },
      "Fecha_alta": {
        "Mes": 10,
        "Día": 27,
        "Año": 2016
      },
      "Hora_alta": {
        "Hora": 23,
        "Minutos": 2,
        "Segundos": 31
      }
    },
    {
      "Usuario_contacto": {
        "_id": {
          "$oid": "5836b916885383034437d21f"
        },
        "Nombre": "Alfonso",
        "Email": "amartinezosorio@hotmail.es",
        "Edad": 23,
        "País": "España",
        "Tipo": "Usuario individual",
        "Apellidos": "Martínez Osorio",
        "Teléfono": 612311456,
        "Ciudad": "Bilbao",
        "Identificador": "U-3461",
        "Información_creación": {
          "Fecha_creación": {
            "Mes": 8,
            "Día": 22,
            "Año": 2016
          },
          "Hora_creación": {
            "Hora": 7,
            "Minutos": 22,
            "Segundos": 30
          }
        }
      },
      "Fecha_alta": {
        "Mes": 10,
        "Día": 27,
        "Año": 2016
      },
      "Hora_alta": {
        "Hora": 12,
        "Minutos": 7,
        "Segundos": 48
      }
    },
    {
      "Usuario_contacto": {
        "_id": {
          "$oid": "5836b916885383034437d232"
        },
        "Nombre": "Mercedes",
        "Email": "mreysordo@gmail.es",
        "Edad": 50,
        "País": "España",
        "Tipo": "Usuario individual",
        "Apellidos": "Rey Sordo",
        "Teléfono": 635456989,
        "Ciudad": "Castellón",
        "Identificador": "U-3480",
        "Información_creación": {
          "Fecha_creación": {
            "Mes": 4,
            "Día": 28,
            "Año": 2016
          },
          "Hora_creación": {
            "Hora": 15,
            "Minutos": 22,
            "Segundos": 15
          }
        }
      },
      "Fecha_alta": {
        "Mes": 10,
        "Día": 24,
        "Año": 2016
      },
      "Hora_alta": {
        "Hora": 14,
        "Minutos": 35,
        "Segundos": 26
      }
    }
  ],
  "Información_creación": {
    "Fecha_creación": {
      "Mes": 10,
      "Día": 23,
      "Año": 2016
    },
    "Hora_creación": {
      "Hora": 10,
      "Minutos": 12,
      "Segundos": 10
    }
  },
  "Apellidos": "Sanz Robles",
  "Identifier": "U-3455",
  "Tipo": "Usuario individual",
  "Teléfono": 675456789,
  "Ciudad": "Granada"
}

The exercise ive been asked to do is to create a new document for every user that has 2 or more contacts in the same city("Ciudad").我被要求做的练习是为在同一城市(“Ciudad”)拥有 2 个或更多联系人的每个用户创建一个新文档。 and it has to look like this它必须看起来像这样

[...{
      _id : { Identifier:    ...
                  Ciudad:   ... },
      Counter:  3 
}, ...
]

Im new in mongo, I tried a lot and I know i have to create an aggregate but i dont know how to filter like that.我是 mongo 的新手,我尝试了很多,我知道我必须创建一个聚合,但我不知道如何过滤。

To start with I'm going to work with a simplified data structure as we only need to know about the city and the document id.首先,我将使用简化的数据结构,因为我们只需要了解城市和文档 ID。

So let's insert some records:所以让我们插入一些记录:

db.test.insertMany([
   { "_id": 1, "contacts" : [
     { "name": "Name-1", "city" : "Manchester" },
     { "name": "Name-2", "city" : "Manchester" },
     { "name": "Name-3", "city" : "Manchester" }]
   },
   { "_id": 2, "contacts" : [
     { "name": "Name-4", "city" : "York" },
     { "name": "Name-5", "city" : "Manchester" },
     { "name": "Name-6", "city" : "Sheffield" }]
   },
   { "_id": 3, "contacts" : [
     { "name": "Name-7", "city" : "Sheffield" },
     { "name": "Name-8", "city" : "York" },
     { "name": "Name-9", "city" : "Sheffield" }]
   }
])

Then we'll need an aggragation pipeline that unwinds the sub documents of contacts the regroups them so we can count them with a $sum :然后我们需要一个聚合管道来展开contacts的子文档并重新组合它们,这样我们就可以用$sum来计算它们:

db.test.aggregate([
  { "$unwind": "$contacts" },
  { "$group": { "_id": { "_id": "$_id", "city": "$contacts.city" }, "count": { $sum: 1 } } }
]);

{ "_id" : { "_id" : 2, "city" : "Manchester" }, "count" : 1 }
{ "_id" : { "_id" : 2, "city" : "Sheffield" }, "count" : 1 }
{ "_id" : { "_id" : 3, "city" : "York" }, "count" : 1 }
{ "_id" : { "_id" : 1, "city" : "Manchester" }, "count" : 3 }
{ "_id" : { "_id" : 2, "city" : "York" }, "count" : 1 }
{ "_id" : { "_id" : 3, "city" : "Sheffield" }, "count" : 2 }

Then we'll need to filter out the ones with less than 2 in the count, this can be done with a $match stage with a grater than or equal to operator ( $gte ):然后我们需要过滤掉计数中小于 2 的那些,这可以通过 $match 阶段来完成,该阶段具有大于或等于运算符( $gte ):

db.test.aggregate([
  { "$unwind": "$contacts" },
  { "$group": { "_id": { "_id": "$_id", "city": "$contacts.city" }, "count": { $sum: 1 } } },
  { "$match" : { "count" : { "$gte" : 2 } } }
]);

{ "_id" : { "_id" : 1, "city" : "Manchester" }, "count" : 3 }
{ "_id" : { "_id" : 3, "city" : "Sheffield" }, "count" : 2 }

This then gets our required results, so now just to pipe these in to another collection, this is achieved with a $out stage:然后得到我们需要的结果,所以现在只需将 pipe 这些放入另一个集合中,这是通过$out阶段实现的:

db.test.aggregate([
  { "$unwind": "$contacts" },
  { "$group": { "_id": { "_id": "$_id", "city": "$contacts.city" }, "count": { $sum: 1 } } },
  { "$match" : { "count" : { "$gte" : 2 } } },
  { "$out" : "test2" }
]);

This will output all the results into a collection called test2 now we'll be able to query that directly.这会将 output 的所有结果放入一个名为test2的集合中,现在我们将能够直接查询它。

db.test2.find()
{ "_id" : { "_id" : 1, "city" : "Manchester" }, "count" : 3 }
{ "_id" : { "_id" : 3, "city" : "Sheffield" }, "count" : 2 }

Here's some useful links to read more on the aggregation stages:这里有一些有用的链接可以阅读更多关于聚合阶段的信息:

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM