[英]Search in nested documents in mongodb
Hi I have the following issue with a collection that saves the contact info from a user, and it looks like this.嗨,我对保存用户联系信息的集合有以下问题,它看起来像这样。
[{
"_id": {
"$oid": "5836b917885383034437d26b"
},
"Nombre": "Juan",
"Email": "jsanzrobles@hotmail.es",
"Edad": 34,
"País": "España",
"Contactos": [
{
"Usuario_contacto": {
"_id": {
"$oid": "5836b916885383034437d23d"
},
"Nombre": "Alejandro",
"Email": "aamericagarzon@hotmail.es",
"Edad": 32,
"País": "España",
"Tipo": "Usuario individual",
"Apellidos": "América Garzón",
"Teléfono": 639123123,
"Ciudad": "Salamanca",
"Identificador": "U-3491",
"Información_creación": {
"Fecha_creación": {
"Mes": 7,
"Día": 14,
"Año": 2016
},
"Hora_creación": {
"Hora": 5,
"Minutos": 22,
"Segundos": 16
}
}
},
"Fecha_alta": {
"Mes": 10,
"Día": 27,
"Año": 2016
},
"Hora_alta": {
"Hora": 23,
"Minutos": 2,
"Segundos": 31
}
},
{
"Usuario_contacto": {
"_id": {
"$oid": "5836b916885383034437d21f"
},
"Nombre": "Alfonso",
"Email": "amartinezosorio@hotmail.es",
"Edad": 23,
"País": "España",
"Tipo": "Usuario individual",
"Apellidos": "Martínez Osorio",
"Teléfono": 612311456,
"Ciudad": "Bilbao",
"Identificador": "U-3461",
"Información_creación": {
"Fecha_creación": {
"Mes": 8,
"Día": 22,
"Año": 2016
},
"Hora_creación": {
"Hora": 7,
"Minutos": 22,
"Segundos": 30
}
}
},
"Fecha_alta": {
"Mes": 10,
"Día": 27,
"Año": 2016
},
"Hora_alta": {
"Hora": 12,
"Minutos": 7,
"Segundos": 48
}
},
{
"Usuario_contacto": {
"_id": {
"$oid": "5836b916885383034437d232"
},
"Nombre": "Mercedes",
"Email": "mreysordo@gmail.es",
"Edad": 50,
"País": "España",
"Tipo": "Usuario individual",
"Apellidos": "Rey Sordo",
"Teléfono": 635456989,
"Ciudad": "Castellón",
"Identificador": "U-3480",
"Información_creación": {
"Fecha_creación": {
"Mes": 4,
"Día": 28,
"Año": 2016
},
"Hora_creación": {
"Hora": 15,
"Minutos": 22,
"Segundos": 15
}
}
},
"Fecha_alta": {
"Mes": 10,
"Día": 24,
"Año": 2016
},
"Hora_alta": {
"Hora": 14,
"Minutos": 35,
"Segundos": 26
}
}
],
"Información_creación": {
"Fecha_creación": {
"Mes": 10,
"Día": 23,
"Año": 2016
},
"Hora_creación": {
"Hora": 10,
"Minutos": 12,
"Segundos": 10
}
},
"Apellidos": "Sanz Robles",
"Identifier": "U-3455",
"Tipo": "Usuario individual",
"Teléfono": 675456789,
"Ciudad": "Granada"
}
The exercise ive been asked to do is to create a new document for every user that has 2 or more contacts in the same city("Ciudad").我被要求做的练习是为在同一城市(“Ciudad”)拥有 2 个或更多联系人的每个用户创建一个新文档。 and it has to look like this
它必须看起来像这样
[...{
_id : { Identifier: ...
Ciudad: ... },
Counter: 3
}, ...
]
Im new in mongo, I tried a lot and I know i have to create an aggregate but i dont know how to filter like that.我是 mongo 的新手,我尝试了很多,我知道我必须创建一个聚合,但我不知道如何过滤。
To start with I'm going to work with a simplified data structure as we only need to know about the city and the document id.首先,我将使用简化的数据结构,因为我们只需要了解城市和文档 ID。
So let's insert some records:所以让我们插入一些记录:
db.test.insertMany([
{ "_id": 1, "contacts" : [
{ "name": "Name-1", "city" : "Manchester" },
{ "name": "Name-2", "city" : "Manchester" },
{ "name": "Name-3", "city" : "Manchester" }]
},
{ "_id": 2, "contacts" : [
{ "name": "Name-4", "city" : "York" },
{ "name": "Name-5", "city" : "Manchester" },
{ "name": "Name-6", "city" : "Sheffield" }]
},
{ "_id": 3, "contacts" : [
{ "name": "Name-7", "city" : "Sheffield" },
{ "name": "Name-8", "city" : "York" },
{ "name": "Name-9", "city" : "Sheffield" }]
}
])
Then we'll need an aggragation pipeline that unwinds the sub documents of contacts
the regroups them so we can count them with a $sum
:然后我们需要一个聚合管道来展开
contacts
的子文档并重新组合它们,这样我们就可以用$sum
来计算它们:
db.test.aggregate([
{ "$unwind": "$contacts" },
{ "$group": { "_id": { "_id": "$_id", "city": "$contacts.city" }, "count": { $sum: 1 } } }
]);
{ "_id" : { "_id" : 2, "city" : "Manchester" }, "count" : 1 }
{ "_id" : { "_id" : 2, "city" : "Sheffield" }, "count" : 1 }
{ "_id" : { "_id" : 3, "city" : "York" }, "count" : 1 }
{ "_id" : { "_id" : 1, "city" : "Manchester" }, "count" : 3 }
{ "_id" : { "_id" : 2, "city" : "York" }, "count" : 1 }
{ "_id" : { "_id" : 3, "city" : "Sheffield" }, "count" : 2 }
Then we'll need to filter out the ones with less than 2 in the count, this can be done with a $match stage with a grater than or equal to operator ( $gte
):然后我们需要过滤掉计数中小于 2 的那些,这可以通过 $match 阶段来完成,该阶段具有大于或等于运算符(
$gte
):
db.test.aggregate([
{ "$unwind": "$contacts" },
{ "$group": { "_id": { "_id": "$_id", "city": "$contacts.city" }, "count": { $sum: 1 } } },
{ "$match" : { "count" : { "$gte" : 2 } } }
]);
{ "_id" : { "_id" : 1, "city" : "Manchester" }, "count" : 3 }
{ "_id" : { "_id" : 3, "city" : "Sheffield" }, "count" : 2 }
This then gets our required results, so now just to pipe these in to another collection, this is achieved with a $out
stage:然后得到我们需要的结果,所以现在只需将 pipe 这些放入另一个集合中,这是通过
$out
阶段实现的:
db.test.aggregate([
{ "$unwind": "$contacts" },
{ "$group": { "_id": { "_id": "$_id", "city": "$contacts.city" }, "count": { $sum: 1 } } },
{ "$match" : { "count" : { "$gte" : 2 } } },
{ "$out" : "test2" }
]);
This will output all the results into a collection called test2
now we'll be able to query that directly.这会将 output 的所有结果放入一个名为
test2
的集合中,现在我们将能够直接查询它。
db.test2.find()
{ "_id" : { "_id" : 1, "city" : "Manchester" }, "count" : 3 }
{ "_id" : { "_id" : 3, "city" : "Sheffield" }, "count" : 2 }
Here's some useful links to read more on the aggregation stages:这里有一些有用的链接可以阅读更多关于聚合阶段的信息:
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.