Say I have 17,000 documents that have a structure similar to the document below:
{
someInfo: "blah blah blah",
// and another dozen or so attributes here, followed by:
answers:[
{
email: "test@test.com,
values:[
{value: 1, label: "test1"},
{value: 2, label: "test2"}
]
},
{
email: "someone@somewhere.com,
values:[
{value: 6, label: "test1"},
{value: 1, label: "test2"}
]
}
]
}
Say I use aggregate to unwind both answers
and answers.values
like so:
db.participants.aggregate(
{$unwind: "$answers"},
{$unwind: "$answers.values"}
);
I assume it would create a fairly large result set in memory since it would essentially be replicating the parent object 17,000 * # of answers * # of values
times.
I have been testing a query that does something similar on a development environment and the performance of the query itself is fine, but I'm wondering if I should be concerned about running this on a production environment where the unwound result set could potentially eat up a lot of memory. Mongo's documentation on $unwind goes into how it works, but does not discuss potential performance problems.
Should I be worried about doing this on a production system? Will it slow down other queries against the db?
It is always a good idea to be cognizant of memory resources when $unwind
ing because of the replication of data that occurs.
Using $match
to narrow down the results to the specific documents you are looking for is of course one way to reduce the amount of memory necessary to hold the returned data.
Another way to reduce the memory footprint is with $project
. $project
allows you to re-organize the documents in the pipeline so that you only return the elements in which you are interested.
To use your example,
{
someInfo: "blah blah blah",
answers: [
{
email: "test@test.com",
values: [
{value: 1, label: "test1"},
{value: 2, label: "test2"}
]
},
{
email: "someone@somewhere.com",
values: [
{value: 6, label: "test1"},
{value: 1, label: "test2"}
]
}
]
}
With
db.collection.aggregate([{ $match: { <element>: <value> }}, { $project: { _id: 0, answers: 1}}])
will remove the someInfo
and other attributes you may not be interested in. Then you could $project
again after unwinding...
db.collection.aggregate([
{ $match: { <element>: <value> }},
{ $project: { _id: 0, answers: 1}},
{ $unwind: "$answers"},
{ $unwind: "$answers.tags"},
{ $project: { e: "$answers.email", v: "$answers.values"}}
])
will return fairly compact results like:
{ e: "test@test.com", v: { value: 1, label: "test1" } }
{ e: "test@test.com", v: { value: 2, label: "test2" } }
{ e: "someone@somewhere.com", v: { value: 6, label: "test1" } }
{ e: "someone@somewhere.com", v: { value: 1, label: "test2" } }
Although the single letter attribute names reduce human-readability, it does cut down on the size of the data that is inflated by lengthy repeated attribute names.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.