简体   繁体   English

Mongoose 子文档与嵌套模式

[英]Mongoose subdocuments vs nested schema

I'm curious as to the pros and cons of using subdocuments vs a deeper layer in my main schema:我很好奇在我的主架构中使用子文档与更深层的优缺点:

var subDoc = new Schema({
  name: String
});

var mainDoc = new Schema({
  names: [subDoc]
});

or要么

var mainDoc = new Schema({
  names: [{
    name: String
 }]
});

I'm currently using subdocs everywhere but I am wondering primarily about performance or querying issues I might encounter.我目前在任何地方都使用 subdocs,但我主要想知道我可能遇到的性能或查询问题。

According to the docs , it's exactly the same.根据文档,它完全相同。 However, using a Schema would add an _id field as well (as long as you don't have that disabled), and presumably uses some more resources for tracking subdocs.但是,使用 Schema 也会添加一个_id字段(只要您没有禁用该字段),并且可能会使用更多资源来跟踪子文档。

Alternate declaration syntax替代声明语法

New in v3 If you don't need access to the sub-document schema instance, you may also declare sub-docs by simply passing an object literal [...] v3 中的新功能如果您不需要访问子文档架构实例,您还可以通过简单地传递对象文字 [...]

如果您的模式在模型的各个部分重复使用,那么为子文档定义单独的模式可能会很有用,这样您就不必重复自己。

You should use embedded documents if that are static documents or that are not more than a few hundred because of performance impact.如果是静态文档或由于性能影响不超过几百个,则应使用嵌入式文档。 I have gone through about that issue for a while ago.我已经讨论过这个问题一段时间了。 Newly, Asya Kamsky who works as a solutions architect for MongoDB had written an article about "using subdocuments".最近,作为 MongoDB 解决方案架构师的 Asya Kamsky 写了一篇关于“使用子文档”的文章。

I hope that helps to who is looking for solutions or the best practice.我希望这对正在寻找解决方案或最佳实践的人有所帮助。

Original post on http://askasya.com/post/largeembeddedarrays . http://askasya.com/post/largeembeddedarrays上的原始帖子。 You can reach her stackoverflow profile on https://stackoverflow.com/users/431012/asya-kamsky您可以在https://stackoverflow.com/users/431012/asya-kamsky上访问她的 stackoverflow 个人资料

First of all, we have to consider why we would want to do such a thing.首先,我们必须考虑为什么我们要做这样的事情。 Normally, I would advise people to embed things that they always want to get back when they are fetching this document.通常,我会建议人们嵌入他们在获取此文档时总是想取回的内容。 The flip side of this is that you don't want to embed things in the document that you don't want to get back with it.这样做的另一面是您不想在文档中嵌入不想拿回来的东西。

If you embed activity I perform into the document, it'll work great at first because all of my activity is right there and with a single read you can get back everything you might want to show me: "you recently clicked on this and here are your last two comments" but what happens after six months go by and I don't care about things I did a long time ago and you don't want to show them to me unless I specifically go to look for some old activity?如果您将我执行的活动嵌入到文档中,一开始它会很好用,因为我的所有活动都在那里,只需阅读一次,您就可以获得您想要向我展示的所有内容:“您最近点击了这个和这里是你最后的两条评论吗?”但是六个月后会发生什么,我不关心我很久以前做过的事情,除非我专门去寻找一些旧的活动,否则你不想向我展示它们?

First, you'll end up returning bigger and bigger document and caring about smaller and smaller portion of it.首先,您最终会返回越来越大的文档并关心其中越来越小的部分。 But you can use projection to only return some of the array, the real pain is that the document on disk will get bigger and it will still all be read even if you're only going to return part of it to the end user, but since my activity is not going to stop as long as I'm active, the document will continue growing and growing.但是你可以使用投影只返回数组的一部分,真正的痛苦是磁盘上的文档会变大,即使你只将它的一部分返回给最终用户,它仍然会被全部读取,但是因为只要我活跃,我的活动就不会停止,所以文档将继续增长和增长。

The most obvious problem with this is eventually you'll hit the 16MB document limit, but that's not at all what you should be concerned about.最明显的问题是最终你会达到 16MB 的文档限制,但这根本不是你应该关心的。 A document that continuously grows will incur higher and higher cost every time it has to get relocated on disk, and even if you take steps to mitigate the effects of fragmentation, your writes will overall be unnecessarily long, impacting overall performance of your entire application.不断增长的文档每次必须在磁盘上重新定位时都会产生越来越高的成本,即使您采取措施减轻碎片的影响,您的写入总体上也会不必要地长,从而影响整个应用程序的整体性能。

There is one more thing that you can do that will completely kill your application's performance and that's to index this ever-increasing array.您还可以做一件事,它会完全降低应用程序的性能,那就是索引这个不断增加的数组。 What that means is that every single time the document with this array is relocated, the number of index entries that need to be updated is directly proportional to the number of indexed values in that document, and the bigger the array, the larger that number will be.这意味着每次重定位具有此数组的文档时,需要更新的索引条目的数量与该文档中索引值的数量成正比,数组越大,该数量将越大是。

I don't want this to scare you from using arrays when they are a good fit for the data model - they are a powerful feature of the document database data model, but like all powerful tools, it needs to be used in the right circumstances and it should be used with care.当数组非常适合数据模型时,我不希望这让您害怕使用数组 - 它们是文档数据库数据模型的强大功能,但与所有强大的工具一样,它需要在正确的情况下使用并且应该小心使用。

Basically, create a variable nestedDov and put it here name: [nestedDov]基本上,创建一个变量nestedDov并将其放在此处name: [nestedDov]

Simple Version:简单版:

var nestedDoc = new Schema({
  name: String
});

var mainDoc = new Schema({
  names: [nestedDoc]
});

JSON Example JSON 示例

{
    "_id" : ObjectId("57c88bf5818e70007dc72e85"),
    "name" : "Corinthia Hotel Budapest",
    "stars" : 5,
    "description" : "The 5-star Corinthia Hotel Budapest on the Grand Boulevard offers free access to its Royal Spa",
    "photos" : [
        "/photos/hotel/corinthiahotelbudapest/1.jpg",
        "/photos/hotel/corinthiahotelbudapest/2.jpg"
    ],
    "currency" : "HUF",
    "rooms" : [
        {
            "type" : "Superior Double or Twin Room",
            "number" : 20,
            "description" : "These are some great rooms",
            "photos" : [
                "/photos/room/corinthiahotelbudapest/2.jpg",
                "/photos/room/corinthiahotelbudapest/5.jpg"
            ],
            "price" : 73000
        },
        {
            "type" : "Deluxe Double Room",
            "number" : 50,
            "description" : "These are amazing rooms",
            "photos" : [
                "/photos/room/corinthiahotelbudapest/4.jpg",
                "/photos/room/corinthiahotelbudapest/6.jpg"
            ],
            "price" : 92000
        },
        {
            "type" : "Executive Double Room",
            "number" : 25,
            "description" : "These are amazing rooms",
            "photos" : [
                "/photos/room/corinthiahotelbudapest/4.jpg",
                "/photos/room/corinthiahotelbudapest/6.jpg"
            ],
            "price" : 112000
        }
    ],
    "reviews" : [
        {
            "name" : "Tamas",
            "id" : "/user/tamas.json",
            "review" : "Great hotel",
            "rating" : 4
        }
    ],
    "services" : [
        "Room service",
        "Airport shuttle (surcharge)",
        "24-hour front desk",
        "Currency exchange",
        "Tour desk"
    ]
}

Example:例子:

在此处输入图片说明

I think this is handled elsewhere by multiple post on SO.我认为这是在其他地方通过 SO 上的多个帖子处理的。

Just a few:一些:

The big key is that there is no single answer here, only a set of rather complex trade-offs.关键是这里没有单一的答案,只有一组相当复杂的权衡。

There are some difference between the two:两者之间有一些区别:

  • Using nested schema is helpful for validation.使用嵌套模式有助于验证。

  • Nested schema can be reused in other schemas.嵌套模式可以在其他模式中重用。

  • Nested schema add '_id' field to the subdocument unless you used "_id:false"除非您使用了“_id:false”,否则嵌套架构将“_id”字段添加到子文档

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM