簡體   English   中英

MongoDB聚合中的多個$ project階段是否會影響性能

[英]Does multiple $project stages in MongoDB aggregation affect performance

TL; DR

我們在$match$lookup階段之間添加$project階段,以過濾掉不必要的數據或為字段$project別名。那些$project階段在調試時提高了查詢的讀取能力,但是當存在時它們會以任何方式影響性能查詢涉及的每個集合中有大量文檔。

問題詳細

例如,我有兩個收藏學校學生 ,如下所示:

是的,我知道架構設計不好! MongoDB說-將所有內容放在同一個集合中以避免關系,但現在讓我們繼續使用此方法。

學校收藏

{
    "_id": ObjectId("5c04dca4289c601a393d9db8"),
    "name": "First School Name",
    "address": "1 xyz",
    "status": 1,
    // Many more fields
},
{
    "_id": ObjectId("5c04dca4289c601a393d9db9"),
    "name": "Second School Name",
    "address": "2 xyz",
    "status": 1,
    // Many more fields
},
// Many more Schools

學生集合

{
    "_id": ObjectId("5c04dcd5289c601a393d9dbb"),
    "name": "One Student Name",
    "school_id": ObjectId("5c04dca4289c601a393d9db8"),
    "address": "1 abc",
    "Gender": "Male",
    // Many more fields
},
{
    "_id": ObjectId("5c04dcd5289c601a393d9dbc"),
    "name": "Second Student Name",
    "school_id": ObjectId("5c04dca4289c601a393d9db9"),
    "address": "1 abc",
    "Gender": "Male",
    // Many more fields
},
// Many more students

現在在我的查詢中,如下所示,在$match ,在$lookup之前,我有一個$project階段。 那么這個$project階段是否必要? 當查詢中涉及的所有集合中有大量文檔時,此階段會影響性能嗎?

db.students.aggregate([
    {
        $match: {
            "Gender": "Male"
        }
    },
    // 1. Below $project stage is not necessary apart from filtering out and aliasing.
    // 2. Will this stage affect performance when there are huge number of documents?
    {
        $project: {
            "_id": 0,
            "student_id": "$_id",
            "student_name": "$name",
            "school_id": 1
        }
    },
    {
        $lookup: {
            from: "schools",
            let: {
                "school_id": "$school_id"
            },
            pipeline: [
                {
                    $match: {
                        "status": 1,
                        $expr: {
                            $eq: ["$_id", "$$school_id"]
                        }
                    }
                },
                {
                    $project: {
                        "_id": 0,
                        "name": 1
                    }
                }
            ],
            as: "school"
        }
    },
    {
        $unwind: "$school"
    }
]);

閱讀以下內容: https : //docs.mongodb.com/v3.2/core/aggregation-pipeline-optimization/

與您的特定情況相關的是The aggregation pipeline can determine if it requires only a subset of the fields in the documents to obtain the results. If so, the pipeline will only use those required fields, reducing the amount of data passing through the pipeline. The aggregation pipeline can determine if it requires only a subset of the fields in the documents to obtain the results. If so, the pipeline will only use those required fields, reducing the amount of data passing through the pipeline.

因此,幕后進行了一些優化。 您可以嘗試在聚合中使用解釋選項,以准確了解mongo在嘗試優化管道的方式。

我認為您正在做的事情實際上會在減少流經的數據量的同時提高性能。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM