简体   繁体   English

如何在 Cloud Firestore 中有效地获取集合及其子集合的所有文档

[英]How to fetch all the documents of a collection and it's sub collection efficiently in Cloud Firestore

TL;DR It's taking a lot of time to fetch a small number of documents TL;DR获取少量文档需要花费大量时间

Scenerio:场景:
I have a collection for each account and each account contains a projects sub collection and tasks sub collection.我为每个帐户都有一个集合,每个帐户都包含一个projects子集合和tasks子集合。 Each document in the tasks sub collection can further contain checklists in the checkLists sub collection任务子集合中的每个文档可以进一步包含checkLists子集合中的清单

Note:笔记:

  • Projects can contain tasks which can in turn contain checklists.项目可以包含任务,而任务又可以包含清单。
  • Tasks can be created standalone ie;任务可以独立创建,即; it needn't always be part of a project.它不必总是项目的一部分。
  • projects and tasks are both top level sub collections and checkLists sub collection is nested inside every task.项目和任务都是顶级子集合,并且 checkLists 子集合嵌套在每个任务中。

Illustration:插图:

someTopLevelDB
   |
   |____ accountId1
   |         |______projects
   |         |         |_______ projectId1
   |         |
   |         |______tasks
   |                  |________taskId1 (belongs to projectId1)
   |                  |           |
   |                  |           |________checkLists
   |                  |                         |
   |                  |                         |_____checkListId1
   |                  |
   |                  |________taskId2 (standalone)

Use case: When an user clicks duplicate project (from the UI), I have to create a replica of the entire project ie;用例:当用户单击重复项目(从 UI)时,我必须创建整个项目的副本,即; all tasks, checklists etc..所有任务、清单等。

Code: The process to do so was slow and when I profiled the code, this snippet took a lot of time to execute.代码:这样做的过程很慢,当我分析代码时,这段代码需要很多时间来执行。 The snippet fetches all the tasks and its checklists该代码段获取所有任务及其清单

let db = admin.firestore();

function getTasks(accountId) {
    return db.collection('someTopLevelDB')
        .doc(accountId)
        .collection('tasks')
        .where('deleted', '==', false)
        .get();
}


function getCheckLists(accountId, taskId) {
    return db.collection('someTopLevelDB')
        .doc(accountId)
        .collection('tasks')
        .doc(taskId)
        .collection('checkLists')
        .where('deleted', '==', false)
        .get();
}


async function getTasksAndCheckLists(accountId) {
    try {
        let records = { tasks: [], checkLists: [] };

        // prepare tasks details
        const tasks = await getTasks(accountId);
        const tasksQueryDocumentSnapshot = tasks.docs;
        for (let taskDocumentSnapshot of tasksQueryDocumentSnapshot) {
            const taskId = taskDocumentSnapshot.id;
            const taskData = taskDocumentSnapshot.data();
            const taskDetails = {
                id: taskId,
                ...taskData
            };
            records.tasks.push(taskDetails);

            // prepare check list details
            checkListQueryDocumentSnapshot = (await getCheckLists(accountId, taskId)).docs;
            for (let checkListDocumentSnapshot of checkListQueryDocumentSnapshot) {
                const checkListId = checkListDocumentSnapshot.id;
                const checkListData = checkListDocumentSnapshot.data();
                const checkListDetails = {
                    id: checkListId,
                    ...checkListData
                };
                records.checkLists.push(checkListDetails);
            }
        }
        console.log(`successfully fetched ${records.tasks.length} tasks and ${records.checkLists.length} checklists`);
        return records;
    } catch (error) {
        console.log('Error fetching docs ====>', error);
    }
}




// Call the function to fetch records
getTasksAndCheckLists('someAccountId')
    .then(result => {
        console.log(result);
        return true;
    })
    .catch(error => {
        console.error('Error fetching docs ===>', error);
        return false;
    });

Execution stats:执行统计:
successfully fetched 627 tasks and 51 checklists in 220.532 seconds在 220.532 秒内成功获取 627 个任务和 51 个清单

I came to a conclusion that retrieving checklists was slowing the entire process down as retrieval of tasks was fairly quick.我得出的结论是,检索清单会减慢整个过程,因为检索任务相当快。

So my question is as follows:所以我的问题如下:

  • Is there any way to optimize retrieving the documents in the above code?有没有什么办法可以优化检索上述代码中的文档?
  • Is there any way to retrieve the documents of the sub collection quicker by remodelling the data and using collectionGroup queries etc?有没有办法通过重构数据和使用 collectionGroup 查询等更快地检索子集合的文档?

Thanks.谢谢。

The problem is caused by using await inside of your for loop here:问题是由在 for 循环中使用await引起的:

checkListQueryDocumentSnapshot = (await getCheckLists(accountId, taskId)).docs;

This causes your for loop to stall for as long as it takes to get the check lists of that particular task.这会导致您的 for 循环在获取该特定任务的检查列表所需的时间内停止。

The way to avoid this is to process the check lists asynchronously using Promise chaining.避免这种情况的方法是使用 Promise 链异步处理检查列表。 As you loop over the tasks, you create the request for that task's check lists, add a listener to it's result and then send it and immediately move to the next task.当您循环任务时,您为该任务的检查列表创建请求,为其结果添加一个侦听器,然后发送它并立即移至下一个任务。

With your data structure, the check lists are related to their specific task on the server, but they aren't tied to them in your code above.对于您的数据结构,检查列表与它们在服务器上的特定任务相关,但在上面的代码中它们与它们无关。 When working asynchronously with the same data structure would mean that they will be out of order with your tasks if you are just using a standard array with push() (eg task B's checklist fetch may finish before task A's).当使用相同的数据结构异步工作时,如果您只是使用带有push()的标准数组(例如,任务 B 的清单获取可能在任务 A 之前完成),它们将与您的任务无序。 To fix this, in the below code, I have nested the checklist under the taskDetails object so they are still linked.为了解决这个问题,在下面的代码中,我将清单嵌套在 taskDetails 对象下,因此它们仍然是链接的。

async function getTasksAndCheckLists(accountId) {
    try {
        let taskDetailsArray = [];

        // fetch task details
        const tasks = await getTasks(accountId);

        // init Promise holder
        const getCheckListsPromises = [];

        tasks.forEach((taskDocumentSnapshot) => {
            const taskId = taskDocumentSnapshot.id;
            const taskData = taskDocumentSnapshot.data();
            const taskDetails = {
                id: taskId,
                checkLists: [], // for storing this task's checklists
                ...taskData
            };
            taskDetailsArray.push(taskDetails);

            // asynchronously get check lists for this task
            let getCheckListPromise = getCheckLists(accountId, taskId)
                .then((checkListQuerySnapshot) => {
                    checkListQuerySnapshot.forEach((checkListDocumentSnapshot) => {
                        const checkListId = checkListDocumentSnapshot.id;
                        const checkListData = checkListDocumentSnapshot.data();
                        const checkListDetails = {
                            id: checkListId,
                            ...checkListData
                        };

                        taskDetails.checkLists.push(checkListDetails);
                    });
                });

            // add this task to the promise holder
            getCheckListsPromises.push(getCheckListPromise);
        });

        // wait for all check list fetches - this is an all-or-nothing operation
        await Promise.all(getCheckListsPromises);

        // calculate the checklist count for all tasks
        let checkListsCount = taskDetailsArray.reduce((acc, v) => acc+v.checkLists.length, 0);

        console.log(`successfully fetched ${taskDetailsArray.length} tasks and ${checkListsCount} checklists`);
        return taskDetailsArray;
    } catch (error) {
        console.log('Error fetching docs ====>', error);
    }
}

With these changes, you should see the duration your function runs greatly reduce.通过这些更改,您应该会看到函数运行的持续时间大大减少。 Based on the timings you have provided, I'd guess it would drop to roughly 2-3 seconds.根据您提供的时间,我猜它会下降到大约 2-3 秒。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何从 Cloud Firestore 删除集合及其所有子集合和文档 - How to delete a collection with all its sub-collections and documents from Cloud Firestore 如何列出子集合中的所有文档,然后通过他们的 ID 获取他们的文档? - How to list all documents in a sub-collection and then fetch their documents by their ID? 如何获取 Firestore (v9) 子集合中的所有文档 - How to get all documents in Firestore (v9) sub collection 如何在 Firestore 中“获取集合中的所有文档”? - How to "Get all documents in a collection" in Firestore? 如何删除firestore集合数据库中的所有文档 - How to delete all the documents in a firestore collection database 如何在没有任何身份验证的情况下保护 Cloud Firestore 集合和子集合? - How to secure Cloud Firestore collection and sub collection without any auth? 如何获取 Firestore 集合中所有文档的子集合的特定文档? - How to get specific Documents of a Subcollection of all Documents in a Collection in Firestore? 如何在 Firebase Cloud Firestore 中删除具有相同键值的集合中的文档 - How to delete documents in a collection with the same key value in Firebase Cloud Firestore firestore:数组vs子文件集合的表现 - firestore: arrays vs sub collection of documents performance 如何在 Firebase/Google Cloud Firestore 的集合中获取最新添加的文档? - How to fetch latest added document in a collection in Firebase / Google Cloud Firestore?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM