具有 memory 分配管理的云 Function 调度程序

Question

I'm trying to create a scheduled function on Firebase Cloud Functions with third-party APIs.我正在尝试使用第三方 API 在 Firebase 云函数上创建计划的 function。 As the size of the data collected through the third-party API and passed to this scheduled function is huge, it returns Function invocation was interrupted. Error: memory limit exceeded.由于通过第三方 API 收集并传递给此计划的 function 的数据量很大，因此返回Function invocation was interrupted. Error: memory limit exceeded. Function invocation was interrupted. Error: memory limit exceeded.

I have written this index.js (below) with help , but still looking for the way how it should handle the output data of large size inside the scheduler function.我已经在帮助下编写了这个index.js （如下），但仍在寻找它应该如何处理调度程序 function 内的大尺寸 output 数据的方式。

index.js

const firebaseAdmin = require("firebase-admin");
const firebaseFunctions = require("firebase-functions");
firebaseAdmin.initializeApp();
const fireStore = firebaseAdmin.firestore();
const express = require("express");
const axios = require("axios");
const cors = require("cors");
const serviceToken = "SERVICE-TOKEN";
const serviceBaseUrl = "https://api.service.com/";

const app = express();
app.use(cors());

const getAllExamples = async () => {
    var url = `${serviceBaseUrl}/examples?token=${serviceToken}`;
    var config = {
        method: "get",
        url: url,
        headers: {}
    };
    return axios(config).then((res) => {
        console.log("Data saved!");
        return res.data;
    }).catch((err) => {
        console.log("Data not saved: ", err);
        return err;
    });
}

const setExample = async (documentId, dataObject) => {
    return fireStore.collection("examples").doc(documentId).set(dataObject).then(() => {
        console.log("Document written!");
    }).catch((err) => {
        console.log("Document not written: ", err);
    });
}

module.exports.updateExamplesRoutinely = firebaseFunctions.pubsub.schedule("0 0 * * *").timeZone("America/Los_Angeles").onRun(async (context) => {
    const examples = await getAllExamples(); // This returns an object of large size, containing 10,000+ arrays 
    const promises = [];
    for(var i = 0; i < examples.length; i++) {
        var example = examples[i];
        var exampleId = example["id"];
        if(exampleId && example) promises.push(setExample(exampleId, example));
    }
    return Promise.all(promises);
});

Firebase's official document simply tells how to set timeout and memory allocation manually as below, and I'm looking for the way how we should incorporate it with the above scheduler function. Firebase 的官方文档简单地讲述了如何手动设置超时和 memory 分配，如下所示，我正在寻找如何将其与上述调度程序 function 结合的方式。

exports.convertLargeFile = functions
    .runWith({
      // Ensure the function has enough memory and time
      // to process large files
      timeoutSeconds: 300,
      memory: "1GB",
    })
    .storage.object()
    .onFinalize((object) => {
      // Do some complicated things that take a lot of memory and time
    });

Answer 1

Firebase's official document simply tells how to set timeout and memory allocation manually as below, and I'm looking for the way how we should incorporate it with the above scheduler function . Firebase 的官方文档简单地讲述了如何手动设置超时和 memory 分配，如下所示，我正在寻找我们应该如何将其与上述调度程序 function 结合的方式。

You should do as follows:您应该执行以下操作：

module.exports.updateExamplesRoutinely = firebaseFunctions
    .runWith({
      timeoutSeconds: 540,
      memory: "8GB",
    })
   .pubsub
   .schedule("0 0 * * *")
   .timeZone("America/Los_Angeles")
   .onRun(async (context) => {...)

However you may still encounter the same error if you treat a huge number of "examples" in your CF.但是，如果您在 CF 中处理大量“示例”，您可能仍然会遇到相同的错误。 As you mentioned in the comment to the other answer it is advisable to cut it into chunks.正如您在对另一个答案的评论中提到的，建议将其切成块。

How to do that?怎么做？ It's highly depending on your specific case (ex: do you processes 10,000+ examples at each run? Or it is only going to happen once, in order to "digest" a backlog?).这在很大程度上取决于您的具体情况（例如：您是否在每次运行时处理 10,000 多个示例？或者它只会发生一次，以便“消化”积压？）。

You could treat only a couple of thousands of docs in the scheduled function and schedule it to run every xx seconds.您可以只处理计划的 function 中的数千个文档，并计划它每 xx 秒运行一次。 Or you could distribute the work among several instances of the CF by using PubSub triggered versions of your Cloud Function.或者，您可以使用 Cloud Function 的PubSub 触发版本在 CF 的多个实例之间分配工作。

具有 memory 分配管理的云 Function 调度程序

问题描述

1 个解决方案

解决方案1
1 2022-08-22 05:06:53

具有 memory 分配管理的云 Function 调度程序

问题描述

1 个解决方案

解决方案1 1 2022-08-22 05:06:53

解决方案1
1 2022-08-22 05:06:53