简体   繁体   English

具有 memory 分配管理的云 Function 调度程序

[英]Cloud Function scheduler with memory allocation management

I'm trying to create a scheduled function on Firebase Cloud Functions with third-party APIs.我正在尝试使用第三方 API 在 Firebase 云函数上创建计划的 function。 As the size of the data collected through the third-party API and passed to this scheduled function is huge, it returns Function invocation was interrupted. Error: memory limit exceeded.由于通过第三方 API 收集并传递给此计划的 function 的数据量很大,因此返回Function invocation was interrupted. Error: memory limit exceeded. Function invocation was interrupted. Error: memory limit exceeded.

I have written this index.js (below) with help , but still looking for the way how it should handle the output data of large size inside the scheduler function.我已经在帮助下编写了这个index.js (如下),但仍在寻找它应该如何处理调度程序 function 内的大尺寸 output 数据的方式。

index.js

const firebaseAdmin = require("firebase-admin");
const firebaseFunctions = require("firebase-functions");
firebaseAdmin.initializeApp();
const fireStore = firebaseAdmin.firestore();
const express = require("express");
const axios = require("axios");
const cors = require("cors");
const serviceToken = "SERVICE-TOKEN";
const serviceBaseUrl = "https://api.service.com/";

const app = express();
app.use(cors());

const getAllExamples = async () => {
    var url = `${serviceBaseUrl}/examples?token=${serviceToken}`;
    var config = {
        method: "get",
        url: url,
        headers: {}
    };
    return axios(config).then((res) => {
        console.log("Data saved!");
        return res.data;
    }).catch((err) => {
        console.log("Data not saved: ", err);
        return err;
    });
}

const setExample = async (documentId, dataObject) => {
    return fireStore.collection("examples").doc(documentId).set(dataObject).then(() => {
        console.log("Document written!");
    }).catch((err) => {
        console.log("Document not written: ", err);
    });
}

module.exports.updateExamplesRoutinely = firebaseFunctions.pubsub.schedule("0 0 * * *").timeZone("America/Los_Angeles").onRun(async (context) => {
    const examples = await getAllExamples(); // This returns an object of large size, containing 10,000+ arrays 
    const promises = [];
    for(var i = 0; i < examples.length; i++) {
        var example = examples[i];
        var exampleId = example["id"];
        if(exampleId && example) promises.push(setExample(exampleId, example));
    }
    return Promise.all(promises);
});

Firebase's official document simply tells how to set timeout and memory allocation manually as below, and I'm looking for the way how we should incorporate it with the above scheduler function. Firebase 的官方文档简单地讲述了如何手动设置超时和 memory 分配,如下所示,我正在寻找如何将其与上述调度程序 function 结合的方式。

exports.convertLargeFile = functions
    .runWith({
      // Ensure the function has enough memory and time
      // to process large files
      timeoutSeconds: 300,
      memory: "1GB",
    })
    .storage.object()
    .onFinalize((object) => {
      // Do some complicated things that take a lot of memory and time
    });

Firebase's official document simply tells how to set timeout and memory allocation manually as below, and I'm looking for the way how we should incorporate it with the above scheduler function . Firebase 的官方文档简单地讲述了如何手动设置超时和 memory 分配,如下所示,我正在寻找我们应该如何将其与上述调度程序 function 结合的方式

You should do as follows:您应该执行以下操作:

module.exports.updateExamplesRoutinely = firebaseFunctions
    .runWith({
      timeoutSeconds: 540,
      memory: "8GB",
    })
   .pubsub
   .schedule("0 0 * * *")
   .timeZone("America/Los_Angeles")
   .onRun(async (context) => {...)

However you may still encounter the same error if you treat a huge number of "examples" in your CF.但是,如果您在 CF 中处理大量“示例”,您可能仍然会遇到相同的错误。 As you mentioned in the comment to the other answer it is advisable to cut it into chunks.正如您在对另一个答案的评论中提到的,建议将其切成块。

How to do that?怎么做? It's highly depending on your specific case (ex: do you processes 10,000+ examples at each run? Or it is only going to happen once, in order to "digest" a backlog?).这在很大程度上取决于您的具体情况(例如:您是否在每次运行时处理 10,000 多个示例?或者它只会发生一次,以便“消化”积压?)。

You could treat only a couple of thousands of docs in the scheduled function and schedule it to run every xx seconds.您可以只处理计划的 function 中的数千个文档,并计划它每 xx 秒运行一次。 Or you could distribute the work among several instances of the CF by using PubSub triggered versions of your Cloud Function.或者,您可以使用 Cloud Function 的PubSub 触发版本在 CF 的多个实例之间分配工作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 无法在云中更改 memory 分配 Function - Unable to change memory allocation in Cloud Function Cloud Scheduler 在调度期间多次调用 Cloud Function - Cloud Scheduler invokes Cloud Function more than once during schedule 云存储中的 Memory 问题 Function - Memory issues in a Cloud Storage Function 数据未通过调度程序 function 和 Firebase Cloud Functions 保存到 Firestore - Data not saved into Firestore through scheduler function with Firebase Cloud Functions 无法验证来自 Google Cloud Scheduler 的 HTTP function 调用 - Unable to authenticate HTTP function call from Google Cloud Scheduler 云调度器超时 - Cloud Scheduler Timeout 从云调度程序调用谷歌云 function 时获取权限被拒绝错误 - Getting permission denied error when calling Google cloud function from Cloud scheduler 是否有必要在 Firebase 云 Function 中释放 memory - is it necessary to free memory in Firebase Cloud Function Memory 分配错误调用 XGBoost C function XGBoosterUpdateOneIter 失败:std::bad_alloc - Memory allocation error Call to XGBoost C function XGBoosterUpdateOneIter failed: std::bad_alloc Google Cloud Platform & Firebase - Cloud Scheduler 未经身份验证 - Google Cloud Platform & Firebase - Cloud Scheduler Unauthenticated
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM