简体   繁体   中英

How to batch.commit() for firestore massive documents?

I have approximately 31,000 documents to batch.commit() .

I'm using Blaze plan.

A batch can carry a limit of 500 documents. So, I split the batches with 490 documents. I have 65 batches.

Here is my firebase function code:

'use strict';

const express = require('express');
const cors = require('cors');
const axios = require('axios');

// Firebase init
const functions = require('firebase-functions');
const admin = require('firebase-admin');
admin.initializeApp();
const firestore = admin.firestore();


const echo = express().use(cors()).post("/", (request, response) => {
    axios.get('https://example.com/api').then(result => {

        const data = result.data.items;
        let batchArray = [];
        let batchIndex = 0;
        let operationCounter = 0;

        //initiate batch;
        batchArray.push(firestore.batch());

        data.forEach(item => {

            const collectionRef = firestore.collection('items').doc();

            const row = {
                itemName: item.name,
                // ... and so on...
            };


            batchArray[batchIndex].set(collectionRef, row);
            operationCounter++;

            if (operationCounter === 490) {
                batchArray.push(firestore.batch());
                functions.logger.info(`Batch index added.`, batchIndex);
                batchIndex++;
                operationCounter = 0;
            }

        });

        /*  
        This code wrote only 140 documents.
        Throws Error: 4 DEADLINE_EXCEEDED: Deadline exceeded

        
        batchArray.forEach(batch => {
            batch.commit()
                .then(result=> functions.logger.info("batch.commit() succeeded:", result) )
                .catch(error=>functions.logger.info("batch.commit() failed:", error));
        })

        */

        /* 
        This code wrote only 630 documents 
        Throws Error: 4 DEADLINE_EXCEEDED: Deadline exceeded

        Promise.all([
            batchArray.forEach(batch => {
                setTimeout(
                    ()=>batch.commit().then(result=> functions.logger.info("batch.commit() succeeded:", result) ).catch(error=>functions.logger.info("batch.commit() failed:", error)),
                    1000);
            })
        ]).catch(error => functions.logger.error("batch.commit() error:", error));

        */
        // This code wrote 2100 documents.
        return Promise.all([
            batchArray.forEach(batch => {
                batch.commit()
                    .then(result => functions.logger.info("batch.commit() succeeded:", result))
                    .catch(error => functions.logger.warn("batch.commit() failed:", error))
            })
        ]).then(result => {
            functions.logger.info("all batches succeeded:", result);
            return response.status(200).json({ "status": "success", "data": `success` });
        })
        .catch(error => {
            functions.logger.warn("all batches failed:", error);
            return response.status(200).json({ "status": "error", "data": `${error}` });
        });


    }).catch(error => {
        functions.logger.error("HTTPS Response Error", error);
        return response.status(204).json({ "status": "error", "data": `${error}` });

    });
});


exports.echo = functions.runWith({
    timeoutSeconds: 60 * 9,
}).https.onRequest(echo);

I got a response with "success" after a few seconds. But the inserted firestore data appeared only after 7 minutes and in cloud functions log, it shows the logs of errors with 5 out of 65 batches successful.

The thrown error is:

batch.commit() failed: { Error: 4 DEADLINE_EXCEEDED: Deadline exceeded 
at Object.callErrorFromStatus (/workspace/node_modules/@grpc/grpc-js/build/src/call.js:31:26) 
at Object.onReceiveStatus (/workspace/node_modules/@grpc/grpc-js/build/src/client.js:176:52) 
at Object.onReceiveStatus (/workspace/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:342:141) 
at Object.onReceiveStatus (/workspace/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:305:181) 
at process.nextTick (/workspace/node_modules/@grpc/grpc-js/build/src/call-stream.js:124:78) 
at process._tickCallback (internal/process/next_tick.js:61:11) 
Caused by: Error at WriteBatch.commit (/workspace/node_modules/@google-cloud/firestore/build/src/write-batch.js:419:23) 
at Promise.all.batchArray.forEach.batch (/workspace/index.js:100:23) 
at Array.forEach (<anonymous>) at axios.get.then.result (/workspace/index.js:99:24) 
at process._tickCallback (internal/process/next_tick.js:68:7) code: 4, details: 'Deadline exceeded', metadata: Metadata { internalRepr: Map {}, options: {} }, note: 'Exception occurred in retry method that was not classified as transient' }

The error Error: 4 DEADLINE_EXCEEDED may be related to firestore quotas. But I don't know which limitation is related to this issue.

Its most likely because your forEach where you are doing the commit is not working as you expect. Time and time again, await s with the forEach function causes problems like this. Long ago, I thought that since the await is in the forEach, it will wait until it finishes, then go to the next item in the array, but that isn't true. It will run them all at once. I would suggest going with a traditional for loop.

Also, I would suggest not using the.then syntax. In this cause, it would still run them all at once. Try using the await with a tranditional for loop. This will solve your issues.

Another thing, your Promise.all is not helping here. Promise.all is for running multiple commands at the same time, but because of the exceeded error, you need to run them one at a time (I know, it sucks since you have so many).

for (const batch of batchArray) {
  await batch.commit()
}

I'm not sure how many commits it would take before you get the exceeded amount (with the above approach), but I'm curious if you do 2-3 commits at a time or something. However, its generally best to do one at a time.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM