简体   繁体   中英

How can I update more than 500 docs in Firestore using Batch?

I'm trying to update a field timestamp with the Firestore admin timestamp in a collection with more than 500 docs.

const batch = db.batch();
const serverTimestamp = admin.firestore.FieldValue.serverTimestamp();

db
  .collection('My Collection')
  .get()
  .then((docs) => {
    serverTimestamp,
  }, {
    merge: true,
  })
  .then(() => res.send('All docs updated'))
  .catch(console.error);

This throws an error

{ Error: 3 INVALID_ARGUMENT: cannot write more than 500 entities in a single call
    at Object.exports.createStatusError (C:\Users\Growthfile\Desktop\cf-test\functions\node_modules\grpc\src\common.js:87:15)
    at Object.onReceiveStatus (C:\Users\Growthfile\Desktop\cf-test\functions\node_modules\grpc\src\client_interceptors.js:1188:28)
    at InterceptingListener._callNext (C:\Users\Growthfile\Desktop\cf-test\functions\node_modules\grpc\src\client_interceptors.js:564:42)
    at InterceptingListener.onReceiveStatus (C:\Users\Growthfile\Desktop\cf-test\functions\node_modules\grpc\src\client_interceptors.js:614:8)
    at callback (C:\Users\Growthfile\Desktop\cf-test\functions\node_modules\grpc\src\client_interceptors.js:841:24)
  code: 3,
  metadata: Metadata { _internal_repr: {} },
  details: 'cannot write more than 500 entities in a single call' }

Is there a way that I can write a recursive method which creates a batch object updating a batch of 500 docs one by one until all the docs are updated.

From the docs I know that delete operation is possible with the recursive approach as mentioned here:

https://firebase.google.com/docs/firestore/manage-data/delete-data#collections

But, for updating, I'm not sure how to end the execution since the docs are not being deleted.

I also ran into the problem to update more than 500 documents inside a Firestore collection. And i would like to share how i solved this problem.

I use cloud functions to update my collection inside Firestore but this should also work on client side code.

The solution counts every operation which is made to the batch and after the limit is reached a new batch is created and pushed to the batchArray .

After all updates are completed the code loops through the batchArray and commits every batch which is inside the array.

It is important to count every operation set(), update(), delete() which is made to the batch because they all count to the 500 operation limit.

const documentSnapshotArray = await firestore.collection('my-collection').get();

const batchArray = [];
batchArray.push(firestore.batch());
let operationCounter = 0;
let batchIndex = 0;

documentSnapshotArray.forEach(documentSnapshot => {
    const documentData = documentSnapshot.data();

    // update document data here...

    batchArray[batchIndex].update(documentSnapshot.ref, documentData);
    operationCounter++;

    if (operationCounter === 499) {
      batchArray.push(firestore.batch());
      batchIndex++;
      operationCounter = 0;
    }
});

batchArray.forEach(async batch => await batch.commit());

return;

I liked this simple solution:

const users = await db.collection('users').get()

const batches = _.chunk(users.docs, 500).map(userDocs => {
    const batch = db.batch()
    userDocs.forEach(doc => {
        batch.set(doc.ref, { field: 'myNewValue' }, { merge: true })
    })
    return batch.commit()
})

await Promise.all(batches)

Just remember to add import * as _ from "lodash" at the top. Based on this answer .

Your solution will not really be "recursive". If you want to batch update a collection with more than 500 documents, you will have to iterate the documents, create batches of 500 or less on your own, then commit the batches individually. Or, you can simply update each document individually, as there might not be any real need for batches in your situation.

As mentioned above, @Sebastian's answer is good and I upvoted that too. Although faced an issue while updating 25000+ documents in one go. The tweak to logic is as below.

console.log(`Updating documents...`);
let collectionRef = db.collection('cities');
try {
  let batch = db.batch();
  const documentSnapshotArray = await collectionRef.get();
  const records = documentSnapshotArray.docs;
  const index = documentSnapshotArray.size;
  console.log(`TOTAL SIZE=====${index}`);
  for (let i=0; i < index; i++) {
    const docRef = records[i].ref;
    // YOUR UPDATES
    batch.update(docRef, {isDeleted: false});
    if ((i + 1) % 499 === 0) {
      await batch.commit();
      batch = db.batch();
    }
  }
  // For committing final batch
  if (!(index % 499) == 0) {
    await batch.commit();
  }
  console.log('write completed');
} catch (error) {
  console.error(`updateWorkers() errored out : ${error.stack}`);
  reject(error);
}

Simple solution Just fire twice ? my array is "resultsFinal" I fire batch once with a limit of 490 , and second with a limit of the lenght of the array ( results.lenght) Works fine for me :) How you check it ? You go to firebase and delete your collection , firebase say you have delete XXX docs , same as the lenght of your array ? Ok so you are good to go

async function quickstart(results) {
    // we get results in parameter for get the data inside quickstart function
    const resultsFinal = results;
    // console.log(resultsFinal.length);
    let batch = firestore.batch();
    // limit of firebase is 500 requests per transaction/batch/send 
    for (i = 0; i < 490; i++) {
        const doc = firestore.collection('testMore490').doc();
        const object = resultsFinal[i];
        batch.set(doc, object);
    }
    await batch.commit();
    // const batchTwo = firestore.batch();
    batch = firestore.batch();

    for (i = 491; i < 776; i++) {
        const objectPartTwo = resultsFinal[i];
        const doc = firestore.collection('testMore490').doc();
        batch.set(doc, objectPartTwo);
    }
    await batch.commit();

}

Explanations given on previous comments already explain the issue.

I'm sharing the final code that I built and worked for me, since I needed something that worked in a more decoupled manner, instead of the way that most of the solutions presented above do.

import { FireDb } from "@services/firebase"; // = firebase.firestore();

type TDocRef = FirebaseFirestore.DocumentReference;
type TDocData = FirebaseFirestore.DocumentData;

let fireBatches = [FireDb.batch()];
let batchSizes = [0];
let batchIdxToUse = 0;

export default class FirebaseUtil {
  static addBatchOperation(
    operation: "create",
    ref: TDocRef,
    data: TDocData
  ): void;
  static addBatchOperation(
    operation: "update",
    ref: TDocRef,
    data: TDocData,
    precondition?: FirebaseFirestore.Precondition
  ): void;
  static addBatchOperation(
    operation: "set",
    ref: TDocRef,
    data: TDocData,
    setOpts?: FirebaseFirestore.SetOptions
  ): void;
  static addBatchOperation(
    operation: "create" | "update" | "set",
    ref: TDocRef,
    data: TDocData,
    opts?: FirebaseFirestore.Precondition | FirebaseFirestore.SetOptions
  ): void {
    // Lines below make sure we stay below the limit of 500 writes per
    // batch
    if (batchSizes[batchIdxToUse] === 500) {
      fireBatches.push(FireDb.batch());
      batchSizes.push(0);
      batchIdxToUse++;
    }
    batchSizes[batchIdxToUse]++;

    const batchArgs: [TDocRef, TDocData] = [ref, data];
    if (opts) batchArgs.push(opts);

    switch (operation) {
      // Specific case for "set" is required because of some weird TS
      // glitch that doesn't allow me to use the arg "operation" to
      // call the function
      case "set":
        fireBatches[batchIdxToUse].set(...batchArgs);
        break;
      default:
        fireBatches[batchIdxToUse][operation](...batchArgs);
        break;
    }
  }

  public static async runBatchOperations() {
    // The lines below clear the globally available batches so we
    // don't run them twice if we call this function more than once
    const currentBatches = [...fireBatches];
    fireBatches = [FireDb.batch()];
    batchSizes = [0];
    batchIdxToUse = 0;

    await Promise.all(currentBatches.map((batch) => batch.commit()));
  }
}

You can use default BulkWriter . This method used 500/50/5 rule.

Example:

let bulkWriter = firestore.bulkWriter();

bulkWriter.create(documentRef, {foo: 'bar'});
bulkWriter.update(documentRef2, {foo: 'bar'});
bulkWriter.delete(documentRef3);
await close().then(() => {
  console.log('Executed all writes');
});

Based on all the above answers, I put together the following pieces of code that one can put into a module in JavaScript back-end and front-end to easily use Firestore batch writes, without worrying about the 500 writes limit.

Back-end (Node.js)

// The Firebase Admin SDK to access Firestore.
const admin = require("firebase-admin");
admin.initializeApp();

// Firestore does not accept more than 500 writes in a transaction or batch write.
const MAX_TRANSACTION_WRITES = 499;

const isFirestoreDeadlineError = (err) => {
  console.log({ err });
  const errString = err.toString();
  return (
    errString.includes("Error: 13 INTERNAL: Received RST_STREAM") ||
    errString.includes("Error: 4 DEADLINE_EXCEEDED: Deadline exceeded")
  );
};

const db = admin.firestore();

// How many transactions/batchWrites out of 500 so far.
// I wrote the following functions to easily use batchWrites wthout worrying about the 500 limit.
let writeCounts = 0;
let batchIndex = 0;
let batchArray = [db.batch()];

// Commit and reset batchWrites and the counter.
const makeCommitBatch = async () => {
  console.log("makeCommitBatch");
  await Promise.all(batchArray.map((bch) => bch.commit()));
};

// Commit the batchWrite; if you got a Firestore Deadline Error try again every 4 seconds until it gets resolved.
const commitBatch = async () => {
  try {
    await makeCommitBatch();
  } catch (err) {
    console.log({ err });
    if (isFirestoreDeadlineError(err)) {
      const theInterval = setInterval(async () => {
        try {
          await makeCommitBatch();
          clearInterval(theInterval);
        } catch (err) {
          console.log({ err });
          if (!isFirestoreDeadlineError(err)) {
            clearInterval(theInterval);
            throw err;
          }
        }
      }, 4000);
    }
  }
};

//  If the batchWrite exeeds 499 possible writes, commit and rest the batch object and the counter.
const checkRestartBatchWriteCounts = () => {
  writeCounts += 1;
  if (writeCounts >= MAX_TRANSACTION_WRITES) {
    batchIndex++;
    batchArray.push(db.batch());
    writeCounts = 0;
  }
};

const batchSet = (docRef, docData) => {
  batchArray[batchIndex].set(docRef, docData);
  checkRestartBatchWriteCounts();
};

const batchUpdate = (docRef, docData) => {
  batchArray[batchIndex].update(docRef, docData);
  checkRestartBatchWriteCounts();
};

const batchDelete = (docRef) => {
  batchArray[batchIndex].delete(docRef);
  checkRestartBatchWriteCounts();
};

module.exports = {
  admin,
  db,
  MAX_TRANSACTION_WRITES,
  checkRestartBatchWriteCounts,
  commitBatch,
  isFirestoreDeadlineError,
  batchSet,
  batchUpdate,
  batchDelete,
};

Front-end

// Firestore does not accept more than 500 writes in a transaction or batch write.
const MAX_TRANSACTION_WRITES = 499;

const isFirestoreDeadlineError = (err) => {
  return (
    err.message.includes("DEADLINE_EXCEEDED") ||
    err.message.includes("Received RST_STREAM")
  );
};

class Firebase {
  constructor(fireConfig, instanceName) {
    let app = fbApp;
    if (instanceName) {
      app = app.initializeApp(fireConfig, instanceName);
    } else {
      app.initializeApp(fireConfig);
    }
    this.name = app.name;
    this.db = app.firestore();
    this.firestore = app.firestore;
    // How many transactions/batchWrites out of 500 so far.
    // I wrote the following functions to easily use batchWrites wthout worrying about the 500 limit.
    this.writeCounts = 0;
    this.batch = this.db.batch();
    this.isCommitting = false;
  }

  async makeCommitBatch() {
    console.log("makeCommitBatch");
    if (!this.isCommitting) {
      this.isCommitting = true;
      await this.batch.commit();
      this.writeCounts = 0;
      this.batch = this.db.batch();
      this.isCommitting = false;
    } else {
      const batchWaitInterval = setInterval(async () => {
        if (!this.isCommitting) {
          this.isCommitting = true;
          await this.batch.commit();
          this.writeCounts = 0;
          this.batch = this.db.batch();
          this.isCommitting = false;
          clearInterval(batchWaitInterval);
        }
      }, 400);
    }
  }

  async commitBatch() {
    try {
      await this.makeCommitBatch();
    } catch (err) {
      console.log({ err });
      if (isFirestoreDeadlineError(err)) {
        const theInterval = setInterval(async () => {
          try {
            await this.makeCommitBatch();
            clearInterval(theInterval);
          } catch (err) {
            console.log({ err });
            if (!isFirestoreDeadlineError(err)) {
              clearInterval(theInterval);
              throw err;
            }
          }
        }, 4000);
      }
    }
  }

  async checkRestartBatchWriteCounts() {
    this.writeCounts += 1;
    if (this.writeCounts >= MAX_TRANSACTION_WRITES) {
      await this.commitBatch();
    }
  }

  async batchSet(docRef, docData) {
    if (!this.isCommitting) {
      this.batch.set(docRef, docData);
      await this.checkRestartBatchWriteCounts();
    } else {
      const batchWaitInterval = setInterval(async () => {
        if (!this.isCommitting) {
          this.batch.set(docRef, docData);
          await this.checkRestartBatchWriteCounts();
          clearInterval(batchWaitInterval);
        }
      }, 400);
    }
  }

  async batchUpdate(docRef, docData) {
    if (!this.isCommitting) {
      this.batch.update(docRef, docData);
      await this.checkRestartBatchWriteCounts();
    } else {
      const batchWaitInterval = setInterval(async () => {
        if (!this.isCommitting) {
          this.batch.update(docRef, docData);
          await this.checkRestartBatchWriteCounts();
          clearInterval(batchWaitInterval);
        }
      }, 400);
    }
  }

  async batchDelete(docRef) {
    if (!this.isCommitting) {
      this.batch.delete(docRef);
      await this.checkRestartBatchWriteCounts();
    } else {
      const batchWaitInterval = setInterval(async () => {
        if (!this.isCommitting) {
          this.batch.delete(docRef);
          await this.checkRestartBatchWriteCounts();
          clearInterval(batchWaitInterval);
        }
      }, 400);
    }
  }
}

No citations or documentation, this code i invented by myself and for me it worked and looks clean, and simple for read and usage. If some one like it, then can use it too.

Better make autotest becose code use private var _ops wich can be changed after packages upgrade. Forexample in old versions its can be _mutations

async function commitBatch(batch) {
  const MAX_OPERATIONS_PER_COMMIT = 500;

  while (batch._ops.length > MAX_OPERATIONS_PER_COMMIT) {
    const batchPart = admin.firestore().batch();

    batchPart._ops = batch._ops.splice(0, MAX_OPERATIONS_PER_COMMIT - 1);

    await batchPart.commit();
  }

  await batch.commit();
}

Usage:

const batch = admin.firestore().batch();

batch.delete(someRef);
batch.update(someRef);

...

await commitBatch(batch);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM