[英]Delete a Document with all Subcollections and Nested Subcollections in Firestore
[英]Firestore - Recursively Copy a Document and all it's subcollections/documents
我們使用谷歌的 Firestore 來獲取嵌入式機器配置數據。 因為此數據控制可配置的頁面流和許多其他內容,所以它被分成許多子集合。 每台機器在該系統中都有自己的頂級文檔。 但是,當我們 go 將機器添加到機群時需要很長時間,因為我們必須手動復制多個文檔中的所有這些數據。 go 有誰知道如何在 Python 中遞歸復制 Firestore 文檔、它的所有子集合、它們的文檔、子集合等。您將有一個頂層文檔引用以及新頂層文檔的名稱。
問題要求 Python,但就我而言,我需要在 NodeJS (Typescript)中對 Firestore 文檔/collections 進行遞歸深度復制,並使用文檔作為遞歸的起點。
(這是基於@cristi 的 Python 腳本的解決方案)
import {
CollectionReference,
DocumentReference,
DocumentSnapshot,
QueryDocumentSnapshot,
WriteBatch,
} from 'firebase-admin/firestore';
interface FirestoreCopyRecursiveContext {
batchSize: number;
/**
* Wrapped Firestore WriteBatch. In firebase-admin@11.0.1, you can't continue
* using the WriteBatch object after you call WriteBatch.commit().
*
* Hence, we need to replaced "used up" WriteBatch's with new ones.
* We also need to reset the count after committing, and because we
* want all recursive invocations to share the same count + WriteBatch instance,
* we pass this data via object reference.
*/
writeBatch: {
writeBatch: WriteBatch,
/** Num of items in current batch. Reset to 0 when `commitBatch` commits. */
count: number;
};
/**
* Function that commits the batch if it reached the limit or is forced to.
* The WriteBatch instance is automatically replaced with fresh one
* if commit did happen.
*/
commitBatch: (force?: boolean) => Promise<void>;
/** Callback to insert custom logic / write operations when we encounter a document */
onDocument?: (
sourceDoc: QueryDocumentSnapshot | DocumentSnapshot,
targetDocRef: DocumentReference,
context: FirestoreCopyRecursiveContext
) => unknown;
/** Callback to insert custom logic / write operations when we encounter a collection */
onCollection?: (
sourceDoc: CollectionReference,
targetDocRef: CollectionReference,
context: FirestoreCopyRecursiveContext
) => unknown;
logger?: Console['info'];
}
type FirestoreCopyRecursiveOptions = Partial<Omit<FirestoreCopyRecursiveContext, 'commitBatch'>>;
/**
* Copy all data from one document to another, including
* all subcollections and documents within them, etc.
*/
export const firestoreCopyDocRecursive = async (
/** Source Firestore Document Snapshot, descendants of which we want to copy */
sourceDoc: QueryDocumentSnapshot | DocumentSnapshot,
/** Destination Firestore Document Ref */
targetDocRef: DocumentReference,
options?: FirestoreCopyRecursiveOptions,
) => {
const batchSize = options?.batchSize ?? 500;
const writeBatchRef = options?.writeBatch || { writeBatch: firebaseFirestore.batch(), count: 0 };
const onDocument = options?.onDocument;
const onCollection = options?.onCollection;
const logger = options?.logger || console.info;
const commitBatch = async (force?: boolean) => {
// Commit batch only if size limit hit or forced
if (writeBatchRef.count < batchSize && !force) return;
logger(`Commiting ${writeBatchRef.count} batched operations...`);
await writeBatchRef.writeBatch.commit();
// Once we commit the batched data, we have to create another WriteBatch,
// otherwise we get error:
// "Cannot modify a WriteBatch that has been committed."
// See https://dev.to/wceolin/cannot-modify-a-writebatch-that-has-been-committed-265f
writeBatchRef.writeBatch = firebaseFirestore.batch();
writeBatchRef.count = 0;
};
const context = {
batchSize,
writeBatch: writeBatchRef,
onDocument,
onCollection,
commitBatch,
};
// Copy the contents of the current docs
const sourceDocData = sourceDoc.data();
await writeBatchRef.writeBatch.set(targetDocRef, sourceDocData, { merge: false });
writeBatchRef.count += 1;
await commitBatch();
// Allow to make additional changes to the target document from
// outside the func after copy command is enqueued / commited.
await onDocument?.(sourceDoc, targetDocRef, context);
// And try to commit in case user updated the count but forgot to commit
await commitBatch();
// Check for subcollections and docs within them
for (const sourceSubcoll of await sourceDoc.ref.listCollections()) {
const targetSubcoll = targetDocRef.collection(sourceSubcoll.id);
// Allow to make additional changes to the target collection from
// outside the func after copy command is enqueued / commited.
await onCollection?.(sourceSubcoll, targetSubcoll, context);
// And try to commit in case user updated the count but forgot to commit
await commitBatch();
for (const sourceSubcollDoc of (await sourceSubcoll.get()).docs) {
const targetSubcollDocRef = targetSubcoll.doc(sourceSubcollDoc.id);
await firestoreCopyDocRecursive(sourceSubcollDoc, targetSubcollDocRef, context);
}
}
// Commit all remaining operations
return commitBatch(true);
};
const sourceDocRef = getYourFaveFirestoreDocRef(x);
const sourceDoc = await sourceDocRef.get();
const targetDocRef = getYourFaveFirestoreDocRef(y);
// Copy firestore resources
await firestoreCopyDocRecursive(sourceDoc, targetDocRef, {
logger,
// Note: In my case some docs had their doc ID also copied as a field.
// Because the copied documents get a new doc ID, we need to update
// those fields too.
onDocument: async (sourceDoc, targetDocRef, context) => {
const someDocPattern = /^nameOfCollection\/[^/]+?$/;
const subcollDocPattern = /^nameOfCollection\/[^/]+?\/nameOfSubcoll\/[^/]+?$/;
// Update the field that holds the document ID
if (targetDocRef.path.match(someDocPattern)) {
const docId = targetDocRef.id;
context.writeBatch.writeBatch.set(targetDocRef, { docId }, { merge: true });
context.writeBatch.count += 1;
await context.commitBatch();
return;
}
// In a subcollection, I had to update multiple ID fields
if (targetDocRef.path.match(subcollDocPattern)) {
const docId = targetDocRef.parent.parent?.id;
const subcolDocId = targetDocRef.id;
context.writeBatch.writeBatch.set(targetDocRef, { docId, subcolDocId }, { merge: true });
context.writeBatch.count += 1;
await context.commitBatch();
return;
}
},
});
您可以使用這樣的東西遞歸地從一個集合讀取和寫入另一個集合:
def read_recursive(
source: firestore.CollectionReference,
target: firestore.CollectionReference,
batch: firestore.WriteBatch,
) -> None:
global batch_nr
for source_doc_ref in source:
document_data = source_doc_ref.get().to_dict()
target_doc_ref = target.document(source_doc_ref.id)
if batch_nr == 500:
log.info("commiting %s batched operations..." % batch_nr)
batch.commit()
batch_nr = 0
batch.set(
reference=target_doc_ref,
document_data=document_data,
merge=False,
)
batch_nr += 1
for source_coll_ref in source_doc_ref.collections():
target_coll_ref = target_doc_ref.collection(source_coll_ref.id)
read_recursive(
source=source_coll_ref.list_documents(),
target=target_coll_ref,
batch=batch,
)
batch = db_client.batch()
read_recursive(
source=db_client.collection("src_collection_name"),
target=db_client.collection("target_collection_name"),
batch=batch,
)
batch.commit()
寫入是分批的,這樣可以節省很多時間(在我的情況下,它完成的時間是 set 的一半)。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.