将数据从 GCS 中的文件加载到 GCP firestore

Question

I have written a script which loops through each record in the file and does write to the firestore collection.我编写了一个脚本，它循环遍历文件中的每条记录并写入 firestore 集合。

Firestore Schema {COLLECTION.DOCUMENT.SUBCOLLECTION.DOCUMENT.SUBCOLLECTION} Firestore 架构 {COLLECTION.DOCUMENT.SUBCOLLECTION.DOCUMENT.SUBCOLLECTION}

     '{"KEY":"1234","DATE":"2022-10-10","SUB_COLLECTION":{"KEY":1234,"SUB_DOC":{"KEY1" : :"VAL1"}}'
     '{"KEY":"1235","DATE":"2022-10-10","SUB_COLLECTION":{"KEY":1235,"SUB_DOC":{"KEY1" : :"VAL1"}}'
     '{"KEY":"1236","DATE":"2022-10-10","SUB_COLLECTION":{"KEY":1236,"SUB_DOC":{"KEY1" : :"VAL1"}}'
...

File is read in the below line在下面的行中读取文件

read_file = filename.download_as_string()

converted to a list of strings转换为字符串列表

    fire_client = firestore.Client(project=PROJECT)
    dict_str = read_file.decode("UTF-8");
    dict_str = dict_str.split('\n');
    dict_str = dict_str.split('\n');
        for i in range(0,len(dict_str)-1):
         i = json.loads(dict_str[i])
         doc_ref = fire_client.collection('STATIC_COLLECTION_NAME').document(i['KEY'])
         doc_ref.set({"KEY" : int(i['KEY']), "DATE" : i['DATE']})
         sub_ref = doc_ref.collection('STATIC_SUB_COLLECTION_NAME').document('STATIC_SUB_DOC_NAME')
         sub_ref.set(i['SUB_COLLECTION'])

However, this job is consuming hours to complete a file size of 100 MB.但是，此作业需要数小时才能完成 100 MB 的文件。 Is there a way I could do this using multiple writes at a time, example batch processing of X number of records from the file and write those to X documents and sub-collections in the firestore.有没有一种方法可以一次使用多个写入来做到这一点，例如批处理文件中的 X 条记录并将这些记录写入 firestore 中的 X 文档和子集合。 Finding a way to make this more efficient instead of looping over millions of records, my current script ended up with 503 The datastore operation timed out, or the data was temporarily unavailable.找到一种方法使它更有效而不是循环数百万条记录，我当前的脚本以 503 结束。数据存储操作超时，或者数据暂时不可用。

Answer 1

You'll want to use the bulk_writer to accumulate & send writes to Firestore您需要使用bulk_writer来累积写入并将写入发送到 Firestore

将数据从 GCS 中的文件加载到 GCP firestore

问题描述

File is read in the below line在下面的行中读取文件

converted to a list of strings转换为字符串列表

1 个解决方案

解决方案1
0 2022-11-20 16:55:03

将数据从 GCS 中的文件加载到 GCP firestore

问题描述

File is read in the below line在下面的行中读取文件

converted to a list of strings转换为字符串列表

1 个解决方案

解决方案1 0 2022-11-20 16:55:03

解决方案1
0 2022-11-20 16:55:03