简体   繁体   English

Python Boto3 - 如何在进程开始复制到另一个存储桶之前检查s3文件是否完全写入

[英]Python Boto3 - how to check if s3 file is completely written before process start copying to another bucket

如何在进程B开始使用boto3将文件复制到AWS S3 Bucket B之前,确保进程A在AWS S3存储桶A中完全写入大文件(5+ GB)?

If a new object is being created in Amazon S3, it will only appear after the upload is complete. 如果在Amazon S3中创建新对象,则只会在上载完成后显示。 Other processes will not be able to view it until is has finished uploading. 在完成上传之前,其他进程将无法查看。

Objects cannot be updated in S3. 在S3中无法更新对象。 Rather, they are replaced with a new object. 相反,它们被替换为新对象。 So, if an object is in the process of being updated, it will still appear as the old object to other processes. 因此,如果某个对象正在更新,它仍将作为旧对象显示给其他进程。

The best way would be to trigger Process B by Configuring Amazon S3 Event Notifications . 最好的方法是通过配置Amazon S3事件通知来触发进程B. Once the new object is uploaded, S3 can trigger a Lambda function (or send a notification) that can then perform the second step. 上传新对象后,S3可以触发Lambda函数(或发送通知),然后执行第二步。

You should definitely use s3 event notification as a trigger to a lambda function that copies your file from Bucket A to Bucket B. The trigger ensures that your copying will start once the file gets uploaded completely. 您绝对应该使用s3事件通知作为lambda函数的触发器,该函数将文件从Bucket A复制到Bucket B.触发器确保在文件完全上载后您的复制将开始。

Moreover, if you have further operations to perform you can use AWS step functions in which you can define the workflow of your processes , eg process B will start after 2 seconds from process A, process C and D will execute in parallel after process B ends it's execution , etc. 此外,如果您还有其他操作要执行,您可以使用AWS步骤功能,您可以在其中定义流程的工作流程,例如,流程B将在流程A 2秒后启动,流程C和D将在流程B结束后并行执行它的执行等

I also do uploads of up to 40GB. 我也上传了高达40GB的内容。

Since I do multi-part uploads, I check if the file I am writing to is closed . 由于我进行了多部分上传,因此我会检查我写的文件是否closed An S3 file(object) is only closed when all operations are finished. 只有在完成所有操作后才会关闭S3文件(对象)。

Another way is to use asynchronous task queue like Celery. 另一种方法是使用像Celery这样的异步任务队列。 You will get notifications when a task is done. 任务完成后,您将收到通知。

I now use Golang but both those methods have worked very well for me. 我现在使用Golang,但这两种方法对我来说都非常好用。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM