简体   繁体   English

Google Cloud Storage 上文件列表的一致性

[英]Consistency for file listing on Google Cloud Storage

I am going to develop an application that creates files on Google Cloud Storage and read them by other processes.我将开发一个在 Google Cloud Storage 上创建文件并由其他进程读取它们的应用程序。

File creation may be delayed due to some reasons (such as the file is big) and may exist incomplete (write is ongoing) files on Cloud Storage.由于某些原因(例如文件很大),文件创建可能会延迟,并且 Cloud Storage 上可能存在不完整(正在写入)的文件。

I have to consider to prevent reading incomplete files.我必须考虑防止读取不完整的文件。 But according to this page , the Bucket listing is strongly consistent.但是根据这个页面,Bucket 列表是高度一致的。 The newly created files could be listed immediately after the file is created.新创建的文件可以在创建文件后立即列出。

From the document above, my guess is the newly created files will not be listed until the creation will be completed, the incomplete files will not be listed.从上面的文档,我的猜测是新创建的文件将不会列出直到创建将完成,不完整的文件将不会列出。

Is my guess true?我的猜测是真的吗? If not, how should I do to prevent reading incomplete files?如果没有,我该怎么做才能防止读取不完整的文件?

Your guess is true, the write in the bucket is atomic (when you upload the file, the content is cached before being pushed to your bucket.) You can see this in the documentation您的猜测是正确的,存储桶中的写入是原子的(当您上传文件时,内容在推送到您的存储桶之前会被缓存。)您可以在文档中看到这一点

  • Read-after-write (ie atomic operation, no transient state)写后读(即原子操作,无瞬态)

Thus, you don't need to worry about incomplete files.因此,您无需担心文件不完整。

I downloaded a large file from the internet (512 MB) and then uploaded it to GCS.我从互联网上下载了一个大文件 (512 MB),然后将其上传到 GCS。

I tested by listing the bucket objects (during the upload) using the command我通过使用命令列出存储桶对象(在上传期间)进行了测试

gsutil ls gs://bucket_name

The new object was not listed until the uploading process was successful.新对象直到上传过程成功才被列出。

Therefore your guess is true.因此你的猜测是正确的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM