简体繁体 English

在Java中在app引擎上制作blobstore实体副本的最佳方法是什么？

[英]What is the best way make a copy of a blobstore entity on app engine in Java?

原文 2013-04-28 00:27:28 1 3 java/ google-app-engine/ blobstore

Here is our simple usecase: user2 wants to copy user1's document into his or her own repository within our application. 这是我们的简单用例：user2想要将user1的文档复制到我们应用程序中的自己的存储库中。 Should be simple, right? 应该简单吧？ All we need to do is create a second identical blob in the blobstore with the key returned that we can associate with user2. 我们需要做的就是在blobstore中创建第二个相同的blob，并返回我们可以与user2关联的密钥。 We must be missing something here. 我们必须在这里遗漏一些东西。 It appears that app engine blob store's primary function is to handle blobs uploaded from and downloaded to a browser, and a simple copy operation initiated server-side is not so simple. 似乎app引擎blob存储的主要功能是处理从浏览器上传和下载到浏览器的blob，并且启动服务器端的简单复制操作并不那么简单。

The obvious solution seemed to be using the the experimental file api in java, but no love. 显而易见的解决方案似乎是在java中使用实验文件api，但没有爱。 It works, until you get up in file size beyond a MB or so, then it fails, somewhat unpredictably. 它起作用，直到你的文件大小超过MB左右，然后它失败，有些不可预测。 Reading it all into the server layer also seems silly, when we just need to make a copy in the storage layer. 当我们只需要在存储层中制作副本时，将它全部读入服务器层也似乎很愚蠢。 In addition, the odds of us getting an experimental feature through into our production environment is slim, although non-zero. 此外，我们将实验性功能纳入生产环境的可能性很小，尽管非零。

Some information about our environment: the app is written in Java and we're using the blobstore, not cloud storage and are committed to it for now. 关于我们环境的一些信息：应用程序是用Java编写的，我们使用的是blobstore，而不是云存储，现在就致力于它。 We're a small departmental team and would like to make the case that app engine is a great platform to use, but this one has us stumped. 我们是一个小型的部门团队，并希望说明应用引擎是一个很好的平台，但这个让我们感到难过。 S3 makes this blindingly simple, are we missing something really stupid here? S3让这个简单明了，我们在这里错过了一些非常愚蠢的东西吗？

3 个解决方案

We ended up scrapping the idea of making a programatic copy of the blob with the file api and going with a reference as Kalle suggested in his comment, and created a new xref entity that stores information about the copy and the original. 我们最终废弃了使用文件api制作blob的程序版本的想法，并按照Kalle在评论中提出的参考，并创建了一个新的外部参照实体，用于存储有关副本和原始文件的信息。 When an image or file is deleted, we check the xfef entity for references and delete the ones that point to that image/file (ie created if the deleted image/file was copied from another one). 当删除图像或文件时，我们检查xfef实体是否有参考，并删除指向该图像/文件的那些（即，如果删除的图像/文件是从另一个复制的，则创建）。 If we don't find any xrefs at all, we delete the blob itself. 如果我们根本找不到任何外部参照，我们会删除blob本身。 We didn't like the privacy/compliance implications of leaving orphaned blobs laying around, and even though storage is cheap every $$$ helps. 我们不喜欢留下孤立的blob的隐私/合规性影响，即使每个$$$有助于存储便宜。 We also liked the idea of keeping a clean house so to speak. 我们也喜欢保持干净房子的想法。

Solution 1 : I will launch a Google Compute Engine instance and use the command gsutil to do the copy. 解决方案1 ：我将启动Google Compute Engine实例并使用命令gsutil进行复制。

And then shutdown the instance when it's done. 然后在完成后关闭实例。 This is the fastest way to do the copy to my knowledge 这是根据我的知识复制的最快方法

gsutil documentation gsutil文档

Solution 2 : But I will personally choose to use a counter as said in the comments, because the point that you said is scary will be the same problem with the copy. 解决方案2 ：但我个人会选择使用评论中所说的计数器，因为你说的那些可怕的点与副本有同样的问题。 So just use counters with strong unit testing on those for example that will be less scary. 因此，只需使用具有强大单元测试的计数器，例如那些不那么可怕的计数器。

An idea to make it less scary is when you reach 0 for your counter you don't delete the blob right away but do a job to do this later on. 让它不那么可怕的一个想法是，当你的计数器达到0时，你不会立即删除blob，而是稍后再做一个工作。 By using Scheduled task in Google App Engine. 通过在Google App Engine中使用预定任务。 And delete the file and your actual record of it a month later for example. 例如，一个月后删除文件及其实际记录。

As already mentioned in the comment, keep one blob and pass the key around. 正如评论中已经提到的那样，保留一个blob并传递密钥。 But you really never need to delete. 但你真的永远不需要删除。 It is good practice to keep the blob for archive purposes. 保留blob以进行存档是一种很好的做法。 So how would delete actually work? 那么delete实际上会如何工作呢？ In your datastore model, have a boolean delete field. 在数据存储区模型中，有一个布尔删除字段。 You don't remove the blob key from an entity upon deletion. 删除后，您不会从实体中删除blob密钥。 But rather, you mark the boolean field as true . 但是，您将布尔字段标记为true 。 This way, your product has a record of every user who has ever owned a file. 这样，您的产品就拥有了拥有文件的每个用户的记录。 But the user does not need to ever know. 但用户不需要知道。