简体   繁体   English

AWS DynamoDB 到 S3

[英]AWS DynamoDB to S3

I want to move(export) data from DynamoDB to S3我想将数据从 DynamoDB 移动(导出)到 S3

I have seen this tutorial but i'm not sure if the extracted data of dynamoDB will be deleted or coexits in DynamoDB and S3 at the same time.我看过这个教程,但我不确定 dynamoDB 的提取数据是否会同时在 DynamoDB 和 S3 中被删除或共存。

What I expect is the data from dynamoDB will be deleted and stored in s3 (after X time stored in DynamoDB)我期望的是来自 dynamoDB 的数据将被删除并存储在 s3 中(在存储在 DynamoDB 中的 X 时间之后)

The main purpose of the project could be similar to this该项目的主要目的可能与类似

There are any way to do this without have to develop a lambda function?有什么方法可以做到这一点而不必开发 lambda 函数?

In resume, I have found this 2 different ways:在简历中,我发现了两种不同的方式:

  • DynamoDB -> Pipeline -> S3 (Are the dynamoDB data deleted?) DynamoDB -> Pipeline -> S3(dynamoDB 数据是否被删除?)

  • DynamoDB -> TTL DynamoDB + DynamoDB stream -> Lambda -> firehose -> s3 (this appears to be more difficult) DynamoDB -> TTL DynamoDB + DynamoDB stream -> Lambda -> firehose -> s3(这似乎更难)

Is this post currently valid for this purpouse?这篇文章目前对这个目的有效吗?

What would be the simpliest and fasted way?什么是最简单和禁食的方法?

In your first option, as per default, data is not removed from dynamoDB.在您的第一个选项中,默认情况下,不会从 dynamoDB 中删除数据。 You can design a pipeline to make this work, but I think that is not the best solution.您可以设计一个管道来完成这项工作,但我认为这不是最好的解决方案。

In your second option, you must evaluate the solution based on your expected data volume:在您的第二个选项中,您必须根据您的预期数据量评估解决方案:

  1. If the data volume that will expire in TTL definition is not very large, you can use lambda to persist removed data into S3 without Firehose.如果 TTL 定义中将要过期的数据量不是很大,您可以使用 lambda 将删除的数据持久化到 S3 中,而无需 Firehose。 You can design a simple lambda function to be triggered by DynamoDB Stream and persist each stream event as a S3 object.您可以设计一个由 DynamoDB Stream 触发的简单 lambda 函数,并将每个流事件作为 S3 对象持久化。 You can even trigger another lambda function to consolidate the objects in a single file in the end of the day, week or month.您甚至可以触发另一个 lambda 函数在一天、一周或一个月结束时将对象合并到单个文件中。 But again, based on your expected volume.但同样,根据您的预期数量。

  2. If you have a lot of data being expired at the same time and you must perform transformations on this data, the best solution is to use Firehose.如果您有大量数据同时过期,并且您必须对这些数据执行转换,最好的解决方案是使用 Firehose。 Firehose can proceed with the transformation, encryption and compact your data before sending it to S3. Firehose 可以在将数据发送到 S3 之前继续转换、加密和压缩数据。 If the volume of data is to big, using functions in the end of the day, week or month may not be feasible.如果数据量很大,在一天、一周或一个月结束时使用函数可能不可行。 So it's better to perform all this procedures before persisting it.所以最好在坚持之前执行所有这些过程。

您可以使用 AWS Pipeline 将 DynamoDB 表转储到 S3,它不会被删除。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM