简体   繁体   English

aws将数据从s3加载到rds

[英]aws load data from s3 to rds

I have a question regarding technical architecture on AWS. 我对AWS上的技术架构有疑问。

Situation: There are couple of sales units (each has Database in different location not connected with each other). 情况:有几个销售单位(每个销售单位都有不同位置的数据库相互连接)。 Business requirement is that sales units place aggregated data in csv files which later will be loaded to report database. 业务要求是销售单位将聚合数据放在csv文件中,以后将加载到报告数据库。

I know already that I will need to do complex ETL processes (I work on SSIS), schedule jobs, write procedures and execute them automatically. 我已经知道我需要做复杂的ETL过程(我在SSIS上工作),安排作业,编写程序并自动执行它们。 Basically everything that MSSQL Server does + Data tools. 基本上是MSSQL Server所做的一切+数据工具。

Question: Is it possible to load data securely to S3, then load to RDS (mssql) via ETL process exclusively on AWS? 问题:是否可以将数据安全地加载到S3,然后通过ETL流程专门在AWS上加载到RDS(mssql)? Is it a good idea? 这是个好主意吗? Can AWS GLUE/DataPipeline do the job? AWS GLUE / DataPipeline可以完成这项工作吗?

If so please name the services with links how to do those tasks if possible. 如果是这样,请使用链接命名服务,如果可能的话,如何执行这些任务。

Thank you for opinions. 谢谢你的意见。

Absolutely. 绝对。

On a high level, within the data pipeline you would need - 在高层次上,您需要的数据管道 -

  1. S3 data node - Your input data S3数据节点 - 您的输入数据
  2. Activity - Any transformation you want to do 活动 - 您想要进行的任何转换
  3. Resource - either EMR or EC2 depending on what resources/software is needed. 资源 - EMR或EC2,具体取决于所需的资源/软件。
  4. RDS data node - Output for the process, your RDS database. RDS数据节点 - 进程的输出,您的RDS数据库。

In addition to the above, you can also setup retries, alerts for failures, success etc. 除上述内容外,您还可以设置重试,故障警报,成功等。

You can refer the AWS documents here - https://aws.amazon.com/documentation/data-pipeline/ https://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/welcome.html 您可以在此处参考AWS文档 - https://aws.amazon.com/documentation/data-pipeline/ https://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/welcome.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM