简体   繁体   English

AWS Redshift ETL流程

[英]AWS Redshift ETL Process

I'm investigating redshift for our Data Warehouse, and I'm trying to think of how to architect a solution. 我正在研究数据仓库的redshift,并且正在尝试思考如何设计解决方案。

I have an instance of Amazon Kinesis Firehose as a delivery stream which writes to my Redshift database, and all that works fine. 我有一个Amazon Kinesis Firehose实例作为传递流写入我的Redshift数据库,并且一切正常。

Now my issue is how do I automate the creation of dimensions and fact tables. 现在,我的问题是如何自动创建维度和事实表。

Can I use a Lambda function in the delivery stream to write to the fact table and update the dimensions? 我可以在交付流中使用Lambda函数写入事实表并更新维度吗?

The Data Transformation capability of AWS Lambda on an Amazon Kinesis Firehose is purely to modify or exclude streaming data. Amazon Kinesis Firehose上的AWS Lambda的数据转换功能纯粹是为了修改或排除流数据。 It cannot be used to create other tables. 它不能用于创建其他表。

If you wish to create dimension and fact tables, or otherwise perform ETL, you'll need to trigger it externally, such as having a scheduled task run SQL commands on your Amazon Redshift instance. 如果您希望创建维度表和事实表,或者以其他方式执行ETL,则需要在外部触发它,例如让计划任务在Amazon Redshift实例上运行SQL命令。 This task would connect via JDBC/ODBC to run the commands. 该任务将通过JDBC / ODBC连接以运行命令。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM