简体繁体 English

AWS Redshift ETL流程

[英]AWS Redshift ETL Process

原文 2017-06-05 05:17:54 1 1 amazon-web-services/ amazon-s3/ aws-lambda/ amazon-redshift

I'm investigating redshift for our Data Warehouse, and I'm trying to think of how to architect a solution. 我正在研究数据仓库的redshift，并且正在尝试思考如何设计解决方案。

I have an instance of Amazon Kinesis Firehose as a delivery stream which writes to my Redshift database, and all that works fine. 我有一个Amazon Kinesis Firehose实例作为传递流写入我的Redshift数据库，并且一切正常。

Now my issue is how do I automate the creation of dimensions and fact tables. 现在，我的问题是如何自动创建维度和事实表。

Can I use a Lambda function in the delivery stream to write to the fact table and update the dimensions? 我可以在交付流中使用Lambda函数写入事实表并更新维度吗？

1 个解决方案

The Data Transformation capability of AWS Lambda on an Amazon Kinesis Firehose is purely to modify or exclude streaming data. Amazon Kinesis Firehose上的AWS Lambda的数据转换功能纯粹是为了修改或排除流数据。 It cannot be used to create other tables. 它不能用于创建其他表。

If you wish to create dimension and fact tables, or otherwise perform ETL, you'll need to trigger it externally, such as having a scheduled task run SQL commands on your Amazon Redshift instance. 如果您希望创建维度表和事实表，或者以其他方式执行ETL，则需要在外部触发它，例如让计划任务在Amazon Redshift实例上运行SQL命令。 This task would connect via JDBC/ODBC to run the commands. 该任务将通过JDBC / ODBC连接以运行命令。