简体   繁体   English

数据管道解决方案

[英]Data Pipeline Solution

We have a use-case to build data pipeline solution in which we need following things:我们有一个用例来构建数据管道解决方案,其中我们需要以下内容:

  1. Ability to have multiple steps (outputs from one step should feed as input to next)具有多个步骤的能力(一个步骤的输出应作为下一步的输入)
  2. Ability to have multiple algorithms (SQL Query or probably invoke REST endpoint) in each step.能够在每个步骤中使用多种算法(SQL 查询或可能调用 REST 端点)。

Input to first step can be anything.第一步的输入可以是任何东西。 We have DW tables, but we can pre-process and keep the relevant information in AWS S3 or other data store.我们有 DW 表,但我们可以预处理并将相关信息保存在 AWS S3 或其他数据存储中。

Something like this:像这样的东西: 数据管道

Is there an existing solution that already provides functionalities similar to this or can be modified to support this?是否有现有的解决方案已经提供与此类似的功能或可以修改以支持此功能?

Having something in AWS would be easier to integrate.在 AWS 中拥有一些东西会更容易集成。

How about AWS Glue? AWS Glue 怎么样? Sounds like a fit to your goals...听起来很适合你的目标...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM