[英]How to design multiclient preprocess software pipeline using aws?
My software goal is to automate the preprocessing pipeline, the pipeline has three code blocks:我的软件目标是自动化预处理管道,管道有三个代码块:
Fetching the data - either by api or by client uploading csv to s3 bucket.获取数据 - 通过 api 或通过客户端将 csv 上传到 s3 存储桶。
Processing the data - my goal is to unified the data from the different clients to a unified end scheme.处理数据——我的目标是将来自不同客户端的数据统一到一个统一的端方案。
Store scheme is database.存储方案是数据库。 I know it is a very common system but I failed to find what is the best design for it.
我知道这是一个非常常见的系统,但我找不到最适合它的设计。
The requirements are:要求是:
I thought of the following:我想到了以下几点:
The lambda solution: schedule a lambda for each client which will fetch the data every X days, the lambda will trigger another lambda which will do processing. The lambda solution: schedule a lambda for each client which will fetch the data every X days, the lambda will trigger another lambda which will do processing. But if I have 100 clients that will be awful to handle 200 lambdas.
但是如果我有 100 个客户端,那么处理 200 个 lambda 表达式会很糟糕。
2.1 making a project call Api and have different script for each client, my a schudle for each script on a ec2 or ecs. 2.1 制作一个名为 Api 的项目,并为每个客户提供不同的脚本,我对 ec2 或 ecs 上的每个脚本都有一个 schudle。
2.2 Have another project call processing where the father class has the common code and all the subclass client code inherite from it, the API script will activate the relevant processing script. 2.2 有另一个项目调用处理,父class有公共代码,所有子类客户端代码都继承自它,API脚本将激活相关处理脚本。
In the end I am very confused what is the best practice, I only found example which handle one client, or a general scheme approch/ diagram block which is to broad.最后,我很困惑什么是最佳实践,我只找到了处理一个客户端的示例,或者一个广泛的通用方案方法/图表块。 Because I know it such a common system, I would appreciate learning from others experience.
因为我知道它是一个如此普遍的系统,所以我会很感激从其他人的经验中学习。 Would appreciate any reference links or wisdom
将不胜感激任何参考链接或智慧
Take a look at Step Functions , it will allow you to decouple the execution of each stage and allow you to reuse your Lambdas.看一下Step Functions ,它将允许您解耦每个阶段的执行并允许您重用您的 Lambda。
By passing in input into the step function the top Lambda might be able to make decisions which feed to the others.通过将输入传递到步骤 function 顶部 Lambda 可能能够做出提供给其他人的决策。
To schedule this use a scheduled CloudWatch event要安排此操作,请使用已安排的CloudWatch 事件
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.