简体   繁体   English

在 AWS EC2 上运行三个“连接”脚本

[英]Running three “connected” scripts on AWS EC2

I have 3 scripts: the 1st and the 3rd are written in R, and the 2nd in Python.我有 3 个脚本:第一个和第三个用 R 编写,第二个用 Python 编写。

The output of the 1st script is the input of the 2nd script, and its output is the input of the 3rd one.第一个脚本的output是第二个脚本的输入,它的output是第三个脚本的输入。

The inputs and outputs are search keywords or phrases.输入和输出是搜索关键字或短语。

For example, the output of the 1st script is Hello , then the 2nd turns the word to olleH , and the 3rd one converts the letters to uppercase: OLLEH .例如,第一个脚本的 output 是Hello ,然后第二个将单词转换为olleH ,第三个将字母转换为大写: OLLEH

My question is how can I connect those scripts and let them run automatically, without my intervention, on AWS.我的问题是如何连接这些脚本并让它们在 AWS 上自动运行而无需我的干预。 What will be the commands?命令会是什么? How can the output of the 1st script be saved, and play a role as the input of the 2nd one, etc.?如何保存第一个脚本的output,并起到第二个等的输入的作用?

I would start an sh Script (or bat on a Windows machine).我会启动一个 sh 脚本(或在 Windows 机器上运行)。 Then use the return values for the scripts as input for the next.然后使用脚本的返回值作为下一个的输入。 So something like:所以像:

SET var1 = Rscript script1.R
SET var2 = py script2.py $var1
SET var3 = Rscript script3.R $ $var2
echo $var3

Of course you need to change your scripts to using the inputs you submitted.当然,您需要将脚本更改为使用您提交的输入。

I have never used AWS so I'm unfamiliar with that, but this seems like a workflow management system would solve these issues.我从未使用过 AWS,因此对此并不熟悉,但这似乎是一个工作流管理系统可以解决这些问题。 Take a look into snakemake or nextflow.看看snakemake或nextflow。 With these tools you can easily (after you get used to it) do exactly what you describe.使用这些工具,您可以轻松(在习惯之后)完全按照您的描述进行操作。 Run scripts/tools that depend on each other sequentially (and also in parallel).按顺序(也可以并行)运行相互依赖的脚本/工具。

You can use AWS Step Functions to achieve your goal.您可以使用 AWS Step Functions 来实现您的目标。 For Python parts you can use AWS Lambda tasks, for R parts - AWS ECS tasks, and orchestrate data flow accordingly.对于 Python 部件,您可以使用 AWS Lambda 任务,对于 R 部件 - AWS ECS 任务,并相应地编排数据流。

https://docs.aws.amazon.com/step-functions/latest/dg/welcome.html https://docs.aws.amazon.com/step-functions/latest/dg/welcome.html

For commands, I wouldn't count on receiving a comprehensive response - workflows are complex and very individual in each case, but I would recommend defining them via some sort of IaC solution like CloudFormation or AWS CDK and keeping them under git.对于命令,我不会指望收到全面的响应 - 在每种情况下,工作流程都很复杂且非常个性化,但我建议通过 CloudFormation 或 AWS CDK 等某种 IaC 解决方案来定义它们,并将它们保存在 git 下。

https://docs.aws.amazon.com/cdk/api/latest/docs/aws-stepfunctions-readme.html https://docs.aws.amazon.com/cdk/api/latest/docs/aws-stepfunctions-readme.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM