在step函数中调用时，如何使Python胶粘作业返回？

Question

I have a glue job, in python, that I call from a step function. 我在python中有一个胶水作业，可以从step函数调用。 The step function successfully starts the job. 步进功能成功启动了作业。 The job successfully finishes. 作业成功完成。 But the step function never moves to the next step. 但是步进功能永远不会移到下一步。 Is there some required configuration/permission for the step function to respond to job success? 步进功能是否有一些必要的配置/权限才能响应作业成功？ Something to do in the python script? 在python脚本中做什么？

Here is the step function (state machine) definition: 这是步骤功能（状态机）的定义：

"MyGlueTask": {
  "Type": "Task",
  "Resource": "arn:aws:states:::glue:startJobRun.sync",
  "Parameters": {
    "JobName": "my_glue_job"
  },
  "ResultPath": "$.MyGlueTask",
  "Next": "NextGlueJob"
}

Answer 1

Are you sure it never moves to the next step? 您确定它永远不会移到下一步吗？ Maybe it does, but, for instance, in 5 minutes? 也许可以，但是例如在5分钟内？

I'm asking that because Step Functions has the limitation: even if your Glue job executes in a few seconds, Step Functions polls the results from Glue job once every 5 minutes actually. 我之所以这样问，是因为Step Functions具有局限性：即使您的Glue作业在几秒钟内执行，Step Functions也会实际上每5分钟轮询一次Glue作业的结果。

A kind of workaround you could implement is to change arn:aws:states:::glue:startJobRun.sync to arn:aws:states:::glue:startJobRun — then Glue job task just will trigger the Glue job and will move to the next step. 您可以实现的一种解决方法是将arn:aws:states:::glue:startJobRun.sync为arn:aws:states:::glue:startJobRun —然后，Glue作业任务将触发Glue作业，并移至下一步。

Most likely, you will need to wait the Glue job finished and get some result out of there. 最有可能的是，您将需要等待Glue作业完成并从中获得一些结果。 Therefore, you need to wrap the previous state with a few more ones. 因此，您需要用其他一些状态来包装以前的状态。

The main purpose is to merely start the Glue job. 主要目的只是开始胶水工作。 Apart from that, we need Glue job RunJobId . 除此之外，我们需要Glue作业RunJobId 。 I don't know if it can be retrieved from Glue job itself, so I've created a Lambda to run the Glue job using boto3 start_job_run function and then get RunJobId from the response. 我不知道是否可以从Glue作业本身中检索它，因此我创建了一个Lambda以使用boto3 start_job_run函数运行Glue作业，然后从响应中获取RunJobId 。
Create a Lambda which will be grabbing the status ( JobRunState ) of the Glue job (via boto3 get_job_run function) by RunJobId from the previous step. 创建一个lambda将被抓住的状态（ JobRunState （通过boto3胶作业） get_job_run功能）通过RunJobId从前面的步骤。
Using Wait Step Functions state type, run the Lambda you created every N seconds. 使用“ Wait步骤函数”状态类型，每N秒运行一次您创建的Lambda。
Use Choice state type to filter Glue job statuses out. 使用“ Choice状态类型来筛选出“胶水”作业状态。
- If RUNNING , go back to the Wait step. 如果是RUNNING ，则返回“ Wait步骤。
- If SUCCEEDEED , then go ahead to the next state. 如果SUCCEEDEED ，然后继续前进到下一个状态。
- If [FAILED | STOPPED] 如果[FAILED | STOPPED] [FAILED | STOPPED] , go wherever else. [FAILED | STOPPED] ，到其他地方去。

Finally, it looks something like this . 最后，它看起来像这样。

Answer 2

The solution to my actual problem was permissions. 解决我实际问题的方法是权限。 You need four permissions when running a startJogRun.sync : 运行startJogRun.sync时需要四个权限：

glue:StartJobRun 胶水：StartJobRun
glue:GetJobRun 胶水：GetJobRun
glue:GetJobRuns 胶水：GetJobRuns
glue:BatchStopJobRun 胶水：BatchStopJobRun

Those are actually the Terraform values, but should help anybody struggling with this. 这些实际上是Terraform的值，但应该可以帮助任何为此苦苦挣扎的人。

在step函数中调用时，如何使Python胶粘作业返回？

问题描述

2 个解决方案

解决方案1
0 2019-07-02 10:48:08

解决方案2
0 已采纳 2019-07-31 01:16:25

在step函数中调用时，如何使Python胶粘作业返回？

问题描述

2 个解决方案

解决方案1 0 2019-07-02 10:48:08

解决方案2 0 已采纳 2019-07-31 01:16:25

解决方案1
0 2019-07-02 10:48:08

解决方案2
0 已采纳 2019-07-31 01:16:25