[英]How to have a Python glue job return when called in step function?
I have a glue job, in python, that I call from a step function. 我在python中有一个胶水作业,可以从step函数调用。 The step function successfully starts the job.
步进功能成功启动了作业。 The job successfully finishes.
作业成功完成。 But the step function never moves to the next step.
但是步进功能永远不会移到下一步。 Is there some required configuration/permission for the step function to respond to job success?
步进功能是否有一些必要的配置/权限才能响应作业成功? Something to do in the python script?
在python脚本中做什么?
Here is the step function (state machine) definition: 这是步骤功能(状态机)的定义:
"MyGlueTask": {
"Type": "Task",
"Resource": "arn:aws:states:::glue:startJobRun.sync",
"Parameters": {
"JobName": "my_glue_job"
},
"ResultPath": "$.MyGlueTask",
"Next": "NextGlueJob"
}
Are you sure it never moves to the next step? 您确定它永远不会移到下一步吗? Maybe it does, but, for instance, in 5 minutes?
也许可以,但是例如在5分钟内?
I'm asking that because Step Functions has the limitation: even if your Glue job executes in a few seconds, Step Functions polls the results from Glue job once every 5 minutes actually. 我之所以这样问,是因为Step Functions具有局限性:即使您的Glue作业在几秒钟内执行,Step Functions也会实际上每5分钟轮询一次Glue作业的结果。
A kind of workaround you could implement is to change arn:aws:states:::glue:startJobRun.sync
to arn:aws:states:::glue:startJobRun
— then Glue job task just will trigger the Glue job and will move to the next step. 您可以实现的一种解决方法是将
arn:aws:states:::glue:startJobRun.sync
为arn:aws:states:::glue:startJobRun
—然后,Glue作业任务将触发Glue作业,并移至下一步。
Most likely, you will need to wait the Glue job finished and get some result out of there. 最有可能的是,您将需要等待Glue作业完成并从中获得一些结果。 Therefore, you need to wrap the previous state with a few more ones.
因此,您需要用其他一些状态来包装以前的状态。
RunJobId
. RunJobId
。 I don't know if it can be retrieved from Glue job itself, so I've created a Lambda to run the Glue job using boto3 start_job_run
function and then get RunJobId
from the response. start_job_run
函数运行Glue作业,然后从响应中获取RunJobId
。 JobRunState
) of the Glue job (via boto3 get_job_run
function) by RunJobId
from the previous step. JobRunState
(通过boto3胶作业) get_job_run
功能)通过RunJobId
从前面的步骤。 Wait
Step Functions state type, run the Lambda you created every N seconds. Wait
步骤函数”状态类型,每N秒运行一次您创建的Lambda。 Choice
state type to filter Glue job statuses out. Choice
状态类型来筛选出“胶水”作业状态。
RUNNING
, go back to the Wait
step. RUNNING
,则返回“ Wait
步骤。 SUCCEEDEED
, then go ahead to the next state. SUCCEEDEED
,然后继续前进到下一个状态。 [FAILED | STOPPED]
[FAILED | STOPPED]
[FAILED | STOPPED]
, go wherever else. [FAILED | STOPPED]
,到其他地方去。 The solution to my actual problem was permissions. 解决我实际问题的方法是权限。 You need four permissions when running a
startJogRun.sync
: 运行
startJogRun.sync
时需要四个权限:
Those are actually the Terraform values, but should help anybody struggling with this. 这些实际上是Terraform的值,但应该可以帮助任何为此苦苦挣扎的人。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.