简体   繁体   English

通过 Lambda 函数触发 AWS Glue 工作流

[英]Triggering AWS Glue Workflow through Lambda function

I am new to AWS GLUE and trying to trigger Glue workflow using the Lambda function.我是 AWS GLUE 的新手,并尝试使用 Lambda 函数触发 Glue 工作流程。

I am using the attribute boto3.client('glue') but I am getting an error saying :我正在使用属性boto3.client('glue')但我收到一条错误消息:

Glue' object has no attribute start_workflow_run胶水对象没有属性start_workflow_run

Here is the piece of code that I am trying to run:这是我试图运行的一段代码:

import json
import boto3
def lambda_handler(event, context):
client = boto3.client('glue')
client.start_workflow_run(Name = 'Workflow_New', Arguments = {})

Is there any other method by what I can achieve what I am trying to do?有没有其他方法可以实现我想要做的事情?

Please refer to this SO on how to call AWS Glue from a lambda, with code snippet.请参阅此 SO,了解如何从 lambda 调用 AWS Glue,以及代码片段。

How to Trigger Glue ETL Pyspark job through S3 Events or AWS Lambda? 如何通过 S3 事件或 AWS Lambda 触发 Glue ETL Pyspark 作业?

import boto3
print('Loading function')

def lambda_handler(event, context):
    source_bucket = event['Records'][0]['s3']['bucket']['name']
    s3 = boto3.client('s3')
    glue = boto3.client('glue')
    gluejobname = "YOUR GLUE JOB NAME"

    try:
        runId = glue.start_job_run(JobName=gluejobname)
        status = glue.get_job_run(JobName=gluejobname, RunId=runId['JobRunId'])
        print("Job Status : ", status['JobRun']['JobRunState'])
    except Exception as e:
        print(e)
        print('Error getting object {} from bucket {}. Make sure they exist '
              'and your bucket is in the same region as this '
              'function.'.format(source_bucket, source_bucket))
    raise e

Thanks谢谢

Yuva尤瓦

this works to invoke a glue workflow from Lambda (python):这适用于从 Lambda (python) 调用胶水工作流

import json
import boto3
def lambda_handler(event, context):

    # add your region_name
    glue = boto3.client(service_name='glue', region_name='eu-west-2') 
    
    # only 'Name' parameter
    workflow_run_id = glue.start_workflow_run(Name = 'Your_Workflow')

    print(f'workflow_run_id: {workflow_run_id}')

https://docs.aws.amazon.com/glue/latest/dg/glue-dg.pdf https://docs.aws.amazon.com/glue/latest/dg/glue-dg.pdf AWS Glue 开发人员指南

Please try using the below code snippet:请尝试使用以下代码片段:

import boto3

glueClient = boto3.client('glue')

response = glueClient.start_workflow_run(Name = 'wf_name')

You can use also this documentation: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/glue.html#Glue.Client.start_workflow_run您也可以使用此文档: https : //boto3.amazonaws.com/v1/documentation/api/latest/reference/services/glue.html#Glue.Client.start_workflow_run

I don't think so that glue has function named 'start_workflow_run'.我不认为胶水具有名为“start_workflow_run”的函数。 Please try 'start_job_run'请尝试“start_job_run”

response = client.start_job_run(JobName = 'Workflow_New', Arguments = {} ) response = client.start_job_run(JobName = 'Workflow_New', Arguments = {} )

Try using:尝试使用:

import json
import boto3

def lambda_handler(event, context):
    glueClient = boto3.client('glue', region_name='us-west-2')
    response = glueClient.start_workflow_run(Name=Workflow_name)

Also, I think you might want to add error handling around the response as well!另外,我认为您可能还想在响应周围添加错误处理!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM