简体   繁体   English

如何让Lambda不触发多次?

[英]How to keep Lambda from triggering multiple times?

TechStack: salesforce data ->Aws Appflow->s3 ->databricks job TechStack:salesforce 数据 ->Aws Appflow->s3 ->databricks 作业

Hello.你好。 I have an appflow flow that is grabbing salesforce data and uploading it to s3 in a folder with multiple parquet files.我有一个 appflow 流程正在抓取 salesforce 数据并将其上传到包含多个镶木地板文件的文件夹中的 s3。 I have an lambda that is listening to the prefix where this folder is being dropped.我有一个 lambda 正在侦听此文件夹被删除的前缀。 This lambda then triggers a databricks job which is an ingestion process I have created.这个 lambda 然后触发了一个数据块作业,这是我创建的一个摄取过程。

My main issue is that when these files are being uploaded to s3 it is triggering my lambda 1 time per file that is uploaded, and was curious as to how I can have the lambda run just once.我的主要问题是,当这些文件被上传到 s3 时,每个上传的文件都会触发我的 lambda 1 次,并且很好奇我如何才能让 lambda 只运行一次。

Amazon AppFlow publishes a Flow notification - Amazon AppFlow when a Flow is complete: Amazon AppFlow 在流程完成时发布流程通知 - Amazon AppFlow

Amazon AppFlow is integrated with Amazon CloudWatch Events to publish events related to the status of a flow. Amazon AppFlow 与 Amazon CloudWatch Events 集成以发布与流状态相关的事件。 The following flow events are published to your default event bus.以下流事件发布到您的默认事件总线。

AppFlow End Flow Run Report: This event is published when a flow run is complete. AppFlow 结束流程运行报告:此事件在流程运行完成时发布。

You could trigger the Lambda function when this Event is published.您可以在发布此事件时触发 Lambda function。 That way, it is only triggered when the Flow is complete.这样,它仅在 Flow 完成时触发。

I hope I've understood your issue correctly but it sounds like your Lambda is working correctly if you have it setup to run every time a file is dropped into the S3 bucket as the S3 trigger will call the Lambda upon every upload.我希望我已经正确理解了您的问题,但如果您将它设置为每次将文件放入 S3 存储桶时运行,那么听起来您的 Lambda 工作正常,因为 S3 触发器将在每次上传时调用 Lambda。

If you want to reduce the amount of time your Lambda runs is setup an Event Bridge trigger to check the bucket for new files you could run this off an Event Bridge CRON to ping the Lambda on a defined schedule.如果您想减少 Lambda 运行的时间,请设置一个事件桥触发器来检查存储桶中的新文件,您可以在事件桥 CRON 上运行它以按定义的时间表 ping Lambda。 You could then send all the files to your data bricks block in bulk rather than individually.然后,您可以将所有文件批量发送到您的数据块块,而不是单独发送。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM