简体   繁体   English

当S3中的多个文件准备好时如何触发AWS Lambda function

[英]How to trigger AWS Lambda function when multiple files in S3 are ready

I am trying to build a service with AWS Lambda/S3 that takes as input a users email and outputs a responding email with a PDF attachment.我正在尝试使用 AWS Lambda/S3 构建一个服务,该服务将用户 email 作为输入,并输出带有 PDF 附件的响应 email。 The final PDF I send to the user is generated by merging together two types of PDFs I generate earlier in the process based on the input email.我发送给用户的最终 PDF 是通过将我在此过程中基于输入 email 生成的两种类型的 PDF 合并在一起生成的。 A full diagram of the architecture is found in the diagram below.下图中可以找到完整的架构图。

Diagram of Architecture架构图

The issue I am encountering is with regards to the Merge PDFs Lambda function that takes in the type 1 and type 2 PDFs and produces a type 3 PDF.我遇到的问题是关于合并 PDF Lambda function,它接收类型 1 和类型 2 PDF 并生成类型 3 ZBCD1B68617759B1DFCFF0403A6B5A8D1。 I need it to trigger once a complete set of type 1 and 2 PDFs is ready and waiting in S3.一旦一组完整的类型 1 和 2 PDF 准备好并在 S3 中等待,我需要它触发。 For example, a user sends an email and the Parse Email function kicks off the production of one type 2 PDF and fifty type 1 PDFs - as soon as these 51 PDFs are generated I want the Merge PDFs function to run. For example, a user sends an email and the Parse Email function kicks off the production of one type 2 PDF and fifty type 1 PDFs - as soon as these 51 PDFs are generated I want the Merge PDFs function to run. How do I get an AWS Lambda function to trigger once a set of multiple files in S3 are ready?一旦 S3 中的一组多个文件准备就绪,如何让 AWS Lambda function 触发?

There is no trigger that I am aware of that waits for several things to be put into S3 in one or more buckets before raising an event.据我所知,没有一个触发器会在引发事件之前等待将几件事放入一个或多个存储桶中的 S3 中。

I originally thought about using a s3 trigger when a file with the suffix '50.pdf' was created, but that leaves a lot of issues around what finishes first and what happens if something50.pdf fails to generate.我最初考虑在创建后缀为“50.pdf”的文件时使用 s3 触发器,但这留下了很多问题,比如先完成什么以及如果 something50.pdf 无法生成会发生什么。 But if you do want to go down that route, there is some good documentation from AWS here .但是,如果您确实想沿着这条路线走 go, 这里有 AWS 提供的一些很好的文档。

An alternative would be to have the lambdas that generate the type 1 and 2 pdfs to invoke the Merge PDF Lambda once they have finished their processing.另一种方法是让生成类型 1 和 2 pdf 的 lambda 在完成处理后调用 Merge PDF Lambda。

You would need to have some sort of external state held somewhere (like a db) which noted some sort of id (which could be included the naming of the type 1 and 2 pdfs) and if type 1 pdf generation was complete and if type 2 pdf generation was complete.您将需要某种外部 state 保存在某处(如 db),其中记录了某种 id(可能包括类型 1 和 2 pdfs 的命名),如果类型 1 pdf 生成完成并且类型 2 pdf 生成完成。

So the Parse Email Lambda would need to seed a db with a reference before doing its work.因此,Parse Email Lambda 需要在执行工作之前为数据库播种参考。 Then the URL to PDF Lambda would record on the db that it had finished and check the db if the HTML to PDF Lambda had finished. Then the URL to PDF Lambda would record on the db that it had finished and check the db if the HTML to PDF Lambda had finished. If so, invoke Merge PDF Lambda (probably via SNS) or if not finish.如果是这样,调用 Merge PDF Lambda(可能通过 SNS)或者如果没有完成。 HTML to PDF Lambda would do the same thing, except it would check to see if the URL to PDF Lambda had finished before starting the merge or finishing. HTML to PDF Lambda would do the same thing, except it would check to see if the URL to PDF Lambda had finished before starting the merge or finishing.

On a slightly separate note, I'd probably trigger the Clean Buckets Lambda at the end of the Merge PDF Lambda.另外,我可能会在合并 PDF Lambda 结束时触发 Clean Buckets Lambda。 That way you could have a Check For Unprocessed Work Lambda that triggered every hour and made some form of notification if it found anything in the buckets older than x.这样,您可以检查未处理的工作 Lambda,它每小时触发一次,并在发现存储桶中的任何内容早于 x 时发出某种形式的通知。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM