简体繁体 English

从 Java 调用 AWS Lambda 函数

[英]Invoking AWS Lambda Functions from Java

原文 2021-05-03 13:47:31 2 1 java/ amazon-web-services/ aws-lambda

I have a long running AWS Lambda function that I am executing from my webapp.我有一个长期运行的 AWS Lambda function，我正在从我的 webapp 执行。 Using the documentation [1], it works fine however my problem is this particular lambda function does not return anything back to the application its output is saved to S3 and it runs for a long time 20-30s.使用文档 [1]，它可以正常工作，但是我的问题是这个特殊的 lambda function 不会将任何东西返回给应用程序，它的 output 2 会保存很长时间，并且运行时间很长。 Is there a way to trigger the lambda and not wait for the return value since I don't want to wait/block my app while the lambda is running.有没有办法触发 lambda 而不是等待返回值，因为我不想在 lambda 运行时等待/阻止我的应用程序。 Right now I am using an ExecutorService as a que to execute lambda requests since I have to wait for each invocation, when the app crashes or restarts I lose jobs that are waiting to be executed.现在我使用ExecutorService作为 que 来执行 lambda 请求，因为我必须等待每次调用，当应用程序崩溃或重新启动时，我会丢失等待执行的作业。

[1] https://aws.amazon.com/blogs/developer/invoking-aws-lambda-functions-from-java/ [1] https://aws.amazon.com/blogs/developer/invoking-aws-lambda-functions-from-java/

1 个解决方案

Tracking status is not necessarily a difficult issue.跟踪状态不一定是一个难题。 Use a simple S3 "file exists" call after each job execution to know if the lambda is done.在每次作业执行后使用简单的 S3“文件存在”调用来了解 lambda 是否已完成。

However, as you've pointed out, you might lose job information at some point.但是，正如您所指出的，您可能会在某些时候丢失工作信息。 To remove this issue, you need some persistence layer outside your JVM.要消除此问题，您需要在 JVM 之外建立一些持久层。 A KV store would work, store some (timestamp, jobId, status) fields in a database, and periodically check from your web server and only update from the lambda. KV 存储可以工作，将一些(timestamp, jobId, status)字段存储在数据库中，并定期从 web 服务器检查并仅从 lambda 进行更新。

Alternatively, to reduce end-to-end time frame further, a queuing mechanism would be better (unless you also want the full history of jobs, but this can be constructed along with the queue).或者，为了进一步减少端到端的时间框架，排队机制会更好（除非您还想要完整的作业历史记录，但这可以与队列一起构建）。 As mentioned in the comments, AWS offers many built in solutions that can directly be used with Lambda, or you need additional infrastructure like RabbitMQ / Redis to built a task event bus.正如评论中提到的，AWS 提供了许多可以直接与 Lambda 一起使用的内置解决方案，或者您需要额外的基础设施，例如 RabbitMQ / Redis 来构建任务事件总线。

With that, lambda is now optional.有了这个，lambda 现在是可选的。 You'd effectively periodically pull off events into a worker queue, which either can be very dumb passthroughs and invoke the lambda, or do the work themselves directly.您将有效地定期将事件拉入工作队列，这可能是非常愚蠢的直通并调用 lambda，或者直接自己完成工作。 Combine this with ECS/EKS/EC2 autoscaling and it might actually run faster than lambda since you can scale in/out based on queue size.将此与 ECS/EKS/EC2 自动缩放结合起来，它实际上可能比 lambda 运行得更快，因为您可以根据队列大小进行缩放。 Then you write the output events to a success/error notification "channel" after the S3 file is written然后在写入 S3 文件后将 output 事件写入成功/错误通知“通道”

Back in the web server, you'll have to modify code to now be listening for messages asynchronously from that channel, and when you get a success message, you'll know that you should be able to access the S3 resources回到 web 服务器，您必须修改代码，以便现在从该通道异步侦听消息，当您收到成功消息时，您将知道您应该能够访问 S3 资源