简体   繁体   English

AWS lambda 和 Java 并发

[英]AWS lambda and Java concurrency

It is known that AWS lambda may reuse early created objects of handlers, and it really does it (see FAQ ):众所周知,AWS lambda可能会重用早期创建的处理程序对象,它确实做到了(请参阅常见问题解答):

Q: Will AWS Lambda reuse function instances?问:AWS Lambda 会重用函数实例吗?

To improve performance, AWS Lambda may choose to retain an instance of your function and reuse it to serve a subsequent request, rather than creating a new copy.为了提高性能,AWS Lambda 可能会选择保留您的函数实例并重用它来处理后续请求,而不是创建新副本。 Your code should not assume that this will always happen.您的代码不应假设这将始终发生。


The question is regarding Java concurrency.问题是关于Java并发性的。 If I have a class for a handler, say:如果我有一个处理程序的类,请说:

public class MyHandler {
    private Foo foo;
    public void handler(Map<String,String> request, Context context) {
       ...
    }
}

so, will it be thread-safe to access and work with the object variable foo here or not?那么,在这里访问和使用对象变量foo是否是线程安全的?

In other words: may AWS lambda use the same object concurrently for different calls?换句话说:AWS lambda 是否可以同时为不同的调用使用相同的对象?

EDIT My function is processed on an event based source, particularly it is invoked by an API Gateway method.编辑我的函数是在基于事件的源上处理的,特别是它是由 API 网关方法调用的。

EDIT-2 Such kind of question rises when you want to implement some kind of connection pool to external resources, so I want to keep the connection to the external resource as an object variable. EDIT-2当您想实现某种与外部资源的连接池时,会出现此类问题,因此我想将与外部资源的连接保持为对象变量。 It actually works as desired, but I'm afraid of concurrency problems.它实际上按预期工作,但我担心并发问题。

EDIT-3 More specifically I'm wondering: can instances of handlers of AWS lambda share common heap (memory) or not ? EDIT-3更具体地说,我想知道: AWS lambda 的处理程序实例是否可以共享公共堆(内存) I have to specify this additional detail in order to prevent answers with listing of obvious and common-known things about java thread-safe objects.我必须指定这个额外的细节,以防止列出关于 java 线程安全对象的明显和常见的事情的答案。

May AWS lambda use same object concurrently for different calls? AWS lambda 可以为不同的调用同时使用相同的对象吗?

Can instances of handlers of AWS lambda share common heap (memory) or not? AWS lambda 的处理程序实例是否可以共享公共堆(内存)?

A strong, definite NO.一个强有力的,明确的否。 Instances of handlers of AWS Lambda cannot even share files (in /tmp ). AWS Lambda 的处理程序实例甚至无法共享文件(在/tmp )。

An AWS Lambda container may not be reused for two or more concurrently existing invocations of a Lambda function, since that would break the isolation requirement:一个 AWS Lambda 容器不能被重复用于两个或多个同时存在的 Lambda 函数调用,因为这会破坏隔离要求:

Q: How does AWS Lambda isolate my code?问:AWS Lambda 如何隔离我的代码?

Each AWS Lambda function runs in its own isolated environment , with its own resources and file system view.每个 AWS Lambda 函数都在其自己的隔离环境中运行,具有自己的资源和文件系统视图。

The section "How Does AWS Lambda Run My Code? The Container Model" in the official description of how lambda functions work states: lambda 函数如何工作的官方描述中的“AWS Lambda 如何运行我的代码?容器模型”部分指出:

After a Lambda function is executed, AWS Lambda maintains the container for some time in anticipation of another Lambda function invocation.执行 Lambda 函数后,AWS Lambda 会维护容器一段时间,以期待另一个 Lambda 函数调用。 In effect, the service freezes the container after a Lambda function completes, and thaws the container for reuse, if AWS Lambda chooses to reuse the container when the Lambda function is invoked again.实际上,如果 AWS Lambda 在再次调用 Lambda 函数时选择重用容器,则该服务会在 Lambda 函数完成后冻结容器,并解冻容器以供重用。 This container reuse approach has the following implications:这种容器重用方法具有以下含义:

  • Any declarations in your Lambda function code remains initialized, providing additional optimization when the function is invoked again. Lambda 函数代码中的任何声明都保持初始化状态,从而在再次调用该函数时提供额外的优化。 For example, if your Lambda function establishes a database connection, instead of reestablishing the connection, the original connection is used in subsequent invocations.例如,如果您的 Lambda 函数建立了数据库连接,而不是重新建立连接,而是在后续调用中使用原始连接。 You can add logic in your code to check if a connection already exists before creating one.您可以在代码中添加逻辑以在创建连接之前检查连接是否已存在。

  • Each container provides some disk space in the /tmp directory.每个容器在 /tmp 目录中提供一些磁盘空间。 The directory content remains when the container is frozen, providing transient cache that can be used for multiple invocations.当容器被冻结时,目录内容仍然存在,提供可用于多次调用的临时缓存。 You can add extra code to check if the cache has the data that you stored.您可以添加额外的代码来检查缓存是否包含您存储的数据。

  • Background processes or callbacks initiated by your Lambda function that did not complete when the function ended resume if AWS Lambda chooses to reuse the container.如果 AWS Lambda 选择重用容器,则由您的 Lambda 函数启动但在函数结束时未完成的后台进程或回调将恢复。 You should make sure any background processes or callbacks (in case of Node.js) in your code are complete before the code exits.在代码退出之前,您应该确保代码中的任何后台进程或回调(在 Node.js 的情况下)都已完成。

As you can see, there is absolutely no warning about race conditions between multiple concurrent invocations of a Lambda function when trying to take advantage of container reuse.如您所见,在尝试利用容器重用时,绝对没有关于 Lambda 函数的多个并发调用之间的竞争条件的警告。 The only note is "don't rely on it!".唯一需要注意的是“不要依赖它!”。

Taking advantage of the execution context reuse is definitely a practice when working with AWS Lambda (See AWS Lambda Best Practices ).在使用 AWS Lambda 时,利用执行上下文重用绝对是一种实践(请参阅AWS Lambda 最佳实践)。 But this does not apply to concurrent executions as for concurrent execution a new container is created and thus new context.但这不适用于并发执行,因为对于并发执行,会创建新容器并因此创建新上下文。 In short, for concurrent executions if one handler changes the value other won't get the new value.简而言之,对于并发执行,如果一个处理程序更改了值,另一个将不会获得新值。

As I see there is no concurrency issues related to Lambda.正如我所见,没有与 Lambda 相关的并发问题。 Only a single invocation "owns" the container.只有一个调用“拥有”容器。 The second invocation will get an another container (or possible have to wait until the first one become free).第二次调用将获得另一个容器(或者可能必须等到第一个容器空闲)。

BUT I didn't find any guarantee the Java memory visibility issues cannot happen.但我没有找到任何保证 Java 内存可见性问题不会发生。 In this case changes done by the first invocation could stay invisible for the second one.在这种情况下,第一次调用所做的更改可能对第二次调用保持不可见。 Or the changes of the first invocation will be written to RAM after the changes done by the second invocation.或者第一次调用的更改将在第二次调用完成更改后写入 RAM。

In the most cases visibility issues are handled in the same way as concurrency issues.在大多数情况下,可见性问题的处理方式与并发问题相同。 Therefore I would suggest to develop Lambda function thread-safe (or synchronized).因此,我建议开发 Lambda 函数线程安全(或同步)。 At least as long as AWS won't give us a guarantee, that they do something on their side to flush CPU state to the memory after every invocation.至少只要 AWS 不给我们保证,他们就会在每次调用后做一些事情来将 CPU 状态刷新到内存中。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM