简体繁体 English

AWS lambda 任务在处理来自 S3 存储桶的数据时出现大数据超时问题

[英]AWS lambda task timed out issue with large data while processing data from S3 bucket

原文 2020-10-19 08:03:01 6 1 pandas/ amazon-web-services/ amazon-s3/ aws-lambda/ aws-lambda-layers

I have 120 mb file of data in my S3 bucket and i am loading it in lambda by python pandas and processing it but after 15 mins(the time set in timeout option of basic settings) it is giving me an error of task timed out and stopping the process.The same process i am doing in basic sublime text and terminal is taking only 2-3 mins.What is the problem and how can i solve it.我的 S3 存储桶中有 120 mb 的数据文件，我正在通过 python pandas 将其加载到 lambda 中并对其进行处理，但是在 15 分钟（基本设置的超时选项中设置的时间）后，它给我一个任务超时错误和停止进程。我在基本 sublime 文本和终端中执行的相同进程只需要 2-3 分钟。问题是什么，我该如何解决。 Thanks in advance提前致谢

1 个解决方案

You should try to take a look at the resourcing used within your local machine if you believe that it takes a significantly less period of time.如果您认为它花费的时间显着缩短，您应该尝试查看本地计算机中使用的资源。 Increasing the amount of memory available to your Lambda can significantly improve performance in circumstances where it is being constrained, this will also increase the amount of CPU. 增加Lambda 可用的内存量可以在它受到限制的情况下显着提高性能，这也会增加 CPU 的数量。

If there are large volumes of data can this be moved into EFS ?如果有大量数据，可以将其移动到EFS 中吗？ Lambda can have an EFS mount attached and accessed as if it is local storage. Lambda 可以附加和访问EFS 挂载，就好像它是本地存储一样。 By doing this you remove this process from your Lambda script and instead can process only.通过这样做，您可以从 Lambda 脚本中删除此进程，而只能进行处理。

Finally if neither of the above result in cutting down the time it takes to execute, take a look at whether you can break up the Lambda into smaller Lambda functions and then orchestrate via Step Functions .最后，如果以上两种方法都没有缩短执行时间，请查看是否可以将 Lambda 分解为更小的 Lambda 函数，然后通过Step Functions进行编排。 By doing this you can create a chained sequence of Lambda functions that will perform the original operation of the single Lambda function.通过这样做，您可以创建一个链接的 Lambda 函数序列，这些函数将执行单个 Lambda 函数的原始操作。