简体   繁体   English

当AWS Lambda函数失败时,如何获得更具体的CloudWatch警报?

[英]How can I get more specific CloudWatch alerts when an AWS Lambda function fails?

I have a variety of functions, all in Node.js, in AWS Lambda. 我在AWS Lambda的Node.js中都有各种功能。 They're triggered by certain events like S3 triggers, API Gateway methods, or sometimes just called manually. 它们是由某些事件触发的,例如S3触发器,API网关方法或有时只是手动调用。 I create them by pasting code in the console or uploading a zip file I've built locally. 我通过在控制台中粘贴代码或上传我在本地构建的zip文件来创建它们。

On rare occasion, a function will fail. 在极少数情况下,功能会失败。 To detect failures, I've set up a CloudWatch alarm that looks like this: 为了检测故障,我设置了一个CloudWatch警报,如下所示:

CloudWatch警报

This works, to an extent: when a function anywhere in my account fails, I get an email. 这在一定程度上有效:当我帐户中任何地方的某个功能失败时,我都会收到一封电子邮件。 The problem is the email just states that the alarm got tripped. 问题是电子邮件仅指出警报已触发。 It doesn't state what Lambda function actually failed so I have to dig through Lambda to find which function actually caused the alarm. 它没有说明什么 Lambda函数实际上失败了,所以我必须仔细研究Lambda才能找到哪个函数真正引起了警报。

I've considered the following: 我考虑过以下几点:

  1. Setting up a CloudWatch alarm per function. 按功能设置CloudWatch警报。 This is the most obvious solution but is also the most tedious and highest maintenance. 这是最明显的解决方案,但也是最繁琐和最高的维护。
  2. Building a CI/CD pipeline for my Lambda functions instead of entering the code or uploading zips in the console. 为我的Lambda函数构建CI / CD管道,而不是在控制台中输入代码或上传zip。 I can then add a step that sets up a CloudWatch alert for the function automatically. 然后,我可以添加一个步骤,以自动为该功能设置CloudWatch警报。 This is better than the first option but also is a lot of infrastructure to set up for potentially a simple problem. 这比第一种选择更好,但也为潜在的简单问题设置了许多基础结构。
  3. Using another Lambda function to custom handle the alert. 使用另一个Lambda函数自定义处理警报。 The problem is, best I can tell, the SNS message that CloudWatch publishes doesn't contain any more data than the email; 问题是,据我所知,CloudWatch发布的SNS消息所包含的数据比电子邮件还多。 it just says in essence "your alarm named X tripped" but not why. 它实际上只是说“您的警报X跳闸了”,但不是为什么。

Any ideas on how to achieve this? 关于如何实现这一目标的任何想法?

We handle it internally. 我们在内部处理它。 When there is a problem, the Lambda attempts to handle it, and sends an alert. 出现问题时,Lambda尝试处理它,并发送警报。 The CloudWatch metric is only for truly unhandled exceptions. CloudWatch指标仅适用于真正未处理的异常。 Remember Lambda automatically retries if a function has an error, which can be undesirable for certain situations. 请记住,如果函数有错误,Lambda会自动重试,这在某些情况下是不希望的。 So it may be preferable to handle any exceptions internal to the Lambda function. 因此,最好处理Lambda函数内部的任何异常。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何从Cloudwatch获取AWS Lambda中的日志内容 - How can I get log content in AWS Lambda from Cloudwatch 无法让 AWS Lambda 函数记录(文本输出)到 CloudWatch - Can't get AWS Lambda function to log (text output) to CloudWatch 如何使用 aws cdk (JAVA) 每 15 分钟创建一个 cloudwatch 规则来触发 lambda function? - How can i create a cloudwatch rule to trigger a lambda function every 15 minutes using aws cdk (JAVA)? 如何将特定的 AWS Lambda function 部署到特定的阶段 - How can I deploy a specific AWS Lambda function to a specific Stage 如何停止AWS Lambda函数以登录CloudWatch - How to stop AWS Lambda function to log on CloudWatch AWS Lambda函数需要的资源超过CloudWatch中报告的最大内存 - AWS Lambda function needs more than max memory reported in CloudWatch 如何监控 AWS cloudwatch 中特定进程使用的内存? - How can I monitor memory used by specific process in AWS cloudwatch? 在 cloudwatch 中创建日志组时,如何触发 lambda? - How can I trigger a lambda when a log group is created in cloudwatch? 我可以从该 function 中为 AWS lambda function 创建/删除 cloudwatch 事件规则吗? - Can I create/delete a cloudwatch event rule for an AWS lambda function from within that function? 如何在 Lambda 函数中获取 CloudWatch 事件名称? - How to get CloudWatch Event name in Lambda function?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM