简体繁体 English

为 Autoscaling Group(CF) 中的 EC2 实例设置 CloudWatch 警报

[英]Setup CloudWatch Alarms for EC2 instances in Autoscaling Group(CF)

原文 2021-04-21 10:03:05 2 1 amazon-web-services/ amazon-cloudformation/ amazon-cloudwatch/ autoscaling/ alerts

I have an AWS::AutoScaling::AutoScalingGroup configuration that runs two instances of EC2.我有一个运行两个 EC2 实例的 AWS::AutoScaling::AutoScalingGroup 配置。 My question is - is it possible to attach CloudWatch alarms for both instances?我的问题是 - 是否可以为两个实例附加 CloudWatch 警报？ For example I want to observe StatusCheckFailed_Instance metric for each EC2 in a group?例如，我想观察组中每个 EC2 的StatusCheckFailed_Instance指标？

Usually you can attach alarms through the EC2 Instance ID but how to know each EC2 Instance ID in AutoScalingGroup to attach alerts?通常您可以通过 EC2 Instance ID附加警报，但如何知道AutoScalingGroup中的每个 EC2 Instance ID以附加警报？ or here should be another way to attach alerts?或者这里应该是另一种附加警报的方式？ I really can't find something useful and workable over internet.我真的无法在互联网上找到有用且可行的东西。

1 个解决方案

Option 1)选项1）

Create your own script that's triggered on launch/terminate events创建您自己的在启动/终止事件时触发的脚本
the scripts will each be set to trigger a lambda that would read in the instance ID and create/delete an alarm每个脚本都将设置为触发 lambda，该脚本将读取实例 ID 并创建/删除警报

Option 2)选项 2)

If you're not trying to use the auto-recover option (which you shouldn't need in an ASG, since the ASG will just replace the instances), then you can make 1 aggregate alarm for the ASG如果您不尝试使用自动恢复选项（在 ASG 中不需要，因为 ASG 只会替换实例），那么您可以为 ASG 发出 1 个聚合警报
Create the alarm based on the StatusCheckFailed_Instance metric with the ASGName=<> Dimmension基于 ASGName=<> 维度的 StatusCheckFailed_Instance 指标创建警报
Set it to trigger if the MAX statistic value is > 1 (since that means at least 1 instance is failing, each instance will push its own datapoints to ASG versions of EC2 metrics)将其设置为在 MAX 统计值 > 1 时触发（因为这意味着至少有 1 个实例失败，每个实例都会将其自己的数据点推送到 EC2 指标的 ASG 版本）
Since you only have 2 instances, you can just manually check both if it ever triggers.由于您只有 2 个实例，因此您可以手动检查它们是否触发。 But for larger ASGs using the SEARCH() math expression on the CloudWatch metrics console (or a dashboard) would be a good way to look through all the ASG instances and view their metrics to see which one is failing但对于较大的 ASG，在 CloudWatch 指标控制台（或仪表板）上使用 SEARCH() 数学表达式将是查看所有 ASG 实例并查看其指标以查看哪个失败的好方法