简体   繁体   中英

Reference Instance ID in CW Alarm Dimension - Terraform

I am adding alarms/monitoring to a logging pipeline. Specifically, I am creating CW Alarms that are triggered on 50+% disk/memory utilization for EC2 instances within an Auto Scaling Group. The ASG is created in the "workers" module directory and outputs the scaling group name for reference in alarm creation which occurs in the "cloudwatch" module directory.

I am struggling to understand a few things about creating this alarm:

  • do all dimensions of a metric have to be referenced in alarm creation?
  • and, if so, how do I reference InstanceID when only target group/scaling groups are defined in the TF files?

in "alarms" parent module:

resource "aws_cloudwatch_metric_alarm" "pipeline_DiskUtilization" {
  alarm_name          = "pipeline-disk-alarm"
  comparison_operator = "GreaterThanOrEqualToThreshold"
  evaluation_periods  = "5"
  metric_name         = "disk_used_percent"
  namespace           = "CWAgent"
  period              = "60"
  statistic           = "Average"
  threshold           = "50"

  dimensions = {
    AutoScalingGroupName = var.scaling_name
  }

  alarm_description = "This metric monitors ec2 disk utilization"
  alarm_actions     = [var.scaling_group]
}

in "workers" parent module:

resource "aws_autoscaling_group" "pipeline-scaling-group" {
  name                = "pipeline-worker-asg"
  vpc_zone_identifier = var.operating_subnets
  desired_capacity   = 2
  max_size           = 4
  min_size           = 2

  target_group_arns  = [var.target_group]
  launch_template {
    id      = aws_launch_template.pipeline-worker-launch-template.id
    version = "$Latest"
  }
}

do all dimensions of a metric have to be referenced in alarm creation?

Yes.

and, if so, how do I reference InstanceID when only target group/scaling groups are defined in the TF files?

You can't do this (easily) from TF. Once you use ASG to manage your instances, they are out of your control, thus you can't get their IDs directly. Also you shouldn't do this, even if you could. Instances in ASG should be treated as group (thus, there is "Group" in Auto Scaling Group name), not as an individual entities.

Even if you could do this (easily), how would you manage these alarms? ASG can replace your instances at any time, leaving lots of dead alarms after a while, and new instances without any alarms.

The proper way to manage this, would be through CloudWatch Event rules , outside of your TF. You would have to detect additions and terminations of instances by your ASG. Any such action would trigger a lambda function which would add/remove alarms dynamically in response to ASG events.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM