Reference Instance ID in CW Alarm Dimension - Terraform

Question

I am adding alarms/monitoring to a logging pipeline. Specifically, I am creating CW Alarms that are triggered on 50+% disk/memory utilization for EC2 instances within an Auto Scaling Group. The ASG is created in the "workers" module directory and outputs the scaling group name for reference in alarm creation which occurs in the "cloudwatch" module directory.

I am struggling to understand a few things about creating this alarm:

do all dimensions of a metric have to be referenced in alarm creation?
and, if so, how do I reference InstanceID when only target group/scaling groups are defined in the TF files?

in "alarms" parent module:

resource "aws_cloudwatch_metric_alarm" "pipeline_DiskUtilization" {
  alarm_name          = "pipeline-disk-alarm"
  comparison_operator = "GreaterThanOrEqualToThreshold"
  evaluation_periods  = "5"
  metric_name         = "disk_used_percent"
  namespace           = "CWAgent"
  period              = "60"
  statistic           = "Average"
  threshold           = "50"

  dimensions = {
    AutoScalingGroupName = var.scaling_name
  }

  alarm_description = "This metric monitors ec2 disk utilization"
  alarm_actions     = [var.scaling_group]
}

in "workers" parent module:

resource "aws_autoscaling_group" "pipeline-scaling-group" {
  name                = "pipeline-worker-asg"
  vpc_zone_identifier = var.operating_subnets
  desired_capacity   = 2
  max_size           = 4
  min_size           = 2

  target_group_arns  = [var.target_group]
  launch_template {
    id      = aws_launch_template.pipeline-worker-launch-template.id
    version = "$Latest"
  }
}

Answer 1

do all dimensions of a metric have to be referenced in alarm creation?

Yes.

and, if so, how do I reference InstanceID when only target group/scaling groups are defined in the TF files?

You can't do this (easily) from TF. Once you use ASG to manage your instances, they are out of your control, thus you can't get their IDs directly. Also you shouldn't do this, even if you could. Instances in ASG should be treated as group (thus, there is "Group" in Auto Scaling Group name), not as an individual entities.

Even if you could do this (easily), how would you manage these alarms? ASG can replace your instances at any time, leaving lots of dead alarms after a while, and new instances without any alarms.

The proper way to manage this, would be through CloudWatch Event rules , outside of your TF. You would have to detect additions and terminations of instances by your ASG. Any such action would trigger a lambda function which would add/remove alarms dynamically in response to ASG events.

Reference Instance ID in CW Alarm Dimension - Terraform

Question

1 answers

solution1
0 ACCPTED 2021-05-08 02:29:44

Reference Instance ID in CW Alarm Dimension - Terraform

Question

1 answers

solution1 0 ACCPTED 2021-05-08 02:29:44

solution1
0 ACCPTED 2021-05-08 02:29:44