简体   繁体   English

Amazon EC2 AutoScaling CPUUtilization Alarm-INSUFFICIENT DATA

[英]Amazon EC2 AutoScaling CPUUtilization Alarm- INSUFFICIENT DATA

So I've been using Boto in Python to try and configure autoscaling based on CPUUtilization, more or less exactly as specified in this example: http://boto.readthedocs.org/en/latest/autoscale_tut.html 所以我一直在使用Python中的Boto尝试基于CPUUtilization配置自动调节,或多或少完全按照此示例中的指定: http//boto.readthedocs.org/en/latest/autoscale_tut.html

However both alarms in CloudWatch just report: 但是,CloudWatch中的两个警报都只报告:

State Details: State changed to 'INSUFFICIENT_DATA' at 2012/11/12 16:30 UTC. 州详细信息:状态在2012/11/12 16:30 UTC更改为“INSUFFICIENT_DATA”。 Reason: Unchecked: Initial alarm creation 原因:未选中:初始警报创建

Auto scaling is working fine but the alarms aren't picking up any CPUUtilization data at all. 自动缩放工作正常,但警报根本没有获取任何CPUUtilization数据。 Any ideas for things I can try? 对于我可以尝试的任何想法?

Edit: The instance itself reports CPU utilisation data, just not when I try and create an alarm in CloudWatch, programatically in python or in the interface. 编辑:实例本身报告CPU利用率数据,而不是当我尝试在CloudWatch中以编程方式在python或界面中创建警报时。 Detailed monitoring is also enabled just in case... 为了以防万一,还启用了详细的监控......

Thanks! 谢谢!

The official answer from AWS goes like this: AWS的官方答案如下:

Hi, There is an inherent delay in transitioning into INSUFFICIENT_DATA state (only) as alarms wait for a period of time to compensate for metric generation latency. 嗨,转换到INSUFFICIENT_DATA状态(仅)的固有延迟,因为警报等待一段时间来补偿指标生成延迟。 For an alarm with a 60 second period, the delay before transition into I_D state will be between 5 and 10 minutes. 对于具有60秒周期的警报,转换到I_D状态之前的延迟将在5到10分钟之间。

John. 约翰。

Apparently this is a temporary state and will likely resolve itself. 显然这是一个临时状态,可能会自行解决。

I am not sure what's going on in the backend, but if you compare the alarm history you will see AWS remove the 'unit' column if you just modify the alarm without any change as at7000ft said. 我不确定后端发生了什么,但如果您比较警报历史记录,您将看到AWS删除“单位”列,如果您只是修改警报而没有任何变化,因为at7000ft说。 So remove the unit column of your script. 因此,请删除脚本的单位列。

Make sure that the alarm's Namespace is 'AWS/EC2'. 确保警报的命名空间为“AWS / EC2”。

I know this is a long time after the original question, but in case others find this via Google, I had the same problem, and it turned out I set alarm's Namespace improperly. 我知道这是在原始问题之后的很长一段时间,但是如果其他人通过谷歌发现这个问题,我遇到了同样的问题,结果我设置了不正确的警报名称空间。

It is needed to publish data with the same unit used to create the alarm. 需要使用用于创建警报的相同单位发布数据。 If you didn't specify one, it will be a <None> unit. 如果您没有指定一个,它将是<None>单位。

Unit can be specified in aws put-metric-data and aws-put-metric-alarm with --unit <value> 可以使用--unit <value>aws put-metric-dataaws-put-metric-alarm指定单位

Unit <value> can be: 单位<value>可以是:

  • Seconds
  • Bytes 字节
  • Bits
  • Percent 百分
  • Count 计数
  • Bytes/Second (bytes per second) 字节/秒(每秒字节数)
  • Bits/Second (bits per second) 位/秒(每秒位数)
  • Count/Second (counts per second) 计数/秒(每秒计数)
  • None (default when no unit is specified) 无(未指定单位时为默认值)

Units are also case-sensitive, be carefull about that in your scripts. 单位也区分大小写,在脚本中要小心。

For CPUUtilization, you can use Percent. 对于CPUUtilization,您可以使用百分比。

After the first data-set is sent to your alarm (it can take up to 5 minutes for a non-detailed monitored instance), the alarm will switch to the OK or ALARM state instead of the INSUFFICIENT_DATA one. 将第一个数据集发送到您的警报后(对于非详细的受监视实例,最多可能需要5分钟),警报将切换到OK或ALARM状态而不是INSUFFICIENT_DATA状态。

I am having the same INSUFFICIENT_DATA alarm state show up in CloudWatch for an RDS CPUUtilization > 60 alarm created with CloudFormation. 我在CloudWatch中显示相同的INSUFFICIENT_DATA警报状态,以便使用CloudFormation创建RDS CPUUtilization> 60警报。 ("Reason: Unchecked: Initial alarm creation" shows up under details). (“原因:未选中:初始警报创建”显示在详细信息下)。 This is a very crude fix but I found that by selecting the alarm, clicking the Modify button, and then the Save button (without changing anything) the alarm goes to the OK state and everything is file. 这是一个非常粗略的修复但我发现通过选择警报,单击修改按钮,然后单击保存按钮(不更改任何内容),警报进入OK状态,一切都是文件。

I had this problem. 我有这个问题。 Make sure the metric name you use to create the alarm matches the actual metric name. 确保用于创建警报的度量标准名称与实际度量标准名称匹配。

You can list your metrics with: 您可以使用以下内容列出指标:

aws cloudwatch list-metrics --namespace=<NAMESPACE, e.g. System/Linux, etc>

Find the metric and the MetricName. 查找指标和MetricName。 Make sure your alarm is configured for that metric. 确保为该指标配置了警报。

As far as I know, default metric resolution is 5 minutes (which can be lowered to 1 minute if you pay up, or something like that), so if your alarm's measurement period is lower than that, then it'll remain permanently in an INSUFFICIENT_DATA state. 据我所知,默认度量标准分辨率是5分钟(如果您付费可以降低到1分钟,或类似的东西),所以如果您的警报的测量周期低于此值,那么它将永久保留在INSUFFICIENT_DATA州。 In my case, I had a 1 minute measurement period on CPU utilization, and changing it to 5 minutes has fixed the state issue. 在我的情况下,我有一个1分钟的CPU利用率测量周期,并将其更改为5分钟已修复状态问题。

I had a similar problem, my alarm was constantly in INSUFFICIENT_DATA status although I can see the metric in the GUI. 我有类似的问题,我的警报一直处于INSUFFICIENT_DATA状态,尽管我可以在GUI中看到指标。

Come out that this happen, because I specified the wrong Unit for the metric, when I created the Alarm. 出现这种情况,因为我在创建警报时为度量指定了错误的单位。 No error was reported back but it never became GREEN. 没有报告任何错误,但它从未成为绿色。

Better to avoid to specify it, if you are not sure, and AWS will do the correct match in the background. 如果您不确定,最好避免指定它,AWS将在后台执行正确的匹配。

There is a directory /var/tmp/aws-mon/ that contains a couple files. 有一个目录/ var / tmp / aws-mon /包含几个文件。 One is instance-id. 一个是instance-id。 The instance I was on was created from an AMI and this file retained the old instance id. 我所使用的实例是从AMI创建的,该文件保留了旧的实例ID。 I just edited it and made sure /var/tmp/aws-mon/placement/availability-zone was also correct. 我刚编辑它并确保/ var / tmp / aws-mon / placement / availability-zone也是正确的。 The alarms changed to OK almost instantly. 警报几乎立即变为OK。

Also ran into this problem but for a different reason: I passed ES cluster ARN instead of domain name in my Cloudformation template. 还遇到了这个问题但出于不同的原因:我在我的Cloudformation模板中传递了ES集群ARN而不是域名。 It was pretty frustrating 这非常令人沮丧

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM