简体   繁体   English

AWS:没有一个实例在发送数据

[英]AWS: None of the Instances are sending data

I'm trying to set up an Elastic Beanstalk application with Amazon Web Services however I'm receiving a load of errors with the message None of the instances are sending data .我正在尝试使用 Amazon Web 服务设置 Elastic Beanstalk 应用程序,但是我收到大量错误消息None of the instances are sending data I've tried deleting the Elastic Beanstalk Application and the EC2 instance several times with the sample application and trying again but I get the same error.我尝试使用示例应用程序多次删除 Elastic Beanstalk 应用程序和 EC2 实例,然后重试,但我遇到了同样的错误。

I also tried uploading a flask application with AWS Elastic Beanstalk command line tools but then I received the error below:我还尝试使用 AWS Elastic Beanstalk 命令行工具上传 flask 应用程序,但随后收到以下错误:

Environment health has transitioned from Pending to Severe. 100.0 % of the requests to the ELB are failing with HTTP 5xx. Insufficient request rate (0.5 requests/min) to determine application health (7 minutes ago). ELB health is failing or not available for all instances. None of the instances are sending data

Why do I get this error and how do I fix it?为什么会出现此错误以及如何修复它? Thanks.谢谢。

You are using Enhanced Health Monitoring .您正在使用增强型健康监测 With enhanced health monitoring an agent installed on your EC2 instance monitors vital system and application level health metrics and sends them directly to Elastic Beanstalk.通过增强的健康监控,安装在 EC2 实例上的代理会监控重要的系统和应用程序级别的健康指标,并将它们直接发送到 Elastic Beanstalk。

When you see an error message like "None of the instances are sending data", it means either the agent on the instance has crashed or it is unable to post data to Elastic Beanstalk due to networking error or some other error.当您看到“没有任何实例正在发送数据”之类的错误消息时,这意味着实例上的代理已崩溃,或者由于网络错误或其他一些错误而无法将数据发布到 Elastic Beanstalk。

For debugging this, I would recommend downloading "Full logs" from the AWS console.为了调试这个,我建议从 AWS 控制台下载“完整日志”。 You can follow the instructions for getting logs in the section "Downloading Bundle Logs from Elastic Beanstalk Console" here .您可以按照此处的“从 Elastic Beanstalk 控制台下载捆绑日志”部分中的说明获取日志。 If you are unable to download logs using the console for any reason you can also ssh to the instance and look at the logs in /var/log .如果由于任何原因无法使用控制台下载日志,您还可以通过 ssh 连接到实例并查看/var/log

You will find logs for the health agent in /var/log/healthd/daemon.log .您将在/var/log/healthd/daemon.log找到健康代理的日志。 Additional logs useful for this situation are /var/log/cfn-init.log , /var/log/eb-cfn-init.log and /var/log/eb-activity.log .对这种情况有用的其他日志是/var/log/cfn-init.log/var/log/eb-cfn-init.log/var/log/eb-activity.log Can you look at the logs and give more details of the errors you see?您能否查看日志并提供您看到的错误的更多详细信息? This should hopefully give you more details regarding the error "None of the instances are sending data".这应该有望为您提供有关“没有任何实例正在发送数据”错误的更多详细信息。

Regarding other health "causes" you are seeing:关于您看到的其他健康“原因”:

  • Environment health has transitioned from Pending to Severe - This is because initially your environment health status is Pending .环境运行状况已从 Pending 转变为 Severe - 这是因为最初您的环境运行状况状态为Pending If the instances do not go healthy within grace period health status transitions to Severe .如果实例在宽限期内未运行正常,则健康状态将转换为Severe In your case since none of the instances is healthy / sending data, the health transitioned to Severe.在您的情况下,由于没有一个实例是健康的/正在发送数据,因此健康状况转变为严重。

  • 100.0 % of the requests to the ELB are failing with HTTP 5xx.对 ELB 的 100.0% 请求失败,HTTP 5xx。 Insufficient request rate (0.5 requests/min) to determine application health (7 minutes ago).请求率(0.5 个请求/分钟)不足以确定应用程序运行状况(7 分钟前)。 Elastic Beanstalk monitors other resources in addition to your EC2 instances when using enhanced health monitoring.使用增强的运行状况监控时,Elastic Beanstalk 会监控除您的 EC2 实例之外的其他资源。 For example, it monitors cloudwatch metrics for your ELB.例如,它监控 ELB 的 cloudwatch 指标。 This error means that all requests sent to your environment CNAME/load balancer are failing with HTTP 5xx errors.此错误意味着发送到您的环境 CNAME/负载平衡器的所有请求都因 HTTP 5xx 错误而失败。 At the same time the request rate is very low only 0.5 requests per minute, so this indicates that even though all requests are failing, the request rate is pretty low.同时请求率非常低,每分钟只有 0.5 个请求,所以这表明即使所有请求都失败了,请求率也很低。 "7 minutes ago" means that information about ELB metrics is slightly old. “7 分钟前”意味着有关 ELB 指标的信息有点旧。 Because Elastic Beanstalk monitors cloudwatch metrics every few minutes, so the data can be slightly stale.由于 Elastic Beanstalk 每隔几分钟监控一次 cloudwatch 指标,因此数据可能会稍微过时。 This is as opposed to health data we get directly from the EC2 instances which is "near real time".这与我们直接从 EC2 实例获得的“近乎实时”的健康数据相反。 In your case since the instances are not sending data the only available source for health is ELB metrics which is delayed by about 7 minutes.在您的情况下,由于实例未发送数据,因此唯一可用的健康来源是延迟了大约 7 分钟的 ELB 指标。

  • ELB health is failing or not available for all instances Elastic Beanstalk is looking at the health of your ELB, ie it is checking how many instances are in service behind ELB. ELB 运行状况失败或不适用于所有实例Elastic Beanstalk 正在查看您的 ELB 的运行状况,即它正在检查有多少实例在 ELB 后面服务。 In your case either all instances behind ELB are out of service or the health is not available for some other reason.在您的情况下,ELB 后面的所有实例都已停止服务,或者由于其他原因导致运行状况不可用。 You should double check that your service role is correctly configured.您应该仔细检查您的服务角色是否配置正确。 You can read how to configure service role correctly here or in the documentation .您可以在此处文档中阅读如何正确配置服务角色。 It is possible that your application failed to start.您的应用程序可能无法启动。

In your case I would suggest focusing on the first error "None of the instances are sending data".在您的情况下,我建议关注第一个错误“没有任何实例正在发送数据”。 For this you need to look at the logs as outlined above.为此,您需要查看上述日志。 Let me know what you see in the logs.让我知道你在日志中看到了什么。 The agent is started fairly early in the bootstrap process on the instance.代理在实例的引导过程中相当早地启动。 So if you see an error like "None of the instances are sending data", it is very likely that bootstrap failed or the agent failed to start for some reason.因此,如果您看到“没有任何实例正在发送数据”之类的错误,则很可能是引导程序失败或代理因某种原因无法启动。 The logs should tell you more.日志应该告诉你更多。

Also make sure you are using an instance profile with your environment.还要确保在您的环境中使用实例配置文件。 Instance profile allows the health agent running on your EC2 instance to authenticate with Elastic Beanstalk.实例配置文件允许在您的 EC2 实例上运行的运行状况代理使用 Elastic Beanstalk 进行身份验证。 If instance profile is not associated with your environment then the agent will not be able to send data to Elastic Beanstalk.如果实例配置文件未与您的环境关联,则代理将无法将数据发送到 Elastic Beanstalk。 Read more about Instance Profiles with Elastic Beanstalk here . 在此处阅读有关使用 Elastic Beanstalk 的实例配置文件的更多信息。

Update One common reason for the health cause "None of the instances are sending data" can be that your instance is in a VPC and your VPC does not allow NTP access.更新健康原因“所有实例均未发送数据”的一个常见原因可能是您的实例位于 VPC 中,而您的 VPC 不允许 NTP 访问。 Typical indicator of this problem is the following message in /var/log/messages: ntpdate: Synchronizing with time server: [FAILED] .此问题的典型指标是/var/log/messages: ntpdate: Synchronizing with time server: [FAILED]的以下消息/var/log/messages: ntpdate: Synchronizing with time server: [FAILED] When this happens the clock on your EC2 instance can get out of sync and the data is considered invalid.发生这种情况时,您的 EC2 实例上的时钟可能会不同步,数据将被视为无效。 You should also see a health cause on the instances on the health page on the AWS web console that tells you that instance clock is out-of-sync.您还应该在 AWS Web 控制台的运行状况页面上的实例上看到运行状况原因,告诉您实例时钟不同步。 The fix is to make sure that your VPC allows access to NTP.解决方法是确保您的 VPC 允许访问 NTP。


There can be many reasons why the health agent is not able to send any data, so this may not be the answer to your problem, but it was to mine and hopefully can help somebody else:健康代理无法发送任何数据的原因可能有很多,所以这可能不是您问题的答案,但它是我的,希望可以帮助其他人:

I got the same error and looking into /var/log/healthd/daemon.log the following was repeatedly reported:我遇到了同样的错误并查看/var/log/healthd/daemon.log重复报告以下内容:

sending message(s) failed: (Aws::Healthd::Errors::GroupNotFoundException) Group 97c30ca2-5eb5-40af-8f9a-eb3074622172 does not exist

This was caused by me making and using an AMI image from an EC2 instance inside an Elastic Beanstalk environment.这是由于我在 Elastic Beanstalk 环境中制作和使用来自 EC2 实例的 AMI 映像造成的。 That is, I created a temporary environment with one instance the same configuration as my production environment, went into the EC2 console and created an image of the instance, terminated the temporary environment, and then created yet another environment using the new custom AMI.也就是说,我创建了一个临时环境,其中一个实例的配置与我的生产环境相同,进入 EC2 控制台并创建了实例的映像,终止了临时环境,然后使用新的自定义 AMI 创建了另一个环境。

Of course (in hindsight) this meant some settings of the temporary environment were still being used.当然(事后看来)这意味着临时环境的一些设置仍在使用中。 In this case specifically /etc/healthd/config.yaml , resulting in the health agent trying to send messages to a no longer existing health group.在这种情况下,特别是/etc/healthd/config.yaml ,导致健康代理尝试将消息发送到不再存在的健康组。

To fix this and make sure there was no other stale configuration around, I instead started a new EC2 instance by hand from the default AMI used in the production environment (find it under the 'Instances' configuration page of your environment), provision that, then create a new image from that and use that image in my new EB environment.为了解决这个问题并确保周围没有其他陈旧的配置,我改为从生产环境中使用的默认 AMI 手动启动一个新的 EC2 实例(在您环境的“实例”配置页面下找到它),进行配置,然后从中创建一个新图像并在我的新 EB 环境中使用该图像。

Check if your instance type's RAM is enough for app + os + amazon tooling.检查您的实例类型的 RAM 是否足以用于 app + os + amazon 工具。 We suffered from this for a long time, when we discovered that t2.micro is barely enough for our use cases.我们受苦了很长时间,当我们发现 t2.micro 几乎不足以满足我们的用例时。 The problem went away right after using t2.small (2GB).使用 t2.small (2GB) 后问题就消失了。

我通过添加另一个安全组(我的 Elastic Beanstalk 的默认安全组)解决了这个问题。

看来我的问题是我没有将公共 IP 地址与我的实例相关联......在我设置它之后它工作得很好。

I just set the Path on load balancing to a URL that response with status code 200, for this only to study environment.我只是将负载平衡的路径设置为响应状态代码 200 的 URL,仅用于研究环境。

For my real app, I use actuator对于我的真实应用程序,我使用执行器

If you see something like this where you don't get any enhanced metrics, check that you haven't accidentally removed the conf.d/elasticbeanstalk/healthd.conf include from your nginx config.如果您看到类似这样的内容而您没有获得任何增强的指标,请检查您是否不小心从 nginx 配置中删除了conf.d/elasticbeanstalk/healthd.conf包含。 This conf adds an machine-read log format that is responsible for reporting that data in EB (see Enhanced health log format - AWS ).此 conf 添加了一种机器读取日志格式,负责在 EB 中报告该数据(请参阅增强型健康日志格式 - AWS )。

没有指标

My instance profile's IAM Role was lacking elasticbeanstalk:PutInstanceStatistics permission.我的实例配置文件的 IAM 角色缺少elasticbeanstalk:PutInstanceStatistics权限。

I found this by looking at /var/log/healthd/daemon.log as suggested in one of the other answers.我通过查看/var/log/healthd/daemon.log发现了这一点,正如其他答案之一中所建议的那样。

I had to SSH into the machine directly to discover this, as the Get Logs function itself was failing due to missing S3 Write permissions.我不得不将 SSH 直接插入机器才能发现这一点,因为 Get Logs function 本身由于缺少 S3 写入权限而失败。

If you're running a Worker Tier EB, need to add this policy :如果您正在运行Worker Tier EB,则需要添加此策略

arn:aws:iam::aws:policy/AWSElasticBeanstalkWorkerTier

I was running an app in elastic beanstalk environment with docker as platform.我在 docker 作为平台的弹性 beanstalk 环境中运行一个应用程序。 I got the same error that none of the instances are sending.我得到了没有实例发送的相同错误。 And I was unable fetch logs as well.而且我也无法获取日志。 Rebuilding the environment worked for me.重建环境对我有用。

For anyone arriving here in 2022…对于 2022 年抵达这里的任何人……

After launching a new environment that was identical to a current healthy environment and seeing no data, I raised an AWS Support ticket.在启动与当前健康环境相同的新环境并且没有看到任何数据后,我提出了 AWS Support 票证。 I was informed:我被告知:

Here, I would like to inform you that recently Elastic Beanstalk introduced new feature called EnhancedHealthAuthEnabled to increase security of your environment and help prevent health data spoofing on your behalf and this option will be enabled by default when you create new environment.在此,我想通知您,最近 Elastic Beanstalk 引入了名为EnhancedHealthAuthEnabled的新功能,以提高您环境的安全性并帮助防止代表您的健康数据欺骗,并且当您创建新环境时默认情况下会启用此选项。

If you use managed policies for your instance profile, this feature is available for your new environment without any further configuration as Elastic Beanstalk instance profile managed policies contain permissions for the elasticbeanstalk:PutInstanceStatistics action.如果您对实例配置文件使用托管策略,则此功能无需任何进一步配置即可用于您的新环境,因为 Elastic Beanstalk 实例配置文件托管策略包含elasticbeanstalk:PutInstanceStatistics操作的权限。 However, If you use a custom instance profile instead of a managed policy, your environment might display a No Data health status.但是,如果您使用自定义实例配置文件而不是托管策略,您的环境可能会显示无数据运行状况。 This happens because custom instance profile doesn't PutInstanceStatistics permission by default and instances aren't authorised for the action that communicates enhanced health data to the service.发生这种情况是因为自定义实例配置文件在默认情况下没有PutInstanceStatistics权限,并且实例未被授权执行将增强的健康数据传递给服务的操作。 Hence, your environment health shows Unknown/No data status.因此,您的环境健康状况显示未知/无数据状态。

The policy that I needed to attach to my existing EC2 role (as advised by AWS Support) looked like:我需要附加到现有 EC2 角色的策略(根据 AWS Support 的建议)如下所示:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "ElasticBeanstalkHealthAccess",
      "Action": [
        "elasticbeanstalk:PutInstanceStatistics"
      ],
      "Effect": "Allow",
      "Resource": [
        "arn:aws:elasticbeanstalk:*:*:application/*",
        "arn:aws:elasticbeanstalk:*:*:environment/*"
      ]
    }
  ]
}

Adding this policy to my EC2 role solved the issue for me.将此策略添加到我的 EC2 角色为我解决了这个问题。

在我的情况下,当我增加我的 ram 或实例类型(t2.micro 到 c5.xlarge)时,它已经解决了。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM