[英]CloudFormation AutoScalingGroup not waiting for signal on update/scale-up
I'm working with a CloudFormation template that brings up as many instances as I request, and want to wait for them to finish initialising (via User Data) before the stack creation/update is considered complete. 我正在使用CloudFormation模板,该模板会根据我的请求调出尽可能多的实例,并希望等到它们完成初始化(通过用户数据),然后才能认为堆栈创建/更新已完成。
Creating or updating the stack should wait for signals from all newly created instances, such to ensure that their initialisation is complete. 创建或更新堆栈应等待来自所有新创建的实例的信号,以确保其初始化完成。
I don't want the stack creation or update to be considered successful if any of the created instances fail to initialise. 如果任何创建的实例无法初始化,我不希望将堆栈创建或更新视为成功。
CloudFormation only seems to wait for signals from instances when the stack is first created. CloudFormation似乎只在等待首次创建堆栈时的实例信号。 Updating the stack and increasing the number of instances seems to disregard signalling.
更新堆栈并增加实例数似乎忽略了信令。 The update operation finishes successfully very quickly, whilst instances are still being initialised.
更新操作非常快速地成功完成,而实例仍在初始化。
Instances created as a result of updating the stack can fail to initialise, but the update action would've already been considered a success. 由于更新堆栈而创建的实例可能无法初始化,但更新操作已经被认为是成功的。
Using CloudFormation, how can I make the reality meet the expectation? 使用CloudFormation,我如何才能使现实符合预期?
I want the same behaviour that applies when the stack is created, to when the stack is updated. 我想要在创建堆栈时,以及更新堆栈时应用的相同行为。
I have found only the following question that matches my problem: UpdatePolicy in Autoscaling group not working correctly for AWS CloudFormation update 我发现只有以下问题符合我的问题: Autoscaling组中的UpdatePolicy无法正常用于AWS CloudFormation更新
It's been open for a year and has not received an answer. 它已经开放一年,但没有得到答案。
I'm creating another question as I've more information to add, and I'm not sure if these particulars will match those of the author in that question. 我正在创建另一个问题,因为我需要添加更多信息,而且我不确定这些细节是否与该问题中的作者相匹配。
To demonstrate the problem, I've created a template based off of the example beneath the Auto Scaling Group header on this AWS documentation page , which includes signalling. 为了演示此问题,我在此AWS文档页面上的Auto Scaling Group标题下创建了一个模板,其中包括信令。
The created template has been adapted as so: 创建的模板已经过调整,如下所示:
ap-northeast-1
). ap-northeast-1
)。 The cfn-signal
command has been bootstrapped and called as necessary considering this change. cfn-signal
命令已经过引导,并在考虑到此更改时根据需要进行调用。 Here's the template, saved to template.yml
: 这是模板,保存到
template.yml
:
Parameters:
DesiredCapacity:
Type: Number
Description: How many instances would you like in the Auto Scaling Group?
Resources:
AutoScalingGroup:
Type: AWS::AutoScaling::AutoScalingGroup
Properties:
AvailabilityZones: !GetAZs ''
LaunchConfigurationName: !Ref LaunchConfig
MinSize: !Ref DesiredCapacity
MaxSize: !Ref DesiredCapacity
CreationPolicy:
ResourceSignal:
Count: !Ref DesiredCapacity
Timeout: PT5M
UpdatePolicy:
AutoScalingScheduledAction:
IgnoreUnmodifiedGroupSizeProperties: true
AutoScalingRollingUpdate:
MinInstancesInService: 1
MaxBatchSize: 2
PauseTime: PT5M
WaitOnResourceSignals: true
LaunchConfig:
Type: AWS::AutoScaling::LaunchConfiguration
Properties:
ImageId: ami-b7d829d6
InstanceType: t2.micro
UserData:
'Fn::Base64':
!Sub |
#!/bin/bash -xe
sleep 120
apt-get -y install python-setuptools
TMP=`mktemp -d`
curl https://s3.amazonaws.com/cloudformation-examples/aws-cfn-bootstrap-latest.tar.gz | \
tar xz -C $TMP --strip-components 1
easy_install $TMP
/usr/local/bin/cfn-signal -e $? \
--stack ${AWS::StackName} \
--resource AutoScalingGroup \
--region ${AWS::Region}
Now I create the stack with a single instance, via: 现在我用一个实例创建堆栈,通过:
$ aws cloudformation create-stack \
--region=ap-northeast-1 \
--stack-name=asg-test \
--template-body=file://template.yml \
--parameters ParameterKey=DesiredCapacity,ParameterValue=1
After waiting a few minutes for the creation to complete, let's look some key stack events: 在等待几分钟完成创建之后,让我们看看一些关键的堆栈事件:
$ aws cloudformation describe-stack-events \
--region=ap-northeast-1 \
--stack-name=asg-test
...
{
"Timestamp": "2017-02-03T05:36:45.445Z",
...
"LogicalResourceId": "AutoScalingGroup",
...
"ResourceStatus": "CREATE_COMPLETE",
...
},
{
"Timestamp": "2017-02-03T05:36:42.487Z",
...
"LogicalResourceId": "AutoScalingGroup",
...
"ResourceStatusReason": "Received SUCCESS signal with UniqueId ...",
"ResourceStatus": "CREATE_IN_PROGRESS"
},
{
"Timestamp": "2017-02-03T05:33:33.274Z",
...
"LogicalResourceId": "AutoScalingGroup",
...
"ResourceStatusReason": "Resource creation Initiated",
"ResourceStatus": "CREATE_IN_PROGRESS",
...
}
...
You can see that the auto scaling group started initiating at 05:33:33. 你可以看到自动缩放组在05:33:33开始启动。 At 05:36:42 (3 minutes after initiation), it received a success signal.
在05:36:42(启动后3分钟),它收到了成功信号。 This allowed the auto scaling group to reach its own success status only moments after, at 05:36:45.
这使得自动缩放组仅在05:36:45之后才能达到自己的成功状态。
That's awesome - working like a charm. 这太棒了 - 像魅力一样工作。
Now let's try increasing the number of instances in this auto scaling group to 2 by updating the stack: 现在让我们尝试通过更新堆栈将此自动缩放组中的实例数增加到2:
$ aws cloudformation update-stack \
--region=ap-northeast-1 \
--stack-name=asg-test \
--template-body=file://template.yml \
--parameters ParameterKey=DesiredCapacity,ParameterValue=2
After waiting a much shorter time for the update to complete, let's look at some of the new stack events: 在等待更短的时间完成更新之后,让我们看看一些新的堆栈事件:
$ aws cloudformation describe-stack-events \
--region=ap-northeast-1 \
--stack-name=asg-test
{
"ResourceStatus": "UPDATE_COMPLETE",
...
"ResourceType": "AWS::CloudFormation::Stack",
...
"Timestamp": "2017-02-03T05:45:47.063Z"
},
...
{
"ResourceStatus": "UPDATE_COMPLETE",
...
"LogicalResourceId": "AutoScalingGroup",
"Timestamp": "2017-02-03T05:45:43.047Z"
},
{
"ResourceStatus": "UPDATE_IN_PROGRESS",
...,
"LogicalResourceId": "AutoScalingGroup",
"Timestamp": "2017-02-03T05:44:20.845Z"
},
{
"ResourceStatus": "UPDATE_IN_PROGRESS",
...
"ResourceType": "AWS::CloudFormation::Stack",
...
"Timestamp": "2017-02-03T05:44:15.671Z",
"ResourceStatusReason": "User Initiated"
},
....
Now you can see that whilst the auto scaling group started updating at 05:44:20, it completed at 05:45:43 - that's less than one and a half minutes to completion, which shouldn't be possible considering a sleep time of 120 seconds in the user data. 现在你可以看到,虽然自动缩放组在05:44:20开始更新,但它在05:45:43完成 - 完成时间不到一分半钟,考虑到睡眠时间不可能用户数据120秒。
The stack update then proceeds to completion without the auto scaling group ever having received any signals. 然后堆栈更新进行到完成,而自动缩放组没有接收到任何信号。
The new instance does indeed exist. 新实例确实存在。
In my real use case I've SSHed into one of these new instances to find that it was still in the process of initialising even after the stack update completed. 在我的实际用例中,我已经连接到其中一个新实例,发现即使在堆栈更新完成后它仍处于初始化过程中。
I've read and re-read the documentation surrounding CreationPolicy
and UpdatePolicy
, but have failed to identify what I'm missing. 我已经阅读并重新阅读了有关
CreationPolicy
和UpdatePolicy
的文档,但未能确定我缺少的内容。
Taking a look at the update policy in use above, I don't understand what it's actually doing. 看一下上面使用的更新策略,我不明白它实际上在做什么。 Why is
WaitOnResourceSignals
true, but it's not waiting? 为什么
WaitOnResourceSignals
true,但它不等待? Is it serving some other purpose? 它是否有其他用途?
Or are these new instances not falling under the "rolling update" policy? 或者这些新实例是否属于“滚动更新”政策? If they don't belong there, then I'd expect them to fall under the creation policy, but that doesn't seem to apply either.
如果他们不属于那里,那么我希望他们属于创作政策,但这似乎也不适用。
As such, I don't really know what else to try. 因此,我真的不知道还有什么可以尝试。
I have a sneaking feeling that it's functioning as designed/expected, but if it is then what's the point of that WaitOnResourceSignals
property and how can I meet the expectation set above? 我有一种偷偷摸摸的感觉,它的功能与设计/预期一致,但如果它是那么
WaitOnResourceSignals
属性的重点是什么,我怎样才能满足上面设定的期望?
The AutoScalingRollingUpdate
policy handles rotating out an entire set of instances in an Auto Scaling group in response to changes to the underlying LaunchConfiguration
. AutoScalingRollingUpdate
策略处理旋转Auto Scaling组中的整个实例集,以响应对基础LaunchConfiguration
更改。 It doesn't apply to individual changes to the number of instances in the existing group. 它不适用于对现有组中实例数的单独更改。 According to the UpdatePolicy Attribute documentation,
根据UpdatePolicy属性文档,
The
AutoScalingReplacingUpdate
andAutoScalingRollingUpdate
policies apply only when you do one or more of the following:仅当您执行以下一项或多项操作时,
AutoScalingReplacingUpdate
和AutoScalingRollingUpdate
策略才适用:
- Change the Auto Scaling group's
AWS::AutoScaling::LaunchConfiguration
.更改Auto Scaling组的
AWS::AutoScaling::LaunchConfiguration
。- Change the Auto Scaling group's
VPCZoneIdentifier
property更改Auto Scaling组的
VPCZoneIdentifier
属性- Update an Auto Scaling group that contains instances that don't match the current
LaunchConfiguration
.更新包含与当前
LaunchConfiguration
不匹配的实例的Auto Scaling组。
Changing the Auto Scaling group's DesiredCapacity
property is not in this list, so the AutoScalingRollingUpdate
policy does not apply to this type of change. 更改Auto Scaling组的
DesiredCapacity
属性不在此列表中,因此AutoScalingRollingUpdate
策略不适用于此类更改。
As far as I know, it is not possible (using standard AWS CloudFormation resources) to delay the completion of a Stack Update modifying DesiredCapacity
until any new instances added to the Auto Scaling Group are fully provisioned. 据我所知,在完全配置添加到Auto Scaling组的任何新实例之前,不可能(使用标准AWS CloudFormation资源)延迟完成修改
DesiredCapacity
的堆栈更新。
Here are some alternative options: 以下是一些备选方案:
DesiredCapacity
, modify a LaunchConfiguration
property at the same time. DesiredCapacity
,同时修改LaunchConfiguration
属性。 This will trigger an AutoScalingRollingUpdate
to the desired capacity (the downside is that it will also update existing instances, which may not actually need to be modified). AutoScalingRollingUpdate
到所需的容量(缺点是它还将更新现有实例,实际上可能不需要修改)。 AWS::AutoScaling::LifecycleHook
resource to your Auto Scaling Group, and call aws autoscaling complete-lifecycle-action
in addition to cfn-signal
, to signal lifecycle-hook completion. AWS::AutoScaling::LifecycleHook
资源添加到Auto Scaling组,并在cfn-signal
之外调用aws autoscaling complete-lifecycle-action
,以指示生命周期钩子完成。 This won't delay your CloudFormation stack update as desired, but it will delay the individual auto-scaled instances from entering the InService
state until the lifecycle signal is received. InService
状态,直到收到生命周期信号。 (See Lifecycle Hooks documentation for more info.) DesiredCapacity
number of instances all in the InService
state. DesiredCapacity
数时完成InService
状态。 the rolling update only works for existing instances. 滚动更新仅适用于现有实例。 The documentation says:
文件说:
Rolling updates enable you to specify whether AWS CloudFormation updates instances that are in an Auto Scaling group in batches or all at once.
通过滚动更新,您可以指定AWS CloudFormation是批量更新Auto Scaling组中的实例还是一次更新所有实例。
So to test this, create a stack based on your template. 因此,要对此进行测试,请根据模板创建堆栈。 than make a small modification to the launch config (eg set sleep 120 to 121) and update the stack.
而不是对启动配置进行小的修改(例如,设置睡眠120到121)并更新堆栈。 now you should see a rolling update.
现在您应该看到滚动更新。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.