简体   繁体   中英

Auto-scaling ECS Cluster to/from zero instances

I have implemented the Job Observer Pattern using SQS and ECS. Job descriptions are pushed to the SQS queue for processing. The job processing run on an ECS Cluster within an Auto-Scaling Group running ECS Docker Tasks.

Each ECS Task does:

  1. Read message from SQS queue
  2. Execute job on data (~1 hour)
  3. Delete message
  4. Loop while there are more messages

I would like to scale down the cluster when there is no more work for each Instance, eventually to zero instances.

Looking at this similar post , the answers suggest scale-in would need to be handled outside of ASG in some way. Instances would self-scale-in, either by explicitly self-terminating or by toggling ASG Instance Protection off when there are no more messages.

This also doesn't handle the case of running multiple ECS Tasks on a single instance, as an individual task shouldn't terminate if other Tasks are running in parallel.

Am I limited to self scale-in and only one Task per Instance? Any way to only terminate once all ECS Tasks on an instance have exited? Any other scale-in alternatives?

You could use CloudWatch Alarms with Actions :

detect and terminate worker instances that have been idle for a certain period of time

I ended up using:

  • A Scale Out Policy that Adds the same number of instances as pending SQS queue messages
  • A Scale In Policy that Sets to Zero instances once the SQS queue is empty
  • Enabling ASG Instance Protection at the start of the batch job and disabling it at the end

This restricts me to one batch job per instance, but worked well for my scenario.

该问题的另一个解决方案是2016年底宣布的AWS Batch服务。

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM