简体   繁体   English

ECS 任务只能从 SQS 队列中选择一条消息

[英]ECS task only able to pick one message from SQS queue

I have an architecture which looks like that:我有一个看起来像这样的架构:

  • As soon as a message is sent to a SQS queue, an ECS task picks this message and process it.一旦消息被发送到 SQS 队列,ECS 任务就会选择该消息并对其进行处理。
  • Which means that if X messages are sent into the queue, X ECS task will be spun up in parallel.这意味着如果 X 条消息被发送到队列中,X 个 ECS 任务将并行启动。 An ECS task is only able to fetch one message (per my code above) ECS 任务只能获取一条消息(根据我上面的代码)

The ECS task uses a dockerized Python container, and uses boto3 SQS client to retrieve and parse the SQS message: ECS 任务使用dockerized Python 容器,并使用boto3 SQS 客户端检索和解析 SQS 消息:

sqs_response = get_sqs_task_data('<sqs_queue_url>')
sqs_message = parse_sqs_message(sqs_response)

while sqs_message is not None:
    # Process it
    # Delete if from the queue
    
    # Get next message in queue
    sqs_response = get_sqs_task_data('<sqs_queue_url>')
    sqs_message = parse_sqs_message(sqs_response)

def get_sqs_task_data(queue_url):
    client = boto3.client('sqs')

    response = client.receive_message(
        QueueUrl=queue_url,
        MaxNumberOfMessages=1
    )

    return response

def parse_sqs_message(response_sqs_message):

    if 'Messages' not in response_sqs_message:
        logging.info('No messages found in queue')
        return None
    
    # ... parse it and return a dict

    return {
        data_1 = ..., 
        data_2 = ...
    }

All in all, pretty straightforward.总而言之,非常简单。

In get_sqs_data() , I explicitely specify that I want to retrieve only one message (because 1 ECS task has to process only one message).get_sqs_data()中,我明确指定我只想检索一条消息(因为 1 个 ECS 任务必须只处理一条消息)。 In parse_sqs_message() , I test if there are some messages left in the queue withparse_sqs_message()中,我测试队列中是否还有一些消息

if 'Messages' not in response_sqs_message:
        logging.info('No messages found in queue')
        return None

When there is only one message in the queue (meaning one ECS task has been triggered), everything is working fine.当队列中只有一条消息(意味着触发了一个 ECS 任务)时,一切正常。 The ECS task is able to pick the message, process it and delete it. ECS 任务能够挑选、处理和删除消息。

However, when the queue is populated with X messages ( X > 1 ) at the same time , X ECS task are triggered, but only ECS task is able to fetch one of the message and process it.但是,当队列同时填充了 X 条消息 ( X > 1 ),会触发 X 个 ECS 任务,但只有 ECS 任务能够获取其中一条消息并进行处理。
All the others ECS tasks will exit with No messages found in queue , although there are X - 1 messages left to be processed.尽管还有X - 1条消息需要处理,但所有其他 ECS 任务将退出并显示No messages found in queue

Why is that?这是为什么? Why are the others task not able to pick the messages left to be picked?为什么其他任务无法选择剩下要选择的消息?

If that matters, the VisibilityTimeout of SQS is set to 30mins.如果这很重要,SQS 的VisibilityTimeout设置为 30 分钟。

Any help would greatly be appreciated!任何帮助将不胜感激!

Feel free to ask for more precision if you want so.如果您愿意,请随时要求更精确。

I forgot to give an answer to that question.我忘了回答那个问题。

The problem was the fact the the SQS was setup as a FIFO queue.问题在于 SQS 被设置为 FIFO 队列。 A FIFO Queue only allows one consumer at a time (to preserve the order of the message). FIFO 队列一次只允许一个消费者(以保持消息的顺序)。 Changing it to a normal (standard) queue fixed this issue.将其更改为普通(标准)队列可解决此问题。

I'm not sure to understand how the tasks are triggered from SQS, but from what I understand in the SQS SDK documentation, this might happen if the number of messages is small when using short polling.我不确定如何从 SQS 触发任务,但根据我在 SQS SDK 文档中的理解,如果使用短轮询时消息数量较少,则可能会发生这种情况。 From the get_sqs_task_data definition, I see that your are using short polling.get_sqs_task_data定义中,我看到您正在使用短轮询。

Short poll is the default behavior where a weighted random set of machines is sampled on a ReceiveMessage call.短轮询是在 ReceiveMessage 调用中对一组加权随机机器进行采样的默认行为。 Thus, only the messages on the sampled machines are returned.因此,仅返回采样机器上的消息。 If the number of messages in the queue is small (fewer than 1,000), you most likely get fewer messages than you requested per ReceiveMessage call.如果队列中的消息数量很少(少于 1,000 条),您收到的消息很可能比每次 ReceiveMessage 调用请求的消息少。 If the number of messages in the queue is extremely small, you might not receive any messages in a particular ReceiveMessage response.如果队列中的消息数量非常少,您可能不会在特定的 ReceiveMessage 响应中收到任何消息。 If this happens, repeat the request.如果发生这种情况,请重复请求。

https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sqs.html#SQS.Client.receive_message https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sqs.html#SQS.Client.receive_message

You might want to try to use Long polling with a value superior to the visibility timeout您可能想尝试使用具有优于可见性超时值的长轮询

I hope it helps我希望它有帮助

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM