[英]Parallel polling of the AWS SQS standard queue - Message processing is too slow
I've a module that polls an AWS SQS queue at specified intervals with one message at a time with ReceiveMessageRequest
. 我有一个模块,它使用ReceiveMessageRequest
在指定的时间间隔一次以一条消息轮询AWS SQS队列。 Following is the method: 以下是方法:
public static ReceiveMessageResult receiveMessageFromQueue() {
String targetedQueueUrl = sqsClient.getQueueUrl("myAWSqueueName").getQueueUrl();
ReceiveMessageRequest receiveMessageRequest = new ReceiveMessageRequest(targetedQueueUrl)
.withWaitTimeSeconds(10).withMaxNumberOfMessages(1);
return sqsClient.receiveMessage(receiveMessageRequest);
}
Once a message is received and processed its get deleted from the queue with the DeleteMessageResult
. 一旦收到并处理了一条消息,便使用DeleteMessageResult
从队列中将其删除。
public static DeleteMessageResult deleteMessageFromQueue(String receiptHandle) {
log.info("Deleting Message with receipt handle - [{}]", receiptHandle);
String targetedQueueUrl = sqsClient.getQueueUrl("myAWSqueueName").getQueueUrl();
return sqsClient.deleteMessage(new DeleteMessageRequest(targetedQueueUrl, receiptHandle));
}
I've created an executable jar file which is deployed in around 40 instances and are actively polling the queue. 我创建了一个可执行的jar文件,该文件部署在大约40个实例中,并且正在主动轮询队列。 I could see each of them receives messages. 我可以看到他们每个人都收到消息。 But in AWS SQS console I can see only the numbers 0, 1, 2 or 3 on the 'in flight messages' column. 但是在AWS SQS控制台中,我在“飞行消息”列中只能看到数字0、1、2或3。 Why that so even when there are 40+ different consumers are receiving messages from the queue? 为什么即使有40多个不同的使用者都从队列中接收消息,也为什么呢? Also the number of messages available in the queue reduces very slowly. 同样,队列中可用消息的数量减少得非常缓慢。
Following are the configuration parameters of the queue. 以下是队列的配置参数。
Default Visibility Timeout: 30 seconds
Message Retention Period: 4 days
Maximum Message Size: 256 KB
Receive Message Wait Time: 0 seconds
Messages Available (Visible): 4,776
Delivery Delay: 0 seconds
Messages in Flight (Not Visible): 2
Queue Type: Standard
Messages Delayed: 0
Content-Based Deduplication: N/A
Why the messages are not getting processed quickly even when there are multiple consumers? 为什么即使有多个使用者也无法快速处理邮件? Do I need to modify any of the queue parameters or something in the receive message/delete message requests? 我是否需要修改任何队列参数或接收消息/删除消息请求中的某些内容? Please advise. 请指教。
UPDATE: 更新:
All the EC2 instances and the SQS are in the same region. 所有EC2实例和SQS都位于同一区域中。 The consumers (jar file that polls the queue) run as part of the start-up script of the EC2 instance. 使用者(轮询队列的jar文件)作为EC2实例启动脚本的一部分运行。 And it is having a scheduled task that polls the queue every 12 seconds. 而且它有一个计划的任务,每12秒轮询一次队列。 Before I push the messages to the queue I spun up 2-3 instances. 在将消息推送到队列之前,我启动了2-3个实例。 (We may have some already running instances at that time - this adds up the number of receivers(Caped to 50) for the queue. On receiving the message it will do some tasks (including some DB operations, data analysis and calculations, report file generation and upload the report to S3 etc..) and It'll take approx. 10-12 seconds. After that's done it deletes the message from the queue. Below image is the screenshot of the SQS metrics for last 1 week (from SQS monitoring console). (当时我们可能有一些已经在运行的实例-这将增加队列的接收方数量(上限为50)。收到消息后,它将执行一些任务(包括一些数据库操作,数据分析和计算,报告文件)生成报告并将其上传到S3等。)大约需要10到12秒。完成后,它将删除队列中的消息。下图是过去1周的SQS指标的屏幕截图(来自SQS)监控控制台)。
I'll do the best I can with the information given. 我会尽力提供所提供的信息。 More details about your processing loop logic, region setup, and metrics (see below) would help improve this answer. 有关处理循环逻辑,区域设置和指标的更多详细信息(请参阅下文)将有助于改善此答案。
I've created an executable jar file which is deployed in around 40 instances and are actively polling the queue. 我创建了一个可执行的jar文件,该文件部署在大约40个实例中,并且正在主动轮询队列。 I could see each of them receives messages. 我可以看到他们每个人都收到消息。 But in AWS SQS console I can see only the numbers 0, 1, 2 or 3 on the 'in flight messages' column. 但是在AWS SQS控制台中,我在“飞行消息”列中只能看到数字0、1、2或3。 Why that so even when there are 40+ different consumers are receiving messages from the queue? 为什么即使有40多个不同的使用者都从队列中接收消息,也为什么呢? Also the number of messages available in the queue reduces very slowly. 同样,队列中可用消息的数量减少得非常缓慢。
Why the messages are not getting processed quickly even when there are multiple consumers? 为什么即使有多个使用者也无法快速处理邮件? Do I need to modify any of the queue parameters or something in the receive message/delete message requests? 我是否需要修改任何队列参数或接收消息/删除消息请求中的某些内容?
The fact that you're not seeing in-flight numbers that correspond more closely with the number of hosts you have processing messages definitely points to a problem - either your message processing is blazing fast (which doesn't seem to be the case) or your hosts aren't doing the work you think they are. 您没有看到与正在处理邮件的主机数量更紧密相关的机上号码这一事实肯定说明了一个问题-邮件处理速度很快(似乎并非如此)或您的房东没有做您认为应该做的工作。
In general, fetching and deleting a single message from SQS should take on the range of a few milliseconds. 通常,从SQS中获取和删除单个消息应花费几毫秒的时间。 Without more detail on your setup, this should get you started on troubleshooting. 如果没有更多详细的设置,这应该可以帮助您开始进行故障排除。 ( Some of these steps may seem obvious, but every single one of these was the source of real life problems I've seen developers run into. ) ( 其中一些步骤可能看起来很明显,但是其中每一个步骤都是我所见过的开发人员遇到的现实生活问题的源。 )
getQueueUrl
just once for the lifetime of your app, during some initialization step. 在某些初始化步骤中,仅在应用程序的生命周期内执行一次getQueueUrl
。 You don't need to call this repeatedly, as it'll be the same URL 您不需要重复调用它,因为它是相同的URL
Further note: 进一步说明:
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.