简体   繁体   English

使用AWS lambda函数使用SQS消息

[英]Consume SQS messages using AWS lambda function

I have 2 FIFO SQS queues which receives JSON messages that are to be indexed to elasticsearch. 我有2个FIFO SQS队列,它接收要索引到elasticsearch的JSON消息。 One queue is constantly adding delta changes to the database and adding them to the queue. 一个队列不断向数据库添加增量更改并将其添加到队列中。 The second queue is used for database re-indexing ie the entire 50Tb if data is to be indexing every couple of months (where everything is added to the queue). 第二个队列用于数据库重新索引,即整个50Tb,如果数据要每隔几个月编入索引(其中所有内容都被添加到队列中)。 I have a lambda function that consumes the messages from the queues and places them into the appropriate queue (either the active index or the indexing being rebuilt). 我有一个lambda函数,它使用来自队列的消息并将它们放入适当的队列(活动索引或正在重建的索引)。

How should I trigger the lambda function to best process the backlog of messages in SQS so it process both queues as quickly as possible? 我应该如何触发lambda函数以最好地处理SQS中的积压消息,以便尽快处理这两个队列?

A constraint I have is that the queue items need to be processed in order. 我有一个约束是需要按顺序处理队列项。 If the lambda function could be run indefinitely without the 5 minute limit I could keep running one function that constantly processes messages. 如果lambda函数可以在没有5分钟限制的情况下无限期运行,我可以继续运行一个不断处理消息的函数。

The standard way to do this is to use Cloudwatch Events that run periodically . 执行此操作的标准方法是使用定期运行的Cloudwatch Events This lets you pull data from the queue on a regular schedule. 这使您可以定期从队列中提取数据。

Because you have to poll SQS this may not lead to the fastest processing of messages. 因为您必须轮询SQS,这可能不会导致最快的消息处理。 Also, be careful if you constantly have messages to process - Lambda will end up being far more expensive than a small EC2 instance to handle the messages. 另外,如果你经常要处理消息,要小心 - Lambda最终会比处理消息的小EC2实例贵得多。

Instead of pushing your messages directly into SQS you could publish the messages to a SNS Topic with 2 Subscriber registered. 您可以将消息发布到SNS主题,而不是将消息直接发送到SQS ,并注册了2个订阅者。

  1. Subscriber: SQS 订阅者:SQS
  2. Subscriber: Lambda Function 订阅者:Lambda函数

Has the benefit that your Lambda is invoked at the same time as the message is stored in SQS . 具有在消息存储在SQS中的同时调用Lambda的好处。

Not sure I fully understand your problem, but here are my 2 cents: 不确定我完全理解你的问题,但这是我的2美分:

  1. If you have a constant and real-time stream of data, consider using Kinesis Streams with 1 shard in order to preserve the FIFO. 如果您有一个恒定实时的数据流,请考虑使用带有1个分片的 Kinesis Streams以保留FIFO。 You may consume the data in batch of n items using lambda . 您可以使用lambdan项目的方式使用数据。 Up to you to decide the batch size n and the memory size of lambda . 由您决定批量大小nlambda的内存大小。

    • with this solution you pay a low constant price for Kinesis Streams and a variable price for Lambdas . 与此解决方案,您只需交纳较低的固定价格Kinesis Streams和可变价格Lambdas
  2. Should you really are in love with SQS and the real-time does not metter , you may consume items with Lambdas or EC2 or Batch . 如果您真的爱上SQS并且实时不满足 ,您可以使用LambdasEC2Batch来消费。 Either you trigger many lambdas with CloudWatch Events , either you keep alive an EC2 , either you trigger on a regular basis an AWS Batch job. 您可以使用CloudWatch Events触发许多lambdas ,要么保持EC2 ,要么定期触发AWS Batch作业。

    • there is an economic equation to explore, each solution is the best for one use case and the worst for another, make your choice ;) 有一个经济方程式需要探索,每个解决方案对一个用例最好,而对另一个用例最差,做出你的选择;)
    • I prefer SQS + Lambdas when there are few items to consume and SQS + Batch when there are a lot of items to consume. 我喜欢SQS + Lambdas时,有一些项目,消费和SQS + Batch时有很多项目要消耗的。
  3. You may probably also consider using SNS + SQS + Lambdas like @maikay says in his answer, but I wouldn't choose that solution. 您可能会也可能考虑使用SNS + SQS + Lambdas像@maikay他回答说,但我不会选择的解决方案。

Hope it helps. 希望能帮助到你。 Feel free to ask for clarifications. 随意要求澄清。 Good luck! 祝好运!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM