简体   繁体   English

将数千条消息添加到Azure存储队列

[英]add thousands of messages to an Azure Storage Queue

I am trying to add about 6000 messages to my Azure Storage Queue in an Azure Function with Node.js . 我正在尝试使用Node.jsAzure函数中向我的Azure存储队列添加大约6000条消息。

I have tried multiple ways to do this, right now I wrap the QueueService method in a Promise and resolve the 6000 promises through a Promise.map with a concurrency of about 50 using Bluebird . 我已经尝试了多种方法来实现这一点,现在我将QueueService方法包装在Promise并使用Bluebird通过Promise.map解决6000个Promise.map ,并发度大约为50。

const addMessages = Promise.map(messages, (msg) => {
  //returns a promise wrapping the Azure QueueService method
  return myQueueService.addMessage(msg);
}, { concurrency: 50 });

//this returns a promise that resolves when all promises have resolved.
//it rejects when one of the promises have rejected.
addMessages.then((results) => {
  console.log("SUCCESS");
}, (error) => {
  console.log("ERROR");
});

My QueueService is created with an ExponentialRetry policy. 我的QueueService是使用ExponentialRetry策略创建的。


I have had mixed results using this strategy: 使用此策略我的结果好坏参半:

  • All messages get added to my queue and the promise resolves correctly. 所有消息都会添加到我的队列中,并且promise会正确解析。
  • All messages get added to my queue and the promise does not resolve (or reject). 所有消息都会添加到我的队列中,并且promise无法解析(或拒绝)。
  • Not all messages get added to my queue and the promise does not resolve (or reject). 并非所有消息都会添加到我的队列中,并且承诺无法解决(或拒绝)。

Am I missing something or is it possible for my calls to sometimes take 2 minutes to resolve and sometimes more than 10 minutes? 我是否遗漏了某些内容,或者我的通话有时需要2分钟才能解决,有时甚至超过10分钟?

In the future, I probably am going to have to add about 100.000 messages, so I'm kind of worried about the unpredictable result I have now. 在将来,我可能不得不添加大约100.000条消息,所以我有点担心我现在的不可预知的结果。

What would be the best strategy to add a large number of messages in Node (in an Azure Function) ? 在Node (在Azure功能中)添加大量消息的最佳策略是什么?


EDIT: 编辑:

Not sure how I missed this, but a pretty reliable way to add my messages to my Storage Queue is to use the queue output binding of my Azure Function: 不确定我是如何错过这个,但是将我的消息添加到我的存储队列的一种非常可靠的方法是使用我的Azure功能的队列输出绑定:

https://docs.microsoft.com/en-us/azure/azure-functions/functions-bindings-storage-queue#storage-queue-output-binding https://docs.microsoft.com/en-us/azure/azure-functions/functions-bindings-storage-queue#storage-queue-output-binding

Makes my code a lot easier as well! 使我的代码更容易!

for (var i = 0; i < messages.length; i++) {
  // add each message to queue
  context.bindings.outputQueue.push(messages[i]);
}

EDIT2: EDIT2:

I am going to split up my messages in batches of about 1000 and store these batches in Azure Blob Storage . 我将以大约1000个批量拆分我的消息,并将这些批次存储在Azure Blob存储中

Another Azure Function can be triggered each time a new blob is added and this function will handle the queueing of my messages by 1000 at a time. 每次添加新blob时都可以触发另一个Azure功能,此功能将一次处理1000个消息的排队。

This should make my queueing much more reliable and scalable, as I tried adding 20.000 messages to my queue through my output binding and recieving an Azure Function timeout after 5 minutes being able to process only about 15.000 messages. 这应该使我的排队更加可靠和可扩展,因为我尝试通过输出绑定向队列添加20.000条消息,并在5分钟后仅接收大约15.000条消息后接收Azure功能超时。

What triggers this function? 什么触发了这个功能? What I would recommend, instead of having a single function add all of those messages, is to fan out and allow those functions to scale and take better advantage of concurrency by limiting the amount of work they're doing. 我建议,而不是让一个函数添加所有这些消息,就是扇出并允许这些函数通过限制他们正在做的工作量来扩展并更好地利用并发性。

With I'm proposing above, you'd have the function that handles the trigger you have in place today queue up the work that would in turn be processed by another function that performs the actual work of adding a (much) smaller number of messages to the queue. 正如我在上面提出的那样,你将拥有处理现有触发器的功能,今天将工作排队,然后由另一个执行添加(少量)消息的实际工作的功能处理到队列。 You may need to play with the numbers to see what works well based on your workload, but this pattern would allow those functions to better scale (including across multiple machines), better handle failures and improve reliability and predictability. 您可能需要使用这些数字来查看哪些功能可以根据您的工作负载进行,但是这种模式可以使这些功能更好地扩展(包括跨多台计算机),更好地处理故障并提高可靠性和可预测性。

As an example, you could have the number of messages in the message you queue to trigger the work, and if you wanted 1000 messages as the final output, you could queue 10 messages instructing your "worker" functions to add 100 messages each. 例如,您可以在排队的消息中使用消息数来触发工作,如果您希望将1000条消息作为最终输出,则可以排列10条消息,指示您的“工作”功能各自添加100条消息。 I would also recommend playing with much smaller numbers per function. 我还建议每个功能使用更小的数字。

I hope this helps! 我希望这有帮助!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM