简体   繁体   English

如何避免在 MongoDB 中多次读取相同的数据?

[英]How to avoid multiple reads of the same data in MongoDB?

I develop a service (in .net) for sending a lot of emails我开发了一项服务(在 .net 中)用于发送大量电子邮件

The service can be executed on multiple nodes该服务可以在多个节点上执行

I have a collection in MongoDB with all the data of the emails, how i can read a data on a node without the others nodes read it too ?我在 MongoDB 中有一个包含所有电子邮件数据的集合,我如何在没有其他节点读取的情况下读取节点上的数据?

I want to avoid sending the same mail multiple times我想避免多次发送相同的邮件

Basically, you'd need to signal the other nodes that a document is already being processed.基本上,您需要向其他节点发出文件已被处理的信号。 Therefore, you could add a marker property to your document, eg:因此,您可以在文档中添加标记属性,例如:

public class MyDocument 
{
  // ... other members

  // I suppose there is some kind of status property on the document
  public bool Done { get; set; }
  
  public DateTime? StartedProcession { get; set; }
}

Why is this a DateTime?为什么这是一个DateTime? property?财产? Because this way you can handle errors that are encountered by a node when sending the mail and updating the status flag.因为这样您可以处理节点在发送邮件和更新状态标志时遇到的错误。

You can use MongoDB's FindAndModify method to identify the next mail a node should send.您可以使用 MongoDB 的FindAndModify方法来识别节点应该发送的下一封邮件。 This method finds exactly one document and performs an atomic update on this.此方法仅找到一个文档并对其执行原子更新。 The conditions would be:条件是:

  • Done is false完成是假的
  • StartedProcession is null or more than an hour (or a value that better suits your needs) in the past过去 StartedProcession 为 null 或超过一个小时(或更适合您需要的值)

The update should set the StartedProcession property to the current time.更新应将StartedProcession属性设置为当前时间。 This way, other nodes will not try to send the same mail.这样,其他节点就不会尝试发送相同的邮件。

In the node, the mail is sent.在节点中,发送邮件。 If all is well and good, you update the Done flag on the document;如果一切顺利,您更新文档上的Done标志; if not, you could reset the StartedProcession property to null to allow for a retry (in addition, you could also store the error details in the document for later analysis).如果没有,您可以将StartedProcession属性重置为 null 以允许重试(此外,您还可以将错误详细信息存储在文档中以供以后分析)。

In case of an error that is so catastrophic that the node does not reset the StartedProcession property, the filter above asserts that another node will retry the transmission after an hour.如果发生灾难性错误,节点不会重置StartedProcession属性,则上面的过滤器会断言另一个节点将在一小时后重试传输。

Please note that in some rare cases the mail might still be sent twice, eg if the mail is sent successfully, but the update of the Done flag fails.请注意,在极少数情况下,邮件可能仍会发送两次,例如,如果邮件发送成功,但Done标志的更新失败。 However, this should be very rare as the FindAndModify as the first step went well and some milliseconds after the update is performed.但是,这应该是非常罕见的,因为FindAndModify第一步进展顺利,并且在执行更新后几毫秒。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM