简体   繁体   English

在分片式mongodb中保留文档

[英]Reserve documents in sharded mongodb

I have a sharded collection of documents in mongodb and several application servers accessing it. 我在mongodb中有一个分片的文档集合,有几个应用服务器正在访问它。

Each application contributes new documents and eventually needs to remove some as well. 每个应用程序都会贡献新的文档,最终还需要删除一些文档。

It doesn't matter which documents are removed, but it's critical that it removes (claims) an exact number, and that no other application is removing (claiming) the same document(s). 删除哪些文档无关紧要,但是至关重要的是,它要删除(声明)确切的编号,并且没有其他应用程序要删除(声明)相同的文档。

My idea is: 我的想法是:

unique = makeUniqueValue()
docs = []

for (i = 0;i < 10;i++) {
    r = findAndModify( claim: false, $set: { claim: unique });
    if (r.value) docs.push(r);
}

if (docs.length < 10)
    "release all docs by updating (claim: false) and try again in some time"

One potential problem with this solution is that given too many applications (and few docs), they would just keep claiming some documents and releasing them again. 该解决方案的一个潜在问题是,给定太多的应用程序(和少量的文档),他们只会继续索要一些文档并再次发布它们。

What is the well-known and well-tested solution to this problem? 解决该问题的著名方法是什么?

Are "update" and "findAndModify" guaranteeing, that the updated document match the query before the update? “ update”和“ findAndModify”是否保证更新后的文档与更新前的查询匹配?

Or could another application "steal" it between matching and updating and thus both application thinks they've claimed the document? 还是另一个应用程序可以在匹配和更新之间“窃取”它,因此两个应用程序都认为他们已经声明了该文档?

Once the update is running on that document it will ensure that the query matches the document and that it is the latest version. 在该文档上运行更新后,它将确保查询与该文档匹配并且它是最新版本。

No other program should be able to steal on a per document basis. 任何其他程序都不能基于每个文档进行窃取。

To explain a bit further since I realise this answer is kind of bare: MongoDB has a writer greedy read/write lock on a database level. 自从我意识到这个答案很简单以来,需要进一步解释一下:MongoDB在数据库级别有一个作家贪婪的读/写锁。

This means that findAndModify would not be able to find something while a write operation is given ability to run. 这意味着在findAndModify写操作运行能力时, findAndModify将无法找到某些内容。 So, it can't find a document that is about to be updated as claimed in another thread/application for example. 因此,它找不到将要更新的文档,例如在另一个线程/应用程序中所声明的。

So this code immediately isolates claiming of documents to one application since each iteration of the loop by another application will result in unclaimed documents and never an in-between/partial state on the MongoDB server. 因此,此代码可立即将对文档的声明隔离到一个应用程序中,因为另一个应用程序进行的每次循环迭代都将导致文档声明不足,并且永远不会在MongoDB服务器上处于中间/部分状态。

When actually updating it doesn't matter since you know those documents are the documents you need to update, however, operators like $set etc are run in sequence on a single document as such update operations themselves cannot take partial document state either, they either take claim false or nothing. 实际更新时没有关系,因为您知道这些文档就是您需要更新的文档,但是,像$set等操作符会在单个文档上按顺序运行,因为此类更新操作本身也无法处于部分文档状态, claim false或一无所有。 The update will also pick the rows directly from the data files not from a static result set written out. 此更新还将直接从数据文件中选择行,而不是从静态结果集中提取行。

If you were to update using the _id or another static piece of data then it would be different. 如果要使用_id或其他静态数据进行更新,则将有所不同。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM