[英]Iterating over MongoDB cursor is slow
I need to iterate over a full MongoDb collection with ~2 million documents. 我需要用大约200万个文档迭代一个完整的MongoDb集合。 Therefore I am using the cursor feature and the
eachAsync
function. 因此我使用光标功能和
eachAsync
功能。 However I noticed that it's pretty slow (it takes more than 40 minutes). 但是我注意到它很慢(需要40多分钟)。 I tried different batchSizes up to 5000 (which would be just 400 queries against MongoDB).
我尝试了不同的batchSizes高达5000(这对MongoDB只有400次查询)。
The application doesn't take much CPU (0.2% - 1%), nor does it take much RAM or IOPs. 该应用程序不需要太多的CPU(0.2% - 1%),也不需要太多的RAM或IOP。 So apparently my code can be optimized to speed up this process.
显然我的代码可以进行优化,以加快这个过程。
The code: 代码:
const playerProfileCursor = PlayerProfile.find({}, { tag: 1 }).cursor({ batchSize: 5000 })
const p2 = new Promise<Array<string>>((resolve, reject) => {
const playerTags:Array<string> = []
playerProfileCursor.eachAsync((playerProfile) => {
playerTags.push(playerProfile.tag)
}).then(() => {
resolve(playerTags)
}).catch((err) => {
reject(err)
})
})
When I set a breakpoint inside of the eachAsync function body it will immediately hit. 当我在eachAsync函数体内设置断点时,它会立即命中。 So there is nothing stuck, it's just so slow.
所以没有任何东西被卡住,它只是如此缓慢。 Is there a way to speed this up?
有没有办法加快速度?
That feature was added in version 4.12 (most up to date atm) and isn't really documented yet. 该功能已在版本4.12(最新的atm)中添加,并且尚未真正记录。
eachAsync
runs with a concurrency of 1 by default, but you can change it in the parameter 'parallel'. 默认情况下,
eachAsync
以并发1运行,但您可以在参数“parallel”中更改它。 ( as seen here ) ( 如此处所示 )
Thus your code could look something like this: 因此,您的代码看起来像这样:
const playerProfileCursor = PlayerProfile.find({}, { tag: 1 }).cursor({ batchSize: 5000 })
const p2 = new Promise<Array<string>>((resolve, reject) => {
const playerTags:Array<string> = []
playerProfileCursor.eachAsync((playerProfile) => {
playerTags.push(playerProfile.tag)
}, { parallel: 50 }).then(() => {
resolve(playerTags)
}).catch((err) => {
reject(err)
})
})
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.