简体繁体 English

批量API中是否有MongoDB使用的默认批处理大小？

[英]Is there a default batch size that is used by MongoDB in the Bulk API?

原文 2018-11-14 09:10:23 7 1 mongodb/ bulkinsert

I am new to MongoDB and I have this question because I read in the following link : " How can I improve MongoDB bulk performance? " that MongoDB internally breaks the list down into 1000 operations at a time. 我是MongoDB的新手，我有这个问题，因为我在以下链接中阅读：“ 我如何提高MongoDB的批量性能？ ”，MongoDB在内部一次将列表分解为1000个操作。 However, I could not find this information in the MongoDB documentation. 但是，我在MongoDB文档中找不到此信息。

Does this mean that when I insert 50,000 documents in a MongoDB collection using the Bulk API, MongoDB will internally break the list into batches of 1,000 and performing the bulk insert operation 50 times? 这是否意味着当我使用Bulk API在MongoDB集合中插入50,000个文档时，MongoDB会在内部将列表分为1000个批次，并执行50次大容量插入操作？ If yes, would I achieve the same performance if I break down the list of 50,000 documents into sublists of 1,000 documents and using the bulk insert operation in a for loop? 如果是，将50,000个文档列表分解为1,000个文档的子列表，并在for循环中使用批量插入操作，是否可以达到相同的性能？ Which is the better approach? 哪种方法更好？

Please help me understand. 请帮助我理解。

Thanks. 谢谢。

1 个解决方案

Yes, if you insert 50,000 documents in a MongoDB collection using the Bulk API, mongo will break it down to 1000 operations at max. 是的，如果您使用Bulk API在MongoDB集合中插入50,000个文档，则mongo最多会将其分解为1000个操作。 Ideally, you should make the batches of 1000 yourself and do the inserts but in this case it is not going to make any difference because data is already there in memory. 理想情况下，您应该自己制作1000个批次并进行插入，但是在这种情况下，由于内存中已经有数据，因此不会有任何区别。 You should not be accepting this huge amount of data in a single request in a production ready system. 在生产准备就绪的系统中，您不应在单个请求中接受如此大量的数据。 Client should be capable of sending small chunks of data so that you can store it in a queue on the server and process it in the background(some other thread). 客户端应该能够发送少量数据，以便您可以将其存储在服务器上的队列中并在后台（其他线程）进行处理。