简体   繁体   English

mongoimport 性能随着时间的推移而下降

[英]mongoimport performances degrading over time

I'm using mongoimport to import some json files into my MongoDB database.我正在使用mongoimport将一些json文件导入到我的 MongoDB 数据库中。 I have 5 files with about 2M documents each, and the collection have 4 regular index and 2 multikey ones.我有 5 个文件,每个文件大约有 200 万个文档,集合有 4 个常规索引和 2 个多键索引。

When I start importing the first file I see ~500 documents inserted per second but the performances starting going down after a while.当我开始导入第一个文件时,我看到每秒插入约 500 个文档,但一段时间后性能开始下降。 I'm now importing the 3rd file and I see a throughput of less than 50 documents per second.我现在正在导入第三个文件,我看到每秒少于 50 个文档的吞吐量。 It seems like the import degrades with the collection size increasing.似乎随着集合大小的增加,导入会降低。 What's going on?这是怎么回事? How can I improve this?我该如何改进?

While it may not be the reason in this case, indexes do have some overhead when it comes to writes because you are inserting into the index as well as inserting into the collection.虽然在这种情况下这可能不是原因,但在写入时索引确实有一些开销,因为您插入到索引以及插入到集合中。 There is an additional performance hit if you update a document that causes it to become larger than it's allotted size.如果您更新的文档导致其变得大于分配的大小,则会对性能造成额外影响。 In these cases all indexes that include this document would need to be updated.在这些情况下,包含此文档的所有索引都需要更新。

With 6 indexes on the collection, there is 6 indexes to update for every document that is inserted.集合上有 6 个索引,每个插入的文档都有 6 个索引要更新。 This will have some impact on the effective speed of mongoimport这会对mongoimport的有效速度有一定的影响

You could test this by importing into an unindexed collection and creating the indexes after the fact.您可以通过导入未索引的集合并事后创建索引来测试这一点。

You can see here for more information on write impact of indexes.您可以在此处查看有关索引写入影响的更多信息。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM