简体   繁体   中英

mongo db insert big collections

I have a mongo (version 2) in production in replicaset configuration (the next step is to add sharding).

I need to implement the following:

  • Once a day i'll receive a file with millions rows and i shall load it into mongo.
  • I have a runtime application that always read from this collection - very large amount of reads, and their performance is very important. The collection is indexed and all read perform readByIndex operation.

My current implementation of loading is:

  1. drop collection
  2. create collection
  3. insert into collection new documents

One of the thing I see is that because of mongoDB lock my total performance getting worst during the loading. I've checked the collection with up to 10Million entries. For more that that size I think I should start use sharding

What is the best way to love such issue? Or maybe should I use another solution strategy?

You could use two collections :)

  • collectionA contains this day's data
  • new data arrives
  • create a new collection (collectionB) and insert the data
  • now use collectionB as your data

Then, next day, repeat the above just swapping A and B :)

This will let collectionA still service requests while collectionB is being updated.

PS Just noticed that I'm about a year late answering this question :)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM