I ran into the below situation where I need to update large number of collections very frequently.
I have a collections like below
coll1
{
"identification_id" : String,
"name" : String,
"mobile_number" : Number,
"location" : String,
"user_properties" : [Mixed types],
"profile_url" : String
}
coll2
{
"identification_id": String,
"user_id" : String,
"name" : String,
"mobile_number" : Number,
"location" : String,
"user_properties" : String,
"profile_url": String,
"qualified_user" : String,
"user_interest_stage" :Number,
"source" : String,
"fb_id" : String,
"comments":String
}
updated coll1
{
"identification_id": String,
"name" : String,
"mobile_number" : Number,
"location" : String,
"user_properties" : String,
"profile_url": String,
"qualified_user" : String,
"user_interest_stage" :Number,
"source" : String,
"fb_id" : String,
"comments":String
}
As you have seen coll1 and coll2, below will be inserted documents scenarios
Now due to some reasons, We are merging these collections into one collection, which is coll1. We have decided to update qualified visitor based on key 'qualified_user' and update corresponding user fields in coll1.
I have written a script, using Node JS and mongoose, which will fetch documents from coll1 and verify a qualified_user in coll2 and update based on below scenarios.
When I run this script, I am getting below error.
<--- JS stacktrace --->
==== JS stack trace =========================================
The number of documents in coll1 are 1L. Due to processing large number of collections I ran into this situation. So I have used skip and limit to process all the documents but it took 1hour to process all documents.
Is there any better way to handle these type of db updates for large number of collections?
You're trying to hold too many documents at once and it makes you run out of memory.
You have two easy options:
--max-old-space-size
flag when running you're script, with that you can manually set the amount of memory the script has access to, like so: node --max-old-space-size=4096 script.js
With that said both of these aren't optimal and assuming you're scale will keep increasing both will eventually not work. I personally recommend to re-think the data structure. Mongo by being an unstructured language does not handle data duplications well. This means you 'want' to keep all the data in one collection, then just update certain fields under certain conditions.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.