Fairly new to node and mongo. I'm a developer from a relational db background.
I have been asked to write a report to calculate the conversion rate from leads relating to vehicle workshop bookings to invoices. A conversion is where an invoice was produced within 60 days of a lead being generated.
So I have managed with mongodb, mongoose and nodejs to import all of the data from flat files into two collections, leads and invoices. I have 1M leads and about 30M invoices over a 5 year period and the rates are to be produced on a month by month basis. All data has vehicle reg in common.
So my problem is how do I join the data together with mongoose and nodejs?
So far I have attempted for any single lead so find any invoices within a 60 day period in order for the lead to qualify as a conversion. This works but my script stops after about 20 or so successful updates. At this point I think my script which makes individual queries for invoices per lead is too heavy a load on mongodb and I can see that making millions of individual queries is too much for mongodb.
After hours of browsing, I'm not sure what I should be looking for!?
Any help would be greatly appreciated.
Your attempt should be working without a problem. What helps me, though, with large data Mongo DB instances and analysis on them: Run queries directly in Mongo, not through Node. Like that you avoid having to convert Mongo structures (eg iterators) into Node structures (eg arrays) and generally lose a lot of overhead.
Also, make sure you have correct indices setup. That can be a HUGE difference in terms of performance in big databases.
What I would do then is something like (this should be considered pseudo code):
let converted = 0;
db.leads.find({},{id: 1, date: 1}).forEach(lead => {
const hasInvoices = db.invoices.count({leadId: lead.id, date: {$lt: lead.date + 60}});
converted ++;
});
To speed things up, I'd use the following index for this case:
db.invoices.createIndex({leadId: 1, date: -1});
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.