简体   繁体   中英

Node.js optimizing module for best performance

I'm writing a crawler module which is calling it-self recursively to download more and more links depending on a depth option parameter passed.

Besides that, I'm doing more tasks on the returned resources I've downloaded (enrich/change it depending on the configuration passed to the crawler).
This process is going on recursively until it's done which might take a-lot of time (or not) depending on the configurations used.

I wish to optimize it to be as fast as possible and not to hinder on any Node.js application that will use it.
I've set up an express server that one of its routes launch the crawler for a user defined (query string) host.
After launching a few crawling sessions for different hosts, I've noticed that I can sometimes get real slow responses from other routes that only return simple text.
The delay can be anywhere from a few milliseconds to something like 30 seconds, and it's seems to be happening at random times (well nothing is random but I can't pinpoint the cause).
I've read an article of Jetbrains about CPU profiling using V8 profiler functionality that is integrated with Webstorm, but unfortunately it only shows on how to collect the information and how to view it, but it doesn't give me any hints on how to find such problems, so I'm pretty much stuck here.

Could anyone help me with this matter and guide me, any tips on what could hinder the express server that my crawler might do (A lot of recursive calls), or maybe how to find those hotspots I'm looking for and optimize them?

It's hard to say anything more specific on how to optimize code that is not shown, but I can give some advice that is relevant to the described situation.

One thing that comes to mind is that you may be running some blocking code. Never use deep recursion without using setTimeout or process.nextTick to break it up and give the event loop a chance to run once in a while.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM