简体   繁体   中英

Are callbacks for requests a bad practice in node.js?

Imagine you want to download an image or a file, this would be the first way the internet will teach you to go ahead:

request(url, function(err, res, body) {
    fs.writeFile(filename, body);
});

But doesn't this accumulate all data in body , filling the memory? Would a pipe be totally more efficient?

request(url).pipe(fs.createWriteStream(filename));

Or is this handled internally in a similar matter, buffering the stream anyway, making this irrelevant?

Furthermore, if I want to use the callback but not the body (because you can still pipe ), will this memory buffer still be filled?

I am asking because the first (callback) method allows me to chain downloads in stead of launching them in parallel(*), but I don't want to fill a buffer I'm not gonna use either. So I need the callback if I don't want to resort to something fancy like async just to use queue to prevent this.

(*) Which is bad because if you just request too many files before they are complete, the async nature of request will cause node to choke to death in an overdose of events and memory loss. First you'll get these:

"possible EventEmitter memory leak detected. 11 listeners added. Use emitter.setMaxListeners() to increase limit."

And when stretching it, 500 piped requests will fill your memory up and crash node. That's why you need the callback in stead of the pipe, so you know when to start the next file.

But doesn't this accumulate all data in body, filling the memory?

Yes, many operations such as your first snippet buffer data into memory for processing. Yes this uses memory, but it is at least convenient and sometimes required depending on how you intend to process that data. If you want to load an HTTP response and parse the body as JSON, that is almost always done via buffering, although it's possible with a streaming parser, it is much more complicated and usually unnecessary. Most JSON data is not sufficiently large such that streaming is a big win.

Or is this handled internally in a similar matter, making this irrelevant?

No, APIs that provide you an entire piece of data as a string use buffering and not streaming.

However, multimedia data, yes, you cannot realistically buffer it to memory and thus streaming is more appropriate. Also that data tends to be opaque (you don't parse it or process it), which is also good for streaming.

Streaming is nice when circumstances permit it, but that doesn't mean there's anything necessarily wrong with buffering. The truth is buffering is how the vast majority of things work most of the time. In the big picture, streaming is just buffering 1 chunk at a time and capping them at some size limit that is well within the available resources. Some portion of the data needs to go through memory at some point if you are going to process it.

Because if you just request too many files one by one, the async nature of request will cause node to choke to death in an overdose of events and memory loss.

Not sure exactly what you are stating/asking here, but yes, writing effective programs requires thinking about resources and efficiency.

See also substack's rant on streaming/pooling in the hyperquest README .

I figured out a solution that renders the questions about memory irrelevant (although I'm still curious).

if I want to use the callback but not the body (because you can still pipe ), will this memory buffer still be filled?

You don't need the callback from request() in order to know when the request is finished. The pipe() will close itself when the stream 'ends'. The close emits an event and can be listened for:

request(url).pipe(fs.createWriteStream(filename)).on('close', function(){           
    next();
});

Now you can queue all your requests and download files one by one.

Of course you can vacuum the internet using 8 parallel requests all the time with libraries such as async.queue , but if all you want to do is get some files with a simple script, async is probably overkill.

Besides, you're not gonna want to max out your system resources for a single trick on a multi-user system anyway.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM