简体   繁体   English

请求回调在node.js中是一种不好的做法吗?

[英]Are callbacks for requests a bad practice in node.js?

Imagine you want to download an image or a file, this would be the first way the internet will teach you to go ahead: 假设您要下载图片或文件,这将是互联网教会您继续前进的第一种方式:

request(url, function(err, res, body) {
    fs.writeFile(filename, body);
});

But doesn't this accumulate all data in body , filling the memory? 但这不是在body累积所有数据,填满内存吗? Would a pipe be totally more efficient? pipe会更有效吗?

request(url).pipe(fs.createWriteStream(filename));

Or is this handled internally in a similar matter, buffering the stream anyway, making this irrelevant? 还是以类似的方式在内部进行处理,无论如何都要缓冲流,从而使此无关紧要?

Furthermore, if I want to use the callback but not the body (because you can still pipe ), will this memory buffer still be filled? 此外,如果我要使用回调而不是 body (因为您仍然可以使用pipe ),那么是否仍将填充此内存缓冲区?

I am asking because the first (callback) method allows me to chain downloads in stead of launching them in parallel(*), but I don't want to fill a buffer I'm not gonna use either. 我问是因为第一种(回调)方法允许我链接下载,而不是并行启动下载(*),但是我不想填充一个我都不用的缓冲区。 So I need the callback if I don't want to resort to something fancy like async just to use queue to prevent this. 因此,如果我不想诉诸异步之类的东西只是为了使用队列来防止这种情况,就需要回调。

(*) Which is bad because if you just request too many files before they are complete, the async nature of request will cause node to choke to death in an overdose of events and memory loss. (*)这很不好,因为如果您在完成文件之前仅request太多文件,则request的异步性质将导致节点因过量事件和内存丢失而窒息而死。 First you'll get these: 首先,您将获得以下内容:

"possible EventEmitter memory leak detected. 11 listeners added. Use emitter.setMaxListeners() to increase limit."

And when stretching it, 500 piped requests will fill your memory up and crash node. 并且当扩展它时,500个管道请求将填满您的内存并崩溃节点。 That's why you need the callback in stead of the pipe, so you know when to start the next file. 这就是为什么需要回调而不是管道的原因,因此您知道何时启动下一个文件。

But doesn't this accumulate all data in body, filling the memory? 但这不将体内的所有数据都累积到内存中吗?

Yes, many operations such as your first snippet buffer data into memory for processing. 是的,许多操作(例如您的第一个片段将数据缓存到内存中进行处理)。 Yes this uses memory, but it is at least convenient and sometimes required depending on how you intend to process that data. 是的,这会使用内存,但这至少很方便,有时甚至需要,这取决于您打算如何处理该数据。 If you want to load an HTTP response and parse the body as JSON, that is almost always done via buffering, although it's possible with a streaming parser, it is much more complicated and usually unnecessary. 如果要加载HTTP响应并将主体解析为JSON,这几乎总是通过缓冲来完成,尽管流解析器可以实现,但它要复杂得多,通常是不必要的。 Most JSON data is not sufficiently large such that streaming is a big win. 大多数JSON数据不够大,因此流媒体是一个大赢家。

Or is this handled internally in a similar matter, making this irrelevant? 还是在内部以类似方式处理此问题,从而使其无关紧要?

No, APIs that provide you an entire piece of data as a string use buffering and not streaming. 不可以,作为字符串为您提供完整数据的API使用缓冲而不是流式传输。

However, multimedia data, yes, you cannot realistically buffer it to memory and thus streaming is more appropriate. 但是,多媒体数据是的,您不能实际将其缓冲到内存中,因此流式传输更为合适。 Also that data tends to be opaque (you don't parse it or process it), which is also good for streaming. 同样,数据往往是不透明的(您不解析或处理它),这对于流式传输也很有用。

Streaming is nice when circumstances permit it, but that doesn't mean there's anything necessarily wrong with buffering. 只要情况允许,流式传输就很好了,但这并不意味着缓冲必定有问题。 The truth is buffering is how the vast majority of things work most of the time. 事实是缓冲是大多数情况下大多数时间是如何工作的。 In the big picture, streaming is just buffering 1 chunk at a time and capping them at some size limit that is well within the available resources. 从总体上看,流式传输一次仅缓冲1个块,并将它们限制在可用资源范围内的一定大小限制内。 Some portion of the data needs to go through memory at some point if you are going to process it. 如果要处理数据,则某些时候某些部分需要通过内存。

Because if you just request too many files one by one, the async nature of request will cause node to choke to death in an overdose of events and memory loss. 因为如果您仅一个一个地请求太多文件,那么请求的异步性质将导致节点因过量事件和内存丢失而窒息而死。

Not sure exactly what you are stating/asking here, but yes, writing effective programs requires thinking about resources and efficiency. 不确定确切要在这里说/要问的是什么,但是是的,编写有效的程序需要考虑资源和效率。

See also substack's rant on streaming/pooling in the hyperquest README . 另请参见hyperquest README中有关流/池化的子堆栈设置

I figured out a solution that renders the questions about memory irrelevant (although I'm still curious). 我想出了一个解决方案,使有关内存的问题变得无关紧要(尽管我仍然很好奇)。

if I want to use the callback but not the body (because you can still pipe ), will this memory buffer still be filled? 如果我要使用回调而不是 body (因为您仍然可以使用pipe ),那么此内存缓冲区是否仍会被填充?

You don't need the callback from request() in order to know when the request is finished. 您不需要来自request()callback即可知道请求何时完成。 The pipe() will close itself when the stream 'ends'. stream “结束”时, pipe()将自行关闭。 The close emits an event and can be listened for: 关闭发出一个事件,可以监听:

request(url).pipe(fs.createWriteStream(filename)).on('close', function(){           
    next();
});

Now you can queue all your requests and download files one by one. 现在,您可以将所有请求排队,并一一下载文件。

Of course you can vacuum the internet using 8 parallel requests all the time with libraries such as async.queue , but if all you want to do is get some files with a simple script, async is probably overkill. 当然,您可以使用async.queueasync.queue使用8个并行请求来async.queue ,但是如果您要做的只是用一个简单的脚本获取一些文件,那么async可能会过大。

Besides, you're not gonna want to max out your system resources for a single trick on a multi-user system anyway. 此外,无论如何,您都不希望为多用户系统上的一个窍门最大化系统资源。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM