简体   繁体   English

NodeJS流和过早结束

[英]NodeJS streams and premature end

Assuming a Readable Stream in NodeJS and a Data ( on('data', ...) ) event handler tied to it that is relatively slow, is it possible for the End event to fire before the last Data handler(s) has finished, and if so, will it prematurely terminate that handler? 假设NodeJS中的可读流和与之关联的数据( on('data', ...) )事件处理程序相对较慢,是否有可能在最后一个数据处理程序完成之前触发End事件,如果是这样,它会提前终止该处理程序吗? Or, will all Data events get dispatched and run? 或者,是否会调度并运行所有数据事件?

In my case, I am working with large files and want to commit to a DB every data chunk. 在我的情况下,我正在处理大文件,并希望每个数据块都提交到数据库。 I am worried that I may lose the last record or two (or more) if End is fired before the last DB calls in the handler actually complete. 我担心如果在处理程序中的最后一次数据库调用实际完成之前触发了End,我可能会丢失最后一条或两条(或更多条)。

Event 'end' fire after last 'data' event. 最后'数据'事件后事件'结束'开火。 But it may happend before the last Data handler has finished. 但它可能会在最后一个数据处理程序完成之前发生。 It is possible that before one 'data' handler has finished, next is started. 在一个'数据'处理程序完成之前,可能会启动next。 It depends of what you have in your code, but it is possible that later call of event 'data' finish before earlier. 这取决于你的代码中有什么,但事后“数据”的后续调用有可能在之前完成。 It may cause errors and problems in your code. 它可能会导致代码中的错误和问题。

Example how to cause problems (to your own tests): 示例如何导致问题(对您自己的测试):

  var fs = require('fs');
  var rr = fs.createReadStream('somebigfile.jpg');
  var i=0;
  rr.on('data', function(chunk) {
    i++;
    var s = i;
    console.log('readable:' + s);
    setTimeout(function(){
      console.log('timeout:'+s);
    }, 50-i*10);
  });
  rr.on('end', function() {
    console.log('end');
  });

It will print in your console when start each 'data' event handler. 当启动每个'data'事件处理程序时,它将在您的控制台中打印。 And after some miliseconds when it finish. 完成后几毫秒。 Finish may be in different order. 完成可能是不同的顺序。

Solution: 解:

Readable Streams have two modes 'flowing mode' and a 'paused mode'. 可读流有两种模式“流动模式”和“暂停模式”。 When you add 'data' event handler, you auto set Readable Streams to flowing mode. 添加“数据”事件处理程序时,会自动将可读流设置为流动模式。

From documentation : 来自文档

When in flowing mode, data is read from the underlying system and provided to your program as fast as possible 在流动模式下,数据从底层系统读取并尽快提供给您的程序

In this mode events will not wait for your slow actions to finish. 在此模式下,事件不会等待您的慢动作完成。 For your need is 'paused mode'. 对于您的需求是'暂停模式'。

From documentation: 来自文档:

In paused mode, you must explicitly call stream.read() to get chunks of data out. 在暂停模式下,您必须显式调用stream.read()以获取数据块。 Streams start out in paused mode. 流以暂停模式开始。

In other words: you demand chunk of data, you get it, you work with it, and when you ready you ask for new chunk of data. 换句话说:你需要数据块,你得到它,你使用它,当你准备好时,你要求新的数据块。 In this mode you controll when you want to get your data. 在此模式下,您可以控制何时获取数据。

How to change to 'paused mode' : 如何更改为“暂停模式”

It is default mode for this stream. 它是此流的默认模式。 But when you register 'data' event handler it switch to 'flowing mode'. 但是当您注册“数据”事件处理程序时,它会切换到“流动模式”。 Therefore not use readstream.on('data',...) Instead use readstream.on('readable', function(){...}) when it fire, then it means that stream is ready to give chunk of data. 因此,不要使用readstream.on('data',...)而是在readstream.on('readable', function(){...})使用readstream.on('readable', function(){...}) ,这意味着流已准备好提供数据块。 To get chunk of data use var chunk = readstream.read(); 要获取大块数据,请使用var chunk = readstream.read();

Example from docs: 来自docs的示例:

var fs = require('fs');
var rr = fs.createReadStream('foo.txt');
rr.on('readable', function() {
  console.log('readable:', rr.read());
});
rr.on('end', function() {
  console.log('end');
});

Please read documentation for more details, because there are more posibilities when stream is auto switched to 'flowing mode'. 请阅读文档以获取更多详细信息,因为当流自动切换到“流动模式”时有更多可能性。

Work with slow handlers and flowing mode: 使用慢速处理程序和流动模式:

If you want/need work in 'flowing mode', there is also solution. 如果您想要/需要在“流动模式”下工作,还有解决方案。 You can pause and resume stream. 您可以暂停和恢复流。 When you get chunk form readstream('data'), pause stream and when you finish work then resume it. 当你获得chunk形式readstream('data'),暂停流,当你完成工作然后恢复它。

Example from documentation: 文档示例:

var readable = getReadableStreamSomehow();
readable.on('data', function(chunk) {
  console.log('got %d bytes of data', chunk.length);
  readable.pause();
  console.log('there will be no more data for 1 second');
  setTimeout(function() {
    console.log('now data will start flowing again');
    readable.resume();
  }, 1000);
});

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM