简体   繁体   English

Nodejs系列运行功能

[英]Nodejs Running Functions in Series

So right now I'm trying to use Nodejs to access files in order to write them to a server and process them. 因此,现在我正在尝试使用Node.js来访问文件,以便将它们写入服务器并进行处理。

I've split it into the following steps: 我将其分为以下步骤:

  • Traverse directories to generate an array of all of the file paths 遍历目录以生成所有文件路径的数组
  • Put the raw text data from each of file paths in another array 将来自每个文件路径的原始文本数据放在另一个数组中
  • Process the raw data 处理原始数据

The first two steps are working fine, using these functions: 使用以下功能,前两个步骤可以正常工作:

var walk = function(dir, done) {
    var results = [];
    fs.readdir(dir, function(err, list) {
        if (err) return done(err);
        var pending = list.length;
        if (!pending) return done(null, results);
        list.forEach(function(file) {
            file = path.resolve(dir, file);
            fs.stat(file, function(err, stat) {
                if (stat && stat.isDirectory()) {
                    walk(file, function(err, res) {
                        results = results.concat(res);
                        if (!--pending) done(null, results);
                    });
                } else {
                    results.push(file);
                    if (!--pending) done(null, results);
                }
            });
        });
    });
};
function processfilepaths(callback) {
    // reading each file
    for (var k in filepaths) { if (arrayHasOwnIndex(filepaths, k)) {
        fs.readFile(filepaths[k], function (err, data) {
            if (err) throw err;
            rawdata[k] = data.toString().split(/ *[\t\r\n\v\f]+/g);
            for (var j in rawdata[k]) { if (arrayHasOwnIndex(rawdata[k], j)) {
                rawdata[k][j] = rawdata[k][j].split(/: *|: +/);
            }}
        });
    }}
    if (callback) callback();
}

Obviously, I want to call the function processrawdata() after all of the data has been loaded. 显然,我想在所有数据加载后调用processrawdata()函数。 However, using callbacks doesn't seem to work. 但是,使用回调似乎无效。

walk(rootdirectory, function(err, results) {
    if (err) throw err;
    filepaths = results.slice();
    processfilepaths(processrawdata);
});

This never causes an error. 这绝不会导致错误。 Everything seems to run perfectly except that processrawdata() is always finished before processfilepaths() . 除了processrawdata()总是在processfilepaths()之前完成之外,一切似乎都运行processfilepaths() What am I doing wrong? 我究竟做错了什么?

I think for your problem, you can use async module for Node.js: 我认为对于您的问题,可以对Node.js使用异步模块:

async.series([
    function(){ ... },
    function(){ ... }
]);


To answer you actual question, I need to explain how Node.js works: 为了回答您的实际问题,我需要解释一下Node.js的工作方式:
Say, when you call an async operation (say mysql db query), Node.js sends "execute this query" to MySQL. 说,当您调用异步操作(例如mysql db查询)时,Node.js将“执行此查询”发送到MySQL。 Since this query will take some time (may be some milliseconds), Node.js performs the query using the MySQL async library - getting back to the event loop and doing something else there while waiting for MySQL to get back to us. 由于此查询将花费一些时间(可能是几毫秒),因此Node.js使用MySQL异步库执行查询-返回事件循环,然后在其他地方进行其他操作,同时等待MySQL返回我们。 Like handling that HTTP request. 就像处理该HTTP请求一样。 So, In your case both functions are independent and executes almost in parallel. 因此,在您的情况下,这两个函数是独立的,几乎可以并行执行。

For more information: 欲获得更多信息:

You are having a problem with callback invocation and asynchronously calling functions. 您在回调调用和异步调用函数方面遇到问题。 IMO I'll recommend that you use a library such as after-all to execute a callback once all your functions get executed. IMO我建议您所有函数执行完毕后 ,使用诸如all之类的库来执行回调。

Here's a example, here the function done will be called once all the functions wrapped with next are called. 这是一个示例,在这里,一旦调用了用next包装的所有函数done将调用完函数。

var afterAll = require('after-all');

// Call `done` once all the functions
// wrapped with next() get called
next = afterAll(done);

// first execute this
setTimeout(next(function() {
  console.log('Step two.');
}), 500);

// then this
setTimeout(next(function() {
  console.log('Step one.');
}), 100);

function done() {
  console.log("Yay we're done!");
}
function processfilepaths(callback) {
    // reading each file
    for (var k in filepaths) { if (arrayHasOwnIndex(filepaths, k)) {
        fs.readFile(filepaths[k], function (err, data) {
            if (err) throw err;
            rawdata[k] = data.toString().split(/ *[\t\r\n\v\f]+/g);
            for (var j in rawdata[k]) { if (arrayHasOwnIndex(rawdata[k], j)) {
                rawdata[k][j] = rawdata[k][j].split(/: *|: +/);
            }}
        });
    }}
    if (callback) callback();
}

Realize that you have: 意识到您有:

for
    readfile (err, callback) {... }
if ...

Node will call each readfile asynchronously, which only sets up the event and callback, then when it is done calling each readfile, it will do the if, before the callback probably even has a chance to get invoked. Node将异步调用每个readfile,这只会设置事件和回调,然后在完成调用每个readfile的调用后,它将执行if,如果回调可能还没有机会被调用。

You need to use either Promises, or a promise module like async to serialize it. 您需要使用Promises或Promise模块(例如async进行序列化。 What you would then do looks like: 然后,您将执行的操作如下所示:

async.XXXX(filepaths, processRawData, 
   function (err, ...) {
      // function for when all are done
      if (callback) callback();
   }
);

Where XXXX is one of the functions from the library like series, parallel, each , etc... The only thing you also need to know is in your process raw data, async gives you a callback to call when done. XXXX是库中的函数之一,例如series, parallel, each等。您还需要知道的唯一一点是在过程原始数据中,异步给您回调完成后的调用。 Unless you really need sequential access (I don't think you do) use parallel so that you can queue up as many i/o events as possible, it should execute faster, maybe only marginally, but it'll better leverage the hardware. 除非您真的需要顺序访问(我认为您不需要)使用并行,以便您可以将尽可能多的I / O事件排队,否则它应该执行得更快,也许只是执行一次,但是会更好地利用硬件。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM